Variables in DRL model
| Variable | Description | Equation |
|---|---|---|
| Current state of the environment at time , including mold and crew utilization, and job progress | Eq (8) | |
| Selected action at decision point , representing job allocation to mold plates | Eq (9) | |
| Valid action at decision point | – | |
| Estimated Q-value of taking action in state | Eq (12) | |
| Refined estimated value after action masking | – | |
| The information about the job currently being executed on the mold plate | Eq (8) | |
| One-hot encoding. For each type of PC jobs, a unique vector is designed to identify which type of job is currently under execution on mold plate | Eq (8) | |
| Order completion progress , representing the current percentage of jobs completed or being executed for a specific type of PC order, consists of amount percentage and relative due dates | Eq (8) | |
| Job completion progress, representing the completion progress of various processes for jobs currently being executed | Eq (8) | |
| Action space | Eq.(11) | |
| Selecting n orders from the type of PC jobs | Eq.(9) | |
| The coefficiency of policy | Eq.(12) | |
| The action masking function at decision point | Eq (14)Eq (13) | |
| Masked action space at decision point | – | |
| Reward received after taking action | Eq (14) - Eq (17) |
| Variable | Description | Equation |
|---|---|---|
| Current state of the environment at time | ||
| Selected action at decision point | ||
| Valid action at decision point | – | |
| Estimated Q-value of taking action | ||
| Refined estimated value after action masking | – | |
| The information about the job currently being executed on the mold plate | ||
| One-hot encoding. For each type of PC jobs, a unique vector is designed to identify which type of job is currently under execution on mold plate | ||
| Order completion progress | ||
| Job completion progress, representing the completion progress of various processes for jobs currently being executed | ||
| Action space | ||
| Selecting | ||
| The coefficiency of | ||
| The action masking function at decision point | ||
| Masked action space at decision point | – | |
| Reward received after taking action |
Sharing content requires targeting cookies to be enabled. Please update your cookie preferences to use this feature.