Figure 5 At the top, a dashed...

Figure 5

A diagram shows a control flow for an agent interacting with an environment, likely for precast production sequencing.

At the top, a dashed rectangular box labeled “Environment Initialize” contains two inner rectangular blocks: “P C Quantity, P C Due Date,” and “Mould Quantity Crew Quantity.” An arrow points from “Environment Initialize” to a rectangular block labeled “Initial Action Space” on its right. From “Initial Action Space,” an arrow points downward to a parallelogram block labeled “Valid Action Masking.” Below “Environment Initialize,” a dashed rectangular box labeled “State” is present. This “State” box contains four inner rectangular blocks: “Mould Utilization,” “Order Progress,” “Crew Utilization,” and “Job Progress.” An arrow originates from “State” and points to “Valid Action Masking” on its right. Below “State,” a dashed rectangular box represents a neural network. It's labeled “Input: State s subscript t” to the right of the neural network. Inside this dashed box, four layers of circles, representing neurons, are depicted: a leftmost layer of blue circles (input), the middle two layers of gray circles (hidden), and a rightmost layer of green circles (output). To the right side of this neural network is a label “Output: Q subscript t (s subscript t, a subscript t).” From “Valid Action Masking,” a downward arrow leads to the label “Q subscript u t (s subscript t, a subscript u t).” Following this, another downward arrow points to “Select action a subscript u with max Q subscript u,” and then a subsequent downward arrow leads to “Select job to be executed.” An arrow then curves back from “Select job to be executed” to the left, connecting to a rectangular decision box “Precast order complete question mark.” A “Yes” arrow from this decision box is labeled “Precast production sequence.” A “No” arrow from this decision box is connecting to the “State” dashed box, closing another loop.

DQN architecture. Source: Authors’ own work

Sharing Unavailable