Figure 7
A line graph depicts the trend of the rewards curve with respect to the episodes.The graph is titled “Learning Curve with D Q N.” The vertical axis, labeled “Rewards” in blue text, ranges from negative 30000 to 10000 in increments of 10000. The horizontal axis, labeled “Episodes,” ranges from 0 to 90 in increments of 15. A single blue line, representing “Rewards,” is plotted across the graph. This line starts around episode 0 with a reward value close to negative 25000, then sharply drops to approximately negative 34000 before rising rapidly. It fluctuates significantly between episodes 5 and 20, generally increasing from around negative 20000 to negative 5000. From episode 20 to about episode 30, the reward sharply increases from around negative 20000 to slightly above 0, then plateaus until around episode 45. After episode 45, there's a significant jump to approximately 8000, followed by a slight dip and then a relatively stable period around 5000 to 6000 until about episode 75. From episode 75 onwards, the reward increases again, reaching a peak close to 10000 and remaining at that level until the end of the plotted episodes. A legend in the bottom-right corner of the plot explicitly labels the blue line as “Rewards.”

Reward curve. Authors’ own work

or Create an Account

Close Modal
Close Modal