A line graph compares the reward evolution patterns of SRQL and DRQL algorithms over 200 iterations. The horizontal axis represents the number of iterations, ranging from 0 to 200. The vertical axis represents the reward, ranging from 0 to 2.5. Two data lines are present: one for DRQL in blue and one for SRQL in orange. The DRQL line starts at 0 and rises steeply, peaking around 2.2 at 100 iterations, then stabilizes around 2. The SRQL line also starts at 0, rises more gradually, and stabilizes around 1.7 after 100 iterations. Both lines show an overall upward trend, with DRQL consistently achieving higher rewards than SRQL.Comparative analysis of reward evolution patterns in SRQL and DRQL algorithms
Sharing content requires targeting cookies to be enabled. Please update your cookie preferences to use this feature.