Figure 2
A diagram illustrates self-attention estimation from features and a share matrix in reinforcement learning.The diagram is divided into nine parts (a) through (i). Part (a): It shows a simplified map with several seaports labeled “Seaport A” at the top, “Seaport B” on the right, and “Seaport N” at the bottom. Ships are depicted traveling between these seaports, with curved lines indicating routes. Curved lines connect “Seaport A” to “Seaport B” and “Seaport B” to “Seaport C.” Part (b): It includes three tables for “Feature Seaport A,” “Feature Seaport B,” and “Feature Seaport C,” arranged in a horizontal sequence, with three horizontal dots shown between “Feature Seaport B” and “Feature Seaport C.” Each table contains numerical values across three columns. The values for “Feature Seaport A” are 0.6, 1.4, and 2.0. The values for “Feature Seaport B” are 1.2, 2.4, and 1.0. The values for “Feature Seaport C” are 2.2, 0.4, and 2.0. Below these tables, the “Normalised Feature” tables are shown. The values for “Normalised Feature Seaport A” are 0.64, 0.80, 0.88. The values for “Normalised Feature Seaport B” are 0.76, 0.91, 0.73. The values for “Normalised Feature Seaport N” are 0.90, 0.59, 0.88. The main calculation is detailed at the top right: “Multiply feature, share, and adjacency matrix.” Part (c): It is the “normalised features” matrix, displayed as a 3 by 3 grid using the normalized data from part (b). The row entries are as follows: Row 1: 0.64, 0.80, 0.88. Row 2: 0.76, 0.91, 0.73. Row 3: 0.90, 0.59, 0.88. Three vertical dots are shown between rows 2 and 3. Part (d): It is the “share matrix,” displayed as a 3 by 3 grid. The row entries are as follows: Row 1: 0.2, 0.4, 0.4. Row 2: 0.4, 0.3, 0.3. Row 3: 0.1, 0.2, 0.7. Part (e): It is the “adjacency matrix,” displayed as a 3 by 3 grid. The row entries are as follows: Row 1: 0, 1, 1. Row 2: 1, 0, 1. Row 3: 1, 1, 0. (Note: The second and third rows and the second and third columns are obscured by a dot.) The result of the multiplication of (c), (d), and (e) is part (f), the “importance matrix.” It is displayed as a 3 by 3 grid. The row entries are as follows: Row 1: 0.54, 0.67, 1.11. Row 2: 0.59, 0.72, 1.08. Row 3: 0.50, 0.71, 1.15. Three vertical dots are shown between rows 2 and 3. The process continues with the text “Calculate ‘self attention’ of each customer.” Part (g): It shows the application of a “Leaky ReLU” activation function to the importance matrix (f). It is displayed as a 3 by 3 grid. The row entries are as follows: Row 1: 0.54, 0.67, 1.11. Row 2: 0.59, 0.72, 1.08. Row 3: 0.50, 0.71, 1.15. Part (h): It shows the “attention coefficient” matrix, which is derived from (g) (implied by arrows) and contains different values. It is displayed as a 3 by 3 grid. The row entries are as follows: Row 1: 0.23, 0.29, 0.47. Row 2: 0.24, 0.30, 0.45. Row 3: 0.21, 0.30, 0.48. Part (i): It is the final “self attention” feature table, where the attention coefficient values are aggregated, shown as a column vector with labels: “Seaport A”: 0.67, “Seaport B”: 0.72, and “Seaport N”: 0.71. Three vertical dots are shown between “Seaport B” and “Seaport C.” A leftward arrow is shown between part (g) and part (h), and between part (h) and part (i).

Self-attention estimation from features and share matrix. Source: Created by authors

or Create an Account

Close Modal
Close Modal