Figure 5 The figure is a multi‑panel...

Figure 5

The figure is a multi‑panel workflow summarizing evaluation outcomes for construction‑law question answering. On the left, a vertical box titled “7 G P L L M s” lists the models with icons: “Llama‑2‑70 b”, “Text‑davinci‑003”, “G P T‑3.5 Turbo”, “G P T‑4”, “Chat G L M 2‑6 B”, “E R N I E‑Bot‑turbo”, and “E R N I E‑Bot 4.0”. A double-headed arrow labeled “Integration” points downward to a box “Construction law knowledge repository (C L K R)” containing a miniature of the earlier eight‑area knowledge diagram (C 1 to C 8 of “387 C L documents in C L K R”). From the models, an arrow leads to a central box stating “14,980 answers to 2,140 questions from 7 original G P L L M s” with “8,404 marks for 14,980 answers” beneath. A double-headed arrow labeled “Comparison” points from a dashed box below back to this box. The dashed box reads “14,980 answers generated by 7 C L K R‑empowered G P L L Ms” and adds “Retrieved knowledge chunks and similarity ranks”. Below the comparison arrow, another box notes “10,202.5 marks for 14,980 answers”, indicating improved scoring when using C L K R. To the right, a large rectangular panel titled “Performance comparison of each test paper slash on M S Q s or M M Q s slash across 8 C L areas slash on open‑ended questions (Section 4.1)” contains bullet points: “C L K R enhances the C L Q A performance of 7 G P L L M s by an average of 21.1 percent, varying from 9.9 percent to 44.9 percent (Table 4).” “C L Q A performance on M S Q s and M M Q s improves by 14.9 percent and 38.3 percent (Table 5).” “C L K R enhances C L Q A performance from 14.5 percent to 28.2 percent across eight C L knowledge areas (Table 6).” “C L K R enhances 7 G P L L M s’ performance in 100 open‑ended questions by an average of 22.0 percent (Table S 4).” A lower right panel titled “Impact evaluation of individual C L document on performance enhancement (Section 4.2)” provides further bullets: “Top 10 (2.6 percent) documents offer 37.2 percent (unranked) and 37.3 percent (ranked) knowledge for C L Q A (Figure 9).” “210 documents retrieved less than 5 times offer 3.7 percent (unranked) and 3.5 percent (ranked) knowledge for C L Q A (Table S 4).”

The comparison results of GPLLMs’ CLQA accuracies with and without CLKR. Source(s): Authors’ own work