Table 4

Base LLMs adopted in this study

Qwen2.5–72BQwen2.5-MAXQwen3-235B-A22 BDeepSeek-V3DeepSeek-R1GPT-4o
# ArchitectureDenseMoEMoEMoEDenseDense, Multimodal
# Total
Params
72B325 B235B671 B260B∼1.8 T (estimated)
# Activated
Params
72B22 B22B37 B260B∼1.8 T (estimated)
# AvailabilityOpen SourceOpen SourceOpen SourceOpen SourceOpen SourceClosed Source
# Model
Category
GeneralGeneralReasoningGeneralReasoningGeneral

Note(s): MoE is short for mixture-of-expert

Source(s): Authors’ own work

or Create an Account

Close Modal
Close Modal