Constructing knowledge graphs for disciplinary analysis faces challenges in Chinese programming domains due to semantic complexity and context dependency. Traditional entity–relation extraction methods struggle with entity nesting, relation overlapping and feature extraction difficulties. This study aims to develop an effective joint extraction model for Chinese programming knowledge to support knowledge graph construction.
The authors propose MMBRel, a hierarchical model integrating relation-aware encoding using RoBERTa-wwm for joint processing; dynamic threshold generation for filtering entity interactions; global–local attention fusion combining semantic and syntactic features and adversarial training for enhanced generalization.
A Chinese C++ knowledge data set with detailed annotations was constructed. Experimental results show MMBRel significantly outperforms baseline models in joint entity–relation extraction, effectively handling overlapping relations and entity nesting while achieving comprehensive feature extraction.
To the best of the authors’ knowledge, this introduces the first hierarchical joint extraction model for Chinese programming knowledge with novel encoding and threshold mechanisms. The work contributes a high-quality data set and provides robust support for disciplinary knowledge graph construction, advancing extraction methodology for Chinese technical domains.
