Construction and evaluation of a domain-specific knowledge graph for knowledge discovery

Nguyen, Huyen; Chen, Haihua; Chen, Jiangping; Kargozari, Kate; Ding, Junhua

doi:10.1108/IDD-06-2022-0054

Article navigation

Research Article| February 03 2023

Construction and evaluation of a domain-specific knowledge graph for knowledge discovery

Huyen Nguyen;

Huyen Nguyen

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Huyen Nguyen can be contacted at: huyennguyen5@my.unt.edu

Search for other works by this author on:

This Site

PubMed

Google Scholar

Haihua Chen;

Haihua Chen

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Jiangping Chen;

Jiangping Chen

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Kate Kargozari;

Kate Kargozari

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Junhua Ding

University of North Texas

, Denton, Texas,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Huyen Nguyen can be contacted at: huyennguyen5@my.unt.edu

Publisher: Emerald Publishing

Received: June 21 2022

Revision Received: October 28 2022

Revision Received: November 26 2022

Accepted: December 11 2022

Online ISSN: 2398-6255

Print ISSN: 2398-6247

2022

Emerald Publishing Limited

Licensed re-use rights only

Information Discovery and Delivery (2023) 51 (4): 358–370.

https://doi.org/10.1108/IDD-06-2022-0054

Purpose

This study aims to evaluate a method of building a biomedical knowledge graph (KG).

Design/methodology/approach

This research first constructs a COVID-19 KG on the COVID-19 Open Research Data Set, covering information over six categories (i.e. disease, drug, gene, species, therapy and symptom). The construction used open-source tools to extract entities, relations and triples. Then, the COVID-19 KG is evaluated on three data-quality dimensions: correctness, relatedness and comprehensiveness, using a semiautomatic approach. Finally, this study assesses the application of the KG by building a question answering (Q&A) system. Five queries regarding COVID-19 genomes, symptoms, transmissions and therapeutics were submitted to the system and the results were analyzed.

Findings

With current extraction tools, the quality of the KG is moderate and difficult to improve, unless more efforts are made to improve the tools for entity extraction, relation extraction and others. This study finds that comprehensiveness and relatedness positively correlate with the data size. Furthermore, the results indicate the performances of the Q&A systems built on the larger-scale KGs are better than the smaller ones for most queries, proving the importance of relatedness and comprehensiveness to ensure the usefulness of the KG.

Originality/value

The KG construction process, data-quality-based and application-based evaluations discussed in this paper provide valuable references for KG researchers and practitioners to build high-quality domain-specific knowledge discovery systems.

2022

Emerald Publishing Limited

Licensed re-use rights only

You do not currently have access to this content.

Don't already have an account? Register

Construction and evaluation of a domain-specific knowledge graph for knowledge discovery

Email Alerts

Cited By

Construction and evaluation of a domain-specific knowledge graph for knowledge discovery Available to Purchase

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable

Construction and evaluation of a domain-specific knowledge graph for knowledge discovery