Demystifying oral history with natural language processing and data analytics: a case study of the Densho digital collection

Chen, Haihua; Kim, Jeonghyun (Annie); Chen, Jiangping; Sakata, Aisa

doi:10.1108/EL-12-2023-0303

Article navigation

Research Article| June 28 2024

Demystifying oral history with natural language processing and data analytics: a case study of the Densho digital collection

Haihua Chen;

Haihua Chen

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Haihua Chen can be contacted at: haihua.chen@unt.edu

Search for other works by this author on:

This Site

PubMed

Google Scholar

Jeonghyun (Annie) Kim;

Jeonghyun (Annie) Kim

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Jiangping Chen;

Jiangping Chen

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Aisa Sakata

Department of Information Science,

University of North Texas

, Denton, Texas,

USA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Haihua Chen can be contacted at: haihua.chen@unt.edu

Publisher: Emerald Publishing

Received: December 15 2023

Revision Received: April 18 2024

Accepted: April 24 2024

Online ISSN: 1758-616X

Print ISSN: 0264-0473

2024

Emerald Publishing Limited

Licensed re-use rights only

The Electronic Library (2024) 42 (4): 643–663.

https://doi.org/10.1108/EL-12-2023-0303

Purpose

This study aims to explore the applications of natural language processing (NLP) and data analytics in understanding large-scale digital collections in oral history archives.

Design/methodology/approach

NLP and data analytics were used to analyse the oral interview transcripts of 904 survivors of the Japanese American incarceration camps collected from Densho Digital Repository, relying specifically on descriptive analysis, keyword extraction, topic modelling and sentiment analysis (SA).

Findings

The researchers found multiple geographic areas of large residential communities of ethnic Japanese people and the place names of the concentration camps. The keywords and topics extracted reflect the deplorable conditions and militaristic nature of the camps and the forced labour of the internees. When remembering history, the main focus for the narrators remains the redress and reparation movement to obtain the restitution of their civil rights. SA further found that the forcible removal and incarceration of Japanese Americans during Second World War negatively impacted and brought deep trauma to the narrators.

Originality/value

This case study demonstrated how NLP and data analytics could be applied to analyse oral history archives and open avenues for discovery. Archival researchers and the general public may benefit from this type of analysis in making connections between temporal, spatial and emotional elements, which will contribute to a holistic understanding of individuals and communities in terms of their collective memory.

2024

Emerald Publishing Limited

Licensed re-use rights only

You do not currently have access to this content.

Don't already have an account? Register

Demystifying oral history with natural language processing and data analytics: a case study of the Densho digital collection

Email Alerts

Cited By

Demystifying oral history with natural language processing and data analytics: a case study of the Densho digital collection

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable