A sound source localization method based on multi-scale cross-STFT complex-valued convolutional neural network

Liu, Mengran; Zhou, Chao; Feng, Hanghai; Gong, Chuanqi; Hu, Junhao; Jian, Zeming

doi:10.1108/SR-10-2024-0870

Article navigation

Research Article| January 31 2025

A sound source localization method based on multi-scale cross-STFT complex-valued convolutional neural network

Mengran Liu;

Mengran Liu

Hubei Key Laboratory of Modern Manufacturing Quantity Engineering, School of Mechanical Engineering,

Hubei University of Technology

, Wuhan,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Chao Zhou;

Chao Zhou

Hubei Key Laboratory of Modern Manufacturing Quantity Engineering, School of Mechanical Engineering,

Hubei University of Technology

, Wuhan,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Hanghai Feng;

Hanghai Feng

Hubei Key Laboratory of Modern Manufacturing Quantity Engineering, School of Mechanical Engineering,

Hubei University of Technology

, Wuhan,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Chuanqi Gong;

Chuanqi Gong

Hubei Key Laboratory of Modern Manufacturing Quantity Engineering, School of Mechanical Engineering,

Hubei University of Technology

, Wuhan,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Junhao Hu;

Junhao Hu

Hubei Key Laboratory of Modern Manufacturing Quantity Engineering, School of Mechanical Engineering,

Hubei University of Technology

, Wuhan,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Zeming Jian

Hubei Key Laboratory of Modern Manufacturing Quantity Engineering, School of Mechanical Engineering,

Hubei University of Technology

, Wuhan,

China

Zeming Jian can be contacted at: jianzemingx@163.com

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Zeming Jian can be contacted at: jianzemingx@163.com

Publisher: Emerald Publishing

Received: August 20 2024

Revision Received: December 11 2024

Revision Received: January 07 2025

Accepted: January 11 2025

Online ISSN: 1758-6828

Print ISSN: 0260-2288

2025

Emerald Publishing Limited

Licensed re-use rights only

Sensor Review (2025) 45 (3): 374–386.

https://doi.org/10.1108/SR-10-2024-0870

Purpose

This paper aims to address the limitations of current deep learning algorithms for sound source localization (SSL), which focus on a single feature and frequency scale, neglecting the integration of multi-scale information. The method developed in this study enhances localization accuracy by effectively using the spatial information and spectral diversity provided by microphone arrays.

Design/methodology/approach

The method is based on a multi-scale cross-short-time Fourier transform (STFT) complex-valued convolutional neural network (CCNN). It uses cross-STFT spectra at different scales to capture detailed acoustic information across various frequencies. The effectiveness of the algorithm was validated through both simulations and experimental studies.

Findings

Experimental results demonstrate that the proposed multi-scale cross-STFT CCNN not only outperforms the single-scale cross-STFT model but also delivers superior localization performance compared to other advanced methods, achieving consistently higher accuracy. The method shows excellent robustness across various signal-to-noise ratio (SNR) conditions and performs well even on imbalanced datasets, confirming its strong generalization capabilities.

Originality/value

This paper introduces a novel approach to SSL that integrates multi-scale information, addressing a key limitation of existing methods. The findings offer significant value to researchers and practitioners in the field of acoustic signal processing, particularly those focused on deep learning-based localization techniques.

2025

Emerald Publishing Limited

Licensed re-use rights only

You do not currently have access to this content.

Don't already have an account? Register

A sound source localization method based on multi-scale cross-STFT complex-valued convolutional neural network

New and popular articles

Email Alerts

Cited By

A sound source localization method based on multi-scale cross-STFT complex-valued convolutional neural network

Sign in

Client Account

ICE Member Sign In

New and popular articles

Email Alerts

Suggested Reading

Recommended for you

Cited By

Sharing Unavailable