The purpose of this paper is two‐fold. First, to deal with the problem of audio speaker localization and second, to deal with the problem of mobile camera control. The task of speaker localization consists of determining the position of the active speaker and the task of camera control consists of orienting a mobile camera towards that active speaker. These steps represent the main task of speaker tracking, which is the global purpose of the research work.
In this approach, two‐channel‐based estimation of the speaker position is achieved by comparing the signals received by two cardioids microphones, which are placed the one against the other and separated by a fixed distance. The localization technique presented in this paper is inspired from the human ears, which act as two different sound observation points, enabling humans to estimate the direction of the speaking person with a good precision. Concerning the camera control part, the authors have conceived an automatic system for generating the command signals and controlling the rotation of the mobile camera by a stepper motor.
The off‐line experiments of speaker tracking by camera have been done in a small meeting room without echo cancelation. Results show the good performances of the proposed localization methods and a correct tracking by camera.
This new technique can be used for the automatic supervision of smart rooms.
The work described in this paper is original, since it uses only two microphones for the speaker localization.
