deposit_your_work

Detection of Target Speakers in Audio Databases

Files in this item

Files Size Format View
Mag1999Non5Detection.PDF 58.41Kb application/pdf Thumbnail
Mag1999Non5Detection.PPT 100.8Kb application/vnd.ms-powerpoint View/Open
Mag1999Non5Detection.PS 88.45Kb application/postscript View/Open

Show full item record

Item Metadata

Title: Detection of Target Speakers in Audio Databases
Author: Magrin-Chagnolleau, Ivan; Rosenberg, Aaron; Parthasarathy, S.
Type: Conference paper
Keywords: Temporary
Citation: I. Magrin-Chagnolleau, A. Rosenberg and S. Parthasarathy, "Detection of Target Speakers in Audio Databases," 1999.
Abstract: The problem of speaker detection in audio databases is addressed in this paper. Gaussian mixture modeling is used to build target speaker and background models. A detection algorithm based on a likelihood ratio calculation is applied to estimate target speaker segments. Evaluation procedures are defined in detail for this task. Results are given for different subsets of the HUB4 broadcast news database. For one target speaker, with the data restricted to high quality speech segments, the segment miss rate is approximately 7%. For unrestricted data, the segment miss rate is approximately 27%. In both cases the segment false alarm rate is 4 or 5 per hour. For two target speakers with unrestricted data, the segment miss rate is approximately 63% with about 27 segment false alarms per hour. The decrease in performance for two target speakers is largely associated with short speech segments in the two target speaker test data which are undetectable in the current configuration of the detection algorithm.
Date Published: 1999-01-15

This item appears in the following Collection(s)

  • ECE Publications [1048 items]
    Publications by Rice University Electrical and Computer Engineering faculty and graduate students
  • DSP Publications [508 items]
    Publications by Rice Faculty and graduate students in digital signal processing.