Rice Univesrity Logo
    • FAQ
    • Deposit your work
    • Login
    View Item 
    •   Rice Scholarship Home
    • Faculty & Staff Research
    • George R. Brown School of Engineering
    • Electrical and Computer Engineering
    • ECE Publications
    • View Item
    •   Rice Scholarship Home
    • Faculty & Staff Research
    • George R. Brown School of Engineering
    • Electrical and Computer Engineering
    • ECE Publications
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Speaker Detection in Broadcast Speech Databases

    Thumbnail
    Name:
    Ros1998Non5SpeakerDe.PDF
    Size:
    146.0Kb
    Format:
    PDF
    View/Open
    Thumbnail
    Name:
    Ros1998Non5SpeakerDe.PS
    Size:
    295.3Kb
    Format:
    Postscript
    View/Open
    Author
    Rosenberg, Aaron; Magrin-Chagnolleau, Ivan; Parthasarathy, S.
    Date
    2004-01-14
    Abstract
    Experiments have been carried out to assess the feasibility of detecting target speaker segments in multi-speaker broadcast databases. The experiemental database consists of NBC Nightly News broadcasts. The target speaker is the news anchor, Tom Brokaw. Gaussian mixture models are constructed from labelled training data for the target speaker as well as background models for other speakers, commercials, and music. Four labelled 30-min. broadcasts are used for testing. Mel-frequency cepstral features, augmented by delta cepstral features are calculated over 20 msec. windows shifted every 10 msec. through a broadcast. Likelihood ratio scores are calculated for each test frame averaged over blocks of frames with a specified duration. The block scores are input to a detection routine which returns estimates of target segments boundaries. The range of best results obtained over the test broadcasts is 82% to 100% detection of target segments with segment frame accuracy ranging from 86% to 95%. 0 to 2 false alarm segments are detected over each 30 min. broadcast.
    Description
    Conference Paper
    Citation
    A. Rosenberg, I. Magrin-Chagnolleau and S. Parthasarathy, "Speaker Detection in Broadcast Speech Databases," 1998.
    Keyword
    Temporary; Signal Processing Applications; Temporary
    Type
    Conference paper
    Citable link to this page
    https://hdl.handle.net/1911/20304
    Metadata
    Show full item record
    Collections
    • DSP Publications [508]
    • ECE Publications [1468]

    Home | FAQ | Contact Us | Privacy Notice | Accessibility Statement
    Managed by the Digital Scholarship Services at Fondren Library, Rice University
    Physical Address: 6100 Main Street, Houston, Texas 77005
    Mailing Address: MS-44, P.O.BOX 1892, Houston, Texas 77251-1892
    Site Map

     

    Searching scope

    Browse

    Entire ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypeThis CollectionBy Issue DateAuthorsTitlesSubjectsType

    My Account

    Login

    Statistics

    View Usage Statistics

    Home | FAQ | Contact Us | Privacy Notice | Accessibility Statement
    Managed by the Digital Scholarship Services at Fondren Library, Rice University
    Physical Address: 6100 Main Street, Houston, Texas 77005
    Mailing Address: MS-44, P.O.BOX 1892, Houston, Texas 77251-1892
    Site Map