Unsupervised Speaker Clustering Using a Global Similarity and F0 Features
- Main information
-
This paper investigates an unsupervised speaker clustering approach that exploits global similarity and also proposes extending the standard cepstal feature set used for speaker clustering with prosodic features, extracted from F0. The global-similarity-based speaker clustering algorithm, initially proposed by the authors in [6], leverages the insight that audio segments within a single cluster are not only similar to one another, but also display the same patterns of similarities and differences with audio segments belonging to all other clusters. First, speaker clustering performance using the standard Bayesian Information Criterion (BIC) is compared to the performance achieved using a BIC-based algorithm incorporating global similarity. Then both clustering techniques are tested using an extended feature set including F0-derived features in addition to the standard cepstral features. The evaluation, which is performed on data recorded from German language radio, shows the clear benefits of using global information when performing clustering. It also demonstrates that in most cases F0-features outperform the cepstral feature set both in standard BIC clustering and in the BIC global-similarity-based approach. WP5: Detection, Extraction and Annotation of Knowledge. IAIS Konstantin Biatov, Martha Larson 2007-03-15 16:16 Request for more detail
- Access and Use Rights
-
Condition of use defined in response to "need to access request". Copyright Fraunhofer Institut Intelligente Analyse- und Informationssysteme. Closed, attachment is not public
This item is not available for public download. For further information click on Request for more details above