Open-set Speaker Recognition with Bounded Laguerre Voronoi Clustering
Date
2024-08-19
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Speaker recognition is a challenging problem in behavioral biometrics. It has been rigorously investigated over the last decade. Although numerous supervised closed-set systems successfully harvest the power of deep neural networks, limited studies have been made on open-set speaker recognition. This thesis proposes a self-supervised open-set speaker recognition that leverages the geometric properties of speaker distribution for accurate and robust speaker identification. The proposed framework consists of a deep neural network incorporating a wider viewpoint of temporal speech features and Laguerre–Voronoi diagram-based speech feature extraction. The deep neural network is trained with a specialized clustering criterion that only requires positive pairs during training. The framework further incorporates a novel approach of clustering by integrating concepts from Voronoi diagrams in Laguerre geometry. This approach offers flexibility by necessitating only one hyperparameter, an upper-bound value for the number of centroids. The experiments validated that the proposed system outperformed current state-of-the-art methods in open-set speaker verification and identification.
Description
Keywords
Citation
Ohi, A. Q. (2024). Open-set speaker recognition with bounded Laguerre Voronoi clustering (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.