|
Zehu's authentication technology is based on biometric Adaptive Speaker Verification (ASV), identifying speakers according to their unique vocal identifiers which are related to the shape of the vocal tract. During the enrollment of new speakers, the identifiers (also known as features) are extracted from several voice samples in order to create a voice template, or voiceprint, which is then stored in a database. This voice template represents the distribution of voice features, but does not contain actual voice samples.
During verification, the features are extracted from the test segment and compared with a single mathematical voice model or a set of voice models. The result of this comparison is a numerical score, describing the likelihood that the speaker who created the voice model is the person speaking in the test segment. A comparison of this numerical score with a threshold, yields a binary accept/reject decision. This process can be repeated for several voice models, providing one-to-many identification results.
|