RESEARCH STARTER
Voiceprint analyses
Voiceprint analysis involves creating visual representations of an individual's voice through biometric measurements, helping to identify speakers in recorded conversations. Each person's voice is unique, influenced by anatomical features such as vocal cords, mouth shape, and speech patterns. This analysis is done using a device called a sound spectrograph, which produces a voiceprint considered to be as distinctive as a fingerprint. Law enforcement uses this technology to assist in investigations, although its admissibility in U.S. courts varies by state due to ongoing debates about accuracy and reliability.
Historically, voiceprint analysis began in the 1940s but gained traction in the 1960s when researchers explored its potential for identifying callers behind anonymous threats. While some studies, notably by engineer Lawrence Kersta, suggested high accuracy rates, subsequent investigations have reported mixed results. Concerns persist regarding the technology's capability to account for changes in an individual’s voice over time or the potential for voice disguise. For valid analysis, conditions such as recording quality and adequate sample duration must be met, and trained technicians are essential to ensure accurate comparisons. Despite its challenges, voiceprint analysis remains a significant tool within the forensic landscape.
Authored By: Davidson, Martiscia, A.M. 1 of 4
Published In: 2020 2 of 4
- Related Topics:
3 of 4
- Related Articles:An Efficient Speaker Identification System with High-Level Feature Extraction and Database Dimensionality Reduction.;Efficient Neuromorphic Reservoir Computing Using Optoelectronic Memristors for Multivariate Time Series Classification.;Patent Application Titled "Voiceprint Recognition Method, Graphical Interface, And Electronic Device" Published Online (USPTO 20250078841).;Patent Issued for Training a speech verification model (USPTO 12475895).;Zurich Voice Super-Recognizer Test (ZVSRT): Normative Data from Law Enforcement Personnel on Voice Discrimination and Sorting.
4 of 4
Full Article
DEFINITION: Visual representations of individuals’ voices based on biometric measurements.
SIGNIFICANCE: Law-enforcement investigators use voiceprint analysis to determine the likelihood that particular individuals are the persons speaking in recorded conversations. In the United States, the admissibility of voiceprints as evidence varies from state to state.
The principle underlying the use of voiceprints, also called sound spectrograms, for identification is that each individual has a unique voice; that is, each person’s voice has particular characteristics that allow the voice to be distinguished from every other voice. Each person’s voice is affected by the size and shape of the person’s vocal cords, mouth, throat, teeth, and nasal cavity. In addition, voice uniqueness derives from the movement of the tongue, lips, and jaw muscles during speech. When a speaker’s voice is analyzed by an instrument known as a sound spectrograph, which maps the voice onto a graph to produce a visual representation, the resulting voiceprint is, according to proponents of the technology, as unique as a fingerprint. This technology was one of several that has been used to authenticate the claim that Osama Bin Laden is the speaker on audiotapes released by al-Qaeda.
History
In 1941, Bell Telephone Laboratories in New Jersey developed the sound spectrograph, a device that analyzes sound frequencies and wavelengths and creates visual records of sounds in the form of graphs. Although intelligence agencies were interested in the technology as a way to identify enemy agents from recorded telephone conversations, progress toward the identification of individual speakers was slow until the early 1960s. At that time, police in New York City became interested in using voice analysis to assist in identifying a caller who was repeatedly phoning in bomb threats to airlines. They asked Lawrence Kersta, a Bell Labs engineer, to determine whether a comparison of sound spectrograms, which Kersta later called voiceprints, could be used to identify a suspect positively as the caller. Kersta experimented with visual pattern matching of voiceprints and concluded that when an unknown voiceprint was compared with that of a known speaker, the likelihood of a match could be determined with more than 99 percent accuracy.
Kersta’s results were not universally accepted, and other researchers found a lower degree of accuracy, but in the early 1970s, law-enforcement agencies began trying to enter voiceprints into evidence in criminal cases. Some courts accepted voiceprint evidence; others threw it out on the grounds that the technology had not been adequately proven. Although the American Board of Recorded Evidence, an advisory board of the American College of Forensic Examiners, published standards in 1997 for the comparison of voice samples and certifies speaker identification examiners, voiceprint evidence is not uniformly admissible in American courts.
Controversies Surrounding Voiceprint Evidence
From the beginning, voiceprint evidence has been a topic of controversy. Although Kersta claimed almost perfect accuracy in identifying speakers, his experiments were performed under ideal conditions with high school girls as subjects. Other experimenters, working under different, less ideal conditions, reported lower accuracy rates. Currently, the results of voiceprint comparisons are classified, based on the number of similarities in the samples, as positive identification, probable identification, positive elimination, probable elimination, or unable to determine. In a study of 2,000 forensic voiceprints, the Federal Bureau of Investigation (FBI) found 0.31 percent false identifications and 0.53 percent false eliminations.
Among the questions that have plagued voiceprint evidence are whether voiceprints of the same person change over time and whether a voice can be disguised to fool the spectrograph. Studies have shown quite conclusively that although a person’s voice may sound different to listeners as the person ages, the frequency and wavelength of the sound remain essentially unchanged. Disguising or distorting the voice, however, can make voiceprint comparison invalid. A trained examiner will recognize that one voice sample has been artificially altered, and this may force an “unable to determine” finding. Courts have ruled that it is not a violation of suspects’ rights to compel them to provide acceptable voice samples.
Standards and Training
Certain conditions must be met for the results of voiceprint comparisons to be considered valid. Several minutes of speech from both the known speaker and the unknown speaker must be available for analysis. Ideally, the samples should contain many of the same words and phrases. The style of speech in the samples must be similar—for example, one cannot be shouted, and the other whispered. Relaxed, normal conversation produces the most accurate results. The quality of both recordings must be good (for instance, clear and free of excessive background noise). In addition, the analyst must be a trained voiceprint technician. The analyst should make both a visual comparison of the voiceprints and an auditory comparison of the samples to listen for vocal tics, phrasing, and accent similarities and differences.
Minimum training for a voiceprint technician involves completing a two- to four-week course, performing a minimum of 100 voice comparisons under the direct supervision of an expert, and passing an examination given by experts in the field. Voiceprint technicians who serve as expert witnesses often have additional training, including academic research in forensic linguistics or forensic phonetics. The International Association of Forensic Linguists and the International Association of Forensic Phonetics and Acoustics publish the International Journal of Speech Language and the Law, which presents research findings and reports on legal cases involving speaker identification through voice samples.
Bibliography
Dornman, Andy. “Biometrics Becomes a Commodity.” IT Architect, 1 Feb. 2006, p. 46.
“Forensic Science, No Consensus.” Issues in Science and Technology, vol. 20, Winter 2004, pp. 5–9.
Hollien, Harry. Forensic Voice Identification. Academic Press, 2002.
James, Stuart H., and Jon J. Nordby, editors. Forensic Science: An Introduction to Scientific and Investigative Techniques. 4th ed., CRC Press, 2014.
Kalat, David. “Nervous System: From a Cry in the Dark to the Forensic Voiceprint.” BRG, 4 May 2022, www.thinkbrg.com/insights/publications/kalat-ns-forensic-voiceprint/. Accessed 7 Dec. 2025.
Moore, Sarah. “Voice Analysis in Forensics.” AZO Life Sciences, 17 Dec. 2021, www.azolifesciences.com/article/Voice-Analysis-in-Forensics.aspx. Accessed 7 Dec. 2025.
Tanner, Dennis C. Medical-Legal and Forensic Aspects of Communication Disorders, Voice Prints, and Speaker Profiling. Lawyers & Judges, 2007.
Full Article
DEFINITION: Visual representations of individuals’ voices based on biometric measurements.
SIGNIFICANCE: Law-enforcement investigators use voiceprint analysis to determine the likelihood that particular individuals are the persons speaking in recorded conversations. In the United States, the admissibility of voiceprints as evidence varies from state to state.
The principle underlying the use of voiceprints, also called sound spectrograms, for identification is that each individual has a unique voice; that is, each person’s voice has particular characteristics that allow the voice to be distinguished from every other voice. Each person’s voice is affected by the size and shape of the person’s vocal cords, mouth, throat, teeth, and nasal cavity. In addition, voice uniqueness derives from the movement of the tongue, lips, and jaw muscles during speech. When a speaker’s voice is analyzed by an instrument known as a sound spectrograph, which maps the voice onto a graph to produce a visual representation, the resulting voiceprint is, according to proponents of the technology, as unique as a fingerprint. This technology was one of several that has been used to authenticate the claim that Osama Bin Laden is the speaker on audiotapes released by al-Qaeda.
History
In 1941, Bell Telephone Laboratories in New Jersey developed the sound spectrograph, a device that analyzes sound frequencies and wavelengths and creates visual records of sounds in the form of graphs. Although intelligence agencies were interested in the technology as a way to identify enemy agents from recorded telephone conversations, progress toward the identification of individual speakers was slow until the early 1960s. At that time, police in New York City became interested in using voice analysis to assist in identifying a caller who was repeatedly phoning in bomb threats to airlines. They asked Lawrence Kersta, a Bell Labs engineer, to determine whether a comparison of sound spectrograms, which Kersta later called voiceprints, could be used to identify a suspect positively as the caller. Kersta experimented with visual pattern matching of voiceprints and concluded that when an unknown voiceprint was compared with that of a known speaker, the likelihood of a match could be determined with more than 99 percent accuracy.
Kersta’s results were not universally accepted, and other researchers found a lower degree of accuracy, but in the early 1970s, law-enforcement agencies began trying to enter voiceprints into evidence in criminal cases. Some courts accepted voiceprint evidence; others threw it out on the grounds that the technology had not been adequately proven. Although the American Board of Recorded Evidence, an advisory board of the American College of Forensic Examiners, published standards in 1997 for the comparison of voice samples and certifies speaker identification examiners, voiceprint evidence is not uniformly admissible in American courts.
Controversies Surrounding Voiceprint Evidence
From the beginning, voiceprint evidence has been a topic of controversy. Although Kersta claimed almost perfect accuracy in identifying speakers, his experiments were performed under ideal conditions with high school girls as subjects. Other experimenters, working under different, less ideal conditions, reported lower accuracy rates. Currently, the results of voiceprint comparisons are classified, based on the number of similarities in the samples, as positive identification, probable identification, positive elimination, probable elimination, or unable to determine. In a study of 2,000 forensic voiceprints, the Federal Bureau of Investigation (FBI) found 0.31 percent false identifications and 0.53 percent false eliminations.
Among the questions that have plagued voiceprint evidence are whether voiceprints of the same person change over time and whether a voice can be disguised to fool the spectrograph. Studies have shown quite conclusively that although a person’s voice may sound different to listeners as the person ages, the frequency and wavelength of the sound remain essentially unchanged. Disguising or distorting the voice, however, can make voiceprint comparison invalid. A trained examiner will recognize that one voice sample has been artificially altered, and this may force an “unable to determine” finding. Courts have ruled that it is not a violation of suspects’ rights to compel them to provide acceptable voice samples.
Standards and Training
Certain conditions must be met for the results of voiceprint comparisons to be considered valid. Several minutes of speech from both the known speaker and the unknown speaker must be available for analysis. Ideally, the samples should contain many of the same words and phrases. The style of speech in the samples must be similar—for example, one cannot be shouted, and the other whispered. Relaxed, normal conversation produces the most accurate results. The quality of both recordings must be good (for instance, clear and free of excessive background noise). In addition, the analyst must be a trained voiceprint technician. The analyst should make both a visual comparison of the voiceprints and an auditory comparison of the samples to listen for vocal tics, phrasing, and accent similarities and differences.
Minimum training for a voiceprint technician involves completing a two- to four-week course, performing a minimum of 100 voice comparisons under the direct supervision of an expert, and passing an examination given by experts in the field. Voiceprint technicians who serve as expert witnesses often have additional training, including academic research in forensic linguistics or forensic phonetics. The International Association of Forensic Linguists and the International Association of Forensic Phonetics and Acoustics publish the International Journal of Speech Language and the Law, which presents research findings and reports on legal cases involving speaker identification through voice samples.
Bibliography
Dornman, Andy. “Biometrics Becomes a Commodity.” IT Architect, 1 Feb. 2006, p. 46.
“Forensic Science, No Consensus.” Issues in Science and Technology, vol. 20, Winter 2004, pp. 5–9.
Hollien, Harry. Forensic Voice Identification. Academic Press, 2002.
James, Stuart H., and Jon J. Nordby, editors. Forensic Science: An Introduction to Scientific and Investigative Techniques. 4th ed., CRC Press, 2014.
Kalat, David. “Nervous System: From a Cry in the Dark to the Forensic Voiceprint.” BRG, 4 May 2022, www.thinkbrg.com/insights/publications/kalat-ns-forensic-voiceprint/. Accessed 7 Dec. 2025.
Moore, Sarah. “Voice Analysis in Forensics.” AZO Life Sciences, 17 Dec. 2021, www.azolifesciences.com/article/Voice-Analysis-in-Forensics.aspx. Accessed 7 Dec. 2025.
Tanner, Dennis C. Medical-Legal and Forensic Aspects of Communication Disorders, Voice Prints, and Speaker Profiling. Lawyers & Judges, 2007.
More Like ThisRelated Articles
Related Articles (5)
Related Articles (5)
- An Efficient Speaker Identification System with High-Level Feature Extraction and Database Dimensionality Reduction.Published In: Ingénierie des Systèmes d'Information, 2025, v. 30, n. 9. P. 2473Authored By: Shatti, Ahmed Hussein; Mohamed-Kazim, Haider A.; Saraj, Rusul Noori; Aldhahab, AhmedPublication Type: Academic Journal
- Efficient Neuromorphic Reservoir Computing Using Optoelectronic Memristors for Multivariate Time Series Classification.Published In: International Journal of Bifurcation & Chaos in Applied Sciences & Engineering, 2023, v. 33, n. 6. P. 1Authored By: Su, Jing; Lu, Jiale; Sun, Fan; Zhou, Guangdong; Duan, Shukai; Hu, XiaofangPublication Type: Academic Journal
- Patent Application Titled "Voiceprint Recognition Method, Graphical Interface, And Electronic Device" Published Online (USPTO 20250078841).Published In: Psychology & Psychiatry Journal, 2025. P. 7676Publication Type: Periodical
- Patent Issued for Training a speech verification model (USPTO 12475895).Published In: Psychology & Psychiatry Journal, 2025. P. 924Publication Type: Periodical
- Zurich Voice Super-Recognizer Test (ZVSRT): Normative Data from Law Enforcement Personnel on Voice Discrimination and Sorting.Published In: Psychology & Psychiatry Journal, 2026. P. 948Publication Type: Periodical