banner ad
Experts Logo


Forensic Sound Analyses of Cellular Telephone Recordings

By: Durand R. Begault, Adrian L. Lu, And Philip J. Perry
Tel: 415-397-0442
Email Dr. Begault

View Profile on

Originally Published by Audio engineering Society, 2015


Recordings involving cellular telephones or personal digital assistants ("PDAs") are increasingly the source evidence in audio forensic examinations, compared to recordings originating with other devices such as hand-held digital recorders. On modern PDA cellular telephones recordings can be made either directly to the telephone or transmitted as voice mail messages. The current investigation focuses on differences in the two types of recordings in terms of dynamic range and linearity of levels. Such information can be important for characterizing the distance of sound sources relative to the microphone and are important for understanding transformation of recorded speech and non-speech sounds.


Recordings involving cellular telephones or personal digital assistants ("PDAs") are increasingly the focus of audio forensic examinations, while recordings originating with other devices such as hand-held digital recorders have decreased as these devices are supplanted by PDAs, much as what has happened with consumer-level cameras. On modern PDA cellular telephones, recordings can be made either directly to the telephone, or transmitted as voice mail messages to a particular messaging system. It becomes immediately obvious that the quality of such recordings varies widely between messaging systems and between built-in recording applications;furthermore, the sensitivity of recordings to different sound source levels varies greatly, due to codecs and/or telephonic transmission. An understanding of these transformational properties has important implications for forensic analyses that include speech transcription of "background" voices, gunshots, distance of sound sources, and environmental context analysis (sometimes referred to as "roomprints") [1, 2].

When made directly to the telephone, the recording function is similar to a traditional hand-held digital recorder. For example the native voice memo recording application on the iPhone 5s writes .m4a files that, while technically written using a lossy codec, are both wideband and decent dynamic range (sampling rate 44.1 kHz, bitrate 64 Kbps). A particular application from a third party may allow alternative means of export (but not necessarily recording). The use of the iPhone microphone as input to a sound level meter application has also been evaluated in the literature [3].

When a recording is saved as a voicemail message, either on another telephone, on a corporate voice mail server, or on an emergency ("911") call log recording system, several additional and varied stages of signal processing can be involved. These stages include algorithms designed to optimize the speech signal against background noise, including compression and voice activity detection algorithms. Most importantly, these stages of signal processing alter the spectrum of speech as a function of level.

The current investigation is necessarily limited in scope, focusing primarily on tests using an iPhone 5s in our laboratory using an AT&T carrier (GSM 4G) in the San Francisco Bay Area. Analyses of actual gunshots from a different older-model cellular telephone are provided in a final section (details have been omitted due to privacy concerns). We have approached understanding of effects from the standpoint of a "black box" analysis, where we provide detailed description of the input and resulting output. The analyses focus on differences in types of recordings in terms of dynamic range and linearity of levels. Such information can be important for characterizing the distance of sound sources relative to the microphone and are important for understanding the impact of recorded speech and non-speech sounds in forensic settings.

An admitted but not fatal limitation of this study is a lack of certainty regarding the technical aspects of the hardware involved, and how the signal was transformed by individual signal processing elements in the communication chain of the cellular telephone system, including those features caused by linear predictive coding, discontinuous transmission, voice activity detection, and the inclusion of so-called "comfort noise" (see e.g. [4] for an informative overview). In other words, we observe what occurs to the signal and its implications for forensic analysis, without analyzing the specific causes.

The frequency response characteristics of the voice mail recordings (AMR file structure) indicate a classic telephony narrow band, fixed rate with a cutoff at around 4k consistent with (but not necessarily indicating) a G711 mu-law codec. There are 3 microphones in the iPhone, one at the bottom, one next to the speaker above the screen, and one between the flash and the lens of the camera on the rear of the phone. Investigation indicates that different applications may access different microphones to optimize the functionality. For example, when using the front facing camera for recording video, the front facing microphone is active. This microphone is deactivated when the camera is switched to rear facing, and the rear facing microphone becomes active. The voice memo appears to use all three microphones to varying degrees, with the main (bottom) mic being the main source. When making voice calls, only the bottom mic is actively transmitting speech, however the iPhone features a noise cancellation control system and it is believed that at least one additional microphone is used for noise cancellation during voice calls when this feature is enabled.

. . .Continue to read rest of article (PDF).

Durand R. Begault received a PhD in Computer Audio from UC San Diego in 1988 and an MFA from Mills College in 1981. He has been associated with the Audio Forensic Center and Charles M. Salter Associates, Inc. since 1996 and has worked as an expert witness, legal consultant, and acoustical consultant on a variety of audio-related cases since 1988. Dr. Begault also holds a research position at NASA Ames Research Center and is an adjunct faculty member at McGill University. He is the author of several U.S. patents and is qualified as an expert witness in both Federal and State Courts. Dr. Begault has testified in over 40 cases on a variety of audio technology and forensic audio-related cases.

©Copyright - All Rights Reserved


Related articles


2/15/2016· Audio Forensics

Audio Forensics: What We Hear v. What We Record

By: David Smith

In Audio Forensics, it often comes down to ears versus microphones. In one corner, we have the human ear with all its amazing capabilities. In the other corner, the fantastic piece of technology known as the microphone, with its ability to record sound to a digital file or tape for later playback.


7/28/2015· Audio Forensics

Pocket Dialing and Voicemail

By: David Smith

Pocket dialing occurs when a mobile phone's touch pad is pressed by something in a person's pocket or purse, or, if the person happens to sit down a certain way, a part of their anatomy. If the pressed key or screen surface area is programmed to dial or redial a telephone number, a call occurs. This call often goes undetected by the phone's owner. Some phones are more prone to this phenomenon than others.


1/14/2013· Audio Forensics

Audio Restoration vs. Audio Forensics

By: David Smith

You have an audio recording that may be used as evidence in an upcoming court case. Looking for a way to make it easier to hear, you do an on-line search and come upon a list of "Audio Restoration companies." The ads state that these sites can lower or eliminate noise and restore recordings to their original quality.

; broker Movie Ad

Follow us

linkedin logo youtube logo rss feed logo