home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.dsp:2131 comp.compression.research:156
- Path: sparky!uunet!wupost!sdd.hp.com!swrinde!network.ucsd.edu!qualcom.qualcomm.com!qualcom!rdippold
- From: rdippold@qualcom.qualcomm.com (Ron Dippold)
- Newsgroups: comp.dsp,comp.compression.research
- Subject: Re: Looking for telephone quality audio compression
- Message-ID: <rdippold.716077895@qualcom>
- Date: 9 Sep 92 22:31:35 GMT
- References: <BuBn7u.BFu@news.cso.uiuc.edu>
- Sender: news@qualcomm.com
- Organization: Qualcomm, Inc., San Diego, CA
- Lines: 38
- Nntp-Posting-Host: qualcom.qualcomm.com
-
- ja51359@uxa.cso.uiuc.edu (axelrod) writes:
- > I'm looking for a audio compression algorithm that will result
- >in telephone quality reproduction. I.E. 4Khz bandwidth, limited dynamic
- >range, average S/N ratio.
- > I'm already familiar with using delta-fibonacci, delta-huffman
- >techniques, but I'm looking for a more lossy algorithm that will give
- >better compression results, more like on the order of 8:1 with 8-bit
- >samples.
- > How is the quality of CELP compression? I heard voices end up
- >sounding robotic. I'd like something that sounds natural.
-
- Our version of CELP, QCELP, sounds quite decent. If things aren't
- tuned just right, voices can get a "sharpness" to them. To my ear it
- sounds superior to standard telephone, and those I've called have been
- unable to tell when I'm calling with the desk phone and with the
- cellular phone unless we introduced plenty of noise into the system
- (at which time the voice starts sounding somewhat "bubbly" as the
- noise overwhelms our error correction).
-
- We output 192 bits per 20 millisecond frame which works out to 1200
- bytes per second, or 4.3 megabytes per hour of speech. In addition,
- we do voice activity detection and can produce half, quarter, and
- eighth rate frames. The voice activity factor of standard speech
- works out to about 0.6 with this method, which means that the
- resulting data is only 60% of the size of that where we force it to
- stay in full rate mode, which gives about 2.6 megs for an hour of
- speech, or 720 bytes per second.
-
- Given your sampling rate of 8000 Hz with 8 bit samples, that would be
- 28.8 megs for an hour of speech, so we're doing around 7:1 without
- even bothering with voice activity, about 11:1 with it, including
- error correction.
-
- We're doing all this in an ASIC, but it demonstrates that it's
- possible to get what you want with a version of CELP. At least it
- might be worth looking into.
- --
- I never made a mistake in my life. I thought I did once, but I was wrong.
-