shadox has asked for the wisdom of the Perl Monks concerning the following question:
Hi guys, i am with a project and i need a little help, one part of the project is compare 2 wav files, i need to get a match when the 2 files have the same sound, or the same musicm the files will be small (about 5 seconds), i was looking in cpan but i didn't find anything that could help me, anyone of you guys have do it before, or which module can i use for it.
Update:This project is not for Speech reconigtion, it is for compare 2 small wav files with music.
___________________________________________
Optimus magister, bonus liber
(jcwren) Re: How to compare 2 wav files.
by jcwren (Prior) on May 27, 2002 at 20:52 UTC
|
This is a non-trivial task, although not impossible.
Basically, you'd need to run a sliding window FFT/DFT (Fast Fourier Transform/Discrete Fourier Transform) and look at the spectral energy density of various frequency groups. If the energy density is the same for both samples at the same time, you could consider them the same.
FFT/DFTs are not difficult to implement, and while I'd tend to do it in C, I think Perl should do it pretty well. There are dozens of FFT/DFT implementations on the 'net. Decoding the .WAV format is pretty simple, also.
About the only problem I can really see is that if you have two .WAV files sampled at different rates (i.e., one is 44.1Khz, and the other 10Khz). In that case, you'd have to resample one to match the other. Still not overly difficult, but an added step.
Welcome to DSP-101
--Chris
e-mail jcwren
| [reply] |
|
UPDATE: late at night and didn't check the date to see that it was a zombie thread... post is still relevant in case someone looks...
FFT/DFTs are not difficult to implement, and while I'd tend to do it in C, I think Perl should do it pretty well.
It's not difficult, but I wouldn't bother implementing FFTs myself at all anymore, except as an excercise in implementing FFTs or if I needed to own the code. There are free libraries available, such as fftw that are already debugged, documented, and reasonably optimized. Fftw is pretty speedy - about 0.06 seconds to do a 2048x2048 2D-FFT on a new-ish (i7) Macbook.
| [reply] |
|
when and where to use FFT..??
| [reply] |
Re: How to compare 2 wav files.
by arunhorne (Pilgrim) on May 27, 2002 at 23:06 UTC
|
Fast Fourier Transform is a good option but you might also consider Hidden Markov Models (HMMs). These statistical models are commonly used for speech recognition tasks and have the advantage over FFT that they are less likely to be fooled by an off-target sample point. HMMs build a probabalistic model of a pattern (in this case a sound wave) and will provide you will a likelihood that the sound wave it is given matches the training set.
To apply the idea to your problem, you use the first wave file as a training set and the second as test set. If the HMM returns a probablility for the test set of greater than say, 0.9 consider them equal. This probabilistic approach will serve you well in this case. For example, with FFT to identical wave files recorded at different frequency may not be matched, whereas an HMM should be able to encapsulate this difference.
Here are some links to get you started. However, be aware that what you are attempting is non-trivial as jcwren points out and also many people devote their entire degrees/Phds to this area... would it be better if you just used a human? There are times when a computer isn't the best solution, and knowing when to recognise this can be key to many Artifical Intelligence tasks...
HMM Tutorial
HMMs for Speech
Hope this helps, or at least touches the tip of the ice-berg.
____________
Arun
| [reply] |
Re: How to compare 2 wav files.
by graff (Chancellor) on May 28, 2002 at 02:43 UTC
|
Personally, I would not view this as a Perl question, nor
as a problem best solved using Perl -- except for building
any sort of "wrapper" utility that would make it easier to
use the existing tools that are available in C and C++.
For example,
http://www.isip.msstate.edu/ has a fairly comprehensive
set of signal processing tools (including an HMM toolkit).
These tend to prefer raw pcm data so you can use
SoX
to strip the WAV headers (it also does a lot of other useful
stuff -- you need it anyway).
A lot will depend on the scope and actual nature of your project:
how many files to compare, what criteria define "same" vs.
"not same", how confusable the samples are on these criteria,
what error rate is acceptable. If you're looking for cases
of two files that replicate the same portion of a single digital source with
little or no alteration, then DSP approaches are likely
succeed quite well -- but any other condition will have a
measurable error rate on both "same" and "not-same" decisions.
Another approach to consider, if the job allows it, would be
to build a Perl/Tk interface that makes it very easy,
fast and efficient for a human to compare the audio files and make
the decisions.
Update: It's not at all clear to me that
HMM's are appropriate for classifying music data. The first
thing to try should probably just be comparing DFT vectors,
both "narrow band" (long analysis window) and "wide band"
(short window). I believe the ISIP toolkit includes a
vector quantization process, which will make the statistical
assessments easier. | [reply] |
Re: How to compare 2 wav files.
by thor (Priest) on May 27, 2002 at 22:49 UTC
|
As jcwren says above, DFT is probably the way to go. Beware, however, that your clips have the same length. If you have the same sound that in one sample is a 1/4 of a second longer than the other, and your sample rate is 1/2 a second, then you will be taking completely different sample points and your algorithm will identify the sounds as different. That may be construed as a feature, however. | [reply] |
Re: How to compare 2 wav files.
by toma (Vicar) on May 29, 2002 at 04:59 UTC
|
Here is another approach that hasn't been mentioned
yet. No AI, no feature extraction, no FFT, and it
will work great! It is quite slow, though.
Imagine that you have a bunch of sound samples. These
samples are either positive or negative. If you take
another copy of the samples and slide them across the
original samples, they will line up at one particular
instant in time. A nice way to get the computer to
"see" this moment is to multiply the samples from
each waveform together, sample by sample.
Then, add up the products. This works because the lined-up
samples will all turn into positive numbers (instead
of the random mix of positive and negative numbers
when they don't line up.) All these positive numbers
will add up to a really big positive number,
which is called a correlation spike.
This algorithm of sliding the samples across each other,
multiplying them sample-by-sample, and adding them
together is called convolution. It works great but
requires a large amount of computer power. If you
have a really hot machine, or you are patient,
it should work fine.
A short-cut for this procedure takes advantage of the
Fast Fourier Transform (FFT). This amazing algorithm allows
you to transform a convolution in the time domain
into a multiplication in the frequency domain. To
get the benefits of this algorithm, you will need
to learn about window functions, the effect of sample
rates, and some other gory details. It will be
*much* faster, but also much more work to learn
how to use.
To solve your particular problem, you will need to
convolve each possibly new sample against each song
in your collection, looking for a match. For the
FFT algorithm, you can compute the FFT of each song only
once, and store it. These stored FFT samples are
multiplied by the FFT of the new song. You get the
same value for the correlation spike when you multiply
the FFTs as when you do the convolution. If you get
the huge spike in either the time or the frequency
domain, the songs are the same.
The sliding FFT that jcwren mentions solves an
important problem with the FFT. Imagine that two
songs start with identical notes, but they are
played in a different order. An ordinary FFT cannot
distinguish between these two songs. The sliding
FFT will fix this. The sliding FFT is yet more
complicated than the ordinary FFT, so I wouldn't
recommend it as a first project in signal processing.
For doing this type of number-crunching in perl I use the PDL
modules. They are well worth the trouble to install.
Update: See
Analyzing WAV Files with Perl for FFT usage.
It should work perfectly the first time! - toma | [reply] |
|
I've to do real FFT on a wav file.How to give the wav file as input to the realFFT method?
| [reply] |
|
Hi,
It's really not that easy, but such tool exists:
http://www.sevana.fi/audio_speech_codecs_quality_analysis.php
It can compare two audio files and give % of similarity whether you want to test your codec quality or just compare two audio files (like original and received at destination of VoIP channel). It's also available for Linux.
Hope this helps.
Regards,
Vallu
| [reply] |
Re: How to compare 2 wav files.
by jotti (Scribe) on May 28, 2002 at 20:43 UTC
|
Of course there is allways the trivial case when we are talking about two wave files that both are originally the same sample. In that case we simply search for similar byte patterns. But, as most comments seem to take for granted, shadox is probabely talking about two different samples, say two mikes picking up the same sound source, right? | [reply] |
|
Well not totally right :)
I will have some wav files about 5 second in lenght (each file) and a friend server will send me a wav file (5 seconds too), then i will compare his file with the files i have and i will say "That song is ......" or "I don't know that song" and my program will "learn" that song.
This is just a learn project (not college or work project (: ) and i really apreciate all your help guys.
___________________________________________
Optimus magister, bonus liber
| [reply] |
|
Okay, if you and your friend agree on which 5-sec
portion to compare (e.g. always use the first 5 sec, not
counting any initial silence that might be present), then
you have a fairly good chance of building a DFT-based
discriminator/identifier with a pretty good success rate.
In this case, Perl could be very
handy for driving the DFT/VQ engine on your friend's
audio file, doing data reduction on that output, and
running or maybe even computing the suitable statistics to
identify a "best match" in your local database of first-5-sec
snippets.
Just building your local database of "song signatures" will
be a very instructive exercise, and you can use it for both
"training" and "testing". I could go on... but it would all
be speculative, and you should work it out for yourself.
| [reply] |
|
If you allready have the file on your server, is it the same sample, f.i. extracted from the same music CD with same sample rate? Or might it be two different samples from two different LP records? Or could it be f.i. two different recordings of Beethoven's 5:th?
| [reply] |
|
|
Re: How to compare 2 wav files.
by vallu (Initiate) on Dec 29, 2009 at 09:58 UTC
|
Hi,
Have you tried AQuA Wideband?
http://www.sevana.fi/voice_quality_testing_measurement_analysis.php
Charging by the technology presentation comparing audio is not a trivial task at all. Although this software does not intent to find audio similarity at first stage, it does compare files and provides percentage of their similarity. Besides it's multiplatform charging by the software and company blog: http://wordpress.sevana.fi | [reply] |
|
By using php it is possible please try this one
<?php
$audio1 = file_get_contents('audio1.wav');
$audio2 = file_get_contents('audio2.wav');
if($audio1 == $audio2)
echo "true";
else
echo "false";
?>
| [reply] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
|
|