> Conversion from audio/WAV/MP3 to MIDI (article created 30/06/2006)
Conversion from audio/WAV/MP3 to MIDI
A comparison of the best software
This article should simultaneously inform those of you looking to buy software to transcribe music audio to MIDI, and also those who just wish to see the current state of the art. We'll begin with some background, and then rate four of the programs on the market (one is free) to see which is best.
Converting from a waveform to note data (such as MIDI
) has actually been a source of research since the 1960s. Although faithful conversion is now possible with monophonic audio (e.g. a solo singer or one-fingered piano player), the accuracy of polyphonic conversion (single instrument - many notes) is far from perfect. For example, a piano sonata by Mozart will have notes missing, repeated, off-time and wrong notes added.
That's nothing compared to the holy grail of signal analysis in artificial intelligence though. If we look at a typical pop song, or a full piano concerto with multiple instruments and complex timbres, we're way off
from finding a decent algorithm for computers to extract pitch and instrument data from polyphonic+multitimbral
music. This is somewhat curious, considering the human brain can perform the feat with relative ease.
Of course, such research is vital for many purposes, including the automatic classification, indexing, searching, and analysis of music
, not to mention the advancement in mathematics and signal processing. Other possibilities include translating MP3 to traditional score
, and individual instrument extraction to use in a new composition.
After wading through various programs on the internet, some of which share the same engine (and some which crash), we filtered the results down to four contenders:
AKoff Music Composer (v2.0) $29
Intelliscore Polyphonic WAV to MIDI Converter (v6.3) $79
AmazingMIDI (v1.70) FREE
TS-AudioToMIDI Realtime Converter (v3.3) $34.99 / WIDI - Music Recognition System (v3.3 Build 587) $49.9
NB: WIDI - Music Recognition System has more or less the same engine as TS-AudioToMIDI Realtime Converter, though there are subtle differences. TS-AudioToMIDI of course, is cheaper, though it lacks the ability to keep multiple MIDIs open at once on screen. Also, WIDI contains a few more pitch detection algorithms and settings.
Other software was excluded for various reasons. Digital Ear Real-Time 4.01 only allowed two seconds (!) of demo time - far from enough to make a proper judgement of accuracy. Solo Explorer 1.0 only allowed a monophonic conversion, and 7Canaries uses the same engine as WIDI, but with less features in the conversion preferences.
Meet the music!
Below, we'll be looking at five sample MP3s* which which to put the programs through their paces.
Bread and Butter
Yoshi's Island - solo melody extracted (SNES platform, Composer: Kojo Kondo, Publisher: Nintendo)
Beethoven's Fur Elise
Casio CT-700 demo (Composer: ?)
Mozart's Symphony No. 40, K550
Over the Frozen Sea - 2nd BGM (From 'Polystars' arcade game, Publisher: Konami)
Subtle timbres +
The kitchen sink!
Now time to test them out! Note that we're looking strictly at the conversion quality to MIDI as opposed to the cost or program features. Throughout, we have endeavoured to always carefully choose the preferences to maximise the quality of a conversion, and you'll find this info available at the end of this page. Finally, as well as the ratings, we have given example MP3s so you can decide for yourself how good they are.
Click each score out of ten to download the mp3 conversion.
So there we have it! If you want the most accurate MP3 to MIDI conversion, then WIDI - Music Recognition System (or its close cousin - TS-AudioToMIDI Realtime Converter) seems to be the way to go.
Why is it so difficult to extract pitch and instrument data from audio?
As we have seen, even the WIDI software often misses the mark of translating audio data to raw MIDI data, but why should this be the case? The problem is how sound is encoded into a sound wave. Imagine multiple sine waves for each instrument (fundamental + harmonics), and then multiple instruments. They're all mixed together, and before long it becomes incredibly difficult to distinguish between instruments visually in the waveform. The only way is to mix all possible combinations of instruments (+ initial offsets) together, and see if the resulting wave is what is in the signal. But for that you need to have the original instruments! So we have to go even further, and look for similarities at one part of the signal all of their together, and extract all the instruments first. This is as you might imagine, a sort of chicken and egg scenario. You need the instruments, before you can get at the note data, but on the other hand, you initially need the note data to easily separate and extract all the instruments.
And finally, a challenge...
To any budding programmers out there, I challenge you to beat these benchmarks! (I know I'm going to try once I finish my degree). The rewards will be recognition, and a great journey into the abstract realm of mathematics and sound analysis. Plus you'll make it onto this page!! Can't get cooler than that!
If the info on this site has been of sufficient interest, a small donation would be appreciated:
Technical information for conversions:
A lot of the time, the best conversions were produced by sticking with the default preferences, but now and then, we had to change them to get the most of the software. The following details are provided for those who wish to experiment more with the software:
WIDI "Correlators (Monophonic)"
WIDI (Velocity range - half decrease)
AmazingMIDI1 (Tone files: sine.wav. Increase min relative+analysis & decrease min note length)
AmazingMIDI2 (Tone files: piano0.wav. Increase min relative+analysis & decrease min note length)
TS-Audio (lower threshold)
intelliscore ("note on" unticked)
amazing (rightmost "minimum analysis" & "minimum relative")
akoff (Sound type: Piano)
WIDI1 (Time tick: 30, Pitch detection: Sensors)
WIDI2 (Time tick: 30, Pitch detection: Advanced Sensors)
TS-Audiotomidi (Minimal Duration - : 80ms)
akoff (Sound type: Piano)
AmazingMIDI (Tone File: sine.wav, minimum analysis: -60dB, minimum relative: -30dB)
WIDI1 (Pitch detection: Advanced Sensors, Velocity range: quarter decrease)
WIDI2 (Pitch detection: FFT, Time tick: 20, Minimal note duration: 60ms, Velocity range: half decrease)
WIDI1 - Advanced sensors (Time tick: 30ms)
WIDI2 - FFT (Time tick: 30ms)
Converting MIDI to traditional music score - Which programs best convert from MIDI to sheet music? We cover 25 notation programs, including Sibelius, MidiNotate, and Finale, and rank them.
MIDI Transform - A simple web applet to edit MIDI music files. Upload a file and change the volume, speed, instruments, key, and especially interesting and unique to this software - the scale's mode. Listen to Paul McCartney's 'Yesterday' or Mozart's 'Eine Kleine Nacht Music' in a minor key!
A crash course on the standard MIDI specification - A quick start to programming and manipulating raw MIDI data (at the byte level)
Skytopia > Articles > Conversion from audio/WAV/MP3 to MIDI (article created 30/06/2006)
All research on this page is copyright 2006 onwards Daniel White.
If you wish to duplicate any of the information from this page, please contact me for permission.
* These low bit-rate quality music clips are believed to constitute Fair Use, and are used only for the purposes of program comparison.
The copyright of course belongs to the original owners. Should you wish to hear the full versions, and if they're still in print, then buy the originals!