Hum to Sheet Music

Record yourself humming, singing, or playing a simple melody — get sheet music, MIDI, and MusicXML back in seconds. Runs entirely on your device. No upload, no account.

Record or Upload Audio

Microphone level
or upload an audio file
Drop an audio file here or click to browse
WAV, MP3, OGG, M4A, FLAC — mono or stereo, any sample rate
Ready — click Start Recording or drop an audio file.

How it works

1
Record or upload a melody — hum, sing, whistle, or play a single-line instrument.
2
On-device AI (Spotify's Basic Pitch neural network via TensorFlow.js) detects pitch frames and note onsets — no audio ever leaves your device.
3
Notes are assembled into a melody, displayed as sheet music (via ABC notation + abcjs), and packaged as MIDI and MusicXML.
4
Download your files to import into Sibelius, MuseScore, GarageBand, Logic Pro, Finale, or any DAW.

The pitch-detection model is a compact convolutional neural network trained by Spotify Research on a large dataset of musical audio. It estimates, for each 11.6 ms frame, the probability that each of 88 MIDI pitches is present — enabling polyphonic (multiple simultaneous notes) transcription, though hum-to-sheet results are best for single-note melodies.

Frequently asked questions

What kind of audio gives the best results?
Clear single-note melodies work best: humming, whistling, singing a melody on "la" or "na", or playing a flute, violin, or other monophonic instrument. Chords and complex harmonics make transcription harder — for polyphonic recordings, try the High onset sensitivity setting and accept that some notes may merge or split. Background noise and reverb reduce accuracy significantly, so record in a quiet room and keep the microphone close.
Does my audio get sent to a server?
No. The entire transcription pipeline runs inside your browser using WebAssembly and TensorFlow.js. The only network request is a one-time download of the ~10 MB model weights (served from a CDN, cached in your browser after the first use). Your microphone audio or uploaded file never leaves your device.
What do MIDI and MusicXML downloads give me?
The MIDI file (.mid) can be imported into any DAW or notation software — GarageBand, Logic Pro, Ableton Live, FL Studio, Cubase, or MuseScore. It carries the exact pitch and timing of each detected note. The MusicXML file (.xml) is the standard interchange format for notation software: Sibelius, Finale, Dorico, MuseScore, and Flat.io all import it, giving you an editable score with proper staves, clefs, and note values. The ABC file (.abc) is a lightweight text format supported by MuseScore, abcjs, and various online tools.
Why does the first run take longer?
The neural network model (~10 MB of weights) is downloaded from a CDN on your first visit and stored in your browser's cache. Subsequent uses are instant. Transcription itself takes roughly one second of processing per second of audio on a modern device.
What is the maximum recording length?
There is no hard limit, but transcription time scales linearly with audio length. A 30-second recording typically processes in 10–30 seconds depending on your device. Very long recordings (several minutes) may cause the browser tab to use significant memory during inference. For best results, keep recordings under 60 seconds and trim silence from the start and end.