by John Q. Walker*

Imagine hearing great musicians of the past or present play today – recreating any recording they ever made! At Zenph Studios, we are building the technology to re-create music performances live, starting from audio recordings.

Zenph is creating building blocks for the musical equivalent of familiar visual software. Our first offering will be a service that is like OCR (optical character recognition) for piano recordings. We take piano recordings and convert them back into the precise keystrokes and pedal motions that were used to create them. This is done in a new format, high-definition MIDI (for Musical Instrument Digital Interface), which can be played back with phenomenal reality on corresponding computer-controlled grand pianos. Horowitz, Glenn Gould, and Thelonious Monk can literally play “live” again.

Capturing and Recreating Fine Nuances

The problems we are working on are known technically as “automatic transcription” or “WAV to MIDI.” A high level of precision is needed to match the ultra-fine gradations of a musician’s touch. As a key or pedal is pressed, every millisecond of its timing and every micropressure of its movement is measured with fiber optics and captured in computer files. Zenph is the only implementer worldwide of Yamaha’s spec for high-definition MIDI, offering ten times the precision being used by others. Musicians who have heard themselves played back using high-definition MIDI acclaim its incredible reality. Zenph’s high-definition MIDI software processed all the files used in the judging of the International Piano-e-Competition in 2004.

When the team at Zenph Studios heard how good the high-definition MIDI was – good enough to be at the heart of a piano competition – they asked themselves what it would take to hear great artists of the past play again. The answer came in the form of a massive body of “signal processing” software, capable of taking the sound waves of an audio recording and turning them into a precise computer description. We also undertook a deep study of how pianists of the past actually played, measuring their movements with very fine precision and reconstructing what they commonly did using new families of equations.

How Is This Different From Digital Remastering?

In digital remastering, the engineer is still working in the acoustic domain, manipulating the sound waves. It’s an easy place to do equalization (for example, increasing or decreasing bass or treble), change balance among performers, alter the dynamic range, add reverb, or clean up extraneous noises.

What we’re doing is literally recreating original performances. It is as if the performers were once again performing in exactly the same way they did for the original recordings. Their finger and feet motions are regenerated precisely in the form of computer data, used by the computer-controlled piano to recreate the same human performances without loss of quality. Many improvements can thus be made in a new re-recording:

better piano (its timbre or richness)
better piano tuning (particularly individual out-of-tune strings)
better piano voicing (how the hammers hit the strings)
better room acoustics
less background noise – no interruptions from cars, coughs, airplanes, etc.
better microphones, more (or fewer) microphones
better microphone placement
better recording equipment
recorded at a better (higher) bit rate

This can become, essentially, a new archival medium. As years pass, the performance can be re-recorded, as enhancements or improvements in any or all of the above are achieved.

A Century of Aging Recordings

There are 100 years of piano recordings in the vaults of the recording companies and in private collections. Many great recordings have never been released because they were marred in some way that made them substandard. Live performances are often unattractive to release because of background noises or out-of-tune piano strings. They also may never have been released because they were recorded off the radio or on cassette recorders. Similarly, many wonderful studio recordings have never seen release due to instrument or equipment problems during the sessions. The chicken-and-egg problem is bringing older audio material forward, which is where we can help. We can bring these rarely heard treasures back to life, to be re-recorded for modern release.

Implications for Music Production and Listening

But the implications go further. Imagine musical software that is like Photoshop. Musicians or recording engineers could take high-definition MIDI performances and work with them in their computers. Notes, phrasing, or pedaling could be touched up. Software could make the performance more delicate or “emotional,” for example. We are now able to see and study performances as high-resolution computer data – literally seeing what our brains and emotions have reacted to for centuries. This opens a world of opportunity for creating natural-behavioral algorithms – what is the equation for “slightly happier?”

We started with solo piano music because of the high quality of the hardware for playback. These same techniques can readily be applied to other instruments as well. How they are played back will evolve swiftly through the coming years as the qualities of virtual instruments – and robotic players – improve.

Consider also the extraction of “artistic DNA.” What were the distinctive things Horowitz did that made him unique as a performer? We can compare his performances to the original scores – essentially “hold them up to the light” in the computer – and build a software template for Horowitz that might be applied to any musical score.

This opens the remarkable opportunity for creating models of an artist’s style or creating style models from scratch. In fact, new playback devices (unlike today’s MP3 players) could be built to hold digitized music scores – and listeners could then download and exchange style templates that would let them change how the music is heard. As the 21st century progresses, listeners will get to control how recordings sound – and how the music is played, even after the fact.

Audio Equivalent of CG in Films

We see a remarkable opportunity for using digital techniques to reinvent the art of music interpretation, performance, and recording. Think of the analogies in video. The entertainment industry uses three-dimensional CG (computer graphics) in many aspects of movie production. Software and equations for describing natural behavior in the visual domain are being developed rapidly for films. A groundbreaking example occurred in the mid 1990s: the programmers who animated Jurassic Park had to figure out the “code” for animals’ natural gaits, how muscles moved in relation to a skeleton, and how skin reflected light. CG manipulation is now used in nearly every commercial film or ad. The analogous set of problems in the domain of music remains nearly untouched. To learn more about this remarkable technology, visit http://www.zenph.com/.

And for a firsthand demonstration of this new technology, readers are urged to attend a recital by Mei-Ting Sun on Thursday, May 19, in Raleigh’s Fletcher Opera Theater. The artist captured the 2002 International Piano-e-Competition and more recently garnered first prize in the National Chopin Piano Competition – he will represent the USA in the International Chopin Competition in Warsaw in September 2005. His program in Raleigh will include preludes by Robert Cuckson and Chopin, Scriabin’s Sonata No. 3, and Liszt’s spectacular transcription of the Overture to Wagner’s Tannhäuser. Glenn Gould and Alfred Cortot will be on hand, too, figuratively speaking, as Zenph Studios demonstrates its latest musical re-creations. For details, see CVNC‘s Triangle Chamber Music page or the website of the Raleigh Chamber Music Guild.

Copyright ©2005 Zenph Studios, Inc.

*Note: Walker is President & CEO of Zenph Studios.

Note 2: For another perspective on this new development, see Mick Hamer’s article in New Scientist Magazine at http://www.eurekalert.org/pub_releases/2005-04/ns-ief042005.php.