AI accelerated sound restoration

Artificial intelligence has truly matured to a point where it can do mind boggling things. Tony and I briefly touched on this in another forum section recently, and here’s a hint of what it’s able to do for us piano nerds.

In computer graphics there’s a technique called ray tracing which simulates natural lighting in scenes by simply letting each light source emit virtual photons, which are then traced as they bounce around and interact with surfaces in the scene - similarly to how light behaves in the real world. Tracking the interactions of millions of photons is tremendously computationally expensive however, which has prevented it from being used in real time applications. Now… Nvidia has put AI on the problem, and trained a neural network to learn how a ray traced frame looks before its finished converging (ie a noisy image, since it’s made up from few photons) compared to the final result. The network can now take an incompletely converged noisy image - even one not present in the training set - and predict how it would have looked had the rendering finished, just like we can look at the image and “imagine” how it would look without the noise. The result is below, with the noisy frame fed to the network to the left and the AI de-noised output overlaid to the right.

Thing is - if this technique works on video, it will work on audio as well by just recording a piano with 1927 level equipment and modern 2018 equipment simultaneously, and let the network learn how the piano sounds without the sonic limitations (though it will need hundreds and hundreds of hours of training data with current tech). This will allow a recording of Rachmaninoff’s to sound like it was recorded yesterday. Let me repeat that: this will allow a recording of Rachmaninoff’s to sound like it was recorded yesterday.

Now we just need to hijack one of these networks and a juicy looking Chinese supercomputer and let them look at hundreds of photos of Liszt, sequence his DNA, and let them recreate how he would have played the Hammerklavier sonata. :pimp:

Idk man.

It will sound modern, glittery and wrong.

All the things it “adds” and recreates will be guesses.

On the other hand, computers will be able to play more expressively than pianists soon.
Who knows.

Hopefully I’m not about to miss the point. It’s been a long day here -

Firstly, I’ve sat in a studio, supervising noise removal, and, done manually, it is the most fucking tedious, mind-numbing, anti-muzical thing I have ever done in my life. So… amen to AI.


Do we actually want that? You know, there’s something about the 1920s 1930s piano sound that is so much more mellow, so much more aurally pleasing than today’s almost synthetically, hygienically clean, typical Steinway sound.

Yes, and it will also replace Rach’s sound with something else entirely.

Kind of taking a Rec and reproducing it like a piano roll.
The ideas will be there but the sound won’t be.

Maybe teach a computer to play in a Friedman style, but don’t advertise it as the real thing.


I would also add: having done automated noise reduction (the software learns noise profiles and, in some way, “corrects” or subtracts the noise from the initial recording) I am genuinely unconvinced that the edited audio sound is superior to the original, when the original has been recorded with good equipment. You can hear that the edited sound is “cleaner”, especially in quiet passages, but is it more pleasing to the ear? I am really not sure at all.

Any sort of aggressive noise removal tech will get rid of a lot of the original frequencies

Yeah I agree. When it goes badly wrong you can get hideous artefaction effects. But when it appears to work I suspect harmonics are affected in an insidious manner.

Listen to inferior labels like Urania etc which take normal transfers and use insane noise removal.

You get pretty significant changes to the original 88 tone.

Lots of information is contained in the hiss.
The human ear is the best noise removal there is.
You’ll get used to the surface noise and ignore it.

If anything - imho, many of the modern studio 88 recs need to

  1. Go easy on the editing
  2. No upclose mics or reverb orgies
  3. Record with some less lifelike mics like sum ribbons, and then maybe mix in some ambient sounds from condenser mics?

Or just take 2 neutral sounding Omni mics and place a bit further in the Hall.

I’m working with NVIDIA right now on some POC in another field. Smart dudes, but this is sort of a different discipline.

The simple fact that interpolation accuracy is dependent on the accuracy of your primary data set means applying similar tech to super flawed old recs may just produce some nicer sounding artificialness, or let you know what freq bands don’t have much relevant action and are ok to snip.

I think there are good programs out there for the latter.

…could make for some killer artificial reverb settings on a modern day mixing board though…

Could see also some application with “reimagining” shite like the brahms cylinders after heavy pre filtering…

Totally. It doesn’t take long to adapt either.

About Rachmaninoff, you might know the “Windows in Time” discs.
The result is staggeringly good - this is what computer technology can do with piano rolls.

That said, after listening once I went back to the old scratchy, noisy acoustic and early electric recs, they give me more, much more satisfaction… :rock: :stop:

Right, I forgot to get back about this. It’s a completely different technique than anything used before - there’s no interpolation involved, and it’s also not comparable to a piano roll. I actually talked to a friend with a PhD in computer science today who works with convolutional neural networks, and he not only confirmed that it’s doable but said he thought it would be relatively easy - at least compared to the example above.

of course…

that is really sort of like a glorified piano roll.

Get with that guy and try some stuff!

I’d buy some reissues with some cool new tech applied.

skimmed this

thinking AI can learn a lot of conditional patterning to make better conditions based interpolations

If you went super crazy get modern full res data during same time you make training version for like wax cylinders, electric phonograph etc, then let the network train its way through conditions until the outcome looks like the reference set.

Take your learned conditions to like old recs maybe it sounds good? Piano is a machine after all.

that or drink exactly three glasses of wine until the crackles go away ; )

or take sum ACID and conjure up a 1840 Liszt performance

Yeah time to start a company 8)

The paper there would be a different thing, but I guess it’s a similar technique. AI super-resolution mentioned in the abstract there is pure magic as well - I can’t find the link now, but Nvidia had an example of this in a presentation or blog or something I saw a couple of weeks ago as well which had my jaw drop. What’s beginning to happen is computers which can “imagine”.