In an earlier post we talked about the technology behind Instant Mix for Music Beta by Google. Instant Mix uses machine hearing to characterize music attributes such as its timbre, mood and tempo. Today we would like to talk about acoustic and visual analysis -- this time on YouTube. A fundamental part of YouTube's mission is to allow anyone anywhere to showcase their talents -- occasionally leading to life-changing success -- but many talented performers are never discovered. Part of the problem is the sheer volume of videos: forty eight hours of video are uploaded to YouTube every minute (that’s eight years of content every day). We wondered if we could use acoustic analysis and machine learning to pore over these videos and automatically identify talented musicians.

First we analyzed audio and visual features of videos being uploaded. We wanted to find “singing at home” videos -- often correlated with features such as ambient indoor lighting, head-and-shoulders view of a person singing in front of a fixed camera, few instruments and often a single dominant voice. Here’s a sample set of videos we found.



Then we estimated the quality of singing in each video. Our approach is based on acoustic analysis similar to that used by Instant Mix, coupled with a small set of singing quality annotations from human raters. Given these data we used machine learning to build a ranker that predicts if an average listener would like a performance.

While machines are useful for weeding through thousands of not-so-great videos to find potential stars, we know they alone can't pick the next great star. So we turn to YouTube users to help us identify the real hidden gems by playing a voting game called YouTube Slam. We're putting an equal amount of effort into the game itself -- how do people vote? What makes it fun? How do we know when we have a true hit? We're looking forward to your feedback to help us refine this process: give it a try*. You can also check out singer and voter leaderboards. Toggle “All time” to “Last week” to find emerging talent in fresh videos or all-time favorites.

Our “Music Slam” has only been running for a few weeks and we have already found some very talented musicians. Many of the videos have less than 100 views when we find them.



And while we're excited about what we've done with music, there's as much undiscovered potential in almost any subject you can think of. Try our other slams: cute, bizarre, comedy, and dance*. Enjoy!

Related work by Google Researchers:
Video2Text: Learning to Annotate Video Content”, Hrishikesh Aradhye, George Toderici, Jay Yagnik, ICDM Workshop on Internet Multimedia Mining, 2009.

* Music and dance slams are currently available only in the US.