Google Trained Its AI To Learn Depth Perception From YouTube’s Mannequin Challenge

Now Reading
Google Trained Its AI To Learn Depth Perception From YouTube’s Mannequin Challenge

Browse This Page
Share via

Google’s latest blog post had focused on how depth perception works in videos where both the subject and the camera are in motion. A huge amount of data and footage was required for the study that aimed at training an AI. The foundation for the AI’s training was to detect scenes where the camera was in motion but the subject remained stationary.

Remarkable resourcefulness was shown as Google used its own source that was perfect for the job. YouTube (Google’s video sharing platform) had a vast amount of footage that fit the study’s required parameters. The Mannequin Challenge had a person and more often a group of people which stays perfectly still, anywhere from sitting to the bizarre handstands, while one person pans the camera around the still subjects (Mannequin’s) position.

To train the AI to detect human figures in various different scenes, Google used approximately 2000 videos from YouTube on the Mannequin Challenge. Interestingly, when depth perception comes into play multiple cameras are generally utilized from different angles to sense depth. Google, on the other hand, taught its AI to create depth maps from footage that only had one view, i.e perception from a single camera unit.


With the Pixel line up of Google smartphones, the company has already achieved similar results with its still images on portrait mode effect that produce the bokeh effect. The benefit of the study extends towards the augmented reality that the company is known to dabble with, Playmojis from Google’s Playground is one such example.

Also ReadHuawei’s Recess Could Provide Opportunities For Its Competitors In India

If the feature arrives on videos, it’ll open up new and previously impossible effects in video capture such as live bokeh effect similar to the one found in most smartphones. This may also one day lead to 3D images and 2D scenes being shot on smartphone cameras. Google’s progress on the software spectrum proves that just hardware improvements are not always what lead to great strides in photography and videography.

What's your reaction?
I Want This
About The Author
Avatar photo
Vikhyaat Vivek
Newbie at iGyaan