How YouTube Uses Artificial Intelligence And Machine Learning

Tanumoy Deb
6 min readOct 20, 2020


Artificial Intelligence

There are more than 1.9 billion users logged in to YouTube every single month who watch over a billion hours of video every day. Every minute, 300 hours of videos are uploaded to the platform. With this number of users, activity, and content, it makes sense for YouTube to take advantage of the power of Artificial Intelligence (AI) to help operations.

Automatically remove objectionable content

In the first quarter of 2019, 8.3 million videos were removed from YouTube, and 76% were automatically identified and flagged by AI classifiers. More than 70% of these were identified before there were any views by users. While the algorithms are not foolproof, they are combing through content much more quickly than if humans were trying to monitor the platform singlehandedly. In some cases, the algorithm pulled down newsworthy videos mistakenly seeing them as “violent extremism.” This is just one of the reasons Google has full-time human specialists employed to work with AI to address violative content. Here i am going to explain how YouTube, owned by Google, uses AI.

In the first quarter of 2020, 49.9% of the videos that were removed from youtube, had 0 views, 29.4% of videos had only 1–10 views and 22.7% of the videos had more than 10 views. AI is the one who made this possible. Among the removed videos, 1021380 videos belonged to United States, 826661 videos were from India, 484536 videos from Brazil and many other countries are there.

Bar Diagram showing number of removed videos country wise

YouTube’s top priority is to protect its users from harmful content. In pursuit of that, the company invested in not only human specialists but the machine learning technology to support the effort. AI has contributed greatly to YouTube’s ability to quickly identify objectionable content. Before using artificial intelligence, only 8% of videos containing “violent extremism” were flagged and removed before ten views had occurred; but after machine learning was used, more than half of the videos removed had fewer than ten views.

One of the main drivers for YouTube’s diligence in removing objectionable content is the pressure from brands, agencies, and governments and the backlash that’s experienced if ads appear alongside offensive videos. When ads started appearing next to YouTube videos supporting racism and terrorism, Havas UK and other brands began pulling their advertising dollars. In response, YouTube deployed advanced machine learning and partnered with third-party companies to help provide transparency to advertising partners. Google is working with third-party companies to make sure YouTube content is safe for brands while also deploying advanced machine learning to better identify content that might be deemed offensive to viewers and advertisers.

YouTube has been a target of criticism in the past for not taking necessary steps to stop this trash videos trend on its platform. Google has installed an AI (Artificial Intelligence) software that examines tons of videos on its own and blocks videos from the home page of website and home screen of the app, which looks troubling for the platform. According to people working with the project, this Artificial Intelligence software is known as “trashy video classifier”. This system plays an essential role in attracting and holding the visitors on the homepage of YouTube. Despite being so significant, the company hasn’t reported this trashy video classifier before. The AI examines the feedback from users who report videos that are with a misleading title, misleading thumbnail, click-baity and inappropriate videos.

New effects on videos

Snapchat, Google’s artificial intelligence researchers trained a neural network to be able to swap out backgrounds on videos without the need for specialized equipment. The researchers trained an algorithm with carefully labeled imagery that allowed the algorithm to learn patterns, and the result is a fast system that can keep up with video.

Up Next - feature

If you have ever used YouTube’s “Up Next” feature, you benefited from the platform’s artificial intelligence. Since the dataset on YouTube is constantly changing as its users upload hours of video every minute, the AI required to power its recommendation engine needed to be different than the recommendation engines of Netflix or Spotify. It had to be able to handle real-time recommendations while new data is constantly added by users. The solution they came up with is a two-part system. The first is candidate generation, where the algorithm assesses the YouTube history of the user. The second part is the ranking system that assigns a score to each video.

YouTube has one of the largest and most advanced recommendation systems in the industry. As one of the world’s leading websites, to satisfy its customers it must recommend relevant videos. YouTube is slightly different than other services that utilize recommendation systems(i.e Netflix, Spotify) because users upload thousands of hours of video to the platform every second. YouTube’s corpus is constantly changing, and they aren’t in control of the content being added. This creates the need for a robust model that can handle constant incoming data and will output quality recommendations in real time. The model below is the author’s response to this need.

The Recommendation System’s Architecture

The recommendation system they designed has two stages. The first being a neural network for candidate generation and the latter for ranking.

Guillaume Chaslot, a former Google employee and founder of an initiative urging greater transparency known as AlgoTransparency, explained that the metric used by YouTube’s algorithm to determine a successful recommendation is watch time. This is good for the platform and the advertisers, but not so good for the users, he said. This situation could amplify videos that have outlandish content, and the more people spend time watching it, the more it gets recommended.

Training on depth prediction

With so much data, YouTube videos provide a fertile training ground for artificial intelligence algorithms. Google AI researchers used more than 2,000 “mannequin challenge” videos posted on the platform to create an AI model with the ability to discern the depth of field in videos. The “mannequin challenge” had groups of people in a video stand still as if frozen while one person goes through the scene shooting the video. Ultimately, this skill of depth prediction could help propel the development of augmented reality experiences.

With the assistance of artificial intelligence, YouTube, Twitter, and Facebook already work to delete terrorist content, but what’s new in the President’s request is that they work with the Department of Justice and law enforcement agencies. There are many questions about how such a partnership would work, if social media channels could detect actual terrorists before they act and the potential to impact the civil liberties of innocent people. Whether YouTube and other social media companies could use artificial intelligence to stop terrorists while not infringing on the rights of others remains to be seen.