YouTube’s AI-powered auto-dubbing feature is now available to hundreds of thousands of creators in the YouTube Partner Program. Initially introduced at VidCon last year with a limited rollout, the feature is now accessible to channels focused on informational content, including tutorials and educational videos. YouTube plans to extend it to other types of content soon.
What’s is AI-Powered Dubbing on YouTube
When creators upload a video, YouTube automatically detects the original language and generates dubbed versions using AI. YouTube’s AI-powered auto-dubbing supports the languages below:
- English (for translation into other supported languages)
- French
- German
- Hindi
- Indonesian
- Italian
- Japanese
- Portuguese
- Spanish
YouTube AI dubbing tool creates dubs in all the supported languages for English videos. However, when you upload videos in any of the other supported languages, it will only generate an English dub. As of now, you cannot upload a French video and generate an Italian dubbed version as per the current state of auto-dub implementation.
Navigate to the Languages section in YouTube Studio where you can review, publish, unpublish, or delete the dubbed versions. The dubbed videos carry an auto-dubbed label, and viewers can switch to the original audio using the Audio track option.
Also Read:
- YouTube Premium Subscribers Can ‘Ask’ Questions to AI
- YouTube Music’s Ask Music Will Use AI to Generate Playlists in Seconds
- YouTube Releases New AI Features for Premium Users
How YouTube’s AI-Powered Dubbing Works
YouTube’s new dubbing tool leverages Google’s Gemini AI to generate these dubbed versions. When the creator first uploads the video, Google uses its natural language processing (NLP) to underidentify the language automatically irrespective of accents and dialects. Then the audio is transcribed into text using Google Speech-to-Text API. Once transcribed, the text is translated into the target language(s) with Google Translate, which uses neural machine translation (NMT) models.
The translated text is then converted into audio using Google Gemini AI using tech like WaveNet. This is called Synthetic voice generation. The dubbed audio tracks are automatically synchronized with the original video and uploaded alongside the video in YouTube Studio.
Current AI-generated dubs may lack perfect tone and emotional fidelity. However, YouTube is working on an enhancement called Expressive Speech. This feature under development with Google DeepMind, focuses on replicating the creator’s tone, emotions, and even the ambiance of the original environment. Previewed at the Made on YouTube event, this enhancement promises to make auto-dubs more natural and engaging.
Helping Creators Reach a Global Audience
By expanding the reach of AI-powered dubbing, YouTube is helping creators connect with audiences worldwide and making content more inclusive. While the technology is still in its early stages, YouTube’s collaboration with Google DeepMind and Google Translate signals ongoing improvements.
Although the current implementation has some limitations, such as restricted language pairings and less expressive dubs, future updates are expected to address these gaps.