DocumentsImagesMediaPDF Tools

Convert MP4 to OPUS Online

Extract audio from any MP4 as modern Opus, free, in your browser.

Drag your file here

.mp4 · up to 100 MB

Processed in your browser — file never uploadedFree
Note: The first conversion loads the FFmpeg engine (~25MB). Subsequent conversions will be faster.

From MP4 video to modern Opus audio

50% smaller than MP3

Opus at 64 kbps perceptually outperforms MP3 at 128 kbps per Xiph.org MUSHRA studies.

Native Discord and WebRTC

Opus is the mandatory codec in WebRTC (RFC 7874). Your clips are natively compatible.

100% private

Your video never leaves your device. Local processing with FFmpeg.wasm.

No usage limits

No signup, no watermark, no queues. Convert as many files as you need.

Three steps, no hassle

1

Upload your MP4 file

Drag or select your .mp4 video. Up to 500 MB, no signup required.

2

Audio extraction and encoding

FFmpeg.wasm decodes the MP4 audio track and re-encodes it to Opus 48 kHz directly in your browser.

3

Download your Opus file

Get your .opus file ready for Discord, podcasts, bots, and VoIP. Up to 50% smaller than MP3.

Got questions?

Opus delivers perceptually superior quality to MP3 at half the bitrate. At 64 kbps, Opus outperforms MP3 at 128 kbps according to Xiph.org MUSHRA studies (2012). For voice, Opus at 32 kbps is comparable to MP3 at 64–96 kbps. Discord, Telegram, WebRTC (RFC 7874, May 2016), and most modern VoIP applications use Opus natively. If that is your destination, Opus is the technically superior choice.

There is one transcoding generation: the original AAC-LC audio in the MP4 is decoded to PCM and re-encoded to Opus. This introduces minimal loss at appropriate bitrates (96–160 kbps for music, 48–64 kbps for voice). The loss is lower than equivalent MP3 transcoding because Opus uses a more modern psychoacoustic model (CELT+SILK, designed 2007–2012 versus MP3's 1993 subband model).

Yes. MP4 files from downloaded YouTube videos, GoPro, DJI, Sony cameras, OBS Studio screen recordings, and any standard MP4 work correctly. The MP4 container typically carries AAC-LC, AAC-HE, or occasionally Opus audio. In all cases, FFmpeg extracts and converts the audio stream.

Yes. Discord accepts .opus files in text chats and plays them natively. The free upload limit is 25 MB. For real-time voice transmission, Discord encodes audio to Opus internally at 64 kbps (voice) or 8 kbps (low quality channels). The .opus file you produce here is a static file for sharing as an attachment, separate from real-time voice.

It is discarded entirely. This tool extracts only the audio stream. If you need to keep the video with an Opus audio track remuxed in, that requires a video remuxing tool beyond the scope of this audio conversion.

For spoken voice (lectures, podcasts, vlogs): 48–64 kbps provides high-quality telephone-grade audio. For voice with background music or mixed content: 96–128 kbps. For music recorded in video (concerts, studio sessions): 128–160 kbps. Opus uses automatic voice activity detection (VAD) and reduces bitrate during silence, so actual file size is typically lower than the theoretical maximum.

Convert MP4 to Opus: extract video audio with the most efficient modern codec

MP4 (MPEG-4 Part 14, ISO 14496-12) is the world's most widely used video container. It stores video streams (typically H.264 or H.265) alongside audio streams encoded in AAC-LC, AAC-HE, or increasingly Opus. The need to extract only the audio from an MP4 is common: podcast recordings made with a camera or smartphone, YouTube sessions to be redistributed as audio, conference recordings, online lectures, and vlogs where only the voice is needed. The question is which format to extract to. MP3 has been active since 1993 (ISO 11172-3) and is ubiquitous, but its subband psychoacoustic model was designed for 1990s hardware. Opus, standardized in RFC 6716 of September 2012 by IETF and Xiph.org, uses the hybrid SILK model (originally developed by Skype for VoIP) and CELT (designed for low-latency audio), combined with dynamic mode-switching logic. The result is that Opus at 64 kbps produces perceptually superior quality to MP3 at 128 kbps in 2012 MUSHRA tests, and Opus at 96 kbps is equivalent to AAC at 128 kbps. For extracting audio from MP4 videos destined for modern platforms — Discord (which uses Opus for all its audio), WebRTC applications, audio bots, and podcast distribution — Opus is the technically optimal choice: smaller file size at equal perceived quality, configurable latency from 2.5 to 60 ms, and a completely royalty-free license (unlike AAC or MP3 which require Via Licensing payments).

The internal architecture of an MP4 is relevant to understanding what happens during audio extraction. The MP4 container organizes its streams in atoms (boxes) per ISO 14496-12: the moov atom contains container metadata, trak atoms describe each stream (video, audio, subtitles), and mdat atoms contain the interleaved sample data. The audio stream in an MP4 recorded by a GoPro Hero 12 action camera uses AAC-LC at 48 kHz stereo at 192 kbps. A YouTube MP4 downloaded in 137+140 format has H.264 video and AAC-LC audio at 128 kbps. An OBS Studio MP4 recording may have AAC at 160 kbps or, if configured, Opus. In all these cases, FFmpeg identifies the audio stream, decodes it to PCM (if it is not already Opus), and re-encodes to Opus in an OGG container. The conversion introduces a single transcoding generation if the original audio was AAC or MP3. If the MP4 originally contained Opus (unusual case), FFmpeg can perform a direct stream copy without re-encoding, preserving original quality. The resulting .opus file uses the OGG Encapsulation Format (defined in RFC 7845, April 2016), compatible with VLC, mpv, Firefox (since v15, 2012), Chrome (since v25, 2013), and Foobar2000.

Convertir.ai extracts the audio from the MP4 and converts it to Opus entirely in the browser with FFmpeg.wasm, without sending any video frame or audio sample to external servers. The process: FFmpeg.wasm opens the MP4 in memory, identifies the audio stream (typically AAC-LC on track 0:1), decodes it to PCM at 44.1 kHz or 48 kHz, applies resampling to 48 kHz if necessary using FFmpeg's Kaiser-windowed sinc filter, and encodes with libopus at the user-selected bitrate. The application mode is selected automatically: audio for musical content (uses CELT across the full spectrum), voice for spoken content (uses SILK at low frequencies with CELT for highs). MP4 metadata (title, artist, album, date from the udta/© atom) is transferred to Vorbis Comment tags in the output Opus when available. The output file is .opus in OGG, the standard format for Opus outside WebM containers. Processing latency depends on file size and device speed: a 1-hour voice MP4 at 1.5 GB processes in approximately 3–8 minutes on modern consumer hardware with FFmpeg.wasm. The service is completely free, no signup, no file limit, and no watermark.