DocumentsImagesMediaPDF Tools

Compress Audio Online

Reduce audio file size by lowering the bitrate. Free, in your browser, no file uploads.

Drag your file here

.mp3, .wav, .ogg, .flac, .aac · up to 100 MB

Processed in your browser — file never uploadedFree
Note: The first conversion loads the FFmpeg engine (~25MB). Subsequent conversions will be faster.

Compress audio without losing what matters

Bitrate control

Choose between 64 and 320 kbps. Exact formula to calculate the resulting size.

100% private

Compression happens in your browser. Your audio is never uploaded to any server.

Universal compatibility

Output in MP3 or AAC. Compatible with all devices and platforms.

Up to 80% smaller

Reduce audio files to a fraction of the original size in seconds.

Three steps, no hassle

1

Upload your audio file

Drag or select your MP3, AAC, OGG, or other audio format file. Up to 200 MB, no signup.

2

Choose the target bitrate

Select between 64 and 320 kbps for your use case. 64-96 kbps for voice, 128-192 kbps for casual music, 256-320 kbps for hi-fi.

3

Download the compressed audio

Compare original and new file size before downloading. Typical savings are 50-80% versus the original file.

Got questions?

Yes, reducing the bitrate in lossy compression formats (MP3, AAC, OGG) means discarding auditory information. Modern codecs like AAC-LC, developed by the MPEG coalition in 1997 with contributions from Dolby, Fraunhofer, AT&T, Sony, and Nokia, use psychoacoustic models to selectively discard information that the human ear is least capable of perceiving. The psychoacoustic model analyzes the frequency spectrum in time windows (typically 20–50 ms) and identifies two phenomena: frequency masking (a loud sound at one frequency makes weak sounds at nearby frequencies imperceptible, a principle discovered by Harvey Fletcher in the 1920s and formalized in the Fletcher-Munson equal-loudness contours, 1933) and temporal masking (a loud sound makes sounds in the ~100 ms before and ~200 ms after imperceptible). At 128 kbps, AAC and MP3 preserve all perceptually relevant information for most listeners under normal listening conditions. Below 96 kbps, artifacts like pre-ringing, pumping, and high-frequency distortion begin to be audible in complex musical content.

The optimal bitrate choice depends on content and listening context. For podcasts and spoken voice content: 64 kbps mono is the industry standard (used by most podcasts on Spotify, Apple Podcasts, and Overcast), equivalent to approximately 28 MB per hour. At 64 kbps, spoken voice reproduces with complete clarity since the spectral bandwidth of the human voice (85 Hz to 8 kHz for most intelligibility) is significantly narrower than that of music. For casual music streaming: 128 kbps is acceptable for listening on Bluetooth speakers or mid-range headphones. Spotify uses 128 kbps OGG/Vorbis for free users and 160 kbps for premium on mobile. For music with attentive listening: 192–256 kbps. Apple Music uses 256 kbps AAC-LC as the base format for its entire library. For reference files or archiving: 320 kbps MP3 or, better yet, FLAC (lossless). Professional music producers distribute masters in WAV 24-bit/96 kHz to platforms, which then transcode them internally.

The exact formula for calculating compressed audio file size is: Size (MB) = (Bitrate in kbps × Duration in seconds) / 8 / 1024. Concrete examples: a 4-minute song (240 seconds) at 320 kbps occupies (320 × 240) / 8 / 1024 = 9.375 MB. The same song at 128 kbps occupies 3.75 MB (60% less). At 64 kbps it occupies 1.875 MB (80% less than 320 kbps). For comparison, the uncompressed WAV of the same song at 44.1 kHz, 16-bit, stereo occupies approximately 44,100 × 2 × 2 × 240 / 1,024 / 1,024 = 40.6 MB. A 12-track album of 4-minute songs at 128 kbps occupies 45 MB total, versus 487 MB in uncompressed WAV. This factor-10x difference explains why lossy formats were essential to make digital music distribution viable in the 1990s–2000s with the internet speeds of the era (56k modem: 7 KB/s; ADSL 1 Mbps: 125 KB/s).

Lossy compression is a data compression method that reduces file size by permanently discarding information considered non-essential to the user's perceptual experience. In audio, lossy codecs like MP3 (patented by Fraunhofer IIS and Thomson Consumer Electronics, fundamental patents expired in 2017), AAC (ISO/IEC 13818-7:1997 standard), and OGG Vorbis (open source, developed by the Xiph.Org Foundation since 1998) exploit the limitations of the human auditory system to discard information that cannot be perceived. Lossless compression like FLAC (Free Lossless Audio Codec, created by Josh Coalson in 2001) applies algorithms similar to ZIP but optimized for audio signals, typically achieving compression ratios of 2:1 to 3:1 without discarding any information. Lossy compression, on the other hand, can achieve ratios of 10:1 to 20:1 because it discards auditorily irrelevant information, at the cost of the compression being irreversible: the original audio cannot be recovered from the compressed file.

CBR (Constant Bitrate) and VBR (Variable Bitrate) are two bit allocation strategies in lossy audio encoding. CBR maintains a fixed bitrate throughout the entire file: if you choose 128 kbps CBR, every second of audio occupies exactly 128,000 bits, regardless of whether that second contains silence, simple speech, or complex music with high spectral density. This guarantees a predictable file size and facilitates streaming (the server can calculate exactly how many bytes are needed for any position in the file), but is inefficient because it assigns the same bits to simple and complex moments. VBR assigns more bits to the most complex audio segments (higher spectral density, more information to encode) and fewer bits to simpler moments (silences, monotone speech), maintaining more uniform perceptual quality. In MP3, LAME VBR with -V 2 (approximately equivalent to 190 kbps average) produces results indistinguishable from 320 kbps CBR for most listeners in blind ABX tests, with a significantly smaller file size. AAC VBR with quality 3–4 (in FFmpeg) is the equivalent for AAC.

Yes, compressing an already lossy-compressed file (like MP3 to MP3, or MP3 to AAC) implies additional quality degradation compared to compressing from a lossless source (WAV or FLAC). This phenomenon is known as generational loss. Each lossy compression cycle introduces new psychoacoustic artifacts: the artifacts from the first compression (pre-ringing, high-frequency distortion, dynamic range pumping) combine with the artifacts of the second compression, and the psychoacoustic model of the second codec may allocate bits suboptimally because the input material no longer has the statistical properties of natural PCM audio. In practice, at bitrates of 128 kbps or higher, the difference between compressing from WAV and from a 320 kbps MP3 is only audible in highly controlled ABX tests and is imperceptible in casual listening. However, compressing a 64 kbps MP3 to 32 kbps produces clearly audible artifacts even for untrained listeners. The general recommendation is always to start from the highest quality source available for any transcoding operation.

Compress audio: bitrate, quality, and psychoacoustics explained

Lossy audio compression is one of the most influential technologies of the digital era. Technical understanding of how it works enables informed decisions about the appropriate bitrate for each use case. The fundamental principle of modern lossy audio codecs, including MP3 (MPEG-1 Audio Layer III), AAC (Advanced Audio Coding), and OGG Vorbis, is the psychoacoustic model: a set of algorithms that analyze the spectral content of the audio and determine what information can be discarded without being perceived by the human ear. The psychoacoustic model exploits two well-documented phenomena of human auditory perception. The first is simultaneous frequency masking: when two tones are played simultaneously, the louder one can make the quieter one imperceptible if they are close enough in frequency. This phenomenon was systematically investigated by Harvey Fletcher at Bell Laboratories in the 1920s and formalized in the Fletcher-Munson equal-loudness contours (1933), updated as ISO 226:2003. The second phenomenon is temporal masking: a loud sound masks other sounds for approximately 100 ms before (pre-masking) and 200 ms after (post-masking). Audio codecs assign bits only to spectral components that exceed the masking threshold, discarding everything that falls below it.

The relationship between bitrate and perceived quality is not linear. Quality improvements diminish with the law of diminishing returns as bitrate increases. In MP3 with the LAME encoder (LAME Ain't an MP3 Encoder, the open source reference encoder developed since 1998): at 64 kbps, spoken voice is completely intelligible but music presents audible artifacts (pre-ringing on percussion, high-frequency distortion). At 96 kbps, acceptable for music in low-demand contexts. At 128 kbps, considered the minimum quality standard for music; most untrained listeners cannot detect a difference from the original in a blind test. At 192 kbps, only listeners with auditory training and high-fidelity equipment detect differences in complex musical content. At 256 kbps, transparent for virtually all listeners in almost all conditions. At 320 kbps, the upper limit of the MP3 standard; differences compared to FLAC are undetectable even in controlled ABX tests with high-fidelity equipment. The formula for calculating compressed audio file size is: Size (MB) = (Bitrate_kbps × Duration_seconds) / 8000. This formula assumes CBR (Constant Bitrate). For VBR (Variable Bitrate), size varies according to content complexity, but the average bitrate specified in the encoder serves as an estimate.

The main use cases by recommended bitrate for compressed audio are: mono spoken voice (podcasts, audiobooks, calls): 32–64 kbps. VoIP (Voice over IP) telephony uses specialized codecs like Opus at 6–64 kbps or G.711 at 64 kbps; WhatsApp and Telegram use Opus at 32–64 kbps for voice calls. RSS-distributed podcasts: 64 kbps mono is the de facto industry standard; Spotify recommends 128 kbps stereo for maximum quality on their platform. Casual streaming music: 128–192 kbps. Spotify Free uses 128 kbps OGG/Vorbis on mobile. Music for attentive listening: 256–320 kbps. Apple Music uses 256 kbps AAC; Amazon Music HD uses up to 850 kbps FLAC. Working files and masters: uncompressed WAV or FLAC, regardless of final use. A special case is audio compression for web and mobile applications. The W3C standard for audio in browsers (Web Audio API, 2011 specification, implemented in Chrome 10, Firefox 25, Safari 6) supports MP3, AAC, OGG/Vorbis, WAV, and Opus. For musical backgrounds in web applications, the recommendation is AAC at 128 kbps (universal support in iOS Safari) or OGG/Vorbis at 128 kbps with MP3 fallback for Safari. For UI sound effects (buttons, notifications), 64–96 kbps is sufficient given the short duration and low spectral complexity content. The Opus format, standardized by IETF in 2012 (RFC 6716), outperforms MP3 and AAC at low bitrates (8–64 kbps) and is the preferred format for real-time communications (WebRTC) and modern adaptive streaming.