> For the complete documentation index, see [llms.txt](https://musicwave.gitbook.io/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://musicwave.gitbook.io/docs/cover-song-tutorials/audio-quality.md).

# Audio Quality Optimization

Even with the best voice model and a great song, AI cover quality varies based on input quality, settings, and post-processing. This guide covers how to get professional-quality results from MusicWave.

## The quality chain

AI cover quality depends on three stages:

1. **Input quality** — the source audio you start with
2. **Generation settings** — how MusicWave processes the audio
3. **Post-processing** — what you do after generation

Weak quality at any stage compromises the final result.

## Stage 1: Input quality

The cleaner your input, the better your output. The "garbage in, garbage out" principle applies strongly to AI audio.

### Use high-quality source audio

* Prefer **WAV or FLAC** over MP3
* Use the highest bitrate available (320 kbps minimum for MP3)
* Avoid heavily compressed or low-quality streaming rips
* Source from official releases when possible

### Avoid these input problems

* **Background noise** — kills voice extraction quality
* **Heavy effects on vocals** — autotune, harmonizers, heavy reverb
* **Multiple overlapping vocalists** — confuses the model
* **Crowd noise or live recordings** — adds unwanted artifacts
* **Distortion or clipping** — produces unusable results

### Pre-processing input

Before submitting to MusicWave:

1. **Normalize the audio** to consistent volume
2. **Remove silence** at the start and end
3. **Trim to the section you want** — don't process unnecessary parts
4. **Consider noise reduction** if needed

## Stage 2: Generation settings

MusicWave offers settings that affect quality vs. speed:

### Quality presets

| Setting  | Speed   | Use for                 |
| -------- | ------- | ----------------------- |
| Draft    | Fastest | Testing voice/style fit |
| Standard | Medium  | Most regular use        |
| High     | Slower  | Final renders           |
| Studio   | Slowest | Professional output     |

### Sample rate

* **44.1 kHz** — CD quality, fine for most uses
* **48 kHz** — broadcast standard, good for video
* **96 kHz** — overkill for most use cases, larger files

### Bit depth

* **16-bit** — standard, sufficient for streaming
* **24-bit** — professional, better for further editing

For most users, 44.1 kHz / 16-bit is fine. Choose higher only if you're doing further professional production.

### Voice tuning options

* **Pitch correction** — subtly tunes the AI vocal (recommended on)
* **Vibrato matching** — matches vibrato to original singer
* **Breath sounds** — adds realistic breathing (more natural sound)
* **Emotion intensity** — controls how expressive the vocals are

## Stage 3: Post-processing

Even great AI vocals usually need post-processing to sound truly professional.

### Volume balancing

Compare AI vocal volume to the instrumental:

* Vocals should sit on top of the mix, not buried
* Don't let vocals overpower everything else
* Aim for roughly equal perceived loudness

### EQ (Equalization)

Common EQ adjustments for AI vocals:

* **High-pass filter at 80 Hz** — removes low-end rumble
* **Cut around 250-500 Hz** — reduces "muddiness"
* **Slight boost at 3-5 kHz** — adds presence and clarity
* **Boost at 10 kHz** — adds "air" and brightness

### Compression

Apply gentle compression to even out vocal dynamics:

* Ratio: 2:1 to 4:1
* Attack: 5-10 ms
* Release: 100-300 ms
* Threshold: just enough to catch peaks

### Reverb and ambience

Adding subtle reverb makes AI vocals feel more natural:

* Use a short room or plate reverb
* Mix at 10-20% wet
* Match the reverb to the original instrumental's space

### De-essing

AI vocals sometimes have harsh sibilants. A de-esser at 5-8 kHz tames them.

## Common quality problems and fixes

### "The vocal sounds detached from the instrumental"

The vocal needs ambience to sit in the mix. Add a small amount of reverb that matches the instrumental's space.

### "The vocal is robotic"

Try a different voice model with more natural characteristics. Also enable "breath sounds" in generation settings.

### "Sibilants are harsh"

Apply a de-esser. Reduce frequencies between 5-8 kHz dynamically when sibilants occur.

### "Vocals sound thin"

EQ boost in the low-mid range (250-400 Hz). Be subtle — too much makes it muddy.

### "Vocals sound muddy"

EQ cut in the low-mid range (200-400 Hz). High-pass filter below 80 Hz.

### "Pitch sounds slightly off"

Use the pitch correction setting in MusicWave, or apply post-processing pitch correction in your DAW.

### "Volume is uneven"

Apply compression to even out dynamics.

## Output formats

Choose the right format for your use case:

| Format       | Quality  | File size | Best for                          |
| ------------ | -------- | --------- | --------------------------------- |
| MP3 320 kbps | High     | Small     | Streaming, sharing                |
| WAV          | Lossless | Large     | Video editing, further production |
| FLAC         | Lossless | Medium    | Archive, audiophile               |
| AAC          | High     | Small     | Apple devices, video              |

For YouTube uploads, WAV is preferred — YouTube re-compresses anyway, so giving it the cleanest source matters.

## Testing your final mix

Before publishing, listen to your mix on multiple systems:

1. **Studio headphones** — for technical accuracy
2. **Earbuds / AirPods** — what most people use
3. **Phone speaker** — worst-case scenario
4. **Car speakers** — common listening environment
5. **Laptop speakers** — common for content viewing

A mix that sounds good on all of these is professional quality.

## Reference tracks

Compare your mix to commercial releases in the same genre:

* Load a reference track in your editor
* Switch back and forth between yours and the reference
* Match overall loudness, brightness, and balance

Reference tracks reveal what professional mixes sound like and highlight what's missing in yours.

## Avoiding over-processing

It's easy to over-process AI vocals. Signs you've gone too far:

* The vocal sounds artificial or "plasticky"
* Heavy autotune artifacts are audible
* Excessive reverb obscures the lyrics
* Compression pumps unnaturally
* The vocal sounds completely different from the input

Less is usually more. Trust the AI's output and add only what's needed.

## Quick quality checklist

* [ ] Used high-quality source audio
* [ ] Pre-processed to remove noise and silence
* [ ] Selected appropriate quality preset
* [ ] Listened on multiple speakers
* [ ] Applied gentle compression
* [ ] Added subtle reverb to match instrumental
* [ ] EQ'd for clarity without muddiness
* [ ] De-essed if needed
* [ ] Compared to reference tracks
* [ ] Exported in correct format

## Next steps

* [Voice Model Selection Guide](/docs/cover-song-tutorials/voice-models.md) — Choosing the right voice
* [Creating Cover Songs Ethically](/docs/cover-song-tutorials/creating-ethically.md) — Legal considerations
* [Stem Splitter Guide](/docs/tools-documentation/stem-splitter.md) — Source separation

[Try MusicWave free →](https://www.musicwave.ai)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://musicwave.gitbook.io/docs/cover-song-tutorials/audio-quality.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.