Q & A: Vocal Isolation
Danzibar writes:
Hello!
How can I isolate vocals from a song? Im using Cubase sx, but have very basic knowledge of the programme so far. Its a very simple song, vocals and mbira. I want to get rid of the mbira and use the vocal track.
Help would be much appreciated.
Cheers.
: Thanks for writing in Danzibar, and thats a really common question we sound guys get. Unfortunately, the answer comes with lots of “ifs and buts.”
Short Answer:
First of all, as far I’m aware, there aren’t any “perfect” methods of extracting vocals from a mix. There are plugins and processes that try to separate the audio in the middle and the audio on sides. This kind of processing is called “Mid-Side” processing, (different from, but related to Mid-Side recording). This will help you extract a vocal if that vocal happens to be the only/loudest track centered in the mix.
The catch is, you’ll generally still get a lot of bleed, and of course if not done right, can create all sorts of funky artifacting/distortion. But its kind of the “best” option out there for now.
Here’s a couple of free things that might help you on your way.
Brainworx makes a free plugin called “Solo” which lets you, well, “solo” the mid and sides of your track. Try throwing that on a track and soloing your mid. Then throw on some EQ to try to cut out the extreme highs and lows to lover the volume of the remaining bleed. Plugin formats (VST, AU, RTAS, TDM)
Blue Cat Audio makes a cool set of plugs, one of which is a mid-side gain plugin. If you try this one, You can “play” with how much mid and side you get by adjusting the gains. Plugin formats (VST, AU, DX) More on that here.
Oh and check out their description of mid-side processing here.
Here’s an Example:
The Original Stereo Mix
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
The Sides Channel – What you’re trying to remove.
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
The Mid Channel – The sides removed from the stereo mix, High Pass Filtered above 150 Hz.
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
Notice how on the Mid channel, you’ve still got a fair amount of snare and hat, and you can still make out some bass/kick. This is because those tracks were also centered in the mix, so this kind of processing won’t entirely get rid of them. However, I was relatively successful in raising the vocal channel above a lot of the other elements in the mix, and this kind of processing really requires tweaking. I did the above example in about ten minutes.
***
(Much) Longer Answer:
The basic idea behind most vocal extraction plugins is to use the nature of stereo to your advantage. If you have a stereo audio file and a vocal is panned completely center, then it theoretically has equal level in both the left and right channels. If that vocal is panned all the way to the left, or hard left, then it is roughly has full volume on the left and is not present in the right. Panning hard right reverses the situation.
Nerd Note: This is roughly true, as panning is handled differently by different hardware/software, and a lot of panning mechanisms aren’t actually linear in scale. More on that here.
But how does that help us with vocals extraction? In theory, there are ways to “separate” the audio that is different between the L and R channels from audio that is the same in both channels. That might sound a bit confusing, but think of it as being able to take all the things from the “sides” and separate them from things that are in the “center”.
Great, that’s exactly what we want right? Well, it doesn’t generally work perfectly. Most of the time, there is stuff other than your vocal in the center channel, and you often get some “bleed” into your center channels.
There’s actually a great article on the nitty gritty of this process, here. It actually explains a lot about how FM stereo signals work, which is basically what this processing is based on.
After you’ve Isolated the “mid” channel, or the center, you should be left with your main vocal, and anything else that was panned center in the mix. Usually that includes bass guitar, kick drum, snare drum, and some other melodic tracks.
Now comes the part that takes a little skill, tweaking, and patience. You have to EQ and Gate the track to try to reduce the volume of things other than the vocal. Start by High Pass Filtering the the track. I’d suggest trying a pretty steep filter and the bringing it up while listening to the track. Try to find a frequency between the kick drum/bass elements and the lower-end of your vocal. I often find it somewhere between 150-250 Hz, but listen for yourself.
If you have way too much high end in the Mid channel, you can try Low Pass Filtering, but this can make your vocal sound muted or unnatural, so at the very least, use a shallow slope (6-12 dB/octave).
Next, try a gate. You’re going to try to set the threshold of the gate below the volume of the vocal but above the everything else. This can get tricky, and you probably won’t be able to catch everything. The goal here, is to let the gate “turn down” the parts between vocal lines, and leave the actual vocal parts untouched.
Once the threshold is set, mess with the “range” or “reduction” and try something heavy first like 60dB, then bring it back to try to make the gating sound more natural. Setting the attack/release/hold really requires some finesse, and i’d suggest a rather quick attack, and play with the hold and release till it sounds more natural.
Lastly you can try some spectral editors. They let you see you audio in an FFT view, which s a fancy way of saying you can see Time (left-to-right), Frequency (top-to-bottom), and Volume (either shades of color or brightness) all in one view.
I loaded the Mid channel in to my spectral editor, and zoomed to a phrase with a snare drum hit in it. Then I select the snare drum hit.
Then I processed it using “spectral repair.” Some editors use “repair” or “remove event.” In any case it is a function that tries to remove a transient sound (like a drum hit) and overwrite it using the audio before and after the hit. It kind of works like Photoshop’s smudge tool if you’re familiar with that.
I use iZotope Rx which is kind of an investment, but works great. There are other ones out there that you can try, though I’m not familiar with too many. I’ll try to add a list of some soon.






Thanks for linking to my article on Mid/Side extraction.
However, I only used FM stereo as an example in my article…I never intended it to be a “how it works”, although I suppose that’s just a side effect of how well I overexplain things.
Thanks Jay, your article really sheds some light on the concepts behind Mid-Side processing, and I think it could really benefit people without a lot of experience with the technique. Keep up the great work!
Exactly what I have been thinking. Your post was unbelievable. To get an ex back is not the hardest of the accomplishments But it for sure can take some time
-** I am very thankful to this topic because it really gives useful information -;,
Thanks for making the effort to talk relating to this, I feel fervently relating to this and I take pleasure in learning about this topic. Please, as you gain information, please update this web site with more information. I have found it handy.
How do I copy my WordPress blog onto my computer so I can locally edit and try out plugins before publishing?
How do you get a customised wordpress background?
I’m not positive the place you’re getting your info, however good topic. I needs to spend a while studying more or working out more. Thank you for fantastic information I was searching for this information for my mission.
Hi..
I am happy your article was available for reading. It’s extremely thought-provoking, informative, bold, primary and creative.
This particular info has given me the new perspective along this particular topic. Thank you..
Nice blog tooo ;)
Crank out completely free iphone ringtones with your music library quick and simple http://bit.ly/qq4Yf3
After study many of the blog posts on your blog these few days, and I definitely like your personal style of blogging. I saved it to my personal favorites web site listing and will also be looking at back shortly.
fabelhafter infunkt mit ateweis und seiwa veraft, sehr llinke und bleistei. dsahl schon enmat hat schworsu, startich und kallt mit aleitilg.
wunderschöne heren mit kavorph und hichu etrareld, sehr llinke und moren. fackt schon handsme hat tydreckt, stand und gekthein mit ligkeinse.
fabuloso cinho de mosos y debrapomo con flermirio iconcia. antomo a covinfeis y glicadeto nciasismo con codondo pados!
I think this site contains some very good information for everyone :D.
Appreciating the dedication you put into your website and in depth information you present. It’s good to come across a blog every once in a while that isn’t the same old rehashed information. Excellent read! I’ve saved your site and I’m adding your RSS feeds to my Google account.
prodigious piece keep it up.
This information is worth everyone’s attention. When can I find out more?
Thank you so much regarding giving my family an update on this issue on your web site. Please know that if a brand new post becomes available or in the event that any improvements occur about the current post, I would consider reading a lot more and knowing how to make good usage of those approaches you discuss. Thanks for your efforts and consideration of other folks by making this website available.
99% of website owners are doing these 5 mistakes. You will be suprised how dumb they are.
Some styles appear aback and others just disappear. Now you apperceive a little bit about the history of fashion.