Voice Recording for Beginners

(Last updated: February 2020)

1. Disclaimer

This guide is meant for beginners. I am not a professional, I give advice to the best of my knowledge, but please bear in mind that you may get better advice, or advice that suits you better, from people who have deeper knowledge and more experience than I do.

This is about the recording of spoken words. Singing, or music, are different matters, and much of what is said here doesn’t apply there.

2. Introduction

You need a recording room, recording equipment (a microphone, a recording device, recording software if it isn’t built into the device, and basic knowledge of how to use them), and, optionally, a computer with sound editing software (optionally, because maybe you’ll have someone else editing your recordings).

3. The Recording Room

It is important that the room in which you record meets two criteria: that it is free of ambient noise, and that it has as little echo as possible. The closer the microphone is to your mouth the less effect noise and echo will have on your recordings, but a good recording environment will always be essential for getting good results.

“Noise” includes faint sounds that you may hardly notice because you are used to them — the ticking of a clock, the central heating, the ceiling fan, the computer fan, or, as a friend of mine once found out, the air pump of the fish tank. And, of course, external noises, traffic, children, dogs … I know it is difficult to have silence, but you need to have it. And don’t forget to turn the fish tank air pump and other life-saving equipment on again, if you had to turn them off for your recording session!

Echo isn’t necessarily bad if you’re a professional recording engineer and know what you’re doing, but for our purposes the rule is simple: the less echo you have, the better. The problem with echo is that your microphone picks it up, even if your ears aren’t aware of it.

This is not the place to discuss the finer points of room acoustics, but the important thing for us is, sound-absorbing surfaces reduce echo. Anything that’s soft: carpets, curtains, rugs on the floor or on the wall, upholstered furniture, pillows, blankets, duvets, clothes, laundry on a laundry rack, but also, for instance, books. If there is an armoire in the room, open it. Your location in the room and the direction in which you speak may also make a difference, though probably only a small one.

4. The Recording Equipment

4.1. Microphone

You need a proper microphone — the one that’s built into your cell phone, tablet or laptop computer will not do!

Microphones come with a number of different connectors — the most common ones are 3.5mm, XLR (for professional use), and USB. They come in different technologies, like dynamic, condenser or electret microphones. Dynamic microphones do not need power, others may need to be powered by the recording device, or they may pack their own batteries. They come with different directionalities, like spherical or cardioid, they come in all sizes and shapes, they come for a variety of intended uses, and the range of their prices is at least as wide as that of their quality.

If you are on a tight budget, you may want to use an inexpensive lavalier microphone — these are the small microphones that get clipped to your clothes, usually equipped with a 3.5mm connector. There are models that can be powered by their own batteries, but recording devices like cell phones and audio recorders provide the necessary power (you may need to enable it), and allow you to use the more common and more practical battery-less models. (You can not use a battery-less models with a computer!) The other advantages of lavalier microphones: they cost little (just don’t go for the cheapest ones), they take up little space, they don’t need a microphone stand or any other support apart from what you’re wearing on your body, and they’ll be out of the way of the air flow from your mouth and thus not be susceptible to those nasty “pop” sounds.

They have disadvantages, too: their quality will not satisfy higher demands, their cheap electronic circuits produce white background noise (which can later be removed from the recording, though), and, not being that close to your mouth, they pick up ambient sound and echo. Still, they are a small investment, and the results you’ll get may meet your requirements.

The usual place to wear a lavalier microphone is on your chest, clipping it to the button placket of your shirt or blouse, or to your tie if you’re wearing one for the occasion. A caveat: avoid having the cable rub against your clothes when you move — this may create noises that can ruin your recording, but with a little practice you’ll find out how to avoid it.

Of course, if you have the money to spend and the expertise to spend it wisely, there’s nothing better than a good “real” microphone with a pop screen and a microphone stand in front of you. Pay attention to connectivity, though!

4.2. Recording Device

A good choice, both in terms of quality and usability, is a hand-held audio recorder, for instance by Roland, Tascam or Zoom. They aren’t that expensive, and they will serve your recording needs for a long time (I have mine for more than 10 years now, far longer than any computer, camera or cell phone). Despite their built-in pair of stereo microphones, you will still have to get an external one. Most microphones that require low-voltage external power can be powered by most audio recorders, provided that they can be plugged in. (Note: do not confuse a proper hand-held audio recorder with those voice recorders for office use!) USB microphones will not work.

“Not expensive” is of course relative, so you may prefer to use a recording device that you already have — like a cell phone, or a tablet computer. How well they do the job depends upon the quality of their audio processing hardware (the analog-to-digital converter), of which, unfortunately, you will hardly be able to get any technical specifications. (With an iPhone you'll probably be on the safe side, as Apple has a tradition of caring about audio quality. If you have a new iPhone model without a 3.5mm connector, you need Apple’s Lightning to 3.5mm adapter, the same one that also lets you connect headphones with standard 3.5mm plugs.) Your phone or tablet computer will have come with its own recording app, but you will need one that gives you a better control over the recording process, even if it costs a little money.

If you record with a computer, whether a desktop model or a portable one, you will probably either use an USB microphone that plugs into the computer, or an XLR microphone that is connected to a small mixing console which again connects to the computer via USB. In these cases, the analog-to-digital conversion is done by the microphone or the mixing console respectively, so that the computer’s usually less-than-perfect audio circuits do not interfere with the quality of the recording. What does interfere, though, is the noise made by your computer’s fan — the only real solution to this problem is to use a computer or a laptop without a fan.

The advantage of recording with a computer is that you choose from a wide variety of USB and XLR microphones, from low cost to semi-professional to professional. USB means that the analog-to-digital conversion is done by the microphone itself, while XLR is an analog connector, bascially the same as 3.5mm. USB microphones are easier to use, and also easier to buy — you just need to make your choice of microphone. The combination of a good XLR microphone and a good mixing console can not be topped when it comes to sound quality, but this is a more complex setup which requires more know-how to buy and also to operate, with more things that have to be done right (and, of course, with one more piece of equipment to clog your workspace).

When you record with a computer you may want to use a software that lets you do both the recording and what editing you will need to do. Audacity, for instance, is free and open source software that probably offers all you will need.

4.3 Summary and Prices

Studio quality you will only get in a recording studio, but a decent quality that will suffice for many purposes can be had for relatively little money. Microphone and recording device have to match, which unfortunately is not trivial.

Currently (February 2020) you can get a “broadcast-grade” Rode Smartlav+ lavalier microphone for about $60 (Amazon US), which is a great choice if your recording device can supply the power that it needs (most hand-held devices do, computers usually don’t). For only $15 you can get a Movo PM10 lavalier mic, which has some good user reviews. A Tascam DR-05X audio recorder, for instance, costs about $90 — it looks good, but I haven’t tested it. Other recorders, for instance by Tascam, Zoom or Roland, can be had for anything from $100 to $200 and more.

You can find good USB microphones from $100 upwards — my favorite is the Rode NT USB for $ 170. If not included, you’ll also need a microphone stand, a shock mount, and a pop filter. If you use that with a fan-less mobile computer, you have a perfect little portable recording studio.

There is a large choice of condenser microphones with XLR connectors, from really cheap to quite expensive, which need a power supply or a mixing console to work, and there are dynamic microphones which do not, but also mostly come with XLR. There are all kinds of microphones, adapters, interfaces and recording devices which may, or may not, work together. Just do a little research before you buy, and I’m sure you’ll find something that suits you, for an acceptable price.

5. How to Record

Depending on the device and software that you use there can be different settings to make, but these two are essential: volume and file saving format.

5.1 Volume

For volume, you can use the easy way out and, if this option is provided, choose “automatic gain control.” That way, you cannot do much wrong, but you’re not doing it perfectly right, either — your recording loses expression and emotional depth that comes from speaking some words louder or softer. Still, if you find it difficult to keep the volume of your voice steady and in the appropriate loudness range, or if technical correctness is more important than artistic expression, automatic gain control can be useful, and you can disregard the following paragraphs.

In the world of sound recording, volume is usually measured in negative numbers — in decibel (dB) below maximum (dB measures relative distance on a logarithmic scale; alternatively, 0 dB can also be shown as 100%). In analog days, this was the level that a signal should not exceed, to avoid distortions — in our digital age, it is the level that the signal cannot exceed. Which is not good news, though. What happens, when the volume exceeds this limit, is called “clipping” — the peaks of the sound wave, when they hit the ceiling, are cut off, distorting the sound beyond repair. Too low a volume is the far lesser evil, but also has a negative effect on sound quality.

I recommend this way to set the proper recording volume: speak a text with your normal recording voice. Adjust the volume so that the peaks get close to, but do not exceed, -9dB (or 35%). Now emphasize some important or dramatic words, and make sure that the peaks still stay below -3dB (or 70%) — if they don’t, lower the recording volume until they do. (You should never raise or lower your voice too much, or the changes in volume will make listening to the recording difficult.) Important: If you can, use a sound editor (like Audacity) to verify that the recording levels shown on your recording device are accurate. I've seen cases where they are not.

Remember the volume setting, you can use it in the future as long as you use the same microphone in the same position, and you will not have to keep the volume indicator under constant observation. It’s still a good idea to speak a few words and check the volume indicator before each recording, and to check the recorded files afterwards. (Also keep in mind, that, if you record using a computer, there can be up to three volume controls involved: on your microphone, in the recording app, and on the operating system level — and if you use a mixer console, it can actually be four.)

Feel free to arrive a this result by any other way, but ideally volume peaks should be in the range between -9dB and -3db, and never, ever, reach 0dB or 100%.

(In case you are interested: The relationship between dB and % is a bit complicated. Bels are values on a logarithmic scale with base 10; a decibel is one tenth of a Bel, so, -10dB should equal 0.1, that is 10%. But, decibels measure sound volume, which corresponds to signal power, which grows with the square of its amplitude, to which the percentage scale refers. Therefore, -10dB in volume means sqrt(0.1) in amplitude, which is 0.316, or 32%, on a linear scale.)

5.2 File Format

The important thing to bear in mind is that each time you save audio in a “lossy” compressed file format, such as mp3 (which is by far the one most commonly used), you lose some quality — how much depends upon the “quality” (or bitrate) settings.

The final product of your work will, in most cases, be a compressed mp3 file, but this comes at the very end of the postproduction process, that is, after all the recording, sound editing and mixing have been done. On the way there, you can safely use lossy file compression once, if you choose the highest quality setting (with mp3, this is 320 kbit/s) — more often than once, and you risk the quality loss to be audible.

This means: If your recording isn’t too long and the resulting file size not too big, save it either in an uncompressed format (like WAV) or a lossless compressed format (like FLAC). If you have to economize with file size, you can use mp3 with 320 kbit/s, but then you have to take care to avoid another lossy compression through the entire postproduction process, until the (probable) creation of the final mp3 file.

If you send your recordings to someone else for editing/mixing, and want to have a small file to send, either record in mp3 with 320 kbit/s and send the original file, or record in a lossless format, do some preliminary editing, and then save as mp3 with 320 kbit/s to send that file. If you do the editing yourself, and use Audacity, the best idea is to record in WAV or FLAC (mp3 if otherwise the files get too large for your recording device) and stay with Audacity’s own format throughout the editing process.

5.3 Recording Tips

Three tips that will greatly help with editing your recording:

Always include a few seconds of silence, either at the beginning or the end of your recording — the editing software needs this as a reference for noise reduction.

If you notice you’ve made a mistake, pause for a few seconds, then go back to the beginning of the sentence or the verse or the paragraph, and read on.

Read the text twice, in the same recording.

6. Basic Editing

Your recordings need to be edited. If someone else does it, then you can stop reading.

If you want or need to edit your recordings yourself, you need a computer, a decent pair of headphones (you cannot properly do it with speakers), and sound editing software. Whatever platform your computer runs on, Audacity is an obvious choice, as it is free, open source, powerful, and reasonably simply to use. You may have your reason to choose a different sound editor — the following discussion of basic editing steps is written with Audacity in mind, but the principles stay the same whichever tool you use. You will not find any detailed instructions here, just a general idea of what needs to be done.

Always keep the original file. During the editing process, and when it is completed, always save your work in a lossless format.

a. Convert stereo to mono, if your recording happens to be stereo.

b. If the volume is low, normalize it to something like -3dB (for this, you first have to cut out loud noises, like coughs).

c. Check the plot spectrum. If it shows a lot of subsonic noise, use a high pass filter to eliminate it — a cutoff frequency of 60Hz and a rolloff of 24dB work well, unless you have a very deep voice, in which case you may want to use a lower cutoff frequency.

d. Reduce background noise.

You can easily reduce or nearly eliminate a constant background noise, such as white noise generated by less-than-perfect electronic circuitry, or a hum that somehow sneaked in from the AC mains, with the help of your sound editor’s noise reduction feature.

e. Optional: use a noise gate filter to eliminate low noises in between your spoken words. Do this with caution, or you may lose bits that belong to the recording. (The values I use are level reduction -12dB, gate threshold -45dB, attack/decay 100ms.) Don’t do this if you are not sure how it works.

f. Delete all that’s not supposed to be in the recording.

You have to distinguish between cutting out a sound, or silencing it. Cutting out is trivial, silencing not so. There are three ways to silencing. The first one is to mute that section — but, when you listen to your recording closely, you will find that “silence” isn’t totally silent. With muting, you create total silence that can stand out from the “natural” silence of the recording. Better use the second method — copy a stretch of silence from somewhere else in your recording, and paste it over the part you want to silence. The third method, which I use for instance with the sounds of breathing between words, is not to delete them but only to reduce their volume, by something between -6 and -18dB — leaving the breathing (barely) audible makes the listener feel the reader’s presence, and, in the right place, an audible intake of air may add a desired dramatic effect. Those very short little clicking sounds that come out of your mouth can safely simply be cut out, when they are only a few milliseconds long, without having a noticeable effect on the rhythm of your speech.

g. Pay attention to pauses.

Pauses are important. Some pauses between words or sentences may seem too short, others may seem too long — listen, and trust your ears. Shortening a pause is trivial, making it longer is best done with copying a part of it and pasting it in next to from where you’ve copied it (see above). With short pauses (fractions of a second) that you want to insert between words, you can use the editor’s “Create silence” function.

h. Repair flaws.

You won’t be able to do wonders, but there can be some little flaws that can be fixed. For instance, in an otherwise flawless recording, the voice artist had said “hand” instead of “hands” — I copied the “s” sound from the word “winds” two lines above, pasted it, and all was well. What flaws you can fix depends upon your editor’s features and on your skills, but always bear in mind that you can easily make things worse instead of better, so always save your work before you try something, and critically review the result.

i. Normalize volume.

A simple step, you only have to select the target volume (you can use 0, though I prefer -1dB). Or not so simple, if one or a few parts of the recording stand out in volume — then, normalizing will give those peaks the maximum volume, but for the larger parts of the recording the volume will be too low. To avoid this you have to look for those loud parts — words or syllables given too much emphasis by a raised voice. Carefully reduce their volume, maybe by 2 or 3dB, before you “normalize” the volume of the entire recording.

j. Add or trim leading and trailing silence.

This is entirely up to you, but I use 0.3 seconds of leading and 2 seconds of trailing silence. Total silence, as created by the “Create silence” function, is appropriate here.

k. Export to mp3.

Save your work in a lossless format (with Audacity, the obvious choice is its native one), and only then export to mp3. You have to decide on a compromise between file size and sound quality, with the available settings depending on your software. If given the option, a good idea is to select “variable bit rate” (VBR), with the quality setting of “2” on a scale from 9 to 0, with 0 being the best.

That’s it! Comments or questions welcome!

Back to the “Audio Wing” page