One of the most frequent questions beginner (and many experienced) podcasters have is “how do I record a podcast with my co-hosts/guests being far away?” Here I’ll share my experience.
First of all, we can roughly divide all remote recording options into two categories:
Centralized recording, where all participants are recorded into a single audio track.
Double-ender style recording, where each participant gets their own audio track, usually recorded on that participant’s own computer.
Each approach has its own advantages and disadvantages. Double-ender recording usually is of higher quality than centrally recorded, for two reasons:
- Real-time VoIP audio is severely compressed1, and this is the audio stream that’s typically captured in the centralized recording scenario. In the double-ender scenario, each participant’s audio stream is being saved locally before it is compressed and transmitted over the network, so the recorded track has much higher bitrate than the VoIP stream.
- The double-ender is not affected by network glitches. If your guest’s voice turned robotic or simply vanished, you still know that their speech is being captured by their own computer in its original form.
Another advantage of a double-ender, besides quality, is that having each participant on their own audio track gives you more control in post-production. You can apply equalization, compression, and set levels for each voice independently.
That said, double-enders have two potential drawbacks:
- Unless you use specialized services (which I discuss below), each participant needs to record their own side of the conversation. A lot can go wrong here: they may not know how to record sound on their computer, or their recording may fail, or they may simply forget to press record… and if at least one participant’s track is lost, so is the whole episode.
- The tracks recorded on different computers tend to get out of sync with time—the effect known as “audio drift”.
When I started podcasting, I leaned towards centralized recording for its simplicity and robustness. I even created my own web tool which would join a skype group call and record the conversation. (This was in 2015, before skype introduced their call recording feature.)
Since then, I became a huge fan of double-enders, for the reasons described above and because now we have a few services that address the potential drawbacks I mentioned: they don’t require any extra actions from you and your guests, are reliable, and help with the audio drift problem.
The services I have the most experience with are Zencastr and Cast (also known as “Try Cast”). They are very similar for my purposes, except one of them works much better than the other one. I’ll give the details below, but feel free to skip straight to the conclusions. There are other offerings in this space, such as SquadCast, but I haven’t tried them.
I used Zencastr for a year, from April 2017 to May 2018. At first I was quite happy with it, but in October 2017 I started to notice some weird clicks in Zencastr’s mp3 files.
I always do a local recording as a backup (using the method described here), and when I found the corresponding piece in my local recording, it didn’t have these artifacts.
I reported this issue to Zencastr developers on October 20, 2017, and they replied that they’ll look into it.
They didn’t get back to me after that, but in February 2018 a post was published on Zencastr’s blog which alluded to these issues and claimed they were fixed. Except they weren’t.
On October 28, I wrote this to Zencastr support:
Thank you for the recent update — there are many great features there.
However, the cracking sound I reported back in October, sadly, wasn’t resolved by this update.
Here is an example of a recording I did just today (zencastr vs local, made at the same time). I used Chrome 64 (specifically, google-chrome-stable-64.0.3282.186-1.x86_64), and zencastr gave me no warnings.
I understood that these problems were incredibly hard to track down and didn’t expect any more than “we’ll look into it”. But the response really disappointed me:
It appears one of your guests was using Chrome 63 at the time of your most recent recording.
We know you can’t always control what your guests are running, but an up to date browser will likely improve things going forward. Since Zencastr is very reliant on the browser, we would recommend double checking that Chrome up to date before each recording.
So they were saying that the clicks in my track, recorded locally on my computer, were somehow induced by my guest’s older browser. It might be that I don’t fully understand how Zencastr works, but this sounded like BS to me.
I continued using Zencastr for some time, using my local recording for my track and using Zencastr only to get my guests’ tracks. It was palatable, but the cracking sound was only getting worse with time (even when all participants were on Chrome 64), so in June 2018 I finally switched to Cast.
Also, Zencastr couldn’t fully solve the drift problem. Their February 2018 update improved things, but as the screenshot shows, the drift was still about 2 seconds per 2 hours recording time (which is a lot).
Cast has worked much better for me than Zencastr. There are occasional artifacts with Cast, too, such as this one (which I reported in July 2018):
Cast recordings suffer less from audio drift. Though not perfect, they have about 0.5 seconds or less deviation per hour of recording in my experience.
On the other hand, some things were better in Zencastr. For instance, in Zencastr when you create a recording session, it remains accessible indefinitely for anyone with the link. So I could create a recording link a month in advance and send it to my guests. Not only was that convenient, but my guests could use the link to test that Zencastr works in their browser and that their microphone is detected. In Cast, session links are only valid until the host leaves the session, so this workflow no longer works. They have a feature to schedule a session, where they’d send the recording link to the guests 30 minutes before the start, but it’s not very convenient and doesn’t allow the guest to test their browser and microhpone in advance.
I should also mention that there was one time—in April 2017—when my guest didn’t manage to get Cast working with her browser/OS, but Zencastr worked. (That’s when I started using Zencastr.) But since I switched back to Cast in May 2018, Cast always worked for my guests.
Update from 2020-06-12
I started noticing lately that Cast’s audio has noticably lower quality than the local audio. (I wouldn’t be surprised if that issue has always been there, and I just became more perceptive to these things.)
Here’s an example from a recent recording. The difference is subtle, but I can recognize it 100% of time in an ABX test:
I thought that this could be a consequence of Cast recording to a 128kbps mp3. So I encoded my local version as 128kbps mp3 (using LAME 3.100):
I haven’t noticed any artifacts in the local 128bkps-encoded version compared to the lossless version, and I still can tell it from the Cast version 100% of time in an ABX test.
(Added on 2018-11-03.)
Shortly after I published this article, Mark Hills from Cleanfeed got in touch with me and recommended to give it a try. Cleanfeed is not a double-ender service per se; instead, it’s a web-based centralized recording service. From what I understand, recording and mixing happens in the host’s browser, not on their server.
I only played with Cleanfeed for an hour or two, so take this with a grain of salt.
Mark wrote to me:
Though I see you were looking at double-ender, I would be grateful if you’d give it a try, at least just to experience VoIP that is as good as local audio.
So I did, and I have to say that Cleanfeed is to my ear indistinguishable from local audio. Take a listen yourself to these audio samples. One of them is local, the other one is from Cleanfeed.
If you hear a difference, try to guess which one is local and which one is remote, then hover/tap below to reveal the correct answer:
To conduct the above experiment, I opened two Chrome windows, emulating a “host” and a “guest”. To make sure Cleanfeed doesn’t cheat by going through the high-bandwidth and low-latency loopback network, my guest Chrome window was configured to go through an http proxy in Texas, while the host window would use my normal Internet connection in Ukraine.
To achieve this audio quality, Cleanfeed uses the awesome Opus audio codec with bitrates averaging to 72kbps for mono audio and 172kbps for stereo music.
Thus Cleanfeed gives you great audio quality without audio drift. So what’s the catch? Here are a few limitations of Cleanfeed that I’ve discovered during my testing.
With Cleanfeed you can only have two separate audio tracks, which will become two stereo channels in the wav file. To enable this, choose “Separate tracks” when starting the recording. This works if you have a single guest or co-host, but otherwise you’ll get an already mixed recording. For comparison, Cast allows up to 3 guests and Zencastr allows up to 2 guests on their free plan and unlimited guests on the $20/mo plan (all of which get their separate tracks).
Unlike Zencastr and Cast, Cleanfeed does not save the recording on the server, nor does it have an option to retrieve the recording from the browser’s local storage. You have to download your track before you close the window; otherwise it’s lost. It’s very scary to me how easy it is to lose a recording by accidentally closing the Chrome window or by having your laptop run out of battery.
On the other hand, with Cleanfeed you can download the recording up to this moment without stopping the recording session—a feature I haven’t seen in any other services.
Cleanfeed tolerates packet loss of up to 30% surprisingly well. But if the packet loss reaches a certain level or the network connection drops for some time, the guests will disconnect and won’t be able to connect to this session again. In that case you’ll need to start a new session (and don’t forget to download your audio track before doing that). None of the services handle network disconnects particularly well, but with Cast and Zencastr guests can usually reconnect to the current session and what they were saying during the disconnect is still saved.
(Added on 2020-03-28.)
I decided to try RINGR after I saw a comment saying that they support recording from a mobile app. Given that microphones in a mobile phone are placed much closer to the mouth than microphones in laptops (and are designed with that in mind), I wouldn’t be surprised if using a mobile phone for recording resulted in better audio than using a laptop’s built-in mic. So I was eager to give RINGR a try.
Another thing I found compelling was an ability to customize the recording format and to record losslessly—a multitrack FLAC would be ideal for me.
However, after spending two hours testing RINGR with a friend, I can say it’s by far the buggiest service of all I’ve seen. With the Android app, we’ve observed the following failure modes on different occasions:
- When clicking the link from the email, it would more often than not fail to detect the installed app and redirect to Google Play instead.
- When joining the call as a guest, it would sometimes just cycle back to the initial logging screen.
- Other times, it would just stay on the “loading” screen forever.
Here’s a video demonstrating (1) and (3) ((2) was happening to us originally, but then we couldn’t replicate it because it somehow switched to (3)):
We managed to make it work a single time, but then couldn’t do it again (following the exact same steps).
Using the in-browser version (we used Google Chrome on both sides) was similarly unreliable: we did get it working one time, but two other times we couldn’t hear each other despite the waveform animation being displayed in the browser.
In addition to it simply not working most of the time, here are the permissions that you have to grant to the Android app:
Microphone and purchases? Sure. Files and WiFi info? Makes me a bit uncomfortable, but I can imagine there being technical reasons for why it needs them. Identity, call info, and contacts?! This is super shady, and I don’t even care what excuses RINGR might have for requesting these. There’s no way I’m going to ask my guests to install this.
My final disappointment with RINGR came when I logged in to my bank account. RINGR advertises a free 30-day trial, which I was going to use to evaluate the service.
However, while creating the account, I was asked for my credit card number. I didn’t think much of it: this is a frequent trick used by companies to ask your credit card and charge you after your free trial is over and you forget to cancel. So I entered my credit card and created a reminder to cancel the subscription if I end up not using it. But then I got an email with a receipt from RINGR:
And checking my bank account showed that they were not joking about charging me. I contacted their support on March 29 in the hope to get a refund but haven’t heard from them (for 3 weeks as I’m writing this, and probably more as you’re reading this).
So you can try RINGR if you have spare $17, but be prepared that it may not work, and you may not get your money back.
At this point I would definitely not recommend Zencastr because of its cracking/clicking issues. Cast, on the other hand, proved to be reliable, and I will continue to use and recommend it. Still, with any service, you should always do a local backup recording in case your service fails. (If you use Linux, see my audio recording guide.)
If the audio drift drives you crazy, then give Cleanfeed a try. Or if you are currently using Skype or Hangouts to record your podcast. Subjectively Cleanfeed offers much better quality than those, although I haven’t rigorously compared them head to head. But don’t forget to save your track and always have a backup.
In the coming years, I hope to see the relevant browser APIs standardized and made interoperable (most of these services are Chrome-only) and an open source version of Cast/Zencastr emerge that I could host on my own server for free for as long as I like.
in the sense of “lossy data compression”, not “dynamic range compression”↩︎