Recording and transcribing conversations or meetings can be arduous. That’s why many who need to do either turn to transcription services. Otter takes an innovative approach to the task, offering real-time transcripts of conversations and meetings as they occur. It also integrates other features, such as cross-conversation speaker identification, good search tools, and excellent mobile apps, though some features are a bit rough around the edges. Otter is more accurate than most automatic services, but it’s still not a viable alternative to human-based services at this point. Until Otter improves its technologies, we recommend Editors’ Choice Rev for your important transcription jobs. For impromptu meetings or personal notes though, Otter may work well for you.
Otter offers both a free and paid plan. With the free plan, users get 600 minutes of transcriptions per month. The free version doesn’t restrict any other features of the app or web interface. I also appreciate that Otter doesn’t show ads in its free version.
The Premium account level lets users transcribe 6,000 minutes of audio per month. Otter charges $9.99 per month for this plan, though you can save money if you pay annually at a rate of $79.99 per year. Students and teachers also get a discount, paying only $2.99 per month for this plan. In addition to the extra transcription minutes, Otter Premium includes priority email support and will soon offer advanced export options.
For comparison, Scribie’s automatic transcription service is completely free, though it does not have a mobile app. Trint’s cheapest subscription plan charges $15 per month for up to three hours’ worth of uploaded audio. Temi, another automatic service, costs $0.10 per minute.
Human-based services cost more, given the human capital they require and the improved rates of accuracy. Most charge on a per-minute basis, usually in the $1 to $3 range. Rev, for example, charges a base fee of $1 per minute for each minute of audio you order. It, along with others, also bumps up the price if you add extra options to the transcript job such as speaker identifications and timestamps, or even if the audio recording is of poor quality.
How Otter Works
At its core, Otter is an ambient transcription service that relies on automatic speech recognition (ASR) to process your recordings in real time. All you need to do is hit the record button, start speaking, and watch your words appear in the app. It even adds in proper punctuation and separates individual speakers (albeit with mixed results). In our experience, it takes a couple of minutes for the transcript to show up in the app, but after that, it updates in near real-time. We detail Otter’s performance in our accuracy test in a later section of the review.
Otter’s ASR technology is similar to that of other services we reviewed, including Scribie, Temi, and Trint. The underlying technology is developed by a group called AI Sense; it notably integrates with the
In case you’re curious, there’s nothing special about the name, Otter. Yes, it sounds a bit like “utter.” However, my contact offered a better explanation: otters are cute animals. In any case, I agree that otters are cool creatures and do appreciate the logo.
Otter does not allow users to set up two-factor authentication, which is disappointing. Any service that hosts potentially sensitive information should enable this security measure by default. That said, automatic services are inherently safer since there is not another human on the end of the process who has access to your (potentially) sensitive files.
Truly Mobile Transcripts
Otter offers apps for both Android and iOS. Setup is easy: Just download the app from the respective app store and create an account. We installed the app on both a Google Pixel running Android 8.1 and an iPhone 8 with iOS 11, though the majority of our review refers to our experience with the Android version. Otter’s iPhone app looks almost identical to its Android counterpart.
The app uses a primarily white interface with the occasional blue accent for emphasis. It looks clean and modern, but I would appreciate a dark mode. You navigate the app by tapping one of the five icons in the bottom menu: Dashboard, Conversations, Record, Groups, and Settings. You can’t swipe to navigate between screens, however, which is annoying.
If you click on a particular conversation, you can view the transcript in its entirety along with some basic information up top, including the recording date, time, and total length. In the upper-right corner, you can share access to the transcript with contacts and set editing permissions. Alternatively, you can directly share the view-only link that Otter automatically generates.
To edit a particular part of your transcription, simply hold down on it and hit the Edit button. You need to tap on each section individually to make edits; you can’t, for example, scroll to another section and make changes in one smooth session. This is a bit of a nuisance given that Otter isn’t very accurate with how it identifies new speakers or breaks in the conversation.
Otter also lets you edit the title of a transcript, but you can’t change the date. This option would be useful if you uploaded a file on the web interface (more on this later) after the original recording took place. You can scrub through a recording via the playback controls at the bottom and Otter highlights the words as the transcript audio plays. A top menu lets you control the playback speed (0.5x-2x) or delete the recording entirely.
The Groups tab allows you to organize contacts for easily sharing conversations. Otter’s approach to sharing is one of the better implementations we’ve seen. Other services, like Rev and Trint, let you set up collaborators or teams, but neither let you do so seamlessly on mobile. The Settings section hosts a toggle to restrict audio uploading and streaming to a Wi-Fi connection, along with Bluetooth, device storage, and notification options. You can also train Otter to recognize your voice, manage connected accounts, and access a host of support options. This process entails recording yourself reading back text so it can get a good grasp on your voice model.
Otter on the Web
Otter’s web interface features the same clean and modern design style as its mobile apps, though it could also benefit from a dark mode. Along the left rail, there are two main menu items: Conversations and Groups. Further down, Otter now features an Account Settings section, from where you can edit basic profile info, link your Google or Zoom account, or upgrade to the premium account tier. When I first reviewed Otter, the web interface changed quite a few times during the testing period, but the design looks more finalized now. For example, the top-level search bar is back, which lets users search for terms across all of their recordings. Performance glitches I’d encountered previously were not in evidence this time, as well.
The Conversations section works the same way as it does on mobile; it lists all of your recordings in reverse chronological order. Selecting a file opens a full view of the transcript along with playback controls. From here, you can share the transcript, export the audio or text, rerun the speaker matching, or delete the transcription entirely.
At the top, you can search for complete terms within the transcript. Alternatively, Otter generates a series of common terms in the transcript under the title, which you can click on to find instances within the text. Otter includes playback controls at the bottom, complete with a play/pause button and playback speed options. You can scrub through the file with the included slider, but Otter does not include fast rewind or forward buttons. Other services such as Rev also include highlighting and strikethrough features for more focused editing.
To edit the text, simply click on the pencil icon to the right of each section. Instead of letting you edit everything all at once, Otter requires you to click into each individual section before you can make any changes, which slows down the workflow considerably.
Otter also lets you change the speaker IDs for individual blocks of text. Unfortunately, Otter isn’t great at separating sections on its own; it often split paragraphs in weird places. That said, Otter does have some clever features. For example, once you add a speaker ID to a section, it automatically goes through the rest of the transcript and adds that name to whatever paragraphs it detects as having the same speaker. It uses the same information to identify the same person in any other conversations or recordings hosted on your account, as well, though this feature only partially worked in testing.
To test out the accuracy of the transcription services, I uploaded the same 16-minute recording to each one. The original recording of a three-person conference call came from an Olympus VN-722PC dedicated voice recorder. It’s not an easy recording, but all the voices are clearly audible. Although this is not Otter’s primary purpose, it’s the best way to compare its ASR engine directly to that of other services.
Otter finished the transcription process in about six minutes. All of the automated transcription services completed the task in the range of three to four minutes. The quickest human-based transcription, Rev, only required about an hour for the same task.
Instead of comparing the entirety of each transcript, I chose three paragraphs, one from each speaker on the call. For each snippet of the transcript, I marked an error wherever there was a missing or an extra word. I calculated the overall error rate by dividing the total number of mistakes into the total number of words across the combined sections (in this case, 201 words). The sample for section A is a short introductory section. Section B is slightly longer and uses more complex vocabulary. Section C is even lengthier and contains some technical language.
Otter produced excellent results for an automatic service (it only had an error rate of 17 percent), but it still fell short of the human-based service I tested. For comparison, Rev only had an error rate of 3 percent and Scribie turned in a final copy with 6 percent. Take a look at the full chart below for the complete breakdown.
I retested all the automatic services, including Otter, with a simpler recording (two people, in-person) and calculated the error rate, in the same manner, using two samples, instead of three. The automatic services fared better with this task as a whole, but they still weren’t perfect. Otter actually fell to the middle of the pack with an error rate of 21 percent, though this was not too far off from Trint’s 14 percent or Temi’s 20 percent. The full results of the second test appear below.
Talk and Transcribe
Wherever and whenever you have an important conversation, you should have some way to record it and turn it into usable text data. Otter’s focus on real-time transcriptions and sharing make it an innovative option in the space. It’s more consistently accurate than any automatic transcription service we tested, even if it can’t compete with human-based services. After a few iterations, Otter could very well be a front-runner in the category, but for now, we recommend using Editors’ Choice Rev for affordable and accurate transcripts.