FAQS

Cost/Performance/Quality

Q: How are you able to run at such a low cost? We’ve engineered highly efficient models and optimized our infrastructure, which allows us to deliver state-of-the-art transcription at a fraction of the cost.

Q: How do you calculate Word Error Rate (WER)?Word Error Rate (WER) is the standard way to measure how accurate a Speech-to-Text system is. It compares the model’s transcript to a correct “reference” transcript and counts how many mistakes it makes.
WER looks at three types of errors:

  • Substitutions: wrong word instead of the correct one
  • Deletions: a word is missing
  • Insertions: extra word that should not be there

The formula is: WER = (Substitutions + Deletions + Insertions) ÷ Total words in the reference A lower WER means higher accuracy. For example, a WER of 6% means 94% of the words were transcribed correctly.

Q: How accurate are you compared to competitors? We use public, transparent benchmarks to calculate WER: Huggingface Audio ASR Leaderboard, and consistently outperform other providers on noisy environments, multiple speakers, background chatter, and diverse accents.

Q: How do you handle filler words like “um” and “uh”? Our models naturally filter many of these out. If additional filtering is requested, including adding them back in, we can apply optional post-processing.

Q: Can your system handle noisy audio or overlapping speakers?Yes, our models are specifically trained on noisy, real-world audio environments and overlapping dialogue.

Features/Product

Q: Do you support both pre-recorded audio and real-time streaming?Our API currently supports pre-recorded file transcription and live streaming transcription is coming shortly.

Q: What languages do you support?We currently support English. Expansion into additional languages is on the roadmap.

Q: Do you provide speaker diarization (who said what)?Not yet, but it’s an option.

Q: Do you offer custom vocabularies or domain-specific tuning?Yes, we provide the ability to bias the model toward specific words, phrases, or jargon. Reach out to our enterprise sales team for more info.

Pricing

Q: Is there a free tier or trial?We provide 100 free hours so developers can test the system before committing.

Q: Do you offer enterprise discounts or volume-based pricing?Yes, we provide tiered pricing models for higher usage.

Q: Will pricing change between pre-recorded and streaming?Our base pricing is consistent across both, though streaming may include additional compute overhead depending on volume.

Compliance & Security

Q: Are you SOC2 and HIPAA compliant?We are in the process of completing SOC 2 (~Nov 1st) and HIPAA (~Jan 1st) compliance requirements for data handling and security.

Q: How is data handled and stored?Audio and transcripts are processed securely. By default, we do not retain customer audio unless explicitly opted in for model improvement.