BYOK AI notes for developers - fastest models, your key, per-task routing

For engineering teams that want provider-grade transcription and summarization on long technical calls without handing key custody to a SaaS. Pick a provider for transcription, pick another for summarization if you want, and route everything under your own API key.

When a cloud provider is the right call

When you bring your own cloud provider, jotty.pro routes audio for transcription and the transcript for summarization to whichever providers you configure. You pick from nine supported providers (OpenAI, Groq, Deepgram, HuggingFace, Ollama, X.AI, Google, Mistral, DeepSeek), and your key sets the terms that apply.

This is the right call when you need the freshest model versions, real speaker diarization on multi-participant calls, or low latency on dense technical meetings. On-device models still can't match that at scale. We don't provision or hold enterprise keys; we route requests under the key you supply.

Org-provisioned keys are supported. Your IT can issue a shared OpenAI, Groq, or other provider key; the team pastes it in Settings and every request runs under that key's terms.

The tradeoff against keeping it on-device: the provider sees both audio and transcript. You decide which provider that is, and your key sets the contract.

How it works

  1. In Settings, pick a transcription provider and a summarization provider. They can be the same provider or two different ones; per-task routing is supported.
  2. Paste your BYOK key: personal, team-issued, or an org key provisioned by IT.
  3. Record the technical meeting, design review, or architecture call.
  4. The transcription provider handles audio-to-text; the summary provider processes the transcript and returns structured notes.
  5. The transcript writes to local DuckDB regardless of which providers handled it, ready for semantic search on the device later.

How it compares

DimensionOn-device + BYOK summaryGeneric meeting bots (Otter, Fathom)jotty.pro with cloud-provider BYOK
Where audio goesStays on the deviceUploaded to the bot vendorRouted to your chosen transcription provider under your key
Which terms applyProvider terms apply only to the text you sendBot vendor's terms apply to the whole meetingYour chosen provider's terms apply to audio and transcript
Transcript residencyLocal DuckDB; excerpts leave only when you send themVendor-hosted; export depends on vendor policyLocal DuckDB after processing; provider retains per their terms
Model freshnessSummary is provider-current; transcription is localVendor-selected models, often laggingBoth transcription and summary use provider-current models
Cost per meetingDevice CPU for transcription; pay only for sent textSubscription or per-seat feePay-per-use under your provider's pricing for both tasks

Honest answers

Why not keep it on-device if I already have BYOK?

On-device transcribes locally, which is slower on long dense calls and doesn't have access to provider-grade speaker diarization. If you need clean speaker separation across a multi-participant architecture review, or your team's technical vocabulary trips up on-device models, routing transcription to a cloud provider built for it is the better path. Same DuckDB storage on the back end either way; the difference is where transcription happens.

Can I use one provider for transcription and another for summary?

Yes. Settings supports per-task provider configuration. Send audio to Deepgram for transcription, route the transcript to OpenAI for summarization, or any other combination across the nine supported providers. Each task uses the key and terms of the provider you assigned.

Whose terms apply if my company key is used on a personal device?

The terms of whoever issued the key. If your IT provisions a company OpenAI key, OpenAI's terms govern every request made with that key, regardless of which laptop kicked off the call. We don't see or store the key beyond routing the request. The audit and data-handling obligations run between your org and the provider.


If you'd rather keep transcription on-device and only send selected excerpts to AI providers, the on-device version for technical notes is the same key custody with a narrower default data path.

Handling highly sensitive content?

Use the Local version of this workflow.