Chapter I. Story so far
By Roman G.
TOC
- Motivation
- Sorry, I'm not ready to be responsible for your secrets and PII
- Speaking of data
- AI race
- Real price of all-local web applications
Motivation
When I decided to share my old idea with the world through the jotty.pro product, I realized I had something special to offer beyond just the "Founder Letter" or a whitepaper. I have a full story to tell. This isn't just a development log or a sales pitch, but a reflection born of my experience as the sole user of my app as I brought it out into the open. There are 9 chapters in total.
Sorry, I'm not ready to be responsible for your secrets and PII.
Working in highly compliant environments instills a professional deformation in you, so your first reaction to a new product idea is not how the user will receive it, but how much money your company will lose if you do something wrong, even if you put all the goodwill in the world into this feature.
And every time I wanted to start using a note-taking or transcription tool for myself, I was really confused about how to be sure all my silly data wouldn't leak to any 3rd party. No, I don't have any top-secret information to protect. For example, it was an attempt to use AI to challenge my rhetorical skills and to set up zero-tolerance detection of filler words. But who knows what you can say during this brainstorming or mind flow when you're locked in during the meeting?
Given that, I don't want to leak my own data to AI giants or anyone in between now or in 20 years because of a small line in the TC or PP. So I have no temptation to be responsible for anyone else's data as well. How can I avoid this? Keep all your data on your device. Do I need to pay for compute for transcriptions, inferences, and object storage, plus additional transient encryption and transport costs, legal department, and a large engineering/tech team, if the data is on your own device? The answer is clear. As a bonus, the product scale is infinite.
Speaking of data
The problem with data storage is solved. But if the data is still and we can't analyze it or do anything with it using modern tools, there is no value. At that moment transformers.js library has moved from an unstable experiment to something fundamental and ready for production. The fact that you can run an LLM model straight in your Chrome browser tab was blowing my mind. I've conducted a bunch of tests on my hardware using most available open-source ONNX Runtime models from HuggingFace. It was a pure rollercoaster - from excitement to complete depression, but I now have a full understanding of the current state of in-browser AI models. Not the massive 30-70 billion parameter models running on super expensive hardware, but the "right" sized, often quantized models, are ready to help you in your day-by-day tasks. Having that idea of a handy, fully local app for local meeting transcripts and note-taking, I split the most useful LLMs into categories, using the HuggingFace taxonomy:
- sentence similarity (or embeddings) - for vector search
- speech-to-text - transcribing audio and voice
- summarization - pretty much self-explanatory
- text generation - chat as you know it and data enhancement from meeting notes
The order is not random. It's sorted by the size of artifacts to download and impact.
We have reached the point where we no longer need to use cloud AI models! But the reality is that most adorable kinds of LLMs, like text generation, require more than 13GB of data to download and allocate on a GPU. Smaller means higher risk of hallucinations or failure.
Solution: BYOK (Bring Your Own Key). Most of us have our own API keys for popular AI models, or use corp API keys in our corp environment. As noted above, the data and API keys are on your device, so you can access most (hey, CORS!) cloud providers directly from your browser tab, of course, if you're ready to share your data. At least, it's under your control.
AI race
There is no technological competition in the market compared to the products we currently have. Let me break it down:
- It runs purely in your web browser - you don't need to download or install anything. Just log in, and you're good to go.
- It makes requests to cloud AI providers directly from your web browser using your own keys (BYOK). For local AI models, it uses transformers.js on your GPU (or CPU) to run them on your device. Yes, in your browser.
- The only data stored and touched on the backend is account info, payment info, community templates we're launching soon, anonymous telemetry, and your local DB encryption key.
Why keep your personal encryption key? If you choose to use your DuckDB backup file locally for experimentation, you're good to go. If you lose your encryption key, just log in. You can also re-encrypt your data in this database with your own key and upload it.
Quite flexible. I didn't predict it or consciously plan to add it as a feature - it's all a result of strict limitations when working with the modern stack and cutting-edge Chrome/web features in a web browser to keep all the data on this device. Did you know you already have a tiny little Built-in AI model in your Chrome browser?
Real price of all-local web applications
On paper, the above sounds great, but there are always some trade-offs.
| Trade-off | Pain point | Benefit |
|---|---|---|
| Data synchronization | You must manually or semi-manually sync across devices (p2p sync coming soon) | Your data never leaves your device |
| Device performance | Speed depends on your hardware (weaker laptops = slower) | Zero cloud latency + full offline capability |
| Model size & accuracy | Tiny local models vs giant cloud models leads to occasional lower accuracy | Instant processing + privacy |
| Browser limitations | Safari and old browsers has weaker WebGPU support; Chrome/Edge Works well | True cross-platform privacy without apps |
| Initial model download | First use downloads ~250 MB of models | After that everything works completely offline |
| Battery & heat | Local inference can drain battery or warm up the device during long sessions | You control the cost (your electricity, not a subscription) |
| Feature velocity | Cutting-edge cloud-only models are harder to integrate quickly | You own the entire stack - no surprise price hikes or policy changes |
I will expand on all of the above and more in the next articles. Please stand by. The next chapter is - Chapter II: If a tiny transformers.js transcription model can understand your articulation - you're good SOON