Show HN: Tonbo – an embedded database for serverless and edge runtimes

rubenvanwyk · 2025-12-24T10:04:28 1766570668

How does it compare to https://slatedb.io/ ?

Seems similar ideas, although SlateDB seems a bit more lightweight and using Parquet as primitive (even using Arrow) might mean more compute-heavy on client-side?

pdyc · 2025-12-24T13:39:20 1766583560

from slatedb faq https://slatedb.io/docs/get-started/faq/

>SlateDB is designed for key/value (KV) online transaction processing (OLTP) workloads. It is optimized for lowish-latency, high-throughput writes. It is not optimized for analytical queries that scan large amounts of columnar data. For online analytical processing (OLAP) workloads, we recommend checking out Tonbo.

spwa4 · 2025-12-24T11:12:49 1766574769

This is so weird. If you're using this library

1) your serverless and edge runtime needs to have internet access, so it can contact anyone

2) you're obviously not going to be able to efficiently write to S3 while providing guarantees, so it'll be expensive

3) you're writing in rust, so you really care about correctness and efficiency

This seems like a contradiction. Why would you do this as opposed to hosting a redundant postgres on 2 Hetzner/OVH/... servers and writing to that?

ethegwo · 2025-12-25T03:41:54 1766634114

Owner of Tonbo here. This critique makes sense in a classic web-app model.

What's shifting is workloads. More and more compute runs in short-lived sandboxes: WASM runtimes (browser, edge), Firecracker, etc. These are edge environments, but not just for web applications.

We're exploring a different architecture for these workloads: ephemeral, stateless compute with storage treated as a format rather than a service.

This also maps to how many AI agent service want per-user or per-workspace isolation at large scale, without operating millions of always-on database servers.

If you're happy running a long-lived Postgres service, Neon or Supabase are great choices.

spwa4 · 2025-12-25T11:37:17 1766662637

This makes no sense. DB connections have been part of the "short-lived sandbox" since the very beginning. CGI, PHP, ... all use database connections, and that's way faster and correcter (with proper transactions) than this approach.

And you use Rust ... so you care about speed and correctness. This seems like a very wrong approach.

ethegwo · 2025-12-26T04:18:44 1766722724

CGI/PHP treated database connections as something that's always available. That pushes a lot of hidden complexity onto the database platform: it has to be reachable from anywhere, handle massive fan-out, survive bursty short-lived clients, and remain correct under constant connect/disconnect.

That model worked when you had a small number of stable app servers. It becomes much harder when compute fans out into thousands or millions of short-lived sandboxes.

We're already seeing parts of the data ecosystem move away from this assumption. Projects like Iceberg and DuckDB decouple storage from long-running database services, treating data as durable formats that many ephemeral compute instances can operate on. That's the direction we're exploring as well.

rglover · 2025-12-24T12:49:09 1766580549

Because the means have been given priority over the ends.

brainless · 2025-12-24T12:07:50 1766578070

Lovely project. Also @rubenvanwyk mentioned SlateDB. I am not sure if this will fit my use-case but, today, I was looking for data hosting options for a self-hosted LLM+bot for email/calendar.

I have this product I have tried and stopped before: https://github.com/pixlie/dwata and I want to restart it. The idea is to create a knowledge graph (use Gliner for NER). Compute would either be on desktop or cloud (instances).

Then store the data on S3 or Cloudflare Workers KV or AWS Dynamo DB and access with cloud functions to hook up to WhatsApp/Telegram bot. I may stick with Dynamo or Cloudflare options eventually though (both have cloud functions support).

I need a persistent storage of key/value data (the graph, maybe embedding) for cloud functions. Completely self-hosted email/calendar bot with LLM, own cloud, own API keys. Super low running cost.

Eikon · 2025-12-24T12:45:25 1766580325

SlateDB is awesome, that’s ZeroFS [0] storage backend and it’s been great!

[0] https://github.com/Barre/ZeroFS

WilcoKruijer · 2025-12-17T20:14:04 1766002444

Sounds very interesting, but the README has me pondering the downsides. Is the latency very high? Are requests not immediately durable? Is it super expensive?

ethegwo · 2025-12-18T04:14:29 1766031269

Yes We'll provide a report to explain how we tradeoff these things, please stay tuned.

rubenvanwyk · 2025-12-24T10:03:04 1766570584

License does not yet exist? Hope it’s Apache 2.

niek_pas · 2025-12-24T10:26:48 1766572008

For some reason this post links to the dev branch on GitHub, if you switch to the main branch you will see the license file is indeed Apache 2.0.

ethegwo · 2025-12-25T03:44:46 1766634286

Yes it's Apache 2, thanks for pointing this out, I'll be fixing this.

canadiantim · 2025-12-24T12:28:50 1766579330

How big is the wasm?

ethegwo · 2025-12-25T03:43:43 1766634223

It's currently 3MB, and we've done almost nothing to reduce the file size, so we can expect it to get even smaller.