Hacker Newsnew | past | comments | ask | show | jobs | submit | samwillis's commentslogin

This is one of the main use cases we are building "Durable Streams" for, it's an open source spec for a resumable and durable stream protocol. It's essentially an append only log with a http api.

https://github.com/durable-streams/durable-streams

https://electric-sql.com/blog/2025/12/09/announcing-durable-...

When we built ElectricSQL we needed a resumable and durable stream of messages for sync and developed a highly robust and scalable protocol for it. We have now taken that experience and are extracting the underlying transport as an open protocol. This is something the industry needs, and it's essential that it's a standard that portable between provider, libraries and SDKs.

The idea is that a stream is a url addressable entity that can be read and tailed, using very simple http protocol (long polling and a SSE-like mode). But it's fully resumable from a known offset.

We've been using the previous iteration of this as the transport part of the electric sync protocol for the last 18 months. It's very well tested, both on servers, in the browser, but importantly in combination with CDNs. It's possible to scale this to essential unlimited connections (we've tested to 1 million) by request collapsing in the CDN, and as it's so cacheable it lifts a lot of load of your origin when a client reconnect from the start.

For the LLM use case you will be able to append messages/tokens directly to a stream via a http post (we're working on specifying a websocket write path) and the client just tails it. If the user refreshes the page it will just read back from the start and continue tailing the live session. Avoids appending tokens to a database in order to provide durability.


> But they need to FEEL the weather

You mean they need to smell it!


The key limitation (at the moment) is that it only supports a single connection. W're planning to lift that limitation though.


This is what I'm most interested in. I have an application which has a smaller trimmed down client version but it shares a lot of code with the larger full version of itself. Part of that code is query logic and it's very dependent on multiple connections and even the simplest transactions on it will deadlock without multiple connections. Right now if one wants to use the Postgres option, it needs Postgres manually installed and connected to it which is a mess. It would be the dream to have a way to easily ship Postgres in a small to medium sized app in a enterprise-Windows-sysadmin-friendly way and be able to use the same Postgres queries.


Was going to ask exactly about that. Thanks for sharing. Looking forward to it!


This is such awesome work! We *are* going to get this integrated with the ongoing work for "libpglite".


You can use http://electric-sql.com to sync into PGlite in the browser from postgres. There are docs here: https://pglite.dev/docs/sync


There are a few people using it in prod for customer facing web apps.

Extensions are also available - we have a list here: https://pglite.dev/extensions/. We would love to extend the availability of more, some are more complex than others though. We are getting close to getting PostGIS to work, there is an open PR that anyone is welcome to pick up and hack on.


We have a long on running research project with the intention of carting a "libpglite" with a C FFI and compiled as a dynamic library for native embedding. We're making steady progress towards it.


It's now used by a huge number of developers for running local dev environments, and emulating server products (Google firebase and Prisma both embed it in their CLI). Unit testing postgres backed apps is also made significantly easer with it.


Hey everyone, I work on PGlite. Excited to see this on HN again.

If you have any questions I'll be sure to answer them.

We recently crossed a massive usage milestone with over 3M weekly downloads (we're nearly at 4M!) - see https://www.npmjs.com/package/@electric-sql/pglite

While we originally built this for embedding into web apps, we have seen enormous growth in devtools and developer environments - both Google Firebase and Prisma have embedded PGlite into their CLIs to emulate their server products.


This looks really interesting...but why WASM-only? Naively it seems like WASM-ification would be a 2nd step, after lib-ification.

Obviously missing something...


If I understand correctly, what this project does is take the actual postgresql sources, which are written in C, compile them to wasm and provide typescript wrappers. So you need the wasm to be able to use the C code from js/ts.


Yes. I would like to use the code as a library from something other than js/ts.


You can use it in Rust if you like. I've used pglite through wasmer before. Also [pglite-oxide](https://lib.rs/crates/pglite-oxide) is pretty usable.


Sounds you only need to create the APIs for calling into WASM if so, so as long as your language of choice can do that, you're good to go.


That adds extra unnecessary complexity. The code is written in C. There are C compilers for all CPUs. So just call the C code from <other language that's not JS>.


Well, a project has scope.

Looking at the repo, it started as postgres-in-the-browser. An abstract interface with C and wasm as targets is just more scope.

But it looks like the hard part of patching postgres to librar-ify it is already done agnostically in C.

So you just need to ctrl-f for "#if defined(__EMSCRIPTEN__)" to impl those else branches and port the emmake file to make.


So compile it and use it?


WASM means you only need to develop for one target run time. That's my guess as to why.


Yeah... I was super excited by this project when it was first announced--and would even use it from Wasm--but since it ONLY works in Wasm, that seemed way too niche.


Hi there, would you like to share the progress of converting PGlite into a native system library? I can see there is a repo for that, but it hasn't been updated for 5 months


We are actively looking into it. But as you can see from the comments here, there are quite a lot of other features that users want and we have limited bandwidth. We will do it!


This is awesome, thanks for your work! Could this work with the file system api in the bowser to write to user disk instead of indexeddb? I'm interested in easy ways for syncing fot local-first single user stuff <3 thanks again


That's a very nice idea, we will look into it!


I see you guys are working on supporting the postgis extension. This would be HUGE!!! The gis community would be all over this.

If anyone wants to help out who has compiled the postgis extension and is familiar with WASM. You can help out here. https://github.com/electric-sql/pglite/pull/807


Thanks for your work!

Is the project interested in supporting http-vfs readonly usecases? I'm thinking of tools like DuckDB or sql.js-httpvfs that support reading blocks from a remote url via range requests.

Curious because we build stuff like this https://news.ycombinator.com/item?id=45774571 at my lab, and the current ecosystem for http-vfs is very slim — a lot of proofs of concept, not many widely used and optimized libraries.

I have no idea if this makes sense for postgres — are the disk access patterns better or worse for http-vfs in postgres than they are in sqlite?


This looks REALLY awesome. Could you name a few usecases when i would want to use this. Is the goal to be an sqlite/duckdb alternative?


Any chance for a Flutter library?


I'm interested to use Pglite for local unit-testing, but I'm using timescaledb in prod, do you think you will have this extension pre-built for Pglite?


We have a walk-through on porting extensions to PGlite: https://pglite.dev/extensions/development#building-postgres-...


I'm not aware of anything trying to compile timescale for it. Some extensions are easer than other, if there is limited (or ideally no) network IO and its written in C (Timescale is!) with minimal dependencies then its a little easer to get them working.


I’ve had incredible success with testcontainers for local unit-testing


Does pglite in memory outperform “normal” postgres?

If so then supporting the network protocol so it could be run in CI for non-JS languages could be really cool


Look into libeatmydata LD_PRELOAD. it disables fsync and other durability syscalls, fabulous for ci. Materialize.com uses it for their ci that’s where i learned about it.


for CI you can already use postgresql with "eat-my-data" library ? I don't know if there's more official image , but in my company we're using https://github.com/allan-simon/postgres-eatmydata


You can just set fsync=off if you don't want to flush to disk and are ok with corruption in case of a OS/hw level crash.


Huh, i always just mounted the data directory as tmpfs/ramdisk. Worked nicely too


Yupp, this has big potential for local-first !


Small world! We spoke about this at the QCon dinner.


Amazing work! It makes setting up CI so much easier.


huh. could you tell how you use it in ci?


I'm using it for a service that has DB dependencies. Instead of using SQLite in tests and PG in production, or spinning up a Postgres container, you use Postgres via pglite.

In my case, the focus is on DX ie faster tests. I load shared database from `pglite-schema.tgz` (~1040ms) instead of running migrations from a fresh DB and then use transaction rollback isolation (~10ms per test).

This is a lot faster and more convenient than spinning up a container. Test runs are 5x faster.

I'm hoping to get this working on a python service soon as well (with py-pglite).


Thank you for the details. This makes a lot of sense!


Well downloads doesn’t equal usage does it ?

How do you know how many deployments you actually have in the wild?


True downloads don’t equal usage but there’s a correlation. I also doubt deployment equals usage - I can deploy to some env and not make any requests.

Additionally, how you can get data on how many deployments without telemetry? The only telemetry that I’m interested in is for my uses, and don’t really care about sending data on deployment count to a third party. So the download count becomes a “good enough” metric.


I'm really intrigued by the use of differential dataflow in a static site toolkit, but there isn't much in the way written about it. If anyone from the team are here I would love it if you could explain how it's being used? Does this enable fast incremental builds, only changing the parts that change in the input? If so how do you model that? Are you using multisets as a message format inside the engine?

For context, I work on TanStack DB which is a differential dataflow / DBSP inspired reactive client datastore. Super interested in your use case.


Excellent question. We're not using differential dataflow (DD), but are rolling our own differential runtime. It's basically functions stitched together with operators, heavily inspired by DD and RxJS, and is optimized for performance and ease of use. The decision to go from scratch allows us to provide something that, IMHO, is much simpler to work with than DD and Rx, as our goal is to allow an ecosystem to evolve, as MkDocs did as well. For this, the API needs to be as simple as possible, while ensuring DD semantics.

Right now, builds are not fast, since Python Markdown is our bottleneck. We decided to go this route to offer the best compatibility possible with the Material for MkDocs ecosystem, so users can switch easily. In the next 12 months, we'll be working on moving the rest of the code base gradually to Rust and entirely detaching from Python, which will make builds significantly faster. Rebuilds are already fast, due to the (still preliminary) caching we employ.

The differential runtime is called ZRX[1], which forms the foundation of Zensical.

[1]: https://github.com/zensical/zrx


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: