Hacker Newsnew | past | comments | ask | show | jobs | submit | deepdarkforest's commentslogin

> Codex is more hands off, I personally prefer that over claude's more hands-on approach

Agree, and it's a nice reflection of the individual companie's goals. OpenAI is about AGI, and they have insane pressure from investors to show that that is still the goal, hence codex when works they could say look it worked for 5 hours! Discarding that 90% of the time it's just pure trash.

While Anthropic/Boris is more about value now, more grounded/realistic, providing more consistent hence trustable/intuitive experience that you can steer. (Even if Dario says the opposite). The ceiling/best case scenario of a claude code session is a bit lower than Codex maybe, but less variance.


Well, if you had tried using GPT/Codex for development you would know that the output from those 5 hours would not be 90% trash, it would be close to 100% pure magic. I'm not kidding. It's incredible as long as you use a proper analyze-plan-implement-test-document process.

> which does pose interesting questions over nvidia's throne...

> Zebra-Llama is a family of hybrid large language models (LLMs) proposed by AMD that...

Hmmm


It's definitely cool and engineering wise close to SOTA given lovable and all of the app generators.

But, assuming you are trying to be in between lovable and google, how are you not going to be steamrolled by google or perplexity etc the moment you get solid traction? Like, if your insight for v3 was that the model should make its own tools, so even less hardcoded, then i just dont see a moat or any vertical direction. What really is the difference?


Thanks, and great question. The custom Phind models are really key here -- off-the-shelf models (even SOTA models from big labs) are slow and error-prone when it comes to generating full websites on-the-fly.

Our long-term vision is to build a fully personalized internet. For Google this is an innovator's dilemma, as Google currently serves as a portal to the current internet.


All VC's have preferred shares, meaning in case of liquation like now, they get their investment back, and then the remainder gets shared.

Additionally, depending on round, they also have multiples, like 2x meaning they get at least 2x their investment before anyone else gets anything


because the secret is that the web runs on advertising/targeted recommendations. Brezos(tm) wants you to actively browse Ramazon so he can harvest your data, search patterns etc. Amazon and most sites like that are very not crawl friendly for this reason. Why would Brezos let Saltman get all the juicy preference data?


On the foundational level, test time compute(reasoning), heavy RL post training, 1M+ plus context length etc.

On the application layer, connecting with sandboxes/VM's is one of the biggest shifts. (Cloudfares codemode etc). Giving an llm a sandbox unlocks on the fly computation, calculations, RPA, anything really.

MCP's, or rather standardized function calling is another one.

Also, local llm's are becoming almost viable because of better and better distillation, relying on quick web search for facts etc.


test-time-compute(reasoning): Running the thing in a loop so it can hallucinate on its own words.

Let's not kid ourselves as to what it actually is.


A very obvious AI review with 80 points(?) plus a couple of more comments. Discussion also here https://old.reddit.com/r/MachineLearning/comments/1oyce03/d_...


Their page itself looks classic v0/ai generated, that yellow/orange warning box, plus the general shadows/borders screams LLM slop etc. Is it too hard these days to spend 30 minutes to think about UI/user experience?

I actually like the idea, not sure about monetization.

It also requires access to all the data?? And it's not even open source.


> I actually like the idea, not sure about monetization.

To be fair, we're not sure about monetization either :) We just had a lot of fun building it and have enjoyed seeing what people make with it.

> It also requires access to all the data??

Think of us like Tampermonkey/some other userscript manager. The scripts you run have to go through our script engine. That means that any data/permission your script needs access to, our script needs access to. We do try to make the scripting transparent. If you're familiar with the Greasemonkey API, we show you which permissions a given script requests (e.g. here https://www.tweeks.io/share/script/d856f07a2cb843c5bfa1b455, requires GM_addStyle)


I'm working on Flavia, an ultra-low latency voice AI data analyst that can join your meetings. You can throw in data(csv's, postgres db's, bigquery, posthog analytics for now) and you just talk and ask questions. Using cerebras(2000 tokens per second) and very low latency sandboxes on the fly, you can get back charts/tables/analysis in under 1 second. (excluding time of the actual SQL query if you are doing bigquery).

She can also join your google meet or teams meetings, share her screen and then everyone in the meeting can ask questions and see live results. Currently being used by product managers and executives for mainly analytics and data science use cases.

We plan to open-source it soon if there is demand. Very fast voice+actions is the future imo

https://www.tryflavia.com/


This sounds amazing. A demo video would help me finish sign up - I can’t try it without hooking it up to real data, and I don’t want to for a test.


Great feedback thanks! We have added a synthetic e-commerce dataset as an example when you sign up so you can test it without your data first. Will also add a demo video ASAP.


What kind of plan do you have with Cerebras? It seems like something like that would need one of the $1500/month plans at least if there were more than a handful of customers.


They introduced pay as you go recently. The limits on that is similar to the plans, 1 million tokens per minute, so if you stack a few keys and do a simple load balancing with redis, can cover a decent amount of traffic with no upfront cost. Eventually we would have to go enterprise though yes!


ok.. when I tried to use pay-as-you-go it was unusable for me because there were a ton of 429s and 503s. one test it was just constant for a few seconds when I tried it, 429 or 503.

I am using it for a voice application though so retrying causes a delay for the user that they don't expect. especially if it stays unavailable for a few seconds.


Hey! I have a business inquiry for you but I don't see a contact anywhere on your website. Is there a place I can reach you? Thank you!


are you guys built on recall or did you guys build out the meeting joining functionality yourself?


Breaking news: For profit company chases profit, briefly pretends it's not while it is


This is it exactly.

Plus, why do people think OAI is still special? Facebook, Google, and many smaller companies are doing the exact same work developing models.


It is special because of what is being discussed here: it attempted (pretended?) to do so as a non-profit, which arguably gave it early support by people who otherwise may not have provided it. None of the other players you mention did so, which to me makes it an unfair advantage. Or not, given that it seems that anything is fair that you can get away with these days.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: