More

whymauri · 2025-12-25T02:59:53 1766631593

Have you worked at BigCo before? This was 1:1 my experience at a large company and within months they were asking for a +1 leveled boomerang.

You can't take denied promos at face value, honestly.

rufo · 2025-12-25T03:20:33 1766632833

> You can't take denied promos at face value, honestly.

This was my experience as well.

Maybe your manager didn't push hard enough for you at the level calibration meeting. Maybe your director didn't like the project you were on as much as the one another manager's engineers worked on, so they weren't inclined to listen to your manager push for you. Maybe the leadership team decided to hire a new ML/AI team this fiscal year, so they told the rest of the engineering org that they only have the budget for half as many promos as the year before.

And these are the things I've heard about on the _low_ end of the spectrum of corporate/political bullshit.

There is an argument to be made that playing the game is part of the job. Perhaps, but you still get to decide to what degree you want to play at any given company, and are allowed to leave and get a different set of rules. And even so, there will always be a lot of elements that are completely outside of your control.

whymauri · 2025-11-07T18:33:20 1762540400

Wait, PyTorch and the ecosystem are much more than just that. You can't be serious?

jurschreuder · 2025-11-07T20:29:33 1762547373

It's just Torch for idiots

whymauri · 2025-09-23T15:40:03 1758642003

I work in Trust and Safety (not at Meta). We've never 'randomly' shut down an account and any action involving deactivation or deletion goes through thorough human review with no exceptions.

The lack of respect for the end-user is squarely a Meta problem, not an industry problem.

whymauri · 2025-09-10T23:20:13 1757546413

On Sunday, I was talking a Mexican friend about how politicians get killed in our countries (Colombia, Venezuela, Mexico). Just in June, presidential hopeful Miguel Uribe was shot and killed in Bogota. In the head, in front of a crowd.

I remember being grateful about how that doesn't really happen in the US (Trump being the most recent, but he survived). I guess I was wrong... and, in that case, Garcia Marquez might agree with you.

yepitwas · 2025-09-11T00:39:42 1757551182

The US had one killed within the last two months, with an attempt on another, and the attacker had a list of other targets.

You could be forgiven for not knowing, since the collective coverage and attention to it since has probably been less, total, than what this received in the last couple hours.

tcmart14 · 2025-09-12T05:14:57 1757654097

Also there was a string of events of a guy shooting at offices of a certain political party in Arizona not that long ago and also a candidate who lost who also tried to hire a hitman to kill the person they lost to.

jakelazaroff · 2025-09-11T03:50:30 1757562630

A few years ago, a would-be assassin went to House Speaker Nancy Pelosi's house and — when he couldn't find her — beat her husband with a hammer. Here's what Charlie Kirk had to say about that [1]:

> By the way, if some amazing patriot out there in San Francisco or the Bay Area wants to really be a midterm hero, someone should go and bail this guy out, I bet his bail’s like thirty or forty thousand bucks. Bail him out and then go ask him some questions.

[1] https://www.rollingstone.com/politics/politics-news/charlie-...

ZeroGravitas · 2025-09-11T08:50:33 1757580633

It would be therefore fitting if someone started a conspiracy theory that Charlie Kirk was shot by his gay lover.

I mean a lot of people are saying that. Big if true etc.

pjc50 · 2025-09-11T09:23:46 1757582626

The US is in the process of turning into a stereotypical Latin American country, caudillo and everything else. Driven by the same economic and social forces, and in some cases the same people.

astura · 2025-09-11T09:58:21 1757584701

>I remember being grateful about how that doesn't really happen in the US (Trump being the most recent, but he survived).

Excuse me? Melissa Hortman and John Hoffman were less than 3 months ago.

https://en.wikipedia.org/wiki/2025_shootings_of_Minnesota_le...

motorest · 2025-09-11T06:01:52 1757570512

> I remember being grateful about how that doesn't really happen in the US (Trump being the most recent, but he survived).

You are clearly not paying attention.

https://www.bbc.com/news/live/cvgv4y99n7rt

whymauri · 2025-09-11T17:50:18 1757613018

This is now the 5th comment saying the same thing, so I'll respond. I'm aware of these and they were terrible. In a just world, they would get as much if not more media attention.

The difference is the public nature of the execution. That is what makes it more similar to, say, Colombia or Venezuela _to me._ Within the context of 'magical realism', it is the perspective and mass dissemination of the violence that heightens that feeling.

Going back to the original topic, there is a reason that most of 100 Years of Solitude's pivotal moments happen around the staging of public executions (and not so much the off-screen violence, of which there is some but it's not focal).

whymauri · 2025-09-09T02:03:39 1757383419

The Harvard BIONICS lab is working on neuroprostheses for different forms of paralysis, like intestinal paralysis. They're great.

whymauri · 2025-08-29T21:45:50 1756503950

The most direct, non-marketing, non-aesthetic summary is that this model trades off a few points on 'fundamental benchmarks' (GPQA, MATH/AIME, MMLU) in exchange for being a 'more steerable' (less refusals) scaffold for downstream tuning.

Within that framing, I think it's easier to see where and how the model fits into the larger ecosystem. But, of course, the best benchmark will always be just using the model.

whymauri · 2025-08-29T20:52:52 1756500772

I really like their technical report:

https://arxiv.org/pdf/2508.18255

esafak · 2025-08-29T20:55:33 1756500933

All the contacts are X aliases!

dang · 2025-08-29T23:11:20 1756509080

We'll put that link in the top text too. Thanks!

whymauri · 2025-08-24T18:47:08 1756061228

I used to work at a drug discovery startup. A simple model generating directly from latent space 'discovered' some novel interactions that none of our medicinal chemists noticed e.g. it started biasing for a distribution of molecules that was totally unexpected for us.

Our chemists were split: some argued it was an artifact, others dug deep and provided some reasoning as to why the generations were sound. Keep in mind, that was a non-reasoning, very early stage model with simple feedback mechanisms for structure and molecular properties.

In the wet lab, the model turned out to be right. That was five years ago. My point is, the same moment that arrived for our chemists will be arriving soon for theoreticians.

wenc · 2025-08-24T19:37:01 1756064221

A lot of interesting possibilities lie in latent space. For those unfamiliar, this means the underlying set of variables that drive everything else.

For instance, you can put a thousand temperature sensors in a room, which give you 1000 temperature readouts. But all these temperature sensors are correlated, and if you project them down to latent space (using PCA or PLS if linear, projection to manifolds if nonlinear) you’ll create maybe 4 new latent variables (which are usually linear combinations of all other variables) that describe all the sensor readings (it’s a kind of compression). All you have to do then is control those 4 variables, not 1000.

In the chemical space, there are thousands of possible combinations of process conditions and mixtures that produce certain characteristics, but when you project them down to latent variables, there are usually less than 10 variables that give you the properties you want. So if you want to create a new chemical, all you have to do is target those few variables. You want a new product with particular characteristics? Figure out how to get < 10 variables (not 1000s) to their targets, and you have a new product.

whymauri · 2025-08-24T23:07:25 1756076845

At the end of the generative funnel we had a filter and it used (roughly) the mechanism you're describing.

https://www.pnas.org/doi/10.1073/pnas.1611138113

You summarized it very well!

adastra22 · 2025-08-25T16:06:46 1756138006

What do you do now? We are doing similar things to explore the surface chemistry involved in bootstrapping nanotechnology.

timClicks · 2025-08-24T20:17:33 1756066653

It's been a while since I've played in the area, but is PCA still the go to method for dimensionality reduction?

wenc · 2025-08-24T20:41:33 1756068093

PCA (essentially SVD) the one that makes the fewest assumptions. It still works really well if your data is (locally) linear and more or less Gaussian. PLS is the regression version of PCA.

There are also nonlinear techniques. I’ve used UMAP and it’s excellent (particularly if your data approximately lies on a manifold).

https://umap-learn.readthedocs.io/en/latest/

The most general purpose deep learning dimensionality reduction technique is of course the autoencoder (easy to code in PyTorch). Unlike the above, it makes very few assumptions, but this also means you need a ton more data to train it.

ChadNauseam · 2025-08-25T06:33:00 1756103580

> PCA (essentially SVD) the one that makes the fewest assumptions

Do you mean it makes the *strongest* assumptions? "your data is (locally) linear and more or less Gaussian" seems like a fairly strong assumption. Sorry for the newb question as I'm not very familiar with this space.

wenc · 2025-08-25T07:01:37 1756105297

You’re correct in a mathematical sense: linearity and Gaussian are restrictive assumptions.

However I meant it colloquially in that those assumptions are trivially satisfied by many generating processes in the physical and engineering world, and there aren’t a whole lot of other requirements that need to be met.

Lerc · 2025-08-25T12:36:51 1756125411

From 2019 but a decent overview by Leland McInnes https://www.youtube.com/watch?v=9iol3Lk6kyU

There's a newer thing called PacMap which is an interesting thing that handles difference cases better. Not as robustly tested as UMAP but that could be said of any new thing. I'm a little wary that it might be overfitted to common test cases. To my mind it feels like PacMap seems like a partial solution of a better way of doing it.

The three stage process of PacMap is either asking to be developed into either a continuous system or finding a analytical reason/way to conduct a phase change.

wenc · 2025-08-26T03:04:21 1756177461

Leland McInnes is amazing. He's also the author of UMAP.

baq · 2025-08-24T20:42:25 1756068145

PCA is nice if you know relationships are linear. You also want to be aware of TSNE and UMAP.

wenc · 2025-08-24T20:47:52 1756068472

A lot of relationships are (locally) linear so this isn’t as restrictive as it might seem. Many real-life productionized applications are based on it. Like linear regression, it has its place.

T-SNE is good for visualization and for seeing class separation, but in my experience, I haven’t found it to work for me for dimensionality reduction per se (maybe I’m missing something). For me, it’s more of a visualization tool.

On that note, there’s a new algorithm that improves on T-SNE called PaCMAP which preserves local and global structures better. https://github.com/YingfanWang/PaCMAP

a_bonobo · 2025-08-25T00:42:39 1756082559

There's also Bonsai, it's parameter-free and supposedly 'better' than t-SNE, but it's clearly aimed at visualisation purposes (except that in Bonsai trees, distances between nodes are 'real' which is usually not the case in t-SNE)

https://www.biorxiv.org/content/10.1101/2025.05.08.652944v1....

energy123 · 2025-08-25T06:57:48 1756105068

I'd add that PCA/OLS is linear in the functional form (linear combination), but the input variables can be non-linear (X_new := X_{old,1}*X_{old,2}^2), so if the non-linearities are simple, then basic feature engineering to strip out the non-linearities before fitting PCA/OLS may be acceptable.

siavosh · 2025-08-25T14:36:21 1756132581

In terms of terminology, is it accurate to interpret the latent variables as the “world model” of the neural network?

wenc · 2025-08-26T04:09:32 1756181372

Not quite.

Embeddings are a form of latent variables.

Attention query/key/value vectors are latent variables.

More generally, a latent variable is any internal, not-directly-observed representation that compresses or restructures information from inputs into a form useful for producing outputs.

They usually capture some underlying behavior in either lower dimensional or otherwise compressed space.

pk-protect-ai · 2025-08-25T18:18:46 1756145926

How about "bias vector space"?

svantana · 2025-08-24T19:08:15 1756062495

Interesting! Depending on your definition, "automated invention" has been a thing since at least the 1990's. An early success was the evolved antenna [1].

1. https://en.wikipedia.org/wiki/Evolved_antenna

hhh · 2025-08-24T20:24:42 1756067082

IBM has done this with pharmaceuticals for ages no? That’s why they have patents on what would be the next generation of ADHD medications e.g. 4F-MPH?

johnisgood · 2025-08-25T14:03:40 1756130620

4F-* are mainly research chemicals (still?), still being sold widely, especially where there is not a blanket ban on them.

I remember 3-FPM, that was what I imagined stimulants should be doing. It did everything just right. I got it back when it was legal. Any other stimulants come nowhere as close, maybe similar ones, but 4FA or whatever is for example, mostly euphoric, which is not what I want.

No clue about IBM's part in it.

kmarc · 2025-08-24T19:29:35 1756063775

Reminds me of this story on the Babbage podcast a month ago:

https://www.economist.com/science-and-technology/2025/07/02/...

My understanding is, iterating on possible sequences (of codons, base pairs, etc) is exactly what LLMs, these feedback-looped predictor machines, are especially great at. With the newest models, those that "reason about" (check) their own output, are even better at it.

apimade · 2025-08-24T23:08:55 1756076935

Warning the below comment comes from someone who has no formal science degree, and just enjoys reading articles on the topic.

Similar for physicists, I think there’s a very confusing/unconventional antenna called the “evolved antenna” which was used on a NASA spacecraft. The idea behind it was supported from genetic programming. The science or understanding “why” the way the antenna bends at different areas supporting increased gain is not well understood by us today.

This all boils down to empirical reasoning, which underlies the vast majority of science (or science adjacent fields like software engineering, social sciences etc).

The question I guess is; does LLMs, “AI”, ML give us better hypothesis or tests to run to support empirical evidence-based science breakthroughs? The answer is yes.

Will these be substantial, meaningful or create significant improvements on today’s approaches?

I can’t wait to find out!

pojzon · 2025-08-24T21:05:48 1756069548

If AI comes up with new drugs or treatments - does it mean its a public knowledge and cant be copyrighted ?

Wouldnt that mean a fall of US pharmaceutical conglomate based on current laws about copyright and AI content?

jillesvangurp · 2025-08-25T06:34:47 1756103687

You are confusing copyright and patents, which are two very different things. And yes, companies or people wielding AIs can patent anything that hasn't been claimed by others before.

pojzon · 2025-08-25T12:41:45 1756125705

But how can you patent something that was created by AI that cannot take ownership of anything and different AIs can produce the same result.

Is achieving the same result using different engines the same as designing combustion engine in different ways?

How public domain translate to that?

I really hope it kills any ways to claim patents on anything.

ileonichwiesz · 2025-08-25T14:01:31 1756130491

https://en.wikipedia.org/wiki/DABUS

abtinf · 2025-08-26T02:07:08 1756174028

> created by AI

AI does not create anything, anymore than Word writes documents or Photoshop creates photos.

selkin · 2025-08-24T21:36:48 1756071408

Drugs discovered by humans are not under the protections of copyright as well.

ACCount37 · 2025-08-24T20:23:50 1756067030

Hallucinations or inhuman intuition? An obvious mistake made by a flawed machine that doesn't know the limits of its knowledge? Or a subtle pattern, a hundred scattered dots that were never connected by a human mind?

You never quite know.

Right now, it's mostly the former. I fully expect the latter to become more and more common as the performance of AI systems improves.

brandonb · 2025-08-24T23:13:00 1756077180

This is really cool. Have you (or your colleagues) written anything about what you learned about ML for drug discovery?

lukev · 2025-08-25T03:05:05 1756091105

Ok but I have to point out something important here. Presumably, the model you're talking about was trained on chemical/drug inputs. So it models a space of chemical interactions, which means insights could be plausible.

GPT-5 (and other LLMs) are by definition language models and though they will happily spew tokens about whatever you ask, they don't necessarily have the training data to properly encode the latent space of (e.g) drug interactions.

Confusing these two concepts could be deadly.

Difwif · 2025-08-25T15:34:46 1756136086

Seems short sighted to me. LLMs could have any data in their training set encoded as tokens. Either new specialized tokens are explicitly included (e.g: Vision models) or the language encoded version of everything that usually exists (e.g: the research paper and the csv with the data).

To improve next token prediction performance on these datasets and generalize requires a much richer latent space. I think it could theoretically lead to better results from cross-domain connections (ex: being fluent in a specific area of advanced mathematics, quantum mechanics, and materials engineering is key to a particular breakthrough)

anthk · 2025-08-25T21:05:45 1756155945

Then you are now implementing a parser.

whymauri · 2025-08-18T15:50:51 1755532251

5% success rate might mean: if you get it to work, you are capturing value that the other 95% are not.

A lot of this must come down to execution. And there's a lot of snake oil out there at the execution layer.

Joel_Mckay · 2025-08-18T16:01:20 1755532880

"So you're telling me there's a chance"

https://www.youtube.com/watch?v=KX5jNnDMfxA

5% is not unexpected, as startup success rates are normally about 1:22 over 3 years. lol =3

whymauri · 2025-08-14T18:15:30 1755195330

LLMs are really annoying to use for moderation and Trust and Safety. You either depend on super rate-limited 'no-moderation' endpoints (often running older, slower models at a higher price) or have to tune bespoke un-aligned models.

For your use case, you should probably fine tune the model to reduce the rejection rate.

canyon289 · 2025-08-14T18:48:37 1755197317

Speaking for me as an individual as an individual I also strive to build things that are safe AND useful. Its quite challenging to get this mix right, especially at the 270m size and with varying user need.

My advice here is make the model your own. Its open weight, I encourage it to be make it useful for your use case and your users, and beneficial for society as well. We did our best to give you a great starting point, and for Norwegian in particular we intentionally kept the large embedding table to make adaption to larger vocabularies easier.

bboygravity · 2025-08-15T07:58:11 1755244691

What does safe even mean in the context of a locally running LLM?

Protect my fragile little mind from being exposed to potentially offending things?

segfaultex · 2025-08-15T12:19:47 1755260387

Enterprises are increasingly looking at incorporating targeted local models into their systems vs paying for metered LLMs, I imagine this is what the commenter above is referring to.

whymauri · 2025-08-14T19:03:41 1755198221

To be fair, Trust and Safety workloads are edgecases w.r.t. the riskiness profile of the content. So in that sense, I get it.

sheepdestroyer · 2025-08-14T19:34:06 1755200046

I don't. "safety" as it exists really feels like infantilization, condescention, hand holding and enforcement of American puritanism. It's insulting.

Safety should really just be a system prompt: "hey you potentially answer to kids, be PG13"

ungreased0675 · 2025-08-14T19:54:24 1755201264

Safety in the context of LLMs means “avoiding bad media coverage or reputation damage for the parent company”

It has only a tangential relationship with end user safety.

If some of these companies are successful the way they imagine, most of their end users will be unemployed. When they talk about safety, it’s the companies safety they’re referring to.

bravoetch · 2025-08-15T10:40:03 1755254403

Investor safety. It's amazing that people in hn threads still think the end-user is the customer. No. The investor is the customer, and the problem being solved for that curtomer is always how to enrich them.

mulmen · 2025-08-15T16:17:58 1755274678

How can the investor be the customer? Where does the revenue come from?

I understand “if you aren’t paying for a product you are the product” but I’m not convinced it applies here.

conradev · 2025-08-15T05:20:26 1755235226

It feels hard to include enough context in the system prompt. Facebook’s content policy is huge and very complex. You’d need lots of examples, which lends itself well to SFT. A few sentences is not enough, either for a human or a language model.

I feel the same sort of ick with the puritanical/safety thing, but also I feel that ick when kids are taken advantage of:

https://www.reuters.com/investigates/special-report/meta-ai-...

The models for kids might need to be different if the current ones are too interested in romantic love.

katzenversteher · 2025-08-15T05:06:44 1755234404

I also don't get it. I mean if the training data is publicly available, why isn't that marked as dangerous? If the training data contains enough information to roleplay a killer or a hooker or build a bomb, why is the model censored?

conradev · 2025-08-15T05:35:14 1755236114

We should put that information on Wikipedia, then!

but instead we get a meta-article: https://en.wikipedia.org/wiki/Bomb-making_instructions_on_th...

jdjwk2843738 · 2025-08-16T06:20:59 1755325259

If you don’t believe that you can be harmed verbally, then I understand your position. You might be able to empathise if the scenario was an LLM being used to control physical robotic systems that you are standing next to.

Some people can be harmed verbally, I’d argue everyone if the entity conversing with you knows you well, and so i don’t think the concept of safety itself is an infantilisation.

It seems what we have here is a debate over the efficacy of having access to disable safeguards that you deem infantilising and that get in the way of an objective, versus the burden of always having to train a model to avoid being abusive for example, or checking if someone is standing next to the sledgehammer they’re about to swing at 200rpm

jcgrillo · 2025-08-14T22:52:53 1755211973

It's also marketing. "Dangerous technology" implies "powerful". Hence the whole ridiculous "alignment" circus.