Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
The Sparsely-Gated Mixture-of-Experts Layer (2017) [pdf] (arxiv.org)
1 point by swatson741 2 hours ago | past | discuss
Large Language Models Struggle to Learn Long-Tail Knowledge (2023) (arxiv.org)
1 point by wslh 2 hours ago | past | discuss
Information, complexity, brains and reality (Kolmogorov Manifesto) (2007) (arxiv.org)
2 points by jxmorris12 14 hours ago | past | discuss
LLMs, LoRA, and Slerp Shape Representational Geometry of Embeddings (arxiv.org)
1 point by PaulHoule 14 hours ago | past | discuss
Deep sequence models tend to memorize geometrically; it is unclear why (arxiv.org)
3 points by tzury 16 hours ago | past | discuss
Optimal Software Pipelining and Warp Specialization for Tensor Core GPUs (arxiv.org)
1 point by matt_d 16 hours ago | past | discuss
Generative Caching for Structurally Similar Prompts and Responses (arxiv.org)
1 point by PaulHoule 18 hours ago | past | discuss
Propose, Solve, Verify: Self-Play Through Formal Verification (arxiv.org)
2 points by imakwana 19 hours ago | past | discuss
Memelang: "Axial grammar" makes ultra token efficient query strings (arxiv.org)
1 point by bri-holt 21 hours ago | past | discuss
Position: Privacy Is Not Just Memorization (arxiv.org)
1 point by PaulHoule 22 hours ago | past | discuss
A Profit-Based Measure of Lending Discrimination (arxiv.org)
3 points by neehao 1 day ago | past | discuss
Memelang: Terse SQL uses "axial grammar" for LLM generation (arxiv.org)
1 point by bri-holt 1 day ago | past | discuss
Automating Deception: Scalable Multi-Turn LLM Jailbreaks (arxiv.org)
3 points by PaulHoule 1 day ago | past | discuss
ChatGPT: Excellent Paper Accept It. Editor: Imposter Found Review Rejected (arxiv.org)
1 point by belter 1 day ago | past | discuss
Designing Predictable LLM-Verifier Systems for Formal Method Guarantee (arxiv.org)
58 points by PaulHoule 1 day ago | past | 12 comments
Toward Training Superintelligent Software Agents Through Self-Play SWE-RL (arxiv.org)
1 point by pama 1 day ago | past | discuss
Hypothesis Testing with E-Values (arxiv.org)
3 points by weiliddat 1 day ago | past | discuss
Towards a Science of Scaling Agent Systems (arxiv.org)
1 point by Anon84 2 days ago | past | discuss
Beyond Context: Large Language Models Failure to Grasp Users Intent (arxiv.org)
4 points by mpweiher 2 days ago | past | discuss
Toward Training Superintelligent Software Agents Through Self-Play SWE-RL (arxiv.org)
1 point by klipt 2 days ago | past | discuss
Epistemological Fault Lines Between Human and Artificial Intelligence (arxiv.org)
2 points by sarusso 2 days ago | past | discuss
Large Causal Models from Large Language Models (arxiv.org)
2 points by walterbell 2 days ago | past | discuss
Prompt Repetition Improves Non-Reasoning LLMs (arxiv.org)
2 points by ksec 2 days ago | past | 1 comment
Memelang: Token-efficient LLM query language (arxiv.org)
2 points by bri-holt 2 days ago | past | discuss
A Century of Noether's Theorem (arxiv.org)
61 points by fanf2 2 days ago | past | 9 comments
Emergent temporal abstractions in autoregressive models enable hierarchical RL (arxiv.org)
2 points by simonpure 3 days ago | past | discuss
Large Causal Models from Large Language Models (arxiv.org)
2 points by Anon84 3 days ago | past | discuss
Extremal descendant integrals on spaces of curves: inequality proved with AI (arxiv.org)
2 points by thunderbong 3 days ago | past | discuss
Attention Is Not What You Need: Grassmann Flows as an Attention-Free Alternative (arxiv.org)
3 points by lexandstuff 3 days ago | past | discuss
Dual Codebook Representationl Learning for Generative Recommendation (arxiv.org)
2 points by PaulHoule 3 days ago | past | discuss

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: