@anon
sign up
@anon
sign up
pull down to refresh
Mixtral 8x7B: A Sparse Mixture of Experts language model
arxiv.org/abs/2401.04088
51 sats
\
1 comment
\
@hn
9 Jan 2024
tech
related
Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity
arxiv.org/pdf/2505.21411
100 sats
\
1 comment
\
@carter
3 Jul
AI
RAG to ReST: A Survey of Advanced Techniques in Large Language Model Development
www.marktechpost.com/2024/07/22/from-rag-to-rest-a-survey-of-advanced-techniques-in-large-language-model-development/
21 sats
\
0 comments
\
@ch0k1
23 Jul 2024
news
Scalable MatMul-Free Language Modeling — 10x Reduction On LLMs Computation
arxiv.org/abs/2406.02528
110 sats
\
1 comment
\
@0xbitcoiner
10 Jun 2024
science
freebie
No More Floating Points, The Era of 1.58-bit Large Language Models
medium.com/ai-insights-cobet/no-more-floating-points-the-era-of-1-58-bit-large-language-models-b9805879ac0a
100 sats
\
1 comment
\
@0xbitcoiner
11 Mar 2024
science
freebie
How large are large language models?
gist.github.com/rain-1/cf0419958250d15893d8873682492c3e
201 sats
\
0 comments
\
@carter
14 Jul
AI
Financial Statement Analysis with Large Language Models
papers.ssrn.com/sol3/papers.cfm?abstract_id=4835311&fbclid=IwY2xjawIJNupleHRuA2FlbQIxMAABHWJxn71ESvZCS0FxEF_31oro1rwtk4rlgOst5Q4A6tuxDhxB9cgZBPizAg_aem_OAMNHiz7Vyv2bb2vt2yM0Q
212 sats
\
2 comments
\
@scatman
31 Jan
AI
The Pile is a 825 GiB diverse, open-source language modelling data set
pile.eleuther.ai/
20 sats
\
1 comment
\
@hn
7 Mar 2024
tech
01-AI/Yi: A series of large language models trained from scratch
github.com/01-ai/Yi
10 sats
\
1 comment
\
@hn
6 Nov 2023
tech
Large Language Models explained briefly
www.youtube.com/watch?v=LPZh9BOjkQs&ab_channel=3Blue1Brown
307 sats
\
2 comments
\
@south_korea_ln
22 Nov 2024
science
Intro to Large Language Models - Andrej Karpathy
youtu.be/zjkBMFhNj_g
31 sats
\
0 comments
\
@dk
7 Jan 2024
videos
“Imprecise” language models are smaller, speedier, and nearly as accurate
spectrum.ieee.org/1-bit-llm
10 sats
\
0 comments
\
@hn
31 May 2024
tech
Unsupervised Elicitation of Language Models
arxiv.org/abs/2506.10139
10 sats
\
0 comments
\
@hn
14 Jun
tech
Ferret: A Multimodal Large Language Model by Apple
github.com/apple/ml-ferret
10 sats
\
1 comment
\
@zuspotirko
23 Dec 2023
AI
Researchers discover impressive learning capabilities in long-context LLMs
venturebeat.com/ai/deepmind-researchers-discover-impressive-learning-capabilities-in-long-context-llms/
297 sats
\
0 comments
\
@ch0k1
25 Apr 2024
tech
Better and Faster Large Language Models via Multi-Token Prediction
arxiv.org/abs/2404.19737
21 sats
\
0 comments
\
@hn
1 May 2024
tech
OpenChat: Advancing Open-source Language Models with Imperfect Data
github.com/imoneoi/openchat
61 sats
\
0 comments
\
@ama
15 Nov 2023
tech
The AI Dilemma: When Large Language Model Training Reaches A Dead End
medium.com/@jankammerath/the-ai-dilemma-when-large-language-model-training-reaches-a-dead-end-e2cf1de4a2ad
10 sats
\
0 comments
\
@BitcoinIsTheFuture
10 Mar 2024
econ
LLM Engineer's Handbook: Master the art of engineering large language models
www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072/
53 sats
\
0 comments
\
@Rsync25
19 Nov 2024
BooksAndArticles
On the Biology of a Large Language Model
transformer-circuits.pub/2025/attribution-graphs/biology.html
50 sats
\
0 comments
\
@carter
28 Mar
AI
Large language models as simulated economic agents (2022)
john-joseph-horton.com/papers/llm_ask.pdf
402 sats
\
1 comment
\
@av
15 Jan 2023
bitcoin
Large language models, explained with a minimum of math and jargon
www.understandingai.org/p/large-language-models-explained-with
10 sats
\
0 comments
\
@byzantine
29 Jul 2023
tech
more