@anon
sign up
@anon
sign up
pull down to refresh
FlashAttention: Fast Transformer training with long sequences
www.adept.ai/blog/flashier-attention
10 sats
\
1 comment
\
@hn
1 Oct 2023
tech
related
Google AI Proposes TransformerFAM: A Novel Transformer Architecture
www.marktechpost.com/2024/04/17/google-ai-proposes-transformerfam-a-novel-transformer-architecture-that-leverages-a-feedback-loop-to-enable-the-neural-network-to-attend-to-its-latent-representations/
61 sats
\
2 comments
\
@ch0k1
20 Apr 2024
tech
The Engineer's Guide to Deep Learning: Understanding the Transformer Model
www.interdb.jp/dl/
240 sats
\
0 comments
\
@hn
16 Jul 2024
tech
Mass Editing Memory in a Transformer
memit.baulab.info/
60 sats
\
1 comment
\
@hn
21 Apr 2023
tech
But what is a GPT? Visual intro to Transformers | Deep learning, chapter 5
m.youtube.com/watch?v=wjZofJX0v4M
1000 sats
\
0 comments
\
@south_korea_ln
2 Apr 2024
science
Understanding Transformers Using A Minimal Example
rti.github.io/gptvis/
228 sats
\
0 comments
\
@carter
4 Sep
AI
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
arxiv.org/abs/2402.12875
10 sats
\
0 comments
\
@Rsync25
17 Sep 2024
tech
RustGPT: A pure-Rust transformer LLM built from scratch
github.com/tekaratzas/RustGPT
100 sats
\
0 comments
\
@hn
15 Sep
tech
VGGT: Visual Geometry Grounded Transformer
github.com/facebookresearch/vggt
10 sats
\
0 comments
\
@hn
25 Mar
tech
New AI Paradigm?! Energy-Based Transformers Explained
www.youtube.com/watch?v=LUQkWzjv2RM
100 sats
\
0 comments
\
@carter
6 Sep
AI
Visualizing Attention, a Transformer's Heart [video]
www.3blue1brown.com/lessons/attention
31 sats
\
0 comments
\
@hn
15 Apr 2024
tech
Generative AI exists because of the transformer
ig.ft.com/generative-ai/
232 sats
\
2 comments
\
@elvismercury
14 Oct 2023
tech
Transformer – Spreadsheet
www.byhand.ai/p/transformer-spreadsheet
9 sats
\
0 comments
\
@hn
7 Feb
tech
Sohu – first specialized chip (ASIC) for transformer models
x.com/Etched/status/1805625693113663834
42 sats
\
0 comments
\
@Rsync25
25 Jun 2024
alter_native
The FFT Strikes Back: An Efficient Alternative to Self-Attention
arxiv.org/abs/2502.18394
69 sats
\
0 comments
\
@hn
26 Feb
tech
PixNerd: Pixel Neural Field Diffusion
arxiv.org/abs/2507.23268
302 sats
\
2 comments
\
@optimism
4 Aug
AI
3Blue1Brown: How might LLMs store facts | Chapter 7, Deep Learning
www.youtube.com/watch?v=9-Jl0dxWQs8
184 sats
\
3 comments
\
@south_korea_ln
5 Sep 2024
science
Launch HN: Deepsilicon (YC S24) – Software and hardware for ternary transforme
news.ycombinator.com/item?id=41490196
21 sats
\
0 comments
\
@hn
9 Sep 2024
tech
A look at Apple’s new Transformer-powered predictive text model
jackcook.com/2023/09/08/predictive-text.html
10 sats
\
1 comment
\
@hn
17 Sep 2023
tech
Rapid-fire Macro Reorientation
thebitcoinlayer.substack.com/p/rapid-fire-macro-reorientation
76 sats
\
1 comment
\
@ExponentialBTC
9 Dec 2022
bitcoin
Donut: OCR-Free Document Understanding Transformer
github.com/clovaai/donut
30 sats
\
1 comment
\
@hn
29 May 2023
tech
Alibaba's Wan 2.5 video generation neural network has been released
wan.video/
187 sats
\
2 comments
\
@lunin
25 Sep
AI
more