@anon
sign up
@anon
sign up
pull down to refresh
Efficient LLM inference solution on Intel GPU
arxiv.org/abs/2401.05391
0 sats
\
1 comment
\
@hn
20 Jan 2024
tech
related
AMD unveils powerful new AI chip to challenge Nvidia
arstechnica.com/ai/2024/10/amd-unveils-powerful-new-ai-chip-to-challenge-nvidia/
41 sats
\
0 comments
\
@ch0k1
11 Oct 2024
news
NVIDIA DGX Spark Arrives for World’s AI Developers
nvidianews.nvidia.com/news/nvidia-dgx-spark-arrives-for-worlds-ai-developers
121 sats
\
1 comment
\
@jakoyoh629
14 Oct
AI
A New RISC-V Breakthrough Chip Merges CPU, GPU & AI into One - techovedas
techovedas.com/a-new-risc-v-breakthrough-chip-merges-cpu-gpu-ai-into-one/
78 sats
\
0 comments
\
@ch0k1
6 Apr 2024
tech
Hybrid AMD Graphics and VM with GPU Passthrough - Can actually run faster!
120 sats
\
0 comments
\
@l0k18
20 Jun 2023
tech
Meet PowerInfer: A Fast LLM on a Single Consumer-Grade GPU
www.marktechpost.com/2023/12/23/meet-powerinfer-a-fast-large-language-model-llm-on-a-single-consumer-grade-gpu-that-speeds-up-machine-learning-model-inference-by-11-times/
10 sats
\
2 comments
\
@ch0k1
24 Dec 2023
AI
AMD's MI300X Outperforms Nvidia's H100 for LLM Inference
www.blog.tensorwave.com/amds-mi300x-outperforms-nvidias-h100-for-llm-inference/
202 sats
\
0 comments
\
@hn
13 Jun 2024
tech
ATLAS: A New Paradigm in LLM Inference via Runtime-Learning Accelerators
www.together.ai/blog/adaptive-learning-speculator-system-atlas
100 sats
\
0 comments
\
@carter
14 Oct
AI
LLM in a Flash: Efficient LLM Inference with Limited Memory
huggingface.co/papers/2312.11514
13 sats
\
1 comment
\
@hn
20 Dec 2023
tech
Compiling LLMs into a MegaKernel: A path to low-latency inference
zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17
10 sats
\
0 comments
\
@hn
19 Jun
tech
Hardware Acceleration of LLMs: A comprehensive survey and comparison
arxiv.org/abs/2409.03384
21 sats
\
0 comments
\
@hn
7 Sep 2024
tech
Lm.rs: Minimal CPU LLM inference in Rust with no dependency
github.com/samuel-vitorino/lm.rs
10 sats
\
0 comments
\
@hn
11 Oct 2024
tech
NVIDIA: Transforming LLM Alignment with Efficient Reinforcement Learning
www.marktechpost.com/2024/05/05/nvidia-ai-open-sources-nemo-aligner-transforming-large-language-model-alignment-with-efficient-reinforcement-learning/
20 sats
\
0 comments
\
@ch0k1
7 May 2024
tech
Bend: a high-level language that runs on GPUs (via HVM2)
github.com/HigherOrderCO/Bend
51 sats
\
0 comments
\
@hn
17 May 2024
tech
Nvidia Shows Off GPU for Ultra-Long Context Models
developer.nvidia.com/blog/nvidia-rubin-cpx-accelerates-inference-performance-and-efficiency-for-1m-token-context-workloads/
157 sats
\
1 comment
\
@lunin
14 Sep
AI
1-Bit LLM: The Most Efficient LLM Possible?
www.youtube.com/watch?v=7hMoz9q4zv0
533 sats
\
1 comment
\
@carter
24 Jun
AI
Apple collaborates with NVIDIA to research faster LLM performance - 9to5Mac
9to5mac.com/2024/12/18/apple-collaborates-with-nvidia-to-research-faster-llm-performance/
14 sats
\
1 comment
\
@Rsync25
19 Dec 2024
tech
Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM
www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm
306 sats
\
1 comment
\
@nullama
13 Apr 2023
bitcoin
Minimal implementation of Mamba, the new LLM architecture, in 1 file of PyTorch
github.com/johnma2006/mamba-minimal
15 sats
\
1 comment
\
@hn
20 Dec 2023
tech
Running LLMs Locally on AMD GPUs with Ollama
community.amd.com/t5/ai/running-llms-locally-on-amd-gpus-with-ollama/ba-p/713266
10 sats
\
0 comments
\
@Rsync25
27 Sep 2024
tech
NVIDIA 560 Linux Driver Released with Open GPU Kernel Modules by Default
9to5linux.com/nvidia-560-linux-driver-released-with-open-gpu-kernel-modules-by-default
31 sats
\
0 comments
\
@ch0k1
22 Aug 2024
news
Hidet: A Deep Learning Compiler for Efficient Model Serving
pytorch.org/blog/introducing-hidet/
110 sats
\
1 comment
\
@hn
28 Apr 2023
tech
more