items/394809/related \ stacker news

pull down to refresh

Efficient LLM inference solution on Intel GPU arxiv.org/abs/2401.05391

0 sats \ 1 comment \ @hn 20 Jan 2024 tech

related

AMD unveils powerful new AI chip to challenge Nvidia arstechnica.com/ai/2024/10/amd-unveils-powerful-new-ai-chip-to-challenge-nvidia/

41 sats \ 0 comments \ @ch0k1 11 Oct 2024 news

NVIDIA DGX Spark Arrives for World’s AI Developers nvidianews.nvidia.com/news/nvidia-dgx-spark-arrives-for-worlds-ai-developers

121 sats \ 1 comment \ @jakoyoh629 14 Oct AI

A New RISC-V Breakthrough Chip Merges CPU, GPU & AI into One - techovedas techovedas.com/a-new-risc-v-breakthrough-chip-merges-cpu-gpu-ai-into-one/

78 sats \ 0 comments \ @ch0k1 6 Apr 2024 tech

Hybrid AMD Graphics and VM with GPU Passthrough - Can actually run faster!

120 sats \ 0 comments \ @l0k18 20 Jun 2023 tech

Meet PowerInfer: A Fast LLM on a Single Consumer-Grade GPU www.marktechpost.com/2023/12/23/meet-powerinfer-a-fast-large-language-model-llm-on-a-single-consumer-grade-gpu-that-speeds-up-machine-learning-model-inference-by-11-times/

10 sats \ 2 comments \ @ch0k1 24 Dec 2023 AI

AMD's MI300X Outperforms Nvidia's H100 for LLM Inference www.blog.tensorwave.com/amds-mi300x-outperforms-nvidias-h100-for-llm-inference/

202 sats \ 0 comments \ @hn 13 Jun 2024 tech

ATLAS: A New Paradigm in LLM Inference via Runtime-Learning Accelerators www.together.ai/blog/adaptive-learning-speculator-system-atlas

100 sats \ 0 comments \ @carter 14 Oct AI

LLM in a Flash: Efficient LLM Inference with Limited Memory huggingface.co/papers/2312.11514

13 sats \ 1 comment \ @hn 20 Dec 2023 tech

Compiling LLMs into a MegaKernel: A path to low-latency inference zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17

10 sats \ 0 comments \ @hn 19 Jun tech

Hardware Acceleration of LLMs: A comprehensive survey and comparison arxiv.org/abs/2409.03384

21 sats \ 0 comments \ @hn 7 Sep 2024 tech

Lm.rs: Minimal CPU LLM inference in Rust with no dependency github.com/samuel-vitorino/lm.rs

10 sats \ 0 comments \ @hn 11 Oct 2024 tech

NVIDIA: Transforming LLM Alignment with Efficient Reinforcement Learning www.marktechpost.com/2024/05/05/nvidia-ai-open-sources-nemo-aligner-transforming-large-language-model-alignment-with-efficient-reinforcement-learning/

20 sats \ 0 comments \ @ch0k1 7 May 2024 tech

Bend: a high-level language that runs on GPUs (via HVM2)github.com/HigherOrderCO/Bend

51 sats \ 0 comments \ @hn 17 May 2024 tech

Nvidia Shows Off GPU for Ultra-Long Context Models developer.nvidia.com/blog/nvidia-rubin-cpx-accelerates-inference-performance-and-efficiency-for-1m-token-context-workloads/

157 sats \ 1 comment \ @lunin 14 Sep AI

1-Bit LLM: The Most Efficient LLM Possible?www.youtube.com/watch?v=7hMoz9q4zv0

533 sats \ 1 comment \ @carter 24 Jun AI

Apple collaborates with NVIDIA to research faster LLM performance - 9to5Mac 9to5mac.com/2024/12/18/apple-collaborates-with-nvidia-to-research-faster-llm-performance/

14 sats \ 1 comment \ @Rsync25 19 Dec 2024 tech

Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm

306 sats \ 1 comment \ @nullama 13 Apr 2023 bitcoin

Minimal implementation of Mamba, the new LLM architecture, in 1 file of PyTorch github.com/johnma2006/mamba-minimal

15 sats \ 1 comment \ @hn 20 Dec 2023 tech

Running LLMs Locally on AMD GPUs with Ollama community.amd.com/t5/ai/running-llms-locally-on-amd-gpus-with-ollama/ba-p/713266

10 sats \ 0 comments \ @Rsync25 27 Sep 2024 tech

NVIDIA 560 Linux Driver Released with Open GPU Kernel Modules by Default 9to5linux.com/nvidia-560-linux-driver-released-with-open-gpu-kernel-modules-by-default

31 sats \ 0 comments \ @ch0k1 22 Aug 2024 news

Hidet: A Deep Learning Compiler for Efficient Model Serving pytorch.org/blog/introducing-hidet/

110 sats \ 1 comment \ @hn 28 Apr 2023 tech