Select - Your Community
Select
Get Mobile App

Hacker News

avatar

Hacker News: Newest

shared a link post in group #Hacker News

Feed Image

github.com

Make loading weights 10-100x faster by jart · Pull Request #613 · ggerganov/llama.cpp

This is a breaking change that's going to give us three benefits: Your inference commands should load 100x faster You may be able to safely load models 2x larger You can run many concurrent infere...

Comment here to discuss with all recipients or tap a user's profile image to discuss privately.

Embed post to a webpage :
<div data-postid="kyaobq" [...] </div>