SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : AMD, ARMH, INTC, NVDA
AMD 213.50+6.2%3:59 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Pravin Kamdar who wrote (72953)12/16/2025 5:15:21 PM
From: Pravin Kamdar of 73040
 
I've been coming up to speed on how to do local inferencing of models on my laptop. My laptop has an Nvidia 3060 discrete GPU with 6 GB of VRAM. I wanted to use ollama on the GPU to run small models. After reinstalling the Nvidia drivers, installing ollama, and downloading the 7GB Mistral model, it works at a speedy 50 tokens per second as a local chat bot.

It was much harder getting Cursor to attach to the local model using Continue. It's clunky, but got it setup to either use the new cloud models as programming agents, or the local models downloaded and served through ollama.

Pretty cool. Learning a lot.
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext