Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!
I run kobold.cpp which is a cutting edge local model engine, on my local gaming rig turned server. I like to play around with the latest models to see how they improve/change over time. The current chain of thought thinking models like deepseek r1 distills and qwen qwq are fun to poke at with advanced open ended STEM questions.
STEM questions like “What does Gödel’s incompleteness theorem imply about scientific theories of everything?” Or “Could the speed of light be more accurately refered to as ‘the speed of causality’?”
As for actual daily use, I prefer using mistral small 24b and treating it like a local search engine with the legitimacy of wikipedia. Its a starting point to ask questions about general things I don’t know about or want advice on, then do further research through more legitimate sources.
Its important to not take the LLM too seriously as theres always a small statistical chance it hallucinates some bullshit but most of the time its fairly accurate and is a pretty good jumping off point for further research.
Lets say I want an overview of how can I repair small holes forming in concrete, or general ideas on how to invest financially, how to change fluids in a car, how much fat and protein is in an egg, ect.
If the LLM says a word or related concept I don’t recognize I grill it for clarifying info and follow it through the infinite branching garden of related information.
I’ve used an LLM to help me go through old declassified documents and speculate on internal gov terminalogy I was unfamiliar with.
I’ve used a speech to text model and get it to speek just for fun. Ive used multimodal model and get it to see/scan documents for info.
Ive used websearch to get the model to retrieve information it didn’t know off a ddg search, again mostly for fun.
Feel free to ask me anything, I’m glad to help get newbies started.
LMStudio is pretty much the standard. I think it’s opensource except for the UI. Even if you don’t end up using it long-term, it’s great for getting used to a lot of the models.
Otherwise there’s OpenWebUI that I would imagine would work as a docker compose, as I think there’s ARM images for OWU and ollama
None currently. Wish I could afford a GPU to play with some stuff.
Well, let me know your suggestions if you wish. I took the plunge and am willing to test on your behalf, assuming I can.
I was able to run a distilled version of DeepSeek on Linux. I ran it inside a PODMAN container with ROCM support (I have an AMD GPU). It wasn’t super fast but for a locally deployed and self hosted option the performance was okay. Apart from that I have deployed Fooocus for image generation in a similar manner. Currently, I am working on deploying Stable Diffusion with either ComfyUI or Automatic1111 inside a PODMAN container with ROCM support.
Didn’t know about these image generation tools, besides Stable Diffusion. Thanks!