Local-Ai

A terminal window showing a Python script calling a local LLM with no API key

Your Local AI Stack: uv and Ollama in 10 Minutes

How do you run a local LLM from a Python script? Install Ollama, pull a model, install uv, write one file with inline dependencies, and run it. No API key. No virtual environment to activate. No Docker. The whole setup takes under ten minutes. Why run local Three reasons: cost, privacy, and offline access. Frontier APIs charge per token. For experimentation, prototyping, and batch tasks, those costs add up before you have anything to show. A local model costs nothing per call. ...

Why I Run AI Locally (and You Might Want to)

As I admitted before in posts 1 and 2 I vibe-coded and lived to tell. This post answers the question I kept avoiding. Why run any of this locally when cloud models are flatly better at the task? The answer isn’t that cloud AI is bad. It is more complicated than that. It turns out that it is the wrong question, and there is a place for both tools. Frontier cloud models are impressive Cloud models are better than anything I can run locally for complex reasoning. They handle more context, have larger parameter counts, and represent the current state of the art. For plenty of tasks, they are a frankly amazing tool. ...

I trusted three local AI models, and Python had to clean up their mess

Previously, I reported that I vibe-coded a tool that reads a blog post I’ve written and generates platform-specific promo copy using a local Ollama model. I chose local models because I’m curious about them. They seem to be the future of AI, at least for use cases like this… and it works… sort of. Now, the continued story of how I trusted three local AI models and Python had to clean up after them. The truth is that I was asking too much of them, and they returned occasionally insightful and often malformed and hallucinatory results. ...

A tool box with some socket wrenches in it

I Vibe Coded a Local AI-Powered Promo Generator

Every Monday, I publish a blog post. Then I write five slightly different versions of “hey, I wrote a thing” for LinkedIn, Twitter, Bluesky, and Mastodon. Each platform has different character limits, different audiences, and different best practices. It’s tedious. I wanted to automate it. Not with a frontier model, but with a small local one running on my laptop. Something like phi or llama, through Ollama. I didn’t need a polished production app. I needed a quick prototype to test my theory. My theory was that a small local model can handle a real, recurring task. …and it can do it well enough to be useful. ...