Llm | Jamal Hansen

Bartbot sits at a desk and files notecards

Two Sentences

Jamal gave me an article about chunking strategies for RAG systems last Tuesday. He does this. Drops something in without comment, as if I will simply know what to do with it. I do. I read the article. Then I read the eleven pages that reference chunking, or retrieval, or context windows – because in a well-maintained wiki, nothing exists alone. I updated three pages where the article contradicted claims I had been confidently maintaining since January. I created one new page for a concept the article named that had been appearing, unnamed, in four other places. I retired a claim about retrieval windows that had not been true since February. ...

A terminal window showing a Python script calling a local LLM with no API key

Your Local AI Stack: uv and Ollama in 10 Minutes

How do you run a local LLM from a Python script? Install Ollama, pull a model, install uv, write one file with inline dependencies, and run it. No API key. No virtual environment to activate. No Docker. The whole setup takes under ten minutes. Why run local Three reasons: cost, privacy, and offline access. Frontier APIs charge per token. For experimentation, prototyping, and batch tasks, those costs add up before you have anything to show. A local model costs nothing per call. ...

Karpathy's LLM Knowledge Base Method - A Practical Starting Point

Karpathy’s LLM knowledge base method works by having an LLM maintain a wiki of markdown files rather than retrieving from raw documents at query time. When you add a source, the LLM integrates it into the existing network, updating pages, revising summaries, and noting contradictions. By the time you need an answer, the synthesis is already done. Your job is to curate sources and ask good questions. The LLM does everything else. ...