December 16, 2025

5 min read

Stupid Simple Setup to Run AI Locally on Any Computer

# ai# ollama# gemma# automation# local-llm# privacy

When I built my newsletter digester, I initially reached for OpenAI’s API. It worked great. But then I realized I was sending every blog post and newsletter to a cloud service for simple summarization. That felt wrong for a self-hosted tool focused on privacy.

You can run a capable LLM locally that handles summarization, general Q&A, and daily automation without requiring a monster GPU. Ollama + Gemma 2B is the answer.

The Setup

Ollama is the easiest way to run LLMs locally. One command to install, one command to download a model, and you’re running. No Python environments, no CUDA configuration, no model file hunting.

If you want more control and advanced features, LM Studio is another excellent option with a full GUI and model management interface.

The Model

Gemma is Google’s open model family. It resembles Gemini and was trained on similar data. The 2B parameter version is tiny (1.4GB download) and runs on basically anything, even older MacBooks or low-end servers.

What you get:

OpenAI-compatible API: Drop-in replacement for GPT calls
Fully offline: No data leaves your machine
Minimal resources: 2-4GB RAM, any modern CPU
Fast responses: 20-50 tokens/second on modest hardware

Perfect for summarization, content extraction, general answers, and daily automation.

Installation is Ridiculously Easy

Download Ollama from ollama.com - works on macOS, Linux, and Windows.

If you prefer command line: brew install ollama on macOS or curl -fsSL https://ollama.com/install.sh | sh on Linux.

Download and run Gemma 2B:

ollama run gemma:2b

That’s it. The first run downloads the model (1.4GB), then drops you into an interactive chat. Like ChatGPT but fully local.

Want more accurate and smarter responses? Try gemma:7b (4.8GB) or gemma2:9b (5.5GB). Just swap the model name.

OpenAI-Compatible API

The killer feature: Ollama provides an OpenAI-compatible API endpoint. If your code talks to OpenAI, it already works with Ollama. Just change the base URL.

// OpenAI
fetch("https://api.openai.com/v1/chat/completions", { ... })

// Ollama
fetch("http://localhost:11434/v1/chat/completions", { ... })

Same request structure, same response format. Different URL, different model name, no API key needed.

Switch between cloud and local with environment variables. OpenAI for production, Ollama for development or privacy-sensitive tasks.

My newsletter digester uses Ollama for all AI operations. Here’s the actual summarization code:

import axios from "axios";

const AI_BASE_URL = process.env.AI_BASE_URL || "http://localhost:11434/v1";
const AI_MODEL = process.env.AI_MODEL || "gemma:2b";
const AI_API_KEY = process.env.AI_API_KEY || "";

async function summarizePost(content) {
  const response = await axios.post(
    `${AI_BASE_URL}/chat/completions`,
    {
      model: AI_MODEL,
      messages: [
        {
          role: "system",
          content:
            "You are a helpful assistant that summarizes blog posts and articles. Provide concise 2-3 sentence summaries.",
        },
        {
          role: "user",
          content: `Summarize this article:\n\n${content}`,
        },
      ],
      temperature: 0.3,
    },
    {
      headers: {
        "Content-Type": "application/json",
        ...(AI_API_KEY && { Authorization: `Bearer ${AI_API_KEY}` }),
      },
    }
  );

  return response.data.choices[0].message.content;
}

Environment variables control everything. Want to use OpenAI? Set AI_BASE_URL=https://api.openai.com/v1 and AI_MODEL=gpt-4o-mini. Want local? Use defaults.

The digester processes 20-50 blog posts daily. Gemma 2B summarizes each one in 3-5 seconds on my server (no GPU). Total cost: $0.00. Privacy impact: zero, everything stays local.

Check the full code: github.com/mfyz/newsletter-blog-digester

The Good and Bad

Summarization is where Gemma 2B really works. I feed it blog posts and newsletters, and it gives me clean 2-3 sentence summaries that are honestly comparable to GPT-3.5. Content extraction from HTML is reliable for simple cases - asking it to pull out titles, dates, and main points from markup.

General Q&A works well enough for automation. “What is this code doing?”, “Explain this error message” - these kinds of questions get solid answers. Not perfect, but good enough when you’re processing content in bulk.

Complex reasoning isn’t happening with this model. Multi-step logic, deep technical analysis, anything requiring the model to think through multiple connected concepts - you need bigger models for that. Creative writing lacks personality. It can generate text, but it won’t sound like GPT-4 or Claude.

Long contexts degrade fast. Gemma 2B technically handles 8K tokens, but quality drops with really long inputs. Coding tasks are basic. It knows syntax and can explain simple functions, but don’t expect Claude Code level assistance.

Why I still prefer local: my content stays on my machine. No API calls to external services, no data leaving my infrastructure. It’s free - no usage limits, no monthly bills, no surprise charges. No API rate limits or outages either. Localhost API calls are 20-50ms vs 200-500ms for OpenAI.

You can experiment freely. Try different prompts, adjust parameters, run it 1000 times - doesn’t matter. No cost anxiety.

The trade-off is hardware and capability. But for everyday automation, Gemma 2B hits the sweet spot.

My newsletter digester runs entirely on Ollama and has processed thousands of posts. Zero costs, zero privacy concerns, consistent performance. For simple automation tasks, local models are more than good enough.

October 28, 2025 7 min read
Claude Code on Loop: The Ultimate YOLO Mode
February 26, 2018 2 min read
Track who goes to space with IFTTT
April 23, 2025 5 min read
Automate Everything with n8n
January 6, 2026 4 min read
Cursor's Composer1: Trading Smarts for Speed (and Why That Works)
November 23, 2021 2 min read
Using rclone & cronjobs for simple server backup solution
September 3, 2024 2 min read
Cronicle: My new Go-To Task Scheduler (+ it’s Open Source)

Stupid Simple Setup to Run AI Locally on Any Computer

The Setup

The Model

Installation is Ridiculously Easy

OpenAI-Compatible API

The Good and Bad

Share

Related Posts

Stupid Simple Setup to Run AI Locally on Any Computer

The Setup

The Model

Installation is Ridiculously Easy

OpenAI-Compatible API

Real Usage: Newsletter Digester

The Good and Bad

Share

Related Posts