S02

Local vs Cloud LLMs

Running AI on your computer vs using online services

-------|---------------|---------|

CloudUse someone else's servers via internetChatGPT, Claude, Gemini
LocalRun the AI on your own computerOllama, LM Studio

Both have their place. Let's figure out which is right for you.


Cloud LLMs (ChatGPT, Claude, etc.)

How It Works

Your device → Internet → Company's servers → Response back to you

The AI runs on powerful computers owned by OpenAI, Anthropic, or Google. You just send messages and get responses.

Advantages

AdvantageWhy It Matters
---------------------------
No setupCreate account, start chatting
Most powerful modelsGPT-4, Claude 3.5, etc. are only available here
No hardware neededWorks on any phone, tablet, or computer
Always updatedCompanies improve models without you doing anything
Free tiersCasual use costs nothing

Disadvantages

DisadvantageWhy It Matters
------------------------------
Privacy concernsYour conversations go through their servers
Internet requiredNo connection = no AI
Can be slowServer load affects response time
Usage limitsFree tiers have caps; heavy use costs money
Data policy concernsMay be used for training (check settings)
Censorship/guardrailsSome topics are restricted

Best For

  • Most people, most of the time
  • When you need the smartest models
  • When you don't want to deal with technical setup
  • When privacy isn't a critical concern

Local LLMs (Ollama, LM Studio)

How It Works

Your device → Your device
(Everything stays on your computer)

You download the AI model and run it on your own hardware. No internet needed once set up.

Advantages

AdvantageWhy It Matters
---------------------------
Complete privacyData never leaves your computer
Works offlineUse on planes, in remote areas, anywhere
No usage limitsRun as much as you want
No subscription feesFree forever after setup
Full controlNo censorship or content restrictions
CustomizableCan fine-tune for specific tasks

Disadvantages

DisadvantageWhy It Matters
------------------------------
Hardware requirementsNeed a decent computer (see below)
Smaller modelsBest models don't run locally (yet)
Setup requiredNeed some technical comfort
SlowerUnless you have a good GPU
No internet featuresCan't search web, access current info
Your responsibilityUpdates, troubleshooting, etc.

Best For

  • Privacy-conscious users
  • Developers building applications
  • Offline use cases
  • Learning how AI works
  • Avoiding monthly subscriptions

Hardware Requirements for Local LLMs

Minimum Specs

ComponentMinimumRecommended
---------------------------------
RAM8 GB16-32 GB
Storage10 GB free50+ GB free
CPUModern 4-core8+ cores
GPUNot requiredNVIDIA with 8+ GB VRAM

What Can You Run?

Your HardwareWhat Models WorkQuality
------------------------------------------
Basic laptop (8GB RAM)3B parameter modelsBasic, faster
Good laptop (16GB RAM)7-8B parameter modelsDecent
Gaming PC (32GB RAM + GPU)13-70B parameter modelsGood
Workstation (64GB+ RAM, pro GPU)70B+ parameter modelsGreat

Model Size Guide

Model SizeRAM NeededQuality Comparison
-------------------------------------------
3B4-6 GBBasic assistant
7-8B8-10 GBLike GPT-3.5
13B16 GBBetter reasoning
30-34B24-32 GBApproaching GPT-4 quality
70B48+ GBStrong performance

How to Get Started: Cloud

Option 1: ChatGPT

  1. Go to chat.openai.com
  2. Click "Sign Up"
  3. Create account (email or Google)
  4. Start chatting

Free tier: GPT-3.5 unlimited, limited GPT-4o

Option 2: Claude

  1. Go to claude.ai
  2. Click "Try Claude"
  3. Create account
  4. Start chatting

Free tier: Generous daily limits

Option 3: Gemini

  1. Go to gemini.google.com
  2. Sign in with Google account
  3. Start chatting

Free tier: Unlimited basic use


How to Get Started: Local

Option 1: Ollama (Easiest for Mac/Linux)

# Install (Mac/Linux)
curl -fsSL https://ollama.com/install.sh | sh

# Run a model
ollama run llama3.2

# That's it! Start chatting.

Windows: Download from ollama.com

Option 2: LM Studio (Easiest for Everyone)

  1. Go to lmstudio.ai
  2. Download for your OS
  3. Open the app
  4. Browse and download a model (app handles this)
  5. Click "Chat" and start talking

Best for: People who prefer graphical interfaces

Option 3: Jan (Simple, Cross-Platform)

  1. Go to jan.ai
  2. Download the app
  3. Download a model through the app
  4. Chat

Best for: Beginners who want a ChatGPT-like experience locally


For most people starting out:

Llama 3.2 3B - Good balance of quality and speed, runs on most computers

# With Ollama
ollama run llama3.2

If you have more RAM (16GB+):

Llama 3.1 8B - Better quality, still reasonable speed

ollama run llama3.1:8b

Privacy Deep Dive

Cloud Privacy Concerns

ConcernRealityMitigation
------------------------------
Conversations storedUsually yes, for service improvementCheck data settings
Used for trainingSometimes, usually opt-out availableDisable in settings
Employees can see chatsRarely, for safety reviewDon't share secrets
Data breaches possibleSmall risk, big companies are targetsAvoid very sensitive info

What NOT to Put in Cloud LLMs

  • Passwords or secret keys
  • Financial account numbers
  • Medical records
  • Confidential business strategies
  • Personal secrets you wouldn't share
  • Client confidential information

When Local is Worth the Effort

  • Processing sensitive documents
  • Journaling or personal reflection
  • Business confidential work
  • Medical or legal information
  • Anything you wouldn't email

Cost Comparison

Cloud Costs

ServiceFree TierPaid Tier
-------------------------------
ChatGPTGPT-3.5 unlimited$20/month for Plus
Claude~100 messages/day$20/month for Pro
GeminiUnlimited basic$20/month for Advanced
API (per token)Varies$0.50-30 per million tokens

Typical user: Free tier is enough Power user: $20/month Developer building apps: $10-100/month depending on usage

Local Costs

ComponentOne-time CostOngoing Cost
----------------------------------------
SoftwareFree (Ollama, LM Studio)$0
ModelsFree (open weights)$0
Electricity~$2-10/month if heavy use~$2-10/month
Hardware (if upgrading)$500-2000+$0

If you already have a decent computer: Essentially free If you need to upgrade: Significant upfront, but no subscription


Many people use both:

TaskUse
-----------
General questionsCloud (faster, smarter)
Sensitive documentsLocal (privacy)
Creative writingEither (test both)
Coding helpCloud (better for now)
Offline workLocal (obviously)
Learning how LLMs workLocal (full access)

The best of both worlds:

  • Keep a cloud account for powerful models
  • Set up a local option for privacy-sensitive tasks
  • Use whichever fits each situation

Quick Decision Guide

Use Cloud If:

  • [ ] You want the smartest AI available
  • [ ] You don't want technical setup
  • [ ] Privacy isn't a major concern
  • [ ] You're always online
  • [ ] You want features like web search, image generation

Use Local If:

  • [ ] Privacy is important to you
  • [ ] You work with sensitive information
  • [ ] You want to avoid subscriptions
  • [ ] You're often offline
  • [ ] You want to learn how AI works
  • [ ] You have decent hardware

Use Both If:

  • [ ] You want flexibility
  • [ ] Different tasks have different needs
  • [ ] You're a curious person who likes options

Model Comparison (Honest Assessment)

CapabilityBest CloudBest LocalGap
----------------------------------------
General chatGPT-4/ClaudeLlama 3.1 70BNoticeable
Simple tasksAnyAny 7B modelSmall
CodingClaude/GPT-4DeepSeek CoderModerate
Creative writingClaudeLlama 3.1Moderate
ReasoningGPT-4/ClaudeLlama 3.1 70BNoticeable
SpeedFastDepends on hardwareVaries

Bottom line: Cloud models are still noticeably better, but local models are improving rapidly and are "good enough" for many tasks.


Key Takeaways

  1. Cloud is easier, local is more private - choose based on your needs
  2. Free cloud tiers are generous - most people never need to pay
  3. Local is surprisingly good - especially for 7B+ models
  4. You can use both - many people do
  5. Hardware matters for local - but it's getting better
  6. Start with cloud - add local when you have a reason to

What's Next?

  • Using cloud? Just go to Claude.ai or ChatGPT and start
  • Trying local? Download LM Studio and try Llama 3.2
  • Ready to go deep? Check out the Deep Dive modules based on Karpathy's video

Original content comparing local and cloud approaches.