Running AI on your computer vs using online services
-------|---------------|---------|
| Cloud | Use someone else's servers via internet | ChatGPT, Claude, Gemini |
|---|---|---|
| Local | Run the AI on your own computer | Ollama, LM Studio |
Both have their place. Let's figure out which is right for you.
Cloud LLMs (ChatGPT, Claude, etc.)
How It Works
Your device → Internet → Company's servers → Response back to you
The AI runs on powerful computers owned by OpenAI, Anthropic, or Google. You just send messages and get responses.
Advantages
| Advantage | Why It Matters |
|---|---|
| ----------- | ---------------- |
| No setup | Create account, start chatting |
| Most powerful models | GPT-4, Claude 3.5, etc. are only available here |
| No hardware needed | Works on any phone, tablet, or computer |
| Always updated | Companies improve models without you doing anything |
| Free tiers | Casual use costs nothing |
Disadvantages
| Disadvantage | Why It Matters |
|---|---|
| -------------- | ---------------- |
| Privacy concerns | Your conversations go through their servers |
| Internet required | No connection = no AI |
| Can be slow | Server load affects response time |
| Usage limits | Free tiers have caps; heavy use costs money |
| Data policy concerns | May be used for training (check settings) |
| Censorship/guardrails | Some topics are restricted |
Best For
- Most people, most of the time
- When you need the smartest models
- When you don't want to deal with technical setup
- When privacy isn't a critical concern
Local LLMs (Ollama, LM Studio)
How It Works
Your device → Your device
(Everything stays on your computer)
You download the AI model and run it on your own hardware. No internet needed once set up.
Advantages
| Advantage | Why It Matters |
|---|---|
| ----------- | ---------------- |
| Complete privacy | Data never leaves your computer |
| Works offline | Use on planes, in remote areas, anywhere |
| No usage limits | Run as much as you want |
| No subscription fees | Free forever after setup |
| Full control | No censorship or content restrictions |
| Customizable | Can fine-tune for specific tasks |
Disadvantages
| Disadvantage | Why It Matters |
|---|---|
| -------------- | ---------------- |
| Hardware requirements | Need a decent computer (see below) |
| Smaller models | Best models don't run locally (yet) |
| Setup required | Need some technical comfort |
| Slower | Unless you have a good GPU |
| No internet features | Can't search web, access current info |
| Your responsibility | Updates, troubleshooting, etc. |
Best For
- Privacy-conscious users
- Developers building applications
- Offline use cases
- Learning how AI works
- Avoiding monthly subscriptions
Hardware Requirements for Local LLMs
Minimum Specs
| Component | Minimum | Recommended |
|---|---|---|
| ----------- | --------- | ------------- |
| RAM | 8 GB | 16-32 GB |
| Storage | 10 GB free | 50+ GB free |
| CPU | Modern 4-core | 8+ cores |
| GPU | Not required | NVIDIA with 8+ GB VRAM |
What Can You Run?
| Your Hardware | What Models Work | Quality |
|---|---|---|
| --------------- | ------------------ | --------- |
| Basic laptop (8GB RAM) | 3B parameter models | Basic, faster |
| Good laptop (16GB RAM) | 7-8B parameter models | Decent |
| Gaming PC (32GB RAM + GPU) | 13-70B parameter models | Good |
| Workstation (64GB+ RAM, pro GPU) | 70B+ parameter models | Great |
Model Size Guide
| Model Size | RAM Needed | Quality Comparison |
|---|---|---|
| ------------ | ------------ | ------------------- |
| 3B | 4-6 GB | Basic assistant |
| 7-8B | 8-10 GB | Like GPT-3.5 |
| 13B | 16 GB | Better reasoning |
| 30-34B | 24-32 GB | Approaching GPT-4 quality |
| 70B | 48+ GB | Strong performance |
How to Get Started: Cloud
Option 1: ChatGPT
- Go to chat.openai.com
- Click "Sign Up"
- Create account (email or Google)
- Start chatting
Free tier: GPT-3.5 unlimited, limited GPT-4o
Option 2: Claude
- Go to claude.ai
- Click "Try Claude"
- Create account
- Start chatting
Free tier: Generous daily limits
Option 3: Gemini
- Go to gemini.google.com
- Sign in with Google account
- Start chatting
Free tier: Unlimited basic use
How to Get Started: Local
Option 1: Ollama (Easiest for Mac/Linux)
# Install (Mac/Linux)
curl -fsSL https://ollama.com/install.sh | sh
# Run a model
ollama run llama3.2
# That's it! Start chatting.
Windows: Download from ollama.com
Option 2: LM Studio (Easiest for Everyone)
- Go to lmstudio.ai
- Download for your OS
- Open the app
- Browse and download a model (app handles this)
- Click "Chat" and start talking
Best for: People who prefer graphical interfaces
Option 3: Jan (Simple, Cross-Platform)
- Go to jan.ai
- Download the app
- Download a model through the app
- Chat
Best for: Beginners who want a ChatGPT-like experience locally
Recommended First Local Model
For most people starting out:
Llama 3.2 3B - Good balance of quality and speed, runs on most computers
# With Ollama
ollama run llama3.2
If you have more RAM (16GB+):
Llama 3.1 8B - Better quality, still reasonable speed
ollama run llama3.1:8b
Privacy Deep Dive
Cloud Privacy Concerns
| Concern | Reality | Mitigation |
|---|---|---|
| --------- | --------- | ------------ |
| Conversations stored | Usually yes, for service improvement | Check data settings |
| Used for training | Sometimes, usually opt-out available | Disable in settings |
| Employees can see chats | Rarely, for safety review | Don't share secrets |
| Data breaches possible | Small risk, big companies are targets | Avoid very sensitive info |
What NOT to Put in Cloud LLMs
- Passwords or secret keys
- Financial account numbers
- Medical records
- Confidential business strategies
- Personal secrets you wouldn't share
- Client confidential information
When Local is Worth the Effort
- Processing sensitive documents
- Journaling or personal reflection
- Business confidential work
- Medical or legal information
- Anything you wouldn't email
Cost Comparison
Cloud Costs
| Service | Free Tier | Paid Tier |
|---|---|---|
| --------- | ----------- | ----------- |
| ChatGPT | GPT-3.5 unlimited | $20/month for Plus |
| Claude | ~100 messages/day | $20/month for Pro |
| Gemini | Unlimited basic | $20/month for Advanced |
| API (per token) | Varies | $0.50-30 per million tokens |
Typical user: Free tier is enough Power user: $20/month Developer building apps: $10-100/month depending on usage
Local Costs
| Component | One-time Cost | Ongoing Cost |
|---|---|---|
| ----------- | --------------- | -------------- |
| Software | Free (Ollama, LM Studio) | $0 |
| Models | Free (open weights) | $0 |
| Electricity | ~$2-10/month if heavy use | ~$2-10/month |
| Hardware (if upgrading) | $500-2000+ | $0 |
If you already have a decent computer: Essentially free If you need to upgrade: Significant upfront, but no subscription
The Hybrid Approach (Recommended)
Many people use both:
| Task | Use |
|---|---|
| ------ | ----- |
| General questions | Cloud (faster, smarter) |
| Sensitive documents | Local (privacy) |
| Creative writing | Either (test both) |
| Coding help | Cloud (better for now) |
| Offline work | Local (obviously) |
| Learning how LLMs work | Local (full access) |
The best of both worlds:
- Keep a cloud account for powerful models
- Set up a local option for privacy-sensitive tasks
- Use whichever fits each situation
Quick Decision Guide
Use Cloud If:
- [ ] You want the smartest AI available
- [ ] You don't want technical setup
- [ ] Privacy isn't a major concern
- [ ] You're always online
- [ ] You want features like web search, image generation
Use Local If:
- [ ] Privacy is important to you
- [ ] You work with sensitive information
- [ ] You want to avoid subscriptions
- [ ] You're often offline
- [ ] You want to learn how AI works
- [ ] You have decent hardware
Use Both If:
- [ ] You want flexibility
- [ ] Different tasks have different needs
- [ ] You're a curious person who likes options
Model Comparison (Honest Assessment)
| Capability | Best Cloud | Best Local | Gap |
|---|---|---|---|
| ------------ | ----------- | ------------ | ----- |
| General chat | GPT-4/Claude | Llama 3.1 70B | Noticeable |
| Simple tasks | Any | Any 7B model | Small |
| Coding | Claude/GPT-4 | DeepSeek Coder | Moderate |
| Creative writing | Claude | Llama 3.1 | Moderate |
| Reasoning | GPT-4/Claude | Llama 3.1 70B | Noticeable |
| Speed | Fast | Depends on hardware | Varies |
Bottom line: Cloud models are still noticeably better, but local models are improving rapidly and are "good enough" for many tasks.
Key Takeaways
- Cloud is easier, local is more private - choose based on your needs
- Free cloud tiers are generous - most people never need to pay
- Local is surprisingly good - especially for 7B+ models
- You can use both - many people do
- Hardware matters for local - but it's getting better
- Start with cloud - add local when you have a reason to
What's Next?
- Using cloud? Just go to Claude.ai or ChatGPT and start
- Trying local? Download LM Studio and try Llama 3.2
- Ready to go deep? Check out the Deep Dive modules based on Karpathy's video
Original content comparing local and cloud approaches.