The default token pool is fine — until it isn't
When you first install LacPointer, you're on the free tier: 4,000 tokens every 15 hours, shared across all your activity. For casual use that's workable. But if you're using it properly — voice sessions, Wand explanations, Skills firing mid-conversation — you'll hit that ceiling faster than you expect, usually mid-afternoon on a busy day.
The fix is simple: bring your own API key. Once you do, LacPointer routes your requests straight to that provider under your account. Your keys, your rate limits, your costs. No shared pool, no 15-hour reset clock hanging over you.
Where to add keys
Open LacPointer with Option+Space, go into Settings, and find the API Keys section. You'll see a list of slots. On the free plan you get up to 3 key slots; on Pro you get up to 10.
Each slot takes a provider and a key string. Right now LacPointer supports:
- OpenAI — paste your
sk-...key and pick a model (GPT-4o, GPT-4o mini, etc.) - Anthropic — paste your Claude key and pick a model (Claude 3.5 Sonnet, Claude 3 Haiku, etc.)
- Custom / OpenAI-compatible — point it at any endpoint that speaks the OpenAI API format
Once a key is saved, LacPointer uses it for all subsequent conversations. If you've added multiple keys, you can switch the active provider from the same settings panel. No restart needed — it takes effect immediately.
Which model for which job
This is where it gets worth thinking about. Not every task needs the heaviest model, and running everything through GPT-4o when you're asking what the weather is wastes money and adds latency.
Here's roughly how I split it:
Day-to-day chat and quick lookups
For casual questions, Slack message drafts, quick Notion task creation — I use GPT-4o mini or Claude 3 Haiku. Both are fast, cheap, and totally capable for anything that doesn't require multi-step reasoning. The response time is noticeably snappier, which matters when you're firing off quick queries throughout the day.
Code explanations and Wand output
When I hit Caps Lock to use the Wand on a block of code or an error message, I want a model that actually understands context — not just syntax. For this I lean on Claude 3.5 Sonnet. It tends to give cleaner explanations with less noise, and it's good at surfacing the why behind something rather than just restating what the code does.
Voice sessions and Personas
Voice mode (triggered with Cmd+Shift+V) benefits from lower latency above all else. I've found GPT-4o mini strikes the best balance here — responses come back fast enough that the conversation doesn't feel stilted. If you've set up a custom Persona with a specific system prompt, the model still follows it, so you're not giving anything up on the instruction-following side.
Long documents and deep analysis
If I copy a long document to my clipboard and Copy Mode kicks in (that's the Wand feature where LacPointer auto-explains anything you copy), or if I'm asking something that requires holding a lot of context, I'll temporarily switch the active key over to Claude 3.5 Sonnet or GPT-4o. Claude in particular handles long context well without the response quality degrading halfway through.
Managing multiple keys smartly
The 3-slot limit on the free plan is enough if you pick one primary provider and one backup. Most people end up with OpenAI as their main key and a Claude key as an alternative they switch to for specific tasks.
On Pro the 10-slot limit opens up more interesting setups. I keep:
- A GPT-4o mini key as my default — fast and cheap for 80% of tasks
- A Claude 3.5 Sonnet key for when I'm doing focused work and want better reasoning
- A GPT-4o key for the occasional heavy lift
- A custom endpoint pointed at a local Ollama instance for anything I don't want leaving my machine
Switching between them takes about two seconds in Settings. It's not quite per-conversation automatic routing, but being able to swap the active model without restarting anything makes it practical enough.
Your keys stay yours
One thing worth knowing: LacPointer doesn't store your API keys on its servers. They're held locally on your machine. When you make a request, LacPointer sends it directly to the provider using your key. The lacai backend isn't in the middle of that call.
This matters if you're using LacPointer in a work context where you care about where API calls are going. The answer is: straight to OpenAI or Anthropic, just like they would be if you called the API yourself.
The free tier is still there if you need it
Even after you add your own keys, the built-in token pool doesn't disappear. It's still available as a fallback if you hit a rate limit on your own key or just want to try something without burning your quota. You can toggle back to the default provider in Settings at any point.
But honestly, once you've run LacPointer off your own keys for a day, going back to the shared pool feels like borrowing someone's Wi-Fi. Your own key is just faster and less constrained.
Quick tip to get started
If you don't have an Anthropic key yet and you're not sure which provider to start with, go with OpenAI first — the setup is the most straightforward and GPT-4o mini will handle the majority of what you'll do in LacPointer without costing much at all. Add a Claude key later once you know where you want the extra reasoning headroom.
Open LacPointer with Option+Space → Settings → API Keys → Add Key. You'll be running off your own key in under two minutes.