Lite, Standard, and Full use progressively larger quantized models so you can balance speed and quality. Replies stream in, and optional “thinking” text helps you follow reasoning before the final answer.
Example “Explain recursion like I’m new to code” (works offline after the model is downloaded).
On Standard and Full, questions can trigger local embedding + vector search over your indexed notes (sqlite-vec). Lite skips retrieval for speed.
Example “What did I save about the dentist?”
A validated planner can run allow-listed tools: calendar, reminders, alarms, contacts, notes—only with the permissions you grant.
Examples “Remind me in 20 minutes to stretch”; “What’s on my calendar tomorrow?”
When enabled in settings, prompts that match web-style phrases—for example search for, look up, or what is the current …—can trigger a DuckDuckGo-backed fetch—not on every message.
Example “Search for population of Tokyo”
Push-to-talk transcription (Whisper-based path when configured), then the same pipelines as text. Replies can use system text-to-speech.
Example “Add a note to pick up bread” after you release the mic button.
Keyword routing steers reminders, memory lookup, or general chat without an extra router model—saving latency and battery on small devices.
Example “Remind me…” is steered toward reminders early.