What if learning math felt less like reading a textbook and more like having a private tutor draw everything out on a whiteboard — in real time — while you talk to it?
That's the idea behind Claw Learn: an AI-powered visual math tutor you speak to, and which responds by animating a full multi-scene explanation on a live canvas, narrated in sync. No slides. No pre-recorded videos. No typing required.
The Problem It Solves
Most math explainer tools fall into one of two camps: walls of text with static diagrams, or pre-recorded video walkthroughs you can't interact with. Neither adapts to your question in your words at your pace.
Claw Learn takes a different approach. You ask a question — by voice or text — and the app:
- Sends it to an AI to generate a structured visual teaching plan (10+ scenes)
- Renders those scenes live on a 2D canvas with smooth animations
- Narrates each scene in sync via ElevenLabs, streamed over WebRTC
- Lets you interrupt mid-explanation and ask a follow-up — just by speaking
The result is something that genuinely feels like a tutor thinking alongside you.
Voice-First Architecture
The centerpiece is the ElevenLabs Speech Engine integration. Rather than the typical record-then-transcribe loop, Claw Learn uses a persistent WebRTC connection to an ElevenLabs Conversational AI agent. This gives you:
- Sub-100ms voice input with natural interruption support
- Streaming TTS output that plays as each scene begins — not after it renders
- A single connection for both mic input and audio playback
When the Speech Engine isn't configured, the app gracefully degrades: it falls back to the ElevenLabs REST API for TTS, then to the browser's Web Speech API for recognition. The app is fully functional at every tier.
What the Canvas Can Draw
The custom 2D canvas renderer supports 30+ visual element types, including everything you'd expect a good math teacher to reach for:
- Coordinate axes, function graphs, tangent and secant lines
- Riemann sums, shaded areas, slope fields
- Matrices with highlights, complex planes, parametric curves
- Scatter plots, histograms, bar and line charts
- Physics springs, angle arcs, vectors, 3D isometric axes
The coordinate system is centered with a typical visible range of x ∈ [−6, 6] and y ∈ [−4, 4] — enough room for almost any standard explanation.
// Example scene element — a tangent line to a curve at x = 1
{
type: "tangent",
expression: "x^2",
at: 1,
label: "slope = 2",
color: "#f97316"
}
The math parser is a custom recursive-descent implementation — no eval, no new Function. Safe to run in the browser on untrusted AI-generated expressions.
Tech Stack
The AI layer is provider-agnostic by design. Point it at Gemini (free via AI Studio), OpenAI, or a local Ollama instance — all using the same OpenAI-compatible format:
# Gemini (default)
OPENAI_API_KEY=your_gemini_key
OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai
OPENAI_MODEL=gemini-2.5-flash
# Or OpenAI
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_MODEL=gpt-4o
# Or local Ollama
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_MODEL=llama3.1
Getting Started in 3 Steps
# 1. Clone and install
git clone https://github.com/arzumanabbasov/claw-learn.git
cd claw-learn && npm install
# 2. Configure environment
cp .env.local.example .env.local
# Fill in OPENAI_API_KEY, AUTH_SECRET, GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET
# 3. Run
npm run dev
Open localhost:3000. The only hard requirements are an OpenAI-compatible API key and Google OAuth credentials. Voice and rate limiting are optional enhancements.
Try These Questions
Here are a few prompts that showcase the canvas renderer well:
- "Why does the derivative represent slope?" — tangent lines, secant convergence
- "How does matrix multiplication work?" — matrix grids with cell highlights
- "Explain the Fourier transform visually" — waves, decomposition, frequency plots
- "Show me Euler's formula e^(iπ) + 1 = 0" — complex plane, unit circle, rotation
- "How does gravity create orbits?" — vectors, parametric curves, physics spring
Security Notes
A few design decisions worth highlighting for anyone deploying this:
- All API keys are server-side only — the browser never sees them
- The canvas math parser uses a safe recursive-descent implementation (no
eval) - CORS is locked to
ALLOWED_ORIGINin production - Standard security headers (
X-Frame-Options,X-Content-Type-Options, etc.) are set on every response - Rate limiting uses Upstash Redis atomic
INCRwith a TTL to UTC midnight — serverless-safe across all Vercel edge instances
What's Next
A few areas the project could grow into:
- Persistence — conversation history currently lives in-memory and clears on refresh; a database-backed session store would enable "resume where you left off"
- More element types — probability trees, number lines, geometric proofs
- Export — render a scene plan to a shareable animation or PDF summary
- Multi-language narration — ElevenLabs supports 30+ languages, the narration pipeline could pass the user's locale through
Wrapping Up
Claw Learn is an interesting demonstration of what becomes possible when you combine a capable LLM for structured planning, a real-time voice interface, and a purpose-built canvas renderer. The pieces aren't individually new — but the way they're wired together produces something that genuinely feels different from existing math tools.
The project is MIT licensed and open for contributions. If you build on it, add a visual element type, or deploy it for a classroom, the maintainers want to hear about it.
Try the Live Demo
View Source on GitHub