A SaaS-style helpdesk where every screen has an AI affordance: suggested replies with KB citations, conversation summaries, sentiment + category auto-tagging, RAG chat for end users, and EN ↔ AR translation. Built to demonstrate real LLM wiring — system prompts, streaming, caching, fallback, cost tracking — not just an OpenAI playground demo.
Not a wrapper around a chat widget — Claude is woven into the helpdesk workflow.
Suggested reply, conversation summary, sentiment + category auto-tag, EN↔AR translation. Click "Insert" to push the AI's draft into the composer.
End-users chat with an AI that answers from the knowledge base with clickable citations. "Talk to a human" creates a real ticket in the agent inbox.
Admin-only buttons on each article: Generate FAQ, Suggest improvements, Translate to Arabic. "Find gaps" clusters recent tickets and proposes new articles.
Switch between Haiku 4.5 / Sonnet 4.6 / Opus 4.7 live. Edit the system prompt with a test-with-sample preview. Temperature, max-tokens, prompt caching toggles.
Dashboard shows AI calls, latency, fallback %, and dollar cost. Every call (live or mock) is logged so you can see what the LLM is actually being used for.
API key never touches the browser. Cloudflare Worker proxy at /api/sanad/ai/* holds the secret. Per-IP rate limit. Falls back to mock when key is absent — demo never breaks.