Coming Soon

The kit is actively developed, and these features are on the roadmap. They aren’t in the template yet — this page is here so you can see where it’s headed and plan around it.

Roadmap items are subject to change in scope, naming, and order. Nothing here is guaranteed for a specific release. Already-shipped features live under Features.

RAG with Docling

Document Q&A for your app: users upload files, and the AI answers from their contents with citations — backed by Docling , an open-source document-processing engine, for ingestion.

Docling lets the kit handle real-world documents, not just plain text:

Many formats in — PDF, DOCX, PPTX, XLSX, HTML, Markdown, and images, through one pipeline.
Scanned and image-based files — built-in OCR extracts text from scans and photos, so image-only PDFs are searchable too.
Structure preserved — layout analysis, reading order, and table recognition keep tables, headings, and sections intact instead of flattening them to a wall of text.
Retrieval-ready chunking — documents are split into clean, semantically coherent chunks sized for embedding and retrieval.
Processed in your own infrastructure — parsing runs locally rather than shipping documents to a third-party API, which keeps sensitive content in-house.

The result is higher-quality retrieval: the model grounds its answers in what your users actually uploaded, with the document’s structure carried through to the context it sees.

Advanced AI memory

Persistent memory that carries across conversations. Instead of starting cold every thread, the assistant remembers durable facts about each user and workspace — stated preferences, recurring context, and details from earlier chats — and recalls the relevant pieces when they matter.

The aim is an assistant that feels continuous: it stops re-asking what it’s already been told, tailors answers to the workspace it’s in, and gets more useful the longer a customer uses it. Memory is workspace-scoped, so what one tenant teaches the assistant never leaks into another.

Auto-compact context

Long conversations eventually outgrow the model’s context window. Auto-compaction keeps them going by summarizing older turns into a compact running digest while preserving the active thread — so a chat can run for hundreds of messages without hitting a wall, losing the plot, or forcing the user to start over.

Token cost tracker & observability

Per-workspace, per-model cost tracking that turns raw token usage into dollars, plus hooks into observability platforms for tracing and debugging AI calls. See exactly what each tenant and model is costing you, spot runaway usage early, and inspect prompts, tool calls, and latencies when something looks off.

Audit trails

A tamper-evident log of who did what and when — sign-ins, role and plan changes, admin actions, and other security-relevant events — for compliance, incident review, and customer trust.

Blog

A built-in blog for content marketing and SEO, served from the same app and styled by your admin theme. Publish posts without bolting on a separate CMS.

Landing page builder

A visual builder for marketing and landing pages, so you can ship and iterate on the top of your funnel without editing code for every campaign.