Data Assistant
The Data Assistant is an AI-powered interface that allows you to explore, analyze, and visualize your data using natural language. It works directly on top of your existing DataSpace workspaces and datasets, without moving or exposing data outside your infrastructure.
The assistant is designed to support both precise, metric-driven questions and open-ended, exploratory analysis across one or multiple workspaces.
Linking Workspaces
Before asking questions, you must link at least one workspace.
Linked workspaces are managed on the right-hand side of the Data Assistant interface.
You can link one or multiple workspaces, as long as you have access permissions.
Only linked workspaces are visible to the assistant and used for analysis.
Linking multiple workspaces enables cross-domain questions, such as combining sales, marketing, or operational data.
Asking Questions
Once at least one workspace is linked, you can start asking questions in natural language.
Questions can be:
Specific and well-defined
Examples:
How many products do we have?
What was the most bought item in November?
How many patients were processed last week?
Open-ended and exploratory
Examples:
What trends do we see in customer behavior over the last quarter?
Is there a correlation between marketing campaigns and sales in November?
When multiple workspaces are linked, the assistant can reason across them to answer higher-level business questions.
How the Assistant Works
The underlying large language model (LLM) used by the Data Assistant is predefined by an administrator. Users cannot change the model themselves.
When a question is submitted, the assistant follows an iterative analysis process:
Question analysis The assistant interprets the intent of the question and determines which datasets and metrics may be relevant.
Workspace context discovery The assistant explores the available datasets in the linked workspaces.
Business logic discovery If a
README.mdfile is present in a workspace, it is read and used as contextual documentation. This is the recommended place to document:Business logic
Dataset purpose
Column semantics
Domain-specific rules
Providing this context helps the assistant choose the correct datasets and columns during analysis.
Iterative data exploration The assistant runs queries directly against the datasets to extract information. This exploration happens in a loop of querying and reasoning and can take up to 25 steps, as defined by the administrator.
Result synthesis All gathered information is combined into a clear, human-readable answer.
Visualizations and Tables
When appropriate, the Data Assistant automatically generates visual output to best convey the results.
Supported outputs include:
Pie charts
Bar charts
Line charts
Scatter plots
Data tables
Each chart or table is always accompanied by query details, showing the exact query that was executed to produce the result. This ensures transparency, reproducibility, and auditability.
Privacy and Security
All chats are private and visible only to the user who created them.
The assistant operates entirely within the boundaries of the linked workspaces.
No data is sent to external systems.
On-Premise and Offline Operation
The Data Assistant is designed to work fully offline.
Given the appropriate infrastructure, all LLMs and data processing components can run entirely on-premise, ensuring full data sovereignty.
Last updated