Data Assistant

The Data Assistant is an AI-powered interface that allows you to explore, analyze, and visualize your data using natural language. It works directly on top of your existing DataSpace workspaces and datasets, without moving or exposing data outside your infrastructure.

The assistant is designed to support both precise, metric-driven questions and open-ended, exploratory analysis across one or multiple workspaces.

Linking Workspaces

Before asking questions, you must link at least one workspace.

  • Linked workspaces are managed on the right-hand side of the Data Assistant interface.

  • You can link one or multiple workspaces, as long as you have access permissions.

  • Only linked workspaces are visible to the assistant and used for analysis.

Linking multiple workspaces enables cross-domain questions, such as combining sales, marketing, or operational data.

Asking Questions

Once at least one workspace is linked, you can start asking questions in natural language.

Questions can be:

Specific and well-defined

Examples:

  • How many products do we have?

  • What was the most bought item in November?

  • How many patients were processed last week?

Open-ended and exploratory

Examples:

  • What trends do we see in customer behavior over the last quarter?

  • Is there a correlation between marketing campaigns and sales in November?

When multiple workspaces are linked, the assistant can reason across them to answer higher-level business questions.

How the Assistant Works

The underlying large language model (LLM) used by the Data Assistant is predefined by an administrator. Users cannot change the model themselves.

When a question is submitted, the assistant follows an iterative analysis process:

  1. Question analysis The assistant interprets the intent of the question and determines which datasets and metrics may be relevant.

  2. Workspace context discovery The assistant explores the available datasets in the linked workspaces.

  3. Business logic discovery If a README.md file is present in a workspace, it is read and used as contextual documentation. This is the recommended place to document:

    • Business logic

    • Dataset purpose

    • Column semantics

    • Domain-specific rules

    Providing this context helps the assistant choose the correct datasets and columns during analysis.

  4. Iterative data exploration The assistant runs queries directly against the datasets to extract information. This exploration happens in a loop of querying and reasoning and can take up to 25 steps, as defined by the administrator.

  5. Result synthesis All gathered information is combined into a clear, human-readable answer.

Visualizations and Tables

When appropriate, the Data Assistant automatically generates visual output to best convey the results.

Supported outputs include:

  • Pie charts

  • Bar charts

  • Line charts

  • Scatter plots

  • Data tables

Each chart or table is always accompanied by query details, showing the exact query that was executed to produce the result. This ensures transparency, reproducibility, and auditability.

Privacy and Security

  • All chats are private and visible only to the user who created them.

  • The assistant operates entirely within the boundaries of the linked workspaces.

  • No data is sent to external systems.

On-Premise and Offline Operation

The Data Assistant is designed to work fully offline.

Given the appropriate infrastructure, all LLMs and data processing components can run entirely on-premise, ensuring full data sovereignty.

Last updated