Health Checks
Health Checks provide a structured way to define and enforce data quality expectations directly within your transformation code. After each build, these expectations are automatically evaluated and presented in the Health tab, offering clear and immediate insight into the quality and reliability of your datasets.
What are Health Checks?
A health check is a declarative rule that validates specific assumptions about your data, including but not limited to:
Columns must not contain null values
String values must not be empty
Values must be unique
Numeric values must fall within a defined range
Rather than implementing imperative validation logic, you explicitly declare the conditions that must hold true. DataSpace then evaluates these conditions automatically once the build has completed.
How it works
Health checks are defined within a transformation using a declarative API. Checks are associated with individual columns and returned alongside the transformed dataset.
Once the build finishes:
All defined checks are executed automatically
Results are aggregated into an overall pass rate
Detailed per-column results are displayed in the Health tab
Warnings and failures are clearly indicated
For a complete list of supported checks and configuration options, refer to the API documentation.
Severity levels
Each health check specifies a severity level, which determines its impact on the build outcome:
Warn: The build completes successfully, but the issue is recorded and visible in the Health tab
Fail: The build is marked as failed, even if the transformation code itself executed successfully
Severity levels allow you to differentiate between non-blocking data quality issues and strict guarantees that must be satisfied for a build to be considered successful.
Last updated