Skip to main content

Stream Data Archives

The Stream Data Archives section of the Refinery Studio lets you create and manage the time-series archives that capture data into CrateDB. Each archive defines a CK type, a curated list of attribute paths, and a status — and the studio walks you through every state transition (Activate, Disable, Enable, Retry, Delete) with confirmation dialogs.

For the underlying mechanics see the tech guide: Stream Data Archives.

Prerequisites

  • StreamData must be enabled for the tenant. If you see an empty state with a single "Enable StreamData for this tenant" button, the tenant flag is off — click it to enable (requires write permissions on the tenant).
  • The StreamDataAdmin role is required for every action on this page. Reader / Writer accounts will see the navigation entry only if their role implies admin (AdminPanelManagement).
  • The archives feature is hidden entirely when the instance-level StreamData:Enabled flag is false — that's an operator-level decision, not something the tenant admin can change from the UI.

Accessing the Archives Page

Navigate to Repository → Archives in the tenant menu. The page shows every archive on the active tenant — raw, rollup, and time-range — with a status badge and per-row actions.

Archives List

ColumnMeaning
NamertWellKnownName of the archive entity.
TypeRaw, Rollup, or Time-Range.
Target CK TypeThe CK type the archive captures (Industry.Energy/EnergyMeter, …).
StatusCreated / Activated / Disabled / Failed. Hover for the last transition timestamp; on Failed, the underlying error code is also shown.
ColumnsCount of captured attribute paths.
ActionsContext-aware: Activate, Disable, Enable, Retry, Delete, Copy ID, View content (Activated only), Edit.

The action menu is status-aware — only legal transitions are enabled:

Current statusAvailable actions
CreatedEdit · Activate · Delete
ActivatedView content · Disable · Delete (no Edit — schema is frozen)
DisabledEnable · Delete
FailedRetry Activation · Delete

All destructive actions (Disable, Activate, Delete, Retry) open a confirmation dialog before the mutation runs.

Creating an Archive

Three entry points sit in the page toolbar:

  • New Archive — a RawArchive for instant measurements (one row per timestamp).
  • New Rollup — a RollupArchive derived from an existing archive. Picks a source first (raw or rollup-of-rollup) and then the aggregations.
  • New Time-Range — a TimeRangeArchive for externally pre-aggregated data with explicit [from, to) windows (EDA reports, smart meters, weather APIs).

Raw Archive

  1. Click New Archive.
  2. Enter a name (rtWellKnownName) — short and stable; the archive is referenced by rtId from pipelines and queries, but the name is what you'll see in dropdowns.
  3. Pick the Target CK Type. The picker auto-completes from the CK model loaded in the tenant. Inheritance is supported: an archive on a base type captures every concrete derived type with its own table.
  4. Use the Attribute Path Picker to select the columns to capture:
    • Tree view of the CK type's attributes; click into records to expand.
    • Per row, toggle Required (the column becomes NOT NULL) and Indexed (default on; disable only for columns that are read but never filtered).
    • Array marker [*] is shown inline for collection-typed attributes (readings[*].value).
    • At least one column must be selected.
  5. Save. The archive is created in status Created. The CrateDB table is not provisioned yet.
  6. Back on the list, click Activate on the new row. The studio runs activateArchive server-side; on success the badge flips to Activated and the data plane (inserts + queries) is open.

Rollup Archive

  1. Click New Rollup. The source picker lists every existing archive on the tenant — raw or rollup. Chaining rollup-of-rollup is supported.
  2. Pick a source. The next screen pre-fills the source archive metadata in read-only fields:
    • Target CK type (inherited from source)
    • Source columns (available paths for aggregation)
  3. Configure the rollup:
    • Bucket size — the length of each aggregation window. Shown human-readable (e.g. 1 hour, 1 day) alongside the raw millisecond value, with an inline hint explaining that a bucket is the [start, end) window one output row summarises. The bucket must be at least as long as the source's granularity, and an integer multiple of it — e.g. a rollup over a 15-minute source may use 15 min, 30 min, 1 h or 1 d buckets, but not 5 min (finer than the source) or 20 min (not a whole multiple of 15 min). A finer or non-aligned bucket would split single source rows across buckets and produce misleading aggregates, so the rule is enforced at Activate time (see the RollupBucketIntervalException row under Failure Modes). Raw sources that declare no granularity are not constrained.
    • Bucket alignmentFixedSize for engineering data, CalendarDay / Iso8601Week / CalendarMonth / CalendarYear for energy and EDA workflows.
    • Reference time zone — for the calendar alignments, pick an IANA zone (e.g. Europe/Vienna) so the buckets snap to local midnight / week / month / year and stay DST-correct. Leave it empty for UTC boundaries; it is ignored for FixedSize. Alignment and time zone are immutable after activation — to change them, delete and re-create the rollup.
    • Watermark lag — how far behind real time the orchestrator stays before closing a bucket. Shown human-readable with an inline hint; buy yourself some headroom for late-arriving data.
    • Aggregations — one or more { source path, function, target column } rows. Functions: Avg, Min, Max, Sum, Count. Avg is stored as sum + count columns so chained rollups stay numerically correct.
  4. Save. The studio derives TargetCkTypeId (from the source) and Columns (from the aggregations) server-side via createRollupArchive — you never set those by hand.
  5. Activate as for a raw archive. From then on, the rollup orchestrator ticks hourly (configurable), writes one row per closed bucket, and keeps the current in-progress bucket refreshed so partial-period totals (this month / this year so far) stay live.
Backfill / recompute needs an activated rollup

Queuing a backfill or recompute against a rollup that is still Created, Disabled or Failed fails fast with a clear message — activate the rollup first. Disabling or deleting a rollup also cancels any queued recompute work, so no un-processable job is left behind. See Rollups & recompute — Lifecycle interactions.

Time-Range Archive

  1. Click New Time-Range.
  2. Enter name, target CK type, and the column list (same picker as for raw archives).
  3. Optionally set a Period (TimeSpan) — advisory only; the studio uses it to pick a default time range when rendering query previews. Not enforced at insert time.
  4. Save and activate.
  5. External producers push rows via SaveTimeRangeStreamDataInArchive@1 (pipeline) or POST /api/v1/streamData/archives/{rtId}/insertTimeRange (REST).

Editing an Archive

  • While Created, the archive is fully editable — click the name or Edit to reopen the form. Columns, target CK type, name, indexing, required flags — everything can change.
  • Once Activated, the schema is frozen. The form opens read-only with a banner explaining the immutability rule. The only mutable bits at that point are the name and (for rollups) the freeze / rewind controls below.
  • For a breaking schema change, create a new archive with a new name and migrate consumers. There is no in-place schema migration by design.

Viewing Archive Content

To inspect the rows an archive has captured — for debugging a pipeline, spot-checking values, or confirming a backfill — open the content viewer:

  • From the archives list, right-click an Activated archive and choose View content. The action is offered only for Activated archives; the other states have no queryable CrateDB table.
  • From an archive's edit form (raw, time-range, or rollup), click View content in the toolbar.

Both open a read-only grid (Repository → Archives → <archive> → Content) that runs a query selecting all of the archive's columns and pages through the CrateDB table — there is no row limit; the grid loads one page at a time. Rows are shown newest first by default.

Columns

  • Raw archives show a single Timestamp column.
  • Time-range and rollup archives show Window start and Window end — their rows describe a [from, to) window rather than a single instant.
  • A Source column carries the originating entity's name, followed by one column per captured attribute.

Time zone

Dates are shown in your browser's local time by default. Use the Local / UTC toggle in the toolbar to switch all date columns to UTC — handy when correlating with backend logs or CrateDB, which store timestamps in UTC.

Filtering and sorting

  • Filter — click Filter, then Add filter to constrain any column (data columns as well as the time and system fields such as window_start or rtWellKnownName). Each filter is a { field, operator, value } row; the grid refreshes once the filter is complete. Empty or half-filled filter rows are ignored.
  • Sort — click Sort to order by one or more columns (ascending / descending). Both filtering and sorting run server-side across the whole table, not just the page currently loaded.

The viewer never modifies archive data. For saved, reusable queries — aggregations, grouping, downsampling, or querying across archives — use the Query Builder.

Lifecycle Actions

Activate

  • Created → Activated: provisions the CrateDB table and opens the data plane.
  • Disabled → Activated: re-validates column paths against the current CK model (CK migrations between disable and re-enable can break paths — failure shows up here, not silently).
  • Failed → Activated (Retry): idempotent — re-runs CREATE TABLE IF NOT EXISTS. Safe to retry repeatedly.

Disable

Activated → Disabled. Inserts and queries are rejected; the CrateDB table and its data are preserved. Use Disable to quiesce an archive before a destructive operation or to investigate a downstream issue without losing history.

Delete

Drops the CrateDB table and soft-deletes the archive entity (rtState = Archived). The confirmation dialog reads:

Delete archive '<rtId>'? The CrateDB table will be dropped and historical data lost. This action cannot be undone.

Delete is rejected with a notification if the archive is a RawArchive that has at least one active RollupArchive referencing it. Drop the rollups first, or freeze them and check what depends on them.

Retry Activation

Only available on Failed. The status hover tooltip carries the underlying error code (often a DDL or path-resolution failure). After fixing the root cause — usually a CK-model gap — click Retry.

Computed Columns

A computed column is an extra column whose value is derived by a formula from other columns of the same row — for example powerFactor = activePower / apparentPower. Unlike the ingested columns (which are frozen once the archive is activated), computed columns can be added, re-formulated, and removed on an active archive. The Computed columns panel on the archive edit page manages them.

While a column is being added or its formula changed, the archive keeps serving the previous values; the new values appear atomically once the backfill of the existing rows completes. If a backfill fails, the archive is left exactly as it was.

Add a computed column

  1. Open the archive (it must be Activated) and find the Computed columns panel.
  2. Click Add computed column and fill in:
    • Name — the column name, also the identifier other formulas reference (e.g. powerFactor).
    • Formula — an expression over the archive's other columns (e.g. activePower / apparentPower). See the Formula Expressions reference for the supported syntax.
    • Result typeDouble, Int, Int64, Boolean, or DateTime. (Text is not supported.)
    • Indexed — leave on unless the column is rarely filtered.
  3. Click Add. The column is backfilled across the existing rows and becomes visible when done.

Change a formula

Click the pencil icon on a row, enter the new formula, and click Apply. Readers keep seeing the old values until the new ones are backfilled and switched in atomically.

A formula change is rejected if another computed column references this one — change or remove the dependent column first. Direct SQL / Grafana queries that use the column also need updating, because a formula change moves the column to a new internal name.

Remove a computed column

Click the trash icon and confirm. Removal is rejected if another computed column still references it.

Rollup-Specific Actions

The rollup edit form exposes four operator controls:

ControlEffect
FreezeSets FrozenUntil. The orchestrator stops producing buckets whose bucketEnd falls in the frozen range. Monotonic — a new until must be ≥ the current value.
UnfreezeClears FrozenUntil. Idempotent. The Accept gaps toggle is recorded for audit but the gap-detection guard is a follow-up.
Rewind watermarkResets LastAggregatedBucketEnd to the chosen timestamp (truncated to the bucket boundary). Subsequent ticks re-aggregate the rewound range. Destructive — previously committed rows in that range are temporarily out of sync until the orchestrator catches up.
RecomputeRe-aggregates a chosen [from, to) range (optionally scoped to one entity). Unlike rewind this is non-destructive to readers: an optimistic atomic swap keeps queries on a consistent snapshot throughout, then flips to the new values in one step. The right tool for correcting historical data without taking dashboards out of sync.

Recompute panel

The rollups panel on the archive-detail view shows, per rollup, a Recompute column with the live health: whether a recompute is in progress, how many dirty windows / pending ranges are queued, and the last success and last failure (with reason). The Recompute action opens a from/to range picker; below it, a recompute-job history table lists recent jobs (state, rows, windows, duration, error reason) so you can see why a run failed. These map to the recomputeArchive mutation and the recomputeJobsFor query.

Typical workflows:

  • Correcting historical data without downtime: prefer Recompute over rewind — pick the affected [from, to) range (and an entity scope if only one stream changed) and run it. Readers never see a partial result.
  • Backfilling raw data: Freeze every dependent rollup → run the import → rewind each rollup's watermark to the start of the back-fill range → unfreeze. The orchestrator re-aggregates the affected buckets on its next tick. (Rewind reconciles any prior recompute generations for that range automatically.)
  • Fixing a misconfigured aggregation: There is no in-place "edit aggregations". Delete the rollup, create a new one with the corrected spec, activate.

Reading the Status Badge

Badge color (default theme)StatusWhat it means
GrayCreatedNot provisioned yet. Inserts and queries are rejected.
GreenActivatedCrateDB table exists, data plane open.
YellowDisabledTable preserved, inserts and queries rejected. Re-enable to resume.
RedFailedActivation DDL failed. Hover for the error code. Retry once you've fixed it.

Hover the badge to see the last transition timestamp. On Failed, the tooltip also shows the underlying error code.

Failure Modes

SymptomLikely cause
Activate returns ArchivePathInvalidExceptionOne of the captured paths cannot be resolved against the current CK model. Edit the archive (if still Created) or fix the CK model.
Activate returns ArchiveColumnTypeUnsupportedExceptionAn attribute's CK type cannot be mapped to a CrateDB column type. Choose a different attribute or restructure the CK type.
Activate returns RollupBucketIntervalExceptionThe rollup's bucket interval is finer than, or not an integer multiple of, the source's granularity (e.g. a 5-min or 20-min bucket over a 15-min source). Delete the rollup and re-create it with a bucket that is ≥ and a whole multiple of the source granularity.
Delete returns RollupSourceInUseExceptionActive rollups reference this raw archive. Drop or freeze the rollups first.
Retry Activation keeps failingThe error is environmental (Crate unreachable, schema permissions) — check the asset-repo logs.
Archive page is empty, no "Enable" buttonInstance-level StreamData:Enabled is false. This is an operator decision (appsettings).
Buttons are disabled even though the role looks rightToken doesn't carry the role claim. Re-log in; check the identity service's API resource configuration.

Where the Data Lives

Archive metadata (the Archive runtime entities, status, columns, rollup state) lives in the tenant's MongoDB database. Time-series rows live in CrateDB, in a schema named after the tenant. Deleting a tenant drops both. mongodump --db=<tenant> captures the metadata but not the CrateDB rows — for time-series backups use CrateDB's own snapshot tools.

See Also