StatisticalAnomalyDetection@1

Node StatisticalAnomalyDetection@1 is used to detect anomalies in numeric data streams using various statistical methods.

Adapter Prerequisites

Mesh Adapter

Node Configuration

For fields path, targetPath, targetValueWriteMode, and targetValueKind, see Overview.

transformations:
- type: StatisticalAnomalyDetection@1
  path: $.Items[*]  # Path to array of items to analyze
  targetPath: $.anomalies  # Path where anomaly results will be stored
  resetStatistics: false  # Reset statistics on each run (true = stateless, false = stateful)
  detectors:
    - path: $.Attributes.GrossTotal  # JSONPath to the numeric value to monitor
      groupByPath: $.Attributes.Issuer.Attributes.CompanyName  # Optional: Group statistics by this path
      contextPath: $.Attributes.DocumentNumber  # Optional: Include context in anomaly results
      method: PercentChange  # Detection method: ZScore, Iqr, PercentChange, MovingAverage
      threshold: 50.0  # Threshold for anomaly detection (interpretation depends on method)
      minSamples: 2  # Minimum samples required before detection starts
      maxSamples: 1000  # Maximum samples to keep in memory (0 = unlimited)
      windowSize: 10  # Window size for moving average method

Detection Methods

Z-Score Method

The Z-Score method detects anomalies by measuring how many standard deviations a data point is from the mean of the distribution. It assumes data follows a normal distribution.

How it works:

Calculates the mean (μ) and standard deviation (σ) from collected samples
For each new value, computes Z-Score = |value - μ| / σ
Values exceeding the threshold are flagged as anomalies

Parameters:

threshold: Number of standard deviations (e.g., 3.0 for 3σ rule)
- 2.0 = ~95% of normal data (5% outliers expected)
- 3.0 = ~99.7% of normal data (0.3% outliers expected)
- 4.0 = ~99.99% of normal data (very rare outliers only)
minSamples: Recommended minimum 30 for statistical validity

When to use:

Data follows normal/Gaussian distribution
Consistent, stable processes
Manufacturing quality control
Server response time monitoring

Further reading:

IQR (Interquartile Range) Method

The IQR method uses quartiles to detect outliers, making it robust against extreme values and suitable for non-normal distributions.

How it works:

Calculates Q1 (25th percentile) and Q3 (75th percentile)
Computes IQR = Q3 - Q1
Defines bounds: Lower = Q1 - (threshold × IQR), Upper = Q3 + (threshold × IQR)
Values outside these bounds are anomalies

Parameters:

threshold: IQR multiplier
- 1.5 = Standard outlier detection (Tukey's method)
- 3.0 = Extreme outlier detection
- Custom values for domain-specific needs
minSamples: Recommended minimum 10-20 for stable quartiles

When to use:

Skewed or non-normal distributions
Data with natural outliers
Financial data analysis
Customer behavior metrics

Further reading:

Percent Change Method

Detects anomalies based on the percentage change from the previous value, ideal for detecting sudden jumps or drops in sequential data.

How it works:

Calculates: Change = |current_value - last_value| / |last_value| × 100
Flags as anomaly if change exceeds threshold percentage
Simple but effective for trend monitoring

Parameters:

threshold: Maximum allowed percentage change
- 10.0 = 10% change triggers anomaly
- 50.0 = 50% change triggers anomaly
- 100.0 = Doubling/halving triggers anomaly
minSamples: Can work with just 1 previous sample

When to use:

Stock price monitoring
Sales volume tracking
Traffic pattern analysis
Resource utilization monitoring

Note: Not suitable for values that can be zero or near-zero (division issues).

Moving Average Method

Detects anomalies by comparing values against a moving average, effectively identifying deviations from recent trends.

How it works:

Maintains a sliding window of recent values
Calculates the mean of the window (moving average)
Computes deviation: |value - moving_avg| / moving_avg × 100
Flags anomaly if deviation exceeds threshold

Parameters:

threshold: Maximum percentage deviation from moving average
- 10.0 = 10% deviation from average
- 25.0 = 25% deviation from average
windowSize: Number of recent values for average calculation
- Smaller (5-10): More responsive to changes
- Larger (20-50): More stable, less sensitive to noise
minSamples: Must be at least equal to windowSize

When to use:

Trending data with seasonal patterns
Network traffic analysis
Temperature monitoring
Business metrics with weekly/daily cycles

Further reading:

Output Format

The node produces an array of anomaly results at the target path. Each anomaly object contains the following fields:

Output Fields

Field	Type	Description
`path`	string	The JSONPath that was monitored (from detector configuration)
`value`	number	The actual numeric value that triggered the anomaly
`isAnomaly`	boolean	Always `true` in output (non-anomalies are not included)
`score`	number	Anomaly severity score. Interpretation varies by method: • Z-Score: Number of standard deviations from mean • IQR: Distance from bounds divided by IQR • PercentChange: Actual percentage change • MovingAverage: Percentage deviation from average
`method`	string	Detection method used: `"ZScore"`, `"Iqr"`, `"PercentChange"`, or `"MovingAverage"`
`reason`	string	Human-readable explanation of why the anomaly was detected, includes calculated values and thresholds
`context`	any	Additional context data from `contextPath` (if configured). Can be any JSON type depending on source data

Example Output

[
  {
    "path": "$.Attributes.GrossTotal",
    "value": 2880,
    "isAnomaly": true,
    "score": 900.0,
    "method": "PercentChange",
    "reason": "Change: 900.00% (threshold: 50.0%)",
    "context": "Document-5"
  },
  {
    "path": "$.Attributes.Temperature",
    "value": 45.2,
    "isAnomaly": true,
    "score": 3.5,
    "method": "ZScore",
    "reason": "Z-Score: 3.50 (threshold: 3.0)",
    "context": "Sensor-A1"
  },
  {
    "path": "$.Attributes.ResponseTime",
    "value": 1250,
    "isAnomaly": true,
    "score": 2.1,
    "method": "Iqr",
    "reason": "Value outside IQR bounds [150.00, 850.00]",
    "context": "Server-02"
  },
  {
    "path": "$.Attributes.Traffic",
    "value": 15000,
    "isAnomaly": true,
    "score": 35.5,
    "method": "MovingAverage",
    "reason": "Deviation from MA: 35.50% (threshold: 25.0%)",
    "context": "2024-01-15T10:00:00"
  }
]

Interpreting the Score

The score field interpretation depends on the detection method:

Method	Score Interpretation	Typical Anomaly Thresholds
ZScore	Standard deviations from mean	> 2.0 mild, > 3.0 strong, > 4.0 extreme
Iqr	Multiple of IQR distance from bounds	> 0 outlier, > 1.0 strong outlier
PercentChange	Actual percentage change	Depends on domain (e.g., > 50% for prices)
MovingAverage	Percentage deviation from average	> 20% mild, > 50% strong deviation

Output Behavior

Empty array: No anomalies detected in the current batch
Stateful mode (resetStatistics: false): Statistics accumulate across runs, improving accuracy over time
Stateless mode (resetStatistics: true): Each run starts fresh, suitable for independent batches
Grouping: When using groupByPath, separate statistics are maintained for each group

Examples

Example 1: Detect Invoice Amount Anomalies by Issuer

transformations:
- type: StatisticalAnomalyDetection@1
  path: $.documents.Items[*]
  targetPath: $.invoiceAnomalies
  resetStatistics: false
  detectors:
    - path: $.Attributes.GrossTotal
      groupByPath: $.Attributes.Issuer.Attributes.CompanyName
      contextPath: $.Attributes.DocumentNumber
      method: PercentChange
      threshold: 50.0
      minSamples: 2

Example 2: Multiple Detector Configuration

transformations:
- type: StatisticalAnomalyDetection@1
  path: $.measurements[*]
  targetPath: $.detectedAnomalies
  detectors:
    # Detect sudden temperature spikes
    - path: $.temperature
      method: ZScore
      threshold: 3.0
      minSamples: 10
    # Detect pressure changes
    - path: $.pressure
      method: PercentChange
      threshold: 20.0
      minSamples: 1
    # Detect flow rate deviations from moving average
    - path: $.flowRate
      method: MovingAverage
      threshold: 15.0
      windowSize: 20
      minSamples: 20

Example 3: Stateless Anomaly Detection

transformations:
- type: StatisticalAnomalyDetection@1
  path: $.sensorData[*]
  targetPath: $.alerts
  resetStatistics: true  # Each run starts with fresh statistics
  detectors:
    - path: $.value
      method: Iqr
      threshold: 1.5
      minSamples: 5
      maxSamples: 100  # Keep only last 100 samples in memory

Example 4: Financial Monitoring with IQR

transformations:
- type: StatisticalAnomalyDetection@1
  path: $.transactions[*]
  targetPath: $.suspiciousTransactions
  resetStatistics: false
  detectors:
    - path: $.amount
      groupByPath: $.accountType  # Separate statistics per account type
      contextPath: $.transactionId
      method: Iqr
      threshold: 3.0  # Detect extreme outliers only
      minSamples: 20
      maxSamples: 500  # Sliding window of 500 transactions

Notes

Stateful vs Stateless: When resetStatistics is false, the node maintains statistics across pipeline runs, allowing it to learn from historical data. When true, each run starts fresh.
GroupBy: The groupByPath parameter allows separate statistical tracking for different groups (e.g., per customer, per sensor).
Memory Management: Use maxSamples to limit memory usage for long-running pipelines. Older samples are removed in FIFO order.
Context Data: The contextPath parameter includes additional information with anomaly results for better debugging and analysis.
Method Selection Guide:
- Use Z-Score for normally distributed data with stable patterns
- Use IQR for skewed data or when robust outlier detection is needed
- Use PercentChange for detecting sudden changes in sequential data
- Use MovingAverage for trending data with expected variations
Minimum Samples: Each method requires different minimum samples for reliable detection. Z-Score needs more samples (30+) for statistical validity, while PercentChange can work with just 1 previous value.

Adapter Prerequisites​

Node Configuration​

Detection Methods​

Z-Score Method​

IQR (Interquartile Range) Method​

Percent Change Method​

Moving Average Method​

Output Format​

Output Fields​

Example Output​

Interpreting the Score​

Output Behavior​

Examples​

Example 1: Detect Invoice Amount Anomalies by Issuer​

Example 2: Multiple Detector Configuration​

Example 3: Stateless Anomaly Detection​

Example 4: Financial Monitoring with IQR​

Notes​

Adapter Prerequisites

Node Configuration

Detection Methods

Z-Score Method

IQR (Interquartile Range) Method

Percent Change Method

Moving Average Method

Output Format

Output Fields

Example Output

Interpreting the Score

Output Behavior

Examples

Example 1: Detect Invoice Amount Anomalies by Issuer

Example 2: Multiple Detector Configuration

Example 3: Stateless Anomaly Detection

Example 4: Financial Monitoring with IQR

Notes