Skip to content

Data Handling Operators

Operators for missing data and cumulative computations.

Missing Data

is_nan(x)

Check for NaN (Not a Number) values.

Text Only
is_nan(x: Numeric) -> Boolean
Text Only
signal example:
  missing = is_nan(prices)
  valid = not(is_nan(prices))
  emit where(valid, prices, 0)

fill_nan(x, value)

Replace NaN values with a constant.

Text Only
fill_nan(x: Numeric, value: Scalar) -> Numeric
Text Only
signal example:
  // Fill missing prices with 0
  filled = fill_nan(prices, 0)

  // Fill missing returns with 0
  returns = ret(prices, 20)
  safe_returns = fill_nan(returns, 0)

  emit safe_returns

coalesce(a, b)

Return first non-NaN value.

Text Only
coalesce(a: Numeric, b: Numeric) -> Numeric
Text Only
signal example:
  // Use primary source, fallback to secondary
  price = coalesce(primary_price, secondary_price)

  // Chain multiple fallbacks
  value = coalesce(estimate_a, coalesce(estimate_b, default_value))

  emit price

Cumulative Operations

cumsum(x)

Cumulative sum.

Text Only
cumsum(x: Numeric) -> Numeric
Text Only
signal example:
  // Cumulative returns
  cum_returns = cumsum(daily_returns)

  // Running total
  total_volume = cumsum(volume)

  emit cum_returns

cumprod(x)

Cumulative product.

Text Only
cumprod(x: Numeric) -> Numeric
Text Only
signal example:
  // Cumulative wealth (starting at 1)
  growth_factors = 1 + daily_returns
  wealth = cumprod(growth_factors)

  emit wealth

cummax(x)

Cumulative maximum (running max).

Text Only
cummax(x: Numeric) -> Numeric
Text Only
signal example:
  // Track all-time high
  all_time_high = cummax(prices)

  // Drawdown from peak
  drawdown = (prices - all_time_high) / all_time_high

  emit drawdown

cummin(x)

Cumulative minimum (running min).

Text Only
cummin(x: Numeric) -> Numeric
Text Only
signal example:
  // Track all-time low
  all_time_low = cummin(prices)

  // Distance from trough
  recovery = (prices - all_time_low) / all_time_low

  emit recovery

Common Patterns

Handle Missing Returns

Text Only
signal safe_returns:
  raw_returns = ret(prices, 20)
  // Fill missing returns with 0 (no change)
  safe = fill_nan(raw_returns, 0)
  emit zscore(safe)

Forward Fill (with Lag)

Text Only
signal forward_fill:
  // Use previous value if current is missing
  filled = where(is_nan(prices), lag(prices, 1), prices)
  emit filled

Multiple Data Sources

Text Only
signal multi_source:
  // Primary source preferred, fallback to secondary
  combined = coalesce(primary_data, secondary_data)

  // Triple fallback
  best_estimate = coalesce(estimate_a, coalesce(estimate_b, estimate_c))

  emit combined

Drawdown Calculation

Text Only
signal drawdown:
  // Peak wealth
  cum_return = cumsum(ret(prices, 1))
  peak = cummax(cum_return)

  // Current drawdown
  dd = cum_return - peak

  // Max drawdown (most negative)
  max_dd = cummin(dd)

  emit dd

Wealth Index

Text Only
signal wealth:
  // Daily returns
  daily_ret = ret(prices, 1)

  // Growth factor (1 + return)
  growth = 1 + daily_ret

  // Cumulative wealth (assumes $1 start)
  wealth = cumprod(growth)

  emit wealth

High Water Mark

Text Only
signal hwm:
  // Cumulative performance
  performance = cumsum(returns)

  // High water mark
  hwm = cummax(performance)

  // Distance from HWM
  below_hwm = performance - hwm

  emit below_hwm

Data Quality Score

Text Only
signal quality:
  // Count non-missing values
  has_price = where(is_nan(prices), 0, 1)
  has_volume = where(is_nan(volume), 0, 1)

  // Quality score
  quality = (has_price + has_volume) / 2

  emit quality

Safe Division

Text Only
signal safe_div:
  // Avoid division by zero
  ratio = where(denominator != 0, numerator / denominator, 0)

  // Or use coalesce
  safe_ratio = coalesce(numerator / denominator, 0)

  emit ratio

Cumulative Indicator

Text Only
signal trend_days:
  // Count consecutive up days
  up = where(ret(prices, 1) > 0, 1, 0)
  cum_up = cumsum(up)

  emit cum_up

NaN Propagation

Most operators propagate NaN:

Text Only
// If any input is NaN, output is NaN
NaN + 1 = NaN
zscore([1, NaN, 3]) = [z1, NaN, z3]
rolling_mean([1, NaN, 3], 2) = [NaN, NaN, NaN]  // Window contains NaN

Handle NaN explicitly:

Text Only
signal robust:
  // Remove NaN before computation
  clean = fill_nan(raw_data, 0)
  result = zscore(clean)
  emit result

Type Behavior

Operator Input Output
is_nan Numeric Boolean
fill_nan Numeric, Scalar Numeric
coalesce Numeric, Numeric Numeric
cumsum Numeric Numeric
cumprod Numeric Numeric
cummax Numeric Numeric
cummin Numeric Numeric

Best Practices

1. Handle Missing Data Early

Text Only
signal clean_first:
  // Clean data at the start
  clean_prices = fill_nan(prices, 0)

  // Then compute
  returns = ret(clean_prices, 20)
  emit zscore(returns)

2. Use Coalesce for Fallbacks

Text Only
signal with_fallback:
  // Prefer primary, use secondary if missing
  price = coalesce(bloomberg_price, yahoo_price)
  emit zscore(ret(price, 20))

3. Check for NaN in Conditions

Text Only
signal safe_condition:
  has_data = not(is_nan(x))
  valid_and_positive = has_data and x > 0
  emit where(valid_and_positive, x, 0)

4. Document Missing Data Handling

Text Only
// Missing prices are forward-filled
// Missing returns are set to 0
// Missing fundamentals are excluded
signal documented:
  ...

Next Steps