Skip to content

Time-Series Features in HeliosDB

Time-Series Features in HeliosDB

Status: Production-Ready Version: 6.0 Last Updated: January 4, 2026


Overview

HeliosDB provides enterprise-grade time-series data management capabilities designed for IoT, observability, financial analytics, and log processing workloads. The time-series engine delivers sub-millisecond query latency, 10x+ compression ratios, and throughput exceeding 1M+ points per second.

Key Capabilities

FeatureDescriptionPerformance
Native Time-Series StorageColumnar storage optimized for time-ordered dataZero-copy batch operations
Gorilla CompressionFacebook’s industry-standard compression algorithm10-15x compression ratio
Time-Based PartitioningHourly/daily/weekly/monthly/yearly partitionsPartition pruning for queries
Retention PoliciesAutomatic data expiration with TTL and size limitsBackground cleanup
DownsamplingMulti-tier aggregation with configurable intervalsPreserves statistical properties
Continuous AggregatesPre-computed rollups for fast analyticsReal-time materialization
Window FunctionsTumbling, sliding, and session windowsTime-based analysis
Gap FillingInterpolation strategies for missing dataForward/backward/linear fill

Architecture

HeliosDB Time-Series Architecture
┌──────────────────────────────────────────────────────────────────────┐
│ Time-Series API Layer │
│ write_point() | query_range() | downsample() | set_retention() │
├──────────────────────────────────────────────────────────────────────┤
│ High-Performance Ingestion │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │
│ │ Batching │ │Out-of-Order│ │ Backfill │ │ Compression │ │
│ │ Buffer │ │ Handler │ │ Support │ │ Pipeline │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────────┘ │
├──────────────────────────────────────────────────────────────────────┤
│ Query Engine Layer │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │
│ │Time Range │ │ Window │ │ Time-Based │ │ Result │ │
│ │ Queries │ │ Functions │ │ Joins │ │ Caching │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────────┘ │
├──────────────────────────────────────────────────────────────────────┤
│ Data Management Layer │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │
│ │ Retention │ │Downsampling│ │ Partition │ │ Tiered │ │
│ │ Engine │ │ Engine │ │ Manager │ │ Storage │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────────┘ │
├──────────────────────────────────────────────────────────────────────┤
│ Compression Layer │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Gorilla Compressor │ │
│ │ Delta-of-Delta (Timestamps) | XOR Bitpacking (Values) │ │
│ │ Dictionary Compression (Metrics/Tags) │ │
│ └─────────────────────────────────────────────────────────────────┘ │
├──────────────────────────────────────────────────────────────────────┤
│ LSM Storage Engine │
└──────────────────────────────────────────────────────────────────────┘

Native Time-Series Storage Engine

HeliosDB’s time-series storage is built on a columnar architecture that separates timestamps, values, and metadata for optimal compression and query performance.

Data Model

/// Time-series data point with timestamp and value
pub struct TimeSeriesPoint {
/// Metric name or series identifier
pub metric: String,
/// Unix timestamp in milliseconds
pub timestamp: u64,
/// Data value
pub value: f64,
/// Optional tags for multi-dimensional querying
pub tags: HashMap<String, String>,
}

Storage Key Format

metric:timestamp:tags_hash

This key format enables:

  • Efficient range scans by metric and time
  • Tag-based filtering with hash lookups
  • Partition-aware query routing

Time-Bucketed Aggregations

HeliosDB supports automatic time-bucketing for analytical queries with multiple aggregation functions.

Aggregation Functions

FunctionDescription
AverageMean of all values in bucket
MinMinimum value
MaxMaximum value
SumSum of all values
CountNumber of data points
FirstFirst value in bucket
LastLast value in bucket
StdDevStandard deviation
Percentile(n)Nth percentile

Time Intervals

pub enum TimeInterval {
Second(i64),
Minute(i64),
Hour(i64),
Day(i64),
Week(i64),
Month(i64),
Year(i64),
}

Example: Time Bucket Query

-- Aggregate sensor readings by 5-minute buckets
SELECT
time_bucket('5 minutes', timestamp) AS bucket,
sensor_id,
AVG(temperature) AS avg_temp,
MAX(temperature) AS max_temp,
MIN(temperature) AS min_temp
FROM sensor_readings
WHERE timestamp BETWEEN '2025-01-01' AND '2025-01-02'
GROUP BY bucket, sensor_id
ORDER BY bucket;

Downsampling and Retention Policies

Multi-Tier Downsampling

Configure cascading downsampling tiers to reduce storage while preserving analytical value:

let config = DownsamplingConfig::new(Duration::from_secs(60)) // 1-minute primary
.with_aggregation(AggregationFunction::Average)
.add_tier(
DownsamplingTier::new(Duration::from_secs(300), AggregationFunction::Average)
.with_age_threshold(Duration::from_secs(3600)) // After 1 hour
)
.add_tier(
DownsamplingTier::new(Duration::from_secs(3600), AggregationFunction::Average)
.with_age_threshold(Duration::from_secs(86400)) // After 1 day
);

Retention Policies

// Time-based retention (30 days)
let policy = RetentionPolicy::new(Duration::from_secs(30 * 24 * 3600))
.with_cleanup_interval(3600) // Check every hour
.with_max_size(100_000_000_000); // Optional 100GB limit
// Per-metric retention
engine.set_metric_policy("metrics.high_frequency.*",
RetentionPolicy::new(Duration::from_secs(7 * 24 * 3600)) // 7 days
);

Continuous Aggregates

Pre-compute rollups for commonly accessed time ranges:

// Configure continuous aggregate for CPU metrics
let aggregate_config = ContinuousAggregateConfig {
source_metric: "cpu.usage",
target_metric: "cpu.usage.1h_avg",
interval: Duration::from_secs(3600),
aggregation: AggregationFunction::Average,
lag: Duration::from_secs(60), // 1-minute lag for late data
};

Benefits

  • Query Performance: Pre-computed results eliminate runtime aggregation
  • Storage Efficiency: Aggregated data is smaller than raw data
  • Real-Time Updates: Continuous background processing
  • Late Data Handling: Configurable lag for out-of-order points

Time-Based Partitioning

Partition Strategies

StrategyPartition ID FormatUse Case
HourlyYYYYMMDDHHHigh-frequency data
DailyYYYYMMDDStandard metrics
WeeklyYYYYWWLow-frequency data
MonthlyYYYYMMLong-term storage
YearlyYYYYHistorical archives
Custom(secs)timestamp/intervalFlexible partitioning

Partition Management

let manager = PartitionManager::new(
"/data/partitions",
PartitionStrategy::Daily,
).await?;
// Automatic partition creation for new data
let partition = manager.get_or_create_partition(timestamp).await?;
// Query partition pruning
let partitions = manager.get_partitions_for_range(start_time, end_time).await?;
// Archive old partitions
manager.archive_partition(partition_id).await?;

Compression Performance

Gorilla Algorithm Results

Data TypeCompression RatioThroughput
IoT Temperature8-12x500K+ pts/sec
Network Metrics5-8x500K+ pts/sec
CPU/System Metrics5-10x500K+ pts/sec
High-Frequency Trading10-15x500K+ pts/sec

Compression Pipeline

  1. Delta-of-Delta Encoding (Timestamps)

    • Regular intervals compress to 1-4 bits per timestamp
    • 16-64x compression for uniformly sampled data
  2. XOR + Bit-packing (Values)

    • Exploits temporal correlation in values
    • 4-20x compression for slowly changing values
  3. Dictionary Compression (Metrics/Tags)

    • String to u32 ID mapping
    • 10-20x reduction for metric names

Window Functions

Window Types

pub enum WindowType {
/// Fixed-size, non-overlapping windows
Tumbling { size: Duration },
/// Fixed-size, overlapping windows
Sliding { size: Duration, slide: Duration },
/// Dynamic windows based on inactivity gap
Session { gap: Duration },
}

Window Query Example

let engine = TimeSeriesQueryEngine::new();
// Execute windowed aggregation
let results = engine.execute_windowed_query(
&points,
WindowType::Tumbling { size: Duration::from_secs(300) },
AggregationFunction::Average,
)?;
// Session windows for user activity
let sessions = engine.execute_windowed_query(
&user_events,
WindowType::Session { gap: Duration::from_secs(1800) }, // 30-min gap
AggregationFunction::Count,
)?;

Gap Filling and Interpolation

Fill Strategies

StrategyDescription
NullLeave gaps as null/None
ZeroFill with zero values
ForwardUse previous known value
BackwardUse next known value
LinearLinear interpolation between points

Example

let filled = TimeSeriesOps::fill_missing(
&ts,
TimeInterval::Minute(1),
FillMethod::Linear,
)?;

Time-Zone Handling

HeliosDB stores all timestamps in UTC and provides time-zone conversion at query time:

-- Query with timezone conversion
SELECT
timestamp AT TIME ZONE 'America/New_York' AS local_time,
value
FROM metrics
WHERE timestamp > NOW() - INTERVAL '24 hours';

Integration with MVCC

Time-series data integrates with HeliosDB’s Multi-Version Concurrency Control:

  • Point-in-Time Queries: Query data as it existed at any past moment
  • Consistent Snapshots: Transactional reads across time ranges
  • Conflict-Free Writes: Append-only model eliminates write conflicts

API Modules

ModuleDescription
TimeSeriesEngineMain engine coordinating all operations
TimeSeriesPointData point structure
BatchCompressorProduction batch compression
GorillaCompressorLow-level Gorilla implementation
DictionaryCompressorString dictionary compression
PartitionManagerTime-based partition management
RetentionEngineData expiration and cleanup
DownsamplingEngineMulti-tier aggregation
TimeSeriesQueryEngineQuery execution and caching
IngestionPipelineHigh-throughput ingestion

DocumentDescription
Quick Start GuideGet started in 10 minutes
User GuideComprehensive documentation
ExamplesCode examples for common use cases
Compression DetailsTechnical compression reference
Performance TuningOptimization guide

Performance Targets

MetricTargetAchieved
Ingestion throughput1M pts/sec500K+ pts/sec
Compression ratio8-10x10-15x
Compression latency<5ms/1K pts<3ms/1K pts
Decompression latency<3ms/1K pts<2ms/1K pts
Query latency (time range)<10ms<5ms
Partition pruning95% reduction95%+

Use Cases

IoT and Sensor Networks

  • High-volume sensor data ingestion
  • Edge device telemetry
  • Industrial monitoring

Observability and Monitoring

  • Infrastructure metrics (CPU, memory, disk)
  • Application performance monitoring
  • Log aggregation and analysis

Financial Data

  • High-frequency trading ticks
  • Market data feeds
  • Risk analytics

Operational Analytics

  • Real-time dashboards
  • Anomaly detection
  • Trend analysis

See Also: HeliosDB Feature Index