Storage Cost Attribution - Quick Start Guide
Storage Cost Attribution - Quick Start Guide
Get started with HeliosDB’s storage cost attribution and optimization in 5 minutes.
What is Storage Cost Attribution?
A comprehensive system for tracking, analyzing, and optimizing storage costs at granular table and column levels. Enables 20-30% cost reduction through intelligent tiering and compression.
Quick Start
1. Initialize Storage Attributor
use heliosdb_cost_optimizer_v2::{StorageAttributor, TierCostConfig};
// Create with default costs ($0.10/GB/month for hot tier)let attributor = StorageAttributor::new(0.10);
// OR create with custom tier costslet tier_costs = TierCostConfig { hot_cost_per_gb: 0.10, // SSD warm_cost_per_gb: 0.05, // HDD cold_cost_per_gb: 0.01, // Object storage archive_cost_per_gb: 0.004, // Glacier-like};let attributor = StorageAttributor::with_tier_costs(tier_costs);2. Track Table Metrics
use heliosdb_cost_optimizer_v2::{TableStorageMetrics, StorageTier, AccessFrequency, AccessPattern};
let metrics = TableStorageMetrics { table_name: "users".to_string(), total_bytes: 10_737_418_240, // 10 GB row_count: 1_000_000, index_bytes: 1_073_741_824, data_bytes: 9_663_676_416, compressed_bytes: 3_579_139_413, uncompressed_bytes: 10_737_418_240, compression_ratio: 3.0, avg_row_size: 10737.42, storage_tier: StorageTier::Hot, last_accessed: chrono::Utc::now(), created_at: chrono::Utc::now() - chrono::Duration::days(365), access_frequency: AccessFrequency { reads_per_day: 1000.0, writes_per_day: 100.0, last_7_days_reads: 7000, last_30_days_reads: 30000, access_pattern: AccessPattern::Hot, },};
attributor.update_table_metrics(metrics).await;3. Calculate Costs
// Total cost across all tableslet total_cost = attributor.calculate_total_cost().await;println!("Total monthly cost: ${:.2}", total_cost);
// Cost by tablelet cost_by_table = attributor.cost_by_table().await;for (table, cost) in cost_by_table { println!("{}: ${:.2}/month", table, cost);}
// Compression savingslet savings = attributor.compression_savings().await;println!("Monthly savings from compression: ${:.2}", savings);4. Get Optimization Recommendations
// Tiering recommendationslet tiering_recs = attributor.tiering_recommendations().await;for rec in tiering_recs.iter().take(5) { println!("Move {} from {:?} to {:?}", rec.table, rec.from_tier, rec.to_tier); println!(" Annual savings: ${:.2}", rec.annual_savings_usd); println!(" Reason: {}", rec.reason);}
// Compression recommendationslet compression_recs = attributor.compression_recommendations().await;for rec in compression_recs.iter().take(5) { println!("Compress {}.{} with {:?}", rec.table, rec.column, rec.recommended_compression); println!(" Annual savings: ${:.2}", rec.expected_savings_usd_annual);}5. Track Trends and Forecast
use heliosdb_cost_optimizer_v2::{StorageTrendTracker, StorageSnapshot};
let tracker = StorageTrendTracker::new();
// Add daily snapshotslet snapshot = StorageSnapshot { timestamp: chrono::Utc::now(), total_bytes: 5_368_709_120_000, total_cost: 500.0, table_count: 100, by_table: HashMap::new(), by_tier: HashMap::new(),};tracker.add_snapshot(snapshot).await;
// Forecast 30 dayslet forecast = tracker.forecast_growth(30).await;println!("Current: {} GB", forecast.current_bytes / 1_073_741_824);println!("Predicted in 30 days: {} GB", forecast.predicted_bytes / 1_073_741_824);println!("Growth rate: {:.2} GB/day", forecast.growth_rate_gb_per_day);println!("Trend: {:?}", forecast.growth_trend);
// Analyze growthlet analysis = tracker.analyze_growth(None).await;println!("Daily growth: {:.2} GB", analysis.average_daily_growth_gb);println!("Weekly growth: {:.2} GB", analysis.average_weekly_growth_gb);6. Use Dashboard API
use heliosdb_cost_optimizer_v2::{DashboardState, storage_dashboard_routes};use axum::Router;use std::sync::Arc;
// Create shared statelet state = DashboardState { attributor: Arc::new(attributor), trend_tracker: Arc::new(tracker),};
// Create routerlet app = storage_dashboard_routes().with_state(state);
// Start serverlet listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await?;axum::serve(listener, app).await?;7. Access Dashboard
# Main dashboardcurl http://localhost:8080/api/storage/dashboard?top_n=10&forecast_days=30
# Cost breakdowncurl http://localhost:8080/api/storage/cost
# Table detailscurl http://localhost:8080/api/storage/tables/users
# Recommendationscurl http://localhost:8080/api/storage/recommendations
# Forecastcurl http://localhost:8080/api/storage/forecast?forecast_days=60
# Efficiency metricscurl http://localhost:8080/api/storage/efficiencyCommon Use Cases
Use Case 1: Identify Cost Hotspots
// Get top 10 most expensive tableslet top_tables = attributor.top_tables_by_cost(10).await;
for table in top_tables { println!("{}: ${:.2}/month ({:.1}% of total)", table.table, table.cost_usd_monthly, table.percent_of_total );}Output:
events_log: $500.00/month (40.0% of total)user_sessions: $200.00/month (16.0% of total)transactions: $150.00/month (12.0% of total)Use Case 2: Optimize Old Data
// Find tables that should be moved to cheaper tierslet recommendations = attributor.tiering_recommendations().await;
let big_savers = recommendations.iter() .filter(|r| r.annual_savings_usd > 1000.0) .collect::<Vec<_>>();
println!("Found {} high-value optimizations", big_savers.len());Use Case 3: Forecast Capacity Needs
// Predict when you'll hit capacitylet capacity_bytes = 100_000_000_000_000; // 100 TBlet analysis = tracker.analyze_growth(Some(capacity_bytes)).await;
if let Some(days) = analysis.days_until_capacity { println!("Will reach capacity in {} days", days); println!("Consider adding storage or archiving old data");}Use Case 4: Monitor Efficiency
// Track storage efficiency over timelet score = attributor.storage_efficiency_score().await;
match score { s if s >= 90.0 => println!(" Excellent efficiency"), s if s >= 70.0 => println!("⚠ Good efficiency, room for improvement"), s if s >= 50.0 => println!("⚠ Fair efficiency, optimization recommended"), _ => println!("❌ Poor efficiency, immediate action needed"),}Configuration
Tier Cost Configuration
// AWS-like pricinglet aws_costs = TierCostConfig { hot_cost_per_gb: 0.10, // EBS SSD warm_cost_per_gb: 0.045, // EBS HDD cold_cost_per_gb: 0.023, // S3 Standard archive_cost_per_gb: 0.004, // S3 Glacier};
// Azure-like pricinglet azure_costs = TierCostConfig { hot_cost_per_gb: 0.12, warm_cost_per_gb: 0.06, cold_cost_per_gb: 0.015, archive_cost_per_gb: 0.002,};
// GCP-like pricinglet gcp_costs = TierCostConfig { hot_cost_per_gb: 0.085, warm_cost_per_gb: 0.040, cold_cost_per_gb: 0.020, archive_cost_per_gb: 0.004,};Snapshot Retention
// Keep 1 year of daily snapshots (default)let tracker = StorageTrendTracker::new();
// Keep 2 years of daily snapshotslet tracker = StorageTrendTracker::with_retention(730);
// Keep 90 days of daily snapshotslet tracker = StorageTrendTracker::with_retention(90);Best Practices
1. Regular Snapshot Collection
Collect daily snapshots for accurate forecasting:
tokio::spawn(async move { let mut interval = tokio::time::interval(Duration::from_secs(86400)); // 24 hours
loop { interval.tick().await; let snapshot = collect_snapshot(&attributor).await; trend_tracker.add_snapshot(snapshot).await; }});2. Automatic Tier Migration
Implement automatic migration based on recommendations:
let recommendations = attributor.tiering_recommendations().await;
for rec in recommendations { if rec.risk_level == RiskLevel::Low && rec.annual_savings_usd > 1000.0 { // Auto-migrate if low risk and significant savings migrate_table(&rec.table, rec.to_tier).await?; println!("Auto-migrated {} to {:?}", rec.table, rec.to_tier); }}3. Set Budget Alerts
let total_cost = attributor.calculate_total_cost().await;let budget = 2000.0; // $2,000/month
if total_cost > budget * 0.9 { println!("⚠ Warning: At 90% of budget (${:.2} / ${:.2})", total_cost, budget);}
if total_cost > budget { println!("❌ Alert: Over budget! (${:.2} / ${:.2})", total_cost, budget);}4. Monitor Growth Trends
let forecast = tracker.forecast_growth(30).await;
match forecast.growth_trend { GrowthTrend::Accelerating => { println!("⚠ Growth is accelerating - investigate data sources"); }, GrowthTrend::Linear => { println!(" Steady growth - predictable"); }, GrowthTrend::Stable => { println!(" Stable storage usage"); }, _ => {}}Troubleshooting
Issue: Forecasts are inaccurate
Solution: Ensure you have at least 30 days of historical snapshots:
let snapshots = tracker.get_snapshots().await;if snapshots.len() < 30 { println!("Warning: Only {} snapshots. Need 30+ for accurate forecasting.", snapshots.len());}Issue: Recommendations seem wrong
Solution: Verify table metrics are up-to-date:
let metrics = attributor.get_table_metrics("table_name").await;if let Some(m) = metrics { let days_old = (chrono::Utc::now() - m.last_accessed).num_days(); if days_old > 1 { println!("Warning: Metrics are {} days old. Re-analyze table.", days_old); }}Issue: High memory usage
Solution: Reduce snapshot retention or limit table count:
// Reduce retention to 90 dayslet tracker = StorageTrendTracker::with_retention(90);
// Only track top N tableslet top_tables = attributor.top_tables_by_cost(100).await; // Top 100 onlyPerformance Tips
- Batch Updates: Update metrics in batches to reduce lock contention
- Async Operations: Use async/await for all operations
- Index Separately: Track index costs separately from data costs
- Cache Results: Cache dashboard results for 5-10 minutes
- Parallel Forecasting: Run forecasts for multiple scenarios in parallel
Next Steps
- Read the full documentation
- Explore API reference
- Check out integration examples
- Learn about Week 3: Network Cost Tracking
Support
For questions or issues, see:
- GitHub Issues: https://github.com/heliosdb/heliosdb
- Documentation: /home/claude/HeliosDB/docs/
- Examples: /home/claude/HeliosDB/heliosdb-cost-optimizer-v2/tests/