Files

Dave Friedel d315f5d26e Initial push

2025-07-20 03:41:39 -04:00

8.8 KiB

Raw Permalink Blame History

SqrtSpace SpaceTime Best Practices

This project demonstrates best practices for building production-ready applications using the SqrtSpace SpaceTime library. It showcases advanced patterns and configurations for optimal memory efficiency and performance.

Key Concepts Demonstrated

1. Comprehensive Service Configuration

The application demonstrates proper configuration of all SpaceTime services:

// Environment-aware memory configuration
builder.Services.Configure<SpaceTimeConfiguration>(options =>
{
    options.Memory.MaxMemory = environment.IsDevelopment() 
        ? 256 * 1024 * 1024  // 256MB for dev
        : 1024 * 1024 * 1024; // 1GB for production
    
    // Respect container limits
    var memoryLimit = Environment.GetEnvironmentVariable("MEMORY_LIMIT");
    if (long.TryParse(memoryLimit, out var limit))
    {
        options.Memory.MaxMemory = (long)(limit * 0.8); // Use 80% of container limit
    }
});

2. Layered Caching Strategy

Implements hot/cold tiered caching with automatic spill-to-disk:

builder.Services.AddSpaceTimeCaching(options =>
{
    options.MaxHotMemory = 50 * 1024 * 1024; // 50MB hot cache
    options.EnableColdStorage = true;
    options.ColdStoragePath = Path.Combine(Path.GetTempPath(), "spacetime-cache");
});

3. Production-Ready Diagnostics

Comprehensive monitoring with OpenTelemetry integration:

builder.Services.AddSpaceTimeDiagnostics(options =>
{
    options.EnableMetrics = true;
    options.EnableTracing = true;
    options.SamplingRate = builder.Environment.IsDevelopment() ? 1.0 : 0.1;
});

4. Entity Framework Integration

Shows how to configure EF Core with SpaceTime optimizations:

options.UseSqlServer(connectionString)
       .UseSpaceTimeOptimizer(opt =>
       {
           opt.EnableSqrtNChangeTracking = true;
           opt.BufferPoolStrategy = BufferPoolStrategy.SqrtN;
       });

5. Memory-Aware Background Processing

Background services that respond to memory pressure:

_memoryMonitor.PressureEvents
    .Where(e => e.CurrentLevel >= MemoryPressureLevel.High)
    .Subscribe(e =>
    {
        _logger.LogWarning("High memory pressure detected, pausing processing");
        // Implement backpressure
    });

6. Pipeline Pattern for Complex Processing

Multi-stage processing with checkpointing:

var pipeline = _pipelineFactory.CreatePipeline<Order, ProcessedOrder>("OrderProcessing")
    .Configure(config =>
    {
        config.ExpectedItemCount = orders.Count();
        config.EnableCheckpointing = true;
    })
    .AddTransform("Validate", ValidateOrder)
    .AddBatch("EnrichCustomerData", EnrichWithCustomerData)
    .AddParallel("CalculateTax", CalculateTax, maxConcurrency: 4)
    .AddCheckpoint("SaveProgress")
    .Build();

7. Distributed Processing Coordination

Shows how to partition work across multiple nodes:

var partition = await _coordinator.RequestPartitionAsync(
    request.WorkloadId, 
    request.EstimatedSize);

// Process only this node's portion
var filter = new OrderFilter
{
    StartDate = partition.StartRange,
    EndDate = partition.EndRange
};

8. Streaming API Endpoints

Demonstrates memory-efficient streaming with automatic chunking:

[HttpGet("export")]
[SpaceTimeStreaming(ChunkStrategy = ChunkStrategy.SqrtN)]
public async IAsyncEnumerable<OrderExportDto> ExportOrders([FromQuery] OrderFilter filter)
{
    await foreach (var batch in orders.BatchBySqrtNAsync())
    {
        foreach (var order in batch)
        {
            yield return MapToDto(order);
        }
    }
}

Architecture Patterns

Service Layer Pattern

The OrderService demonstrates:

Dependency injection of SpaceTime services
Operation tracking with diagnostics
External sorting for large datasets
Proper error handling and logging

Memory-Aware Queries

// Automatically switches to external sorting for large results
var orders = await query
    .OrderByExternal(o => o.CreatedDate)
    .ToListWithSqrtNMemoryAsync();

Batch Processing

// Process data in memory-efficient batches
await foreach (var batch in context.Orders
    .Where(o => o.Status == "Pending")
    .BatchBySqrtNAsync())
{
    // Process batch
}

Task Scheduling

// Schedule work based on memory availability
await _scheduler.ScheduleAsync(
    async () => await ProcessNextBatchAsync(stoppingToken),
    estimatedMemory: 50 * 1024 * 1024, // 50MB
    priority: TaskPriority.Low);

Configuration Best Practices

1. Environment-Based Configuration

Development: Lower memory limits, full diagnostics
Production: Higher limits, sampled diagnostics
Container: Respect container memory limits

2. Conditional Service Registration

// Only add distributed coordination if Redis is available
var redisConnection = builder.Configuration.GetConnectionString("Redis");
if (!string.IsNullOrEmpty(redisConnection))
{
    builder.Services.AddSpaceTimeDistributed(options =>
    {
        options.NodeId = Environment.MachineName;
        options.CoordinationEndpoint = redisConnection;
    });
}

3. Health Monitoring

app.MapGet("/health", async (IMemoryPressureMonitor monitor) =>
{
    var stats = monitor.CurrentStatistics;
    return Results.Ok(new
    {
        Status = "Healthy",
        MemoryPressure = monitor.CurrentPressureLevel.ToString(),
        MemoryUsage = new
        {
            ManagedMemoryMB = stats.ManagedMemory / (1024.0 * 1024.0),
            WorkingSetMB = stats.WorkingSet / (1024.0 * 1024.0),
            AvailablePhysicalMemoryMB = stats.AvailablePhysicalMemory / (1024.0 * 1024.0)
        }
    });
});

Production Considerations

1. Memory Limits

Always configure memory limits based on your deployment environment:

Container deployments: Use 80% of container limit
VMs: Consider other processes running
Serverless: Respect function memory limits

2. Checkpointing Strategy

Enable checkpointing for:

Long-running operations
Operations that process large datasets
Critical business processes that must be resumable

3. Monitoring and Alerting

Monitor these key metrics:

Memory pressure levels
External sort operations
Checkpoint frequency
Cache hit rates
Pipeline processing times

4. Error Handling

Implement proper error handling:

Use diagnostics to track operations
Log errors with context
Implement retry logic for transient failures
Clean up resources on failure

5. Performance Tuning

Adjust batch sizes based on workload
Configure parallelism based on CPU cores
Set appropriate cache sizes
Monitor and adjust memory thresholds

Testing Recommendations

1. Load Testing

Test with datasets that exceed memory limits to ensure:

External processing activates correctly
Memory pressure is handled gracefully
Checkpointing works under load

2. Failure Testing

Test recovery scenarios:

Process crashes during batch processing
Memory pressure during operations
Network failures in distributed scenarios

3. Performance Testing

Measure:

Response times under various memory conditions
Throughput with different batch sizes
Resource utilization patterns

Deployment Checklist

Configure memory limits based on deployment environment
Set up monitoring and alerting
Configure persistent storage for checkpoints and cold cache
Test failover and recovery procedures
Document memory requirements and scaling limits
Configure appropriate logging levels
Set up distributed coordination (if using multiple nodes)
Verify health check endpoints
Test under expected production load

Advanced Scenarios

Multi-Node Deployment

For distributed deployments:

Configure Redis for coordination
Set unique node IDs
Implement partition-aware processing
Monitor cross-node communication

High-Availability Setup

Use persistent checkpoint storage
Implement automatic failover
Configure redundant cache storage
Monitor node health

Performance Optimization

Profile memory usage patterns
Adjust algorithm selection thresholds
Optimize batch sizes for your workload
Configure appropriate parallelism levels

Summary

This best practices project demonstrates how to build robust, memory-efficient applications using SqrtSpace SpaceTime. By following these patterns, you can build applications that:

Scale gracefully under memory pressure
Process large datasets efficiently
Recover from failures automatically
Provide predictable performance
Optimize resource utilization

The key is to embrace the √n space-time tradeoff philosophy throughout your application architecture, letting the library handle the complexity of memory management while you focus on business logic.

8.8 KiB Raw Permalink Blame History