sqrtspace-dotnet/samples/SampleWebApi
2025-07-20 03:41:39 -04:00
..
Controllers Initial push 2025-07-20 03:41:39 -04:00
Data Initial push 2025-07-20 03:41:39 -04:00
Models Initial push 2025-07-20 03:41:39 -04:00
Properties Initial push 2025-07-20 03:41:39 -04:00
Services Initial push 2025-07-20 03:41:39 -04:00
Program.cs Initial push 2025-07-20 03:41:39 -04:00
README.md Initial push 2025-07-20 03:41:39 -04:00
SampleWebApi.csproj Initial push 2025-07-20 03:41:39 -04:00

SqrtSpace SpaceTime Sample Web API

This sample demonstrates how to build a memory-efficient Web API using the SqrtSpace SpaceTime library. It showcases real-world scenarios where √n space-time tradeoffs can significantly improve application performance and scalability.

Features Demonstrated

1. Memory-Efficient Data Processing

  • Streaming large datasets without loading everything into memory
  • Automatic batching using √n-sized chunks
  • External sorting and aggregation for datasets that exceed memory limits

2. Checkpoint-Enabled Operations

  • Resumable bulk operations that can recover from failures
  • Progress tracking for long-running tasks
  • Automatic state persistence at optimal intervals

3. Real-World API Patterns

Products Controller (/api/products)

  • Paginated queries - Basic memory control through pagination
  • Streaming endpoints - Stream millions of products using NDJSON format
  • Smart search - Automatically switches to external sorting for large result sets
  • Bulk updates - Checkpoint-enabled price updates that can resume after failures
  • CSV export - Stream large exports without memory bloat
  • Statistics - Calculate aggregates over large datasets efficiently

Analytics Controller (/api/analytics)

  • Revenue analysis - External grouping for large-scale aggregations
  • Top customers - Find top N using external sorting when needed
  • Real-time streaming - Server-Sent Events for continuous analytics
  • Complex reports - Multi-stage report generation with checkpointing
  • Pattern analysis - ML-ready data processing with memory constraints
  • Memory monitoring - Track how the system manages memory

4. Automatic Memory Management

  • Adapts processing strategy based on data size
  • Spills to disk when memory pressure is detected
  • Provides memory usage statistics for monitoring

Running the Sample

  1. Start the API:

    dotnet run
    
  2. Access Swagger UI: Navigate to https://localhost:5001/swagger to explore the API

  3. Generate Test Data: The application automatically seeds the database with:

    • 1,000 customers
    • 10,000 products
    • 50,000 orders

    A background service continuously generates new orders to simulate real-time data.

Key Scenarios to Try

1. Stream Large Dataset

# Stream all products (10,000+) without loading into memory
curl -N https://localhost:5001/api/products/stream

# The response is newline-delimited JSON (NDJSON)

2. Bulk Update with Checkpointing

# Start a bulk price update
curl -X POST https://localhost:5001/api/products/bulk-update-prices \
  -H "Content-Type: application/json" \
  -H "X-Operation-Id: price-update-123" \
  -d '{"categoryFilter": "Electronics", "priceMultiplier": 1.1}'

# If it fails, resume with the same Operation ID

3. Generate Complex Report

# Generate a report with automatic checkpointing
curl -X POST https://localhost:5001/api/analytics/reports/generate \
  -H "Content-Type: application/json" \
  -d '{
    "startDate": "2024-01-01",
    "endDate": "2024-12-31",
    "metricsToInclude": ["revenue", "categories", "customers", "products"],
    "includeDetailedBreakdown": true
  }'

4. Real-Time Analytics Stream

# Connect to real-time analytics stream
curl -N https://localhost:5001/api/analytics/real-time/orders

# Streams analytics data every second using Server-Sent Events

5. Export Large Dataset

# Export all products to CSV (streams the file)
curl https://localhost:5001/api/products/export/csv > products.csv

Memory Efficiency Examples

Small Dataset (In-Memory Processing)

When working with small datasets (<10,000 items), the API uses standard in-memory processing:

// Standard LINQ operations
var results = await query
    .Where(p => p.Category == "Books")
    .OrderBy(p => p.Price)
    .ToListAsync();

Large Dataset (External Processing)

For large datasets (>10,000 items), the API automatically switches to external processing:

// Automatic external sorting
if (count > 10000)
{
    query = query.UseExternalSorting();
}

// Process in √n-sized batches
await foreach (var batch in query.BatchBySqrtNAsync())
{
    // Process batch
}

Configuration

The sample includes configurable memory limits:

// appsettings.json
{
  "MemoryOptions": {
    "MaxMemoryMB": 512,
    "WarningThresholdPercent": 80
  }
}

Monitoring

Check memory usage statistics:

curl https://localhost:5001/api/analytics/memory-stats

Response:

{
  "currentMemoryUsageMB": 245,
  "peakMemoryUsageMB": 412,
  "externalSortOperations": 3,
  "checkpointsSaved": 15,
  "dataSpilledToDiskMB": 89,
  "cacheHitRate": 0.87,
  "currentMemoryPressure": "Medium"
}

Architecture Highlights

  1. Service Layer: Encapsulates business logic and SpaceTime optimizations
  2. Entity Framework Integration: Seamless integration with EF Core queries
  3. Middleware: Automatic checkpoint and streaming support
  4. Background Services: Continuous data generation for testing
  5. Memory Monitoring: Real-time tracking of memory usage

Best Practices Demonstrated

  1. Know Your Data Size: Check count before choosing processing strategy
  2. Stream When Possible: Use IAsyncEnumerable for large results
  3. Checkpoint Long Operations: Enable recovery from failures
  4. Monitor Memory Usage: Track and respond to memory pressure
  5. Use External Processing: Let the library handle large datasets efficiently

Next Steps

  • Modify the memory limits and observe behavior changes
  • Add your own endpoints using SpaceTime patterns
  • Connect to a real database for production scenarios
  • Implement caching with hot/cold storage tiers
  • Add distributed processing with Redis coordination