# SqrtSpace SpaceTime Sample Web API This sample demonstrates how to build a memory-efficient Web API using the SqrtSpace SpaceTime library. It showcases real-world scenarios where √n space-time tradeoffs can significantly improve application performance and scalability. ## Features Demonstrated ### 1. **Memory-Efficient Data Processing** - Streaming large datasets without loading everything into memory - Automatic batching using √n-sized chunks - External sorting and aggregation for datasets that exceed memory limits ### 2. **Checkpoint-Enabled Operations** - Resumable bulk operations that can recover from failures - Progress tracking for long-running tasks - Automatic state persistence at optimal intervals ### 3. **Real-World API Patterns** #### Products Controller (`/api/products`) - **Paginated queries** - Basic memory control through pagination - **Streaming endpoints** - Stream millions of products using NDJSON format - **Smart search** - Automatically switches to external sorting for large result sets - **Bulk updates** - Checkpoint-enabled price updates that can resume after failures - **CSV export** - Stream large exports without memory bloat - **Statistics** - Calculate aggregates over large datasets efficiently #### Analytics Controller (`/api/analytics`) - **Revenue analysis** - External grouping for large-scale aggregations - **Top customers** - Find top N using external sorting when needed - **Real-time streaming** - Server-Sent Events for continuous analytics - **Complex reports** - Multi-stage report generation with checkpointing - **Pattern analysis** - ML-ready data processing with memory constraints - **Memory monitoring** - Track how the system manages memory ### 4. **Automatic Memory Management** - Adapts processing strategy based on data size - Spills to disk when memory pressure is detected - Provides memory usage statistics for monitoring ## Running the Sample 1. **Start the API:** ```bash dotnet run ``` 2. **Access Swagger UI:** Navigate to `https://localhost:5001/swagger` to explore the API 3. **Generate Test Data:** The application automatically seeds the database with: - 1,000 customers - 10,000 products - 50,000 orders A background service continuously generates new orders to simulate real-time data. ## Key Scenarios to Try ### 1. Stream Large Dataset ```bash # Stream all products (10,000+) without loading into memory curl -N https://localhost:5001/api/products/stream # The response is newline-delimited JSON (NDJSON) ``` ### 2. Bulk Update with Checkpointing ```bash # Start a bulk price update curl -X POST https://localhost:5001/api/products/bulk-update-prices \ -H "Content-Type: application/json" \ -H "X-Operation-Id: price-update-123" \ -d '{"categoryFilter": "Electronics", "priceMultiplier": 1.1}' # If it fails, resume with the same Operation ID ``` ### 3. Generate Complex Report ```bash # Generate a report with automatic checkpointing curl -X POST https://localhost:5001/api/analytics/reports/generate \ -H "Content-Type: application/json" \ -d '{ "startDate": "2024-01-01", "endDate": "2024-12-31", "metricsToInclude": ["revenue", "categories", "customers", "products"], "includeDetailedBreakdown": true }' ``` ### 4. Real-Time Analytics Stream ```bash # Connect to real-time analytics stream curl -N https://localhost:5001/api/analytics/real-time/orders # Streams analytics data every second using Server-Sent Events ``` ### 5. Export Large Dataset ```bash # Export all products to CSV (streams the file) curl https://localhost:5001/api/products/export/csv > products.csv ``` ## Memory Efficiency Examples ### Small Dataset (In-Memory Processing) When working with small datasets (<10,000 items), the API uses standard in-memory processing: ```csharp // Standard LINQ operations var results = await query .Where(p => p.Category == "Books") .OrderBy(p => p.Price) .ToListAsync(); ``` ### Large Dataset (External Processing) For large datasets (>10,000 items), the API automatically switches to external processing: ```csharp // Automatic external sorting if (count > 10000) { query = query.UseExternalSorting(); } // Process in √n-sized batches await foreach (var batch in query.BatchBySqrtNAsync()) { // Process batch } ``` ## Configuration The sample includes configurable memory limits: ```csharp // appsettings.json { "MemoryOptions": { "MaxMemoryMB": 512, "WarningThresholdPercent": 80 } } ``` ## Monitoring Check memory usage statistics: ```bash curl https://localhost:5001/api/analytics/memory-stats ``` Response: ```json { "currentMemoryUsageMB": 245, "peakMemoryUsageMB": 412, "externalSortOperations": 3, "checkpointsSaved": 15, "dataSpilledToDiskMB": 89, "cacheHitRate": 0.87, "currentMemoryPressure": "Medium" } ``` ## Architecture Highlights 1. **Service Layer**: Encapsulates business logic and SpaceTime optimizations 2. **Entity Framework Integration**: Seamless integration with EF Core queries 3. **Middleware**: Automatic checkpoint and streaming support 4. **Background Services**: Continuous data generation for testing 5. **Memory Monitoring**: Real-time tracking of memory usage ## Best Practices Demonstrated 1. **Know Your Data Size**: Check count before choosing processing strategy 2. **Stream When Possible**: Use IAsyncEnumerable for large results 3. **Checkpoint Long Operations**: Enable recovery from failures 4. **Monitor Memory Usage**: Track and respond to memory pressure 5. **Use External Processing**: Let the library handle large datasets efficiently ## Next Steps - Modify the memory limits and observe behavior changes - Add your own endpoints using SpaceTime patterns - Connect to a real database for production scenarios - Implement caching with hot/cold storage tiers - Add distributed processing with Redis coordination