5.8 KiB
SqrtSpace SpaceTime Sample Web API
This sample demonstrates how to build a memory-efficient Web API using the SqrtSpace SpaceTime library. It showcases real-world scenarios where √n space-time tradeoffs can significantly improve application performance and scalability.
Features Demonstrated
1. Memory-Efficient Data Processing
- Streaming large datasets without loading everything into memory
- Automatic batching using √n-sized chunks
- External sorting and aggregation for datasets that exceed memory limits
2. Checkpoint-Enabled Operations
- Resumable bulk operations that can recover from failures
- Progress tracking for long-running tasks
- Automatic state persistence at optimal intervals
3. Real-World API Patterns
Products Controller (/api/products)
- Paginated queries - Basic memory control through pagination
- Streaming endpoints - Stream millions of products using NDJSON format
- Smart search - Automatically switches to external sorting for large result sets
- Bulk updates - Checkpoint-enabled price updates that can resume after failures
- CSV export - Stream large exports without memory bloat
- Statistics - Calculate aggregates over large datasets efficiently
Analytics Controller (/api/analytics)
- Revenue analysis - External grouping for large-scale aggregations
- Top customers - Find top N using external sorting when needed
- Real-time streaming - Server-Sent Events for continuous analytics
- Complex reports - Multi-stage report generation with checkpointing
- Pattern analysis - ML-ready data processing with memory constraints
- Memory monitoring - Track how the system manages memory
4. Automatic Memory Management
- Adapts processing strategy based on data size
- Spills to disk when memory pressure is detected
- Provides memory usage statistics for monitoring
Running the Sample
-
Start the API:
dotnet run -
Access Swagger UI: Navigate to
https://localhost:5001/swaggerto explore the API -
Generate Test Data: The application automatically seeds the database with:
- 1,000 customers
- 10,000 products
- 50,000 orders
A background service continuously generates new orders to simulate real-time data.
Key Scenarios to Try
1. Stream Large Dataset
# Stream all products (10,000+) without loading into memory
curl -N https://localhost:5001/api/products/stream
# The response is newline-delimited JSON (NDJSON)
2. Bulk Update with Checkpointing
# Start a bulk price update
curl -X POST https://localhost:5001/api/products/bulk-update-prices \
-H "Content-Type: application/json" \
-H "X-Operation-Id: price-update-123" \
-d '{"categoryFilter": "Electronics", "priceMultiplier": 1.1}'
# If it fails, resume with the same Operation ID
3. Generate Complex Report
# Generate a report with automatic checkpointing
curl -X POST https://localhost:5001/api/analytics/reports/generate \
-H "Content-Type: application/json" \
-d '{
"startDate": "2024-01-01",
"endDate": "2024-12-31",
"metricsToInclude": ["revenue", "categories", "customers", "products"],
"includeDetailedBreakdown": true
}'
4. Real-Time Analytics Stream
# Connect to real-time analytics stream
curl -N https://localhost:5001/api/analytics/real-time/orders
# Streams analytics data every second using Server-Sent Events
5. Export Large Dataset
# Export all products to CSV (streams the file)
curl https://localhost:5001/api/products/export/csv > products.csv
Memory Efficiency Examples
Small Dataset (In-Memory Processing)
When working with small datasets (<10,000 items), the API uses standard in-memory processing:
// Standard LINQ operations
var results = await query
.Where(p => p.Category == "Books")
.OrderBy(p => p.Price)
.ToListAsync();
Large Dataset (External Processing)
For large datasets (>10,000 items), the API automatically switches to external processing:
// Automatic external sorting
if (count > 10000)
{
query = query.UseExternalSorting();
}
// Process in √n-sized batches
await foreach (var batch in query.BatchBySqrtNAsync())
{
// Process batch
}
Configuration
The sample includes configurable memory limits:
// appsettings.json
{
"MemoryOptions": {
"MaxMemoryMB": 512,
"WarningThresholdPercent": 80
}
}
Monitoring
Check memory usage statistics:
curl https://localhost:5001/api/analytics/memory-stats
Response:
{
"currentMemoryUsageMB": 245,
"peakMemoryUsageMB": 412,
"externalSortOperations": 3,
"checkpointsSaved": 15,
"dataSpilledToDiskMB": 89,
"cacheHitRate": 0.87,
"currentMemoryPressure": "Medium"
}
Architecture Highlights
- Service Layer: Encapsulates business logic and SpaceTime optimizations
- Entity Framework Integration: Seamless integration with EF Core queries
- Middleware: Automatic checkpoint and streaming support
- Background Services: Continuous data generation for testing
- Memory Monitoring: Real-time tracking of memory usage
Best Practices Demonstrated
- Know Your Data Size: Check count before choosing processing strategy
- Stream When Possible: Use IAsyncEnumerable for large results
- Checkpoint Long Operations: Enable recovery from failures
- Monitor Memory Usage: Track and respond to memory pressure
- Use External Processing: Let the library handle large datasets efficiently
Next Steps
- Modify the memory limits and observe behavior changes
- Add your own endpoints using SpaceTime patterns
- Connect to a real database for production scenarios
- Implement caching with hot/cold storage tiers
- Add distributed processing with Redis coordination