Files
sqrtspace-dotnet/samples/SampleWebApi/README.md
2025-07-20 03:41:39 -04:00

190 lines
5.8 KiB
Markdown

# SqrtSpace SpaceTime Sample Web API
This sample demonstrates how to build a memory-efficient Web API using the SqrtSpace SpaceTime library. It showcases real-world scenarios where √n space-time tradeoffs can significantly improve application performance and scalability.
## Features Demonstrated
### 1. **Memory-Efficient Data Processing**
- Streaming large datasets without loading everything into memory
- Automatic batching using √n-sized chunks
- External sorting and aggregation for datasets that exceed memory limits
### 2. **Checkpoint-Enabled Operations**
- Resumable bulk operations that can recover from failures
- Progress tracking for long-running tasks
- Automatic state persistence at optimal intervals
### 3. **Real-World API Patterns**
#### Products Controller (`/api/products`)
- **Paginated queries** - Basic memory control through pagination
- **Streaming endpoints** - Stream millions of products using NDJSON format
- **Smart search** - Automatically switches to external sorting for large result sets
- **Bulk updates** - Checkpoint-enabled price updates that can resume after failures
- **CSV export** - Stream large exports without memory bloat
- **Statistics** - Calculate aggregates over large datasets efficiently
#### Analytics Controller (`/api/analytics`)
- **Revenue analysis** - External grouping for large-scale aggregations
- **Top customers** - Find top N using external sorting when needed
- **Real-time streaming** - Server-Sent Events for continuous analytics
- **Complex reports** - Multi-stage report generation with checkpointing
- **Pattern analysis** - ML-ready data processing with memory constraints
- **Memory monitoring** - Track how the system manages memory
### 4. **Automatic Memory Management**
- Adapts processing strategy based on data size
- Spills to disk when memory pressure is detected
- Provides memory usage statistics for monitoring
## Running the Sample
1. **Start the API:**
```bash
dotnet run
```
2. **Access Swagger UI:**
Navigate to `https://localhost:5001/swagger` to explore the API
3. **Generate Test Data:**
The application automatically seeds the database with:
- 1,000 customers
- 10,000 products
- 50,000 orders
A background service continuously generates new orders to simulate real-time data.
## Key Scenarios to Try
### 1. Stream Large Dataset
```bash
# Stream all products (10,000+) without loading into memory
curl -N https://localhost:5001/api/products/stream
# The response is newline-delimited JSON (NDJSON)
```
### 2. Bulk Update with Checkpointing
```bash
# Start a bulk price update
curl -X POST https://localhost:5001/api/products/bulk-update-prices \
-H "Content-Type: application/json" \
-H "X-Operation-Id: price-update-123" \
-d '{"categoryFilter": "Electronics", "priceMultiplier": 1.1}'
# If it fails, resume with the same Operation ID
```
### 3. Generate Complex Report
```bash
# Generate a report with automatic checkpointing
curl -X POST https://localhost:5001/api/analytics/reports/generate \
-H "Content-Type: application/json" \
-d '{
"startDate": "2024-01-01",
"endDate": "2024-12-31",
"metricsToInclude": ["revenue", "categories", "customers", "products"],
"includeDetailedBreakdown": true
}'
```
### 4. Real-Time Analytics Stream
```bash
# Connect to real-time analytics stream
curl -N https://localhost:5001/api/analytics/real-time/orders
# Streams analytics data every second using Server-Sent Events
```
### 5. Export Large Dataset
```bash
# Export all products to CSV (streams the file)
curl https://localhost:5001/api/products/export/csv > products.csv
```
## Memory Efficiency Examples
### Small Dataset (In-Memory Processing)
When working with small datasets (<10,000 items), the API uses standard in-memory processing:
```csharp
// Standard LINQ operations
var results = await query
.Where(p => p.Category == "Books")
.OrderBy(p => p.Price)
.ToListAsync();
```
### Large Dataset (External Processing)
For large datasets (>10,000 items), the API automatically switches to external processing:
```csharp
// Automatic external sorting
if (count > 10000)
{
query = query.UseExternalSorting();
}
// Process in √n-sized batches
await foreach (var batch in query.BatchBySqrtNAsync())
{
// Process batch
}
```
## Configuration
The sample includes configurable memory limits:
```csharp
// appsettings.json
{
"MemoryOptions": {
"MaxMemoryMB": 512,
"WarningThresholdPercent": 80
}
}
```
## Monitoring
Check memory usage statistics:
```bash
curl https://localhost:5001/api/analytics/memory-stats
```
Response:
```json
{
"currentMemoryUsageMB": 245,
"peakMemoryUsageMB": 412,
"externalSortOperations": 3,
"checkpointsSaved": 15,
"dataSpilledToDiskMB": 89,
"cacheHitRate": 0.87,
"currentMemoryPressure": "Medium"
}
```
## Architecture Highlights
1. **Service Layer**: Encapsulates business logic and SpaceTime optimizations
2. **Entity Framework Integration**: Seamless integration with EF Core queries
3. **Middleware**: Automatic checkpoint and streaming support
4. **Background Services**: Continuous data generation for testing
5. **Memory Monitoring**: Real-time tracking of memory usage
## Best Practices Demonstrated
1. **Know Your Data Size**: Check count before choosing processing strategy
2. **Stream When Possible**: Use IAsyncEnumerable for large results
3. **Checkpoint Long Operations**: Enable recovery from failures
4. **Monitor Memory Usage**: Track and respond to memory pressure
5. **Use External Processing**: Let the library handle large datasets efficiently
## Next Steps
- Modify the memory limits and observe behavior changes
- Add your own endpoints using SpaceTime patterns
- Connect to a real database for production scenarios
- Implement caching with hot/cold storage tiers
- Add distributed processing with Redis coordination