392 lines
9.8 KiB
Markdown
392 lines
9.8 KiB
Markdown
# SpaceTime Benchmark Suite
|
||
|
||
Standardized benchmarks for measuring and comparing space-time tradeoffs across algorithms and systems.
|
||
|
||
## Features
|
||
|
||
- **Standard Benchmarks**: Sorting, searching, graph algorithms, matrix operations
|
||
- **Real-World Workloads**: Database queries, ML training, distributed computing
|
||
- **Accurate Measurement**: Time, memory (peak/average), cache misses, throughput
|
||
- **Statistical Analysis**: Compare strategies with confidence
|
||
- **Reproducible Results**: Controlled environment, result validation
|
||
- **Visualization**: Automatic plots and analysis
|
||
|
||
## Installation
|
||
|
||
```bash
|
||
# From sqrtspace-tools root directory
|
||
pip install numpy matplotlib psutil
|
||
|
||
# For database benchmarks
|
||
pip install sqlite3 # Usually pre-installed
|
||
```
|
||
|
||
## Quick Start
|
||
|
||
```bash
|
||
# Run quick benchmark suite
|
||
python spacetime_benchmarks.py --quick
|
||
|
||
# Run all benchmarks
|
||
python spacetime_benchmarks.py
|
||
|
||
# Run specific suite
|
||
python spacetime_benchmarks.py --suite sorting
|
||
|
||
# Analyze saved results
|
||
python spacetime_benchmarks.py --analyze results_20240315_143022.json
|
||
```
|
||
|
||
## Benchmark Categories
|
||
|
||
### 1. Sorting Algorithms
|
||
Compare memory-time tradeoffs in sorting:
|
||
|
||
```python
|
||
# Strategies benchmarked:
|
||
- standard: In-memory quicksort/mergesort (O(n) space)
|
||
- sqrt_n: External sort with √n buffer (O(√n) space)
|
||
- constant: Streaming sort (O(1) space)
|
||
|
||
# Example results for n=1,000,000:
|
||
Standard: 0.125s, 8.0MB memory
|
||
√n buffer: 0.187s, 0.3MB memory (96% less memory, 50% slower)
|
||
Streaming: 0.543s, 0.01MB memory (99.9% less memory, 4.3x slower)
|
||
```
|
||
|
||
### 2. Search Data Structures
|
||
Compare different index structures:
|
||
|
||
```python
|
||
# Strategies benchmarked:
|
||
- hash: Standard hash table (O(n) space)
|
||
- btree: B-tree index (O(n) space, cache-friendly)
|
||
- external: External index with √n cache
|
||
|
||
# Example results for n=1,000,000:
|
||
Hash table: 0.003s per query, 40MB memory
|
||
B-tree: 0.008s per query, 35MB memory
|
||
External: 0.025s per query, 2MB memory (95% less)
|
||
```
|
||
|
||
### 3. Database Operations
|
||
Real SQLite database with different cache configurations:
|
||
|
||
```python
|
||
# Strategies benchmarked:
|
||
- standard: Default cache size (2000 pages)
|
||
- sqrt_n: √n cache pages
|
||
- minimal: Minimal cache (10 pages)
|
||
|
||
# Example results for n=100,000 rows:
|
||
Standard: 1000 queries in 0.45s, 16MB cache
|
||
√n cache: 1000 queries in 0.52s, 1.2MB cache
|
||
Minimal: 1000 queries in 1.83s, 0.08MB cache
|
||
```
|
||
|
||
### 4. ML Training
|
||
Neural network training with memory optimizations:
|
||
|
||
```python
|
||
# Strategies benchmarked:
|
||
- standard: Keep all activations for backprop
|
||
- gradient_checkpoint: Recompute activations (√n checkpoints)
|
||
- mixed_precision: FP16 compute, FP32 master weights
|
||
|
||
# Example results for 50,000 samples:
|
||
Standard: 2.3s, 195MB peak memory
|
||
Checkpointing: 2.8s, 42MB peak memory (78% less)
|
||
Mixed precision: 2.1s, 98MB peak memory (50% less)
|
||
```
|
||
|
||
### 5. Graph Algorithms
|
||
Graph traversal with memory constraints:
|
||
|
||
```python
|
||
# Strategies benchmarked:
|
||
- bfs: Standard breadth-first search
|
||
- dfs_iterative: Depth-first with explicit stack
|
||
- memory_bounded: Limited queue size (like IDA*)
|
||
|
||
# Example results for n=50,000 nodes:
|
||
BFS: 0.18s, 12MB memory (full frontier)
|
||
DFS: 0.15s, 4MB memory (stack only)
|
||
Bounded: 0.31s, 0.8MB memory (√n queue)
|
||
```
|
||
|
||
### 6. Matrix Operations
|
||
Cache-aware matrix multiplication:
|
||
|
||
```python
|
||
# Strategies benchmarked:
|
||
- standard: Naive multiplication
|
||
- blocked: Cache-blocked multiplication
|
||
- streaming: Row-by-row streaming
|
||
|
||
# Example results for 2000×2000 matrices:
|
||
Standard: 1.2s, 32MB memory
|
||
Blocked: 0.8s, 32MB memory (33% faster)
|
||
Streaming: 3.5s, 0.5MB memory (98% less memory)
|
||
```
|
||
|
||
## Running Benchmarks
|
||
|
||
### Command Line Options
|
||
|
||
```bash
|
||
# Run all benchmarks
|
||
python spacetime_benchmarks.py
|
||
|
||
# Quick benchmarks (subset for testing)
|
||
python spacetime_benchmarks.py --quick
|
||
|
||
# Specific suite only
|
||
python spacetime_benchmarks.py --suite sorting
|
||
python spacetime_benchmarks.py --suite database
|
||
python spacetime_benchmarks.py --suite ml
|
||
|
||
# With automatic plotting
|
||
python spacetime_benchmarks.py --plot
|
||
|
||
# Analyze previous results
|
||
python spacetime_benchmarks.py --analyze results_20240315_143022.json
|
||
```
|
||
|
||
### Programmatic Usage
|
||
|
||
```python
|
||
from spacetime_benchmarks import BenchmarkRunner, benchmark_sorting
|
||
|
||
runner = BenchmarkRunner()
|
||
|
||
# Run single benchmark
|
||
result = runner.run_benchmark(
|
||
name="Custom Sort",
|
||
category=BenchmarkCategory.SORTING,
|
||
strategy="sqrt_n",
|
||
benchmark_func=benchmark_sorting,
|
||
data_size=1000000
|
||
)
|
||
|
||
print(f"Time: {result.time_seconds:.3f}s")
|
||
print(f"Memory: {result.memory_peak_mb:.1f}MB")
|
||
print(f"Space-Time Product: {result.space_time_product:.1f}")
|
||
|
||
# Compare strategies
|
||
comparisons = runner.compare_strategies(
|
||
name="Sort Comparison",
|
||
category=BenchmarkCategory.SORTING,
|
||
benchmark_func=benchmark_sorting,
|
||
strategies=["standard", "sqrt_n", "constant"],
|
||
data_sizes=[10000, 100000, 1000000]
|
||
)
|
||
|
||
for comp in comparisons:
|
||
print(f"\n{comp.baseline.strategy} vs {comp.optimized.strategy}:")
|
||
print(f" Memory reduction: {comp.memory_reduction:.1f}%")
|
||
print(f" Time overhead: {comp.time_overhead:.1f}%")
|
||
print(f" Recommendation: {comp.recommendation}")
|
||
```
|
||
|
||
## Custom Benchmarks
|
||
|
||
Add your own benchmarks:
|
||
|
||
```python
|
||
def benchmark_custom_algorithm(n: int, strategy: str = 'standard', **kwargs) -> int:
|
||
"""Custom algorithm with space-time tradeoffs"""
|
||
|
||
if strategy == 'standard':
|
||
# O(n) space implementation
|
||
data = list(range(n))
|
||
# ... algorithm ...
|
||
return n # Return operation count
|
||
|
||
elif strategy == 'memory_efficient':
|
||
# O(√n) space implementation
|
||
buffer_size = int(np.sqrt(n))
|
||
# ... algorithm ...
|
||
return n
|
||
|
||
# Register and run
|
||
runner = BenchmarkRunner()
|
||
runner.compare_strategies(
|
||
"Custom Algorithm",
|
||
BenchmarkCategory.CUSTOM,
|
||
benchmark_custom_algorithm,
|
||
["standard", "memory_efficient"],
|
||
[1000, 10000, 100000]
|
||
)
|
||
```
|
||
|
||
## Understanding Results
|
||
|
||
### Key Metrics
|
||
|
||
1. **Time (seconds)**: Wall-clock execution time
|
||
2. **Peak Memory (MB)**: Maximum memory usage during execution
|
||
3. **Average Memory (MB)**: Average memory over execution
|
||
4. **Throughput (ops/sec)**: Operations completed per second
|
||
5. **Space-Time Product**: Memory × Time (lower is better)
|
||
|
||
### Interpreting Comparisons
|
||
|
||
```
|
||
Comparison standard vs sqrt_n:
|
||
Memory reduction: 94.3% # How much less memory
|
||
Time overhead: 47.2% # How much slower
|
||
Space-time improvement: 91.8% # Overall efficiency gain
|
||
Recommendation: Use sqrt_n for 94% memory savings
|
||
```
|
||
|
||
### When to Use Each Strategy
|
||
|
||
| Strategy | Use When | Avoid When |
|
||
|----------|----------|------------|
|
||
| Standard | Memory abundant, Speed critical | Memory constrained |
|
||
| √n Optimized | Memory limited, Moderate slowdown OK | Real-time systems |
|
||
| O(log n) | Extreme memory constraints | Random access needed |
|
||
| O(1) Space | Streaming data, Minimal memory | Need multiple passes |
|
||
|
||
## Benchmark Output
|
||
|
||
### Results File Format
|
||
|
||
```json
|
||
{
|
||
"system_info": {
|
||
"cpu_count": 8,
|
||
"memory_gb": 32.0,
|
||
"l3_cache_mb": 12.0
|
||
},
|
||
"results": [
|
||
{
|
||
"name": "Sorting",
|
||
"category": "sorting",
|
||
"strategy": "sqrt_n",
|
||
"data_size": 1000000,
|
||
"time_seconds": 0.187,
|
||
"memory_peak_mb": 8.2,
|
||
"memory_avg_mb": 6.5,
|
||
"throughput": 5347593.5,
|
||
"space_time_product": 1.534,
|
||
"metadata": {
|
||
"success": true,
|
||
"operations": 1000000
|
||
}
|
||
}
|
||
],
|
||
"timestamp": 1710512345.678
|
||
}
|
||
```
|
||
|
||
### Visualization
|
||
|
||
Automatic plots show:
|
||
- Time complexity curves
|
||
- Memory usage scaling
|
||
- Space-time product comparison
|
||
- Throughput vs data size
|
||
|
||
## Performance Tips
|
||
|
||
1. **System Preparation**:
|
||
```bash
|
||
# Disable CPU frequency scaling
|
||
sudo cpupower frequency-set -g performance
|
||
|
||
# Clear caches
|
||
sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
|
||
```
|
||
|
||
2. **Accurate Memory Measurement**:
|
||
- Results include Python overhead
|
||
- Use `memory_peak_mb` for maximum usage
|
||
- `memory_avg_mb` shows typical usage
|
||
|
||
3. **Reproducibility**:
|
||
- Run multiple times and average
|
||
- Control background processes
|
||
- Use consistent data sizes
|
||
|
||
## Extending the Suite
|
||
|
||
### Adding New Categories
|
||
|
||
```python
|
||
class BenchmarkCategory(Enum):
|
||
# ... existing categories ...
|
||
CUSTOM = "custom"
|
||
|
||
def custom_suite(runner: BenchmarkRunner):
|
||
"""Run custom benchmarks"""
|
||
strategies = ['approach1', 'approach2']
|
||
data_sizes = [1000, 10000, 100000]
|
||
|
||
runner.compare_strategies(
|
||
"Custom Workload",
|
||
BenchmarkCategory.CUSTOM,
|
||
benchmark_custom,
|
||
strategies,
|
||
data_sizes
|
||
)
|
||
```
|
||
|
||
### Platform-Specific Metrics
|
||
|
||
```python
|
||
def get_cache_misses():
|
||
"""Get L3 cache misses (Linux perf)"""
|
||
if platform.system() == 'Linux':
|
||
# Use perf_event_open or read from perf
|
||
pass
|
||
return None
|
||
```
|
||
|
||
## Real-World Insights
|
||
|
||
From our benchmarks:
|
||
|
||
1. **√n strategies typically save 90-99% memory** with 20-100% time overhead
|
||
|
||
2. **Cache-aware algorithms can be faster** despite theoretical complexity
|
||
|
||
3. **Memory bandwidth often dominates** over computational complexity
|
||
|
||
4. **Optimal strategy depends on**:
|
||
- Data size vs available memory
|
||
- Latency requirements
|
||
- Power/cost constraints
|
||
|
||
## Troubleshooting
|
||
|
||
### Memory Measurements Seem Low
|
||
- Python may not release memory immediately
|
||
- Use `gc.collect()` before benchmarks
|
||
- Check for lazy evaluation
|
||
|
||
### High Variance in Results
|
||
- Disable CPU throttling
|
||
- Close other applications
|
||
- Increase data sizes for stability
|
||
|
||
### Database Benchmarks Fail
|
||
- Ensure write permissions in output directory
|
||
- Check SQLite installation
|
||
- Verify disk space available
|
||
|
||
## Contributing
|
||
|
||
Add new benchmarks following the pattern:
|
||
|
||
1. Implement `benchmark_*` function
|
||
2. Return operation count
|
||
3. Handle different strategies
|
||
4. Add suite function
|
||
5. Update documentation
|
||
|
||
## See Also
|
||
|
||
- [SpaceTimeCore](../core/spacetime_core.py): Core calculations
|
||
- [Profiler](../profiler/): Profile your applications
|
||
- [Visual Explorer](../explorer/): Visualize tradeoffs |