sqrtspace/sqrtspace-tools

Fork 0

Files

History

Dave Friedel 89909d5b20 Initial

2025-07-20 04:04:41 -04:00

README.md

Initial

2025-07-20 04:04:41 -04:00

spacetime_benchmarks.py

Initial

2025-07-20 04:04:41 -04:00

README.md

SpaceTime Benchmark Suite

Standardized benchmarks for measuring and comparing space-time tradeoffs across algorithms and systems.

Features

Standard Benchmarks: Sorting, searching, graph algorithms, matrix operations
Real-World Workloads: Database queries, ML training, distributed computing
Accurate Measurement: Time, memory (peak/average), cache misses, throughput
Statistical Analysis: Compare strategies with confidence
Reproducible Results: Controlled environment, result validation
Visualization: Automatic plots and analysis

Installation

# From sqrtspace-tools root directory
pip install numpy matplotlib psutil

# For database benchmarks
pip install sqlite3  # Usually pre-installed

Quick Start

# Run quick benchmark suite
python spacetime_benchmarks.py --quick

# Run all benchmarks
python spacetime_benchmarks.py

# Run specific suite
python spacetime_benchmarks.py --suite sorting

# Analyze saved results
python spacetime_benchmarks.py --analyze results_20240315_143022.json

Benchmark Categories

1. Sorting Algorithms

Compare memory-time tradeoffs in sorting:

# Strategies benchmarked:
- standard: In-memory quicksort/mergesort (O(n) space)
- sqrt_n: External sort with √n buffer (O(√n) space)
- constant: Streaming sort (O(1) space)

# Example results for n=1,000,000:
Standard: 0.125s, 8.0MB memory
√n buffer: 0.187s, 0.3MB memory (96% less memory, 50% slower)
Streaming: 0.543s, 0.01MB memory (99.9% less memory, 4.3x slower)

2. Search Data Structures

Compare different index structures:

# Strategies benchmarked:
- hash: Standard hash table (O(n) space)
- btree: B-tree index (O(n) space, cache-friendly)
- external: External index with √n cache

# Example results for n=1,000,000:
Hash table: 0.003s per query, 40MB memory
B-tree: 0.008s per query, 35MB memory
External: 0.025s per query, 2MB memory (95% less)

3. Database Operations

Real SQLite database with different cache configurations:

# Strategies benchmarked:
- standard: Default cache size (2000 pages)
- sqrt_n: √n cache pages
- minimal: Minimal cache (10 pages)

# Example results for n=100,000 rows:
Standard: 1000 queries in 0.45s, 16MB cache
√n cache: 1000 queries in 0.52s, 1.2MB cache
Minimal: 1000 queries in 1.83s, 0.08MB cache

4. ML Training

Neural network training with memory optimizations:

# Strategies benchmarked:
- standard: Keep all activations for backprop
- gradient_checkpoint: Recompute activations (√n checkpoints)
- mixed_precision: FP16 compute, FP32 master weights

# Example results for 50,000 samples:
Standard: 2.3s, 195MB peak memory
Checkpointing: 2.8s, 42MB peak memory (78% less)
Mixed precision: 2.1s, 98MB peak memory (50% less)

5. Graph Algorithms

Graph traversal with memory constraints:

# Strategies benchmarked:
- bfs: Standard breadth-first search
- dfs_iterative: Depth-first with explicit stack
- memory_bounded: Limited queue size (like IDA*)

# Example results for n=50,000 nodes:
BFS: 0.18s, 12MB memory (full frontier)
DFS: 0.15s, 4MB memory (stack only)
Bounded: 0.31s, 0.8MB memory (√n queue)

6. Matrix Operations

Cache-aware matrix multiplication:

# Strategies benchmarked:
- standard: Naive multiplication
- blocked: Cache-blocked multiplication
- streaming: Row-by-row streaming

# Example results for 2000×2000 matrices:
Standard: 1.2s, 32MB memory
Blocked: 0.8s, 32MB memory (33% faster)
Streaming: 3.5s, 0.5MB memory (98% less memory)

Running Benchmarks

Command Line Options

# Run all benchmarks
python spacetime_benchmarks.py

# Quick benchmarks (subset for testing)
python spacetime_benchmarks.py --quick

# Specific suite only
python spacetime_benchmarks.py --suite sorting
python spacetime_benchmarks.py --suite database
python spacetime_benchmarks.py --suite ml

# With automatic plotting
python spacetime_benchmarks.py --plot

# Analyze previous results
python spacetime_benchmarks.py --analyze results_20240315_143022.json

Programmatic Usage

from spacetime_benchmarks import BenchmarkRunner, benchmark_sorting

runner = BenchmarkRunner()

# Run single benchmark
result = runner.run_benchmark(
    name="Custom Sort",
    category=BenchmarkCategory.SORTING,
    strategy="sqrt_n",
    benchmark_func=benchmark_sorting,
    data_size=1000000
)

print(f"Time: {result.time_seconds:.3f}s")
print(f"Memory: {result.memory_peak_mb:.1f}MB")
print(f"Space-Time Product: {result.space_time_product:.1f}")

# Compare strategies
comparisons = runner.compare_strategies(
    name="Sort Comparison",
    category=BenchmarkCategory.SORTING,
    benchmark_func=benchmark_sorting,
    strategies=["standard", "sqrt_n", "constant"],
    data_sizes=[10000, 100000, 1000000]
)

for comp in comparisons:
    print(f"\n{comp.baseline.strategy} vs {comp.optimized.strategy}:")
    print(f"  Memory reduction: {comp.memory_reduction:.1f}%")
    print(f"  Time overhead: {comp.time_overhead:.1f}%")
    print(f"  Recommendation: {comp.recommendation}")

Custom Benchmarks

Add your own benchmarks:

def benchmark_custom_algorithm(n: int, strategy: str = 'standard', **kwargs) -> int:
    """Custom algorithm with space-time tradeoffs"""
    
    if strategy == 'standard':
        # O(n) space implementation
        data = list(range(n))
        # ... algorithm ...
        return n  # Return operation count
        
    elif strategy == 'memory_efficient':
        # O(√n) space implementation
        buffer_size = int(np.sqrt(n))
        # ... algorithm ...
        return n
        
# Register and run
runner = BenchmarkRunner()
runner.compare_strategies(
    "Custom Algorithm",
    BenchmarkCategory.CUSTOM,
    benchmark_custom_algorithm,
    ["standard", "memory_efficient"],
    [1000, 10000, 100000]
)

Understanding Results

Key Metrics

Time (seconds): Wall-clock execution time
Peak Memory (MB): Maximum memory usage during execution
Average Memory (MB): Average memory over execution
Throughput (ops/sec): Operations completed per second
Space-Time Product: Memory × Time (lower is better)

Interpreting Comparisons

Comparison standard vs sqrt_n:
  Memory reduction: 94.3%      # How much less memory
  Time overhead: 47.2%         # How much slower
  Space-time improvement: 91.8% # Overall efficiency gain
  Recommendation: Use sqrt_n for 94% memory savings

When to Use Each Strategy

Strategy	Use When	Avoid When
Standard	Memory abundant, Speed critical	Memory constrained
√n Optimized	Memory limited, Moderate slowdown OK	Real-time systems
O(log n)	Extreme memory constraints	Random access needed
O(1) Space	Streaming data, Minimal memory	Need multiple passes

Benchmark Output

Results File Format

{
  "system_info": {
    "cpu_count": 8,
    "memory_gb": 32.0,
    "l3_cache_mb": 12.0
  },
  "results": [
    {
      "name": "Sorting",
      "category": "sorting",
      "strategy": "sqrt_n",
      "data_size": 1000000,
      "time_seconds": 0.187,
      "memory_peak_mb": 8.2,
      "memory_avg_mb": 6.5,
      "throughput": 5347593.5,
      "space_time_product": 1.534,
      "metadata": {
        "success": true,
        "operations": 1000000
      }
    }
  ],
  "timestamp": 1710512345.678
}

Visualization

Automatic plots show:

Time complexity curves
Memory usage scaling
Space-time product comparison
Throughput vs data size

Performance Tips

System Preparation:

# Disable CPU frequency scaling
sudo cpupower frequency-set -g performance

# Clear caches
sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

Accurate Memory Measurement:
- Results include Python overhead
- Use memory_peak_mb for maximum usage
- memory_avg_mb shows typical usage
Reproducibility:
- Run multiple times and average
- Control background processes
- Use consistent data sizes

Extending the Suite

Adding New Categories

class BenchmarkCategory(Enum):
    # ... existing categories ...
    CUSTOM = "custom"

def custom_suite(runner: BenchmarkRunner):
    """Run custom benchmarks"""
    strategies = ['approach1', 'approach2']
    data_sizes = [1000, 10000, 100000]
    
    runner.compare_strategies(
        "Custom Workload",
        BenchmarkCategory.CUSTOM,
        benchmark_custom,
        strategies,
        data_sizes
    )

Platform-Specific Metrics

def get_cache_misses():
    """Get L3 cache misses (Linux perf)"""
    if platform.system() == 'Linux':
        # Use perf_event_open or read from perf
        pass
    return None

Real-World Insights

From our benchmarks:

√n strategies typically save 90-99% memory with 20-100% time overhead
Cache-aware algorithms can be faster despite theoretical complexity
Memory bandwidth often dominates over computational complexity
Optimal strategy depends on:
- Data size vs available memory
- Latency requirements
- Power/cost constraints

Troubleshooting

Memory Measurements Seem Low

Python may not release memory immediately
Use gc.collect() before benchmarks
Check for lazy evaluation

High Variance in Results

Disable CPU throttling
Close other applications
Increase data sizes for stability

Database Benchmarks Fail

Ensure write permissions in output directory
Check SQLite installation
Verify disk space available

Contributing

Add new benchmarks following the pattern:

Implement benchmark_* function
Return operation count
Handle different strategies
Add suite function
Update documentation

README.md Unescape Escape

SpaceTime Benchmark Suite

Features

Installation

Quick Start

Benchmark Categories

1. Sorting Algorithms

2. Search Data Structures

3. Database Operations

4. ML Training

5. Graph Algorithms

6. Matrix Operations

Running Benchmarks

Command Line Options

Programmatic Usage

Custom Benchmarks

Understanding Results

Key Metrics

Interpreting Comparisons

When to Use Each Strategy

Benchmark Output

Results File Format

Visualization

Performance Tips

Extending the Suite

Adding New Categories

Platform-Specific Metrics

Real-World Insights

Troubleshooting

Memory Measurements Seem Low

High Variance in Results

Database Benchmarks Fail

Contributing

See Also

README.md