sqrtspace-experiments/experiments/README.md
2025-07-20 03:56:21 -04:00

3.2 KiB
Raw Blame History

Space-Time Tradeoff Experiments

This directory contains practical experiments demonstrating Williams' theoretical result about space-time tradeoffs in computation. Each experiment has been rigorously tested with real data, multiple trials, and statistical analysis.

Experiments Overview

1. Checkpointed Sorting (Python) ✓

Location: checkpointed_sorting/

External merge sort with limited memory:

  • In-memory O(n): 0.022ms (baseline)
  • Checkpointed O(√n): 8.2ms (375× slower)
  • Extreme O(log n): 152s (6.9M× slower)

Real data from 10 trials with error bars.

2. Maze Solver (C#) ✓

Location: maze_solver/

Graph traversal with memory constraints:

  • BFS: O(n) memory, explores efficiently
  • Memory-Limited: O(√n) memory, ~5× slower
  • Shows path recomputation overhead

3. Stream Processing (Python) ✓

Location: stream_processing/

Sliding window vs full storage:

  • Surprising result: Less memory = 30× faster!
  • Cache locality beats theoretical predictions
  • Demonstrates memory hierarchy effects

4. SQLite Buffer Pool (NEW) ✓

Location: database_buffer_pool/

Real database system (150MB, 50k docs):

  • Tests page cache sizing: O(n), O(√n), O(log n), O(1)
  • Modern SSDs minimize penalties
  • Still follows √n recommendations

5. LLM KV-Cache (NEW) ✓

Location: llm_kv_cache/

Transformer attention memory tradeoffs:

  • Full O(n): 197 tokens/sec
  • Flash O(√n): 1,349 tokens/sec (6.8× faster!)
  • Minimal O(1): 4,169 tokens/sec (21× faster!)
  • Memory bandwidth bottleneck dominates

Quick Start

# Install dependencies
pip install -r requirements.txt

# Run all experiments
./run_all_experiments.sh

# Or run individually:
cd checkpointed_sorting && python run_final_experiment.py
cd ../maze_solver && dotnet run
cd ../stream_processing && python sliding_window.py
cd ../database_buffer_pool && python sqlite_heavy_experiment.py
cd ../llm_kv_cache && python llm_kv_cache_experiment.py

Key Findings

  1. Williams' √n bound confirmed with massive constant factors (100-10,000×)
  2. Memory hierarchies create cliffs: L1→L2→L3→RAM→Disk transitions
  3. Modern hardware changes everything: Fast SSDs, memory bandwidth limits
  4. Cache-aware beats optimal: Locality > theoretical complexity
  5. The pattern is everywhere: Databases, AI, algorithms, systems

Statistical Rigor

All experiments include:

  • Multiple trials (5-20 per configuration)
  • 95% confidence intervals
  • Hardware/software environment logging
  • JSON output for reproducibility
  • Publication-quality plots

Real-World Impact

These patterns appear in:

  • 2+ billion smartphones (SQLite)
  • ChatGPT/Claude/Gemini (KV-cache optimizations)
  • Google/Meta infrastructure (MapReduce, external sorts)
  • Video games (A* pathfinding with memory limits)
  • Embedded systems (severe memory constraints)

Files

  • measurement_framework.py: Profiling utilities
  • FINDINGS.md: Detailed analysis
  • requirements.txt: Dependencies
  • Individual READMEs in each subdirectory

Paper

These experiments support "The Ubiquity of Space-Time Simulation in Modern Computing: From Theory to Practice" which bridges Williams' STOC 2025 result to real systems.