sqrtspace-tools/advisor/README.md
2025-07-20 04:04:41 -04:00

8.0 KiB

SpaceTime Configuration Advisor

Intelligent system configuration advisor that applies Williams' √n space-time tradeoffs to optimize database, JVM, kernel, container, and application settings.

Features

  • System Analysis: Comprehensive hardware profiling (CPU, memory, storage, network)
  • Workload Characterization: Analyze access patterns and resource requirements
  • Multi-System Support: Database, JVM, kernel, container, and application configs
  • √n Optimization: Apply theoretical bounds to real-world settings
  • A/B Testing: Compare configurations with statistical confidence
  • AI Explanations: Clear reasoning for each recommendation

Installation

# From sqrtspace-tools root directory
pip install -r requirements-minimal.txt

Quick Start

from advisor import ConfigurationAdvisor, SystemType

advisor = ConfigurationAdvisor()

# Analyze for database workload
config = advisor.analyze(
    workload_data={
        'read_ratio': 0.8,
        'working_set_gb': 50,
        'total_data_gb': 500,
        'qps': 10000
    },
    target=SystemType.DATABASE
)

print(config.explanation)
# "Database configured with 12.5GB buffer pool (√n sizing), 
#  128MB work memory per operation, and standard checkpointing."

System Types

1. Database Configuration

Optimizes PostgreSQL/MySQL settings:

# E-commerce OLTP workload
config = advisor.analyze(
    workload_data={
        'read_ratio': 0.9,
        'working_set_gb': 20,
        'total_data_gb': 200,
        'qps': 5000,
        'connections': 300,
        'latency_sla_ms': 50
    },
    target=SystemType.DATABASE
)

# Generated PostgreSQL config:
# shared_buffers = 5120MB      # √n sized if data > memory
# work_mem = 21MB              # Per-operation memory
# checkpoint_segments = 16      # Based on write ratio
# max_connections = 600        # 2x concurrent users

2. JVM Configuration

Tunes heap size, GC, and thread settings:

# Low-latency trading system
config = advisor.analyze(
    workload_data={
        'latency_sla_ms': 10,
        'working_set_gb': 8,
        'connections': 100
    },
    target=SystemType.JVM
)

# Generated JVM flags:
# -Xmx16g -Xms16g              # 50% of system memory
# -Xmn512m                     # √n young generation
# -XX:+UseG1GC                # Low-latency GC
# -XX:MaxGCPauseMillis=10     # Match SLA

3. Kernel Configuration

Optimizes Linux kernel parameters:

# High-throughput web server
config = advisor.analyze(
    workload_data={
        'request_rate': 50000,
        'connections': 10000,
        'working_set_gb': 32
    },
    target=SystemType.KERNEL
)

# Generated sysctl settings:
# vm.dirty_ratio = 20
# vm.swappiness = 60
# net.core.somaxconn = 65535
# net.ipv4.tcp_max_syn_backlog = 65535

4. Container Configuration

Sets Docker/Kubernetes resource limits:

# Microservice API
config = advisor.analyze(
    workload_data={
        'working_set_gb': 2,
        'connections': 100,
        'qps': 1000
    },
    target=SystemType.CONTAINER
)

# Generated Docker command:
# docker run --memory=3.0g --cpus=100

5. Application Configuration

Tunes thread pools, caches, and batch sizes:

# Data processing application
config = advisor.analyze(
    workload_data={
        'working_set_gb': 50,
        'connections': 200,
        'batch_size': 10000
    },
    target=SystemType.APPLICATION
)

# Generated settings:
# thread_pool_size: 16         # Based on CPU cores
# connection_pool_size: 200    # Match concurrency
# cache_size: 229,739          # √n entries
# batch_size: 10,000           # Optimized for memory

System Analysis

The advisor automatically profiles your system:

from advisor import SystemAnalyzer

analyzer = SystemAnalyzer()
profile = analyzer.analyze_system()

print(f"CPU: {profile.cpu_count} cores ({profile.cpu_model})")
print(f"Memory: {profile.memory_gb:.1f}GB")
print(f"Storage: {profile.storage_type} ({profile.storage_iops} IOPS)")
print(f"L3 Cache: {profile.l3_cache_mb:.1f}MB")

Workload Analysis

Characterize workloads from metrics or logs:

from advisor import WorkloadAnalyzer

analyzer = WorkloadAnalyzer()

# From metrics
workload = analyzer.analyze_workload(metrics={
    'read_ratio': 0.8,
    'working_set_gb': 100,
    'qps': 10000,
    'connections': 500
})

# From logs
workload = analyzer.analyze_workload(logs=[
    "SELECT * FROM users WHERE id = 123",
    "UPDATE orders SET status = 'shipped'",
    # ... more log entries
])

A/B Testing

Compare configurations scientifically:

# Create two configurations
config_a = advisor.analyze(workload_a, target=SystemType.DATABASE)
config_b = advisor.analyze(workload_b, target=SystemType.DATABASE)

# Run A/B test
results = advisor.compare_configs(
    [config_a, config_b],
    test_duration=300  # 5 minutes
)

for result in results:
    print(f"{result.config_name}:")
    print(f"  Throughput: {result.metrics['throughput']} QPS")
    print(f"  Latency: {result.metrics['latency']} ms")
    print(f"  Winner: {'Yes' if result.winner else 'No'}")

Export Configurations

Save configurations in appropriate formats:

# PostgreSQL config file
advisor.export_config(db_config, "postgresql.conf")

# JVM startup script
advisor.export_config(jvm_config, "jvm_startup.sh")

# JSON for other systems
advisor.export_config(app_config, "app_config.json")

√n Optimization Examples

The advisor applies Williams' space-time tradeoffs:

Database Buffer Pool

For data larger than memory:

  • Traditional: Try to cache everything (thrashing)
  • √n approach: Cache √(data_size) for optimal performance
  • Example: 1TB data → 32GB buffer pool (not 1TB!)

JVM Young Generation

Balance GC frequency vs pause time:

  • Traditional: Fixed percentage (25% of heap)
  • √n approach: √(heap_size) for optimal GC
  • Example: 64GB heap → 8GB young gen

Application Cache

Limited memory for caching:

  • Traditional: LRU with fixed size
  • √n approach: √(total_items) cache entries
  • Example: 1B items → 31,622 cache entries

Real-World Impact

Organizations using these principles:

  • Google: Bigtable uses √n buffer sizes
  • Facebook: RocksDB applies similar concepts
  • PostgreSQL: Shared buffers tuning
  • JVM: G1GC uses √n heuristics
  • Linux: Page cache management

Advanced Usage

Custom System Types

class CustomConfigGenerator(ConfigurationGenerator):
    def generate_custom_config(self, system, workload):
        # Apply √n principles to your system
        buffer_size = self.sqrt_calc.calculate_optimal_buffer(
            workload.total_data_size_gb * 1024
        )
        return Configuration(...)

Continuous Optimization

# Monitor and adapt over time
while True:
    current_metrics = collect_metrics()
    
    if significant_change(current_metrics, last_metrics):
        new_config = advisor.analyze(
            workload_data=current_metrics,
            target=SystemType.DATABASE
        )
        apply_config(new_config)
    
    time.sleep(3600)  # Check hourly

Examples

See example_advisor.py for comprehensive examples:

  • PostgreSQL tuning for OLTP vs OLAP
  • JVM configuration for latency vs throughput
  • Container resource allocation
  • Kernel tuning for different workloads
  • A/B testing configurations
  • Adaptive configuration over time

Troubleshooting

Memory Calculations

  • Buffer sizes are capped at available memory
  • √n sizing only applied when data > memory
  • Consider OS overhead (typically 20% reserved)

Performance Testing

  • A/B tests simulate load (real tests needed)
  • Confidence intervals require sufficient samples
  • Network conditions affect distributed systems

Future Enhancements

  • Cloud provider specific configs (AWS, GCP, Azure)
  • Kubernetes operator for automatic tuning
  • Machine learning workload detection
  • Integration with monitoring systems
  • Automated rollback on regression

See Also