# SpaceTime Configuration Advisor Intelligent system configuration advisor that applies Williams' √n space-time tradeoffs to optimize database, JVM, kernel, container, and application settings. ## Features - **System Analysis**: Comprehensive hardware profiling (CPU, memory, storage, network) - **Workload Characterization**: Analyze access patterns and resource requirements - **Multi-System Support**: Database, JVM, kernel, container, and application configs - **√n Optimization**: Apply theoretical bounds to real-world settings - **A/B Testing**: Compare configurations with statistical confidence - **AI Explanations**: Clear reasoning for each recommendation ## Installation ```bash # From sqrtspace-tools root directory pip install -r requirements-minimal.txt ``` ## Quick Start ```python from advisor import ConfigurationAdvisor, SystemType advisor = ConfigurationAdvisor() # Analyze for database workload config = advisor.analyze( workload_data={ 'read_ratio': 0.8, 'working_set_gb': 50, 'total_data_gb': 500, 'qps': 10000 }, target=SystemType.DATABASE ) print(config.explanation) # "Database configured with 12.5GB buffer pool (√n sizing), # 128MB work memory per operation, and standard checkpointing." ``` ## System Types ### 1. Database Configuration Optimizes PostgreSQL/MySQL settings: ```python # E-commerce OLTP workload config = advisor.analyze( workload_data={ 'read_ratio': 0.9, 'working_set_gb': 20, 'total_data_gb': 200, 'qps': 5000, 'connections': 300, 'latency_sla_ms': 50 }, target=SystemType.DATABASE ) # Generated PostgreSQL config: # shared_buffers = 5120MB # √n sized if data > memory # work_mem = 21MB # Per-operation memory # checkpoint_segments = 16 # Based on write ratio # max_connections = 600 # 2x concurrent users ``` ### 2. JVM Configuration Tunes heap size, GC, and thread settings: ```python # Low-latency trading system config = advisor.analyze( workload_data={ 'latency_sla_ms': 10, 'working_set_gb': 8, 'connections': 100 }, target=SystemType.JVM ) # Generated JVM flags: # -Xmx16g -Xms16g # 50% of system memory # -Xmn512m # √n young generation # -XX:+UseG1GC # Low-latency GC # -XX:MaxGCPauseMillis=10 # Match SLA ``` ### 3. Kernel Configuration Optimizes Linux kernel parameters: ```python # High-throughput web server config = advisor.analyze( workload_data={ 'request_rate': 50000, 'connections': 10000, 'working_set_gb': 32 }, target=SystemType.KERNEL ) # Generated sysctl settings: # vm.dirty_ratio = 20 # vm.swappiness = 60 # net.core.somaxconn = 65535 # net.ipv4.tcp_max_syn_backlog = 65535 ``` ### 4. Container Configuration Sets Docker/Kubernetes resource limits: ```python # Microservice API config = advisor.analyze( workload_data={ 'working_set_gb': 2, 'connections': 100, 'qps': 1000 }, target=SystemType.CONTAINER ) # Generated Docker command: # docker run --memory=3.0g --cpus=100 ``` ### 5. Application Configuration Tunes thread pools, caches, and batch sizes: ```python # Data processing application config = advisor.analyze( workload_data={ 'working_set_gb': 50, 'connections': 200, 'batch_size': 10000 }, target=SystemType.APPLICATION ) # Generated settings: # thread_pool_size: 16 # Based on CPU cores # connection_pool_size: 200 # Match concurrency # cache_size: 229,739 # √n entries # batch_size: 10,000 # Optimized for memory ``` ## System Analysis The advisor automatically profiles your system: ```python from advisor import SystemAnalyzer analyzer = SystemAnalyzer() profile = analyzer.analyze_system() print(f"CPU: {profile.cpu_count} cores ({profile.cpu_model})") print(f"Memory: {profile.memory_gb:.1f}GB") print(f"Storage: {profile.storage_type} ({profile.storage_iops} IOPS)") print(f"L3 Cache: {profile.l3_cache_mb:.1f}MB") ``` ## Workload Analysis Characterize workloads from metrics or logs: ```python from advisor import WorkloadAnalyzer analyzer = WorkloadAnalyzer() # From metrics workload = analyzer.analyze_workload(metrics={ 'read_ratio': 0.8, 'working_set_gb': 100, 'qps': 10000, 'connections': 500 }) # From logs workload = analyzer.analyze_workload(logs=[ "SELECT * FROM users WHERE id = 123", "UPDATE orders SET status = 'shipped'", # ... more log entries ]) ``` ## A/B Testing Compare configurations scientifically: ```python # Create two configurations config_a = advisor.analyze(workload_a, target=SystemType.DATABASE) config_b = advisor.analyze(workload_b, target=SystemType.DATABASE) # Run A/B test results = advisor.compare_configs( [config_a, config_b], test_duration=300 # 5 minutes ) for result in results: print(f"{result.config_name}:") print(f" Throughput: {result.metrics['throughput']} QPS") print(f" Latency: {result.metrics['latency']} ms") print(f" Winner: {'Yes' if result.winner else 'No'}") ``` ## Export Configurations Save configurations in appropriate formats: ```python # PostgreSQL config file advisor.export_config(db_config, "postgresql.conf") # JVM startup script advisor.export_config(jvm_config, "jvm_startup.sh") # JSON for other systems advisor.export_config(app_config, "app_config.json") ``` ## √n Optimization Examples The advisor applies Williams' space-time tradeoffs: ### Database Buffer Pool For data larger than memory: - Traditional: Try to cache everything (thrashing) - √n approach: Cache √(data_size) for optimal performance - Example: 1TB data → 32GB buffer pool (not 1TB!) ### JVM Young Generation Balance GC frequency vs pause time: - Traditional: Fixed percentage (25% of heap) - √n approach: √(heap_size) for optimal GC - Example: 64GB heap → 8GB young gen ### Application Cache Limited memory for caching: - Traditional: LRU with fixed size - √n approach: √(total_items) cache entries - Example: 1B items → 31,622 cache entries ## Real-World Impact Organizations using these principles: - **Google**: Bigtable uses √n buffer sizes - **Facebook**: RocksDB applies similar concepts - **PostgreSQL**: Shared buffers tuning - **JVM**: G1GC uses √n heuristics - **Linux**: Page cache management ## Advanced Usage ### Custom System Types ```python class CustomConfigGenerator(ConfigurationGenerator): def generate_custom_config(self, system, workload): # Apply √n principles to your system buffer_size = self.sqrt_calc.calculate_optimal_buffer( workload.total_data_size_gb * 1024 ) return Configuration(...) ``` ### Continuous Optimization ```python # Monitor and adapt over time while True: current_metrics = collect_metrics() if significant_change(current_metrics, last_metrics): new_config = advisor.analyze( workload_data=current_metrics, target=SystemType.DATABASE ) apply_config(new_config) time.sleep(3600) # Check hourly ``` ## Examples See [example_advisor.py](example_advisor.py) for comprehensive examples: - PostgreSQL tuning for OLTP vs OLAP - JVM configuration for latency vs throughput - Container resource allocation - Kernel tuning for different workloads - A/B testing configurations - Adaptive configuration over time ## Troubleshooting ### Memory Calculations - Buffer sizes are capped at available memory - √n sizing only applied when data > memory - Consider OS overhead (typically 20% reserved) ### Performance Testing - A/B tests simulate load (real tests needed) - Confidence intervals require sufficient samples - Network conditions affect distributed systems ## Future Enhancements - Cloud provider specific configs (AWS, GCP, Azure) - Kubernetes operator for automatic tuning - Machine learning workload detection - Integration with monitoring systems - Automated rollback on regression ## See Also - [SpaceTimeCore](../core/spacetime_core.py): √n calculations - [Memory Profiler](../profiler/): Identify bottlenecks