# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Context This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring: ### Core Purpose A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools. ### Main Features: 1. **Comprehensive System Monitoring** - Real-time resource tracking for AI workloads - Live dashboard with GPU, CPU, memory, disk, and network monitoring - Process monitoring with real-time top processes display - Enhanced header with critical metrics (GPU load, VRAM, RAM, disk space) - Detailed tooltips showing active Ollama models 2. **Model Manager** - Complete Ollama model management interface - Download, delete, create, and test models - Support for Hugging Face models via Ollama pull syntax - Rich model metadata display with size, quantization, context length - Quick in-app chat testing interface 3. **Plugin-Based Tool System** - Extensible framework for AI testing tools - Auto-discovery of tools from `src/tools/` directory - Each tool can have multiple sub-pages with routing - Tools have access to system monitors via ToolContext - Enable/disable tools via simple property override 4. **External Integrations** - Quick access to related services - Direct link to Open WebUI for advanced model interactions ## Development Commands ### Running the Application ```bash # Install dependencies uv sync # Run the development server (use port 8081 for testing as 8080 is usually occupied) APP_PORT=8081 uv run python src/main.py # Default port (8080) - usually already in use by main instance uv run python src/main.py ``` ### Dependency Management ```bash # Add a new dependency uv add # Add a dev dependency uv add --dev # Update dependencies uv sync ``` ## Architecture Overview ### Technology Stack - **Package Manager**: uv (version 0.8.17) - **UI Framework**: NiceGUI (async web framework based on FastAPI/Vue.js) - **Python Version**: 3.13+ - **Ollama API**: Running on localhost:11434 - **Dependencies**: - `nicegui` - Main UI framework - `niceguiasyncelement` - Custom async component framework (from git) - `psutil` - System monitoring - `httpx` - Async HTTP client for Ollama API - `python-dotenv` - Environment configuration ### Project Structure ``` src/ ├── main.py # Entry point, NiceGUI app configuration with all routes ├── pages/ # Core page components │ ├── dashboard.py # Comprehensive system monitoring dashboard │ └── ollama_manager.py # Ollama model management interface (AsyncColumn) ├── components/ # Reusable UI components │ ├── header.py # Enhanced header with critical metrics and tooltips │ ├── sidebar.py # Navigation sidebar with auto-populated tools │ ├── bottom_nav.py # Mobile bottom navigation │ ├── ollama_downloader.py # Ollama model downloader component (AsyncCard) │ ├── ollama_model_creation.py # Model creation component (AsyncCard) │ └── ollama_quick_test.py # Model testing component (AsyncCard) ├── tools/ # Plugin system for extensible tools │ ├── __init__.py # Auto-discovery and tool registry │ ├── base_tool.py # BaseTool and BasePage classes, ToolContext │ └── example_tool/ # Example tool demonstrating plugin system │ ├── __init__.py │ └── tool.py # ExampleTool with main, settings, history pages ├── utils/ # Utility modules │ ├── gpu_monitor.py # GPU monitoring (AMD/NVIDIA auto-detect) │ ├── system_monitor.py # Comprehensive system resource monitoring │ ├── ollama_monitor.py # Ollama status and active models monitoring │ └── ollama.py # Ollama API client functions └── static/ # Static assets (CSS, images) └── style.css # Custom dark theme styles ``` ### Key Design Patterns 1. **Plugin Architecture**: Extensible tool system with auto-discovery - Tools are auto-discovered from `src/tools/` directory - Each tool inherits from `BaseTool` and defines routes for sub-pages - Tools can be enabled/disabled via simple property override - Sub-routes support: tools can have multiple pages (main, settings, etc.) 2. **Async Components**: Uses custom `niceguiasyncelement` framework - `BasePage(AsyncColumn)` for consistent tool page structure - `AsyncCard` base classes for complex components - All tool pages inherit from `BasePage` to eliminate boilerplate 3. **Context Pattern**: Shared resource access via ToolContext - `ToolContext` provides access to system monitors from any tool - Global context initialized in main.py and accessible via `tool.context` - Clean separation between tools and system resources 4. **Bindable Dataclasses**: Monitor classes use `@binding.bindable_dataclass` - Real-time UI updates with 2-second refresh intervals - `SystemMonitor`, `GPUMonitor`, `OllamaMonitor` for live data 5. **Enhanced Header**: Critical metrics display with detailed tooltips - GPU load, VRAM usage, system RAM, disk space badges - Active model tooltip with detailed model information - Clean metric formatting with proper units ## Component Architecture ### Monitor Classes (Supporting AI Testing) - **SystemMonitor**: Tracks system resources during AI model testing - CPU usage during model inference - Memory consumption by loaded models - Disk I/O for model loading - Process statistics for Ollama and GPU processes - **GPUMonitor**: Critical for AI workload monitoring - Auto-detects AMD/NVIDIA GPUs - Tracks GPU usage during model inference - Memory usage by loaded models - Temperature monitoring during extended testing - Power draw under AI workloads - **OllamaMonitor**: Core service monitoring - Ollama service status and version - Currently loaded/active models - Real-time model state tracking ### UI Components - **MetricCircle**: Small circular progress indicator with icon - **LargeMetricCircle**: Large circular progress for primary metrics - **ColorfulMetricCard**: Action cards with gradient backgrounds - **Sidebar**: Navigation menu with updated structure: - Main: Dashboard, System Overview - Tools: Censor (content filtering) - Bottom: Model Manager, Settings - **Header**: Top bar with system status indicators ### Ollama-Specific Components (AsyncCard-based): - **OllamaDownloaderComponent**: Model downloading with progress tracking (supports HF models via Ollama's pull syntax) - **OllamaModelCreationComponent**: Custom model creation from Modelfile - **ModelQuickTestComponent**: Interactive model testing interface ## Ollama Integration The Ollama API client (`src/utils/ollama.py`) provides async functions: - `status()`: Check if Ollama is online and get version - `available_models()`: List installed models with detailed metadata - `active_models()`: Get currently loaded/running models - `delete_model()`: Remove a model - `model_info()`: Get detailed model information and Modelfile - `stream_chat()`: Stream chat responses ## Tools Plugin System\n\nThe application features an extensible plugin system for AI testing tools:\n\n### Creating a New Tool\n\n1. **Create tool directory**: `src/tools/my_tool/`\n2. **Create tool class**: `src/tools/my_tool/tool.py`\n\n```python\nfrom tools.base_tool import BaseTool, BasePage\nfrom typing import Dict, Callable, Awaitable\n\nclass MyTool(BaseTool):\n @property\n def name(self) -> str:\n return \"My Tool\"\n \n @property\n def description(self) -> str:\n return \"Description of what this tool does\"\n \n @property\n def icon(self) -> str:\n return \"build\" # Material icon name\n \n @property\n def enabled(self) -> bool:\n return True # Set to False to disable\n \n @property\n def routes(self) -> Dict[str, Callable[[], Awaitable]]:\n return {\n '': lambda: MainPage().create(self),\n '/settings': lambda: SettingsPage().create(self),\n }\n\nclass MainPage(BasePage):\n async def content(self):\n # Access system monitors via context\n cpu_usage = self.tool.context.system_monitor.cpu_percent\n active_models = self.tool.context.ollama_monitor.active_models\n \n # Your tool UI here\n ui.label(f\"CPU: {cpu_usage}%\")\n```\n\n### Tool Features:\n- **Auto-discovery**: Tools are automatically found and loaded\n- **Sub-routes**: Tools can have multiple pages (/, /settings, /history, etc.)\n- **Context Access**: Access to system monitors via `self.tool.context`\n- **Enable/Disable**: Control tool visibility via `enabled` property\n- **Consistent Layout**: `BasePage` handles standard layout structure\n\n### AI Model Testing Features: - **Model Discovery & Management**: - Browse and pull models from Ollama library - Support for HuggingFace models via Ollama syntax - Rich metadata display (size, quantization, parameters, format) - Time tracking for model versions - **Testing Capabilities**: - Quick chat interface for immediate model testing - Model information and Modelfile inspection - Custom model creation from Modelfiles - Real-time resource monitoring during inference - **Testing Tools**: - Censor tool for output filtering analysis - Extensible framework for adding new testing tools API endpoints at `http://localhost:11434/api/`: - `/api/version`: Get Ollama version - `/api/tags`: List available models - `/api/pull`: Download models - `/api/delete`: Remove models - `/api/generate`: Generate text - `/api/chat`: Chat completion - `/api/ps`: List running models - `/api/show`: Show model details ## System Monitoring ### GPU Monitoring Strategy The application uses a hierarchical approach for GPU monitoring: 1. **NVIDIA GPUs** (via `nvidia-smi`): - Temperature, usage, memory, power draw - CUDA version and driver info - Multi-GPU support 2. **AMD GPUs** (multiple fallbacks): - Primary: `rocm-smi` for full metrics - Fallback: `/sys/class/drm` filesystem - Reads hwmon for temperature data - Supports both server and consumer GPUs ### CPU & System Monitoring - Real-time CPU usage and per-core statistics - Memory (RAM and swap) usage - Disk usage and I/O statistics - Network traffic monitoring - Process tracking with top processes by CPU/memory - System uptime and kernel information ## UI/UX Features ### Dark Theme Custom dark theme with: - Background: `#1a1d2e` (main), `#252837` (sidebar) - Card backgrounds: `rgba(26, 29, 46, 0.7)` with backdrop blur - Accent colors: Cyan (`#06b6d4`) for primary actions - Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp) ### Responsive Design - Desktop: Full sidebar navigation - Mobile: Bottom navigation bar - Adaptive grid layouts for different screen sizes - Viewport-aware content scaling ### Real-time Updates - System metrics update every 2 seconds (configurable via `MONITORING_UPDATE_INTERVAL`) - Live data binding for all metrics - Smooth transitions and animations ## Enhanced Dashboard Features The dashboard provides comprehensive real-time monitoring specifically designed for AI workload testing: ### Primary Monitoring Sections: - **GPU Performance**: Large circular progress for GPU load, VRAM usage bar, temperature & power draw - **CPU & Memory**: Dual circular progress with detailed specs and frequency info - **Ollama Service**: Live status, version, and grid display of active models with metadata - **Storage & Network**: Disk usage bars and real-time network I/O monitoring - **Process Monitoring**: Live table of top processes with CPU%, memory usage, and status - **System Information**: OS details, uptime, load average, hardware specifications ### Header Enhancements: - **Critical Metrics Badges**: GPU load, VRAM usage, system RAM, disk space with live updates - **Active Models Tooltip**: Detailed grid showing running models with context length, size, VRAM usage - **Live Status Indicators**: Ollama service status with version information ## NiceGUI Patterns - **Plugin-Based Routing**: Tools auto-register their routes with sub-page support - **Context Pattern**: Shared monitor access via `tool.context` for all plugins - **BasePage Pattern**: Consistent tool page structure with `BasePage(AsyncColumn)` - **Data Binding**: Reactive UI updates with `bind_text_from()` and `bind_value_from()` - **Async Components**: `niceguiasyncelement` framework with `@ui.refreshable` decorators - **Timer Updates**: 2-second intervals for real-time monitoring data - **Dark Mode**: Comprehensive dark theme with custom metric colors ## Environment Variables Configured in `.env`: - `MONITORING_UPDATE_INTERVAL`: Update frequency in seconds (default: 2) - `APP_PORT`: Web server port (default: 8080, use 8081 for testing) - `APP_TITLE`: Application title - `APP_STORAGE_SECRET`: Session storage encryption key - `APP_SHOW`: Auto-open browser on startup ## Testing & Development - Run on port 8081 to avoid conflicts: `APP_PORT=8081 uv run python src/main.py` - Monitor GPU detection in console logs - Check Ollama connectivity at startup - Use browser DevTools for WebSocket debugging ## Current Route Structure ### Core Application Routes: - `/` - Comprehensive system monitoring dashboard - `/ollama` - Advanced model manager (download, test, create, manage) - `/settings` - Application configuration and monitoring intervals ### Plugin System Routes (Auto-Generated): - `/example-tool` - Example tool demonstrating plugin capabilities - `/example-tool/settings` - Tool-specific settings page - `/example-tool/history` - Tool-specific history page - **Dynamic Discovery**: Additional tool routes auto-discovered from `src/tools/` directory ### External Integrations: - Direct link to Open WebUI for advanced model interactions ## Tool Development Guide ### Quick Start: 1. Create `src/tools/my_tool/` directory 2. Add `tool.py` with class inheriting from `BaseTool` 3. Define routes dictionary mapping paths to page classes 4. Create page classes inheriting from `BasePage` 5. Tool automatically appears in sidebar and routes are registered ### Advanced Features: - **Context Access**: Access system monitors via `self.tool.context.system_monitor` - **Sub-routing**: Multiple pages per tool (main, settings, config, etc.) - **Enable/Disable**: Control tool visibility via `enabled` property - **Live Data**: Bind to real-time system metrics and Ollama status ## Future Enhancements - Local AI model testing capabilities that prioritize privacy and security - Tools for testing model behaviors that external providers might restrict - Advanced local prompt engineering and safety testing frameworks - Private data processing and analysis tools using local models - Additional testing capabilities as needs are discovered through usage