320 lines
15 KiB
Markdown
320 lines
15 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Context
|
|
This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring:
|
|
|
|
### Core Purpose
|
|
A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools.
|
|
|
|
### Main Features:
|
|
1. **Comprehensive System Monitoring** - Real-time resource tracking for AI workloads
|
|
- Live dashboard with GPU, CPU, memory, disk, and network monitoring
|
|
- Process monitoring with real-time top processes display
|
|
- Enhanced header with critical metrics (GPU load, VRAM, RAM, disk space)
|
|
- Detailed tooltips showing active Ollama models
|
|
|
|
2. **Model Manager** - Complete Ollama model management interface
|
|
- Download, delete, create, and test models
|
|
- Support for Hugging Face models via Ollama pull syntax
|
|
- Rich model metadata display with size, quantization, context length
|
|
- Quick in-app chat testing interface
|
|
|
|
3. **Plugin-Based Tool System** - Extensible framework for AI testing tools
|
|
- Auto-discovery of tools from `src/tools/` directory
|
|
- Each tool can have multiple sub-pages with routing
|
|
- Tools have access to system monitors via ToolContext
|
|
- Enable/disable tools via simple property override
|
|
|
|
4. **External Integrations** - Quick access to related services
|
|
- Direct link to Open WebUI for advanced model interactions
|
|
|
|
## Development Commands
|
|
|
|
### Running the Application
|
|
```bash
|
|
# Install dependencies
|
|
uv sync
|
|
|
|
# Run the development server (use port 8081 for testing as 8080 is usually occupied)
|
|
APP_PORT=8081 uv run python src/main.py
|
|
|
|
# Default port (8080) - usually already in use by main instance
|
|
uv run python src/main.py
|
|
```
|
|
|
|
### Dependency Management
|
|
```bash
|
|
# Add a new dependency
|
|
uv add <package>
|
|
|
|
# Add a dev dependency
|
|
uv add --dev <package>
|
|
|
|
# Update dependencies
|
|
uv sync
|
|
```
|
|
|
|
## Architecture Overview
|
|
|
|
### Technology Stack
|
|
- **Package Manager**: uv (version 0.8.17)
|
|
- **UI Framework**: NiceGUI (async web framework based on FastAPI/Vue.js)
|
|
- **Python Version**: 3.13+
|
|
- **Ollama API**: Running on localhost:11434
|
|
- **Dependencies**:
|
|
- `nicegui` - Main UI framework
|
|
- `niceguiasyncelement` - Custom async component framework (from git)
|
|
- `psutil` - System monitoring
|
|
- `httpx` - Async HTTP client for Ollama API
|
|
- `python-dotenv` - Environment configuration
|
|
|
|
### Project Structure
|
|
```
|
|
src/
|
|
├── main.py # Entry point, NiceGUI app configuration with all routes
|
|
├── pages/ # Core page components
|
|
│ ├── dashboard.py # Comprehensive system monitoring dashboard
|
|
│ └── ollama_manager.py # Ollama model management interface (AsyncColumn)
|
|
├── components/ # Reusable UI components
|
|
│ ├── header.py # Enhanced header with critical metrics and tooltips
|
|
│ ├── sidebar.py # Navigation sidebar with auto-populated tools
|
|
│ ├── bottom_nav.py # Mobile bottom navigation
|
|
│ ├── ollama_downloader.py # Ollama model downloader component (AsyncCard)
|
|
│ ├── ollama_model_creation.py # Model creation component (AsyncCard)
|
|
│ └── ollama_quick_test.py # Model testing component (AsyncCard)
|
|
├── tools/ # Plugin system for extensible tools
|
|
│ ├── __init__.py # Auto-discovery and tool registry
|
|
│ ├── base_tool.py # BaseTool and BasePage classes, ToolContext
|
|
│ └── example_tool/ # Example tool demonstrating plugin system
|
|
│ ├── __init__.py
|
|
│ └── tool.py # ExampleTool with main, settings, history pages
|
|
├── utils/ # Utility modules
|
|
│ ├── gpu_monitor.py # GPU monitoring (AMD/NVIDIA auto-detect)
|
|
│ ├── system_monitor.py # Comprehensive system resource monitoring
|
|
│ ├── ollama_monitor.py # Ollama status and active models monitoring
|
|
│ └── ollama.py # Ollama API client functions
|
|
└── static/ # Static assets (CSS, images)
|
|
└── style.css # Custom dark theme styles
|
|
```
|
|
|
|
### Key Design Patterns
|
|
1. **Plugin Architecture**: Extensible tool system with auto-discovery
|
|
- Tools are auto-discovered from `src/tools/` directory
|
|
- Each tool inherits from `BaseTool` and defines routes for sub-pages
|
|
- Tools can be enabled/disabled via simple property override
|
|
- Sub-routes support: tools can have multiple pages (main, settings, etc.)
|
|
|
|
2. **Async Components**: Uses custom `niceguiasyncelement` framework
|
|
- `BasePage(AsyncColumn)` for consistent tool page structure
|
|
- `AsyncCard` base classes for complex components
|
|
- All tool pages inherit from `BasePage` to eliminate boilerplate
|
|
|
|
3. **Context Pattern**: Shared resource access via ToolContext
|
|
- `ToolContext` provides access to system monitors from any tool
|
|
- Global context initialized in main.py and accessible via `tool.context`
|
|
- Clean separation between tools and system resources
|
|
|
|
4. **Bindable Dataclasses**: Monitor classes use `@binding.bindable_dataclass`
|
|
- Real-time UI updates with 2-second refresh intervals
|
|
- `SystemMonitor`, `GPUMonitor`, `OllamaMonitor` for live data
|
|
|
|
5. **Enhanced Header**: Critical metrics display with detailed tooltips
|
|
- GPU load, VRAM usage, system RAM, disk space badges
|
|
- Active model tooltip with detailed model information
|
|
- Clean metric formatting with proper units
|
|
|
|
## Component Architecture
|
|
|
|
### Monitor Classes (Supporting AI Testing)
|
|
- **SystemMonitor**: Tracks system resources during AI model testing
|
|
- CPU usage during model inference
|
|
- Memory consumption by loaded models
|
|
- Disk I/O for model loading
|
|
- Process statistics for Ollama and GPU processes
|
|
|
|
- **GPUMonitor**: Critical for AI workload monitoring
|
|
- Auto-detects AMD/NVIDIA GPUs
|
|
- Tracks GPU usage during model inference
|
|
- Memory usage by loaded models
|
|
- Temperature monitoring during extended testing
|
|
- Power draw under AI workloads
|
|
|
|
- **OllamaMonitor**: Core service monitoring
|
|
- Ollama service status and version
|
|
- Currently loaded/active models
|
|
- Real-time model state tracking
|
|
|
|
### UI Components
|
|
- **MetricCircle**: Small circular progress indicator with icon
|
|
- **LargeMetricCircle**: Large circular progress for primary metrics
|
|
- **ColorfulMetricCard**: Action cards with gradient backgrounds
|
|
- **Sidebar**: Navigation menu with updated structure:
|
|
- Main: Dashboard, System Overview
|
|
- Tools: Censor (content filtering)
|
|
- Bottom: Model Manager, Settings
|
|
- **Header**: Top bar with system status indicators
|
|
|
|
### Ollama-Specific Components (AsyncCard-based):
|
|
- **OllamaDownloaderComponent**: Model downloading with progress tracking (supports HF models via Ollama's pull syntax)
|
|
- **OllamaModelCreationComponent**: Custom model creation from Modelfile
|
|
- **ModelQuickTestComponent**: Interactive model testing interface
|
|
|
|
## Ollama Integration
|
|
The Ollama API client (`src/utils/ollama.py`) provides async functions:
|
|
- `status()`: Check if Ollama is online and get version
|
|
- `available_models()`: List installed models with detailed metadata
|
|
- `active_models()`: Get currently loaded/running models
|
|
- `delete_model()`: Remove a model
|
|
- `model_info()`: Get detailed model information and Modelfile
|
|
- `stream_chat()`: Stream chat responses
|
|
|
|
## Tools Plugin System\n\nThe application features an extensible plugin system for AI testing tools:\n\n### Creating a New Tool\n\n1. **Create tool directory**: `src/tools/my_tool/`\n2. **Create tool class**: `src/tools/my_tool/tool.py`\n\n```python\nfrom tools.base_tool import BaseTool, BasePage\nfrom typing import Dict, Callable, Awaitable\n\nclass MyTool(BaseTool):\n @property\n def name(self) -> str:\n return \"My Tool\"\n \n @property\n def description(self) -> str:\n return \"Description of what this tool does\"\n \n @property\n def icon(self) -> str:\n return \"build\" # Material icon name\n \n @property\n def enabled(self) -> bool:\n return True # Set to False to disable\n \n @property\n def routes(self) -> Dict[str, Callable[[], Awaitable]]:\n return {\n '': lambda: MainPage().create(self),\n '/settings': lambda: SettingsPage().create(self),\n }\n\nclass MainPage(BasePage):\n async def content(self):\n # Access system monitors via context\n cpu_usage = self.tool.context.system_monitor.cpu_percent\n active_models = self.tool.context.ollama_monitor.active_models\n \n # Your tool UI here\n ui.label(f\"CPU: {cpu_usage}%\")\n```\n\n### Tool Features:\n- **Auto-discovery**: Tools are automatically found and loaded\n- **Sub-routes**: Tools can have multiple pages (/, /settings, /history, etc.)\n- **Context Access**: Access to system monitors via `self.tool.context`\n- **Enable/Disable**: Control tool visibility via `enabled` property\n- **Consistent Layout**: `BasePage` handles standard layout structure\n\n### AI Model Testing Features:
|
|
- **Model Discovery & Management**:
|
|
- Browse and pull models from Ollama library
|
|
- Support for HuggingFace models via Ollama syntax
|
|
- Rich metadata display (size, quantization, parameters, format)
|
|
- Time tracking for model versions
|
|
|
|
- **Testing Capabilities**:
|
|
- Quick chat interface for immediate model testing
|
|
- Model information and Modelfile inspection
|
|
- Custom model creation from Modelfiles
|
|
- Real-time resource monitoring during inference
|
|
|
|
- **Testing Tools**:
|
|
- Censor tool for output filtering analysis
|
|
- Extensible framework for adding new testing tools
|
|
|
|
API endpoints at `http://localhost:11434/api/`:
|
|
- `/api/version`: Get Ollama version
|
|
- `/api/tags`: List available models
|
|
- `/api/pull`: Download models
|
|
- `/api/delete`: Remove models
|
|
- `/api/generate`: Generate text
|
|
- `/api/chat`: Chat completion
|
|
- `/api/ps`: List running models
|
|
- `/api/show`: Show model details
|
|
|
|
## System Monitoring
|
|
|
|
### GPU Monitoring Strategy
|
|
The application uses a hierarchical approach for GPU monitoring:
|
|
|
|
1. **NVIDIA GPUs** (via `nvidia-smi`):
|
|
- Temperature, usage, memory, power draw
|
|
- CUDA version and driver info
|
|
- Multi-GPU support
|
|
|
|
2. **AMD GPUs** (multiple fallbacks):
|
|
- Primary: `rocm-smi` for full metrics
|
|
- Fallback: `/sys/class/drm` filesystem
|
|
- Reads hwmon for temperature data
|
|
- Supports both server and consumer GPUs
|
|
|
|
### CPU & System Monitoring
|
|
- Real-time CPU usage and per-core statistics
|
|
- Memory (RAM and swap) usage
|
|
- Disk usage and I/O statistics
|
|
- Network traffic monitoring
|
|
- Process tracking with top processes by CPU/memory
|
|
- System uptime and kernel information
|
|
|
|
## UI/UX Features
|
|
|
|
### Dark Theme
|
|
Custom dark theme with:
|
|
- Background: `#1a1d2e` (main), `#252837` (sidebar)
|
|
- Card backgrounds: `rgba(26, 29, 46, 0.7)` with backdrop blur
|
|
- Accent colors: Cyan (`#06b6d4`) for primary actions
|
|
- Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp)
|
|
|
|
### Responsive Design
|
|
- Desktop: Full sidebar navigation
|
|
- Mobile: Bottom navigation bar
|
|
- Adaptive grid layouts for different screen sizes
|
|
- Viewport-aware content scaling
|
|
|
|
### Real-time Updates
|
|
- System metrics update every 2 seconds (configurable via `MONITORING_UPDATE_INTERVAL`)
|
|
- Live data binding for all metrics
|
|
- Smooth transitions and animations
|
|
|
|
## Enhanced Dashboard Features
|
|
|
|
The dashboard provides comprehensive real-time monitoring specifically designed for AI workload testing:
|
|
|
|
### Primary Monitoring Sections:
|
|
- **GPU Performance**: Large circular progress for GPU load, VRAM usage bar, temperature & power draw
|
|
- **CPU & Memory**: Dual circular progress with detailed specs and frequency info
|
|
- **Ollama Service**: Live status, version, and grid display of active models with metadata
|
|
- **Storage & Network**: Disk usage bars and real-time network I/O monitoring
|
|
- **Process Monitoring**: Live table of top processes with CPU%, memory usage, and status
|
|
- **System Information**: OS details, uptime, load average, hardware specifications
|
|
|
|
### Header Enhancements:
|
|
- **Critical Metrics Badges**: GPU load, VRAM usage, system RAM, disk space with live updates
|
|
- **Active Models Tooltip**: Detailed grid showing running models with context length, size, VRAM usage
|
|
- **Live Status Indicators**: Ollama service status with version information
|
|
|
|
## NiceGUI Patterns
|
|
- **Plugin-Based Routing**: Tools auto-register their routes with sub-page support
|
|
- **Context Pattern**: Shared monitor access via `tool.context` for all plugins
|
|
- **BasePage Pattern**: Consistent tool page structure with `BasePage(AsyncColumn)`
|
|
- **Data Binding**: Reactive UI updates with `bind_text_from()` and `bind_value_from()`
|
|
- **Async Components**: `niceguiasyncelement` framework with `@ui.refreshable` decorators
|
|
- **Timer Updates**: 2-second intervals for real-time monitoring data
|
|
- **Dark Mode**: Comprehensive dark theme with custom metric colors
|
|
|
|
## Environment Variables
|
|
Configured in `.env`:
|
|
- `MONITORING_UPDATE_INTERVAL`: Update frequency in seconds (default: 2)
|
|
- `APP_PORT`: Web server port (default: 8080, use 8081 for testing)
|
|
- `APP_TITLE`: Application title
|
|
- `APP_STORAGE_SECRET`: Session storage encryption key
|
|
- `APP_SHOW`: Auto-open browser on startup
|
|
|
|
## Testing & Development
|
|
- Run on port 8081 to avoid conflicts: `APP_PORT=8081 uv run python src/main.py`
|
|
- Monitor GPU detection in console logs
|
|
- Check Ollama connectivity at startup
|
|
- Use browser DevTools for WebSocket debugging
|
|
|
|
## Current Route Structure
|
|
|
|
### Core Application Routes:
|
|
- `/` - Comprehensive system monitoring dashboard
|
|
- `/ollama` - Advanced model manager (download, test, create, manage)
|
|
- `/settings` - Application configuration and monitoring intervals
|
|
|
|
### Plugin System Routes (Auto-Generated):
|
|
- `/example-tool` - Example tool demonstrating plugin capabilities
|
|
- `/example-tool/settings` - Tool-specific settings page
|
|
- `/example-tool/history` - Tool-specific history page
|
|
- **Dynamic Discovery**: Additional tool routes auto-discovered from `src/tools/` directory
|
|
|
|
### External Integrations:
|
|
- Direct link to Open WebUI for advanced model interactions
|
|
|
|
## Tool Development Guide
|
|
|
|
### Quick Start:
|
|
1. Create `src/tools/my_tool/` directory
|
|
2. Add `tool.py` with class inheriting from `BaseTool`
|
|
3. Define routes dictionary mapping paths to page classes
|
|
4. Create page classes inheriting from `BasePage`
|
|
5. Tool automatically appears in sidebar and routes are registered
|
|
|
|
### Advanced Features:
|
|
- **Context Access**: Access system monitors via `self.tool.context.system_monitor`
|
|
- **Sub-routing**: Multiple pages per tool (main, settings, config, etc.)
|
|
- **Enable/Disable**: Control tool visibility via `enabled` property
|
|
- **Live Data**: Bind to real-time system metrics and Ollama status
|
|
|
|
## Future Enhancements
|
|
- Local AI model testing capabilities that prioritize privacy and security
|
|
- Tools for testing model behaviors that external providers might restrict
|
|
- Advanced local prompt engineering and safety testing frameworks
|
|
- Private data processing and analysis tools using local models
|
|
- Additional testing capabilities as needs are discovered through usage |