15 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Context
This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring:
Core Purpose
A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools.
Main Features:
-
Comprehensive System Monitoring - Real-time resource tracking for AI workloads
- Live dashboard with GPU, CPU, memory, disk, and network monitoring
- Process monitoring with real-time top processes display
- Enhanced header with critical metrics (GPU load, VRAM, RAM, disk space)
- Detailed tooltips showing active Ollama models
-
Model Manager - Complete Ollama model management interface
- Download, delete, create, and test models
- Support for Hugging Face models via Ollama pull syntax
- Rich model metadata display with size, quantization, context length
- Quick in-app chat testing interface
-
Plugin-Based Tool System - Extensible framework for AI testing tools
- Auto-discovery of tools from
src/tools/directory - Each tool can have multiple sub-pages with routing
- Tools have access to system monitors via ToolContext
- Enable/disable tools via simple property override
- Auto-discovery of tools from
-
External Integrations - Quick access to related services
- Direct link to Open WebUI for advanced model interactions
Development Commands
Running the Application
# Install dependencies
uv sync
# Run the development server (use port 8081 for testing as 8080 is usually occupied)
APP_PORT=8081 uv run python src/main.py
# Default port (8080) - usually already in use by main instance
uv run python src/main.py
Dependency Management
# Add a new dependency
uv add <package>
# Add a dev dependency
uv add --dev <package>
# Update dependencies
uv sync
Architecture Overview
Technology Stack
- Package Manager: uv (version 0.8.17)
- UI Framework: NiceGUI (async web framework based on FastAPI/Vue.js)
- Python Version: 3.13+
- Ollama API: Running on localhost:11434
- Dependencies:
nicegui- Main UI frameworkniceguiasyncelement- Custom async component framework (from git)psutil- System monitoringhttpx- Async HTTP client for Ollama APIpython-dotenv- Environment configuration
Project Structure
src/
├── main.py # Entry point, NiceGUI app configuration with all routes
├── pages/ # Core page components
│ ├── dashboard.py # Comprehensive system monitoring dashboard
│ └── ollama_manager.py # Ollama model management interface (AsyncColumn)
├── components/ # Reusable UI components
│ ├── header.py # Enhanced header with critical metrics and tooltips
│ ├── sidebar.py # Navigation sidebar with auto-populated tools
│ ├── bottom_nav.py # Mobile bottom navigation
│ ├── ollama_downloader.py # Ollama model downloader component (AsyncCard)
│ ├── ollama_model_creation.py # Model creation component (AsyncCard)
│ └── ollama_quick_test.py # Model testing component (AsyncCard)
├── tools/ # Plugin system for extensible tools
│ ├── __init__.py # Auto-discovery and tool registry
│ ├── base_tool.py # BaseTool and BasePage classes, ToolContext
│ └── example_tool/ # Example tool demonstrating plugin system
│ ├── __init__.py
│ └── tool.py # ExampleTool with main, settings, history pages
├── utils/ # Utility modules
│ ├── gpu_monitor.py # GPU monitoring (AMD/NVIDIA auto-detect)
│ ├── system_monitor.py # Comprehensive system resource monitoring
│ ├── ollama_monitor.py # Ollama status and active models monitoring
│ └── ollama.py # Ollama API client functions
└── static/ # Static assets (CSS, images)
└── style.css # Custom dark theme styles
Key Design Patterns
-
Plugin Architecture: Extensible tool system with auto-discovery
- Tools are auto-discovered from
src/tools/directory - Each tool inherits from
BaseTooland defines routes for sub-pages - Tools can be enabled/disabled via simple property override
- Sub-routes support: tools can have multiple pages (main, settings, etc.)
- Tools are auto-discovered from
-
Async Components: Uses custom
niceguiasyncelementframeworkBasePage(AsyncColumn)for consistent tool page structureAsyncCardbase classes for complex components- All tool pages inherit from
BasePageto eliminate boilerplate
-
Context Pattern: Shared resource access via ToolContext
ToolContextprovides access to system monitors from any tool- Global context initialized in main.py and accessible via
tool.context - Clean separation between tools and system resources
-
Bindable Dataclasses: Monitor classes use
@binding.bindable_dataclass- Real-time UI updates with 2-second refresh intervals
SystemMonitor,GPUMonitor,OllamaMonitorfor live data
-
Enhanced Header: Critical metrics display with detailed tooltips
- GPU load, VRAM usage, system RAM, disk space badges
- Active model tooltip with detailed model information
- Clean metric formatting with proper units
Component Architecture
Monitor Classes (Supporting AI Testing)
-
SystemMonitor: Tracks system resources during AI model testing
- CPU usage during model inference
- Memory consumption by loaded models
- Disk I/O for model loading
- Process statistics for Ollama and GPU processes
-
GPUMonitor: Critical for AI workload monitoring
- Auto-detects AMD/NVIDIA GPUs
- Tracks GPU usage during model inference
- Memory usage by loaded models
- Temperature monitoring during extended testing
- Power draw under AI workloads
-
OllamaMonitor: Core service monitoring
- Ollama service status and version
- Currently loaded/active models
- Real-time model state tracking
UI Components
- MetricCircle: Small circular progress indicator with icon
- LargeMetricCircle: Large circular progress for primary metrics
- ColorfulMetricCard: Action cards with gradient backgrounds
- Sidebar: Navigation menu with updated structure:
- Main: Dashboard, System Overview
- Tools: Censor (content filtering)
- Bottom: Model Manager, Settings
- Header: Top bar with system status indicators
Ollama-Specific Components (AsyncCard-based):
- OllamaDownloaderComponent: Model downloading with progress tracking (supports HF models via Ollama's pull syntax)
- OllamaModelCreationComponent: Custom model creation from Modelfile
- ModelQuickTestComponent: Interactive model testing interface
Ollama Integration
The Ollama API client (src/utils/ollama.py) provides async functions:
status(): Check if Ollama is online and get versionavailable_models(): List installed models with detailed metadataactive_models(): Get currently loaded/running modelsdelete_model(): Remove a modelmodel_info(): Get detailed model information and Modelfilestream_chat(): Stream chat responses
Tools Plugin System\n\nThe application features an extensible plugin system for AI testing tools:\n\n### Creating a New Tool\n\n1. Create tool directory: src/tools/my_tool/\n2. Create tool class: src/tools/my_tool/tool.py\n\npython\nfrom tools.base_tool import BaseTool, BasePage\nfrom typing import Dict, Callable, Awaitable\n\nclass MyTool(BaseTool):\n @property\n def name(self) -> str:\n return \"My Tool\"\n \n @property\n def description(self) -> str:\n return \"Description of what this tool does\"\n \n @property\n def icon(self) -> str:\n return \"build\" # Material icon name\n \n @property\n def enabled(self) -> bool:\n return True # Set to False to disable\n \n @property\n def routes(self) -> Dict[str, Callable[[], Awaitable]]:\n return {\n '': lambda: MainPage().create(self),\n '/settings': lambda: SettingsPage().create(self),\n }\n\nclass MainPage(BasePage):\n async def content(self):\n # Access system monitors via context\n cpu_usage = self.tool.context.system_monitor.cpu_percent\n active_models = self.tool.context.ollama_monitor.active_models\n \n # Your tool UI here\n ui.label(f\"CPU: {cpu_usage}%\")\n\n\n### Tool Features:\n- Auto-discovery: Tools are automatically found and loaded\n- Sub-routes: Tools can have multiple pages (/, /settings, /history, etc.)\n- Context Access: Access to system monitors via self.tool.context\n- Enable/Disable: Control tool visibility via enabled property\n- Consistent Layout: BasePage handles standard layout structure\n\n### AI Model Testing Features:
-
Model Discovery & Management:
- Browse and pull models from Ollama library
- Support for HuggingFace models via Ollama syntax
- Rich metadata display (size, quantization, parameters, format)
- Time tracking for model versions
-
Testing Capabilities:
- Quick chat interface for immediate model testing
- Model information and Modelfile inspection
- Custom model creation from Modelfiles
- Real-time resource monitoring during inference
-
Testing Tools:
- Censor tool for output filtering analysis
- Extensible framework for adding new testing tools
API endpoints at http://localhost:11434/api/:
/api/version: Get Ollama version/api/tags: List available models/api/pull: Download models/api/delete: Remove models/api/generate: Generate text/api/chat: Chat completion/api/ps: List running models/api/show: Show model details
System Monitoring
GPU Monitoring Strategy
The application uses a hierarchical approach for GPU monitoring:
-
NVIDIA GPUs (via
nvidia-smi):- Temperature, usage, memory, power draw
- CUDA version and driver info
- Multi-GPU support
-
AMD GPUs (multiple fallbacks):
- Primary:
rocm-smifor full metrics - Fallback:
/sys/class/drmfilesystem - Reads hwmon for temperature data
- Supports both server and consumer GPUs
- Primary:
CPU & System Monitoring
- Real-time CPU usage and per-core statistics
- Memory (RAM and swap) usage
- Disk usage and I/O statistics
- Network traffic monitoring
- Process tracking with top processes by CPU/memory
- System uptime and kernel information
UI/UX Features
Dark Theme
Custom dark theme with:
- Background:
#1a1d2e(main),#252837(sidebar) - Card backgrounds:
rgba(26, 29, 46, 0.7)with backdrop blur - Accent colors: Cyan (
#06b6d4) for primary actions - Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp)
Responsive Design
- Desktop: Full sidebar navigation
- Mobile: Bottom navigation bar
- Adaptive grid layouts for different screen sizes
- Viewport-aware content scaling
Real-time Updates
- System metrics update every 2 seconds (configurable via
MONITORING_UPDATE_INTERVAL) - Live data binding for all metrics
- Smooth transitions and animations
Enhanced Dashboard Features
The dashboard provides comprehensive real-time monitoring specifically designed for AI workload testing:
Primary Monitoring Sections:
- GPU Performance: Large circular progress for GPU load, VRAM usage bar, temperature & power draw
- CPU & Memory: Dual circular progress with detailed specs and frequency info
- Ollama Service: Live status, version, and grid display of active models with metadata
- Storage & Network: Disk usage bars and real-time network I/O monitoring
- Process Monitoring: Live table of top processes with CPU%, memory usage, and status
- System Information: OS details, uptime, load average, hardware specifications
Header Enhancements:
- Critical Metrics Badges: GPU load, VRAM usage, system RAM, disk space with live updates
- Active Models Tooltip: Detailed grid showing running models with context length, size, VRAM usage
- Live Status Indicators: Ollama service status with version information
NiceGUI Patterns
- Plugin-Based Routing: Tools auto-register their routes with sub-page support
- Context Pattern: Shared monitor access via
tool.contextfor all plugins - BasePage Pattern: Consistent tool page structure with
BasePage(AsyncColumn) - Data Binding: Reactive UI updates with
bind_text_from()andbind_value_from() - Async Components:
niceguiasyncelementframework with@ui.refreshabledecorators - Timer Updates: 2-second intervals for real-time monitoring data
- Dark Mode: Comprehensive dark theme with custom metric colors
Environment Variables
Configured in .env:
MONITORING_UPDATE_INTERVAL: Update frequency in seconds (default: 2)APP_PORT: Web server port (default: 8080, use 8081 for testing)APP_TITLE: Application titleAPP_STORAGE_SECRET: Session storage encryption keyAPP_SHOW: Auto-open browser on startup
Testing & Development
- Run on port 8081 to avoid conflicts:
APP_PORT=8081 uv run python src/main.py - Monitor GPU detection in console logs
- Check Ollama connectivity at startup
- Use browser DevTools for WebSocket debugging
Current Route Structure
Core Application Routes:
/- Comprehensive system monitoring dashboard/ollama- Advanced model manager (download, test, create, manage)/settings- Application configuration and monitoring intervals
Plugin System Routes (Auto-Generated):
/example-tool- Example tool demonstrating plugin capabilities/example-tool/settings- Tool-specific settings page/example-tool/history- Tool-specific history page- Dynamic Discovery: Additional tool routes auto-discovered from
src/tools/directory
External Integrations:
- Direct link to Open WebUI for advanced model interactions
Tool Development Guide
Quick Start:
- Create
src/tools/my_tool/directory - Add
tool.pywith class inheriting fromBaseTool - Define routes dictionary mapping paths to page classes
- Create page classes inheriting from
BasePage - Tool automatically appears in sidebar and routes are registered
Advanced Features:
- Context Access: Access system monitors via
self.tool.context.system_monitor - Sub-routing: Multiple pages per tool (main, settings, config, etc.)
- Enable/Disable: Control tool visibility via
enabledproperty - Live Data: Bind to real-time system metrics and Ollama status
Future Enhancements
- Local AI model testing capabilities that prioritize privacy and security
- Tools for testing model behaviors that external providers might restrict
- Advanced local prompt engineering and safety testing frameworks
- Private data processing and analysis tools using local models
- Additional testing capabilities as needs are discovered through usage