Files
ArchGPUFrontend/CLAUDE.md
2025-09-18 22:54:40 +02:00

15 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Context

This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring:

Core Purpose

A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools.

Main Features:

  1. Comprehensive System Monitoring - Real-time resource tracking for AI workloads

    • Live dashboard with GPU, CPU, memory, disk, and network monitoring
    • Process monitoring with real-time top processes display
    • Enhanced header with critical metrics (GPU load, VRAM, RAM, disk space)
    • Detailed tooltips showing active Ollama models
  2. Model Manager - Complete Ollama model management interface

    • Download, delete, create, and test models
    • Support for Hugging Face models via Ollama pull syntax
    • Rich model metadata display with size, quantization, context length
    • Quick in-app chat testing interface
  3. Plugin-Based Tool System - Extensible framework for AI testing tools

    • Auto-discovery of tools from src/tools/ directory
    • Each tool can have multiple sub-pages with routing
    • Tools have access to system monitors via ToolContext
    • Enable/disable tools via simple property override
  4. External Integrations - Quick access to related services

    • Direct link to Open WebUI for advanced model interactions

Development Commands

Running the Application

# Install dependencies
uv sync

# Run the development server (use port 8081 for testing as 8080 is usually occupied)
APP_PORT=8081 uv run python src/main.py

# Default port (8080) - usually already in use by main instance
uv run python src/main.py

Dependency Management

# Add a new dependency
uv add <package>

# Add a dev dependency
uv add --dev <package>

# Update dependencies
uv sync

Architecture Overview

Technology Stack

  • Package Manager: uv (version 0.8.17)
  • UI Framework: NiceGUI (async web framework based on FastAPI/Vue.js)
  • Python Version: 3.13+
  • Ollama API: Running on localhost:11434
  • Dependencies:
    • nicegui - Main UI framework
    • niceguiasyncelement - Custom async component framework (from git)
    • psutil - System monitoring
    • httpx - Async HTTP client for Ollama API
    • python-dotenv - Environment configuration

Project Structure

src/
├── main.py              # Entry point, NiceGUI app configuration with all routes
├── pages/               # Core page components
│   ├── dashboard.py     # Comprehensive system monitoring dashboard
│   └── ollama_manager.py # Ollama model management interface (AsyncColumn)
├── components/          # Reusable UI components
│   ├── header.py        # Enhanced header with critical metrics and tooltips
│   ├── sidebar.py       # Navigation sidebar with auto-populated tools
│   ├── bottom_nav.py    # Mobile bottom navigation
│   ├── ollama_downloader.py # Ollama model downloader component (AsyncCard)
│   ├── ollama_model_creation.py # Model creation component (AsyncCard)
│   └── ollama_quick_test.py # Model testing component (AsyncCard)
├── tools/               # Plugin system for extensible tools
│   ├── __init__.py      # Auto-discovery and tool registry
│   ├── base_tool.py     # BaseTool and BasePage classes, ToolContext
│   └── example_tool/    # Example tool demonstrating plugin system
│       ├── __init__.py
│       └── tool.py      # ExampleTool with main, settings, history pages
├── utils/               # Utility modules
│   ├── gpu_monitor.py   # GPU monitoring (AMD/NVIDIA auto-detect)
│   ├── system_monitor.py # Comprehensive system resource monitoring
│   ├── ollama_monitor.py # Ollama status and active models monitoring
│   └── ollama.py        # Ollama API client functions
└── static/              # Static assets (CSS, images)
    └── style.css        # Custom dark theme styles

Key Design Patterns

  1. Plugin Architecture: Extensible tool system with auto-discovery

    • Tools are auto-discovered from src/tools/ directory
    • Each tool inherits from BaseTool and defines routes for sub-pages
    • Tools can be enabled/disabled via simple property override
    • Sub-routes support: tools can have multiple pages (main, settings, etc.)
  2. Async Components: Uses custom niceguiasyncelement framework

    • BasePage(AsyncColumn) for consistent tool page structure
    • AsyncCard base classes for complex components
    • All tool pages inherit from BasePage to eliminate boilerplate
  3. Context Pattern: Shared resource access via ToolContext

    • ToolContext provides access to system monitors from any tool
    • Global context initialized in main.py and accessible via tool.context
    • Clean separation between tools and system resources
  4. Bindable Dataclasses: Monitor classes use @binding.bindable_dataclass

    • Real-time UI updates with 2-second refresh intervals
    • SystemMonitor, GPUMonitor, OllamaMonitor for live data
  5. Enhanced Header: Critical metrics display with detailed tooltips

    • GPU load, VRAM usage, system RAM, disk space badges
    • Active model tooltip with detailed model information
    • Clean metric formatting with proper units

Component Architecture

Monitor Classes (Supporting AI Testing)

  • SystemMonitor: Tracks system resources during AI model testing

    • CPU usage during model inference
    • Memory consumption by loaded models
    • Disk I/O for model loading
    • Process statistics for Ollama and GPU processes
  • GPUMonitor: Critical for AI workload monitoring

    • Auto-detects AMD/NVIDIA GPUs
    • Tracks GPU usage during model inference
    • Memory usage by loaded models
    • Temperature monitoring during extended testing
    • Power draw under AI workloads
  • OllamaMonitor: Core service monitoring

    • Ollama service status and version
    • Currently loaded/active models
    • Real-time model state tracking

UI Components

  • MetricCircle: Small circular progress indicator with icon
  • LargeMetricCircle: Large circular progress for primary metrics
  • ColorfulMetricCard: Action cards with gradient backgrounds
  • Sidebar: Navigation menu with updated structure:
    • Main: Dashboard, System Overview
    • Tools: Censor (content filtering)
    • Bottom: Model Manager, Settings
  • Header: Top bar with system status indicators

Ollama-Specific Components (AsyncCard-based):

  • OllamaDownloaderComponent: Model downloading with progress tracking (supports HF models via Ollama's pull syntax)
  • OllamaModelCreationComponent: Custom model creation from Modelfile
  • ModelQuickTestComponent: Interactive model testing interface

Ollama Integration

The Ollama API client (src/utils/ollama.py) provides async functions:

  • status(): Check if Ollama is online and get version
  • available_models(): List installed models with detailed metadata
  • active_models(): Get currently loaded/running models
  • delete_model(): Remove a model
  • model_info(): Get detailed model information and Modelfile
  • stream_chat(): Stream chat responses

Tools Plugin System\n\nThe application features an extensible plugin system for AI testing tools:\n\n### Creating a New Tool\n\n1. Create tool directory: src/tools/my_tool/\n2. Create tool class: src/tools/my_tool/tool.py\n\npython\nfrom tools.base_tool import BaseTool, BasePage\nfrom typing import Dict, Callable, Awaitable\n\nclass MyTool(BaseTool):\n @property\n def name(self) -> str:\n return \"My Tool\"\n \n @property\n def description(self) -> str:\n return \"Description of what this tool does\"\n \n @property\n def icon(self) -> str:\n return \"build\" # Material icon name\n \n @property\n def enabled(self) -> bool:\n return True # Set to False to disable\n \n @property\n def routes(self) -> Dict[str, Callable[[], Awaitable]]:\n return {\n '': lambda: MainPage().create(self),\n '/settings': lambda: SettingsPage().create(self),\n }\n\nclass MainPage(BasePage):\n async def content(self):\n # Access system monitors via context\n cpu_usage = self.tool.context.system_monitor.cpu_percent\n active_models = self.tool.context.ollama_monitor.active_models\n \n # Your tool UI here\n ui.label(f\"CPU: {cpu_usage}%\")\n\n\n### Tool Features:\n- Auto-discovery: Tools are automatically found and loaded\n- Sub-routes: Tools can have multiple pages (/, /settings, /history, etc.)\n- Context Access: Access to system monitors via self.tool.context\n- Enable/Disable: Control tool visibility via enabled property\n- Consistent Layout: BasePage handles standard layout structure\n\n### AI Model Testing Features:

  • Model Discovery & Management:

    • Browse and pull models from Ollama library
    • Support for HuggingFace models via Ollama syntax
    • Rich metadata display (size, quantization, parameters, format)
    • Time tracking for model versions
  • Testing Capabilities:

    • Quick chat interface for immediate model testing
    • Model information and Modelfile inspection
    • Custom model creation from Modelfiles
    • Real-time resource monitoring during inference
  • Testing Tools:

    • Censor tool for output filtering analysis
    • Extensible framework for adding new testing tools

API endpoints at http://localhost:11434/api/:

  • /api/version: Get Ollama version
  • /api/tags: List available models
  • /api/pull: Download models
  • /api/delete: Remove models
  • /api/generate: Generate text
  • /api/chat: Chat completion
  • /api/ps: List running models
  • /api/show: Show model details

System Monitoring

GPU Monitoring Strategy

The application uses a hierarchical approach for GPU monitoring:

  1. NVIDIA GPUs (via nvidia-smi):

    • Temperature, usage, memory, power draw
    • CUDA version and driver info
    • Multi-GPU support
  2. AMD GPUs (multiple fallbacks):

    • Primary: rocm-smi for full metrics
    • Fallback: /sys/class/drm filesystem
    • Reads hwmon for temperature data
    • Supports both server and consumer GPUs

CPU & System Monitoring

  • Real-time CPU usage and per-core statistics
  • Memory (RAM and swap) usage
  • Disk usage and I/O statistics
  • Network traffic monitoring
  • Process tracking with top processes by CPU/memory
  • System uptime and kernel information

UI/UX Features

Dark Theme

Custom dark theme with:

  • Background: #1a1d2e (main), #252837 (sidebar)
  • Card backgrounds: rgba(26, 29, 46, 0.7) with backdrop blur
  • Accent colors: Cyan (#06b6d4) for primary actions
  • Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp)

Responsive Design

  • Desktop: Full sidebar navigation
  • Mobile: Bottom navigation bar
  • Adaptive grid layouts for different screen sizes
  • Viewport-aware content scaling

Real-time Updates

  • System metrics update every 2 seconds (configurable via MONITORING_UPDATE_INTERVAL)
  • Live data binding for all metrics
  • Smooth transitions and animations

Enhanced Dashboard Features

The dashboard provides comprehensive real-time monitoring specifically designed for AI workload testing:

Primary Monitoring Sections:

  • GPU Performance: Large circular progress for GPU load, VRAM usage bar, temperature & power draw
  • CPU & Memory: Dual circular progress with detailed specs and frequency info
  • Ollama Service: Live status, version, and grid display of active models with metadata
  • Storage & Network: Disk usage bars and real-time network I/O monitoring
  • Process Monitoring: Live table of top processes with CPU%, memory usage, and status
  • System Information: OS details, uptime, load average, hardware specifications

Header Enhancements:

  • Critical Metrics Badges: GPU load, VRAM usage, system RAM, disk space with live updates
  • Active Models Tooltip: Detailed grid showing running models with context length, size, VRAM usage
  • Live Status Indicators: Ollama service status with version information

NiceGUI Patterns

  • Plugin-Based Routing: Tools auto-register their routes with sub-page support
  • Context Pattern: Shared monitor access via tool.context for all plugins
  • BasePage Pattern: Consistent tool page structure with BasePage(AsyncColumn)
  • Data Binding: Reactive UI updates with bind_text_from() and bind_value_from()
  • Async Components: niceguiasyncelement framework with @ui.refreshable decorators
  • Timer Updates: 2-second intervals for real-time monitoring data
  • Dark Mode: Comprehensive dark theme with custom metric colors

Environment Variables

Configured in .env:

  • MONITORING_UPDATE_INTERVAL: Update frequency in seconds (default: 2)
  • APP_PORT: Web server port (default: 8080, use 8081 for testing)
  • APP_TITLE: Application title
  • APP_STORAGE_SECRET: Session storage encryption key
  • APP_SHOW: Auto-open browser on startup

Testing & Development

  • Run on port 8081 to avoid conflicts: APP_PORT=8081 uv run python src/main.py
  • Monitor GPU detection in console logs
  • Check Ollama connectivity at startup
  • Use browser DevTools for WebSocket debugging

Current Route Structure

Core Application Routes:

  • / - Comprehensive system monitoring dashboard
  • /ollama - Advanced model manager (download, test, create, manage)
  • /settings - Application configuration and monitoring intervals

Plugin System Routes (Auto-Generated):

  • /example-tool - Example tool demonstrating plugin capabilities
  • /example-tool/settings - Tool-specific settings page
  • /example-tool/history - Tool-specific history page
  • Dynamic Discovery: Additional tool routes auto-discovered from src/tools/ directory

External Integrations:

  • Direct link to Open WebUI for advanced model interactions

Tool Development Guide

Quick Start:

  1. Create src/tools/my_tool/ directory
  2. Add tool.py with class inheriting from BaseTool
  3. Define routes dictionary mapping paths to page classes
  4. Create page classes inheriting from BasePage
  5. Tool automatically appears in sidebar and routes are registered

Advanced Features:

  • Context Access: Access system monitors via self.tool.context.system_monitor
  • Sub-routing: Multiple pages per tool (main, settings, config, etc.)
  • Enable/Disable: Control tool visibility via enabled property
  • Live Data: Bind to real-time system metrics and Ollama status

Future Enhancements

  • Local AI model testing capabilities that prioritize privacy and security
  • Tools for testing model behaviors that external providers might restrict
  • Advanced local prompt engineering and safety testing frameworks
  • Private data processing and analysis tools using local models
  • Additional testing capabilities as needs are discovered through usage