gmarth/ArchGPUFrontend

Fork 0

Files

Alexander Thiess ca3ebcf02a tooling and docs

2025-09-18 22:54:40 +02:00

15 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Context

This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring:

Core Purpose

A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools.

Main Features:

Comprehensive System Monitoring - Real-time resource tracking for AI workloads
- Live dashboard with GPU, CPU, memory, disk, and network monitoring
- Process monitoring with real-time top processes display
- Enhanced header with critical metrics (GPU load, VRAM, RAM, disk space)
- Detailed tooltips showing active Ollama models
Model Manager - Complete Ollama model management interface
- Download, delete, create, and test models
- Support for Hugging Face models via Ollama pull syntax
- Rich model metadata display with size, quantization, context length
- Quick in-app chat testing interface
Plugin-Based Tool System - Extensible framework for AI testing tools
- Auto-discovery of tools from src/tools/ directory
- Each tool can have multiple sub-pages with routing
- Tools have access to system monitors via ToolContext
- Enable/disable tools via simple property override
External Integrations - Quick access to related services
- Direct link to Open WebUI for advanced model interactions

Development Commands

Running the Application

# Install dependencies
uv sync

# Run the development server (use port 8081 for testing as 8080 is usually occupied)
APP_PORT=8081 uv run python src/main.py

# Default port (8080) - usually already in use by main instance
uv run python src/main.py

Dependency Management

# Add a new dependency
uv add <package>

# Add a dev dependency
uv add --dev <package>

# Update dependencies
uv sync

Architecture Overview

Technology Stack

Package Manager: uv (version 0.8.17)
UI Framework: NiceGUI (async web framework based on FastAPI/Vue.js)
Python Version: 3.13+
Ollama API: Running on localhost:11434
Dependencies:
- nicegui - Main UI framework
- niceguiasyncelement - Custom async component framework (from git)
- psutil - System monitoring
- httpx - Async HTTP client for Ollama API
- python-dotenv - Environment configuration

Project Structure

src/
├── main.py              # Entry point, NiceGUI app configuration with all routes
├── pages/               # Core page components
│   ├── dashboard.py     # Comprehensive system monitoring dashboard
│   └── ollama_manager.py # Ollama model management interface (AsyncColumn)
├── components/          # Reusable UI components
│   ├── header.py        # Enhanced header with critical metrics and tooltips
│   ├── sidebar.py       # Navigation sidebar with auto-populated tools
│   ├── bottom_nav.py    # Mobile bottom navigation
│   ├── ollama_downloader.py # Ollama model downloader component (AsyncCard)
│   ├── ollama_model_creation.py # Model creation component (AsyncCard)
│   └── ollama_quick_test.py # Model testing component (AsyncCard)
├── tools/               # Plugin system for extensible tools
│   ├── __init__.py      # Auto-discovery and tool registry
│   ├── base_tool.py     # BaseTool and BasePage classes, ToolContext
│   └── example_tool/    # Example tool demonstrating plugin system
│       ├── __init__.py
│       └── tool.py      # ExampleTool with main, settings, history pages
├── utils/               # Utility modules
│   ├── gpu_monitor.py   # GPU monitoring (AMD/NVIDIA auto-detect)
│   ├── system_monitor.py # Comprehensive system resource monitoring
│   ├── ollama_monitor.py # Ollama status and active models monitoring
│   └── ollama.py        # Ollama API client functions
└── static/              # Static assets (CSS, images)
    └── style.css        # Custom dark theme styles

Key Design Patterns

Plugin Architecture: Extensible tool system with auto-discovery
- Tools are auto-discovered from src/tools/ directory
- Each tool inherits from BaseTool and defines routes for sub-pages
- Tools can be enabled/disabled via simple property override
- Sub-routes support: tools can have multiple pages (main, settings, etc.)
Async Components: Uses custom niceguiasyncelement framework
- BasePage(AsyncColumn) for consistent tool page structure
- AsyncCard base classes for complex components
- All tool pages inherit from BasePage to eliminate boilerplate
Context Pattern: Shared resource access via ToolContext
- ToolContext provides access to system monitors from any tool
- Global context initialized in main.py and accessible via tool.context
- Clean separation between tools and system resources
Bindable Dataclasses: Monitor classes use @binding.bindable_dataclass
- Real-time UI updates with 2-second refresh intervals
- SystemMonitor, GPUMonitor, OllamaMonitor for live data
Enhanced Header: Critical metrics display with detailed tooltips
- GPU load, VRAM usage, system RAM, disk space badges
- Active model tooltip with detailed model information
- Clean metric formatting with proper units

Component Architecture

Monitor Classes (Supporting AI Testing)

SystemMonitor: Tracks system resources during AI model testing
- CPU usage during model inference
- Memory consumption by loaded models
- Disk I/O for model loading
- Process statistics for Ollama and GPU processes
GPUMonitor: Critical for AI workload monitoring
- Auto-detects AMD/NVIDIA GPUs
- Tracks GPU usage during model inference
- Memory usage by loaded models
- Temperature monitoring during extended testing
- Power draw under AI workloads
OllamaMonitor: Core service monitoring
- Ollama service status and version
- Currently loaded/active models
- Real-time model state tracking

UI Components

MetricCircle: Small circular progress indicator with icon
LargeMetricCircle: Large circular progress for primary metrics
ColorfulMetricCard: Action cards with gradient backgrounds
Sidebar: Navigation menu with updated structure:
- Main: Dashboard, System Overview
- Tools: Censor (content filtering)
- Bottom: Model Manager, Settings
Header: Top bar with system status indicators

Ollama-Specific Components (AsyncCard-based):

OllamaDownloaderComponent: Model downloading with progress tracking (supports HF models via Ollama's pull syntax)
OllamaModelCreationComponent: Custom model creation from Modelfile
ModelQuickTestComponent: Interactive model testing interface

Ollama Integration

The Ollama API client (src/utils/ollama.py) provides async functions:

status(): Check if Ollama is online and get version
available_models(): List installed models with detailed metadata
active_models(): Get currently loaded/running models
delete_model(): Remove a model
model_info(): Get detailed model information and Modelfile
stream_chat(): Stream chat responses

Tools Plugin System\n\nThe application features an extensible plugin system for AI testing tools:\n\n### Creating a New Tool\n\n1. Create tool directory: `src/tools/my_tool/`\n2. Create tool class: `src/tools/my_tool/tool.py`\n\npython\nfrom tools.base_tool import BaseTool, BasePage\nfrom typing import Dict, Callable, Awaitable\n\nclass MyTool(BaseTool):\n @property\n def name(self) -> str:\n return \"My Tool\"\n \n @property\n def description(self) -> str:\n return \"Description of what this tool does\"\n \n @property\n def icon(self) -> str:\n return \"build\" # Material icon name\n \n @property\n def enabled(self) -> bool:\n return True # Set to False to disable\n \n @property\n def routes(self) -> Dict[str, Callable[[], Awaitable]]:\n return {\n '': lambda: MainPage().create(self),\n '/settings': lambda: SettingsPage().create(self),\n }\n\nclass MainPage(BasePage):\n async def content(self):\n # Access system monitors via context\n cpu_usage = self.tool.context.system_monitor.cpu_percent\n active_models = self.tool.context.ollama_monitor.active_models\n \n # Your tool UI here\n ui.label(f\"CPU: {cpu_usage}%\")\n\n\n### Tool Features:\n- Auto-discovery: Tools are automatically found and loaded\n- Sub-routes: Tools can have multiple pages (/, /settings, /history, etc.)\n- Context Access: Access to system monitors via `self.tool.context`\n- Enable/Disable: Control tool visibility via `enabled` property\n- Consistent Layout: `BasePage` handles standard layout structure\n\n### AI Model Testing Features:

Model Discovery & Management:
- Browse and pull models from Ollama library
- Support for HuggingFace models via Ollama syntax
- Rich metadata display (size, quantization, parameters, format)
- Time tracking for model versions
Testing Capabilities:
- Quick chat interface for immediate model testing
- Model information and Modelfile inspection
- Custom model creation from Modelfiles
- Real-time resource monitoring during inference
Testing Tools:
- Censor tool for output filtering analysis
- Extensible framework for adding new testing tools

API endpoints at http://localhost:11434/api/:

/api/version: Get Ollama version
/api/tags: List available models
/api/pull: Download models
/api/delete: Remove models
/api/generate: Generate text
/api/chat: Chat completion
/api/ps: List running models
/api/show: Show model details

System Monitoring

GPU Monitoring Strategy

The application uses a hierarchical approach for GPU monitoring:

NVIDIA GPUs (via nvidia-smi):
- Temperature, usage, memory, power draw
- CUDA version and driver info
- Multi-GPU support
AMD GPUs (multiple fallbacks):
- Primary: rocm-smi for full metrics
- Fallback: /sys/class/drm filesystem
- Reads hwmon for temperature data
- Supports both server and consumer GPUs

CPU & System Monitoring

Real-time CPU usage and per-core statistics
Memory (RAM and swap) usage
Disk usage and I/O statistics
Network traffic monitoring
Process tracking with top processes by CPU/memory
System uptime and kernel information

UI/UX Features

Dark Theme

Custom dark theme with:

Background: #1a1d2e (main), #252837 (sidebar)
Card backgrounds: rgba(26, 29, 46, 0.7) with backdrop blur
Accent colors: Cyan (#06b6d4) for primary actions
Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp)

Responsive Design

Desktop: Full sidebar navigation
Mobile: Bottom navigation bar
Adaptive grid layouts for different screen sizes
Viewport-aware content scaling

Real-time Updates

System metrics update every 2 seconds (configurable via MONITORING_UPDATE_INTERVAL)
Live data binding for all metrics
Smooth transitions and animations

Enhanced Dashboard Features

The dashboard provides comprehensive real-time monitoring specifically designed for AI workload testing:

Primary Monitoring Sections:

GPU Performance: Large circular progress for GPU load, VRAM usage bar, temperature & power draw
CPU & Memory: Dual circular progress with detailed specs and frequency info
Ollama Service: Live status, version, and grid display of active models with metadata
Storage & Network: Disk usage bars and real-time network I/O monitoring
Process Monitoring: Live table of top processes with CPU%, memory usage, and status
System Information: OS details, uptime, load average, hardware specifications

Header Enhancements:

Critical Metrics Badges: GPU load, VRAM usage, system RAM, disk space with live updates
Active Models Tooltip: Detailed grid showing running models with context length, size, VRAM usage
Live Status Indicators: Ollama service status with version information

NiceGUI Patterns

Plugin-Based Routing: Tools auto-register their routes with sub-page support
Context Pattern: Shared monitor access via tool.context for all plugins
BasePage Pattern: Consistent tool page structure with BasePage(AsyncColumn)
Data Binding: Reactive UI updates with bind_text_from() and bind_value_from()
Async Components: niceguiasyncelement framework with @ui.refreshable decorators
Timer Updates: 2-second intervals for real-time monitoring data
Dark Mode: Comprehensive dark theme with custom metric colors

Environment Variables

Configured in .env:

MONITORING_UPDATE_INTERVAL: Update frequency in seconds (default: 2)
APP_PORT: Web server port (default: 8080, use 8081 for testing)
APP_TITLE: Application title
APP_STORAGE_SECRET: Session storage encryption key
APP_SHOW: Auto-open browser on startup

Testing & Development

Run on port 8081 to avoid conflicts: APP_PORT=8081 uv run python src/main.py
Monitor GPU detection in console logs
Check Ollama connectivity at startup
Use browser DevTools for WebSocket debugging

Current Route Structure

Core Application Routes:

/ - Comprehensive system monitoring dashboard
/ollama - Advanced model manager (download, test, create, manage)
/settings - Application configuration and monitoring intervals

Plugin System Routes (Auto-Generated):

/example-tool - Example tool demonstrating plugin capabilities
/example-tool/settings - Tool-specific settings page
/example-tool/history - Tool-specific history page
Dynamic Discovery: Additional tool routes auto-discovered from src/tools/ directory

External Integrations:

Direct link to Open WebUI for advanced model interactions

Tool Development Guide

Quick Start:

Create src/tools/my_tool/ directory
Add tool.py with class inheriting from BaseTool
Define routes dictionary mapping paths to page classes
Create page classes inheriting from BasePage
Tool automatically appears in sidebar and routes are registered

Advanced Features:

Context Access: Access system monitors via self.tool.context.system_monitor
Sub-routing: Multiple pages per tool (main, settings, config, etc.)
Enable/Disable: Control tool visibility via enabled property
Live Data: Bind to real-time system metrics and Ollama status

Future Enhancements

Local AI model testing capabilities that prioritize privacy and security
Tools for testing model behaviors that external providers might restrict
Advanced local prompt engineering and safety testing frameworks
Private data processing and analysis tools using local models
Additional testing capabilities as needs are discovered through usage

15 KiB Raw Blame History