Files
ArchGPUFrontend/CLAUDE.md
2025-09-18 10:10:52 +02:00

11 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Context

This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring:

Core Purpose

A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools.

Main Features:

  1. Model Manager - Complete Ollama model management interface

    • Download, delete, create, and test models
    • Support for Hugging Face models via Ollama pull syntax
    • Rich model metadata display
    • Quick in-app chat testing
  2. System Monitoring - Resource tracking for AI workloads

    • Real-time GPU monitoring (AMD/NVIDIA) to track model performance
    • CPU and memory usage during model inference
    • System metrics dashboard
  3. AI Testing Tools:

    • Censor - Text content filtering/censoring tool for testing AI outputs
    • Additional testing tools to be added as needed
  4. Settings - Application configuration and refresh intervals

Development Commands

Running the Application

# Install dependencies
uv sync

# Run the development server (use port 8081 for testing as 8080 is usually occupied)
APP_PORT=8081 uv run python src/main.py

# Default port (8080) - usually already in use by main instance
uv run python src/main.py

Dependency Management

# Add a new dependency
uv add <package>

# Add a dev dependency
uv add --dev <package>

# Update dependencies
uv sync

Architecture Overview

Technology Stack

  • Package Manager: uv (version 0.8.17)
  • UI Framework: NiceGUI (async web framework based on FastAPI/Vue.js)
  • Python Version: 3.13+
  • Ollama API: Running on localhost:11434
  • Dependencies:
    • nicegui - Main UI framework
    • niceguiasyncelement - Custom async component framework (from git)
    • psutil - System monitoring
    • httpx - Async HTTP client for Ollama API
    • python-dotenv - Environment configuration

Project Structure

src/
├── main.py              # Entry point, NiceGUI app configuration with all routes
├── pages/               # Page components (inheriting NiceGUI elements)
│   ├── dashboard.py     # Main dashboard with system metrics
│   ├── ollama_manager.py # Ollama model management interface (AsyncColumn)
│   ├── system_overview.py # System information page
│   └── welcome.py       # Welcome/landing page
├── components/          # Reusable UI components
│   ├── circular_progress.py # Circular progress indicators
│   ├── header.py        # App header with live status
│   ├── sidebar.py       # Navigation sidebar with updated menu structure
│   ├── bottom_nav.py    # Mobile bottom navigation
│   ├── ollama_downloader.py # Ollama model downloader component (AsyncCard)
│   ├── ollama_model_creation.py # Model creation component (AsyncCard)
│   └── ollama_quick_test.py # Model testing component (AsyncCard)
├── utils/               # Utility modules
│   ├── gpu_monitor.py   # GPU monitoring (AMD/NVIDIA auto-detect)
│   ├── system_monitor.py # System resource monitoring
│   ├── ollama_monitor.py # Ollama status monitoring (bindable dataclass)
│   └── ollama.py        # Ollama API client functions
└── static/              # Static assets (CSS, images)
    └── style.css        # Custom dark theme styles

Key Design Patterns

  1. Async Components: Uses custom niceguiasyncelement framework for async page/component construction
    • AsyncColumn, AsyncCard base classes for complex components
    • OllamaManagerPage(AsyncColumn) for full page async initialization
    • Async component dialogs with await component.create() pattern
  2. Bindable Dataclasses: Monitor classes use @binding.bindable_dataclass for reactive data binding
    • SystemMonitor, GPUMonitor, OllamaMonitor for real-time data updates
  3. Environment Configuration: All app settings are managed via .env file and loaded with python-dotenv
  4. Centralized Routing: All routes defined in main.py with layout creation pattern
  5. Real-time Updates: Timer-based updates every 2 seconds for all monitor instances

Component Architecture

Monitor Classes (Supporting AI Testing)

  • SystemMonitor: Tracks system resources during AI model testing

    • CPU usage during model inference
    • Memory consumption by loaded models
    • Disk I/O for model loading
    • Process statistics for Ollama and GPU processes
  • GPUMonitor: Critical for AI workload monitoring

    • Auto-detects AMD/NVIDIA GPUs
    • Tracks GPU usage during model inference
    • Memory usage by loaded models
    • Temperature monitoring during extended testing
    • Power draw under AI workloads
  • OllamaMonitor: Core service monitoring

    • Ollama service status and version
    • Currently loaded/active models
    • Real-time model state tracking

UI Components

  • MetricCircle: Small circular progress indicator with icon
  • LargeMetricCircle: Large circular progress for primary metrics
  • ColorfulMetricCard: Action cards with gradient backgrounds
  • Sidebar: Navigation menu with updated structure:
    • Main: Dashboard, System Overview
    • Tools: Censor (content filtering)
    • Bottom: Model Manager, Settings
  • Header: Top bar with system status indicators

Ollama-Specific Components (AsyncCard-based):

  • OllamaDownloaderComponent: Model downloading with progress tracking (supports HF models via Ollama's pull syntax)
  • OllamaModelCreationComponent: Custom model creation from Modelfile
  • ModelQuickTestComponent: Interactive model testing interface

Ollama Integration

The Ollama API client (src/utils/ollama.py) provides async functions:

  • status(): Check if Ollama is online and get version
  • available_models(): List installed models with detailed metadata
  • active_models(): Get currently loaded/running models
  • delete_model(): Remove a model
  • model_info(): Get detailed model information and Modelfile
  • stream_chat(): Stream chat responses

AI Model Testing Features:

  • Model Discovery & Management:

    • Browse and pull models from Ollama library
    • Support for HuggingFace models via Ollama syntax
    • Rich metadata display (size, quantization, parameters, format)
    • Time tracking for model versions
  • Testing Capabilities:

    • Quick chat interface for immediate model testing
    • Model information and Modelfile inspection
    • Custom model creation from Modelfiles
    • Real-time resource monitoring during inference
  • Testing Tools:

    • Censor tool for output filtering analysis
    • Extensible framework for adding new testing tools

API endpoints at http://localhost:11434/api/:

  • /api/version: Get Ollama version
  • /api/tags: List available models
  • /api/pull: Download models
  • /api/delete: Remove models
  • /api/generate: Generate text
  • /api/chat: Chat completion
  • /api/ps: List running models
  • /api/show: Show model details

System Monitoring

GPU Monitoring Strategy

The application uses a hierarchical approach for GPU monitoring:

  1. NVIDIA GPUs (via nvidia-smi):

    • Temperature, usage, memory, power draw
    • CUDA version and driver info
    • Multi-GPU support
  2. AMD GPUs (multiple fallbacks):

    • Primary: rocm-smi for full metrics
    • Fallback: /sys/class/drm filesystem
    • Reads hwmon for temperature data
    • Supports both server and consumer GPUs

CPU & System Monitoring

  • Real-time CPU usage and per-core statistics
  • Memory (RAM and swap) usage
  • Disk usage and I/O statistics
  • Network traffic monitoring
  • Process tracking with top processes by CPU/memory
  • System uptime and kernel information

UI/UX Features

Dark Theme

Custom dark theme with:

  • Background: #1a1d2e (main), #252837 (sidebar)
  • Card backgrounds: rgba(26, 29, 46, 0.7) with backdrop blur
  • Accent colors: Cyan (#06b6d4) for primary actions
  • Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp)

Responsive Design

  • Desktop: Full sidebar navigation
  • Mobile: Bottom navigation bar
  • Adaptive grid layouts for different screen sizes
  • Viewport-aware content scaling

Real-time Updates

  • System metrics update every 2 seconds (configurable via MONITORING_UPDATE_INTERVAL)
  • Live data binding for all metrics
  • Smooth transitions and animations

NiceGUI Patterns

  • Data Binding: Use bind_text_from() and bind_value_from() for reactive updates
  • Page Routing: Navigation via ui.navigate.to(route) with centralized route handling
  • Async Components: Custom niceguiasyncelement framework for complex async initialization
    • AsyncColumn.create() for async page construction
    • AsyncCard.create() for dialog components
    • @ui.refreshable decorators for dynamic content updates
  • Timer Updates: app.timer() for periodic data refresh (2-second intervals)
  • Dialog Patterns: Modal dialogs with await dialog for user interactions
  • Component Layout: create_layout(route) pattern for consistent page structure
  • Dark Mode: Forced dark mode with custom CSS overrides

Environment Variables

Configured in .env:

  • MONITORING_UPDATE_INTERVAL: Update frequency in seconds (default: 2)
  • APP_PORT: Web server port (default: 8080, use 8081 for testing)
  • APP_TITLE: Application title
  • APP_STORAGE_SECRET: Session storage encryption key
  • APP_SHOW: Auto-open browser on startup

Testing & Development

  • Run on port 8081 to avoid conflicts: APP_PORT=8081 uv run python src/main.py
  • Monitor GPU detection in console logs
  • Check Ollama connectivity at startup
  • Use browser DevTools for WebSocket debugging

Current Route Structure

From main.py routing:

  • / - Dashboard (system metrics for monitoring AI workloads)
  • /system - System Overview (detailed resource information)
  • /ollama - Model Manager (primary interface for AI model testing)
  • /censor - Censor tool (AI output filtering/testing)
  • /settings - Settings (refresh intervals, app configuration)

Placeholder Routes (may be repurposed for AI tools):

  • /processes - Reserved for future AI tools
  • /network - Reserved for future AI tools
  • /packages - Reserved for future AI tools
  • /logs - Reserved for future AI tools
  • /info - Reserved for future AI tools

Future Enhancements

  • Enhanced model chat interface with conversation history
  • Model performance benchmarking tools
  • Batch testing capabilities for multiple models
  • Output comparison tools between different models
  • Integration with more AI model formats
  • Advanced prompt testing and optimization tools
  • Model fine-tuning interface