gmarth/ArchGPUFrontend

Fork 0

Files

Alexander Thiess 994fc6873e stuff

2025-09-18 10:10:52 +02:00

11 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Context

This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring:

Core Purpose

A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools.

Main Features:

Model Manager - Complete Ollama model management interface
- Download, delete, create, and test models
- Support for Hugging Face models via Ollama pull syntax
- Rich model metadata display
- Quick in-app chat testing
System Monitoring - Resource tracking for AI workloads
- Real-time GPU monitoring (AMD/NVIDIA) to track model performance
- CPU and memory usage during model inference
- System metrics dashboard
AI Testing Tools:
- Censor - Text content filtering/censoring tool for testing AI outputs
- Additional testing tools to be added as needed
Settings - Application configuration and refresh intervals

Development Commands

Running the Application

# Install dependencies
uv sync

# Run the development server (use port 8081 for testing as 8080 is usually occupied)
APP_PORT=8081 uv run python src/main.py

# Default port (8080) - usually already in use by main instance
uv run python src/main.py

Dependency Management

# Add a new dependency
uv add <package>

# Add a dev dependency
uv add --dev <package>

# Update dependencies
uv sync

Architecture Overview

Technology Stack

Package Manager: uv (version 0.8.17)
UI Framework: NiceGUI (async web framework based on FastAPI/Vue.js)
Python Version: 3.13+
Ollama API: Running on localhost:11434
Dependencies:
- nicegui - Main UI framework
- niceguiasyncelement - Custom async component framework (from git)
- psutil - System monitoring
- httpx - Async HTTP client for Ollama API
- python-dotenv - Environment configuration

Project Structure

src/
├── main.py              # Entry point, NiceGUI app configuration with all routes
├── pages/               # Page components (inheriting NiceGUI elements)
│   ├── dashboard.py     # Main dashboard with system metrics
│   ├── ollama_manager.py # Ollama model management interface (AsyncColumn)
│   ├── system_overview.py # System information page
│   └── welcome.py       # Welcome/landing page
├── components/          # Reusable UI components
│   ├── circular_progress.py # Circular progress indicators
│   ├── header.py        # App header with live status
│   ├── sidebar.py       # Navigation sidebar with updated menu structure
│   ├── bottom_nav.py    # Mobile bottom navigation
│   ├── ollama_downloader.py # Ollama model downloader component (AsyncCard)
│   ├── ollama_model_creation.py # Model creation component (AsyncCard)
│   └── ollama_quick_test.py # Model testing component (AsyncCard)
├── utils/               # Utility modules
│   ├── gpu_monitor.py   # GPU monitoring (AMD/NVIDIA auto-detect)
│   ├── system_monitor.py # System resource monitoring
│   ├── ollama_monitor.py # Ollama status monitoring (bindable dataclass)
│   └── ollama.py        # Ollama API client functions
└── static/              # Static assets (CSS, images)
    └── style.css        # Custom dark theme styles

Key Design Patterns

Async Components: Uses custom niceguiasyncelement framework for async page/component construction
- AsyncColumn, AsyncCard base classes for complex components
- OllamaManagerPage(AsyncColumn) for full page async initialization
- Async component dialogs with await component.create() pattern
Bindable Dataclasses: Monitor classes use @binding.bindable_dataclass for reactive data binding
- SystemMonitor, GPUMonitor, OllamaMonitor for real-time data updates
Environment Configuration: All app settings are managed via .env file and loaded with python-dotenv
Centralized Routing: All routes defined in main.py with layout creation pattern
Real-time Updates: Timer-based updates every 2 seconds for all monitor instances

Component Architecture

Monitor Classes (Supporting AI Testing)

SystemMonitor: Tracks system resources during AI model testing
- CPU usage during model inference
- Memory consumption by loaded models
- Disk I/O for model loading
- Process statistics for Ollama and GPU processes
GPUMonitor: Critical for AI workload monitoring
- Auto-detects AMD/NVIDIA GPUs
- Tracks GPU usage during model inference
- Memory usage by loaded models
- Temperature monitoring during extended testing
- Power draw under AI workloads
OllamaMonitor: Core service monitoring
- Ollama service status and version
- Currently loaded/active models
- Real-time model state tracking

UI Components

MetricCircle: Small circular progress indicator with icon
LargeMetricCircle: Large circular progress for primary metrics
ColorfulMetricCard: Action cards with gradient backgrounds
Sidebar: Navigation menu with updated structure:
- Main: Dashboard, System Overview
- Tools: Censor (content filtering)
- Bottom: Model Manager, Settings
Header: Top bar with system status indicators

Ollama-Specific Components (AsyncCard-based):

OllamaDownloaderComponent: Model downloading with progress tracking (supports HF models via Ollama's pull syntax)
OllamaModelCreationComponent: Custom model creation from Modelfile
ModelQuickTestComponent: Interactive model testing interface

Ollama Integration

The Ollama API client (src/utils/ollama.py) provides async functions:

status(): Check if Ollama is online and get version
available_models(): List installed models with detailed metadata
active_models(): Get currently loaded/running models
delete_model(): Remove a model
model_info(): Get detailed model information and Modelfile
stream_chat(): Stream chat responses

AI Model Testing Features:

Model Discovery & Management:
- Browse and pull models from Ollama library
- Support for HuggingFace models via Ollama syntax
- Rich metadata display (size, quantization, parameters, format)
- Time tracking for model versions
Testing Capabilities:
- Quick chat interface for immediate model testing
- Model information and Modelfile inspection
- Custom model creation from Modelfiles
- Real-time resource monitoring during inference
Testing Tools:
- Censor tool for output filtering analysis
- Extensible framework for adding new testing tools

API endpoints at http://localhost:11434/api/:

/api/version: Get Ollama version
/api/tags: List available models
/api/pull: Download models
/api/delete: Remove models
/api/generate: Generate text
/api/chat: Chat completion
/api/ps: List running models
/api/show: Show model details

System Monitoring

GPU Monitoring Strategy

The application uses a hierarchical approach for GPU monitoring:

NVIDIA GPUs (via nvidia-smi):
- Temperature, usage, memory, power draw
- CUDA version and driver info
- Multi-GPU support
AMD GPUs (multiple fallbacks):
- Primary: rocm-smi for full metrics
- Fallback: /sys/class/drm filesystem
- Reads hwmon for temperature data
- Supports both server and consumer GPUs

CPU & System Monitoring

Real-time CPU usage and per-core statistics
Memory (RAM and swap) usage
Disk usage and I/O statistics
Network traffic monitoring
Process tracking with top processes by CPU/memory
System uptime and kernel information

UI/UX Features

Dark Theme

Custom dark theme with:

Background: #1a1d2e (main), #252837 (sidebar)
Card backgrounds: rgba(26, 29, 46, 0.7) with backdrop blur
Accent colors: Cyan (#06b6d4) for primary actions
Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp)

Responsive Design

Desktop: Full sidebar navigation
Mobile: Bottom navigation bar
Adaptive grid layouts for different screen sizes
Viewport-aware content scaling

Real-time Updates

System metrics update every 2 seconds (configurable via MONITORING_UPDATE_INTERVAL)
Live data binding for all metrics
Smooth transitions and animations

NiceGUI Patterns

Data Binding: Use bind_text_from() and bind_value_from() for reactive updates
Page Routing: Navigation via ui.navigate.to(route) with centralized route handling
Async Components: Custom niceguiasyncelement framework for complex async initialization
- AsyncColumn.create() for async page construction
- AsyncCard.create() for dialog components
- @ui.refreshable decorators for dynamic content updates
Timer Updates: app.timer() for periodic data refresh (2-second intervals)
Dialog Patterns: Modal dialogs with await dialog for user interactions
Component Layout: create_layout(route) pattern for consistent page structure
Dark Mode: Forced dark mode with custom CSS overrides

Environment Variables

Configured in .env:

MONITORING_UPDATE_INTERVAL: Update frequency in seconds (default: 2)
APP_PORT: Web server port (default: 8080, use 8081 for testing)
APP_TITLE: Application title
APP_STORAGE_SECRET: Session storage encryption key
APP_SHOW: Auto-open browser on startup

Testing & Development

Run on port 8081 to avoid conflicts: APP_PORT=8081 uv run python src/main.py
Monitor GPU detection in console logs
Check Ollama connectivity at startup
Use browser DevTools for WebSocket debugging

Current Route Structure

From main.py routing:

/ - Dashboard (system metrics for monitoring AI workloads)
/system - System Overview (detailed resource information)
/ollama - Model Manager (primary interface for AI model testing)
/censor - Censor tool (AI output filtering/testing)
/settings - Settings (refresh intervals, app configuration)

Placeholder Routes (may be repurposed for AI tools):

/processes - Reserved for future AI tools
/network - Reserved for future AI tools
/packages - Reserved for future AI tools
/logs - Reserved for future AI tools
/info - Reserved for future AI tools

Future Enhancements

Enhanced model chat interface with conversation history
Model performance benchmarking tools
Batch testing capabilities for multiple models
Output comparison tools between different models
Integration with more AI model formats
Advanced prompt testing and optimization tools
Model fine-tuning interface

11 KiB Raw Blame History