11 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Context
This is a NiceGUI-based web platform for testing and managing AI models through Ollama on Arch Linux systems with GPU support. The application serves as an AI model testing environment featuring:
Core Purpose
A streamlined interface for testing AI models locally, managing Ollama models, and running various AI-related testing tools.
Main Features:
-
Model Manager - Complete Ollama model management interface
- Download, delete, create, and test models
- Support for Hugging Face models via Ollama pull syntax
- Rich model metadata display
- Quick in-app chat testing
-
System Monitoring - Resource tracking for AI workloads
- Real-time GPU monitoring (AMD/NVIDIA) to track model performance
- CPU and memory usage during model inference
- System metrics dashboard
-
AI Testing Tools:
- Censor - Text content filtering/censoring tool for testing AI outputs
- Additional testing tools to be added as needed
-
Settings - Application configuration and refresh intervals
Development Commands
Running the Application
# Install dependencies
uv sync
# Run the development server (use port 8081 for testing as 8080 is usually occupied)
APP_PORT=8081 uv run python src/main.py
# Default port (8080) - usually already in use by main instance
uv run python src/main.py
Dependency Management
# Add a new dependency
uv add <package>
# Add a dev dependency
uv add --dev <package>
# Update dependencies
uv sync
Architecture Overview
Technology Stack
- Package Manager: uv (version 0.8.17)
- UI Framework: NiceGUI (async web framework based on FastAPI/Vue.js)
- Python Version: 3.13+
- Ollama API: Running on localhost:11434
- Dependencies:
nicegui- Main UI frameworkniceguiasyncelement- Custom async component framework (from git)psutil- System monitoringhttpx- Async HTTP client for Ollama APIpython-dotenv- Environment configuration
Project Structure
src/
├── main.py # Entry point, NiceGUI app configuration with all routes
├── pages/ # Page components (inheriting NiceGUI elements)
│ ├── dashboard.py # Main dashboard with system metrics
│ ├── ollama_manager.py # Ollama model management interface (AsyncColumn)
│ ├── system_overview.py # System information page
│ └── welcome.py # Welcome/landing page
├── components/ # Reusable UI components
│ ├── circular_progress.py # Circular progress indicators
│ ├── header.py # App header with live status
│ ├── sidebar.py # Navigation sidebar with updated menu structure
│ ├── bottom_nav.py # Mobile bottom navigation
│ ├── ollama_downloader.py # Ollama model downloader component (AsyncCard)
│ ├── ollama_model_creation.py # Model creation component (AsyncCard)
│ └── ollama_quick_test.py # Model testing component (AsyncCard)
├── utils/ # Utility modules
│ ├── gpu_monitor.py # GPU monitoring (AMD/NVIDIA auto-detect)
│ ├── system_monitor.py # System resource monitoring
│ ├── ollama_monitor.py # Ollama status monitoring (bindable dataclass)
│ └── ollama.py # Ollama API client functions
└── static/ # Static assets (CSS, images)
└── style.css # Custom dark theme styles
Key Design Patterns
- Async Components: Uses custom
niceguiasyncelementframework for async page/component constructionAsyncColumn,AsyncCardbase classes for complex componentsOllamaManagerPage(AsyncColumn)for full page async initialization- Async component dialogs with
await component.create()pattern
- Bindable Dataclasses: Monitor classes use
@binding.bindable_dataclassfor reactive data bindingSystemMonitor,GPUMonitor,OllamaMonitorfor real-time data updates
- Environment Configuration: All app settings are managed via
.envfile and loaded with python-dotenv - Centralized Routing: All routes defined in main.py with layout creation pattern
- Real-time Updates: Timer-based updates every 2 seconds for all monitor instances
Component Architecture
Monitor Classes (Supporting AI Testing)
-
SystemMonitor: Tracks system resources during AI model testing
- CPU usage during model inference
- Memory consumption by loaded models
- Disk I/O for model loading
- Process statistics for Ollama and GPU processes
-
GPUMonitor: Critical for AI workload monitoring
- Auto-detects AMD/NVIDIA GPUs
- Tracks GPU usage during model inference
- Memory usage by loaded models
- Temperature monitoring during extended testing
- Power draw under AI workloads
-
OllamaMonitor: Core service monitoring
- Ollama service status and version
- Currently loaded/active models
- Real-time model state tracking
UI Components
- MetricCircle: Small circular progress indicator with icon
- LargeMetricCircle: Large circular progress for primary metrics
- ColorfulMetricCard: Action cards with gradient backgrounds
- Sidebar: Navigation menu with updated structure:
- Main: Dashboard, System Overview
- Tools: Censor (content filtering)
- Bottom: Model Manager, Settings
- Header: Top bar with system status indicators
Ollama-Specific Components (AsyncCard-based):
- OllamaDownloaderComponent: Model downloading with progress tracking (supports HF models via Ollama's pull syntax)
- OllamaModelCreationComponent: Custom model creation from Modelfile
- ModelQuickTestComponent: Interactive model testing interface
Ollama Integration
The Ollama API client (src/utils/ollama.py) provides async functions:
status(): Check if Ollama is online and get versionavailable_models(): List installed models with detailed metadataactive_models(): Get currently loaded/running modelsdelete_model(): Remove a modelmodel_info(): Get detailed model information and Modelfilestream_chat(): Stream chat responses
AI Model Testing Features:
-
Model Discovery & Management:
- Browse and pull models from Ollama library
- Support for HuggingFace models via Ollama syntax
- Rich metadata display (size, quantization, parameters, format)
- Time tracking for model versions
-
Testing Capabilities:
- Quick chat interface for immediate model testing
- Model information and Modelfile inspection
- Custom model creation from Modelfiles
- Real-time resource monitoring during inference
-
Testing Tools:
- Censor tool for output filtering analysis
- Extensible framework for adding new testing tools
API endpoints at http://localhost:11434/api/:
/api/version: Get Ollama version/api/tags: List available models/api/pull: Download models/api/delete: Remove models/api/generate: Generate text/api/chat: Chat completion/api/ps: List running models/api/show: Show model details
System Monitoring
GPU Monitoring Strategy
The application uses a hierarchical approach for GPU monitoring:
-
NVIDIA GPUs (via
nvidia-smi):- Temperature, usage, memory, power draw
- CUDA version and driver info
- Multi-GPU support
-
AMD GPUs (multiple fallbacks):
- Primary:
rocm-smifor full metrics - Fallback:
/sys/class/drmfilesystem - Reads hwmon for temperature data
- Supports both server and consumer GPUs
- Primary:
CPU & System Monitoring
- Real-time CPU usage and per-core statistics
- Memory (RAM and swap) usage
- Disk usage and I/O statistics
- Network traffic monitoring
- Process tracking with top processes by CPU/memory
- System uptime and kernel information
UI/UX Features
Dark Theme
Custom dark theme with:
- Background:
#1a1d2e(main),#252837(sidebar) - Card backgrounds:
rgba(26, 29, 46, 0.7)with backdrop blur - Accent colors: Cyan (
#06b6d4) for primary actions - Metric colors: Purple (CPU), Green (Memory), Orange (GPU), Cyan (Temp)
Responsive Design
- Desktop: Full sidebar navigation
- Mobile: Bottom navigation bar
- Adaptive grid layouts for different screen sizes
- Viewport-aware content scaling
Real-time Updates
- System metrics update every 2 seconds (configurable via
MONITORING_UPDATE_INTERVAL) - Live data binding for all metrics
- Smooth transitions and animations
NiceGUI Patterns
- Data Binding: Use
bind_text_from()andbind_value_from()for reactive updates - Page Routing: Navigation via
ui.navigate.to(route)with centralized route handling - Async Components: Custom
niceguiasyncelementframework for complex async initializationAsyncColumn.create()for async page constructionAsyncCard.create()for dialog components@ui.refreshabledecorators for dynamic content updates
- Timer Updates:
app.timer()for periodic data refresh (2-second intervals) - Dialog Patterns: Modal dialogs with
await dialogfor user interactions - Component Layout:
create_layout(route)pattern for consistent page structure - Dark Mode: Forced dark mode with custom CSS overrides
Environment Variables
Configured in .env:
MONITORING_UPDATE_INTERVAL: Update frequency in seconds (default: 2)APP_PORT: Web server port (default: 8080, use 8081 for testing)APP_TITLE: Application titleAPP_STORAGE_SECRET: Session storage encryption keyAPP_SHOW: Auto-open browser on startup
Testing & Development
- Run on port 8081 to avoid conflicts:
APP_PORT=8081 uv run python src/main.py - Monitor GPU detection in console logs
- Check Ollama connectivity at startup
- Use browser DevTools for WebSocket debugging
Current Route Structure
From main.py routing:
/- Dashboard (system metrics for monitoring AI workloads)/system- System Overview (detailed resource information)/ollama- Model Manager (primary interface for AI model testing)/censor- Censor tool (AI output filtering/testing)/settings- Settings (refresh intervals, app configuration)
Placeholder Routes (may be repurposed for AI tools):
/processes- Reserved for future AI tools/network- Reserved for future AI tools/packages- Reserved for future AI tools/logs- Reserved for future AI tools/info- Reserved for future AI tools
Future Enhancements
- Enhanced model chat interface with conversation history
- Model performance benchmarking tools
- Batch testing capabilities for multiple models
- Output comparison tools between different models
- Integration with more AI model formats
- Advanced prompt testing and optimization tools
- Model fine-tuning interface