generated readme
This commit is contained in:
178
README.md
178
README.md
@@ -1,2 +1,178 @@
|
||||
# CensorBot
|
||||
# 🔒 CensorBot
|
||||
|
||||
A secure data sanitization tool for IT service companies that automatically detects and censors sensitive customer information using AI.
|
||||
|
||||
## Overview
|
||||
|
||||
CensorBot is a Python application that helps protect customer privacy by automatically identifying and replacing sensitive information with placeholders. It uses a small, efficient LLM (like DeepSeek) to process text locally, ensuring that sensitive data never leaves your control before being sent to external AI services.
|
||||
|
||||
## Features
|
||||
|
||||
- 🛡️ **Automatic Detection** - Identifies names, emails, phone numbers, addresses, SSNs, and more
|
||||
- 🔄 **Real-time Processing** - Stream-based censoring for immediate feedback
|
||||
- 🎯 **High Accuracy** - AI-powered detection understands context, not just patterns
|
||||
- 💼 **Enterprise Ready** - Designed for IT service companies handling customer data
|
||||
- 🌐 **Web Interface** - Clean, intuitive UI built with NiceGUI
|
||||
- 📝 **30+ Test Examples** - Comprehensive test suite covering various scenarios
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Python 3.8+
|
||||
- [uv](https://github.com/astral-sh/uv) package manager
|
||||
- An OpenAI-compatible API endpoint (e.g., DeepSeek, local LLM)
|
||||
|
||||
### Installation
|
||||
|
||||
1. Clone the repository:
|
||||
```bash
|
||||
git clone https://github.com/yourusername/CensorBot.git
|
||||
cd CensorBot
|
||||
```
|
||||
|
||||
2. Install dependencies:
|
||||
```bash
|
||||
uv sync
|
||||
```
|
||||
|
||||
3. Configure environment variables:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit .env with your API credentials
|
||||
```
|
||||
|
||||
4. Run the application:
|
||||
```bash
|
||||
uv run python src/main.py
|
||||
```
|
||||
|
||||
5. Open your browser to `http://localhost:8080`
|
||||
|
||||
## Configuration
|
||||
|
||||
Create a `.env` file with the following variables:
|
||||
|
||||
```env
|
||||
# LLM Backend Configuration
|
||||
BACKEND_BASE_URL=https://api.deepseek.com # Your LLM API endpoint
|
||||
BACKEND_API_TOKEN=your-api-token-here # API authentication token
|
||||
BACKEND_MODEL=deepseek-chat # Model to use for censoring
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
1. **Paste Text**: Copy your text containing sensitive customer information into the input field
|
||||
2. **Process**: Click "Censor Data" to automatically detect and replace sensitive information
|
||||
3. **Copy Result**: Use the censored text safely with any external AI service
|
||||
|
||||
### What Gets Censored
|
||||
|
||||
- Personal names
|
||||
- Email addresses
|
||||
- Phone numbers
|
||||
- Physical addresses
|
||||
- Social Security Numbers
|
||||
- Credit card numbers
|
||||
- Bank account numbers
|
||||
- Driver's license numbers
|
||||
- Passport numbers
|
||||
- Medical record numbers
|
||||
- IP addresses
|
||||
- Usernames and passwords
|
||||
- Company names (in customer context)
|
||||
- Dates of birth
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
CensorBot/
|
||||
├── src/
|
||||
│ ├── main.py # Main application with NiceGUI interface
|
||||
│ ├── prompt.md # System prompt for the censoring LLM
|
||||
│ └── lib/
|
||||
│ └── llm.py # LLM integration module
|
||||
├── examples/ # 30+ test cases with various sensitive data
|
||||
│ ├── 01_customer_support.txt
|
||||
│ ├── 02_medical_record.txt
|
||||
│ └── ...
|
||||
├── .env.example # Environment variables template
|
||||
├── pyproject.toml # Project dependencies
|
||||
└── CLAUDE.md # AI assistant instructions
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
### Running Tests
|
||||
|
||||
Test the censoring with example files:
|
||||
```bash
|
||||
# The application loads a random example on startup
|
||||
uv run python src/main.py
|
||||
```
|
||||
|
||||
### Adding Dependencies
|
||||
|
||||
```bash
|
||||
uv add <package-name>
|
||||
```
|
||||
|
||||
### Project Commands
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
uv sync
|
||||
|
||||
# Run the application
|
||||
uv run python src/main.py
|
||||
|
||||
# Format code (if configured)
|
||||
uv run black src/
|
||||
|
||||
# Type checking (if configured)
|
||||
uv run mypy src/
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- **Local Processing**: Use a local or self-hosted LLM for maximum security
|
||||
- **No Data Storage**: CensorBot doesn't store any processed text
|
||||
- **API Security**: Keep your API tokens secure and never commit them
|
||||
- **HTTPS Only**: Use HTTPS for API communications
|
||||
- **Regular Updates**: Keep dependencies updated for security patches
|
||||
|
||||
## Use Cases
|
||||
|
||||
- **IT Support Tickets**: Sanitize customer tickets before using AI for solutions
|
||||
- **Documentation**: Remove sensitive data from technical documentation
|
||||
- **Training Data**: Prepare datasets for ML training without privacy concerns
|
||||
- **Compliance**: Meet GDPR, HIPAA, and other privacy regulations
|
||||
- **Knowledge Base**: Create sanitized versions of customer interactions
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! Please feel free to submit a Pull Request.
|
||||
|
||||
1. Fork the repository
|
||||
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
|
||||
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
|
||||
4. Push to the branch (`git push origin feature/AmazingFeature`)
|
||||
5. Open a Pull Request
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the MIT License - see the LICENSE file for details.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
- Built with [NiceGUI](https://nicegui.io/) for the web interface
|
||||
- Powered by [uv](https://github.com/astral-sh/uv) for fast Python package management
|
||||
- AI censoring via OpenAI-compatible APIs
|
||||
|
||||
## Support
|
||||
|
||||
For issues, questions, or suggestions, please open an issue on GitHub.
|
||||
|
||||
---
|
||||
|
||||
**⚠️ Important**: This tool is designed to help protect privacy but should not be the only measure. Always review censored output and follow your organization's data protection policies.
|
||||
|
||||
Reference in New Issue
Block a user