Project Summary
GameEdge Intelligence is a comprehensive sentiment analysis and customer segmentation platform for sports betting and fantasy sports, built with enterprise-grade architecture and advanced ML capabilities. It combines BERT-based transformers with traditional ML fallback, RFM analysis, clustering, churn prediction, and real-time analytics over large-scale sports and transaction datasets.
Technical deep dive
GameEdge Intelligence is a comprehensive sentiment analysis and customer segmentation platform for sports betting and fantasy sports, built with enterprise-grade architecture and advanced ML capabilities. It combines BERT-based transformers with traditional ML fallback, RFM analysis, clustering, churn prediction, and real-time analytics over large-scale sports and transaction datasets—1.6 million pre-labeled tweets, 525k+ transaction records, and 500k+ sports matches with odds data power the training and demo pipelines.
Full-stack architecture
The platform splits cleanly between a Python FastAPI backend with async ML operations and a Next.js 14 frontend using App Router, TypeScript, Tailwind CSS, and Shadcn/ui. PostgreSQL with SQLAlchemy ORM stores customer profiles, segments, and analytics aggregates. WebSockets deliver live sentiment feeds to the dashboard. Docker Compose orchestrates development and production-like environments with environment-specific configuration files.
Backend module layout
- app/api/ — REST endpoints for sentiment, segmentation, analytics, and data-pipeline control
- app/core/ — Configuration, security settings, and database session management
- app/models/ — SQLAlchemy ORM models for customers, transactions, and sentiment events
- app/services/ — Business logic separating HTTP transport from ML and analytics rules
- app/ml/ — Model training, inference pipelines, and monitoring hooks
ML pipeline design
- Multi-model sentiment — BERT transformers via Hugging Face with scikit-learn fallback for resilience
- Customer segmentation — RFM analysis, clustering algorithms, and churn prediction models
- Automated preprocessing — Feature engineering, train/validation splits, and model versioning
- Real-time inference — Async FastAPI handlers with thread-pool or worker isolation for CPU/GPU-bound steps
- Drift and performance monitoring — Hooks for retraining triggers when sentiment or churn distributions shift
Data pipeline architecture
scripts/manage_data_pipeline.py provides CLI control over pipeline status, synthetic data generation, Kaggle dataset downloads, and per-dataset transforms. Data pipeline APIs expose GET /api/v1/data-pipeline/status, POST run/synthetic/transform endpoints, and pipeline metadata for operators. Synthetic generation supports configurable user counts for testing without external dependencies; real dataset ingestion requires Kaggle credentials and respects licensing constraints.
Real-time analytics
WebSocket /ws/live-sentiment streams sentiment events to the frontend for live monitoring dashboards. Recharts and D3.js render time-series sentiment trends, segment distributions, and churn risk heatmaps. The dark-theme UI targets sports betting operators who need high-contrast operational views during live events.
Security and deployment
- JWT_SECRET for authenticated API access in production configurations
- DATABASE_URL and REDIS_URL environment variables for managed service wiring
- ML_MODEL_PATH for versioned artifact deployment separate from application code
- docker-compose.prod.yml for production container orchestration
- CI/CD workflows under .github/ for automated test and build gates
API surface summary
- POST /api/v1/sentiment/analyze — Text sentiment scoring with model selection
- GET /api/v1/customers/segments — Retrieve customer segment definitions and metrics
- POST /api/v1/customers/segment — Create or update segmentation rules
- GET /api/v1/analytics/dashboard — Aggregated KPIs for executive and operator views
- GET /api/v1/predictions/churn — Churn probability scores with feature contributions
Documentation and extension
The repository includes docs for API reference, data pipeline operations, ML pipeline guides, deployment, and contributing. Extension paths include federated learning for multi-book operators, real-time odds integration APIs, responsible-gaming compliance modules, and feature stores for cross-sport customer intelligence.
Key Features & Capabilities
- Multi-model sentiment engine with BERT transformers and traditional ML fallback
- Customer segmentation via RFM analysis, clustering algorithms, and churn prediction
- Real-time analytics with WebSocket live sentiment monitoring
- Advanced ML pipeline: automated training, feature engineering, and model monitoring
- Modern dark-theme UI with sports betting aesthetic and responsive design
- Production-ready Docker containers, environment configs, and CI/CD workflows
Tech Stack & Components
Getting Started
1.Backend setup
Python 3.9+, Node.js 18+, Docker, and PostgreSQL required.
cd backend
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt2.Frontend and database
Configure environment files from .env.example templates.
cd frontend && npm install
docker-compose up -d postgres
cp backend/.env.example backend/.env
cp frontend/.env.example frontend/.env3.Run development servers
Start backend and frontend in separate terminals.
# Backend
uvicorn main:app --reload
# Frontend
npm run devFrequently asked questions
- What is GameEdge Intelligence?
- A sports betting and fantasy sports analytics platform combining BERT-based sentiment analysis with traditional ML fallback, RFM customer segmentation, clustering, churn prediction, and real-time WebSocket sentiment feeds—built on FastAPI, PostgreSQL, and Next.js 14.
- What datasets power GameEdge Intelligence?
- The platform is designed around large-scale data: 1.6 million pre-labeled tweets for sentiment, 525k+ transaction records, 500k+ sports matches with odds, 50k+ synthetic customer profiles, and 100k+ simulated betting transactions. scripts/manage_data_pipeline.py controls ingestion and synthetic generation.
- How does GameEdge handle real-time analytics?
- WebSocket /ws/live-sentiment streams sentiment events to the Next.js dashboard. Recharts and D3.js render time-series trends, segment distributions, and churn risk views. The backend uses async FastAPI with isolated ML inference for CPU/GPU-bound model steps.
- What ML models does GameEdge Intelligence use?
- Hugging Face BERT transformers for primary sentiment with scikit-learn fallback for resilience. Customer segmentation uses RFM analysis, clustering algorithms, and churn prediction models with automated preprocessing, train/validation splits, and monitoring hooks for distribution drift.
- How is GameEdge Intelligence deployed?
- Docker Compose supports development and production-like stacks (docker-compose.prod.yml). Environment variables configure DATABASE_URL, JWT_SECRET, ML_MODEL_PATH, and REDIS_URL. CI/CD workflows under .github/ provide automated test and build gates.
