MoodAI Player – AI-Powered Emotion Detection Music System Using Raspberry Pi
by Aneeq Khan in Circuits > Raspberry Pi
28 Views, 0 Favorites, 0 Comments
MoodAI Player – AI-Powered Emotion Detection Music System Using Raspberry Pi
Have you ever wished your devices could understand how you feel instead of simply waiting for commands?
The MoodAI Player is an AI-powered smart device that detects a user’s emotions in real time and automatically responds through music and ambient lighting. Using a webcam, a Raspberry Pi 5, and a custom-trained YOLO computer vision model, the system recognizes emotional states such as happiness, sadness, anger, fear, neutrality, and sleepiness. Once an emotion is detected, the system automatically selects suitable music and changes Led strip lighting effects to create a more personalized and responsive experience.
The idea behind this project originated from a simple observation: modern devices are emotionally unaware. Even when users feel stressed, tired, or overwhelmed, they must manually search for music or adjust their environment. MoodAI Player attempts to bridge this gap by creating a system that understands face expressions and reacts automatically.
This project brings together Artificial Intelligence, Computer Vision, a Raspberry Pi, emotion-based music, RGB lighting, and a custom laser-cut wooden case. It was built for the CTAI program at Howest, and it shows how AI can make the way people and devices interact feel more natural and human.
Supplies
Electronics
- Raspberry Pi 5 (8GB)
- UGREEN USB Webcam (Model 65857)
- LCD1602 Display (Freenove Kit)
- WS2812B RGB LED Strip (120 LEDs)
- Creative Pebble V2 Speaker
- USB-to-3.5mm Audio Adapter
- Push Buttons (x5)
- Jumper Wires
- 32GB MicroSD Card
- External 5V Power Supply for the LED Strip
Enclosure Materials
- 4mm Birch Plywood
- Wood Glue
Software
- Raspberry Pi OS
- Python 3.12
- YOLO (Ultralytics)
- Roboflow
- Gradio
- Docker
- PostgreSQL
- OpenCV
Tools
- Laser Cutter
- Soldering Iron
- Wire Cutter
- Adobe Illustrator
- MakerCase
Bill of Materials (BOM) — Total Cost: ~€290.38
1. Raspberry Pi 5 (8GB) + power supply + 32GB microSD — €174.95
2. USB Webcam (UGREEN) — €26.49
3. WS2812B RGB LED Strip (120 LEDs) — €12.00
4. Jumper Wires — €9.99
5. Multiplex 4mm Birch Plywood (x3) — €12.00
6. AC Power Supply 5V — €16.00
7. LCD1602 Display — €8.95
8. USB Speaker (Creative Pebble V2) — €24.00
9. Push Buttons (x5) — €6.00
Downloads
Raspberry Pi Setup
The Raspberry Pi 5 acts as the central controller for the MoodAI Player. It handles emotion detection, music playback, LED control, LCD communication, and interaction with the database.
Model training was performed on a separate laptop with an NVIDIA RTX 4050 GPU. The Raspberry Pi only runs the trained model for real-time inference.
Install Raspberry Pi OS:
1. Download Raspberry Pi OS
2. Flash the image to a MicroSD card using Raspberry Pi Imager
3. Configure Wi-Fi and SSH
4. Boot the Raspberry Pi
Create a Python Virtual Environment:
During development, Python 3.14 caused compatibility issues with Gradio and several dependencies. To ensure stable operation, the project uses Python 3.12 inside a virtual environment.
sudo apt update
sudo apt upgrade -y
python3.12 -m venv moodai-env
source moodai-env/bin/activate
Install Dependencies:
pip install ultralytics
pip install gradio
pip install opencv-python-headless
pip install psycopg2-binary
Hardware Verification:
Before continuing, verify that:
- Webcam is detected correctly
- LCD1602 functions properly
- Speaker outputs sound
- GPIO buttons respond correctly
- PostgreSQL container is accessible
- YOLO model loads successfully
Hardware Wiring
The MoodAI Player combines multiple hardware components that work together to create an interactive user experience.
Webcam:
The UGREEN USB webcam continuously captures facial images and provides input to the emotion detection model. Connect it directly via USB.
LCD1602 Display (I2C):
LCD Pin → Raspberry Pi
GND → GND
VCC → 5V
SDA → SDA
SCL → SCL
RGB LED Strip (WS2812B):
LED Pin → Raspberry Pi
GND → GND
Data (MOSI) → GPIO10
⚠️ Raspberry Pi 5 Compatibility Note:
Most WS2812B tutorials target Raspberry Pi 4 and older. Common LED libraries do not work reliably on Raspberry Pi 5 due to hardware changes. To solve this, use the rpi5-ws2812 library and connect the strip through the SPI interface using GPIO10 (MOSI). This provides stable, real-time lighting effects.
Emotion-to-Color Mapping:
Happy → Yellow
Neutral → Cool White
Sad → Blue
Fear → Light Purple
Anger → Red
Sleepy → Orange
Push Button GPIO Pins:
Power / Start System → GPIO16
Previous Song → GPIO12
Next Song → GPIO25
Volume Down → GPIO23
Volume Up → GPIO24
Audio Output:
The Creative Pebble V2 speaker connects via USB-to-3.5mm audio adapter and outputs the emotion-based music selected by the system.
Creating the Emotion Dataset
A custom dataset was created specifically for this project using images captured with a webcam. All images were collected and annotated manually.
Dataset Statistics:
- Total Images: 3,166
- Total Annotations: 3,203
- Classes: 6
Emotion Classes:
Happy → 573 images
Neutral → 572 images
Sad → 558 images
Fear → 520 images
Anger → 507 images
Sleepy → 473 images
Data Collection Process:
Photos for each emotion were taken directly with a webcam. Each image was then manually annotated in Roboflow and exported in YOLO format for training. This ensured all training data was original and collected specifically for the MoodAI Player project.
Annotation Process:
1. Upload images to Roboflow
2. Draw bounding boxes around faces
3. Assign emotion labels
4. Review annotations
5. Export in YOLO format
The complete source code and .ai files for laser cutting of the enclosure is available on GitHub: https://github.com/KhanMuhammadAneeqShakeel/MoodAI_Player
The dataset is publicly available on Roboflow.com under the project name: T5
Training the YOLO Model
The AI component of MoodAI Player is based on a YOLO object detection model trained to recognize facial emotions.
Training Environment & Configuration:
Model → YOLOv11n
GPU → NVIDIA RTX 4050 Laptop GPU
Annotation Platform → Roboflow
Operating System → Windows
Epochs → 100
Patience → 20
Mosaic → 0.0
Workers → 0
mAP@50 → 0.858
Training Workflow:
1. Dataset was uploaded and managed in Roboflow
2. The dataset was exported in YOLO format
3. YOLO11 was trained using Python
4. The best model weights were exported as best.pt
5. The trained model was transferred to Raspberry Pi 5 for real-time inference
Performance:
The final model achieved a mAP@50 of 0.858, providing reliable real-time facial emotion detection while remaining lightweight enough to run efficiently on Raspberry Pi hardware.
Note on Workers=0:
- During training on Windows, setting workers to any value above 0 caused dataloader errors. Setting workers=0 resolved this issue without affecting model performance.
Software Architecture and Dashboard
One of the most important parts of the MoodAI Player is the software ecosystem that connects the AI model, hardware components, and user interface.
Backend Components:
The project uses an OOP Python architecture with dedicated classes for each hardware component:
- EmotionDetector — runs YOLO inference on webcam frames
- MoodVoter — collects votes over a 4-second window and decides the winning emotion
- LCDDisplay — controls the I2C LCD1602
- LEDController — controls the WS2812B LED strip via SPI
- MusicPlayer — handles pygame-based audio playback
- MoodAIPlayer — main orchestrator class
Database:
A PostgreSQL database stores detected emotions, timestamps, playback history, and system events. All data is accessible through the Gradio dashboard.
Gradio Dashboard Tabs:
- Home — Welcome screen and project overview
- About — How the system works and hardware list
- Operating — Start/stop detection, set confidence threshold, select music source
- Playlists — Create playlists, add songs per emotion, upload your own audio
- Songs — Browse and play any song manually
- Data — Live stats, mood bar chart, listening time donut chart
- Debugging — Test LCD, LED strip, camera, and buttons individually
- History — Last 50 detections with emotion, song, and LED color
The complete source code is available on GitHub:
https://github.com/KhanMuhammadAneeqShakeel/MoodAI_Player
Designing and Building the Enclosure
The enclosure was designed using MakerCase and refined in Adobe Illustrator before laser cutting.
Dimensions: 290mm × 210mm × 260mm
Material: 4mm Birch Plywood
Design Goals:
- Protect all electronics
- Hide and organize cables
- Improve overall aesthetics
- Simplify transportation
- Provide easy maintenance access
Manufacturing Process:
1. Create box structure in MakerCase
2. Export SVG files
3. Refine SVG files in Adobe Illustrator (add holes for camera, buttons, speaker, LCD)
4. Laser cut all panels
5. Sand all edges smooth
6. Assemble with wood glue
Safety Features:
- No exposed wiring
- All electronics protected inside enclosure
- Organized cable routing
The complete .ai files for laser cutting are available on GitHub:
https://github.com/KhanMuhammadAneeqShakeel/MoodAI_Player
Final Assembly and Demonstration
After testing each subsystem independently, all components were integrated into the final prototype.
Final Workflow:
1. User sits in front of the webcam
2. Webcam captures facial images
3. YOLO detects the user's facial emotion
4. The system selects a suitable music playlist
5. The RGB LED strip updates its color
6. The LCD1602 displays the detected emotion and song name
7. Detection event is stored in the database
8. Music playback starts automatically
Final Features:
✓ Real-time emotion detection
✓ AI-powered music recommendation
✓ RGB mood lighting
✓ LCD status display
✓ SQLite database backend
✓ Gradio web dashboard
✓ Raspberry Pi 5 deployment
✓ Custom laser-cut enclosure
✓ Physical button controls
✓ User-configurable confidence threshold
Future Improvements:
- Spotify integration
- Multiple-user support
- Voice interaction
- Larger and more diverse datasets
- Cloud synchronization
The completed MoodAI Player demonstrates how artificial intelligence, embedded systems, and emotion-aware interfaces can be combined to create more natural and personalized interactions between people and technology.