Bit-Vision: See Binary Think Decimal
by saiyamhowest in Circuits > Raspberry Pi
77 Views, 0 Favorites, 0 Comments
Bit-Vision: See Binary Think Decimal
Bit-Vision is an interactive educational device that teaches binary numbers using computer vision and artificial intelligence. Users display a binary value using hand gestures, and the system automatically recognizes the gesture, converts it into decimal form, and displays the result in real time.
Unlike traditional learning methods, students and beginners through Bit-Vision can visualize binary conversions and transform it into an engaging hands-on experience through AI-powered interaction.
Supplies
Electronics
- Raspberry Pi 5 (65 euro)
- Logitech C270 USB Webcam (22 euro)
- SSD1306 OLED Display (7 euro)
- 2 × RGB LEDs (5.5 euro)
- Push Button (1.8 euro)
- Active Buzzer (1.5 euro)
- Breadboard (5.4 euro)
- Jumper Wires (7 euro)
- 5V Power Supply (16 euro)
- Pi T-cobbler (9 euro)
Software
- Python 3.11
- MediaPipe Key-point Detections
- OpenCV
- Gradio
- PostgreSQL
- Docker
- XGBoost
Material
- Enclosure is made using ABS 3mm
Total Cost :
181 euros
OVERVIEW
Binary numbers are the foundation of modern computing, but many students struggle to visualize them.
Bit-Vision addresses this challenge by allowing users to represent binary digits with their fingers. A camera captures the hand gesture, an AI model interprets the finger positions, and the system instantly converts the gesture into a decimal number.
This creates an intuitive bridge between abstract binary concepts and physical interaction.
Features
✅ Real-time hand gesture recognition
✅ Binary to decimal conversion
✅ OLED display feedback
✅ RGB status indicators
✅ Push-button hardware control
✅ Wireless Gradio dashboard
✅ PostgreSQL detection history
✅ Raspberry Pi standalone operation
✅ Safe shutdown menu system
Understanding the Problem
Most students first encounter binary numbers through textbooks or lectures.
A binary number such as:
1101
must mentally be converted into:
13
For beginners, this process often feels disconnected from real-world interaction.
I wanted to create a system where binary numbers could be learned visually and physically rather than memorized.
This led to a simple question:
What if a student could show a binary number using their fingers and have a computer recognize it instantly?
That question became the foundation of Bit-Vision.
Building the Dataset
To train the model, I needed examples of every binary configuration.
A custom data collection tool was created to:
- Capture webcam frames
- Extract hand landmarks
- Label the gesture
- Save the feature vectors
Hundreds of samples were collected for each binary combination.
The dataset included variations in:
- Hand orientation
- Distance from camera
- Lighting conditions
- User positioning
This helped improve robustness during training.
Choosing the Ai Pipeline
Several approaches were considered:
- Image classification
- CNN-based recognition
- Object detection
- Hand landmark detection
After experimentation, Media-Pipe Hand Tracking was selected because it provides 21 highly accurate hand landmarks in real time.
Instead of training directly on camera images, I extracted landmark coordinates and used them as machine learning features.
Initially model was trained only with the normalized coordinates excluding the thumb but after proper analysis thumb was added back because thumb coordinates and position also play role in gesture formation.
Finally.
63 Normalized coordinates , 5 Fingertip position and 5 finger angle together constitutes to 73 features which then used according to the feature importance.
This dramatically reduced computation while improving consistency.
The pipeline became:
Camera
↓
Media-Pipe
↓
21 Hand Landmarks
↓
Features Extraction
↓
Machine Learning Model
↓
Binary Prediction
Training the Machine Learning Model
Several machine learning approaches were evaluated.
After testing multiple algorithms, XG-Boost delivered the best balance between:
- Accuracy
- Speed
- Raspberry Pi performance
The model was trained using the extracted landmark coordinates.
After training:
- The model was exported
- Stored locally
- Loaded directly on the Raspberry Pi
This allowed all inference to run completely offline.
Database Integration
Once detections were working, I wanted the system to maintain a history of interactions.
A PostgreSQL database was added. Three different tables were created i.e. users ,sessions and detections.
Database ERD is shown in the image.
Each detection stores:
- Binary value
- Decimal value
- Confidence score
- Detection timestamp
- Detection image
This transformed Bit-Vision from a simple detector into a data-driven learning platform.
Hardware Assembly
OLED Display
Displays:
- Startup screen
- Detection results
- IP address
- Shutdown menu
RGB LEDs
RGB 1
- Blue = System Booting
- Green = System Ready
- Red = System Shutdown
RGB 2
- Green = Ready
- Blue = Processing
- Green Flash = Successful Detection
- Red Flash = No Detection
Push Button
- Short Press → Run Detection
- Long Press → Open Menu
- Shutdown Confirmation
Menu Actions
- Show IP Address
- Shutdown Device
Assembly View
Creating the Web Dashboard
Although the OLED was useful, a larger interface was needed for analysis and demonstrations.
Gradio was selected because it allowed rapid creation of a browser-based dashboard.
The dashboard provides:
- Live camera feed
- Detection results
- Detection history
- Confidence metrics
- Database statistics
- System diagnostics
This made the project much easier to demonstrate and evaluate.
Making System Wireless
Initially, development was performed through an Ethernet connection.
Later, the Raspberry Pi was configured to operate entirely over Wi-Fi.
The OLED displays the device IP address at startup.
Any device connected to the same network can access: http://<raspberry-pi-ip>:7862
No monitor, keyboard, or Ethernet cable is required.
Only power is needed.
This transformed Bit-Vision into a truly standalone embedded system.
Final System Architecture
Reliability Improvements
Several engineering challenges emerged during development.
- Camera Latency
The webcam occasionally returned stale frames.
This was solved by:
- Reducing OpenCV buffer size
- Maintaining the latest frame cache
GPIO Integration
Hardware controls needed to work independently from the Gradio interface.
A dedicated GPIO handling system was implemented to ensure:
- Button presses
- OLED updates
- LED control
could operate even when the dashboard was open.
Safe Shutdown
Sudden power removal can corrupt storage.
A shutdown menu was added to:
- Open the menu
- Select shutdown
- Confirm shutdown
- Execute safe power-off
The OLED provides shutdown feedback before power is removed.
Conclusion
Bit-Vision began as a simple idea for teaching binary numbers through hand gestures. Through multiple iterations involving computer vision, machine learning, embedded systems, databases, networking, and user interface design, it evolved into a complete standalone educational platform.
The project demonstrates how AI can be used not only for automation, but also as a tool for making fundamental computing concepts more engaging, interactive, and accessible.