Bit-Vision: See Binary Think Decimal

by saiyamhowest in Circuits > Raspberry Pi

77 Views, 0 Favorites, 0 Comments

Bit-Vision: See Binary Think Decimal

bitvision_heroo (1).png

Bit-Vision is an interactive educational device that teaches binary numbers using computer vision and artificial intelligence. Users display a binary value using hand gestures, and the system automatically recognizes the gesture, converts it into decimal form, and displays the result in real time.

Unlike traditional learning methods, students and beginners through Bit-Vision can visualize binary conversions and transform it into an engaging hands-on experience through AI-powered interaction.

Supplies

Screenshot 2026-06-16 151546.png
Screenshot 2026-06-16 133339.png
Screenshot 2026-06-16 133242.png
Screenshot 2026-06-17 191632.png
Screenshot 2026-06-16 133447.png
WhatsApp Image 2026-06-10 at 13.31.17.jpeg
Screenshot 2026-06-16 133417.png
Screenshot 2026-06-16 133358.png
rgb-led-module-2.jpg
Screenshot 2026-06-16 151942.png
Screenshot 2026-06-16 134220.png
Screenshot 2026-06-16 134204.png
FBQ0ST0MPX9BJU3.jpg

Electronics

  1. Raspberry Pi 5 (65 euro)
  2. Logitech C270 USB Webcam (22 euro)
  3. SSD1306 OLED Display (7 euro)
  4. 2 × RGB LEDs (5.5 euro)
  5. Push Button (1.8 euro)
  6. Active Buzzer (1.5 euro)
  7. Breadboard (5.4 euro)
  8. Jumper Wires (7 euro)
  9. 5V Power Supply (16 euro)
  10. Pi T-cobbler (9 euro)

Software

  1. Python 3.11
  2. MediaPipe Key-point Detections
  3. OpenCV
  4. Gradio
  5. PostgreSQL
  6. Docker
  7. XGBoost

Material

  1. Enclosure is made using ABS 3mm


Total Cost :

181 euros

OVERVIEW

FV2T4APMQIJAD4N.png
9ec3724361d84a06b5216c6ddae4b10c.jpg
iso.png
FZOLH6PMQIALBRN.png

Binary numbers are the foundation of modern computing, but many students struggle to visualize them.

Bit-Vision addresses this challenge by allowing users to represent binary digits with their fingers. A camera captures the hand gesture, an AI model interprets the finger positions, and the system instantly converts the gesture into a decimal number.

This creates an intuitive bridge between abstract binary concepts and physical interaction.

Features

✅ Real-time hand gesture recognition

✅ Binary to decimal conversion

✅ OLED display feedback

✅ RGB status indicators

✅ Push-button hardware control

✅ Wireless Gradio dashboard

✅ PostgreSQL detection history

✅ Raspberry Pi standalone operation

✅ Safe shutdown menu system

Understanding the Problem

ChatGPT Image Jun 18, 2026, 01_01_18 PM (2).png

Most students first encounter binary numbers through textbooks or lectures.

A binary number such as:

1101

must mentally be converted into:

13

For beginners, this process often feels disconnected from real-world interaction.

I wanted to create a system where binary numbers could be learned visually and physically rather than memorized.

This led to a simple question:

What if a student could show a binary number using their fingers and have a computer recognize it instantly?

That question became the foundation of Bit-Vision.

Building the Dataset

dataset (1) (1).png

To train the model, I needed examples of every binary configuration.

A custom data collection tool was created to:

  1. Capture webcam frames
  2. Extract hand landmarks
  3. Label the gesture
  4. Save the feature vectors

Hundreds of samples were collected for each binary combination.

The dataset included variations in:

  1. Hand orientation
  2. Distance from camera
  3. Lighting conditions
  4. User positioning

This helped improve robustness during training.

Choosing the Ai Pipeline

Screenshot 2026-06-09 000111.png
Screenshot 2026-06-18 122342.png

Several approaches were considered:

  1. Image classification


  1. CNN-based recognition
  2. Object detection
  3. Hand landmark detection

After experimentation, Media-Pipe Hand Tracking was selected because it provides 21 highly accurate hand landmarks in real time.

Instead of training directly on camera images, I extracted landmark coordinates and used them as machine learning features.

Initially model was trained only with the normalized coordinates excluding the thumb but after proper analysis thumb was added back because thumb coordinates and position also play role in gesture formation.

Finally.

63 Normalized coordinates , 5 Fingertip position and 5 finger angle together constitutes to 73 features which then used according to the feature importance.

This dramatically reduced computation while improving consistency.

The pipeline became:

Camera

Media-Pipe

21 Hand Landmarks

Features Extraction

Machine Learning Model

Binary Prediction

Training the Machine Learning Model

Screenshot 2026-06-18 115436.png
Screenshot 2026-06-18 115423.png
Screenshot 2026-06-18 115328.png

Several machine learning approaches were evaluated.

After testing multiple algorithms, XG-Boost delivered the best balance between:

  1. Accuracy
  2. Speed
  3. Raspberry Pi performance

The model was trained using the extracted landmark coordinates.

After training:

  1. The model was exported
  2. Stored locally
  3. Loaded directly on the Raspberry Pi

This allowed all inference to run completely offline.

Database Integration

erd.png
Screenshot 2026-06-18 083033.png
Screenshot 2026-06-18 083024.png
Screenshot 2026-06-18 082956.png
Screenshot 2026-06-18 082942.png

Once detections were working, I wanted the system to maintain a history of interactions.

A PostgreSQL database was added. Three different tables were created i.e. users ,sessions and detections.

Database ERD is shown in the image.

Each detection stores:

  1. Binary value
  2. Decimal value
  3. Confidence score
  4. Detection timestamp
  5. Detection image

This transformed Bit-Vision from a simple detector into a data-driven learning platform.

Hardware Assembly

internal (1).png
WhatsApp Image 2026-06-15 at 03.35.14.jpeg
oledsample.png
WhatsApp Image 2026-06-15 at 04.32.45.jpeg

OLED Display

Displays:

  1. Startup screen
  2. Detection results
  3. IP address
  4. Shutdown menu

RGB LEDs

RGB 1

  1. Blue = System Booting
  2. Green = System Ready
  3. Red = System Shutdown

RGB 2

  1. Green = Ready
  2. Blue = Processing
  3. Green Flash = Successful Detection
  4. Red Flash = No Detection

Push Button

  1. Short Press → Run Detection
  2. Long Press → Open Menu
  3. Shutdown Confirmation

Menu Actions

  1. Show IP Address
  2. Shutdown Device


Assembly View

ChatGPT Image Jun 8, 2026, 02_57_43 AM.png

Creating the Web Dashboard

Screenshot 2026-06-18 114025.png
Screenshot 2026-06-18 114138.png
Screenshot 2026-06-18 114052.png
Screenshot 2026-06-18 114035.png
Screenshot 2026-06-18 114116.png

Although the OLED was useful, a larger interface was needed for analysis and demonstrations.

Gradio was selected because it allowed rapid creation of a browser-based dashboard.

The dashboard provides:

  1. Live camera feed
  2. Detection results
  3. Detection history
  4. Confidence metrics
  5. Database statistics
  6. System diagnostics

This made the project much easier to demonstrate and evaluate.

Making System Wireless

Initially, development was performed through an Ethernet connection.

Later, the Raspberry Pi was configured to operate entirely over Wi-Fi.

The OLED displays the device IP address at startup.

Any device connected to the same network can access: http://<raspberry-pi-ip>:7862

No monitor, keyboard, or Ethernet cable is required.

Only power is needed.

This transformed Bit-Vision into a truly standalone embedded system.

Final System Architecture

Engineering System Architecture Diagram.png

Reliability Improvements

Several engineering challenges emerged during development.

  1. Camera Latency

The webcam occasionally returned stale frames.

This was solved by:

  1. Reducing OpenCV buffer size
  2. Maintaining the latest frame cache

GPIO Integration

Hardware controls needed to work independently from the Gradio interface.

A dedicated GPIO handling system was implemented to ensure:

  1. Button presses
  2. OLED updates
  3. LED control

could operate even when the dashboard was open.

Safe Shutdown

Sudden power removal can corrupt storage.

A shutdown menu was added to:

  1. Open the menu
  2. Select shutdown
  3. Confirm shutdown
  4. Execute safe power-off

The OLED provides shutdown feedback before power is removed.

Conclusion

bitvision_hero_solo.png
bitmodel.png

Bit-Vision began as a simple idea for teaching binary numbers through hand gestures. Through multiple iterations involving computer vision, machine learning, embedded systems, databases, networking, and user interface design, it evolved into a complete standalone educational platform.

The project demonstrates how AI can be used not only for automation, but also as a tool for making fundamental computing concepts more engaging, interactive, and accessible.