AI Fretboard Trainer: Detect Guitar Notes With Computer Vision
by Joran_Thienpont in Circuits > Raspberry Pi
190 Views, 1 Favorites, 0 Comments
AI Fretboard Trainer: Detect Guitar Notes With Computer Vision
Hello, my name is Joran Thienpont, I'm a first year student at Howest Kortrijk. I study CTAI(Creative Tech and Artificial Intelligence). For this project I combined artificial intelligence with my interest in music. Many beginning guitar players including me struggle to map a music note on sheet music to a place on the fretboard and often don't know if the correct note was played. My goal was to make a system that can detect which string and fret a player is pressing and give feedback on the detected note.
The AI fretboard trainer is powered by the raspberry Pi, it uses a camera and a custom trained YOLOv26 model to detect finger positions on the fretboard of the guitar. To determine the fret the player has to do a small calibration process, during this reference points are recorded and the software can determine which fret is being played. The system can display the note that is played or generates an exercise for the user. Notes are displayed on the OLED and LED's give feedback.
So without further ado let me guide you through the process from start to finish.
Supplies
Electronics:
- Raspberry Pi 5
- USB Camera
- OLED Display (I2C)
- Rocker Switch
- 4 LEDs(green, 2 yellow, red)
- Resistors
- Breadboard
- Jumper Wires
Tools for developing
- Roboflow: For annotating pictures
- MakerCase: To get a pre made case
- Inkscape: To make adjustments to the case
- visual studio code
Downloads
Prepare the Raspberry Pi
Install Raspberry Pi OS
Download and install the Raspberry Pi Imager. Select Raspberry Pi OS (64-bit), choose your microSD card and open Advanced Options to configure the Wi-Fi. Insert the card, connect the camera and power, and complete the initial setup.
Once booted, enable the interfaces you need:
Go to Interface Options and enable SSH, I²C, and SPI.
- SSH is needed for remote development
- I²C for the OLED display
- SPI for the RFID reader
To connect VS Code:
- install the Remote-SSH extension.
- select Remote-SSH: Connect to Host.
- Enter your username and Pi IP:
Once connected you can edit and run code directly on the Pi from your laptop.
Set up a virtual environment so the dependencies are isolated:
Creating the Dataset
For this I created my own dataset because every guitar setup is different.
To minimalize the pictures I had to take and also have a more reliable model I worked with a camera mount clamp that is attached to the headstock of the guitar as you can see in the picture above. This allowed me to take pictures from roughly the same angle every time.
The guitar strings are also wrapped in a different color. This makes it easier for the model to distinguish between the different strings.
For taking the images I switched between three fingers each time in three different positions for every fret on every string.
Annotating Images
After collecting all the images, I uploaded them to Roboflow and manually annotated every image.
My classes were:
- string1_pressed
- string2_pressed
- string3_pressed
- string4_pressed
- string5_pressed
- string6_pressed
- nut
The nut class was added so fret positions can be calculated more easily but more on this later.
If you want a very reliable model you can take pictures in different light settings, different hand positions, different finger positions and move the camera arm now and then. I took pictures up and untill the 12th fret because the lower you go the harder it becomes for detecting the string and fret. I ended up with 1649 images which gave me a very reliable model.
Training the Model
After labeling the dataset, you can train the model in a jupyter notebook with the code snippet you get from roboflow after labeling. I used YOLOv26 for training, which is made to run relatively fast on the Raspberry Pi espcially the nano and small version. I strongly recommend to train the model on a computer with a good GPU.
Here is an example of how to train a model in notebook:
I experimented with:
- YOLOv26 nano
- YOLOv26 small
- different epochs sizes
- different batch sizes
- different augmentation settings
For determining if a model was good I looked closely to the confusion matrix, recall, precision and f1 score graphs. It is important to try different training settings for your model and if results are still not good consider adding some pictures to your dataset.
The best performing model was:
- YOLO Small
- 50 epochs
- Batch size 8
Which gave the following results:
- Precision: 0.9823317700874202
- Recall: 0.986463921695271
- mAP50: 0.9908044073787108
- mAP50-95: 0.6280138585106289
How the System Determines the Fret
Detecting the string that is played brings us halfway of trying to detect the exact note. The exact fret will be detected via software and some not so fun math.
Calibration
So at startup the user hold their finger on six known positions(frets 3, 5 and 12 on both the 1st and 6th string) for 3 seconds. The nut is automatically detected by the model and gives us another 2 reference points. This gives us 8 points were we can base the grid around.
Filling in missing frets
Only frets 0,3,5 and 12 are measured directly the rest is estimated using this formula:
- P = A + t * (B - A)
Where A and B are the two neighboring calibration frets on either side, and t is a pre-calculated ratio from the equal temperament formula, which describes how the spacing between frets on the guitar gets smaller and smaller as you go further down the neck
Finding the pressed fret
To find the fret an imaginary line is drawn from the nut to the 12th fret. Every fret position and finger position are then measured as distances along that line. The fret that is then closest to the finger wins.
Note: you can go to RPi/models/fret_detector.py to see how this done code wise.
Wiring and Testing Hardware
Once the AI model was working and gave good predictions, I started with building the hardware.
All the components were wired on breadboard and connected to the Raspberry Pi
The components for my project are:
- OLED display
- Toggle switch
- Rocker switch
- LEDs
Important note: If you buy the same OLED display I did from AliExpress, the display may be physically labelled as SSD1309 but actually require the SH1106 driver. Use the luma.oled library with sh1106.
You can find all hardware components in test files in RPi/hardware and RPi/tests
OLED Display
The OLED display is the main feedback screen and our most important component.
For controlling the OLED I used the luma library which made displaying the stave and note easy.
The OLED has two modes the user can switch between with the toggle switch
Free play mode: The detected string and fret are looked up in a dictionary that maps every string and it's 12 frets to a note name and its position on a musical staff. The note is drawn on the OLED in sheet music notation in real time.
practice mode: When an rfid tag is scanned a sessions starts. The system will give the player a note and a string to play it on. After the player presses a string, the system compares the detected note to the target note and shows correct or wrong on the screen.
There is also a song mode that can be started from the Gradio web interface. The player can practice Twinkle Twinkle Little Star.
Optimizing Performance
Running AI models directly on a Raspberry Pi can be challenging.
Initially, I used the original PyTorch model, but performance was limited to approximately one frame per second for the small model.
To improve performance the model was converted to an NCNN format:
After conversion, the frame rate increased by a lot and the application became much more responsive.
Gradio Interface
The Gradio interface runs on the Pi and is accessible from any browser on the same network at http://your-pi-ip:7860. It has four pages About, Live Data, Operating, and Debug.
The real-time detection loop is driven by gr.Timer, which fires on every tick, runs inference, updates the LEDs and OLED, and pushes the annotated camera frame back to the browser. The tick rate is adjustable via a target FPS slider.
python
The Operating page lets you start song mode, which triggers the OLED to step through the note sequence. The Debug page lets you check hardware state and test the OLED and LEDs. There is also a shutdown button to power off the Pi cleanly from the browser.
Setup Database and Api
All data from practice mode ,is stored like what notes the user had to play and whether they were correct. Everything is stored in a PostgreSQL database in docker
Database tables:
users(user_id, name, rfid_tag)
sessions(session_id, user_id, start_time, end_time, duration)
attempts(attempt_id, session_id, expected_note, detected_note, string_detected, fret_detected, correct, confidence)
The FastAPI backend follows a models/repositories/routers structure. All database interaction happens through repository classes, and the routers expose the REST endpoints. You can view the auto-generated API documentation at http://your-pi-ip:8000/docs.
Designing the Enclosure
After the software and hardware were working correctly, I designed the enclosure that holds everything together.
I used MakerCase to get the basic box design and selected the sliding lid box. for easy acces to the hardware and electronics. MakerCase saved a lot of time because you can just input the dimensions you need.
Then I imported the .svg file I got from MakerCase into Inkscape for further customisations. I added holes for the OLED, LEDs, toggle switch and rocker switch.
Building the Case
Once the design was finished, the panels were laser-cut. I ended using 3mm ABS which is a strong type of plastic. I assembled the box with 2 component epoxy glue which really is perfect for this material. I added additional holes with a multitool for the power and camera cable.
Then I added the electronics to the case, I mainly used glue and tape for making everything stuck in place.
Final Software Setup and Autostart
To make it so everything autostarts when the power is plugged in you have to add a systemd service file.
This file basically starts the file app.py which is where everthing is linked togheter.
Github and Code
All code is available on GitHub, feel free to do with it as you like.