AI Fretboard Trainer: Detect Guitar Notes With Computer Vision

by Joran_Thienpont in Circuits > Raspberry Pi

190 Views, 1 Favorites, 0 Comments

AI Fretboard Trainer: Detect Guitar Notes With Computer Vision

the image.png

Hello, my name is Joran Thienpont, I'm a first year student at Howest Kortrijk. I study CTAI(Creative Tech and Artificial Intelligence). For this project I combined artificial intelligence with my interest in music. Many beginning guitar players including me struggle to map a music note on sheet music to a place on the fretboard and often don't know if the correct note was played. My goal was to make a system that can detect which string and fret a player is pressing and give feedback on the detected note.


The AI fretboard trainer is powered by the raspberry Pi, it uses a camera and a custom trained YOLOv26 model to detect finger positions on the fretboard of the guitar. To determine the fret the player has to do a small calibration process, during this reference points are recorded and the software can determine which fret is being played. The system can display the note that is played or generates an exercise for the user. Notes are displayed on the OLED and LED's give feedback.


So without further ado let me guide you through the process from start to finish.

Supplies

Electronics:


  1. Raspberry Pi 5
  2. USB Camera
  3. OLED Display (I2C)
  4. Rocker Switch
  5. 4 LEDs(green, 2 yellow, red)
  6. Resistors
  7. Breadboard
  8. Jumper Wires


Tools for developing

  1. Roboflow: For annotating pictures
  2. MakerCase: To get a pre made case
  3. Inkscape: To make adjustments to the case
  4. visual studio code

Prepare the Raspberry Pi

Gemini_Generated_Image_hpn0nbhpn0nbhpn0.png

Install Raspberry Pi OS


Download and install the Raspberry Pi Imager. Select Raspberry Pi OS (64-bit), choose your microSD card and open Advanced Options to configure the Wi-Fi. Insert the card, connect the camera and power, and complete the initial setup.

Once booted, enable the interfaces you need:


sudo raspi-config

Go to Interface Options and enable SSH, I²C, and SPI.

  1. SSH is needed for remote development
  2. I²C for the OLED display
  3. SPI for the RFID reader

To connect VS Code:

  1. install the Remote-SSH extension.
  2. select Remote-SSH: Connect to Host.
  3. Enter your username and Pi IP:


joran@192.168.1.120

Once connected you can edit and run code directly on the Pi from your laptop.

Set up a virtual environment so the dependencies are isolated:


python3 -m venv ~/.venv
source ~/.venv/bin/activate


Creating the Dataset

camera_mount.jpg
strings.jpg

For this I created my own dataset because every guitar setup is different.


To minimalize the pictures I had to take and also have a more reliable model I worked with a camera mount clamp that is attached to the headstock of the guitar as you can see in the picture above. This allowed me to take pictures from roughly the same angle every time.


The guitar strings are also wrapped in a different color. This makes it easier for the model to distinguish between the different strings.


For taking the images I switched between three fingers each time in three different positions for every fret on every string.


Annotating Images

roboflow_annotate.png
robo annov2.png

After collecting all the images, I uploaded them to Roboflow and manually annotated every image.


My classes were:

  1. string1_pressed
  2. string2_pressed
  3. string3_pressed
  4. string4_pressed
  5. string5_pressed
  6. string6_pressed
  7. nut


The nut class was added so fret positions can be calculated more easily but more on this later.


If you want a very reliable model you can take pictures in different light settings, different hand positions, different finger positions and move the camera arm now and then. I took pictures up and untill the 12th fret because the lower you go the harder it becomes for detecting the string and fret. I ended up with 1649 images which gave me a very reliable model.

Training the Model

confusion_matrix.png
val_batch1_pred.jpg

After labeling the dataset, you can train the model in a jupyter notebook with the code snippet you get from roboflow after labeling. I used YOLOv26 for training, which is made to run relatively fast on the Raspberry Pi espcially the nano and small version. I strongly recommend to train the model on a computer with a good GPU.


Here is an example of how to train a model in notebook:

from ultralytics import YOLO

def main():
model = YOLO(model="yolo26s.pt")
model.train(data="./Guitar-Fretboard-Trainer-11/data.yaml", epochs=50, imgsz=640, verbose=True, batch=8)
model.val()
model.export()
if __name__ == '__main__':
main()


I experimented with:

  1. YOLOv26 nano
  2. YOLOv26 small
  3. different epochs sizes
  4. different batch sizes
  5. different augmentation settings


For determining if a model was good I looked closely to the confusion matrix, recall, precision and f1 score graphs. It is important to try different training settings for your model and if results are still not good consider adding some pictures to your dataset.


The best performing model was:

  1. YOLO Small
  2. 50 epochs
  3. Batch size 8

Which gave the following results:

  1. Precision: 0.9823317700874202
  2. Recall: 0.986463921695271
  3. mAP50: 0.9908044073787108
  4. mAP50-95: 0.6280138585106289




How the System Determines the Fret

code_math.png

Detecting the string that is played brings us halfway of trying to detect the exact note. The exact fret will be detected via software and some not so fun math.


Calibration

So at startup the user hold their finger on six known positions(frets 3, 5 and 12 on both the 1st and 6th string) for 3 seconds. The nut is automatically detected by the model and gives us another 2 reference points. This gives us 8 points were we can base the grid around.


Filling in missing frets

Only frets 0,3,5 and 12 are measured directly the rest is estimated using this formula:

  1. P = A + t * (B - A)

Where A and B are the two neighboring calibration frets on either side, and t is a pre-calculated ratio from the equal temperament formula, which describes how the spacing between frets on the guitar gets smaller and smaller as you go further down the neck


Finding the pressed fret

To find the fret an imaginary line is drawn from the nut to the 12th fret. Every fret position and finger position are then measured as distances along that line. The fret that is then closest to the finger wins.


Note: you can go to RPi/models/fret_detector.py to see how this done code wise.

Wiring and Testing Hardware

breadboardv1.jpg
breadboardV2.jpg

Once the AI model was working and gave good predictions, I started with building the hardware.


All the components were wired on breadboard and connected to the Raspberry Pi


The components for my project are:


  1. OLED display
  2. Toggle switch
  3. Rocker switch
  4. LEDs


Important note: If you buy the same OLED display I did from AliExpress, the display may be physically labelled as SSD1309 but actually require the SH1106 driver. Use the luma.oled library with sh1106.


You can find all hardware components in test files in RPi/hardware and RPi/tests

OLED Display

oled_in_boxv2.JPG
notes_fret.jpg

The OLED display is the main feedback screen and our most important component.


For controlling the OLED I used the luma library which made displaying the stave and note easy.


The OLED has two modes the user can switch between with the toggle switch


Free play mode: The detected string and fret are looked up in a dictionary that maps every string and it's 12 frets to a note name and its position on a musical staff. The note is drawn on the OLED in sheet music notation in real time.


practice mode: When an rfid tag is scanned a sessions starts. The system will give the player a note and a string to play it on. After the player presses a string, the system compares the detected note to the target note and shows correct or wrong on the screen.


There is also a song mode that can be started from the Gradio web interface. The player can practice Twinkle Twinkle Little Star.


Optimizing Performance

ncnn.png

Running AI models directly on a Raspberry Pi can be challenging.

Initially, I used the original PyTorch model, but performance was limited to approximately one frame per second for the small model.

To improve performance the model was converted to an NCNN format:

yolo export model=best_model.pt format=ncnn

After conversion, the frame rate increased by a lot and the application became much more responsive.

Gradio Interface

image (4).jpg

The Gradio interface runs on the Pi and is accessible from any browser on the same network at http://your-pi-ip:7860. It has four pages About, Live Data, Operating, and Debug.

The real-time detection loop is driven by gr.Timer, which fires on every tick, runs inference, updates the LEDs and OLED, and pushes the annotated camera frame back to the browser. The tick rate is adjustable via a target FPS slider.

python

live_timer = gr.Timer(value=1.0 / ctx.cfg.default_target_fps, active=True)

live_timer.tick(
fn=inference.dispatch_gpio_inference,
inputs=[inp.input_source, inp.latest_capture_frame, ...],
outputs=[inp.input_preview, model.guitar_count_output, ...],
show_progress="hidden",
)

The Operating page lets you start song mode, which triggers the OLED to step through the note sequence. The Debug page lets you check hardware state and test the OLED and LEDs. There is also a shutdown button to power off the Pi cleanly from the browser.

Setup Database and Api

tables.png

All data from practice mode ,is stored like what notes the user had to play and whether they were correct. Everything is stored in a PostgreSQL database in docker


Database tables:

users(user_id, name, rfid_tag)

sessions(session_id, user_id, start_time, end_time, duration)

attempts(attempt_id, session_id, expected_note, detected_note, string_detected, fret_detected, correct, confidence)


The FastAPI backend follows a models/repositories/routers structure. All database interaction happens through repository classes, and the routers expose the REST endpoints. You can view the auto-generated API documentation at http://your-pi-ip:8000/docs.

Designing the Enclosure

Case.png
ink.png

After the software and hardware were working correctly, I designed the enclosure that holds everything together.


I used MakerCase to get the basic box design and selected the sliding lid box. for easy acces to the hardware and electronics. MakerCase saved a lot of time because you can just input the dimensions you need.


Then I imported the .svg file I got from MakerCase into Inkscape for further customisations. I added holes for the OLED, LEDs, toggle switch and rocker switch.

Building the Case

elec in case.jpg
work case.jpg

Once the design was finished, the panels were laser-cut. I ended using 3mm ABS which is a strong type of plastic. I assembled the box with 2 component epoxy glue which really is perfect for this material. I added additional holes with a multitool for the power and camera cable.


Then I added the electronics to the case, I mainly used glue and tape for making everything stuck in place.

Final Software Setup and Autostart

boot.JPG

To make it so everything autostarts when the power is plugged in you have to add a systemd service file.


[Unit]
Description=ProjectOne Gradio app
After=network.target
Wants=network.target

[Service]
ExecStart=/home/student/.venv/bin/python -u /home/student/2025-2026-ProjectOne-CTAI-ThienpontJoran/RPi/app.py
WorkingDirectory=/home/student/2025-2026-ProjectOne-CTAI-ThienpontJoran/RPi
StandardOutput=journal
StandardError=journal
Restart=always
User=student

[Install]
WantedBy=multi-user.target


This file basically starts the file app.py which is where everthing is linked togheter.


Github and Code

All code is available on GitHub, feel free to do with it as you like.