DeskGuardian: an AI Desk Tray That Tells You What You Forgot

Honestly this whole project started because I keep walking away from my desk without my keys or my USB stick, and then I waste five minutes searching for something that was sitting right in front of me the whole time. So I wanted to build something that just looks at my desk and tells me, before I get up, whether the important stuff is actually there.

DeskGuardian is a little AI-powered "landing pad" for your desk. You drop your everyday items in the tray, a webcam looks down at them, and a YOLO object-detection model running on a Raspberry Pi figures out which of my five items are present: keys, wallet, USB stick, earbuds and a wristwatch. A 16x2 LCD shows the current status, a green/red LED tells you instantly if anything required is missing, and a buzzer beeps at you if you try to leave something behind. Everything is controlled from a small Gradio web interface, and every scan is logged in a PostgreSQL database so I can even re-export the photos as a fresh training dataset later.

What you'll find in this Instructable: how I collected and labelled my own dataset, trained the YOLO model, wired and soldered the electronics (LEDs, buzzer, LCD, button), designed and 3D-printed the enclosure, set up the Docker/FastAPI/PostgreSQL backend, and built the Gradio interface. Basically everything you'd need to rebuild it yourself.

It's not a life-changing invention, but it's a genuinely useful little gadget and it taught me a ton about computer vision, GPIO hardware and getting all of that to actually work together on a Pi. Let's get into it.

Supplies

Here's everything I used. Prices below are indicative: they're roughly what I paid in Belgium (in €) and a lot of it came from a Freenove Raspberry Pi starter kit, so your prices will vary. The full, exact list with supplier links and real prices is in my BOM (see the attached BOM Excel / PDF in the downloads at the end).

Parts (Bill of Materials)

Raspberry Pi 5 Model B (8GB), qty 1, €130.66: the brain, runs YOLO and the backend. (Reichelt; also Gotron)
Freenove Project Kit for Raspberry Pi, qty 1, €59.99: project board with the LCD, LEDs, resistors, buzzer, button and wiring used for the status display and feedback. (store.freenove.com)
HD USB webcam (2K, 4MP, with privacy cover), qty 1, €39.95: mounted on top, looks down at the tray. (bol.com)
Breadboard + electronics starter kit, qty 1, €25.05: breadboard, jumper wires and extra LEDs/resistors for prototyping the wiring. (bol.com)
Ethernet cable (CAT6A, 1.5 m), qty 1, €6.49: wired network connection to the Pi. (amazon.com.be)
PLA Matte filament, light grey, 1 kg spool (about 500g used), €22.99: for the 3D-printed enclosure. (Bambu Lab EU store)
microSD card (32 GB) with Raspberry Pi OS, qty 1, €36.29: storage and operating system. (Kiwi Electronics)
USB-C power supply, official Raspberry Pi 27W USB-C PD, qty 1, €12.95: powers the Pi 5 (it needs a 5V/5A supply). (raspberrypi.com; also Farnell BE)

Total cost: €334.37.

Note: the LCD, LEDs, resistors, buzzer, button and jumper wires referenced in the build steps below are part of the Freenove kit and the breadboard/electronics starter kit, so they aren't listed as separate purchases. See the attached BOM for the exact supplier links and alternatives.

Tools

Soldering iron + solder (to remove the breadboard and make everything permanent)
A 3D printer (I used a Bambu Lab A1) + slicer software
A laptop/PC for training the model, ideally with an NVIDIA GPU. I trained on my laptop (Ryzen + GeForce RTX), which made training a lot faster
Wire cutters/strippers, small screwdriver, hot glue / tape

Collect & Label the Dataset

Standard image classification (one label per whole image) wasn't allowed for this project, and it also wouldn't make sense here: I needed the model to find multiple objects in one frame. So this is an object-detection dataset with bounding boxes.

I took all my own photos (no copyrighted or pre-labelled data) of the five item classes from lots of different angles, all in good lighting, since a dark desk is impossible to scan anyway. I aimed for at least 100 annotations per class:

Keys
Wallet
USB stick
Earbuds
Wristwatch

Then I labelled every image by drawing bounding boxes around each object and exporting in YOLO format. Tip from experience: vary the background, the position and which items are present/absent, otherwise the model "cheats" and just learns your desk instead of the objects.

Train the YOLO Model

I trained a YOLO nano detection model (YOLOv26n). I picked the nano variant on purpose because it's small and fast enough to run inference on a Raspberry Pi without crawling. Input size is 640x640, and I trained on my laptop in Python.

Rough workflow:

Split the labelled data into train/validation sets.
Train the nano model for a number of epochs and watch the loss / mAP curves.
Export the best weights as best.pt.

The trained .pt file is what actually gets shipped to the Pi. You do not train on the Pi, that would take forever.

Export the Model to the Raspberry Pi

Once I was happy with the model, I copied the trained weights over to the Pi into the RPi/ai/ folder. From there the inference code loads best.pt and runs detection on frames coming from the webcam.

# on the laptop, copy the trained model to the Pi

scp best.pt pi@<pi-ip-address>:~/DeskGuardian/RPi/ai/

That's the moment it stops being "a model on my laptop" and becomes part of the actual device. Pretty satisfying.

Wire Up the Electronics

Now the hardware. DeskGuardian has three LEDs, a buzzer, a button and the LCD, all on the Pi's GPIO. Here's exactly which pin does what (this is the pinout I used with RPi.GPIO):

Green LED (GPIO 17): all required items were detected on the last scan.
Red LED (GPIO 27): something required is missing.
Blue LED (GPIO 22): toggled manually from the Debug tab, indicates the live YOLO preview is active.
Active buzzer (GPIO 12): alert tone when a required item is missing.
Push button (GPIO input): start a scan or trigger a safe shutdown.
16x2 LCD (I2C, SDA/SCL): shows status and the detected items.

Each LED gets its own 330 ohm resistor. I built and tested the whole thing on a breadboard first to make sure the logic was right before committing to solder. Definitely do this, it saved me a lot of pain.

Solder Everything (Goodbye Breadboard)

For the final build I removed the breadboard completely and soldered all the wiring (the LEDs, the shutdown button). This makes the whole thing reliable and, more importantly, lets it all hide inside the enclosure.

The Raspberry Pi itself lives in a sealed compartment behind the rear panel, so from the outside you don't see a single wire.

Design & 3D-Print the Enclosure

I designed a 3D-printed enclosure and printed it in matte grey PLA on a Bambu Lab A1. The shape is basically two parts:

An open-top tray at the front: this is the "landing pad" where you drop your items so the camera can see them from above.
A rear panel / box that holds the LCD and the 3-LED cluster on the front face, with a sealed compartment behind it for the Raspberry Pi and all the electronics. The side has a grid of ventilation holes so the Pi doesn't cook.

The webcam clips onto the top of the rear box and tilts down over the tray. I went for a clean, square design so it sits nicely on the corner of a desk and looks tidy rather than like a pile of wires.

Set Up the Database & Backend (Docker + FastAPI + PostgreSQL)

Every scan gets logged so I can review history and even rebuild a dataset later. The backend runs in Docker and has three pieces: PostgreSQL, pgAdmin and a FastAPI service.

The database has three tables:

required_items: the five classes, each with an is_required flag. This doubles as a settings table: only items you mark as "required" will trigger an alert when missing.
scan_sessions: one row per scan (timestamp, what triggered it, and the raw JPEG of the frame).
detections: one row per item per scan (present/absent, confidence, and the bounding-box x/y/w/h). That's 5 rows per session.

Spinning it up is just:

cd RPi/api

cp .env.example .env # fill in a real password

docker compose up -d --build

Because the DATABASE_URL uses the Compose service name postgres as the host, it resolves automatically inside the Docker network, with no extra config needed. The neat part: the stored JPEG frames can be bundled by the FastAPI /scans/export/zip endpoint into a ready-to-label YOLO dataset for retraining the model later.

Build the Gradio Interface

The whole thing is controlled from a Gradio web app with four tabs:

About & onboarding: title, a short description, setup instructions (where to put the camera, which objects are supported) and a colour-coded grid of the 5 item classes.
Data: current desk status per item, a bar chart of how often each object has been detected, and the full scan-history table.
Operating: a "Scan Now" button (or the physical button) that runs the camera + YOLO. You get a live camera feed, which switches to the annotated result with bounding boxes for 5 seconds after each scan, plus a present/missing panel and the required-items checklist that drives the LED/buzzer logic.
Debugging: a toggle for the blue debug LED, a "send test message to the LCD" button, the raw YOLO inference log, the live annotated preview, DB/network status, and the shutdown button.

You can start/stop a scan either from Gradio or with the physical button, so it works headless too.

Final Assembly

Last step: put it all together. The soldered components and the Pi go into the rear compartment, the LCD and LEDs press into the cut-outs on the front panel, the button goes in reach, and the webcam clips on top and gets aimed down at the tray. Close it up and the only things you can touch are the tray, the LCD, the LED cluster and the button. No exposed wiring, nothing trailing across the desk.

How to Use DeskGuardian

Power on the Pi. The LCD shows the status.
Open the Gradio interface (or just use the physical button).
On the Operating tab, tick which of the five items are "required" for you today.
Drop your items into the tray and hit Scan Now (or the button).
The camera + YOLO run, the annotated frame pops up for 5 seconds, and you instantly get:
Green LED = everything required is there.
Red LED + buzzer = something's missing, go grab it before you leave!
Every scan is saved to the database so you can review your history on the Data tab.

Conclusion, Downloads & What I'd Improve

And that's DeskGuardian! It started as "I'm tired of forgetting my keys" and ended up being a full little vision system: camera, custom-trained YOLO model, GPIO hardware feedback, a database, a Docker backend and a web interface, all packed into a 3D-printed box. The thing I'm most happy about is that it actually works end-to-end and looks clean on the desk.

What I'd improve next time

More training data in different lighting so it's even more robust.
Adding more item classes (phone, glasses…).
A phone notification when something's missing, not just the buzzer.

Downloads & links

Full source code: https://github.com/howest-mct/2025-2026-ProjectOne-CTAI-DebrabandereTristan
BOM (Excel/PDF): attached separately, has exact prices and supplier links
Trained model: best.pt (in RPi/ai/ in the repo)
3D files: the enclosure STL files

Thanks for reading! If you build your own version I'd love to see it. Tristan

Downloads

side_wall_ready_v2.stl

side_wall_final.stl

panel_06_top.stl

panel_03_rear_wall.stl

panel_02_front_wall.stl

panel_01_floor.stl

BOM_bill-of-materials.pdf