OCR for Spectacle

2025-08-20 535 words 3 minutes

Contents

The Story

This story begins a couple of months ago when I was working and needed to copy some text from a picture someone had sent me.

I already knew about OCR, and there are hundreds of websites and applications that do exactly that, however, once I had to OCR a few images, the process quickly became very cumbersome. Download picture → Upload picture → Copy text → Paste text!

Yes, it works — but it’s a lot of wasted time. So, naturally, as any sane person would do in my position, I decided the wisest use of my time would be to spend the next couple of hours writing and testing a script that lets me take a screenshot and instantly get the text from it.

Linux to the Rescue

One of the best things about Linux is that if you don’t like how something works — or want to change it — you usually can!

I’m currently running Fedora 42 KDE and using the Spectacle app to take screenshots. Spectacle is a fantastic tool on its own, but the feature that makes this all work is its ability to export screenshots directly to another program.

The Script

Prerequisites:

To get this script running, you need to install tesseract, magick, wl-copy or xclip and notify-send. I won’t go into detail on how to install them, but you can almost certainly grab them using your distro’s¹ package manager.

Once that’s done, you’re good to go — just follow the instructions below.

Create the OCR script and save it in a folder of your choosing.

        
        
        
    
#!/usr/bin/env bash

IMAGE="$1"
CR=$(printf '\r')

# Select language packs for Tesseract
LANG="eng" # Default language is English, you can modify this line like "eng+deu" for English and German etc.

# Cleanup function to remove temp and original image
cleanup() {
rm -f "$RESIZED"
rm -f "$IMAGE"
}
trap cleanup EXIT

if [[ ! -f "$IMAGE" ]]; then
notify-send -i dialog-error "OCR Error" "No image file received"
exit 1
fi

# Resize for better OCR
RESIZED="/tmp/ocr_resized_$$.png"
magick "$IMAGE" -resize 400% "$RESIZED"

# Perform OCR
OCR_OUTPUT=$(tesseract --psm 6 -l "$LANG" "$RESIZED" - 2>&1)
OCR_STATUS=$?

if [ $OCR_STATUS -ne 0 ]; then
notify-send -i dialog-error "OCR Error" "Tesseract failed: $OCR_OUTPUT"
exit 1
fi

# Normalize line endings
TEXT=$(echo "$OCR_OUTPUT" | sed "s/\$/${CR}/")

# Copy to clipboard
if command -v wl-copy &>/dev/null; then
echo -n "$TEXT" | wl-copy
elif command -v xclip &>/dev/null; then
echo -n "$TEXT" | xclip -selection clipboard
else
notify-send -i dialog-error "OCR Error" "No clipboard tool found"
exit 1
fi

# Notify success
notify-send -i edit-paste "OCR" "Text copied to clipboard"

Create a desktop file and associate it with PNG.

        
mkdir -p ~/.local/share/applications
# I use micro, but you can use nano or vim 
micro ~/.local/share/applications/text-ocr.desktop

        
        
        
    
[Desktop Entry]
Name=Extract Text
# Change the path to wherever you have saved your script
Exec=sh -c "nohup /home/kaloian/Apps/text_ocr.sh %f >/dev/null 2>&1 &"
MimeType=image/png;
Icon=scanner
Terminal=false
Type=Application
Categories=Utility;
StartupNotify=false

Update the mime-type database

update-desktop-database ~/.local/share/applications

Update

Since posting this article on Reddit, there seemed to be some interest in this script, so I decided to create a GitHub repo. If you want the latest updates, you should download it directly from there.

Demo

Linux Distribution ↩︎