Contents

OCR for Spectacle

The Story

This story begins a couple of months ago when I was working and needed to copy some text from a picture someone had sent me.

I already knew about OCR, and there are hundreds of websites and applications that do exactly that, however, once I had to OCR a few images, the process quickly became very cumbersome. Download picture → Upload picture → Copy text → Paste text!

Yes, it works — but it’s a lot of wasted time. So, naturally, as any sane person would do in my position, I decided the wisest use of my time would be to spend the next couple of hours writing and testing a script that lets me take a screenshot and instantly get the text from it.

Linux to the Rescue

One of the best things about Linux is that if you don’t like how something works — or want to change it — you usually can!

I’m currently running Fedora 42 KDE and using the Spectacle app to take screenshots. Spectacle is a fantastic tool on its own, but the feature that makes this all work is its ability to export screenshots directly to another program.

The Script

Prerequisites:

To get this script running, the only thing you need to install is tesseract. I won’t go into detail on how to install it, but you can almost certainly grab it using your distro’s1 package manager.

Once that’s done, you’re good to go — just follow the instructions below.

Create the OCR script and save it in a folder of your choosing.

#!/usr/bin/env bash

IMAGE="$1"
CR=$(printf '\r')

# Cleanup function to remove temp and original image
cleanup() {
  rm -f "$RESIZED"
  rm -f "$IMAGE"
}
trap cleanup EXIT

if [[ ! -f "$IMAGE" ]]; then
  notify-send -i dialog-error "OCR Error" "No image file received"
  exit 1
fi

# Resize for better OCR
RESIZED="/tmp/ocr_resized_$$.png"
magick "$IMAGE" -resize 400% "$RESIZED"

# Perform OCR on the selected languages.
OCR_OUTPUT=$(tesseract --psm 6 -l eng+bul+deu "$RESIZED" - 2>&1)
OCR_STATUS=$?

if [ $OCR_STATUS -ne 0 ]; then
  notify-send -i dialog-error "OCR Error" "Tesseract failed: $OCR_OUTPUT"
  exit 1
fi

# Normalize line endings
TEXT=$(echo "$OCR_OUTPUT" | sed "s/\$/${CR}/")

# Copy to clipboard
if command -v wl-copy &>/dev/null; then
  echo -n "$TEXT" | wl-copy
elif command -v xclip &>/dev/null; then
  echo -n "$TEXT" | xclip -selection clipboard
else
  notify-send -i dialog-error "OCR Error" "No clipboard tool found"
  exit 1
fi

# Notify success
notify-send -i edit-paste "OCR" "Text copied to clipboard"

Create a desktop file and associate it with PNG.

mkdir -p ~/.local/share/applications
# I use micro, but you can use nano or vim 
micro ~/.local/share/applications/text-ocr.desktop
[Desktop Entry]
Name=Extract Text
# Change the path to wherever you have saved your script
Exec=sh -c "nohup /home/kaloian/Apps/text_ocr.sh %f >/dev/null 2>&1 &"
MimeType=image/png;
Icon=scanner
Terminal=false
Type=Application
Categories=Utility;
StartupNotify=false

Update the mime-type database

update-desktop-database ~/.local/share/applications

Demo

OCR

  1. Linux Distribution ↩︎