Linux: Convert a png file into a pdf file

Linux: Convert a png file into a pdf file

Important: I have only tested with png files.

Other important notes: You have to install the programs tesseract and pandoc for this script.

What is tesseract?

A commercial quality OCR engine originally developed at HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005.

The website of tesseract.

What is pandoc?

This package provides a command-line executable that uses the pandoc library to convert between markup formats. For pdf output please also install pandoc-pdf or weasyprint.

What is pandoc-pdf

This package pulls in the TeXLive latex package collection needed by pandoc to generate pdf output using pdflatex. To use –latex-engine=xelatex or lualatex, install texlive-collection-xetex or texlive-collection-luatex respectively.

The website about pandoc

Check first with

and

whether it is installed on your computer

Hint: In Fedora: tesseract and pandoc are in the repositories: updates

To check whether you have the repository updates. The correct name is „fedora-updates.repo“ go to

It should be installed during the installation.

To install the programs, tesseract and pandoc

If you have a screenshot tool that, you have created it with a screenshot tool.

I use here on Fedora the program gnome-screenshot, because I use the gnome desktop.

gnome-screenshot is in the repository „fedora“

What is gnome-screenshot?

gnome-screenshot lets you take pictures of your screen.

If everything is fine and you have installed everything.

Script

#!/bin/bash

clear

echo "Welcome to extract from a screenshot to pdf"


echo

echo -n "Add the path and the file name "

read pfad


if [ -d "$pfad" ]; then

echo "$pfad does exists"

cd $pfad

else

echo "$pfad" does not exist

exit 1

fi




echo -n "Add the file name "

read filename


if [ -f "$filename" ]; then

echo "$filename does exists"

else

echo "$filename" does not exists

fi


echo -n "What is the name of the output file? "

read output


tesseract -l deu $filename stdout | xargs > 12.txt


pandoc 12.txt -o $output.pdf

rm 12.txt
Die Kommentare sind geschlossen.