Linux: pdftotext – tool to read PDF files in the Bash (here Fedora)

Linux: pdftotext – tool to read PDF files in the Bash (here Fedora)

If you want to read in the Bash, on the command line on Linux a PDF file, you don’t need a graphical program. You need the command pdftotext.

But you have to install the collection of poppler-utils.

What are the poppler-utils?

Poppler-utils are collections of helpful PDF commands for the bash.

Where to find the poppler-utils:

poppler-utils are in the repository updates in Fedora:

poppler-utils are in the repo updates of Fedora.

Installation of the poppler-utils in Fedora

What kind of programs contains this package?

You can check it with the command, what it contains:

sven@fedora:~$ dnf repoquery -l poppler-utils
Aktualisiere und lade Paketquellen:
Paketquellen geladen.
/usr/bin/pdfattach
/usr/bin/pdfdetach
/usr/bin/pdffonts
/usr/bin/pdfimages
/usr/bin/pdfinfo
/usr/bin/pdfseparate
/usr/bin/pdfsig
/usr/bin/pdftocairo
/usr/bin/pdftohtml
/usr/bin/pdftoppm
/usr/bin/pdftops
/usr/bin/pdftotext
/usr/bin/pdfunite

Here you can see the command pdftotext where is the command what we need for my example

dnf provides */pdftotext

bash-completion-1:2.16-1.fc41.noarch : Programmable completion for Bash
Repo : @System
Matched From :
Filename : /usr/share/bash-completion/completions/pdftotext

poppler-utils-24.08.0-2.fc41.x86_64 : Command line utilities for converting PDF files
Repo : @System
Matched From :
Filename : /usr/bin/pdftotext

bash-completion-1:2.16-1.fc41.noarch : Programmable completion for Bash
Repo : updates
Matched From :
Filename : /usr/share/bash-completion/completions/pdftotext

poppler-utils-24.08.0-2.fc41.x86_64 : Command line utilities for converting PDF files
Repo : updates
Matched From :
Filename : /usr/bin/pdftotext

bash-completion-1:2.13-2.fc41.noarch : Programmable completion for Bash
Repo : fedora
Matched From :
Filename : /usr/share/bash-completion/completions/pdftotext

Why pdftotext also appears in bash-completion

When you ran dnf provides */pdftotext, you saw both poppler-utils and bash-completion in the output. This is a crucial detail.

  • poppler-utils provides the actual executable program at /usr/bin/pdftotext.
  • bash-completion provides the autocompletion script at /usr/share/bash-completion/completions/pdftotext.

This means that pressing the Tab key will suggest pdftotext and its options, which is a big help. For example, if you type pdf and press Tab, bash-completion will show you all commands that begin with pdf.

Here is an example on my computer:

sven@fedora:~$ pdf [tab] [tab]

Here is an example on my computer:

sven@fedora:~$ pdf [tab] [tab] [tab] …
pdf2dsc pdfatfi pdfdetach pdffonts pdfimages pdfjadetex pdflatex-dev pdfseparate pdf-stapler pdftk pdftohtml pdftops pdfunite pdf2ps pdfattach pdfetex pdfgrep pdfinfo pdflatex pdfroff pdfsig pdftex pdftocairo pdftoppm pdftotext pdfxmltex

It also contains pdftotext – you see it.

How does pdftotext work in the Bash?

I have created a PDF file from my own blog, that it calls „GitHub_Reisen_und_IT.pdf“

If you only want to see your PDF files in the current directory, you only type:

ls *.pdf

This kind of combination does not hide folders in your directory. It only lists files with the extension PDF.

The command

pdftotext PDFfile.pdf –

Important is the – sign after PDF otherwise it does not work.

The – is a sign in the Bash for the standard output (stdout). It will send the text to the terminal.

If you have a large PDF file, then it is useful to use a pager like less.

pdftotext PDFfile.pdf – | less

Conclusion:

You see: With pdftotext you have powerful tool in order to read PDF files in the Bash. If you want to know more, more options, then use the manual of pdftotext

man pdftotext

Die Kommentare sind geschlossen.