Textract installation hints#
Textract needs Poppler to extract text from PDFs.
Windows#
- Download the latest binary of your choice from github.com/oschwartz10612. In this example we will download and use Release-22.01.0-0.zip.
- Extract the archive file Release-22.01.0-0.zip
- Copy the folders from poppler-22.01.0\Library into
C:\Program Files\Poppler
. - Thus, the directory structure should look something like this:
C:\Program Files\Poppler
\bin
\include
\lib
\share
- Add
C:\Program Files\Poppler\bin
to your system PATH! - Try it with a filecontent example rule