Get Text from PDF

đź“„

Gets text from the provided PDF file. Input Rich text, File, Image, URL, Evernote note, PDF, Text Result Text, Rich text

Score
Stage
Prompt
Type
4
4/5

Information

Parameters

Get [Text/Rich Text] from PDF [Document]

Page Header Text: [Page Header]
Page Footer Text: [Page Footer]
Combine Pages [Toggle]

Input

Folder, Text, Rich text, URL, File, Image, Evernote note, PDF

Result

Text, Rich text

Comments

Get Text from PDF follows in the Extract Text from Image’s footsteps, albeit doing perhaps less-fancy work in the process as PDFs do provide the actual data in the document itself. That being said, in impact and expectation, it provides a similar benefit – letting users grab all the words from a format that’s not as accessible and utilize it in the course of their Shortcuts.
With Get Text From PDF, the data in the “portable document format” actually becomes a lot more portable – especially since the action has a “Rich Text” option that tries to preserve the integrity of the text so inline links and formatting still comes through.
Plus, users can specify the page Header and Footer text, allowing those to be ignored in the process of scraping the data. Finally, the pages can be combined (or not) as needed, either creating one large section of text at the end or a distinct list of each pages’ contents.
In practice, this action lets you copy-and-paste all the data out of a PDF, act on the data in all sorts of ways (like formatting that Rich Text as HTML or Markdown), and do things like create a new multi-page document out of it immediately or move to any other app that hooks into Shortcuts.

Details

Score
4/5
Group
Date Added
07/06/2021
Stage
Type
Identifier
is.workflow.actions.gettextfrompdf

Works Well With

Shortcuts Using This


Get more ways to browse the actions – become a member.