Monday, January 27, 2014

Mac 101: Use Automator to extract text from PDFs

Here are some tricks and tips of using Mac 101 the most efficient and user friendly way. I am sure all of us have tried copying and pasting pdf content to some word or pages document. Trust me, the results are quite disenchanting, as the document would end up looking like a kids job, with absolutely no format, messed up lines and diagrams, photos all gone awry, what a mess! And then, the clean up begins, you spend almost 5 times the amount of time putting the entire document back in order, if not more, especially if the document happened to be bigger than normal. Now I have learnt that there is a very simple and efficient way of doing this and yes, it is a Mac application. This magic wand is called Automator.

What is an Automator? This is a user friendly application from the Mac 101 and helps you do what you have wanted to achieve seamlessly for a long time. The steps are quite simple, you need to create a workflow; thereby almost creating a chain of commands that will help you in achieving the end result; without having to go through the entire set of instructions every single time.

Here is how you do it.

1. Once you open Mac, click on applications and open Automator. A prompt then asks you to select a document type. You need to select workflow, and hit enter. This is critical; as it helps you create the command chain I was mentioning earlier. The second command prompt is to select “Ask for finder items” from the files and folders on the far left of your screen. Drag this command to the far right space where it is mentioned as “Drag actions or files here to build your workflow”. This completes step 1.

2. Click on pdf in the far left column and select “Extract pdf text from second column”. This your second chain of command. Drag “Extract pdf text” into the space on the right where you have built the workflow. Now a completed workflow has been created by the Automator.

3. In the Extract PDF Text bubble of the workflow, select Rich text instead of plain text. This is done to retain the original formatting as far as text characteristics are concerned. Now choose location to save your extracted file, give a title and save. Make sure to save as an application and not as a workflow.

Finally, open your Automator application and select the PDF you want to extract the text from. A new Rich text document will be created; from which you can copy and paste the contents into your preferred word document. Though Mac has always trailed windows as far as usage is concerned, a large part of the virtual community still swear by Mac as a great application; and this application called Automator only helps in proving their stand right.

Well, I am sure one of your most long standing issues as far as PDF is concerned is solved now.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.