The following is a transcript of the UiPath Anchor Base tutorial posted above.
UiPath Anchor Base Example
I did a previous tutorial on how to do some basic text extraction without the Anchor Base activity as an introduction, but if you ever start doing serious data extraction from your PDFs, you absolutely must use the UiPath Anchor Base Activity with the AnchorPosition property. That’s the only way to really make text extraction work with multiple PDF files from different vendors over an extended timeline where the structure of the file might change.
So that’s what I want to talk about here. In a subsequent UiPath Anchor Base tutorial, I’m going to talk about doing data extraction on tabular data, which this is not. But if you put the two tutorials together, I think you’ll really know how to work with PDF files and UiPath.
UiPath Anchor Base Example
I’m going to kick this off by creating a new process project, called PDFExtract.
Let me show you the PDF that I’m going to pull some data out of.
You’ll notice that I’ve got the invoice number and the date over clearly visible. I’ve also got a total.
I want to pull that data out of this PDF file using the UiPath Anchor Base activity.
UiPath Anchor Base Example
Now, one thing I should point out there’s tabular data on here. The mailing address is tabular.
The line items here in the orders section is tabular.
And then there these individual properties. These aren’t tabular. These are name-value pairs, such as the invoice number, and the date.
If you want to pull out tabular data, you have to do that through data scraping. I’ve got a separate UiPath tutorial on that.
If you want to pull out name-value pairs, the best way to do it is with the UiPath Anchor Base tag and the AnchorPosition property. So that’s what I’m going to demonstrate here.
I’m going to pull up the invoice number of the date and the total, and I’m going to do it all through an anchor activity.
I need to actually get the name and folder of the invoice. It’s in a folder called orders. So I’m just going to copy the path to the file and then paste that name in.
Kicking off that process. We need to attach the Adobe PDF window to this process so that we can interact with the invoice.
Now what I want to do is grab some information. I want to grab that invoice number, the invoice date and the invoice total. I’m gonna grab that information. I’m gonna need some variables. I’m going to create those right off the bat.
Make sure you got those variables declared.
And then the first thing I’m going to grab is this invoice number.
Now the invoice number is going to change from time to time. So I want you to look under where it says invoice number, which will always be there and give me the value associated with it. And you do that by adding something in called an Anchor Base activity. I’m going to drop that in here.
Anchor Base vs Find Elements
The Anchor Base activity has two parts to it. One’s the anchor. And what’s the text that you want to get based on that anchor and to fill in the anchor.
And again, it says, what elements are you looking for? So again, click that button, say I’m looking for something relative to the invoice number says, okay, that’s cool. It says, what do you want to do once you’ve found that thing in a second to get some texts that’s associated with it as well. Show me what text you want. And again, I’ll click on that link and I’ll say, well, it’s actually the text that’s right underneath it.
Now it looks like everything is running swimmingly.
One thing I will say this, this Anchor Base tries to guess what direction you’re interested in and it doesn’t always guess it right.
That anchor invoice number is above the string that I want. So what you should always do is explicitly state in your Anchor Base where that anchor is positioned. In this case, invoice number’s positioned above the variable above the data that I want. So for the UiPath AnchorPosition, you say that the anchor is above or the anchor is on top of the element that we’re after not on top on top means actually the same cell top means just above it.
There’s two options there that are kind of similar to the on top and top we want top.
This is going to give us our PDF invoice number, and if you’re interested, you can even write that out. So you can say, ‘Hey, the invoice number is,’ and then plus invoice number, and that should print out what the invoice number is.
And then we can follow these steps right through and get the invoice date. And so again, that’s going to be exactly the same. It’s just going to be different. So we grabbed the anchor base activity, throw it on
I go through the process of associating those two elements with the fields that I want. So I’m running out of space here on the page, but you can see what the following steps will be.
And so the last thing that I need to do here is specify that the UiPath AnchorPosition is top. That gets me through the anchor based thing. And then I guess I can always write the line out there and say, Hey, the total is plus invoice total.
Well, all of this looks good. I would like to make sure, you know, I’ve got things in double coats off and forget to do that. Make sure that they’ve got each variable for the getter pointing to a value in why is there no value there, so that get tech should point to invoice number. So it looks like I did mess something up there, make sure that you’ve got that specified. So get texts for there. It goes to invoice number.
This get text goes to invoice date, and then this get text goes to invoice total. And you know what? I never printed out the invoice date. Gee, that wasn’t very thoughtful of me.
So I’m just going to add a right line after the invoice date and date is invoice date. Okay. And that looks pretty handsome.
Well, that doesn’t look too visibly, visually appealing, but it gives you a bit of an idea there of what’s going on. Kickoff the process and load the PDF. We associate it with Acrobat reader and then we based on anchors, get the invoice number, the invoice date and the invoice total.
Okay, well I’m going to run this and see what happens.
Oh, there’s a validation error. It looks like, Oh, I got done underscore instead of a plus there silly me. Okay. Save that again. And then run.
Example of Anchor Base UiPath Activity
It loads up the file. It does the PDF extract, and then here you can see down at the bottom, the invoice number is a one eight zero one. The date is zero one 2020. The total is one 15 and that maps exactly to everything I’ve got on the form.
And there you go. That’s how you use the UiPath Anchor Base activity to perform UiPath PDF data extraction.
If you enjoyed this tutorial, once I head over to the server side.com, I’ve got all sorts of tutorials and articles over there about enterprise software development. If you’re interested in my personal antics, you can follow me on Twitter @cameronmcnz and subscribe, on YouTube.