Fill PDF Form Fields 5x Faster With Power Automate Desktop

Fill PDF Form Fields 5x Faster With Power Automate Desktop

I’ve found a better way to fill pdf forms in Power Automate Desktop than a typical UI automation. A UI automation uses keyboard and mouse clicks to select text input fields on the screen and write in values. The problem for pdf files is Power Automate Desktop cannot detect their text inputs and you must resort to writing a complicated series of keyboard commands to slowly jump to each field and write in the value.

By running a PowerShell script in Power Automate Desktop instead we drastically simplify the code and fill the pdf action amazingly fast since keyboard and mouse are not used. In this article I will show you how to fill a pdf form with Power Automate Desktop.

Note: the PDF document and other files used in this tutorial can be found in my Power Automate Desktop Github repository.


Table Of Contents:
• Introduction: The Manufacturer's Certificate Of OriginChoose A Fillable PDF FormDownload The iTextSharp DriverDownload The BouncyCastle DriverGet The PDF Text Input Field Names Using PowerShellWrite Values Into PDF Fields Using PowerShellBuild A CSV File With Multiple Lines Of InformationRead Multiple Lines Of Data From A CSV FileSet The Fillable PDF and Driver LocationsLoop Over Each Row In The CSV TableFill A PDF File In Power Automate Desktop Using PowerShellRun The Desktop Flow To Generate Multiple PDFs




Introduction: The Manufacturer’s Certificate Of Origin

Employees at an automaker generate a manufacturer’s certificate of origin for each vehicle produced. The certificate of origin is created by filling-in a PDF template with information and then saving it in read-only mode.




Choose A Fillable PDF Form

Begin by selecting a PDF form for Power Automate Desktop to fill-in. If you would like to follow along with this tutorial the Manufacturer’s Certificate of Origin fillable PDF can be found in my Github repository.



Move the fillable PDF file into a folder where Power Automate Desktop can access it.



In this example the path to the PDF file is shown below. We will need to use the path later on desktop flow to tell Power Automate Desktop where the file is stored.

C:\RPA\FillPDF\CertificateOfOrigin.PDF




Download The iTextSharp Driver

Power Automate Desktop does not have any actions to fill-in a PDF file. However, we can write some custom code and use the Run A PowerShell Script action to fill-in PDFs instead. This method requires us to download two free drivers from the nuget.org website. The 1st driver is called iTextSharp. Go to the nuget.org webpage for iTextSharp and get the package.



Open Windows Explorer and navigate to the folder where iTextSharp downloaded to. Change the file extension from .nupkg to .zip. We are doing this so we can extract the driver from the package.



Locate the itextsharp.dll file inside the package then copy it…



…and paste it into the same folder as the certificate of origin




Download The BouncyCastle Driver

The iTextSharp driver has a dependency on another driver called BouncyCastle. We must download it as well. Go to the nuget.org webpage for BouncyCastle and get the package.



Rename the file extension of the package from .nukpg to .zip



Look for the file called BouncyCastle.Crypto.dll inside the package. Then copy it…



…and paste it into the same folder as the Certificate Of Origin and the iTextSharp driver.




Get The PDF Text Input Field Names Using PowerShell

Now that we have a fillable PDF file and downloaded the necessary drivers we can start writing our script. Our first objective is to use PowerShell to identify the name of each text input field in the PDF. Once we have the names of each text input we write another script to fill them in with values.

We will write our script using the Windows PowerShell ISE app which is installed by default on every Windows 10 and Windows 11 machine. ISE stands for Integrated Scripting Environment. With the Windows PowerShell ISE app you can input commands directly into the console or write scripts, press play to execute them and see the result happen all-at-once.

Note: if you are new to PowerShell I highly recommend checking out this excellent series of beginner tutorials on Youtube by Shane Young.




When the Windows PowerShell ISE app opens click on the New Script icon.



Now we can write some PowerShell code in the right-side of the app to find all of the field names in the Manufacturer’s Certificate Of Origin PDF.

Note: the left-side of the shows the code that was executed once the play button is clicked. We haven’t done this part yet. I wanted to show the end result in advance so you know what to expect if this is your first-time using PowerShell.





Copy and paste this code into the Windows PowerShell ISE script editor. This code opens the fillable PDF, gets the name of every input field, displays the field name in the console and then closes the PDF file. Make sure your drivers and PDF file are stored at location in the script or else we will receive an error.

# Import drivers
Add-Type -Path "C:\rpa\fillpdf\itextsharp.dll"
Add-Type -Path "C:\rpa\fillpdf\BouncyCastle.Crypto.dll"

# Variables for PDF location
$PdfFileInput = "C:\rpa\fillpdf\CertificateOfOrigin.pdf"

# Create PDF Reader Objects
$PdfReader = New-Object iTextSharp.text.pdf.PdfReader($PdfFileInput)

# Display List Of PDF Fields
$PdfFields = $PdfReader.AcroFields.Fields
echo $PdfFields

# Close PDF
$PdfReader.Close()



Press play to run the script. When we do this the PDF text input field names appear in the console.



Save the script as FindPdfFieldNames.ps1 and keep it in a safe place in case we want to re-use it another day.




Write Values Into PDF Fields Using PowerShell

The next step is to write a PowerShell script that will open the PDF, fill-in each field, set the field to read-only so no further edits can be made and them close the PDF file. We want to make sure the PowerShell script runs successfully before using it inside Power Automate. Therefore, we will only generated one filled-in PDF file here with hardcoded field values.



Copy and paste this PowerShell script into the editor.

# Import drivers
Add-Type -Path "C:\rpa\fillpdf\itextsharp.dll"
Add-Type -Path "C:\rpa\fillpdf\BouncyCastle.Crypto.dll"

# Variables for PDF locations
$PdfFileInput = "C:\rpa\fillpdf\CertificateOfOrigin.pdf"
$PdfFileOutput = "C:\rpa\fillpdf\CertificateOfOrigin_1HGCG1659WA030328.pdf"


# Create PDF Reader & Stamper Objects
$PdfReader = New-Object iTextSharp.text.pdf.PdfReader($PdfFileInput)
$PdfStamper = New-Object iTextSharp.text.pdf.PdfStamper($PdfReader, [System.IO.File]::Create($PdfFileOutput))

# Fill PDF Fields With These Values
$PdfFields = @{
 IssueDate = "5/1/2022"
 InvoiceNumber = "103743"
 VehicleIDNumber = "1HGCG1659WA030328"
 ModelYear = "2019"
 Model = "D4500"
 Manufacturer = "Company Name"
}

# Fill Each PDF Field And Set To Read-Only
ForEach ($PdfField in $PdfFields.GetEnumerator()) {
    $PdfStamper.AcroFields.SetField($PdfField.Key, $PdfField.Value)
    $PdfStamper.AcroFields.SetFieldProperty($PdfField.Key, "setfflags", [iTextSharp.text.pdf.PdfFormField]::FF_READ_ONLY, 0)
}

# Close PDF
$PdfStamper.Close()
$PdfReader.Close()



When we press play a new PDF is created at the path defined in the PdfFileOutput variable. Save the script as PdfFill.ps1 and close the Windows PowerShell ISE. We are now done testing our script.




Build A CSV File With Multiple Lines Of Information

Our PowerShell script only produced a single PDF. But we will build our Power Automate Desktop flow to generate multiple PDFs based on data from a CSV file.



Open Windows Notepad and copy & paste these values into a new document.

IssueDate, InvoiceNumber,VehicleIDNumber,ModelYear,Model,Manufacturer 
"5/1/2022","103743","3C63D3GLXCG791583","2021","M4500","Awesome Car Co."
"5/1/2022","103744","JTHBB1BA2B2596944","2022","S3500","Awesome Car Co."
"5/1/2022","103745","1FTEW1CM3BF564277","2021","M4500","Awesome Car Co."
"5/2/2022","103746","NM0KS9BN5BT974992","2022","R3300","Awesome Car Co."
"5/2/2022","103747","2LMHJ5AT5EB108349","2022","S3500","Awesome Car Co."



Save the file as a csv (comma separated values) and place it in the same folder as the other files we’ve created.




Read Multiple Lines Of Data From A CSV File

After much preparation we are finally able to start building our desktop flow. Open Power Automate Desktop and create a new flow called Fill PDF Document.



Add a set variable action with the path to the CSV file.



This is the CSV file path we’ve used in this tutorial.

C:\RPA\FillPDF\inputfile.csv



Then insert a read from CSV file action. Make sure the first line contains column names toggle is on and the separator properties is set to comma.



Use this variable in the file path.

%CsvPath%





The first two steps of the flow should look like this. Press the play button to test them out and make sure the read from CSV step returns a result.




Set The Fillable PDF and Driver Locations

Next we’ll create a few variable to store the fillable PDF and driver locations. Set a new variable called ItextSharpDriverPath…



…and use this path as the value.

C:\RPA\FillPDF\itextsharp.dll




Set another new variable called BouncyCastleDriverPath…



…and use this location as the value.

C:\RPA\FillPDF\BouncyCastle.Crypto.dll



Then, create one more variable called PdfFormInputPath…



…and use this file path as the value.

C:\RPA\FillPDF\CertificateOfOrigin.pdf




Loop Over Each Row In The CSV Table

We want to write one PDF for each row found in the CSV table. Insert a new For Each action into the desktop flow.



Iterate over the variable returned by the read a CSV file action.

%CSVTable%



The desktop flow will look like this once the for each action is added.

At the start of the loop we need to define a unique name for each PDF file name to be output.



Set a new variable called PDFFormOutputPath and use this code to construct a file name which includes the vehicle identification number. Notice that we are able to join multiple text strings together using the + sign operator.

%'C:\\RPA\\FillPDF\\CertificateOfOrigin_' + CurrentItem['VehicleIDNumber'] + '.pdf'%




Fill A PDF File In Power Automate Desktop Using PowerShell

All of our work to this point has been leading to running a PowerShell script in Power Automate Desktop which fills a PDF and saves it in read-only mode. Since we’ve done all of the hard-stuff in advance this step is relatively easy.



Copy & paste this code into the run PowerShell script action. Notice that all of the hard-coded values have been replaced with variables from our desktop flow. Also pay special attention to the syntax we’ve used to do that. All of the string variables are enclosed in quotation-marks as well as the % symbols we are used to in Power Automate Desktop.

# Import drivers
Add-Type -Path "%ItextSharpDriverPath%"
Add-Type -Path "%BouncyCastleDriverPath%"

# Variables for PDF locations
$PdfFileInput = "%PdfFormInputPath%"
$PdfFileOutput = "%PdfFormOutputPath%"

# Create PDF Reader & Stamper Objects
$PdfReader = New-Object iTextSharp.text.pdf.PdfReader($PdfFileInput)
$PdfStamper = New-Object iTextSharp.text.pdf.PdfStamper($PdfReader, [System.IO.File]::Create($PdfFileOutput))

$PdfFields = @{
 IssueDate = "%CurrentItem['IssueDate']%"
 InvoiceNumber = "%CurrentItem['InvoiceNumber']%"
 VehicleIDNumber = "%CurrentItem['VehicleIDNumber']%"
 ModelYear = "%CurrentItem['ModelYear']%"
 Model = "%CurrentItem['Model']%"
 Manufacturer = "%CurrentItem['Manufacturer']%"
}

ForEach ($PdfField in $PdfFields.GetEnumerator()) {
    $PdfStamper.AcroFields.SetField($PdfField.Key, $PdfField.Value)
    $PdfStamper.AcroFields.SetFieldProperty($PdfField.Key, "setfflags", [iTextSharp.text.pdf.PdfFormField]::FF_READ_ONLY, 0)
}

# Close PDF
$PdfStamper.Close()
$PdfReader.Close()



Run PowerShell script was the final action we needed to create. The finished desktop top flow should look like this.




Run The Desktop Flow To Generate Multiple PDFs

Press the play button in Power Automate Desktop and watch the flow in action as it creates multiple PDF files. In our example we instructed the flow to output the files in the same location as our other tutorial files. In a real-world setting it would be better to write the PDF files to a folder named output to keep things more organized.



When we open one of the PDF files we can see the values filled-in and the form is in read-only mode.





Questions?

If you have any questions about Fill PDF Form Fields 5x Faster With Power Automate Desktop please leave a message in the comments section below. You can post using your email address and are not required to create an account to join the discussion.

Matthew Devaney

Subscribe
Notify of
guest
8 Comments
Oldest
Newest
Inline Feedbacks
View all comments
R.Ali
14 days ago

Great article once again Matthew. I have played with iTextSharp quite a bit on PowerShell and C# for projects. One thing that must be kept in my mind when using AcroFields is that PDF forms are essentially of two types. AcroFields will work with old PDF forms whereas the new PDF forms based on XML is called XPDF will not work with the AcroFields object.Therefore, it is crucial to ensure that the form itself is compatible as well.

R.Ali
13 days ago

I mistyped the type of form (it is not XPDF but XFA PDF forms in case someone wants to search it further. Older forms are called AcroForms).

Because XFA is just XML so you can use any XML package, PowerShell native capabilities, or .NET XML classes that let you work with XML packages. The actual XML schema can be found on the Adobe website if you search for it on the Web.

iTextSharp, from when I worked it last, did provide XFA manipulation capabilities.

One of the major problems that made me get into this was that the multiple lines of text field in the PDF Form (AcroForm) would not gracefully expand to accommodate text if it extended the boundary of the control. The text would just shrink in font size (even with auto expand the output was not very desirable for a form at an enterprise level). After some research, I found that XFA PDF forms would resolve it and they did.

It would be great to know how Encodian does as you have suggested in response to one of the comments below. Encodian does have a tried and test way for a product that they sell so it would be great to figure out how Encodian handles the different type, or if they work with the PDF specification natively.

Thomas van der Waard
Thomas van der Waard
13 days ago
Reply to  R.Ali

Encodian uses Aspose (www.aspose.com)

R.Ali
12 days ago

Thank you Thomas.

Thomas van der Waard
Thomas van der Waard
14 days ago

Would there also be a way to achieve this entirely in the cloud. Without PA Desktop unattended and the use of third party connectors.
Something like that the fillable PDF form is stored in OneDrive or SharePoint, grab it from there and populate this with data and store somewhere in the cloud again. Should I then consider creating an Azure Function? Or any other ideas?

George Winters
George Winters
12 days ago

Bravo!