Document Classification With Power Automate And Azure AI
Power Automate can perform document classification with the help of Azure AI. Document classification is when AI reviews a document and determines what type of document it is. The result also includes a confidence score. Once a document is classified we can tag it with the correct document type in SharePoint. We could also use it to choose the correct data extraction AI model to get data from the invoice.
Table of Contents
• Introduction: The Document Library With Automated Document Classification
• Open Azure AI Document Intelligence Studio
• Setup A New Project For The Custom Classification Model
• Label Sample Invoices With The Document Type
• Train The Document Classification Model
• Test The Document Classification Model
• Create A SharePoint Document Library For Document Classification
• Trigger A Power Automate Flow When A File Is Added
• Classify A Document Using The HTTP Action
• Obtain The Endpoint And The Subscription Key
• Get The Document Classification Result
• Update The Document Type And Confidence Score File Properties
• Run The Power Automate Flow To Classify A Document
Introduction: The Document Library With Automated Document Classification
Workers at an insurance company receive several types of documents and upload them to a SharePoint library. Once the invoice is uploaded Azure AI Document Intelligence classifies the document and writes its document type and confidence score to the file’s metadata in SharePoint
Open Azure AI Document Intelligence Studio
Then select Create a project.
Setup A New Project For The Custom Classification Model
To start a new project we must enter several details into a project setup wizard. On the Enter project details screen name the project Form Recognizer Tutorial.
Choose an Azure subscription, resource group and Form Recognizer Resource. If none exist, create new ones. Select the API version 2023-07-31 (3.1 General Availability).
Pick an existing storage account and blob containers or create a new ones. Leave the folder path as blank.
Press the Create Project button to exit the project setup wizard.
The new project will appear in the My Projects menu. Open the Form Recognizer Tutorial Project.
Label Sample Invoices With The Document Type
The Form Recognizer Tutorial project opens to the label data screen. On this screen we must upload a set of invoices. Download the invoices in this Github repository for use in this tutorial. Then drag and drop Contoso Invoices #1-5 into the file upload area.
Add a new document type named Contoso Invoice.
Then select each invoice one-by-one in the drag and drop area and select Contoso Invoice from the dropdown menu that appear. When we do this the invoice name appears under the document type on the right menu.
Also upload Adatum invoices #1-5, add a new type called Adatum Invoice, and label each document so it appears in the right side menu.
Train The Document Classification Model
Now that we have labelled all of the invoices its time to time our document classification model. Select the Train button in the upper right corner of the Label data screen.
Assign the Model ID aibuilderinvoices to the classification model. Check the confirmation box and press the Train button.
After a few moments the aibuilderinvoices model appears in the Models tab.
Test The Document Classification Model
We can test the document model before we use it in Power Automate. Go to the Test tab and upload the invoice file named Adatum6.pdf. Press the Run analysis button. The document is correctly classified as an Adatum Invoice with a confidence score of 41.70%.
Create A SharePoint Document Library For Document Classification
The automation we will build requires a SharePoint document library. Create a new library named FormRecognizerTutorial with the following columns:
- Name (text)
- Document Type (text)
- Confidence Score (number)
Trigger A Power Automate Flow When A File Is Added
When an invoice is uploaded to a document library we want to start an automated flow to classify the document. Create a new Power Automate flow and select SharePoint – When a file is created (properties only) as the trigger.
Then add a SharePoint – Get file content using path action to get the file we will perform document classification on.
Classify A Document Using The HTTP Action
Document classification is not included as standard action Power Automate. To use the classification model we created in Document Intelligence Studio we will need to create a new HTTP action and choose the POST method.
|For reference, here is a link to the API documentation:
Use the following URI. Input the modelName as aibuilderinvoices. We will learn how to find the proper value for endpoint in a moment. For now
Apply the following Headers. We will also learn where to find the Key shortly.
Write this code in the Body of the HTTP request to send the file content in Base 64 format. Alternatively, we could supply a link to a document and replace base64Source with urlSource.
Obtain The Endpoint And The Subscription Key
We need the endpoint and the subscription key for use in the HTTP action. Go to portal.azure.com and search for Document Intelligences.
Choose the FormRecognizerTutorial project we setup earlier in the tutorial.
Go to the Keys and Endpoint tab. Copy the highlighted values into the URI property HTTP action.
Get The Document Classification Result
When we run the HTTP POST action to classify the document it does not immediately return a response. Instead, it places the document into a processing queue. We must retrieve the result in another action by using the Result ID.
Create a new Data Operations – Compose action to store the Result ID.
Use this code in the Compose action.
Add a Schedule – Delay action to the flow. Make the flow wait 10 seconds for the document to be processed.
Then add another HTTP action to the flow but this time make it a GET action to retrieve the result.
|For reference, here is a link to the API documentation:
Use this following code in the URI field. Notice that the Result ID is supplied to the analyze results endpoint.
Supply the same Headers as the HTTP POST action.
Update The Document Type And Confidence Score File Properties
The HTTP GET action returns the document type, the confidence score and other useful information about the document. We want to write the document type and the confidence score to the file’s metadata in SharePoint.
Add two Data Operations – Compose actions to the flow.
Use this code in the first action to get the Document Type.
Write this code in the second action to get the Confidence Score.
Finally, add a SharePoint – Update file properties action to the flow and supply the document type and confidence score.
Run The Power Automate Flow To Classify A Document
We are now done building the flow. Perform a test run on the flow and place the document Adatum Invoice #6 in the document library.
After a few moments the document is updated with a document type and a confidence score.
Did You Enjoy This Article? 😺
Subscribe to get new Power Apps & Power Automate articles sent to your inbox each week for FREE
If you have any questions or feedback about Document Classification With Power Automate And Azure AI please leave a message in the comments section below. You can post using your email address and are not required to create an account to join the discussion.