Support Portal ContactGet in touch

Using Cognitive Services and DocMan for automatic Document Recognition

   Words by CRM Consultancy

   on 05/10/2018 15:00:00

Tracking Metadata for your Documents in SharePoint is invaluable – it allows you to see key facts about a Document at a glance, control Document Workflow and make searching for Documents quicker and easier

On the other hand, remembering to manually set the Metadata for a Document in CRM or Word is a chore.

And as with anything that is a chore, we can lose the benefits here from the extra effort involved.

However Azure Cognitive Services allows us to read the Content of a Document on upload, which DocMan can then use to flag up key facts about the content and use this to pre-populate the Metadata about the Document – this then removes the manual intervention and helps automate the link up between CRM and SharePoint.

STEP 1 – Define our Cognitive Services Handler

First of all, we browse to the DocMan Configuration Area in Dynamics and add a new Cognitive Services Handler to our use of DocMan:

image

Viewing the DocMan Configuration Area in Dynamics 365

Here we supply the Type of Handler we are adding, our Azure Cognitive Services API Key and the Region we are invoking in the Microsoft Cloud.

image

Configuring our new Handler to connect Azure Cognitive Services to DocMan

STEP 2 – Specify how the Handler should match Uploads to a Content Type

We can add Rules to our Handler that define what Content to look for in a Document or Image to then set the Content Type on upload.

This Ruleset then governs the automatic population of the SharePoint Content Type.

The most obvious example is to add a new Rule and set the following:

Cognitive Term (to look for): Proposal

Cognitive Operation: Find

Cognitive Strength: 5

What is Cognitive Strength?  Well a Document will often involve multiple Terms and we do not want an Invoice being uploaded with Content Type Contract just because the Invoice Description references Contract once.  So we add a Cognitive Strength to define how strongly we count each occurrence of the term ‘Contract’ and compare to any other Content Type Rules we have in place when determining the final Content Type for the Document.

image

Defining the Rule that identifies the Document as a Proposal based on the Content

This sets our Handler and informs the Handler how to identify the Content Type from the Content in the Document. (or Image if using the Image Processor, or Handwriting if using the OCR Processor)

image

Viewing the List of Rules for our Handler to identify the SharePoint Content Type of a Document on upload

STEP 3 – Attach the Handler to our intended Entity in CRM

We will work with various Entities in Dynamics and we can control which Handler(s) will process the upload to the Document Area for an Entity.

So we add the Entity or Entities to the Handler definition to specify that an upload to this Entity will invoke this Handler and the Rules for the Handler:

image

Defining the behaviour of the Case Entity in DocMan and ensuring we have our New Handler attached to implement our Rules on Document Upload

At this point, we can also add additional Rules to either the Handler, Content Type or Entity to specify how Content found or scored in the Document can be mapped to the Fields in CRM  or Metadata in SharePoint. (the Rule then being specified to the Content Type or Entity, or left blank to always being invoked for the Handler)

STEP 4 – See it in action

We can then browse to our Entity in Dynamics and upload a Document or Image to see this in action.

So if we take our Case Entity as an example – we drag 3 Documents into the Documents Panel of the Case, this then:

  • Uploads the Documents to SharePoint
  • Cognitive Services reads Content from the Document
  • DocMan interprets this Content into a Content Type, Metadata and any updates to CRM
  • Documents stored in SharePoint with the right Content Type and Metadata automatically populated

image

Our Documents uploaded to the Case with the Content Type and Metadata populated automatically – as if by magic!

In the above example, we can see how the Invoice Amount and GDPR Data Classification flags have also been automatically populated on upload of the relevant document – this is done by adding additional Rules to the Content Type to populate other Fields from the Document Content.

We will see this in more detail in the next DocMan Article on Automatic Metadata Population.

DocMan for Dynamics - Cognitive Services to set Content Types on Upload

Prefer to go old-school?

Write to us using the below addresses.

Head Office
CRM Consultancy
61 Oxford Street
Manchester
M1 6EQ

London Office
CRM Consultancy London
Grosvenor Avenue
London

Content © CRM Consultancy.