FAQs for Document Matching

Frequently asked questions for DataSnipper's Document Matching

Are you able to add documents later and re-run Document Matching?

Yes. Simply add the documents to the relevant folders of your Document Organizer and then re-run the Document Matching process. It is important to note that DataSnipper will overwrite any data in the output columns so any edits that you have made to those columns would be lost.

Can you perform multiple Document Matching tests in one tab?

Yes, you would simply need to overwrite the first document matching setup. Provided the output is in a different location, this will not loose any of the first work performed. You cannot however save a template with more than 1 document matching setup built in.

Does Document Matching take the name of the supporting documents into account?

Document matching will match the name of the file, creating a snip to a blank area in the document as a reference.

How does Document Matching find the correct item to snip (i.e. if there are multiple positive matches)?

DataSnipper will create a unique search combination of your inputs and attempt to snip the items which have the best collective match. This is based on a requirement that the snips should be on the same document and that the snips should be as close together as possible.

If the supporting documents (e.g. invoices) are 2-3 pages long, will Document Matching still pick up the correct inputs (e.g. let’s say the amount is located on the third page of the document)?

Yes, Document Matching has an algorithm which ranks multiple hits, one of the main ways of ranking is the proximity of all the snips on the page so generally this will work.

How should you review the results of Document Matching?

Document matching will output exact matches (or approximate matches if advanced settings have been used) as the snips. These Text Snips create a cross reference to the source of the extraction. This allows the user to select each Text Snip and view the supporting reference. The user should therefore review each snip to verify that the supporting evidence is appropriate.

Are you able to incorporate a rounding / variance allowance in Document Matching?

Yes. In step 3 of the document matching setup you can select any of the input(s) / output(s) three dots to view the advanced settings. In inputs, you are able to incorporate a variance / threshold deviation allowance of value or percentage differences.

What is the difference between Partial Matching and Fuzzy Text Matching?

Partial matching: Matches for parts of the input. It does not account for spelling differences and and abbreviations. Example Matching “Jenny” to “Jenny Lam”. Fuzzy matching: Matches simular results, including spelling differances and abbreviations. Example: Matching “Jenni Lem” to “Jenny Lam”.

What instances would a user need to use Partial Matching for?

Partial matches are useful when you have a longer input section which you cannot find an exact match to in the supporting materials. In this instance it may be useful to try and find partial matches in case portions of the input is present in the document. Example: Searching for “Jenny Lam” which does not exist but “Jenny” is present on the document. This is useful to extract.

Why is strict matching the default if Partial Matching is beneficial?

Although partial matching is useful when exact matches cannot be found, an exact match is still best case scenario and therefore is defined as the default. If the user is willing to accept more potential for invalid matches for flexibility sakes, then they can toggle partial matches.

How do Partial Match and Fuzzy Text Match interact together?

Partial matching together with Fuzzy Matching: Matches for parts of the input and accounts for spelling differences and and abbreviations. Example Matching “Jenni” to “Jenny Lam”.