Data Matching | Maillie LLP

Data Matching

Most organizations are reactive when it comes to fraud; however, there are some things an organization can do to be as proactive as possible in identifying and quickly addressing potential fraud risks.  Many frauds can be detected in a timely manner by “drilling down” into the financial data to analyze suspicious transactions.  Specialized data analytics software can be used to discover indicators of fraud.

Data Matching

Some of the hardest-to-detect frauds are created by collusion, defined as conspiracy by one or more persons to cheat or deceive others. These methods of fraud are difficult to detect because most internal control systems are based on the principal of preventing a single individual from committing fraud.  Collusion can result in the bypass of internal controls such as approvals and authorizations, often making segregation of duties meaningless.

Establishing a link between two individuals, such as two employees, two vendors, or a vendor and an employee, can help an organization identify previously unknown relationships that could be red flags for potential collusion.  Although it is nearly impossible to obtain enough data about your employees and vendors to establish all potential links and relationships, some readily-available data that can be used for this purpose would be names, addresses, phone numbers and bank account information.

Using a complete vendor and employee list, data analysis software can perform sophisticated matches to identify any potential links between the data sets.  Although the match process does not sound inherently complex, variances in spelling and abbreviation that are especially prevalent in names and addresses create difficulty in the traditional matching process.  For example, one instance of a street address might use the abbreviation “Ave” for avenue, causing the pair of addresses to not match in a traditional matching attempt. 

To solve this problem, data analysis software provides a process called probability record linkage, usually referred to as “fuzzy matching”, which involves determining how statistically close two sets of data are to each other.  Data pairs with probabilities over a certain threshold are considered matches.  In theory, fuzzy matching will assign a highly significant relationship link between two sets of data if the only difference was in spelling or abbreviation, yet still exclude data that have little or no relationship to each other. Here are some examples of fuzzy matching probability results:


Fuzzy Match Probability


Fuzzy Match Probability

Rodriques, Manny


322 Hometown Ave, Troy, MI 48945


Rodriquez, Manilo Jr.

322 Home Town Avenue, Troy, MI 48946

Morgan, Elizabeth


1183 Grandview Boulevard, Portland, ME 18907


Morgan, Beth

1183 Grantville Blvd, Portland, ME 18907

It is important to remember that, even if a link is established between two individuals within the organization’s data, it does not automatically mean collusion is present.  Establishing links in the data only alerts management to a relationship requiring further investigation to determine whether an actual conflict is present.  It is common for many organizations to employ related individuals in different capacities that do not involve supervisory relationships and would rarely create a conflict.  Companies may also decide to contract with a vendor that is owned by a family member of an employee, as long as the employee is not involved in the purchasing decision. Each organization should establish its own set of rules for what relationships it will and will not tolerate and then attempt to identify any relationships that violate these rules.

Maillie, LLP can help you use data analytics, including fuzzy data matching, to analyze your financial data for potential red flag indicators of fraud.  Contact us today for more information.

Data Analytics Services