Which Search Process Should I Use?
To search the database for duplicate constituents, you can use the Full Search and Incremental Search Processes found on the Duplicates page in Constituents, or you can create a custom SQL Server Integration Services (SSIS) search package. Consider the pros and cons of both methods to determine which is better for your organization.
Pros:
-
Because these searches are built into the program, they do not require special technical knowledge to configure and customize. The options provided in the program allow you to easily select which fields to match on, which types of constituents to include, and how to filter the results. You can also adjust the match confidence thresholds. Some of these options are also available with the SSIS package, but others must be set up manually or are not possible to use at all.
Cons:
-
These searches rely on the soundex algorithm to create groups of constituents to compare. Soundex is based on English pronunciation so it may not be a good fit for organizations that have lots of constituents with non-English names.
Furthermore, there may be a small percentage of duplicates that are not found through the full or incremental search processes because misspellings that are phonetically dissimilar are not recognized by the soundex phonetic algorithm. Typically, this is not a problem because most misspellings do not change the beginning phonetic sound of the name. But in some cases, it could be an issue. For example, two constituent records with the same address and the last names “Tanner” and “Manner” will not be flagged as duplicates because the phonetic sound of the last name is too dissimilar. If you encounter this issue frequently, you may consider using the SSIS search processes instead of, or in addition to, the full and incremental searches. The SSIS search processes do not use the soundex component.
-
The full and incremental searches run on the application and database servers. The additional load on the application server may result in decreased performance for other processes running at the same time.
-
The search algorithm only finds matches based on key names and constituents of the same type. For example, it would not match an Individual with an Organization.
Pros:
-
You can select which fields to match on, even including those that are not available in the full and incremental search processes.
-
The search algorithm can match constituents of different types. For example, it can match an Individual with an Organization.
-
The SSIS package uses a fuzzy matching algorithm that is not language-dependent like soundex. This method may provide better results for your organization. For more information about the benefits of SSIS fuzzy matching, see http://msdn.microsoft.com/msdnmag/issues/05/09/SQLServer2005/default.aspx.
-
The SSIS package runs on the database server so there is no additional load to the application server.
Cons:
-
The SSIS package requires significant technical knowledge to implement. You must know how to set up and schedule SSIS packages, and how to use SQL Server.