Adding a duplicate search rule
Glossary Item Box
Introduction
Deduplication process uses the rules created as procedures and stored in bpm’online. When a rule is being executed, it populates the ContactDuplicateSearchResult table with a list of duplicates. Search results are grouped by rules and displayed on the duplicate page. Learn more about this functionality in the “Finding and merging duplicates” documentation.
To add a duplicate rule:
1. Add a column to the object schema (if needed). The column value will be used for the search of duplicates.
2. Add the stored search procedure to the application database.
3. Register the stored procedure as a new rule.
Case description
When launching the duplicate search process, the contacts with the same [Taxpayer ID] column values should be considered duplicates and should be displayed as search result.
Case implementation algorithm
1. Adding a field whose value will be used for the duplicate search
Since the [Taxpayer ID] field is not available on a standard contact edit page, add the field to the page (e.g., via a section wizard). For more information about adding fields to edit pages see the “Adding a new field to the edit page” article.
The new field properties:
- [Title] – “Taxpayer ID”
- [Name in DB] – “UsrInn”
2. Adding the stored search procedure to the application database
Add the stored duplicate search procedure by the [Taxpayer ID] field to the database. To do so, execute the following SQL script:
-- Verifying if the stored tsp_FindContactDuplicateByInn procedure is available. IF NOT OBJECT_ID('[dbo].[tsp_FindContactDuplicateByInn]') IS NULL BEGIN -- Deleting the stored procedure. DROP PROCEDURE [dbo].[tsp_FindContactDuplicateByInn]; END; GO -- Creating the stored procedure. CREATE PROCEDURE [dbo].[tsp_FindContactDuplicateByInn] ( -- This table parameter is only rendered if a new contact is stored. -- Contains the new contact data. -- If duplicate global search process is launched, the rendered parameter contains no data. @parsedConfig CreatingObjectInfo READONLY, -- Unique identifier of user who launched the duplicate search. @sysAdminUnit UNIQUEIDENTIFIER, -- Identifier of the current rule from the [ContactDuplicateSearchResult] table. -- This identifier is created after a rule is registered in the system. @ruleId UNIQUEIDENTIFIER ) AS BEGIN -- Receiving the quantity of records from the accepted table for defining the duplicate global search launch. DECLARE @parsedConfigRowsCount INT = (SELECT COUNT(*) FROM @parsedConfig); -- Creating temporary table with contact data for the search. CREATE TABLE #searchContact ( [UsrInn] INT, [SortDate] DATETIME ); -- In case of global search, the temporary table is populated with data. IF @parsedConfigRowsCount = 0 BEGIN -- Adding data for duplicate search to the temporary table. INSERT INTO #searchContact ([UsrInn], [SortDate]) -- Query for contact data selection. SELECT -- The Taxpayer ID columns of contact modification date are selected. [UsrInn], MAX([ModifiedOn]) FROM [Contact] -- Grouping by fields is added to enable using qauntity verification. GROUP BY [UsrInn] -- The table is populated only if more than one contact is available. HAVING COUNT(*) > 1; END; -- Populating the table of results. INSERT INTO [ContactDuplicateSearchResult] ([ContactId], [GroupId], [RuleId], [SysAdminUnitId]) SELECT -- Contact duplicate identifier. [vr].[Id], -- Group numbering. DENSE_RANK() OVER (ORDER BY [vr].[SortDate] DESC, [vr].[UsrInn]), -- Rule identifier. @ruleId RuleId, -- Identifier of user who launched the duplicate search process. @sysAdminUnit FROM ( -- Subquery populating the duplicatae table. SELECT -- Contact identifier. [v].[Id], --Contact Taxpayer ID. [v].[UsrInn], -- Date of sorting. [r].[SortDate] -- Tables providing data. FROM [Contact] [v], #searchContact r -- The rule defining that contacts are duplicates. WHERE [v].[UsrInn] = [r].[UsrInn] -- Groupoing of search results. GROUP BY [v].[UsrInn], [r].[SortDate], [v].[Id] ) [vr]; END; GO
NOTE
Sometimes you can come across the "Cannot resolve the collation conflict between "Cyrillic_General_CI_AS" and "Cyrillic_General_CI_AI" in the equal to operation" error. To fix it, specify the necessary COLLATE when creating the table column.
CREATE TABLE #searchContact ([Name] NVARCHAR(128) COLLATE Cyrillic_General_CI_AI, [BirthDate] DATETIME, [SortDate] DATETIME );
3. Registering the stored procedure as a new rule
To register the stored procedure as a new duplicate search rule, add the corresponding record to the DuplicatesRule table. To do so, execute the following SQL script:
-- Veriable that stores the UId column value of the Contact schema. DECLARE @ContactUId UNIQUEIDENTIFIER; -- Receives the UId сcolumn value of the Contact schema. Set @ContactUId = (SELECT TOP 1 SysSchema.UId FROM SysSchema WHERE SysSchema.Name = 'Contact' AND SysSchema.ExtendParent = 0); -- Adds a new rule to the system. INSERT INTO DuplicatesRule ([IsActive], [ObjectId], [ProcedureName], [Name]) VALUES (1, @ContactUId, 'tsp_FindContactDuplicateByInn', 'Дубли контактов. ИНН');
After you update the application page and clear the cache, you will have a new rule in the list of duplicate search rules (Fig. 1).
Fig. 1. Duplicate search rule by the [Taxpayer ID] field