Creatio development guide
This documentation is valid for Creatio version 7.14.0. We recommend using the newest version of Creatio documentation.

Adding a duplicate search rule

Glossary Item Box

Introduction

Deduplication process uses the rules created as procedures and stored in bpm’online. When a rule is being executed, it populates the ContactDuplicateSearchResult table with a list of duplicates. Search results are grouped by rules and displayed on the duplicate page. Learn more about this functionality in the “Finding and merging duplicates” documentation.

To add a duplicate rule:

1. Add a column to the object schema (if needed). The column value will be used for the search of duplicates.

2. Add the stored search procedure to the application database.

3. Register the stored procedure as a new rule.

Case description

When launching the duplicate search process, the contacts with the same [Taxpayer ID] column values should be considered duplicates and should be displayed as search result.

Case implementation algorithm

1. Adding a field whose value will be used for the duplicate search

Since the [Taxpayer ID] field is not available on a standard contact edit page, add the field to the page (e.g., via a section wizard). For more information about adding fields to edit pages see the “Adding a new field to the edit page” article.

The new field properties:

  • [Title] – “Taxpayer ID”
  • [Name in DB] – “UsrInn”

2. Adding the stored search procedure to the application database

Add the stored duplicate search procedure by the [Taxpayer ID] field to the database. To do so, execute the following SQL script:

-- Verifying if the stored tsp_FindContactDuplicateByInn procedure is available.
IF NOT OBJECT_ID('[dbo].[tsp_FindContactDuplicateByInn]') IS NULL
BEGIN
    -- Deleting the stored procedure.
    DROP PROCEDURE [dbo].[tsp_FindContactDuplicateByInn];
END;
GO
-- Creating the stored procedure.
CREATE PROCEDURE [dbo].[tsp_FindContactDuplicateByInn] (
    -- This table parameter is only rendered if a new contact is stored.
    -- Contains the new contact data.
    -- If duplicate global search process is launched, the rendered parameter contains no data.
    @parsedConfig CreatingObjectInfo READONLY,
    -- Unique identifier of user who launched the duplicate search.
    @sysAdminUnit UNIQUEIDENTIFIER,
    -- Identifier of the current rule from the [ContactDuplicateSearchResult] table.
    -- This identifier is created after a rule is registered in the system.
    @ruleId UNIQUEIDENTIFIER
)
AS
BEGIN
    -- Receiving the quantity of records from the accepted table for defining the duplicate global search launch.
    DECLARE @parsedConfigRowsCount INT = (SELECT COUNT(*) FROM @parsedConfig);
    -- Creating temporary table with contact data for the search.
    CREATE TABLE #searchContact (
        [UsrInn] INT,
        [SortDate] DATETIME
    );
    -- In case of global search, the temporary table is populated with data.
    IF @parsedConfigRowsCount = 0
    BEGIN
        -- Adding data for duplicate search to the temporary table.
        INSERT INTO #searchContact ([UsrInn], [SortDate])
        -- Query for contact data selection.
        SELECT
            -- The Taxpayer ID columns of contact modification date are selected.
            [UsrInn],
            MAX([ModifiedOn])
        FROM [Contact]
        -- Grouping by fields is added to enable using qauntity verification.
        GROUP BY [UsrInn]
        -- The table is populated only if more than one contact is available.
        HAVING COUNT(*) > 1;
    END;
    
    -- Populating the table of results.
    INSERT INTO [ContactDuplicateSearchResult] ([ContactId], [GroupId], [RuleId], [SysAdminUnitId])
    SELECT
        -- Contact duplicate identifier.
        [vr].[Id],
        -- Group numbering.
        DENSE_RANK() OVER (ORDER BY [vr].[SortDate] DESC, [vr].[UsrInn]),
        -- Rule identifier.
        @ruleId RuleId,
        -- Identifier of user who launched the duplicate search process.
        @sysAdminUnit
    FROM (
        -- Subquery populating the duplicatae table.
        SELECT
            -- Contact identifier.
            [v].[Id],
            --Contact Taxpayer ID.
            [v].[UsrInn],
            -- Date of sorting.
            [r].[SortDate]
        -- Tables providing data.
        FROM [Contact] [v], #searchContact r
        -- The rule defining that contacts are duplicates.
        WHERE [v].[UsrInn] = [r].[UsrInn]
        -- Groupoing of search results.
        GROUP BY [v].[UsrInn], [r].[SortDate], [v].[Id]
    ) [vr];
END;
GO

NOTE

Sometimes you can come across the "Cannot resolve the collation conflict between "Cyrillic_General_CI_AS" and "Cyrillic_General_CI_AI" in the equal to operation" error. To fix it, specify the necessary COLLATE when creating the table column.

CREATE TABLE #searchContact ([Name] NVARCHAR(128) COLLATE Cyrillic_General_CI_AI, [BirthDate] DATETIME, [SortDate] DATETIME );

3. Registering the stored procedure as a new rule

To register the stored procedure as a new duplicate search rule, add the corresponding record to the DuplicatesRule table. To do so, execute the following SQL script:

-- Veriable that stores the UId column value of the Contact schema.
DECLARE @ContactUId UNIQUEIDENTIFIER;

-- Receives the UId сcolumn value of the Contact schema.
Set @ContactUId = (SELECT TOP 1 SysSchema.UId FROM SysSchema
WHERE SysSchema.Name = 'Contact' AND SysSchema.ExtendParent = 0);

-- Adds a new rule to the system.
INSERT INTO DuplicatesRule ([IsActive], [ObjectId], [ProcedureName], [Name]) VALUES
  (1, @ContactUId, 'tsp_FindContactDuplicateByInn', 'Дубли контактов. ИНН');

After you update the application page and clear the cache, you will have a new rule in the list of duplicate search rules (Fig. 1).

Fig. 1. Duplicate search rule by the [Taxpayer ID] field

© bpm'online 2002-2019.

Did you find this information useful?

How can we improve it?