Creatio development guide
PDF
This documentation is valid for Creatio version 7.16.0. We recommend using the newest version of Creatio documentation.

Bulk duplicate search service

Glossary Item Box

Bulk duplicate search is a third-party service for bulk deduplication of Creatio section records.

Duplicate records may appear in Creatio whenever users add new records to system sections. Finding and merging duplicates helps maintain the quality of your data in any Creatio section.

Introduction

You need the global search service set up and configured using ElasticSearch to ensure the operation of the bulk duplicate search service. Learn more about the global search service in the “Global Search Service” article.

Creatio implements the following duplicate search modes:

  • Bulk duplicate search – check for duplicates is run for the entire database. Launched manually or automatically.
  • Duplicates search when saving a record – checks for duplicates for a particular record. It is run automatically when a new record is added and saved in a section.

Additionally, you can manually merge any records in a section, even if they were not flagged as duplicates. This option is available for all system sections. By default, duplicate search is available in the [Accounts], [Contacts] and [Leads] sections. In Creatio, the duplicate search is executed with the help of pre-configured rules. Creatio also provides customization of out-of-the-box duplicate search rules for contacts, accounts, and leads. Create custom rules for any Creatio section, including custom sections.

The bulk duplicate search function is pre-enabled in Creatio applications deployed in the cloud. Creatio applications deployed on-site require the global search service set up and configured before the bulk duplicate search service can be enabled. Learn more about the global search service in the “Global Search Service” article.

To connect bulk duplicate search to Creatio, take the following steps:

  1. Set up the [Deduplication service api address] system setting value. Learn about system settings in more detail in “The [System settings] section” article.
  2. Set up the [Duplicates search] operation permissions. Read more about managing access permissions in the “Managing object operation permissions” article.
  3. Run the SQL script to enable the bulk duplicate search functionality in Creatio (BulkESDeduplication, ESDeduplication, Deduplication). Learn more about working with additional options in the “Feature Toggle. Mechanism of enabling and disabling functions” article.
  4. Restart the Creatio application.

More information enabling the bulk duplicate search service is available in the “Set up bulk duplicate search” article.

Bulk duplicate search service schema

Bulk duplicate search service consists of the following components:

  • RabbitMQ – message broker. Bulk duplicate search service component.
  • ElasticSearch – a search engine. Bulk duplicate search service component.
  • Redis – repository used for caching and speed.
  • MongoDB – document-oriented DBMS.
  • WebAPI – web service for communicating in the main Creatio application.
  • Data Service – internal service for communication with a MongoDB component.
  • Duplicates Search Worker – duplicate search component.
  • Duplicates Deletion Worker – targeted duplicate deletion component.
  • Duplicates Confirmation Worker – component for grouping and filtering the detected duplicates based on their uniqueness.
  • Duplicates Cleaner – component for clearing the duplicates.
  • Deduplication Task Worker – component for setting the deduplication task.
  • Deduplication Preparation Worker – component for preparing the deduplication process. This component generates queries for duplicate search according to the rules.

The working principles of the bulk duplicate search service are presented in Figure 1.

Fig. 1. The operation scheme of the bulk duplicate search service

Bulk duplicate search service scalability

Database clustering enables scaling of the bulk duplicate search service in large projects. Learn more about ElasticSearch clustering in the official documentation.

Bulk duplicate search service compatibility with Creatio products

The bulk duplicate search service features several versions: 1.0-1.5, 2.0. Each version is compatible with all Creatio products of version 7.14 and up.

Bulk duplicate search service deployment options

You can deploy the bulk duplicate search service on-site and in the cloud.

On-site applications require a preliminary setup of the global search service. Learn more in the “Set up global search” article. To set up the bulk duplicate search service, you need a server (a physical or virtual machine) that meets specific system requirements. More information about the system requirements for the server is available in the “Set up bulk duplicate search” article. Both servers must run under Linux with Docker installed. You can find the list of supported Linux distributions in the Docker documentation.

We recommend that you install the most up-to-date version of the bulk duplicate search service.

See Also

© Creatio 2002-2020.

Did you find this information useful?

How can we improve it?