Merge Studio provides an automatic method for deduplicating entity records. Your configuration will allow you to determine what records are considered duplicates, identify which record will be the winning record and decide what to do with the field values on the winning and losing record after the merge is completed. Merge Studio operates on the unified data set and any duplicates that are merged within Syncari will be removed from your connected systems. As a result, all your end systems will have a common view of the data which is especially useful when combined with Data Unification.
Prerequisites
Setup - Access Merge Studio
You can access Merge Studio by double clicking on the Syncari core node when viewing a draft of your Entity Pipeline.
Alternatively, you can also access and view your Merge Studio configuration when you single click on the Syncari node on your Entity Pipeline and click on the Merge Studio tab on the right side panel.
Step 1: Find Duplicates
The first step of the Merge Studio configuration will include the ability to enable a Report Only option and setup the conditions for finding duplicates.
Report Only: Enable the Report Only option if you would like to test your Merge Studio rules without actually merging records. See Viewing Merge Studio Transactions for more information around how to review the transaction data that is generated with the Report Only option enabled.
Highly Recommended: Set this option to True before publishing your pipeline
Max Allowed Duplicates: This takes a number input. Merge will not proceed if there are more duplicates than max allowed duplicates configuration. Merge transaction will still be logged in report-only mode and a notification will be sent about more duplicates than this configuration.
Skip When: When the skip-when filter matches with an incoming record, the find duplicates process will skip that particular record. The skip-when filter accommodates tokens as well as fields on the left-hand side (LHS).
Find Duplicates: Build your condition to find duplicates within the Syncari entity that match the incoming record.
In the example screenshot below we use the Account Domain to find a duplicate. A duplicate will be found if the incoming record's Domain matches an existing Syncari record's Domain.
You can setup the condition logic to be as simple or complex as you need with AND/OR operators by clicking the + Add Condition link.
You can also setup the Find Duplicates condition to search for duplicates in multiple steps by adding additional groups of conditions. This allows you to have a hierarchy of conditions that will run in a given order. If condition #1 doesn't find a duplicate, Syncari will try condition #2 and so on. If the right-hand side is left blank, that numbered condition will be disregarded. Each numbered condition can be moved to a different position by dragging or clicking the up/down arrows next to the number
Note: When user creates a filter condition for a match and it is possible to match the filter condition to be true based on the blank value during the sync. By default for operators starts with, contains, equals and equalIgnore case we do not match the condition
Click the Next button to proceed to the next step.
Step 2: Select Winner
The second step of the Merge Studio configuration defines how the winning record will be selected. You can set one or more conditions to be evaluated in a specific order to determine a winner. When setting multiple conditions, the option for "Progressive Selection" will control how each condition will be used to find a winner.
Progressive Selection
If Enabled, the Select Winner step will progressively try to select a winner according to the list of conditions. If more than one duplicate record matches a condition, all records that meet the current condition will be evaluated on the next condition. If no single record can be selected as the winner after the last condition on the list is evaluated, the incoming record will be selected as the winner (also known as the most recently updated record).
In the example configuration shown above, if there are 3 accounts to be merged and 2 of the 3 accounts have an Account Status equal to "Customer", the 2 "Customer" accounts will move onward to be evaluated on next condition and the record with the highest Score will be selected as a winner. If both records happen to have the same Score, the final decision, after all configured conditions do not find a winner, will select the incoming record as the winner.
If Disabled, the conditions will be evaluated in the order they are defined. If any condition results in a tie between multiple records, the incoming record will be selected as the winner and no other subsequent conditions will be evaluated.
In the example configuration shown above, if the both accounts have an Account Status equal to Customer, the condition will result in a tie and the incoming record will be selected as the winner. The 2nd condition on the list configured to select the Score with the Highest Value will not be evaluated.
Setting Conditions
You can define the winner based on a Record Level Selection or a Field Level Selection.
Record Level Selection
Choose the "Record" option if you want to select the winner based on one of these conditions:
- Earliest Created Record
- Earliest Updated Record
- Most Complete Record
- Most Recently Created Record
- Most Recently Updated Record
Field Level Selection
Choose "Field Level Selection" if you want to select a winner based on a field value.
In this example I configured the select winner rule to select the winning account record when the "Account Type" equals "Customer":
Click the Next button to proceed to the next step after defining your Select Winner criteria.
Step 3: Merge Records
This step is where you configure the merge policy for how field values from the losing record(s) are handled. You have the option to define a policy for the record overall and also the option to define individual field policies.
Default Merge Policy: Once a winner is selected, use this policy to describe how to use values in losing records. Note that multiple losing records can merge with a single winning record.
Default Override Policy:
Define when the the Default Merge Policy should be applied.
Field Level Merge Policies:
This is an optional setting. Define merge & override policies at a field level. When defined, the Field Level Merge Policies take precedence over the Default Merge Policy and Default Override Policy.
This is the last step for configuring your Merge Studio setting. Click the Finish button to save the configuration. Note, the Merge Studio settings do not go live until you publish your pipeline.
Final Step: Publish Your Pipeline
The Merge Studio settings you configured will be a part of the Draft version of the Pipeline. The settings will not go live until you publish your pipeline.
Keep in mind that once your Merge Studio rules are published, Syncari will merge records in your connected destination Synapses. You can publish the Merge Studio setting with the "Report Only" option enabled on Step 1 if you want to review how the merges will run before going live.
Congratulations! You will now be able to merge duplicate records within your Unified Data and at your connected Synapses.