In this article we talk factors affecting the Sync time and provide tips and recommendations to optimize your Sync.
Sync Cycle
Sync cycle is a process where Syncari detects changes in source synapses and queries the changed data and runs the data through the Pipeline. See Sync Cycle for more details.
Each sync cycle typically reads up to 2000 records per source synapse and these records are run through the pipeline.
Factors affecting Sync time
Multiple factors influence the time taken by Sync cycle. Below are a few.
- Number of sources in the pipeline
- Number of Mapped Fields in a Pipeline
- Functions and Actions in the pipeline
- Merge Studio
- Synapses
Number of sources in the pipeline
In each Sync cycle, Syncari reads up to 2000 records from each source in the pipeline. Increase in the number of sources can increase the number of records processed in each sync cycle and thus increase time to complete one sync cycle. Typically it is not possible to reduce Sync time here as all the sources may be needed in the Pipeline.
Number of Mapped Fields in a Pipeline
Sync cycle takes longer as more fields are mapped in the pipeline. Slowness in the Sync cycle may be noticeable only beyond 50 mapped fields.
Reference Fields
Reference fields are fields that refer to a different Syncari Entity through ID field. When source reference fields are mapped in Pipeline, Syncari tries to resolve the corresponding Entity record. Take an example of Account ID field in Contact entity which refers to Account entity. When source record is processed we try to find the Account record in Syncari corresponding to incoming Account ID. This resolution can be slow. If the resolution is not possible then we try to resolve this Account ID in the Account pipeline, which can slow down Account pipeline especially if large number of unresolved references exist. As discussed above, avoid mapping Reference fields if not needed.
Functions and Actions in the pipeline
Number of Functions and Actions in a pipeline can affect the duration of Sync cycle. Different functions and actions can affect the Sync cycle in varying degrees. Functions and actions that can have significant impact on Sync time are discussed below along with useful tips to boost performance. Note that you may not be able to implement all suggestions in your pipeline and recommendations are provided as general guidelines.
Functions
Lookup Functions
Lookup Functions like Lookup Syncari Record, Update Syncari Record, Attach Record and aggregation functions like Count and Sum, query Syncari Database based on provided filter conditions. These queries can be time consuming and affect Sync time.
Recommendation
- Minimize the usages of these functions. If feasible, use them after a decision node. This can reduce number of times the functions are processed.
- If Lookup Syncari Record is being used in the Pipeline with same Entity and same filter conditions in multiple places, consider storing the result in a Temporary variable and use the variable in different parts of pipeline.
- Ensure the right hand side condition in Filter Condition resolves to a valid value. If that is not possible, consider adding additional condition with "Is Not Empty" operator.
Let's look at an example. Below, if the token in Right hand side (RHS) of the condition can resolve to an empty value, then function can be pretty slow.
Consider adding another condition which checks if Account Phone is not empty like below.
Lookup External Record
The Lookup External Record function allows you to run a lookup directly on an external system like a Salesforce or a Database. As the Lookup happens on an external system and outside Syncari, these can be slow. If the lookups/queries are on a Database, ensure that right database indexes exist for the query.
Decision
Recommendation
- Prefer one complex filter over breaking a complex filter into multiple simple filters.
- Connect Decision nodes sequentially instead of arranging them in parallel as shown below. This keeps the logic simple and ensures pipeline is not executed in parallel paths which slows the Sync.
Enrichment Functions
Enrich Company and Enrich Person
These functions lookup company or person information from enrichment services like Clearbit and ZoomInfo. As these calls are made to external APIs for each record they can be slow. Syncari does cache information about a given company or a person for 30 days. This means if same company or person needs to be looked up within 30 days, we do not hit external API. This saves you the API cost and reduces Sync time. See Enrichment Functions for details.
Reference Data
Reference Data is an efficient alternative to using Lookup Syncari Record function, if the entity that is being looked up rarely changes and changes can be manually uploaded to Reference Data. See Using Reference Data for details.
Actions
Typically Actions (both built-in actions and Custom Actions) are executed for each record sequentially and as most of the actions make API calls outside of Syncari, they can be slow. Some of the Actions support Batch mode and these can be significantly faster than Actions which are not batched. One downside of using Actions which are Batched is that we cannot use the output of the Action in the subsequent nodes in the Pipeline.
Below is the list of Actions which are batched
- Add To Salesforce Campaign
- Convert Salesforce Lead
- Add To Marketo List
- Add To Marketo Program
- Add To HubSpot List
- Create External Record (Batching is optional)
- Wait
Custom Actions can also be designed to executed in batch mode. See Custom Action for more details.
Merge Studio
Merging has two parts - Merging records in Syncari and writing result of merge operation to the destination synapses. Both these can make pipeline slow, especially for large volume of merges.
Merge in Syncari
Finding Duplicates is the first step during the merge operation in Syncari and also the most time consuming. In this step filter conditions defined in the configuration are run against the records in Syncari for that entity. For each record being processed in the pipeline, filter conditions are evaluated sequentially until a match is found.
Writing merges to destination synapses could be slow due to API limits or lack of native merge operation, in which case Syncari creates the winning entity and deletes the loser entity. This can make the operation even slower.
Synapses
Custom Synapse
Pipelines which use custom synapse either as a source or destination tend to be slower than ones using standard synapse. Custom Synapses are deployed outside of Syncari system in a secure and sandboxed Google Cloud Function environment. This means when Syncari is querying custom synapse, first request goes to cloud function and from there to the end system. This additional hop may affect Sync time, though not significantly.
Sync Rate Limit
User can configure Sync Rate limit for a given Synapse. This limits the number of downstream write operations for the synapse done per sync cycle. This can slow down the pipeline as this limits the number of records processed in each cycle.
Batch Support
Certain synapses do not support batch read or write operations or they may not support certain operations in batch mode. For example, Hubspot synapse supports batched write operations while updating existing records, but not while creating new records. This can significantly slow down the pipeline. Please find about what operations do not support Batch mode for the Synapses you are using.
Resync
You may notice that Resync is slightly slower than normal sync. This could be for few reasons.
- Initial Resync is done where there are no records for that Entity in Syncari and thus internal processing like running Merges, logging transactions and writing to destination happen which may not always happen for normal sync. This can make resync slower than sync for same number of records.
- Partial resync for a date range may take more time than you would expect. We still bring in all data for the date range and process them through the pipeline even if most of the records already exist in Syncari.
Sync Benchmarks
In this section we discuss the we look at Sync times for some common pipeline use cases. Note that the below numbers were measured on sample pipelines and Sync duration in your pipelines may vary.
Use Case | Synapse | Time Per Sync Cycle (seconds) | Throughput (Records/Hour) |
Simple Pipeline with no transformations | Salesforce as Source and Destination | 83 | 86,000 |
Simple Pipeline with Merge Studio configuration | Salesforce as Source and Destination | 91 | 79,000 |
Pipeline with Lookup Syncari Record function | Salesforce as Source and Destination | 101 | 71,000 |
Pipeline with two sources and Attach Record function and Merge configuration | Salesforce and Hubspot as Sources and Salesforce as Destination | 411 (4000 records processed per cycle) | 35,000 |