Describe Method

Rok Kovač

Updated May 29, 2024 16:17

Describe method is used to create the Synapse schema inside Syncari for the end system we want to synchronise with. Syncari can handle both seeded and dynamic schemas as well as a combination of both. When invoked the method gets an argument of a DescribeRequest, which contains a list of entity names that Syncari expects to get the schema for.

There are two types of invocation of the method: when the synapse is first set up (once it’s tested and activated) or a schema refresh and while the pipelines are running. In the first the DescribeRequest will be empty, since the system either doesn’t know the existing schema yet or the user wants to initiate a refresh in the latter the DescribeRequest will include the specific entities - this is run with every sync cycle, by default once a minute. In the subsequent paragraphs, you will find descriptions of the three most commonly used schema configurations.

Seeded Schema

In a seeded schema use case, entire schema is set as a dict that includes all the entities with their respective attributes. For this use case you can use the boilerplate bellow to generate the schema on each call.

def describe(self, desc_request: DescribeRequest) - List[Schema]:
    entities = desc_request.entities
    if entities is None or not entities:
        entities = entity_schemas.keys()
    return [Schema.parse_obj(entity_schemas[entity]) for entity in entities if entity in entity_schemas]

To support the boilerplate method you would also include a schema dict entity_schemas that would include all the entities. Below is an example of a static schema:

entity_schemas = {
    'contact': {
        'apiName':'contact',
        'displayName':'Contact',
        'attributes':[
            {'apiName':'id', 'dataType':'string', 'isIdField':True, 'displayName':'Id'},
            {'apiName':'firstName', 'dataType':'string', 'displayName':'First Name'},
            {'apiName':'lastName', 'dataType':'string', 'displayName':'Last Name'},
            {'apiName':'companyName', 'dataType':'string', 'displayName':'Company Name'},
            {'apiName':'email', 'dataType':'string', 'displayName':'Email'},
            {'apiName':'dateAdded', 'dataType':'datetime', 'displayName':'Created At','isCreatedAtField':True,'isSystem':True},
            {'apiName':'dateUpdated', 'dataType':'datetime', 'displayName':'Updated At','isUpdatedAtField':True, 'isWatermarkField':True,'isSystem':True}
        ]
    },
    'account': {
        'apiName':'account',
        'displayName':'Account',
        'attributes':[
            {'apiName':'id', 'dataType':'string', 'isIdField':True, 'displayName':'Id'},
            {'apiName':'name', 'dataType':'string', 'displayName':'Name'},
            {'apiName':'dateAdded', 'dataType':'datetime', 'displayName':'Created At','isCreatedAtField':True,'isSystem':True},
            {'apiName':'dateUpdated', 'dataType':'datetime', 'displayName':'Updated At','isUpdatedAtField':True, 'isWatermarkField':True,'isSystem':True}

        ]

    }

}

Hybrid Schema

Opposed to a fully seeded schema the hybrid schema approach would be utilized when the end system has predefined entities, but supports a schema API that can fetch custom fields and / or entities. Here you would append custom fields as attributes to existing entities and append custom entities to the schema if available. When the end system has a different set of field data types please consider making a translation dict to ensure that types get mapped correctly between the end system and Syncari.

Please note that in the use case of appending fields to the existing seeded schema the seeded schema should be method scoped and not global scoped. This is required to avoid field cross pollination across synapse setups.

Below is an example of a method where you would modify an existing schema.

def describe(self, desc_request: DescribeRequest) - List[Schema]:
    entities = desc_request.entities
    if entities is None or not entities:
            entities = entity_schemas.keys()
    entity_schemas = self.__get_custom_fields(entities)
    return [Schema.parse_obj(entity_schemas[entity]) for entity in entities if entity in entity_schemas]

def __get_custom_fields(entities):
    entity_schemas = {
        'contact': {
            'apiName':'contact',
            'displayName':'Contact',
            'attributes':[
                {'apiName':'id', 'dataType':'string', 'isIdField':True, 'displayName':'Id'},
                {'apiName':'firstName', 'dataType':'string', 'displayName':'First Name'},
                {'apiName':'lastName', 'dataType':'string', 'displayName':'Last Name'},
                {'apiName':'companyName', 'dataType':'string', 'displayName':'Company Name'},
                {'apiName':'email', 'dataType':'string', 'displayName':'Email'},
                {'apiName':'dateAdded', 'dataType':'datetime', 'displayName':'Created At','isCreatedAtField':True,'isSystem':True},
                {'apiName':'dateUpdated', 'dataType':'datetime', 'displayName':'Updated At','isUpdatedAtField':True, 'isWatermarkField':True,'isSystem':True}
            ]
        },
        'account': {
            'apiName':'account',
            'displayName':'Account',
            'attributes':[
                {'apiName':'id', 'dataType':'string', 'isIdField':True, 'displayName':'Id'},
                {'apiName':'name', 'dataType':'string', 'displayName':'Name'},
                {'apiName':'dateAdded', 'dataType':'datetime', 'displayName':'Created At','isCreatedAtField':True,'isSystem':True},
                {'apiName':'dateUpdated', 'dataType':'datetime', 'displayName':'Updated At','isUpdatedAtField':True, 'isWatermarkField':True,'isSystem':True}
            ]
        }
     }

    for entity in entities:
        resp = client.get(entity)
        resp_json = resp.json()
        for field in resp_json:
               entity_schemas['entity']['attributes'].append({'apiName': field['apiName'], 'displayName': field['displayName'], 'dataType': field['dataType']})
    return entiy_schemas

Dynamic schema

In this use case everything would be dynamic. Both the list of entities and its field would be discoverable via the API and the synapse code on its own wouldn't hold any static schema information.

Bellow is an example of an implementation where both list of entities and fields get fetched dynamically.

def describe(self, desc_request: DescribeRequest) - List[Schema]:
    entities = desc_request.entities
        if entities is None or not entities:
            entities = self.__get_entities()
        return [Schema.parse_obj(self.__get_schema_for_entity(entity)) for entity in entities]

def __get_entities(self):
    resp = self.client.get('entites', headers=self.__authentication())
    resp_json = resp.json()
    entity_objects = resp_json.get('data').get('attributes')
    entity_names = entity_objects.keys()
    return entity_names

def __get_schema_for_entity(self, entity):
    entity_api_name = entity
    entity_display_name = entity.replace("_", " ")
    resp = self.client.get(f'fields/{entity_api_name}', headers=self.__authentication())
    resp_json = resp.json()
    fields = resp_json.get('data').get('attributes')
    schema_fields = []
    for k,v in fields.items():
        apiName = k
        displayName = k.capitalize().replace('_', ' ')
        required = False if v.get('required') else True
        field = {'apiName': apiName, 'dataType': self.mapping_dict.get(v.get('type'), "string"), 'displayName': displayName, 'nillable' : required}
        if v.get('type') == "id":
            field['isIdField'] = True
        if k == 'date_entered':
            field['isCreatedAtField'] = True
            field['isSystem'] = True
        if k == 'date_modified':
            field['isUpdatedAtField'] = True
            field['isSystem'] = True    
            field['isWatermarkField'] = True                
        schema_fields.append(field)
    schema = {
        "apiName" : entity_api_name,
        "displayName" : entity_display_name,
        "attributes" : schema_fields
    }
    return schema

    self.mapping_dict = {
        'id' : 'string',
        'Address': 'complex',
        'Checkbox': 'boolean',
        'Currency': 'decimal',
        'Date': 'date',
        'Datetime': 'datetime',
        'Decimal': 'decimal',
        'Dynamic': 'string',
        'DropDown': 'picklist',
        'Float': 'double',
        'HTML': 'string',
        'IFrame': 'string',
        'Image': 'string',
        'Integer': 'integer',
        'MultiSelect': 'picklist',
        'Phone': 'string',
        'Radio': 'boolean',
        'Relate': 'reference',
        'TextArea': 'string',
        'URL': 'string',
        'TextField': 'string',
        'WYSIWYG': 'string'
    }

Best Practices

Syncari platform supports the following field data types: 'boolean', 'decimal', double', 'reference', 'picklist', 'string', 'datetime', 'timestamp', 'integer', 'date', 'object', 'child', 'password', 'complex'. When working with a dynamic schema use case the type names might not align and a translation dictionary would need to be added to the Synapse so that describe call can make the mappings on the fly. In this case we also recommend setting the string as a default mapping value in case if a mapping got missed or if the API adds a new one later on. This will prevent the describe calls from crashing unnecessarily.

Syncari also allows for additional metadata to be added to fields - if they are required or read only for example. Full list of available attribute flags can be found here. Please note that each entity should have one field set to isIdfield true - this is the unique identifier of the entity so that syncari can distinguish between different records and one field set to isWatermarkField true. The watermark field is usually set as the lastModified or lastUpdated date. If the entity has no lastModified date you can add a fabricated one and set the value of lastModified on each Record to the current time stamp in epoch milliseconds.

When developing a synapse we recommend to start first with a seeded schema. This allows for faster testing and prototyping within the production environment. The only thing to keep in mind is that any key value changes after implementing a dynamic schema could result in some additional work within Syncari, since field pipelines and tokens might need to be adjusted.

Please be sure to use Schema.parse_obj() if you are using a seeded schema as a dictionary or setting the fetched schema as a dict. Alternatively it’s possible to create an instance of the schema class outright.

When you have an entity that is read only, you can set the 'readOnly' field on the entity to True, this will prevent end users in the UI to configure the destination side nodes, which trigger create, update and delete methods. If you have a bi-directional entity but need to set certain fields to read only, you can set 'updateable' and 'initializable' attribute fields to False. This will prevent destination side on a field pipeline for that specific field.

Example

In the examples below you can see a part of Pipedrive’s seeded schema and the describe method. The latter uses a common pattern we recommend - to check if the incoming DescribeRequest includes any entities or not. If it does, it proceeds, otherwise it takes all the keys of the entity_schemas dictionary (in a fully dynamic schema setup, this could be replaced with an API call response). From there it proceeds to process each entity, where it checks if the entity is part of the entities that have dynamic schema available (in entity_fields_endpoint) or processes static ones. Once everything is run through it returns the full schema of requested entities to the platform.

Example of a seeded schema

entity_schemas = {
      'user': {
        'apiName': 'user',
        'displayName': 'User',
        'pluralName': 'users',
        'description': 'Represents user schema',
        'readOnly': True,
        'attributes': [
            {'apiName':'id', 'displayName':'ID', 'dataType':'integer', 'isIdField':True, 'nillable':False, 'updateable':False, 'isSystem':True, 'unique':True},
            {'apiName':'name', 'displayName':'Name', 'dataType':'string'},
            {'apiName':'default_currency', 'displayName':'Default Currency', 'dataType':'string'},
            {'apiName':'locale', 'displayName':'Locale', 'dataType':'string'},
            {'apiName':'lang', 'displayName':'Language', 'dataType':'integer'},
            {'apiName':'email', 'displayName':'Email', 'dataType':'string'},
            {'apiName':'phone', 'displayName':'Phone', 'dataType':'string'},
            {'apiName':'activated', 'displayName':'Is Activated', 'dataType':'boolean'},
            {'apiName':'last_login', 'displayName':'Last Login', 'dataType':'datetime'},
            {'apiName':'created', 'displayName':'Created', 'dataType':'datetime', 'nillable':False, 'updateable':False, 'isSystem':True},
            {'apiName':'modified', 'displayName':'Last Modified', 'dataType':'datetime', 'isWatermarkField':True, 'nillable':False, 'updateable':False, 'isSystem':True},
            {'apiName':'signup_flow_variation', 'displayName':'Signup Flow Variation', 'dataType':'string'},
            {'apiName':'has_created_company', 'displayName':'Has Created Company', 'dataType':'boolean'},
            {'apiName':'is_admin', 'displayName':'Is Admin', 'dataType':'boolean'},
            {'apiName':'active_flag', 'displayName':'Is Active', 'dataType':'boolean'},
            {'apiName':'timezone_name', 'displayName':'Timezone Name', 'dataType':'string'},
            {'apiName':'timezone_offset', 'displayName':'Timezone Offset', 'dataType':'string'},
            {'apiName':'role_id', 'displayName':'Role ID', 'dataType':'integer'},
            {'apiName':'icon_url', 'displayName':'Icon URL', 'dataType':'string'},
            {'apiName':'is_you', 'displayName':'Is You', 'dataType':'boolean'}
        ]
    }

Example Method

def describe(self, desc_request: DescribeRequest) - List[Schema]:
    schemas = []
    entities = desc_request.entities
    if entities is None or not entities:
    entities = entity_schemas.keys()
         for entity in entities:
         if entity in entity_fields_endpoint:
                 schemas.append(self.__get_entity_schema(entity))
             else:
                 schemas.append(Schema.parse_obj(entity_schemas[entity]))
         return schemas

Was this article helpful?

0 out of 0 found this helpful