Which datatypes are supported?
All datatypes not specified here are assumed to be string types
Redshift Datatype | Syncari Datatype | Notes |
boolean,bool | boolean | |
tinyint | boolean | |
double, real,float, numeric,number,decimal | number | |
timestamp, timestamp with time zone, timestamp without time zone, timestamp_ltz, timestamp_ntz, timestamp_tz | timestamp | Timestamps without TZ information are considered to be in the timezone id set in the Synapse configuration. If not set, it is assumed to be UTC |
integer, int, bigint | integer | |
date | date | |
datetime | datetime | |
varchar,char | string |
What does Syncari require to access data in tables?
To sync with database tables, Syncari needs to be able to identify each individual record and know if that record was modified since the last sync. To ensure that happens, every database table that you connect to Syncari must include:
- A primary key that uniquely identifies each record. The primary key must be required for each record.
- A DATETIME or TIMESTAMP field that identifies when the record was created or updated. Syncari uses the field as a watermark to identify when the record was last updated. This must be populated upon creation and must be populated at the current date and time of modification.
Why are duplicates created in Redshift when a primary key and/or uniqueness is defined for a given column?
Uniqueness, primary key, and foreign key constraints are informational only; they are not enforced by Amazon Redshift when you populate a table. To prevent this, make sure you setup your pipeline with either merge studio rules to deduplicate and/or unification rules to prevent the sources from creating duplicate records in Syncari.
Reference the Redshift documentation.