Home How It Works

How Data Sync Works

Enterprise-grade log-based Change Data Capture (CDC) technology that reads database transaction logs in real-time, delivering 90% data compression for efficient cloud replication.

Log-Based CDC
Real-Time Capture
90%
Data Compression
Network Efficiency
Zero Impact
On Production DBs

Log-Based Change Data Capture

Unlike traditional query-based replication, Data Sync reads the database transaction log files directly—capturing every INSERT, UPDATE, and DELETE in real-time without impacting database performance.

Transaction Log Reading

How Transaction Logs Work

1

Database Writes to Log

Every database transaction (INSERT, UPDATE, DELETE) is first written to a transaction log file before being committed to the database.

2

Data Sync Reads the Log

Our CDC agent continuously monitors and reads these log files, parsing transaction records in real-time without querying the production database.

3

Change Events Created

Each transaction is converted into a structured change event with before/after values, timestamp, and transaction metadata.

4

Compressed & Replicated

Changes are compressed (up to 90% reduction) and streamed to cloud destinations in near real-time, preserving transactional consistency.

Source Database
Oracle, SQL Server, SAP HANA
Transaction: UPDATE users SET status='active'
Transaction Log File
Archived transaction records
SCN: 12345678 | LSN: 0x00000034 | Operation: UPDATE
Data Sync CDC Agent
Reads, compresses & streams

Zero Database Impact

No queries run against production databases—log reading has minimal CPU/memory overhead.

Real-Time Capture

Sub-second latency from database commit to change event delivery to the cloud.

Complete Data Fidelity

Captures all operations including DELETEs and schema changes, with exact ordering preserved.

90% Data Compression Technology

Our intelligent compression algorithms reduce network bandwidth requirements by up to 90%, enabling cost-effective replication of large enterprise datasets to the cloud.

Compression Pipeline

Uncompressed Data 100 GB
Raw Transaction Data
Compressed Data 10 GB
90% Reduced
90%
Bandwidth Saved
10x
Faster Transfer
85%
Cost Reduction
Intelligent Compression

Multi-Layer Compression Strategy

Data Sync employs multiple compression techniques optimized for database change data:

Columnar Compression

Similar values in column batches are compressed together using dictionary encoding and run-length encoding.

Delta Encoding

Only changed columns can be transmitted, not entire rows—drastically reducing payload size for UPDATEs.

LZ4 Stream Compression

High-speed LZ4, snappy, or gzip algorithm compresses data streams with minimal CPU overhead and sub-millisecond latency.

Schema-Aware Optimization

Data types are encoded efficiently—integers use variable-length encoding, strings use dictionary compression.

Result: A typical 1TB SAP HANA database replicates to the cloud using only ~100GB of network bandwidth per full sync.

Data Flow & Type Mapping

Automated schema discovery and intelligent data type mapping ensures seamless replication from enterprise databases to modern cloud platforms.

Source

SAP HANA

VARCHAR(100)
DECIMAL(18,2)
TIMESTAMP
BLOB

CDC Agent

Data Sync

Parse Log
Map Types
Compress 90%
Stream

Target

Snowflake

VARCHAR(100)
NUMBER(18,2)
TIMESTAMP_NTZ
BINARY

Automated Type Mapping Examples

Source Database Source Type Target Platform Target Type
SAP HANA VARCHAR(255) Snowflake VARCHAR(255)
Oracle NUMBER(18,2) BigQuery NUMERIC(18,2)
SQL Server DATETIME2 Databricks TIMESTAMP
PostgreSQL JSONB Snowflake VARIANT
MySQL TINYINT(1) BigQuery BOOL
SAP HANA BLOB Databricks BINARY
Type mappings are automatically detected and configured during initial schema discovery. Custom mappings can be defined for special cases, above are examples you can map as well.

Technical Architecture

Enterprise-grade architecture designed for reliability, scalability, and security at every layer.

CDC Agent Layer

  • Deployed on-premises or in private cloud near source databases
  • Reads transaction logs using native database APIs
  • Minimal resource footprint (2-4 CPU cores, 4-8GB RAM)
  • High availability with automatic failover

Stream Processing

  • In-memory event buffering and batching
  • Real-time data transformation and enrichment
  • Schema evolution and type conversion
  • Guaranteed exactly-once delivery semantics

Cloud Delivery

  • Native connectors for Snowflake, Databricks, BigQuery
  • Optimized bulk loading with COPY/MERGE operations
  • Automatic retry with exponential backoff
  • End-to-end encryption (TLS 1.3)

Security & Compliance

Built with enterprise security standards from the ground up. All data is encrypted in transit and at rest, with comprehensive audit logging.

SOC 2 Type II GDPR Compliant HIPAA Ready ISO 27001

End-to-End Encryption

TLS 1.3 for all data in transit

Role-Based Access

Granular permissions & SSO

Audit Logging

Complete activity tracking

Private Networking

VPC peering & PrivateLink

Ready to Modernize Your Data Infrastructure?

Experience log-based CDC with 90% compression. Get your enterprise data flowing to the cloud in minutes, not months.

Trusted by enterprise data teams worldwide

No Production Impact
Sub-Second Latency
Enterprise Support