We are looking for a highly skilled Data Engineer with experience in building and managing data systems,
responsible for designing and maintaining scalable data pipelines, enabling efficient data ingestion,
storage, and processing across various platforms, while also optimizing data for analytical workloads.
Key Responsibilities:
• Create and manage scalable ETL/ELT pipelines, ensuring efficient data flow into SQL Server,
ClickHouse, and MinIO
• Develop CDC (Change Data Capture) pipelines using Kafka Connect, Debezium, and Schema
Registry
• Architect and maintain data storage solutions using MinIO for object storage and ClickHouse for
analytical queries, ensuring optimal performance for large-scale datasets.
• Implement and optimize real-time and batch data processing workflows using Kafka, Apache
Flink, and SQL Server
• Develop CDC (Change Data Capture) pipelines using Kafka Connect, Debezium, and Schema
Registry
• Develop and manage OLAP systems using ClickHouse and SQL Server to support highperformance analytical queries
• Implement and enforce best practices for data security, integrity, and governance
• Develop CDC (Change Data Capture) pipelines using Kafka Connect, Debezium, and Schema
Registry
• Partner with data scientists, analysts, and stakeholders to ensure that data infrastructure meets
the needs of business-critical analytics.
• Use Apache Airflow to orchestrate and automate data workflows, ensuring the reliability and
scalability of data pipelines
• Collaborate with Data team members to ensure data quality and availability
Qualifications:
• Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
• Experience:
o 3+ years of experience in data engineering.
o Experience in building and managing OLAP systems for large-scale analytical workloads.
o Experience in working with both SQL and NoSQL databases.
o Advanced Python and SQL programming skills for data processing and automation.
• Skills:
o Expertise in designing and maintaining data lakes, lakehouse ,warehouses, and OLAP
particularly with SQL Server, ClickHouse, and MinIO
o Experience with System architecture, optimization and performance tuning
o Experience with ETL/ELT development using Python and orchestration tools like Apache
Airflow
o Deep knowledge of Kafka, Kafka connect, and Schema Registry
o Experience with ETL/ELT pipelines and CDC (Change Data Capture) implementations.
o Experience with Spark or PySpark
o Familiarity with CI/CD pipelines and version control for managing data engineering
projects
o Familiarity with data governance, security, and compliance best practices
o Experience with Docker for containerization and Kubernetes for orchestration
o Working knowledge of monitoring/alerting using Grafana and Prometheus
o Strong problem-solving skills and the ability to work effectively in a collaborative
environment
Preferred Qualifications:
• Experience in database administration (DBA).
• Familiarity or experience with AI and ML
ثبت مشکل و تخلف آگهی
ارسال رزومه برای دیجی کالا