$Not specified
The AWS Data Engineer will design, develop, and optimize data pipelines and manage ETL/ELT workflows for both structured and unstructured data. They will leverage various AWS services to implement data lake and data warehouse architectures while ensuring data quality and security.
Candidates should have strong programming skills in Python and PySpark, along with proficiency in SQL. Experience with AWS services and distributed systems is essential, as well as knowledge of ETL frameworks and data modeling techniques.
Job Summary We are looking for an experienced AWS Data Engineer with strong expertise in Python and PySpark to design, build, and maintain large-scale data pipelines and cloud-based data platforms. The ideal candidate will have hands-on experience with AWS services, distributed data processing, and implementing scalable solutions for analytics and machine learning use cases. Key Responsibilities· Design, develop, and optimize data pipelines using Python, PySpark, and SQL.· Build and manage ETL/ELT workflows for structured and unstructured data.· Leverage AWS services (S3, Glue, EMR, Redshift, Lambda, Athena, Kinesis, Step Functions, RDS) for data engineering solutions.· Implement data lake/data warehouse architectures and ensure data quality, consistency, and security.· Work with large-scale distributed systems for real-time and batch data processing.· Collaborate with data scientists, analysts, and business stakeholders to deliver high-quality, reliable data solutions.· Develop and enforce data governance, monitoring, and best practices for performance optimization.· Deploy and manage CI/CD pipelines for data workflows using AWS tools (CodePipeline, CodeBuild) or GitHub Actions. Required Skills & Qualifications· Strong programming skills in Python and hands-on experience with PySpark.· Proficiency in SQL for complex queries, transformations, and performance tuning.· Solid experience with AWS cloud ecosystem (S3, Glue, EMR, Redshift, Athena, Lambda, etc.).· Experience working with data lakes, data warehouses, and distributed systems.· Knowledge of ETL frameworks, workflow orchestration (Airflow, Step Functions, or similar), and automation.· Familiarity with Docker, Kubernetes, or containerized deployments.· Strong understanding of data modeling, partitioning, and optimization techniques.· Excellent problem-solving, debugging, and communication skills.
This job posting was last updated on 8/28/2025