Job Description Overview
-
Skill:
Databricks, AWS, SQL, Python, PySpark, Data Pipelines, API Development, Cloud Technologies, Data Integration, Troubleshooting
-
Location:
Remote
-
Experience:
6
We are seeking a talented Data Engineer with experience in Databricks, AWS, and Python to join our growing data team. In this role, you will be responsible for designing, developing, and maintaining scalable data pipelines and workflows that efficiently manage and process large datasets. You will work with cutting-edge tools like PySpark and AWS services to ensure seamless data integration, transformation, and storage. If you are passionate about working with cloud technologies, optimizing data workflows, and driving innovation in data engineering, this is an excellent opportunity for you.
Required Skills & Qualifications:
Key Responsibilities:
As a Data Engineer, you will be responsible for:
-
Data Pipeline Development:
- Design, develop, and maintain robust data pipelines using Python and PySpark on Databricks to process and transform large datasets efficiently.
- Implement data workflows to ensure seamless data extraction, transformation, and loading (ETL) from source systems to target storage.
-
Data Workflow Optimization:
- Work with AWS services like S3, Glue, and Redshift to manage and optimize data storage and processing workflows.
- Continuously monitor and improve the performance and scalability of data pipelines to handle large-scale datasets.
-
SQL Querying & Data Analysis:
- Write and optimize complex SQL queries for data extraction, transformation, and analysis.
- Ensure data quality and integrity through validation checks and troubleshooting of data pipeline issues.
-
API Integration:
- Collaborate with cross-functional teams to integrate APIs that enhance data accessibility and simplify data access for other teams.
- Assist in the development and integration of APIs for seamless data exchange across platforms.
-
Troubleshooting & Issue Resolution:
- Troubleshoot issues related to data pipelines, ensuring smooth and uninterrupted data processing and storage operations.
- Resolve data quality issues and optimize performance through debugging, testing, and implementing fixes.
-
Collaboration & Continuous Improvement:
- Work closely with other data engineers, data scientists, and business teams to deliver high-quality, actionable data insights.
- Contribute to improving data engineering practices and workflows to drive continuous improvement in the team.