This job is expired.

Advertisement:



Lead Data Engineer - Python/azure Databricks, Pune

Last update 2024-12-02
Expires 2024-12-01
ID #2452719777
Free
Lead Data Engineer - Python/azure Databricks, Pune
India, Maharashtra, Pune,
Modified November 14, 2024

Description

Are you passionate about building state-of-the-art data platforms and powering the next generation of compute and AI applications, we'd love to hear from you.

This is an exciting opportunity to leverage your expertise in distributed computing frameworks to make a significant impact as we push the boundaries of supply chain planning and eventually adoption of Gen AI.

JOB BRIEF : We are seeking an experienced Data Engineer to join our team and lead the development of a cutting-edge data platform.

The platform will leverage distributed computing frameworks such as Apache Spark, Databricks, and Snowflake to enable near real time supply chain planning, eventually leading to advanced analytics, insights into data with the adoption of Generative AI (Gen AI) technologies across our product base.

KEY RESPONSIBILITIES : As a Lead Data Engineer, the candidate would be responsible for : - Design and build a highly scalable, fault-tolerant data platform optimized for distributed computing and large-scale data processing - Implement data pipelines and ETL/ELT processes using distributed computing frameworks to efficiently ingest, transform, and load massive datasets from various sources - Leverage cloud data platforms to enable seamless data sharing, near-zero maintenance, and fast analytics on structured and semi-structured data - Collaborate with data scientists, machine learning engineers, and software developers to understand data requirements and build solutions to power Gen AI applications - Optimize distributed computing jobs and queries for maximum performance and cost efficiency - Implement data governance, security, and compliance best practices - Provide guidance on distributed computing architecture and mentor junior data engineers QUALIFICATIONS - 5 years of experience as a Data Engineer working with big data technologies.

- Strong proficiency in SQL, Python programming and data modeling techniques.

- Deep expertise in distributed computing principles and frameworks (e.g., Apache Spark), including SQL, streaming, and optimizing jobs for scale and efficiency.

- Hands-on experience with developing and deploying distributed computing applications using cloud-based platforms (e.g., AWS EMR, Azure HDInsight, or equivalent).

- Strong understanding of cloud data platform architectures and best practices for ELT/ETL, data sharing, and query optimization (e.g., AWS Athena, AWS Glue, Azure Synapse Analytics, or equivalent).

- Experience enabling application engineers to build applications leveraging the data platform through APIs and abstractions.

- Experience with orchestration frameworks like Apache Airflow and data streaming technologies like Kafka - Experience building and optimizing data pipelines for machine learning applications.

- Knowledge of data modelling, data warehousing, and schema design.

- Familiarity with public cloud platforms such as AWS, Azure, or GCP.

- Excellent problem-solving and communication skills.

- Bachelor's or Master's degree in Computer Science (Preferred), Engineering, or a related field EXPERIENCE : 5 Years (ref:hirist.tech)

Job details:

Job type: Full time
Contract type: Permanent
Salary type: Monthly
Occupation: Lead data engineer - python/azure databricks

⇐ Previous job

Next job ⇒     

 

Contact employer

    Quick search:

    Location

    Type city or region

    Keyword


    Advertisement: