Description

Job Description:

Responsibilities:

  • Build data pipelines to enable training of LLMs
  • Work with technical and non-technical stakeholders to collect requirements, integrate with data systems and deliver large-scale datasets
  • Operate in a complex multi-cloud data lakehouse, building end-to-end integrated workflows
  • Work with privacy, security, and other policy stakeholders to design, review and implement solutions to automate compliant ETL for downstream teams

Requirements:

  • Availability for long-term business travel, spending most of the year in the U.S. – working in customer’s office
    Strong understanding of machine learning principles, especially in the context of LLMs.
  • 5+ years of experience in big data engineering
  • Proficiency in Java, Scala, Spark
  • Proficiency in PySpark and Golang(preferred)
  • Experience building and leveraging largescale data infrastructure, map-reduce/hadoop, cloud-native deployments
  • Experience with alerting, monitoring and remediation automation in a large scale distributed environment
  • BS/BA or equivalent degree in computer science or similar (preferred).

We offer:

  • Opportunity to work on bleeding-edge projects
  • Work with a highly motivated and dedicated team
  • Competitive salary
  • Flexible schedule
  • Benefits package – medical insurance, sports
  • Corporate social events
  • Professional development opportunities
  • Well-equipped office

 

Tagged as: