Lead Data Engineer with AI experience

Jobgether · India

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Lead Data Engineer with AI experience based in India.

This role sits at the core of modern AI and data transformation initiatives, building the foundational infrastructure that powers next-generation intelligent systems. You will design and operate scalable data pipelines, retrieval systems, and ML/LLMOps frameworks that enable advanced AI applications, including conversational agents, RAG systems, and predictive models. The work spans both classical data engineering and cutting-edge AI infrastructure, requiring strong architectural thinking and hands-on execution.
You will collaborate with cross-functional engineering and AI teams to translate reference architectures into production-grade systems that are reliable, scalable, and efficient.
Your contributions will directly influence the performance, accuracy, and scalability of AI-driven products used in real-world enterprise environments.
The role offers exposure to agentic systems, semantic data layers, and advanced retrieval architectures at scale.
It is a highly technical and impact-driven position where engineering excellence and AI innovation intersect.

Accountabilities:

Data Pipeline Engineering: Build, optimize, and maintain robust batch and streaming data pipelines using modern cloud-native tools such as Snowflake, PySpark, Delta Lake, and Kafka, ensuring reliability, scalability, and performance.
RAG & Retrieval Infrastructure: Design and implement end-to-end retrieval systems including embedding pipelines, vector databases, hybrid search, chunking strategies, and ranking mechanisms to optimize AI context relevance.
Semantic & Knowledge Layer Development: Develop ontologies, entity mappings, and knowledge graphs while maintaining semantic contracts, metadata systems, and lineage tracking for AI and ML use cases.
ML/LLMOps Enablement: Support ML and LLM lifecycle workflows including dataset curation, feature engineering, model evaluation, experiment tracking, and production monitoring.
Agentic Data Systems: Build APIs, context stores, and tool interfaces that enable autonomous agents, including observability for reasoning traces, tool calls, and contextual outputs.
Governance & Data Quality: Implement robust data governance frameworks including RBAC, PII handling, schema validation, data quality monitoring, and compliance-ready audit logging systems.

Requirements

This role requires a highly experienced data engineering professional with strong cloud, distributed systems, and AI infrastructure expertise. The ideal candidate combines deep technical execution with architectural thinking and hands-on experience building production-grade AI-enabled data systems.

7+ years of experience in data engineering with strong exposure to cloud-based data platforms.
2+ years of experience building production AI/ML or LLM-related data infrastructure at scale.
Strong expertise in Python, SQL, PySpark, Snowflake, Delta Lake, Kafka, and Spark Structured Streaming.
Hands-on experience with vector databases, embedding pipelines, and retrieval systems in production RAG environments.
Solid understanding of MLOps practices including MLflow, CI/CD for ML systems, and automated evaluation frameworks.
Strong knowledge of data governance, security, compliance, and data quality frameworks.
Experience working with cloud ecosystems such as AWS or Azure and containerized environments (Docker, Kubernetes).
Familiarity with AI/LLM tooling such as LangChain, LlamaIndex, OpenAI/Claude/Bedrock APIs, and FastAPI is a plus.
Strong problem-solving mindset with the ability to design scalable systems and operate in fast-moving AI environments.

Benefits

Competitive compensation package aligned with experience and market standards
Remote-friendly or hybrid work flexibility depending on team structure
Opportunity to work on cutting-edge AI, LLM, and agentic systems
Exposure to global engineering teams and enterprise-scale AI transformation projects
Health, insurance, and wellness benefits (as per policy and location)
Learning and development support for advanced AI and data engineering skills
Access to modern cloud-native and AI-first technology stacks
Collaborative, engineering-driven culture focused on innovation and impact.

Data & ML pay context

Based on 1,357 disclosed Data & ML salaries on RoleSuite, the role pays a median of $165K/year, with most offers between $128K and $209K (10th–90th percentile: $106K–$246K).

See the full Data & ML salary breakdown →

Apply →

Lead Data Engineer with AI experience

Accountabilities:

Requirements

Benefits

Data & ML pay context

Other roles at Jobgether

More Data & ML roles