Data Engineer

Jobgether · UK

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Data Engineer based in United Kingdom.

This role focuses on rebuilding trust in a complex, regulated data environment where existing pipelines are not yet reliable, reproducible, or fully validated. You will be responsible for transforming a newly centralized data lake into a robust, analytics-ready foundation that supports downstream data science and risk modelling use cases. Working within a regulated credit and lending context, you will design and enforce strong data quality, lineage, and governance standards across multiple source systems. The role requires deep hands-on engineering across AWS, Spark, and modern data tooling, with a strong emphasis on correctness, auditability, and reproducibility. You will collaborate closely with data science and engineering stakeholders to define harmonized data models and prepare feature-ready datasets. This is a high-impact foundational role where your work directly enables reliable decision-making in a financial risk environment.

Accountabilities:

Rebuild and validate data pipelines to ensure full reproducibility of reporting and descriptive statistics across all datasets
Profile, reconcile, and harmonize heterogeneous source schemas across multiple business entities into a unified data model
Design and implement dbt-based data models (staging, intermediate, and marts) with strong testing and validation layers
Develop and maintain data quality frameworks using tools such as Great Expectations and dbt tests to enforce reliability
Build and implement entity resolution and record linkage logic across fragmented customer and account datasets
Ensure robust anonymization and pseudonymization processes that meet regulatory and compliance requirements
Optimize large-scale Spark-based processing jobs, including partitioning strategies, file formats, and cost-efficient compute usage
Orchestrate production-grade pipelines using tools such as Airflow or AWS Step Functions
Deliver clean, documented, and feature-ready datasets for downstream data science and risk modelling teams
Create clear technical documentation and runbooks to support operational handover and long-term maintainability

Requirements:

4+ years of professional experience in data engineering with strong exposure to large-scale AWS and Spark environments
Advanced proficiency in SQL and Python for data processing and transformation at scale
Strong experience with AWS data services including S3, Glue, Athena, Redshift, EMR, and orchestration tools
Proven experience building and maintaining data models using dbt or similar frameworks
Hands-on experience with data quality, validation, and testing frameworks such as Great Expectations
Strong understanding of data governance, lineage, and reproducibility in production environments
Experience with entity resolution, deduplication, or record linkage across multiple data sources
Familiarity with anonymization and pseudonymization techniques in regulated environments
Experience working in regulated industries such as BFSI, healthcare, or government is highly valued
Ability to work independently or as a lead engineer within a small, fast-moving delivery team
Strong written and verbal communication skills in English, with the ability to document and explain complex systems clearly

Benefits:

Competitive compensation package aligned with experience and impact
Remote-friendly working arrangements within Europe
Opportunity to work on a high-impact, regulated data transformation project
Exposure to modern AWS data architecture and large-scale Spark processing environments
Direct collaboration with data science and engineering leadership on meaningful analytics use cases
Strong autonomy in shaping data foundations and engineering standards
Opportunity to build robust, production-grade systems from an early-stage data estate
International, collaborative environment with distributed teams

Data & ML pay context

Based on 1,433 disclosed Data & ML salaries on RoleSuite, the role pays a median of $162K/year, with most offers between $127K and $203K (10th–90th percentile: $105K–$245K).

See the full Data & ML salary breakdown →

Apply →

Data Engineer

Accountabilities:

Requirements:

Benefits:

Data & ML pay context

Other roles at Jobgether

More Data & ML roles