Data Engineer, Specialist

Vanguard · Malvern, PA

    This role enables the foundational data layer for enterprise risk visibility and Responsible AI at scale. By delivering accurate, integrated, and timely data products, you help ensure Vanguard can monitor AI systems, track inventory, and manage supplier egress risk with confidence and precision.  As a Data Engineer you will be responsible for designing, developing, and maintaining data pipelines and data products that enable enterprise insights, monitoring, and governance across Supplier Egress, the Enterprise GenAI Monitoring Function, and the GenAI Inventory. 

    This role focuses on executing ETL (Extract, Transform, Load) processes and integrating data from disparate systems, tools, and platforms to create reliable, scalable, and high-quality data assets. These data products support critical business capabilities including third-party risk visibility, AI system monitoring, inventory tracking, and governance reporting. 

     

    Role Context (Why This Role Exists) 

    This role sits at the intersection of three critical enterprise capabilities: 

    1. Supplier Egress & Insights 

    • Build datasets that unify data across internal systems and third-party integrations 

    • Enable visibility into where and how sensitive data (e.g., PII) is shared externally 

    2. Enterprise GenAI Monitoring Function 

    • Enable ingestion, transformation, and aggregation of monitoring metrics, alerts, and system performance data 

    • Contribute to standardized, scalable monitoring datasets used for governance and incident management 

    3. GenAI Inventory 

    • Build and maintain datasets supporting the enterprise inventory of AI systems (e.g., MGM, ServiceNow integrations) 

    • Enable tracking of system attributes (risk tier, ownership, compliance status, lifecycle stage) 

    • Support reconciliation and alignment across multiple systems of record 

    Key responsibilities  

    • Integrate data from multiple enterprise systems (e.g., monitoring platforms, inventory systems, third-party tools)  

    • Translate requirements into queries, datasets, and reports 

    • Support both batch and near real-time data processing use cases  

    • Ensure pipelines are scalable, performant, and resilient 

    • Implement data validation, cleansing, and transformation logic to ensure accuracy and completeness 

    • Translate use cases (e.g., monitoring dashboards, inventory reporting) into data models and pipelines 

    • Support iterative development and refinement of data products 

     

    Required Skills 

    • SQL for querying, transformation and reporting 

    • Experience with Python or DETL tools for data preparation  

    • Familiarity with Data Platforms and reporting tools 

    • Data quality and data validation implementations 

    • Familiarity with version control and deployment pipelines  

    An informational will be held on 6/3 at 12:00 PM  

    ________________________________________________________________________________ 

    Microsoft Teams meeting  

    Meeting ID: 273 136 031 135 663  

    Passcode: 72g4gJ2U  

     

    Special Factors

    Sponsorship

    Vanguard is not offering visa sponsorship for this position.

    About Vanguard

    At Vanguard, we don't just have a mission—we're on a mission.

    To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.

    How We Work

    Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

    Data & ML pay context

    Based on 1,491 disclosed Data & ML salaries on RoleSuite, the role pays a median of $161K/year, with most offers between $127K and $200K (10th–90th percentile: $102K–$245K).

    See the full Data & ML salary breakdown →
    Apply →