Data Engineer, Specialist
This role enables the foundational data layer for enterprise risk visibility and Responsible AI at scale. By delivering accurate, integrated, and timely data products, you help ensure Vanguard can monitor AI systems, track inventory, and manage supplier egress risk with confidence and precision. As a Data Engineer you will be responsible for designing, developing, and maintaining data pipelines and data products that enable enterprise insights, monitoring, and governance across Supplier Egress, the Enterprise GenAI Monitoring Function, and the GenAI Inventory.
This role focuses on executing ETL (Extract, Transform, Load) processes and integrating data from disparate systems, tools, and platforms to create reliable, scalable, and high-quality data assets. These data products support critical business capabilities including third-party risk visibility, AI system monitoring, inventory tracking, and governance reporting.
Role Context (Why This Role Exists)
This role sits at the intersection of three critical enterprise capabilities:
1. Supplier Egress & Insights
Build datasets that unify data across internal systems and third-party integrations
Enable visibility into where and how sensitive data (e.g., PII) is shared externally
2. Enterprise GenAI Monitoring Function
Enable ingestion, transformation, and aggregation of monitoring metrics, alerts, and system performance data
Contribute to standardized, scalable monitoring datasets used for governance and incident management
3. GenAI Inventory
Build and maintain datasets supporting the enterprise inventory of AI systems (e.g., MGM, ServiceNow integrations)
Enable tracking of system attributes (risk tier, ownership, compliance status, lifecycle stage)
Support reconciliation and alignment across multiple systems of record
Key responsibilities
Integrate data from multiple enterprise systems (e.g., monitoring platforms, inventory systems, third-party tools)
Translate requirements into queries, datasets, and reports
Support both batch and near real-time data processing use cases
Ensure pipelines are scalable, performant, and resilient
Implement data validation, cleansing, and transformation logic to ensure accuracy and completeness
Translate use cases (e.g., monitoring dashboards, inventory reporting) into data models and pipelines
Support iterative development and refinement of data products
Required Skills
SQL for querying, transformation and reporting
Experience with Python or DETL tools for data preparation
Familiarity with Data Platforms and reporting tools
Data quality and data validation implementations
Familiarity with version control and deployment pipelines
An informational will be held on 6/3 at 12:00 PM
________________________________________________________________________________
Microsoft Teams meeting
Meeting ID: 273 136 031 135 663
Passcode: 72g4gJ2U
Special Factors
Sponsorship
Vanguard is not offering visa sponsorship for this position.About Vanguard
At Vanguard, we don't just have a mission—we're on a mission.
To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.
How We Work
Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.
Data & ML pay context
Based on 1,491 disclosed Data & ML salaries on RoleSuite, the role pays a median of $161K/year, with most offers between $127K and $200K (10th–90th percentile: $102K–$245K).
See the full Data & ML salary breakdown →