Data Architect (Managed Services)

AvePoint · Singapore

ROLE OVERVIEW

We are looking for a seasoned Data Architect with deep expertise in enterprise data warehousing, ETL/ELT pipeline development, and business intelligence. You will be responsible for the end-to-end design and implementation of our data architecture — from data ingestion and cleansing to dimensional modeling, warehouse construction, and BI dashboard delivery. You will also establish robust data governance frameworks including data catalogs, lineage tracking, and security audit systems. This role requires strong hands-on capability combined with architectural vision, and prior experience in complex industries such as finance, international trade, or manufacturing is highly valued.

KEY RESPONSIBILITIES

Data Acquisition, Cleansing & Integration

Design and implement enterprise-wide data collection strategies across heterogeneous source systems (ETRM, ERP, CRM, external APIs, flat files, and databases).
Develop robust data cleansing and standardization pipelines to ensure data accuracy, consistency, and completeness across all data sources.
Build data integration frameworks to consolidate data from disparate systems into a unified enterprise data warehouse, handling schema evolution and data drift.
Implement Change Data Capture (CDC) mechanisms and incremental data loading strategies to ensure near real-time data freshness.

Data Warehouse Architecture & Modeling

Lead the design and construction of the enterprise data warehouse (EDW) using industry-proven methodologies (Inmon, Kimball, or Data Vault).
Apply dimensional modeling expertise to design star schemas, snowflake schemas, and constellation models optimized for analytical query performance.
architect data marts for specific business domains (finance, sales, supply chain, operations) with clear separation of concerns.
Select and implement appropriate data warehouse technologies based on workload characteristics: batch analytics (Hive, Spark SQL), real-time analytics (ClickHouse, Apache Doris, StarRocks), or MPP databases (Greenplum, Vertica).
Design layered data architectures (ODS → DWD → DWS → ADS / Bronze → Silver → Gold) with clear data flow and transformation logic at each layer.

ETL/ELT Pipeline Development

Design, build, and maintain scalable ETL/ELT pipelines using Apache Spark, Flink, Hive, MapReduce, or proprietary ETL tools (Informatica, Talend, Kettle).
Implement workflow orchestration using Apache Airflow, DolphinScheduler, Azkaban, or Oozie to schedule, monitor, and manage data pipelines.
Establish data quality frameworks: automated validation rules, anomaly detection, data profiling, and quality scorecards to ensure pipeline reliability and data trustworthiness.
Build comprehensive monitoring and alerting systems for pipeline health, data freshness, and SLA compliance.
Optimize pipeline performance through partitioning, bucketing, indexing, and query tuning strategies.

Business Intelligence & Reporting

Develop enterprise BI reporting systems and interactive dashboards using tools such as Apache Superset, FineBI, Tableau, Power BI, or Quick BI.
Design executive dashboards, operational reports, and self-service analytics interfaces tailored to business stakeholders across finance, trading, operations, and management.
Implement metrics frameworks (KPIs, OKRs) and ensure data definitions are standardized, documented, and consistently applied across all reports.
Work directly with business units to understand analytical requirements and translate them into effective data models and visualizations.

Data Governance & Standards

Design and implement enterprise data governance frameworks from the ground up, including data ownership policies, stewardship roles, and governance workflows.
Build a centralized data catalog (using Apache Atlas, DataHub, Amundsen, or similar) documenting all data assets, business glossaries, and metadata.
Implement data lineage analysis systems to track data flow from source to consumption, enabling impact analysis and root cause diagnosis.
Establish data security and audit frameworks: access control (RBAC/ABAC), data masking, encryption policies, sensitive data discovery, and compliance audit trails.
Define and enforce data standards: naming conventions, data dictionaries, master data management (MDM) rules, and data quality SLAs.
Ensure regulatory compliance with data protection laws (GDPR, PIPL, industry-specific regulations) through governance controls and documentation.

REQUIRED QUALIFICATIONS

Bachelor's degree or above in Computer Science, Statistics, Mathematics, Information Systems, or related technical field.
3+ years of professional experience in data development, data engineering, or data architecture roles, with a proven track record of building production data warehouse systems.
Expert-level proficiency in SQL (complex queries, window functions, CTEs, query optimization) and Python for data processing and pipeline development.
Mastery of data warehouse modeling theories and methodologies: dimensional modeling (Kimball), normalized modeling (Inmon), and Data Vault 2.0. Demonstrated experience designing star schemas, snowflake schemas, and aggregate tables for complex business domains.
Deep domain experience in at least one complex industry: financial services (banking, securities, insurance), international trade (import/export, supply chain, logistics), or manufacturing (production planning, quality control, industrial IoT). Understanding of industry-specific data entities, business processes, and analytical requirements.
Strong hands-on experience with ETL/ELT tools and frameworks: Apache Spark (batch and streaming), Apache Flink, Hive, MapReduce, Sqoop, DataX, SeaTunnel, or commercial ETL platforms (Informatica PowerCenter, Talend, IBM DataStage).
Proficiency in workflow orchestration: Apache Airflow, DolphinScheduler, Azkaban, Oozie, or equivalent scheduling and monitoring platforms.
Production experience with at least one major data warehouse/analytics database: ClickHouse, Apache Doris, StarRocks, Hive, Greenplum, Vertica, Teradata, or Snowflake.
Strong hands-on experience with BI and visualization tools: Apache Superset, FineBI, Tableau, Power BI, Quick BI, or similar enterprise BI platforms.
Deep expertise in Data Governance with demonstrated ability to build governance systems from scratch: data catalog implementation, metadata management, automated lineage tracking (using OpenLineage, Marquez, or similar), data quality monitoring, and security audit frameworks.
Solid understanding of relational and analytical databases: PostgreSQL, MySQL, Oracle, SQL Server, Greenplum — including performance tuning, indexing strategies, and query optimization.
Experience with big data ecosystem components: HDFS, YARN, ZooKeeper, Kafka, and Hadoop distributions (CDH, HDP, Apache).

PREFERRED QUALIFICATIONS

Experience with master data management (MDM) platforms and methodologies.
Familiarity with data mesh or data fabric architectural patterns for large-scale decentralized data environments.
Experience with data versioning and time-travel capabilities in modern data warehouses.
Domain knowledge on Energy Trading, Market Risk Management.
Professional certifications: CDMP (Certified Data Management Professional), DAMA-DMBOK, or vendor-specific data platform certifications.

Any personal data you share with us during the application process will be processed strictly in compliance with applicable data protection laws and our Privacy Notice.

Data & ML pay context

Based on 1,573 disclosed Data & ML salaries on RoleSuite, the role pays a median of $162K/year, with most offers between $127K and $202K (10th–90th percentile: $103K–$246K).

See the full Data & ML salary breakdown →

Apply →

Data Architect (Managed Services)

Data & ML pay context

Other roles at AvePoint

More Data & ML roles