Pessoa Engenheiro de Dados Pleno
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Pessoa Engenheiro de Dados Pleno based in Brazil.
This role is an opportunity for a data professional who wants to work at the intersection of data engineering, AI, and modern cloud architecture. You will be responsible for building and evolving scalable data pipelines that power analytics, machine learning, and generative AI solutions. The position plays a key role in ensuring data reliability, governance, and accessibility across structured and unstructured sources. You will collaborate closely with data scientists, ML engineers, and business teams to enable data-driven decision-making at scale. The environment is fast-paced, collaborative, and highly focused on innovation, with strong exposure to AI-native architectures such as RAG and vector databases. Your work will directly impact the efficiency, intelligence, and scalability of next-generation data products.
Accountabilities:
Design, build, and maintain scalable batch and streaming data pipelines, ensuring high performance, reliability, and cost efficiency across data workflows.
- Develop and optimize ETL/ELT processes, ensuring data quality, integrity, and consistency across multiple systems and sources.
- Build and maintain data infrastructure supporting AI and Machine Learning pipelines, including structured and unstructured data processing.
- Implement and improve data governance practices, ensuring secure, well-documented, and reusable data assets for analytics and AI use cases.
- Work closely with Data Science and ML Engineering teams to enable efficient data consumption for predictive models and LLM-based solutions.
- Monitor, troubleshoot, and optimize data pipelines and queries, continuously improving performance and reducing operational costs.
- Strong proficiency in SQL (query optimization, modeling, and performance tuning) and Python for data manipulation (e.g., Pandas, PySpark).
- Experience with cloud platforms such as AWS, GCP, or Azure, and data warehouse solutions like BigQuery, Redshift, or Snowflake.
- Hands-on experience with data orchestration tools such as Apache Airflow.
- Knowledge of relational and NoSQL databases, as well as APIs and system integrations.
- Experience working with unstructured data (text, documents, images) and vector databases (e.g., Pinecone, Milvus, Weaviate, pgvector) for semantic search and RAG architectures.
- Familiarity with NLP concepts, embeddings, and modern AI/ML data pipelines.
- Strong communication skills, with the ability to translate technical concepts into business-oriented insights.
- Proactive, ownership-driven mindset with strong problem-solving, adaptability, and collaboration skills.
- Competitive compensation with PJ (contractor) model
- Flexible and remote-first work environment
- Health plan, dental plan, telemedicine, and life insurance
- Multibenefits card (Flash) for flexible usage
- Gympass access for wellness and fitness
- Paid rest period and birthday day off
- Training and development programs (Academia X)
- Structured career growth and continuous learning opportunities
- Collaborative and innovation-driven culture focused on AI and data excellence.
Requirements
Solid experience as a Data Engineer (approximately 2–4 years), with strong background in building and maintaining data pipelines in production environments.
Benefits
Data & ML pay context
Based on 1,491 disclosed Data & ML salaries on RoleSuite, the role pays a median of $161K/year, with most offers between $127K and $200K (10th–90th percentile: $102K–$245K).
See the full Data & ML salary breakdown →