Machine Learning Engineer, Trust & Safety
Responsibilities
Build, deploy, and maintain end-to-end machine learning models that detect policy violating actors, remove harmful content, and keep users safe.
Own medium-sized projects end-to-end (from problem scoping and data collection to training, deployment, monitoring, and iteration).
Contribute to scalable inference and data pipelines (e.g., Spark, Kubernetes), including preprocessing, batch and real-time inference, and post-processing components.
Apply standardized performance metrics, testing protocols, and evaluation processes to measure model effectiveness and identify risks.
Continuously assess and refine deployed models using user feedback, business impact signals, and emerging policy and ethical considerations.
Collaborate closely with Data Scientists, Data Engineers, Product Managers, Backend Engineers, and the AI Platform team to ship coordinated user safety improvements.
Stay current on advances in AI/ML, particularly LLMs and evaluation methods, and apply appropriate techniques to detection and user safety problems.
What We're Looking For
Strong programming skills: Proficiency in Python and SQL, comfort with at least one ML stack (e.g., PyTorch, Hugging Face Transformers), and a working understanding of data pipelines.
Domain expertise: Solid understanding of machine learning, deep learning, and emerging AI techniques. Track record of building, debugging, and fine-tuning ML models for user-facing products. Experience with content classification, fraud detection, or related Trust & Safety problems is a plus.
System design & architecture: Experience training and deploying ML models in production. Working understanding of distributed computing for inference and training.
AI application: Hands-on with AI agents, RAG, structured outputs, and function/tool calling. Understands when to reach for an agent or LLM vs. a classical ML approach.
Evaluation frameworks for ML and LLM systems: Comfort designing and running evaluations to measure model effectiveness, fairness, and safety across both classical ML and LLM systems. Familiar with golden datasets, LLM-as-judge patterns, precision/recall/F1, false positive rates, hallucination, and offline/online metrics.
Cloud and data platform proficiency: Hands-on with at least one cloud environment (GCP, AWS, or Azure). Familiarity with Databricks, Ray, or Kubeflow is a plus.
Data engineering knowledge: Comfortable handling large datasets, including cleaning, preprocessing, and storage, and contributing to batch and streaming pipelines orchestrated with tools like Databricks or Argo.
Project ownership: Demonstrated ability to drive medium-sized projects to completion, identify and unblock yourself on technical issues, and ship measurable outcomes with limited oversight.
Collaboration and communication skills: The ability to work effectively in a team and communicate complex ideas clearly with individuals from diverse technical and non-technical backgrounds.
Strong written communication: The ability to communicate complex ideas and technical knowledge through
2+ years of experience as an MLE, applied scientist, or data scientist (depending on education).
1+ years of experience applying end-to-end machine learning models in an industry setting — including data collection, model training, deployment, and monitoring.
Experience integrating or evaluating LLMs in real-world applications with appropriate baseline metrics and evaluation methodologies is a plus.
Familiarity with at least one ML infrastructure component (feature store, training environment, model serving, observability, workflow orchestrator).
Previous exposure to Trust & Safety, fraud detection, content classification, or compliance is preferred but not required.
Demonstrated use of modern AI tooling in their actual workflow (e.g., Cursor, Claude Code, Codex).
A degree in computer science, engineering, or a related field (or equivalent practical experience).