DataJobs
RoleSuite
CompaniesRemoteAboutMethodologyContactPrivacy
Updated 2026-06-29 20:00 UTC·© 2025–2026 RoleSuite
← Back to listings

Member of Technical Staff (Data Scientist, Evals)

Perplexity · San Francisco

Perplexity serves tens of millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and our specialized data sources. We aim to use the latest models as they are released, but the intelligence frontier is a jagged one, and popular benchmarks do not effectively cover our use cases. In this role, you will build specialized evals to improve answer quality across Perplexity, covering search-based LLM answers and other scenarios popular with our users.

Responsibilities

  • Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness

  • Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality

  • Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices

  • Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements

  • Operate within a small, high-impact team where your evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality

Qualifications

  • PhD or MS in a technical field or equivalent experience

  • 4+ years of experience in data science or machine learning

  • Strong proficiency in Python and SQL (expected to write production-grade code)

  • Experience building within a modern cloud data stack, specifically AWS and Databricks

  • Comfortable with agentic coding workflows and using AI-assisted development tools to iterate faster

Preferred Qualifications

  • 1+ years of experience working with LLMs at scale, specifically with LLM-as-a-judge setups

  • Prior experience working on customer-facing web products or consumer apps, with real user traffic at scale

  • A strong research background, with experience applying research methods to real-world ML problems

  • Experience defining evaluation metrics (e.g., factual consistency, hallucination rate, retrieval precision) and building ground truth datasets

Data & ML pay context

Based on 1,494 disclosed Data & ML salaries on RoleSuite, the role pays a median of $161K/year, with most offers between $127K and $200K (10th–90th percentile: $102K–$245K).

This posting lists $200K–$300K, above the $161K market median.

See the full Data & ML salary breakdown →
Apply →

Other roles at Perplexity

  • Member of Technical Staff (Secure Intelligence Institute)San Francisco
  • Member of Technical Staff (AI Software Engineer, Agents)San Francisco
  • Member of Technical Staff (AI Policy and Strategic Initiatives)San Francisco
  • Sr. Go-to-Market Recruiter San Francisco
  • Engineering Site LeadLondon
  • Recruiting Coordinator San Francisco
  • Sr. Technical SourcerSan Francisco
  • Sr. Technical Recruiter San Francisco
  • Product Marketing Manager, APISan Francisco
  • Member of Technical Staff (Engineering Lead, Developer Experience & Relations)San Francisco

More Data & ML roles

  • ML Research EngineerApple · Cupertino
  • AIML - ML Researcher, Responsible AIApple · Cupertino
  • Customer Data Scientist (Statsig)Amplitude · Singapore
  • Data Scientist - SeniorGrvty · Camp Lejeune, North Carolina, United States
  • Data Scientist - ExpertGrvty · Falls Church, Virginia, United States
  • Senior Data EngineerProlific · Remote, UK
  • Analista de Dados Pleno (Produtos)Agibank · Campinas, São Paulo, Brasil
  • Data Manager - SeniorGrvty · Falls Church, Virginia, United States
  • Data Engineer / Cloud ETL DeveloperJobgether · US
  • Pessoa Engenheiro de Dados PlenoJobgether · Brazil