DataJobs
RoleSuite
CompaniesRemoteAboutMethodologyContactPrivacy
Updated 2026-07-04 01:00 UTC·© 2025–2026 RoleSuite
← Back to listings

Product Data Scientist — AI Evaluation & Quality

PNL · Berlin

About Finom

Finom is a European tech startup headquartered in Amsterdam, and we’re on a journey towards revolutionizing the financial landscape for entrepreneurs worldwide. Our mission is to develop an all-in-one financial B2B solution that integrates banking functions, accounting, financial management, and invoicing into a seamless, mobile-first platform.

We recently closed a €115 million Series C equity round (around $133 million), bringing our total funding to approximately $346 million. This significant investment follows a $105 million growth funding round from General Catalyst, a long-term backer since 2021 known for supporting companies like Airbnb, HubSpot, KAYAK, and Stripe.

Finom's platform goes beyond traditional banking, offering invoicing and a growing suite of features, including AI-enabled accounting, aiming to simplify financial management for entrepreneurs. We're actively expanding our reach across key EU markets like Germany, France, the Netherlands, Italy, and Spain.

At Finom, we’re not just redefining the entrepreneurial experience — we’re empowering our employees to make a real difference. Your work matters, and your impact extends far beyond product metrics. We nurture innovation and an inspiring work environment where bold ideas thrive, prioritizing thorough research, swift implementation of solutions, and ensuring that every effort we make benefits our users, employees, partners, and our business as a whole.

Maintaining our start-up spirit, we prioritize thorough research, swift implementation of solutions, and ensuring that every effort we make benefits our users, employees, partners, and, of course, our business.

You'll join the AI Team — the group driving all AI products and technology at Finom
We build and ship AI across the company: AI financial co-pilot, voice agent, and internal AI-powered processes
Our belief: your AI agent is only as good as your eval loop — we can build AI as good as the evals we run on it
Your mission: own that eval loop across every AI product we ship — pre-launch quality gates, post-launch monitoring, continuous improvement
You'll work directly with our AI Quality lead, Igor Kolodkin
Close collaboration with AI engineers, Product, and domain experts across the company
Core stack: Databricks, DeepEval, Claude Code

What You Will Be Doing

  • Own and extend our offline eval suite across products — datasets (capability + regression), judges, metrics
  • Build and maintain online quality dashboards: resolution rate, CSAT, thumbs up/down, LLM-as-judge signals, error rate, latency
  • Close the production feedback loop: mine failure patterns from real traffic → turn them into regression cases → propose fixes to Product and domain experts
  • Harden methodology: judge stability, non-determinism handling
  • Translate numbers into decisions – weekly syncs, clear trade-offs, no dashboards for their own sake
  • Must-Haves

  • Python and SQL — you can build an analysis end-to-end
  • Solid foundation in statistics — sampling, hypothesis testing, variance, understanding what a noisy metric is
  • Analytical mindset — you start from the business question, not from the tool
  • 3+ years in analyst / data scientist roles, at least one in a product context
  • Nice-to-Haves

  • Experience in quality analytics for ML systems — ranking, recommendations, classification, etc.
  • Hands-on experience evaluating LLM applications (RAG, agents, tool use, judges)
  • Experience building LLM agents — side projects, toy builds, personal experiments all count
  • How we work — one thing we mean seriously

  • AI-assisted coding is our default authoring environment, not a bonus
  • Claude Code is our main tool — you'll reach for it for SQL, Python, analyses, dashboards, and internal scripts
  • We're looking for analysts who are already curious and fluent with AI coding — or genuinely excited to become fluent fast
  • We care about what you ship and how clearly you think
  • If this idea excites you rather than worries you, you'll feel at home here
  • Data & ML pay context

    Based on 1,462 disclosed Data & ML salaries on RoleSuite, the role pays a median of $162K/year, with most offers between $127K and $204K (10th–90th percentile: $102K–$245K).

    See the full Data & ML salary breakdown →
    Apply →

    Other roles at PNL

    • Product Data Scientist — AI Evaluation & QualityEstonia
    • Product Data Scientist — AI Evaluation & QualityWarsaw
    • Senior AI EngineerBelgrade
    • Senior Product Designer - Core Product (Remote)Berlin / Italy / Amsterdam / Warsaw / Cyprus
    • Database Administrator (DBA) (Remote)Vilnius
    • Customer Care Specialist - Dutch or German (Full remote, Freelancer)Romania / Serbia / Montenegro / Hungary
    • Customer Care Specialist - Dutch or German (Full remote, Freelancer)Turkey / Bulgaria / Serbia / Montenegro / Hungary / Greece
    • Customer Care Specialist - Dutch or German (Full remote, Freelancer)Serbia / Montenegro / Hungary
    • Customer Care Specialist - Dutch or German (Full remote, Freelancer)Bulgaria / Serbia / Montenegro / Hungary
    • Senior Legal CounselNetherlands / Germany

    More Data & ML roles

    • [Job-30295] Senior Data Engineer (DataBricks & Data Privacy-LGPD ), BrazilCiandt · Brazil
    • [Job-30277] Senior Data Steward ( Data Governance), BrazilCiandt · Brazil
    • Sr Data Scientist, AI ProductsWealthsimple · Remote (Canada)
    • DevOps Engineer (Data & AI Platform)Simplepractice55 · Mexico City
    • Data Annotation Specialist, Data ScienceCohere · Canada
    • Senior Forward Deployed Data Engineer, Data ModernizatonQualified Health · United States - Remote
    • [Job - 30217] Senior Data Architect, ColombiaCiandt · Colombia
    • [Job - 30217] Senior Data Architect, BrazilCiandt · Brazil
    • Head of Data (Haifa - Hybrid)Pragmatike · United Kingdom
    • Head of Data (UK - Remote)Pragmatike · United Kingdom