Machine Learning Evaluation Specialist

Machine Learning Evaluation Specialist

Posted 3 days ago by Alignerr

Negotiable
Undetermined
Remote
London, England, United Kingdom

Summary: The Machine Learning Evaluation Specialist role seeks domain experts with strong machine learning backgrounds to design complex evaluation problems that challenge state-of-the-art AI systems. Candidates will leverage their specialized research expertise to create nuanced problems that require deep domain understanding. The position involves collaboration with a global team and offers flexibility in hours and work arrangements. This role is pivotal in shaping the evaluation and improvement of next-generation AI models.

Key Responsibilities:

  • Propose complex, original machine learning problems rooted in your domain of expertise
  • Design evaluation tasks that require advanced domain knowledge beyond standard ML pipelines
  • Draw from your own research experience to craft problems that would challenge a highly capable LLM
  • Define clear problem statements, evaluation criteria, and gold-standard solutions
  • Assess AI-generated ML solutions for correctness, creativity, and methodological rigor
  • Document problem difficulty, required domain knowledge, and expected failure modes
  • Collaborate asynchronously with a global team of researchers and engineers

Key Skills:

  • Graduate-level expertise (MS or PhD preferred) in a scientific or technical domain that intersects with machine learning
  • Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design
  • Deep familiarity with active research problems in your field
  • Ability to identify where general ML knowledge falls short and specialized domain insight becomes critical
  • Experience publishing or conducting original research is highly valued
  • Excellent written communication — able to articulate complex problems clearly and precisely
  • Self-motivated and comfortable working independently on intellectually demanding tasks

Salary (Rate): £400.00/hour

City: London

Country: United Kingdom

Working Arrangements: remote

IR35 Status: undetermined

Seniority Level: undetermined

Industry: IT

Detailed Description From Employer:

About The Role We're looking for domain experts with strong machine learning backgrounds to design challenging ML evaluation problems that test the boundaries of state-of-the-art AI systems. You'll draw on your specialized research expertise to craft problems that go beyond textbook knowledge — the kind of challenges that require deep, nuanced domain understanding to solve correctly. Your work directly shapes how we measure and improve the next generation of AI models.

Organization: Alignerr

Type: Hourly Contract

Compensation: $200–$400 /hour

Location: Remote

Commitment: 10–40 hours/week

What You'll Do

  • Propose complex, original machine learning problems rooted in your domain of expertise
  • Design evaluation tasks that require advanced domain knowledge beyond standard ML pipelines
  • Draw from your own research experience to craft problems that would challenge a highly capable LLM
  • Define clear problem statements, evaluation criteria, and gold-standard solutions
  • Assess AI-generated ML solutions for correctness, creativity, and methodological rigor
  • Document problem difficulty, required domain knowledge, and expected failure modes
  • Collaborate asynchronously with a global team of researchers and engineers

Who You Are

  • Graduate-level expertise (MS or PhD preferred) in a scientific or technical domain that intersects with machine learning
  • Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design
  • Deep familiarity with active research problems in your field
  • Ability to identify where general ML knowledge falls short and specialized domain insight becomes critical
  • Experience publishing or conducting original research is highly valued
  • Excellent written communication — able to articulate complex problems clearly and precisely
  • Self-motivated and comfortable working independently on intellectually demanding tasks

Example Domains (Not Exhaustive)

  • Computational biology, genomics, or bioinformatics
  • Climate science and environmental modeling
  • Medical imaging and healthcare ML
  • Materials science and computational chemistry
  • Astrophysics and signal processing
  • Natural language processing for low-resource or specialized corpora
  • Robotics, control theory, or reinforcement learning in complex environments
  • Financial modeling and quantitative analysis

Why Join Us

  • Work at the frontier of AI evaluation and safety research
  • Collaborate with top research labs pushing the boundaries of what AI can do
  • Leverage your hard-earned domain expertise in a high-impact, meaningful way
  • Full autonomy, flexible schedule, and global collaboration
  • Potential for ongoing work, contract extension, and deeper research involvement
  • Build your profile as a contributor to cutting-edge AI development

Application Process (Takes 10–15 min)

  • Submit your resume highlighting your domain expertise and ML experience
  • Complete a short screening assessment
  • Project matching and onboarding

PS: Our team reviews applications daily. Please complete your application steps to be considered for this opportunity.