1 day ago Be among the first 25 applicants
This range is provided by Mercor. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.
Base pay range
$100.00/hr - $120.00/hr
Direct message the job poster from Mercor
About The Job
Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark , General Catalyst , Peter Thiel , Adam D'Angelo , Larry Summers , and Jack Dorsey .
Position Details
Position: AI Task Evaluation & Statistical Analysis Specialist
Type: Contract
Compensation: $100–$120/hour
Location: Remote
Role Responsibilities
- Conduct comprehensive statistical failure analysis to identify patterns in AI agent failures across task components such as prompts, rubrics, and templates.
- Perform root cause analysis to determine if failures are due to task design, rubric clarity, file complexity, or agent limitations.
- Analyze performance variations across finance sub-domains, file types, and task categories to enhance understanding of AI model performance.
- Create dashboards and reports to highlight failure clusters, edge cases, and improvement opportunities.
- Recommend improvements to task design, rubric structure, and evaluation criteria based on statistical findings.
- Present insights to data labeling experts and technical teams to foster collaboration and drive improvements.
Qualifications
Must-Have
- Statistical Expertise: Strong foundation in statistical analysis, hypothesis testing, and pattern recognition.
- Programming: Proficiency in Python (pandas, scipy, matplotlib/seaborn) or R for data analysis.
- Data Analysis: Experience with exploratory data analysis and creating actionable insights from complex datasets.
- AI/ML Familiarity: Understanding of LLM evaluation methods and quality metrics.
- Tools: Comfortable working with Excel, data visualization tools (Tableau/Looker), and SQL.
Preferred
- Experience with AI/ML model evaluation or quality assurance.
- Background in finance or willingness to learn finance domain concepts.
- Experience with multi-dimensional failure analysis.
- Familiarity with benchmark datasets and evaluation frameworks.
- 2-4 years of relevant experience.
Application Process (Takes 20–30 mins to complete)
- Upload resume
- AI interview based on your resume
- Submit form
Resources & Support
- For details about the interview process and platform information, please check:
- For any help or support, reach out to:
PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.
Seniority level
Not Applicable
Employment type
Part-time
Job function
Analyst
Industries
Software Development
Referrals increase your chances of interviewing at Mercor by 2x
Get notified about new Senior Data Analyst jobs in Germany.
#J-18808-Ljbffr