Staff AI/ML Engineer, MyHealthTeam

MyHealthTeamLocation Not Available

Stellenbeschreibung:

Join us to redefine the patient experience

MyHealthTeam builds communities for people living with chronic and rare conditions. We reach millions of people each month, and we’re investing deeply in AI to help members find the right support, content, and recommendations — safely, responsibly, and at scale.

We’re looking for a Staff AI/ML Engineer who is product‑minded, hands‑on, and excited to ship production ML/LLM systems that turn messy, real‑world text and behavioral data into reliable experiences for patients.

What you get to do every day

Build end‑to‑end ML/LLM features from problem definition → data → modeling → evaluation → deployment → monitoring.
Develop LLM applications with retrieval and tool use (e.g., RAG, orchestration/workflows, structured extraction) to deliver trustworthy consumer health experiences.
Convert unstructured text (posts, comments, messages, search queries) into structured signals (topics, entities, intent, sentiment, safety flags) using a mix of classical NLP and modern LLMs.
Create and maintain data pipelines for training, inference, evaluation, and analytics (batch and/or streaming as needed).
Design evaluation systems that measure quality and safety: offline metrics, golden datasets, human review workflows, and online A/B testing alignment.
Implement production guardrails to reduce harm and misinformation risk (policy constraints, refusal behavior, citations/attribution when appropriate, red‑testing, monitoring, and incident response).
Set up monitoring for model + system health (latency, cost, drift, regressions, quality metrics).
Partner closely with the Product, Engineering, and Data teams and clinical/subject‑matter experts to validate outputs and define what “correct” means for sensitive, health‑adjacent use cases.
(Staff scope) Lead architecture and technical direction for applied AI across the organization; mentor engineers; establish best practices and reusable platforms.

Examples of problems you might work on

Personalized recommendations for communities, posts, resources, or next‑best actions
Safer content understanding: detection of misleading/high‑risk health claims, escalation workflows
Search and discovery improvements using embeddings, hybrid retrieval, and ranking
Summarization and structuring of long threads into navigable insights (with safety constraints)
Member intent understanding from behavioral + text signals

Must‑have qualifications

8+ years building and shipping production ML systems (or equivalent experience with demonstrable impact)
Strong Python skills and experience with ML/LLM libraries and tooling (e.g., Hugging Face ecosystem, LangChain/LangGraph, or equivalent)
Proven ability to design production‑grade pipelines (training/inference/eval) and operate models in real systems (monitoring, rollbacks, incident handling)
Solid grounding in ML fundamentals (NLP, deep learning, statistical reasoning, evaluation)
Experience with MLOps best practices: versioning, reproducibility, CI/CD, model registry patterns, feature/data management, and infrastructure collaboration
Experience working with large‑scale data using Databricks/Spark or equivalent distributed processing
Strong product and stakeholder instincts: you can translate ambiguous business needs into measurable ML outcomes

Nice‑to‑have qualifications

Experience building RAG and retrieval systems: vector databases, hybrid search, ranking, recommendation, query understanding
Experience in healthcare or regulated environments, including privacy‑by‑design, auditability, and safety reviews (HIPAA/PHI familiarity a plus)
Experience with streaming/clickstream data, experimentation platforms, and causal/measurement thinking
Ability to prototype end‑to‑end experiences (e.g., Streamlit, Gradio, lightweight frontends)
Experience designing LLM safety systems: red‑testing, adversarial testing, prompt injection mitigation, output filtering, human‑in‑the‑loop review

Some tools we use

Databricks/Spark for distributed processing
Redshift and BI tools (Looker/Tableau) for analytics
Terraform for infrastructure‑as‑code; Airflow for orchestration; GitHub Actions for CI/CD
AWS (including Bedrock) and a mix of private and open‑weight models (including fine‑tunes where appropriate)
Experimentation tooling (A/B testing) and internal UX analytics tools
AI‑assisted coding tools (e.g., Cursor, Copilot, Claude/OpenAI tooling)

Working model

The MyHealthTeam Engineering Team is hybrid. This role requires in‑person time at our office at One Post Plaza in San Francisco, typically two days per week.

Application materials

Submit a resume and answer the following three short questions:

Tell us why MyHealthTeam’s mission matters to you.
Give one example of a production ML/LLM system you shipped and how you evaluated it.
Confirm that you can meet the hybrid San Francisco work requirement.

#J-18808-Ljbffr

NOTE / HINWEIS: