Yofy
$100k - Senior LLM Engineer - AI Startup - Remote
Yofy European
Stellenbeschreibung:
    Job Description

    We’re excited to present a unique opportunity to join a hyper-growth startup at the intersection of AI companionship, video generation, and expressive identity.

    Having scaled to 30M+ users in just 18 months, this startup is among the top 15 largest AI companies worldwide.

    With over 30 million users and a team pushing the boundaries of what’s possible in AI, the company is now hiring a Senior Machine Learning Engineer to architect and optimize large language models (LLMs) powering meaningful conversations for millions globally.

    With a team of 60 people innovators, they’re seeking an expert to enhance their inference pipeline, scale to 10M+ daily messages, and reduce infrastructure costs by 40%.

    his is a high-impact role where you’ll drive cutting-edge AI companionship technology, focusing on low-latency, multilingual conversational systems.


    Core Responsibilities
    • LLM Optimization: Fine-tune and optimize LLMs using vLLM/TensorRT-LLM, reducing latency and memory usage through quantization, distillation, and distributed training.
    • Scalable Inference Pipeline: Audit and enhance the vLLM setup, optimize KV cache, and scale infrastructure to handle 10M+ daily messages with 50% latency reduction.
    • Custom Kernel Development: Write and debug custom CUDA/Triton kernels to maximize GPU efficiency for large-scale LLM inference.
    • Feature Development: Build innovative features like multi-character chats, gamification, and multilingual support, collaborating with teams on audio, image, and video modalities.
    • Dataset Management: Create and maintain diverse datasets for training and improving LLM performance, ensuring robust conversational quality.
    • Cross-Functional Collaboration: Work with co-founders, web engineers, and DevOps to align technical strategies with user-focused product goals.

    What They Require
    • 2+ years of production Python engineering with expertise in LLM inference optimization at scale, serving 1M+ daily users.
    • Deep expertise in the vLLM framework and hands-on experience with quantization, distillation, and distributed training (FSDP, DeepSpeed, or Accelerate).
    • Strong understanding of LLM architectures (transformers, attention mechanisms, tokenizers, and post-training techniques).
    • Proven ability to identify and resolve compute/memory bottlenecks using industry-standard profiling workflows.
    • Comfort working in a fast-paced, high-ambiguity environment with a focus on delivering measurable performance improvements.
    • NSFW: Comfortable building products based on uncensored models and content.

    What They Value
    • Performance Expertise: A track record of optimizing LLM inference pipelines with measurable latency/throughput improvements.
    • AI Curiosity: Passion for pushing the boundaries of AI companionship, including multilingual systems and specialized use cases like multi-turn conversations or erotic roleplay.
    • Open-Source Impact: Contributions to frameworks like vLLM, Hugging Face Transformers, or Triton, showcasing your expertise.
    • Collaborative Mindset: Strong communication skills, fluent in English, with the ability to work seamlessly across teams to align technical and product goals.
    • Doer Mentality: A goal-oriented, humble approach, balancing execution, planning, and strategy while thriving in a fast-moving startup.

    Compensation & Perks
    • Competitive Salary: $70K–$100K + equity
    • Flexible Work: Fully remote role, async-friendly, and optimized for European time zones.

    If you’re passionate about scaling LLM systems, building innovative features, and working on cutting-edge companionship technology, we’d love to hear from you.

NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung
Stelleninformationen
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Remote
  • Kategorie:

    Development & IT
  • Erfahrung:

    Senior
  • Arbeitsverhältnis:

    Angestellt
  • Veröffentlichungsdatum:

    30 Aug 2025
  • Standort:

    European
KI Suchagent
AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!