INFUSE

Semantic Backend Engineer (Contract, Remote)

INFUSE München

Stellenbeschreibung:

INFUSE is committed to complying with applicable data privacy and security laws and regulations. INKHUB is ingesting 10 million raw PDFs to build the internet’s richest catalog of marketing‑grade B2B content — tagged, summarized, and searchable by topic, company, or intent. We’re looking for an applied ML engineer to own the semantic ingestion pipeline, from raw PDFs to tagged, summarized, and embedded assets.

Hiring Process

  • Review application against job requirements.
  • Invite to submit a video interview; may be followed by a test or short project.
  • Interview with hiring manager/team via face‑to‑face or Zoom.
  • Decision and offer conversation.

What You’ll Do

  • Own the ETL pipeline from raw PDFs (S3-ingested) to structured resources.
  • Finalize summarization and classification flow using open‑source models with GPT‑4o fallback.
  • Apply filtering logic (≤3 years old, ≤100 pages, etc.) to enforce resource quality.
  • Map each asset to the specific topic taxonomy (10+ per topic across ~9,000 topics).
  • Generate dense embeddings using sentence-transformers.
  • Load and query embeddings using Milvus or pgvector.
  • Implement “freshness” logic to identify and index only new or updated content based on file diffing, crawl timestamp, or document hash.
  • Build a QA/evaluation harness: format compliance, , drift monitoring.
  • Expose /v1/semantic-search via FastAPI, with filtering and rank fusion.
  • Collaborate closely with Tech Lead on UX integration and snippet generation.

Your Toolbox

  • Python, PyTorch, sentence-transformers, OpenAI APIs, or similar pretrained LLMs.
  • FastAPI, Milvus or pgvector, PyPDF/Tika, Airflow or Lambda for orchestration.
  • Docker, GPU scheduling, Athena/Redshift SQL.

You Might Be a Fit If…

  • You’ve built ML pipelines that touched real users, not just notebooks.
  • You’ve worked on semantic search, embeddings, or large‑scale tagging.
  • You’ve wrestled with unstructured data and love turning chaos into clarity.
  • You like working fast, iterating with feedback, and tracking metrics that matter.

Why This Role Matters

Your models decide what gets found, how it’s tagged, and which content and companies stand out. You’ll help define relevance and freshness for over a million resources and 50,000+ company pages – ensuring INKHUB stays ahead of the curve.

Seniority level

Entry level

Employment type

Contract

Industries

Construction, Software Development, and IT Services and IT Consulting

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Veröffentlichungsdatum:

    02 Feb 2026
  • Standort:

    München
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!