Document Split with Convolutional Neural Networks
Location Not Available
Stellenbeschreibung:
    Job Description
    Introduction

    Hi, we just founded a company in Germany that specializes on document recognition for tax statements. We have multiple challenges and a team of 4 developpers already on board. We realize the different tasks in an agentic set-up and currently want to explore an additional route for splitting a long document of multiple invoices.


    Problem Statement

    The input is one large document - e.g. a pdf of 100 pages. The output is a table with two columns - Column 1 is the page number and Column 2 needs to give a probability that this page is the start of a new document. We already have some algorithms in place, but also would like to assess the performance of a convolutional neural network. When asking GPT 5, I got the following suggestion which sounds reasonable to me... https://chatgpt.com/share/68a1b3b4-e9ec-8000-a3ca-3e2e9ee44d38


    Task Description

    Your task is to use the attached data - a sample document with nearly 200 pages and the corresponding "true" output (showing all the first pages). We want to have up to three developpers do a small project in python, that takes any pdf as input and provides the corresponding excel table as output. You should somehow train a CNN model, that is used to generate the likelihoods of a split by comparing the visuals. You can use local LLMs but due to data privacy issues, it is not possibl to use chat GPT or Gemini... This task is to test your skills and it is sufficient to deliver a first version.


    Proposal Requirements

    We would ask you to send us a high level description of how you would like to address the problem. We will review those in the next days and arrange 3-5 calls with the most promising solutions.


    Selection Process and Compensation

    In those calls we will decide for each of them whether the developper gets a 100 USD project to deliver a prototype of the implementation. Then we will most likely have three developpers code a version1. After this we will decide for the best prototype and continue with this developper for this or possibly also other tasks in the general project.


    Closing

    We are looking forward to your suggestions.

    Regards, Christian
Stelleninformationen
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Remote
  • Kategorie:

    Development & IT
  • Erfahrung:

    Erfahren
  • Arbeitsverhältnis:

    Freelance
  • Veröffentlichungsdatum:

    18 Aug 2025
  • Standort:

KI Suchagent
ai job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!