Data Extraction Specialist / Document Parsing Engineer Offline

We are looking for a skilled Data Extraction Engineer / Document Automation Developer to help build a system that automatically extracts structured information from various medical and immunization records and stores it in a structured database format.
You’ll be working with diverse document formats (PDFs, scans, electronic forms), using OCR and NLP-based tools to process them.

Your responsibilities:

 

  • Analyze medical document templates to identify key fields
  • Develop pipelines for automatic extraction of fields such as:
    • Student Full Name
    • Date of Birth
    • Sex/Gender
    • Vaccination Information (e.g. Type, Date, Dose, Clinic, Manufacturer)
    • Next Due Dates (calculated from context or intervals)
  • Build and integrate document processing flows using OCR or NLP tools such as Tesseract, Azure Form Recognizer, Amazon Textract, etc.
  • Validate and store extracted data in a structured database
  • Ensure accuracy and integrity of parsed data
  • Collaborate with development and QA teams to integrate the solution into the main platform

 

Requirements:

 

  • Hands-on experience with OCR tools:
    Tesseract, AWS Textract, Azure Form Recognizer, Google Document AI
  • Strong knowledge of .NET / C#, especially for parsing documents and data transformation
  • Understanding of NLP techniques and Regular Expressions
  • Experience transforming unstructured data into structured formats
  • Solid understanding of databases and storage formats (e.g. JSON, SQL)

 

Nice to have:

 

  • Experience in the healthcare or education domain
  • Familiarity with ML/AI-based approaches to document processing
  • English proficiency for reading documentation and written communication

 

What we offer:

 

  • Opportunity to work on a meaningful healthcare-related product
  • Potential for long-term in-house cooperation after the initial project
  • Flexible working hours
  • Support and mentorship from a technical team
  • Comfortable office in Cherkasy, or remote work option
  • Internal training and English language development

The job ad is no longer active

Look at the current jobs ML / AI Dnipro→

Loading...