Data Extraction Specialist / Document Parsing Engineer

About the company

VReal Soft is a Ukrainian IT company with 7 years of experience building custom software solutions, mobile apps, and web platforms for clients in the US, Israel, and Europe. We are currently opening a position for a Data Extraction Specialist, combining project work in the healthcare domain with the opportunity for long-term in-house collaboration.

 

Position Summary

We are looking for a skilled Data Extraction Engineer / Document Automation Developer to help build a system that automatically extracts structured information from various medical and immunization records and stores it in a structured database format.
You’ll be working with diverse document formats (PDFs, scans, electronic forms), using OCR and NLP-based tools to process them.

 

Your responsibilities:

  • Analyze medical document templates to identify key fields
  • Develop pipelines for automatic extraction of fields such as:
    • Student Full Name
    • Date of Birth
    • Sex/Gender
    • Vaccination Information (e.g. Type, Date, Dose, Clinic, Manufacturer)
    • Next Due Dates (calculated from context or intervals)
       
  • Build and integrate document processing flows using OCR or NLP tools such as Tesseract, Azure Form Recognizer, Amazon Textract, etc.
  • Validate and store extracted data in a structured database
  • Ensure accuracy and integrity of parsed data
  • Collaborate with development and QA teams to integrate the solution into the main platform
     

Requirements:

  • Hands-on experience with OCR tools:
    Tesseract, AWS Textract, Azure Form Recognizer, Google Document AI
  • Strong knowledge of .NET / C#, especially for parsing documents and data transformation
  • Understanding of NLP techniques and Regular Expressions
  • Experience transforming unstructured data into structured formats
  • Solid understanding of databases and storage formats (e.g. JSON, SQL)
     

Nice to have:

  • Experience in the healthcare or education domain
  • Familiarity with ML/AI-based approaches to document processing
  • English proficiency for reading documentation and written communication
     

What we offer:

  • Opportunity to work on a meaningful healthcare-related product
  • Potential for long-term in-house cooperation after the initial project
  • Flexible working hours
  • Support and mentorship from a technical team
  • Comfortable office in Cherkasy, or remote work option
  • Internal training and English language development
     
Published 11 June
39 views
·
2 applications
100% read
·
0% responded
To apply for this and other jobs on Djinni login or signup.
Loading...