Jobs Dnipro

2
  • · 71 views · 4 applications · 23d

    Data Extraction Specialist / Document Parsing Engineer

    Ukraine · 3 years of experience · Upper-Intermediate
    About the company VReal Soft is a Ukrainian IT company with 7 years of experience building custom software solutions, mobile apps, and web platforms for clients in the US, Israel, and Europe. We are currently opening a position for a Data Extraction...

    About the company

    VReal Soft is a Ukrainian IT company with 7 years of experience building custom software solutions, mobile apps, and web platforms for clients in the US, Israel, and Europe. We are currently opening a position for a Data Extraction Specialist, combining project work in the healthcare domain with the opportunity for long-term in-house collaboration.

     

    Position Summary

    We are looking for a skilled Data Extraction Engineer / Document Automation Developer to help build a system that automatically extracts structured information from various medical and immunization records and stores it in a structured database format.
    You’ll be working with diverse document formats (PDFs, scans, electronic forms), using OCR and NLP-based tools to process them.

     

    Your responsibilities:

    • Analyze medical document templates to identify key fields
    • Develop pipelines for automatic extraction of fields such as:
      • Student Full Name
      • Date of Birth
      • Sex/Gender
      • Vaccination Information (e.g. Type, Date, Dose, Clinic, Manufacturer)
      • Next Due Dates (calculated from context or intervals)
         
    • Build and integrate document processing flows using OCR or NLP tools such as Tesseract, Azure Form Recognizer, Amazon Textract, etc.
    • Validate and store extracted data in a structured database
    • Ensure accuracy and integrity of parsed data
    • Collaborate with development and QA teams to integrate the solution into the main platform
       

    Requirements:

    • Hands-on experience with OCR tools:
      Tesseract, AWS Textract, Azure Form Recognizer, Google Document AI
    • Strong knowledge of .NET / C#, especially for parsing documents and data transformation
    • Understanding of NLP techniques and Regular Expressions
    • Experience transforming unstructured data into structured formats
    • Solid understanding of databases and storage formats (e.g. JSON, SQL)
       

    Nice to have:

    • Experience in the healthcare or education domain
    • Familiarity with ML/AI-based approaches to document processing
    • English proficiency for reading documentation and written communication
       

    What we offer:

    • Opportunity to work on a meaningful healthcare-related product
    • Potential for long-term in-house cooperation after the initial project
    • Flexible working hours
    • Support and mentorship from a technical team
    • Comfortable office in Cherkasy, or remote work option
    • Internal training and English language development
       
    More
  • · 39 views · 2 applications · 8d

    Data Extraction Specialist / Document Parsing Engineer

    Ukraine · 3 years of experience · Upper-Intermediate
    About the company VReal Soft is a Ukrainian IT company with 7 years of experience building custom software solutions, mobile apps, and web platforms for clients in the US, Israel, and Europe. We are currently opening a position for a Data Extraction...

    About the company

    VReal Soft is a Ukrainian IT company with 7 years of experience building custom software solutions, mobile apps, and web platforms for clients in the US, Israel, and Europe. We are currently opening a position for a Data Extraction Specialist, combining project work in the healthcare domain with the opportunity for long-term in-house collaboration.

     

    Position Summary

    We are looking for a skilled Data Extraction Engineer / Document Automation Developer to help build a system that automatically extracts structured information from various medical and immunization records and stores it in a structured database format.
    You’ll be working with diverse document formats (PDFs, scans, electronic forms), using OCR and NLP-based tools to process them.

     

    Your responsibilities:

    • Analyze medical document templates to identify key fields
    • Develop pipelines for automatic extraction of fields such as:
      • Student Full Name
      • Date of Birth
      • Sex/Gender
      • Vaccination Information (e.g. Type, Date, Dose, Clinic, Manufacturer)
      • Next Due Dates (calculated from context or intervals)
         
    • Build and integrate document processing flows using OCR or NLP tools such as Tesseract, Azure Form Recognizer, Amazon Textract, etc.
    • Validate and store extracted data in a structured database
    • Ensure accuracy and integrity of parsed data
    • Collaborate with development and QA teams to integrate the solution into the main platform
       

    Requirements:

    • Hands-on experience with OCR tools:
      Tesseract, AWS Textract, Azure Form Recognizer, Google Document AI
    • Strong knowledge of .NET / C#, especially for parsing documents and data transformation
    • Understanding of NLP techniques and Regular Expressions
    • Experience transforming unstructured data into structured formats
    • Solid understanding of databases and storage formats (e.g. JSON, SQL)
       

    Nice to have:

    • Experience in the healthcare or education domain
    • Familiarity with ML/AI-based approaches to document processing
    • English proficiency for reading documentation and written communication
       

    What we offer:

    • Opportunity to work on a meaningful healthcare-related product
    • Potential for long-term in-house cooperation after the initial project
    • Flexible working hours
    • Support and mentorship from a technical team
    • Comfortable office in Cherkasy, or remote work option
    • Internal training and English language development
       
    More
Log In or Sign Up to see all posted jobs