Experience

For the past 2.5 years I have worked in US IT startups as Data Scientist (NLP field) with trade and crime data.
- NER model with Prodigy package (data preparation, labeling, modeling, error analysis, iterate)
- Geocoder (Google APIs, Pelias)
- Supervised models (xgboost, bi-lstm, transformers) for free text comparation
- Unsupervised models / custom systems for free text clustering
- A lot of tabular data (SQL)
- Cloud services (AWS, Databricks)

Skills

Python Machine Learning SQL Data Science Pandas numpy Jupyter Notebook Git scikit-learn statistics matplotlib English PyTorch NLP Jira Excel AWS Recurrent Neuronal Networks Data Analysis Scientific Research

Highlights

- Successfully developed from end-to-end NER pipeline with help of Prodigy tool for annotations with human level performance, resulting in enhanced company data and user experience and increased customer satisfaction significantly.
- Implemented geocoding pipeline from scratch using Google’s places and addresses api, created voting system as part of pipeline for selection best of multiple geocoding api outputs.
- Combined the two above projects to create ML driven system that takes news (media articles) as input and outputs location (lat-lon pair), time and actors of crime event.
- Spent significant amount of time developing tools for automatic data insight and statistics generation to make user experience more pleasant and insightful.

Preferred language

Українська, English



$2000 / mo

  • Ukraine, Lviv
  • 3 years of experience
  • English: Advanced/Fluent
  • Remote work
  • Office
  • Published 28 March 2024
  • Typically replies in: 3 days
  • Response rate 72%