Senior Data Scientist
About project:
isitetech is an innovative startup focused on transforming the oil and gas industry through advanced data solutions. Our core product is a powerful web application built on Google Application Engine and a complementary iPad app, designed to efficiently collect, analyze, and visualize data from oil facilities.
This platform empowers our clients to make informed, data-driven decisions, optimizing their operations and improving efficiency. We operate with a distributed, global team, with our head office in Dallas and development hubs in Ukraine (Kharkiv, Kyiv), Spain, and Poland. Our philosophy emphasizes efficiency and flexibility, with a focus on results-oriented work and communication primarily via Slack, rather than rigid daily calls.
Requirements:
- 5+ experience in applying data science and machine learning techniques
- Python programming using classic scientific stack (numpy, pandas, scipy, scikit-learn, pytorch, lightgbm, matplotlib or any other visualization library)
- Understanding of classic machine learning algorithms and theory (supervised/unsupervised learning techniques, metrics and visualizations for interpretation of model performance, time-series analysis, anomaly detections, recommendation models, hyperparameter optimization)
- NLP (NLP techniques, GenAI, LLM finetuning)
- Statistics and optimization basic understanding
- Data manipulation and cleaning skills (fill missing entries / filter dataframes / apply certain changes to columns)
- Prior experience with Python web frameworks would be great, but not critical (FastAPI, Flask or anything else)
- Dockerization technologies
- Cloud solutions (any of AWS, GCP, Azure)
- Understanding of prompt engineering or experience with building scripts or applications around LLMs
- Proficient with SQL (Postgres, Snowflake), NoSQL (ElasticSearch, Mongo(optional))
- Monitoring services
Responsibilities:
- Work on time series forecasting problems with real, complex data. Implement features, interpret their importance, train models and incorporate them in the production pipeline.
- Work on LLM-based solutions, which use techniques like Contextual RAG (Retrieval Augmented Generation), LLM as Judge, semantic search, chatbot services, etc.
- Create anomaly detection models and clustering methods for non-stationary, “dirty”, semi-organized data
Required languages
English | B2 - Upper Intermediate |
Ukrainian | Native |