Senior Data Engineer (ML/NLP project) (offline)

Company description

Zetico is a digital consultancy on a mission to accelerate innovation and build sustainable software, for the long-term. We believe in developing software products that matter and have a significant impact.

This time, we have partnered with a leading market intelligence platform from New York - CB Insights - and are looking for an experienced data engineer to join our team.

Job description

As a data engineer working with CB Insights, you will be a core part of a strong team, building data pipelines and robust infrastructure enabling effective data processing at scale.

You will build products that use natural language processing and machine learning models and make them run efficiently with large amounts of data to enable a smooth experience for their clients and in-house intelligence units.

We’re looking for engineers that through hard-won practical experience know how to build maintainable and testable data pipeline processes and infrastructure. We are looking for engineers that love solving problems and are willing to take on hard ones.

Responsibilities

- Engineer efficient, adaptable, and scalable data pipelines that power our data products
- Design and build efficient ETL and ML infrastructures for unstructured textual data sets and various other types of data sources
- Take a prototype of a data product built with NLP and/or machine learning models and make it run reliably in production.
- Monitor and maintain existing data products running in production including identifying when models need to be retrained
- Design and implement internal tools to make this data processing infrastructure easily accessible to and usable by other software developers
- Develop clean solutions that are well-engineered, maintainable, tested and delivered on time.
- Participate in code reviews and sprint planning, help to identify problems and share knowledge with your colleagues.

Examples of tasks you’ll be dealing with

- Turn an ML pipeline created by the data scientist into a production-ready pipeline. Our DevOps provide the template that all pipelines need to conform to, this makes it easier to dockerize and monitor jobs. The side effect of this would be suggestions to improve the templates
- Code reviews: our data scientists are strong Python coders, however, we are always looking for ways to do things better. The engineer would become an integral part of our code review process and raise code and architecture quality
- Get data from external vendor API and figure out how to persist it on our side to make it easy to use for everyone else
- Find the memory leak in a production job

Required skills & experience

- 3+ years professional software/data engineering experience using Python, SQL and, ideally, at least 1 statically typed language (Go, Java, Scala)
- Knowledgeable about data modeling, data storage techniques, data warehousing and general data architecture
- Experience with engineering data pipelines to capture, store and process unstructured data
- Excellent written and verbal communication skills
- Excellent problem solving and analytical skills
- Believer in Lean and Agile values and principles for building software
- Proficiency developing in a Mac/Linux environment
- Technology stack: Python, Go, SQL, NoSQL, Spark, Hadoop
- Helpful Humble Human

Bonus points

- Experience with AWS services (RDS, S3, SQS, Redshift, Spectrum, Glue)
- Experience building and maintaining a Hadoop or Spark cluster and other related tools in the big data ecosystem
- You’re a contributor to open-source software, or run your own projects

What’s in it for you

- Opportunity to become a part of a leading international startup, a leader in its space
- Competitive compensation
- An open-minded work environment with minimal hierarchy
- Paid vacation and sick leave
- Remote-friendly culture
- Internal tech talks
- Learning & development benefits

About Zetico

Zetico is an engineering & product consultancy. We partner with ambitious tech companies — from startups to enterprises — to craft elegant solutions to the toughest software problems.

Our team combines years of experience in digital services, with a specific focus on the JavaScript ecosystem. We leverage Typescript, Node.js, Go, React, GraphQL, AWS and more to develop world-class software that is built to last.

Among our clients are companies like The Mozilla Foundation, Powtoon, vCita, DreaMed Diabetes, Loola.tv and others.

Company website:
https://zetico.io

DOU company page:
https://jobs.dou.ua/companies/zetico/

The job ad is no longer active
Job unpublished on 5 December 2021

Look at the current jobs Data Science Kyiv→