Machine Learning / Data Pipeline Engineer

Vorstella via Stack Overflow

San Francisco, CA

Oct 10th 2018

About Us

Vorstella is an AI platform that automatically manages large scale distributed systems like Cassandra, Hadoop, Spark and Kafka for large companies. Founded by ex-DataStax engineers, weve designed some of the largest distributed system deployments in the world and we wanted to make this technology accessible to everyone. You shouldnt need 3 years of experience to feel comfortable running a new system at scale. We take the guesswork out of using new technology and let you focus on building your applications.

Who we're looking for

Were looking for someone that is probably 60% engineer, 40% machine learning. A little more street fighter, a little less ivory tower. Someone that can write production quality code, solve engineering problems, and knows enough ML to find good-enough solutions. Someone thats creative and can solve problems without always reaching for the ML hammer. Sometimes we use rules, sometime we use ML, sometimes we need to ask the user better questions.

What youll be working on

Youll be working on the machine learning pipeline and models. Weve got multiple signals both synthesized and raw being fed into root-cause analysis, database tuning algorithms and cost optimization. These models feed data to the UI/API which presents next best action to the end user. Were always looking for a solution that gets us to good outcomes as quickly as possible. Sometimes its basic, sometimes were pushing beyond the boundaries of whats published.

Our stack

Our deployment target is Docker and Kubernetes. On the frontend we use React/Redux. Back-end services are written in Go, with the machine learning code written in Python. Our continuous integration system is CircleCI, and we use GitHub for all our code. Were multi-cloud with deployments currently in Google and AWS.


  • Writing code that manages some of the largest distributed systems in the world. We work with customers that have hundreds of thousands of servers.
  • Youll be a senior member of the team, youll have strong input over large swathes of infrastructure as well as product ownership for core products and features.
  • Writing code and working with a team thats pushing the cutting edge of whats possible. Youll be working with some of the worlds experts in optimization and distributed systems.
  • The ability to publish and contribute to multiple popular open source projects.
Apply for this job