Senior Site Reliability Engineer
Skillshare via Stack Overflow
New York, NY
Apr 14th 2019
As a Senior Site Reliability Engineer at Skillshare, you'll play a key role in balancing our current operations with building for the future. We're scaling quickly and are excited to bring someone on board who can help us proactively tackle resulting challenges – both in the day-to-day operations, and anticipating those further out. This role is an exciting blend of both Infrastructure and DevOps, which means opportunity for impact across the board. We'll look to your strategic expertise, reliable execution, and sound judgment to improve and maintain our infrastructure, along with creating increasingly smooth processes for our engineers as we grow the platform. You'll be joining a team that's passionate about technology, and helping pave the way for building products together that we're proud of. We're excited to meet you.
What you'll do:
- Improve, monitor and maintain our infrastructure
- Ensure site uptime and performance
- Maintain and improve development and QA environments
- Work with web developers to improve tooling for initiatives like unit testing, deployment processes, etc.
- Proactively prep and train developers for improvements or updated workflows
- Quickly and proactively resolve developer issues
- Support the platform team in building new application platform on Node.js
- Make strategic recommendations and improvements to our application and infrastructure security
What you'll need to be successful:
- Experience building and supporting cloud-based web infrastructure with AWS
- Docker experience (Kubernetes experience is a plus)
- Continuous integration and deployment experience (preferably with CircleCI)
- Relational databases and queueing systems knowledge (we use MySQL, Redshift, Redis)
- Experience with application monitoring and alerting systems (we use New Relic and Datadog)
- Understanding of web infrastructure: load balancing, high availability configurations, disaster recovery, DNS configuration, security best practices, etc.
- Working knowledge of software engineering practices
- Strong communication skills – you're a natural collaborator and can report out to stakeholders of all levels
- Ability to balance strategy and execution
Why you want this job:
- Impact: you'll play a key role in shaping the direction of our infrastructure and developer processes long-term
- Growth: Our team is small, so you'll have room to wear a lot of hats and take on more responsibility over time.
- Our mission: We are building a learning ecosystem for the new economy and changing millions of lives for the better.
- Our team: We have a passionate, smart team that is a lot of fun to work with.
- Your life: We take pride in our flexibility. Need flexible hours, or work a day or two remotely? No problem. We trust you to do what you need to do.