Systems Reliability Engineer (SRE)

Qntfy via Stack Overflow
Development

Jul 25th 2018


Qntfy is looking for a talented and motivated SRE to join our ops team. You will be responsible for deploying, configuring, and maintaining the core systems and services that our software and business depends on. We need someone who is interested in designing sustainable, best-in-class infrastructure and reliability processes.

We move quickly and are not beholden to any single technology but we do have favorites. An ideal candidate will have experience with, or the ability to figure out quickly, tools like Mesos/Marathon, Kubernetes, Ansible, and Docker. As an SRE at Qntfy, you will have the freedom and responsibility to recommend and implement core architectural changes in support of our long-term technological vision. As to our stack, we have both on-premises and AWS deployments to manage and are looking to increase our use of Kubernetes.

Responsibilities:

  • Help to determine production standards alongside software engineers from day 0.
  • Communicate with peers, customers, and partners to foster cooperation and development.
  • Design and implement the systems to support major new features for our platform.
  • Translate team needs into technical requirements and produce stable solutions.
  • Effectively estimate time to implement solutions.
  • Plan, execute, maintain and improve infrastructure.
  • Debug, automate, and monitor operations.
  • Record and make available postmortem records of incident response

Qualifications:

  • BS or Master's degree in Computer Science/Engineering, related degree, or equivalent experience.
  • 3+ years experience with DevOps, SysAdmin, and/or datacenter operations.
  • Ability to architect and deploy services to support distributed systems while maintaining flexibility and high-quality documentation.
  • Strong work-ethic and passion for problem solving.

Preferred Qualifications:

  • 3+ years work with Kubernetes, Docker, and/or Mesos/Marathon.
  • 3+ years working with public cloud infrastructure and tooling
  • Experience provisioning new systems in a reproducible and maintainable fashion (including the use of technologies like Ansible, Terraform, and Kops).
  • High level of proficiency with Linux systems and services.
  • Strong understanding of security best practices and their implementations
  • Experience with scripting languages

U.S. Citizenship Required

Apply for this job