About this job
python, kubernetes, terraform, linux, amazon-web-services
Please do not apply for this role if you are not physically located in Europe, Africa or the Middle East (UTC-1 to UTC+3). While this is a remote position, we can not consider candidates that are not based in these regions. You can find a detailed explanation in our Recruitment FAQs.
At Hotjar, we’re building Behavior Analytics software for businesses selling online. We make it easy for them to go beyond traditional web analytics and understand what users are really doing on their site. As part of our Engineering team, you’ll work on the exciting challenges that come with large-scale web traffic and analytics.
We work in an agile and highly collaborative environment, 100% remotely, and challenge the norms of traditional business leadership. Our focus is on true transparency and respect.
We’re looking for enthusiastic and resourceful Site Reliability Engineers who are passionate about building great products, someone who loves being part of a team, whilst enjoying the autonomous nature of working remotely.
The Hotjar infrastructure processes more than 16,500 API requests per second, delivers over a billion pieces of static content every week, and hosts databases well into terabyte-size ranges. Come join us handling the woes of a large monolith, standardizing our microservices platform and keeping us online, fast and reliable.
About the job:
- Reporting into our SRE Team Lead, be part of an agile team setting up and maintaining the infrastructure that powers our applications and services on AWS.
- Design, deploy, and maintain tools and services to support a robust infrastructure.
- Collaborate and mentor engineers in product teams to help them bring up new microservices.
- Ensure all necessary monitoring, alerting and backup solutions are in place. Using SLOs to guide prioritization putting reliability front and center.
- Dive into large codebases, not being afraid of programming more than a few lines of bash.
- Spend a small amount of your time dealing with incidents and internal change requests. This is not a service-desk or incident-only position, the vast majority of your time will be spent creating and optimizing our tools and infrastructure, not firefighting.
- Strong working knowledge of Amazon AWS and Kubernetes.
- 3+ years of professional experience with systems administration / cloud infrastructure / DevOps. Ideally you’ve worked on large codebases before.
- Experience with Python, shell scripting, Terraform, Linux, robust CI/CD pipelines.
- Not required but experience in any of the following is considered to be an asset: PostgreSQL, Elasticsearch, Redis, Traefik.
- Desire to work in a respectful, transparent, and collaborative work environment, following Hotjar’s company values, culture and ways of working.
- Must submit to a background check confidentially processed by our third-party.
You enjoy programming (not just quick scripts!) and are excited to have a mix of developing code and infrastructure in your day.