SafetyCulture Logo

SafetyCulture

Staff Engineer - Observability

Posted 10 Days Ago
Be an Early Applicant
Hybrid
Sydney, New South Wales
Senior level
Hybrid
Sydney, New South Wales
Senior level
As a Staff Site Reliability Engineer at SafetyCulture, you will design and support the Observability platform, collaborate across teams to define service objectives, and drive a culture of incident management while addressing reliability and performance issues within microservices.
The summary above was generated by AI


SafetyCulture is a global technology company that is helping to transform workplaces around the world. After witnessing the tragedy of workplace incidents as a private investigator, SafetyCulture Founder Luke Anear recruited a team to help him develop a mobile solution for frontline workers. What we have created is a market-leading workplace operations platform that helps give teams the knowledge, tools, and confidence they need to meet higher standards, work safely, and improve every day. 


SafetyCulture is among the fastest-growing tech companies in Australia. Its bold ambition is to reach 100 million users worldwide by 2032. Opportunities to be part of a journey like this do not come around often. 


What do Staff Engineer’s at SafetyCulture do?

Staff Engineers at SafetyCulture are the champions of our platform's reliability, and as a Staff engineer you’ll be empowered to manage complex architectural decisions, solve cross domain challenges and to drive cultural change across engineering.

What will you be doing?

  • Design, develop and support our Observability platform (Prometheus, Loki and Grafana Cloud);
  • Work across engineering teams to define Service Level Objectives;
  • Instrument Go microservices using OpenTelemetry;
  • Write and maintain Go modules providing fundamental capabilities to our applications (e.g tracing and logging);
  • Driving a culture around Incident Management and how we can learn and improve from them;
  • Engaging with teams across Engineering on reliability and performance issues;
  • Educating and driving the observability mandate across the organisation.

The successful applicant will:

  • Work closely with our wider engineering team to understand what reliability metrics will enable them to prioritise production stability, and where additional instrumentation will assist in diagnosing complex issues. You’ll be a go-to expert on using our observability platform to uncover the root cause of problems.
  • Be managing and scaling an observability platform ingesting millions of metrics series, terabytes of trace and log data, and providing an opinionated stance on reliability through curated dashboards and SLOs.
  •  Evolve our existing Incident Management process and tooling, enabling all of our engineering teams to mitigate, learn and drive improvements when things go wrong.
  • Partner closely with the Engineering Leadership Team to define and share key reliability findings based on production telemetry and incident reviews.
  • Develop capabilities and tooling that enable our engineering community to clearly understand how their production services are running, and how they can diagnose where performance and reliability issues come from.
  • Collaborate with engineering teams across SafetyCulture to help them instrument their services, understand their observability telemetry, and diagnose complex problems within our microservice architectures. You’ll have opportunities to directly contribute to reliability improvements, and to grow your passion within the SRE space.

You will have experience in:

  • Expertise to operating Observability platforms at scale.
  • Strong technical leadership in SRE concepts
  • Knowledge of best practices for the full software development life cycle; including coding standards, code reviews, source control management, build processes, testing, and operations.
  • Experience in designing and building complex software and at scale systems

Your professional background will comprise of:

  • Tertiary degree in Computer Science or related technical field, or equivalent practical experience.
  • 8+ years relevant experience in software development and mentorship experience.
  • A solid understanding of monitoring, logging, tracing, and observability instrumentation.
  • Experience working with observability platforms like Grafana / Datadog / New Relic / Honeycomb.
  • A solid background in SRE concepts like SLOs.
  • Experience in defining and driving a culture of Incident Management
  • Proven experience of working on complex and large-scale projects that require high-level technical skills, creativity, and leadership.
  • Proficiency with one or more general purpose programming languages including but not limited to: C#, Golang, C++, Python, Java, Typescript, Scala.

What Do I Get Access To When Working at SafetyCulture?

  • Equity with high growth potential, and a competitive salary.
  • Hybrid working; we encourage you to create the best work blend while working from your home and the local SafetyCulture office.
  • Access to professional and personal training and development opportunities.
  • Participation in hackathons, workshops, and lunch & learn sessions.
  • Community involvement, open source work, attending talks and events, and experimenting with new technologies.

What are the office benefits?

  • In-house Culinary Crew serving up daily breakfast, lunch, and snacks.
  • Barista coffee machine, craft beer on tap, boutique wines, and a range of non-alcoholic beverages.
  • Quarterly celebrations and team events.
  • Table tennis, board games, book library, and pet-friendly office.

We’re committed to building inclusive teams and cultivating a sense of belonging so our people can bring their whole authentic selves to work each day. We seek to make reasonable adjustments throughout our recruitment process to create an even playing field for all candidates. Thanks to the tireless efforts of the entire SafetyCulture team, we’ve built an incredible culture which has seen us recognized as a Best Place to Work in Australia, the US, and the UK.


Even if you don't meet every requirement listed in the ad, please consider applying for this role. We prioritise inclusion and value individuals with potential over a checklist of qualifications. Don't rule yourself out—hit that apply button if this job resonates with you.


You can find out more about life at SafetyCulture via YouTube, Twitter, Instagram, and LinkedIn.


To all recruitment agencies, we do not accept resumes or partnership opportunities. Please do not forward resumes to SafetyCulture or any of our employees. We are not responsible for any fees associated with unsolicited resumes.

Top Skills

C#
C++
Go
Java
Python
Scala
Typescript

Similar Jobs

Be an Early Applicant
35 Minutes Ago
Sydney, New South Wales, AUS
Remote
11,000 Employees
Mid level
11,000 Employees
Mid level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
As a Backend Software Engineer at Atlassian, you will independently drive projects from technical design to launch, apply architectural standards, contribute to code reviews, and mentor junior team members. You should possess a deep understanding of data structures, a passion for collaboration, and a willingness to learn.
Be an Early Applicant
2 Hours Ago
Sydney, New South Wales, AUS
Hybrid
26,000 Employees
Expert/Leader
26,000 Employees
Expert/Leader
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The Sr Advisory Solution Consultant will support Customer Workflow & Industry Solution Sales by acting as a technical consultant, guiding revenue through product-specific solutions, leading discovery workshops, and developing client relationships. They will provide product demonstrations, feedback to product management, and contribute to strategic programs in top accounts.
Be an Early Applicant
2 Hours Ago
Sydney, New South Wales, AUS
Hybrid
26,000 Employees
Senior level
26,000 Employees
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The Director of Solution Sales will manage and lead the Sales Team for Customer Experience Solutions in the ANZ region. Responsibilities include developing sales strategies, aligning with business unit leaders, engaging in territory planning, driving revenue, achieving sales goals, and building relationships with C-level stakeholders.

What you need to know about the Sydney Tech Scene

From opera to comedy shows, the Sydney Opera House hosts more than 1,600 performances a year, yet its entertainment sector isn't the only one taking center stage. The city's tech sector has earned a reputation as one of the fastest-growing in the region. More specifically, its IT sector stands out as the country's third-largest, growing at twice the rate of overall employment in the past decade as businesses continue to digitize their operations to stay competitive.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account