CrowdStrike Logo

CrowdStrike

Sr. IT Monitoring Engineer/Site Reliability Engineer (Shift -12PM-9PM IST) (Remote)

Posted 8 Days Ago
Be an Early Applicant
Remote or Hybrid
16 Locations
Senior level
Remote or Hybrid
16 Locations
Senior level
Responsible for designing and maintaining monitoring solutions for IT infrastructure, focusing on reliability and automation, and participating in incident management.
The summary above was generated by AI

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About the Role 

The CrowdStrike Information Technology team is looking for a skilled Sr. IT Monitoring Engineer/Site Reliability Engineer (SRE) to join our IT Operations team. In this role, you will be responsible for designing, implementing, and maintaining monitoring solutions that ensure the reliability, availability, and performance of our critical IT infrastructure and applications. You will work at the intersection of operations and development, applying software engineering principles to operations tasks while focusing on system reliability and automation. This position requires a proactive approach to identifying and resolving issues before they impact business operations, as well as participating in on-call rotations to address incidents when they occur.

What You’ll Need

  • 5+ years of experience with enterprise monitoring tools (Prometheus, LogicMonitor, Datadog, ThousandEyes, Zscaler Digital Experience (ZDX))

  • Strong proficiency in scripting languages (Python, Bash, PowerShell) for automation

  • Experience with log management platforms (ELK stack, Splunk, LogScale)

  • Working knowledge of cloud services monitoring (AWS CloudWatch, GCP)

  • Experience with application performance monitoring (APM), digital experience monitoring (DEM) and infrastructure monitoring

  • Knowledge of SRE principles, SLOs, error budgets, and incident management

  • Experience with automated alerting, remediation workflows, and CI/CD pipeline monitoring

  • Familiarity with Infrastructure as Code (Terraform, Ansible) and containerization (Docker, Kubernetes)

  • Strong incident triage, root cause analysis, and documentation skills

  • Experience participating in on-call rotations and emergency response

What You'll Do

Monitoring and Reliability

  • Design and maintain comprehensive monitoring solutions across infrastructure and applications

  • Configure appropriate alerting thresholds to ensure timely response to potential issues

  • Define and track SLOs and error budgets for critical services

  • Create and maintain dashboards providing real-time visibility into system health

  • Conduct regular reviews of system reliability and recommend improvements

Incident Management and Operations

  • Participate in on-call rotation to respond to alerts and incidents

  • Lead incident response efforts and conduct thorough post-incident reviews

  • Document incidents, resolutions, and lessons learned

  • Develop and refine incident response procedures to improve MTTR

  • Implement proactive monitoring to detect potential issues before they impact users

Automation and Collaboration

  • Develop scripts and automation to streamline monitoring tasks and reduce manual effort

  • Create self-healing systems that can automatically remediate common issues

  • Integrate monitoring tools with other operational systems

  • Work closely with development, infrastructure, and security teams

  • Provide guidance on monitoring best practices and observability

  • Maintain comprehensive documentation for monitoring systems and procedures

Continuous Improvement

  • Stay current with industry trends in monitoring and site reliability engineering

  • Analyze monitoring data to identify patterns and improvement opportunities

  • Implement metrics to track the effectiveness of monitoring processes

  • Contribute to the evolution of the organization's monitoring strategy

Preferred Qualifications

  • SRE, cloud platform, or monitoring tool certifications

  • ITIL Foundation certification

  • Bachelor's degree in Computer Science, Information Technology, or related field

Shift timings - 12PM -9PM IST

#LI-DP1

#LI-VJ1

#LI-Remote

Benefits of Working at CrowdStrike:

  • Remote-friendly and flexible work culture

  • Market leader in compensation and equity awards

  • Comprehensive physical and mental wellness programs

  • Competitive vacation and holidays for recharge

  • Paid parental and adoption leaves

  • Professional development opportunities for all employees regardless of level or role

  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections

  • Vibrant office culture with world class amenities

  • Great Place to Work Certified™ across the globe

CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program.

CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements.

If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at [email protected] for further assistance.

Top Skills

Ansible
Aws Cloudwatch
Bash
Datadog
Docker
Elk Stack
GCP
Kubernetes
Logicmonitor
Logscale
Powershell
Prometheus
Python
Splunk
Terraform
Thousandeyes
Zscaler Digital Experience

CrowdStrike Sydney, New South Wales, AUS Office

Sydney, Sydney, Australia

Similar Jobs at CrowdStrike

17 Hours Ago
Remote or Hybrid
17 Locations
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
As an Applied ML Research Scientist, you will build and optimize ML models for malware detection, create automated pipelines, and collaborate with researchers to enhance cybersecurity measures.
Top Skills: AirflowSparkAWSAzureElasticsearchGCPMlflowMongoDBMySQLPostgresPythonPyTorchRayScikit-LearnTensorFlow
5 Days Ago
Remote or Hybrid
16 Locations
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
As a Senior Software Engineer, you will develop software for file format parsing, collaborate on machine learning features, and maintain systems while ensuring quality and optimization.
Top Skills: AWSAzureBitbucketC++GCPGitJenkinsJIRALinuxmacOSPythonRustWindows
5 Days Ago
Remote or Hybrid
16 Locations
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
As a Senior SDET, you'll develop test strategies, automate tests, validate features, and optimize CI/CD pipelines while collaborating with data science teams.
Top Skills: AWSAzureBitbucketC++DockerGCPGitJenkinsPythonRust

What you need to know about the Sydney Tech Scene

From opera to comedy shows, the Sydney Opera House hosts more than 1,600 performances a year, yet its entertainment sector isn't the only one taking center stage. The city's tech sector has earned a reputation as one of the fastest-growing in the region. More specifically, its IT sector stands out as the country's third-largest, growing at twice the rate of overall employment in the past decade as businesses continue to digitize their operations to stay competitive.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account