Enroll Now Talk to Advisor
Skip to main content
Foundation 5 Days · Instructor-Led

Site Reliability Engineer Training Course

Hands-on SRE fundamentals for DevOps and cloud engineers.

Duration

5 Days

Level

Foundation

Format

Instructor-Led

Certification

SRESchool

About this course

This 5-day instructor-led training course provides a comprehensive foundation in Site Reliability Engineering. You will learn the core SRE principles, practice with real tools, and build the skills needed to operate and improve reliable production systems. The course prepares you for the Certified Site Reliability Engineer (CSRE) certification.

Prerequisites

  • Basic understanding of Linux and command-line tools
  • Familiarity with cloud computing concepts (AWS, GCP, or Azure)
  • Some experience with software development or operations
  • No prior SRE experience required

Who should attend

  • DevOps Engineers transitioning to SRE
  • Cloud Engineers building reliability skills
  • Operations Engineers moving into SRE
  • Software Engineers expanding into reliability work
  • System Administrators moving to cloud/SRE
  • IT professionals establishing SRE fundamentals

What you will learn

Explain SRE principles and their relationship to DevOps and software engineering
Define SLOs, SLIs, and SLAs for production services
Implement monitoring, logging, and observability using modern tooling
Manage incidents effectively from detection to postmortem
Identify and automate toil to improve operational efficiency
Apply SRE practices in real production environments

5-Day Course Agenda

D1

Day 1: Foundations of Site Reliability Engineering

Introduction to Site Reliability EngineeringSRE principles and the SRE bookSRE vs DevOps: key differencesUnderstanding Service Level Objectives (SLOs), SLAs, and SLIsError budgets and how to use themSRE culture and the reliability mindset
D2

Day 2: Monitoring, Observability, and Alerting

The three pillars of observability: metrics, logs, tracesMonitoring vs observabilityBuilding effective dashboardsAlert design: reducing noise and improving signalOn-call runbooks and escalation pathsHands-on: Setting up Prometheus + Grafana monitoring
D3

Day 3: Incident Management and Root Cause Analysis

Incident lifecycle: detection, response, resolution, reviewRoles in incident response: incident commander, comms leadRoot cause analysis techniquesWriting blameless postmortemsIncident management tools and workflowsHands-on: Incident simulation exercise
D4

Day 4: Automation, Toil Reduction, and Infrastructure as Code

Defining and measuring toilAutomation frameworks for SREInfrastructure as Code with TerraformConfiguration management best practicesRunbook automationHands-on: Automating a common operational task
D5

Day 5: Scaling, Capacity Planning, and SRE in Your Organization

Capacity planning fundamentalsDesigning fault-tolerant systemsLoad testing and performance optimizationImplementing SRE practices in your organizationReal-world case studiesCSRE certification exam preparation and Q&A

Hands-on Labs

  • Setting up Prometheus metrics collection
  • Building a Grafana dashboard for service monitoring
  • Configuring alerting rules and on-call notifications
  • Incident simulation tabletop exercise
  • Writing a blameless postmortem
  • Automating a toil-heavy operational task
  • Infrastructure as Code with Terraform basics

Tools Covered

PrometheusGrafanaPagerDutyTerraformKubernetesDatadog (overview)SlackJira/OpsgenieBash scripting

Career outcomes

Site Reliability EngineerProduction EngineerReliability EngineerObservability EngineerPlatform EngineerDevOps Engineer

Course FAQs

The Site Reliability Engineer Training Course is a 5 Days instructor-led training program.

Instructor-Led, Virtual, On-Site (Corporate).

Basic understanding of Linux and command-line tools Familiarity with cloud computing concepts (AWS, GCP, or Azure) Some experience with software development or operations No prior SRE experience required

Yes. This course is the recommended preparation for the undefined (undefined) certification.

Ready to enroll in the Site Reliability Engineer Training Course?

Contact us to join an upcoming batch or request a private session for your team.