Reference
sre.yaml reference
Every field in the sre.yaml schema explained.
Full example
service: payments-api
team: platform-engineering
slos:
- name: availability
target: 99.9
window: 30d
indicator:
metric: http_requests_total
good_filter: 'status!~"5.."'
error_budget:
burn_rate_alerts:
- rate: 14.4
severity: critical
remediate: scale-up
notify:
slack: "#incidents"
pagerduty: your-routing-key
- rate: 6.0
severity: warning
notify:
slack: "#sre-warnings"
runbooks:
scale-up:
mode: auto
steps:
- kubectl scale deploy/payments --replicas=+2
- wait: 60s
- assert: availability > 99.9
oncall:
provider: pagerduty
escalation_minutes: 10
notify_slack: "#sre-incidents"
dashboards:
provider: grafana
auto_generate: trueField reference
Field
Type
Required
Description
servicestringyesName of your service. Used in notifications and dashboards.teamstringyesTeam responsible for this service.sloslistyesList of SLO definitions. At least one required.slos[].namestringyesName of the SLO.slos[].targetfloatyesTarget percentage. Must be between 0 and 100.slos[].windowstringyesRolling window for the SLO calculation.error_budget.burn_rate_alertslistnoBurn rate thresholds that trigger actions.burn_rate_alerts[].ratefloatyesBurn rate multiplier. 14.4 = critical (budget gone in 2 days).burn_rate_alerts[].severitystringyescritical or warning.burn_rate_alerts[].remediatestringnoName of the runbook to execute.runbooksmapnoNamed runbooks with executable steps.runbooks[].modestringnoauto (execute immediately) or semi-auto (post to Slack for approval).runbooks[].stepslistyesShell commands to execute in order.oncall.providerstringnoOn-call provider. Currently: pagerduty.dashboards.auto_generateboolnoAuto-generate Grafana dashboard from SLO definitions.