About Me
Site Reliability Engineer & Reliability Advocate
Site Reliability Engineer & Reliability Advocate
I’m Steve McGhee, a veteran Site Reliability Engineer (SRE) and Reliability Advocate at Google. With over 20 years of experience in the industry, I’ve spent more than a decade at Google building and maintaining some of the most reliable systems on the internet, including Android, YouTube, and Google Cloud.
My journey in SRE began in the early days of the discipline. I was part of the original team that wrote the monitoring for the first-ever Android launch. Since then, I’ve focused on bridging the gap between theoretical reliability concepts and practical engineering implementation.
After a brief “summer vacation” working on cloud migration and architecture at a smaller enterprise, I rejoined Google to help external organizations adopt SRE principles. I currently serve as part of the Google Cloud Incident Response Core Team, helping our largest customers architect for high availability and sustainable operations.
You can download my latest resume here.