A Day in the Life of an SRE | Tiago Dias Generoso
Today we have Tiago Dias Generoso sharing his #SRE Story. I came across Tiago’s blog post on Observability — Tooling Decision Guide and resonated with lot of ideas in it. So I decided to see if Tiago would be interested in sharing his story with us. And here he is!
Tiago, please introduce yourself, how did you come into the SRE verse?
Hello, I am Tiago from Brasil. I joined IBM in 2009 and worked in different positions around monitoring as system administrator and as a network administrator as well as system architect.
From 2017 onwards, we started a big transformation the organisation with respect to observability tooling. That’s when I actively moved into the SRE and observability world. I was leading a team of around 50 people which was working on standardising tooling and helping other internal teams in IBM adopt Observability practices, and evaluating new products with proof of concept.
In 2021, I became part of Kyndryl as it spun off from IBM. These days I am focusing more on integrating OpenTelemetry into our applications.
Where do you think the teams struggle the most in their Observability journey? Is it in the in the instrumentation phase or is it in the phase where you want to like make sense of the observed data and take decisions based on that?
Because of auto-instrumentation in tools such as OpenTelemetry, most teams can get started with instrumentation. But what to do with all of that data? The real struggle happens in contextualising the SLIs, KPIs for the application or the service and map it to the system reliability. It also needs some awareness of business context, so in my experiences teams need time to come up with those signals for their service.
You lead a remote team of 50 people managing Observability for other distributed team members. How is your experience leading such a distributed team?
Yeah. My team members are spread across USA and Europe. I always thought that the occasional coffee break in an office environment is where new ideas come out in the open :) But most of my team members are used to working remotely from a long time, it involves lot of meetings though. Especially, because of a distributed environment across geographies, we have to be mindful of each other’s times. We use Github and Jira heavily for project management and documentation which helps.
Do you have a specific work gear while working remotely?
Yeah, I have separate screen and keyboard. I also use two separate notebooks to track my notes across projects. I have moved into a leadership role recently so I have to attend lot of meetings and take notes, having two notebooks helps. Also I have made a dedicated office space in my home.
How does your typical day look like now that you have moved into a leadership role?
The first thing I start these to take a look on my emails and to understand if a person needs me because as I mentioned, our team members are spread across continents. So the first half of the days mostly goes into following up with people, making sure they have everything they need, project planning and so on.
The second half is where I do most of my hands on work, take a stock of few important applications, do research and active coding.
You help teams with their observability journey. Where do you find the information about what’s new happening in the industry?
I like Medium a lot where people share their opinions and experiences which is different from just the documentation. I also follow few newsletters around SRE and DevOPS such as SRE Weekly and SRE Brasil.
You are very active on your blog. How do you find topics to write about and what is your writing process?
I pick up topics from my daily work itself where someone is asking me questions or I need to learn something to solve a problem. I heavily focus on adding visualisation using images because I think that makes the flow clear to the reader. I create all those diagrams myself because I want to make sure people understand what I am reading. I use Grammarly to fix the grammar. Identifying the topic is the most difficult part.
What is an important trait that someone should have to become a better SRE?
Curiosity - most important thing.
Being a Generalist in a breadth of topics such as networking, TCP/IP and specialist in few topics.
Having patience because debugging is hard and is time consuming.
Ability to negotiate with other teams.
Technical things can be mastered but soft skills are hard and I think they play a major role in doing a better job as an SRE.
Metrics vs Traces?
I really like traces as they help correlate by providing context between metrics and logs. I use the visualisation of traces a lot to debug problems in the applications.
We are almost out of time. So, one last question, if you're not an SRE, what would you be?
I like to architect solutions and plan strategies so I think I would a system architect I think.
Do you play any strategy games?
I used to play age of empires but these days I play Peteca, a local sport that is very popular in Brazil, which is similar to Volleyball. That’s how I recharge myself.
Thanks a lot Tiago for taking time and sharing your SRE story with us. Folks can reach out to Tiago on Linkedin.
Readers - If you want to feature on the SRE Stories or nominate someone, please submit this form. You don’t have to have the SRE title to share your story. Let’s learn from each other 😊