“Cloud Confessions” Uncover Changing APM Realities for Tech Pros — SolarWinds TechPod 025

Stream on:
The SolarWinds® Cloud Confessions 2020 survey explores how extensively technology professionals are using APM tools, whether on-premises or for SaaS-based application management, and how they monitor these environments. One of the things the survey revealed this year was the introspection our colleagues in tech are doing about the changing nature of tech work; the relationship of IT to the business; and how our experience should have taught us by now to expect disruption in these moments of significant changeno matter how carefully we try to formulate our plans.  This episode is brought to you by the SolarWinds APM Suite. Simplify your full-stack monitoring for hybrid, Azure, AWS, and other cloud-native IT environments and gain valuable performance insights across user experience, applications, services, and infrastructure. Related Links

Guest

We’re Geekbuilt.® Developed by network and systems engineers who know what it takes to manage today's dynamic IT environments, SolarWinds has a deep connection to… Read More
Leon Adato

Host

Leon Adato is a Head Geek™ and technical evangelist at SolarWinds, and is a Cisco® Certified Network Associate (CCNA), MCSE and SolarWinds Certified Professional (he… Read More

Episode Transcript

Leon: The SolarWinds Cloud Confessions 2020 survey explores how extensively technology professionals are using APM tools, whether on-premises or for SaaS-based application management and how they monitor these environments. It also showcases areas where tech pros feel confident in their APM tools and strategies and challenges and avenues for building confidence. I’m Leon Adato and joining me today on SolarWinds TechPod is Adam Bertram, aka Adam the Automator. In our last conversation, Adam and I dug deep into the implications of the report that had had regarding APM, but there was so much more for us to discuss, we’re back for a whole other episode. Adam, thank you for joining us again.

Adam: Hey Leon, I’m glad to be back again.

Leon: Okay, so some people know you from your website or your podcast or the other work that you do, but for those folks in the listening audience who are hearing you for the first time, can you take a moment for some shameless self-promotion. Tell us a little bit about yourself and where they can find more of your work.

Adam: Sure. I’m Adam Bertram, as you said, also known as Adam the Automator since I’m a big fan of automation. You can find me at my blog. I blog a lot and it has a lot of guest authors on adamtheautomator.com and you can also find me on the Twitters @ADBertram.

Leon: Great, and just to round things out, again, my name is Leon Adato. I’m a Head Geek at SolarWinds and that actually is my job title. Greatest job on earth. You can find me on the Twitters, as the youngins say, @LeonAdato. You can also find me on the SolarWinds user community called THWACK.com. My username there is AdatoLE. And I want to start off with a quote from the survey that gets to the heart of, I think, the self image and self perception of IT practitioners. It says IT pros are, “confident in their ability to manage and monitor applications on-prem, in hybrid environments, and in the cloud. This confidence mostly sits with their ability to troubleshoot.” And the other quote, “Troubleshooting and monitoring as the top two areas where tech pros have the most confidence is consistent with last year’s findings. In 2019, troubleshooting application issues was the number one activity tech pros spent their time on, with 48% of respondents choosing this as the top three tasks.”

Leon: So obviously you get good at the stuff you do a lot, even if you hate doing it, which was clear from the 2019 survey, that it was one of the top three tasks and they wished they could stop doing it. But it also speaks to the maturity of monitoring not being a continuum, but actually a set of interlocking abilities where if you’re strong in one area, you compensate for a weakness in another. IT is used to troubleshooting, but I want to talk about what this says to you. I mean this was a survey about APM and yet troubleshooting comes to the top about this is how we use our APM and this is what we do. I think there’s something about that that speaks to the heart of IT pros. What was your take on that?

Adam: Yeah, I definitely think so, too. You hit a good point, to where you think people get good at it even if they don’t like to do what they’re doing. And I think that’s how IT is with troubleshooting. I mean essentially it’s what they know. They know how to fix stuff, but it’s hard to really get in that mindset of proactive engagement, proactive things. Because going back up, I like to go back to some of the stories that I’ve had in my career. There was a position that I was at where the management was concerned, they didn’t see the forest for the trees. They were concerned about closed tickets. They measured success of each individual, not only on the support side, they did the majority on the support side, but also on the engineering side to where they were measuring customer success, they define customer success as number of tickets closed.

Leon: Right.

Adam: How long people spend on tickets, how fast would they address these things. And that set the incentive of well, then all I have to do is close tickets. And us being brilliant people and brilliant IT engineers, we said okay, well let’s just kind of game the system.

Leon: Years of playing Halo and Quake and Doom teach you how to game systems really fast.

Adam: Exactly. If my bonus is getting measured on how many tickets I close, well watch out. I’m going to close a lot of tickets. Well obviously, that doesn’t necessarily mean anything because you could be closing 50 tickets and not really help anyone. But at the end of the day, they realize that a closed ticket is a success. It’s a win. Not only that, if they do actually help somebody, I close a ticket, that’s a win, versus a long form project.

Leon: Right. I love the fact that some ticket systems have a mass close button and I keep on thinking that is the worst idea ever to put into a system, is, “Close 37,000 tickets all at once.” Don’t let them do that. If you have 37,000 tickets, you have a different problem to solve. But okay, all right. But you’re absolutely right. You’re going to get what you measure and you better make sure that you’re measuring what you really want.

Adam: Yep. They were just fixing the symptom rather than the core sickness, disease, I guess if you want to call it that.

Leon: Right. I think it also speaks to an attitude where IT is still a cost center. And so what do you do? Well, I close tickets because that doesn’t provide value, but we didn’t expect you to provide value. And you had a comment as we were preparing for this conversation about how DevOps is really changing that attitude a lot.

Adam: Yeah, it definitely has. Just years and years ago when I first started out, you think of Ops or IT, what’s it called, or now some people just call it IT Ops. It’s a cost center. It keeps the lights on. I think that ingrained nature of “I just don’t want this server to go down. I don’t care what happens on it. I just want make sure that the lights stay on. I want to make sure that it’s still available through my monitoring. I want to see all green.” That’s a big thing. I want to see all green on my dashboard. That’s a more of a cost center thinking mentality. But like you said that we were talking about earlier was, with DevOps, that’s really one tenant of DevOps.

Adam: The kind of culture perspective is really turning IT from a cost center to a value add for the business, rather than figuring out whether then, “Oh, it’s just IT. We’re going to stick them down in a basement. Oh, they’re just costing us money.” Now if you combined Dev and Ops together, we software developers, maybe you’re an eCommerce company, maybe you are developing applications for your customers. At that point, if Dev and Ops can work together and build a single product, at that point, IT and Dev are not completely separate. They provide value. We’re actually making money off of the products and the services that IT provides.

Leon: Right. It takes your mentality from, “I’m good at what I do after it’s broken, like that’s where my strength is. Or maybe shortly before it breaks, but that’s where I’m good,” to “I’m good at making things better. I’m good at proactively thinking about an existing system and where can I improve this to make someone’s life easier, better, to make a process move more smoothly, to remove friction.” That’s a very different mindset for, I’ll say, traditional ops trained IT members. But it’s an important place to go because the report really does bear out this whole thing about getting good at troubleshooting.

Leon: Just to quote again, 78% of tech pros reports spending less than 10% of their time proactively optimizing their environment versus reactively maintaining it. In 2019, 77% of the respondents reported spending the same amount of time on proactive optimization and that sentence is kind of hard to follow. What it’s saying is that last year and this year, IT pros spent less than 10% of their time improving the environment. They spent most of their time troubleshooting. And again, I think that there is a conversation to be had about how can we, as IT practitioners, move ourselves into the frame of mind of making things better? How do I get good? How do I build the muscle of looking at the world, my world, through the lens of what can I improve?

Adam: Yeah. I think a lot of this goes back to automation, because automation provides you with… I could go on and on about automation, but essentially, automation provides you, automates all that stuff, all the boring stuff, lots of the things that you would traditionally spend a lot of time on. I think that should really give people more time to be more proactive. And going back to that previous company that I was talking about before, they really didn’t have time to be proactive on a lot of this stuff because they weren’t automating more. I think if we can leverage more automation, that can take care of lots and lots of the daily “traditional Ops skills, or not skills, traditional Ops tasks that have been around forever.” At that point, if we can give IT more time, they can definitely spend more time being more proactive, keeping a closer eye on monitoring tools, on various tools that allow them to see more and to understand the various services that they’re monitoring.

Leon: Yeah, I think that you can start on a very personal level that we all, as IT pros, have things that we do repetitively that we have to, whether it’s signing our name to the bottom of an email, or filling out a particular set of information over and over again, or creating ticket information. Macros is really what I’m getting at, that we can start to think about our world in terms of macros and mini scripts and things. How can I do this thing that I do over and over again with a couple of keystrokes as opposed to a really long set of commands? And if you get into that mindset and you say, “Wow, this thing used to take me three minutes or five minutes or whatever, and now I can do it in a couple of clicks of a button.” Then you start to use that experience to look at the world around you, to look at the things that happen outside.

Leon: Again, I have spent 20 years in the monitoring space and so I look at things like what is the most populous ticket in the queue? Back to tickets, right? But not because they got closed the most, but what is the ticket, especially the automatic ticket, the one that results from monitoring. What is the one that you see the most of? Because if I see the most of something and I can build a script or routine or response that automatically deals with even 20%, 30% of those tickets and causes them never to happen, or to automatically reverse themselves.

Leon: As an example, disc fault. If you’re in an environment of any kind of moderate size, two, three, four thousand devices, on up to 10,000, 30,000 devices, you’re going to get a lot of disc full alerts. They just happen all the time. Discs get full. It’s an imperfect world. Screws fall out of doors. That’s what happens, as Bender said in that famous movie. But what do you do when you get a disc full? Well, most people, if you’re like me, check the temp directory, clear the temp directory, clear old log files. We clear that out and seven out of 10 times, that was the problem. Just too many log files. The disc is fine. Clear it out, move on.

Leon: Well, what if you wrote a script that did that? What if you wrote a script that as a response to the disc full incident automatically went back to that machine and cleared out the temp files and all that stuff, and then if the problem did not persist, tickets reversed, not closed. Closing tickets is for humans. Closing tickets, to say this problem is gone now, requires a human to look at, but it doesn’t require a human to look at it at three o’clock in the morning. It requires a human to look at it in their time at an appropriate time. So you can reduce, in my experience, I was at a company, we removed 70% of the disc full incidents, the tickets, just by implementing that one script. It resulted in the equivalent of three quarters of one staff person’s time for the year.

Adam: I’ve done the same as well.

Leon: Yeah, that’s a lot of time, but you have to look at the world in that particular kind of way.

Adam: Yeah. I mean you have to think of monitoring as the first stage. That’s not the complete life cycle of a problem. The monitoring, just will discover the problem. At that point, it’s either a human going to do it, or if you define all the patterns like you did, where is the temp directory run… The one that I had written was “Run the disc cleanup utility.” If you can define the pattern well enough, you can convert that pattern into a script and then using the monitoring tool, they can actually discover and remediate at the same time.

Leon: Right. And this does not require knuckles dragging on the ground programming skills. You do not have to be God’s gift to C# or C++ to be able to do this. Just a little bit of scripting skills, cause honestly it’s the commands you would’ve been running yourself.

Adam: Yep. And especially with the PowerShell community, there’s thousands and thousands and thousands of scripts out there already that will probably do what you needed to do anyway.

Leon: Right. Now those little scripting things lead up to a big word. A word that I think is a little bit daunting for some of us who are new to the cloud and hybrid IT, and that’s orchestration. So could you take a minute and talk about orchestration from that standpoint? What is the orchestration from the idea of I want to save myself time on repetitive tasks and I want to have more time to deal with the things that require my big human brain?

Adam: Yep. Think of automation and orchestration as automation, as the components of an orchestration framework, if you will. So talking generally, think of maybe an automation as that script that you had wrote to clean out temporary files on one or more servers. That can be really classified as an automation script. You can have scripts that could be to create a new active directory user. You can have a script to bring up a a virtual machine. You can have scripts to do all kinds of things. But at the end of the day, this is one thing from the PowerShell arena where I primarily focus on is people will, a team will, come up with a lot of these scripts and solve a lot of great problems like this. But at the end of the day you have to manage all those things, all those scripts.

Adam: And this is where it kind of comes down into the orchestration space, which is essentially calling, managing all that stuff, all as one kind of a unit. So taking us back to an example, so let’s say that you have a script to create. I guess onboarding would be a good example of orchestration, because an onboarding process has lost a different steps. You have create an AD user, create a home folder, or create a soft phone account, it reaches out to all these different services. And you could sort of think of that as if you have a tool that’s saying onboard user Joe Smith, click. That is orchestrating all that automation, all those different scripts to do those individual tasks. So essentially orchestration is considered kind of an umbrella, kind of a platform or a management service in a very general sense of managing all of those automation scripts and all those routines and tasks.

Leon: Right. I like the image. So my father was in the Cleveland Orchestra for 46 years and New York Philharmonic before that. So the idea of an orchestra brings up very visceral memories for me, but they do have a conductor at the head of that orchestra, 120 people all playing different instruments in different parts and the conductor, I think a lot of people when they see an orchestra playing think that that guy’s just waving his hands. No, no, no. He is actually telling each individual player how loud, how soft, how fast, and he’s giving very intricate instructions with those subtle hand movements. And I the idea of an orchestration tool saying to each script, no, no, not you yet. Wait a second. You, more, more, more, more, more. Run again, run again, run again. Okay, good. Now you.

Leon: And the idea that that’s what it’s doing is each individual part, the script is an instrument and that the orchestrator is really taking a look at what’s happening in that moment. What are the acoustics in the hall? How is this piece playing? How is the soloist working? Whatever. And telling the entire orchestra how to respond to those inputs so that it creates a successful outcome in the end. So that’s just an interesting way, and I liked the choice of word that they chose for that.

Adam: Yeah, dependencies is also a big thing. I just wanted to mention that. An orchestrator knows the dependencies between each of those. Does this script run first? Does this script need to run? Now this script needs to run. The individual scripts don’t know that, but the conductor or the orchestration layer would.

Leon: So, this next quote from the survey is so important that I’m just reading it and we’ll dive in from there. And the quote goes like this: “Tech pros are collecting these business metrics, but there’s a need to bridge the gap between business metrics collected and tech pros confidence in their ability to communicate performance to the business.” And what this says to me is that they actually have the tea leaves in their cup, but they can’t read them, or that they see the metrics and they can explain them to other technical people, but they can’t talk to the business. And I have spent a lot of time over the last several years talking to IT folks about their need to learn to speak business-ese as a second language. Not as a first language. No one is asking you to take the hair, or the hair you have left, and puff it up and make it pointy, like the pointy haired boss. No one’s asking you to do that. But you do need to learn to speak business as a second language and be able to translate that, or else you’re not going to be successful. So I just, I wanted to get your thoughts on that as well.

Adam: Yep. I’ve, as I’m sure you have, I’ve come across this a lot in my career and traditionally, us geeks, we don’t speak business. We speak PowerShell or we speak whatever-

Leon: Perl, Perl. We speak Perl. No, no one speaks Perl anymore. I’m like the only one left. But anyway, keep going. Yes.

Adam: Maybe a little bit. Maybe a small portion per whole, no. But yeah, they don’t really speak business because they don’t really know… They’re so focused on their job and exactly what they’re doing. Their boss may come to them with a request saying, “Okay, we need to deploy a bunch of these laptops,” for example. In this day and age, I need to deploy a lot of laptops because people are going to be working from home. Well, the IT person may immediately think, “Oh crap, I got to do all of this work and now I have to deploy these laptops.” They don’t really understand the reasoning, the why, behind that. And I think a lot of IT professionals need to maybe ask the manager, “Well, can I get an idea of why? Or what is the bigger picture?”

Adam: Because a lot of us think like, “Well, we’re just going to focus on exactly what we’re doing now and I don’t want to take the time to do all this, deploy all these laptops to… Now I have to deal with users and all of this sort of stuff.” But if they would understand the why, then that would, at least for me. I come back to, if I understand the why of this, I feel a lot better about it. I feel like you have a sense of purpose. If you’re taking this back to the business value, this goes back to also a big thing of DevOps, is about business value. Well that server that you’re spending up or that laptop you’re deploying to a user. Well if you do that, then the business will make more money, which then will continue to give you a job, which then will maybe give you a bonus, or it will relate to you. You’ll feel like you’re contributing to something greater than just building a laptop. And I think the whole DevOps thing kind of brings in that soft, cushy stuff of let’s work more as a team and really understand, not only what you’re doing, but why you’re doing it at the end of the day.

Leon: Yeah. I had an experience early on in my career that not a lot of people have. It wasn’t unique, it was just a conversation that doesn’t occur a lot. I was working for a manufacturing company that made circuit boards and it was toward the end of my shift and one of the zebra printers in the factory area was down. And zebra because they print bar codes, that’s why they’re called zebra printers. And it was down and I’m like, “Ugh, okay.” I had 15 minutes left in my shift. I took a look at it and was like, “I’m going to need to come back tomorrow and work on this one.” And the person who was in charge, the manager of the line, said, “Not a problem. If you’ve got to go home and pick up your kids, I totally get it. But let me just explain to you that these boards can’t ship until that zebra printer comes back online and every one of these boards represents $12,000 worth of profit to us.”

Leon: And I turned around and I’m looking at seven racks that each had 120 boards in them. And he said, “Yeah, we can’t box these and ship these until that printer’s back online. So do what you need to do. But just know that each one of those boards is $12,000 worth of profit.” I’m like, “I’ll make a phone call and be right back.” And I didn’t get yelled at, nobody called me into their office, “Do you realize what you cost us?” None of that. This person very calmly said, “Look, we all have a life and we get it, but you need to understand the financial decision you are making, even though you think all you’re doing is working on a printer.”

Adam: Yep.

Leon: And I think that DevOps, because of its focus on, again, the user experience side of things and the metrics that aren’t just hardware metrics, the metrics of how many transactions, how many customers, how many up-sells, how many cross-sells, all those intangibles. Because DevOps focuses on being able to capture and relate those, that the DevOps community, the DevOps culture, has instilled in IT folks a better understanding. But that understanding is there. I think the message I’d like for people who are listening to know is that no matter what level of IT you’re working at, your decisions are financial decisions. Your decisions have an impact on the company and you just may not know it. Sometimes that’s because it’s not communicated, but a lot of times it’s because you haven’t taken the time to figure it out or find out. But the information’s there. It just may be in a language you’re not comfortable with speaking. So anyway, I just wanted to share that experience with you and with everyone. You had mentioned that if you do have a group of people who are either not interested or not so good at learning to speak business-ese, that there is a way around it, that a team lead is another option. So can you explain that?

Adam: Yeah, as we were talking, I was mentioning a lot of us IT people, we don’t really speak business-ese by default. It’s definitely not our default. And one thing that I have seen some success on in previous positions was the concept of a team lead. A team lead, they’re not necessarily your manager. You don’t report to them. They are essentially the liaison, I guess, between the hardcore geeks that are just getting the job done and that’s just what they enjoy. They don’t enjoy the business aspect at all, versus the team lead, in some of these organizations ,where they like the kind of management aspect, to some degree. They like leading. They’re usually more senior. And at that point they understand they can be that liaison or that conversion between “I need to roll out this application because it’s going to make us X number of dollars,” to convert that to, “Okay now team we have to roll out X number of servers and deploy this and configure this.” They’re kind of the liaison.

Adam: And they’re not necessarily a manager because at some degree you have to have that leader aspect. You have the manager, the traditional support, the traditional hierarchy that organizations follow, but more of a team lead that’s more senior or likes that management, leadership style. I think its a really good way to go about that as well.

Leon: Hey look, in IT, we know all about data transforms. We know about converting from this database platform or that database platform. And what you’re talking about is just another data transform.

Adam: Yeah, too bad we don’t have one of those language converter kits on our self. The necklaces that some of the doctors have. The CIO can come to you and speak into it and out will come geek.

Leon: Right, exactly. Hey look, if anybody needs a startup idea, there’s one for you, is to translate. I know that online language classes have Klingon and Valerian, so they probably will have a geek curriculum coming out soon.

Leon: Adam, thank you so much for taking the time. This has been an amazing series of conversations. One more time, if people wanted to find out more about you, where can they find you and your stuff?

Adam: Sure. They can find me at adamtheautomater.com or you can hit me up on the Twitters @ADBertram.

Leon: Fantastic. And if you’re looking for the data, more of the deep dive into the Cloud Confessions report, you can find that at solarwinds.com/cloudconfessions. Adam, thank you again.

Adam: Yep. Thank you.