Confused about much hyped DevOps? Curious if developer’s monitoring tools are different than those made for operations? Wondering if there are hidden cloud tools in the Orion® Platform modules you already have? As always, SolarWinds Lab™ is here to help. In a SolarWinds Lab first, Head Geeks™ Thomas LaRock and Patrick Hubbard travel to Las Vegas for AWS re:Invent to interview technology pros and SolarWinds customers about how much or even *if* they’re happily using DevOps and cloud technologies in production. And next Head Geek Leon Adato and Patrick bring the conversation back to the Lab to untangle hybrid IT hype and reveal recent new dev and automation features in NPM and SAM you might not know about, like container monitoring. Leon even presents an Orion Platform insider’s how-to for managing containers, microservices, and more. If you’ve ever wondered if you must buy into the hype and adopt a lot of alien processes just to up your automation game, this is the episode for you.
Welcome back to SolarWinds Lab. So glad to see you back again.
And this episode’s going to be a little bit different because we’re going to do something new.
Like actually get out of the studio and shoot live at a conference.
Not just any conference but Amazon’s AWS re:Invent. Some say it’s the largest event of its kind of held by the largest single infrastructure provider on Earth.
Okay, first, Tom says that while also talking about how fast Microsoft Azure is growing. And second, our surveys in the live chat from THWACKcamp 2018 say that maybe a quarter of IT is actually on cloud.
Yeah, so there’s just a little bit of hype.
A little hype? And I’m the first one to admit that underneath the buzzword monolith there’s actually valuable stuff there.
Ah buzzword monolith. I like what you did there. I mean it’s meta but not actually GNU.
Okay it’s not anything like GNU and no one is trying to shame you if you’re not all in on GNU.
You know that actually kind of true, right? Like there’s a lot of people that are trying to tell you that you really just don’t count if you’re not like full DevOps.
Right. And the point is more about understanding observability than setting up a Kanban board.
How can you say that DevOps is hype and then invoke observability?
Because one of them is unicorn farts and the other one is actual monitoring.
So that’s what this episode is going to be about? We’re going to do DevOps versus IT Ops versus unicorn farts versus monitoring.
Yeah, pretty much. See you and Tom are going to toddle off to Las Vegas. Talk to our customers and actual people doing actual IT things and see if we’re really that far behind.
Piece of cake.
But then you have to come back here and then we take what they’re doing, how they actually are using a subset of DevOps tools alongside the main dashboards for keeping their infrastructure online.
Ah so DevOps for the rest of us.
Yes it’s FestivOps.
Well it’s the season.
Right. And you know what we ought to do? Okay so let’s get real first. You go to AWS and ask actual admins, not the cool kids, if they’re actually doing DevOps.
All right, then what are you going to do?
I’m going to wait right here ’cause I did booth duty last year. And then when you come back, you can admit what we already know.
So then whatever I come back with, you’re going to show them how to use it in the real world by real admins using SolarWinds products?
All right, but first we really ought to talk about what DevOps is and isn’t. I mean we made the joke before that it is pretty overloaded with hype.
So let’s start with how do we get to this place where once upon a time it was a great idea and now it’s really confusing.
Like most things. It got caught up in the buzzword bingo and a few executives got really excited about it and they wanted to buy a box of DevOps and sprinkle it all over everything.
And it’s just not that. It never was that.
It’s that age-old story and I hate to use OpenFlow as an example, right? But I mean, it was the idea of technology that started with us, it started with geeks. It started with people who were actually applying technology in the field. And they wanted some enlightened developers and some enlightened and eager operations professionals got together and say hey I think we can collaborate and do something cool here. And to your point, yeah, then vendors jumped on it and made it into this thing of hey if you buy these products with these check boxes then ta-da!
You’re doing DevOps.
You’re doing DevOps.
Yay! And we used to have a word for it, a very technical word. It was called teamwork.
Okay so the terminology’s a little bit different when we talk about tooling. They tend to kind of fall into three categories, right?
Right. So there’s metrics, there’s logging.
And tracing, right. And then sometimes you can talk about sort of user experience too as sort of that fourth pillar, but when you hear people talk about the three pillars of observability those are the three things that they’ll talk about. And it’s interesting that one of the things that we talk about is we say metrics not monitoring. So how are they different?
So, metrics can come from anywhere.
Monitoring is typically used to refer to poll-based. Now I realize that we’re getting really pedantic about that, like oh that word doesn’t mean that thing. It means whatever, I mean like, just use words, right? But understand that when you’re talking to somebody who’s really deep in the DevOps culture and perhaps fanboy-ing a little bit, that metrics has a particular connotation right now.
Right, well and the other thing too that you mention there is data from wherever. And I think when you hear people talk about the word observability and you know we did make a joke about it, but observability really is also a way of kind of thinking about it differently. A refreshed view. Because for so long products have been really mature. If I point something that is designed to monitor DB2.
It is going to light up a dashboard with everything that I need for that database instance.
I mean it’s not going to tell me about sort of query level performance tuning, but in terms of the platform itself, or Exchange or any number of applications the infrastructure itself certainly on the networking side or virtualization management platforms. Monitoring is great at going out and– When we say the word monitoring we’re thinking about a thing that you turn on and it pulls all that data in.
But it’s not necessarily thing like hey can you light up the dashboard out of log data.
Yeah. And I call it the “not me” line. For a long time in at least my career as a monitoring engineer, there was this, in my head, I had this “not me” line. Okay, so me was that monitoring. Collecting data from all sorts of sources using all sorts of interesting techniques and stuff but when someone said log file aggregation, that was in the “not me” territory. And when they said tracing, I was like oh the coders, the programmers, they care about tracing that’s across the “not me” line. And the “not me” line now has moved so that observability now includes those things to create a whole coherent, cohesive view of everything through the stack from top to bottom, but you’re just able to accommodate a wider range of data sources.
And it’s interesting that you call it that. I kind of refer to that same line as the fast versus meticulous line, right? So what I mean by that is, how quickly do I need data and how quickly am I willing to create access to those metrics and then dispose of them. So for example, if I am monitoring an on-prem application that is going to run for years, not for 15 seconds or five minutes or however long that process is going to run.
I will invest some time to make sure that I can set up, from an observability perspective, a set of monitoring poll-based metrics that are super comprehensive and then I make sure that thing is running everyday.
I don’t make a lot of changes to it. I may add a few metrics here or there and I may adjust my dashboards and hopefully you are all spending a ton of time adjusting your alerts because too many alerts is not the policy of SolarWinds. Let’s say that again it is not the policy of SolarWinds. If you have not seen what alerts–
Any of the things.
Alerts How do I Hate Thee–
How I Hate Thee. from THWACKcamp this year, or any number of probably six other episodes please go back into our catalog.
We’ll have all of them in the show notes, trust me.
Yeah and just labs.solarwinds.com, look at expand that page and search on alerts and you’ll see a whole bunch of them all pop up.
Mm-hmm. But the other thing with DevOps is I tend to think of how quickly can I get metrics data. Because I’m going to start with a bench test, right? Because a lot of times especially if I’m redesigning an application, forget something that’s brand new, but I’m taking an existing application, I have robust monitoring, now all of a sudden I have little to no monitoring. How quickly can I, in my initial tests, achieve parity with what I had before, but then once it’s deployed into the unknown abyss of Ops, again if we sort of–you know what we ought to do let’s take a couple roles here, right? It’s a little bit of a gag, you know, that I sort of play this overly excited develop automation-focused engineer.
It’s a gag? I thought it was like that’s…
Once upon a time a very long time ago, but listen, I know a thing or two about Ops.
And then of course, you are always the voice of the sage, dependable, quality-focused engineer who’s thinking about keeping the lights on in applications seven years after the Dev team hands it over to Ops. Right?
But when we go back to what we were talking about before about, you know, where did DevOps come from and the original intents. So much of it was collaborative. Right?
And it was almost inviting as opposed to so much of that adversarial communication that you hear on teams about I don’t want to do DevOps or this is too hard or this is a massive culture change or we don’t have budget for that or all the things that are a part of that.
I hate to play, I hate to invoke the “back in the day,” but back in the day we just worked to get it out the door.
Right and it was about inviting collaboration. So for example, operations said to developers, hey you have some tools that could allow us to actually have more time on the weekends without our pagers.
‘Cause back then it was pagers, right?
And then a lot of times for developers, operations said hey you know what you need to improve the quality of service over time? You need a feedback loop where we’re going to take actual real data coming from real operations and constantly feed that back into the development process.
Well there’s also and again this wasn’t any sort of movement or whatever. It just sort of happened naturally in healthy organizations. I almost said healthy organisms which is almost the same. That the developers would work with the operations folks and say how do, you know, how do we help each other? Oh you know what this operation stuff keeping the system up, I can do things to affect that. I can do things to make it better this operation stuff isn’t so bad. Like you said, you started as a Dev, but you found that operations was maybe hard but not impossible and not incomprehensible. And then the operations folks looked at what the Devs were doing and maybe didn’t say oh I want to code a multi-tier, you know, library infused thing but they developed a sense of code. They developed an appreciation for what was happening and said you know what, some of these techniques I can use and I can be aware of and that is, I think, at the heart of DevOps even before it got a name or The Phoenix Project came out in print or whatever it is.
So interesting you would mention that. So where do you think that… where did it become a problem? Where did people suddenly become fearful of it? Instead of just saying oh these are just some additional tools that we would use in particular circumstances?
When it became a thing trademark registered all rights reserved. Like it when it became like “DevOps!” that’s when people started to like, you know, I don’t need another SOA in my life. Okay enough of that. It’s time for you to get over to AWS and we’ll find out if you really know what you’re talking about.
Awesome, do I get to use the door?
No, no door for you.
Welcome to re:Invent. And welcome Patrick for finally getting here.
Leon didn’t let me use the door. It’s a transporter.
The door is the magical place and it’s also a good place to have a nap.
That’s actually really true. So why are we here?
We’re here, I’ll tell you why we’re here, not just us but everybody in general. Three things, right? Three things that I’ve noticed. One, people want metrics. They want data. They want the ability, and this is two, the ability to log all of those metrics.
Right and then three, they want to do analysis on the logs that have those metrics. You know what I call that?
Let me guess.
Observability. And observability is one of those terms from that thing that people like to believe in, that sorcery called DevOps.
DevOps, yes. And so DevOps is a thing right? You can buy that?
Yeah, sure. I buy DevOps off the shelf at my grocery store. Don’t you?
It’s not market-y at all? It’s not overloaded?
It is totally a marketing term for something that has existed probably throughout human history. DevOps really isn’t a thing. It’s more of a process.
It’s more of a process. Well especially I mean like our shirts, right? More Dev, less Ops. It’s interesting the way I’ve had so many comments on the shirt and everyone seems to be taking it the same way which is basically automation is good. It’s been good for 5,000 years. The wheel was good. Steam was good.
I never thought of the wheel as automation, but sure.
Okay but electricity was good, right? The idea of robots are good. The idea being automation eliminates drudgery and toil and that is the whole point of what DevOps is supposed to be about. It’s not eliminating operations. It’s getting rid of the drudgery so that you can focus on actually making a difference and accomplishing your goals.
Okay, so, this is a DevOps show then?
It’s a developers show maybe kind of no. What do you think?
Is it a cloud show?
Sort of, yeah.
You’re getting closer. What do you think?
I think what you’re looking at is the world’s largest infrastructure conference. Think about everything that you have here. You have people who it’s all about the data. So you have people and companies here focused on the storage of your data, the migration of your data, the ability to report on that data, the high availability, disaster recovery.
Transformation. Everything you need to do and of course all the JSON. JSON’s everywhere.
Well you included JSON so we’re now JSON compliant.
But all that data is hosted in AWS. They are providing the infrastructure. They are on track to become essentially the world’s largest MSP.
That’s true. Because what do MSPs do?
They back up your data, they right? They restore your data when you need it. They give you tools to solve problems, performance insights, they help you with analytics. What else?
Well some of them will, you know, reset your printer but I mean for the most part it sounds like what you’re really saying though or maybe I’ll take a huge jump.
Is does it mean that cloud is essentially data as infrastructure? And that all the rest of this is just plumbing and tools that enables that?
Wow. Data as infrastructure.
Like it’s not thinking about the physicality of infrastructure. It’s not servers and chassis, it’s really distilling it back down to the thing that actually matters, which is your data.
Right. And what company would want your data more than, say, Amazon? Maybe Microsoft. These are the companies that want your data because if they have your data they have your business, right?
That’s absolutely right.
And that’s why we’re here.
Yeah well and it’s kind of interesting too because I mean DevOps is obviously a overloaded term.
And so much of it, I was thinking about talking to a customer the other day and he was talking about how there’s, you know, disagreements between teams and they don’t trust each other and maybe they don’t get along. And he’s coming from sort of a traditional on-prem IT environment which is, you know, all about cost savings and SLA and the rest of it. And they’re being tasked with innovation goals and a lot of other things and so they’re trying to incorporate… They’ve basically been told to get DevOps by the end of 2019, right? And you don’t do that.
And so they’re trying to pick up best practices. But I was thinking about it I’m like, and I don’t think everything goes back to an Apollo analogy but if you think about the Apollo F1 engine, right? There’s that myth that they can’t rebuild the Saturn V because they lost all of the blueprints. Which of course is completely ridiculous. NASA has all of the details of everything about that program, but literally they started pulled a bunch of F1 engines out of storage to take a look at them for the SLS program with the idea being thinking well maybe we’ll do this instead of SRB boosters, right? And the conclusion was we can’t build them. And the reason we can’t build them, is because it’s not that we don’t have the blueprints, but there was, you know, there was skill. There was special knowledge that the welders and the other engineers who built them had that there were notes on paper and that they were lost a long time ago, right? So the solution was the new kids at NASA went and redesigned a lot of 3-D printed technology were able to dramatically reduce the part count and basically built a design for an engine that you could build now today with today’s technology that would eliminate the problem of not having that engine, right?
Now bear with me.
I’m bearing with you.
In terms of the skepticism there are those who might say if you were a younger engineer at NASA would say, oh all you old-timers, you know, you didn’t bother to document your code, right? They effectively didn’t keep those notes around and prevented us to have the ability to rebuild the engine now.
But those old-timers would say, okay hold on a second, we managed to send humans to the moon 50 years ago, where’s that engine that you designed? Just show me one of those, right.
Get off my lawn.
Right. So that dichotomy of we have a great idea and we have a great plan, sort of the Dev focus, along with engineers and operations who was like, well we actually do it.
We actually provide the technology and the tools and the services that allow our businesses to continue to operate. So does that work for you? ‘Cause I know you’re a space guy.
That works me but that’s what I meant by saying one word process. It’s a process. It’s a way for you to essentially do business. To do IT. To do it properly. To have teams that are integrated. Now, I’m old enough to remember when we called DevOps synergy. And we just said we need more synergy around here, right?
That’s about the same amount of marketing.
[laughs] But those were the words because we just needed the teams to be integrated and to be in sync with each other and to understand, to know what is it that this group needs now or will need in a week from now. And to have people all at the table understanding what everybody’s doing. Right-hand, left-hand all that type of stuff. All the same jargon we’ve heard for years. But DevOps just encapsulates.
All of that. And it’s really about the process. How is your team, how is your company working? Efficiently or not? And DevOps is that all one big word that just sort of describes all of it to me. That’s how I take it.
Well do you feel like really it comes down to an opportunity to collaborate based on tools, right? That the idea that there were these two separate camps with wildly different technology. I mean you look at the adoption of, you know, a lot of existing SolarWinds customers who are now monitoring a lot of containers, right? That was something they got. They’re dealing with containers and Kubernetes and Mesos. Not just like, you know, single standalone Docker instances. That was going to be DevOps. That was going to be these special new teams, and instead, it’s a commodity technology and anything that’s commodity immediately is about cost savings, which pushes it into traditional IT and it ends up being business challenges that need to be solved and the tools that address them. So I feel like there’s an opportunity for Dev and Ops to really focus on collaborating on tools, right? If I need to drive a nail, I’m thinking about the hammer. If I’m digging a hole, it’s about using a shovel. It’s like, it’s not so much that there’s this great philosophy that has to be adopted, and I’m not discounting Agile and I’m not absolutely not discounting the need to really think about culture change. Really to instill the idea of feedback loops is something that takes a lot of effort from both sides. Developers to make sure that they’re baking monitoring and metrics into applications before deployment to make it easier for Ops, but then Ops committing to make sure that they’re sending those metrics back into the development process to improve quality over time.
But I mean is it ultimately like it might be the tools that you use and common tools that work for both groups that’s supposed to meet?
But the tools you’re talking about is just something that help with part of the process. So, for example, containers are a great way to do test-driven development. It’s like the way to do test- driven development, right? But that’s just a thing you would’ve already been doing that just took you forever and containers just make it a little bit easier. So it’s not, is it part tools? I guess you could say that. But you know what it really is? To me, it’s about if you’re going to be data-driven or not. Are you truly going to be data-driven? Are you going to do things and you’re going to let the data dictate and help drive decisions. Or are you still going to operate on a hunch, right? You just spin up all these containers, you’re going to do a deployment. And let’s say you do something, you deploy something 10 times; it succeeds seven. Seven out of 10 times, it failed three times. Is that a success?
Okay, but now you’re getting into like error rate metrics and error budgets and sort of things.
No, no, what I’m getting into is actually if it’s something being pushed to every desktop, would you want 30% of all desktops to have a failure and be calling IT?
As long as it’s only the executive desktop upgrades that fail, I don’t think you’d have a problem.
Right! So here’s the thing, you would look at somebody and say, no, no, we need like something closer to 100% success rate. This is something user experience, but if the guy in charge of IT says ah I got a feeling we’ll be okay. Let’s just go ahead on Friday and deploy this. So it’s part tools but it’s also part being a data-driven culture. And I think that’s a huge part of DevOps in general is all these different teams have their own area of focus and it’s all about their parts of their data and are they truly letting the data help them make better decisions.
Well but isn’t that part of the misdirection around AI and ML? I mean the whole idea about, you’re talking about bias right?
Decision bias, conclusion bias. So you’re saying data helps eliminate human bias.
So we’re being told that, oh no, AI and ML are the only way to get past that. But really those are both completely dependent on data analysis. So, right?
Datas and humans. Humans to write the code.
Humans and data. So basically you’re saying that one of the primary goals for anyone, regardless of how they come to operations, whether they’re cloud focused or they’re on-prem, is to really start admitting that they might have everything that they need in the data that they’re already responsible for to start looking to the data and to not go just based on hunch.
I mean like one of the ones that I think is absolutely true and especially talking to customers here, is how often to do we think like what are the top ten problems in your infrastructure? And now you could– immediately you’re going to say, I can list the systems, I can list the people, resources, whatever else, budget, but if you looked at the data, would it really be those 10? Would it maybe be 10 other ones that you didn’t even think about?
No, my first would be the data itself. That’d be number one, the data. Data’s never right.
Data’s never right?
I didn’t go to school to become a data janitor, it just sort of happened. Nobody goes to school to be a data janitor, but yet we end up, how much data janitorial services have you done over the years?
Oh I’d probably say, what, 85, 90% of touching data has been janitorial.
Exactly, exactly. That’s where everybody spends the bulk of their time.
So one of the things I like is this is actually the second year that Lab has been able to come to AWS, right?
And so what are we doing here this year?
Well what have to do is we have to work. Not just our booth, but we need to walk the floor and we need to figure out, you know, what are the challenges people have? What are the solutions on the floor here that are meant to help people?
Talk to you guys.
We have to talk to you guys. So I mean, we’ve got some work to do. That’s what we’re doing here.
All right, so, we’re going to have a little fun? We’re going to go by the booth, talk to a couple of customers?
Fun? There’s no fun here. There’s no fun at re:Invent.
This is a serious.
No fun at all.
This is a serious business, Patrick.
There’s no boondoggle at all.
We’re working hard.
All right, so welcome to the booth and you seem to have survived your date with the unicorn.
So you promised me it was a unicorn, but it turned out to be robo-kitty, and I rode robo-kitty like a boss.
And now you’re sitting in a chair for some reason.
Yes I’m sitting. For no related reason whatsoever.
Yeah, well. So hey, let’s talk about what it’s like here in this booth. ‘Cause it’s a little bit different at this show than maybe where we normally see you at Cisco Live! or Microsoft Ignite or maybe–
VMworld certainly. Or certainly at a SWUG.
But the conversations here are a little bit different and we just had one that was I think the best one of the show.
Not just because it came from a really unexpected place, but the spirit of what he was saying really to me captured what DevOps is all about. So this is a DBA.
Who was talking about using tools and some new techniques to really bring the Ops team the Dev team into operations, right?
So what were the two things– What was driving that for him?
Well I remember this thing that he said that he had a goal of, wanting to get the tools, the right tools to them that they would want to use.
Instead of forcing a tool on them and saying this is the direction, this is what we’re going to go. He goes no, no, let’s figure out what it is they want and need.
And give it to them.
Because then they would adopt it and feel invited.
What I loved about it was his spirit, I mean you talk about the culture change of DevOps, was inviting those teams in and instead of it being developers pushing themselves onto Ops which never works.
It was someone in Ops, and you know this, it’s a little bit of a reputation among database administrators as being a little focused on their–
Or that. But here’s someone that literally from the inside of the inside of the onion. Right?
Is the one who’s reaching out, pulling team members in and he was doing it for there was a couple reasons, so there’s that; it was helping, he was using the tools selection as the grease to make that happen.
But he was doing it because one, it made his job easier to get those teams involved, but also he was acting as a force multiplier because he knew the data. He’s the database person. To your point before about it is data as infrastructure, but he’s using his knowledge of the data to then, and the importance of data, to get them to collaborate around a database. The core of what is traditional operations to really improve the delivery of services to his organization. It was amazing and I’m hoping that we can bring maybe more conversations from him in the future.
He really was that spirit of what DevOps really is about.
Is using development to streamline operations not replace operations, but to get the ugly stuff out of the way so that you can really focus on doing what you need to do.
That’s right. Drive the business.
Driving the business. So real quick, we just want to show you a couple of things where there is a difference in some of the conversations that we had about what if you’re a DevOps team or certainly a developer you’re thinking about and certainly if you’re operating in the cloud, how the views that you use among the tools are a little bit different. So let me show you just a couple of them real quick. Okay so first of all, dashboards. We live and die by our dashboards. Dashboards are a little bit different when you’re at cloud scale, right? And it’s a combination of infastructure and events and a whole lot of other stuff that, you know, don’t necessarily go together. It’s basically apples and oranges for days, right? At enormous scale. So, for example, here’s a dashboard. This is a Slingbot production dashboard so this is things like API submissions, consumer messages, here we got a Fetcher Jobs Processed. You know, these are specific to a application that is a cloud-native application.
But it’s got some other resources that actually span into hybrid, right? So when you look at the sorts of elements just the diversity of data that are coming from that are a part of the dashboards that you’re using they’re not focused sort of at that host level or this is a VM or this is a network interface. It’s a combination of all of those.
So the second thing that happens a lot is you normally start thinking about services a little bit differently, right? You don’t necessarily know where they’re running. You don’t have as much of a deterministic view to be able to say, yes, I know all the elements of this app. So what ends up happening is host definition becomes a little different view. Is a host the physical host or is it containers that are running on that host? And then the other thing too is you end up with a different view of infrastructure in terms of problems, right? Like heat maps of problems. Like here we’re looking at a lot of hot trace data where this is based on a lot of error reporting, right? So almost that definition of infrastructure can change by the minute. It’s not this is my core network and then these are the services that support applications and virtualization. Instead it’s where are we in the moment. And you start your drill down sort of from the problem view.
The other things that’s a little bit different is troubleshooting is really different, right? Especially like, here’s an application this is like a booking service, right? So this is a horizontally scaled application. There’s the front-end app and then the main contributor is this database behind it. But in this case, I don’t know where they are and they change hosts from minute to minute. So this is where tracing comes in and that’s something that is really new if you’re used to working more with traditional IT processes and on-prem. And so for them, they start looking at queries and it becomes more about outliers, right? So in this case, these are transactions that are flowing through the system. And so I start looking for thing in this heat map like, well, what’s going on here? What is this one problem in this transaction? So this is a call to a booking service. I’m going to click on that call. And you start thinking instead about all the different layers. So this is an app that’s built on a MongoDB Spring. It’s a Java-based application and it runs on Rails. And you end up drilling into calls. So it ends up being, more to your point about databases, that would typically be am I optimizing this query? More how many times does that query run? So optimization around data is a little bit different. And when you start looking at what’s going on in a database, being able to drill in and do where tracing normally would be, you’re looking at the queries that are coming back as a part of, let’s say, Database Performance Analyzer. In this case, the trace itself is where you’re getting the details that were part of, in this case, the database lookup or maybe it’s the URL parameters that are coming through the front-end or it’s a part of a memcache or other memory caching service. Watching the steps, step-to-step, to try to figure out what’s driving those weird outliers at high volume.
And nothing you’ve shown here right now says to me hosts or server or instance of a database.
It’s just showing the layers of, you know, the function ’cause that’s all you care about. The infrastructure is provided by somebody else.
So you don’t care about that infrastructure anymore.
And they won’t let you care anymore.
They won’t let you care either.
You can’t care.
It’s parts flying in formation. But in all of these cases when we’re talking to customers who are having a great experience who will say yes we’re doing DevOps, they are going back to the conversation a second ago it is where they are collaborating on the tool selection so that they both enjoy that experience, they have a common vocabulary about being able to look at telemetry data across applications, right? Everyone benefits from that. The other thing that’s different is there’s also kind of different users. Like if you’re a marketing person, if you care about like an external website that’s being hosted in cloud, for example. Especially when it’s pushed out to Edge and there’s a whole bunch of local optimization to reduce latency, being able to look at that from like a meta level for the web app– a lot of people are Pingdom customers here, I know a lot of you are using Pingdom, what’s interesting is that if you’re thinking about this a little bit more about development that next step is not hey I’m the CMO, is my app up? It’s what the experience? Like are API calls performing as well as my users are experiencing on their mobile device.
A lot of granularity there to be able to figure out how that’s working. And things like well, you know, what’s the experience by site. Oh, what is this?
It’s just my blog.
It’s your blog. Well how’s your blog doing?
You know what? I have a page speed score of 81.
Is that good? So you’re a B minus?
I’m a B minus and I could get better.
Well why do you have a…
Part of the reason here is I can see that the header page is a fairly large file.
Your wife takes fantastic pictures of you.
It is a beautiful photo.
It is a fantastic photo.
But it’s too big.
It’s too big, so what are you going to do?
So I’ve got to have her give me something at a slightly lower resolution. I’ve got to reduce that file size. But for me, I mean, this is the stuff you care about for the end user experience. Now it only took a couple of seconds to load the page, but half that time was just loading that image. And for me, and living in the U.S., I can deal with that. I don’t think it’s a problem. But you know what? Some countries don’t have the best networks.
Okay but hosting this page, you’re Ops for yourself for this site, right?
But if you were taking it down to the level of caring about how am I going to fix this problem, you’re a developer now.
You’re doing HTML, “himatel,” and you’re now talking about optimizing that image, one of the assets for or maybe using…
I could use a CDN in order to get there faster too.
You could use a CDN. But now you’re thinking like a developer.
So again, it’s that bridge between improving quality of service requires thinking about it more to kind of a component level. Or the discrete data that it takes to fix that. So again, data-driven, right?
Data to determine the size of that file.
Purely data-driven. I know what to work on next. The data tells me what to do.
The next thing, of course, logs are really important.
And one of the things you were telling us over and over again and folks that we’ve talked to here is, oh no, no, no, there’s nothing to poll. There is no infrastructure except giant fire hoses of logs that get spit at me that contain the data that I need dashboards made out of.
This has been the most popular thing I’ve talked about this week. Everybody has metrics, they want logs of metrics, and they want the analytics of the logs of those metrics.
Imagine a NOC view made of nothing but log-based data.
That would be cool. So they’ll use views here like this is data coming out of Cloudtrail that’s pulled together into what would look like a traditional dashboard. Beyond that, you’ll start to do things like, you know, a lot of them that first Dev step is well how do I derive data from that? Because it’s not, maybe it’s JSON and it’s ready to go, maybe it’s an unformatted data type or it’s XML or it’s something else. So one of the things that they’ll spend a lot of time doing is actually creating derived fields from that data that they can take action on. Down here like automatically created tags, so multi-dimensional tagging becomes a big part of what you’re doing. Like hey, I want to know all of my front-end issues or errors that are coming regardless of the system. So there’s a little bit more RegEx, there’s a little bit more definition that’s a part of that, but the tools are discovery tools in a slight different way. Like, is RegEx Dev?
Oh RegEx is just… there are words I would use to describe RegEx.
Words that we can’t say on SolarWinds Lab.
I’m not sure I would classify it as Dev, but then again I’m not sure I wouldn’t.
It’s hard to say.
All right. Well and then the last thing that they’re definitely talking about, is the need to be able to grab logs wherever they may be.
That’s right. Absolutely.
Right, if you things– a container that only runs for five minutes, it has a terrible error that you needed to catch.
But also to be able to look at things like log volume, log rate.
The Live Tail.
The number of people who are using Papertrail is just delightful. And there is a free tier by the way, so if you want to play with this or you’re kind of a Raspberry Pi now Arduino, I’m doing more Arduino stuff these days. Being able to capture the events that are coming out of devices, containers, wherever they are is a big part of what operations cares about. And so going back again to that conversation of being collaborative, sort of the spirit of what DevOps actually can be to get beyond the marketing hype. Being able as operations to specify, to say, hey, here’s a line of code. If you would please drop that into this application. Or here’s the logging definition that should be a part of a container definition. So that it automatically is going to as the orchestrator scales it up or my deployment Beanstalk is going to deploy that, it’s automatically going to ship my logs off to the SaaS endpoint and figure out what to do that so that I in operations can sort it out later. Again, it’s a chance to get past that I’ve been waiting on Dev for a week, this thing keeps failing, and it’s on me to deal with the heat from executives. Again, it’s that great opportunity to sort of use tools as a common way of connecting people. So yeah, just magic on toast.
Magic, magic is a great word to describe this week, especially for me. So I don’t know if you saw all the keynotes, but watch the replays of the keynotes. The keynotes were very heavily focused on data.
Very much on data. And this is my biggest takeaway for the week, we talked is this a DevOps show? Is this a cloud show? And no, it’s an infrastructure show, but you know, it’s not DevOps, it’s DataOps. Everything here, everybody, every vendor, is all focused on data for one reason or another. For us, we’re trying to help people get the right data so they can make better decisions, right? So we have a tool like Loggly. That’s why people are coming and they’re looking at it and they’re going I have logs everywhere. What am I supposed to do with it? Oh, we can fix that for you, right?
We know what the pain point is, but it’s all–
The problems are solvable.
Yes they’re solvable.
Whether it’s us or anyone else.
And the conversation I’ve been having with people, it’s always been about my this data thing and how do I do this thing. And we can help with that. So it has been magical because it’s very much focused on data operations. You know, DataOps is a thing.
And unicorn magic.
And unicorn magic or robo-kitty magic. Whatever we want to call it.
So robo-kitty’s still magic?
Robo-kitty is magic.
So hopefully you noticed that the conversations that we’re having, that everyone is having here, are really the same thing. There isn’t Dev and there isn’t Ops, and there isn’t cloud and there isn’t on-prem, they’re delineations that are about data.
All about the data.
But beyond that, there are problems to solve. And it’s really about the tools people are choosing and their approach to get started.
Yeah. And who’s hosting your infrastructure? I mean, we’ve had a lot of people, they come and they say what do you offer as a service? I don’t want to host this stuff anymore.
I want you to do all that work for me. I just want the data, I want the analysis on it, and I want to be able to make better decisions. That’s why we’re here, right? We’re not here for a cloud show. We’re not here for, it’s not a DevOps show. It’s not a developer conference. You know what the people are here for? It’s because of the data. They all have different data needs and the data has to go from here to there but it’s all about data. It’s all about getting access to the right data, getting analytics on that data, and then we’re going to help you make truly a data-driven decision in order to have a positive impact on yourself, your team, your business. That’s what this event is all about.
So it’s letting the data inform, or the problems based around managing that data, inform the selection of the tools that you’re using and the approaches that you take.
Absolutely. And you know this has been fun. I really appreciate you having me on Lab. I always enjoy coming out to Lab, but I got a unicorn to tame. And you know what? I heard unicorn bacon is kind of tasty. So…
Right? Wasn’t that cool?
It was. It was. So what I think we saw was that people were, I mean, there was a lot of the buzzword and stuff like that but they really wanted–
Lot of buzzwords.
[laughs] Right. But they really wanted to demystify it. They really wanted to get down to how do I get my job done everyday.
Yeah and they had demystified it. And I think that’s really the point right? Is that hopefully that you saw people solving, other admins solving the same kind of problems that are, well not exactly the same, but slightly different versions of the same problems that you’ve been solving for a really long time. Maybe with different nomenclature added on top of it. And then the other thing was that they are in most cases, what a year or two out.
Year or two right.
Away from having and doing traditional on-prem monitoring or maybe hybrid IT with a little bit more, but that these tools were, to your point, demystified.
It didn’t seem to be that different once they kind of grokked it and said, oh, I get it now.
I’m going to reconcile errors with log messages. Ah, this is useful.
And it’s just the work of the work. It’s just the regular work that I did, maybe with slightly different tools, maybe with slightly different mental concepts, but that’s it. And that’s what I want to do now that you’re back. I want to just talk about how you get work done everyday.
So that’s what we’re going to do. I’m going to be the decoder ring. And you’re going to talk about the work that they’ve been doing for a really long time well, and then I will occasionally remind them of when we see things that we heard people in Las Vegas talking about from a DevOps perspective.
And/or an observability perspective and how it actually maps to the things that they’re doing and that this really is the tools that they’ve been using for a long time are a jumping-off place.
That a lot people eventually go to when you get dragged there by the technology decisions that your companies are making.
Right. So I want to start off with containers. Now we did–
If there’s anything that’s DevOps-y it’s got to be containers.
It’s got to be containers. But it’s still, again, the work. The things that you do are still the normal work. And the things that you know how to do. That mental process is still the mental process. So we did an episode on SolarWinds Lab not too long ago about containers. I’m not going to define containers because in your daily work you don’t define containers. You don’t walk around the cube saying well a container is an ephemeral object that, no, no, no I just want to know how to do. And I’m starting with AppStack. Yes, AppStack. So I want to start by reminding everyone that in THWACKcamp 2018 we were able to show people that containers are included in SAM, it’s right there, and I’m looking at AppStack and there it is, look at that. It’s a layer right there.
Amazing that a container could actually be something that sits on top of a resource queue along with everything else that you’re used to looking at maybe if you’re familiar with virtualization that it should be considered the same way?
Well not only that, but when I click on this and we understand how AppStack works, right? That’s not something new, but what I want to show is that by focusing, putting the focus on containers that, look there’s still a host associated with it, there’s still datastores, there’s still volumes, there’s still, these things called containers are not magical unicorn farts, okay? They really are just things. They might be slightly different. Although for those of us who’ve been around awhile, I might call them an LPAR, but just because they look very familiar to me. But okay, so that’s the first point I want to make. The second thing I want to do, is take a look at what you can see with this. So let’s look at the details for that container.
Okay, so there we are. And we can see that there’s some statistics here. I’m looking at the last 24 hours. What’s happening here? We actually don’t care because it’s SolarWinds Lab. We’re doing a demo, that’s not the point. The point is is that when you’re getting work done, you have the metrics at your fingertips. You also have the MiniStack view that shows you that this container is not a magical unicorn fart in the sky. It actually exists within the same context of all the other things.
But the difference here is that when we’re looking at this in AppOptics, the container monitoring in AppOptics, that was a combination of infrastructure monitoring provided by an agent.
Right? Along with tracing data that was actually being used to consolidate the various horizontally scaled components of that application into one overall view of that application. But in this case, you’re taking what is normally the AppStack data that’s coming from the association of different modules polling data.
And what in the case of, almost like Virtualization Manager would be using, what the API monitoring against vCenter, right? To ask it about the VMs that were running within the purview of vCenter.
In this case you’re talking, SAM is talking to the orchestrator for Mesos, Docker, or Kubernetes, Docker Swarm or Kubernetes, and getting that same information. So it’s a slightly different architectural approach and this synthesis of how we know what the components of that application are, that’s a little bit of a different technology to assemble them. But the view ends up being very, very similar.
Right. And I’m going to stand over here on this side of the desk and say don’t care. Really. And like it’s a very DevOps regular Ops Ops. You know the fact is is that you’re right. We’re getting this information a different way and we’re going to look at that in a little bit, but the fact is that it’s my day-to-day job. I don’t care where it’s coming from. I know I can get the data. I know I can get my work done. That’s the point. And even more so that there’s a server associated. Okay. There’s this LAB-NOC- center CON01, right? And I want to take a look at that server. I want to see what that device is doing, whatever that thing is. So there I am looking at it. If I scroll down, there’s the containers right here. They’re the ones that are part of my environment. If you don’t want to say infrastructure ’cause there’s negative connotations to the infrastructure, it means routers and switches, whatever. I don’t care. I’m not going to be pedantic about it. But these things, containers, are part of my world that I have to care about from that Ops-y Dev side of the world. But that’s not all.
Because we’ve been laughing about alerts and things like that, but I want to show you that, again, because this is part of the integrated system, here I am building a new alert. And I think a lot of us traditional folks who came from regular NOC-based monitoring are thinking oh containers, well how am I supposed to do anything with them? Well, I’ve shown you that we can get the data, you can get the data where you want it, but what about the alerting? Well if it’s there, it’s there. So normally you think about things like nodes, but if I scroll down here, there are containers. Containers are an object. They’re collecting–
A first-class object.
Exactly. And if I want to see what fields or what elements, I’m going to Browse All, I’m even going to play around with it, so there I can look at things like the health and I can look at the HealthStatus and I can look at the RestartCount. How many times has this container bounced? Maybe a lot, maybe none, who knows. And whether that matters or not isn’t for us to define here today on SolarWinds Lab. It’s for you to know that you’re able to get that information and it can be made relevant in the context of everything else that’s going on. Whether it’s an application aspect or an infrastructure aspect or whatever it is that the restart count may be one of those causal factors you want to consider, or the state or the status.
Okay but that brings up the point of thinking again in terms of areas of the business or including the business in the conversation in a way that you might not.
Like what’s the number one complaint a lot of times for our customers when they go to cloud is?
Too expensive. I’m spending too much money. It was supposed to be cheaper [laughs].
Yeah, it was supposed to be cheaper. Okay well the first symptom of that would be am I just doing lift and shift?
Did I take what used to be a VM and I pushed it into a container and now I’ve been runnig that container for a year.
So if want to use my time created and do a quick report and say, wow, you know 75% of my containers have been running for more than a month? And the team said we were going to do this horizontally distributed demand-based provisioning? We’re not doing that.
Yeah reports your CIO might actually want.
Right so again.
Out of monitoring.
But that’s a business metric effectively, right?
So how long am I actually using this if my business goal is to move to a dynamically provisioned resource model and I am not doing that I can measure the effectiveness of my business approach and how the technology and the decisions that I’m making about the approach actually fit with that.
Yes and that is again the business, you’re thinking as a monitoring engineer, you’re thinking about the needs of the business and your instrumenting and you’re monitoring that. Okay, so, one more thing I want to show. So here I am back at the server.
And I’m going to go to Performance Analyzer. It’s one of my favorite little quick hits, quick links, and we’re opening up a PerfStack view.
Lots of PerfStack.
Which is delightful, but once again, container and if I want to take that container that I’ve been talking about and I want to see what statistics can I get. Well, I have status and events. And I have statistics, what can I do? Well let me just throw, just for fun, the CPU and the memory right there, and now I’m able to once again combine the thing that is in my environment called containers, which is a new thing, but it’s not an alien thing. And I can look at it across all the other elements. Whether it’s part of this server or part of other parts of the infrastructure, to see how they’re interacting.
Ah. Well let me show you how to set up the AppOptics agent on that environment if you were going to monitor it using AppOptics.
Okay. The mouse, she is yours.
Awesome. All right, so remember before when we saw this when we were at AWS. I’m going to click on the infrastructure link here and I’ve just got two VMs that are being monitored here, right?
So I’m going to add another one. Well how do I do that? I’m going to come back here to home. I’m going to click on Add Host. And this is going to look awfully familiar. All right, so I am using your favorite.
So I click on Ubuntu and it’s going to give me the command that I’m going to use. I’m just going to copy that to my clipboard and then come down here and we need to actually switch over and connect to an instance and for this let’s…
Look at that! Solar-PuTTY!
Exactly. So I’m going to log in to this system. This is running in AWS. I’m going to paste this in because I’m just going to right-click of course there’s no cut paste.
Middle click, double click. No we’re not going to be, we don’t judge.
Do I want to install the AppOptics agent? Well yes I do. Do a little time compression.
And we’re done, right? The magic of time compression. So right here it’s telling us where it should appear. It’s telling us what the hostname is going to be. And that it succeeded. And now if I come back over here to my AppOptics portal, tap done. Look at that.
Ah. It’s on my host list. That’s pretty easy right? It’s going to take it a minute before we start getting metrics.
But I’ve gone ahead and added that and I can actually get my details from that system.
And it will start sending me data right away. If this was a part of a container environment, it would also ask, hey do you want to start monitoring containers in this environment? And then the agent would take care of adding additional monitoring to be able to poll container metrics as well. All right Leon, in all of that, are you sure that you’re not just a little bit making my case that expanding your view, being a little bit broader about what you intend to actually monitor, the data that you’re looking for is at least a toe into the water of the observability pool?
Okay not really. What I’m just trying to remind folks is that if they keep in mind the needs of the business and let that drive what and how they monitor, instead of monitoring all the things and flinging alerts everywhere, that the business is going to like them more.
Well they are going to like them more. And they’re probably going to let you, I don’t know, be a little bit more aggressive about, oh I don’t know, culture change. And so if you were changing your culture. To start thinking outside of the box, about where you can actually get data, about data that you would actually juxtapose, for example, data the business cares about, data that shows that customers are happy along with infrastructure, if you get people excited about culture change, then why not take the step and really think about DevOps in the way that you see it written about. That you see books about. About adopting a lot of the practices of DevOps and really inviting, truly inviting, developers into operations.
Okay, so but what do we hear about people from Vegas. What did they think?
Well the Vegas say that there’s a lot of tools and approaches out there. Certainly it’s interesting that they de-hype them at least as much as we do. And then your demos showed that what they were talking about really do apply regardless of the types of tools that we’re using. I mean there’s some demarcations, and there’s some divisions about the specifics about whether something is a container or it’s a microservice. Whether it’s something you would manage in AppOptics or you would be able to do it SAM directly, but they look, they sound like the same kinds of problems just with a little bit different nomenclature from the audience that’s actually using them.
Right and I also showed real live tracing in Orion. I showed log aggregation on-prem and by cutting down on alerts, you did that by thinking about events differently. I just didn’t pour the magical rainbow DevOps sauce on it.
I feel like you’re making fun of me.
Okay well fine. We’ll wrap this up. We hope that this slightly unusual format for Lab was really helpful and we want your feedback as always. Because the truth is that the moving parts of hybrid IT, including but not limited to DevOps and observability, is somewhere between all in and mm-mm no, we’re not going to ever, right? And we as usual in technology are stuck right in the middle.
But that’s the point. We’re always stuck in the middle. IT is always stuck in the middle. We’re the ones who have to figure out not if an app can be built or if it’s going to be cool in the first six months, we have to figure out how to keep apps running for years. You, the members of THWACK, are the audience that Amazon, Microsoft, and Google and everybody else are talking to. And your comments are really helpful.
That’s absolutely right. And so if you’re not with us live, and of course how do we know, how do they know if they’re with us live?
Because there’s that chatty box over there.
That’s right. And you’re talking to us live and we’re talking about who knows what. If you don’t see that box, well then swing by our home page which is lab.solarwinds.com. Check out past episodes, but most of all, check on the schedule for upcoming episodes and be with us live next time.
Okay, so that’s it, right? We had AWS attendee goodness, containers, and microservice, and tracing, oh my! And all of that in Orion. And we untangled, hopefully.
The DevOps at least a bit and where SolarWinds is in terms of DevOps tools.
That actually feels like an awful lot. And we’re sorry that you didn’t get to go to AWS.
It is an awful lot and I am not sorry that I missed AWS.
Really. How do your feet feel?
They’re still pretty numb.
Exactly. Okay so for SolarWinds Lab, I’m Leon Adato.
And I’m Patrick Hubbard. Keep your feedback coming and we’ll see you next time.