
“You’re not the boss of me!” is something we expect to hear from petulant children, not our IT systems. And yet IT practitioners often find themselves in those same kinds of power struggles with their systems and platforms. In this episode of SolarWinds lab, Head Geeks
™ Leon Adato and Thomas LaRock, and Senior Product Marketing Manager Jared Hensle (joined by special guests Karlo Zatylny and Serena Chou) will share techniques you can use to show stubborn technologies—automated multi-step network and application maps, containers, vSANs, and server configurations—who is the REAL boss.
Back to Video Archive
Episode Transcript
You know what I find interesting about Batman?
Everything's interesting about Batman.
Yeah, 'cause he's Batman.
Well, no, it's because there's only one Bat-Signal. Imagine if Batman had to respond to a thousand lights in the sky.
Well then, he'd have to be something even greater than Batman.
Maybe like a monitoring engineer?
Yes. [claps]
That's my point. I'm sure other people have tried to put up fake Bat-Signals over the years or just call them for help, but Batman's able to narrow things down and focus on the one thing he has to do at that moment. In other words, he controls the environment.
I see your goal, where you're going with this. As enterprise systems grow, so does the complexity of monitoring, alerting, and reporting. You hit a tipping point where you're not in control of your monitoring solution anymore. Your monitoring solution is in control of you.
But this isn't going to be one of those, like, your alerts suck episodes, right?
No, no, no, no. This is an episode about how to take back control of that environment, how to show that system who's the boss.
And we take back control by giving you visibility into those complex solutions.
Alright, I think we need to get a little control over this episode right now, so let's kick it off officially. Welcome to SolarWinds Lab, I'm Leon Adato.
I'm Thomas LaRock.
And I'm Jared Hensle, Senior Product Marketing Manager. What I'd like to show is how visibility can give you control of your environment.
Alright, so I think we should start off with mapping. Let's take a quick look at the mapping that at least some of the folks have seen already and then we'll get into things that we've enhanced.
Okay.
Yeah, this is the map that came out earlier this year where we allowed you to put your network devices in there and some of your application stuff in here. So, this map right now, we can see, it's using network connectivity to show your core router and errors and discards to another router, anything along those lines and you can, pretty much this is in our demo, publicly available right now. You can use it, try it today, and basically map out your, at least in this case, your network.
Right, and the big deal here is that it's automatic. At no time did anybody have to select devices and throw them on or anything like that. It's all pre, done because all of the information we're already gathering about all these devices.
Yeah, I think what's important is all this stuff is in context, and that's what this does. It shows you A and B are talking or maybe it's A and C where you're not having to sit there and go dig through some Excel spreadsheet that you had A, this is connected to this or some archaic Visio diagram or something like that. This is live and changing as your environment's changing.
Right, and this is nice. It's network, it's nice. But we're systems guys here, I think we can own that. Do we have a systems example that we've shown before?
Yeah, so what this also does is extend over to systems, so here we can see that we've got an Exchange server. Basically, we started mapping this out in Q1 with our ADM technology. Basically, puts netstat on the box, basically start seeing what it's talking to and that's how it's getting, hey I've got 10 milliseconds of latency, and I'm talking to this Exchange server or this Active Directory server. Anything along those lines. It actually extends beyond just your Microsoft talking to one another, your software. It actually can start extending out to your hardware, so here I can see that I've got a hypervisor, in this case it's Hyper-V, and I can see that the VM is on this host and that it's actually down to the cluster. So, I actually can see with the dotted line, it gives me kind of that visual virtual infrastructure, not your software connections.
Right, but that's all been there, so--
Yeah, that's present day. In SolarWinds fashion, we took it, made it better. What I'm going to introduce you to is the next version of Maps coming out. Right here, I've got a node, maps-dc-02. You can go to the maps just by getting to the right-hand side, left-hand side, and selecting Maps. We've now enhanced it where you can actually get into Maps multiple ways. What I want to do is I can actually click into the vital stats, I've actually got it open over here. I've got, I drilled into the C:\ drive. So, I'm now looking at the C:\ drive on maps-dc-02. Again, I actually have the Maps icon over here so I can click it and now it's going to, just like AppStack did, it put me in context to where I'm starting. Now I'm looking at the C:\ drive on here and I actually can start seeing some of the infrastructure that it's connected to.
Now the C:\ drive is actually a child of the host machine?
Correct.
You're able to see the sub-elements mapped as well.
[Jared] Correct. I'm going to flip this, just kind of get it more lined up a little bit differently. That's actually new to the change, or, no, change layout was there where we can change that. I can come in here and now I can see, I actually have a vSAN device in there. I can see I've got all the other devices. Let me select that so I can see it's part of a cluster and other host that it's on. But what's great about this is now I can actually use these nodes as pivot points.
[Leon] Okay.
[Jared] So, here I clearly see, hey, green dots, green dots, everything's good. But I see a vSAN, that definitely looks bad. So, I can select it, come over here, and I can actually show the alerts that are on that device. I've got a couple new options over here. I've got Related, Connected, Alerts, and then the actual Recommendations for a virtualization point of view. So, I can come over here, though I want to go back to the vSAN again, and I'm actually going to show all of the virtual machines. I actually can now use it and start plotting out the other VMs that are connected to it. I'm just going to select a couple random VMs and hit apply. Now, I've got the vSAN all set in. Oh, look at this. I've got VMs connected to it. I can zoom in and basically keep changing my focus, adding related things to it, basically creating a map of my infrastructure.
I could actually pivot from this vSAN to this device, select more related devices, and eventually build a map of as much as my environment as I wanted to, just by linking up and using, and so this is a term that was new for me, the ancestors, and the descendants, and the dependencies.
Correct.
Which you actually see those icons along the top.
Right over here, yeah.
[Leon] So there we can see that that device has 128 descendants, it has got three ancestors, and it's got 20 different dependencies.
Yeah, and the dependencies is using that ADM technology that we came out with in the prior quarter. It is software, it's looking at who you're talking to, so that's what's great is that you may not even know that your Active Directory server is talking to your SQL Server, or it's talking to this or it's giving permissions to that. So that you're really able to control that map and build it with no preconceived notions of what, how things are connected or talking.
Right, we're helping do the discovery for you. So, once you're done, you close the window, and then you have to come back and rebuild it again, time and time again, right?
Not so fast.
Okay.
That's what's great about this is you can actually come in here, and you've added your elements, added your nodes, and you've got it saved the way you want, now you can save it as a group. Obviously, we've always had groups before.
[Leon] Uh-uh.
But the problem with group is you were just looking at node names. You could filter a little bit, but you'd have to know that A and B are talking to X and Y. Where here, you come in here and do your dependencies and mapping, and now you don't need to know that. This discovered it for you and you could save it as a group, just come in here, hit save and save it as a name, and now you've created a custom group.
What I'm hearing about this is that you can, like I said, you can come in, you can place all the objects that you want by finding who's connected to who, save it, and then you can do things like apply a SAM template to the group so now that you create a group called "my customer ordering app" or whatever which has all the databases, all of the infrastructure, all the elements, and I can remove the ones I'm not interested in.
Correct. You could go, if you wanted to do something, you could scale it out to a domain controller. But if you go that next level pass the domain controller, you probably get a whole bunch of fluffy noise, printers, copy machines, other things you don't care about, so you can take it to where you determine your edge is and then you contain how far out you go and you create your group.
Yeah, that's what I was going to say. You create your group with specific elements and objects that you want, and then use it for reporting and alerting and so on.
Absolutely.
Now we had mentioned earlier, this is all information that SolarWinds has already been collecting, right? What I love about this as a data guy, somebody that's collected a lot of data over the years. I often tell people, 'cause they always have questions like what data should I be collecting, and usually a good starting point discussion is, if you collect it, could you put it into a graph of some type? Could you visualize that data, even a bar chart? Can your data communicate what it's trying to tell with a visualization? What I love about this is that we are building those visualizations now. So, you don't have to be writing these own reports and telling me what's all related. This map is going to tell the story for you.
Exactly.
No, I think it's meaningful data. If you have data that just is out there and it's not connected to anything and it doesn't mean anything, then what's the point of collecting?
It's just noise.
Yeah, but if you're collecting and saying, hey this, when this does this, or this metric hits that, this happens, this cascades, now you've got data to show people. Now you've got data to act or work with other departments to actually fix things.
Right, that is fantastic.
Hey guys, I thought Leon was going to be in the show today.
Yeah, I thought, I don't know where he went.
I haven't seen him all day.
I am standing right, [slams desk]. Fine, okay. [thuds]
Ah, there you are.
Better?
Oh, there you are.
Much better.
Welcome.
Hey, contain yourself.
Alright, fine.
Well, actually containers are why I'm here today.
What does container have to do with SolarWinds?
At SolarWinds, we've been adding some new functionality and we've been talking about containers for awhile, and we're here to show you about containerization and how it applies to your environment.
I think a lot of our audience, they've obviously heard of containers, but can we take a minute and just talk about what they are?
Sure, containers, you can think of as one of two things, either a really fat process or a really thin VM. [laughs] What we're looking at is a way to take virtualization into a world where we have very small virtual things that we can now deploy and reuse and really scale and put into our environments to take greater control of our applications.
Okay, but one of the hurdles I've heard people talk about running into is just that they're not always there?
Yeah, containers can have this ephemerality where they come and go as they please and that's really one of the advantages to containers is that because they are stateless, they're able to come up, do what they need to do, and then leave when they're not needed anymore. So, if you have a website that typically goes up and down throughout the day where you have a lot of visitors at some point in time and then in the evening you get fewer visitors, you don't need to keep all of that compute and all of that power available all the time. You can scale containers up as you need them, and then as your traffic goes down, they can disappear and save yourself on some resources.
I was going to say, so your typical monitoring of availability and uptime where it's 24/7 minimal CPU and RAM usage, those rules don't apply there.
No, absolutely not. When it comes to containers, what you really want is to maximize your resources so CPU and memory, when you're monitoring containers, comes something that you want to use a lot of. So, a container should be using a lot of CPU, it should be using a lot of memory in order to make sure that you're actually getting what you're paying for.
But I can see actually a more basic issue which is our standard model at SolarWinds is to inventory to scan the whole network, to discover all the objects and their sub elements and things like that. And then they're in the list, so to speak, and if they disappear from the list, that's bad, or it's part of a process of specific thoughtful decommission. It sounds like that's not going to fit into it.
It does fit into it because with SolarWinds, we really have products like Server & Application Monitor where application is king. With containers, application is king. You don't really care if a container comes up or comes down or lives or dies, what you care about is the state of your application. Is it there? Are you serving the webpages that you're expecting? Is your database there scaling at the numbers that you expect? You care about the application, you don't really care about the infrastructure as much.
Huh. Okay, so what can you show us?
Alright, I've got up our new container management page where we're able to look at the different container environments that we have up and running. So far today, in my environment I have both a Docker application and a Kubernetes cluster. I'll walk everybody through how to add a new environment into this so that we can see what it actually takes and what's going on behind the scenes. As a node that everything that I add already has to be a node inside of Orion, so I've already added a node in Orion where my Docker environment is running and so I'll be adding that into that, our environment today.
So, is that node part of SAM, is that what you're saying?
Yeah, container monitoring today ships with both Server & Application Monitor and VMAN, so if you have either of those products you're able to come in and do your, or add container monitoring to your environment. I've created a special username and password within Orion that I'm using for my container monitoring and that allows me to come in and select my server, and easily get the script which we're going to use to deploy container monitoring to the service. Here's the script that actually gets used to deploy the Docker monitoring to my environment and we are using containers to monitor containers so what this does is it actually creates a set of containers in your environment to do the monitoring through. I've copied that script to my clipboard, I come in here. Now we run that script and we're able to then go and this is now downloading all of the latest containers that we need in order to do the monitoring. Now you've seen it's gone and downloaded the different containers and now it will start communicating back to Orion. So, we're actually being more of like an agent type of deployment where we've deployed containers to the server and they'll now talk back to Orion, pushing all of the container data to Orion.
What screen are you using here?
Oh, I'm sorry, this is Solar-PuTTY. This is great new tool that we offer for free from SolarWinds and it allows you to SSH and do a variety of things into different environments and as we get more Linux-friendly, this is a useful tool for us to go in and look at our different environments and--
You said amazing words just now. As we get more Linux-friendly. I'm just going to bask in them.
I didn't find those words amazing. I just found them to be words.
Yes. [laughs] So here you can see that I can come in and run normal Docker commands and this is through one of my favorite tools, Solar-PuTTY.
That's just you actually, using PuTTY in the back end and we built on it?
Correct, it's just a way for me to save different configurations and easily have a tabular environment of different SSH windows, so I'm able to then come in here, have multiple tabs open and just be able to quickly tab between different environments that I'm SSHing to and be in my favorite command line land.
Got it.
Let's go back to Orion. Now you see that I've added the new Docker environment that you saw me type in earlier. It takes a couple minutes for us to gather all the initial state information about the different containers in that environment and as that starts to come up, we'll see the containers start to appear inside of this Docker environment. I have set up a fairly popular WordPress instance with MySQL in the background, so when we get to the point where we're seeing the containers in here, we'll see an actual deployment of WordPress on this server.
When container spin up and when they die, how long is that data saved for? Does it go away immediately, or is there, or I see value in keeping that information saying, hey it was spun up Saturday, it's Tuesday, but I can give you some historical context.
Excellent question. As we come in here and we look at a node details page, we've added a new summary resource for containers themselves that is searchable and we're able to look at all the containers that are running in my environment, but also some of the containers that have been running in my environment over the past seven days. The default answer to your question is seven days because a lot of times in container environments that are ephemeral, what's going on is containers are spinning up and spinning down and using resources, but you're not always watching it. Nobody's sitting here staring at their node details page forever. But you might end up in a situation where you're running into problems and maybe it happened over the weekend or overnight when you're like weren't actually staring at your Orion page, but you want to be able to come in and see what was going on so you're able to look at historical perspective and see what was going on, what was running when there is a problem, and what was actually being utilized at that point in time.
I think also from a DevOps perspective, because you do a lot of A/B testing, because as you have new versions of things, you roll them in and you simply spin up a single container, if something goes, not even catastrophically wrong, if just there's bad performance or a bad reaction on the customer side, and someone wants to go back and say three days ago we, our sales dipped like that, what was going on? Oh, I can see that we rolled that beta code and then we immediately pulled it back again and we know which containers those were pushed out to. So, we're able to go back historically and correlate those actions from the dev side and also from effectively the operation side.
Yeah, exactly. That's one of the beauties of our PerfStack environment, is that if you come into the performance analysis piece of Orion, you're able to go in and look at your environments and see what happened historically. Here I have an environment that I have an application running on that has the TCP port monitor and the response time to my actual server that's hosting my containerized environment. But then I also have CPU utilization for the different containers that have been running throughout the history of this deployment. So, I'd be able to trace back and see what containers were running and add them to the performance analysis and actually see the results of different points in time throughout history. We're also excited about having containers available on maps. We're able to come in here and on a node details page, you have the mapping capabilities, but containers are now authorized children of your nodes, so you're able to come in here and see containers on the maps as well, and that way you're able to then drive out some container environments where you want to have a map specific to a specific deployment and come in here and build up a map and see what you're able to do.
And once again, if you know that certain, from what we're talking about before with mapping, now you can start to build out a real picture of that highly elastic, highly available environment and say this is all of it.
Yeah, 'cause that's what I was going to say was from a systems admin point of view, I know these services make Exchange or Blackberry or SQL. A container, I don't know, so if I'm able to create that map and save it as Karlo's special website or XYZ ERP server, I don't have to have the tribal knowledge then going back along how many containers do I need to look at, what were they called, right there, it's grouped together.
Exactly. Each container obviously has its own details page as well, and that gives us a lot of great information about the containers themselves. What command was used to start up that container, the different metrics, details about the image, what's running, and then all of the different environment variables that are going into running that container. We get a fairly robust list of information about the container itself and then of course, looking at the container performance and our lovely AppStack or mini-stack telling us the container information and what server it's on. All of this information just to give you that additional layer of data that you need for when you have containers, what's actually going on, and this is great information as a software developer, that I would need to come in and look, and say alright, what's going on in the container? What command did you actually use to run it? Do I need to change my configuration? Do I need to change my YAML file in order to make this run more efficiently?
Will these containers, will this only monitor on-premises or on-premises and cloud containers?
Right now, our deployments are centered on servers that you own, so it doesn't really matter where the containers are deployed per se as long as you have access to the server like I was able to SSH into the server and deploy the containers that way. If that was an EC2 instance, or if it's something local on my own private cloud, I'm able to do it that way. It really doesn't matter where it is as long as there's network availability and then we're able to talk back from that server to the Orion server, which we're able to do from AWS, Azure, Google Cloud, any of those are able to set up the right networking to get us back and you're able to then monitor containers wherever you have them deployed. Again, application is king when it comes to monitoring containers, and so when I have my specific deployment out there and running, I am really concerned about container restarts. Again, containers are a new type of infrastructure that you need to monitor in a way that is different than the traditional methodologies. It's not just you're deploying it like a virtual machine that's going to live forever. These containers are servicing an application and one of the side effects of that is a container can stop and need to be restarted. We have orchestrators like Mesos, Kubernetes, Docker Swarm that are taking care of your containers as you've configured them, but when something happens, the orchestrators are programmed to go in and restart containers, or do whatever necessary to make sure that the health of your application is what you expect it to be. Restart count in my case is something that I really watch carefully in this specific environment. Containers are first-class citizens within the Orion infrastructure, so alerts, reports, maps, AppStack, PerfStack, all the wonderful features that you know and love with Orion, containers are first-class citizens and behaves accordingly. Here I've created an alert for my container, looking at the restart count and in my case, I'm looking for anything greater than five. My containers are fairly low restart counters so I don't get a lot of churn on them, so I'm just looking for a slight bump. Just a little bit more than five and that would make me worry that something's wrong with my configuration, perhaps I just made a deployment of a container that keeps restarting, maybe the environment variables aren't set up properly that it's not able to pull in the right information, or maybe it's got a dependency downstream that it keeps wanting to communicate to and it's just not able to find it. Really what you see is with the alerts is containers are first-class citizen. We're treating them as important things that you need to monitor within Orion, so alerts, reports, everything that you need with container monitoring.
It's amazing. I think as somebody who doesn't work a lot with containers, this also helps me get my head around what they are without having to go through a million tutorials and spin a bunch of stuff up or whatever it is. Now I really have a much better sense of what they're doing, and also that within my corporate organization, I'm able to support them. So, thanks for coming.
Yeah, no problem. Pleasure being here.
Serena, thank you so much for joining us on Lab. I know that you are really busy getting things ready for the release. Serena is one of my go-to people whenever I want to talk about virtualization, and storage, and things like that. She really knows a lot. She agreed to come on, but I'm wondering, what are you going to show us today?
I'm here to show you how VMAN has added support for VMware's virtual SAN solution.
Okay, for myself, I think I need to take a couple steps backward and just say, I'll just get it out there, what's a vSAN?
A vSAN is a software-defined storage or a constrict.
More nouns and verbs?
You can think of your physical SAN, you put all of your files on this physical SAN, take that and have software create that same thing to store all your files, do all of your RAID, that's what vSAN does. And VMware has provided a solution that's really good at it.
So essentially, I was going to say, essentially, we broke storage up over here, compute and CPU over here, vSAN is now taking peanut butter and jelly, putting it back together?
Yes, it simplified it, it makes it easier, so essentially you basically just go, here's the new host to my vSAN cluster, and now you have more storage. You don't have to worry about compatibility, you don't have to worry about adding all of the things. It's just there, it simplifies your life and it's just part of making sure that you have enough storage for all your applications and virtual machines.
In my case, being the database server guy, what I love about vSANs is the ability to simply apply policies in order to change the RAID configuration levels on the fly. I don't need to talk to anybody and beg for a RAID 10 for a tempdb, I can just say, you know what, every time I build a database server, tempdb goes on the T drive and the T drive will always be RAID 10, and it's enforced with a policy, and done. Software-defined storage.
Nice.
It is nice.
Does that create issues of ownership?
Depends on who you ask. If you ask me, I know who's in charge, right?
Right.
I know who's responsible. I would tell you that ultimately the person whosever responsible for the recovery of the hardware is ultimately in charge. In this case, vSAN is really about storage. It's software-defined storage, so who administers the software? Maybe the virtualization admin does, but who's in charge of the storage in terms of recovery? If I change from RAID 10 to RAID 0, that's going to affect how recovery's handled, most likely. In other words, I have a potential to lose data that I may not have otherwise thought about. If the storage guys ultimately are responsible for that recovery, then I would tell you, he's ultimately responsible for the vSAN. But I know some people feel the virtualization admin is in charge of the vSAN. It's like saying, is a DBA in charge of the server? No, he's just in charge of a piece of software in the server. The virtualization admin is in charge of a piece of software that's on top--
Absolutely, but you would not, I would build the server for you, get Windows prepped and then hand the server off to you. I think the same rules kind of apply with vSAN as your virt admin would build it up and then hand the ownership to you.
That's right, it's a team effort.
Got it, and I think that's what we discovered when we were talking to customers, right?
Absolutely, virtual admins are kind of in a strange place in terms of their job and what they're responsible for. This, who owns the vSAN is definitely a challenge that comes up in these more siloed, traditional IT setups. These virt admins are being challenged to say, hey you might not know that much about storage. I know we asked you some of this whenever we ask you to separate our datastores, but now it's even more demanding upon their expertise, they have to go in, they have to set it up, they have to work with the storage admin to get those storage policies in place, and then they have to keep an eye on it. The thing is, our virtual admins, their virtualization environments are expanding, they're growing in scale, so they don't want to sit there and look at a web console and just constantly say, hey is this running out of storage? Because storage is so huge now, it's so cheap that really that could just be like watching water boil. At this point, they're at a point in their career where alerts, events, those types of triggers to let them know when critical things are happening, like when you're about to hit 80% perhaps on your usage of your vSAN, that's something that you need to get an alert for so you can then work with your storage admin to do something about it. Add another host to your vSAN cluster, or consider maybe switching up the configuration of the vSAN entirely.
Got it. It sounds like our conversations with virtualization admins sort of affected the direction we took with this set of capabilities.
Absolutely.
How do we get into it?
So, for VMAN, one of the things that we noticed is that once you actually set up vSAN, again, you're not in charge of actually just sitting there and watching it and just looking at it everyday. You want this to be integrated into your everyday workflows, you want to be able to just do what you normally do and see vSAN data along the way. We actually took that into account in the feature that we provided in VMAN. You can see that by the fact that there isn't actually anything that is specifically screaming at you, vSAN, here's a vSAN overview tab. Or here's a whole bunch of vSAN. You'll see hints of it sprinkled all over the place, but we recognize that there are places that you expect to see this data. You don't need to go out of your way to see us, another siloed page that's not related to anything else to get that data. Here we are. We've got the VM summary page right here. There's a couple places that you'll see hints of vSAN. A couple places are that if you have any health checks that are actually triggered on your vSAN, those alarms and those checks, we will actually integrate and show you that that virtual SAN cluster is having some problems and give you some information about that so you can see the disk health here is in a failed state. There's something wrong here with this vSAN. This is one entry point that you can say, okay, well, this is a vSAN, and as I hover over it, you can see that vSAN is enabled, and you can see the overall cluster capacity. That's important information to know, 'cause again you want to keep an eye out on the ratio of used capacity versus what's left. We also have some other ways to get into this. You can also do a search if you want to as well. Search is available, but also as part of your normal flow, you can look at your entire inventory and I happen to know that this is a vSAN cluster. I'm just going to jump into it. Here, again, you see all of the default data, but then sprinkled amongst your normal workflow is some data about your used capacity. We also will be able to detect if you're using deduplication and compression on an all-flash storage. If that's the case, then we will give you some extra information here. This is also fully supported for our AppStack feature and mini-AppStack. Here, again, you don't see anything that screams vSAN, I'm a vSAN. Yeah, our virtual admins have told us pretty clearly that there's a couple places that they expect to see this data. It's either at the vSAN cluster level because they've configured it for business use cases, or at the actual datastore level. Because if you look at vSphere client, that's also where they're going to expect to see that data, so we basically did the same thing. Here we can see that our vSAN datastore is having some really bad issues. There's some problems, it's running out of capacity, and this is something that a virt admin is going to have to go deal with immediately. 'Cause otherwise their virtual machines, they're going to have a resyncing due to the failover. Currently, I know that this vSAN datastore is happening here and showing you that it's not in a good state, but we don't actually have the ability to show you the resyncing because it's actually happening too fast to show for this demo.
We have really good synchronization. However, I think we've got a screen, so we'll put that up on the screen here for a second, but that would be what the resync would look like if you didn't have such amazing systems that we're lucky enough to have.
Our network is awesome. Really good tools.
Fabulous.
Another important thing to note too is that we've also put in these top 10 virtual guests because really, it's nice to know that your vSAN is running out of capacity, but what you really care about is how is it affecting my VMs? What are the top VMs that are showing latency according to vSAN? We show you right here, and you can put this on any custom views that you have, anything that you want to put that alongside, you can use that. We're going to jump right over into the storage here. The nice thing again is even though we're not trying to overwhelm you with an overview of vSAN, we do call out this is performance data that is being reported by the vSAN for this cluster. You can do correlation here to see that something's going on at this time frame. What's also nice about this is I did mention that it's the, that vSAN is integrated into AppStack. This is also integrated into Performance Analyzer. So, if you click here on this hyperlink, you can see that we're going to preload a Performance Analyzer project for you. This just gets you started with a template right away with important vSAN information. You can open this up, you can see that we have all the relationships of everything that's applicable and related to this vSAN. It's a simple matter of just opening up, clicking on some statistics, putting it alongside, and you're off to the races. What's even nicer is that you can save something like this as a performance project and well, at that point, what's great about this is you can now put this on any views that you want to put. If I were to go to my summary page, which you probably wouldn't do, but if you wanted to, you could. All you have to do is do a simple search here and you can see that the project that I saved is available, I can drag and drop it. And now I have my vSAN data on my custom views.
[Thomas] That's nice. - Yeah, that's been something that a lot of folks have been asking when we talk to them at conventions or at the SolarWinds User Groups. They've really asked for the PerfStack data to be more portable and able to be placed in different areas for use. That's fantastic.
Okay, so the only thing left to show you after that amazing PerfStack revelation that we just had there is that also we know that only showing data at the cluster level is not the complete picture. We do have data available for you at the datastore level as well and this is where you can see the use capacity breakdown. Because when you look at the full amount of storage that's being taken out by the vSAN, there is a fair amount of overhead that's actually happening because it is software. Software is actually running this, maintaining it, and doing all of the resync activity, and that's important information for you to know. There might be a failover happening, there might be resyncs that are occurring, and in that case, you're going to want to look at, well how much of this is actually just the overhead of having a software do all of the storage management for me? And that information is at your fingertips.
Wow. Slick. Again, I think that for folks who are familiar with vSAN, this is going to be a really welcome addition, but for accidental vSAN administrators, people who might have it thrust upon them, it at least gives you an idea of what it is and how to deal with it at that introductory level, at that entry level. Thank you for joining us again. We really appreciate it.
Thank you for having me.
So, I have to say that for this last demonstration, we talk about configuration management all the time and configurations and things like that for our network devices, but configuration on server devices isn't something that I thought we were going to see here ever. Yeah.
We did it. It's kind of like my wife, you have to ask a couple times, but eventually we got there. I'd like to introduce Server Configuration Monitor to everybody.
Alright.
Okay.
What it does is that it uses a Windows agent, actually the Orion agent, so right now, it works on Windows boxes and basically, we're pulling the configuration from a machine, software, hardware, registry, things like that, and then reporting it in there in the Orion Platform, so you get all the Orion goodness. But let's, seeing is believing. So why don't we give it a whirl. Right now, I'm on a server called SCMDEMO and we can see that I've got all these items to chart my configuration. Hardware, IIS, registry, a couple things along those lines.
I just want to point out that there's actually a new item on the left-hand nav bar, right?
Correct, yes. If you are in a node and that you are doing the configuration on there, you would see Server Configuration on the left side, just like you see Maps now, we're starting to use that left menu a lot more for new features. If you're at a server node, you can come in here and you'll see Server Configuration if you're monitoring it. From this screen, we can see all the things we're monitoring hardware, IIS, registry. Why don't we start in software inventory?
Okay.
Okay.
I can just open this up and I can see that I'm monitoring the firmware, operating system, server information, and installed software. If I clicked installed software, I can see that my baseline, I did not have any backup software on there and on the new system, I now installed SolarWinds Backup on there. So, I'm now backing up that server and I can actually tell the version, when it was installed, things like that.
Uh-uh.
That's awesome.
What also is nice about this is it highlights and takes you directly to the part that's changed. It no longer gives you thousands of lines or hundreds of lines--
[Leon] I was just noticing that, that down at the bottom of the screen here, it says, plus 42 unchanged lines.
It basically condenses those and now I can, come down here, I can see that C++ was added or things like that. It's really great that you're not having to scroll and scroll and scroll, it's like, hey here's the data you want, we're going to bubble it up to the top. If you feel like looking through the other stuff, be our guest.
Uh-uh.
It's wonderful.
What's really great about this is and talking to customers, UX sessions, was using the software installed from a two-part, A in conjunction with patching, that hey I'm going to patch on the weekend and I expect to see service pack two installed on 10 machines Monday morning when this runs. If I see seven, what happened with the other three? Or the unauthorized changes.
Uh-uh.
It could be unauthorized, like hey you installed WinZip 'cause you've got to un-compress some things, that's okay. Or hey, somewhere malware got on a machine, things like that that was really unauthorized that you shouldn't have been doing anyway. This is really great to help answer that what's changed. I can't tell you the number of times it worked Friday night, and then Monday morning. [chuckles]
I'm going to say, every Friday, that's the number of times. So, like 52 times a year, every Friday. That's what happened.
That sounds about right. Once again, we've got our software audit report pretty well covered. We've also got a couple other use cases here where we've got our system special file so I can look at a host file, which is great, so I think I've got that loaded right up here. Here I'm looking at my host .INI file, so we can parse binary files and show you the contents. XML, text files, we can actually, INI files, we can actually pull that out so we can see that this server had a new web server pointing to 10.54.1.1 and accounting server and it was removed. At this point, hopefully those are in our DNS server, if not, this server's going no comprende when trying to find a new web server. It's just not resolving anything.
And of course, on the malware side. That's one of the primary things to look for is adding new INI, entries to the host's INI. You know, to also redirect you to some place that you thought you weren't going or what have you. You can immediately detect that.
This is great, this is another example, but it will monitor text files, XML files, anything along those lines so it does great that it's showing you the changes within that file. Again, it's compressed 16 lines that I don't care about. This is what's changed.
Just drive to it. I know we can go back to the node page, but I'm just, I'm intrigued by that item at the top, the configuration comparison. What's that about?
I can click in there and basically it sort of takes the previous screen but gives me a bunch of filters over here so I can go and look at certain profiles. I can look at my hardware profile, so I can open this up and I can see the different things in the hardware. I'm look at either drivers or memory modules. This is great if you're updating PERC drivers, video drivers, NIC drivers, anything along those lines which are so mundane, yet if somebody does it I'm not really thinking of why am I getting packet loss. Let me go check the 3com driver, maybe somebody updated it or something along those lines.
On the hardware side, the thing I like about that is when there's a module failure, a total failure, so now it's just disappeared from the motherboard's perspective, this is where I can pick it up and I'm not troubleshooting things that are non-issues.
Initially, when I saw the hardware side, I'm thinking, who's really pulling graphic cards out and putting graphic cards in at random? But really--
You have no idea.
Outside of that and Leon, really in a virtual environment though, these things are really dynamic. We're giving you in VMAN, recommendations to change memory, to change CPU all the time. Again, here's another spot to see that A, hey, I had memory changes. In this case, it was a physical box, but I can see that memory was added. In a virtual environment, you could see this changing all the time, as recommendations are coming up, that hey you've over- provisioned or under-provisioned or anything like that, you're able to see that right here.
So, did it default to this baseline, it's like the first time it ran?
No, baseline is something you set. Basically, by default there is no baseline. You are coming in there and defining a baseline. So, if you're, if you have a Windows box, you install it and you put this on there right away and you, service pack, and update, update, update, you'll see all those changes in there. But when you've got every available update that you're comfortable with, then you would tag that as a baseline. Then, going forward to weeks, months, whatever it is, you've got that special baseline, that's just the tag, essentially saying, hey, this is when the thing worked perfect and deviated off of there. 'Cause you can always change your two timeframes. You can compare Monday to Tuesday, Monday to Wednesday, anything like that, but your baseline's at special, like hey this is when it was perfect. This was ideal. So, you can always use that.
So, we've looked at software, we've looked at file changes, we've looked at hardware changes, but there's one, as a SysAdmin, there's one change that really tells you when something is horrible, which is registry.
Registry, yes. We can come in here and I could see any registry changes. This is really great, again, it's collapsing stuff, so as things are making changes, what's ironic is in a bunch of the sessions we've done with people, they're using this to monitor, they're even using this to monitor their Orion server. Hey, I'm going to do an update, it may change my five users to 10 users or something like that that they have a key in here. This is really great for that use case or any other software that when it updates, does a minor change, you've perfected it, or again, malware, if it does some sort of browser extension or anything like that, you're going to pick it up here where it totally could get skipped over by an antivirus program. Anything like that, that's really important.
So, what's the, what do we need for this? Obviously, it's part of, it's a new product, a new install. Agents, no agents?
It does use the standard Orion agent. Right now, it works on Windows boxes so you can basically deploy that agent or report in. It is standalone, so you don't need another module to work with it. But because it is in Orion, you do get all the greatness of PerfStack and things like that. Actually, if I go back over here, I'm going to go back to the actual box itself. Again, come over here, I'm going to go to the summary screen, I'm going to go to the PerfStack for it, and this is where it worked really well in conjunction with other SolarWinds products is it's going to show you over here, actually down at the bottom, here it is, server configuration. So, if you overlay this with, and again, you can click it and it's going to give you a brief description on the right-hand side once it loads up. Actually, if I hit the correct timeframe, I can see the changes I can drill in there. This is great when you parley this with SAM. Knowing again, hey, it worked Friday, what changed on Saturday or Sunday, now it's broke on Monday, what perf-- what are the performance metrics I'm looking at Monday? Are we looking at website response time, CPU, RAM? Putting these in conjunction, working these products better together, that gives you that visibility to really look at all the products together.
Again, there's lots of changes that happen on machine all the time, is the event that I'm looking at correlated with the change, any of these changes?
Yeah, and some are it is what it is. If you have a Spectre Meltdown patch, you should apply that. It is what it is, you're going to have to go to management, hey look, it caused CPU to do this, it caused IOPS to do that, there's really no way around it, or hey, we made a change because we're testing out a new program, production ran some new code or whatever, we put new code into production, things like that that you can undo and correlate it back. It could be good, bad, or indifferent.
This is really exciting. I can't wait to hear what everyone has to think about it. If you have a chance to kick the tires. Because we know that this is the first release and we always love to see how people are playing around with it and take it from there. Again, listening to everyone's reactions and see where we develop it and grow it from there.
I've sat on a lot of UX phone calls, MVP calls, it's been really fun to see people use this, play with it, get their hands on it, come up with use cases, so I'm really excited to get it out there. You guys have asked, we finally delivered. I'm glad it's out there and see how it grows.
Perfect.
Thank you.
Holy observability. I feel like I have greater control over my environment already.
That's right, chum. All it takes is a little visibility.
And the right tools for your utility belt.
For SolarWinds Lab, I'm Bacon Man.