In this episode, Head Geeks™ Patrick Hubbard and Leon Adato dig deep—through the copious pile of topic live chat suggestions, SWUG™ requests, support questions, and popular engineer how-to sessions—to create a session that is truly a command performance. They’ll cover everything from basic view customizations to virtualization automation, and even hit on configuring the new SAML 2.0 compliant SSL features of the Orion® Platform. We know this is one Lab episode you won’t want to miss, because it’s what you asked for in the first place!
You know, we were just thinking about it, and there are a ton of ways that customers can interact with us to get insight on using SolarWinds tools more effectively. For example, learning all the secret features of their Orion installation or optimization.
Well yeah, and you can call or email support, or you can set up a session with one of our engineers. And how many episodes of Lab have we done on topics like streamlining alerts, building custom SAM templates, using the SDK and NCM compliance report.
Sure, there is that, but there is so much more. You know, the convention booth can sometimes seem like an extension of the support desk itself, and so on this episode, in addition to answering questions on those topics that we haven’t covered before, but you ask about over and over again, we’re going to go deep into the request inbox, and also show view customization tricks, customizing PerfStack, adding SSL certs to Orion, or setting up the new SAML 2.0 support for SSL integration.
We also got a bunch of specific questions about challenges with VM monitoring, setting up cloud integration, and creating custom device pollers. A heck of a lot of design discussions really do happen at SWUG, and it gives us a chance to really hear your questions about how you want to use these products.
Right, but there’s even more. There’s the THWACK forums, and I know you’ve practically done mini design sessions on social media. That doesn’t even begin to mention the live chat during THWACKcamp and Lab.
Right, and it’s why somewhere in every episode we always remind you to visit our homepage, which is lab.solarwinds.com, and leave feedback and ask questions. This is one of those episodes where we get to look back on a year’s worth of topic ideas, and just move them to the right of the board.
It’s a Kanban joke.
Don’t worry about it. But seriously, the feedback that you leave on our homepage your private messages, and support comments are so valuable, and you can also sign up for reminders, so that next time you can be with us live, and ask your questions.
That makes me feel like we better, for better or worse, get this session started.
Because we have a server ton of questions to get through, and we need to get started.
I have dropped a server ton on my foot before. We’ll get on with it. Welcome to another episode of SolarWinds Lab, I am Patrick Hubbard.
And I’m Leon Adato.
I really like that we’re going to have some time this episode. We always try to pack as much how-to as possible, especially in the screen, and not as much talking, although we did kind of rap a lot with the DevOps episode. [laughing] But we try to get as much as we possibly can in, so when we look through the list of topics of your suggestions and questions, we try it as a jigsaw puzzle.
How many can we get in where we show you, start to finish, a whole lot of answers to a whole lot of questions that you have. This one’s a little bit different because these are some of the ones that maybe take a little bit longer, like for example, we’re going to show how to create a view.
Right, a view-
Creating a view is actually simple in itself, but how do you assign it to a user so that they only see the certain widgets you want them to see or a group or whatever. There are just things that have a couple of moving parts that you have to dig into, and we’re going to take it, like you said, step by step.
So what else are we going to see?
I mentioned we’re going to do SSL certificates, whether it’s a wild card or-
Whoa, whoa, whoa, we’re going to put an SSL cert on an Orion Platform server?
Yes, or if it’s a self-sign cert, we get a lot of questions from people. Relatively simple, just have to walk through it step by step. We’re going to do universal device pollers, which again, very basic element, aspect of the NPM environment, but we’re going to walk through it.
Is there ever a limit to the total number of questions about the UNDP?
No. There’s always something more there. It’s one of the perennial favorites on THWACK also.
Because you can do anything with it.
Right. There’s a few things though, that did come up often in the when I talked to the support group, and the SEs, and things that came up that we’re not going to cover today. For example, we always get lots of questions on alerts, and we always get a lot of questions on the SDK, and there’s show notes, or actually right now on the screen we’re just sort of rolling through all the resources for some of those questions that come up perennially, but there are plenty of resources in the last 12 months that cover those.
Right, or just expand the more episode link on the bottom of this page. If you want to know about how to knock out spurious alerting, there’s what, four episodes?
There’s at least seven resources between eBooks and episodes and things like that.
Right, so those are all out there. But this episode is going to give us a chance to really walk through a couple of these that take a little bit of time. We’re going to be in screen for the entire episode, but bear with us because a lot of these are things like: “Hey where’s the edit button?” or “What is that magical last step to re-title a view page?” right? So we’re going to walk through all of those step-by-step, we will refer to other things where there maybe is a little bit more detail or something that we’ve covered in the past, but this is fresh content that is actually coming from this year’s suggestions.
Fantastic! So here we go. So I want to start off with something we get asked about a lot, especially in the convention booth, because security is everybody’s responsibility. How do I sign on securely into my Orion environment?
Ooh so this is going to be how do I apply an SSL certificate? And how do I integrate with SSO?
Right, which is sort of the new sexy hotness. I mean we’re really sort of excited about this.
Well I think of it as the new delegating hotness right? Because if you have to deal with individual accounts, it makes it really hard to get somebody onboarded, and using Orion.
Exactly. So first one is SSL. The other thing about these is that they are non-events. Once you know how to set them up in Orion, you’ll realize “Oh it’s just like that. It’s not a big deal.”
Yeah there’s a lot of automation and not a lot of sparkle. It just sort of happens.
So what I want to show first is actually the certificate itself. So here I am. I’m on the Orion poller, it’s running IIS, I go down to Server Certificates, and there you can see Leon’s Friendly Certificate.
That looks great! So that’s a server role certificate installed on this machine? And so that would either be like, if you only had one primary poller, it would be on that instance; if you had multiple web servers it would be on each one of those.
Correct. And it is obviously self signed.
It could be a wild card certificate, it doesn’t matter how you’re using it, it’s supported no matter what.
So the only, your mileage may vary thing, is if it’s self signed you may not get a nice green padlock in the bar, but you’re at least going to be able to ensure that you are using encrypted connections to–
Right you are using that SSL protocol. So there we are. And that’s it. When I go into the Orion Configuration Wizard, I just want to adjust the website, I don’t need to update the database, I don’t need to update my services, right here I do not want to skip the website binding, I will enable HTTPS, and look at that! There’s Leon’s Friendly Certificate! There might be some other ones, I could actually generate another self-signed right from there and say “Next.” And yup I know it’s going to bind, and say “Next,” and it’s going to go through the rest of the process.
And it’s an example of how many steps there are in the installer and the configuration wizard here that really are magic. Right?
It’s doing a ton for you. Now if you’ve done this manually, if you’ve applied certificates to an IIS instance. It’s not that hard, it takes a few manual steps, but again this is the kind of thing that the configuration wizard was designed, to make sure that it automates so it happens repeatedly every single time you do it.
Exactly. So this is really exciting. The configuration wizard is sort of grinding all the bits together and everything. I can’t wait for it to finish.
OK. We can wait for it to finish, so let’s use the same time here, and actually show us how to implement SSO, because that I think is an even bigger request. You don’t technically need to have SSL set up for SSO, although why wouldn’t you? So let’s take a look at how that works.
Well, here we have an environment that’s been set up. And you’ll notice that along with regular login, there’s log in with Okta, which is this SSO provider we set up for this demo.
For this demo, OK.
Right. And we used to joke about that right? In order to use Orion with SSO, first easy step, set up an SSO provider. No big deal!
Because if you get past that, all the rest of this is really easy.
Piece of cake. It actually is. So I’m just going to put in Jeremy’s password here, he’s very trusting, he told me his password. Rather than log in, I’m going to log in with Okta, it takes me to the Okta log in,
And if I had logged in already somewhere else, this would already know that I was, but it doesn’t–
If you checked the Remember me box, when you did it last time, maybe a little two-factor, whatever it took to–
Right, exactly. There we go. And yes, this is the first time I’m connecting from this machine, it knows that, so hey yeah sure. Remember me. So he’s logged in. Regular old user in this environment, this dev environment, no special privileges, you notice that he doesn’t even have the Settings, All Settings option right here.
But it does show his fully qualified name up here in the corner, so that you could tell that that’s the logon that you’re using.
Precisely. Now that’s what it looks like from a user logging in. But what does it look like when you want to set it up?
Yes, the easy part.
Right. So we’re going to log in as administrator.
With no password, because this is all about secure.
Right, exactly. Security. Destiny is like “No!” OK. So to set this up, you go to Settings, All Settings, and down with Manage Users is SAML Configuration. Now obviously we have this already filled in.
Now this was added 12.4 right?
Correct. Yeah. So we have this all set up, but we can go through the screens anyway. A lot of this information you’re going to get from the SSO provider, and that’s exactly what I want to go through now.
And it’s going to be specific to your environment, based on the SSO platform that you’re using.
Correct. So the first thing its going to want to know is, the Orion Web Console external URL. What’s it asking here? It’s asking “After the user has signed on, where do you want to drop them?” It could be your additional web server, it could be one of the external ones, it just depends on–
You could have a load balancer with a domain name.
Precisely. So that’s why this would.. It does auto-populate to the Orion poller, but you might want to put something else there, depending on how you’re working it. Next screen also auto-populates again, what’s the audience URI, Universal Resource Indicator, and the SSO service URLs. This is all still specific to the Orion poller. But then this information, the provider name, the target URL, and of course the certificate itself, that is going to be what comes from your SSO provider. So you get that off of the, in this case, in the case of the demo, the Okta system will give you that information.
But if you’ve ever set up SSO, these are standard fields that you’re going to fill out every single time.
The only difference here is going to be the redirect URL after successful logon, back to the Orion server front-end.
Right. Once you’ve saved those settings, you actually can test it out, so that you don’t have to go through things and say “OK test my configuration.” It’ll try it out. If there’s any errors.. OK so it says that you’re not using OrionGroups, I’ll talk about that in a second, what that is. But otherwise it runs clean. Or not. You work the errors like you work any problem, and that’s how you set it up. So let’s say you got through that process, right? How do I set up users? That’s the next thing. So I’ve got my SSO working, I’ve got my test button, is now coming through green, happy happy. Users and groups. Go back down to user accounts, manage my accounts, I want to create a new account, and now you see that I have two additional options. SAML individual account, SAML group account. So let’s just run this down. We have Orion individual accounts, so those ones that are unique to the Orion environment. We’ve got Active Directory individual accounts. I just want to poll their AD account. Or Active Directory groups, right? And then we have the same thing for SAML. We have SAML individual accounts, I just want to put one user in, or SAML group accounts.
Well and the whole point of group accounts, really is why you take advantage of single sign-on in the first place, or AD integration, because hopefully you are all using groups to apply view permissions, or even custom views or something else, so that every time you add a new user, actually any time a new user in a participating group chooses to use the Orion Platform, they are going to be able to log in without any customization from you, and just start using it. So it makes it much easier to delegate access to people who are already in an existing group.
Precisely. Now there’s one more question that we get asked a lot, which is “Oh, I have a user who has some elevated privileges, he or she is also part of a group. So which one takes effect?”
I bet that Learn More link explains all this too.
Yeah it does. And the answer is going back to your list of users, whichever one happens first in the list. So if there was a group here, if that group appeared higher up and Jeremy was part of that group, the permissions he would get would be part of the group first. If the individual account appears first, that is the set of rights that’s going to matter. So the order in which those groups appear, is the order in which–
They’re going to be evaluated.
So regular deny.
Mm hmm. Right, the only logical way. So do you think the SSL certificate is done baking yet?
There’s only one way to check.
All right so we’re going to log in and see how it goes. All right so here we are, Https://10.110.69.72 Here we go, fingers crossed and connection’s not private, but that means that I cannot proceed–
Because you’ve chosen to be unsafe and not sign it.
Yes I am.
You should probably do that.
I wanted a certificate that said Leon’s Friendly Certificate.
Because this makes everybody open a help desk ticket.
I know it does, but it was what I had… OK yes. You should a use a regular sign-in certificate, you’re absolutely right. All right? I admit it. OK, so there we go. SSL certificates, single sign-on certificates, secure, happy, and relatively easy to set up.
So securing Orion Platform 101. That’s what that looks like. SSL and SSO, and of course make sure you check out the Customer Success Center, which is support.solarwinds.com and there’s how-tos that will walk you through all of that. The second thing then is, the whole point is you’re trying to make it easy to get people onboard, right? So then you should let them sprawl into a million individual accounts.
Absolutely.. Not! Not! Not only that, but when they go on, one view is perfect for everybody, like one size fits all.
OK so, we’re going to combine in this case, the sign-on access along with the group definition that we saw before, with custom views, and we’ll create those from start to finish.
Right. And again, it’s not difficult, there’s a couple of moving parts to it, but the idea here is that when somebody logs in, either a group, a member of a group, or an individual, or a customer, or whatever it is, they’re going to be able to have a view that fits what they want. Either it has the metrics right up front that they need, or it’s limited to only the devices that they want, or whatever, that’s the whole concept behind custom views.
So you can combine views and view limitations, and be able to really create those specific views that let people get their work done without having to scroll through all of the data that’s available.
Right. Now one thing that people like to do is just be able to change what they see on the screen here. So I’m just going to go over, I’m going to click this little pencil button to customize the page.
What if you can’t find that pencil?
Everyone doesn’t have the pencil?
No not everyone has it.
No everyone does not have the pencil. It’s a great question. All right I’m going to get out of here, I’m going to say I’m done editing. So if you don’t see that pencil, it means that there’s one particular permission you need. So either you or the administrator of Orion has to go into Manage Accounts, and I’m going to edit, oh, we’ll edit the Knights Who Say “Ni,”
Yes they need edit permissions, definitely.
They definitely need to be able to change the screen, and the permission that you’re talking about is customize, allow account to customize views. That is the permission you need to be able to see the pencil. That’s it! And the nice part is, it doesn’t give you permission to change, to add notes, remove notes, move things around, you know, that’s it. It’s a pretty easy one. All right so here we are back in the view, I have allow account to customize views, I can click on the pencil. And now everything becomes a little bit gray, it’s kind of like you’re in that weird time in Lord of the Rings where you’re invisible, but not really, and stuff like that, except you don’t have to worry about the dark lord coming at you.
But also don’t worry about the data, just focus on the layout.
Exactly. Exactly. So if I want the map not here, I want to move it over to this column over here, I can do that, you can see it resizes a little bit. Some resources move better than others. I can get rid of resources I don’t like. I don’t want to have this Get Started with Cloud Monitoring, why you wouldn’t want that resource, I don’t know, but I’m going to get rid of that. Yes remove. And so on and so forth. You can move things around. I can add widgets, I can add interfaces here. All of that.
Well and the other thing is that’s really helpful, I mean this is great. Like this Pending Approval List, so this is coming out of NCM, right? So it’s normally not displaying. It’s displaying in this Edit View, so that ones are not hidden, but we try not to show a bunch of resources that don’t have data.
That are empty, right.
So when you go to edit mode, if you’ve ever asked “Well I thought I had a resource, is it data that’s not there at the resource?” when you go into edit mode, it will pull all of the widgets, the resources that are defined, even ones that don’t have data.
Right. Now one of the widgets that gets asked about a lot lately, is the PerfStack widget. So I want to take a minute and put that on. So we’re going to add a widget, all right.. And I can sort through here, I can go look but I’m not even going to do that, I’m just going to type PerfStack.
I would say this is one of the most common requested how-tos that we got all year. Especially at Cisco Live. You all really wanted to know how to do this.
Now there’s one key trick. One weird trick that you need. I had to build the PerfStack widget. I had to go into PerfStack and save it as an existing object, as an existing saved PerfStack, before it would appear here.
That’s because it doesn’t exist in PerfStack, unless it’s something along that URL, or it’s the saved ID for that PerfStack.
And it’s a fair question, for people who come up to us in the booth, or whatever, and they say “But I don’t see that resource.” Some resources are blank, you put them on the page then you configure them. In the case of PerfStack that’s not the case. You have to create the PerfStack, save it, and now I can drag it on here. And I’m going to put it right there at the top.
And while you do that, I was going to say the other reason for that is, sometimes Orion will do things like this to really make it easy to prevent trouble later on right? So if you’ve seen us talk about PerfStack, and again, this is one of those things where there have been many episodes where we have talked about PerfStack, one of the most important aspects of it is that left-hand side with the data explorer, where the fields have come over right? So the graphical nature of creating PerfStacks, you could maybe somehow do it inside of a resource, but it really wouldn’t be a lot of fun, so this is one thing where it’s reminding you, like “Hey if you’re building one of these really rich custom resources in PerfStack view, do it over here on the PerfStack page, and then just add it.” And it will just save you trouble later on, instead of trying to jam it into one little column on that page.
Precisely. OK so I’m done adding widgets. There it is. I’m done editing, there it is. And there we go. I have a custom view. Now that I just changed for everybody. [laughs] you having fun with that?
I just love [laughter]
So that’s for everybody. Everybody who uses this view, which happens to be the default Orion view, the default Orion summary view, I just changed it for everybody. So that’s something to be aware of, is that when you give somebody the ability to change it, if they’re using a generic view, they’re going to change, if they move things around it’s not personal.
So that’s because you modified the default view.
If you had created a custom view and made those changes, the default view would still remain OK.
Correct. And that’s exactly what we’re going to do now. Now I’m want to set up, based on this default view, I want to create another view that then I can apply to a particular user, or a group. Whether that is an Active Directory group, or a SAML group, or whatever, whatever, whatever. OK? So to do that I want to go to Settings, All Settings, and I’m going to Manage Views. Now here I have my list of views, I do have to know which view it is that I’m editing, like I said I already knew that this was the Orion custom view. So I scroll down here, it’s all alphabetical.
And so again some of these views are summary views, where they’d be summary resources, sort of overall performance, and then some of them are like node details views, or specific views, like the ones for a container that we saw a minute ago, or like a QoE views or something else.
Right and if you’re watching this and you’re thinking, “But how do I know which ones they are?” give me a second, I’m going to show you how to know which view is which. So I’m going to do this. I’m going to copy the view, because I’m going to use that as the basis. I could create a completely new view, but that’s a little tedious for what we’re doing.
Copy and modify, cut and paste is a good thing.
Right. So immediately people scroll down to the Orion view, and we get a lot of panicky phone calls, “Where’d it go? I made a copy! I expected to see it right there.” Uh uh. We’re alphabetically sorting, and there is copy of Orion summary home, so now I can edit that view.
Bring in a number and then underscore.
Because that’s what I do. That’s just my thing. If you ever find a SolarWinds installation that says 01.. You know that probably Leon was there at some point. So 01_Lab New Summary View. Also important, and I forget this as often as other people do, click update.
Yeah if you click done down here it won’t pick it up.
It will not. It will still be called “Copy of summary view.”
But it’s also because what it basically does, it changes the name and then it reloads the page so that it wires up all of the controls here for editing the individual resources that are assigned.
Now I could go through here. I could add resources to this screen, which is fairly easy to manage, you can add new elements and things like that, or I can go into preview mode. And if I go into preview mode, I’m looking at this new view, and now I can do the same thing I just showed you before. We can go into edit mode, I’m going to get rid of the PerfStack widget, because the group that is going to have this, doesn’t want to see PerfStack. So I’m going to get rid of that. I’m going to get rid of the support devices, just because. I’m going to get rid of the map, and I’m going to move the pending approval list over here, even though it’s empty for the data that I have. And I’m also going to move end of support devices there. There’s the changes I want to make. OK. Done editing. Now that’s it. That’s all I have to do. Believe it or not, I’ve actually been moved to another tab, the original is still here and I can hit done, and all the changes that I made before, those are captured.
And the other thing that you ask a lot about, is how do you enable left-hand navigation, for really complex pages, and then how do you turn on NOC views. The easiest way, there’s a checkbox right up here, for enable left-hand navigation, this one even is saying “Wow you’ve got an awful lot that you might want to do.” so if you want to add custom tabs to a page, that can jump out to a particular view, you can do that too. It’s assigning it to that view. And then right down here on the bottom left-hand corner, NOC view. So that’ll create a NOC view where it’s going to automatically roll through the different views.
Slide show style.
Slide show style.
Mm hmm. All right so that’s what I want to do. You’ll also notice that when I refresh the page, I hit enable left-hand navigation, then you saw the items in column three change from what I had changed before. Little bit of a panic moment when you come back to this screen is “Oh it’s not there what do I do?”
It’s the refresh.
It’s just the refresh, exactly. So I hit done. Now I’ve created the view, but how do I assign it to a user or a group? That takes us back to our user permissions. So back here at users, I’m going to edit Knights Who Say “Ni,” this user needs to see that special view that I just created, so I’m going to go back into edit.
They need to see another hedge.
They need a hedge [laughs] that’s exactly.. And a little herring, and all of that. So scrolling down, which views get used for which.. So for example, Orion Summary Home. Again I went back and said “How do I know which view is being used for what?” Well now when I look at the user I can see.
And it says “This view is displayed immediately after the user logs in.”
Right. So if I wanted that one I could go to.. 01 Now this list happens to not be alphabetical. I know, you’re welcome to talk to the UX team about how things enumerate and stuff like that, but.. And while you talk to them ask them about, Dark Theme. Just saying.
But I think this is actually the default ones come first, and then the customs come after that.
That could be. Yeah, absolutely. So there, I can set that. But if you ever want to know, what is the default summary view for other things, in each of the sections, for Server & Application Monitor, for Cloud Monitoring Settings, each of those has your different views. So the application summary view is using the application summary, and so on.
And it’s interesting right? Because at SWUGs, we get a lot of feedback from you, where you’re using these custom views and groups together for things like different types of teams, right? So that if you are a part of the database team, you get a customized database overview page. If you’re part of the networking team, you’re going to have a different logon. If apps or virtualization.. That it’s a great way to make sure that when you’re inviting people on to your server, they feel like it’s personalized just for them.
Right, and also you can combine account limitations and view limitations. So an account limitation is where when you log in with this account, or group, you only see devices of this type, devices with this custom property, etc. That’s one thing.
That’s across all your pages, that’s the limit of what you’re going to be able to see.
Reports, everything is limited to that, but you can also set up views. Same view, but one view that shows your storage arrays, and one view that shows your virtual machines, and one view.. It’s the same view, but you’re only showing certain device types, or certain categories or what have you. You can use those in combination as well. There’s another really simple, but again, a few steps, have to walk through it once thing that we get asked about a lot, and that’s universal device pollers.
It just never gets old.
It doesn’t. Well they’re insanely useful. I have a special one that I want to bring up in here in a minute. But before we go in, I want to clarify one thing, which is the difference between a universal device poller, and a custom poller. OK now custom pollers came in in NPM version 12.1, they’ve been around for quite a while, however, custom pollers are when you have a standard statistic that you want to collect on. CPU, RAM, what have you, but instead of using the object ID that we are using that’s built in to NPM, you want to use a separate one. Maybe you don’t like the way that we’re calculating RAM, a lot of articles on THWACK about that, or maybe you want to use a different CPU metric that the vendor is providing special because of that kind of hardware or whatever. That is what a custom poller is for. A custom poller lets you reassign the CPU, the memory, the operating system, the device type, polling from a different object ID.
But from a subset of elements, not just anything that you can get to with an OID.
Correct. Exactly. And you don’t need to set up a universal device poller, which we’re going to do in a minute, you don’t need that to do a custom poller.
Because you do a custom poller on the web interface.
Right. They’re two separate things. So now that I’ve clarified that, I want to look at universal device pollers. So one of the first questions we get when it comes to universal device pollers is “Where do I find these OIDS? Where do I find these object IDs?”
Well they’re all in Google.
Yes everything’s in Google.
Everything is in Google. Yeah I mean, that’s how I usually find them, because I know the vendor, I usually know the piece of hardware that I’m trying to connect to, and I have an idea about what I need to monitor, what piece of data I’m trying to pull back. And so that’s just a great way to just search for that based on this criteria, and it will take you to a page that very often will have details about what that thing actually is returning, how it is.. Is it table data that comes back, is it individual poll element? And so that really sets that up to make it easy to configure. But really? I end up using the SNMP Walk executable a lot. Because a lot of times I’m connecting to something really really weird, that may or may not have a page, and so I may need to just explore that thing in real time, walk through the MIB.
The entire set of output, right? And just for those people who aren’t familiar with it, it literally asks for every possible numeric combination of OID that could answer, and it says “I’ve got a response from this one, this one, this one, this one, this one.”
It’s a ping sweep for MIBs.
It is exactly that. And what’s nice is if you know you’re expecting a particular kind of value, it’s going to be this word, it’s going to be this number, it’s going to be this range, you know what to look for. It outputs into a simple text file, you can search through it that way.
And gives you sample values.
Right. Exactly. So that you can say “Oh it’s this one!” or “It’s not that one, it’s this other one.” So that’s some of the ways that you can find it. Now I want to set up a couple here, just to walk through the process. The first one I’m going to do really, really manually. We’re going to create a new universal device poller here. So in this case I know the number. I’m not even sure–
You got a very specific thing that you’re looking for.
Yeah, exactly. By the way in THWACKcamp, in THWACKcamp 2018 in the session How to Monitor Like a Network Engineer When You’re a SysAdmin, Destiny goes through the whole concept of what these numbers are and what they mean, so I’m not going to belabor this Lab with it. You can reference that, that’s in the show notes. So I’m going to go to 220.127.116.11.18.104.22.168.2.1 and you can see as I’m doing this that it is giving me the name of it. This is the HR software run name. I could have done a lookup of that, browse the MIB tree if I wanted to do that. But it just fills it in for me. Sometimes you’ll do it and it will come up with nothing, which means this is not one that we have in our MIB database, and that’s OK also. Even if it’s not in the SolarWinds MIB database, we still can poll values and get numbers back.
And the other question we get is “How do I get custom MIBs added to the MIB database, so I don’t have to go and manually add them?” and the answer is if you go to support.solarwinds.com click on NPM, there’s a lesson on how to submit them, and then we roll them into releases periodically.
About once a month. Yeah, almost. It happens really frequently. So here it is. There’s the value, I’m going to leave everything else the same. I’m going to put it in a different group. So the group is also another one. So in here I want to put it into Lab group like that,
You’re creating a new group.
Creating a new group, which actually appears over there on the left. And I want to test it on something, so I’m going to go test it on a Windows box.
Since you know exactly what it is, I’m assuming that you know that it won’t necessarily poll data from everything.
It’s going to poll some very specific kinds of data, which is kind of fun. So I’ll use.. That’s fine in the magic db.. Back, and I’m going to hit test. Give it a second to poll those values.. And there we go. And this is an interesting one, System Idle Process, system service host, what this is, what all of the items in this MIB block are, is the SNMP version of Task Manager.
OK, the running tasks.
I can get the running tasks, the amount of memory they’re using, the PID of them, and so on, and I can actually re-create that on my Orion page, on the node details page, strictly from SNMP if I wanted to, which is kind of cool. Now … also a little esoteric.
A little. But I think this whole episode nevermind. [laughs]
So it is. However, there’s one that we get asked about a lot, actually two that we get asked about a lot. Battery and temperature.
So I want to take a second and just look at one of the normal ones. Like I said, battery temperature. So in the OIDs we actually have some of the standard ones, and one of them is the Cisco, fan state, supply, temperature status value. Temperature status value so it’s already in here. So now what do I do? If it’s already in, how do I get it working on things? I right click, and I choose Assign. That takes me right to the space I was just a second ago as if I had just created it, in this case I’m going to assign it to this 2821, once again I want to test it.
You can test it against multiple values.
I could do bunches of them. And it comes back 24, that’s a really cold router.
We’re over-clocking that, so it’s liquid helium cooling.
Oh excellent! No it’s not!
It’s a frost problem.
It is not a frost problem. It’s in Celsius! That’s why.
Oh well then we’ve got to convert that.
We do. And so there’s a way to do that. So first thing you should know, is that when you have a more regular OID, you just apply it, you hit finish, it’s applied, it begins collecting data and so on. However in this case, this isn’t the number we want, I need to do that conversion.
Our office in Cork would like that.
They would appreciate it, and yet I still have a hard time knowing, is that bursting-into-flame hot? Or just kind of warm? So in this case we want to create a transform. So I’m going to open up my transform results, little bit of an informational about what this is. I’m going to call it labtempchange. Notice that I keep on putting underscores, and it keeps on bouncing me back to the beginning, you can’t do underscores, you can’t do special characters. You can do capitals. LabTempChange, I could give it a description, the group I want to put it in. I want to put it in the Leon Linux group, just because I feel like it. The polling interval is the default, which is five minutes, or I can set this one to be higher or lower. I’m going to leave it at the default. And here we go with the formula, and again, if you’re not familiar with this, you might have a bit of a panicky moment. But you don’t have to.
Because there’s a link right there, and there’s some tools to help you do this.
Mmm hmmm. First thing I’m going to do is add a function. And we actually have built-in functions. It’s worth taking a few minutes and exploring them. I want to convert Celsius to Fahrenheit, there’s my CtoF function, and if I click in the middle and say “Add a poller” Cisco environment monitor temperature status value, I stick that in there, next. Now I just created a new conversion… 75.2!
So the answer is “Room temperature.”
Mmm hmm. It’s just fine, for my data center we’re within the boundaries of that. So those universal device pollers, again there’s a few moving parts to it, but once you’ve gone through it once, you realize how amazingly simple and amazingly powerful this feature is.
And the whole point of this then, is going to be “How do I display that?” right? And we’ve done a couple of Lab episodes talking about custom values, so that is, you can generate charts, you can throw those into tables, there are resources that will do that for you, and it’s really easy to do that.
Right. On this screen here, do you want to display results on your Orion website? In some cases you don’t. In some cases you want to collect the value, but you’re not going to display it on the page.
Yeah like get to know the data first, before you promote it to the front page.
No, because you’re using it for an alert, or you’re using it to feed another one into a transform, or whatever. [exasperated sigh] Anyway, yes I want to display it. Do I want it in a chart, or a gauge like those little speedometers, or do you want it in a table? Notice that some of these are not selectable because that data is not table-based data.
But then also where do you want to put it?
And what page do you want to put it on? Exactly. So I can select even the custom views that I created before. I can specify where that’s going to appear.
And it’s one more reason that you want to get started with custom views first, right? Because that really is the foundation here. And it makes, in this case and some other places, where you’re assigning data that you’re generating, in this case, from the UNDP poller, it’s already there. You’re not going to have to go back and add it again later. You can add this as a custom resource, but in this case, you’re essentially pushing that resource to the page. So by having the page there first, it just gives it a place to land.
Exactly. And just like we saw before, where at the very beginning where we saw resources that were blank or they were invisible because they had no data, that little check box at the bottom, “Do not show this poller if it is not assigned,” finish, and you’re ready to go. And you can keep assigning it now. I only assigned it to one device here, but I can assign it to more now.
You know what we ought to do last? Is that you are always asking questions about, what can you do with alerts on virtual machines, or containers, or cloud resources, right? So this is not everything you can do. I think the questions are typically more specific about “what can I do in terms of automation, and how much of it do I actually have to code up?” right? So how much of it might be a script action, versus something that it’s a built-in action inside of alerts. So let’s walk through a really simple example, and then we’ll kind of extend it into a couple of other areas. So the first one is, let’s just do one that is a stupid simple alert for something that you would never do, which is, I don’t know, automatically bounce a VM every time its memory hits a certain point. So imagine that you know that there’s a process in there with a leak, you are trying to get it fixed, it can’t, and so rather than you logging in and doing it all the time, even though it’s in production, you’re just going to go ahead and set a threshold, and let that restart itself.
Just bounce it all the time. Because you know, what’s a bouncing server among friends?
Do not do this for real. But we’ll use this as an example. And also don’t use default out-of-the-box alerts. Those are there for you to learn. And we’ve done Lab episodes before to talk about that, we’ve done THWACKcamp sessions on that. So again, alerts, like “How do I Hate Thee?” The point is the ones that ship out of the box are to learn how to use them. To get something on and get some experience. But spend some time on these, because one, you can do a lot more with them, and two, you can cut down a lot on alerts that would otherwise go to a folder and will not be of use to you. So this first example of, let’s do one for a virtualization, right? So here we are inside the manage alert view, which you get to again, through Settings, All Settings. Or if you’re linking an alert, there’s a link there too right? But I’m just going to say “Add new alert,” and we’ll build one from scratch. And I know this is a lot of steps, but it actually is pretty straightforward, right? So the first one here is Do not try this at home, and Do not try this at home, I always put in descriptions for views, alert definitions, anything like the UNDP poller, because you will eventually come back and not remember what that was.
Yeah. Future you will thank you.
That’s right. So this is going to be again, based on severity of the alert that I want it to raise, and then I’m going to set the next thing is my trigger condition right? So that’s the thing that’s going to fire the alert. The first thing? What am I going to set that alert on? So when you were talking about things like containers, or VMs or UCS tunnels, when you scroll down here, this is going to be based on the modules that you have. So for example, if I want to do this based on, oh I don’t know, a virtual machine. It is now going to poll a different set of objects and conditions that I can apply. So the first one is, well do I want to apply this to all of the objects in my environment? And no, I don’t. Because that would just be bad. So what I’m going to do is say if–
Convert all the servers.
You reboot anytime the memory pegs for just a microsecond. Just reboot the production systems, especially databases. If you can, reboot your–
Tom. I can hear him running in here to beat you senseless now.
Mm hmm. So if the name of this virtual machine is equal to “BadVMbad,” then we’re going to fire this, right? So then the actual.. This is going to limit the objects that we’re firing. So the next thing is the actual trigger condition, right? So we’re going to trigger on the virtual machine, in this case I’m going to say, if the…. Hmm, you know, I don’t see the one here. So let’s do a quick search for memory. Search for mem. We’ll just do this by percentage right? If it comes up, so we’ll just come down here and say percent memory used, on this node, and I’ll say select is greater than.. 90% and then I’m going to set how long I want this condition to exist right? Because if it peaks for a minute, that’s cool or whatever. And it shows me here on the trigger alert actions, with the little wave, sin wave and the clock to tell me that that’s what’s going to trigger it. And you know what? I’m OK with this taking 30 minutes. Right? Because I’ve been watching this thing for weeks, I know that it takes it a while to go bad, and sometimes it magically collects garbage, and it’s all fine. But this is when I manually have to go in and intervene. So then I’m just going to say “Next.” And then we’re going to figure out what my reset condition is. Now on this one, I don’t want a reset condition. I want this thing to fire, to stop, and then make sure that I go in and manually reconcile it. Right? Because it’s still.. It restarted and I’d like to know. Or maybe I want to look at the number of counts or restarts for that, right? So I’m just going to say there’s just no reset action. This can’t be reset except by manual means. Is it enabled all the time? Well maybe I want to specify like when I’m, oh I don’t know, asleep versus not. So I can do that. So then the next thing of course, is the trigger action. So what are we going to do based on that action. And this is where this gets a whole lot more powerful.
Right. And in this case, a whole lot more stupid. Because we’re rebooting the server. But it’s OK. You do you. That’s all.
If you were not customizing the message that you were sending as a part of that, you definitely should. We’ve done whole Lab episodes on that. They start with clicking on the insert variable tool here, to be able to poll those out. For common ones, there are a ton of things that you can do. Customer Success Center has lots and lots of help on this. But I’m going to leave it default. And so the next thing I’m going to do is I’m going to assign an action, right? So I’m going to say “Add action,” because this is coming from a VM-based alert, some of the actions that are going to come up here, are based on VMs. So instead of you have to pick from a very, very long list, especially if you have lots and lots of modules and stuff, it’s going to go ahead and narrow those down for you. So in this case, oh I don’t know, what, we’re going to reboot?
That was what you said you wanted to do.
I still think that’s a terrible idea.
Yeah and we’re going to click Configure Action.
Now if we’re talking about something like a container, or a cloud resource, or anything else, it would be really similar in that that list of actions would actually be kind of pre-curated to make it easy to pick the ones that are going to apply. Where these get a lot more interesting, is if you look at something like system configuration management right? So in this case, this is data that’s coming from Server Configuration Monitor, right? So SCM. And I’m looking at configuration changes, and it would probably be nice to get an alert if we drifted away from our baseline config on a server. So if we look up here, oh what’s this? What’s this HG Destiny machine?
She may have beaten us to the punch!
Yes, probably did. And you can bet it’s going to be a security configuration, and there it is, right? So if I want a… This was an initial snapshot right? So it’s telling me that there was a change. But if I want to get an alert on that, what does that look like? We would go through the same steps that we saw the last time, only when you get to the alert section, it’s going to look a little bit different. And probably the easiest way to get to this, because the alert action list is pretty long, is just to use search right?
It’s comprehensive, right? So I’m going to search for config here, and what do we see? Server configuration differs from baseline. And the reason I polled this one, is that this one is sort of pre-configured. And you’ll see some of these that are the sort of default, out-of-the-box alert where you can only change so many things, especially where, you know, that config is like a diff value, so it’s not sort of a regular polled value that you would use the regular property explorer for. So in this case it’s going to preset some of those, but then it’s also going to let me set the trigger actions. And of course in this case, this one is you know, basically just create a log message and send me an alert, but you, being Leon, definitely do not like to deal with too many alerts, or at least like to have a little bit more sophisticated escalation policy.
I never want to have a human to do something that a computer could have done first.
Yes. And you also have a great habit of naming things what they are, so they do what they say and they say what they do. So if we take a look at your multi-step action alert here, we’re going to start of course, with the alert properties, how often the condition needs to exist before it fires, and we’ll just jump over here straight to trigger actions, because again, that’s what happens when the alert fires.
Right. And it doesn’t really matter what the trigger is for the purpose of this conversation, and it’s got multiple actions it does.
And look what you’ve done here right? You’ve got one, two, three, four. Four levels of escalation. So what are you trying to do here? Because you’ve set wait times in between, so what’s happening?
So first stage, immediately after the alert triggers, and we see that it’s really a problem, the first thing it’s going to do is run this problem called, restart server.pl because it’s a Perl script, because of course I wrote it in Perl. So it’s going to try that. And let’s say that that doesn’t fix the problem. That problem, whatever it is, persists for another ten minutes. The next thing it’s going to do is a V motion. It’s going to actually move that virtual machine over to another ESX host. And if that doesn’t work, then the next thing it’s going to do, is it’s going to actually reboot the machine, because obviously moving it didn’t work. Now rebooting makes a little bit more sense.
And finally, if that doesn’t solve it, ten minutes later, now we’re going to call on the big guns, now we wake up the human at two o’clock in the morning, because only a human has the comprehension and understanding.
Sev one or two ticket and open the ticket.
So basically what you did here, was you took my super simple example of, let’s just automatically reboot that, and you’re actually using different actions that are available on that VM right?
So just kind of, we’re going to tickle it and see if we can get it to behave, we’re going to restart the process within it, and then if that doesn’t work, last case we’ll reboot it, and along the way, each one of those actions as it fires, is going to leave a trail so that we can go back and audit later to figure out what steps actually remediated that to get it working again.
Exactly. OK, so we covered that. We did that. We talked about that one. We talked about that.
Yeah VM actions right exactly. So that’s where we are.
OK well that’s a lot, but that isn’t even close to everything.
No it’s not. I’m willing to bet a few more came up in chat also while we were doing this.
OK so does that mean that we’re going to have to do another one of these? Is it from the mailbox, from the mailbag, from the inbox?
And from the inbox.
OK so it’s sort of a from the inbox session. And we do another one soon.
Right. And I think they’ll be looking forward to it. As always please let us know what you’d like to see on upcoming episodes right here at lab.solarwinds.com OK? All right, that’s a wrap. For SolarWinds Lab, I’m Leon Adato.
And I’m Patrick Hubbard, and thanks for watching.