In this episode, Destiny Bertucci and Patrick Hubbard cover the best ways to manage your monitoring system based on scalability and performance. Learn how to create and maintain a security process within your organization, starting with a few basic security best practices.
Hello and welcome to another episode of SolarWinds Lab, Breaking up with Bad Habits: Monitoring, Security, and Orion Don’ts!
Right, sometimes it’s really hard to let go of bad habits. Especially for monitoring and security, where everything is mostly okay every day and you’re just kind of, you know, thinking about distractions and it’s just kind of going along.
Well, it sounds like you’re talking about relations in general, right? Everything just kind of seems to be going along fine and then you end up in a surprise fight over a couple of bad behaviors. You know, kind of roommate stuff.
Exactly. If you could make the team more efficient and happier with just a couple small but important changes, you’d do it, right?
Yeah. Except I think I see where you’re going here. You’re dovetailing into Leon’s THWACKcamp session on how to love your monitoring.
Actually, it was “Rekindling the Spark,” but yeah. That and Runbooks.
Oh, that’s true. And Runbooks, you had me at “hello.”
All right, so many times we put things off and don’t even realize how badly we’re addicted to monitoring incorrectly. We have to put a stop to this, especially since we’re now past the New Year’s resolution period. First, let’s get a plan started, and break up with those bad habits in monitoring. What are some of the top improvements that you can make right now?
Well, I know I certainly have a few that I don’t want to let go of. So after we cover that, security seems to be the other thing that we say that we’re going to fix every year. This is another new year and then we still end up with bad habits galore. So I think there is a way of approaching that. You know, we can think about standardizing security needs, creating baselines, and then setting some realistic objectives.
Right, and of course we’ll show you where to fix bad habits, even in Orion as well.
Well, nobody has bad Orion habits. Okay, well I’ve been known to take a shortcut from time to time.
But that’s why you’re an IT geek. Just like our audience, right? So you want to have that time to play, too. That’s completely natural. However, we will help them with the top monitoring security don’ts, then you show them the kind of fun you can have making free time with better habits.
Oh, like a reward. Like integrating Orion with Alexa, for example.
Sure. But does that mean you’re going to talk about SWIS again?
Dude, that is a terrible habit.
Okay, so bad monitoring habits. Let’s start with that. None of us have any bad habits at all ever, no shortcuts. There’s never a million overlapping things that we have to take care of and we forget to do the basic stuff. You’re making a face.
So in my 11 years of just working for a monitoring company, I would have to say that, even myself, and some very highly-trained people with monitoring, pretty much everyone has at least one bad monitoring habit.
Okay, give me just a super easy one that people forget about.
Let’s see, a really easy one to forget about is having numerous accounts that you’ve set up. Maybe you’ve changed programs, moved on, never got rid of them, but they’re active accounts with information, right?
Mm, so you recommend that people check out the accounts?
Yes, so let me show you how easy it is, and we’re going to use Orion as an example. But something that everybody here should realize is that this is with any monitoring system. We don’t want you to feel like we’re just, you know, telling you this because it’s in Orion. This is monitoring for anybody.
These are don’ts that apply to everyone.
Right. So what we’re going to do is we’re going to show you, because obviously, we know this platform that we’re going to do, but you’ll get the gist of it. So no matter what you’re doing, you’ll have the best habits.
All right, so when we go into here. I’m going to go into Settings, I’m going to scroll down to where I can actually use the user accounts and I’m going to hit the account list. Now the reason why I’m handling the account list in our, obviously with our product, is because account list is going to list them all out there so that I can actually see everything across them, versus just managed accounts with the names and, of course, things that are happening with them.
So that’s why I’m moving straight to the account list. So it sets them apart and it goes across here and as you see, I put one on here, ‘Why Me,’ right?
So there’s no expiration on any four of these accounts that we have in here.
So then, on top of it, we see that this Guest account, they don’t have any of these rights that are coming through here. But these do have them, so the ‘Why Me’ for example, is already lined up to be an Admin. They have everything available, but it’s never logged in.
Mmhmm, that one we can probably get rid of. It’s got the created date, right?
Right. So it’s something that’s never been logged in as, nobody’s been using it, but yet we know that it’s enabled, we can see that, we know that it has Admin rights. Why is that there? So that’s my ‘Why Me’ account, right? So you would literally be able to go across here and be like, all right, what’s being used? Last time it’s been logged in. And I’ve been actually on some of the people’s actual monitoring servers and you’ll see things that haven’t been logged in for like four, five years.
And sometimes those employees are not there.
Right, and the other thing that’s really cool about this view is, when you go into the regular accounts, a lot of people never click on this.
I’m just amazed when we talk to people, is that the main dialog is designed for making changes. This is just to read it, but when you get into all the tabs, especially when you have multiple modules installed, you have to expand them, right? Because there are detailed settings on each one of them, but here they’re just listed out. So you can just go side by side, and when there’s actually more accounts, it’ll scroll across. But it’s a really easy way to see that or to compare and say, “What is an Admin?” Or “What is a Guest?” or “What do Geeks do?”
And I’m just going to ask you, so if you’re using Windows authentication, what do you see in this view?
So if you’re seeing– you’ll actually see the Windows account and it’ll tell you the same things, last login, things of that nature.
Or you’ll see the role.
You’ll see the role of what should they have.
And that’s the thing where this view is particularly useful, because if you’re using Windows authentication, and you’re binding to roles, that’s really where you get into that kind of security sprawl. Where it was a temporarily assigned role, you did it the right way, right? You weren’t using a—you weren’t cheating. It wasn’t a bad habit, but then you assigned it to somebody, or maybe created an extra role, just for an edge case and then just forgot about it and you never come back.
Test case, right? So there’s tons of like different test scenarios, cases that are out there on a configuration side if you have NCM, for example. Well, that has access to a lot of devices, credentials, things like that. But you’re testing a scenario and you’re trying to test a server, maybe limiting down a team or things of that nature. So it’s a lot easier.
Limiting down a team. That could work a couple of different ways, right?
But yeah, so I just love this view for this. This is a great way to do it and a lot of fun for guys who have never seen it before.
Right. Now, what I like about this for Orion especially is that when you’re coming into here, if you’re going into the manage accounts and you need to like trim these out or you need to turn them off, or maybe you just want to disable them until you can figure out, can I cut this? Is this something that needs to happen? You can actually, from this screen, click, since this account is enabled, I can actually click on it from here without going into the account and actually make it not enabled any more. So I disable it by doing this.
Yeah, and there’s a key down at that bottom that actually shows you what these different statuses, and you can do that for sub elements too. So this is just a really handy screen for everything.
And it’s quick. So for New Year’s resolutions–and you’re out here and we’re trying to get you on track. And a lot of people just don’t want to go in and out of the accounts and do the audit, right? Use the account list and you’re able to come across, make the audits directly from here without going individually, make comparisons so you’re not going out of one screen into the next and then finding out when you refresh that it’s reverting some of the other ones. So it’s a quick and easy way to get a hold of it and create a new good habit, by using the account list itself to be able to manage and kind of micromanage some of those accounts. And turn them on and off when you don’t need them.
Okay, bad habit number one: Leaving dead accounts out there. So occasionally review them using this view to make sure that you don’t have unused accounts or accounts with too much privileges. Okay, bad habit number two: Something around maybe killing people with alerts and not softly.
Right. So if you have a whole bunch of alerts that you’re not paying attention to– say I created an alert that said, ‘If a device is down, alert me.’ Well then, you have one that says, ‘Well, if a device is down in this department, alert me here.’ ‘If a device is down here, then alert me here.’
That’s three alerts.
Yeah, so there’s three different alerts that people are having to set up, or they’re thinking that they have to.
Or that you are getting.
So something that you can do is use custom properties like email, and then we’ll show you how you can actually do that.
And we’ve covered custom properties at SWUGS. We can talk about that. And also, there’s a THWACKcamp session out there, you should check that out too. We’ll throw the link up there.
So what I want to do is, I’m going to go to ‘Manage Alerts’ and what I’m going to do at this moment is just kind of look at one that’s already created. So this is in a component down. I’m going to edit this alert. I’m going to go to the ‘Trigger Actions.’ On this email, I’m actually going to edit it, and as you can see what we did, is we created— these are custom properties. Which, like you said, we’ve gone over these before.
These are custom properties. DefaultEmailTo Default CC, Default BCC.
And the goal here, again, is to be able to change these values without kicking off the rule to however many people might receive it.
Glad you said that. That is a goal of it too. Because you’re able to change your email addresses on your other devices.
So that’s another bad habit right here, that we’ve eliminated, is hard coding our emails into alerts.
Yeah, so we’re trying to give you these really great good habits now. By using these, you’ve already created this alert and this is done for the devices which that are down. Now, what makes this handy and as a great habit is, now that that is set up, I don’t have to go back in this when I’m adding devices. I don’t have to go back in here and adjust or maybe we’ve added a new team on. Maybe a new section of your company is coming into monitoring, and they have their own separate email address, so as you’re putting these servers online, you’re able to just fill in the custom properties, you can import or export them, and so that you’re able to do this on a mass scale.
So bad habit: Hard coding those values into every single alert; stop doing that.
Exactly. And I wanted to show you guys what that looks like because we’ve had it at SWUGs too, people wanting to know: “Well, what do you mean that you can just put this in the actual devices themselves?” All right, so I’m just going to pick any node and I’m going to hit the ‘Edit Properties.’ I don’t have to go into the Custom Property Editor because it’s already been created on the backside. So I’m going to hit the ‘Edit Properties’ themselves on this node. And I’m going to scroll down, and you’re going to see that now we have the default email that we were talking about earlier. So I can actually put in here, just like you would if anything else, and then I can comma, add more to it, however you’re wanting to do this. So from now on, I can actually import or export my custom properties, update my email addresses, do whatever I need to do. I’m never changing the alert. I already have one alert that is a node down, but now the actual alert triggers, instead of having like four or five of those same alerts, which take up resources, and hardware time, the spans on your database, it’s constantly querying and checking. So we eliminate all of that by actually going through here and having the email itself one alert. So think about that. You have one for your nodes, one for your components, one for interfaces, things like that, for your downs. But get in that mentality of–get out of the bad habit of I’m just going to create what they wanted and do a little bit of work add the custom property and then you don’t have to touch that alert again and you don’t have to recreate for different groups.
Awesome. All right, so that’s fixing an alert bad habit. Consolidating alerts. Let’s think about another bad habit, which is ignoring resources on the main machine.
Okay. That is always a good one for me, especially coming from the support side. Because with upgrades along the way and new additions and adding things within there, that’s when it kind of gets out of control, right? Because we’re upgrading, we’re busy, we’re just trying to get the features, maybe we’re trying to get things added in there. But it’s still on the same hardware. It’s still on the same thing that you created four years ago.
Unless you’re on virtual and you’re adding them to there. But you have to add them, right? You have to kind of give it a health check to get into there.
And I think one of the easiest ways to do it, and I’ll show you a quick trick. You know every time I mention this–this is funny– people will say, “Oh really? I didn’t know we did that.” Customer Success Center, if you haven’t been out there, is really, really cool. Customer Success, click right here. And then this is a bunch of training and guides and how-tos, but it’s all broken up so you’re not reading. I think, what are we up to? 19 hundred pages on the admin guide? You’re not going to do that. This one is broken down by product and it’s really easy to fix this. And whether it’s us or anybody else, look for these, more and more companies offer these. And it’s a handy way to get started. So what I do is every single time I do a major upgrade, especially since I have some instances that I use for testing that I’ve had around for seven, eight years now. Believe it or not, because I don’t want you to be the only one who’s doing that. I clicked on ‘NPM’ first, and then I click on ‘Getting Started.’ And there’s a page right down here– so these are getting started guides. Plan your SolarWinds NPM Production Deployment. Because this is a guide that’s really designed for folks that may be new to NPM and they maybe set up everything with the SQL Express server on one machine, and it’s not running really well. So this is to get them to have a production-ready system. But it’ll actually walk you through the guide the same as if you’re making that transition, but we effectively ought to review this every year. There’s a couple reasons. One, we tend to monitor more over time. And we should, it’s the whole point of having discipline about the way that you monitor. But then the other thing is that the applications themselves change on the requirements, right? Like NPM has become more efficient at polling over time. So it needs less resources. Now if you combine more efficient products or maybe more modules, now with more monitoring or changing monitoring, maybe switch from primarily network monitoring to application component monitoring–
Or both. Untangling that mix to figure out what requirements if you were doing a new install from scratch would be can be kind of a trick. So here on this page, it breaks your deployments down. Small, medium and large. It’ll give you an example of the licensed tiers that you would typically be using or the number of elements that you’d be monitoring, and then what’s actually recommended, and in this case, it’ll even break it down for you. So the question is, “Oh, do you virtualize it?” Sure you do. And here’s the differences between if you run it on hardware or you run it virtually. So this is available for all the products. It’s updated fairly often, whenever there’s a change in the way the resources work. So definitely check out that page. I mean that is great bad habit to get out of, which is never reviewing this. And the good habit is to go and review it.
That’s right. Now something that I really like about that page, especially though, is that if you have more than one product on it, NPM is the one that sits on the platform for Orion, right? So we want to look at that as a general basis, because that’s usually what they start off with using. However, when you start adding things like NetFlow, and Server Application Monitor especially, these have a lot different types of monitoring capabilities. You bring in things like WMI, you’re bringing in extra Syslog, you’re bringing in NetFlows that are coming to the different pollers, you’re doing things. So like he was showing you, this breaks it down for the different components. Make sure you look at each one of the modules, not just picking one–each one of them. And you want to go with the highest tier of recommendation because you have that tier. You have that module. A lot of times people will just think, “Well, I have NPM. I’m just going to look at it. I’m going to get it started. I’m done. Now I can just apply all these modules on top of it.” You’ve got to look at their individual plannings and go– so that you have a smooth and a much better upgrade process and or even just a better environment, right? A monitoring environment that you’re not getting frustrated with because you’re underutilized on what you’re needing.
Okay, so that leads us to another bad habit. Which is rather than do an analysis of my polling, I’m just going to change settings and see if I get better performance. And there’s definitely a good habit to eliminate that bad habit. And that is to actually look at your polling statistics.
Right, and what we’ve found out, especially at being at some of the SWUGS is that a lot of people didn’t even know that was available, that we’re actually already doing. Now like we said from the beginning, most monitoring situations, platforms that you’re using do this, so that you need to figure out where this is. But we’re going to show you here, just so you can kind of get a grasp of how much we do on the backside to try to help you guys having great habits. So we’re actually going to go into your ‘Settings, All Settings.’ We’re going to scroll down and you’re going to go into the ‘Details’ and we’re going to go into ‘Polling Engines.’
From here, you’re going to see the last database update, so this is good for like daily things. If you have a question about is it polling? Am I missing polling? Is there something? Because if you are low on resources, you can get behind on pollings and you’re not making those. So as a technical tip I guess you could say, you can come here and say, “When was the last time this has physically touched the database and been writing?” And it was 17 seconds ago. So then, you see Polling Completion. You always want this above 98%. So it hangs out sometimes around 99; you’ll see it do 100, 99, 100.
That could be a reachability problem or something that’s rebooting, or something else where you’re just not going to get poll back. That’s normal.
Well and these statistics are also in the database, right? So if you’re having a huge, like just a reflux of NetFlow or things like that that are coming into your database and we’re polling it back and forth, it’s going to be a little bit– you know you’re going to see that 99% there.
And then you’re going to get your volume and interface node element accounts here. So again, when we saw in the planning guide, small, medium or large, that’ll tell you right there.
Exactly. That element count is key. Because that’s the things which we’re monitoring, especially with NetFlow. And that’s the device counts that they like to talk about. And a lot of people used to have a problem with that when they’re like, “What’s the device and the interfaces?” Quick and easy way, even if you have an eval and you’re trying to figure out what is the best version for me to go to, and what’s the size factor?
In your eval go here, and you’re able to see exactly what you’re monitoring at that moment– the element count of which one that it is, so that you make sure you fit where you’re supposed to be at.
Right, and then that brings us to rates. And this is the thing that gets us really, really– this is really useful because it’s a guide in the form of a percent.
Yes, definitely. And so the 10% is the maximum rate at this time. So it’s definitely underutilized as a poller. What this means, though, is that when you start seeing this get into about 80% you’ll see a warning here. So it’ll say, “Hey, warning, just letting you know. You’re 80% of the polling capacity of this poller.”
Of this poller on this instance.
On this instance. So if we had extra pollers actually on here, you would see the polling statistics obviously for the main and then for the additionals. So when you’re doing that, what’s great is that once you get to 80 and you add another poller on, you load balance them off.
Or upgrade the capability of the system that’s hosting that polling.
Right. You can also do that. This allows you to know beforehand, before you have the users telling you that they’re having a problem. Or the users saying something’s just not right; the reports are taking a while to come back in; this is running a little sluggish. This will automatically– if this is something that you put on your list, maybe even check if you’re not doing a lot of additions, but if you check it once a month, put that down as something to check on; starting a good habit to verify this page. And this also helps you to know if things are being added that you didn’t know, like if you have other people that are able to have that ability; it keeps everybody in track and knows what’s going on. Set up some things that can actually send you an alert on this page, to let you know if something has changed. There’s many ways that you can use this page to help you.
Awesome, all right. So, polling statistics: Bad habit not to look at them. Check them out once in a while. I like your ‘looking at them once a month’ recommendation. I think it’s pretty good. Okay, next bad habit. What? Databases are infinite capability and you never have to look at them, and they’re just a utility that sits in a closet and they never break.
That would be funny [laughs] but no.
That would be funny except it is an actual bad habit.
So something that I do especially with my background, is I monitor this page. I literally monitor this page with WPM.
So, I’ll actually monitor this page and know. And as soon as things start increasing, I’m wanting to, with my SAM, I actually monitor my database, and with DPA.
Right? So you have to monitor your monitor. So a bad habit is just letting it kind of go on itself. You need to be able to monitor your monitor, so that you’re alerted to things that are happening. So by doing that, we can actually look at the database settings in here, so we have more data that we can grab.
All right, let’s look at that.
Okay, so I’m going to go to ‘All Settings.’ I’m going to scroll down to that same area where we were before, and I’m going to hit ‘Database Details.’ Now what this is going to tell me–which is really great if you’re working with support or something, too– is that this has your version, your OS, everything that you’re needing there. It’ll actually show you your network elements, your nodes, your interfaces. Now why this is important is because this is a rollup number. So if you did have extra pollers or things like that, they would tell you individually so that you can adjust and load balance based upon your pollers. What your database details page is going to do is roll all of that up and tell you as a whole, my database is holding this information.
And that’s why I actually like this. And for a lot of our customers maybe that don’t have a lot– I mean we have some with over a dozen remote pollers and hundreds of thousands of elements– in this case, if you don’t, you’re just maybe a single poller or maybe two, this is the page that I actually go to get that number. Because I really care, mostly kind of day-to-day, about the overall number. The other thing it lets me do is I can see when the last time that maintenance ran. And that is definitely something that is a bad habit. We just assume that databases don’t need to be maintained or maybe that Orion maintenance doesn’t need to run, and it does. And it’s nice, not nice, it is important to be able to check on that even if you don’t have an alert set up for that.
Right, and something else when you’re first installing and you have the basic default settings out-of-the-box, this is what these are currently. But if you didn’t and you adjusted these and you’re starting to notice maybe you adjusted it– you took it from seven you moved it to 10, or you moved it to 15– and you’re like, “Man, I just don’t understand why this report’s coming back so slow? I’m just trying to get an event count.” Well, if you look up here, because is as total based upon these statistics rollups that you retain them, this is what we have in those areas. So for your events, for the amount of time that you’re keeping them, which is seven days here, they already have 321 thousand events that are being stored. They delete after that on events, after seven days, but if you increase that to 10, this is how you’re able to figure these out. Right? You’re starting to see the impact of many sources of storage.
Multiply it by about a third.
Right. So it’s a way for you especially if you’re first monitoring, it’s for you to get a grasp about, because you don’t really recognize the events when you first get into it and so you’re just kind of like, “Ah, that sounds great.” Or, “I want to keep it for a whole year.”
Okay, so bad habit: Just extending storage intervals without considering the impact.
Exactly. So you have to be able to know how to see what the impact is, and where you can check on it as well.
Awesome. Okay, security. Bad habits galore. There’s a million reasons for it. We won’t get into it today. A lot of it’s funding and attention and a lot of other things. And unfortunately, a lot of stuff rolls downhill on security. So there’s a great opportunity to address some bad habits there. And you’ve suggested sort of a three-prong approach to that. The first one is to standardize your security needs. That is figure out what bad habits you’re not willing to tolerate. Like the absolute no-goes, right? Create security baselines. In this case, what bad habits do you regularly get into with security? And then the last one, of course, is to set realistic objectives, right? Like what good habits can we get into? Or at least, what do we need to do? What systems can we put in place, or policies can we put in place, that will not lead to bad habits?
Right. And something that I like to call it is ‘a security gut check.’ Right? And it’s a hard question that we need to look at with our teammates, not by yourself. You should do this with your team. And say, “Do we have a security policy in place?” I do not mean it’s one that everybody knows of, one that somebody passed down. Do you have a book somewhere that says, “This is the policies that we have.” And on top of that, “This is the reaction we have afterwards.”
So bad habit number one: Don’t write anything down. [Laughing] You’ve got to address that. It’s got to be on paper. You have to be able to hand it to somebody and say, “This is our policy.”
You should be able to do that so you have a consistent policy; one that is ‘wrote down’ and also you’re able to make changes. Obviously, I always say this, that security is a very fluid dance. It changes all the time. We notice that especially in the media. So there’s things that we’re going to see that are going to come and go, and things are going to need to be addressed. You can go back to a central point with the policy that has been made, and you’re able to adjust and make changes. And it’s easily referenceable. And it’s easy to train up on, between everybody.
I call that kind of your’ pre-assessment.’ Right? Ask yourself that hard question. My second part of it is like, that general baseline. What is important to you for security? And you can’t look at it just from an IT standpoint. I feel like IT needs to address that. Because you have your phones, everybody has phones that they can access. Usually they have email for work on there. There’s applications now on your phones that you can access back to the applications that you’re using. It’s no longer just IT and what we have to do, and it’s security. No, it’s global. I don’t care if somebody just came in, and they’re going to be running the phone system. They need to know a proper procedure for what they need to lay.
I was going to say, this is something we’ve talked about before. And increasingly you guys are beginning to talk about it. Which is sort of that idea of not just Biz-Dev-Ops, but like, Biz-Sec-Ops. The idea of including the business in security decisions, because ultimately, security comes down to risk. Measured not–well, incarceration is an extreme example– but mostly real damage to the brand or the business. And then the cost of deploying or achieving security. Those are business decisions. Those are not technology decisions. And a lot of times, that’s placed on us in IT. They expect us to be able to sort that all out and quite frankly, that’s not our job. We deploy technology. We deliver security. But if we don’t have a budget that works, or we don’t understand maybe when there’s multiple, competing, top-level priorities, then how are you going to triage that? How are you going to decide which ones are most important? That comes from the business.
Yes. And something I like that you said was the incarceration part, right? Because if you have that book and you have a policy built upon it- if there needs to be a legal action, you have a source and a grounds for legal action at that point. If you do not have a written law-type policy, it’s very hard to actually take legal action to even somebody being malicious, you know, internally or maybe it’s somebody even accidentally. But if you have a policy and somebody has to check off and say, “I understand these policies but I still did this,” you have a basis. And you have a quick and easy way to take legal action, or even an HR issue, so that you’re able to have that ball rolling. And it’s not an on-going issue that is just a ‘he said, she said.’
Well, and you can flip that around, too. The other side of that incarceration like from a “Sar-Ox” violation, right? Where the senior executives are actually in peril. I mean, not too many of them get locked up. But still, they’re kind of concerned about it. By having the business involved in creating that document, they can establish, they can set for IT and for you clearly, these are the things that keep our executives up at night. So then, when you decide, “Where am I going to focus?” Well, that’s going to be one of the ones right at the top. And you’re not going to spend a lot of time doing emails or having meetings, trying to say: I’ve got to articulate these four very difficult security issues to someone who is non-technical. I could read that on the document instead. This is going to get my CEO locked up. I’m going to address that first.
Right. And for your objectives, that’s where you start– the risk. And then you need to set objectives for each department. What is your objective for this actual policy? You know, is it IT related? Is it social engineering related? What do they handle? You need to know what they have in place, what they deal with so you can figure out the objective for the security policy. Now something that we have, and I actually have a THWACK post on this too, that will go in kind of a detail on this. Like we said before, this is not just SolarWinds. This is anything, right? So my background being security, I’ve actually put on THWACK– and we’ll put the link in here– that will actually show you how to run through a step-by-step assessment, a pre-check flight, an assessment–a team analysis and a plan that will set through there. So it’s something that will keep and get you on there. Especially the beginning of the New Year, you need to do that gut check and you need to just no longer ignore security. You need to at least start it. And I show you how to start it, and how to even go all the way through.
So eliminate–Break up with a bad habit of not writing things down. Put it on paper and have a policy. Okay, what else?
All right. So then, we need to go into team analysis. So team analysis is something that I like to– it’s one of those things that kind of gets under the rug. Someone decides, I’m going to be the security person and I’m going to set up all of this information and now I’m going to pass it out and let you guys know what it is.
But you only see what you get. You don’t see what everybody gets.
Right. Well if I’m the only one that’s doing the security policy, and I only take care of say the application side of the business– which would never happen. But if I had that… [Laughing]
None of us wears multiple hats ever.
But if I was doing that and I’m the one that set the security policy up, what about the network?
So if I don’t involve a team of my experts, I’ve hired them, or someone has hired them for me, they have tasks that they have to do. They have a completely different set of angles on security. And you need that. You need the different angles of security. You need to have that team analysis. You need to have them sit down. You need to brainstorm. Because maybe somebody on their off time, they’re on Twitter, they’re on something and they’re like seeing these ideas and like, ah, what I would do is this. That’s valuable.
So the bad habit there is don’t invite security people to meetings because they’re annoying and they slow things down. So get over that habit.
So what’s a good way of bringing security experts in, but at least putting some amount of control into that meeting? Is it about setting objectives with them?
Or defining how you expect them to contribute?
So any time that I was ever brought in from a security standpoint, my first thing, and everybody has their own opinions, but my first thing was, “What do you guys do in your department? What do you access?” And they’re like, “Well, we use this, this, this.” I’m like, “All right. Do you have a phone?” “Well, yeah.” I ask the questions and then usually once they start realizing I’m just trying to figure out things, they’re like, “Oh, well you know we also access, you know, if we’re out here.” If I get called in, then I’ll also like look up on my phone using Google, dadadadada. And then I send this and I’m like, “Okay. We have to look at that from a security standpoint.” And it’s not–and I always want to tell them, “Well, I’m just making sure we’re not getting you in trouble. And that you’re able to access these, right?” So you want to make sure that you’re not hindering them from a security point, and you want to be their friend. Like, I’m always trying to make sure that they understand I am there to help them, just not hurt themselves or the company.
Right. I kind of made the relationship joke at the beginning but you kind of hit that with ‘one and done.’ I think that’s another bad habit. Especially when you’re talking about policy, is saying, “Yeah, let’s book about three hours and we’ll just get all of this done.” Okay, well first of all, you don’t know what you don’t know.
You don’t know how much time it’s going to take. But also, who’s going to say ‘yes’ to that meeting? How are you going to make that a regular meeting? You’re not.
So to your point about ‘one and done,’ get past the habit, get out of the habit, break the habit of ‘one and done’ giant meetings about security policy. Invite them into smaller conversations. Manage the agenda of that. You know, like 30 minutes, or do a stand-up. Or maybe you do a little DevOps and you can actually get your [mumbles] board out. But break that up into smaller conversations so that everybody is focused on one task, one element of security. Instead of these giant ‘boil-the-earth’ security processes that go nowhere.
And then you don’t do it again for another year.
And then my last tip that I have for anybody that’s doing any kind of a security gut check, or trying to implement anything, is it’s not a ‘one and done’ as you said, Like, I created a policy and now I’m done.
So especially with the increase of security breaches and the hack attacks, and the reverse engineering on top of social engineering which is high. A lot of the impacts that you’re having right now out there are actually small, could have-been-done– you know, it’s basic things.
So what I like to say is, when you have your book, when you start that or have it and go back to it, at a bare minimum every six months you need to have a team meeting and be like, “Hey, is this still viable?” I mean it’s not viable if it’s something that you don’t even have anymore– an application that’s not there– or it’s switched, it’s changed. You need to make sure that it’s viable every six months. Then you need to also make sure if there’s anything new that’s coming out there, it’s done. So it’s always fluid. And at least check it at least every six months– and just kind of a renew, right? And so it’s just one of those things where you need to go back and forth with it, and not just set it somewhere and let it collect dust.
So what I hear you saying is that last security bad habit is don’t stop trying on your policies for size.
In other words, don’t just write them and throw them out somewhere on a share drive or in your document management system, and expect that other people are going to be able to use it. Start small with that. And put it on and wear it around and see how it works with other teams. And then add or modify as you go. But don’t sit down and say– don’t fall into that bad habit of saying, “I can’t even start on documenting this, or creating this policy, because I found this template online and it looks like I have to do 17 pages for each one. And I’ve calculated using their recommendation, or their regulatory compliance standard, that this is going to be 10,576 pages of documentation.” Don’t do that! So your recommendation there is to break the habit of setting an expectation for grandiose, giant policies and instead start with something a little bit smaller. And then take it for a drive, try it on with other teams, to see if it actually is helpful or not.
Correct. Because if you’re doing something, it’s better than nothing. All right, so if we’re able to solve all those bad habits with the monitoring and the security, we might have a little bit more room to have some more good habits, right?
Yeah, to create some more good habits. And one of those things that is a good habit– you know, we forget that automation is a great habit. And it actually helps eliminate a lot of bad habits. So I thought I would show you a quick example. This is something that comes from SWUG. I mean, how many of those did we do this year?
I think about 12.
Yeah, we did about a dozen, including London, so that makes us international. [Congratulating each other] Okay, so one of the things that you guys asked for several times at SWUG was if it’s possible to integrate the Orion platform using SWIS with Alexa. So I’m going to show you how to do that. Now it’s just going to be a quick two-minute demo. I’m not going to spend a lot of time on this. This is not going to be an Alexa skills programming demo. There’s plenty of stuff out there, and I will put a link up to the code that I’m using here, so you can try it. But the point that I wanted to make here is don’t let the lack of a development environment stop you from playing with APIs, or learning about the way that you can actually automate the SolarWinds platforms using its API. So in this case, this is a great example because this is going to take care– this and AWS Is going to take care of all of the back end for us. So we’re just going to write a little script and it’s going to take care of all this for us, and we’ll just focus on the interaction with SolarWinds.
Sounds good to me.
Okay, so right here I’ve got a machine sitting out on AWS. This has Orion installed; so this has SAM and NPM and then it’s got DPA and a couple of other things on it. So let me show you that these things are talking to each other so that you will believe me. I’m going to take her off of mute here. Okay, so, Alexa, ask Virtual Geek to list Orion accounts.
There are two accounts on the Orion server. Admin is enabled Last login was Thursday, January 5th at 2:51 p.m. Guest is disabled and has never logged in.
Okay, now you can see I logged in right here, right? 3:02. So I’m going to log out; log back in. Okay, so I logged back in. Alexa, ask Virtual Geek to list Orion accounts.
There are two accounts on the Orion server. Admin is enabled. Last log in was Thursday, January 5th at 3:02 p.m. Guest is disabled–
Alexa, stop. Right. So what it did there was it’s interacting with the accounts table, or it’s using actually the accounts object using SWIS. It’s basically just pulling a list and then it’s speaking it back to us using Alexa. So let me show you how I did that. The one trick here is it needs to be able to get to your SWIS interface. It’s got to get to the Orion server or it’s not going to work. So in this case, it’s running here in AWS. The way I did it internally was I set up a little VPN connection tunnel for it, but there’s a bunch of other ways that you can do that. So that’s not that big a deal. So here’s how this thing works. Okay, first of all, we looked at the Orion instance; we do need one to talk to. I recommend installing the SDK because it’s got all the samples. But I didn’t really need to install anything to it in particular. I do need to give it access. So I’m going to go over here to my AWS instance that’s hosting it. You need to make sure that your instances are actually exposing it. And the way that I did that was I went down here and for its inbound rules you’ll notice that I did set up a rule. So 17778, that’s the interoperability port for SWIS, is enabled. And in this case–this is a terrible habit– I’ve got it open to the world. I would not do that. It would be way better to do it to your internal IP. So that’s the first thing; you need to expose the interface for SWIS. And then the second this is you’re going to do a little inside of Alexa. So the way that you’re going to do that, is create a developer account, if you don’t already have one. Just Google that and you’ll find it. You’re going to come over here to Alexa, and it’s going to ask you what you want to create. So you’re going to say I want to get started with creating a custom skill. Now you can do home automation skills where basically you don’t have to say, like in this case I said, “Ask Virtual Geek,” that’s the application that is calling the skill that it’s calling, and then it adds stuff to the end of that. But there’s sign-in requirements and some other stuff that’ll kind of slow you down. So this is an easier way to do this. This is just a custom skill, and we’ll click ‘Edit’ on this, and here’s the way that the skill is defined. You basically type, you give it a name, these, like down here, this Invocation Name, that’s the name that you’re going to call, whether or not it has fields. And again, the Quick Start Guide will walk you through all of this piece of it. But then you have your Interaction Model. And this one does a little bit more. Because this one is also running my Christmas tree and we were having some fun at our Lab 50 episode, with asking whether Kong rocks or not. So these things are called intents. So one of them was GetOrionAccounts. So you basically are saying, give it a list of things that it will go and do, and then you pair that up with a list of utterances that should map to that.
Okay? So again, that’s covered in the documentation. But once you have that, basically what it does, is when you say something to Alexa, it uses the name of the app, it then looks at maps using the utterances to an intent and then it sends a chunk of JSON off somewhere. But here’s where it gets really cool. In here it’ll say, what do you want to connect that to? Well, you’re going to configure what that talks to. And one of the things that it’s probably going to talk to is an ARN for a Lambda function. And the cool thing is when you say Lambda function; it will create it for you and do all the linking and everything else. Set up all the security rules for you so you don’t have to mess with that. You can still call out to your own webhooks if you want, but don’t do that. So again, I literally clicked on New Lambda Function and it creates my new Lambda function and then it finally will take you to a test page where you can experiment. And it’s cool because you can say like, I’ve still got my Christmas tree hooked up, so I’ll say, ‘Ho, ho, ho,’ which is what you’ve asked it for, and it’s going to say, ‘Ah, ho, ho, ho is true.’ So this is basically the response, which is in the form of JSON, and then this is what Alexa sent my function. So again, this is real easy to debug, because it’s all pretty much plain text. Well in this case, like for ‘list Orion accounts,’ and it’s actually pretty forgiving in terms of spelling and everything else. Here’s what it came back with, right? So this again is live. So it said there’s two accounts on the Orion server, admin enabled, blah, blah, blah. So that’s basically what came back. So that’s mapping that interface between the device itself or the test UI account, and then the chunk of code that actually runs. And again, I didn’t have to set up anything except for making SWIS available, and I’ll show that in a second. So all of this, totally free tier, just set it up and run. So I don’t have to set up an environment, and it runs everywhere.
So something neat about this to me is like, especially for somebody who’s not really programming, or somebody who doesn’t do this a lot, when you see that, it shows you the before and after and how it inputs it. So you’re able to actually make a correlation of what you’re doing. So to me, I’m like for kid-wise, if the kid was messing with this and like trying to learn and do something, this is great, right. Like it’s going to spark their interest and has like an interaction.
And it’s the ultimate sandbox.
You cannot break the cloud. Well, you can break your cloud apps. But in terms of the container itself, you can’t. So the great thing is, you set it up and then throw it away. And the free tier for a lot of this dev makes it really, really easy. So I’ll give you another example then of how this actually works, right? So that’s the Alexa piece. Literally, that’s it. I mean that’s the only things that I’m configuring is that one skill. So then, the next thing is the code that actually runs. So the way to think about that– so I’m back here again in my AWS console, and this time we’re going to look at Lambda. So remember, that’s the server-list run-code in the cloud component, usually it’s used for glue between different services. But in this case, it’s running as a standalone endpoint for the webhooks that are coming from the Alexa unit. Okay, so I’ve got a couple of functions here. One of them is this Virtual Geek skill, which is the one that does Christmas trees and all of that other stuff. And that will distract you guys because it’s got a bunch of other APIs for other services. So what I’m going to post is this one, this Alexa SWIS skill. Why did I put two Ss in “SWIS?’ [Patrick sighs]
So it’s getting the account ID, the enabled status, and then a whole lot of other things, you know, you’ll recognize– any of you who’ve looked at the Orion database will recognize most of these columns. You’ve done this for years, right?
And it’s selecting that from Orion accounts. And then I’ve got a couple of other things here, I am doing “ORDER BY LastLogin,” so I don’t have to do it in the code. And how many rows? I’m limiting this to 10. But I’m doing that so I can use with total rows so that I can also confirm, you know, when she says, “You have two.” It’s a way of cheating. I don’t have to iterate though them to figure out how many there are in the result set. There’s a couple things here where it’s logging out to console. I’m also logging this out to Librato, because then it’s a little bit more interactive than using CloudWatch. But that’s something else to play with and that’s for another day. But then basically, you can see where she’s started to break apart the response, right? Because again, this is JSON coming back. So she says, “Response.total Rows.” I’m setting that. So I say, there are total accounts on the Orion server. Well, that’s half of what the response is. Here’s the other one. Here’s an iterator, right? This is a “For” loop. I’m going through my response results and again you’ll see, when you trace it you’ll see, the results that come back, and is it enabled or not? The account ID, last login, bah, bah, bah, bah, blah. And then when it’s done, it takes it and says, “sendAlexaReply.” And Alexa reply is what goes out, the event that basically says, “I’m done talking to you,” and whether or not to stop the connection. Like the skills where you keep talking to her and she keeps listening. ‘True’ basically says, “I’m done I want to cut it off.” ‘False,’ then it’ll stay lit and then you don’t have to go through the process of saying the event name again. And then the rest of this is all just error handling. Here’s where Alexa, ‘sendAlexaReply,’ that’s where it builds that structure. Again, that’s just JSON, so super-duper simple. And then that’s it. Down here, this is boilerplate, event stuff. Don’t worry about that. Any how-to will walk you through that. But you can see where [laughs] that’s for the promo. Remember when we said, you have 126 active alerts? I just created that as an event. But the main thing right here is ‘getOrionAccounts.’ Remember that was that intent that we had from the Alexa config. Basically, it’s saying, “When you see that, call that method. Go get the Orion information and terminate when you’re done with the result.” So again, I’ll post this. But the thing that I wanted you to see was this is the SWQL that’s getting called. And here’s where you’re passing it back. So if you think about all the things that you could do with SWIS on the Orion server, including rebooting servers or making changes through Virtualization Manager for example, through the API– it is really, really powerful and pretty easy to do.
Super powerful. So my deal is that even for people who don’t code, you’re able to use this and go back and forth and be able to figure it out. So you can use this, have some fun with it, and kind of get a little bit more geeked out on your own time when you clean up all the other habits. But this stuff’s pretty cool. I mean, like to learn everything that goes through here, it’s just kind of like it’s a template for you to rather dip your toes into the programming side with something interactive that kind of lets you know back and forth.
These are good habits. They get you excited about breaking bad habits.
Mmhmm. All right, now that was really cool. And I know we tease you about playing with the SolarWinds APIs, but honestly, it’s amazing that you even find the time.
Okay, first of all you apparently find a lot of time, too. You’re the only one of the five of us that has multi-gigabit fiber to your house.
Isn’t that normal?
No, and we’re somewhat envious of that. The second thing is that automation really may be the ultimate cure for bad habits. So often, we make decisions that are shortcuts and then we promise that we’re going to go back and fix them, and we never do. And the next thing, you know, that’s a bad habit that’s eating up time from the entire team that’s blocking other important work.
But if you take the time to automate tasks you’re going to repeat, then you save time and headaches in the long run.
That’s right. I mean APIs are really a form of discipline, right? And what relationship doesn’t benefit from a little bit of discipline and maybe some mindfulness.
True. Well I hope this inspired you all to break up with some bad habits and get back on track for the New Year.
Yeah, and also make sure you check out the THWACKcamp session on monitoring good behaviors that we talked about before, if you haven’t seen it. We’ll put the link in the description of this video. Also, we’ll post that in the live chat right now because you’re probably chatting with us. Of course, if you’re not chatting with us live, you’re missing out on these live SolarWinds events.
And you can fix that bad habit by visiting our homepage, lab.solarwinds.com and sign up for reminders for the next event. So this was awesome. Are we about ready to wrap this up?
Let’s wrap this thing up. I’m Patrick Hubbard.
And I’m Destiny Bertucci, and thanks for watching SolarWinds Lab.