
Episode Transcript
Welcome to SolarWinds Lab. I'm Patrick Hubbard. I'm Leon Adato. I'm Kong Yang. And I'm Rob Johnson. Rob, welcome back to SolarWinds Lab. Thank you. So, I guess since Rob is here, we're going to go over security tips and tricks? Yeah, that's right. We did a LEM episode with Rob a couple of months back, and we really only scratched the surface. So, today... "All your lab are belong to us?" Klaatu Barada Nikto? [Zapping] All right, what the... [Beeping] Is going on around here? I have to say we've been hacked. Are you suggesting, however improbable and coincidental it may be, that we're getting hacked on the day that we're shooting a security episode? I'm shocked, shocked I tell you! Guys, we run our lab on a VM. I'll just restore from our gold image. Nice. Yeah. Oh! Seriously, buddy, thanks. So, someone decided to hack us while we're filming an episode of Labs? Well, actually, probably not. The reality is they probably hacked us weeks ago, and then set everything in place, and it was just scheduled to trigger about now. So, we've probably been hacked for a long time, and just didn't know it? More or less, yeah. That's the way it usually happens. Home Depot, Target, Sony—you name it. They were all hacked long before anyone figured it out. By the time it becomes visible, the damage is usually done. Right, so that's what we're here to discuss today. If you're working at any decent-sized company... Or even a company, or even if it's not decent size, but it's still interesting in some way. Okay, fair enough. It's likely that you have already been hacked, and you just don't even know it yet. Well Leon, that's a pretty huge accusation to make as a sweeping statement. Well, it is, but I'm not the one who said it. You see, at the January World Economic Forums, Cisco CEO John Chambers said, [beeping] "There are two types of companies: those who have been hacked, and those who don't yet know that they've been hacked." If you've ever been to a security convention, you know that one of the events is usually a game of capture the flag. Right. Where participants will bring a laptop, they'll plug into the edge of a completely unknown network, and then their job is to go in and find a file. Usually it's a file sitting on the machine somewhere, and it's got some information in it. And they've been told what it looks like in advance, so they know what they're looking for. Right, they know there's a file called "find me," or something like that, and inside the file is a hex code, or a GUA hit, or something like that. And... Always with the GUA joke. Always, and you know, when they mail in that information, they have effectively captured the flag. But what people need to understand is this happens in the real world. You see, to get into an actual hacker forum, you know, on top of finding it at all, you don't just show up and come up with some really cool, you know, hacker name. You've got to prove that you've got the chops to be in there. Now, at higher levels, proof means having government secrets, or social security numbers, or credit card numbers, or anything like that. But at the low levels— and I've got to stress, this is the low levels, the beginner levels. What you come in with is, you get there and they give you a list of companies, you pick one, and capture the flag. You see, the people who are running the boards have already pre-hacked those companies and placed a file somewhere on the network. And so, the cost of entry is providing that information in. And needless to say, none of those people, even at the lowest levels, have the word "l33t" in any permutation of their username. Right, exactly. So what this means is that, you know, the companies have been pre-hacked, and foreign members have hacked into at least one of those companies themselves. They haven't done anything, necessarily. They haven't damaged anything, or stolen information yet. Maybe, right, they just need to prove they have the skills to get in there. It sounds like you should— you're saying that people should just go on the assumption that they've been hacked. Right? And work as if they're repelling an attack versus going on defense to protect against outside attacks. Pretty much, pretty much. For the record, I'm not saying abandon any best practices or defensive techniques, but assuming that a hack is already in place or already under way adds a certain level of urgency. It changes the way you look at things. Okay, so, I'm going to go check in the data center to make sure that our hack hasn't been any worse than we think it might have already been. And I'm also going to try to get ahead of the management email freak out that always happens. Why don't you guys talk to them about how you can handle a situation like this better? And ideally, how to put some proactive controls in place to make sure that it doesn't happen in the first place. Right, like, don't click on spear phishing links. Yeah, that would be a great place to start. And actually, if you have any really juicy examples of security vents in your enterprise, why don't you share them in the chat window right over here? Although, you might not want to mention the company names. Just a suggestion. Of course, if you don't see the live chat window over here, it means that you're not chatting with us at the live event, so you want to make sure you join us. And the easiest way to do that is visit us at our home page, which is lab.solarwinds.com. Make sure you sign up for show reminders, give us suggestions for content you want to see in future episodes in general, and you just chat with us. So, I'm going to take off, and I'll catch you guys later. [Electronic zapping] Okay, so, where do we start? Realistically, you start by educating yourself about five years ago. Okay, fair enough, but if people find themselves in a situation like this, they're a little under the gun, so it's not time to run to the library like Buffy the Vampire Slayer. True, but if you do have time for a little light reading, the Cisco Guide to Harden Cisco Devices isn't a bad start. That is a good one. I think that Mav Turner mentioned that in THWACKcamp last year. And I get that security is part of a commitment to lifelong learning that we all have as IT pros. But again, I want to focus on the here and now. We are under attack now. So, what can we do now? I know this sounds crazy, but how about unplugging your internet? Okay. If you're really under attack, sever the connection to the outside world and get to work. Mmhm. Ask Sony if it's worth a few hours of downtime. Good, okay. And in our case, our assumption is that we've already been hacked, so we're trying to take some evasive maneuvers, but we're not setting the ship's self-destruct timer just yet. Well, this takes a little preparation, but keeping your eye out for good tools and having them on hand is a good step. For example, I remember last month that the US Army made their security tool, Dshell, available as open source. Right. Having things like that pre-installed on a flash drive for emergencies can shorten your time to getting things back in working order. Good. Leon, didn't you mention something at THWACKcamp about fixing network configs? That's right, okay, using NCM. So, tell you what, let's dig into that now. So, to check our configs, and hopefully correct them if there's anything wrong, we use NCM. Now, this is a regular installation of NPM and NCM. So, you go to the configs tab, and over to compliance reports. And as I mentioned back at THWACK Camp, there are a variety of compliance reports— everything from Cisco security audits, and CSP, to HIPAA, and SOX. I'm going to go to the Sarbanes-Oxley, the SOX report, just for a moment, and this takes the existing configs that we have been collecting on a regular basis— that's NCM's stock and trade, that's what it does— and it scans them on a regular basis to find out where there are problems. And you can see that there's one glaring error here that I have, most of my devices do not have the enable password encrypted. Now, if I'm not a network guy—if I'm a network guy, I know exactly what that means. Oh, my gosh, I can't believe I left that, and hopefully all your devices aren't red. It's hopefully just one or two. But in this case, I'm not a network guy, I don't really know what it is, so you can actually click on that X, and it will tell you. Well, I'm looking for the pattern, service password encryption, and it was not found. Got you. And what this does is, it makes sure that the password that you're using to log in is encrypted. So, I'll explain why in a moment, but I'm going to highlight that, and I'm actually just going to copy it, because I have a choice. I can execute some kind of command on that box to fix it right now, or I can actually do it for all the nodes that are in violation. I'm just going to do one. You want to be careful, you want to test, and then test again. Most people have a decent-sized environment, it's not just one or two routers—it's, you know, a couple dozen or something, so you can afford to test on one or two before you roll it out wholesale. But here, click on execute, and it gives me a window where I can do some script actions. I'm going to go into configuration mode, and immediately, this is why I copy and paste it; I don't have to worry about fat fingering. Okay, that's the line it's looking for. I'm going to exit out of configuration mode; I'm going to write memory because I am confident about what I'm doing. At this point, all I have to do is execute script. It uses the credentials for that box. I don't have the credentials, which means that users, you know, network engineers and whoever who don't have the actual login to the box can still be given— can be tasked with doing this. It's running, and in a minute, it's going to finish. And here we go; I can see the results, which will tell me if it ran correctly. In this case it did, no problem. And so, that one is fixed. Again, test, test again, make sure things are okay, but then you can roll this script out to all the devices that are having a problem. At this point, I would go back to my compliance reports. Now, people panic, and I'm mentioning this just for people who aren't as familiar with NCM. At this point, you see that the issue hasn't been resolved here. Well, that's because I hadn't collected another config, and I hadn't re-read it, or re-scanned it. So, if you roll out a bunch of fixes, and you don't see the changes immediately, it's because you have to do that, you know, the config collection, and another scan, but it will be fixed. So, it's pretty easy to do that. Now, as long as we're in the portal, I want to show one other thing. It's not NCM this time; it's NPM. With NPM 11.5, you have, starting with 11, you have the ability to use deep packet inspection to find out the applications that are running on your network. And that sounds nice, but in 11, you had to actually say which applications you were looking for. With 11.5, it auto detects. There are 1,200 applications that it will automatically detect. All you need is a span port—or, if you're watching on particular server, you just put it on there, and it will tell you what it finds. Now, again, the whole point of this is that you're able to find out whether it's the application that's slow or the network is slow. That's what deep-packet inspection is for. But a secondary benefit of that is that you automatically see which kinds of—wait a minute, how come I have something using Google Voice, or, why do I have traffic going to this site, or that site? It's automatically going to pick it up. We've auto-categorized a lot of these applications as being questionable, or even risky traffic, so you can use that in your environment, you know, already, and that comes built into NPM. It's funny how the tools that you already own, in some cases, network performance monitoring, help with security as well. Exactly. So, Kong, you were really fast to restore our lab session after the thing earlier. Can you tell everyone a little bit more about how they could leverage their virtualization environment to thwart a hack that's in progress? Absolutely. It started with a great base. So, there's four simple steps that you can get that great base, best practice-wise, right? Number one; segregate your access to your virtualized environment. So, leverage our back for that. Number two; prevent VM sprawl. Number three; incorporate logging and monitoring for VM-to-VM traffic. Okay. And lastly, and most importantly, lock down and monitor the folders that contain your VM file sets. If you do that, you're off to a great start already. And we had that in place, so for me, it was just identifying the host and the VM that was causing our issues, right? And I leverage LAN and Patch Manager, and Virtualization Manager to do such. Once I did that, I SSH'd in, identified the PID associated with that VM, kill that PID, and then restored the VM from a good, golden backup that we had. And that was it. Wow. Great, so you sort of hinted at it, but I think we're dancing around the elephant in the room here. Or should we say ‘L-LEM-Phant’? Right. I was wondering when you guys were going to get around to LEM. Now, LEM is one of the bigger tools that you'll invest in, but the time you spend implementing that tool will definitely pay off in the long run. Okay, we'll take a look. Okay, so, going on the premise that we have been hacked. One of the first things that you're going to do is use proxy that you have to investigate that issue, right? To do some forensics, to do some root-cause analysis. Well, the Log & Event Manager, LEM, is an excellent product for something like that because, for obvious reasons, right, it's aggregating all your logs into a central location. Okay, so what we want to do is use that log data, and one quick way to do it is jump into the explore area here, and just run a blank search. So, like, right here, I've got 10 minutes worth of data, but I can come over here, and I can expand that out to whatever timeframe I need. So, it might be, especially if you figure out you're hacked, you might start with a few minutes, and then incrementally go back in time if you're not finding what you need, right? So, what we'll do here is we'll jump in, and we'll— we do a blank search, and the cool thing is that LEM summarizes all of the events that happened within that timeframe. So, I can see, like right here, file creates. So, one of the things we're going to focus on here in a minute is monitoring file access, monitoring file information, right? But it goes beyond that. I can see changes, I can see services, and processes, log-on failures, so that's attempted access. So, all of these events can be used in your investigation. And just to be clear, this is aggregating not just on a single machine, but it's taking it from all across your environment, all the points that you're collecting logs from. So even if it's one login failure on box A and one login failure on box B, you can see that this is a trend. "Oh my gosh, they're skipping across my environment." Exactly. Like a stone. Exactly, and the cool thing is LEM normalizes the data. So, when you're talking about exactly that, when you have multiple systems, they all have their own language, right? LEM makes it a common language by normalization. So when you look up a log-on failure, you can see all the log-on failures, so you'll know what was touched, what that user did throughout the entire network. Did they touch a router? Did they touch a switch? Did they get on to a server? You know, did they make a change somewhere? You can easily figure that out. So, for the example here, I see those log-on failures. Well, let's take a look at that. So, I'm going to double click on that log-on failure. It adds it automatically up to my search bar right here. Then I can hit the search button again, and now it's going to drill down into those events within that timeframe, right? So, quickly comes in, I can see it was an admin account that was being used, right? I see the actual system that the admin account sourced from, so I know this IP address was the source machine, right? And then I can look at other activity, like here is the system they accessed right here. All right, so now I've got a source, an account name, and a destination. All right, and then I can just continue to drill down from there. All right, so that's one way to do it. So, let's say, for example, we want to look at everything that the admin had done in that period of time. So, now we'll switch gears a little bit. We'll come over here, and we'll toggle to a text-based search. And so, now, I can type in admin. Right, or administrator, or admin star, or whatever I need there. Right. Then go in, switch my timeframe. I'll do that last day--hit the search button. Now, I can immediately look at the log results if I want to. Right? But most of the time, that's going to be overwhelming. So, what you want to do is look how LEM summarizes it over here. So, I can see there were over 6,000 logons. All right, I see policy modifications. That's a light bulb going on right there, right? That's one of the things that I want to investigate. I see a software install, all right? File create. These are all things that could prompt further investigation, but the advantage is that LEM summarizes it, makes it much easier for me to just jump and dig down into those details. So, what I usually tell people that I instruct using LEM is start simple, all right? Don't make it more complicated than it is. Obviously, a lot of these hacking attempts can be complicated, but detecting them and determining root cause is typically much more simple. All right, so you want to look at, think of the simplest thing that could go wrong to allow access, or to allow a compromise, and start there. So, like, in this case, I just type in the username. Then what'll happen is, is that will trigger more detailed investigation, then you'll eventually kind of figure out what's going on. Makes perfect sense. Now, what I like to always say is, is, when we talk about where we can be more proactive with security, right? So, we've been hacked. We found out what happened, right? Maybe it's access. What we want to do is turn that into notifications. So, we want to leverage LEM's correlation capabilities to notify us ahead of the game. So, if we see that activity again, even if it's benign, we can still go in and say, let's check that out, all right? It's an alert we need to review. And that's a good point, because we are going on the assumption now that we've been hacked, but the hackers, who are still in our environment, don't know that we're onto them necessarily. So, this gives us a chance. This is a tool that they wouldn't necessarily be watching for off to the side, and therefore, we can say, well, I've seen that pattern, let me wait for it to happen again. And now you actually know the thief is in the house, not just that they came and took something, and left again. Exactly, exactly. So it's levering what you've learned, or leveraging what you've learned from investigating an attack to use it, you know, to identify more behaviors in your network. All right, so to do that, we come over here to build, and then rules, and you can see we've got hundreds of rules that are prebuilt. If you go under security best practices, you will see a bunch of rules that focus specifically on that, as well as other roles that are broken down into other categories. So, these are the rules that you go in, many of them are already configured with notification, but keep in mind that you can design your own correlation rules as well, because, you know, everybody's network is different. So, you might tweak existing ones or create your own to actually help you focus on particular behaviors. That's fantastic. And again, LEM is a big tool; it's a big investment in terms of learning it. It's a big departure for people who are used to our NPM... Right SAM kind of tool set. However, I think what you've shown the last time you were here, and again, this time, is that it's really not so completely overwhelming to get into. And you can have good results, and you can find good information right off the bat once it's there. Exactly. Usually what I'll tell people is understanding that you're in a busy IT environment, understanding that you typically wear multiple hats, spend an hour a week, an hour a week with this product, and typically within a four-to-six week timeframe, you'll really start to get an understanding. Break it down into segments and figure things out from there. So, the last thing we'll cover, and probably one of the more important things, nowadays especially, is in order for people to compromise a system, there's typically some sort of change that has to happen. That might be, you know, change in, like, SAM, or what are they using now? It's not even SAM; it's the Active Directory, some change in Active Directory. SID is what I'm looking for. Ah. So, a SID file, or going and making a change to a system file, or something like that that would allow somebody to come in through a back door, or just gain access to that system. So, file integrity monitoring is pretty popular nowadays, because it's so important to do that. And that is also integrated into LEM, right? So, if you deploy our agent out to operating systems that you want to cover, particularly Windows, you can come in and enable the file integrity-monitoring piece, all right? So, if I go into connectors here. And then again, like other areas of the Log & Event Manager, there's built-in templates that you can take advantage of that are already monitoring the critical system files of any server. So, we'll go in and we'll type in FIM here. All right, you can see we've got a couple. We can monitor registry or file, all right? So, then all I have to do is go in, select new, and you'll see there's DNS server, proxy server, and start-up programs: some of the common things that you want to monitor on any given server. All right, but you can also go in and enable additional specific files that you want it to monitor. Maybe a specific directory, maybe it's a sensitive share that contains information, you can add that as well. And this is one of the things when Home Depot first announced that they had been breached with the back-off virus, and you know, when you looked at the announcement of how it worked, it was very simple. It was just a couple of files that got dropped on the machine, and a couple of connections out to some relatively well-known ports. But it became a little insidious to find because the files were dropped somewhere in the user's directory. Right. Some account within users. Well, scanning users slash star for files that are patterned star is a little bit tricky when you're trying to do that in privileged file structures with a PowerShell script or whatever. So, here, this is just a much easier way to get there. You end up thinking that if Home Depot had this on their point-of-sales systems—we're not saying this goes on every single machine in your 10,000-machine environment, but point-of-sales systems are really sort of an integral area for you to protect, that this could have been avoided. Absolutely, yeah. Targeting what you want to monitor, targeting what you want to be alerted on, will do a couple of things for you. One, significantly reduce false positives, and two, that's the data that you want to protect, right? So, why not put something in place to help you do so? Okay, so you saw some pretty interesting things that we did with LEM, right? But what we really want to do is complete the tool set. So, that will involve getting an idea of what's happening at the end-point level, right? Now, LEM's going to give you a portion of the forensic capability there. Being able to go back, to see what files were touched, to see, you know, if somebody deleted something, or somebody made changes to a system. You know, it's also really helpful at the perimeter and the server level. Where Patch Manager comes into play is that it can give you an idea of what is on those end points. So, most people think of Patch Manager as, okay, I've got to go and patch vulnerabilities, right? It goes beyond that. There are some really exhaustive inventory capability in there that can let you know what firmware, network ports that are open, and you know, other aspects of the end point that can lead to you figuring out where that, you know, hack—where they got in. Right. What system did that take advantage of? So, we're going to do that. So, let me show you kind of a few tips and tricks here that we can take advantage of and patch. Okay, so, continuing on the lines of, you know, forensics analysis, or investigation, we talked about LEM, right? How you have that capability to go in through logs. Well, another way to help you with forensics, and also to help you prepare for, you know, assuming that you're always hacked, is to understand what's on the systems that you're monitoring. Now, Patch Manager is really good for that because of its ability to really inventory the systems that you have. And it goes beyond just what patches were installed. That's typically what people think about when they're talking about patching. I ran a vulnerability assessment, I have a list of vulnerabilities that I have to patch, I have to compare that to the systems that I have in place and what they have installed, and then push out patches. Well, Patch Manager can really extend your resources in an investigation as well, because you can schedule and or do ad hoc inventories of all the systems that you're monitoring, and really get an understanding of what's installed. You can look at firmware, you can look at software, you can look at network ports that are open on the system, any firewall configurations, like operating system firewall, any other software that's installed, along with any of the vulnerabilities that may be, you know, you have to address on the systems, all right? So, within Patch Manager here, it sits on your WSUS environment, it pulls in the existing WSUS reporting options that you have there, but gives you much more extensive reporting beyond that. And if you look over here under configuration management reports, you can see how it breaks it down into different categories, right? So, I've got network information. This speaks to those firewall settings; it does things like a net stack report so you can look at other kind of network configuration options that are on there, the network connections. All right, so it'll scan that system and tell you what network connections are available, right? And then if you go down, it looks at other things, right? So, I can look at GPO. I can do a GPO check, a policy check on my systems to make sure that the Active Directory policies that I've put in place to protect those systems are actually in place. So, it's a way for you to figure out, where was there a back door or some sort of opening that could be used to compromise a system? Now end points are becoming more and more important in networks, though, because people are understanding the power of insider abuse. And we actually did a couple of surveys, just recently, where there was kind of neck and neck, the need to monitor the external access, which is important, right? But also the need to monitor internal. The insider abuse story, right? So, we're hearing a lot more activity around that type of security event. So, Patch Manager can help you find out what happened inside those systems. So, if I come over here, I've got a couple of reports. Okay, so this one's just a simple update status report. So, this is your report that you use to find out if your systems are vulnerable at any time, right? So, I can run one for now, but it keeps a history. So, like in LEM, I can run this report for, let's say the last month, or something like that, and I can find out if there was a patch that was not installed right on there, and that could have been where I was vulnerable. Okay, and the cool thing about patch is that you can take action right away. So, if you see that, I can then go in and highlight these systems, right, just select what I need, right click, or use the management tools up here and push those updates right away. So, you can correct on the fly if necessary, as well. Another handy tool within patch manager. Another report that I have here is my network connection report. Now, you can see I ran a scan for four or five systems right here, and really only one came back with a connection, and it's a share, right? And you can see that this, and it tells you, resource remembered tells me that is a share that has been activated on that system. So, every time they boot up, it's always going to reconnect to that share. Then over here, it shows exactly where the share is going to, 2008r264, so why am I going to get an operating system? Am I trying to do something, am I pirating operating systems in my—you know? So, these are some of the things that you can investigate with Patch Manager. It will tell you that. Right, and again, the ability to go back in history, especially because of the way that some of these, you know, Trojans and malware are working, are random occasional connections out to a place, out to a share, dropping off a key logger file or whatever it is. So, you know, the hard part for a lot of security professionals or IT pros in general is, well, if I wasn't watching it at that moment, I would know. Well, this is watching it at that moment. We can go back and say, yeah, look, every six hours, it's phoning home to the mothership, dropping off its payload, and then it's disconnecting, but you've still got a record of it. And I think that really goes into an important mantra that I like to preach when it comes to security, is that number one, security is everybody's job, right? And number two, security needs to be in every aspect of your network policies and procedures. Right, so everybody who's involved with IT needs to be thinking about security in everything that they do. So, and then lastly, use the tools that are out there to make that job quite a bit easier. And tools like this can help you because they help you centralize, they help you kind of unify the environment, and then they help you respond to security incidents that much quicker. It seems like you can detect the signal from, you know, and the signal being the abnormalities, much quicker with a tool like Patch Manager and LEM. Yeah, definitely. Yup, exactly. [Electronic zapping] All right, corporate systems are good. The damage was just limited to the lab. Yeah, if we didn't have the back-up configs and the VM best practices in place, we would have been out for days instead of minutes. Oh, definitely. So, any clue as to who did this? Well, when we were scripting out this episode... You guys do that? We bounced around a few ideas about who the villain was going to be. A teenager trying to impress his almost girlfriend and almost started a nuclear war? Hey, I would start a nuclear war for Ally Sheedy. Shall we play a game? Or a shadowy figure wearing a hoodie, typing away? Yeah, because that's how I dress whenever I'm hacking. Chris Hemsworth? Actually, Lawrence was the stunt double in that movie. Or a semi-aggressive nation trying to get back at us? You know, my vote was for Prince Humperdinck of Guilder. Yeah, you killed my server. Prepare to die. Actually, I like the idea of Humperdinck kind of Photoshopped onto a guy in a gray jumpsuit. You know, but we realized, after going through that exercise, that the threat is just so real and so common for most enterprises that we just didn't want to be glib. Plus, by the time all this... [Zapping] happens, the people responsible are so long gone, you'll never, you may never really know. Right. Nor is it your first order of business to find out, at that point. Right. Your first order is business is probably getting your first business order back online. Right. Exactly, the best time to deal with your company getting hacked is before it becomes obvious that it happened, which is right now. Right, and it really is obvious after the Sony intrusion, but this is everyone's issue. I mean, it's not just the security, and the audit wonks that worry about it. And I really believe that a comprehensive monitoring plan in accordance with actual security policy can really go a long way to prevent this sort of thing in the future. Right, as we said at the top of the show, you have probably already been hacked. You just don't know it. Right, and your job is to make sure that management understands this so everyone can get to working on the solution before it hits the news. So, let's get to it. All right, let's get to that. For SolarWinds Lab, I'm Patrick Hubbard. I'm Leon Adato. I'm Kong Yang. And I'm Rob Johnson. Thanks for watching SolarWinds Lab.