It’s not everyday that SolarWinds releases an entirely new module for the Orion Platform, especially one dedicated to that unsung hero admin tool: log management. In this episode, join product manager Jamie Hynds and Head Geeks Patrick Hubbard and Leon Adato as they walk you through how to use the new module, review best practices to get the most out of syslogs, and of course cover how-to’s for deployment management and configuration. We’ll also review the rest of the SolarWinds logging products, discuss their differences, and compare how each operates. This is the first SolarWinds Lab to tackle logging.
Hello and welcome back to SolarWinds Lab and with us today all the way from Cork, Ireland, is Jamie Hynds. Jamie is the Product Manager who owns something new, something that we haven’t seen in a really long time and that is a brand-new module for the Orion Platform.
Hey, guys, it’s great to be back.
Yeah, and so, we had you here in Austin for like, what? A year and then you decided to go back home.
Correct. The Guinness was calling me, folks. But great to be back in town in Austin here again for the Tex-Mex barbecue and finally a bit of sunshine.
Yeah, and speaking of sunshine, back again with us, Leon Adato.
Yeah, so, there’s just a lot going on. We’re recording a lot of episodes of Lab, there’s a few other things that we’re working on that you’re going to see more about coming up soon. And of course, when I heard that we were talking about logging, I love all things logging and so I had to get on set and be part of this.
Yeah, you care about logging and also, we’re working on THWACKcamp, so…
Yeah, so, we asked Jamie to come by the Lab today because the new module is Log Manager and he is the guru and so, we’re not calling this LMO or something else, right?
Correct, it’s LM for Log Manager. Nice and short. So, Log Manager is essentially a unified view of log monitoring with net performance data in the Orion console. So, something you guys have asked for a long time is improved log management within Orion and that’s exactly what Log Manager delivers.
Yeah, and so that’s why we figured that you’d probably want to hear about that today. So, let’s do this. First, it is completely new. And although there are going to be a bunch of how-to’s in the Customer Success Center and hopefully, you are all using the Customer Success Center. You remember, Google “SolarWinds Success” and then how-to’s on just about anything you can think of. We know that you really do care about logging and are going to want to see how to configure this.
Right. They’ve never seen it and are going to want to know how to configure it, use searching, and understand the log data they’re looking at.
Mmhmm. Well the second part of that, though is we’re going to do something that we’ve never done on Lab before and I know, Leon, you and I have talked about that. Which is we’re going to tangle– untangle all 11 logging tools that SolarWinds offers.
11. I just…
Yeah, okay, well maybe 12 now with Log Manager?
Ugh, right, I mean. Okay, okay. I think I can actually defend that number and what they’re all used for.
Okay. We’re going to try to do that. Now, we’re not going to spend all the time talking and certainly, there’s no reason to show all 12, but at this point, there are more logging tools than any other category and we get so many questions from you in person when we see you at events, or at SWUGs about what’s the best tool for the best challenge and so hopefully, this is going to be a chance for you to the first time to kind of see all of those together and get a sense for what works better. ‘Cause in a lot of ways, logging can be the most personal of all network monitoring tools.
Well you do have one other secret, don’t you?
And that would be the new add-on, SolarWinds APM powered by AppOptics.
Yeah, that’s kind of mouthful, right? But that’s basically application performance monitoring for distributed applications in Orion, right? So, it’s, you take all the tracing infrastructure goodness from the AppOptics cloud monitor, but then integrate it into Orion via SAM. So, it’s not quite a module, but it’s certainly another extension.
That’s really cool.
It is very cool. And it’s integrated into the Orion Platform, which is really the big deal. But the question in my mind is does it really count as part of the logging category?
Well if you think of log messages as events, you might want metrics on, then yes.
Yeah. So then that would be 13 logger, loggish products.
Okay, you two. How bout we talk about it for less than five minutes at the end and let them tell us if they think it’s a 13th logging tool or not.
Okay, let’s do that. So, we’ll start with number one. What is Log Manager for Orion and how do you configure it? It would be easy to say that this is just the existing Orion Platform logger just on steroids.
Incorrect, sir. [laughter] This is…
Built up from the ground up, this is a brand-new product.
But you can see why we would get a little bit confused, I mean, you know, Leon and I have been talking about, especially, he’s just got back from a SWUG where they were actually live-tweeting basically the revelation that the Orion Platform now supports what’s basically four X more powerful and you can do 400,000 elements.
Right, 100 more additional pollers. I mean, it’s just, you know, it’s scaled up everywhere, so…
And it’d be easy to say, “So the scalability’s just improved,” but you’re saying that’s not true. This is something different.
This is something different, yes. So, like, from, you know, why another 13th, as you said, log management tool? So, you know, what we’ve, from talking to you guys, we’ve seen this, you know log management and network performance data are often separate tools. So, you could have a great tool like NPM triggering alerts and identifying issues but then having to drill into your log data might require another SolarWinds tool. So, you don’t have that single pane of glass and unified view between log management and data and network performance data. So that’s kind of what Log Manager for Orion aims to do is to allow you very quickly to drill in from your, say, network monitoring tool down to that log data to identify an issue that could be affecting performance or availability and sort that problem out quicker, essentially.
So, you know, log data has a more real-time nature so, if you’re relying on those SNMP polling intervals, so you have a gap of visibility, you know, every two minutes or whatever it may be, so log data can give you more visibility during those poll intervals and maybe identify issues that you simply can’t get with SNMP.
And that was something that, you know, we, Leon and I have sort of talked about before on Lab about whether, you know, sort of what is opinion and what is truth, right? So, if you’re polling every some number of seconds or minutes, you’re getting an opinion on the performance of a thing. But if you’re actually looking at the log events that are coming from those systems, you know what the truth is about it. Or at least, the truth that didn’t get lost in transfer.
The current truth, yeah.
Yeah, so what is the… Where do most people start with this? What’s the first thing that they would see when they have it running?
Good. So, let’s say for example, I’m in my Orion dashboard here and I notice a problem, let’s say, on a Cisco device. So, I can see all my usual metrics here and performance data I pull. If I drill into the Node Details page, you’ll view all the information that you always get when you have Network Performance Monitor. However, you’ll notice on the management resource here, you’ll see an Analyze Logs button. So, let’s see, you’ll see, maybe as an example here a hardware component is in warning or critical state. But I’ve no information as to, you know, why is that or what’s causing that or what’s going on right now in this device that could be triggering that alert. So now, I can drill down directly from here thanks to the Analyze Logs button and that’ll then take me into Log Manager for Orion, and show me the log data just for that one specific node. And I can immediately see there’s something happening here and it’s a syslog coming from that device to tell me the fan has a rotation error. So, SNMP might come back and say there’s been a failure. However, you can see now with your log data, you can see very quickly what that exact error message is and how often it’s occurring.
And also, exactly when it started.
Exactly. So, you know, for that Node Details view directly into your log data so if there is a fire that needs to be put out, and your job is to, you know, identify and troubleshoot or remediate that issue, you can very quickly get from your Node Details right down to your log data with one click. So, if I come into the Log Viewer here, worth noting to get to Log Manager for Orion, it’s like any other Orion module. You’ll see Home, Network, Application, Configs, et cetera, and now with Log Manager for Orion…
It’s going to add a new sub-element for Log.
Correct. So, so when I go to Analyze Logs, it shows me logs for that particular node. I can come into Log Viewer and it’ll then show me logs across all my various different devices.
So, the differences here was in the first example, it was basically filtering out those log entries that were related specific to raising that event.
In this case, it’ll be everything for that system.
And it’s very interesting because it’s very similar to the way that we’re structuring PerfStack, which is that you can look at the entire performance stack and pull in other elements or you can go on the Node Details page and specifically see the performance elements of that item, so I like the way that we’re keeping it consistent. You know, two means of getting to where we want to go with it.
Exactly. So, in my Log Manager for Orion Log Viewer view, you could be sending Syslogs and traps from, you know, a huge number of them, of devices…
Yeah, ’cause this is, this supports what? Just under a hundred million a day.
Yep. So, and worth noting it’s node-based as well, so you know, a lot of log management tools license based on volume. So, Log Manager for Orion is similar to say, NCM or LEM. Very easy to calculate your license, it’s down to number of nodes you have. So, on the left-hand side, we have some great out-of-the-box filters to help you drill in on the logs that are important to you or to an important issue, we’ll say. So, let’s say I want to drill in and I just want to see maybe warning messages. So, each Syslog that comes in is going to have the severity, I can quickly see my warnings, and I can again, see, you know, is it a, an issue with the fan here, or maybe someone’s tried to log on to a device, and I can also, on the right-hand side here, you’ll notice the event details, which will actually show you more information. So obviously, there’s only so much information we can show across the main screen here. Click on any event and you can view more information there on the entry details.
So now there’s some normalization going on here behind the scenes, right? Because if this was an IIS log, for example, that was coming in in a warned state, you would still also see that it was in a warned state. So, is Log Manager deciding here across two different message types that it’s still a warning and putting in the right category?
So, a tool like Log & Event Manager does say, full-on normalization where we provide lots of out-of-the-box parsers for you to fully normalize those events. With Log Manager for Orion, we do some level of normalization, in terms of parsing out, let’s say the date and the time, and the severity, as you said, and we’ll show a severity indicator on the left-hand side. However, we don’t do the same level of normalization we do compared to LEM. However, a user can tag events. So, you know, obviously, logs are quite, let’s call them texty by nature so, it can be a bit of a challenge to scroll through all these various different logs and try and pick out key words, et cetera, but we have the ability here to tag events. So, if I say, maybe I can see Configure from Console. That matches a config tag. Not only can you tag, but you can also color-code the tag. So, if I want to say, let’s say a hardware failure is a red tag. I could say, hardware failure, red tag, and when I’m looking through those millions and millions of events, I can then very quickly hone in on that particular tag that could be of interest.
And so, a lot of times, that’ll be context-based, right? Like, instead of saying what might be a perfectly good status for a message which would be, “Admin logged on to Server A from Workstation B.” It should actually be red, warning. Somebody logged on as an admin from that system, right? So that’s what that tag… That’s what that tag system is for so that you can actually extend the specificity, get better context information on those messages.
Right, and one of the things I’ve, when I’ve talked to folks about it is that the logging tools also have to have that visual component because a lot of times, it does end up on the NOC view. It does end up being something that people are keeping an eye on. As much as I always talk about, you know, you can’t hire more eyeballs to stare at screens, the fact is, is that logging data, especially is something that people are very interactive with a lot of times.
So, you know, that use case there, you mentioned, Patrick, is kind of more, maybe security focused. So, worth mentioning that Log Manager for Orion is kind of primarily geared towards IT Ops. So, finding the issue that’s causing performance unavailability. But say, for authentication event, let’s say you’re on your Node Details page, you can see something’s going on, I can then drill in very quickly, and as you can see on the screen here, there was logins to that device. So, Admin has logged on to that device. Maybe if they’ve made a config change, or change an ACL, or something along those lines that has had a NOC on effect somewhere. So, you can see who’s authenticated to the device, maybe have them made a config change, and that’s after raising a ticket, or raising alert in NPM, whatever the case may be. So, I can very quickly come in here, let’s say look for configs, and I can see as a result of that login, I can then see that Admin made a change. He maybe brought down Interface, and that’s causing havoc somewhere on my network and I need to put out that fire. So, let’s say, another good example, actually is if a user is making changes on a device and you want that real tie visibility of what’s going on in that device, you can see Live Mode on the top right here. So, what Live Mode is, it’s a near-real-time stream of log data as it occurs.
So, it’s Live Tail?
Pretty much. So, let’s say I come into Live Mode here. You’ll notice the time stamps on the log will start to update. So, I can see all my various logs come in as they occur. Again, getting that near real-time visibility compared to SNMP polling. However, if you’re sending Syslogs and traps from a huge number of events, it can be quite hard to decipher, you know, trying to find that needle in a haystack, as you well know, Leon. So, for that reason, we have the included the ability to kind of filter that log data and also perform searches on that log data. So, using that example earlier on, let’s say the fan, was there an issue on the fan? I now want to look for logs that include they keyword “fan,” and I can now see as my screen updates, I can see I’m now focused on logs that just contain that keyword. I can see they’re coming from, let’s say this device here, I want to click on this particular device and only show me messages that contain the word “fan” from that IP address and then I can see in this case, it’s a service recommended.
Right, so just to, just to clarify, we’re looking at real-time incoming messages, we are filtering them based on a particular set of criteria, and we’re also searching that filtered criteria, and it’s all happening in near-real-time?
Exactly. All with the goal of identifying issues affecting performance or availability.
So, you might have noticed on the top of the screen here as well we have this really cool looking chart. So, with the chart, apart from looking great, it serves a number of purposes.
That’s a great looking chart!
That’s a pretty chart. We love our pretty charts.
So, in here, I can easily, first of all, identify, you know, how many logs are occurring. So, let’s say you’ve typed in a particular keyword, or event ID, or error number, et cetera, and you want to see, you know, when that occurred. Or let’s say there was an alert that triggered in NPM, and I know that triggered at, let’s say, around 10:50 ish. I can then come in here, drill into my timeframe…
And get the correspond–
And then get the corresponding logs. And from there, I could then see, you know, look at who logged on to the device, did they make the config change? What happened on my device? And then from there, I can see how many events occurred, and I can also refine my timeframe. So, if I know a user called me up at let’s say, exactly 10:51. What happened at 10:51? Drill in here, and I can get even more granular to see exactly what happened.
We never have to go to the logs after the fact. Hours later or days later, that never happens.
Yep. So that’s, that visualization can again, apart from searching and filtering can help you find that needle in the haystack even quicker with that chart interaction.
Right, and this is something that again, we hear about from technicians, you know, from IT pros a lot is that the whole story, you know, initially, they get the device is down. Alright, it’s down, but what was happening? Well, it got hot. How do you know? Well, because the temperature in the closet where they stuck the router is hot. But, the real process, the real story was that, like you saw, the fan rotations were impacted. Not the fan went down, but the fan rotations were impacted. And then the temperature started to rise, and then it hit a threshold, and then it did a solid reboot, and then you have an entire story, and you can begin to go back and fix it. But the process of getting all those little discreet pieces of information was cumbersome in some cases. So that’s, you know, this just makes it so much easier to again, interact with that data.
Well there is something else here that I really do like. Which is– and, of course, I’m going to just be really obnoxious and you know, bring out a Raspberry Pi and stick it down on the table. Because you know, we’re going to have to do that.
He just happens to have them here.
You know. But, so, let’s just say this is an IoT device, right? So, I’m doing log aggregation and what I really need to be able to do is tail. So, I’ve got Docker, I’ve got the containers that are running on it, the application itself, there’s stuff that’s going on in AWS, these are AWS IoT and Azure devices so, in that case, I’m pulling a lot of things together. Now typically, with the existing Orion poller, I would be… I could definitely look at my Syslog messages, right? But in terms of being able to see them in real time, flip back over here for a second to the tab for Papertrail. And you can see that what’s happening here, this is basically coming from this guy’s Pi light down here, right? So, he’s playing through a whole list of different categories, but it’s also pulling out the collectee, General Control, all of the Docker itself, everything is being aggregated to this point. Now, I’m used to being able to search dynamically. Like, if I wanted to actually say, “Hey, just give me the ones that are my continuous deployment pipeline events, plus the play events of the different color cycles that it’s going through,” this idea that oh, I have millions of events that all get condensed into one flow that I can then execute queries on and interrogate and find those needles, but when I look at it in Log Manager, this is something that I wouldn’t typically have used the Orion Platform for. Right? Like, if I got to really high volumetric data, I would typically say, “I’m going to put that somewhere else,” I’d maybe build my own elastic search, I’m going to do something else that really takes that. So, what we’re hearing from customers is it’s not just Syslog anymore. It’s all of these systems are generating so many messages, that it really does take something with a different set of capability like the being able to search in real time, you’re going to be spending a certain amount of time configuring logs or identifying data, and the longer it takes to make that search, the slower it is to actually get to the answer. So, in terms of like, rapid, root cause identification, and troubleshooting, that was one of the things that threw, actually, the early Alpha, for those of you who are part of our Usabili-buddy program, and then certainly, through the Beta, that that was one of the things that you focused on the most which is making sure that that focus is on interactivity and being able to use it as a discovery engine as much as anything else. So, would you say we recommend that they really get to know searching?
We’ve kind of made it as easy as possible, to be honest, Patrick. There isn’t a need for you know, a big learning curve to learn complex queries, et cetera. We’re leveraging SQL full text search here to make search very performant and easy to use. So again, to get you down to that log data you need very quickly.
So, this is RegEx, this is Wild Card, what’s the, what’s the spec?
So, we do basic pattern matching with Wild Cards too. For RegEx use cases, I’ll show you the rules in a second and from there, you can actually look for RegEx patterns and then tag an event from there to make it very easy to identify issues that matches a particular RegEx pattern and tag from there.
So, all the time I spent learning Regular Expressions, it’s not wasted. I can still…
You old die-hards can still. You can still…
I have credibility I have to maintain, so.
No, no, no. It’s stupid admin tricks. Like, you’d want to actually show somebody your RegEx and then they just have to take your word for it that the data that they’re looking at is correct ’cause there’s no way they can decipher.
So, that’s a good, a good point, guys. So, I’m going to drill into the Rule Configuration within Log Manager. So, with the Rules, as you can see, we do provide some out of the box and just looking for some events that might be of interest so when you configure devices to start sending Syslogs and traps to us that we can maybe tag certain events that might be of interest to you. However, there is also the ability to create your own rules. So, if I say, let’s create a new rule. I still love that fan use case we spoke about a minute ago, so let’s say we look at hardware failure. So, if I say, “Hardware failure,” and I want to maybe tag that event so I can assign whatever name I want to this rule. In my conditions, I could then say maybe only look for specific sources. So, I could say, you know, only if it’s coming from a particular polling engine. Or maybe the machine type, it’s a particular model, or maybe it’s a particular vendor you want to focus in on then.
And then these are based on parsed values that are a part of that incoming event?
Correct. From there then, you can look for specific entries. So, let’s say I want to look at, you can see the severity, community, VarBinds, and traps, which I know you know and love, Leon. So, let’s say I look at, want to look in the message and I want to see if it contains, we saw that log a minute ago had like, “fan failed” in the message. So, if I come in here and I can see the message contains that particular string, I want to perform an action on that log data. As you can see, I can have and/or conditions too, so you can make this a bit more complex if you wish. Come in here, and I can then say I want to add an action. So, to assign a tag, it’s as easy as tag the entry. And from there, I can then say, I want to select an existing tag or I can create my own tags from scratch. So, I can define new tag, and then assign my own custom color to that so I can then identify that once it appears. So, you can leverage some of the pre-defined tags, tags you’ve created yourself already, or create tags from scratch. Worth noting as well, you can assign multiple tags to a, to a log. So, let’s say you want to assign maybe the location of the device, the person responsible if a particular issue occurs, whatever use case you have for that particular log, you can assign multiple tags and multiple colors.
Is this the first time on the Orion Platform we’ve done multi-dimensional tagging?
I believe so, but don’t quote me. [laughter]
Alright. He didn’t say that.
So, there is also an additional action worth noting called Flag for Discard. So obviously, as I like to say, you know, logs are noisy by nature so you could have a device sending lots of logs to us, but for some reason, you know, they could be sending lots of logs that aren’t of particularly high value. They could be just you know, de-bug logs or devices that really don’t help you solve any problems. So, for that, you can again, have a pattern match in your rule, and if that occurs, just flag it for discard. You don’t want to eat up your storage or even impact performance by sending a huge number of not-high-value logs, if that makes sense. So, you can flag for discard there also.
Now of course, when I see Flag for Discard, that does make me think of my dear friend, our dear friend, Kiwi Syslog.
Near and dear to my heart.
Right? I think most of us use it as a filtering mechanism. And this is the first time that I think a lot of the attention for the team was you were getting a lot of feedback that they wanted to just be able to do that in one place, right? So, even though Kiwi is fantastic and I can still think of a million great use cases for it, this is a way of maybe just eliminating that step so it also removes the remote configuration aspect.
Exactly, but worth noting you can still actually use Kiwi with Log Manager so if you wanted to still use your existing Kiwi, send logs to Kiwi, and then forward logs from Kiwi to here while preserving the hostname or node name is worth mentioning as well. You have to preserve that. And you can then use Kiwi as a filtration layer and send that to Log Manager and view that information in Orion.
I was going to say, so most of the customers that evaluated this, added it into an existing Orion installation, but you were saying that this actually can run standalone as well. So, it might be, you could just use this for collection independent of any of the other modules.
Correct. So, worth, you know we mentioned Kiwi Syslog, at this point, it’s probably worth mentioning another free tool, actually called Windows Event Log Forwarder.
Which, I know I’ve mentioned Syslogs and traps a few times with Log Manager for Orion, so SolarWinds has a great free tool called Windows Event Log Forwarder, which converts your Windows events to Syslog, which can then be consumed by Log Manager for Orion. So, if you have, you know, a bunch of servers you want to collect Windows event logs from, you can install SolarWinds Event Log Forwarder on those devices and then send them to us for your filtration, your searching, charting can be done.
Well, and that’s a really… I’d be interested to know in the chat, let us know how many of you are using that tool because it’s great. You can actually specify which of the Windows logs are exported. So, instead of sending everything, which will… I cannot imagine what you would do with all that, but maybe you just want security events or you want application errors, right? It’s a really handy way to do that. But then the other one too, that’s a free tool that I’ve seen some of you have had questions about before and that’s the Event Log Consolidator.
There’s actually a very vibrant, passionate community of folks who love that tool.
Right, so. Okay, so that now brings us to a total of five. Right? So, let me ask you this question as long as we’re talking about this. And that is… Maybe it’s a debate question, I’m not sure. Is NetFlow logging? Is it reporting or is it logging?
It… walks like a duck, and it quacks like a duck, and it configures like a duck in a lot of different ways.
And it streams like logging and it has no changes once you collect it in the database like most logging actually does. So, what I’m going to actually argue that NetFlow is logging, especially in the way that it affects the database. And so that’s the other thing that I really want to talk about here, which is there is a new database dependency if you choose to install this module. And that is you’re going to need Microsoft SQL Server 2016. So, why 2016? What does it offer that is required here? And I think many of you who just watched the search example already figured out the answer, but…
Sure, so. We’ve built a dedicated log database. So, it’ll be separate from your Orion database. So, in Syslogs and traps we do today, there are two tables for Syslogs and traps, we just store all the Syslogs in the Orion database.
So, with Log Manager for Orion, you now have a separate database that’s designed specifically for log data. It leverages the latest and greatest SQL technology, such as columnstore indexes, and full text search, and the schemas designed specifically with log data in mind.
And just to be clear, because I think a lot of people hear separate database and they hear different things is, we don’t mean a separate database server. We don’t mean a separate device like the NetFlow Flow Storage processor, we’re not talking about that. We’re just talking about a separate set of tables that specifically deals with the logging data that isn’t the same set of tables that is your network node statistics information.
I am so glad we’re talking about this. Yes. But it is a lot like Flow Storage in that the reason that a while back, NetFlow moved to a fast bits columnstore proprietary table was so that we could also split it off when we got much better performance out of it but too, we were able to split it off and let them actually scale that if that was a major part of it. So, in this case, it’s continuing that, but this is also the same technology that underlies the new logging mechanism for NetFlow. Because NetFlow is also based on columnstore data tables in SQL Server. So, it’s a way of unifying all of those storage elements back around SQL Server so that you can let go of Flow Storage for NetFlow, you don’t have another specific type of Flow Storage– or a type of event storage for Log Manager, it’s all using the same database technology so it’s a way of really simplifying your install.
Jamie, you’ve talked about security a lot here, right? Especially in terms of finding security-related events in Log Manager. When would that, not necessarily not be enough, but when there is… When would there be a time for a tool that’s actually focused on security? That’s actually SIEM?
Good, so, Log & Event Manager is a good fit for customers who are maybe in heavily regulated industries that need to be compliant with let’s say, HIPAA, PCI, ISO, SOX, et cetera that need a tool to monitor both from a security, visibility, security posture use case, and also for compliance reporting. That’s when a tool like Log & Event Manager comes into play.
Or where heavy normalization or categorization or maybe multi-event kick-off as a result of one particular action really needs to come into play, that’s where you would actually look at this.
Not to mention the integration of actual security actions along with the incoming log data. Whether it’s the USB control or, you know, any other responses based on that incoming information.
Exactly. So, if we look at, we’ll say, let’s say, as you mentioned there, Leon, if you wanted to look at something like detaching USB devices. So LEM includes a correlation engine which examines each and every event log as it comes in and from there, you can take actions. So that, that active response technology where you can block USB devices, block IP addresses, disable users, et cetera… very much security-focused use cases.
Now I don’t want to say that we just incorporated LEM into this discussion to actually mention that we’ve kind of been listening to you guys, but we sort of have because we want to show you a surprise.
Cool, without further ado. [laughter]
So, on the THWACK forums, we’ve seen, you know, a lot of talk, and lot of requests for a move to a more Orion-esque look and feel. HTML5 interface.
And a move away from Flash.
Yeah, Flash is not okay for security, right?
I was trying not to drop an F-bomb. [laughter] So, as you can see, this new LEM events console allows you to interact with your log data in terms of the real-time view, your filters that are in the old Flash console are on the left-hand side, so that filtering is still there. And your great searching is there included too. So…
So, this is basically the events tab that you would normally see in LEM but it’s represented through this new interface. So, this gives you a… This is not a full re-implementation in angular yet, this is not all HTML5, but it is the beginning of making a lot of these views available in the new format. And hopefully, you can kind of see where LEM is going and I know the Product Manager has been saying for a long time, I’m sorry, the Product Manager’s been saying for a long time, I know promise, we’re going to fix the UI and get away from Flash, so definitely check this out. This the beginning of that and we really want your feedback as a part of this process.
And I also think that doing this helps a lot of folks. I know a lot of folks that I talk to at SWUGs and also in the convention floor who talk about, you know, I got LEM because my audit team told me I had to and I implemented it because they told me I had to, but I’m not really sure where to go with it. And I think the consistency of interface and the updates are going to make it a lot easier for folks to see how to start to do some of those initial tasks to get moving on it more than just checking off a box for the audit team.
Exactly. So, like, in here in the LEM Events Console, for those who are familiar, if you wanted to do like a historical search, you’d have to come into the Monitor section here, you can see all the latest say, a thousand events, then I want to view that historically, I’ve got to come into the In-Depth section, and I then got to view, you know, more information there, you’re kind of jumping between screens. So, with the Events Console, we’ve made some improvements there too in that you can see your most recent data here in the let’s call it the live view. However, I can then come in here, let’s do a search for Admin. It’ll show me my latest events.
Ah, sure does look like the exact same workflow that you were using with Log Manager just a second ago.
Isn’t it funny how we like to reuse code like every other good development…
Well that’s what, that’s what they ask for.
Give us one way. Explain one way to do it and it should work the same way across all the products.
So, as I was saying. [laughter] Historical search is also possible in the Events Console. It’s not just your, your latest one thousand events and then back into LEM, this is the historical search and you can actually have multi-context too. So, let’s say I search for admin, and I now want to add an additional field to that, let’s say I want to look at a logon event, I can now do a search, and it’ll now show me various different, kind of keywords and multi-criteria. So, you know, certainly, it’s, we’re getting great feedback so far, so, you know, on THWACK please let me know your thoughts and feedback on this Events Console and we’re just happy to discuss.
You’re just responsible for all the log things.
I’m the log guy. [laughter]
So talk, let’s talk about, go back to Log Manager for a second and talk about how it’s installed. Because I think that’s going to be the first question that they’re going to have which is does this run next to existing logging? Does it replace it? How is it set up? How is it configured?
Sure, so. As it’s a, an Orion module built from the ground up, it’s installed on your Orion server, essentially. So, it uses the latest Orion installer, where you can you know, install NPM, SAM, all the other great products. And you’ll then see an option there for Log Manager for Orion. Worth noting, if you’re on, let’s say, an older version of NPM, and you need a new version of NPM to consume Log Manager, we’ll automatically update that as part of the Orion installer too. So, it’s really easy to get.
And that’s something that’s just built in to the Orion installer itself is to get everybody on a common platform for modules in the correct update order before you apply something new on top of it.
And checking all your prerequisites and everything we’ve worked really hard…
And it solves the snowflake problem, right? Which is that if you’ve been running an Orion Platform server with a bunch of modules, right? NPM, and NetFlow, and a bunch of other ones for 10 years, and maybe you’ve had a hotfix, or maybe you sort of waited a while and then got caught up on released versions. It solves that problem of every single install being its own unique thing and so, it’s gotten to the point where you are almost all, I think, on a common code base and so what we’re hearing a lot for support or even upgrades and Betas is that there seems… You were saying that things seem to be really stable and it’s a reliable way of making sure that you have the right bits on disk. So even if you don’t do, have no other takeaway from today, which is just go upgrade. Just run the installer. Especially if you haven’t done it in a while, I guess we’re on this current smart installer now for about a year and a half? And I think at Cisco Live last year, we talked to a few of you who were still like, two years behind. There are so many reasons you need to upgrade, but the installer I cannot say enough, this is one of the main reasons that you want to upgrade is it will solve that problem for you.
Exactly. And in terms of Log Manager, we kind of hold your hand through the whole process, let you know you need a SQL database and the other bits and pieces we require so it makes it really easy to walk you though that installation process too.
So, it’s going to ask, “Do you want it on the main server? Do you want it on a different server?” Basically, you just point it in the installer and it’ll figure out where to put it?
Correct. And then once you upgrade, it’s going to disable the old Syslog and trap viewers that you have been using up to now, and so they’ll still be available on the Orion box, but it just won’t be processing any new data. And then we’ll configure the database, so we’ll deploy our new Log Manager for Orion database, you select your database server like you would with any other Orion module, and from there, it’s a matter of just configuring your devices to send logs and traps to us. So, it’s pretty, pretty painless in terms of consuming Log Manager for Orion and getting it configured and installed.
Painless is awesome.
One question I’ve actually got quite a bit recently is around retention settings.
Ah, yes, that was where we were going next.
So, in terms of retention settings, you’ll notice on the Settings page here there’s now Log Manager for Orion settings. And within here, I can now see my retention periods. So, by default, we can collect, or we have it set at seven days. So, for, you know, as I mentioned, the kind of troubleshoot and identification of problems from talking to you guys in THWACK, et cetera, and forums, and SWUGs, you know, you don’t typically need to store that troubleshooting IT ops logs for several years. Again, that’s more of a compliance use case a la LEM. So, seven days is your default retention period, however, this can be, this can be increased very easily here. And from there, you can then, you know, store up to I think about a year worth of log data.
Because the goal here is to retain lots and lots of data. Not necessarily for a very long time, but make it really easy to sort and search on detail as opposed to being, keeping specific records forever.
And one of the things is, I mean, seven days was the default previously with the old Syslog and trap management, but we were cautioning people not to do that because the Syslog and trap messages were in the same database and they were taking up a lot of room and it was slowing down database performance. Now, again, with the database improvements, that’s no longer the case. And that was one of the things that I heard a lot from folks is that, but I need to keep this information long term, what do I do? And we talked about offloading it, and copying the database, and replicating, and all sorts of stuff. None of that is necessary anymore.
Well that’s the lesson of NetFlow, right? Was it… Is that NetFlow started out as this sort of adjunct tool that many of you were using to get just a little bit of detail off of maybe a couple of routers or switches. And then very soon, you were sending, you were asking, “Well, “we want to go over 35,000 “flows per second, would you please do that?”
How many terabytes can you handle? [laughter]
And so all of a sudden, it went from being this small subset of data in the main Orion Platform database to most of it, in some cases, you know, 80 or 90%. So, this prevents that from happening in the first place because you’ll always be thinking about I’m going to do high performance logging, I would expect to have it actually not affect the performance of my primary monitoring dashboard.
Okay, okay, so, fine. I’m going to give this one to you, Patrick, if you want to make the case for the 13th logging tool, I know that Jamie has said that we have, you know, five minutes, so, go.
So, APM for Orion powered by AppOptics, right? Okay, the reason that I bring this up, not just because you’ve been asking for this, those of you who have been using AppOptics, and especially after we started talking about it at the end of the year at re:Invent. But, wanted to talk about the differences of sort of what’s in and what’s out. ‘Cause before we were talking about SNMP or being able to query or filter on text, so the next piece of it would be getting event data. Like, creating a dashboard from nothing but log data. Which is different, completely different than like, infrastructure polling, right? So, let me just give you a sense of how this works. So, in this interface right here, this is a view of, it looks a lot like an AppInsight view, right? Inside of SAM. But, this view is powered by something a little bit different. Instead of polling to get this information, these details are actually coming from IIS logs.
Right? So, the IIS logs themselves about where the connections are coming from, where the requests are coming from, the data that’s actually driving the AppStack data and that’s actually breaking all of these performance metrics out is coming in and it’s being parsed out of the log events that are coming from IIS and are being sent into the AppOptics engine, but it’s integrated into the Orion dashboard. And the reason I wanted to, to mention that was although I do like looking at an aggregated tail, what is really helpful is to be able to do something with that and that brings us to the last question that they are asking, which is well then, what is Loggly? Right? So, you can sort of think about it as this dashboard… This is another way of looking at the Pis. These are custom metrics, right? So, these are metrics that are actually coming from log events ’cause I’m logging to… I’m sending them also to AppOptics native here. But what it’s really doing is it’s parsing these things out. So, it’s automatically able to go and look at for example, this JSON chunk–I’ll bring this one up here a bit. So, in this case, all of them are looking at and then parsing data out, right? So, in this case, you can see that there’s a bunch of JSON in this log event that’s coming into Loggly, and it’s actually looking at it, and then breaking it out, and coming up with derived fields. So, I can add a bunch of those, but what it really lets me do is let me build dashboards from that log data. So, in this case, this is one for an application running in AWS, right? So, all down here are all of the components that make up this composite application, right? They are each sending log events, and then those logs are being parsed out to give us the performance metrics that are actually driving these dashboards where polling would be completely impossible. And so, the goal there is to be able to combine what would be custom events along with infrastructure, along with security, and finally the AppOptics APM, for example, application traced data or something else to actually be able to monitor all of those things, not just the ones that you could poll or are limited by Syslog.
I’m getting lots of ideas for Log Manager for Orion here, Patrick. [laughs]
Well, yes. And I know that this is tough for you because you’re like, “Mmhmm, yeah, I know.”
He’s making the face. It’s the face. You always have to look, watch us for that face.
Yeah, what’s your favorite page on THWACK? I believe it’s SolarWinds, What Are We Working On Now? And now there’s an entry for Log Manager, right?
Ah, that’s awesome. Okay, Jamie, so one question for you here, and I suspect that Leon is going to dogpile on here because he’s giving me that look like he’s reading my mind yet again. And if I look at this, this looks to me like more of the sort of high volume search and analysis that you get with cloud-based aggregation log products. And it seems to have a bunch of features that are actually pulled together from several different log products.
Right, and it’s important to know that this is version 1.0, right? SolarWinds has a long history of evolving products as a give and take. Right? I mean, I think that’s what we’re looking at.
Yeah, so you know, as you said, Patrick, it’s influenced by a number of different things but one of the big influences is you guys, so, from, you know, as a Product Manager, talking to you guys on THWACK, and at SWUGs, and various other means, you know, we’ve heard that the challenges you guys face for logs, you know, the security compliance challenges are different from the IT ops challenges, and the want and demand for a log management tool in Orion has been huge.
And the specifics of how search should work, and expected performance, and all of that is coming back, is a part of that feedback process.
Exactly. And as Leon said, you know, this is a 1.0 of Log Manager for Orion, we have lots of great and exciting plans. But you know, certainly, on THWACK, I’ll be there a lot as will other folks who will be only too happy to discuss you know, what you’d like to see in the product, what your thoughts are, what we’ve built today, and so please, please reach out and be as interactive as possible on THWACK. We can’t wait to see what you think of Log Manager for Orion.
Well, Jamie thank you so much for coming all the way across the pond and being here to talk with us and to talk with everyone about logging. You know, you’ve been asking about logging on the chat, and forums, and in conventions when we see you for quite a while. You know, it’s interesting to me that a lot of the questions that we get are the very general what’s the recommended way for me to manage my logging within SolarWinds?
Right, Leon. And as we saw, it’s not so much that there’s one best way or even the best tool. There are lots of great ways. It’s more important you think about what you’re trying to log, what your environment is like, and then letting that determine what tools you use.
Yeah, so. Would it be fair to kind of break it out into maybe four categories based on size? So, sort of that first category would be maybe just a little bit of logging, right? So that it’d be either a Kiwi or maybe even the tools that come in the toolset. The second category would be use-specific, like security-focused, right? So that’d be Log & Event Manager, and as we saw, the interface for it has been updated and there’s going to be a lot more that we have to talk about that soon. And then the third would be general logging, right? Ideally, kind of built in in a single dashboard. And so that would be either the logging that comes built in with the Orion Platform, so that’s NPM, NetFlow, IPAM, NCM, SAM, or in this case, now Log Manager, which when you install it, replaces that log engine with a new high-performance engine based on SQL Server 2016. And then the last category would be logging or log-based metrics that are provided as a service. So, for that it’d be Loggly, AppOptics, and Papertrail.
Yep. That’s pretty close. Of course, there are cases where you might choose differently. For example, your Raspberry Pi project is IoT and cloud-based with zero traditional infrastructure. For that, hosted logging is better for access. But a similar app connected to an on-prem data center like industrial IoT, Log Manager for Orion might be a better choice.
Right, well hopefully this Lab has been helpful for you and of course, we’d love to answer any questions that you have. So, if you’re watching us live, just pop your questions in over here on the chat box and we’re going to hang out for a while and chat with ya. And if you’re watching this on Replay– they’d be naughty to be watching this on Replay, but if you are, then swing by our home page which is lab.solarwinds.com and you can start a THWACK conversation, tell us what you’d like to see in upcoming episodes, and of course, check the schedule so that you can be with us live next time.
And of course, as a Product Manager, I’m in the THWACK forum all the time for LEM, our new Log Manager for Orion module, and would love to get your feedback. Try it out in your environment and let us know what you think. The feedback from the Beta was amazing, but we’re really looking forward to seeing what you think.
Well, and of course, the SolarWinds Lab viewers have already figured out that they have extra pull when it comes to new features just based on what they talk about during the live program. But, you already knew that.
Good. Let’s wrap this up so we can chat with you all online. I’m Jamie Hynds.
I’m Leon Adato.
And I’m Patrick Hubbard and thanks again for watching SolarWinds Lab.