Home > How to Plan for Major Incidents in ITSM

How to Plan for Major Incidents in ITSM

plan major incidents itsm

Major Incident Management and Preparedness

A “major incident” can be a nightmare for any business – that much seems agreed upon. In practice, however, defining what a major incident is and how to respond to one effectively is a bit more challenging to find consensus opinion about. Ironically, agreement is one of the necessary steps in defining what a major incident is, when considering major incident management. Do you know what a “major incident” is and how your organization should best handle one of the many scenarios that may constitute one? Read on to learn more about how to deal with major incidents, as well as the importance of having a plan in place.

When an Incident Constitutes a Major Incident

Panic, a flood of calls to customer service, management in crisis mode – the hallmarks of a major incident are pretty difficult to miss. But, according to ITIL and ISO 20000, what denotes a “major incident” is rigid in definition. Furthermore, the response required is rigidly defined, as well – and at its essence, what denotes a major incident must be agreed upon by both management and IT. A major incident requires a separate procedure for addressing the incident—separate from ongoing efforts to address the problem(s) that led to the incident. The primary distinction between incidents and problems is that problems lead to incidents, and are things that can be addressed in order to lessen the occurrence of incidents. This procedure requires the defining of responsibility and a review of what led to the incident, as well as how to prevent it from occurring in the future.

Preparing for Major Incident Management

Advance planning can make a tremendous difference when it comes time for major incident management. Putting together a major incident team or management group, or at the very least, identifying an Incident Manager is a great first step. Defining what denotes a major incident for your organization is also key to the process, as well as defining how major incidents will be identified and communicated to the appropriate departments throughout your company. Designing and implementing agreed-upon emergency procedures for major incident management is another key advance measure. Finally, designing and implementing an incident management test scenario, with regular practice exercises, is also an integral component of preparing for major incident management.

Shifting Roles in a Major Incident Scenario

As mentioned above, one key aspect of major incident preparedness requires the forming and training of an incident team or teams to execute agreed-upon major incident protocols. Depending on the size of your organization, this could be a single Incident Manager, or in the case of large organizations, several incident management teams. When a major incident occurs, all team members should have clearly defined roles and responsibilities while the incident is being resolved. A Post-incident plan is also necessary in order to make sure that the problem was handled appropriately and that steps are put into place to prevent it from recurring. Roles to consider assigning include: Incident Manager, Root Cause Analyst (or Problem Manager), and Major Incident Investigator (or Investigation Team). Additionally, protocols need to be established for the service desk team, service level or account managers, and other teams during a major incident to insure that communication around service interruptions and customer issues are handled according to established policy.

Major Incident Management in the Wake of a Serious Issue

Following a major incident, it can be tempting to return to business as usual, as quickly as you can. However, that temptation should be resisted as much as possible. While returning to optimal service levels as quickly as possible is ideal, problem management in the wake of a major incident, the analysis of the organization’s response to the incident, and taking the opportunity to revisit your incident management test scenario for improvements are all equally crucial.
Tim Lawes
Tim Lawes serves as Manager of Solutions Engineering, ITSM at SolarWinds. ITIL 4 certified, he brings 10+ years of training and consulting experience in the…
Read more

Tweets

SolarWinds's Twitter avatar
SolarWinds
@solarwinds

Before participating in any race, training is a must. The same goes for prepping IT teams before migrating to the c… t.co/4bNiN2UNi9

SolarWinds's Twitter avatar
SolarWinds
@solarwinds

#SysAdminDay is a chance to help the next generation of tech pros see how systems administration is a vibrant, rele… t.co/sbp7LXw7kf

SolarWinds's Twitter avatar
SolarWinds
@solarwinds

Read the top three challenges for utilizing technology to mitigate and/or manage risk within organizations.… t.co/unFShd09zi