Hands-On ICT Continuity Management
Hands-On ICT Continuity Management
How ICT continuity planning can improve your resilience
The
potential for large scale absenteeism resulting from the so-called
'swine' flu pandemic has led many businesses to carry out extensive
pandemic planning to ensure that customer facing and revenue generating
services can be maintained with reduced staff numbers. However, equally
important to maintaining almost every organisation's core business
processes are the ICT services which underpin them. Yet, it is easy to
forget that ICT services are dependent on the human touch to keep
operating at the desired levels.
It
is imperative, when planning for staff absence, that an organisation
factors in the level of human interaction required to operate and
maintain its ICT services. This article examines common interactive
tasks such as day to day operations, backup and recovery processing and
break/fix activities, and the impact that a reduction in staff may have
on these.
Back office ICT continuity
The
vast majority of organisations have prepared specific 'flu-related'
contingency plans and arrangements to manage the potential large scale
absence of staff and ensure continuity in terms of their critical
business processes. Absenteeism may be for a variety of reasons - such
as the corporate policy of "stay away", a significantly reduced public
transport system making travel to and from work difficult, illness or a
need to care for children or other sick dependents.
The
primary focus for most companies is to ensure that customer-facing and
revenue-generating services can be maintained despite reduced staff
numbers. As a result, pandemic business continuity plans tend to focus
on two key aspects: firstly, the welfare of staff and secondly, the
provision of 'front line' services. While the continuity of these two
components is essential, plans often overlook one critical element in
their client-facing activities - the back office functions that support
them. The ICT operation is an integral part of these activities. It is
easy to forget that even the most automated 'hands off' ICT operation
requires some degree of human interaction. In truth, many organisations
probably underestimate the level of hands on support required in a solid
ICT continuity plan.
Keeping the technology running - ICT continuity during a major disruption
It
is not uncommon for organisations to have some degree of remote access
capability for their ICT systems. If this is the case then it is likely
that remote working is featured somewhere in the company's business
continuity plan.
In
many organisations, however, remote access is implemented to provide a
remote working capability for a limited number of staff with only a
small percentage of these utilising the service at any one time. The
subscription rates for remote access varies from 2:1 to 8:1 depending on
which set of statistics are used. Using the higher value, there could
be as many as eight subscribers per available line. Normally, this is
not an issue, as most users can get access when required. However, all
that changes when large numbers of staff cannot get to the office, for
example, because of inclement weather conditions or major transport
disruptions both of which were experienced in the UK in February 2009.
When this situation occurs, the remote access service can become
severely over subscribed.
Avoiding denial of remote access
Our
experience has shown that there is often little or no planning when it
comes to the most productive utilisation of the limited remote access
capability. It's basically a 'free for all', with staff vying for
connectivity. This is inconvenient for short periods of disruption, but
may have serious repercussions if the large scale displacement of staff
extends to longer periods of time, especially if remote working is a
potential response option for your organisation.
For
effective business continuity planning, an organisation must define the
criteria for using the remote access capability. The plan should
clearly identify who can have remote access, when they can have it and
for how long. This will ensure the appropriate people and the functions
they perform are given the necessary priority.
ICT on the front line
It
is a safe assumption that without their ICT capabilities, many
organisations would at best be inconvenienced or at worst be left unable
to provide any front-line services. It is essential therefore that
companies know exactly what their key ICT services are and what the
'must have' human interactions are that keep them running. By gaining a
clear understanding of just how hands-on the ICT operation is, and the
activities involved, this will enable you to factor these elements into
your continuity planning.
Organisations
must consider not only the tasks performed by their own staff but also
any specialised skills that are performed by suppliers or outsourcing
partners and seek to ascertain just how resilient these suppliers are in
the face of a major disruption to their operation. They should also be
aware of that too often just one member of the ICT department is the key
'knowledge holder' or expert on a particular application, system or
service. How much does the ICT department and the organisation rely upon
the skills and knowledge of just one person?
When
evaluating ICT continuity, companies must also consider how the loss of
a particular individual within their ICT team will affect the ongoing
performance of their ICT services should an incident occur, and how
critical they are in terms of delivering front-line services.
Common hands-on ICT activities
Regardless
of the organisation's size or the nature of its business there will be
several common activities that they all perform and that require a
degree of hands-on activity. Some of these could be performed remotely
whilst others will need access to the ICT facility.
There
is the basic systems monitoring and operations functions necessary to
keep the services running. This is usually done directly at the ICT
facility or via a remote operations bridge. But what about the other
functions the operations team may perform?
Continuity of ICT housekeeping and data backups
Let's
consider housekeeping and more specifically backups. There are two main
backup purposes: firstly, to recover data from operational errors such
as data corruption (however caused) and user error; and secondly, for
disaster recovery purposes. Organisations should find out how their
backups are performed and how much automation is deployed. Let's assume
that backups are written to tape and that operations staff need to
manually load and unload tapes. If they are unable to do this what is
the impact to the backup processing?
In
some companies, the disaster recovery backups are shipped to an
off-site facility. What would the impact be if the courier service or
storage agents were unable to collect the tapes and transport them on
your behalf? Also, what about the tapes that should have been returned
to replenish the tape library, how many days worth of 'scratch' tapes
are available to allow backup processing to continue? You should also
ask whether your tape storage company has a pandemic plan to ensure
their service commitments to you are maintained.
ICT maintenance and break/fix arrangements
Let's
now consider the maintenance and break/fix arrangements for the
organisation. Whilst it is not a major issue if maintenance tasks are
delayed, the same cannot be said for the response to break/fix callouts.
How resilient are the ICT services that support the front-line
activities? Would the service continue, albeit in a reduced capacity, if
a component failed or would the service fail? Again, what are the
continuity arrangements of your break/fix suppliers?
These are just some of the questions that will help identify vulnerabilities during your ICT continuity planning.
What steps can you take towards effective ICT continuity planning?
-
It is imperative that you conduct an audit to identify all the hands-on
ICT activities and establish how critical they are to the overall ICT
service. Don't ignore any activity; those that seem mundane are often
the most critical.
-
Examine the ICT infrastructure for any key single points of failure.
Consider implementing a greater degree of resilience, this will provide
two benefits: firstly, it will reduce the likelihood that a component
failure will cause a service interruption; and secondly, it will improve
the overall long term robustness of the service.
-
Establish a temporary backup and recovery strategy. If tapes are at a
premium then consider reducing the number of generations retained. Is it
really necessary to keep seven or ten day's worth of backups or is
three days retention acceptable in the short term?
-
Consider creating procedures (and training) non-ICT staff to perform
some of the tasks if required. The tape management and handling process
is a good example, this will at least enable backup processing to
continue especially for disaster recovery.
-
Liaise with all your suppliers and ensure they have adequate continuity
arrangements in place. Don't let their lack of resilience create
difficulties for you.
The devil's in the detail
I
hope this has highlighted the need to consider the human element of ICT
service continuity. As ever, the devil is in the detail, but
overlooking this operational-level human dependency could exacerbate an
already challenging situation as that 'key' person goes ill or can't get
to work.
Dominic
Cockram is the founder and Managing Director of Steelhenge Consulting
Ltd, specialists in business continuity and crisis management. Special
thanks to Carl Bradbury MBCI, MBCS, senior ICT continuity consultant at
Steelhenge, for his extensive contribution to this article.
Article Source: http://EzineArticles.com/expert/Dominic_Cockram/320296
Article Source: http://EzineArticles.com/3516962
~ (By Dominic Cockram).
Comments
Post a Comment