Hands-On ICT Continuity Management

Hands-On ICT Continuity Management





How ICT continuity planning can improve your resilience

The potential for large scale absenteeism resulting from the so-called 'swine' flu pandemic has led many businesses to carry out extensive pandemic planning to ensure that customer facing and revenue generating services can be maintained with reduced staff numbers. However, equally important to maintaining almost every organisation's core business processes are the ICT services which underpin them. Yet, it is easy to forget that ICT services are dependent on the human touch to keep operating at the desired levels.

It is imperative, when planning for staff absence, that an organisation factors in the level of human interaction required to operate and maintain its ICT services. This article examines common interactive tasks such as day to day operations, backup and recovery processing and break/fix activities, and the impact that a reduction in staff may have on these.

Back office ICT continuity
The vast majority of organisations have prepared specific 'flu-related' contingency plans and arrangements to manage the potential large scale absence of staff and ensure continuity in terms of their critical business processes. Absenteeism may be for a variety of reasons - such as the corporate policy of "stay away", a significantly reduced public transport system making travel to and from work difficult, illness or a need to care for children or other sick dependents.

The primary focus for most companies is to ensure that customer-facing and revenue-generating services can be maintained despite reduced staff numbers. As a result, pandemic business continuity plans tend to focus on two key aspects: firstly, the welfare of staff and secondly, the provision of 'front line' services. While the continuity of these two components is essential, plans often overlook one critical element in their client-facing activities - the back office functions that support them. The ICT operation is an integral part of these activities. It is easy to forget that even the most automated 'hands off' ICT operation requires some degree of human interaction. In truth, many organisations probably underestimate the level of hands on support required in a solid ICT continuity plan.

Keeping the technology running - ICT continuity during a major disruption
It is not uncommon for organisations to have some degree of remote access capability for their ICT systems. If this is the case then it is likely that remote working is featured somewhere in the company's business continuity plan.

In many organisations, however, remote access is implemented to provide a remote working capability for a limited number of staff with only a small percentage of these utilising the service at any one time. The subscription rates for remote access varies from 2:1 to 8:1 depending on which set of statistics are used. Using the higher value, there could be as many as eight subscribers per available line. Normally, this is not an issue, as most users can get access when required. However, all that changes when large numbers of staff cannot get to the office, for example, because of inclement weather conditions or major transport disruptions both of which were experienced in the UK in February 2009. When this situation occurs, the remote access service can become severely over subscribed.

Avoiding denial of remote access
Our experience has shown that there is often little or no planning when it comes to the most productive utilisation of the limited remote access capability. It's basically a 'free for all', with staff vying for connectivity. This is inconvenient for short periods of disruption, but may have serious repercussions if the large scale displacement of staff extends to longer periods of time, especially if remote working is a potential response option for your organisation.

For effective business continuity planning, an organisation must define the criteria for using the remote access capability. The plan should clearly identify who can have remote access, when they can have it and for how long. This will ensure the appropriate people and the functions they perform are given the necessary priority.

ICT on the front line
It is a safe assumption that without their ICT capabilities, many organisations would at best be inconvenienced or at worst be left unable to provide any front-line services. It is essential therefore that companies know exactly what their key ICT services are and what the 'must have' human interactions are that keep them running. By gaining a clear understanding of just how hands-on the ICT operation is, and the activities involved, this will enable you to factor these elements into your continuity planning.

Organisations must consider not only the tasks performed by their own staff but also any specialised skills that are performed by suppliers or outsourcing partners and seek to ascertain just how resilient these suppliers are in the face of a major disruption to their operation. They should also be aware of that too often just one member of the ICT department is the key 'knowledge holder' or expert on a particular application, system or service. How much does the ICT department and the organisation rely upon the skills and knowledge of just one person?

When evaluating ICT continuity, companies must also consider how the loss of a particular individual within their ICT team will affect the ongoing performance of their ICT services should an incident occur, and how critical they are in terms of delivering front-line services.

Common hands-on ICT activities
Regardless of the organisation's size or the nature of its business there will be several common activities that they all perform and that require a degree of hands-on activity. Some of these could be performed remotely whilst others will need access to the ICT facility.

There is the basic systems monitoring and operations functions necessary to keep the services running. This is usually done directly at the ICT facility or via a remote operations bridge. But what about the other functions the operations team may perform?

Continuity of ICT housekeeping and data backups
Let's consider housekeeping and more specifically backups. There are two main backup purposes: firstly, to recover data from operational errors such as data corruption (however caused) and user error; and secondly, for disaster recovery purposes. Organisations should find out how their backups are performed and how much automation is deployed. Let's assume that backups are written to tape and that operations staff need to manually load and unload tapes. If they are unable to do this what is the impact to the backup processing?

In some companies, the disaster recovery backups are shipped to an off-site facility. What would the impact be if the courier service or storage agents were unable to collect the tapes and transport them on your behalf? Also, what about the tapes that should have been returned to replenish the tape library, how many days worth of 'scratch' tapes are available to allow backup processing to continue? You should also ask whether your tape storage company has a pandemic plan to ensure their service commitments to you are maintained.

ICT maintenance and break/fix arrangements
Let's now consider the maintenance and break/fix arrangements for the organisation. Whilst it is not a major issue if maintenance tasks are delayed, the same cannot be said for the response to break/fix callouts. How resilient are the ICT services that support the front-line activities? Would the service continue, albeit in a reduced capacity, if a component failed or would the service fail? Again, what are the continuity arrangements of your break/fix suppliers?

These are just some of the questions that will help identify vulnerabilities during your ICT continuity planning.

What steps can you take towards effective ICT continuity planning?

- It is imperative that you conduct an audit to identify all the hands-on ICT activities and establish how critical they are to the overall ICT service. Don't ignore any activity; those that seem mundane are often the most critical.

- Examine the ICT infrastructure for any key single points of failure. Consider implementing a greater degree of resilience, this will provide two benefits: firstly, it will reduce the likelihood that a component failure will cause a service interruption; and secondly, it will improve the overall long term robustness of the service.

- Establish a temporary backup and recovery strategy. If tapes are at a premium then consider reducing the number of generations retained. Is it really necessary to keep seven or ten day's worth of backups or is three days retention acceptable in the short term?

- Consider creating procedures (and training) non-ICT staff to perform some of the tasks if required. The tape management and handling process is a good example, this will at least enable backup processing to continue especially for disaster recovery.

- Liaise with all your suppliers and ensure they have adequate continuity arrangements in place. Don't let their lack of resilience create difficulties for you.

The devil's in the detail
I hope this has highlighted the need to consider the human element of ICT service continuity. As ever, the devil is in the detail, but overlooking this operational-level human dependency could exacerbate an already challenging situation as that 'key' person goes ill or can't get to work.

Dominic Cockram is the founder and Managing Director of Steelhenge Consulting Ltd, specialists in business continuity and crisis management. Special thanks to Carl Bradbury MBCI, MBCS, senior ICT continuity consultant at Steelhenge, for his extensive contribution to this article.

Article Source: http://EzineArticles.com/expert/Dominic_Cockram/320296


Article Source: http://EzineArticles.com/3516962





_(By Dominic Cockram).

Comments

Popular posts from this blog

Homer

A Brief Overview of the Information Technology Infrastructure Library