The Continuity Program Methodology for Higher Ed can help you build and improve your institution’s program. The methodology outlines where to begin and how to move forward in the planning process. Continuity program development follows four distinct phases: discover, define, develop, and demonstrate. Each phase contributes to the next, with the exception of phases 3 and 4, which will go back and forth as the continuity program matures.
Creating a new program can be a daunting task; read on to learn more about applying this methodology at your institution. Building a viable and resilient continuity program is best approached in phases.
Phase 1 is where you will assess your institution’s business functions and determine their criticality. The best way to achieve this is to conduct a Business Impact Analysis (BIA). Conducting a BIA is an important initial step in building a continuity program because it allows you to identify the critical functions of your institution and the critical applications that support those functions.
The BIA will help you identify the recovery time objectives (RTO) for the business functions by first determining the quantitative and qualitative loss potential during an adverse event. Once you have compiled loss potential information you can determine the most critical functions of the institution and more clearly see the high priority functions you need to focus on first as you build out the continuity program.
Additionally, it is important to also identify the RTOs and recovery point objectives (RPO) for the applications that support critical functions. Not all applications will be as critical to core functions as others, and therefore may have a less aggressive RTO. For example, if the institution's billing application goes down just before a new semester, it will likely have an aggressive RTO.
Recovery Time Objective = The period of time within which systems, applications, or functions must be recovered after a disastrous event.
Recovery Point Objective = The maximum amount of data loss that can be experienced by the business unit as the result of disaster and recovery. This is measured in time, i.e., 3-days of data can be lost and recreated.
The potential risks are also identified in the BIA. Knowing the regional risks (i.e., hurricane, flooding, earthquake), and the local risks (i.e., a research server that is not backed up, or a department without cross-trained individuals to support a critical function) will allow the department to create action items on which to focus, so that the risks may be mitigated through pre-event planning.
After understanding which functions are the most critical through the BIA process, you can turn your attention to recovery and mitigation. Recovery and mitigation strategies help each department understand what needs to be done to return to providing services to internal and external customers in the case of an adverse event. Recovery and mitigation strategies should be considered for anything that can be lost, including the loss of facilities, data, people, computing environment, and key suppliers.
The following questions are helpful in creating mitigation and recovery strategies:
1) What will the department do (prior to an event) to eliminate or mitigate risks?
2) How do we recover from the risks that remain if an event were to occur?
For example, if a potential risk is losing a significant amount of research data your mitigation strategy may be to adequately back up the data offsite, preferably in the cloud, prior to the emergency. If the risk is losing a key staff member your recovery strategy may be to have a cross-trained individual to take temporary responsibility of the duties of the lost staff member.
After collecting data in phase 1, and creating recovery and mitigation strategies in phase 2, you’ll create a business continuity plan in phase 3. As you analyze the collected data and begin to structure and write your plan, think of continuity in the following steps: respond, relocate, recover, resume, and return. These 5 Rs will help you build a complete and comprehensive plan.
Respond is the “emergency management” phase and is all about surviving the emergency; it’s about human safety. Evacuating a burning building, meeting your team members at a predetermined location, determining if anyone is missing, initial communication, and working with first responders (police, fire department, etc.) are all included here.
Relocate involves the relocation of displaced students in housing, or relocation of essential employees to an alternate work location if their normal location is unavailable. This may be the local library, another campus building, or working from home, but for a smooth recovery it is best that this is determined, and arrangements made, prior to an emergency situation.
Recover is the phase where getting back to business begins. Staff in alternate locations get acclimated to their new environment and ensure they have what is needed to perform their functions. The computing environment is restored, and data is tested for integrity. If some applications, data, or people are unavailable, workarounds most likely are occurring.
Resume is getting back to business as usual. Essential employees are conducting their functions, perhaps from an alternate location; the computing environment is available, data is current, and normal business activities are occurring. The key is to get to this phase as quickly as possible and with as little interruption to the institution’s internal and external customers as possible.
Return involves returning to the normal (original or new) location(s), with the functions active and being completed by the right people. Everyone is back to doing their jobs as designed. Returning may take days, weeks, and in some cases such as a pandemic, many months or more.
Finally, in phase 4 you will plan a training to test the continuity plans in place. Exercising the plan with a tabletop exercise (TTX) is a vital activity in continuity planning. This is done by walking through a disaster scenario with the recovery team and answering specific questions as to what the team would do if people were missing, the building burned down, the media was asking students for comments, etc. A set of exercise objectives should be met, such as:
The plans should be updated to accommodate for findings after every exercise, and action items that are viable, actionable, and have measurable times to complete should be created for any discovered gaps.
Higher Ed continuity has a cyclical life cycle. Changes occur daily, and new risks are continuously introduced, but the process itself does not need to be complicated. Keep each phase simple. Working with the most critical departments or functions first and working through the stages will garner resiliency and foster knowledge, and perhaps gain greater sponsorship as you move on through the remainder of functions and departments.
Empowered with this knowledge, evaluate your institution's continuity program maturity with this Higher Ed Continuity Program Maturity Assessment and use the results to continue improving your program.