Increasingly colleges and universities are expanding their offerings to include courses, and often entire degree programs that are available to students whenever they have time to participate and wherever they happen to be. As institutions of higher learning expand their presence in the online education or distance learning arena, there is an implicit expectation that courses and related services will be available 24/7/365. This expectation requires these institutions to heighten the level of attention to disaster recovery and business continuity; areas that have frequently been pushed to the rear as funding priorities have been established.
System Outages – So what?
While colleges and universities in the brick and mortar world have increased their focus on disaster recovery, risk assessment, and business continuity planning in the context of core administrative systems (e.g. payroll) and operational systems (e.g. email, campus web site, phone systems), it is interesting to note that the unavailability of such systems has historically had little impact on student access to class, the ability of students to collaborate around coursework or on the ability of students to interact with professors. The end result has been that even when system outages have occurred, the impact on day-to-day classroom operations has been relatively minor.
The situation is and has been quite different for online programs and particularly for institutions whose offerings are exclusively online. Having taught classes for a major online university, I know first-hand that even brief system outages can have a significant impact on both students and instructors. System and network outages wreak havoc on the ability of instructors and students to have an active presence in the classroom; a core requirement. Outages also inhibit student access to class assignments, related reference material and to the ability of students to collaborate on group assignments.
While few will question the potential for major weather events, utility failures and other phenomena (e.g. cyber-terrorism) to impact operations and system availability, seemingly minor events (e.g. application system and network component failures) and even the failure of 3rd party solutions can also have a disastrous impact on day-to-day classroom activities. During my online teaching experience, brief outages would sometimes require adjustments to class assignments and schedule as consideration for the negative impact on student ability to meet course requirements. Extended outages required the administration to cancel and reschedule courses or credit students for the course fee.
With universities relying more and more on systems for course delivery and support, even brick and mortar institutions are seeing courses impacted by system outages. This during a time when instructional systems continue to receive relatively less attention in disaster recovery planning than back office or administrative systems.
The time has come for colleges and universities to focus on the development of robust recovery and business continuity plans and capabilities. It is only by doing so that institutions can limit the impact of both minor and major outages.
Business Continuity/Disaster Recovery Planning
My proposition is that colleges and universities offering online courses and programs must reassess disaster recovery and business continuity priorities and plans. In doing so, they must recognize that simply deploying technology (e.g. redundant infrastructure and system failover capabilities to provide high availability) is not enough. While technology is clearly important, it is but one of the requirements for effective disaster recovery and business continuity.
1. Revisit potential risks and their implications: Institutions with sizeable online offerings must reassess risks and recovery requirements and determine if current recovery plans enable them to meet critical recovery requirements.
2. Refine disaster recovery and business continuity processes: Processes and related documentation must be comprehensive and clearly defined (e.g. backup and recovery details, logistics for equipment replacement, location of and access to the recovery facility). The assumption that all team leads and subject matter experts will be available and in place during recovery efforts is seldom realized.
3. Assess related operational processes: As most outages are caused by planned procedures that go wrong, process reviews should extend to standard operating processes. (e.g. review monitoring processes to assure capacity trends are identified and dealt with before they become major issues).
4. Test, test and test again: The importance of testing can’t be overstated. RPOs and RTOs are seldom met after a disaster if they haven’t been achieved during testing.
The ability to quickly restore classroom environments and related services will be positively received by both students and instructors and have a positive impact on retention and revenue.