Case Study (System Recovery Plan)

Using the Omega Case Study, create the system contingency plan template for the Production SAP system. The analysis paper must be three to five pages long and must conform to APA standards At least two authoritative, outside references are required (anonymous authors or Web pages are not acceptable). These should be listed on the last page titled “References.”
BACKGROUND
Omega Research is a rapidly growing research and consulting firm. They have a single main office located in Reston, VA and three small branch offices located in San Diego, CA, Salem, OR, and Kansas City, MO. Omega is not currently involved in e-commerce or business-tobusiness relationships.
Two weeks ago, Omega experienced a significant loss of proprietary data (estimated value $550,000.00) that was stored electronically in an Oracle database in their main office in Reston. The data was unrecoverable and backups were not being routinely maintained, so no restoration was possible. Although he has no hard evidence, Omega’s CTO believes that the loss resulted from deliberate deletion of files by a systems administrator from the Kansas City office that had been “let go” several weeks prior to the loss. Needless to say, the CTO has been tasked to “get things under control.”
You have been hired as a consultant to develop a comprehensive plan for improving the company’s recovery posture in order to prevent future outage of Omega’s critical systems and network resources. Your guidance and observations will eventually be used to develop a long-term procedural and policy solution for Omega Research. The CTO has stepped up to the plate and made the commitment to do whatever it takes to address these issues.

Baseline Network Infrastructure
Omega leverages AT&T Managed Internet Services for each of its office locations.
Omega owns and manages the border routers for each of their office sites.
Offices in Reston, San Diego, and Kansas City receive full T-1 service.
Offices in Salem receive 256k F-T1 circuit service.

Systems

Business processes provided by AIX Environment

Financial
Reporting
Data Warehouse

LAN

Vendor
Services
Address
Phone
Contacts
IBM

Tape LibraryTSM Server
522 South RdPoughkeepsie, NY 12601
214 451-7747
Steve Barretta
SunGard

Recovery services for server environment
401 N Broad St.
Philadelphia, PA
877 456-3966
215 351-1300
q Don Meltin (Test Coord.)
q Jack Fabrianni (Acct. Rep)
q Lincoln Balducci (Resource Coord.)

BASELINE ARCHITECTURE
Local Area Architecture (Reston Office)
AIX Environment
Perimeter protection provided by screening router. Configured for dynamic packet filtering using reflexive Access Control Lists (ACL’s).
Remote access is provided to employees while at home or on travel through PPTP VPN, and, dial-up RAS offered by a Microsoft Windows NT 4.0 Server ®.
All servers in the Reston office have been centrally located to a data center.The Reston data center supports a 5-keypunch combination lock that is required to have access to the room. That combination is shared with all IT personnel and is infrequently rotated.
The data center is controlled for humidity through HVAC purification.
The data center is controlled for temperature with isolated HVAC services.
The data center is not on a raised floor to control static electricity.
The data center does not have a site-wide UPS. Each server and network equipment supports their own mini-UPS.
Internal Omega E-mail is supported by a Microsoft Exchange ® 2000 mail server running on a Microsoft Windows ® 2000 Server. Omega has installed an SMTP mail gateway to support Internet mail exchange.
Omega is the registered owner of com and maintains a DNS Server at the Reston facility for name resolution supporting Omega users and to allow Internet access to publicly accessible information (web and e-mail).
Web hosting services are provided on a Microsoft Windows ® 2000 Server running Internet Information Services (IIS).
500 directory services are available through Active Directory although their implementation is relatively immature – they are operating in a mixed environment.
Server and client o/s environments have not been routinely patched.
Reston office printers are all network connected.
The IT Department is responsible for management of the networks and networked resources at the Reston facility. They manage more than 170 workstations and 6 servers performing the functions previously described.
Client machines consist of Microsoft Windows ® 95, 98, NT Workstation 4.0, 2000, and Mac operating systems include OS/8 and OS-X, Panther.
Productivity applications have not been standardized. Some user communities enjoy Corel OfficeSuite ® while others appreciate Microsoft Office ®. There are various editions of these packages installed on client machines.

BASELINE ARCHITECTURE
Local Area Architecture (San Diego Office)
The San Diego is essentially a mirror of the network architecture provided at the Reston facility.
Differences:
o San Diego does not host a web server.
o San Diego does not support VPN or RAS connections.
o There are fewer employees working out of the west coast office. The local IT staff consists of one engineer who manages all networks and networked resources within the San Diego office.
o There are less than 50 client machines in San Diego with similar configurations as the main office.
o All servers have been located in a spare office in San Diego.
There is not a controlled access restriction like in the main center.
The office is not controlled for temperature, humidity, or static.
There are no redundant power supplies.
BASELINE ARCHITECTURE
Local Area Architecture (Salem Office)
Salem is a small site with only 30 workstations configured in much the same way as the rest of the company.
Sale supports a single combined shared file and print server hosted on a Microsoft Windows ® NT 4.0 Server.
Mail services are obtained through the San Diego office, using mailboxes set up on the San Diego Exchange Server.
There are no publicly available networked resources at the Salem office.
Remote access to Salem’s infrastructure is provided to mobile and home employees using VPN client to gateway connectivity.
Salem has an IT staff of one engineer that manages all networks and networked resources at this site.
All servers have been located in a spare office in San Diego.
There is not a controlled access restriction like in the main center.
The office is not controlled for temperature, humidity, or static.
There are no redundant power supplies.

BASELINE ARCHITECTUREARCH

Local Area Architecture (Kansas City Office)
Kansas City is very similar in size to the Salem office with the exception that Kansas City runs a Microsoft Exchange ® 2000 server for mail services.
Kansas City has a local system administrator for support.
All servers have been located in a spare office in Kansas City.
There is not a controlled access restriction like in the main center.
The office is not controlled for temperature, humidity, or static.
There are no redundant power supplies.

CONSIDERATIONS
Networking and Systems Administration
Access to any site LAN automatically guarantees access to the entire WAN. This means that user accounts authenticated in the Salem office have immediate access to resources in San Diego, Kansas City, and Renton.
User accounts and access restrictions are independently managed by each office’s system engineer. There is not a common user policy – rules concerning how passwords are created an enforced, cycled, aged, lockout, user account retention, and so on, are created and maintained per office.
There is no formal backup and disaster recovery policy at any site. Backups are Off-site rotation only happens at the Reston office. Salem currently performs DASD to DASD backups without Tape copies being made.
The local system administrators at the satellite offices take all direction from the central office and are not authorized to make boundary router changes. They do not have authority to change anything without central IT approval. They have no site specific budget; they have full accountability for their LAN’s.
All machines run antivirus software although local IT staff infrequently maintains their definition files and relies on user intervention to perform file updates. No machine has spyware protection.
There is no dedicated program for training employees on avoiding threats like, say,
Firewall logs, host packet analysis, application logs, event and error logs are generally ignored across the board.

Business Requirements
The organization is growing rapidly in spite of recent events.
Their strength is in developing business within the local market and providing on-site consulting services.The research end of the business is the well-spring from which they draw their competitive edge, but Omega is realizing that consolidating the research workforce adds synergy to their efforts, and reduces unnecessary overhead.
They plan to continue down that road. As a result, local sites will expand their consulting workforce and research will continue to be consolidated at the Reston and San Diego facilities. As this trend continues to develop, access to the research data stored at the east and west coast facilities becomes critical. Additionally, they cannot afford a similar loss of proprietary information as was recently …and they know it could have been much worse.
Known Environmental Risks
The San Diego office is located in a 20-year earthquake zone. Once every 20 years, it estimated that a 6.0-Richter scale earthquake or greater will strike the facility, likely causing damage to the facility/computer equipment; management assumes losses to computer assets could be estimated at 20%. As a countermeasure, the company has purchased insurance with $18,000.00/year annual premiums that increase 5% every year.
The Reston office is located in a 500-year flood zone. Once every 500 years, it is estimated that a flood will strike the facility likely causing damage to the facility/computer equipment; management assumes losses to computer assets could be estimated at 40%. The company has opted to not purchase insurance. Annual premiums would run approximately $25,000.
The Kansas City office suffers a significant tornado event once every five years. When the tornado hits, severe electrical disruption affects the equipment and the office suffers 10% losses on computer assets. The company pays $14,000 in annual insurance premiums.
Appendix A.
Balance Sheet

Reston
Book Value
Actual Value
81,290 45,690 27,390 13,330
17,250
9,450
4,309
Networking Equipment Server EquipmentWorkstation Equipment Peripherals
TOTAL:
167,700
31,009
Kansas City
Networking Equipment
12,700
11,900
Server Equipment
4,009,250
3,400,000
Workstation Equipment
18,200
13,400
Peripherals
4,433
TOTAL:
4,044,583
3,425,300
Salem
Networking Equipment
4,300
Server Equipment
3,600
Workstation Equipment
7,200
500
Peripherals
4,433
TOTAL:
19,533
500
San Diego
Networking Equipment
81,290
17,250
Server Equipment
45,690
9,450
Workstation Equipment
27,390
4,309
Peripherals
13,330
TOTAL:
167,700
31,009

Appendix B.
The Business Impact Interviews
Bill Hermann – We are a service-based company and our ability to take in and book cash is critical. Without solid cash flow or expenses increase exponentially in the very short period of time. In addition our cash position which I monitor through the SAP system allows us to manage our treasury and short-term funding. I would estimate within two days we would have to borrow money which could increase our costs and overhead.

Tiffany Sabers – The I.T. organization is in a period of transition when it comes recoverability. Implementation of SAP was very expensive, time-consuming, and drawn out. We have built-in a level of redundancy to sustain production should any number of things fail within a data center itself. However we are not in as good a shape as we should be to protect your organization to the entire data center become unavailable for any significant period of time. Several factors come into play when considering the recovery of a central system such as SAP. The availability of the technology we’ve chosen at our recovery vendor has been a challenge to say the least. SunGard needs to acquire and fund the appropriate IBM servers that we use to run the SAP application. Secondly there is for a terabytes of production data that needs to be recovered from tape once a disaster is declared. The recovery activity using the current tape library technology on the floor is estimated to take 3 to 4 days barring any problems. For tape to be a viable option going forward we need to upgrade to higher speed higher density devices and media to meet the needs of the business which is another capital expense. I think we all knew and accepted the risk of having to retool with the implementation of SAP. Now that time has come and this exercise is crucial to determine the proper recovery strategy and technology to meet the business needs.

John Sampolous – I agree with Bill that our finance structure is key. Since we don’t make anything physical our business model relies on our cash position. I will say though without having finance information available we may begin borrowing on the second day of an outage. The way the SAP system works without current data we will be a day behind at the time of business start the second day. We’re certainly capable of maintaining business function but will begin to lose $3-$500,000 per day in interest alone. The bottom line is treasury function that is maintained via a finance module within SAP is critical from our standpoint.

Linda Okonieski – from a purely operation standpoint we are currently dead in the water if we can’t get to our schedules and billing information for the persons in the field. We generate a quarter million dollars in revenue a daily basis to our service organization. So if there is a hard fail of the SAP system we stand to have issues in two functional areas. The first and most obvious is that if we cannot invoice our clients in a timely manner or cash flow will diminish significantly at the end of the first week. The second concern is his longer-term and related to legal and contractual ramifications if we could not maintain business as usual as quickly as possible. In our business customer confidence and brand value are priceless and need to be protected. So if we are unable to quickly recover we could very well lose future business that could affect our viability of the company.

Nate Brown – Linda hit the nail on the head, we need to ensure that we have the right people in the field generating income through billable hours and we need to continue to collect for their work. So I would say the schedule and billing within the SAP system ranks very high for me. And to add to Windows last point customer confidence is how we’ve been able to maintain a preferred vendor status with most of these companies where we do business, so many chink in the armor could cost us a significant amount of business.

Sandy Ales – Without access to the SAP system we can’t sell services we can’t deliver. Most of our customers rely on us to be able to find and supply the appropriate consultant/resources as quickly as possible. Since we are one of several preferred service providers we will begin to miss out on new contracts and renewals to our competition. Our reliance on up to date information affects 30 to 40% of our short-term contracts and their ability to compete or longer-term assignments for our higher value personnel. Since we converted from our old system last year we had become completely reliant on the SAP application.

Tyler Amdahl – We have built-in on site redundancy for the SAP system, but we are still negotiating a new contract with SunGard services for a recovery configuration at the hotsite. Given the amount of data that is involved with the SAP system we are looking at 12 to 16 hours minimum recovery.

Rachid Chad – The SAP system is designed/architected for failover capability. Unfortunately the production system implementation is currently around $14 million dollars. There is no economy of scale for full redundancy or real-time failover. There are several options worth considering if anyone the recovery time objectives that we all agree to. I can say that they will not be cheap so we will need to understand the costs were relating to an outage from the business perspective to enable us to construct the proper recovery strategy.

Reyes Emme – If you were to ask the employees they would rank getting their paychecks on time as a number one priority. However the fact is that by self insuring our payroll funding for a week to 10 day period we could provide estimated payroll and then rectify many issues once we’re back up and running. We in HR also have or long-term concerns should an outage extends for more than a few days and began to affect our brand value. The reason to be quite honest is that we attract the best consultants partly based on their perception of our technical abilities as an organization.
Fionna O’Connor – The audit and compliance areas are not affected in the short term should an outage occur. However, timing is everything. Should the outage occur during the close of SOX testing on the ramp above financial reporting to the board we could have issues with the regulators will.

Jackson Davis – We have an all-in situation with the SAP system. We are completely reliant on the system availability for day-to-day operation. The risk we have with the prolonged outage is that we will begin to incur penalties for our accounts payable since we have been able to migrate to a just-in-time payment practice. I am also concerned that we may not have the proper documentation to manually operate should the system be unavailable. I think however this exercise turns out several of our departments need to go back and designed some contingency plans should the data center be unavailable to us. The penalties for late payment would be 10% of $100,000 per day.

[System Name]
Security Categorization: High
[Organization Name]
Information System Contingency Plan (ISCP)
Version [#]
[Date]
Prepared by
[Organization Name]
[Street Address]
[City, State, and Zip Code]

toc

Plan Approval
Provide a statement in accordance with the agency’s contingency planning policy to affirm that the ISCP is complete, and has been tested sufficiently. The statement should also affirm that the designated authority is responsible for continued maintenance and testing of the ISCP. This statement should be approved and signed by the system designated authority. Space should be provided for the designated authority to sign, along with any other applicable approving signatures. A sample language is provided below:
As the designated authority for {system name}, I hereby certify that the information system contingency plan (ISCP) is complete and that the information contained in this ISCP provides an accurate representation of the application, its hardware, software, and telecommunication components. I further certify that this document identifies the criticality of the system as it relates to the mission of the {organization name}, and that the recovery strategies identified will provide the ability to recover the system functionality in the most expedient and cost-beneficial method in keeping with its level of criticality.
I further attest that this ISCP for {system name} will be tested at least annually. This plan was last tested on {insert exercise date}; the test, training, and exercise (TT&E) material associated with this test can be found {TT&E results appendix or location}. This document will be modified as changes occur and will remain under version control, in accordance with {organization name}’s contingency planning policy.
{System Owner Name}                                                                         Date
{System Owner Title}
Introduction
Information systems are vital to {Organization’s} business processes; therefore, it is critical that services provided by {system name} are able to operate effectively without excessive interruption. This Information System Contingency Plan (ISCP) establishes comprehensive procedures to recover {system name} quickly and effectively following a service disruption.
1.1                                        Background
This {system name} Information System (IS) Contingency Plan (CP) establishes procedures to recover {system name} following a disruption. The following recovery plan objectives have been established:
Maximize the effectiveness of contingency operations through an established plan that consists of the following phases:
Activation and Notification phase to activate the plan and determine the extent of damage;
Recovery phase to restore {system name} operations; and
Reconstitution phase to ensure that {system name} is validated through testing and that normal operations are resumed.
Identify the activities, resources, and procedures to carry out {system name} processing requirements during prolonged interruptions to normal operations.
Assign responsibilities to designated {organization name} personnel and provide guidance for
recovering {system name} during prolonged periods of interruption to normal operations.
Ensure coordination with other personnel responsible for {organization name} contingency planning strategies. Ensure coordination with external points of contact and vendors associated with {system name} and execution of this plan.
1.2               Scope
This ISCP has been developed for {system name}, which is classified as a High-Impact system, in accordance with Federal Information Processing Standards (FIPS) 199 – Standards for Security Categorization of Federal Information and Information Systems. Procedures in this ISCP are for High- Impact systems and designed to recover {system name} within {RTO hours}. This plan does not address replacement or purchase of new equipment, short-term disruptions lasting less than {RTO hours}, or loss of data at the onsite facility or at the user-desktop levels.
1.3                Assumptions
The following assumptions were used when developing this ISCP:
{System name} has been established as a High-Impact System, in accordance with FIPS 199.
Alternate processing sites and offsite storage are required and have been established for this
Current backups of the system software and data are intact and available at the offsite storage facility in {City, State}.
Alternate facilities have been established at {City, State} and are available if needed for relocation of {system name}.
The {system name} is inoperable at the {organization name} computer center and cannot be recovered within {RTO hours}.
Key {system name} personnel have been identified and trained in their emergency response and recovery roles; they are available to activate the {system name} Contingency Plan.
Additional assumptions as appropriate. The {system name} Contingency Plan does not apply to the following situations:
Overall recovery and continuity of business operations. The Business Continuity Plan (BCP) and Continuity of Operations Plan (COOP) address continuity of business operations.
Emergency evacuation of personnel. The Occupant Emergency Plan (OEP) addresses employee evacuation.
Any additional constraints and associated plans should be added to this list.
CONTINGENCY PLANNING GUIDE FOR FEDERAL INFORMATION SYSTEMS (DRAFT)
Concept of Operations
The Concept of Operations section provides details about {system name}, an overview of the three phases of the ISCP (Activation and Notification, Recovery, and Reconstitution), and a description of roles and responsibilities of {Organization’s} personnel during a contingency activation.
2.1              System Description
NOTE: Information for this section should be available from the system’s System Security Plan (SSP) and can be copied from the SSP. Provide a general description of system architecture and functionality. Indicate the operating environment, physical location, general location of users, and partnerships with external organizations/systems. Include information regarding any other technical considerations that are important for recovery purposes, such as backup procedures.
2.2           Overview of Three Phases
This ISCP has been developed to recover the {system name} using a three-phased approach. This approach ensures that system recovery efforts are performed in a methodical sequence to maximize the effectiveness of the recovery effort and minimize system outage time due to errors and omissions.
The three system recovery phases are:
Activation and Notification Phase – Activation of the ISCP occurs after a disruption or outage that may reasonably extend beyond the RTO established for a system. The outage event may result in severe damage to the facility that houses the system, severe damage or loss of equipment, or other damage that typically results in long-term loss.
Once the ISCP is activated, system owners and users are notified of a possible long-term outage, and a thorough outage assessment is performed for the system. Information from the outage assessment is presented to system owners and may be used to modify recovery procedures specific to the cause of the outage.
Recovery Phase – The Recovery phase details the activities and procedures for recovery of the affected system. Activities and procedures are written at a level that an appropriately skilled technician can recover the system without intimate system knowledge. This phase includes notification and awareness escalation procedures for communication of recovery status to system owners and users.
Reconstitution Phase – The Reconstitution phase defines the actions taken to test and validate system capability and functionality. This phase consists of two major activities: validating successful recovery and deactivation of the plan.
During validation, the system is tested and validated as operational prior to returning operation to its normal state. Validation procedures may include functionality or regression testing, concurrent processing, and/or data validation. The system is declared recovered and operational by system owners upon successful completion of validation testing.
Deactivation includes activities to notify users of system operational status. This phase also addresses recovery effort documentation, activity log finalization, incorporation of lessons learned into plan updates, and readying resources for any future recovery events.

2.3      Roles and Responsibilities
The ISCP establishes several roles for {system name} recovery and/or recovery support. Persons or teams assigned ISCP roles have been trained to respond to a contingency event affecting {system name}.
Describe each team and role responsible for executing or supporting system recovery. Include responsibilities for each team/role, leadership roles, and coordination with other recovery teams, as applicable. At a minimum, a role should be established for a system owner or business unit point of contact, a recovery coordinator, and a technical recovery point of contact.
Leadership roles should include an ISCP Director, who has overall management responsibility for the plan, and an ISCP Coordinator, who is responsible to oversee recovery effort progress, initiate any needed escalations or awareness communications, and establish coordination with other recovery teams as appropriate.
Activation and Notification
The Activation and Notification Phase defines initial actions taken once a {system name} disruption has been detected or appears to be imminent. This phase includes activities to notify recovery personnel, conduct an outage assessment, and activate the ISCP. At the completion of the Activation and Notification Phase, {system name} ISCP staff will be prepared to perform recovery measures to restore system functions.
3.1              Activation Criteria and Procedure
The {system name} ISCP may be activated if one or more of the following criteria are met:
The type of outage indicates {system name} will be down for more than {RTO hours};
The facility housing {system name} is damaged and may not be available within {RTO hours}; and
Other criteria, as appropriate. The following persons or roles may activate the ISCP if one or more of these criteria are met:
Establish one or more roles that may activate the plan based on activation criteria. Authorized persons
may include the system or business owner, or the operations point of contact (POC) for system support.
3.2      Notification
The first step upon activation of the {system name} ISCP is notification of appropriate business and system support personnel. Contact information for appropriate POCs is included in {Contact List Appendix name}.
For {system name}, the following method and procedure for notifications are used:
Describe established notification procedures. Notification procedures should include who makes the initial notifications, the sequence in which personnel are notified (e.g., system owner, technical POC, contingency plan coordinator, business unit or user unit POC, and recovery team POC) and the method of notification (e.g., email blast, call tree, automated notification system, etc.).

3.3 Outage Assessment
Following notification, a thorough outage assessment is necessary to determine the extent of the disruption, any damage, and expected recovery time. This outage assessment is conducted by {name of recovery team}. Assessment results are provided to the ISCP Coordinator to assist in the coordination of the recovery of {system name}.
Outline detailed procedures to include how to determine the cause of the outage; identification of potential for additional disruption or damage; assessment of affected physical area(s); and determination of the physical infrastructure status, IS equipment functionality, and inventory. Procedures should include notation of items that will be needed to be replaced and estimated time to restore service to normal operations.
Recovery
The Recovery Phase provides formal recovery operations that begin after the ISCP has been activated, outage assessments have been completed (if possible), personnel have been notified, and appropriate teams have been mobilized. Recovery Phase activities focus on implementing recovery strategies to restore system capabilities, repair damage, and resume operational capabilities at the original or new permanent location. At the completion of the Recovery Phase, {system name} will be functional and capable of performing the functions identified in the plan.
4.1            Sequence of Recovery Activities
The following activities occur during recovery of {system name}:
Modify the following list as appropriate for the selected system recovery strategy:
Identify recovery location (if not at original location);
Identify required resources to perform recovery procedures;
Retrieve backup and system installation media;
Recover hardware and operating system (if required); and
Recover system from backup and system installation media.
4.2            Recovery Procedures
The following procedures are provided for recovery of {system name} at the original or established alternate location. Recovery procedures are outlined per team and should be executed in the sequence presented to maintain an efficient recovery effort.
Provide general procedures for the recovery of the system from backup media. Specific keystroke-level procedures may be provided in an appendix. If specific procedures are provided in an appendix, a reference to that appendix should be included in this section. Teams or persons responsible for each procedure should be identified.
4.3      Escalation Notices/Awareness

Provide appropriate procedures for escalation notices during recovery efforts. Notifications during recovery include problem escalation to leadership and status awareness to system owners and users. Teams or persons responsible for each escalation/awareness procedure should be identified.
Reconstitution
Reconstitution is the process by which a recovered system is tested to validate system capability and functionality. During Reconstitution, recovery activities are completed and normal system operations are resumed. If the original facility is unrecoverable, the activities in this phase can also be applied to preparing a new permanent location to support system processing requirements. This phase consists of two major activities – validating successful recovery and deactivation of the plan.
5.1             Concurrent Processing
High-impact systems are not required to have concurrent processing as part of the validation effort. If concurrent processing does occur for the system prior to making it operational, procedures should be inserted here. Procedures should include length of time for concurrent processing, processing information on both concurrent systems, and validating information on the new permanent system.
For high-impact systems without concurrent processing, this section may either be removed or the following may be used:
In concurrent processing, a system operates at two separate locations concurrently until there is a level of assurance that the recovered system is operating correctly. {System name} does not have concurrent processing as part of validation. Once the system has been tested and validated, it will be placed into normal operations.
5.2      Validation Data Testing
Validation data testing is the process of testing and validating recovered data to ensure that data files or databases have been recovered completely. The following procedures will be used to determine that the recovered data is complete and current to the last available backup:
Provide procedures for testing or validation of recovered data to ensure that data is correct and up to date. This section may be combined with the Functionality Testing section if procedures test both the functionality and data validity. Teams or persons responsible for each procedure should be identified. An example of a validation data test for a high-impact system would be to log into the system database and check the audit logs to determine that all transactions and updates are current. Detailed data test procedures may be provided in Appendix E, System Validation Test Plan.
5.3      Validation Functionality Testing
Validation functionality testing is the process of verifying that recovered {system name} functionality has been tested, and the system is ready to return to normal operations.
Provide system functionality testing and validation procedures to ensure that the system is operating correctly. This section may be combined with the Data Testing section if procedures test both the functionality and data validity. Teams or persons responsible for each procedure should be identified. An example of a functional test for a high-impact system may be logging into the system and running a

series of operations as a test or real user to ensure that all parts of the system are operating correctly. Detailed functionality test procedures may be provided in Appendix E, System Validation Test Plan.
5.4              Recovery Declaration
Upon successfully completing testing and validation, the {designated authority} will formally declare recovery efforts complete, and that {system name} is in normal operations. {System name} business and technical POCs will be notified of the declaration by the ISCP Coordinator.
5.5      Notifications (users)
Upon return to normal system operations, {system name} users will be notified by {role} using predetermined notification procedures (e.g., email, broadcast message, phone calls, etc.).
5.6               Cleanup
Cleanup is the process of cleaning up or dismantling any temporary recovery locations, restocking supplies used, returning manuals or other documentation to their original locations, and readying the system for a possible future contingency event.
Provide any specific cleanup procedures for the system, including preferred locations for manuals and documents and returning backup or installation media to its original location.
5.7              Offsite Data Storage
It is important that all backup and installation media used during recovery be returned to the offsite data storage location. The following procedures should be followed to return backup and installation media to its offsite data storage location.
Provide procedures for returning retrieved backup or installation media to its offsite data storage location. This may include proper logging and packaging of backup and installation media, preparing for transportation, and validating that media is securely stored at the offsite location.
5.8               Data Backup
As soon as reasonable following recovery, the system should be fully backed up and a new copy of the current operational system stored for future recovery efforts. This full backup is then kept with other system backups. The procedures for conducting a full system backup are:
Provide appropriate procedures for ensuring that a full system backup is conducted within a reasonable time frame, ideally at the next scheduled backup period. This backup should go offsite with the other media in Section 6.3
5.9      Event Documentation
It is important that all recovery events be well-documented, including actions taken and problems encountered during the recovery effort, and lessons learned for inclusion and update to this ISCP. It is the

responsibility of each recovery team or person to document their actions during the recovery effort, and to provide that documentation to the ISCP Coordinator.
Provide details about the types of information each recovery team member is required to provide or collect for updating the ISCP with lessons learned. Types of documentation that should be generated and collected after a contingency activation include:
Activity logs (including recovery steps performed and by whom, the time the steps were initiated and completed, and any problems or concerns encountered while executing activities);
Functionality and data testing results;
Lessons learned documentation; and ■ After Action Report.
Event documentation procedures should detail responsibilities for development, collection, approval, and maintenance.
5.10 Deactivation
Once all activities have been completed and documentation has been updated, the {designated authority} will formally deactivate the ISCP recovery effort. Notification of this declaration will be provided to all business and technical POCs.
SUGGESTED APPENDICES
ISCP appendices included should be based on system and plan requirements. The following appendices are recommended:
APPENDIX A PERSONNEL CONTACT LIST
Provide contact information for each person with a role or responsibility for activation or implementation of the ISCP, or coordination with the ISCP. For each person listed, at least one office and one non-office contact number is recommended.
{System name} ISCP Key Personnel
Key Personnel
Contact Information
ISCP Director
Work
Insert number
Insert Name and Title
Home
Insert number
Insert Street Address
Cellular
Insert number
Insert City, State, and Zip Code
Email
Insert email address
ISCP Director – Alternate
Work
Home
Cellular
Email
ISCP Coordinator
Work
Home
Cellular
Email
{System name} ISCP Key Personnel
ISCP Coordinator – Alternate
Work
Home
Cellular
Email
Recovery Team – Team Lead
Work
Home
Cellular
Email
Recovery Team – Team Member
Work
Home
Cellular
Email
APPENDIX B VENDOR CONTACT LIST
Contact information for all key maintenance or support vendors should be included in this appendix. Contact information, such as emergency phone numbers, contact names, contract numbers, and contractual response and onsite times should be included.
APPENDIX C DETAILED RECOVERY PROCEDURES
This appendix includes the detailed recovery procedures for the system, which may include items such as:
Keystroke-level recovery steps;
System installation instructions from tape, CD, or other media;
Required configuration settings or changes;
Recovery of data from tape and audit logs; and
Other system recovery procedures, as appropriate.
If the system relies totally on another group or system for its recovery (such as a mainframe system), information provided should include contact information and locations of detailed recovery procedures for that supporting system.
APPENDIX D ALTERNATE PROCESSING PROCEDURES
This section should identify any alternate manual or technical processing procedures available that allow the business unit to continue some processing of information that would normally be done by the affected system. Examples of alternate processes include manual forms processing, input into workstations to store data until it can be uploaded and processed, or queuing of data input.
APPENDIX E SYSTEM VALIDATION TEST PLAN
This appendix includes system acceptance procedures that are performed after the system has been recovered and prior to putting the system into full operation and returned to users. The System Validation Test Plan may include the regression or functionality testing conducted prior to implementation of a system upgrade or change.

An example of a system validation test plan:
Once the system has been recovered, the following steps will be performed to validate system data and functionality:
Procedure
Expected Results
Actual Results
Successful?
Performed
by
At the Command Prompt, type in sysname
System Log-in
Screen appears
Log in as user
testuser, using password testpass
Initial Screen with
Main Menu shows
From Menu – select 5- Generate Report
Report Generation Screen shows
– Select Current Date Report
– Select Weekly
– Select To Screen
Report is generated on screen with last successful transaction included
– Select Close
Report Generation Screen Shows
– Select Return to Main Menu
Initial Screen with
Main Menu shows
– Select Log-Off
Log-in Screen appears
APPENDIX F ALTERNATE STORAGE, SITE, AND TELECOMMUNICATIONS
This appendix provides information for alternate storage, alternate processing site, and alternate telecommunications for the system. Alternate storage, site, and telecommunications information is required for High-Impact systems, per NIST SP800-53-3. Refer to NIST SP800-53-3 for details on control specifics. Information that should be provided for each area includes:
Alternate Storage:
City and state of alternate storage facility, and distance from primary facility;
Whether the alternate storage facility is owned by the organization or is a third-party storage provider;
Name and points of contact for the alternate storage facility;
Delivery schedule and procedures for packaging media to go to alternate storage facility;
Procedures for retrieving media from the alternate storage facility;
Names and contact information for those persons authorized to retrieve media;
Alternate storage configuration features that facilitate recovery operations (such as keyed or card reader access by authorized retrieval personnel);
Any potential accessibility problems to the alternate storage site in the event of a widespread disruption or disaster;
Mitigation steps to access alternate storage site in the event of a widespread disruption or disaster;
Types of data located at alternate storage site, including databases, application software, operating systems, and other critical information system software; and
Other information as appropriate.
Alternate Processing Site:
City and state of alternate processing site, and distance from primary facility;
Whether the alternate processing site is owned by the organization or is a third-party site provider;
Name and points of contact for the alternate processing site;
Procedures for accessing and using the alternate processing site, and access security features of alternate processing site;
Names and contact information for those persons authorized to go to alternate processing site;
Type of alternate processing site, and equipment available at site;
Alternate processing site configuration information (such as available power, floor space, office space, telecommunications availability, etc.);
Any potential accessibility problems to the alternate processing site in the event of a widespread disruption or disaster;
Mitigation steps to access alternate processing site in the event of a widespread disruption or disaster;
SLAs or other agreements of use of alternate processing site, available office/support space, set up times, etc.; and
Other information as appropriate. Alternate Telecommunications:
Name and contact information of alternate telecommunications vendors;
Geographic locations of alternate telecommunications vendors facilities (such as central offices, switch centers, etc.);
Contracted capacity of alternate telecommunications;
SLAs or other agreements for implementation of alternate telecommunications capacity;
Information on alternate telecommunications vendor contingency plans;
Names and contact information for those persons authorized to implement or use alternate telecommunications capacity; and
Other information as appropriate.

APPENDIX G DIAGRAMS (SYSTEM AND INPUT/OUTPUT)
NOTE: Information for this section should be available from the system’s System Security Plan (SSP) and can be copied from the SSP. Include any system architecture, input/output, or other technical or logical
diagrams that may be useful in recovering the system. Diagrams may also identify information about interconnection with other systems.

APPENDIX H SYSTEM INVENTORY
Provide the hardware and software inventory for the system. Inventory information should include type of server or hardware on which the system runs, processors and memory requirements, storage requirements, and any other pertinent details. The software inventory should identify the operating system (including service pack or version levels, and any other applications necessary to operate the system, such as database software).
APPENDIX I INTERCONNECTIONS TABLE
NOTE: Information for this section should be available from the system’s System Security Plan (SSP) and can be copied from the SSP. This appendix includes information on other systems that directly interconnect or exchange information with the system. Interconnection information should include the type of connection, information transferred, and contact person for that system.
If the system does not have any direct interconnections, then this appendix may be removed, or the following statement may be used:
{System name} does not directly interconnect with any other systems. APPENDIX J TEST AND MAINTENANCE SCHEDULE
All ISCPs should be reviewed and tested at least yearly or whenever there is a significant change to the system. Provide information and a schedule for the testing of the system. For High-Impact Systems, a yearly full functional test is required. The full functional test should include all ISCP points of contact and be facilitated by an outside or impartial observer. A formal test plan is developed prior to the functional test, and test procedures are developed to include key sections of the ISCP, including the following:
Notification Procedures;
System recovery and an alternate platform from backup media;
Internal and external connectivity; and
Restoration to normal operations.
Results of the test are documented in an After Action Report, and Lessons Learned are developedfor updating information in the ISCP.
NOTE: Full functional tests of systems normally are failover tests to the alternate locations, and may be very disruptive to system operations if not planned well. Other systems located in the same physical location may be affected by or included in the full functional test. It is highly recommended that several functional tests be conducted and evaluated prior to conducting a full functional (failover) test.
Examples of functional tests that may be performed prior to a full functional test include: ■ Full notification and response of key personnel to recovery location;
CONTINGENCY PLANNING GUIDE FOR FEDERAL INFORMATION SYSTEMS (DRAFT)
Recovery of a server or database from backup media; and
Setup and processing from a server at an alternate location.

The following is a sample of a yearly test and maintenance schedule for a high-impact system:
Step
Date Due by
Responsible Party
Date Scheduled
Date Held
Identify failover test facilitator.
March 1
ISCP Coordinator
Determine scope of failover test (include other systems?).
March 15
ISCP Coordinator, Test Facilitator
Develop failover test plan.
April 1
Test Facilitator
Invite participants.
July 10
Test Facilitator
Conduct functional test.
July 31
Test Facilitator, ISCP Coordinator, POCs
Finalize after action report and lessons learned.
August 15
ISCP Coordinator
Update ISCP based on lessons learned.
September 15
ISCP Coordinator
Approve and distribute updated version of ISCP.
September 30
ISCP Director, ISCP Coordinator
APPENDIX K ASSOCIATED PLANS AND PROCEDURES
NOTE: Information for this section should be available from the system’s System Security Plan (SSP) and can be copied from the SSP. ISCPs for other systems that either interconnect or support the system should be identified in this appendix. The most current version of the ISCP, location of ISCP, and primary point of contact (such as the ISCP Coordinator) should be noted.
APPENDIX L BUSINESS IMPACT ANALYSIS
The Business Impact Analysis results should be included in this appendix.
APPENDIX M DOCUMENT CHANGE PAGE
Modifications made to this plan since the last printing are as follows:
Record of Changes
Page No.
Change Comment
Date of Change
Signature

Is this the question you were looking for? If so, place your order here to get started!

Blog

Case Study (System Recovery Plan)

Case Study (System Recovery Plan)

Share this:

Related posts

NURS – 6630N Psychopharmacologic Approaches to Treatment of Psychopathology

New Technologies in Nursing

Experience of Disorder Symptoms Nursing Research Paper