Wednesday, July 9, 2014

Perimeter Security Devices Evaluation - Part 1

Firewall (Next Generation Firewall, UTM) Selection Criteria – Part 1

Firewalls and other perimeter defense appliances are the most essential security devices of today's IT infrastructures. Almost all security administrators have their favorite vendor when it comes to the selection of the appropriate device for their environment. I have even seen people supporting vendors and products like supporting a sports team.

I for long times have thought about the criteria to be considered when choosing an appropriate perimeter defense device and this article is the result of it. I am not going to criticise or praise any specific product or vendor, I will just try to give a view of the points that should be considered. One should know that a perimeter device to  be used in an SMB may  (and must) differ from a large enterprise and sectors in which the company operates is another important factor.

So, let’s begin naming those criteria one by one.


We all know that the days in which firewalls were used just as IP and port filter ACL devices and session state checkers are over a long ago. Everybody expects more from those devices providing efficiencies in many terms starting by cost and efforts.

But how far should a firewall go with functionalities?

Today most firewall products offer additional functionalities such as Intrusion Prevention, Web Content Filtering (Web Proxy), Remote Connectivity Gateway (SSL VPN), Data Loss Prevention, Malware - Spyware Protection, Security Event Management, Endpoint Protection Console Services and Bot Protection. This list may include other functionalities and capabilities according to different vendors but most common ones are these.

While SMBs would like to have all-in-one solutions and they may be right on their approach, special care must be taken at this point. Licensing is the keyword in the selection of many security products and firewalls are no exception to that. SMBs with low information security maturity levels really should not opt in for too many capabilities as it will not serve them that much other than increasing their Operational Expenditure budgets. I believe the key functionalities to be selected as a minimum should be Next Generation Firewall, Intrusion Prevention, Web Proxy and SSL VPN. Administrators should also pay attention to not to put all the functionalities in one box even though they may have cluster configurations.

Leaving security intelligence for several functions to just one vendor is another risk to be aware of.

For larger enterprises, the pros and cons of selecting a multipurpose perimeter security device is more obvious. From one angle having the least number of devices to manage is very important in a time where collecting logs from a big number of different sources is really  a burden for security administrators not even mentioning the cost and licensing advantages.

However, till what moment we can put different functionalities into the same basket.
Vendors all have different hardware designs for their appliances and even one vendor proposes very different products according to the segments. Having multiple functionalities means that a problem in one function may have damaging effects on other functionalities, even though the processes may be isolated the hardware in most cases is not isolated in many vendors (the cost goes higher if hardware isolation exists). You may have to sacrifice certain capabilities in some situations in favor of others, which is really a not desired situation for most people.

Briefly, adding too much in the same basket has its risks and each enterprise should take its decision independently on that subject. In a world where there are really no standard metrics for measuring the computing power of these  appliance, this is the truth security administrators should face.

My experience on this subject is that more and more enterprise should migrate over security products which have hardware isolation for different functionalities starting from separating the management plane from the data plane. I believe many security administrators at least once faced that annoying situation which prevented them to access their devices in case of a serious problem just because data plane is too busy and consuming all the resources. In today's world resource exhaustion attacks are still a major issue to be resolved and until it is resolved processing cycles must be allocated and consumed very very carefully.

No matter what the size of the company some functions should be very carefully thought of in the selection process. In a company where data classification scheme, data roles (owner, custodian, user, etc.) and responsibilities are not clear Data Loss Prevention functionality should not be added to the list. Basic clear text filtering options are present in most of the products present in the market and can be used for basic needs, whenever needed.

Choosing Endpoint Protection Management functionality on the appliance may be wise if you have an enterprise up to 50 employees. From that number on, it is wiser to use a separate system for such a need than consuming very precious resources which may be attributed to more important security functions.

Final observation on perimeter security devices is the Next Generation Firewall capabilities which may be resumed basically to application (including applets and modules) recognition and domain integration (user recognition). These are the musts of such appliances in today's world and must not be taken lightly. Even the biggest vendors with considerable market share are performing these functionalities on the paper, which actually means that they are not performing those functionalities in a stable manner or simply just pretend to do it. In many products they say that the appliance recognizes thousands of applications while in fact those thousands of applications consist of different social media modules and old chat programs. The most needed and critical business applications ( ERP, CRM, Database, Productivity suites) are not recognized and they give you no insight about your business critical data flow.

Monday, May 5, 2014

CISSP - Passed exam in 2 months and at the 1st attempt

My journey to get CISSP certification started with my wish to prove my experience and competency on Information Security, almost a year ago.

My first decision was to get CISM certification which by its name is for people who aim to be “Information Security Manager”s. I did my research, adhered to ISACA and the local chapter and ordered the book. Once the book in hand I was very enthusiastic to get to know what is inside the book and my first disappointment dates back that moment. It was a book of 250 pages, very dry and not really for a person coming from a technical background. During the first week, my record was 4 pages before sleeping on the book and that for 6 days on a row. Plus after that week I almost retained nothing from what I read that far, so I decided to take things a bit easier.

When sharing my experiences with a colleague, he told me about CISSP which seemed to be a more complete program both covering the technical aspects of Information Security and Information Security Management. When I read more about CISSP, I was convinced that giving CISSP the first chance would be a better idea and I very slowly started reading about it.

An interesting series of events made me leave the company I was working for, creating the ideal environment for me to spend more time on my personal development, including the certifications for which I was really not able to spare time and focus. So I started studying seriously for CISSP by the beginning of March 2014.
At the beginning finding the correct resources to study was very important. There are tons of resources both for studying and for practice tests and if you are like me which means if you always want to be 150% sure of what you are doing, that may be very confusing. Techexams.Net Forum on CISSP was a great place to follow as many people who were about to pass the exam and who already passed it share their experiences, creating a very nice community. I would definitely suggest you to follow that forum if you need guidance or a second opinion (even if you do not need anything, go and read! :) )

After reading people’s comments, I decided to use the famous All in one (AIO) 6th Edition book from Shon Harris to study. Many people find that book overly detailed, dry and containing bad humor (I personally would not be that harsh) but like it or not, the book explains everything you need for CISSP easy enough for a 5 years old (That’s the way I like it, that’s a shame but I assume). I can even say that although many people insist that the best resource for CISSP is naturally the ISC2’s Official Guide, the bible of CISSP is the AIO book. You can definitely count on it and use solely that book to pass the exam… If only you have too much time and that much of interest in detail, which I did not have (both).
At that moment Eric Conrad’s CISSP Study Guide 2nd Edition came as a life saver, which is half the pages AIO makes and a lot easier to read with very nice examples (I will never forget the example about Object Oriented Programming concepts thanks to that perfect example). That book helped me to cover all of the 10 domains in a month.

So by the beginning of April, I was finally able to test myself with practice tests covering questions from all domains to give me an overall opinion, which also encouraged me to take some full length practice tests in a very short period. So basically what I have done during April was to do a full length test per day at least and then evaluate the wrong answers to understand why I was mistaken. Surprisingly from the beginning till the end no matter what test engine I used I was constantly getting scores in 79% and 81% range. By mid-April, I was confident enough to schedule the test for the last week of April or the first week of May. Some domains were really tough for me because I did not have specific work experience on those domains (Almost nobody can have I believe) and the reasoning behind some choices was not logical and clear. You will also meet such situations quite often. My advice is to listen to yourself built from your proper real life experiences and  the voice of common sense in such situations, instead of what book A or book B says (There are surprisingly many points that they disagree, that’s also why using more than 2 books is distracting and confusing).

Be careful about selecting the date and the test center because as in my situation, you make your plans to take the exam next week without doing any arrangements (of course doing arrangements is a bit scary, exam costs 520 euros or USD, that is nothing to joke) and find out that the only test center in your city is fully booked for next 2 months.

So finally made up my mind and booked the test in the nearest city for the first week of May. No matter how much you study, after reading the experiences of many different people with totally different experiences and background, without taking the exam once, you will never be sure enough.
Many people who took the exam say having the managerial (high level) mindset dealing with the questions is essential and most of the questions are long and scenario based…I do respect them but that is not what I think. First of all, in my short working experience of 10 years, I have never seen a manager interested with such level of technical details; most of the stuff in the exam are very technical indeed. And secondly, to my opinion scenario based questions are not like what you have seen in AIO book, very short and easy to get ones and only cover very limited part of the exam, so relax.

Finally I went to the nearest city (500 kms away) and entered the exam. At the end of 20 questions, I already knew that I was going to pass because it was not that much different from what I have seen in practice tests. I was already thinking about my blog entry about passing the exam during the exam, which I would not suggest to anyone :). It took me 4 hours to answer 250 questions in one round and 30 more minutes to revise 25 flagged questions, including the breaks. One’s ability to handle questions greatly decrease along the exam, in the last 30 questions or so, I remember myself reading the same short question 3 or 4 times. If your proctor is not an annoying person, take a break of at least 5 minutes for every hour. What I mentioned is even more valid for those whose native language is not English. Time to answer a question literally increases exponentially. My right eye was f..ked off after carefully staring at the screen for almost 5 hours and I still had 5 hours to drive back to home at night.
So to resume, if someone asked me a prescription for passing the exam I would say:

  1. AIO 6th Edition from Shon Harris (Do all the tests in the book)
  2. CISSP Study Guide 2nd Edition from Eric Conrad (Again do all the tests)

Equivalent of 150 hours including the reading sessions at least in dedicated mode (another clue :) )

  1. Software included in AIO book (Total Tester, the best resource IMHO)
  2. Eric Conrad’s 2 Free Full Length Tests
  3.’s paid practice tests (Just because every else does it IMHO)
  4. McGraw Hill’s free practice tests

Cornerstone concepts that one sees everywhere and every day such as Business Impact Analysis, System Development Life Cycle, BCP/DRP, Incident Response Plan, Risk Analysis and Assessment. Do not memorize the steps but know the flow, the logical order.

Memorize the Orange Book levels :)

I wish all of you good luck in your quests to pass the exam and I will definitely keep my blog live with Information Security related article which can be useful for CISSP preparation.

Wednesday, April 23, 2014

CISSP - EAP Protocols

Questions about EAP are annoying ones, at least for me and not many people really seem to know the difference. Furthermore, even the official CISSP guide from ISC2 does not tell much about them. However you can meet many questions about it in different tests.

I will try to give you the essentials about them so that you also know enough about it and discover more if you wish.

First of all, Extensible Authentication Protocols are created for 802.1x protocol, which aims to provide identity based authentication services. In a secure network environment both client who wants to connect to the corporate network and the network authentication server should properly authenticate each other.
When we speak about mutual authentication, the best way to do is digital certificates and the use of Public Key Infrastructure. Both client and server present their digital certificates to each other for authentication and sometimes use these certificates to build an SSL tunnel to exchange more information.

EAP-TLS (Transport Layer Security) requires both client and the authentication server to use digital certificates for authentication. This method is laborious and expensive as it requires too much effort for the proper management of the certificates mostly on the client side. If the client certificate is not renewed correctly or certificate store is not properly managed, clients may end up having problems connecting the network. Because many network administrators are not quite interested about PKI, troubleshooting is also painful.

EAP-TTLS (Tunneled TLS) eases the problems that EAP-TLS create by eliminating the client side certificates. The server side certificate is used to establish a secure SSL tunnel between client and authentication server and authentication information is shared over this tunnel. This method is of course less secure than EAP-TLS but it is also much easier to configure and maintain.

EAP-PEAP (Protected EAP) works just as the same way EAP-TLS that why it is confusing for me and many others I believe. After the establishment of the secure tunnel using the server certificate, a second method such as EAP-TLS or EAP-MSCHAPv2 (Microsoft’s flavor of EAP) can be used for authentication information.

These are all the methods given in the official guide. There are of course other protocols such as LEAP (Cisco’s first protocol of EAP, now considered insecure and no longer used), EAP-MD5 (Sending authentication information hashed with MD5, much less secure than those mentioned above) and EAP-MSCHAPv2 (just an inner authentication method after the first 3, authenticating using Active Directory credentials), but these are considered not essential it seems. It is good to know just that much about them for general knowledge and the exam.

I know Aaron Woland from Cisco Networkers events, he is one of the guys who designed Cisco’s famous ISE product and periodically speaks about Identity-based networking concepts and AAA in events. You can find a more detailed explanation in his blog following this link : 

Monday, April 14, 2014

CISSP - Between the lines facts on Access Control

Access Control domain is considered as one of the top 5 domains of CISSP CBK and must be paid well attention. In this domain also, there are some concepts that an average IT professional is pretty unfamiliar and which should be well understood to obtain the certification. Markup languages and their use can be the best example for such concepts.

A subject is an active entity and an object is a passive entity.

Permission refers to the access granted for an object like read, creat,edit and delete.
Right refers to the ability to take an action on an object. E.g. Modify system time.

Privilege = permission + right

A directive access control is deployed to direct, confine, or control the actions of subjects to force or encourage compliance with security policies.

A cognitive password is usually a series of questions about facts or predefined responses that only the subject should know. For example, what is your mothers maiden name?

DAC is also referred to as identity-based access control because access is granted to subjects based on their identity.

A DAC model is implemented using access control lists (ACLs) on objects. It does not offer a centrally controlled management system because owners can alter the ACLs on their objects at will. Access to objects is easy to change, especially when compared to the static nature of mandatory access controls.

Within a DAC environment, usersprivileges can easily be suspended while they are on vacation, resumed when they return, or terminated when they leave.

Administrators centrally administer non-discretionary access controls and can make changes that affect the entire environment.

In a non-DAC model, access does not focus on user identity. Instead, a static set of rules governing the whole environment is used to manage access. Non-DAC systems are centrally controlled and easier to manage (although less flexible). Rule-based access controls and lattice-based access controls are both considered non-discretionary.

Subjects under lattice-based access controls acquire a least upper bound and a greatest lower bound of access to labeled objects based on their assigned lattice positions. A common example of a lattice-based access control is a mandatory access control.

A mandatory access control (MAC) system relies upon the use of classification labels. Each classification label represents a security domain, or a realm of security. A security domain is a collection of subjects and objects that share a common security policy.

Mandatory access controls are often considered to be non-discretionary controls because they are lattice based. However, the CISSP CIB lists them separately.

An expansion of this access control method is known as need to know. Subjects with specific clearance levels are granted access to resources only if their work tasks require such access.

Mandatory access control is prohibitive rather than permissive, and it uses an implicit deny philosophy. If access is not specifically granted, it is forbidden. It is generally recognized as being more secure than DAC, but it isnt as flexible or scalable.

A distinguishing factor between MAC and rule-based access controls is that MAC controls have labels while the non-discretionary rule-based access controls do not use labels.

Objects have security labels (or sensitivity labels), subjects have clearances.

A capability table specifies the access rights a certain subject possesses pertaining to specific objects. A capability table is different from an ACL because the subject is bound to the capability table, whereas the object is bound to the ACL.

An access control matrix is a table of subjects and objects indicating what actions individual subjects can take upon individual objects. This type of access control is usually an attribute of DAC models. The access rights can be assigned directly to the subjects (capabilities) or to the objects (ACLs).

A meta-directory gathers the necessary information from multiple sources and stores it in one central directory. This provides a unified view of all users digital identity information throughout the enterprise.

A virtual directory plays the same role and can be used instead of a meta-directory. The difference between the two is that the meta-directory physically has the identity data in its directory, whereas a virtual directory does not and points to where the actual data reside.

Web portals functions are parts of a website that act as a point of access to information. A portal presents information from diverse sources in a unified manner.

A web portal is made up of portlets, which are pluggable user-interface software components that present information from other systems. A portlet is an interactive application that provides a specific type of web service functionality.

XML is a common language used to exchange information.

Security Assertion Markup Language (SAML) is an XML-based language that is commonly used to exchange authentication and authorisation (AA) information between federated organisations. It is often used to provide SSO capabilities for browser access.

When there is a need to allow a user to log in one time and gain access to different and separate web-based applications, the actual authentication data have to be shared between the systems maintaining those web applications securely and in a standardized manner. This is the role that the SAML plays. It is an XML standard that allows the exchange of authentication and authorization data to be shared between security domains.

The Service Provisioning Markup Language (SPML) allows for the exchange of provisioning data between applications, which could reside in one organization or many. SPML allows for the automation of user management (account creation, amendments, revocation) and access entitlement configuration related to electronically published services across multiple provisioning systems. This markup language allows for the integration and interoperation of service provisioning requests across various platforms. When a new employee is hired at a company, that employee usually needs access to a wide range of systems, servers, and applications. Setting up new accounts on each and every system, properly configuring access rights, and then maintaining those accounts throughout their lifetimes is time-consuming, laborious, and error-prone. What if the company has 20,000 employees and thousands of network resources that each employee needs various access rights to? This opens the door for confusion, mistakes, vulnerabilities, and a lack of standardization. SPML allows for all these accounts to be set up and managed simultaneously across the various systems and applications. SPML is made up of three main entities: the Re-questing Authority (RA), which is the entity that is making the request to set up a new account or make changes to an existing account; the Provisioning Service Provider (PSP), which is the software that responds to the account requests; and the Provisioning Service Target (PST), which is the entity that carries out the provisioning activities on the requested system.

Transmission of SAML data can take place over different protocol types, but a common one is Simple Object Access Protocol (SOAP). SOAP is a specification that outlines how information pertaining to web services is exchanged in a structured manner. It provides the basic messaging framework, which allows users to request a service and, in exchange, the service is made available to that user. Let's say you need to interact with your company's customer relationship management (CRM) system, which is hosted and maintained by the vendorfor example, You would log in to your company's portal and double-click a link for Salesforce. Your company's portal will take this request and your authentication data and package it up in an SAML format and encapsulate that data into a SOAP message. This message would be transmitted over an HTTP connection to the Salesforce vendor site.

The use of web services in this manner also allows for organizations to provide service oriented architecture (SOA) environments. An SOA is a way to provide independent services residing on different systems in different business domains in one consistent manner. For example, if your company has a web portal that allows you to access the company's CRM, an employee directory, and a help-desk ticketing application, this is most likely being provided through an SOA. The CRM system may be within the marketing department, the employee directory may be within the HR department, and the ticketing system may be within the IT department, but you can interact with all of them through one interface.

Extensible Access Control Markup Language (XACML) is used to define access control policies within an XML format, and it commonly implements role-based access controls. It helps provide assurances to all members in a federation that they are granting the same level of access to different roles.

Diameter supports a wide range of protocols, including traditional IP, Mobile IP, and Voice over IP (VoIP). Because it supports extra commands, it is becoming popular in situations where roaming support is desirable, such as with wireless devices and smart phones.

Key steps in risk management are as follows:
  • Identifying assets
  • Identifying threats
  • Identifying vulnerabilities

After identifying and prioritizing assets, an organization attempts to identify any possible threats to the valuable systems. Threat modelling refers to the process of identifying, understanding, and categorizing potential threats. A goal is to identify a potential list of threats to these systems and to analyze the threats.

Access aggregation refers to collecting multiple pieces of non-sensitive information and combining (aggregating) them to learn sensitive information. Reconnaissance attacks are access aggregation attacks.

A birthday attack focuses on finding collisions. It is so named based on a statistical phenomenon known as the birthday paradox. The birthday paradox states that if there are 23 people in a room, there is a 50 percent chance that any two of them will have the same birthday.

Birthday attacks are mitigated by using hashing algorithms with a sufficient number of bits to make collisions computationally infeasible. There was a time when MD5 (using 128 bits) was considered to be collision free. However, computing power continues to improve, and MD5 is no longer considered safe against collisions. SHA-2 can use as many as 512 bits and is considered safer against birthday attacks and collisionsat least for now.

A drive-by download is a type of malware that installs itself without the users knowledge when the user visits a website. Drive-by downloads take advantage of vulnerabilities in browsers or plug-ins.

Network Segregation, perimeter security, control zone and cabling are physical controls.

Extended TACACS (XTACACS) separates authentication, authorization and accounting processes.

Employing a password generator is a bad idea as users will write down difficult passwords somewhere.

Two factor authentication is better than biometric authentication alone.

In Windows environments, administrators can use a Syskey utility that encrypts the database storing the passwords with a locally stored system key.

Signature dynamics is a method that captures the electrical signals when a person signs a name. Keystroke dynamics captures electrical signals when a person types a certain phrase.

A passphrase is a sequence of characters that is longer than a password and, in some cases, takes the place of a password during an authentication process. The user enters this phrase into an application, and the application transforms the value into a virtual password, making the passphrase the length and format that is required by the application.

A memory card holds information but cannot process information. A smart card holds information and has the necessary hardware and software to actually process that information.

Two types of contactless smart cards are available: hybrid and combi. The hybrid card has two chips, with the capability of utilizing both the contact and contactless formats. A combi card has one microprocessor chip that can communicate to contact or contactless readers.

ISO/IEC standard for Smart Cards is ISO/IEC 14443.

Attackers often delete audit logs that hold this incriminating information. Deleting specific incriminating data within audit logs is called scrubbing.

CISSP - Between the lines notes about Telecommunications Security

Telecommunications and Network Security domain is one of the largest domains in CISSP CBK. Even people with important level of experience with network operations may find many points they miss during their daily lives.

In this blog entry, rather than explaining the facts that most people know, I tried to resume those little points that may have missed from many people’s attention up to now. These points may be lifesaving in answering questions. So let’s start with some information about Session Layer which network and security people maybe pay the less attention.

When two applications need to communicate or transfer data between themselves, a connection may need to be set up between them. The session layer is responsible for establishing a connection between the two applications, maintaining it during the transfer of data, and controlling the release of this connection. The session layer works in three phases: connection establishment, data transfer, and connection release.

Session layer protocols control application-to-application communication, whereas the transport layer protocols handle computer-to-computer communication. For example, if you are using a product that is working in a client/server model, in reality you have a small piece of the product on your computer (client portion) and the larger piece of the software product is running on a different computer (server portion). The communication between these two pieces of the same software product needs to be controlled, which is why session layer protocols even exist. Session layer protocols take on the functionality of middleware, which allows software on two different computers to communicate.

Session layer protocols provide interprocess communication channels, which allow a piece of software on one system to call upon a piece of software on another system without the programmer having to know the specifics of the software on the receiving system. The programmer of a piece of software can write a function call that calls upon a subroutine. The subroutine could be local to the system or be on a remote system. If the subroutine is on a remote system, the request is carried over a session layer protocol. The result that the remote system provides is then returned to the requesting system over the same session layer protocol. This is how RPC works.

One security issue common to RPC (and similar interprocess communication software) is the lack of authentication or the use of weak authentication. Secure RPC can be implemented, which requires authentication to take place before two computers located in different locations can communicate with each other. Authentication can take place using shared secrets, public keys, or Kerberos tickets. Session layer protocols need to provide secure authentication capabilities.

RPC and similar distributed computing calls usually only need to take place within a network; thus, firewalls should be configured so this type of traffic is not allowed into or out of a network.

Some protocols that work at session layer are SQL, NetBIOS, NFS, and (RPC).

The main protocols that work at layer 4 are TCP, UDP, SSL, TLS and SPX.

ICMP and IGMP are Layer 3 protocols.

RARP, PPP, PPTP, L2TP, SLIP, ATM, Ethernet, Token Ring and FDDI are Layer 2 protocols.

ISDN, DSL and SONET are Layer 1 protocols.

Port numbers up to 1023 (0 to 1023) are called well-known ports. Ports 0 to 1023 can be used only by privileged system or root processes.

Registered ports are 1024 to 49151, which can be registered with ICANN for a particular use.

Dynamic ports are 49152 to 65535 and are available to be used by any application on an “as needed” basis.

The SYN proxy is a piece of software that resides between the sender and receiver and only sends on TCP traffic to the receiving system if the TCP handshake process completes successfully.

If an attacker can correctly predict the TCP sequence numbers that two systems will use, then she can create packets containing those numbers and fool the receiving system into thinking that the packets are coming from the authorized sending system. She can then take over the TCP connection between the two systems, which is referred to as TCP session hijacking.

802.1AE (MACSec) defines a security infrastructure to provide data confidentiality, data integrity, and data origin authentication. Where a VPN connection provides protection at the higher networking layers, MACSec provides hop-by-hop protection at layer 2.

802.1AR standard specifies unique per-device identifiers (DevID, PKI, certificates) and the management and cryptographic binding of a device (router, switch, access point) to its identifiers.

DHCP packet types are Discover, Offer, Request and Acknowledgment (DORA) in their order.

In environments that require extensive security, wires are encapsulated within pressurized conduits so if someone attempts to access a wire, the pressure of the conduit will change, causing an alarm to sound and a message to be sent to the security staff.

CSMA/CA will send out a message indicating to all other systems that it is going to put data on the line and CSMA/CD will listen to the wire and try to figure out when would be the best time to put data on the line.

Token Ring uses a token-passing technology with a star-configured topology. Each computer is connected to a central hub, called a Multistation Access Unit (MAU). Token ring operates either at 4 or 16 Mbps.

FDDI has a data transmission speed of up to 100 Mbps and is usually used as a backbone network. FDDI also provides fault tolerance by offering a second counter-rotating fiber ring. The primary ring has data traveling clockwise and is used for regular data transmission. The second ring transmits data in a counterclockwise fashion and is invoked only if the primary ring goes down.

Copper Distributed Data Interface (CDDI) can work over UTP cabling. Whereas FDDI would be used more as a MAN, CDDI can be used within a LAN.

Devices that connect to FDDI rings fall into one of the following categories:
  • Single-attachment station (SAS) Attaches to only one ring (the primary) through a concentrator
  • Dual-attachment station (DAS) Has two ports and each port provides a connection for both the primary and the secondary rings
  • Single-attached concentrator (SAC) Concentrator that connects an SAS device to the primary ring
  • Dual-attached concentrator (DAC) Concentrator that connects DAS, SAS, and SAC devices to both rings

Loki is actually a client/server program used by hackers to set up back doors on systems and uses ICMP packets to carry control traffic.

The Ping of Death attack is based upon the use of oversized ICMP packets. If a system does not know how to handle ICMP packets over the common size of 65,536 bytes, then it can become unstable and freeze or crash.

In Smurf attack, the attacker sends an ICMP ECHO REQUEST packet with a spoofed source address of victim to victim’s network broadcast address. This means that each system on the victim’s subnet receives an ICMP ECHO REQUEST packet. Each system then replies to that request with an ICMP ECHO REPLY packet to the spoof address provided in the packets—which is the victim’s address.

Fraggle attack works with the same principle with Smurf but Fraggle uses the UDP protocol, and Smurf uses the ICMP protocol. They are both DDoS attacks.

In teardrop attack malformed fragments are created by the attacker, and once they are reassembled, they could cause the victim system to become unstable.

Within DNS servers, DNS namespaces are split up administratively into zones. One zone may contain all hostnames for the marketing and accounting departments, and another zone may contain hostnames for the administration, research, and legal departments. The DNS server that holds the files for one of these zones is said to be the authoritative name server for that particular zone. A zone may contain one or more domains, and the DNS server holding those host records is the authoritative name server for those domains.

The primary and secondary DNS servers synchronize their information through a zone transfer. After changes take place to the primary DNS server, those changes must be replicated to the secondary DNS server. It is important to configure the DNS server to allow zone transfers to take place only between the specific servers. Unauthorized zone transfers can take place if the DNS servers are not properly configured to restrict this type of activity.

DNSSEC implements PKI and digital signatures, which allows DNS servers to validate the origin of a message to ensure that it is not spoofed and potentially malicious.

Organizations should implement split DNS, which means a DNS server in the DMZ handles external hostname-to-IP resolution requests, while an internal DNS server handles only internal requests. This helps ensure that the internal DNS has layers of protection and is not exposed by being “Internet facing.” The internal DNS server should only contain resource records for the internal computer systems, and the external DNS server should only contain resource records for the systems the organization wants the outside world to be able to connect to.

Cybersquatters, individuals who register prominent or established names, hope to sell these later to real-world businesses that may require these names to establish their online presence (Domain grabbing).

IMAP provides all the functionalities of POP, but has more capabilities. If a user is using POP, when he accesses his mail server to see if he has received any new messages, all messages are automatically downloaded to his computer. Once the messages are downloaded from the POP server, they are usually deleted from that server, depending upon the configuration. POP can cause frustration for mobile users because the messages are automatically pushed down to their computer or device and they may not have the necessary space to hold all the messages. This is especially true for mobile devices that can be used to access e-mail servers. This is also inconvenient for people checking their mail on other people’s computers.

POP is commonly used for Internet-based e-mail accounts (Gmail, Yahoo!, etc.), while IMAP is commonly used for corporate e-mail accounts.

E-mail spoofing is a technique used by malicious users to forge an e-mail to make it appear to be from a legitimate source.

If source routing is allowed, the packets contain the necessary information within them to tell the bridge or router where they should go. The packets hold the forwarding information so they can find their way to their destination without needing bridges and routers to dictate their paths.

External devices and border routers should not accept packets with source routing information within their headers, because that information will override what is laid out in the forwarding and routing tables configured on the intermediate devices. Source routing can be used by attackers to get around certain bridge and router filtering rules.

A phreaker is a phone hacker.

Main charachteristics of different types of firewalls are :
  • Packet Filters:(Network Layer) Looks at source and destination addresses, ports and services requested. Routers using ACLs to network monitor traffic.
  • Application-level Proxy:(Application Layer) Looks deep into packets and makes granular access control decisions. Requires one proxy per protocol.
  • Circuit-level Proxy:(Session Layer) Looks only at the header packet information. It protects a wider range of protocols and services than an application-level proxy, but does not provide the detailed level of control available to ALPs.
  • Stateful : (Network Layer) Looks at the state and context of packets. Keeps track of each connection using a state table.
  • Kernel Proxy : (Application Layer) Faster because processing is done in the kernel. One network stack is created for each packet.

Characteristics of application-level proxy firewalls:
  • Each protocol that is to be monitored must have a unique proxy.
  • Provides more protection than circuit-level proxy firewalls.
  • Require more processing per packet and thus are slower than a circuit-level proxy firewall.

Characteristics of circuit-level proxy firewalls:
  • Do not require a proxy for each and every protocol.
  • Do not provide the deep-inspection capabilities of an application layer proxy.
  • Provide security for a wider range of protocols.
SOCKS is an example of a circuit-level proxy gateway that provides a secure channel.

A system is considered a bastion host if it is a highly exposed device that is most likely to be targeted by attackers.

Firewall rules that should be implemented are as follows:
  • Silent rule :  Drop “noisy” traffic without logging it. This reduces log sizes by not responding to packets that are deemed unimportant.
  • Stealth rule :  Disallows access to firewall software from unauthorized systems.
  • Cleanup rule : Last rule in rule-base that drops and logs any traffic that does not meet preceding rules.
  • Negate rule: Used instead of the broad and permissive “any rules.” Negate rules provide tighter permission rights by specifying what system can be accessed and how.

A reverse proxy server is commonly on the network that fulfills clients’ requests; thus, it is handling traffic that is entering its network. The reverse proxy can carry out load balancing, encryption acceleration, security, and caching.

On a smaller scale, companies may choose to implement tarpits, which are similar to honeypots in that they appear to be easy targets for exploitation.

Vishing is voice phishing by the use of voice mail messages and other things.

The main protocols that make up the IPsec suite and their basic functionality are as follows:
  • Authentication Header (AH) provides data integrity, data origin authentication, and protection from replay attacks.
  • Encapsulating Security Payload (ESP) provides confidentiality, data-origin authentication, and data integrity.
  • Internet Security Association and Key Management Protocol (ISAKMP) provides a framework for security association creation and key exchange.
  • Internet Key Exchange (IKE) provides authenticated keying material for use with ISAKMP.

AH and ESP can be used separately or together in an IPsec VPN configuration.
  • PPTP is used when a PPP connection needs to be extended through an IP-based network.
  • L2TP is used when a PPP connection needs to be extended through a non IP-based network.
  • IPsec is used to protect IP-based traffic and is commonly used in gateway to gateway connections.
  • SSL VPN is used when a specific application layer traffic type needs protection. 
The three core deficiencies with WEP are the use of static encryption keys, the ineffective use of initialization vectors, and the lack of packet integrity assurance.

LEAP (Lightweight Extensible Authentication Protocol) is a Cisco-proprietary protocol released before 802.1X was finalized. LEAP has significant security flaws and should not be used.

EAP-TLS (EAP-Transport Layer Security) uses PKI, requiring both server-side and client side certificates. EAP-TLS establishes a secure TLS tunnel used for authentication. EAP-TLS is very secure due to the use of PKI, but is complex and costly for the same reason. The other major versions of EAP attempt to create the same TLS tunnel without requiring a client-side certificate.

EAP-TTLS (EAP-Tunneled Transport Layer Security) simplifies EAP-TLS by dropping the client-side certificate requirement, allowing other authentication methods (such as password) for client-side authentication. EAP-TTLS is thus easier to deploy than EAP-TLS, but less secure when omitting the client-side certificate.

PEAP (Protected EAP) It is similar to (and may be considered a competitor to) EAP-TTLS, including not requiring client-side certificates.

802.11b uses DSSS, 802.11a uses OFDM and works in the 5GHz frequency band. But working at higher frequency means a device’s signal cannot cover as wide a range.

802.16 is WiMAX.

Bluejacking and Bluesnarfing are Bluetooth attacks.

War driving is the type of attack when one or more people either walk or drive around with a wireless device equipped with the necessary equipment and software with the intent of identifying APs and breaking into them.

The protocol field of the IP packet dictates what protocol the IP packet is using. TCP=6, ICMP=1, UDP=17, IGMP=2

Dedicated point-to-point protocols are HDLC and PPP.
Packet-switched protocols are X.25, Frame Relay and ATM.
Circuit-switched protocols are ISDN and Leased Line.

HDLC provides a higher throughput and supports full-duplex transmissions comparing to SDLC. IBM Mainframe systems used SDLC.

DSL is considered as an “always on” technology.

Footprinting is a method used by an attacker to learn information about a victim before carrying out scanning and probing activity.

802.2 is LLC and 802.3 is MAC. LLC communicates with Layer 3.

SNMP community string is a password a manager uses to request data from the agent.

In PPP protocol, LCP establishes, configures and maintains the connection and NCPs are used for network layer protocol configuration and authentication.

Tuesday, March 25, 2014

CISSP - Disaster Recovery and Business Continuity

Business Continuity and Disaster Recovery Overview

The goal of disaster recovery is to minimize the effects of a disaster or disruption. It means taking the necessary steps to ensure that the resources, personnel, and business processes are able to resume operation in a timely manner. This is different from continuity planning, which provides methods and procedures for dealing with longer-term outages and disasters. The goal of a disaster recovery plan is to handle the disaster and its ramifications right after the disaster hits; the disaster recovery plan is usually very information technology (IT)–focused.

A disaster recovery plan (DRP) is carried out when everything is still in emergency mode, and everyone is scrambling to get all critical systems back online. A business continuity plan (BCP) takes a broader approach to the problem. It can include getting critical systems to another environment while repair of the original facilities is under way, getting the right people to the right places during this time, and performing business in a different mode until regular conditions are back in place. It also involves dealing with customers, partners, and shareholders through different channels until everything returns to normal.

In most situations the company is purely focused on getting back up and running, thus focusing on functionality. If security is not integrated and implemented properly, the effects of the physical disaster can be amplified as hackers come in and steal sensitive information.

Business continuity and disaster recovery planning is an organization’s last line of defense. When all other controls have failed, BCP/DRP is the final control that may prevent drastic events such as injury, loss of life, or failure of an organization.

An additional benefit of BCP/DRP is that an organization that forms a business continuity team and conducts a thorough BCP/DRP process is forced to view the organization’s critical processes and assets in a different, often clarifying light. Critical assets must be identified and key business processes understood. Risk analysis conducted during a BCP/DRP plan can lead to immediate mitigating steps.

The business continuity plan is an umbrella plan that includes multiple specific plans, most importantly the disaster recovery plan.

One point that can often be overlooked when focusing on disasters and their associated recovery is to ensure that personnel safety remains the top priority.

Disruptive events and disaster that justify the preparation of BCP and DRP can be resumed as follows:
  • Human errors and omissions
  •  Natural disasters
  • Electrical and power problems
  • Temperature and humidity failure
  •  Warfare, terrorism and sabotage
  • Financially motivated attackers
  • Personnel shortages and unavailabilities
  •  Pandemics and diseases
  • Strikes
  • Communication failures

DRP/BCP Preparation

Steps to prepare a BCP/DRP are:

  • Project Initiation
  • Scope the Project
  • Business Impact Analysis
  • Identify Preventive Controls
  • Recovery Strategy
  •  Plan Design and Development
  •  Implementation, Training, and Testing
  • BCP/DRP Maintenance

Project Initiation

  1. Develop the continuity planning policy statement. Write a policy that provides the guidance to develop a BCP, and that assigns authority to the necessary roles to carry out the tasks.
  2. Conduct the business impact analysis (BIA). Identify critical functions and systems and allow the organization to prioritize them based on necessity. Identify vulnerabilities and threats, and calculate risks.
  3. Identify preventive controls. Once threats are recognized, identify and implement controls and countermeasures to reduce the organization’s risk level in an economical manner.
  4. Develop recovery strategies. Formulate methods to ensure systems and critical functions can be brought online quickly.
  5. Develop the contingency plan. Write procedures and guidelines for how the organization can still stay functional in a crippled state.
  6. Test the plan and conduct training and exercises. Test the plan to identify deficiencies in the BCP, and conduct training to properly prepare individuals on their expected tasks.
  7. Maintain the plan. Put in place steps to ensure the BCP is a living document that is updated regularly.

The most critical part of establishing and maintaining a current continuity plan is management support. Management must be convinced of the necessity of such a plan. Therefore, a business case must be made to obtain this support. The business case may include current vulnerabilities, regulatory and legal obligations, the current status of recovery plans, and recommendations. Management is mostly concerned with cost/benefit issues, so preliminary numbers need to be gathered and potential losses estimated. A cost/benefit analysis should include shareholder, stakeholder, regulatory, and legislative impacts, as well as those on products, services, and personnel.

BCP/DRP project manager

The BCP/DRP project manager is the key point of contact (POC) for ensuring that a BCP/DRP not only is completed but also is routinely tested. This person needs to have business skills, to be extremely competent, and to be knowledgeable with regard to the organization and its mission, in addition to being a good manager and leader in case there is an event that causes the BCP or DRP to be implemented. In most cases, the project manager is the POC for every person within the organization during a crisis.

BCP/DRP team

The BCP/DRP team is comprised of those personnel who will have responsibilities if or when an emergency occurs. Before identification of the BCP/DRP personnel can take place, the continuity planning project team (CPPT) must be assembled. The CPPT is comprised of stakeholders within an organization and focuses on identifying who would need to play a role if a specific emergency event were to occur. This includes people from the HR section, public relations (PR), IT staff, physical security, line managers, essential personnel for full business effectiveness, and anyone else responsible for essential functions.

The people who develop the BCP should also be the ones who execute it. (If you knew that in a time of crisis you would be expected to carry out some critical tasks, you might pay more attention during the planning and testing phases.)

The BCP policy supplies the framework for and governance of designing and building the BCP effort. The policy helps the organization understand the importance of BCP by outlining BCP’s purpose. It provides an overview of the principles of the organization and those behind BCP, and the context for how the BCP team will proceed.

Scope of the Project

There are a number of questions to be asked and answered. For instance, is the team supposed to develop a BCP for just one facility or for more than one facility? Is the plan supposed to cover just large potential threats (hurricanes, tornadoes, floods) or deal with smaller issues as well (loss of a communications line, power failure, Internet connection failure)? Should the plan address possible terrorist attacks and other manmade threats? What is the threat profile of the company? Then there’s resources—what personnel, time allocation, and funds is management willing to commit to the BCP program overall?

Basically, Scope of the project is the answer of these and other questions. Senior executives, not BCP managers and planners, should make these kinds of decisions.

Conduct Business Impact Analysis (BIA)

The primary goal of the BIA is to determine the maximum tolerable downtime (MTD) for a specific IT asset. This will directly impact what disaster recovery solution is chosen.

The BIA is comprised of two processes: identification of critical assets, and comprehensive risk assessment.

Critical asset identification can be made by using a table as given below.

IT Asset
User Group Affected
Business Process Effected
Business Impact
E-Mail System
Office Employees
Financial group communications with executive committee
Mild impact, financial group can also use public e-mail

A typical example to DRP biased risk assessment can be as seen from the table below.

Risk Assessment Finding
Servers are hosted in an unlocked room
Access to server room by unauthorized people
Potentially bring several business services down
Install PIN code based electronic lock system (Risk reduced)
Client computers lack security patches
Malware can infect computers or DoS type attack can happen
Client cannot reach ERP applications
Update OS (Risk is eliminated)

Maximum Tolerable Downtime (MTD) is one of the most important terms that should be very well understood and describes the total time a system can be inoperable before an organization is severely impacted. It is the maximum time it takes to execute the reconstitution phase. Maximum tolerable downtime is comprised of two metrics: the recovery time objective (RTO), and the work recovery time (WRT).

MTD is also known as maximum allowable downtime (MAD), maximum tolerable outage (MTO), and maximum acceptable outage (MAO).

The recovery point objective (RPO) is the amount of data loss or system inaccessibility (measured in time) that an organization can withstand.

The recovery time objective (RTO) describes the maximum time allowed to recover business or IT systems. RTO is also called the systems recovery time.

Work recovery time (WRT) describes the time required to configure a recovered system.

Downtime (MTD) consists of two elements, the systems recovery time and the work recovery time. 

Therefore, MTD = RTO + WRT

Mean time between failures (MTBF) quantifies how long a new or repaired system will run before failing. It is typically generated by a component vendor and is largely applicable to hardware as opposed to applications and software.

The mean time to repair (MTTR) describes how long it will take to recover a specific failed system. It is the best estimate for reconstituting the IT system so that business continuity may occur.

Minimum operating requirements (MOR) describe the minimum environmental and connectivity requirements in order to operate computer equipment.

Identify Preventive Controls

One of the important advantages of BCP/DRP preparation is the early detection of some vulnerabilities which can be eliminated by applying simple preventive controls. Applying these controls will help DRP team to better focus on critical areas.

Recovery Strategy

In result of previously defined parameters during BIA phase such as MTD, RTO, RPO, and MTTR, a suitable recovery strategy can be defined for the organization.

Recovery strategy must consider supply chain management, telecommunication management and utility management during decision phase. It must be well understood that, in many cases of disaster recovery efforts, procurement of systems and other equipment, building a new system room from scratch as well as providing connectivity to DR sites may take longer than usual due to several reasons thus can be very risky unless the organization opts for Cold Site strategy.

Different types of strategies in function of cost and provided availability can be seen from the scheme.

Recovery Strategies

Redundant site

A redundant site is an exact production duplicate of a system that has the capability to seamlessly operate all necessary IT operations without loss of services to the end user of the system. A redundant site receives data backups in real time so that in the event of a disaster the users of the system have no loss of data. It is a building configured exactly like the primary site and is the most expensive recovery option because it effectively more than doubles the cost of IT operations.

Hot site

A hot site is a location to which an organization may relocate following a major disruption or disaster. It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured computers. The hot site will have all necessary hardware and critical applications data mirrored in real time. A hot site will have the capability to allow the organization to resume critical operations within a very short period of time—sometimes in less than an hour.

Warm site

A warm site has some aspects of a hot site but it will have to rely upon backup data in order to reconstitute a system after a disruption. It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured computers.

Because of the extensive costs involved with maintaining a hot or redundant site, many organizations will elect to use a warm site recovery solution. These organizations will have to be able to withstand an MTD of at least 1 to 3 days in order to consider a warm site solution. The longer the MTD is, the less expensive the recovery solution will be.

Cold site

A cold site is the least expensive recovery solution to implement. It does not include backup copies of data, nor does it contain any immediately available hardware. After a disruptive event, a cold site will take the longest amount of time of all recovery solutions to implement and restore critical IT services for the organization. Organizations using a cold site recovery solution will have to be able to withstand a significantly long MTD—usually measured in weeks, not days.

Reciprocal agreement

Reciprocal agreements are bidirectional agreements between two organizations in which one organization promises another organization that it can move in and share space if it experiences a disaster. It is documented in the form of a contract written to gain support from outside organizations in the event of a disaster. They are also referred to as mutual aid agreements (MAA), and they are structured so that each organization will assist the other in the event of an emergency.

Mobile sites

Mobile sites are “datacenters on wheels,” towable trailers that contain racks of computer equipment, as well as HVAC, fire suppression, and physical security. They are a good fit for disasters such as a datacenter flood, where the datacenter is damaged but the rest of the facility and surrounding property are intact.

Subscription services

Some organizations outsource their BCP/DRP planning and/or implementation by paying another company to perform those services. This effectively transfers the risk to the insurer company.

Related Plans

Continuity of operations plan (COOP)

The continuity of operations plan (COOP) describes the procedures required to maintain operations during a disaster. This includes transfer of personnel to an alternative disaster recovery site, and operations of that site.

Business recovery plan

The business recovery plan (BRP), also known as the business resumption plan, details the steps required to restore normal business operations after recovering from a disruptive event. This may include switching operations from an alternative site back to a (repaired) primary site. The business recovery plan picks up when the COOP is complete.

Continuity of support plan

The continuity of support plan focuses narrowly on support of specific IT systems and applications. It is also called the IT contingency plan.

Cyber incident response plan

The cyber incident response plan (CIRP) is designed to respond to disruptive cyber events, including network-based attacks, worms, computer viruses, Trojan horses, etc.

Occupant emergency plan

The occupant emergency plan (OEP) provides the “response procedures for occupants of a facility in the event of a situation posing a potential threat to the health and safety of personnel, the environment, or property.

Crisis management plan

The crisis management plan (CMP) is designed to provide effective coordination among the managers of the organization in the event of an emergency or disruptive event. The CMP details the actions management must take to ensure that life and safety of personnel and property are immediately protected in case of a disaster.

Crisis communications plan

A critical component of the crisis management plan is the crisis communications plan which communicates to staff and the public in the event of a disruptive event. All communication with the public should be channeled via senior management or the public relations team.

Call trees

A key tool leveraged for staff communication by the crisis communications plan is the call tree, which is used to quickly communicate news throughout an organization without overburdening any specific person. The call tree works by assigning each employee a small number of other employees they are responsible for calling in an emergency event. The call tree continues until all affected personnel have been contacted.

Automated call trees

Automated call trees automatically contact all BCP/DRP team members after a disruptive event. Third-party BCP/DRP service providers may provide this service. The automated tree is populated with team members’ primary phone, cellular phone, pager, email, and/or fax.

Executive succession planning

Organizations must ensure that there is always an executive available to make decisions during a disaster. Executive succession planning determines an organization’s line of succession. Executives may become unavailable due to a variety of disasters, ranging from injury and loss of life to strikes, travel restrictions, and medical quarantines.

Backups and Availability

Other than the methods which are discussed with more details in Operations Security domain, some concepts deserve to be mentioned.

Hard Copy

After the evaluation of BIA, some organizations may choose to go with hard copies, which means, during the Disaster Recovery period, the organization may choose to continue their business operations on paper.

Tape rotation methods

A common tape rotation method is first-in, first-out (FIFO). Assume you are performing full daily backups and have 14 rewritable tapes total. FIFO means that you will use each tape in order and cycle back to the first tape after the 14th is used. This ensures that 14 days of data are archived. The downside of this plan is that you only maintain 14 days of data.

Grandfather–father–son (GFS) addresses this problem. There are 3 sets of tapes: 7 daily tapes (the son), 4 weekly tapes (the father), and 12 monthly tapes (the grandfather). Once per week a son tape graduates to father. Once every 5 weeks, a father tape graduates to grandfather. After running for a year, this method ensures there daily are backup tapes available for the past 7 days, weekly tapes for the past 4 weeks, and monthly tapes for the past 12 months.

Remote journaling

A database journal contains a log of all database transactions. Journals may be used to recover from a database failure. Assume that a database checkpoint (snapshot) is saved every hour. If the database loses integrity 20 minutes after a checkpoint, it may be recovered by reverting to the checkpoint and then applying all subsequent transactions described by the database journal.

Database shadowing

Database shadowing uses two or more identical databases that are updated simultaneously. The shadow databases can exist locally, but it is best practice to host one shadow database offsite. The goal of database shadowing is to greatly reduce the recovery time for a database implementation. Database shadowing allows faster recovery when compared with remote journaling.

Software escrow

Vendors who have developed products on behalf of other organizations might well have intellectual property concerns about disclosing the source code of their applications to customers. A common middle ground between these two entities is for the application development company to allow a neutral third party to hold the source code. This approach is known as software escrow. If the development organization go out of business or otherwise violate the terms of the software escrow agreement, the third party holding the escrow will provide the source code and other information to the purchasing organization.

DRP Testing, Training and Awareness

There are some important concepts that should be known about DRP testing.

DRP review

The DRP review is the most basic form of initial DRP testing, and is focused on simply reading the DRP in its entirety to ensure completeness of coverage. This review is typically to be performed by the team that developed the plan.


Checklist (also known as consistency) testing lists all necessary components required for successful recovery and ensures that they are, or will be, available if a disaster occurs. The checklist test is often performed concurrently with the structured walkthrough or tabletop testing as a solid first testing threshold.

Structured walkthrough/tabletop

Another test that is commonly completed at the same time as the checklist test is that of the structured walkthrough, which is also often referred to as a tabletop exercise. The goal is to allow individuals to thoroughly review the overall approach.

Simulation test/walkthrough drill

A simulation test, also called a walkthrough drill (not to be confused with structured walkthrough), goes beyond talking about the process and actually has teams carry out the recovery process. The team must respond to a simulated disaster as directed by the DRP.

Parallel processing

This type of test is common in environments where transactional data is a key component of the critical business processing. Typically, this test will involve recovery of critical components at an alternative computing facility, and then restore data from a previous backup. Note that regular production systems are not interrupted. Organizations that are highly dependent upon mainframe and midrange systems will often employ this type of test.

Partial and complete business interruption

Arguably, the most high fidelity of all DRP tests involves business interruption testing; however, this type of test can actually be the cause of a disaster, so extreme caution should be exercised before attempting an actual interruption test. The business interruption style of testing will have the organization actually stop processing normal business at the primary location and instead leverage the alternative computing facility.

DRP/BCP Maintenance

It is recommended to repeat BCP/DRP tests at least once a year. To be able to do so, all the documents mentioned so far must be kept up to date and revised by all the DRP/BCP team members. In order to have must complete record of changes, DRP/BCP process must be related with organization’s change management process.

DRP/BCP Mistakes

Common BCP/DRP mistakes include:
  • Lack of management support
  • Lack of business unit involvement
  • Improper (often narrow) scope
  •  Inadequate telecommunications management
  • Inadequate supply chain management
  • Lack of testing
  • Lack of training and awareness
  • Failure to keep the BCP/DRP plan up to date

Specific DRP/BCP Frameworks

NIST 800-34

ISO/IEC 27031

BS 25999

The Business Continuity Institute (BCI) 2008 Good Practice Guidelines