Super Bowl Sunday 1999 was anything but super in New York City, and not only because the Broncos had knocked the Jets out of championship contention just 14 days earlier. For about an hour on Jan. 31, callers to 911 looking for help heard a busy signal instead of an emergency operator. One person died of a heart attack, and inquiries are being made into what effect, if any, the outage had in this case.
In the wake of more recent incidents, concerns about the system continue to appear in the press. Both the public and public safety have come to rely on 911 for more than 30 years, a reliance that is well founded. Over one quarter of a million 911 calls are placed in the United States every day, and the overwhelming majority of them connect correctly. But what happens when they don’t? How frequently does 911 fail, and for what reasons? Can anything be done to make the system more reliable? How can emergencies be handled where 911 is out of service? In other words, what happens when the emergency number goes down?
what the caller needs
To answer these questions, it must first be noted that 911 is, in reality, just a telephone number. Although special care is given to improve its serviceability, 911 depends on many of the same resources as a standard telephone number. In order to dial 911 from a conventional telephone, a caller must have dial tone and a working telephone device. This means there are no interruptions in the cable that connects the caller with the telephone company central office, that the central office is functioning properly and that the caller has plugged an operating telephone into his or her end of the line.
The reverse is true for 911 to be able to answer – the central office must properly route the call, the circuits between the public safety answering point (PSAP) and the central office must be intact, and the PSAP equipment must be working. While this may sound complicated, it is only the beginning.
A caller who is located in another community or a different part of town from the 911 facility is likely to be served by a different central office, and perhaps even by a different telephone company. This means that in order for the call for help to be completed, both central offices must be operational, and the connection between the two facilities functioning properly.
Enhanced 911, which provides automatic number identification (ANI) and automatic location identification (ALI), adds further dependencies. Because the data used to populate the address and telephone number displays is usually stored in a remote database, every call in an enhanced 911 system generates a request to retrieve the appropriate information. The telephone company computers that maintain ANI and ALI serve large areas, and may not even be located in the same state as the caller. If the link to this master street address guide (MSAG) fails, the call will still be connected, but the telecommunicator will not know its origin.
when everything works
When all these things happen flawlessly, and most times they do, both voice and data are delivered to the PSAP. But in order to be completed, the call still must be routed through the premise equipment to be answered. This equipment can consist of a variety of devices such as an in-house telephone switch, ANI/ALI controller, automatic call distributor, or systems that integrate telephones and computers.
Despite the complexity inside the dispatch center, many 911 outages result from failures in the field that affect large segments of the public network. Perhaps the most notable of these incidents occurred in Hinsdale, IL, in May 1988. A fire in the central office of this suburban Chicago community cut off all telephone service to 500,000 customers. Recovery took the better part of a month.
While catastrophes such as this may be rare, there have been other well-documented problems. A November 1993 failure in a New York City central office affected 100,000 customers for about two hours. Unfortunately, one of these customers was the Emergency Medical Services answering point. Telephone service was lost, including tie lines to the main 911 center and the connection between dispatch consoles and the radio system. In 1997, a central office outage suspended service to 14,000 people in Massachusetts. In 1998, a Longmont, CO, man died during a change in area code that caused a 911 failure. Apparently the database that routes emergency calls was not immediately updated. Colorado saw yet another fatality that fall when contractors twice severed the connection between Douglas and Elbert counties. Although an infant died during one of the outages, an autopsy found that a quicker response would not have helped.
All told, in 1998, Colorado experienced 54 disruptions of 911 service and had seen 30 more by July 1999. The state Public Utilities Commission subsequently released preliminary findings that voiced concerns over aging equipment, single points of failure, and insufficient backup power. Also in 1999, Topeka, KS, lost service to 44,000 lines, including 911. Austin, TX, and Prescott, AZ, are among dozens of towns whose emergency number fell victim to the backhoe as telephone cables were inadvertently excavated. This lends credence to a study of more than 70 outages involving 911 systems that concluded unreported or careless digging to be the most usual cause for disruption.
It should also be noted that not all cable cuts are unintentional. Dallas and several surrounding communities lost 911 when someone, as an act of vandalism or attempted theft, sawed through a major telephone company trunk. Los Angeles, Salt Lake City and Washington, D.C. can be added to a number of major metropolitan areas that have lived through 911 failures. In the case of our nation’s capital, an electrical fire in the dispatch center was the culprit. Suffolk County, NY, experienced three outages in 1997 due to problems with internal equipment.
Calgary, Alberta, joins the list, but for a different reason. In 1996, a hailstorm knocked out 911 service. In fairness, however, it should also be noted that aside from localized outages, 911 held up well during the San Francisco earthquake of 1989. Even when the lights flickered in the communications center, the telephones continued to ring and be answered.
monitoring outages
The Federal Communications Com-mission (FCC) has been actively addressing the issue of major telephone outages and 911 failures. In 1992, the FCC adopted regulations that required the reporting of incidents that affected 50,000 customers for a half-hour or more. The following year saw the issuance of Network Reliability: A Report to the Nation, which contained suggestions on how additional safeguards could be established. By 1994, other requirements had been added, such as the reporting of fires which affected 1,000 or more lines and any PSAP problems that involved more than 25% of their trunks. The threshold for major events had already been lowered from 50,000 to 30,000 customers.
In 1996, a focus group of the Network Reliability Council comprised of industry and government leaders issued findings and recommendations for improvement. In addition to the safeguards built into the original design, many of these improvements are already in place. Calls to 911 are routed to dedicated lines segregated from the public network. These special trunks that carry 911 calls are typically “red flagged” by the telephone provider, alerting service personnel to the criticality of their mission.
Route diversity is another common tactic, whereby a cable path is maintained in more than one direction. Similar to laying a second supply line, route diversity allows for a continuance of communication should the primary cable fail. This diversity is applied not only to local 911 networks, but to the provision of enhanced 911 data, as well.
Modern telephone switches are computerized and have hot standby backup processors that activate immediately upon loss of the primary. Data regarding alternate PSAPs may be stored for use in case the primary answering point fails, and rollovers established for the redirect of calls if a center becomes overloaded. ANI/ALI computers may actually have their backup systems located in segregated facilities in adjoining states. To minimize disruptions caused by digging, “one call” numbers have been instituted where contractors can easily determine the location of all buried utilities.
Central offices are commonly linked by looped fiber optic cable to prevent isolation, and remote central offices can be programmed to forward 911 calls to local numbers in case this link is broken. Telephone companies operate network operations centers (NOCs), where outages can be monitored and managed. Steps have also been taken within the networks to minimize logjams caused by high volume telephone events.
While significant strides have been made, nothing is perfect. In Florida, an agency diverted 911 calls to an alternate center when its primary facility was evacuated. A telephone company technician not familiar with the situation cleared the “busy” on the lines and routed them back. And, in what befits the truest test of irony, a “one call” center in the state of Washington was silenced because contractors installing sewers cut their telephone service on two separate occasions.
know the system
Although it is the responsibility of the local telephone company to provide service, public safety administrators must also play a role in the prevention of, and recovery from, 911 failures. The first step is have a clear understanding of the way your local system operates. Basic design and features may be specified by state standards, and each state handles 911 regulations individually. Deter-mine methods of reporting trouble, and know what information is required.
The days of one telephone company, one notification are long gone. Service may be provided by several local exchange carriers (LECs) and the PSAP equipment may be owned by the local government itself. Identify responsibilities and develop trouble-shooting plans for on-duty personnel so that a quick diagnosis of what’s broken and who will fix it can be made.
Establish a backup PSAP where calls can be routed in case your primary fails. This transfer is usually accomplished by one of two methods: a software switch in the telephone company central office or a hardware switch in the answering point itself.
In the first instance, a computerized record of emergency redirects is maintained by the local telephone company. This program is activated upon proper notification by the PSAP. Because human intervention is required, the transfer can take a few minutes to accomplish.
A more immediate method is through the use of a hardware switch. Think of this as an electronic gated wye, which can divert the flow of calls from one facility to another. Although more instantaneous than the central office solution, it is of no use when a failure occurs between the central office and the public safety facility.
Just as transfer methods differ, so to do the types of backup centers. Some opt to switch calls to nearby operating PSAPs, while others choose to maintain secondary emergency locations. Here too there is some diversity, with setups ranging from a complete mirror image of the original to a few telephone jacks located in a spare room in city hall.
Regardless of your situation, it is important to regularly test both your backup equipment and your personnel. Conduct both theoretical and practical exercises, and involve the entire communications staff. Time spent in familiarization will pay multiple dividends during an actual emergency. Part of your plan must also reflect the readiness of your backup facility. A fully redundant site is like a completely equipped reserve apparatus; once the crew is transferred it can go into service. However, in most cases, some equipment must also be transferred. Develop a checklist of needed items and a ready means of transport.
A review should also be made of your primary PSAP with regard to protection against failure. All telephone devices that rely on electrical power must be connected to emergency circuits. Provide manual transfer switches on generators as an extra measure of reliability. Redundant processors should be used on all computerized devices, and incoming lines should be distributed across different circuit cards.
Of special importance is protection from lightning. Lightning is listed as the primary cause for PSAP equipment damage and failure. Proper grounding, and overcurrent protection of both electrical and telephone wires is required.
Another area of concern is Y2K compliance. With the coming of the new millennium, it is imperative to verify that your equipment is ready to transition into the next century. When examining your readiness to face a 911 or telephone outage remember that standard operating procedures may not be applicable.
A widespread failure will not only disable 911, but also all conventional telephone circuits. This includes alarms, computer, and radio tie lines. Any remote radio sites that rely upon telephone control, as well as your hotline to the local gas and electric company will be useless. Notification lists that depend on calling personnel at home or dialing a pager number will be similarly affected. And while the media will be required to disseminate emergency information to the public, conventional fax machines will not be working. For this reason, a supply of wireless telephones and a cellular fax machine are valuable additions to any dispatch center. Keep in mind, however, that wireless communications is limited by its connection to the standard network – during a widespread outage most, if not all, of your local telephones will not be working. Both parties must have wireless devices to be effective. This widespread unavailability of telephones within the community must also be addressed. Until the public network is restored additional patrols will be required, and citizens should be advised to report emergencies in person to manned fire stations, police precincts, and other government buildings having contact with the dispatch center.
While telephone failures that affect 911 are not everyday occurrences, they do represent a serious potential threat to community safety. Like mass-casualty incidents, hazardous materials spills and large industrial fires they require a level of preparedness that may be tested only on rare occasions. But just as we must be prepared to handle the mass-casualty incident, respond to the hazmat spill and combat the large industrial fire, we must be ready to control what happens when the emergency number goes down.
Barry Furey, a Firehouse® correspondent, is executive director of the Knox County, TN, Emergency Communications District. He is an ex-chief of the Valley Cottage, NY, Fire Department, ex-deputy chief of the Harvest, AL, Volunteer Fire Department and a former training officer for the Savoy, IL, Fire Department.