General

You are here

Safety Data Resources

Task B3-3: Identify CMF Research Needs—Safety Data Resources

This document identifies safety databases that could be used to help accomplish the following tasks, through other Federal Highway Administration (FHWA) or partner efforts, related to crash modification factor (CMF) development and advancement.

  1. Identify and prioritize current CMF research needs (i.e., those already proposed—a near-term goal).
  2. Identify, prioritize, and coordinate future CMF research that will yield more reliable CMFs and may be more cost effective than current practices (a mid- to long-term goal). The relevant questions for future research needs include:
    1. What resources are available and how can they be used?
    2. What parties can be involved?
    3. What tools are available and do better ones exist or can improvements be made to existing tools?
    4. What are the methodological needs and what efforts are needed or underway to meet those needs?
  3. Support and advance innovation in safety countermeasures to further reduce crash fatalities and severe injuries associated with prioritized safety needs.
  4. Identify the current FHWA efforts and emerging statistical methodologies (e.g., those discussed at the recent DCMF Task B2 Technical Experts Meeting) that may support current needs, identify appropriate stakeholders that could be involved in promoting this effort, and determine priority research needs that have not been identified.

The following databases are relevant to supporting the four tasks listed above:

  • Fatality Analysis Reporting System (FARS).
  • General Estimates System (GES).
  • Crashworthiness Data System (CDS).
  • National Motor Vehicles Crash Causation Study (NMVCCS).
  • Crash Injury Research and Engineering Network (CIREN).
  • Motor Carriers Management Information System (MCMIS).
  • Federal Transit Administration (FTA) National Transit Database (NTD).
  • National EMS Information System (NEMSIS).
  • Second Strategic Highway Research Program (SHRP2) Naturalistic Driving and Roadway Databases.
  • National Park Service Service-wide Traffic Accident Reporting System (STARS).
  • Highway Safety Information System (HSIS).

Tables 1 – 4 provide a summary of these databases, including critical aspects of each database with respect to Task B3. Specifically, the tables provide summary information such as the sponsoring agency, data coverage, data years, data availability, and database content. The last row of each table identifies the applicability to Tasks A – D above. The results of this task will be used as a springboard to additional efforts in the future.

Table 1 Summary of National Crash Databases
 FARSGESCDSNMVCCS
Who houses and maintains the data?National Automotive Sampling System (NASS); directed by the National Center for Statistics and Analysis (NCSA), which is a component of Policy and Operations in the National Highway Traffic Safety Administration (NHTSA).NASS; directed by NCSA, a component of Policy and
Operations in NHTSA.
NASS; directed by NCSA, a component of Policy and
Operations in NHTSA.
NASS; directed by NCSA, a component of Policy and
Operations in NHTSA.
What is the spatial coverage of the data?All qualifying fatal crashes within the 50 States, the District of
Columbia, and Puerto Rico.
Obtained from 60 geographic sites that reflect the geography, roadway mileage, population, and traffic density of the United States; approximately 400 police jurisdictions included in the sampling.Obtained from 24 geographic sites that reflect the geography, roadway mileage, population, and traffic density of the United States.Sample of crashes in 24 primary sampling units (PSUs), centered on large cities/counties/metro areas; include cities and counties in AL, AZ, CA, CO, FL, IL, IN, MD, MI, NE, NJ, NY, NC, PA, TN, TX, WA.
What years of data are in the database?1975 to 20121988 to 20121979 to 2012January 2005 to December 2007
What is the general availability of the data?FTP site:
ftp://ftp.nhtsa.
dot.gov/fars/
FTP site:
ftp://ftp.nhtsa.
dot.gov/GES/
FTP site:
ftp://ftp.nhtsa.
dot.gov/NASS/
FTP site:
ftp://ftp.nhtsa.
dot.gov/NASS/
NMVCCS/
How are the data collected? How are the data coded?Cooperative agreement with agency in each State to provide information in standard format on fatal crashes in the State; data collected, coded and submitted into database. The data are coded for:
  • Crash variables.
  • Vehicle variables.
  • Person variables.
Data collectors make weekly, biweekly, or monthly visits to selected police agencies, and randomly sample about 50,000 police accident reports (PARs) each year; approximately 90 data elements; for privacy reasons, no personal information nor specific crash location is coded.Twenty-four research teams at PSUs study between 3,000 and 5,000 crashes a year involving passenger cars, light trucks, vans, and utility vehicles; investigators obtain data from selected police agencies, crash sites, and study all available evidence; interview crash victims and review medical records; more than
600 elements coded; for privacy reasons, no personal information or specific crash location is coded.
Investigated crash locations while first responders were still onsite; reconstruct crash by collecting all available data and interviewing witnesses; identify critical precrash event, critical reason for crash event, and other associated factors; over 500 elements coded.
Does the database include all crashes for the coverage area (i.e., the population) or just a portion of the crashes (i.e., a sample)?Includes population of crashes with fatal outcome; fatalities are defined as a death to an individual occurring within 30 days of a crash due to injuries sustained in the crash.Includes only portion of crashes, sampled randomly from 60 geographic sites and some 400 police agencies across the United States.Includes only portion of crashes, sampled randomly from 24 geographic sites across the United States.Sample of crashes from each PSU.
How are crash severity levels defined?KABCOKABCOKABCO and sometimes Abbreviated Injury ScaleKABCO, plus:
  • Died prior to crash.
  • Unknown if injured.
What is the vehicle type coverage?All vehicle types.All vehicle types.Crashes involving at least one light vehicle <10,000 lbs.Crashes involving at least one light vehicle <10,000 lbs.
If data are just a sample, how was the sampling done?NA(1) Selection of primary sampling units (PSUs).
(2) Selection of police jurisdictions.
(3) Selection of crashes.
(1) Selection of primary sampling units (PSUs).
(2) Selection of police jurisdictions.
(3) Selection of crashes.
Six-hour sampling time period (between 6AM and midnight) selected each week; then divided into sampling days with tendency to maximize probability of observing crash during selected sampling periods.
If just a sample, what (if any) guidance is given to incorporate the sampling procedure into data analysis? 

NA

A national weight has been added to the file for each PAR and is called "WEIGHT." This weight is the product of the inverse of the probabilities of selection at each of the three stages in the sampling process.Data are weighted to represent all police reported motor vehicle crashes occurring in the USA during the year involving passenger cars, light trucks and vans that were towed due to damage.A comprehensive weighting procedure, that makes the NMVCCS sample nationally representative, consists of mainly two phases, the design weight and its appropriate adjustment.
To which Tasks (A – D) is the database applicable?A: Prioritize current CMF research needs based on magnitude of fatalities.
B: Prioritize future CMF research based on magnitude of fatalities.
C: Support and advance innovation in safety countermeasures by demonstrating the magnitude of related fatalities.
D: Determine priority research needs that have not been identified based on magnitude of fatalities and related factors.
A: Prioritize current CMF research needs based on magnitude and severity of crashes.
B: Prioritize future CMF research based on magnitude and severity of crashes.
C: Support and advance innovation in safety countermeasures by demonstrating the magnitude and severity of related crashes.
D: Determine priority research needs that have not been identified based on magnitude and severity of crashes and related factors.
C: Support and advance innovation in safety countermeasures by identifying the underlying crash-contributing factors related to light vehicle crashes.
D: Determine priority research needs that have not been identified based on the investigation of crash contributing factors.
 
Table 2 Summary of National Crash Databases
 CIRENMCMISSTARS
Who houses and maintains the data?NHTSAFederal Motor Carrier Safety Administration (FMCSA)National Park Service (NPS)
What is the spatial coverage of the data?Sample of crashes collected by CIREN teams, which consist of three medical centers and three engineering centers in Washington, Wisconsin, Virginia, Maryland, and Alabama.All qualifying crashes involving motor carriers with USDOT numbers within the 50 States, the District of Columbia, and Puerto Rico.All motor vehicle collisions that occur within National Park Service jurisdiction.
What years of data are in the database?1996 to 20111989 to present1990–2005
What is the general availability of the data?Online: http://www.nhtsa.gov/CIRENAvailable to the general public through the MCMIS Data Dissemination Program with a fee, formal request needed.No direct access online, formal request needed.
How are the data collected? How are the data coded?Each Center collects detailed crash and medical data on about 50 crashes per year. Personal and location identifiers and highly sensitive medical information have been removed from the public files to protect patient confidentiality; 650 National Automotive Sampling System (NASS) Crashworthiness Data System (CDS) data elements and 250 medical and injury data elements coded.Quarterly update from field offices through SAFETYNET, CAPRI, and other sources. The data are coded for: crash variables, census variables, and inspection variables. Inspection data is conducted at the roadside by state personnel under the Motor Carrier Safety Assistance Program (MCSAP).Obtained from Motor Vehicle Accident Report. The data is coded for crash variables.
Does the database include all crashes for the coverage area (i.e., the population) or just a portion of the crashes (i.e., a sample)?Includes only crashes with serious injury.Include only reported crashes involving commercial motor carriers (truck & bus) and hazardous material shippers.All reported crashes.
How are crash severity levels defined?ISS/MAIS ScaleNational Governors’ Association crash thresholds. 
Injury crashes: person injured is immediately taken to a medical facility. 
Tow-away crashes: at least one vehicle is towed from the scene as a result of disabling damage suffered in the crash.
Fatal, Injury, PDO
What is the vehicle type coverage?All vehicle types.Trucks, buses, passenger cars, and light trucks with United States Department of Transportation numbers or HAZMAT placard.All vehicle types.
If data are just a sample, how was the sampling done?Admission to participating CIREN Center. Severely injured and transported to Level 1 trauma center. Injury required: (1) at least one AIS3+ injury, (2) AIS2 injury in two different AIS body regions, (3) significant particular injury to a lower extremity (AIS2). Vehicle model no older than 6 years. Restraint: (1) frontal crash – Air bag and/or belt required, (2) side impact – Unbelted is acceptable, (3) rollover – eject occupants are excluded.NANA
If just a sample, what (if any) guidance is given to incorporate the sampling procedure into data analysis?None.NANA
To which Tasks (A – D) is the database applicable?General: Conduct research related to vehicles, occupants, and nonmotorized road users involved in a crash (e.g., identify motor vehicle design features that offer maximum occupant protection).

C: Support and advance innovation in safety countermeasures to further reduce crash fatalities and severe injuries associated with prioritized safety needs.
D: Determine priority research needs that have not been identified.
General: Support and evaluate motor carrier safety programs and regulations.

C: Support and advance innovation in motor carrier-related safety countermeasures to further reduce crash fatalities and severe injuries associated with prioritized safety needs.
D: Determine priority research needs related to motor carriers that have not been identified.
General: Support and evaluate NPS safety programs and regulations.

C: Support and advance innovation in safety countermeasures to further reduce crash fatalities and severe injuries associated with prioritized safety needs.
D: Determine priority research needs that have not been identified.

Note: the NPS STARS database may have limited potential for the DCMF project and future efforts to advance CMF development.
Table 3 Summary of Other National Databases
 NTDNEMSISSHRP2
Who houses and maintains the data?Federal Transit Administration (FTA)NHTSA Office of Emergency Medical ServicesTransportation Research Board (TRB)
What is the spatial coverage of the data?National transit-related reportable incidents.National repository for EMS data. As of 2012, there are 42 states and territories that are contributing to the dataset.The naturalistic driving study (NDS) data and roadway information database (RID) were based on data gathered in six states (Florida, Indiana, New York, North Carolina, Pennsylvania, and Washington).
What years of data are in the database?2002 to 20132008 to 20122010 to 2013
What is the general availability of the data?Online:
http://www.ntdprogram.gov/
ntdprogram/data.htm
Online request:
http://www.nemsis.org/reportingTools/
requestNEMSISData.html
To be determined.
How are the data collected? How are the data coded?The system derives data from transit providers, States, or Metropolitan Planning Organizations (MPOs) that are recipients and beneficiaries of grants. There are 55 data fields that are collected from six different forms for safety and security.The NEMSIS project was developed to help states collect more standardized elements and eventually submit the data to a national emergency medical services (EMS) database.The Naturalistic Driving Study (NDS) data were collected by instrumenting vehicles to record vehicle location, forward radar, vehicle control positions, and video of the forward roadway and of the driver’s face and hands. Crash investigations were conducted after certain crashes to gather more detailed data.

The RID contains new roadway data gathered by automated data collection vehicles and existing data provided by agencies (i.e., State DOTs, MPOs, and counties). The roadway data include roadway inventory information, crash histories, traffic, weather, roadway improvements, work zones, safety laws, and enforcement campaigns.
Does the database include all crashes for the coverage area (i.e., the population) or just a portion of the crashes (i.e., a sample)?The database includes transit-related reportable incidents. Note that not all incidents are considered to be reportable. If an incident is not related to and does not affect revenue operations, then it is considered to be nonreportable.Events submitted by States do not necessarily represent all EMS events occurring within the State.The naturalistic driving study (NDS) database includes detailed data on more than 5.8 million trips, 33 million travel miles, and 1.4 million driving hours from more than 3,100 participants of various ages across the country. The database represents continuous data from all trips taken by volunteer participants over one to two years. 

The RID contains approximately 12,500 centerline miles and the existing data contains more than 200,000 centerline miles.
How are crash severity levels defined?Incidents, injuries, fatalitiesPossible injury (yes/no)Unknown
What is the vehicle type coverage?Transit vehicles, including the following modes: Automated Guideway, Commuter Bus, Cable Car, Demand Response, Demand Response-Taxi, Ferryboat, Inclined Plane, Heavy Rail, Jitney, Light Rail, Motor Bus, Monorail/Guideway, Monorail, Público, Bus Rapid Transit, Streetcar Rail, Trolleybus, Aerial Tramway, Vanpool, and Hybrid RailAll vehicle typesPassenger vehicles
If data is just a sample, how was the sampling done?NAStates vary in criteria used to determine the types of EMS events submitted to the NEMSIS dataset.Six locations were selected in the United States to represent geographic diversity and to provide a range of driver, vehicle, and roadway conditions.
If just a sample, what (if any) guidance is given to incorporate the sampling procedure into data analysis?NANoNo
To which Tasks (A – D) is the database applicable?General: United States’ primary source of transit system information and statistics. Investigate transit-related crashes, including the injuries and fatalities by type and mode.

C: Support and advance innovation in transit-related safety countermeasures to further reduce fatalities and severe injuries associated with prioritized safety needs.
D: Determine priority research needs related to transit that have not been identified.
General: Evaluate patient and EMS system outcomes.General (Note: the following list provides examples of potential uses of SHRP2 data):
  • Understand the contributing and causal factors in crashes.
  • Understand how the driver interacts with and adapts to the vehicle, traffic, roadway characteristics, traffic control devices, and the environment.
  • Identify the relationship between crashes, conflicts, and crash surrogates.
  • Formulate exposure-based risk measures using surrogate measures.
  • Investigate the potential for new countermeasures related to the design of the roadway and vehicles as well as public policy and enforcement.
  • Enhance driver training programs to demonstrate appropriate and inappropriate driver behavior.
  • The RID provides a model for developing linked datasets for asset management purposes.
Table 4 Summary of Seven HSIS Databases
 CaliforniaIllinoisMaineMinnesotaNorth CarolinaOhioWashington
Who houses and maintains the data?University of North Carolina Highway Safety Research Center (HSRC) under contract with Federal Highway Administration (FHWA).
What is the spatial coverage of the data?Statewide
What years of data are in the database?1991 to present (data typically lag by 1–2 years) .1985 to present (data typically lag by 1–2 years).1985 to present (data typically lag by 1–2 years).1985 to present (data typically lag by 1–2 years).1991 to present (data typically lag by 1–2 years).1997 to present (data typically lag by 1–2 years).1993 to present (data typically lag by 1–2 years); 1997 and 1998 crash data are not included.
What is the general availability of the data?Data can be provided via different mediums (CD-ROM, FTP, email, etc.). The data can be requested by filling out an HSIS data request form online at the HSIS Web site.
How are the data collected? How are the data coded?Annually derived from the California TASAS (Traffic Accident Surveillance and Analysis System). The data are coded for: crash variables, roadway variables, intersection variables, interchange variables, and traffic variablesAnnually derived from Illinois safety information system, which includes a number of data edits and quality checks. The data are coded for: crash variables, roadway variables, interchange variables, curve/grade variables, and traffic variablesAnnually derived from Maine TINIS (Transportation Integrated Network Information System). The data are coded for: crash variables, roadway variables, intersection variables, interchange variables, and traffic variables.Annually derived from Minnesota data system. The data are coded for: crash variables, roadway variables, intersection variables, interchange variables, and traffic variablesAnnually derived from an Oracle database on the "NCDMV" system. Before 2000, it was derived from the "MERGE" system. The data are coded for: crash variables,
roadway variables, and traffic variables
Annually derived from Ohio data system. The data are coded for: crash variables, roadway variables, , curve/grade variables, and traffic variablesAnnually derived from Washington TRIPS system. The data are coded for: crash variables, roadway variables, interchange variables, curve/grade variables, and traffic variables
Does the database include all crashes for the coverage area or just a portion of the crashes (i.e., a sample)?All reported crashes, primarily on the State-maintained system.  This varies slightly by State.
How are crash severity levels defined?KABCO/five-point scale, plus error/other codesKABCO, plus: not coded, error codesKABCO, plus: unknown, error/other codesKABCO, plus: not applicable, unknown if injuredKABCO, plus: unknownKABCONine-point scale
What is the vehicle type coverage?All vehicle types, distinguished between vehicle makes, types, and model years.All vehicle types.All vehicle types.All vehicle types.All vehicle types.All vehicle types.All vehicle types.
If data is just a sample, how was the sampling done?NANANANANANANA
If just a sample, what (if any) guidance is given to incorporate the sampling procedure into data analysis?NANANANANANANA
To which Tasks (A – D) is the database applicable?General: The HSIS database has numerous general applications, as do many of the databases listed in this document.
A: Prioritize current CMF research needs based on the magnitude and severity of crashes at specific locations (e.g., curves, intersections, segments, etc.).
B: Prioritize future CMF research needs based on the magnitude and severity of crashes at specific locations (e.g., curves, intersections, segments, etc.).
C: Support and advance innovation in safety countermeasures to further reduce crash fatalities and severe injuries associated with prioritized safety needs.
D: Determine priority research needs that have not been identified based on the investigation of crashes and crash severity at specific locations (e.g., curves, intersections, segments, etc.).
Updated: Monday, December 2, 2019