Previous Chapter: 9 Other Technologies
Page 69
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.

CHAPTER 10. DATA MANAGEMENT

Signal detection data serves as the foundation for generating accurate volume count information. These data can originate directly from sensors, such as video recordings or radar logs, or from devices and software that compile sensor data, such as signal controllers or outputs from advanced traffic signal management software. Regardless of the source, it is essential to ensure reliable data transmission to a storage location, maintain data accuracy, and process the information effectively for end-user applications.

This chapter explores the data formats and collection methods necessary for traffic monitoring. It also addresses storage requirements, describes procurement considerations, and provides best practices for logging and maintaining volume data to support informed decision-making and system optimization.

STORAGE FORMAT

The detection data storage format impacts several aspects of the volume data management needs. Typically, this data can be stored directly from the sensor, providing users with flexibility but also poses challenges related to storage capacity and efficient data use. Advanced technologies like video and LiDAR generate large data volumes during collection periods, leading to substantial storage requirements, particularly when collecting data to calculate the AADT of a facility. To address these challenges, data can be condensed and stored in formats that only capture detector channel activations or summarize traffic counts for the facility.

This section explores three common data storage formats, as follows:

  • Raw data: This format involves storing information directly as generated by the sensor, such as text files, video files, or proprietary formats specific to the sensor. While this method preserves the most detail, it requires significant storage space and specialized tools to interpret the data.
  • Detection records: These represent data stored via traffic signal controllers, capturing detector channel activations in a high-resolution data log. Traffic signal controllers compliant with NTCIP 1202 v03 must support this type of logging. Users can define the operational data format, such as the Indiana traffic signal hi-resolution data logger enumerations (Li et al., 2020). This format strikes a balance between detail and storage efficiency.
  • Aggregated data: This approach involves post-processing raw data into a simplified format, such as workbooks or datasets accessible via web interfaces. The infrastructure operator or a signal performance metrics provider typically aggregates the data. Third-party vendors often enhance aggregated data by including additional performance metrics, such as those provided in an ATSPM package.

By carefully selecting the appropriate storage format based on operational needs and available resources, agencies can optimize data usability and minimize storage challenges while ensuring accurate and accessible volume information. Table 11 presents strengths and weaknesses of different volume data storage formats.

Page 70
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.

Table 11. Strengths and Weaknesses of Different Data Storage Formats.

Storage Format Strengths Weaknesses
Raw Data from Sensor
  • Data are useful for other tasks like surveillance or sensor troubleshooting.
  • Many sensors can store data directly, making this method for data storage easier than others.
  • May require more storage space than alternate storage formats.
  • Information must be converted into volume information via post-processing.
  • Require some knowledge of the configuration of the localization of the sensor (i.e., direction pointed or lane position).
  • Some storage media (like video) may be subject to legal requests that the program may not want to accommodate. Not storing the data in this format is one way to handle such requests.
TSC High-Resolution Data
  • Compact storage medium.
  • Have information on every detection trigger from the traffic signal controller.
  • Can contain other valuable information on traffic signal performance like phase changes and preemptions.
  • Information must be converted into volume information via post-processing.
  • Require some knowledge of the configuration of the localization of the sensor (i.e., direction pointed or lane position).
  • Must understand the data format configuration to understand data, making this medium less intuitive.
Aggregate Count
  • Volume data by date and time are available as a direct output from the software processing the detection data for performance measures
  • Some packages may be able to include additional metrics alongside the volumes.
  • Files may contain little additional data beyond volume and direction of travel.
  • Using a third-party subscription and data storage likely involves recurring costs.

Although the choice of storage format depends on user needs, this guidebook recommends using a high-resolution data format for several reasons. High-resolution data capture records of all events in a descriptive format, such as text or comma-separated files. This approach provides detailed volume and operational data while maintaining a significantly smaller file size compared to raw video or LiDAR data.

The high-resolution data format is versatile, supporting immediate use for operational analysis and calculation of signal performance measures. Additionally, this format ensures that data remains accessible for future analyses when agency resources and priorities allow. Many ATSPM software tools are compatible with high-resolution data, particularly those adhering to the Indiana traffic signal high-resolution data logger enumerations (Li et al., 2020).

By adopting this format, agencies retain the ability to collect, process, and analyze signal performance measures. The data can also be aggregated for generating traffic volume counts, enabling both real-time and retrospective analysis while supporting future migration to advanced ATSPM systems.

Page 71
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.

COMMUNICATIONS AND STORAGE

Effective transmission and storage of detection data are crucial for accurate volume measurement and related metrics. A reliable communication network and adequate storage resources are fundamental to this process. These needs are interconnected since the communication medium directly affects the required storage resources and their location within the network. Communication systems serve as the backbone for transferring detection data from field equipment to the agency, making them vital for successful volume measurement at intersections.

Traffic signal management communications have key parameters such as latency and reliability. Latency measures the delay between transmitting and receiving data, while reliability ensures that the transmitted data are complete, uncorrupted, and delivered as intended. The choice of communication media—such as ethernet, fiber optics, wireless, or cellular networks—affects both reliability and latency. Modern traffic signal systems predominantly utilize fiber optic communication technology. Compared to traditional copper wiring used in earlier systems, fiber optics can transmit significantly more data while preserving signal quality over much greater distances. This advancement allows for more efficient and reliable communication within traffic management networks. In some cases, like isolated intersections without fiber infrastructure, cellular communication can make more sense. In these cases, the data are transmitted through a cellular router using a data plan the agency selects. Figure 41 shows images of a sample cellular router and fiber optic converter that can be used in a signal cabinet.

Two labeled devices are shown for cabinet communication. Image A displays a cellular router with several ethernet ports and indicator lights. Yellow and blue network cables are connected to the front panel. Image b shows a fiber optic converter housed inside a metal cabinet. The converter has status indicators and ports with multiple yellow fiber optic cables and one gray cable plugged into it. Both devices are used to transmit and manage data from traffic system components.
Figure 41. Communication Tools for Cabinets: (a) Cellular Router, and (b) Fiber Optic Converter.

The network bandwidth, or the average data transfer rate, plays a significant role in accommodating the chosen volume data storage method. Bandwidth capabilities vary by medium, with fiber optics offering higher and more consistent bandwidth compared to cellular networks. Cellular bandwidth may fluctuate due to network load or contractual terms. Agencies storing data centrally or in the cloud benefit from higher bandwidth for faster data transfer. In contrast, edge storage (on-site in the traffic cabinet) requires less bandwidth since data can be transferred as needed or during off-peak periods, but it demands additional hardware and a more complex storage setup.

Page 72
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.

Edge storage reduces communication demands but involves greater hardware investment and on-site data retrieval logistics. Central or cloud storage simplifies hardware requirements but incurs recurring costs and requires robust communication systems. Cloud storage also requires efficient communication for both uploading data from the field and accessing it remotely. Figure 42 illustrates a typical cloud network. In a central server-based network, the field devices for a given TMC communicate directly to the TMC through the communication media, without the cloud acting as a storage location in the middle. Agencies must weigh these trade-offs when selecting a storage approach.

A cloud-based communication network links multiple Traffic Management Centers, T M Cs, and field devices. Five T M Cs, each with rows of computer monitors, surround a central cloud symbol. Each T M C is connected to the cloud by a jagged lightning-shaped line, indicating data transmission. Various field devices, including vertical cabinets and vehicle-mounted systems, are also connected to the cloud. Each field device exchanges data with the central cloud, enabling communication between remote sensing hardware and central control facilities. The setup represents a distributed, cloud-enabled traffic monitoring and control system.
Figure 42. Illustration of a Cloud Communications Environment (Balke et al., 2023).

Best practices for data collection and storage are:

  • Data conversion at the edge: Sensors should log data in the desired format—whether high-resolution logs or aggregate counts—before transmission. High-resolution data minimize bandwidth needs compared to raw video or LiDAR data, making them a practical choice for central or cloud storage.
  • Pilot study for storage needs: Agencies should perform trial data collection (e.g., a week of volume data) to estimate storage requirements. This process would help determine the resources needed considering the data retention policy, typically 3–5 years.
  • Regular data collection: Agencies should automate the transfer of data from edge devices to a central server or cloud storage. Data should be organized systematically, with folders labeled by intersection ID, name, and date. Metadata, such as sensor types and channel mappings, should be stored alongside the data.
Page 73
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
  • Backup systems: To safeguard against data loss, agencies should maintain two separate storage locations—one can be cloud-based, but at least one should be a physical backup.
  • Timestamp accuracy: Accurate timestamps are critical for volume data. Agencies should implement bi-annual checks, ideally aligned with daylight saving time transitions, to ensure synchronization.
  • Processing tools: For high-resolution data, agencies should develop scripts to extract and analyze volumes efficiently. For aggregate counts, tools provided by third-party vendors often simplify data retrieval and analysis.

By following these recommendations, agencies can promote reliable communication, efficient storage, and accurate management of volume data, improving traffic signal and system performance analyses.

QUALITY AND TROUBLESHOOTING

Errors in volume measurements can manifest in various ways, often influenced by the specific detection technology used. For instance, video, radar, and LiDAR technologies, which monitor the length of the roadway, are prone to occlusion issues, where larger vehicles block the detection of smaller ones. In contrast, detection methods like inductive loops and magnetometers, which detect vehicles directly above the sensor, are less susceptible to occlusion but may face challenges such as power loss or data transmission failures to the controller. To ensure data accuracy, agencies must improve their procurement process and regularly review and validate the volume data collected by their systems.

Procurement Considerations

To ensure vendors provide effective and reliable solutions, agencies should consider incorporating the following requirements into procurement specifications:

  • Require vendors to provide third-party validation results or independent performance evaluations under real-world conditions.
  • Specify acceptable accuracy thresholds for key metrics such as TMCs, classification, and volume detection.
  • Define performance expectations under various environmental conditions (e.g., nighttime, inclement weather, high traffic congestion).
  • Require documentation of vendor-recommended calibration procedures.
  • Ensure vendors provide guidelines on proper sensor placement to maximize detection accuracy.
  • Consider requiring an on-site verification process before final system acceptance.
  • Establish performance-based contracts that link payments to ongoing system accuracy and uptime.
  • Require vendors to offer training for agency staff on system calibration and troubleshooting.
  • Include a requirement for periodic recalibration and system diagnostics as part of maintenance contracts.

Agencies should also collaborate with peer agencies that have deployed similar systems to understand their experiences with different vendors. They should also participate in regional or national working groups focused on traffic detection technologies.

Page 74
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.

Initial Validation

When starting to collect volume data, agencies should dedicate extra time during the initial weeks to verify that data are being recorded and transmitted as expected. This process includes:

  • Confirming that the correct file types are saved.
  • Ensuring files are the expected size.
  • Checking for sufficient data to calculate volumes.
  • Comparing signal volumes against benchmark volumes collected from manual counts, videos, vendor-reported data, or other reliable sources of volume data.

This early validation period helps establish a baseline for expected traffic volumes, which can assist in ongoing monitoring and troubleshooting.

Ongoing Data Quality Checks

Once the system is operational, agencies should:

  • Check for new data uploads at expected intervals. If no new data are found, troubleshoot to confirm the detection system is powered and operational. Automated processes for saving and flagging missing data can streamline this task.
  • Look for unexpected spikes or drops in traffic volumes, which may indicate sensor errors. Examples include:
    • High traffic volumes during off-peak hours.
    • Lack of directional flow variations during expected periods (e.g., rush hour).
    • Sudden, unexplained changes in volumes.
  • Such anomalies may signal the need for recalibration, sensor replacement, or adjustments to the detection system.
  • Compare volume patterns against known traffic conditions, such as special events or typical peak hours. If data lacks these expected variations, further investigation is warranted.

System Improvements and Audits

To enhance accuracy, agencies should:

  • Reconfigure sensors to address recurring errors.
  • Develop algorithms to correct observed inaccuracies in volume data.
  • Conduct periodic audits by cross-referencing system-generated volumes with data from independent sources, such as tube counters or manual counts. Discrepancies should prompt recalibration or adjustments to the signal assets.

URBAN VERSUS RURAL TRAFFIC COUNT DATA COLLECTION

Traffic count data collection methods must account for the differing operational and environmental conditions in urban and rural settings. While the goal of obtaining accurate signal volumes remains the same, key distinctions in infrastructure, detection technologies, data transmission, and maintenance practices must be considered to ensure successful implementation across diverse locations.

Page 75
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.

Urban Traffic Count Data Collection

Urban environments present unique challenges, primarily due to high traffic volumes, complex intersections, and multimodal interactions. Considerations for urban traffic data collection include:

  • High density and congestion: Data collection must accommodate frequent vehicle stops, slow-moving traffic, and congestion-related occlusion in sensor-based detection systems (e.g., video or radar).
  • Multimodal integration: Urban areas require methods for capturing pedestrian, bicycle, and transit vehicle data alongside motorized traffic. Advanced sensors, such as AI-enhanced video analytics, are often necessary to differentiate between users.
  • Interference and infrastructure constraints: Signal interference from buildings, tunnels, and other urban infrastructure may affect GPS-dependent or wireless data transmission systems. Careful sensor placement and data redundancy strategies help mitigate this issue.
  • Real-time data needs: Urban traffic management often relies on real-time traffic flow data for adaptive signal control and congestion management. This necessitates high-bandwidth, low-latency communication networks, typically supported by fiber optics.
  • Frequent maintenance and calibration: Higher traffic volumes and wear on infrastructure require more frequent sensor maintenance and recalibration to sustain accuracy. Agencies should establish protocols for periodic sensor audits and adjustments.

Rural Traffic Count Data Collection

Rural traffic monitoring differs significantly due to lower traffic volumes, longer roadway segments, and limited infrastructure. Key considerations include:

  • Low volume and seasonal variability: Rural roads experience lower daily traffic but may see seasonal fluctuations due to tourism, agricultural activity, or weather conditions. Longer data collection periods may be needed to capture representative trends.
  • Extended roadway segments: Due to greater distances between intersections, data collection often focuses on corridor-level trends rather than intersection-level movements. Portable count stations are commonly used.
  • Limited power and communication infrastructure: Many rural areas lack direct power sources and fiber optic connectivity. Agencies may need to rely on solar-powered detection systems, cellular communication, or local storage with periodic data retrieval.
  • Environmental challenges: Extreme weather conditions, such as heavy snowfall or flooding, can impact detection accuracy and require robust, weather-resistant equipment. Sensors should be selected and positioned to minimize environmental disruptions.
  • Poor lighting: Limited street lighting in rural areas can affect the performance of video-based and infrared detection systems, requiring alternative detection methods or additional lighting solutions.
  • Cost and resource constraints: Budgetary limitations often restrict the deployment of high-tech solutions in rural areas. Agencies should prioritize cost-effective, low-maintenance detection systems and explore regional partnerships to share data collection resources.

By addressing these urban and rural differences, agencies can tailor their traffic count data collection strategies to improve accuracy and efficiency.

Page 69
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
Page 69
Page 70
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
Page 70
Page 71
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
Page 71
Page 72
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
Page 72
Page 73
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
Page 73
Page 74
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
Page 74
Page 75
Suggested Citation: "10 Data Management." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Existing Traffic Signal Assets to Obtain Quality Traffic Counts: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29214.
Page 75
Next Chapter: References
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.