
This chapter summarizes key project findings in relation to the research questions addressed in NCHRP 07-30, as detailed in NCHRP Web-Only Document 406: Methods for Assigning Short-Duration Traffic Volume Counts to Adjustment Factor Groups to Estimate AADT.
The NCHRP 07-30 research team sought to determine the most and least effective assignment methods. Findings revealed that cluster analysis is the most effective method in creating homogeneous adjustment factor groups. Some clustering algorithms, such as the k-prototypes and hierarchical clustering, allow the use of both continuous and categorical variables that help, to some extent, in creating clusters that share common characteristics. However, ultimately, analysts have to manually review and refine the produced clusters or employ other statistical or machine learning methods, such as decision trees and support vector machines, to assign counts to clusters. Support vector machines are slightly more effective than decision trees in assigning counts to clusters, but they have a more complex structure.
Further, grouping CCSs by functional class and rural/urban area type tends to produce slightly less accurate AADT estimates than cluster analysis, but the development of CCS groups and the assignment of counts to groups are easy and straightforward. Annualizing counts using probe-based adjustment factors is a promising method that can potentially produce more accurate AADT estimates than traditional methods as long as the penetration rate of the probe data is sufficient and the correlations between probe-based and actual adjustment factors derived from CCSs are strong (>0.85).
Among all methods examined in NCHRP 07-30, not factoring counts is the least effective approach. Other methods that exhibit lower effectiveness include assigning counts to individual CCSs using decision trees, developing clusters using inputs with limited or no temporal adjustment factors, and using probe-based adjustment factors developed from raw probe data with low penetration rates.
The research team also sought to determine the most and least important assignment attributes. Findings showed that the most important attributes used in decision trees and support vector machines are the 8,760 hourly factors by day of year, followed by the 2,016 hourly factors developed by month and day of week. The most important attributes used to develop clusters are the 84 monthly day-of-week factors, followed by the 12 monthly factors. In contrast, attributes with comparatively lower importance, particularly when they are used alone in a model,
include Census variables (e.g., population density.) and traffic volumes (e.g., 12 monthly volumes, 365 daily volumes).
Examination of the efficacy of functional classification versus volume factor groups revealed that when CCSs from all functional classes are considered, all four methods—functional classification, functional class combined with (rural/urban) area type, five volume groups, and ten volume groups—result in similar AADT accuracy. FC_RU and FC perform slightly better than volume groups in lower functional classes, FC6 and FC7. This finding can be attributed to the fact that in functional classification, group adjustment factors are developed exclusively from CCSs belonging to FC6 and FC7. In contrast, volume groups may contain CCSs from all seven functional classes, making the groups more variable and less effective. Volume groups are suitable when CCSs are unavailable on lower functional classes, and therefore, functional classification cannot be applied.
Another relevant finding is that dividing five volume groups into ten volume groups does not enhance AADT accuracy. Therefore, in most cases, five or fewer volume groups should be preferred over a high number of volume groups.
The research team examined the anticipated improvement in AADT accuracy using cluster analysis and found that based on the average results across all states, cluster analysis tends to produce significantly more homogeneous clusters than the traditional grouping methods, though the improvement in AADT accuracy is not as pronounced. For instance, as shown in Table 4, the overall within-group variability (WACV) of seven clusters is around 6.4 percent, while that of the seven FCs is much higher (11.8 percent). The MAPE of cluster analysis performed assuming that the cluster membership of counts is known was 6.7 percent, whereas the corresponding error from functional classification was higher, at 9.3 percent. The MAPE obtained from using support vector machines to assign weekday counts to clusters was 7.8 percent.
Note that the results varied from one state to another. For example, in some states, clustering improved AADT accuracy over traditional methods by more than 70.0 percent, whereas in other states, no improvement was observed. The decision to use cluster analysis over traditional methods should be made by practitioners based on state-specific results, goals, and available resources.
The research team sought to determine the expected increase/reduction in AADT accuracy when combining two or more assignment methods and found that this aspect depends on the methods to be combined. Combining two or more methods may result in a small increase in AADT accuracy or may slightly reduce the accuracy but mitigate individual method limitations. For example, across all states and years, factor groups developed by combining functional classification with rural/ urban area type yielded slightly lower AADT estimation errors, around 0.5 percent, than developing groups only by FC. In contrast, combining cluster analysis with decision trees tended to reduce the AADT accuracy of M10 by 1.0–2.0 percent, depending on the number of clusters created, but facilitated the assignment process, making it data-driven. Another example is that when functional classification is combined with cluster analysis, the latter tends to become slightly less effective, but the produced clusters are better defined.
Some clustering methods, such as the k-prototypes partitioning algorithm and hierarchical clustering, can handle both continuous and categorical variables, which, to some extent, aid in improving the definition of clusters by incorporating some assignment characteristics. Clustering may be a good starting point for identifying similar patterns and determining factor groups. However, findings from this project and previous research studies have shown that there is no statistical or machine learning method nor a set of clustering variables that can automatically and easily produce clusters that are well-defined from a practical perspective without modifying them further. Ultimately, analysts have to review and refine the produced clusters manually. However, there may be cases where some of the clusters cannot be easily defined but have unique traffic patterns that need to be kept as separate factor groups. In this case, one option is to employ other statistical or machine learning methods, such as decision trees, to assign counts to clusters in a data-driven manner.
The research team found that the most important attributes used to develop clusters are the 84 monthly day-of-week factors, followed by the 12 monthly factors. Other variables that can potentially be used to better define the produced clusters include the roadway functional class, area type, land use characteristics (if available), and geographical coordinates of the CCSs. Census variables and distance or proximity variables can also be used but may not be as effective as the attributes stated above.
Decision trees, support vector machines, discriminant analysis, random forests, gradient boosting, and artificial neural networks are examples of statistical and machine learning classification methods that can be used to assign counts to existing clusters or individual CCSs. This research project applied and validated decision trees and support vector machines and found that the latter are slightly more effective than decision trees but have a more complex structure. The validation also revealed that assigning counts to clusters using decision trees and support vector machines results in more accurate AADT estimates than assigning them to individual CCSs within the same functional class.
NCHRP 07-30 researchers sought to determine the expected AADT accuracy of counts factored using adjustment factors from higher functional classes in the case of low-volume roads. Findings showed that for FC6 and FC7, the average AADT estimation errors of weekday counts annualized using adjustment factors from FC5 (M6) or from 5R and 5U (M7) exceed 13 percent. In comparison, other traditional methods, such as functional classification, yield more accurate AADT estimates (MAPE = 11.8 percent) than M6 and M7. For rural local roads (7R), counts annualized using factors from 6R (M9) result in MAPEs of 11.3 percent, which are slightly higher than those (10.6 percent) obtained from functional classification.
The research team aimed to answer the fundamental question of whether it is worth factoring counts, and if so, which assignment method should be used and what the anticipated level of improvement in AADT accuracy is. Related results showed that it is indeed worth factoring
counts; in fact, among all methods examined in this project, not factoring counts yielded the least accurate AADT estimates, with average errors ranging from 11.5 percent (for counts taken on Tuesday–Thursday) to 19.1 percent (for counts conducted on Saturdays and Sundays). Moreover, simple traditional methods such as functional classification and functional class combined with rural/urban code are expected to improve the AADT accuracy of weekday counts by approximately 26 percent and 30 percent, respectively. More advanced methods, such as cluster analysis combined with support vector machines, are expected to improve the AADT accuracy of weekday and weekend counts by more than 37 percent and 50 percent, respectively. Further, annualizing counts using probe-based segment-specific adjustment factors that are strongly correlated with actual adjustment factors can considerably increase the AADT accuracy of unfactored counts.
Low-volume roads are ideal for agencies to operate CCSs on. This project found that if CCSs are not available on local roads (FC7), one alternative is to use group adjustment factors from 6R and 6U to annualize counts taken on 7R and 7U, respectively. A second option is to use volume factor groups. A promising option is to use segment-specific probe-based adjustment factors, provided that the latter are strongly correlated (>0.85) with actual adjustment factors stemming from CCS data. The higher the correlations, the higher the anticipated accuracy of AADT estimates derived from counts factored using probe-based factors.
Probe data can be a viable data source for calculating temporal adjustment factors. However, as of the publication of this guide, it is more difficult to develop accurate axle-correction factors due to the limited availability of probe data for specific vehicle classes. The analysis in NCHRP 07-30 revealed that the effectiveness of probe-based adjustment factors in annualizing counts depends on several interconnected factors, including the following:
Though exceptions may occur, generally speaking, the higher the penetration rate of the probe data, the stronger the correlation between probe and actual factors, which in turn results in more accurate AADT estimates developed from counts annualized using probe-based factors.
The results from the validation of probe-based factoring showed that the accuracy of probe-based AADT estimates developed using probe data from only one of the three vendors (Vendor A) was comparable to that obtained from traditional methods. In the other two cases (Vendor B and Vendor C), the AADT estimation errors were higher than those obtained from the no-factoring method. The probe data of Vendor A had an average penetration rate of 5.8 percent, and the Pearson correlation coefficient between the best-performing set of probe-based adjustment factors (seven annual day-of-week factors) and actual adjustment factors was around 0.82. The corresponding penetration rates and correlations obtained for Vendors B and C were lower or significantly lower than those of Vendor A.
There may be cases where the penetration rates are high, but the correlations and, thereby, the AADT accuracy are low. For instance, the 12 monthly probe-based adjustment factors
developed for medium- and heavy-duty trucks using probe data from Vendor C were highly ineffective (MAPE ≈ 42.7 percent), having weak correlations (≈0.32) with actual adjustment factors, despite the relatively high penetration rates (≈28 percent). The latter are likely a data artifact because Vendor C rounds raw probe trip counts to the nearest 1,000. This rounding leads to increased penetration rates but weak correlations and introduces significant AADT estimation errors.
The research team sought to determine the accuracy of AADT estimates derived from SDCs that have been annualized using probe-based factors. Table 19 shows the average penetration rates, correlations, and AADT estimation errors (MAPE) produced using probe data from different vendors and states.
These results hold true only for the three states and the probe datasets that were analyzed in NCHRP 07-30. Different results may be obtained using data from different vendors or from the same vendor but different states.
The researchers found that as long as probe-based factors are strongly correlated with actual adjustment factors, they can be used to factor counts from lower roadway functional classes. The correlation analysis performed using Vendor A’s seven annual day-of-week factors revealed that the correlations increased from higher to lower functional classes; however, a larger sample size is needed to draw more robust conclusions for Vendor A’s data. The AADT estimation errors slightly increased in the lower functional classes, but they remained less than 10 percent. This increase is expected, though, because the MAPE generally tends to increase as the volumes decrease. For this reason, more room for error is typically allowed in the lower functional classes. The main advantages of probe-based factors over other grouping and assignment methods include (1) the ability to develop probe-based adjustment factors separately for every roadway segment of the entire network, (2) the elimination of the need to create factor groups, and 3) the elimination of the need to assign counts to groups.
Table 19. Penetration rates, correlations, and AADT accuracy of probe-based adjustment factors (M18).
| Vendor (State) | APR* | Correlations of Probe-Based vs. Actual Adjustment Factors | AADT Accuracy (MAPE) of Different Probe-Based Adjustment Factors | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 7 Day of Week | 12 Monthly | 84 Monthly Day of Week | 365 Daily | 7 Day of Week | 12 Monthly | 84 Monthly Day of Week | 365 Daily | ||
| Vendor A (TX) | 5.83% | 0.817 | 0.788 | 0.795 | 0.844 | 8.0% | 11.5% | 9.6% | 9.9% |
| Vendor B (OH) | 0.39% | 0.549 | 0.509 | 0.399 | 0.311 | 11.3% | 36.0% | 38.1% | 54.9% |
| Vendor C (MN) | 2.86% | N/A | 0.325 | N/A | N/A | N/A | 24.0% | N/A | N/A |
*Average Penetration Rate
In general, higher penetration rates tend to more effectively capture temporal changes in traffic volumes, leading to stronger correlations between probe-based factors and actual adjustment factors. This in turn increases the accuracy of probe-based AADT estimates developed from annualized counts. However, many factors may affect the penetration rates, the correlations, and thus the AADT accuracy, such as the types of probe data being used and their representativeness of the entire population. If the probe data only capture a specific subset of the traffic or are biased toward certain conditions or a specific area, they may not be fully representative of the entire population over time, leading to discrepancies between estimated and actual AADT values.
Another challenge that can lead to poor correlations and AADT accuracy is vendors obtaining and adding new raw probe data from new sources to increase the penetration rate of their raw probe data and the accuracy of the data products that they develop. This ultimately may affect the temporal traffic patterns captured through raw probe data, particularly if the dates on which new data sources are added are unknown or are not taken into consideration when probe-based adjustment factors are developed. Something similar can happen when existing sources of probe data are discontinued or abandoned by a vendor.