Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs (2020)

Chapter: Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle

Previous Chapter: Appendix B: Active Data Management Plans as a Planning Tool
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.

C

Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle

The main text identifies certain job descriptions with associated salary ranges, from L (low) to VH (very high). This appendix identifies possible job titles and associated salary ranges observed in workplace and occupational surveys conducted by the Bureau of Labor Statistics (BLS, 2019a).

DATA USED

Occupational Employment Statistics

Collection methods, estimation methodology, and coverage are described in BLS (2019b). The committee downloaded the data from https://www.bls.gov/oes/special.requests/oesm18nat.zip on October 30, 2019. From the downloaded data, national_M2018_dl.xlsx was used.

Occupational Information Network

The Occupational Information Network (O*NET) database is a U.S. Department of Labor–sponsored database developed by the National Center for O*Net Development.1 The database provides standardized descriptions of hundreds of occupations within the U.S. economy. The database comprises worker attributes and job characteristics. Information is collected using a two-stage design in which the following occurs:

  • A statistically random sample of businesses expected to employ workers in the targeted occupations is identified.
  • A random sample of workers in those occupations within those businesses is selected. Data are collected by surveying job incumbents using a randomly assigned standardized questionnaire on occupation characteristics, out of three questionnaires. Additional questions cover tasks and demographic information.
  • Abilities and skills information is developed by occupational analysts using the updated information from incumbent workers (National Center for O*NET Development, 2019a).

___________________

1 See https://www.onetonline.org/, accessed August 12, 2020.

Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.

A data dictionary (National Center for O*NET Development, 2019b) provides additional information.

Version 23_2 of the data (National Center for O*NET Development, 2019c)2 was used for committee determination of salaries. Both Occupation Data.xlsx and Alternate Titles.xlsx were used.

METHODS

Mapping Job Titles to Standard Occupational Classifications

O*NET is structured around Standard Occupational Classification (SOC; BLS, 2019c). The committee’s main text has a normative list of job descriptions based on data management practiced at university libraries. These may not match reported standard occupation titles. The O*NET data provide a long but not exhaustive list of alternate mentions of job titles for specific occupations (Alternate Titles.xlsx). Using both the standard occupation title as well as the alternative mention, the normative job title is matched via probabilistic matching, using the Jaro-Winkler distance (Winkler, 1990) as implemented in the R package fuzzyjoin (Robinson, 2019). All reasonable matches (d < 0.05) were kept to obtain a list of similar occupations and their SOC codes.

Mapping SOC into Salary Ranges

Occupational Employment Statistics computes for each SOC code a salary range, comprising annual salary and hourly wages, and characterized by the 25th and 75th percentile, as well as the median. The annual salary distributions were attached to each of the identified occupations (Table C.1), and then these statistics were collapsed to a triplet of information for each normative job description (Table C.2). To do so, the minimum of all observed 25th percentiles, the median of all observed medians, and the maximum of all observed 75th percentiles were chosen. No weights were applied. An alternative implementation might use the employment shares to create weighted statistics. Reliability statistics were not computed, as the resulting table is meant to be indicative, not precise.

RESULTS

Table C.1 lists the annual salaries, as of 2018, by job title (median, and the 25th and 75th percentile), for all occupations identified as having similar names as the normative description in Chapter 2. Blank salaries (“NA”) indicate that no occupation code could be found on O*NET based on the normative description. Table C.2 lists the ranges, as defined above, for each of the normative descriptions (Chapter 2), based on the underlying occupations identified. Table C.3 lists the statistics associated with each of the salary categories, from low to very high. While the categories are defined based on the experience of members of the committee, ex ante, they match up well with observed median salaries in 2018.

FULL CODE AND DATA

The code and data underlying this appendix, including an exhaustive list of the committee’s edits (inclusions and exclusions) to the list of occupations, are available at https://github.com/labordynamicsinstitute/job-description-and-wages.

Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.

TABLE C.1 Annual Salaries (2018) by Job Title for Occupations

Job TitleTitleSOCAlternative Title25th Percentile
($)
Median Income
($)
75th Percentile
($)
ResearcherIndustrial Ecologists19-2041Researcher53,58071,13094,590
ResearcherAnthropologists19-3091Researcher48,02062,41080,230
ResearcherHistorians19-3093Researcher40,67061,14085,700
ResearcherBiofuels/Biodiesel Technology and Product Development Managers11-9041Scientist112,400140,760173,180
ResearcherMathematicians15-2021Scientist73,490101,900126,070
ResearcherChemical Engineers17-2041Scientist81,900104,910133,320
ResearcherNanosystems Engineers17-2199Scientist69,89096,980126,200
ResearcherManufacturing Engineering Technologists17-3029Scientist47,50063,20080,670
ResearcherBiologists19-1020Scientist56,73077,550103,540
ResearcherBiochemists and Biophysicists19-1021Scientist64,23093,280129,950
ResearcherBioinformatics Scientists19-1029Scientist60,25079,59098,040
ResearcherMedical Scientists, Except Epidemiologists19-1042Scientist59,58084,810118,040
ResearcherChemists19-2031Scientist56,29076,890103,820
ResearcherHydrologists19-2043Scientist61,28079,370100,090
ResearcherRemote Sensing Scientists and Technologists19-2099Scientist75,830107,230136,930
ResearcherGeographers19-3092Scientist63,27080,30096,980
Data LibrarianLibrarians25-4021NA46,13059,05074,740
Data LibrarianLibrary Science Teachers, Postsecondary25-1082Librarian56,55071,56090,550
Data LibrarianArchivists25-4011Librarian38,09052,24071,250
Metadata LibrarianLibrarians25-4021NA46,13059,05074,740
Metadata LibrarianLibrary Science Teachers, Postsecondary25-1082Librarian56,55071,56090,550
Metadata LibrarianArchivists25-4011Librarian38,09052,24071,250
Records Management SpecialistLibrarians25-4021NA46,13059,05074,740
Records Management SpecialistLibrary Science Teachers, Postsecondary25-1082Librarian56,55071,56090,550
Records Management SpecialistArchivists25-4011Librarian38,09052,24071,250
CuratorCurators25-4012NA39,58053,78072,830
CuratorArchivists25-4011NA38,09052,24071,250
CuratorArcheologists19-3091Curator48,02062,41080,230
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Job TitleTitleSOCAlternative Title25th Percentile
($)
Median Income
($)
75th Percentile
($)
Research Domain CuratorBiofuels/Biodiesel Technology and Product Development Managers11-9041Scientist112,400140,760173,180
Research Domain CuratorMathematicians15-2021Scientist73,490101,900126,070
Research Domain CuratorChemical Engineers17-2041Scientist81,900104,910133,320
Research Domain CuratorNanosystems Engineers17-2199Scientist69,89096,980126,200
Research Domain CuratorManufacturing Engineering Technologists17-3029Scientist47,50063,20080,670
Research Domain CuratorBiologists19-1020Scientist56,73077,550103,540
Research Domain CuratorBiochemists and Biophysicists19-1021Scientist64,23093,280129,950
Research Domain CuratorBioinformatics Scientists19-1029Scientist60,25079,59098,040
Research Domain CuratorMedical Scientists, Except Epidemiologists19-1042Scientist59,58084,810118,040
Research Domain CuratorChemists19-2031Scientist56,29076,890103,820
Research Domain CuratorClimate Change Analysts19-2041Scientist53,58071,13094,590
Research Domain CuratorHydrologists19-2043Scientist61,28079,370100,090
Research Domain CuratorRemote Sensing Scientists and Technologists19-2099Scientist75,830107,230136,930
Research Domain CuratorAnthropologists19-3091Scientist48,02062,41080,230
Research Domain CuratorGeographers19-3092Scientist63,27080,30096,980
Research Domain Project ManagerBiofuels/Biodiesel Technology and Product Development Managers11-9041Scientist112,400140,760173,180
Research Domain Project ManagerMathematicians15-2021Scientist73,490101,900126,070
Research Domain Project ManagerChemical Engineers17-2041Scientist81,900104,910133,320
Research Domain Project ManagerNanosystems Engineers17-2199Scientist69,89096,980126,200
Research Domain Project ManagerManufacturing Engineering Technologists17-3029Scientist47,50063,20080,670
Research Domain Project ManagerBiologists19-1020Scientist56,73077,550103,540
Research Domain Project ManagerBiochemists and Biophysicists19-1021Scientist64,23093,280129,950
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Job TitleTitleSOCAlternative Title25th Percentile
($)
Median Income
($)
75th Percentile
($)
Research Domain Project ManagerBioinformatics Scientists19-1029Scientist60,25079,59098,040
Research Domain Project ManagerMedical Scientists, Except Epidemiologists19-1042Scientist59,58084,810118,040
Research Domain Project ManagerChemists19-2031Scientist56,29076,890103,820
Research Domain Project ManagerClimate Change Analysts19-2041Scientist53,58071,13094,590
Research Domain Project ManagerHydrologists19-2043Scientist61,28079,370100,090
Research Domain Project ManagerRemote Sensing Scientists and Technologists19-2099Scientist75,830107,230136,930
Research Domain Project ManagerAnthropologists19-3091Scientist48,02062,41080,230
Research Domain Project ManagerGeographers19-3092Scientist63,27080,30096,980
InformaticianComputer Systems Analysts15-1121NA68,730887,40113,460
InformaticianInformation Technology Project Managers15-1199IT Specialist66,41090,270117,070
Data WranglerInformation Technology Project Managers15-1199IT Specialist66,41090,270117,070
Education SpecialistHealth Educators21-1091Education Specialist39,80054,22074,660
Education SpecialistSpecial Education Teachers, Secondary School25-2054Education Specialist48,63060,60077,820
Education SpecialistInstructional Coordinators25-9031Education Specialist49,28064,45082,860
Communication SpecialistPublic Relations Specialists27-3031Communication Specialist44,49060,00081,550
Software EngineerComputer and Information Research Scientists15-1111Software Engineer91,650118,370149,470
Software EngineerSoftware Developers, Applications15-1132Software Engineer79,340103,620130,460
Software EngineerSoftware Developers, Systems Software15-1133Software Engineer85,610110,000139,550
IT Security SpecialistSecurity Management Specialists13-1199NA52,20070,53094,890
IT Systems EngineerComputer and Information Systems Managers11-3021NA110,110142,530180,190
IT Systems EngineerInformation Technology Project Managers15-1199IT Specialist66,41090,270117,070
IT Project ManagerComputer and Information Systems Managers11-3021NA110,110142,530180,190
IT Project ManagerInformation Technology Project Managers15-1199IS/IT Project Manager66,41090,270117,070
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Job TitleTitleSOCAlternative Title25th Percentile
($)
Median Income
($)
75th Percentile
($)
Project ManagerConstruction Managers11-9021Project Manager70,67093,370123,720
Project ManagerArchitectural and Engineering Managers11-9041Project Manager112,400140,760173,180
Project ManagerManagers, All Other11-9199Project Manager75,460107,480143,230
Project ManagerInformation Technology Project Managers15-1199Project Manager66,41090,270117,070
Project ManagerEnvironmental Engineers17-2081Project Manager66,59087,620112,230
Project ManagerWind Energy Engineers17-2199Project Manager69,89096,980126,200
Project ManagerEnvironmental Restoration Planners19-2041Project Manager53,58071,13094,590
Project ManagerSocial Science Research Assistants19-4061Project Manager35,45046,64060,830
Project ManagerRemote Sensing Technicians19-4099Project Manager37,94049,67063,340
Project ManagerTechnical Directors/Managers27-2012Project Manager48,52071,680110,350
Project ManagerIntelligence Analysts33-3021Project Manager57,56081,920107,000
Senior StaffNANANANANANA
Policy SpecialistNANANANANANA
Administrative StaffFirst-Line Supervisors of Office and Administrative Support Workers43-1011NA42,75055,81071,550
Administrative StaffExecutive Secretaries and Executive Administrative Assistants43-6011NA46,53059,34074,460
Administrative StaffSecretaries and Administrative Assistants, Except Legal, Medical, and Executive43-6014NA28,93036,63046,230
Administrative StaffBusiness Operations Specialists, All Other13-1199Administrative Assistant52,20070,53094,890
Administrative StaffBilling and Posting Clerks43-3021Administrative Assistant31,87037,80046,350
Administrative StaffNew Accounts Clerks43-4141Administrative Assistant30,30035,80042,050
Administrative StaffMedical Secretaries43-6013Administrative Assistant29,58035,76043,200
Facilities ManagerGeneral and Operations Managers11-1021Facilities Manager65,650100,930157,120
Facilities ManagerAdministrative Services Managers11-3011Facilities Manager71,85096,180127,100
Facilities ManagerProperty, Real Estate, and Community Association Managers11-9141Facilities Manager41,21058,34085,120
Facilities ManagerFirst-Line Supervisors of Housekeeping and Janitorial Workers37-1011Facilities Manager31,02039,94052,280
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Job TitleTitleSOCAlternative Title25th Percentile
($)
Median Income
($)
75th Percentile
($)
Facilities ManagerFirst-Line Supervisors of Office and Administrative Support Workers43-1011Facilities Manager42,75055,81071,550
Facilities ManagerFirst-Line Supervisors of Mechanics, Installers, and Repairers49-1011Facilities Manager51,43066,14083,980
Facilities ManagerMaintenance and Repair Workers, General49-9071Facilities Manager29,56038,30050,100
Data ScientistComputer and Information Research Scientists15-1111Data Scientist91,650118,370149,470

NOTE: IT, information technology; SOC, Standard Occupational Classification.

TABLE C.2 Salary Ranges for Job Classifications as Defined in Chapter 2

Job Title25th Percentile ($)Median Salary ($)75th Percentile ($)
Administrative Staff28,93037,80094,890
Communication Specialist44,49060,00081,550
Curator38,09053,78080,230
Data Librarian38,09059,05090,550
Data Scientist91,650118,370149,470
Data Wrangler66,41090,270117,070
Education Specialist39,80060,60082,860
Facilities Manager29,56058,340157,120
Informatician66,41089,505117,070
IT Project Manager66,410116,400180,190
IT Security Specialist52,20070,53094,890
IT Systems Engineer66,410116,400180,190
Metadata Librarian38,09059,05090,550
Policy SpecialistInfNA-Inf
Project Manager35,45087,620173,180
Records Management Specialist38,09059,05090,550
Research Domain Curator47,50080,300173,180
Research Domain Project Manager47,50080,300173,180
Researcher40,67079,945173,180
Senior StaffInfNA-Inf
Software Engineer79,340110,000149,470
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.

TABLE C.3 Statistics Across Each of the Salary Categories (used in Chapter 2 of this report)

Relative Salary25th Percentile ($)Median Salary ($)75th Percentile ($)NMissing
Low28,93037,80094,89070
Medium29,56061,505173,180340
High40,67080,300180,190501
Very High52,200103,620180,190101

REFERENCES

BLS (Bureau of Labor Statistics). 2019a. Occupational Employment Statistics. Data set. Bureau of Labor Statistics, OES Program. https://www.bls.gov/oes/home.htm.

BLS. 2019b. Survey Methods and Reliability Statement for the May 2018 Occupational Employment Statistics Survey. Bureau of Labor Statistics, OES Program. https://www.bls.gov/oes/current/methods_statement.pdf.

BLS. 2019c. 2018 Standard Occupational Classification System. https://www.bls.gov/soc/2018/major_groups.htm.

National Center for O*NET Development. 2019a. O*NET Data Collection Overview. https://www.onetcenter.org/dataCollection.html.

National Center for O*NET Development. 2019b. O*NET® 23.2 Database. Data Dictionary. O*NET Resource Center. https://www.onetcenter.org/dl_files/database/db_23_2_dictionary.pdf.

National Center for O*NET Development. 2019c. O*NET® Database Release 23.2. Data set. O*NET Resource Center. https://www.onetcenter.org/db_releases.html.

Robinson, D. 2019. Fuzzyjoin: Join Tables Together on Inexact Matching. https://github.com/dgrtwo/fuzzyjoin.

Winkler, W.E. 1990. String comparator metrics and enhanced decision rules in the Fellegi-Sunter Model of Record Linkage. Proceedings of the Section on Survey Research Methods, American Statistical Association, 354-359. https://files.eric.ed.gov/fulltext/ED325505.pdf.

Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 140
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 141
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 142
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 143
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 144
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 145
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 146
Suggested Citation: "Appendix C: Identifying Salary Ranges for Jobs Relevant to the Data Life Cycle." National Academies of Sciences, Engineering, and Medicine. 2020. Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs. Washington, DC: The National Academies Press. doi: 10.17226/25639.
Page 147
Next Chapter: Appendix D: Soft Costs for Digital Preservation
Subscribe to Emails from the National Academies
Stay up to date on activities, publications, and events by subscribing to email updates.