Methodology & Data Documentation
Comprehensive documentation of data sources, variable definitions, and analytical methods.
Data Sources
Primary Sources
1. Public University Salary Records (1987-2025)
Obtained through Freedom of Information Act (FOIA) requests and public salary databases. Covers faculty at public institutions across multiple states.
Coverage: ~180 institutions • Observations: 1.2M+ • Update frequency: Annual
2. IPEDS (Integrated Postsecondary Education Data System)
Institutional characteristics, enrollment, faculty counts, and aggregate salary data. Collected annually by the National Center for Education Statistics.
Coverage: All accredited institutions • Years: 1987-2025 • Source: NCES
3. Scopus Publication & Citation Data
Research productivity metrics including publications, citations, and collaboration patterns. Matched to faculty records using name disambiguation algorithms.
Coverage: 75% match rate • Years: 1996-2025 • Source: Elsevier Scopus
Supplementary Data
- •Bureau of Labor Statistics (BLS) - CPI for inflation adjustment
- •Carnegie Classification - Institutional classifications
- •American Community Survey (ACS) - Outside option wages
- •University financial reports - Endowment and revenue data
Variable Dictionary
Individual Characteristics
| Variable | Description | Unit | Source |
|---|---|---|---|
| employee_id | Unique identifier for each faculty member | Text | Public Records |
| name | Faculty member name (anonymized in public dataset) | Text | Public Records |
| position | Academic rank (Assistant, Associate, Full Professor) | Categorical | Public Records |
| field | Academic field/discipline | Categorical | University Records |
| year_phd | Year PhD was awarded | Year | Scopus/CV Data |
| experience | Years since PhD | Years | Calculated |
Compensation Variables
| Variable | Description | Unit | Source |
|---|---|---|---|
| base_salary | Annual base salary | USD | Public Records |
| total_compensation | Total compensation including benefits | USD | Public Records |
| real_salary | Inflation-adjusted salary (2019 dollars) | 2019 USD | Calculated (CPI) |
| bonus | Performance bonuses and additional payments | USD | Public Records |
| benefits | Value of fringe benefits | USD | Public Records |
Institutional Characteristics
| Variable | Description | Unit | Source |
|---|---|---|---|
| unitid | IPEDS unit identification number | Numeric | IPEDS |
| institution_name | Name of institution | Text | IPEDS |
| public_private | Public or private institution | Binary | IPEDS |
| state | State location | Text | IPEDS |
| enrollment | Total student enrollment | Count | IPEDS |
| endowment | University endowment size | USD Millions | IPEDS |
| research_ranking | Carnegie classification/ranking | Categorical | Carnegie |
Research Productivity
| Variable | Description | Unit | Source |
|---|---|---|---|
| publications | Number of publications | Count | Scopus |
| citations | Total citations | Count | Scopus |
| h_index | H-index measure | Numeric | Scopus |
| top_journal_pubs | Publications in top journals | Count | Scopus/Manual |
Analytical Methodology
Data Cleaning & Processing
- Remove duplicate records and resolve naming inconsistencies
- Filter to full-time tenure-track and tenured faculty only
- Exclude administrators with joint appointments
- Normalize salary data (9-month to 12-month equivalent where applicable)
- Winsorize extreme outliers (top/bottom 0.5%)
- Adjust for inflation using CPI-U (base year 2019)
Metric Construction
real_salary = base_salary / cpi_index * 100
All salaries adjusted to 2019 dollars using CPI-U index
experience = current_year - year_phd
Years of experience calculated from PhD award date
field_premium = ln(salary_field) - ln(salary_baseline)
Log wage differential controlling for rank, experience, and institution
Statistical Models
Our primary specifications use fixed effects panel regression models:
log(salary_it) = α + β₁·field_i + β₂·rank_it + β₃·experience_it +
β₄·experience²_it + γ_u + δ_t + ε_it
where γ_u are university fixed effects and δ_t are year fixed effects
Data Quality & Limitations
Coverage Limitations
- • Limited to public universities in most states (private institutions underrepresented)
- • Earlier years (pre-2000) have sparser coverage
- • Some states exempt faculty salaries from public disclosure
Data Quality Notes
- • Name disambiguation for publication matching ~75% success rate
- • 9-month vs 12-month salary conversions may introduce measurement error
- • Benefits data only available for subset of institutions/years
Representativeness
Despite limitations, our dataset covers approximately 65% of all tenure-track faculty at research universities in the United States, making it the most comprehensive publicly available academic salary database.
How to Cite This Work
@article{academic_wage_2025,
title={Faculty Compensation in Higher Education: A Comprehensive Analysis},
author={Author, First and Author, Second},
journal={Journal of Higher Education},
year={2025},
url={https://yourwebsite.com}
}Data License: This dataset is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to share and adapt the data with appropriate attribution.