Data Submission Guide¶
Version 1.1.0 (updated 2026-02-09)
- Updated version of documentation to clarify new standards
- Serogrouping for meningococcus: Report only at the state/reporting jurisdiction level as reporting at smaller geographies would likely lead to data suppression; report separately from age.
- Age Groups: Reported at the state/reporting jurisdiction level; Combined the <1 year age groups (currently 0-6 months and 6-12 months) for current diseases (measles, pertussis, meningococcus) into a single “<1 year” category.
- Removed “YTD” value as a valid option for time_unit.
- Removed monthly aggregations; only weekly aggregation of cases by MMWR week for all diseases.
- New value uses implemented:
total,unknown,unspecifiedhave specified meaning and uses,NAis only valid ifgeo_name = "international resident".
This guide provides detailed specifications for health departments participating in the US Disease Tracker project to contribute reportable disease count data. To date, this project is seeking data from jurisdictions already sending case notification data through NNDSS.
Overview¶
- Reportable Data Specifications
- Data Elements
- Data Suppression
- Data Format
- Metadata Requirements
- Validation
Reportable Data Specifications¶
Time Period¶
Start Date: December 29, 2024 (start of MMWR week 1, 2025)
End Date: Through present
Time Aggregation¶
| Disease | Weekly |
|---|---|
| Measles | ✓ |
| Pertussis | ✓ |
| Invasive Meningococcal Disease | ✓ |
Confirmation Status¶
Required case confirmation status by disease:
| Disease | Confirmation Status |
|---|---|
| Measles | Confirmed only |
| Pertussis | Confirmed and probable combined |
| Invasive Meningococcal Disease | Confirmed and probable combined |
Required Data Aggregations¶
Disease counts for different breakdowns are collected separately (e.g., age, sub-jurisdiction, disease subtype, etc.). We call these versions of the data the "data aggregations". Each data aggregation should provide complete data from December 29, 2024 through the present, with each having equivalent total counts over time (e.g., cumulative measles cases in the age group aggregation must equal cumulative measles cases in the sub-jurisdiction aggregation). See Case Classification by Time below for details on assignment of cases to time periods. Only the data aggregations listed below will be accepted; new aggregations will be added as needed for new demographics, diseases, conditions or outcomes, etc.
Measles
- Cases × week × sub-jurisdiction unit (county, planning area, sub-state region, etc.)
- Cases × week × age group × jurisdiction (state, DC, NYC, or territory)
Pertussis
- Cases × week × sub-jurisdiction unit
- Cases × week × age group × jurisdiction
Invasive Meningococcal Disease
- Cases × week × sub-jurisdiction unit
- Cases × week × age group × jurisdiction
- Cases × week × serogroup × jurisdiction
Reporting Frequency¶
Data should be reported weekly during non-emergency periods. For each data submission, complete data for all diseases should be included, even if there are no updates.
Note: During large outbreaks or public health emergencies, more frequent updates may be requested to improve situational awareness.
Case Classification by Time¶
Cases should be classified in time according to a hierarchical date algorithm.
Recommended: Use the Calculated Case Counting Date (CCCD)
See CSTE Data Standardization Guidelines for details.
The CCCD employs a hierarchy and assigns the case to the earliest of:
1. Symptom onset date
2. Clinical diagnosis date
3. Earliest specimen collection date associated with a positive lab result
4. Earliest result date for a positive lab result
5. Date first received by a public health agency
6. Date entered/record initiated
Alternative: If CCCD is not implemented, use a similar hierarchical algorithm or an existing case classification date such as Event Date in your system.
Required: Provide metadata on the algorithm used by your jurisdiction.
Time Period Assignment¶
- Weekly counts: Classify by MMWR week (see MMWR week table)
Data Lags and Incompleteness¶
- Jurisdictions should share all cases as soon as they are adjudicated as confirmed or probable, and are ready for public release
- Data from recent weeks may be incomplete
- The project team will clearly indicate provisional data through:
- Dashed lines on epidemic curves
- Asterisks and notes detailing data completeness limitations
- The project team will not censor data reported by jurisdictions
- All data will be displayed as reported
- The project team will work with jurisdictions to ensure completeness details are understood and portrayed correctly
Geographic Assignment¶
Cases should be included in aggregated counts according to their place of residence, in accordance with standard epidemiologic practice in the US (see CSTE Position Statement 11-SI-04).
Sub-jurisdiction Reporting¶
Sub-jurisdiction level reporting (below state, territory, or city level) is optimal to maximize usefulness for preparedness and response.
- Each jurisdiction should decide the geographic unit to use and provide a list of geographic units as metadata
- Individual jurisdictions will work with the project team to determine geographic granularity
Data Elements¶
For complete field definitions, data types, valid values, and detailed validation rules, see the Data Technical Specifications.
Summary of Required Fields¶
All data submissions must include the following types of information:
- Time fields: When the cases occurred (report period dates and time unit)
- Disease fields: What disease is being reported, case confirmation status, and outcome
- Geographic fields: Where the cases occurred (jurisdiction and geographic unit)
- Count field: Number of cases for this combination
- Demographic fields: Age group (for age-stratified aggregations)
Summary of Optional Fields¶
- disease_subtype: For meningococcal serogroup reporting.
For detailed specifications of each field including exact field names, data types, and valid value sets, see the Data Technical Specifications.
Age Groups¶
Age groups are defined to be relevant to both disease epidemiology and vaccine schedules. The same age groupings are used for all diseases to simplify visualizations.
| Age Group | Description |
|---|---|
| <1 y | From birth up to but not including 1 year birthday |
| 1-4 y | From 1 year birthday up to but not including 5 year birthday |
| 5-11 y | From 5 year birthday up to but not including 12 year birthday |
| 12-18 y | From 12 year birthday up to but not including 19 year birthday |
| 19-22 y | From 19 year birthday up to but not including 23 year birthday |
| 23-44 y | From 23 year birthday up to but not including 45 year birthday |
| 45-64 y | From 45 year birthday up to but not including 65 year birthday |
| >=65 y | From 65 year birthday and older |
| total | Total counts, all ages |
| unknown | Counts of individuals with unknown age |
| unspecified | Aggregated counts of individuals from age groups being suppressed |
Important
Age groups will only be shared and displayed at the jurisdiction level, not at sub-jurisdiction level, unless otherwise agreed to by individual jurisdictions.
International Residents¶
International residents (residents of countries outside the US but who had a case identified in that jurisdiction) can be included in reported data but should be:
- Designated as "international resident" using:
geo_name = "international resident"geo_unit = "NA"- Excluded from jurisdiction total counts and age group stratifications
- Excluded from displayed totals, epidemic curves, etc. for the jurisdiction
Data Suppression¶
Small Count Suppression¶
Jurisdictions should work with the project team to ensure visualized data do not risk reidentification of individual patients. In general, jurisdictions can leverage their existing policies regarding suppression of small numerators or where underlying populations are small enough to risk reidentification. All data should have suppression applied by the jurisdiction prior to submission, in accordance with their institutional policies. To ensure clear understanding of the data and transparency, jurisdictions should share applicable documentation of small count suppression policies with the project team if possible. Specific reporting rules are to be followed when data are suppressed to limit uncertainty and incompleteness in the data.
Handling Suppressed Data¶
To ensure total counts add to 100% of cases:
- All cases that cannot be assigned to specific values (i.e., counties, age groups) should be aggregated in an "unspecified" category for that variable.
- All other non-suppressed cases should be assigned to their appropriate values.
- Allocation of suppression rules and "unspecified" aggregation must be performed prior to data transfer/submission.
- Suppression rules should be shared with the project team for accurate description/footnoting.
Example
A jurisdiction with a suppression rule that requires suppression of any count <5, records 31 total cases of measles during a week. County A has 1 case, County B has 3 cases, County C and D have 12 and 15 cases, respectively. For the county-level aggregation of these counts, these should be reported as:
- Include a row for each
geo_name = "County C"andgeo_name = "County D"withcount = 12andcount = 15. - Include a row with
geo_name = "unspecified"and a value incountthat sums all suppressed county counts for this week. In this casecount = 4. - Do not include a row for
geo_name = "County A"orgeo_name = "County B".
Measles Exception¶
For measles, the project team recommends not suppressing data as a default and releasing any cases that have already been publicly released. This is to ensure comparability with counts being produced through various "web-scraping" efforts that exist, which often capture cases reported individually through press releases. Individual requests to suppress data can be discussed with the project team.
Data Format¶
No Zero Reporting¶
- Report only non-zero counts (i.e., do not include rows in data where the count equals 0)
- For example, if an age group had no reported outcome during a timeframe, no entry is required (e.g., if age group >= 65 has 0 cases of measles for that week for a state, do not include a row with
count = 0forage_group = ">=65") - If a jurisdiction had no reported counts of any disease or outcome during a timeframe, they should still submit the data, even if it has no new additional rows since the prior week's submission. This will enable us to limit missingness and uncertainty in the data.
- The database system will automatically add 0s at higher spatial aggregations
File Format¶
Data should be submitted in CSV format following the standard template structure.
Template and Example Files:
- Data submission template - Empty template with correct field structure
- Example data file - Sample data demonstrating proper format
- Jurisdiction specification example - Example format for sharing sub-jurisdiction geographies, particularly for jurisdictions reporting geographies other than county.
File Submission Requirements:
- Submit a file with all incident disease counts since December 29, 2024
- Each submission should include all updates to current and prior data, with new rows for new time periods
- Files will be date-stamped by the system for version control
- Changes to prior observation counts will be taken as revisions
File Naming: Jurisdictions should name files following the pattern:
disease_tracking_report_{jurisdiction}_{report_date}.csv
disease_tracking_report_CA-SIMULATED-EXAMPLE_2026-02-09.csv
See the Data Transfer Guide for technical details on data submission methods and the Data Technical Specifications for complete field requirements.
Metadata Requirements¶
Jurisdictions should provide accompanying metadata using the Jurisdiction Reporting Metadata Template. For specifics on metadata, see Data Technical Specifications. Metadata required includes:
- Algorithm used to classify cases in time (e.g., CCCD or alternative)
- List of substate geographic units used by jurisdiction
- Data suppression policies and rules
- Any jurisdiction-specific notes or caveats
Validation¶
All submitted data must meet validation requirements to ensure data quality and consistency.
What is validated:
- Field formats and data types
- Values are within acceptable ranges and valid value sets
- Logical consistency across fields
- Required fields are present
Resources:
- Data Technical Specifications - Complete field definitions and valid value sets
- Data dictionary (CSV) - Reference table of all fields and valid values
- Validation Rules - Comprehensive validation requirements
Values submitted must align with valid value sets. Values not in alignment may result in validation errors. The Data Transfer Guide describes what happens when validation succeeds or fails.
Questions¶
For questions about data submission requirements, contact the project team.