2022 NATIONAL AMBULATORY MEDICAL CARE SURVEY HEALTH CENTER (NAMCS HC) COMPONENT TECHNICAL DOCUMENTATION For Public Use Data File Division of Health Care Statistic Natio aay nter for Health Statistic Ma va National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Overview Summary This document provides detailed information and guidance for users of the 2022 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component public use data file. As a principal source of information on health care utilization in the United States, the NAMCS HC Component collects visit data from a nationally representative sample of U.S. federally qualified health centers (FQHCs) and FQHC look-alikes through electronic health record (EHR) data submission. The 2022 NAMCS HC Component is conducted by the National Center for Health Statistics (NCHS) and is a member of the National Health Care Surveys — a family of surveys which measure health care utilization across a variety of health care providers and settings. Section 1 of this document includes information on the scope of the survey, the data sources, and the confidentiality protections related to the data. Section 2 contains details on the sampling process, data collection procedures, and weighting methodology used to produce national estimates on health care utilization. Section 3 provides information on the number of sampled health centers that were eligible to participate in the NAMCS HC Component and submitted data in 2022. Section 4 details the contents of the 2022 NAMCS HC Component public use data file and the edits used in the creation of the file. Section 5 contains an explanation of the procedures used to accurately produce variance estimates. NCHS presentation standards for proportions, counts, and rates, and their relation to NAMCS HC Component data, are discussed in Section 6, and the data analysis guidelines are provided in Section 7. Section 8 provides information on item missingness, and Section 9 provides a comparison of frequencies between the NAMCS HC Component public use and restricted use data files. Section 10 provides a list of preferred reporting items for complex sample survey analysis. Section 11 provides further information on the availability of NAMCS HC Component restricted use data files available in NCHS and Federal Research Data Centers. Appendix A provides unweighted frequencies for selected variables included on the public use data file. Suggested Citation Technical Documentation: National Center for Health Statistics. Division of Health Care Statistics. 2022 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Public Use Data File Documentation, May 2024. Hyattsville, Maryland. National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Data File: National Center for Health Statistics. Division of Health Care Statistics. 2022 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component public use data file. 2024. Hyattsville, Maryland. Contact Information Data users can find the latest information about the NAMCS HC Component on our website, at: https://www.cdc.gov/nchs/ahcd/namcs_ index.htm. If data users have queries about the public use data file, they may send their question through email to ambcare@cdc.gov, or call us at 301-458-4600. A response to data user inquiries is generally provided in 1-2 business days. The National Center for Health Statistics has an ambulatory health care data listserv, where updates and information about the most recent ambulatory care data (including the NAMCS HC Component) are sent out. Details on how to subscribe to the NCHS Listserv for ambulatory health care data can be found at: https://www.cdc.gov/nchs/ahcd/ahcd listserv.htm. National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Contents Section 1 About the National Ambulatory Medical Care Survey Health Center Component ..............006 6 S@ctioni:1:1. BACkSrOUn ..s.sisic.secicccsseacciccsersccdesaaccccesenicvevseaiacceseticucesabiascsdesiencvsaaiaccesadiccdesaaiaucedericncvsatiaccest 6 S@CtION 1:2 Data: SOUPCOS is.c:cssscccceesonceuess ocecnansceanctvs scecnnesnetenuys sbeenag scetenees sbocndy ventas aceenapeceaedees sceeuavsneaeuss 7 Section .1:3:Data Confidentiality «ii.icc6..iccctctieniieinabeninenneeaiinii eine aaa 7 Section 2. Methodology ssiisiisssscscesssccscssssciccssssccstessscscssssavseestsscctessssecdesssscccesssccessssseacesssacccessaesieestsacsees 8 Section:2.T Brief OVerviewss sis cnisicsccisazesacne seed tin be checetsbenddan steve categers cackqeieselcsdvadvessa ven wbesuva censaateceistanateesdetey 8 Section 2.2 Health Center Frame and Sample Design. ...............:cccccssccccessteeesesseeesenseeeessnueeeesenueeeesennneeenes 8 Section 2.3 NAMCS HC Component Public Use Data File Sample Design .................:cccsscccsessteeesesteeeres 9 Section 2.4 Data Collection Procedures. ............e:ccceccceessecteneeeeeeeessaeeseaaeeeeneeeeaeeseaaeeeeneeeneeseaaeeneaeereeesenas 10 SQCtiON: 2:5 WEIBNTING ss iscsccssciessseicacveesicacacseniceadecictenssetddeassaictenstehigeadedacdensaatidesesatseenainaiogvasiastensinaiassauateas 10 Section 3 Sample Size, Eligibility, and Response Rate .............cccsssssccecsssssceccsssecccsssscescessseeseesseseseesseees 12 Section: 4 Data: PrOCeSSING iicciissssecccccssdadessctvestsstsdsesssavescoutiveedstiebcccotsstunsscesscccovsedssscusessossssunesvecdessdeseer 13 Section: 4:1. Diagnosis, Data wici.veviscecsescadesseiccccseencecesdeaceccvsascacevavicccdvenacadesdanteccvcaltaceveehinedesentedevdeatebevareceeds 13 SOCtiON: 4:2 PAtiemt Age ies iii cc2. cei sdcedivesdyvareSageed cunsseihastel cnddanavalawtage sae debuste uelnateedcensvaubaadugeeeddnseaseaaiateeddanseas 14 Section: 4.3: Patient SON ss szec: siss2ecciaesenctze vse cecnagsceng caps etiendesees tua esedonanesdene saved anageseestuessedetanescens dees eeanaaeseeatans 14 Section 4.4 Patient Race and Hispanic Ethnicity ...................ccccccccsssceccessneecseseneecssseneeessseeeesseeneeerseeneees 14 Section 4.5 Patient Marital Status .2....0. ccc ceceeeseceeeeeeeeeeecaaeeeeaaeseeeeecaeeesaaeseeaeeseeeeesaeeeeaaeeneaeeeea 14 Section 4.6 Visit Month and Day ..............:ccccccescccseseseecsessneeessseeeessseseeessseneeesseenaecssseseeesseesaeeseneaeenseenaees 14 Section 5 Standard Errors and Variance Estimation .............cssscccccscsssesesseeeeeseceesseeeeeeeeseeeecsseeseeseeseeeaes 15 Section 5.1 Subpopulation Analysis — Subsetting Data.................cccccccccescecesseneecseseneeesssseeersseeeesseenaees 15 SECTION 6 Presentation Standards iaisssesissccrccccssvsacssscessscacnandieccesssnasnsesssdecescvadnsensvesccccecadsesséenencsasnacese 16 Section 7 Data Analysis Guidance sisiiiscissscsscccacssccsssccnveceassctessscsecteasescsdssccssscoasesesesscessvenssestsssecdsscousese 17 Section: 7:1-Visit' Weight). ..ciiieczivcicivegeccudiagedieivateedeudeein ivvveahs cevneeitivisabbiswendett oiweghs ceiecelistestieaeeneencesgnens 17 Section 7.2 Guidance on Weight Normalization ..............::cccccsccccessseecesseneecseeseecssseseeesseeeeeseeeeeenseenaees 18 Section 7.2.1 Normalization Example...............cc:cccccssccccessseeeceesseeeeeenseeeeseseeeesenaeeseseseeseseneeeeneneeeenss 19 Section 7.3.1 Normalization Example Code .............ccccccccssccccecsseeeeeenseeeeeeseeeeseneeeeseneeseeeseeseeeneeeenes 22 Section 7.3 SAS SUDAAN Survey Procedules .............cccccccccsssssccsessneeeseseneeeseseeecsseeseeesseesaeesseenaeessnenaees 23 Section 7.3.1 NEST Statement Variables ..0..........ccecccesceeeeeeeeseeteseeeeeneeeeaeeseaaeeeeneeeneeseaaeeeeaeeseeeeeaas 24 Section 7:4:SAS Survey: Proc@QureS s..:issesssieisssocueeseustdevsacetunvsantacssacetauesstencedscatachvcunnacdescecaveveuntdavencatans 24 Section: 7:5:R:SUrVeY PrOCECUIeS ::.:..52.0s205e2icssetecieiecheteseeueivasuauoes ceveoiiute, galleiapgeisa sda penlsiepboieeiacthtieesnedes 25 Section 7.6 Stata Survey Procedules .............cccccccsssseccssssseecseseneeessceeeeseseeeeseeeeeessceseeesseesaeeseeeaeenseenaees 26 Sections Survey CONTENE sce iisecsscccesesecsstecsetecsdecesicescteccassiseesctauwistoxsseceesceuseceasesesecscdecrecssscesteeaseepesdese 26 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Section 8.1 Demographic Item Missingness Rate .................cccccccscccsesssessseeeeeesseesecssaeeseeesseeseaeaeesesesees 26 Section 8.2 Diagnosis Item Missingness Rate. ..............0.cccccccecesscceceeeeeeeeeeaaeaeceeeeeseesaaaeaeeeeeeeeeeeeaeeeeeeeess 27 Section 9 Data: COMPAMiSON sie. .ccsssccesdsseecendecscssecceeescecevccesccosevcesdevcosscceuevccasvsosesccueesadsvessescsuseeesssveesess 29 Section 9.1 Public Use Data Files and Restricted Use Data File ...............cceeeccecceeeseeeeeeeeeeeeeeneeerseeteaas 29 Section 10 Preferred Reporting Items for Complex Sample Survey Analysis (PRICSSA) Checklist for the 2022 NAMCS HC Component Public Use Data File ...............ccssssccccssssccccsssceccssceccenssecceecsssessecssseseaeens 33 Section 11 Research Data Center ...........cccccssessssssesececeesecscseeeeeeseeeeccseseseeeseeeacscseseseeeseseueseeeeseeeseauacsess 34 SECTION 12 RETENCNCES wscssccccedssisesscsecscccsestnesnsascasssssacsanwnabaceedssi disawmansdentssdannawadsccsdéctanmnbeacsdessessananbosess 35 Appendix A Unweighted frequencies for health center ViSits............csssccssssccsssscssccssesccsesccssseseesceeees 36 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Section 1 About the National Ambulatory Medical Care Survey Health Center Component Section 1.1 Background The National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component is an annual survey that provides data on health care utilization at health centers in the United States. As a part of NAMCS, the National Center for Health Statistics (NCHS) began collecting data from health centers in 2006. A separate sample of health centers was drawn in 2012 for NAMCS. In 2021, NCHS redesigned the NAMCS HC Component to collect visit data from electronic health records (EHRs) from participating health centers for the entire calendar year. The NAMCS HC Component collects data on health center visits including information on diagnoses and patient demographics. The survey aims to provide health trends and outcomes of the U.S. population’s utilization of health centers in the following ways: e Provide nationally representative, accurate, and reliable health care data for health centers in the United States. e Answer key questions of interest to health care professionals, researchers, and policy makers about health care quality, use of resources, and disparities of services to population subgroups. e Monitor national trends in health care topics for which health centers play an important role, such as mental health and substance use-related care, maternal and child health, and HIV- related care. e Contribute to a stronger public health foundation that helps address current and future public health threats. In 2022, the entire sample included 324 federally qualified health centers (FQHC) and FQHC look-alikes in the 50 U.S. states and the District of Columbia that used an EHR system. Out of the entire sample, 104 health centers were included in the primary sample and 220 health centers made up the reserve sample. Ultimately, 255 health centers were contacted and 64 health centers agreed to participate and provided visit data from their EHRs. Out of the 64 responding health centers, 26 continued participation from 2021 and 38 health centers were newly recruited in 2022. For more detailed information regarding the sample frame, see Section 2.2. National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Overall, 5,640,370 health centers visits were collected from the 64 responding health centers. Of these, 282,017 health center visits were selected to create the 2022 NAMCS HC Component public use data file. Section 1.2 Data Sources The NAMCS HC Component receives data from EHR systems. Participating health centers submit EHR data, which contain an unlimited number of medical diagnosis and procedure codes, laboratory and medication data, and unstructured clinical notes. However, the public use data file will only include diagnosis variables and demographic information. The NAMCS HC Component accepts EHR data in the format of HL7 CDA® R2 Implementation Guide: National Health Care Surveys Release 1, DSTU Release 1.2 — US Realm (http://www.hl7.org/implement/standards/product_brief.cfm?product_id=385). However, some EHR vendors are not able to format their data in the HL7 CDA format as specified in the National Health Care Surveys Implementation Guide. Alternatively, these centers were able to submit their EHR data as custom extracts, which contained many (but not all) data elements extracted via the above implementation guide. Section 1.3 Data Confidentiality NCHS and its agents take the security and confidentiality of NAMCS HC Component public use data file very seriously. Strict laws have been implemented to establish minimum Federal standards for safeguarding the privacy of individually identifiable health information. Assurance of confidentiality is provided to all health centers according to Section 308(d) of the Public Health Services Act [42 United States Code 242m (d)]. Strict procedures according to Section 3572 of the Confidential Information Protection and Statistical Efficiency Act (44 U.S.C. 3561-3583) are utilized to prevent disclosure of personal identifiable information in NAMCS HC Component data. All information which could identify a participating health center is confidential and seen only by persons associated with NAMCS HC Component, and is not disclosed or released to others for any other purpose. Prior to the release of public use data file, NCHS conducts extensive disclosure risk analysis to minimize the chance of inadvertent disclosure. As a result, selected characteristics and/or data elements may have been omitted or masked on the public use data file to minimize the potential risk of disclosure. Masking was performed in such a way to cause minimal impact on the data. See Section 4: Data Processing for more information on which data elements in the public use data file were impacted. National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component The protocol for NAMCS HC Component has been approved by the NCHS Research Ethics Review Board since the survey’s establishment (2006). Section 2 Methodology Section 2.1 Brief Overview The 2022 NAMCS HC Component used a national probability sample of health centers to collect data on visits to develop the public use data file. The 2022 NAMCS HC Component public use data file sample was designed to allow for nationally representative estimates of visits at health centers in the United States. Section 2.2 Health Center Frame and Sample Design The 2022 NAMCS HC Component identified a targeted universe of FQHCs and FQHC look-alikes in the 50 U.S. states and the District of Columbia that provide direct ambulatory care and use an EHR system at one or more delivery sites. Health centers that were fully or partially funded by the Health Resources and Services Administration (HRSA) were considered for inclusion. Health centers were deemed ineligible if they: ° Did not have an EHR system e Did not provide healthcare services to the general U.S. population, or only provided care to special institutionalized populations such in prisons, nursing homes, homeless shelters, etc. ) Only provided dental services ) Were located on a military installation or outside of the 50 U.S. states and the District of Columbia To create the sampling frame and draw the sample, NCHS worked with the HRSA to use a nationally representative database that contains a list of all health centers in the United States. The database contained 1,482 health centers for the 2022 NAMCS HC Component. To create the sampling frame from this database, ineligible health centers were removed. This included 64 health centers that did not meet the inclusion criteria described above and 149 health centers that were included in the 2021 sample. This process yielded a sampling frame of 1,269 eligible health centers. In 2021, a stratified random sample of 50 FQHCs and FQHC look-alikes was drawn as the primary sample, along with a reserve sample of 100 health centers. The 2022 NAMCS HC Component sample was expanded to initially add 60 respondent health centers to the 50 respondent health centers from the National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component 2021 sample, resulting in 110 FQHCs and FQHC look-alikes making up the 2022 NAMCS HC Component sample. However, 54 health centers were ultimately fielded due to budget constraints. Due to this, six randomly selected health centers were removed from the sample in four strata. In 2022, an additional 120 additional health centers were selected for the reserve sample (Williams et al., 2023). Ultimately 255 health centers were contacted to participate in the 2022 NAMCS HC Component, which includes 64 respondents and 191 eligible non-respondents. The 64 participating health centers include 26 health centers from the 2021 sample and 38 health centers from the 2022 sample. Weighting was conducted to produce health center-level and visit-level estimates. Data were collected for 100% of visits from the sampled health centers via EHR submission. Section 2.3 NAMCS HC Component Public Use Data File Sample Design While the NAMCS HC Component restricted use data file includes every health center (HC, visit record submitted to NAMCS HC Component for the survey year, the 2022 NAMCS HC Component public use data file consists of a5% sample of NAMCS HC Component visit data. This 5% sample of NAMCS HC Component records was selected for the public use data file instead of the full listing of records to decrease disclosure risk and increase efficiency for data users when conducting statistical analyses. In 2022, the NAMCS HC Component collected 5,640,370 visit records. Stratified systematic sampling was used to select the public use data file sample of health center visits. A targeted number of records was determined by taking 5% of the total health center visit records (n=282,017). The sampling interval was the inverse of the percent of submitted EHRs targeted for inclusion in the subsample. The sampling interval used to select the public use data file records in the 2022 NAMCS HC Component was 1/0.05, or 20. Within each estimation stratum, participating health centers were randomly ordered. Within each health center, visits were then sorted by the following variables: Visit Week > Day of Week Once sorted, visits were serially numbered in each estimation stratum. Next from the ordered array of HC; records, visits were selected for the public use data file sample if the assigned “array sequential” numbers were the nearest integer greater than or equal to: Ry + Int(EHR), x k Where: Ry = random number between 0 and Int(EHR); National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component k=0,1;2, 3. Int(EHR)r, = sampling interval Section 2.4 Data Collection Procedures In 2022, health centers submitted EHR data via two sources, either directly from the health centers’ EHR system or as a custom extract, as mentioned above in Section 1.2. Once data were collected, several steps were required for data processing. Specifications for checking, configuring, and transmitting the data files were developed by NCHS. Once NCHS received the data files they were processed to harmonize data from the two data sources. All records from participating health centers’ EHRs were brought into the restricted database, and those records were then collapsed so that a given patient could only have one record (called a visit in the PUF) per day at a given health center. Section 2.5 Weighting Weighting was conducted to produce health center-level and visit-level estimates, and to account for sampling probabilities and nonresponse. Only visit-level weights are included in the public use data file, and users are only able to produce visit-level estimates with this file. Health center-level data were collected via self-completed forms from participating health centers. All 2022 health center visits were collected from the sampled health centers via electronic files of their EHR system. Participating health centers submitted data for all visits that occurred during the 2022 calendar year. While the 2022 NAMCS HC Component restricted use data file includes all (100%) of the visit records sent, the public use data file includes a 5% sample of those records, as described in Section 2.3. All health center visit data collected for 2022 were used to develop weights. To produce visit-level weights, health center-level weights were first developed and smoothed. The visit-level weights were then developed for the restricted use file that includes all visits from participating health centers. These visit weights were formulated as the final health center weight multiplied first by the health center’s actual annual number of visits made for medical care followed by a partial non-response adjustment factor. Visit weights for all visits were then smoothed before they were finalized. Because the public use data file only contains a 5% sample of all visits submitted in the 2022 NAMCS HC Component, visit weights for visits included on the public use data file were adjusted accordingly. This ensures that weighted estimates from the restricted use file and the public use data file sum to approximately the same total number of weighted visits at health centers in the survey year. 10 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Variance estimation procedures for weighted estimates are described further in Section 5 with coding examples in Section 7, and comparisons of weighted estimates between the restricted and public use data files in Section 9. 11 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Section 3 Sample Size, Eligibility, and Response Rate All 255 health centers that were contacted for participation were eligible to participate in the survey. Ultimately, 64 health centers participated in the 2022 NAMCS HC Component yielding a response rate of 25.1%. As described in Section 2.2, 54 health centers in 2022 were added to the 50 health centers selected in 2021, totaling to 104 health centers in the 2022 NAMCS HC Component sample. With this target of recruiting and securing 104 health centers to participate in the 2022 NAMCS HC Component, 64 ultimately participated (or 61.5% of this targeted goal) ultimately agreed. A health center was considered a full respondent if they provided data for at least six months of the survey year. Of the 64 participating health centers that were included in the 2022 NAMCS HC restricted use data file, all provided at least six months of data. Therefore, all health centers were selected to create the public use data file. From the 64 health centers, 5% of all records were selected for the public use data file. Overall, 282,017 health center visits were selected. Table 3.1 presents the number of health centers, visits, and response rates for the 2022 NAMCS HC Component. Table 3.1 Number of health centers, visits, and unweighted response rates, NAMCS HC Component, 2022 TOTAL Health Centers Visits Unweighted Response Rate* Restricted Use Data File 64 5,640,370 25.1 Public Use Data File 64 282,017 N/A Note: N/A is not applicable. *Response Rate was calculated using American Association for Public Opinion Research (AAPOR) Response Rate 1 formula. The percentage is a calculation of the eligible respondents and partial respondents (N=64) divided by the eligible respondents, partial respondents, eligible non-responding and not contacted respondents (N=255). 12 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Section 4 Data Processing The data included in the public use data file underwent additional processing to prepare them for release. Suppression rules such as masking were applied for some records to protect patient confidentiality. Other items were either top-coded or bottom-coded in accordance with NCHS confidentiality requirements; this is noted for specific data items outlined in this section. Imputation was not conducted for data elements with missing values prior to creation of the 2022 NAMCS HC Component public use data file. Section 4.1 Diagnosis Data In the 2022 NAMCS HC Component, diagnosis data from participating health centers were submitted in three different diagnosis coding systems including: International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM); International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM); and SNOMED Clinical Terms (SNOMED CT). In the creation of a harmonized and integrated database, the ICD-9-CM and SNOMED CT diagnosis codes were translated to ICD-10-CM, where applicable. Translation from ICD-9-CM and SNOMED CT to ICD-10-CM was the only modification to the diagnosis codes. On the public use data file, medical diagnosis codes were limited to ICD-10-CM diagnosis codes. An ICD-10-CM code can have a maximum of 7 characters and is organized by chapters from A to Z. For the 2022 NAMCS HC Component public use data file, ICD-10-CM codes have been truncated to four characters to minimize disclosure risks. While the codes have been truncated, the diagnosis codes are never updated or revised to a different code that would result in a change to the original diagnosis for a visit. To maintain integrity of the data, any codes that appear to be invalid are kept as is. Duplicate 4-character ICD-10-CM codes were removed for each unique visit on the public use data file. Although visits collected from health center EHR systems could have had an unlimited number of diagnosis records, diagnosis codes were limited to 30 unique codes per visit (variables DX1 through DX30) in the public use data file, which captured 96.6% of diagnoses recorded at visits included on the public use data file. Rarity of diagnoses was assessed and those deemed rare were truncated to two characters. At least one diagnosis code is listed in 62.0% of all visits. Six health centers did not provide any condition codes that could be translated to ICD-10-CM, therefore do not have any visits that include at least one 13 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component condition code in DX1-DX30. Of the 58 health centers that provide any codes that translated to ICD-10- CM, 74.4% of their visit have at least one diagnosis code in the public use data file. Section 4.2 Patient Age Patient age is present for all visits in the 2022 NAMCS HC Component public use data file. Visits were top coded to the 99.5" percentile of age, thus visits by patients ages 88 and older were top coded to 88 years. Section 4.3 Patient Sex Patient sex is missing in 0.1% of records on the 2022 NAMSC HC Component public use data file. Section 4.4 Patient Race and Hispanic Ethnicity Patient race is missing from 24.5% of records on the 2022 NAMCS HC Component public use data file. Eleven health centers are missing patient race for all visit records. Excluding the 11 health centers with complete missingness, 17.7% of visits are missing patient race. Patient ethnicity is missing from 12.9% of records on the 2022 NAMCS HC Component public use data file. Ten health centers are missing patient ethnicity for all visit records. Excluding the 10 health centers with complete missingness, 6.6% of visits are missing patient ethnicity. Section 4.5 Patient Marital Status Marital status of patients is included in the public use data file but is missing from 20.9% of records overall. Ten health centers are missing marital status from all visit records. For the remaining 54 health centers, marital status is missing from 15.2% of visits. Section 4.6 Visit Month and Day Exact dates are not provided on the NAMCS HC Component public use data file. Instead, only the month and day of the week of health center visits are provided. 14 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Section 5 Standard Errors and Variance Estimation Standard error is primarily a measure of the sampling variability that occurs by chance because only a sample of health centers are in NAMCS HC Component, rather than the entire universe of health centers. Standard errors and other measures of sampling variability are best determined by using a statistical software package that takes into account the sample designs of surveys to produce such measures. See Section 7 for further guidance on how to apply weights and calculate standard errors to generate national estimates. Section 5.1 Subpopulation Analysis — Subsetting Data For data users who may have a subpopulation of interest, such as a particular age group or sex, a domain analysis must be performed, also known as a subgroup or subpopulation analysis. For some variance estimation methods, the entire set of data containing the appropriate weights for a particular survey year must be used to obtain the correct variance estimates. Therefore, it is not recommended to drop observations from the dataset when subsetting data, as it may affect variance estimation. Instead, the estimation procedure must indicate which records are in the subgroup of interest. For example, when examining female patients aged 35 and over, the entire dataset of examined individuals (both male and female patients of all reported ages) must be read into the statistical software program. The STAT and DOMAIN statements in the SAS survey procedure, SUBPOPN in SAS callable SUDAAN, or comparable statements in other programs (SUBSET in R; subpop or over in Stata) must be used to indicate the subgroup of interest (i.e., females aged 35 and over). Depending on the specifications of a data user’s statistical software of choice, an indicator variable created by the data user prior to the procedure may facilitate the identification of the subgroup in the procedure statements. 15 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Section 6 Presentation Standards Data users should be aware of the reliability of survey estimates, particularly smaller estimates. NCHS has published standards for the assessment of reliability and presentation of proportions (or percentages) (https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf) and for the presentation of rates and counts (https://www.cdc.gov/nchs/data/series/sr_02/sr02-200.pdf). For presentation or publication of count estimates using data from the NAMCS HC Component, we recommend visit estimates be rounded to the nearest thousand. These presentation standards apply to products published by NCHS. If, according to the presentation standards, an estimate is not reliable, data users should examine the confidence interval carefully before using the estimate. 16 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Section 7 Data Analysis Guidance The following section provides an overview on how data users can derive visit estimates and compute variances to produce standard errors, using statistical software tools such as SAS, R, and Stata. For the NAMCS HC Component public use data file, SAS-callable SUDAAN software procedures are used for survey analysis, however, SAS/STAT software procedures beginning with SURVEY for survey analysis may also be used. R relies on the “survey” package to conduct survey data analysis whereas Stata, uses the “svy” command. SAS/SUDAAN, R and Stata users can use these procedures to conduct statistical analysis on data from the 2022 NAMCS HC Component public use data file. Additionally, this section provides guidance on normalizing visit weights to account for complete missingness for analytic variables of interest. The guidance provides data users a framework to implement normalizing weights for data analysis. Data users should always investigate if there are any variables of interest that have complete missingness at health centers in the 2022 NAMCS HC Component public use data file. Section 7.1 Visit weight The visit weight is a critical component in the process of producing estimates from sample data and its use should be clearly understood by all data users. The statistics contained on the public use data file reflect only a sample of visits; a 5% sample of the NAMCS HC Component data collected from participating health centers, not a complete count of all visits that occurred in the United States. Each health center’s visit record in the public use data file represents one patient visit in the sample of 282,017 visits. To obtain national estimates from the 5% sample, each record is assigned an inflation factor called the "visit weight” (variable VISWT in the public use data file). By aggregating the “visit weights" assigned to the VISWT variable on the 282,017 health center visits for 2022, the data user can obtain the estimated total of 109,087,913 health center visits (standard error of 19,896,515 health center visits) made in the United States in 2022. Note that estimates of health center visits produced from the 2022 NAMCS HC Component public use data file may differ somewhat from those estimates produced from the 2022 NAMCS HC Component restricted use data file. This is because of adjustments required for the public use data files, as part of the disclosure risk mitigation process. Certain variables were masked on some records for confidentiality purposes. Other variables were top and/or bottom coded in accordance with NCHS confidentiality requirements. 17 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component The table in Section 9 compares aggregate unweighted and weighted data for selected variables between the 2022 NAMCS HC Component public use data file and restricted use data file. Section 7.2 Guidance on Weight Normalization Some health centers did not provide certain data elements for any of their visits in the 2022 data year. In certain situations, some health centers needed to produce custom extracts of their records to conform with the format needed for processing as specified in the HL7 CDA Implementation Guide. Therefore, not all data elements were required of health centers providing custom extracts. In other situations, even for health centers providing data via the IG, certain variables were incomplete for all visits at specific health centers. Regardless of the reason for missingness, data users must identify health centers that have complete missingness for specific analytic variable(s) of interest, and exclude those health centers’ visits from analysis. Additionally, if certain health centers’ visits must be excluded, users must normalize the weight variable (VISWT) so that the sum of weights of visits in the analysis is equal to the sum of weights of all visits in the 2022 NAMCS HC Component public use data file. Steps for a complete case analysis: 1. Identify health centers to be included in your analysis: a. Identify variable(s) required for your analysis b. Identify health centers that are missing values at ALL visits for at least one variable of interest from Step 1a c. Subset all visits from health centers identified with complete missingness for at least one variable of interest, as identified in Step 1b above. NOTE: This process does not eliminate all missingness, rather it eliminates complete missingness of a specific variable for a specific health center. Health centers that are included may still have some visits with missing information for the variables of interest, but this process removes visits at health centers that did not provide any information for variables of interest. 2. Normalize weights if only a subset of health centers’ visits is included: a. Calculate the sum of weights for all visits in the public use data file. In 2022, the sum of weights (VISWT) is 109,087,913. b. Calculate the sum of weights for visits at health centers to be included in your analysis. 18 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component c. Calculate the normalization factor [X] by dividing the sum of weights for all visits in the survey (from step 2a) by the sum of weights for visits in your analysis (from Step 2b), and the value of X from this calculation is the factor you will use to normalize your weights. i. X= [sum of all visit weights] / [sum of visit weights in your analysis] 1. NOTE: X will always be greater than 1. d. Create a new weight variable for visits in your analysis by multiplying the original weight variable by your normalization factor (X). i. NEW_WT=VISWT * X e. Use NEW_WT for your analysis in place of VISWT. NOTE: If you add or subtract variables from your analysis, or you develop a new research question and analysis, you must conduct these steps again to ensure that you: 1) capture visits from health centers providing data on your variables of interest, and 2) normalize those visits’ weights accordingly. Table 7.1 Variables that contain health centers with complete missingness in the 2022 NAMCS HC public use data file Variable Name Variable Description HCID_S to exclude DX1-DX30 Diagnoses 1-30 22, 26, 42, 46, 60, 62 ETHNICITY Patient Hispanic ethnicity 4, 11, 12, 18, 20, 23, 25, 30, 47, 63 MARITAL Marital status 4,11, 12, 18, 20, 23, 25, 30, 47, 63 RACE Patient race 4, 11, 12, 18, 20, 23, 25, 29, 30, 47, 63 RACERETH Combined race and ethnicity variable 4,11, 12, 18, 20, 23, 25, 29, 30, 47, 63 Section 7.2.1 Normalization Example The example below will showcase the differences in estimates when normalizing the 2022 NAMCS HC public use data file for visits with a mental health disorder and race as opposed to not normalizing. This example will provide context on normalizing weights when assessing complete missingness for two variables on the public use data file (DX1 and RACE). Before following the steps for a complete case analysis, it is helpful to assess the unweighted and weighted number of visits for all 64 health centers in the public use data file, as shown in Table 7.2. There are 282,017 visits in the public use data file representing 109,087,913 health center visits. 19 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component Table 7.2 Weighted and unweighted number of visits in the 2022 NAMCS HC Component public use data file Visits at all health centers (N=64) Unweighted 282,017 Weighted 109,087,913 In this example, assume the user wants to assess visits with a first-listed diagnosis (DX1) of a mental health disorder, stratified by race (RACE) using the 2022 NAMCS HC Component PUF. For the purposes of this example, a mental health disorder was classified as any ICD-10-CM code in the Mental, Behavioral and Neurodevelopmental disorders chapter (FO1-F99). Please note that in this public use data file, when DX1 is missing, all DX1-DX30 variables will be missing, so whether assessing first-listed or any-listed diagnosis, we only need to assess complete missingness for DX1. First, the user must identify all health centers that have complete missingness in either the race (RACE) or first-listed diagnosis (DX1) variables (or both) from Table 7.1 above. In 2022, 17 health centers have complete missingness in the DX1 or RACE variables. Health centers 22, 26, 42, 46, 60, 62 are missing DX1 at all visits. Health centers 4, 11, 12, 18, 20, 23, 25, 29, 30, 47, and 63 are missing RACE at all visits. Therefore, 47 health centers make up the subset of data to analyze first-listed mental health diagnoses by race. The normalization factor X should be calculated by dividing the sum of all visit weights (109,087,913) by the sum of visit weights from the 47 health centers included in this example (74,065,859). The normalization factor is (109,087,913/74,065,859) or approximately 1.47. The normalization factor is used to create a new weight variable, which for this example is calculated as NEW_WT=VISWT*(1.47). After calculating the normalization factor and creating a new weight variable, the data user should apply the new visit weight variable to the subset of visits at the 47 health centers included in this example. The total sum of weights in the analytic subset (sum of NEW_WT at HC visits to be included) should be equal to the total sum of weights for all visits at all health centers in the NAMCS HC public use data file as shown in Table 7.2. At the 47 health centers identified for inclusion in this example, we identified visits with a first-listed mental health ICD-10-CM diagnosis and race information. We then produced unweighted and weighted estimates (using the normalized NEW_WT variable) of visits with a first-listed mental health diagnoses at health centers in 2022. These estimates are detailed in Table 7.3 for users to replicate. Please note, normalization of weights at the subset of visits to be included only impacts the weighted numerator and 20 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component weighted denominator estimates; the unweighted counts and weighted percentage will not change in the same subset of visits due to weight normalization. Table 7.3 Visits with a first-listed mental health diagnosis, with race and diagnosis information, in the 2022 NAMCS HC public use data file Overall Non-Normalized subset Normalized subset Subset without Subset with Analysis Overall Data File Normalization Normalization correctly implemented implemented Number of health centers 64 47 47 Unweighted numerator 23,642 21,151 21,151 Unweighted denominator 282,017 211,452 211,452 Weight used VISWT VISWT NEW_WT!? Weighted numerator 8,958,206 7,888,090 11,617,975 Weighted denominator 109,087,913 74,065,859 109,087,913 Weighted Percent (SE) 8.21 (1.69) 10.65 (1.58) 10.65 (1.58) 1 As described in Section 7.2.1, NEW_WT= VISWT *1.47, where 1.47 is the calculated normalization factor. In the first column of Table 7.3, the data is neither subset nor using a normalized visit weight. The weighted numerator underestimates the weighted number of visits with a first-listed mental health diagnosis and race, which also results in an underestimated weighted percent. In the second column, the data is subset to exclude health centers with complete missingness but does not use the normalized visit weight. This further underestimates the weighted number of visits with a first-listed mental health diagnosis. Additionally, because of the use of the subset of health centers and a non-normalized visit weight in the second column, the weighted denominator does not add up to the total number of visits in the public use data file. The last column displays the correct analysis using the subset of health centers and the normalized weight variable. Using a subset of health centers and normalizing their weights produces a higher weighted numerator than using all health centers and the non-normalized weight or using the subset of health centers and the non-normalized weight. In the overall analysis in Table 7.3, visits at health centers with complete missingness for diagnosis data are automatically considered to be non-mental health visits despite not having enough information to discern whether there was a mental health diagnosis. Consequently, the overall weighted numerator is an undercount of visits with a first-listed mental health diagnosis at health centers in the United States. In short, normalizing weights may produce different estimates when analyzing the 2022 NAMCS HC Component public use data file depending on the number of health centers that are included in the 21 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component analysis. Without excluding health centers with complete missingness and subsequently normalizing visit weights, data users will underreport counts and rates for their analysis of interest. Data users should consider the full scope of their research question to make decisions on the subset of health centers to include, and how normalizing visit weights will impact the calculation of estimates. Data users should reference Table 7.1 to ensure the correct health centers are excluded in their analysis when normalizing weights in a complete case analysis. Section 7.3.1 Normalization Example Code For further assistance in implementing normalization on the 2022 NAMCS HC Component public use data file, the following SAS code replicates the normalization example described in Section 7.2.1. *STEP 1; *Identify the variables of interest for your analysis; *Research Question: First listed diagnoses of mental health by age and race; *Variables needed: DX1, RACE; *In this example you will need to subset the data where DX1 is missing or RACE is missing according to Table 7.1; *DX1 is missing where HCID_S in (22, 26, 42, 46, 60, 62); *RACE is missing where HCID_S in (4, 11, 12, 18, 20, 23, 25, 29, 30, 47, 63); *STEP 2; *Calculate two sums: 1. the sum of weights at all HCs in the original datafile and 2. the sum of weights at HCs to be included in your analysis; *1. Overall sum of weights; proc sql; create table sum_total as select sum(viswt) as sum_total from /*[full datafile]*/; quit; proc print data=sum_total; run; *2. Subset sum of weights; proc sql; create table sum_subset as select sum(viswt) as sum_subset from /*[full datafile]*/ where HCID_S not in (4, 11, 12, 18, 20, 22, 23, 25, 26, 29, 30, 42, 46, 47, 60, 62, 63); quit; proc print data=sum_ subset; run; 22 National Ambulatory Medical Care Survey Health Center (NAMCS HC) Component *STEP 3; *Create two new variables for your analysis: 1. anormalized weight, using sum_total and sum_subset calculated in step 4 and 2. an inclusion indicator where the record is at a PSU identified in ‘all_ three' from STEP 3 above; data /*new_datafile*/; set /*[full datafile]*/; new_wt=viswt*(/*[value of sum_total]/[value of sum_subset]*/); if HCID_S notin (4, 11, 12, 18, 20, 22, 23, 25, 26, 29, 30, 42, 46, 47, 60, 62, 63) then include=1; else include=2; if "FO1"