We conducted a community based survey to collect and examine social determinants of health and their association with obesity prevalence among a sample of Hispanics and non-Hispanic whites living in a rural community in the Southeastern United States. To ensure a balanced sample of both ethnic groups, we designed an area stratified random sampling procedure involving three stages: (1) division of the sampling area into non-overlapping strata based on Hispanic household proportion using GIS software; (2) random selection of the designated number of Census blocks from each stratum; and (3) random selection of the designated number of housing units (i.e., survey participants) from each Census block.
Utilizing a standardized area based randomized sampling approach allowed us to successfully recruit an ethnically balanced sample while conducting door to door surveys in a rural, community based study. The integration of area based randomized sampling using tools such as GIS in future community-based research should be considered, particularly when trying to reach disparate populations.
Sampling for cross-sectional survey studies can be probability based or non-probability based. Probability based (e.g. random sampling) requires a defined population, where each possible unit has a known possibility of being selected . Non-probability sampling methods (e.g. convenience sampling) have no known inclusion probabilities , producing bias and unbalanced sample representation [6,7,8,9,10,11,12,13,14]. Simple random sampling can also pose a problem for studies conducting research in minority populations. This method targets the whole population of interest and often results in minority under-representation. Stratified random sampling increases sample representativeness by dividing the study population into strata based on characteristics that are of interest to the researcher . Random samples are then drawn from each strata to ensure adequate sampling of all groups. This approach reduces sampling bias; allows researchers to estimate within and between strata outcomes; and improves accuracy of results [15, 16].
Once the number of blocks from each group were determined, the CASPER toolkit developed by the CDC was utilized to generate random samples . We used an add-on program developed for ArcGIS by the CDC to generate random samples using a polygon layer that represents the sampling area and non-overlapping clusters within the sampling area. In our study, the four strata were our sampling areas with Census blocks the non-overlapping clusters, accounting for the number of housing units within each cluster. The random sampling procedure was repeated four times, once for each stratum. Figure 2 shows the 44 random blocks selected from the entire study area using this approach.
Census blocks selected for recruitment. Map of the 44 census block groups randomly selected in Albertville, AL using an area stratified random sampling approach. Blue outline indicates block group selected. Map developed using licensed ArcGIS software
Utilizing GIS to facilitate community-based research, such as targeting areas for program planning or ensuring random sampling of survey respondents , has been implemented in recent population based studies. This method has been particularly useful in rural, developing countries [20,21,22, 29]. Defar et al. used GIS methods to conduct a cross-sectional survey in Ethiopia on maternal and child health care utilization in a similar two-stage process as the current study  while Wampler et al. used GIS to facilitate the random selection of households in specific areas in Haiti for water quality research . Akin to the results here, a study that compared simple random sampling to stratified sampling by zip code and census tract found that area based stratified sampling ensured a higher representativeness of Hispanic residents in audits of tobacco retailers in an urban area . In the public health realm, Lafontaine et al. developed a spatial random sampling method to conduct neighborhood built environment audits and concluded that this approach was more cost and time effective . Likewise, using the approach herein resulted in recruiting our Hispanic sample in a more efficient manner.
It is important to note that we selected the number of blocks for randomization and recruitment based on feasibility but nonetheless in an arbitrary fashion. While this resulted in a balanced sample for our study, this will likely not translate into other scenarios. Since stratification by design results in subgroups that are over or under represented compared to the overall population , taking the actual population weights of each census tract into account when selecting blocks would have been more appropriate. Since the ultimate goal in sampling is to select a study sample that is representative of the population, applying population sampling weights and using model-based approaches such as raking prior to analysis are essential. Raking adjusts the sampling weights by forcing the survey totals to match proportions in the known population .
Overall, we developed a standardized area based randomized sampling protocol that allowed us to successful recruit an ethnically balanced sample while conducting door to door community surveys. Minimizing selection bias in community-based surveys can be difficult; however, advancement in technological tools such as GIS provides novel approaches to address these biases. Based on our results here, we advocate the integration of area based randomized sampling in future community-based research, particularly when trying to reach disparate populations.
Estimates of the true prevalence of COVID-19 in a population can be made by random sampling and pooling of RT-PCR tests. Here I use simulations to explore how experiment sample size and degrees of sample pooling impact precision of prevalence estimates and potential for minimizing the total number of tests required to get individual-level diagnostic results. 2b1af7f3a8