Supplemental Lesson: Creating Table 1, Table 2, and References

Applied Epidemiologic Reporting in Quarto

Overview

In this lesson, you will learn how to create Table 1 (Descriptive Statistics), Table 2 (Analytic Results), and properly format references in a reproducible Quarto workflow.

These are essential components of epidemiologic manuscripts and will be required for your final project.


Learning Objectives

By the end of this lesson, students will be able to:

  • Create a publication-style Table 1
  • Create a regression-based Table 2
  • Interpret tables in plain language
  • Export clean tables from R
  • Add and format references in Quarto

Required Packages

library(tidyverse)
Warning: package 'dplyr' was built under R version 4.5.1
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(gtsummary)
Warning: package 'gtsummary' was built under R version 4.5.1
library(gt)
Warning: package 'gt' was built under R version 4.5.1
library(broom)

Example Dataset

library(NHANES)

df <- NHANES %>%
  select(Age, Gender, BMI, BPSysAve, Diabetes) %>%
  drop_na()

Table 1: Descriptive Statistics

Table 1 summarizes the characteristics of your study population.

Step 1: Create Table 1

table1 <- df %>%
  tbl_summary(
    by = Diabetes,
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    )
  ) %>%
  add_p()

table1
Characteristic No
N = 7,7491
Yes
N = 7331
p-value2
Age 39 (20) 59 (15) <0.001
Gender

0.078
    female 3,922 (51%) 346 (47%)
    male 3,827 (49%) 387 (53%)
BMI 27 (7) 33 (8) <0.001
BPSysAve 117 (17) 128 (19) <0.001
1 Mean (SD); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test

Interpretation of Table 1

  • Continuous variables: mean (SD)
  • Categorical variables: n (%)
  • p-values: compare groups

Export Table 1

table1 %>%
  as_gt() %>%
  gt::gtsave("table1.html")

Table 2: Regression Results

Table 2 presents results from regression models.

Step 1: Fit Model

model <- lm(BMI ~ Age + Gender + Diabetes, data = df)
summary(model)

Call:
lm(formula = BMI ~ Age + Gender + Diabetes, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-18.350  -4.699  -1.120   3.416  54.564 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 24.102908   0.182482 132.084   <2e-16 ***
Age          0.077325   0.003742  20.663   <2e-16 ***
Gendermale   0.014369   0.144809   0.099    0.921    
DiabetesYes  3.921536   0.267997  14.633   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.661 on 8478 degrees of freedom
Multiple R-squared:  0.09376,   Adjusted R-squared:  0.09344 
F-statistic: 292.4 on 3 and 8478 DF,  p-value: < 2.2e-16

Step 2: Create Table 2

table2 <- tbl_regression(model) %>%
  bold_labels()

table2
Characteristic Beta 95% CI p-value
Age 0.08 0.07, 0.08 <0.001
Gender


    female
    male 0.01 -0.27, 0.30 >0.9
Diabetes


    No
    Yes 3.9 3.4, 4.4 <0.001
Abbreviation: CI = Confidence Interval

Interpretation of Table 2

  • Coefficients show direction and magnitude
  • Confidence intervals show uncertainty
  • p-values show statistical evidence

Optional: Logistic Regression

log_model <- glm(Diabetes ~ Age + BMI + Gender,
                 data = df,
                 family = binomial)

tbl_regression(log_model, exponentiate = TRUE)
Characteristic OR 95% CI p-value
Age 1.06 1.05, 1.06 <0.001
BMI 1.10 1.09, 1.11 <0.001
Gender


    female
    male 1.42 1.20, 1.67 <0.001
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

Export Table 2

table2 %>%
  as_gt() %>%
  gt::gtsave("table2.html")

Writing About Tables

Example:

“The mean BMI was higher among individuals with diabetes compared to those without (p < 0.05). In adjusted models, age and BMI were significantly associated with diabetes status.”


Adding References in Quarto

Step 1: Create a .bib File

Example references.bib:

@article{nhanes,
  title={National Health and Nutrition Examination Survey},
  author={CDC},
  year={2023}
}

Step 2: Add to YAML

bibliography: references.bib
csl: apa.csl

Step 3: Cite in Text

Example:

This dataset comes from NHANES [@nhanes].


Step 4: References Section

Add this at the end:

## References

Quarto will automatically generate the reference list.


Key Takeaways

  • Table 1 describes your sample
  • Table 2 presents analytic results
  • Use gtsummary for clean outputs
  • Always interpret results in plain language
  • Use Quarto citation tools for references

Practice Activity

  1. Create Table 1 using your Project dataset
  2. Create Table 2 using a regression model
  3. Export both tables
  4. Write 2–3 sentences interpreting each table
  5. Add at least one reference using .bib

Conclusion

Tables and references are essential for scientific communication. Mastering these skills will help you produce clear, professional, and reproducible public health reports.