Supplemental Lesson: Creating Table 1, Table 2, and References

Applied Epidemiologic Reporting in Quarto

Overview

In this lesson, you will learn how to create Table 1 (Descriptive Statistics), Table 2 (Analytic Results), and properly format references in a reproducible Quarto workflow.

These are essential components of epidemiologic manuscripts and will be required for your final project.

Learning Objectives

By the end of this lesson, students will be able to:

Create a publication-style Table 1
Create a regression-based Table 2
Interpret tables in plain language
Export clean tables from R
Add and format references in Quarto

Required Packages

library(tidyverse)

Warning: package 'dplyr' was built under R version 4.5.1

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(gtsummary)

Warning: package 'gtsummary' was built under R version 4.5.1

library(gt)

Warning: package 'gt' was built under R version 4.5.1

library(broom)

Example Dataset

library(NHANES)

df <- NHANES %>%
  select(Age, Gender, BMI, BPSysAve, Diabetes) %>%
  drop_na()

Table 1: Descriptive Statistics

Table 1 summarizes the characteristics of your study population.

Step 1: Create Table 1

table1 <- df %>%
  tbl_summary(
    by = Diabetes,
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    )
  ) %>%
  add_p()

table1

Characteristic	No N = 7,749¹	Yes N = 733¹	p-value²
Age	39 (20)	59 (15)	<0.001
Gender			0.078
female	3,922 (51%)	346 (47%)
male	3,827 (49%)	387 (53%)
BMI	27 (7)	33 (8)	<0.001
BPSysAve	117 (17)	128 (19)	<0.001
¹ Mean (SD); n (%)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

Interpretation of Table 1

Continuous variables: mean (SD)
Categorical variables: n (%)
p-values: compare groups

Export Table 1

table1 %>%
  as_gt() %>%
  gt::gtsave("table1.html")

Table 2: Regression Results

Table 2 presents results from regression models.

Step 1: Fit Model

model <- lm(BMI ~ Age + Gender + Diabetes, data = df)
summary(model)


Call:
lm(formula = BMI ~ Age + Gender + Diabetes, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-18.350  -4.699  -1.120   3.416  54.564 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 24.102908   0.182482 132.084   <2e-16 ***
Age          0.077325   0.003742  20.663   <2e-16 ***
Gendermale   0.014369   0.144809   0.099    0.921    
DiabetesYes  3.921536   0.267997  14.633   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.661 on 8478 degrees of freedom
Multiple R-squared:  0.09376,   Adjusted R-squared:  0.09344 
F-statistic: 292.4 on 3 and 8478 DF,  p-value: < 2.2e-16

Step 2: Create Table 2

table2 <- tbl_regression(model) %>%
  bold_labels()

table2

Characteristic	Beta	95% CI	p-value
Age	0.08	0.07, 0.08	<0.001
Gender
female	—	—
male	0.01	-0.27, 0.30	>0.9
Diabetes
No	—	—
Yes	3.9	3.4, 4.4	<0.001
Abbreviation: CI = Confidence Interval

Interpretation of Table 2

Coefficients show direction and magnitude
Confidence intervals show uncertainty
p-values show statistical evidence

Optional: Logistic Regression

log_model <- glm(Diabetes ~ Age + BMI + Gender,
                 data = df,
                 family = binomial)

tbl_regression(log_model, exponentiate = TRUE)

Characteristic	OR	95% CI	p-value
Age	1.06	1.05, 1.06	<0.001
BMI	1.10	1.09, 1.11	<0.001
Gender
female	—	—
male	1.42	1.20, 1.67	<0.001
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

Export Table 2

table2 %>%
  as_gt() %>%
  gt::gtsave("table2.html")

Writing About Tables

Example:

“The mean BMI was higher among individuals with diabetes compared to those without (p < 0.05). In adjusted models, age and BMI were significantly associated with diabetes status.”

Adding References in Quarto

Step 1: Create a `.bib` File

Example references.bib:

@article{nhanes,
  title={National Health and Nutrition Examination Survey},
  author={CDC},
  year={2023}
}

Step 2: Add to YAML

bibliography: references.bib
csl: apa.csl

Step 3: Cite in Text