Baseline Characteristics Table • esubdemo

Baseline Characteristics

Following the ICH E3 guidance, we need to summarize critical demographic and baseline characteristics of the patients in Section 11.2, Demographic and Other Baseline Characteristics.

There is many R packages that can efficiently summarize baseline information. The table1 R package is one of them.

library(esubdemo)

## Warning in eval(ei, envir): The current R version is not the same with the
## current project in 4.1.0

library(table1)
library(r2rtf)
library(haven)
library(dplyr)
library(tidyr)
library(stringr)
library(tools)

As in previous chapters, we first read adsl dataset that contain all required information for baseline characteristics table.

adsl <- read_sas("data-adam/adsl.sas7bdat")

For simplicity, we only analyze SEX, AGE and RACE in this example using the table1 R package. More details of the table R package can be found in the package vignettes.

The table1 R package directly create an HTML report.

ana <- adsl %>%
  mutate(
    SEX = factor(SEX, c("F", "M"), c("Female", "Male")),
    RACE = toTitleCase(tolower(RACE))
  )

tbl <- table1(~ SEX + AGE + RACE | TRT01P, data = ana)
tbl

	Placebo (N=86)	Xanomeline High Dose (N=84)	Xanomeline Low Dose (N=84)	Overall (N=254)
SEX
Female	53 (61.6%)	40 (47.6%)	50 (59.5%)	143 (56.3%)
Male	33 (38.4%)	44 (52.4%)	34 (40.5%)	111 (43.7%)
Age
Mean (SD)	75.2 (8.59)	74.4 (7.89)	75.7 (8.29)	75.1 (8.25)
Median [Min, Max]	76.0 [52.0, 89.0]	76.0 [56.0, 88.0]	77.5 [51.0, 88.0]	77.0 [51.0, 89.0]
RACE
Black or African American	8 (9.3%)	9 (10.7%)	6 (7.1%)	23 (9.1%)
White	78 (90.7%)	74 (88.1%)	78 (92.9%)	230 (90.6%)
American Indian or Alaska Native	0 (0%)	1 (1.2%)	0 (0%)	1 (0.4%)

The code below transfer the output into a dataframe that only contain ASCII character recommended by regulatory agencies. tbl_base is used as input for r2rtf to create final report.

tbl_base <- tbl %>%
  as.data.frame() %>%
  as_tibble() %>%
  mutate(across(
    everything(),
    ~ str_replace_all(.x, intToUtf8(160), " ")
  ))

names(tbl_base) <- str_replace_all(names(tbl_base), intToUtf8(160), " ")
tbl_base

## # A tibble: 11 × 5
##    ` `                         Placebo `Xanomeline Hi…` `Xanomeline Lo…` Overall
##    <chr>                       <chr>   <chr>            <chr>            <chr>  
##  1 ""                          "(N=86… "(N=84)"         "(N=84)"         "(N=25…
##  2 "SEX"                       ""      ""               ""               ""     
##  3 "  Female"                  "53 (6… "40 (47.6%)"     "50 (59.5%)"     "143 (…
##  4 "  Male"                    "33 (3… "44 (52.4%)"     "34 (40.5%)"     "111 (…
##  5 "Age"                       ""      ""               ""               ""     
##  6 "  Mean (SD)"               "75.2 … "74.4 (7.89)"    "75.7 (8.29)"    "75.1 …
##  7 "  Median [Min, Max]"       "76.0 … "76.0 [56.0, 88… "77.5 [51.0, 88… "77.0 …
##  8 "RACE"                      ""      ""               ""               ""     
##  9 "  Black or African Americ… "8 (9.… "9 (10.7%)"      "6 (7.1%)"       "23 (9…
## 10 "  White"                   "78 (9… "74 (88.1%)"     "78 (92.9%)"     "230 (…
## 11 "  American Indian or Alas… "0 (0%… "1 (1.2%)"       "0 (0%)"         "1 (0.…

We start to define the format of the output. We highlight items that are not discussed in previous discussion.

text_indent_first and test_indent_left are used to control the indent space of text. They are helpful when you need to control the white space of a long sentence. For example, “AMERICAN INDIAN OR ALASKA NATIVE” in the table.

colheader1 <- paste(names(tbl_base), collapse = "|")
colheader2 <- paste(tbl_base[1, ], collapse = "|")
rel_width <- c(2.5, rep(1, 4))

tbl_base[-1, ] %>%
  rtf_title(
    "Participant Baseline Characteristics",
    "(All Participants Randomized)"
  ) %>%
  rtf_colheader(colheader1,
    col_rel_width = rel_width
  ) %>%
  rtf_colheader(colheader2,
    border_top = "",
    col_rel_width = rel_width
  ) %>%
  rtf_body(
    col_rel_width = rel_width,
    text_justification = c("l", rep("c", 4)),
    text_indent_first = -240,
    text_indent_left = 180
  ) %>%
  rtf_encode() %>%
  write_rtf("tlf/tlf_base.rtf")

In conclusion, the procedure to generate demographic and baseline characteristics table is summarized as follows.

Step 1: Read the data set.
Step 2: Use table1::table1() to get the baseline characteristics table.
Step 3: Transfer the output in Step 2 into a data frame that only contains ASCII character.
Step 4: Define the format of the rtf table by using R package r2rtf.