tlf-03-baseline.Rmd
Following the ICH E3 guidance, we need to summarize critical demographic and baseline characteristics of the patients in Section 11.2, Demographic and Other Baseline Characteristics.
There is many R packages that can efficiently summarize baseline information. The table1
R package is one of them.
library(esubdemo)
## Warning in eval(ei, envir): The current R version is not the same with the
## current project in 4.1.0
library(table1)
library(r2rtf)
library(haven)
library(dplyr)
library(tidyr)
library(stringr)
library(tools)
As in previous chapters, we first read adsl
dataset that contain all required information for baseline characteristics table.
adsl <- read_sas("data-adam/adsl.sas7bdat")
For simplicity, we only analyze SEX
, AGE
and RACE
in this example using the table1
R package. More details of the table
R package can be found in the package vignettes.
The table1
R package directly create an HTML report.
ana <- adsl %>%
mutate(
SEX = factor(SEX, c("F", "M"), c("Female", "Male")),
RACE = toTitleCase(tolower(RACE))
)
tbl <- table1(~ SEX + AGE + RACE | TRT01P, data = ana)
tbl
Placebo (N=86) |
Xanomeline High Dose (N=84) |
Xanomeline Low Dose (N=84) |
Overall (N=254) |
|
---|---|---|---|---|
SEX | ||||
Female | 53 (61.6%) | 40 (47.6%) | 50 (59.5%) | 143 (56.3%) |
Male | 33 (38.4%) | 44 (52.4%) | 34 (40.5%) | 111 (43.7%) |
Age | ||||
Mean (SD) | 75.2 (8.59) | 74.4 (7.89) | 75.7 (8.29) | 75.1 (8.25) |
Median [Min, Max] | 76.0 [52.0, 89.0] | 76.0 [56.0, 88.0] | 77.5 [51.0, 88.0] | 77.0 [51.0, 89.0] |
RACE | ||||
Black or African American | 8 (9.3%) | 9 (10.7%) | 6 (7.1%) | 23 (9.1%) |
White | 78 (90.7%) | 74 (88.1%) | 78 (92.9%) | 230 (90.6%) |
American Indian or Alaska Native | 0 (0%) | 1 (1.2%) | 0 (0%) | 1 (0.4%) |
The code below transfer the output into a dataframe that only contain ASCII character recommended by regulatory agencies. tbl_base
is used as input for r2rtf
to create final report.
tbl_base <- tbl %>%
as.data.frame() %>%
as_tibble() %>%
mutate(across(
everything(),
~ str_replace_all(.x, intToUtf8(160), " ")
))
names(tbl_base) <- str_replace_all(names(tbl_base), intToUtf8(160), " ")
tbl_base
## # A tibble: 11 × 5
## ` ` Placebo `Xanomeline Hi…` `Xanomeline Lo…` Overall
## <chr> <chr> <chr> <chr> <chr>
## 1 "" "(N=86… "(N=84)" "(N=84)" "(N=25…
## 2 "SEX" "" "" "" ""
## 3 " Female" "53 (6… "40 (47.6%)" "50 (59.5%)" "143 (…
## 4 " Male" "33 (3… "44 (52.4%)" "34 (40.5%)" "111 (…
## 5 "Age" "" "" "" ""
## 6 " Mean (SD)" "75.2 … "74.4 (7.89)" "75.7 (8.29)" "75.1 …
## 7 " Median [Min, Max]" "76.0 … "76.0 [56.0, 88… "77.5 [51.0, 88… "77.0 …
## 8 "RACE" "" "" "" ""
## 9 " Black or African Americ… "8 (9.… "9 (10.7%)" "6 (7.1%)" "23 (9…
## 10 " White" "78 (9… "74 (88.1%)" "78 (92.9%)" "230 (…
## 11 " American Indian or Alas… "0 (0%… "1 (1.2%)" "0 (0%)" "1 (0.…
We start to define the format of the output. We highlight items that are not discussed in previous discussion.
text_indent_first
and test_indent_left
are used to control the indent space of text. They are helpful when you need to control the white space of a long sentence. For example, “AMERICAN INDIAN OR ALASKA NATIVE” in the table.
colheader1 <- paste(names(tbl_base), collapse = "|")
colheader2 <- paste(tbl_base[1, ], collapse = "|")
rel_width <- c(2.5, rep(1, 4))
tbl_base[-1, ] %>%
rtf_title(
"Participant Baseline Characteristics",
"(All Participants Randomized)"
) %>%
rtf_colheader(colheader1,
col_rel_width = rel_width
) %>%
rtf_colheader(colheader2,
border_top = "",
col_rel_width = rel_width
) %>%
rtf_body(
col_rel_width = rel_width,
text_justification = c("l", rep("c", 4)),
text_indent_first = -240,
text_indent_left = 180
) %>%
rtf_encode() %>%
write_rtf("tlf/tlf_base.rtf")
In conclusion, the procedure to generate demographic and baseline characteristics table is summarized as follows.
table1::table1()
to get the baseline characteristics table.r2rtf
.