Analysis Population

Following the ICH E3 guidance, we need to summarize which patients were included in each efficacy analysis in Section 11.1, Data Sets Analysed.

library(esubdemo)
## Warning in eval(ei, envir): The current R version is not the same with the
## current project in 4.1.0
library(haven) # Read SAS data
library(dplyr) # Manipulate data
library(tidyr) # Manipulate data
library(r2rtf) # Reporting in RTF format

The first step is to read relevant datasets into R. For analysis population table, all the required information is saved in the ADSL dataset. We can use haven package to read the dataset.

adsl <- read_sas("data-adam/adsl.sas7bdat")

We illustrate how to prepare a report data for a simplified analysis population table using variables below:

  • USUBJID: Unique subject identifier
  • ITTFL: Intent-to-treat population flag
  • EFFFL: Efficacy population flag
  • SAFFL: Safety population flag
adsl %>%
  select(USUBJID, ITTFL, EFFFL, SAFFL) %>%
  head(4)
## # A tibble: 4 × 4
##   USUBJID     ITTFL EFFFL SAFFL
##   <chr>       <chr> <chr> <chr>
## 1 01-701-1015 Y     Y     Y    
## 2 01-701-1023 Y     Y     Y    
## 3 01-701-1028 Y     Y     Y    
## 4 01-701-1033 Y     Y     Y

Analysis Code

With the helper functions count_by, we can easily prepare report dataset as

# Derive a randomization flag
adsl <- adsl %>% mutate(RANDFL = "Y")

pop <- count_by(adsl, "TRT01PN", "RANDFL",
  var_label = "Participants in Population"
) %>%
  select(var_label, starts_with("n_"))
pop1 <- bind_rows(
  count_by(adsl, "TRT01PN", "ITTFL",
    var_label = "Participants included in ITT population"
  ),
  count_by(adsl, "TRT01PN", "EFFFL",
    var_label = "Participants included in efficacy population"
  ),
  count_by(adsl, "TRT01PN", "SAFFL",
    var_label = "Participants included in safety population"
  )
) %>%
  filter(var == "Y") %>%
  select(var_label, starts_with("npct_"))

Now we combine individual rows into the whole table for reporting purpose. tbl_pop is used as input for r2rtf to create final report.

names(pop) <- gsub("n_", "npct_", names(pop))
tbl_pop <- bind_rows(pop, pop1)

tbl_pop %>% select(var_label, npct_0)
## # A tibble: 4 × 2
##   var_label                                    npct_0        
##   <chr>                                        <chr>         
## 1 Participants in Population                   "  86"        
## 2 Participants included in ITT population      "  86 (100.0)"
## 3 Participants included in efficacy population "  79 ( 91.9)"
## 4 Participants included in safety population   "  86 (100.0)"

We start to define the format of the output.

rel_width <- c(2, rep(1, 3))
colheader <- " | Placebo | Xanomeline line Low Dose| Xanomeline line High Dose"
tbl_pop %>%
  # Table title
  rtf_title(
    "Participants Accounting in Analysis Population",
    "(All Participants Randomized)"
  ) %>%
  # First row of column header
  rtf_colheader(colheader,
    col_rel_width = rel_width
  ) %>%
  # Second row of column header
  rtf_colheader(" | n (%) | n (%) | n (%)",
    border_top = "",
    col_rel_width = rel_width
  ) %>%
  # Table body
  rtf_body(
    col_rel_width = rel_width,
    text_justification = c("l", rep("c", 3))
  ) %>%
  # Encoding RTF syntax
  rtf_encode() %>%
  # Save to a file
  write_rtf("tlf/tbl_pop.rtf")

In conclusion, the procedure to generate a population summary table as shown in the above example is listed as follows:

  • Step 1: Read data into R, i.e., adsl.
  • Step 2: Rowly bind the counts/percentages of the ITT population, the efficacy population and the safety population. These these types of population are generated by a help function count_by.
  • Step 3: Format the output in Step 2 by r2rtf.