Mastering the Art of Data Summarization with gtsummary: Changing the Number of Tests to Adjust for with add_q()
Image by Min sun - hkhazo.biz.id

Mastering the Art of Data Summarization with gtsummary: Changing the Number of Tests to Adjust for with add_q()

Posted on

What is gtsummary?

Before we dive into the nitty-gritty of add_q(), let’s quickly cover what gtsummary is all about. gtsummary is an R package designed to create beautiful, publication-ready tables and summaries from your data. With gtsummary, you can effortlessly summarize your data, perform statistical tests, and visualize the results in an intuitive and visually appealing way. The package is built around the concept of “summary tables,” which provide a concise and informative overview of your data.

Introducing add_q()

The add_q() function is a crucial component of the gtsummary package, allowing you to adjust the number of tests to perform on your data. By default, gtsummary uses the Holm-Bonferroni method to adjust for multiple testing. However, this might not always be the most appropriate approach for your specific use case. That’s where add_q() comes in – it gives you the flexibility to choose from various methods to adjust for multiple testing, ensuring your results are accurate and reliable.

Why adjust for multiple testing?

When performing multiple tests on your data, the risk of false positives increases exponentially. To mitigate this issue, statistical tests are often adjusted to account for the number of tests performed. This adjustment, also known as multiplicity correction, helps to maintain a stable false positive rate (α) across all tests.

Think of it like this: imagine you’re performing 20 tests on your data, each with a 5% chance of producing a false positive. Naively, you might expect the false positive rate to remain at 5%. However, the reality is that the actual false positive rate will be much higher, around 64% (1 – (1 – 0.05)^20)!

Using add_q() to change the number of tests to adjust for

Now that we’ve covered the importance of multiplicity correction, let’s dive into the syntax and usage of add_q(). The add_q() function takes two main arguments:

  • adjustment: The method used to adjust for multiple testing. Options include “holm”, “hochberg”, “hommel”, “bh”, “by”, “fdr”, and “none”.
  • q: The number of tests to adjust for. This can be a single value or a vector of values.

Let’s explore an example using the mtcars dataset, which comes bundled with R.

library(gtsummary)
library(dplyr)

mtcars %>% 
  select(mpg, cyl, disp, hp) %>% 
  tbl_summary(
    by = cyl
  ) %>% 
  add_q(adjustment = "fdr", q = 4)

In this example, we’re creating a summary table of the mtcars dataset, grouped by the number of cylinders (cyl). We’re then using add_q() to adjust for multiple testing using the False Discovery Rate (FDR) method, with q = 4 indicating that we want to adjust for 4 tests.

Common adjustment methods

Here are some of the most commonly used adjustment methods available in add_q():

  1. holm: The Holm-Bonferroni method, which is the default adjustment method in gtsummary.
  2. bh: The Benjamini-Hochberg method, also known as the False Discovery Rate (FDR) method.
  3. fdr: The False Discovery Rate (FDR) method, which is a more conservative approach than the Benjamini-Hochberg method.
  4. none: No adjustment for multiple testing. Use with caution, as this can lead to an inflated false positive rate.

Customizing add_q() for advanced use cases

In some scenarios, you might need to adjust for multiple testing using a custom method or a complex set of rules. Fear not, dear reader! add_q() is highly customizable, allowing you to tailor the adjustment method to your specific needs.

One common use case is when you have a mixture of different test types, each requiring a unique adjustment method. In this scenario, you can pass a vector of adjustment methods to the adjustment argument.

library(gtsummary)
library(dplyr)

mtcars %>% 
  select(mpg, cyl, disp, hp) %>% 
  tbl_summary(
    by = cyl
  ) %>% 
  add_q(
    adjustment = c("holm", "fdr", "none"),
    q = c(2, 3, 1)
  )

In this example, we’re using a vector of adjustment methods (c("holm", "fdr", "none")) and a corresponding vector of tests to adjust for (c(2, 3, 1)). This allows us to specify different adjustment methods for different subsets of tests.

Advanced customization with the q argument

The q argument in add_q() can also be customized to accommodate complex testing scenarios. For instance, you might want to adjust for a different number of tests depending on the specific combination of variables.

library(gtsummary)
library(dplyr)

mtcars %>% 
  select(mpg, cyl, disp, hp) %>% 
  tbl_summary(
    by = cyl
  ) %>% 
  add_q(
    adjustment = "fdr",
    q = ~ifelse(cyl == 4, 2, ifelse(cyl == 6, 3, 1))
  )

In this example, we’re using a formula to specify the number of tests to adjust for, depending on the value of the cyl variable. This allows for a more nuanced approach to multiplicity correction, tailored to the specific needs of your data.

Conclusion

In this article, we’ve explored the powerful add_q() function from the gtsummary package, learning how to change the number of tests to adjust for and customize the adjustment method to suit our needs. By mastering add_q(), you’ll be able to create more accurate and reliable summary tables, while maintaining control over the multiplicity correction process.

Remember, with great power comes great responsibility. Use add_q() wisely, and always consider the implications of multiplicity correction on your results.

Summary of add_q() arguments
Argument Description
adjustment The method used to adjust for multiple testing
q The number of tests to adjust for

We hope this article has empowered you to take your data summarization skills to the next level. Happy summarizing!

Frequently Asked Questions

Get the inside scoop on adjusting the number of tests with add_q() from gtsummary package!

What is the purpose of add_q() function in gtsummary package?

The add_q() function in gtsummary package is used to add a new column to a gtsummary table that performs a statistical test or calculation, such as adding a p-value for a t-test or a confidence interval. It allows you to customize your table with additional information that’s relevant to your analysis.

How do I change the number of tests to adjust for with add_q()?

To adjust the number of tests with add_q(), you can specify the number of tests using the `k` argument. For example, `add_q(test = “t.test”, k = 6)` would perform the t-test and adjust for 6 tests. You can adjust the `k` value based on the number of tests you want to perform.

What is the default number of tests adjusted for with add_q()?

The default number of tests adjusted for with add_q() is 1. If you don’t specify the `k` argument, it will assume you’re performing a single test.

Can I use add_q() to perform multiple tests and adjust for multiple comparisons?

Yes, you can use add_q() to perform multiple tests and adjust for multiple comparisons. Simply pass a list of tests to the `test` argument and specify the number of tests using the `k` argument. For example, `add_q(test = c(“t.test”, “wilcox.test”), k = 2)` would perform both t-test and Wilcoxon rank-sum test and adjust for 2 tests.

What types of tests can I perform with add_q()?

With add_q(), you can perform a variety of statistical tests, such as t-tests, Wilcoxon rank-sum tests, ANOVA, regression, and more. The package also allows you to customize your own tests using R functions. The possibilities are endless!

Leave a Reply

Your email address will not be published. Required fields are marked *