Olaf Borghi - Teaching Myself: Bayesian Multiple and Multivariate Linear Regression

# use groundhog to make code maximally reproducible
if (!require("groundhog", quietly = TRUE)) {
  install.packages("groundhog")
}
library("groundhog")

# use groundhog to install and load packages - we may not use all of them and some consider it bad practise
# to still load them, but all of them are nice to have and could be useful
pkgs <- c("here",         # Path management
          "tidyverse",    # ggplot, dplyr, %>%, and friends
          "easystats",    # bayestestR, performance, effectsize and friends
          "psych",        # Descriptive statistics and other tools
          "brms",         # Bayesian modeling through Stan
          "tidybayes",    # Integration of Bayesian models in tidy + ggplot workflow
          "broom",        # Convert model objects to data frames
          "broom.mixed",  # Convert brms model objects to data frames
          "marginaleffects", # Compute marginal effects
          "tinytable",    # Lightweight package to create tables
          "hrbrthemes",   # Additional ggplot themes
          "extrafont",    # Additional fonts for plots etc
          "ggdist",       # Special geoms for posterior distributions
          "ggrepel",      # Automatically position labels
          "gghalves",     # Special half geoms
          "patchwork"     # Combine ggplot objects
          )

groundhog.library(pkgs, "2024-03-01") # change to a recent date

Teaching Myself: Bayesian Multiple Linear Regression

This blog article builds on the analyses from the last blog post on Bayesian linear regression. We will use part of the same data, but this time investigate the effects of multiple predictor variables simultaneously. You may have noticed how we plotted the association of self-reported and behavioral self-control with authoritarianism, but in our regression only included self-reported self-control as a predictor. So let’s go one step further!

Meet the data

This time, I only use open data from Zmigrod et al. (2021). This dataset includes cognitive and personality factors derived from exploratory factor analysis (EFA) of a large number of behavioral cognitive tasks and personality questionnaires from the Self-Regulation Ontology, and in addition to that, several measures of ideology. Zmigrod et al. (2021) use EFA to obtain three factors of ideology. I can highly suggest to read the article, both the analytic approach and the results are very interesting - You can find it here. In this blog post I will skip the factor analysis of the ideology measures, and directly investigate the relative associations of the cognitive factors with self-reported measures of ideology, thus taking a bit of a different analytic approach. We will investigate the associations of five cognitive factors

Caution, Temporal Discounting, Perceptual Processing Time, Speed of Evidence Accumulation, and Strategic Information Processing

With the score on three ideology questionnaires that measure

Conservatism, Authoritarianism, and Dogmatism

And we will control for the demographic variables

Gender, Age, Education and Income

Let’s select the data that we are interested in here.

# select variables of interest
dat <- data %>%
  select(1, 3:11, 24, 32, 33) %>%
  rename(ID = SubjectN) 

str(dat)
## tibble [334 × 13] (S3: tbl_df/tbl/data.frame)
##  $ ID              : num [1:334] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Age             : num [1:334] 22 33 28 31 29 37 38 35 43 39 ...
##  $ Edu             : num [1:334] 4 3 3 1 5 7 5 5 4 4 ...
##  $ Gender          : num [1:334] 0 0 0 0 1 0 1 1 1 1 ...
##  $ Income          : num [1:334] 3 3 2 2 4 1 4 2 3 2 ...
##  $ SpeededIP       : num [1:334] -1.779 -2.342 -1.905 -1.356 -0.576 ...
##  $ StrategicIP     : num [1:334] -1.568 -1.703 0.734 -1.111 -0.586 ...
##  $ Percep          : num [1:334] -0.046 -0.63 1.172 -0.592 0.352 ...
##  $ Caution         : num [1:334] 0.228 -0.579 -1.217 -0.03 1.19 ...
##  $ Discount        : num [1:334] 0.488 1.637 1.597 1.213 -0.21 ...
##  $ Conservatism    : num [1:334] 67 120 79 82 101 62 63 69 84 75 ...
##  $ Authoritarianism: num [1:334] 3 1 3 8 8 4 1 4 5 6 ...
##  $ Dogmatism       : num [1:334] 44 38 17 41 29 42 11 23 28 19 ...

I want to standardize the variables again. However, as we have categorical and binary predictors, to be able to compare the sizes of effects, this time we standardize all continuous variables by 2 SD. This puts them on a scale that is comparable with binary or dummy-coded predictors. For more information on this procedure, please see these articles / blog posts by Andrew Gelman https://statmodeling.stat.columbia.edu/2009/07/11/when_to_standar/ http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf Let’s first change the data type of Gender.


d <- dat %>% 
  mutate(Gender = as.factor(Gender))

str(d)
## tibble [334 × 13] (S3: tbl_df/tbl/data.frame)
##  $ ID              : num [1:334] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Age             : num [1:334] 22 33 28 31 29 37 38 35 43 39 ...
##  $ Edu             : num [1:334] 4 3 3 1 5 7 5 5 4 4 ...
##  $ Gender          : Factor w/ 2 levels "0","1": 1 1 1 1 2 1 2 2 2 2 ...
##  $ Income          : num [1:334] 3 3 2 2 4 1 4 2 3 2 ...
##  $ SpeededIP       : num [1:334] -1.779 -2.342 -1.905 -1.356 -0.576 ...
##  $ StrategicIP     : num [1:334] -1.568 -1.703 0.734 -1.111 -0.586 ...
##  $ Percep          : num [1:334] -0.046 -0.63 1.172 -0.592 0.352 ...
##  $ Caution         : num [1:334] 0.228 -0.579 -1.217 -0.03 1.19 ...
##  $ Discount        : num [1:334] 0.488 1.637 1.597 1.213 -0.21 ...
##  $ Conservatism    : num [1:334] 67 120 79 82 101 62 63 69 84 75 ...
##  $ Authoritarianism: num [1:334] 3 1 3 8 8 4 1 4 5 6 ...
##  $ Dogmatism       : num [1:334] 44 38 17 41 29 42 11 23 28 19 ...

Alright, now let’s scale the continuous predictor variables to have a mean of 0 and a sd of 0.5 and the continuous outcome variables to have a mean of 0 and a SD of 1.


d <- d %>% 
  mutate(across(c(SpeededIP, StrategicIP, Percep, Caution, Discount, Age, Edu, Income), 
                ~ (.x - mean(.x, na.rm = TRUE)) / (2 * sd(.x, na.rm = TRUE))),
         across(c(Conservatism, Authoritarianism, Dogmatism), scale))


d_des <- describe(d)
d_des %>% 
  select(mean, sd)
##                    mean   sd
## ID               167.50 96.6
## Age                0.00  0.5
## Edu                0.00  0.5
## Gender*            1.51  0.5
## Income             0.00  0.5
## SpeededIP          0.00  0.5
## StrategicIP        0.00  0.5
## Percep             0.00  0.5
## Caution            0.00  0.5
## Discount           0.00  0.5
## Conservatism       0.00  1.0
## Authoritarianism   0.00  1.0
## Dogmatism          0.00  1.0

As I am curious, I create a second dataframe in which I use the more common approach to scale continuous predictor and outcome variables to have a mean of 0 and a sd of 1. To make effects comparable to a binary predictor, we use effect coding for the variable gender (see https://statmodeling.stat.columbia.edu/2009/06/09/standardization/)


d1sd <- d %>% 
  mutate(across(where(is.numeric), scale))

contrasts(d1sd$Gender) <- contr.sum(levels(d1sd$Gender))

d1sd_des <- describe(d1sd) 
d1sd_des %>% 
  select(mean, sd)
##                  mean  sd
## ID               0.00 1.0
## Age              0.00 1.0
## Edu              0.00 1.0
## Gender*          1.51 0.5
## Income           0.00 1.0
## SpeededIP        0.00 1.0
## StrategicIP      0.00 1.0
## Percep           0.00 1.0
## Caution          0.00 1.0
## Discount         0.00 1.0
## Conservatism     0.00 1.0
## Authoritarianism 0.00 1.0
## Dogmatism        0.00 1.0

I am very curious. Technically, not only gender, but also education is a categorical variable, and I want to treat it as such. Let’s look at how many numbers of observation we have for each level of education. Some factor levels have very few observations that do not all for comparisons. So let’s collapse them into a few, less levels.

df <- dat %>%
  mutate(EduFactor = case_when(
    Edu %in% c(1, 2) ~ "High school (and lower)",
    Edu %in% c(3, 4) ~ "Below Bachelor",
    Edu %in% c(5, 6, 7) ~ "Bachelor, Master, Doctorate",
    TRUE ~ NA_character_  # This line is optional, handles unexpected cases
  ))  %>%
  mutate(EduFactor = factor(EduFactor, levels = c("High school (and lower)", "Below Bachelor", "Bachelor, Master, Doctorate")),
          Gender = as.factor(Gender)) %>% 
  mutate(across(where(is.numeric), scale))

To make the effects of different input variables comparable, let’s standardize numerical variables and use effect coding for categorical variables again.


# Gender
contrasts(df$Gender) <- contr.sum(levels(df$Gender))
contrasts(df$Gender)
##   [,1]
## 0    1
## 1   -1

# Education
# Define the contrast matrix
contrast_edu <- matrix(c(1, -1, -1,
                        -1,  1, -1,
                        -1, -1, -1),
                      byrow = TRUE,  
                      ncol = 3)      
contrasts(df$EduFactor) <- contrast_edu
contrasts(df$EduFactor)
##                             [,1] [,2]
## High school (and lower)        1   -1
## Below Bachelor                -1    1
## Bachelor, Master, Doctorate   -1   -1

This coding ensures that each level’s effects are considered relative to the overall mean (similar to traditional effect coding), but with adjustments to balance the variance attributed to each level in the prior predictive distribution, by replacing 0s with -1s. I am not an expert on this so please see this for some literature on why this should work:https://discourse.mc-stan.org/t/symmetric-weakly-informative-priors-for-categorical-predictors/22188/5 Create yet another dataframe in which I use usual dummy-coding

df_dummy <- dat %>%
  mutate(EduFactor = case_when(
    Edu %in% c(1, 2) ~ "High school (and lower)",
    Edu %in% c(3, 4) ~ "Below Bachelor",
    Edu %in% c(5, 6, 7) ~ "Bachelor, Master, Doctorate",
    TRUE ~ NA_character_  # This line is optional, handles unexpected cases
  ))  %>%
  mutate(EduFactor = factor(EduFactor, levels = c("High school (and lower)", "Below Bachelor", "Bachelor, Master, Doctorate")),
          Gender = as.factor(Gender)) %>% 
  mutate(across(where(is.numeric), scale))

Let’s plot the distributions of our outcome variables next.


vars1_density<- df %>%
  select(c(Conservatism, Authoritarianism, Dogmatism)) %>%
  pivot_longer(., cols = c(Conservatism, Authoritarianism, Dogmatism), 
               names_to = "Variable", values_to = "Value") %>%
  ggplot(aes(x = Value, fill = factor(Variable))) + 
  geom_density(aes(alpha = 0.6)) + 
  labs(x="Standardised values (mean = 0, sd = 1)", y=element_blank(),
       title="Distributions of outcome variables") +
  scale_fill_ipsum() +
  scale_color_ipsum() +
  guides(color = "none", alpha = "none") +
  theme_ipsum(plot_title_size = 14) + 
  theme(legend.position = "bottom",
        legend.title = element_blank())

vars1_density

Let’s also plot the distributions of the predictor variables.


vars1_density<- df %>%
  select(SpeededIP, StrategicIP, Percep, Caution, Discount) %>%
  pivot_longer(., cols = c(SpeededIP, StrategicIP, Percep, Caution, Discount), 
               names_to = "Variable", values_to = "Value") %>%
  ggplot(aes(x = Value, fill = factor(Variable))) + 
  geom_density(aes(alpha = 0.4)) + 
  labs(x="Standardised values (mean = 0, sd = 1)", y=element_blank(),
       title="Distributions of predictor variables") +
  scale_fill_ipsum() +
  scale_color_ipsum() +
  guides(color = "none", alpha = "none") +
  theme_ipsum(plot_title_size = 14) + 
  theme(legend.position = "bottom",
        legend.title = element_blank())

vars1_density

Have a first visual look of the association between the variables.


# predictor and outcome variables
predictors <- c("SpeededIP", "StrategicIP", "Percep", "Caution", "Discount")
outcomes <- c("Conservatism", "Authoritarianism", "Dogmatism")

# list to store sub-lists of plots (one list per outcome variable)
plots_lists <- vector("list", length(outcomes))
names(plots_lists) <- outcomes

for (outcome in outcomes) {
  # Initialize an empty list to store plots for this particular outcome
  plots <- list()
  
  for (predictor in predictors) {
    # Generate the plot for this predictor-outcome pair
    plot <- ggplot(df, aes_string(x = predictor, y = outcome)) +
      geom_point(size = 1) +
      geom_smooth(method = "lm") +
      labs(x = predictor, y = outcome,
           title = paste(predictor, "and", outcome)) +
      scale_fill_ipsum() +
      scale_color_ipsum() +
      guides(color = "none", alpha = "none") +
      theme_ipsum(plot_title_size = 12)
      
    # Add the plot to the list of plots
    plots[[predictor]] <- plot
  }
  
  # Combine all plots for this outcome into a single patchwork object
  plots_lists[[outcome]] <- reduce(plots, `+`)
}


print(plots_lists[["Conservatism"]])


print(plots_lists[["Authoritarianism"]])


print(plots_lists[["Dogmatism"]])

Now that we have a first descriptive impression of what the relationships may look like, let’s start with the first model.

Frequentist multiple linear regression: The associations between cognitive factors and political conservatism

This time, I will also fit a linear regressions using the frequentist lm() function. We will use the results from this to make comparisons to the Bayesian approach. We start the following model: Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + Edu + Income and will compare the coefficients for the dataset in which all continuous input variables were scaled by 2 SD with the one in which they were scaled by 2 SD.


f_m1_con <- lm(Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + Edu + Income, data = d)

summary(f_m1_con)
## 
## Call:
## lm(formula = Conservatism ~ SpeededIP + StrategicIP + Percep + 
##     Caution + Discount + Gender + Age + Edu + Income, data = d)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4766 -0.6220  0.0195  0.6279  2.8572 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.10477    0.07402    1.42   0.1579    
## SpeededIP    0.10288    0.12156    0.85   0.3980    
## StrategicIP -0.48601    0.12043   -4.04  6.9e-05 ***
## Percep       0.12898    0.10987    1.17   0.2413    
## Caution      0.28971    0.10779    2.69   0.0076 ** 
## Discount     0.18185    0.11156    1.63   0.1041    
## Gender1     -0.18396    0.10648   -1.73   0.0851 .  
## Age          0.30968    0.11767    2.63   0.0089 ** 
## Edu          0.00537    0.11191    0.05   0.9618    
## Income       0.08150    0.10856    0.75   0.4534    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.91 on 309 degrees of freedom
##   (15 observations deleted due to missingness)
## Multiple R-squared:  0.164,  Adjusted R-squared:  0.14 
## F-statistic: 6.73 on 9 and 309 DF,  p-value: 8.14e-09

We can see that there is a statistically significant effect (with \(\alpha = 0.05\)) for StrategicIP and Caution, as well as Age. We will not go into detail for now, but let’s also fit the same regression on the data that was standardized to have a SD = 1.


f_m2_con <- lm(Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + Edu + Income, data = d1sd)

summary(f_m2_con)
## 
## Call:
## lm(formula = Conservatism ~ SpeededIP + StrategicIP + Percep + 
##     Caution + Discount + Gender + Age + Edu + Income, data = d1sd)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4766 -0.6220  0.0195  0.6279  2.8572 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.01279    0.05104    0.25   0.8022    
## SpeededIP    0.05144    0.06078    0.85   0.3980    
## StrategicIP -0.24300    0.06021   -4.04  6.9e-05 ***
## Percep       0.06449    0.05494    1.17   0.2413    
## Caution      0.14485    0.05390    2.69   0.0076 ** 
## Discount     0.09092    0.05578    1.63   0.1041    
## Gender1      0.09198    0.05324    1.73   0.0851 .  
## Age          0.15484    0.05884    2.63   0.0089 ** 
## Edu          0.00268    0.05596    0.05   0.9618    
## Income       0.04075    0.05428    0.75   0.4534    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.91 on 309 degrees of freedom
##   (15 observations deleted due to missingness)
## Multiple R-squared:  0.164,  Adjusted R-squared:  0.14 
## F-statistic: 6.73 on 9 and 309 DF,  p-value: 8.14e-09

Alright, makes sense. If we scale the input variables by 2 SD, the estimates are exactly two times as large, a change of 2 SD of the predictor reflect a change of X SD in the outcome. If we scale the input by one SD, a change of 1 SD of the predictor variables reflect a change of X SD in the outcome. Let’s check what happens when we scale the input variables by 1 SD, but set up sum-coding for the categorical variables.


f_m3_con <- lm(Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + EduFactor + Income, data = df)

summary(f_m3_con)
## 
## Call:
## lm(formula = Conservatism ~ SpeededIP + StrategicIP + Percep + 
##     Caution + Discount + Gender + Age + EduFactor + Income, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4816 -0.6105  0.0379  0.6197  2.8478 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.0125     0.0788   -0.16   0.8744    
## SpeededIP     0.0512     0.0609    0.84   0.4006    
## StrategicIP  -0.2465     0.0605   -4.08  5.8e-05 ***
## Percep        0.0638     0.0550    1.16   0.2467    
## Caution       0.1408     0.0542    2.60   0.0099 ** 
## Discount      0.0959     0.0559    1.72   0.0871 .  
## Gender1       0.0877     0.0537    1.63   0.1037    
## Age           0.1561     0.0589    2.65   0.0085 ** 
## EduFactor1   -0.0244     0.0775   -0.31   0.7534    
## EduFactor2   -0.0334     0.0605   -0.55   0.5810    
## Income        0.0356     0.0547    0.65   0.5165    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.911 on 308 degrees of freedom
##   (15 observations deleted due to missingness)
## Multiple R-squared:  0.165,  Adjusted R-squared:  0.138 
## F-statistic: 6.07 on 10 and 308 DF,  p-value: 1.93e-08

We achieved the same thing! But its actually a bit easier to interpret effects now. So I think I actually prefer this way and will stick with it for now. We scale our input variables by 1 SD, and use sum-coding for our binary predictor variables. Now that we cleared this - let’s finally get to the point of this blog post.

Bayesian multiple linear regression

Alright, let’s do the same but in the Bayesian framework. For now, we will stick again and investigate the associations of the cognitive factors with political conservatism while controlling for the effects of the confounding variables.

Selecting priors

I already scaled all variables of interest. This will make it a bit simpler to select priors. I will use weakly informative priors for our slope and intercept parameters (see https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations). The selected prior is \(N(0,1)\). Let’s plot to prior to see what kind of an effect it may have on our parameters.


colors <- ipsum_pal()(1)

prior(normal(0, 1)) %>% 
  parse_dist() %>% 
  
  ggplot(aes(xdist = .dist_obj, y = prior)) + 
  stat_halfeye(.width = 0.683, p_limits = c(.001, .999), 
               slab_fill = colors[1], slab_alpha = 0.8) +
  scale_y_discrete(NULL, breaks = NULL, expand = expansion(add = 0.1)) +
  labs(title = "Prior N(0,1)", x = "Parameter Space", y = "") +
  coord_cartesian(xlim = c(-4, 4)) +
  scale_x_continuous(breaks = seq(-4, 4, by = 1)) +
  theme_ipsum() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

We can see that in the the plot for \(N(0,1)\), more density is around 0. As we expect our parameter values to be between -1 and 1, this gives more weight to parameter values closer to zero, and less weight to parameter values that are far from zero. Let’s check how our priors are set by default in brms.

get_prior(Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + EduFactor + Income, data = df)
## Warning: Rows containing NAs were excluded from the model.
##                 prior     class        coef group resp dpar nlpar lb ub       source
##                (flat)         b                                              default
##                (flat)         b         Age                             (vectorized)
##                (flat)         b     Caution                             (vectorized)
##                (flat)         b    Discount                             (vectorized)
##                (flat)         b  EduFactor1                             (vectorized)
##                (flat)         b  EduFactor2                             (vectorized)
##                (flat)         b     Gender1                             (vectorized)
##                (flat)         b      Income                             (vectorized)
##                (flat)         b      Percep                             (vectorized)
##                (flat)         b   SpeededIP                             (vectorized)
##                (flat)         b StrategicIP                             (vectorized)
##  student_t(3, 0, 2.5) Intercept                                              default
##  student_t(3, 0, 2.5)     sigma                                    0         default

By default brms sets non-informative flat priors for our slope parameters. We thus have to change this, and can do so directly when fitting the model.

Fitting and summarising the first model

m1.1 <- brm(Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + EduFactor + Income, 
          data = df, 
          prior = c(prior(normal(0, 1), class = "Intercept"), 
                    prior(normal(0, 1), class = "b")),
          seed = bayes_seed,
          file = "02_fits/fit_m1.1")

We get a warning message:

Warning: Rows containing NAs were excluded from the model.

We are not happy with this, but for now we will deal with it later. Check the results. We can use either summary(m1.1) or print(m1.1) to do so.

summary(m1.1, prob = 0.95)  # prob = 0.95 is the default
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + EduFactor + Income 
##    Data: df (Number of observations: 319) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept      -0.01      0.08    -0.18     0.15 1.00     6802     3047
## SpeededIP       0.05      0.06    -0.07     0.17 1.00     5675     3118
## StrategicIP    -0.25      0.06    -0.36    -0.13 1.00     5877     3118
## Percep          0.06      0.06    -0.05     0.17 1.00     7396     2862
## Caution         0.14      0.05     0.04     0.25 1.00     7209     3474
## Discount        0.10      0.06    -0.01     0.21 1.00     6027     3021
## Gender1         0.09      0.05    -0.02     0.19 1.00     7069     3145
## Age             0.16      0.06     0.03     0.27 1.00     5988     3108
## EduFactor1     -0.03      0.08    -0.18     0.12 1.00     6084     3451
## EduFactor2     -0.03      0.06    -0.15     0.09 1.00     6409     3374
## Income          0.04      0.06    -0.07     0.14 1.00     6764     3353
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.92      0.04     0.85     0.99 1.00     8581     3207
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Use the tools from bayestestR to have another look at the posterior. We also define a region of practical equivalence ranging from [-0.05; 0.05]. More on this later.

m1.1_posteriors <- describe_posterior(m1.1, 
                                      centrality = "median",
                                      dispersion = FALSE,
                                      ci = 0.95,
                                      ci_method = "hdi",
                                      test = c("rope"),
                                      rope_range = c(-0.05, 0.05),
                                      rope_ci = 1)

m1.1_posteriors <- as.data.frame(m1.1_posteriors)
m1.1_posteriors %>% 
  select(Parameter, Median, CI_low, CI_high, ROPE_Percentage) %>% 
  tt(.)

tinytable_l7z3lhr43xynmp5512k1

Parameter	Median	CI_low	CI_high	ROPE_Percentage
b_Intercept	-0.0143	-0.1656	0.1552	0.4522
b_SpeededIP	0.0527	-0.0732	0.1666	0.4313
b_StrategicIP	-0.2467	-0.3595	-0.1274	0.0005
b_Percep	0.0634	-0.0483	0.1683	0.3787
b_Caution	0.1406	0.0337	0.2435	0.0447
b_Discount	0.0951	-0.0185	0.1993	0.1970
b_Gender1	0.0876	-0.0262	0.1872	0.2405
b_Age	0.1562	0.0383	0.2743	0.0410
b_EduFactor1	-0.0254	-0.1785	0.1241	0.4492
b_EduFactor2	-0.0338	-0.1503	0.0874	0.5148
b_Income	0.0359	-0.0713	0.1443	0.5463

Plot the conditional effects - for now just plot those of strategic information processing, this predictor seems to be largest in size. Let’s borrow some code (as often) from great open resources. First load some fancy colors from the beyonce package.

# devtools::install_github("dill/beyonce")
library(beyonce)

bp <- beyonce_palette(41, n = 9, type = "continuous")
bp

Let’s gather the draws from our model as a dataframe.

w_draws <- as_draws_df(m1.1)
head(w_draws)
## # A draws_df: 6 iterations, 1 chains, and 14 variables
##   b_Intercept b_SpeededIP b_StrategicIP b_Percep b_Caution b_Discount b_Gender1 b_Age
## 1       0.125      -0.125         -0.22   0.0597      0.21     -0.013     0.058 0.127
## 2      -0.070       0.193         -0.24   0.0461      0.14      0.182     0.164 0.210
## 3      -0.067       0.156         -0.28   0.0358      0.14      0.186     0.137 0.164
## 4       0.050       0.055         -0.27   0.1670      0.15      0.068     0.149 0.055
## 5      -0.063       0.033         -0.31  -0.0078      0.15      0.113     0.044 0.242
## 6      -0.085       0.109         -0.32   0.0486      0.19      0.070     0.065 0.204
## # ... with 6 more variables
## # ... hidden reserved variables {'.chain', '.iteration', '.draw'}

Let’s first replicate the plot from above.


# how many posterior lines would you like?
n_lines <- 200

df %>% 
  ggplot(aes(x = StrategicIP, y = Conservatism)) +
  geom_abline(data = w_draws %>% slice(1:n_lines),
              aes(intercept = b_Intercept, slope = b_StrategicIP, group = .draw),
              color = bp[2], linewidth = 1/4, alpha = 1/4) + 
  geom_point(alpha = 1/2, color = bp[5]) +
  labs(title = "Conditional association between StrategicIP and Conservatism",
       caption = (paste("Data with", n_lines, "credible regression lines")),
       x = "Strategic",
       y = "Conservatism") +
  coord_cartesian(xlim = c(-5, 5),
                  ylim = c(-3, 3)) +
  theme_ipsum(plot_title_size = 12,
              axis_title_size = 10,
              axis_title_face = "bold") +
  theme(panel.grid.minor = element_blank()) 
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not found in
## Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

Let’s also plot the conditional association between Age and Conservatism. As we standardized the variable Age, but Age in years is more intuitive for interpretations, let’s rescale the draws first.


# make a function to rescale slope estimates
make_beta_1 <- function(zeta_1, sd_x, sd_y) {
  zeta_1 * sd_y / sd_x
}

# this is not super elegant but I add Conservatism scaled to the first data as in df I only have the scaled variable - note for future do not change orig variable
data <- data %>% 
  mutate(Conservatism_z = scale(Conservatism))

# calculate the sd of x and y
sd_x <- sd(data$Age, na.rm = T)
sd_y <- sd(data$Conservatism_z, na.rm = T)

# rescale the posterior draws of Age
w_draws <-
  w_draws %>% 
  mutate(b_Age_orig = make_beta_1(zeta_1 = b_Age,
                                  sd_x   = sd_x,
                                  sd_y   = sd_y))

w_draws %>% 
  select(b_Age, b_Age_orig) %>% 
  head(.) %>% 
  tt(.)
## Warning: Dropping 'draws_df' class as required metadata was removed.

tinytable_fpwsovgd5fuxtv5lu5je

b_Age	b_Age_orig
0.126837715268519	0.0149471688486139
0.210043509129804	0.0247525413862275
0.16379642037851	0.0193025611271303
0.0548954843510674	0.00646914895845259
0.241954146956206	0.0285130450396435
0.203877101602445	0.0240258621464944

Now we are ready to create a prettier plot from our manual draws for Age.


# how many posterior lines would you like?
n_lines <- 200

data %>% 
  ggplot(aes(x = Age, y = Conservatism_z)) +
  geom_abline(data = w_draws %>% slice(1:n_lines),
              aes(intercept = b_Intercept, slope = b_Age_orig, group = .draw),
              color = bp[2], linewidth = 1/4, alpha = 1/4) + 
  geom_point(alpha = 1/2, color = bp[5]) +
  labs(title = "Conditional association between Age and Conservatism",
       caption = (paste("Data with", n_lines, "credible regression lines")),
       x = "Age",
       y = "Conservatism") +
  coord_cartesian(xlim = c(18, 65),
                  ylim = c(-3, 3)) +
  theme_ipsum(plot_title_size = 12,
              axis_title_size = 10,
              axis_title_face = "bold") +
  theme(panel.grid.minor = element_blank()) 
## Warning: Removed 9 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font family not
## found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, : font family
## not found in Windows font database

Let’s also plot the conditional effects for our factor variables and see how those plots look like. For this we use the inbuilt function conditional_effects().


c_eff_m1.1 <- conditional_effects(m1.1)

c_eff_m1.1_plot_gen <- plot(c_eff_m1.1, 
                        points = T,
                        point_args = list(size = 1, alpha = 1/4, width = .05, height = .05, color = "black"),
                        plot = FALSE)[[6]] +
  scale_color_ipsum() +
  scale_fill_ipsum() +
  theme_ipsum()


c_eff_m1.1_plot_edu <- plot(c_eff_m1.1, 
                        points = T,
                        point_args = list(size = 1, alpha = 1/4, width = .05, height = .05, color = "black"),
                        plot = FALSE)[[8]] +
  scale_color_ipsum() +
  scale_fill_ipsum() +
  theme_ipsum()
  
c_eff_m1.1_plot_gen + c_eff_m1.1_plot_edu

Let’s use bayesplot functions to plot the posterior distributions of the parameters.


mcmc_plot(m1.1, 
          type = "areas") +
  theme_ipsum()

We can also manually create a similar plot using draws from the posterior distribution. For simplicity, Here we use gather_draws from the tidybayes package - this automatically returns a long dataframe suitable for plotting in ggplot.


# gather draws in a long format
draws_m1.1 <- m1.1 %>% 
  gather_draws(`b_.*`, regex = TRUE)  %>% 
  filter(.variable %in% c("b_SpeededIP", "b_StrategicIP", "b_Percep", "b_Caution", "b_Discount")) %>%
  mutate(.variable = case_when(
    .variable == "b_SpeededIP" ~ "Speed of Evidence Accumulation",
    .variable == "b_StrategicIP" ~ "Strategic Information Processing",
    .variable == "b_Percep" ~ "Perceptual Processing Time",
    .variable == "b_Caution" ~ "Caution",
    .variable == "b_Discount" ~ "Discount",
    TRUE ~ .variable # keeps the original value if none of the conditions are met
  ))  %>%
  mutate(.variable = factor(.variable, levels = c(
    "Speed of Evidence Accumulation",
    "Strategic Information Processing",
    "Perceptual Processing Time",
    "Caution",
    "Discount"
  )))

ggplot(draws_m1.1, aes(x = .value, y = fct_rev(.variable), fill = .variable)) +
  geom_rect(aes(xmin = -0.05, xmax = 0.05, ymin = -Inf, ymax = Inf), 
            color = "transparent", fill = bp[9], alpha = 0.05) +
  geom_vline(xintercept = 0, alpha = 0.2) +
  stat_halfeye(slab_alpha = 0.8,
               .width = c(0.9, 0.95), point_interval = "median_hdi") +
  scale_fill_manual(values = beyonce_palette(41, n = 7, type = "continuous")) +
  guides(fill = "none", slab_alpha = "none", alpha = "none") +
  labs(title = "Associations of Cognitive Factors with Political Conservatism", 
       x = "Posterior Distribution", 
       y = "Variable",
       caption = "90% and 95% credible intervals shown in black.") +
  scale_x_continuous(breaks = seq(-0.50, 0.50, by = 0.1)) +
  coord_cartesian(xlim = c(-0.45, 0.45)) +
  theme_ipsum(plot_title_size = 14,
              axis_title_size = 12,
              axis_title_face = "bold") +
  annotate("text", x = 0, y = 0.6, label = "ROPE", 
            color = "black", size = 5)

Interpretation of the findings

Let’s make some sense of the findings. The model indicates that there is a high probability given the data that Strategic Information Processing (Median = -0.25 , 95% CI [-0.36; -0.13]) is negatively associated with Conservatism, and that in turn, Caution (Median = 0.14 , 95% CI [0.03; 0.24]) is positively associated with Conservatism.

Checking the model

Prior check

Let’s check if the prior that we used for the slope and intercept was informative or, as intended, only weakly informative. We can use functionality from bayestestR for this.


prior_summary(m1.1)
##                 prior     class        coef group resp dpar nlpar lb ub       source
##          normal(0, 1)         b                                                 user
##          normal(0, 1)         b         Age                             (vectorized)
##          normal(0, 1)         b     Caution                             (vectorized)
##          normal(0, 1)         b    Discount                             (vectorized)
##          normal(0, 1)         b  EduFactor1                             (vectorized)
##          normal(0, 1)         b  EduFactor2                             (vectorized)
##          normal(0, 1)         b     Gender1                             (vectorized)
##          normal(0, 1)         b      Income                             (vectorized)
##          normal(0, 1)         b      Percep                             (vectorized)
##          normal(0, 1)         b   SpeededIP                             (vectorized)
##          normal(0, 1)         b StrategicIP                             (vectorized)
##          normal(0, 1) Intercept                                                 user
##  student_t(3, 0, 2.5)     sigma                                    0         default
check_prior(m1.1)
##        Parameter Prior_Quality
## 1    b_Intercept uninformative
## 2    b_SpeededIP uninformative
## 3  b_StrategicIP uninformative
## 4       b_Percep uninformative
## 5      b_Caution uninformative
## 6     b_Discount uninformative
## 7      b_Gender1 uninformative
## 8          b_Age uninformative
## 9   b_EduFactor1 uninformative
## 10  b_EduFactor2 uninformative
## 11      b_Income uninformative

Ok - as intended our priors were uninformative.

Posterior predictive check

Let’s also check how well our model performs in generating data that is similar to our observed data. We call this a Posterior predictive check (PPC).

pp_check(m1.1, ndraws = 100)

Our model seems to capture the shape of the observed data well.

pp_check(m1.1, ndraws = 100, type ='error_scatter_avg', alpha = .1)

Trace plot

Commonly, we look at the trace plots, i.e., the estimated values across each iteration for a chain. See this nice practical guide by Michael Clark for some details. We will follow his code in several steps below. What should the trace plot look like? Ideally it should look like a series from a normal distribution.

qplot(x = 1:1000, y = rnorm(1000), geom = 'line') +
  theme_ipsum()

Now we can also plot the trace plot from our model, using nice functionality from bayesplot.

mcmc_plot(m1.1, type = 'trace') +
  theme_ipsum()

This looks good too.

Time to go Multivariate

Ok - but what if we want to investigate the effects of multiple predictors on multiple outcomes at the same time? brms allows us to run a multivariate multiple regression! So we can do just that. We will keep it a bit brief here, but may delve into this a bit more in future blog posts. Start with formulating our model. We want to investigate the relative controlled associations between the five Cognitive Factors and Conservatism, Authoritarianism and Dogmatism at the same time.

f_mv1 <- 
  bf(mvbind(Conservatism, Authoritarianism, Dogmatism) 
     ~ SpeededIP + StrategicIP + Percep + Caution + Discount + 
       Gender + Age + EduFactor + Income) +
  set_rescor(TRUE)

Let’s have a look at the default priors.

get_prior(data = df,
          family = gaussian,
          formula = f_mv1)
## Warning: Rows containing NAs were excluded from the model.
##                    prior     class        coef group             resp dpar nlpar lb ub
##                   (flat)         b                                                    
##                   (flat) Intercept                                                    
##                   lkj(1)    rescor                                                    
##                   (flat)         b                   Authoritarianism                 
##                   (flat)         b         Age       Authoritarianism                 
##                   (flat)         b     Caution       Authoritarianism                 
##                   (flat)         b    Discount       Authoritarianism                 
##                   (flat)         b  EduFactor1       Authoritarianism                 
##                   (flat)         b  EduFactor2       Authoritarianism                 
##                   (flat)         b     Gender1       Authoritarianism                 
##                   (flat)         b      Income       Authoritarianism                 
##                   (flat)         b      Percep       Authoritarianism                 
##                   (flat)         b   SpeededIP       Authoritarianism                 
##                   (flat)         b StrategicIP       Authoritarianism                 
##  student_t(3, -0.1, 2.5) Intercept                   Authoritarianism                 
##     student_t(3, 0, 2.5)     sigma                   Authoritarianism             0   
##                   (flat)         b                       Conservatism                 
##                   (flat)         b         Age           Conservatism                 
##                   (flat)         b     Caution           Conservatism                 
##                   (flat)         b    Discount           Conservatism                 
##                   (flat)         b  EduFactor1           Conservatism                 
##                   (flat)         b  EduFactor2           Conservatism                 
##                   (flat)         b     Gender1           Conservatism                 
##                   (flat)         b      Income           Conservatism                 
##                   (flat)         b      Percep           Conservatism                 
##                   (flat)         b   SpeededIP           Conservatism                 
##                   (flat)         b StrategicIP           Conservatism                 
##     student_t(3, 0, 2.5) Intercept                       Conservatism                 
##     student_t(3, 0, 2.5)     sigma                       Conservatism             0   
##                   (flat)         b                          Dogmatism                 
##                   (flat)         b         Age              Dogmatism                 
##                   (flat)         b     Caution              Dogmatism                 
##                   (flat)         b    Discount              Dogmatism                 
##                   (flat)         b  EduFactor1              Dogmatism                 
##                   (flat)         b  EduFactor2              Dogmatism                 
##                   (flat)         b     Gender1              Dogmatism                 
##                   (flat)         b      Income              Dogmatism                 
##                   (flat)         b      Percep              Dogmatism                 
##                   (flat)         b   SpeededIP              Dogmatism                 
##                   (flat)         b StrategicIP              Dogmatism                 
##     student_t(3, 0, 2.5) Intercept                          Dogmatism                 
##     student_t(3, 0, 2.5)     sigma                          Dogmatism             0   
##        source
##       default
##       default
##       default
##       default
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##       default
##       default
##       default
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##       default
##       default
##       default
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##       default
##       default

Again, we want to set the priors for the intercepts and slopes to \(N(0, 1)\). Let’s fit the model.

mv1 <- brm(data = df,
           family = gaussian,
           formula = f_mv1,
           prior = c(prior(normal(0, 1), class = "Intercept"),
                     prior(normal(0, 1), class = "b")),
          seed = bayes_seed,
          file = "02_fits/fit_mv1")

Let’s have a look at the output.

summary(mv1)
##  Family: MV(gaussian, gaussian, gaussian) 
##   Links: mu = identity; sigma = identity
##          mu = identity; sigma = identity
##          mu = identity; sigma = identity 
## Formula: Conservatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + EduFactor + Income 
##          Authoritarianism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + EduFactor + Income 
##          Dogmatism ~ SpeededIP + StrategicIP + Percep + Caution + Discount + Gender + Age + EduFactor + Income 
##    Data: df (Number of observations: 319) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##                              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Conservatism_Intercept          -0.01      0.08    -0.17     0.14 1.00     3468     2921
## Authoritarianism_Intercept       0.02      0.08    -0.15     0.18 1.00     3954     3213
## Dogmatism_Intercept              0.04      0.08    -0.12     0.21 1.00     4130     3532
## Conservatism_SpeededIP           0.05      0.06    -0.07     0.17 1.00     4090     3362
## Conservatism_StrategicIP        -0.24      0.06    -0.37    -0.12 1.00     3945     2937
## Conservatism_Percep              0.06      0.05    -0.05     0.17 1.00     4061     2961
## Conservatism_Caution             0.14      0.06     0.03     0.25 1.00     4148     3692
## Conservatism_Discount            0.10      0.06    -0.02     0.21 1.00     4386     3381
## Conservatism_Gender1             0.09      0.05    -0.02     0.19 1.00     4612     3568
## Conservatism_Age                 0.16      0.06     0.04     0.27 1.00     3767     3305
## Conservatism_EduFactor1         -0.03      0.08    -0.18     0.13 1.00     3561     3135
## Conservatism_EduFactor2         -0.03      0.06    -0.15     0.09 1.00     3489     2839
## Conservatism_Income              0.04      0.06    -0.07     0.14 1.00     4206     3325
## Authoritarianism_SpeededIP       0.10      0.07    -0.03     0.23 1.00     3671     3293
## Authoritarianism_StrategicIP    -0.11      0.07    -0.24     0.02 1.00     3674     3107
## Authoritarianism_Percep          0.01      0.06    -0.10     0.13 1.00     4530     3466
## Authoritarianism_Caution         0.07      0.06    -0.05     0.19 1.00     4071     3241
## Authoritarianism_Discount        0.13      0.06     0.01     0.25 1.00     4400     3208
## Authoritarianism_Gender1        -0.09      0.06    -0.20     0.03 1.00     4212     3202
## Authoritarianism_Age             0.07      0.06    -0.05     0.19 1.00     3676     3390
## Authoritarianism_EduFactor1      0.05      0.08    -0.12     0.21 1.00     3715     3523
## Authoritarianism_EduFactor2     -0.06      0.06    -0.18     0.07 1.00     4004     3529
## Authoritarianism_Income          0.04      0.06    -0.08     0.16 1.00     3776     3050
## Dogmatism_SpeededIP             -0.10      0.07    -0.23     0.02 1.00     4151     3242
## Dogmatism_StrategicIP           -0.09      0.07    -0.22     0.03 1.00     4011     3179
## Dogmatism_Percep                 0.05      0.06    -0.07     0.16 1.00     4177     3449
## Dogmatism_Caution                0.06      0.06    -0.06     0.18 1.00     4558     3399
## Dogmatism_Discount               0.06      0.06    -0.06     0.17 1.00     4534     3168
## Dogmatism_Gender1               -0.01      0.06    -0.13     0.10 1.00     4428     3170
## Dogmatism_Age                   -0.01      0.06    -0.14     0.11 1.00     4063     3184
## Dogmatism_EduFactor1             0.07      0.08    -0.09     0.23 1.00     4118     3439
## Dogmatism_EduFactor2            -0.03      0.07    -0.16     0.09 1.00     3827     3292
## Dogmatism_Income                -0.01      0.06    -0.13     0.10 1.00     4682     3298
## 
## Family Specific Parameters: 
##                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_Conservatism         0.92      0.04     0.85     0.99 1.00     4289     3204
## sigma_Authoritarianism     0.98      0.04     0.91     1.06 1.00     4699     3296
## sigma_Dogmatism            0.99      0.04     0.92     1.07 1.00     4893     2919
## 
## Residual Correlations: 
##                                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## rescor(Conservatism,Authoritarianism)     0.42      0.05     0.33     0.51 1.00     4369
## rescor(Conservatism,Dogmatism)            0.32      0.05     0.22     0.42 1.00     4756
## rescor(Authoritarianism,Dogmatism)        0.31      0.05     0.21     0.41 1.00     4094
##                                       Tail_ESS
## rescor(Conservatism,Authoritarianism)     2693
## rescor(Conservatism,Dogmatism)            3229
## rescor(Authoritarianism,Dogmatism)        2905
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mv1_posteriors <- describe_posterior(mv1, 
                                     centrality = "median",
                                     dispersion = FALSE,
                                     ci = 0.95,
                                     ci_method = "hdi")
## Warning: Multivariate response models are not yet supported for tests `rope` and
## `p_rope`.

mv1_posteriors <- as.data.frame(mv1_posteriors)
mv1_posteriors %>% 
  select(Parameter, Median, CI_low, CI_high) %>% 
  mutate(across(where(is.numeric), ~round(.x, 2))) %>% 
  tt(.)

tinytable_g6ujc0zkdzva1r2m37t5

Parameter	Median	CI_low	CI_high
b_Conservatism_Intercept	-0.01	-0.16	0.15
b_Conservatism_SpeededIP	0.05	-0.07	0.17
b_Conservatism_StrategicIP	-0.24	-0.37	-0.13
b_Conservatism_Percep	0.06	-0.05	0.17
b_Conservatism_Caution	0.14	0.04	0.25
b_Conservatism_Discount	0.09	-0.02	0.20
b_Conservatism_Gender1	0.09	-0.02	0.19
b_Conservatism_Age	0.16	0.04	0.27
b_Conservatism_EduFactor1	-0.02	-0.18	0.12
b_Conservatism_EduFactor2	-0.04	-0.15	0.09
b_Conservatism_Income	0.04	-0.07	0.14
b_Authoritarianism_Intercept	0.02	-0.15	0.18
b_Authoritarianism_SpeededIP	0.10	-0.02	0.23
b_Authoritarianism_StrategicIP	-0.11	-0.23	0.02
b_Authoritarianism_Percep	0.01	-0.10	0.13
b_Authoritarianism_Caution	0.07	-0.04	0.19
b_Authoritarianism_Discount	0.13	0.01	0.25
b_Authoritarianism_Gender1	-0.09	-0.19	0.03
b_Authoritarianism_Age	0.07	-0.05	0.19
b_Authoritarianism_EduFactor1	0.05	-0.11	0.22
b_Authoritarianism_EduFactor2	-0.06	-0.18	0.07
b_Authoritarianism_Income	0.04	-0.07	0.16
b_Dogmatism_Intercept	0.04	-0.12	0.21
b_Dogmatism_SpeededIP	-0.10	-0.23	0.03
b_Dogmatism_StrategicIP	-0.09	-0.22	0.03
b_Dogmatism_Percep	0.05	-0.07	0.16
b_Dogmatism_Caution	0.06	-0.05	0.19
b_Dogmatism_Discount	0.06	-0.06	0.17
b_Dogmatism_Gender1	-0.01	-0.13	0.10
b_Dogmatism_Age	-0.02	-0.14	0.11
b_Dogmatism_EduFactor1	0.07	-0.08	0.24
b_Dogmatism_EduFactor2	-0.03	-0.16	0.09
b_Dogmatism_Income	-0.01	-0.13	0.10

We cannot directly check the percentage in the ROPE using the describe_posterior function for multivariate models yet - it will throw a warning. But we can use the rope function from the bayesTestR package.

rope(mv1, 
     range = list(Conservatism = c(-0.05, 0.05), 
                  Authoritarianism = c(-0.05, 0.05), 
                  Dogmatism = c(-0.05, 0.05)), 
     ci = 1, ci_method = "HDI")
## Possible multicollinearity between b_Authoritarianism_EduFactor1 and
##   b_Authoritarianism_Intercept (r = 0.73), b_Dogmatism_EduFactor1 and
##   b_Dogmatism_Intercept (r = 0.72). This might lead to inappropriate results. See
##   'Details' in '?rope'.
## # Proportion of samples inside the ROPE.
## ROPE with depends on outcome variable.
## 
## Parameter                    | inside ROPE |    ROPE width
## ----------------------------------------------------------
## Conservatism_Intercept       |     47.20 % | [-0.05, 0.05]
## Conservatism_SpeededIP       |     44.52 % | [-0.05, 0.05]
## Conservatism_StrategicIP     |      0.00 % | [-0.05, 0.05]
## Conservatism_Percep          |     38.88 % | [-0.05, 0.05]
## Conservatism_Caution         |      5.42 % | [-0.05, 0.05]
## Conservatism_Discount        |     20.00 % | [-0.05, 0.05]
## Conservatism_Gender1         |     23.50 % | [-0.05, 0.05]
## Conservatism_Age             |      3.95 % | [-0.05, 0.05]
## Conservatism_EduFactor1      |     46.92 % | [-0.05, 0.05]
## Conservatism_EduFactor2      |     50.42 % | [-0.05, 0.05]
## Conservatism_Income          |     53.60 % | [-0.05, 0.05]
## Authoritarianism_Intercept   |     44.22 % | [-0.05, 0.05]
## Authoritarianism_SpeededIP   |     20.35 % | [-0.05, 0.05]
## Authoritarianism_StrategicIP |     16.25 % | [-0.05, 0.05]
## Authoritarianism_Percep      |     58.45 % | [-0.05, 0.05]
## Authoritarianism_Caution     |     34.85 % | [-0.05, 0.05]
## Authoritarianism_Discount    |     10.22 % | [-0.05, 0.05]
## Authoritarianism_Gender1     |     25.00 % | [-0.05, 0.05]
## Authoritarianism_Age         |     33.92 % | [-0.05, 0.05]
## Authoritarianism_EduFactor1  |     38.92 % | [-0.05, 0.05]
## Authoritarianism_EduFactor2  |     41.58 % | [-0.05, 0.05]
## Authoritarianism_Income      |     51.58 % | [-0.05, 0.05]
## Dogmatism_Intercept          |     39.48 % | [-0.05, 0.05]
## Dogmatism_SpeededIP          |     20.18 % | [-0.05, 0.05]
## Dogmatism_StrategicIP        |     24.77 % | [-0.05, 0.05]
## Dogmatism_Percep             |     46.95 % | [-0.05, 0.05]
## Dogmatism_Caution            |     37.57 % | [-0.05, 0.05]
## Dogmatism_Discount           |     40.30 % | [-0.05, 0.05]
## Dogmatism_Gender1            |     60.52 % | [-0.05, 0.05]
## Dogmatism_Age                |     54.55 % | [-0.05, 0.05]
## Dogmatism_EduFactor1         |     33.88 % | [-0.05, 0.05]
## Dogmatism_EduFactor2         |     49.88 % | [-0.05, 0.05]
## Dogmatism_Income             |     58.98 % | [-0.05, 0.05]

Let’s also create the same plot as before.


draws_mv1 <- mv1 %>%  
  gather_draws(`b_.*`, regex = TRUE) %>% 
  # for the plot I want to remove the intercepts
  filter(!str_detect(.variable, "Intercept")) %>%
  # create a new variable that captures which outcome
  mutate(outcome = case_when(
    str_detect(.variable, "Conservatism") ~ "Conservatism",
    str_detect(.variable, "Authoritarianism") ~ "Authoritarianism",
    str_detect(.variable, "Dogmatism") ~ "Dogmatism")) %>% 
  # create a new variable that captures which predictor
  mutate(predictor = case_when(
    str_detect(.variable, "SpeededIP") ~ "Speed of Evidence Accumulation",
    str_detect(.variable, "StrategicIP") ~ "Strategic Information Processing",
    str_detect(.variable, "Percep") ~ "Perceptual Processing Time",
    str_detect(.variable, "Caution") ~ "Caution",
    str_detect(.variable, "Discount") ~ "Discount")) %>% 
  # create a new variable that captures which confounder
  mutate(confounder = case_when(
    str_detect(.variable, "Age") ~ "Age",
    str_detect(.variable, "Gender1") ~ "Gender",
    str_detect(.variable, "EduFactor1") ~ "Education Level 1",
    str_detect(.variable, "EduFactor2") ~ "Education Level 2",
    str_detect(.variable, "Income") ~ "Income")) %>% 
  # set them as factors with specific levels
  mutate(outcome = factor(outcome, levels = c(
    "Conservatism",
    "Authoritarianism",
    "Dogmatism"))) %>% 
  mutate(predictor = factor(predictor, levels = c(
    "Speed of Evidence Accumulation",
    "Strategic Information Processing",
    "Perceptual Processing Time",
    "Caution",
    "Discount"))) %>% 
  mutate(confounder = factor(confounder, levels = c(
    "Age",
    "Gender",
    "Education Level 1",
    "Education Level 2",
    "Income"))) 

glimpse(draws_mv1)
## Rows: 120,000
## Columns: 8
## Groups: .variable [30]
## $ .chain     <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ .iteration <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20…
## $ .draw      <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20…
## $ .variable  <chr> "b_Conservatism_SpeededIP", "b_Conservatism_SpeededIP", "b_Conservati…
## $ .value     <dbl> -0.02523, -0.02746, 0.11269, -0.01573, 0.07271, 0.04645, -0.00398, 0.…
## $ outcome    <fct> Conservatism, Conservatism, Conservatism, Conservatism, Conservatism,…
## $ predictor  <fct> Speed of Evidence Accumulation, Speed of Evidence Accumulation, Speed…
## $ confounder <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

draws_mv1 %>% 
  # drop NA (= predictors that I will not need) 
  drop_na(predictor) %>%
  ggplot(aes(x = .value, y = fct_rev(predictor), fill = predictor)) +
  geom_rect(aes(xmin = -0.05, xmax = 0.05, ymin = -Inf, ymax = Inf), 
            color = "transparent", fill = bp[9], alpha = 0.05) +
  geom_vline(xintercept = 0, alpha = 0.8) +
  stat_halfeye(slab_alpha = 0.8,
               .width = c(0.9, 0.95), point_interval = "median_hdi") +
  scale_fill_manual(values = beyonce_palette(41, n = 7, type = "continuous")) +
  guides(fill = "none", slab_alpha = "none", alpha = "none") +
  labs(title = "Associations of Cognitive Factors with Ideological Attitudes", 
       x = "Posterior Distribution", 
       y = "Predictor Variable",
       caption = "90% and 95% credible intervals shown in black.") +
  scale_x_continuous(breaks = seq(-0.50, 0.50, by = 0.1)) +
  coord_cartesian(xlim = c(-0.45, 0.45)) +
  theme_ipsum(plot_title_size = 16,
              axis_title_size = 14,
              axis_title_face = "bold") +
  annotate("text", x = 0, y = 0.6, label = "ROPE", 
            color = "black", size = 4) +
  facet_wrap(~ outcome, nrow=2)

Very nice! From the plot we can directly infer that Strategic Information Processing is the most important predictor of Conservatism - this is also the overall most relevant effect! Our cognitive predictors do a worse job at predicting Authoritarianism or Dogmatism.

What about our demographic predictor variables?

draws_mv1 %>% 
  # drop NA (= predictors that I will not need) 
  drop_na(confounder) %>%
  ggplot(aes(x = .value, y = fct_rev(confounder), fill = confounder)) +
  geom_rect(aes(xmin = -0.05, xmax = 0.05, ymin = -Inf, ymax = Inf), 
            color = "transparent", fill = bp[9], alpha = 0.05) +
  geom_vline(xintercept = 0, alpha = 0.8) +
  stat_halfeye(slab_alpha = 0.8,
               .width = c(0.9, 0.95), point_interval = "median_hdi") +
  scale_fill_manual(values = beyonce_palette(8, n = 5, type = "continuous")) +
  guides(fill = "none", slab_alpha = "none", alpha = "none") +
  labs(title = "Associations of Demographic Factors with Ideological Attitudes", 
       x = "Posterior Distribution", 
       y = "Predictor Variable",
       caption = "90% and 95% credible intervals shown in black.") +
  scale_x_continuous(breaks = seq(-0.50, 0.50, by = 0.1)) +
  coord_cartesian(xlim = c(-0.45, 0.45)) +
  theme_ipsum(plot_title_size = 16,
              axis_title_size = 14,
              axis_title_face = "bold") +
  annotate("text", x = 0, y = 0.6, label = "ROPE", 
            color = "black", size = 4) +
  facet_wrap(~ outcome, nrow=2)

Very neat! We see that there is a relatively high probability that Age is positively associated with Conservatism - this appears to be the most relevant effect again.

Checking the model

The model did not throw any warnings, so I will let you check convergence etc. by yourself if you want to.

But let’s have a final look at the posterior predictive check:

pp_check(mv1, ndraws = 100, resp="Conservatism")

pp_check(mv1, ndraws = 100, resp="Authoritarianism")

It seems like the model does a bit worse at capturing Authoritarianism. This may be as Authoritarianism is actually a ordinal variable here that can only take the values 1, 2, 3, 4.

pp_check(mv1, ndraws = 100, resp="Dogmatism")

Indirect associations?

One thing that I did not talk about is that brms also allows to test indirect effects. Before conducting the study, we may have expected that Age is indirectly associated with Conservatism via cognitive abilities. In a causal analysis this would be called mediation, but for here I just want to say: Before you do something like this, try to make yourself aware of the conclusions that you can draw from such analyses in observational data. I may delve a bit deeper into this in a future blog post, but for now here are three reads I can highly recommend:

That’s a Lot to Process! Pitfalls of Popular Path Models by Rohrer et al. (2022)
Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data, again by Rohrer (2018)
And as many people criticize statistical models that test for indirect associations in cross-sectional data, here a bit of a different view on it.

Sources and Resources

I am not a statistician, so please take all of what I wrote and coded here with a lot of caution. However, there are many great resources out there if you want to delve deeper into the topic - I am just getting started with this, but here are a few potentially helpful links.

Doing Bayesian Data Analysis in brms and the tidyverse: A brms and tidyverse version by Solomon Kurz of the classic book by Kruschke Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan.
Practical Bayes Part I and Practical Bayes Part II: Great resources to dealing and avoiding common problems in Bayesian models by Micheal Clark
Bayesian Basics: General overview of how to run Bayesian models in R
bayestestR: Vignettes on basic Bayesian models and their interpretation
Stan Wiki: Prior Choice: Go to reference for selecting priors
Course Handouts for Bayesian Data Analysis Class by Mark Lai

And there are of course many many more resources, this is just to get started with some.