Statistical Modelling: Reporting

In this chapter you will learn:

how to report your hypothesis testing results (including visualizations) to communicate what your model says about your hypothesis.

1 Reporting your hypothesis testing results

In an upcoming section, we will discuss communicating your hypothesis testing (including results) more formally - i.e. the text and visuals you will use to write your own scientific reports and papers. In the current chapter, we will gather many of the pieces needed for that task.

Remember that testing your hypothesis means looking for evidence of an effect of each predictor on your response. To report your model, you want to communicate this evidence, including:

your best-specified model(s)

You will report your best-specified model by reporting the terms - the predictors and any interactions - that are in your model, and any from your starting model that were removed during model selection.

how well your model explains your response

You will report how much variability in your response your best-specified model is able to explain, and the relative importance of each model term - the predictors and any interactions - in explaining that variability.

your modelled effects

You will report your modelled effects - i.e. the effect of each predictor on your response. Here you are interested in reporting:

how big the effect of a predictor is on your response,
how certain you are that there is an effect of the predictor on your response (is the effect bigger than zero?), and
(for categorical predictors), if the effects differ among levels of a categorical predictor.

Remember that model effects are captured in your coefficients.

2 Introducing the examples:

In this chapter, we will report model results for four example best-specified models that showcase a range of different hypotheses.

If you are re-reading this chapter with your own hypothesis you are testing, look for the example that shares the most structure with your own hypothesis.

The examples are:

Example 1: Resp1 ~ Cat1 + 1

Example 2: Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Example 3: Resp3 ~ Cont4 + 1

Example 4: Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

where

Resp# is your response variable.
Cat# indicates a categorical predictor
Cont# indicates a continuous predictor

Note that each example is testing a different hypothesis by fitting a different model to a different data-set. The four examples are not related to one another.

Note also that the examples will all involve a normal error distribution assumption. I will point out when your process would be different for a different error distribution assumption (e.g. Gamma, poisson, and binomial)

Finally, note that I am not going through all the steps that got us to these example best-specified models, but assume we went through the previous steps (Responses, Predictors, Hypothesis, Starting model, Model validation and Hypothesis testing) as outlined in this Handbook to choose each best-specified model.

Example 1: Resp1 ~ Cat1 + 1

For example #1, assume your best-specified model says that your response variable (Resp1) is explained by a categorical predictor (Cat1):

Resp1 ~ Cat1

Your best-specified model for example #1 is in an object called bestMod1:

bestMod1


Call:  glm(formula = Resp1 ~ Cat1 + 1, family = gaussian(link = "identity"), 
    data = myDat1)

Coefficients:
(Intercept)      Cat1Sp2      Cat1Sp3  
     20.955        5.877       -4.510  

Degrees of Freedom: 99 Total (i.e. Null);  97 Residual
Null Deviance:      7083 
Residual Deviance: 5210     AIC: 687.1

that was fit to data in myDat1:

str(myDat1)

'data.frame':   100 obs. of  2 variables:
 $ Cat1 : Factor w/ 3 levels "Sp1","Sp2","Sp3": 1 1 1 2 3 3 2 3 2 1 ...
 $ Resp1: num  14.2 11.2 32.9 26.4 20.6 10.9 29 13.6 25.2 29 ...

and the dredge() table you used to pick your bestMod1 is in dredgeOut1:

dredgeOut1

Global model call: glm(formula = Resp1 ~ Cat1 + 1, family = gaussian(link = "identity"), 
    data = myDat1)
---
Model selection table 
  (Intrc) Cat1 df   logLik  AICc delta weight
2   20.95    +  4 -339.547 687.5  0.00      1
1   21.79       2 -354.905 713.9 26.42      0
Models ranked by AICc(x)

Example 2: Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

For Example #2, assume your best-specified model says that your response variable (Resp2) is explained by two categorical predictors (Cat2 & Cat3) as well as the interaction between the predictors:

Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Your best-specified model for example #2 is in an object called bestMod2:

bestMod2


Call:  glm(formula = Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1, family = gaussian(link = "identity"), 
    data = myDat2)

Coefficients:
          (Intercept)              Cat2TypeB              Cat2TypeC  
               364.91                  37.37                 -75.94  
            Cat2TypeD             Cat3Treat2            Cat3Control  
              -141.97                 -43.63                  64.12  
 Cat2TypeB:Cat3Treat2   Cat2TypeC:Cat3Treat2   Cat2TypeD:Cat3Treat2  
                42.71                  73.61                  66.02  
Cat2TypeB:Cat3Control  Cat2TypeC:Cat3Control  Cat2TypeD:Cat3Control  
                81.25                 -19.08                 -71.72  

Degrees of Freedom: 99 Total (i.e. Null);  88 Residual
Null Deviance:      1249000 
Residual Deviance: 377000   AIC: 1133

that was fit to data in myDat2:

str(myDat2)

'data.frame':   100 obs. of  3 variables:
 $ Resp2: num  278 335 566 341 313 ...
 $ Cat2 : Factor w/ 4 levels "TypeA","TypeB",..: 3 2 2 3 4 1 1 3 4 1 ...
 $ Cat3 : Factor w/ 3 levels "Treat1","Treat2",..: 3 2 3 3 3 1 1 2 1 2 ...

and the dredge() table you used to pick your bestMod2 is in dredgeOut2:

dredgeOut2

Global model call: glm(formula = Resp2 ~ Cat2 + Cat3 + Cat2:Cat3, family = gaussian(link = "identity"), 
    data = myDat2)
---
Model selection table 
  (Int) Ct2 Ct3 Ct2:Ct3     R^2 df   logLik   AICc delta weight
8 364.9   +   +       + 0.69800 13 -553.634 1137.5  0.00  0.964
4 354.9   +   +         0.62540  7 -564.419 1144.1  6.55  0.036
2 388.5   +             0.55300  5 -573.243 1157.1 19.63  0.000
1 348.3                 0.00000  2 -613.508 1231.1 93.64  0.000
3 331.1       +         0.03933  4 -611.502 1231.4 93.92  0.000
Models ranked by AICc(x)

Example 3: Resp3 ~ Cont4 + 1

For Example #3, assume your best-specified model says that your response variable (Resp3) is explained by one continuous predictor (Cont4):

Resp3 ~ Cont4 + 1

Your best-specified model for example #3 is in an object called bestMod4:

bestMod3


Call:  glm(formula = Resp3 ~ Cont4 + 1, family = gaussian(link = "identity"), 
    data = myDat3)

Coefficients:
(Intercept)        Cont4  
     -226.9        260.1  

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:      276100000 
Residual Deviance: 64930000     AIC: 1628

that was fit to data in myDat4:

str(myDat3)

'data.frame':   100 obs. of  2 variables:
 $ Resp3: num  3668 2774 4594 1094 2100 ...
 $ Cont4: num  13.93 12.15 15.4 4.87 6.64 ...

and the dredge() table you used to pick your bestMod3 is in dredgeOut3:

dredgeOut3

Global model call: glm(formula = Resp3 ~ Cont4 + 1, family = gaussian(link = "identity"), 
    data = myDat3)
---
Model selection table 
  (Intrc) Cont4 df   logLik   AICc  delta weight
2  -226.9 260.1  3 -811.080 1628.4   0.00      1
1  2358.0        2 -883.457 1771.0 142.63      0
Models ranked by AICc(x)

Example 4: Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

For Example #4, assume your best-specified model says that your response variable (Resp4) is explained by one categorical predictor (Cat4) and one continuous predictor (Cont6) as well as the interaction between the predictors:

Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

Your best-specified model for example #3 is in an object called bestMod4:

bestMod4


Call:  glm(formula = Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = gaussian(link = "identity"), 
    data = myDat4)

Coefficients:
    (Intercept)        Cat5Urban         Cat5Wild            Cont6  
      98.346793        -0.236424        -0.867336        -0.002931  
Cat5Urban:Cont6   Cat5Wild:Cont6  
       0.015117         0.030858  

Degrees of Freedom: 99 Total (i.e. Null);  94 Residual
Null Deviance:      4454 
Residual Deviance: 642.8    AIC: 483.9

that was fit to data in myDat4:

str(myDat4)

'data.frame':   100 obs. of  3 variables:
 $ Resp4: num  102 102 104 110 107 ...
 $ Cat5 : Factor w/ 3 levels "Farm","Urban",..: 1 2 2 3 2 1 3 2 3 2 ...
 $ Cont6: num  341 490 643 412 444 ...

and the dredge() table you used to pick your bestMod4 is in dredgeOut4:

dredgeOut4

Global model call: glm(formula = Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = gaussian(link = "identity"), 
    data = myDat4)
---
Model selection table 
   (Int) Ct5       Cn6 Ct5:Cn6    R^2 df   logLik  AICc  delta weight
8  98.35   + -0.002931       + 0.8557  7 -234.930 485.1   0.00      1
4  92.25   +  0.009650         0.8166  5 -246.915 504.5  19.39      0
2  96.93   +                   0.7923  4 -253.133 514.7  29.61      0
3  94.92      0.017780         0.0848  3 -327.278 660.8 175.73      0
1 104.00                       0.0000  2 -331.708 667.5 182.46      0
Models ranked by AICc(x)

3 Reporting your best-specified models

This section is similar regardless of your model structure (e.g. error distribution assumption). The examples below all assume a normal error distribution assumption but you can use the process presented here for models of any structure.

Reporting your best-specified model means reporting the terms - the predictors and any interactions - that are in your model. It is good practice to present the model along with the results from the model selection. In this way, you can include multiple best-specified models if there is evidence that more than one might be useful. Depending on your hypothesis and results, you may want to present all models within ∆AICc < 2 of the best-specified model, or all models with any Akaike weight, or simply all models.

Remember from the Hypothesis Testing chapter that you can also use the output from dredge() function to report evidence for how you picked the best-specified model.¹

Let’s take a look at how you do this with our examples:

Example 1: Resp1 ~ Cat1 + 1

Global model call: glm(formula = Resp1 ~ Cat1 + 1, family = gaussian(link = "identity"), 
    data = myDat1)
---
Model selection table 
  (Intrc) Cat1 df   logLik  AICc delta weight
2   20.95    +  4 -339.547 687.5  0.00      1
1   21.79       2 -354.905 713.9 26.42      0
Models ranked by AICc(x)

For Example 1, you will report that your best-specified model is Resp1 ~ Cat1 + 1, i.e. that there is evidence that Cat1 explains variability in Resp1. This was chosen as the best-specified model as it had the lowest AICc and the next highest rank model had a ∆AICc of 26.42 (i.e. ∆AICc > 2). The best-specified model had an Akaike weight of 1.

Example 2: Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Global model call: glm(formula = Resp2 ~ Cat2 + Cat3 + Cat2:Cat3, family = gaussian(link = "identity"), 
    data = myDat2)
---
Model selection table 
  (Int) Ct2 Ct3 Ct2:Ct3     R^2 df   logLik   AICc delta weight
8 364.9   +   +       + 0.69800 13 -553.634 1137.5  0.00  0.964
4 354.9   +   +         0.62540  7 -564.419 1144.1  6.55  0.036
2 388.5   +             0.55300  5 -573.243 1157.1 19.63  0.000
1 348.3                 0.00000  2 -613.508 1231.1 93.64  0.000
3 331.1       +         0.03933  4 -611.502 1231.4 93.92  0.000
Models ranked by AICc(x)

For Example 2, you will report that your best-specified model is Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1, i.e. that there is evidence that Cat2 and Cat3 explain variability in Resp2, and that there is an interaction between the effect of Cat2 and Cat3 on your response - i.e. the effect of Cat2 on Resp2 depends on the value of Cat3.

This was chosen as the best-specified model as it had the lowest AICc and the next highest rank model had a ∆AICc of 6.55 (i.e. ∆AICc > 2). The best-specified model had an Akaike weight of 0.964.

Example 3: Resp3 ~ Cont4 + 1

Global model call: glm(formula = Resp3 ~ Cont4 + 1, family = gaussian(link = "identity"), 
    data = myDat3)
---
Model selection table 
  (Intrc) Cont4 df   logLik   AICc  delta weight
2  -226.9 260.1  3 -811.080 1628.4   0.00      1
1  2358.0        2 -883.457 1771.0 142.63      0
Models ranked by AICc(x)

For Example 3, you will report that your best-specified model is Resp3 ~ Cont4 + 1, i.e. that there is evidence that Cont4 explains variability in Resp3.

This was chosen as the best-specified model as it had the lowest AICc and the next highest rank model had a ∆AICc of 142.63 (i.e. ∆AICc > 2). The best-specified model had an Akaike weight of 1.

Example 4: Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

Global model call: glm(formula = Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = gaussian(link = "identity"), 
    data = myDat4)
---
Model selection table 
   (Int) Ct5       Cn6 Ct5:Cn6    R^2 df   logLik  AICc  delta weight
8  98.35   + -0.002931       + 0.8557  7 -234.930 485.1   0.00      1
4  92.25   +  0.009650         0.8166  5 -246.915 504.5  19.39      0
2  96.93   +                   0.7923  4 -253.133 514.7  29.61      0
3  94.92      0.017780         0.0848  3 -327.278 660.8 175.73      0
1 104.00                       0.0000  2 -331.708 667.5 182.46      0
Models ranked by AICc(x)

For Example 4, you will report that your best-specified model is Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1. This model says that there is evidence that Cat5 and Cont6 explain variability in Resp4, and that there is an interaction between Cat5 and Cont6 - i.e. the effect of Cont6 on Resp4 depends on the value of Cat5.

This was chosen as the best-specified model as it had the lowest AICc and the next highest rank model had a ∆AICc of 19.39 (i.e. ∆AICc > 2). The best-specified model had an Akaike weight of 1.

4 Reporting how well your model explains your response

In this section you will report

how much variation (deviance) in your response is explained by your model
how important each of your predictors is in explaining that deviance

4.1 how much variation (deviance) in your response is explained by your model

If you will recall, your whole motivation for pursuing statistical modelling was to explain variation in your response. Thus, it is important that you quantify how much variation in your response you are able to explain by your model.

Note that we will discuss this as “explained deviance” rather than “explained variation”. This is because the term “variance” is associated with models where the error distribution assumption is normal, whereas deviance is a more universal term.

When you have a normal error distribution assumption and linear shape assumption, you can capture the amount of explained deviance simply by comparing the variation in your response (i.e. the starting variation before you fit your model - the null variation) vs. the variation in your model residuals (i.e. the remaining variation after you fit your model - the residual variation) as the \(R^2\):

\(R^2 = 1 - \frac{residual variation}{null variation}\)

From this equation, you can see how, if your model is able to explain all the variation in the response, the residual variation will be zero, and \(R^2 = 1\). Alternatively, if the model explains no variation in the response the residual variation equals the null variation and \(R^2 = 0\).

For models with other error distribution and shape assumptions, you need another way of estimating the goodness of fit of your model. You can do this through estimating a pseudo \(R^2\).

There are many many different types of pseudo \(R^2\).² One useful pseudo \(R^2\) is called the Likelihood Ratio \(R^2\) or \(R^2_{LR}\). The \(R^2_{LR}\) compares the likelihood of your best-specified model to the likelihood of the null model:

\(R^2_{LR} = 1-exp(-\frac{2}{n}(log𝓛(model)-log𝓛(null)))\)

where \(n\) is the number of observations, \(log𝓛(model)\) is the log-likelihood of your model, and \(log𝓛(null)\) is the log-likelihood of the null model. The \(R^2_{LR}\) is the type of pseudo \(R^2\) that shows up in your dredge() output when you add extra = "R^2" to the dredge() call. You can calculate \(R^2_{LR}\) by hand, read it from our dredge() output, or estimate it using r.squaredLR() from the MuMIn package:

r.squaredLR(bestMod1)

[1] 0.2644618
attr(,"adj.r.squared")
[1] 0.2646806

Note here that two values of \(R^2_{LR}\) are reported. The adjusted pseudo \(R^2\) given here under attr(,"adj.r.squared") has been scaled so that \(R^2_{LR}\) can reach a maximum of 1, similar to a regular \(R^2\)³.

Let’s compare this to the traditional \(R^2\):

1-summary(bestMod1)$deviance/summary(bestMod1)$null.deviance

[1] 0.2644618

Note that the two estimates are similar: One nice feature of the \(R^2_{LR}\) is that it is equivalent to the regular \(R^2\) when our model assumes a normal error distribution assumption and linear shape assumption, so you can use \(R^2_{LR}\) for a wide range of models.

Aside: what is a good \(R^2\)?

Recall that your goal in hypothesis testing is to explain the variation in your response - preferably as much variation as possible! So it is natural to hope for a high R^2 but this is a trap: the results of your hypothesis testing⁴ are valuable whether you are able to explain 3% or 93% of the variation in your response. Remember:

You are progressing science either way - helping the field learn about mechanisms and design future experiments.
Remaining unexplained variation points to potential limitations in measuring predictor effects and/or other predictors (not in your model) that may play a role.

For context, here is a plot of R^2 values from a meta-analysis of biology studies:

4.2 how important is each of your predictors in explaining that variation

When you have more than one predictor in your model, you may also want to report how relatively important each predictor is to explaining deviance in your response. This is also called “partitioning the explained deviance” to the predictors or “variance decomposition”.

Aside: what we won’t be doing

To get an estimate of how much deviance in your response one particular predictor explains, you may be tempted to compute the explained deviance (\(R^2\)) estimates of models fit to data with and without that particular predictor. Let’s try with our Example 4:

dredgeOut4

Global model call: glm(formula = Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = gaussian(link = "identity"), 
    data = myDat4)
---
Model selection table 
   (Int) Ct5       Cn6 Ct5:Cn6    R^2 df   logLik  AICc  delta weight
8  98.35   + -0.002931       + 0.8557  7 -234.930 485.1   0.00      1
4  92.25   +  0.009650         0.8166  5 -246.915 504.5  19.39      0
2  96.93   +                   0.7923  4 -253.133 514.7  29.61      0
3  94.92      0.017780         0.0848  3 -327.278 660.8 175.73      0
1 104.00                       0.0000  2 -331.708 667.5 182.46      0
Models ranked by AICc(x)

If you want to get an estimate as to how much response deviance the Cont6 predictor explains, you might compare the \(R^2\) of a model with and without the Cont6 predictor.

Let’s compare comparing model #4 (that includes Cont6) and model #2 (that doesn’t include Cont6):

R2.mod4 <- (dredgeOut4$`R^2`[2]) # model #4 is the second row in dredgeOut4
R2.mod2 <- (dredgeOut4$`R^2`[3]) # model #2 is the third row in dredgeOut4

diffR2 <- R2.mod4 - R2.mod2 # find estimated contribution of Cont to explained deviance

diffR2

[1] 0.02429225

So it looks like 2.4% of the variability in Resp4 is explained by Cont6.

But what if instead you chose to compare model #3 (that includes Cont6) and model #1 (that doesn’t include Cont6):

R2.mod3 <- (dredgeOut4$`R^2`[4]) # model #3 is the fourth row in dredgeOut4
R2.mod1 <- (dredgeOut4$`R^2`[5]) # model #1 is the fifth row in dredgeOut4

diffR2 <- R2.mod3 - R2.mod1 # find estimated contribution of Cont to explained deviance

diffR2

[1] 0.08480009

Now it looks like 8.5% of the variability in Resp4 is explained by Cont6! Quite a different answer! Your estimates of the contribution of Cont6 to explaining the response deviation don’t agree because of collinearity among our predictors - more on this in the section on Model Validation.

There are a few options you can use to get a robust estimate of how much each predictor is contributing to explained deviance in your response.

One option for partitioning the explained deviance when you have collinearity among your predictors is hierarchical partitioning. Hierarchical partitioning estimates the average independent contribution of each predictor to the total explained variance by considering all models in the candidate model set⁵. This method is beyond the scope of the course but see the rdacca.hp package for an example of how to do this.

Another method (that we will be using) for estimating the importance of each term (predictor or interaction) in your model is by again looking at the candidate model set ranking made by dredge(). Here you can measure the importance of a predictor by summing up the Akaike weights for any model that includes a particular predictor. The Akaike weight of a model compares the likelihood of the model scaled to the total likelihood of all the models in the candidate model set. The sum of Akaike weights for models including a particular predictor tells you how important the predictor is in explaining the deviance in your response. You can calculate the sum of Akaike weights directly with the sw() function in the MuMIn package:

sw(dredgeOut4)

                     Cat5 Cont6 Cat5:Cont6
Sum of weights:      1    1     1         
N containing models: 3    3     1

Here we can see that all model terms (the predictors Cat5 and Cont6 as well as the interaction) are equally important in explaining the deviance in Resp4 (they appear in all models that have any Akaike weight).

Let’s look at these two steps

how much deviance in your response is explained by your model
how important each of your predictors is in explaining that deviance

with our examples:

Example 1: Resp1 ~ Cat1 + 1

how much deviance in your response is explained by your model

R2 <- r.squaredLR(bestMod1)

R2

[1] 0.2644618
attr(,"adj.r.squared")
[1] 0.2646806

The best-specified model explains 26.4% of the deviance in Resp1.

how important each of your predictors is in explaining that deviance

dredgeOut2

Global model call: glm(formula = Resp2 ~ Cat2 + Cat3 + Cat2:Cat3, family = gaussian(link = "identity"), 
    data = myDat2)
---
Model selection table 
  (Int) Ct2 Ct3 Ct2:Ct3     R^2 df   logLik   AICc delta weight
8 364.9   +   +       + 0.69800 13 -553.634 1137.5  0.00  0.964
4 354.9   +   +         0.62540  7 -564.419 1144.1  6.55  0.036
2 388.5   +             0.55300  5 -573.243 1157.1 19.63  0.000
1 348.3                 0.00000  2 -613.508 1231.1 93.64  0.000
3 331.1       +         0.03933  4 -611.502 1231.4 93.92  0.000
Models ranked by AICc(x)

sw(dredgeOut1)

                     Cat1
Sum of weights:      1   
N containing models: 1

With Example 1, you have only one predictor (Cat1) and so this predictor is responsible for explaining all of the variability in your response (Resp1). You can see that it appears in all models with any weight with your sw() function from the MuMIn package.

Example 2: Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

how much deviance in your response is explained by your model

R2 <- r.squaredLR(bestMod2)

R2

[1] 0.6980474
attr(,"adj.r.squared")
[1] 0.6980507

The best-specified model explains 69.8% of the deviance in Resp2.

how important each of your predictors is in explaining that deviance

dredgeOut2

Global model call: glm(formula = Resp2 ~ Cat2 + Cat3 + Cat2:Cat3, family = gaussian(link = "identity"), 
    data = myDat2)
---
Model selection table 
  (Int) Ct2 Ct3 Ct2:Ct3     R^2 df   logLik   AICc delta weight
8 364.9   +   +       + 0.69800 13 -553.634 1137.5  0.00  0.964
4 354.9   +   +         0.62540  7 -564.419 1144.1  6.55  0.036
2 388.5   +             0.55300  5 -573.243 1157.1 19.63  0.000
1 348.3                 0.00000  2 -613.508 1231.1 93.64  0.000
3 331.1       +         0.03933  4 -611.502 1231.4 93.92  0.000
Models ranked by AICc(x)

sw(dredgeOut2)

                     Cat2 Cat3 Cat2:Cat3
Sum of weights:      1.00 1.00 0.96     
N containing models:    3    3    1

Here you can see that Cat2 and Cat3 are equally important in explaining the deviance in Resp2 (they appear in all models that have any Akaike weight), while the interaction term between Cat2 and Cat3 is less important (only appearing in one model with Akaike weight, though this is the top model).

Example 3: Resp3 ~ Cont4 + 1

how much deviance in your response is explained by your model

R2 <- r.squaredLR(bestMod3)

R2

[1] 0.764852
attr(,"adj.r.squared")
[1] 0.764852

The best-specified model explains 76.5% of the deviance in Resp3.

how important each of your predictors is in explaining that deviance

dredgeOut3

Global model call: glm(formula = Resp3 ~ Cont4 + 1, family = gaussian(link = "identity"), 
    data = myDat3)
---
Model selection table 
  (Intrc) Cont4 df   logLik   AICc  delta weight
2  -226.9 260.1  3 -811.080 1628.4   0.00      1
1  2358.0        2 -883.457 1771.0 142.63      0
Models ranked by AICc(x)

sw(dredgeOut3)

                     Cont4
Sum of weights:      1    
N containing models: 1

With Example 3, you have only one predictor (Cont6) and so this predictor is responsible for explaining all of the variability in your response (Resp3). You can see that it appears in all models with any weight with your sw() function from the MuMIn package.

Example 4: Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

how much deviance in your response is explained by your model

R2 <- r.squaredLR(bestMod4)

R2

[1] 0.855657
attr(,"adj.r.squared")
[1] 0.8567834

The best-specified model explains 85.6% of the deviance in Resp2.

how important each of your predictors is in explaining that deviance

dredgeOut4

Global model call: glm(formula = Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = gaussian(link = "identity"), 
    data = myDat4)
---
Model selection table 
   (Int) Ct5       Cn6 Ct5:Cn6    R^2 df   logLik  AICc  delta weight
8  98.35   + -0.002931       + 0.8557  7 -234.930 485.1   0.00      1
4  92.25   +  0.009650         0.8166  5 -246.915 504.5  19.39      0
2  96.93   +                   0.7923  4 -253.133 514.7  29.61      0
3  94.92      0.017780         0.0848  3 -327.278 660.8 175.73      0
1 104.00                       0.0000  2 -331.708 667.5 182.46      0
Models ranked by AICc(x)

sw(dredgeOut4)

                     Cat5 Cont6 Cat5:Cont6
Sum of weights:      1    1     1         
N containing models: 3    3     1

Here you can see that Cat5, Cont6 and the interaction Cat5:Cont6 are all equally important in explaining the deviance in Resp4 (they appear in all models that have any Akaike weight).

5 Reporting your modelled effects

In the last chapter of these notes, we discussed testing your hypothesis.

Remember that testing your hypothesis is looking for evidence of effects of your predictors on your response. By testing your hypothesis, you determined that there was evidence that the effects of your predictors on your response are non-zero (Popovic et al. 2024).

These effects are presented as estimates of the coefficients in your model (e.g. a slope) which are called parameters once they have been estimated.

For categorical predictors, the coefficient represents how much your response changes (increases or decreases) when you change from one level of the category to another. For numeric predictors, the coefficient represents how much your response changes (increases or decreases) when you increase your predictor by one unit. More on this below!

Here you will gather the information to communicate visually and quantitatively the magnitude and direction of your effects and determine (in some cases) where effects are similar or different from one another, answering questions like:

how much does a change (effect) in your predictor change your response?
is that change (effect) positive (your response) or negative (your response decreases)?
is that change (effect) similar across levels of your categorical predictor?

You can report your modelled effects in a number of ways⁶:

Visualizing your model effects: Here, you will make plots of your model effects including the effects themselves, uncertainty around the effects and your observations. These plots will help communicate your modelled effects as well as how well your model fits your observations.
Giving examples of your modelled effects: Here, you will use your model to predict the value of your response (with uncertainty) to illustrate how the response changes when your predictors change.
Quantifying your model effects: Here, you will report (in numbers) the magnitude and direction of your modelled effects along with the uncertainty. Exactly how you do that will depend on the structure of your model (e.g. the error distribution assumption). We will go through a number of examples here. You will also look for evidence of whether modelled effects of a categorical predictor differ across all levels.

We will focus on options #1 and 3 here as giving examples of your modelled effects is covered⁷ in the section on predicting.

5.1 Visualizing your model effects

Here, you will visualize how each predictor is affecting the response by drawing how your modelled fitted values (the estimates of your response made by the model) change with your predictors, along with a measure of uncertainty around those fitted values. You also want to include your data (actual observations of your response and predictors) so you can start to communicate how well your model captures your hypothesis, and what amount of deviance in your response is explained by your model.

There are a lot of different R packages that make it easy to quickly visualize your model. We will focus on two methods that will allow you to make quick visualizations of your model and/or customize the figures as you would like.

visualizing using the visreg package: This will let you get a quick look at your model object with a limited amount of customization.
vizualizing “by hand”: Here, “by hand” is a bit of a silly description as R will be doing the work for you. What I mean by “by hand” is that you will be building the plot yourself by querying your model object. This method is very flexible and will lead you to a fully customizable visualization. This process involves i) choosing the values of your predictors at which to make estimates of your fitted values, ii) using predict() to use your model to estimate your response variable at those values of your predictors, and iii) use the model estimates to plot your model fit. Aside: this method is also what is used to make predictions from your model - more in the Predicting section.

Below are visualizations for each of our examples. Two further tips:

In a few of the examples, we use ggplot2’s position_dodge argument. Be careful here - it can be useful for avoiding overlapping data points when predictors are categorical BUT can actually change the location of data points if used with a numeric predictor.
Note that we do not use the geom_smooth() function in the ggplot2 package. The fits provided by geom_smooth() are not the same as the model fits directly from your best-specified model so we avoid this here (e.g. geom_smooth() will fit a separate model for each level of your categorical predictor negating the hypothesis testing that allows us to make comparisons across the predictor levels).

Example 1: Resp1 ~ Cat1 + 1

Visualizing with the visreg package

library (visreg) # load visreg package for visualization

library(ggplot2) # load ggplot2 package for visualization

visreg(bestMod1, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cat1", # predictor on x-axis
       #by = ..., # if you want to include a 2nd predictor plotted as colour
       #breaks = ..., # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 3rd predictor
       #overlay = TRUE, # to plot as overlay or panels, when there is 
       #rug = FALSE, # to turn off the rug. The rug shows you where you have positive (appearing on the top axis) and negative (appearing on the bottom axis) residuals
       gg = TRUE)+ # to plot as a ggplot, with ggplot, you will have more control over the look of the plot.
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cat1, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

Note the small dashes at the top and bottom of the plot. This is called the “rug”. visreg uses the rug to give an indication of where your observed data lie - a dash on the bottom axis means you have a data point at that location that would be a positive residual relative to the model fit; a dash on the top axis means you have a data point at that location that would be a positive residual relative to the model fit. This is useful, but not the same as actually including your observations on the plot. Unfortunately due to limitations of the visreg package, you can not easily add you observations onto plots where the x-axis is a categorical predictor. But that’s ok, because there are other options…

Visualizing by hand

To plot by hand, you will

first make a data frame containing the value of your predictors at which you want to plot effects on the response:

# Set up your predictors for the visualized fit
forCat1<-unique(myDat1$Cat1) # every value of your categorical predictor

# create a data frame with your predictors
forVis<-expand.grid(Cat1=forCat1) # expand.grid() function makes sure you have all combinations of predictors

Next, you will use the predict() function⁸ to get the model estimates of your response variable at those values of your predictors:

# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod1, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod1)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame

Finally, you will use these model estimates to make your plot:

library(ggplot2)

ggplot() + # start ggplot
  geom_point(data = myDat1, # add observations to your plot
             mapping = aes(x = Cat1, y = Resp1), 
             position=position_jitter(width=0.1)) + # control position of data points so they are easier to see on the plot
  geom_errorbar(data = forVis, # add the uncertainty to your plot
              mapping = aes(x = Cat1, y = Fit, ymin = Lower, ymax = Upper),
              linewidth=1.2) + # control thickness of errorbar line
  geom_point(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cat1, y = Fit), 
             shape = 22, # shape of point
             size = 3, # size of point
             fill = "white", # fill colour for plot
             col = 'black') + # outline colour for plot
  ylab("Resp1, (units)")+ # change y-axis label
  xlab("Cat1, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

Example 2: Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Visualizing with the visreg package

As you have more than one predictor in Example 2, there are a number of ways you can visualize your effects.

Here is an example of a visualization with Cat2 on the x-axis and the effects of Cat3 plotted as separate panels:

library (visreg) # load visreg package for visualization

library(ggplot2) # load ggplot2 package for visualization

visreg(bestMod2, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cat2", # predictor on x-axis
       by = "Cat3", # if you want to include a 2nd predictor plotted as colour
       #breaks = ..., # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 3rd predictor
       overlay = FALSE, # to plot as overlay or panels, when there is 
       #rug = FALSE, # to turn off the rug. The rug shows you where you have positive (appearing on the top axis) and negative (appearing on the bottom axis) residuals
       gg = TRUE)+ # to plot as a ggplot, with ggplot, you will have more control over the look of the plot.
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cat2, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

# tip: we can't include our observations on this plot due to the limitations of the visreg package when plotting with a categorical predictor on the x-axis

And here is an example of a visualization with Cat2 on the x-axis and the effects of Cat3 overlayed on the plot as different colours for each category (level) of Cat3:

visreg(bestMod2, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cat2", # predictor on x-axis
       by = "Cat3", # if you want to include a 2nd predictor plotted as colour
       #breaks = ..., # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 3rd predictor
       overlay = TRUE, # to plot as overlay or panels, when there is 
       #rug = FALSE, # to turn off the rug. The rug shows you where you have positive (appearing on the top axis) and negative (appearing on the bottom axis) residuals
       gg = TRUE)+ # to plot as a ggplot, with ggplot, you will have more control over the look of the plot.
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cat2, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

# tip: we can't include our observations on this plot due to the limitations of the visreg package when plotting with a categorical predictor on the x-axis

Visualizing by hand

#### i) choosing the values of your predictors at which to make a prediction

# Set up your predictors for the visualized fit
forCat2<-unique(myDat2$Cat2) # every level of your categorical predictor
forCat3<-unique(myDat2$Cat3) # every level of your categorical predictor
  
# create a data frame with your predictors
forVis<-expand.grid(Cat2 = forCat2, Cat3 = forCat3) # expand.grid() function makes sure you have all combinations of predictors

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod2, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod2)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame

#### iii) use the model estimates to plot your model fit

library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  geom_point(data = myDat2, # add observations to your plot
             mapping = aes(x = Cat2, y = Resp2, col = Cat3), 
             position=position_jitterdodge(jitter.width=0.75, dodge.width=0.75)) + # control position of data points so they are easier to see on the plot
  geom_errorbar(data = forVis, # add the uncertainty to your plot
              mapping = aes(x = Cat2, y = Fit, ymin = Lower, ymax = Upper, col = Cat3),
              position=position_dodge(width=0.75), # control position of data points so they are easier to see on the plot
              size=1.2) + # control thickness of errorbar line
  geom_point(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cat2, y = Fit, fill = Cat3), 
             shape = 22, # shape of point
             size=3, # size of point
             col = 'black', # outline colour for point
             position=position_dodge(width=0.75)) + # control position of data points so they are easier to see on the plot
  ylab("Resp2, (units)")+ # change y-axis label
  xlab("Cat2, (units)")+ # change x-axis label
  labs(fill="Cat3, (units)", col="Cat3, (units)") + # change legend title
  theme_bw() # change ggplot theme

Example 3: Resp3 ~ Cont4 + 1

Visualizing with the visreg package

Notice how you can include the gg = TRUE argument to plot this as a ggplot type figure. This allows you to add your data onto the visualization of your model.

library (visreg) # load visreg package for visualization

library(ggplot2) # load ggplot2 package for visualization

visreg(bestMod3, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cont4", # predictor on x-axis
       #by = ..., # if you want to include a 2nd predictor plotted as colour
       #breaks = ..., # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 3rd predictor
       #overlay = TRUE, # to plot as overlay or panels, when there is 
       #rug = FALSE, # to turn off the rug. The rug shows you where you have positive (appearing on the top axis) and negative (appearing on the bottom axis) residuals
       gg = TRUE)+ # to plot as a ggplot, with ggplot, you will have more control over the look of the plot.
  geom_point(data = myDat3,
             mapping = aes(x = Cont4, y = Resp3))+
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cont4, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

# Note: you CAN include your observations on your plot with visreg when you have a continuous predictor on the x-axis

Visualizing by hand

#### i) choosing the values of your predictors at which to make a prediction


# Set up your predictors for the visualized fit
forCont4<-seq(from = min(myDat3$Cont4), to = max(myDat3$Cont4), by = 1)# a sequence of numbers in your continuous predictor range
  
# create a data frame with your predictors
forVis<-expand.grid(Cont4 = forCont4) # expand.grid() function makes sure you have all combinations of predictors.  

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod3, # the model
                newdata = forVis, # the predictor values
                type = "link",# here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate


ilink <- family(bestMod3)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame

#### iii) use the model estimates to plot your model fit


library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  
  geom_point(data = myDat3, # add observations to your plot
             mapping = aes(x = Cont4, y = Resp3)) + # control position of data points so they are easier to see on the plot
  
  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont4, y = Fit),
              linewidth = 1.2) + # control thickness of line
  
    geom_line(data = forVis, # add uncertainty to your plot (upper line)
              mapping = aes(x = Cont4, y = Upper),
              linewidth = 0.8, # control thickness of line
              linetype = 2) + # control style of line
  
      geom_line(data = forVis, # add uncertainty to your plot (lower line)
              mapping = aes(x = Cont4, y = Lower),
              linewidth = 0.8, # control thickness of line
              linetype = 2) + # control style of line
  
  ylab("Resp3, (units)") + # change y-axis label
  
  xlab("Cont4, (units)") + # change x-axis label
  
  theme_bw() # change ggplot theme

# another option making use of geom_ribbon()

ggplot() + # start ggplot
  
  geom_point(data = myDat3, # add observations to your plot
             mapping = aes(x = Cont4, y = Resp3)) + # control position of data points so they are easier to see on the plot
  
  geom_ribbon(data = forVis, # add uncertainty to your plot (upper line)
          mapping = aes(x = Cont4, ymin = Lower, ymax = Upper),
          alpha = 0.3) + # control the transparency, between 0 (transparent) and 1 (opaque)

  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont4, y = Fit),
              linewidth = 1.2) + # control thickness of line
  
  ylab("Resp3, (units)") + # change y-axis label
  
  xlab("Cont4, (units)") + # change x-axis label
  
  theme_bw() # change ggplot theme

Example 4: Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

plotting with marginal effects can be 4 D with x-axis, colour, row and columns Visualizing with the visreg package

As you will see in (sec_quantifying?), there is no evidence of an effect of Cont6 on Resp3 when Cat5 = Farm - i.e. the coefficient (slope) is not different from zero. In such cases, you would typically not include the effect on the plot (red line in the next figure), but I include it here to illustrate the visualization techniques.

A plot with Cont6 on the x-axis:

library (visreg) # load visreg package for visualization

library(ggplot2) # load ggplot2 package for visualization

visreg(bestMod4, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cont6", # predictor on x-axis
       by = "Cat5", # predictor plotted as colour
       #breaks = 3, # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 4th predictor
       overlay = TRUE, # to plot as overlay or panels 
       rug = FALSE, # to include a rug
       gg = TRUE)+ # to plot as a ggplot
  geom_point(data = myDat4, # data
             mapping = aes(x = Cont6, y = Resp4, col = Cat5))+ # add data to your plot
  #ylim(0, 60)+ # adjust the y-axis units
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cont6, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

A plot with Cat5 on the x-axis: Notice how the continuous predictor is represented by different levels in the colours on the plot. Here you’ve asked for three levels with breaks = 3. Note that you can not include your observations on the visreg plot when the x-axis predictor is a category

visreg(bestMod4, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cat5", # predictor on x-axis
       by = "Cont6", # predictor plotted as colour
       breaks = 3, # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 4th predictor
       overlay = TRUE, # to plot as overlay or panels 
       rug = FALSE, # to include a rug
       gg = TRUE)+ # to plot as a ggplot
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cat5, (units)")+ # change x-axis label
  labs(fill="Cont6, (units)", col="Cont6, (units)") + # change legend title
  theme_bw() # change ggplot theme

You can also specify at which levels the breaks should occur with the breaks = ... argument. For example, you can ask visreg to plot the modelled effects when Cont6 = 400 and Cont6 = 600:

visreg(bestMod4, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cat5", # predictor on x-axis
       by = "Cont6", # predictor plotted as colour
       breaks = c(400,600), # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 4th predictor
       overlay = TRUE, # to plot as overlay or panels 
       rug = FALSE, # to include a rug
       gg = TRUE)+ # to plot as a ggplot
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cat5, (units)")+ # change x-axis label
  labs(fill="Cont6, (units)", col="Cont6, (units)") + # change legend title
  theme_bw() # change ggplot theme

Note that you can not include your observations on the visreg plot when the x-axis predictor is a category.

Visualizing by hand

Here’s an example with Cont6 on the x-axis:

#### i) choosing the values of your predictors at which to make a prediction

# Set up your predictors for the visualized fit
forCat5<-unique(myDat4$Cat5) # all levels of your categorical predictor
forCont6<-seq(from = min(myDat4$Cont6), to = max(myDat4$Cont6), by = 1)# a sequence of numbers in your continuous predictor range
  
# create a data frame with your predictors
forVis<-expand.grid(Cat5 = forCat5, Cont6 = forCont6) # expand.grid() function makes sure you have all combinations of predictors.  

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod4, # the model
                newdata = forVis, # the predictor values
                type = "link",# here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod4)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame

#### iii) use the model estimates to plot your model fit


library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  
  geom_point(data = myDat4, # add observations to your plot
             mapping = aes(x = Cont6, y = Resp4, col = Cat5)) + # control position of data points so they are easier to see on the plot
  
  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont6, y = Fit, col = Cat5),
              linewidth = 1.2) + # control thickness of line
  
    geom_line(data = forVis, # add uncertainty to your plot (upper line)
              mapping = aes(x = Cont6, y = Upper, col = Cat5),
              linewidth = 0.4, # control thickness of line
              linetype = 2) + # control style of line
  
      geom_line(data = forVis, # add uncertainty to your plot (lower line)
              mapping = aes(x = Cont6, y = Lower, col = Cat5),
              linewidth = 0.4, # control thickness of line
              linetype = 2) + # control style of line
  
  ylab("Resp4, (units)") + # change y-axis label
  
  xlab("Cont6, (units)") + # change x-axis label
  
  labs(fill="Cat5, (units)", col="Cat5, (units)") + # change legend title
  
  theme_bw() # change ggplot theme

# another option making use of geom_ribbon()

ggplot() + # start ggplot
  
  geom_point(data = myDat4, # add observations to your plot
             mapping = aes(x = Cont6, y = Resp4, col = Cat5)) + # control position of data points so they are easier to see on the plot
  
  geom_ribbon(data = forVis, # add uncertainty to your plot (upper line)
          mapping = aes(x = Cont6,  ymin = Lower, ymax = Upper, fill = Cat5),
          alpha = 0.3) + # control the transparency, between 0 (transparent) and 1 (opaque)

  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont6, y = Fit, col = Cat5),
              linewidth = 1.2) + # control thickness of line
  
  ylab("Resp4, (units)") + # change y-axis label
  
  xlab("Cont6, (units)") + # change x-axis label
  
  labs(fill="Cat5, (units)", col="Cat5, (units)") + # change legend title
  
  theme_bw() # change ggplot theme

Here’s an example with Cat5 on the x-axis:

#### i) choosing the values of your predictors at which to make a prediction

# Set up your predictors for the visualized fit
forCat5<-unique(myDat4$Cat5) # all levels of your categorical predictor
forCont6<-c(400, 600) # a sequence of numbers in your continuous predictor range
  
# create a data frame with your predictors
forVis<-expand.grid(Cat5 = forCat5, Cont6 = forCont6) # expand.grid() function makes sure you have all combinations of predictors.  

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor

# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod4, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod4)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame


#### iii) use the model estimates to plot your model fit


library(ggplot2) # load ggplot2 library


ggplot() + # start ggplot
  
  geom_point(data = myDat4, # add observations to your plot
             mapping = aes(x = Cat5, y = Resp4, col = Cont6), 
             position=position_jitterdodge(jitter.width=0.75, dodge.width=0.75)) + # control position of data points so they are easier to see on the plot
  
  geom_errorbar(data = forVis, # add the uncertainty to your plot
              mapping = aes(x = Cat5, y = Fit, ymin = Lower, ymax = Upper, col = Cont6),
              position=position_dodge(width=0.75), # control position of data points so they are easier to see on the plot
              size=1.2) + # control thickness of errorbar line
  
  geom_point(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cat5, y = Fit, fill = Cont6), 
             shape = 22, # shape of point
             size=3, # size of point
             col = 'black', # outline colour for point
             position=position_dodge(width=0.75)) + # control position of data points so they are easier to see on the plot
  
  ylab("Resp4, (units)")+ # change y-axis label
  
  xlab("Cat5, (units)")+ # change x-axis label
  
  labs(fill="Cont6, (units)", col="Cont6, (units)") + # change legend title
  
  theme_bw() # change ggplot theme

Aside: Plotting in 3D

You might notice that many of the plots above are communicating 3-dimensions (one response + two predictors) in a 2-dimensional plot. There are other ways of making 3-dimensional plots in R, e.g. with the visreg package using the visreg2d() function in the visreg package:

visreg2d(fit = bestMod4, # your model
         xvar = "Cont6", # what to plot on the x-axis
         yvar = "Cat5", # what to plot on the y-axis
         scale = "response") # make sure fitted values (colours) are on the scale of the response

or “by hand” using the geom_raster() function in the ggplot2 package:

# Set up your predictors for the visualized fit
forCont6<-seq(from = min(myDat4$Cont6), 
             to = max(myDat4$Cont6), 
             by = 0.1) # e.g. a sequence of Cont values
forCat5<-unique(myDat4$Cat5) # every value of your categorical predictor

# create a data frame with your predictors
forVis<-expand.grid(Cont6=forCont6, Cat5=forCat5) # expand.grid() function makes sure you have all combinations of predictors


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod4, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod4)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame

# create your plot
ggplot() + # start your ggplot
  geom_raster(data = forVis, aes(x = Cont6, y = Cat5, fill = Fit))+ # add the 3 dimensions as a raster
  geom_point(data = myDat4, aes(x = Cont6, y = Cat5, colour = Resp)) # add your data

EXTRA: As you can see, these plots represent the 3rd dimension by using colour. You can also make actual 3 dimensional plots in R with the plotly package. These plots are interactive which makes them more useful than static 3d plots. Click on the plot and move your mouse to rotate the plot!

library(plotly) # load the plotly package

plot_ly(data = forVis, # the data with your model predictions (made above)
        x = ~Cont6, # what to plot on the x axis
        y = ~Cat5, # what to plot on the y axis
        z = ~Fit, # what to plot on the z axis
        type="scatter3d", # type of plot
        mode="markers") %>% # type of plot
  add_markers(data = myDat4, x = ~Cont6, y = ~Cat5, z = ~Resp4) # add your data

5.2 Quantifying your model effects

How you quantify your model effects varies with your model structure. If it is your first time reading this, read through the examples that present models assuming a normal error distribution assumption. This will help you understand why we are reporting modelled effects in this way. Then, (if relevant) look at the section on the error distribution assumption you are interested in for your model.

In this section, you will learn how to report your modelled effects in numbers. In general, this section will involve answering:

What are your modelled effects (the magnitude, direction and uncertainty of your coefficient estimates)?

If you also have a categorical predictor with more than two levels, you will also want to answer:

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

Let’s examine each of these in turn:

What are your modelled effects (with uncertainty)?

Recall from the text above that your model effects describe the change in your response given a change in your predictor. These effects are given in the estimates of your coefficients.

When you have a numeric predictor, the coefficient gives the amount of change in the response that is caused by a unit change in your numeric predictor. When the coefficient is positive, you have an increase in your response with a unit change in your predictor. When the coefficient is negative, you have a decrease in your response with a unit change in your predictor. When the coefficient is zero, you have no change in your response for a unit change in your predictor, and you would conclude that there is no evidence that your predictor explains variability in your response.

For linear models, information about the effect of a numeric predictor is captured in the coefficient (slope) of the linear fit, (Expand the text box below to see why), and this effect is the same across the entire range of your predictor (the definition of a linear relationship). When you have a numeric predictor in your best-specified model, you can conclude that there is evidence that it is explaining variation in your response.

the math

Here’s a model equation for a generalized linear model:

\[ \begin{align} g(\mu_i|Pred_i) &= \beta_1 \cdot Pred_i + \beta_0 \\ Resp_i &\sim F(\mu_i) \end{align} \]

where * \(g()\) is the link function * \(\mu_i\) is the mean fitted value \(\mu_i\) dependent on \(Pred_i\) * \(\beta_1\) is the coefficient of \(Pred_i\) * \(Pred_i\) is the value of your predictor * \(\beta_0\) is the intercept * \(Resp_i\) is the value of the response variable * \(F\) represents some error distribution assumption

As above, the effect of a continuous predictor on your response is the change in your fitted value (\(g(\mu_i)\)) for a unit change in your predictor (\(Pred_i\)):

\[ \begin{align} g(\mu|Pred+1) - g(\mu|Pred) &= (\beta_1 \cdot (Pred+1) + \beta_0) - (\beta_1 \cdot Pred_i + \beta_0) \\ &=\beta_1 \cdot Pred + \beta_1 + \beta_0 - \beta_1 \cdot Pred_i - \beta_0 \\ &=\beta_1 \end{align} \] For linear models with a normal error distribution assumption (i.e. when you are using link = "identity"), the link function \(g()\) is not there. For this reason, you can read the effects of your model directly from the model coefficients (\(\beta_1\), \(\beta_0\)) when you have a normal error distribution assumption. And for this reason, you will need to convert your model coefficients to quantify your modelled effects when you have an error distribution assumption that is not normal/uses a link function that isn’t “identity”.

This is because of the difference between the “link” scale and “response” scale for generalized linear models.

The link scale is where the model fitting happens, and where the evidence for your modelled effects is gathered.

The response scale is back in the real world - these are the actual effects in your response you would expect to see with a change in your predictor.

Recall that you can see the default link functions associated with each error distribution function with ?family:

When you have a categorical predictor, the coefficient is the intercept of the linear fit and the effect of the predictor on your response is captured in how that intercept changes as you move from one level of your category to another. So if you have a categorical predictor with 3 categories (levels), you will have three coefficients: one coefficient that describes the mean response level for one category, and two coefficients that describe how this mean changes when you change categories. A positive coefficient for level B means that there is a higher mean response value when your predictor is level B vs. level A. A negative coefficient for level B means that there is a lower mean response value when your predictor is level B vs. level A. A coefficient for level B that is zero means that there is no evidence of a difference in your response value when your predictor is level B vs. level A.

Finally, you need to report the uncertainty around your modelled effects so your peers know how much evidence there is for these modelled effects. There are a number of ways to do this. Two common ones are i) reporting standard errors (SE) which are a measure of uncertainty of the average modelled effect, and ii) the 95% confidence intervals around your modelled effects, which tell you the range your modelled effects would be expected to fall into (with a 95% probability) if you redid your experiment. We will focus on the 95% confidence intervals as they tend to be more generally interperable across different types of models.

The meaning behind your coefficients outlined above are true regardless of your model structure BUT our interpretation of what the effects mean in the real world will vary based on e.g. your error distribution assumptions (see the text box above for an explanation). In the examples below, you will learn how to quantify your modelled effects and uncertainty estimates for models with a normal error distribution assumption. After this, there are sections below to show you how to do the same for models with other error distribution assumptions.

For a quick summary, here is how the method depends on your modelled error distribution assumption:

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

When you have a categorical predictor in your best-specified model, you know there is evidence of an effect for at least one of your levels (categories). You will also need to determine the evidence for the the effects among your factor levels, where they differ from zero and where they differ from each other. The processes for this outlined below can be applied to models of any structure.

Models with a normal error distribution assumption:

Example 1: Resp1 ~ Cat1 + 1

Recall that Example 1 contains only one predictor and it is categorical:

Resp1 ~ Cat1 + 1

What are your modelled effects (with uncertainty)?

In the chapter on your Starting Model, you found your coefficients in the “Estimate” column of the summary() output of your model:

coef(summary(bestMod1)) # extract the coefficients from summary()

             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 20.954839   1.316231 15.920341 8.313153e-29
Cat1Sp2      5.876740   1.773637  3.313384 1.296521e-03
Cat1Sp3     -4.509677   1.861431 -2.422694 1.726174e-02

Interpreting the coefficients from this output takes practice - especially for categorical predictors because of the way that R treats categorical predictors in regression. With R’s “dummy coding”, one level of the predictor (here Sp1) is incorporated into the intercept estimate (21) as the reference level. The other coefficients in the Estimate column represent the change in modelled response when you move from the reference level (Sp1) to another level.

The modelled response when Cat1 = Sp2 is 5.9 units higher than this reference level, or 21 + 5.9 = 26.9 units.

The modelled Resp1 when Cat1 = Sp3 is -4.5 units lower than the reference level, or 21 -4.5 = 16.5 units.⁹

So all the information we need is in this summary() output, but not easy to see immediately. An easier way is to use the emmeans¹⁰ package which helps you by reporting the coefficients directly¹¹. In our case, you use the emmeans package to get the mean value of the response at each level of the predictor. For categorical predictors, you do this with the emmeans() function:

library(emmeans) # load the emmeans package

emmOut <- emmeans(object = bestMod1, # your model
            specs = ~ Cat1, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat1 emmean   SE df lower.CL upper.CL
 Sp1    21.0 1.32 97     18.3     23.6
 Sp2    26.8 1.19 97     24.5     29.2
 Sp3    16.4 1.32 97     13.8     19.1

Confidence level used: 0.95

So now you have a modelled value of your response for each level of your categorical predictor - this captures the effect of your categorical predictor on your response. You also need to report uncertainty around this effect, and you can do this by reporting the 95% confidence level (CL) also reported by the emmeans() function.

When Cat1 is Sp1, Resp1 is estimated to be 21 units (95% confidence level: 18.3 to 23.6 units).
When Cat1 is Sp2, Resp1 is estimated to be 26.8 units (95% confidence level: 24.5 to 29.2 units).
When Cat1 is Sp3, Resp1 is estimated to be 16.4 units (95% confidence level: 13.8 to 19.1 units.

Note that this is the same information in the summary() output just easier to read.¹²

Note also that two types of uncertainty are measured here. SE stands for the standard error around the prediction, and is a measure of uncertainty of the average modelled effect. The lower.CL and upper.CL represent the 95% confidence limits of the prediction - so if I observed a new Resp1 at a particular Cat1, there would be a 95% chance it would lie between the bounds of the confidence limits.

Finally, note that you can also get a quick plot of the effects by handing the emmeans() output to plot().

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

Since you have a categorical predictor with more than two levels, you will need to determine the magnitude and direction of the effect of each level of Cat1 on Resp1 - i.e. are the coefficients across factor levels significantly different from one another?

To get evidence about how each level affects your response, you need to test which effects differ from one another among the categories (levels) of your categorical predictor. This is done using multiple comparisons, i.e. you will compare the modelled effect of each level of your categorical predictor vs. every other level of your categorical predictor to determine which are different from each other (called pairwise testing).

This is done by testing the null hypothesis that the modelled effects of each pair of category levels are similar to one another, typically rejecting the null hypothesis when P < 0.05. Remember that the P-value is the probability that you observe a value at least as big as the one you observed even though our null hypothesis is true. In this case, you are looking at the value you are observing as the difference between coefficients estimated for two levels of your predictor. The P-value is the probability of getting a difference at least as big as the one you observed even though there is actually no difference between the coefficients (the null hypothesis is true).

A couple of things to note about multiple comparison testing:

Multiple comparison testing on a categorical predictor should only be done after your hypothesis testing has given you evidence that the predictor has an effect on your response. That is, you should only do a multiple comparison test on a predictor if that predictor is in your best-specified model. As this is a test done after your hypothesis testing, it is called a post-hoc¹³ test.
Multiple comparison testing can be a problem because you are essentially repeating a hypothesis test many times on the same data (i.e. are the effects of Sp1 different than those of Sp2? are the effects of Sp1 different than those of Sp3? are the effects of Sp2 different than those of Sp3?…). These repeated tests mean there is a high chance that you will find a difference in one test purely due to random chance, vs. due to there being an actual difference. The multiple comparison tests you will perform have been formulated to correct for this increased error.

Multiple comparison testing is very simple with the emmeans package. It just requires you to hand the output from emmeans() to a function called pairs():

forComp <- pairs(emmOut, # your emmeans object
                 adjust = 'fdr') # multiple comparison adjustment

forComp

 contrast  estimate   SE df t.ratio p.value
 Sp1 - Sp2    -5.88 1.77 97  -3.313  0.0019
 Sp1 - Sp3     4.51 1.86 97   2.423  0.0173
 Sp2 - Sp3    10.39 1.77 97   5.856  <.0001

P value adjustment: fdr method for 3 tests

The output shows the results of the multiple comparison (pairwise) testing. The values in the p.value column tell you the results of the hypothesis test comparing the coefficients between the two levels of your categorical predictor. For example, for Sp1 vs. Sp2, there is a 0.19% (p = 0.0019) probability of getting a difference in coefficients at least as big as 26.8 - 21 = r 5.8 even though the null hypothesis (no difference) is true. This value 0.19% (P = 0.0019) is small enough that you can say you have evidence that the effect of Cat1 on Resp1 is different for these two different levels (Sp1andSp2`).

Based on a threshold p-value of 0.05, we can see that:

There is evidence that the value of Resp1 when Cat1 is Sp1 is different (lower) than that when Cat1 is Sp2 as P = 0.0019 is less than P = 0.05.
There is a little evidence that the value of Resp1 when Cat1 is Sp1 is different (higher) than that when Cat1 is Sp3 as P = 0.0173 is less than P = 0.05 (but it is close!). There is some evidence that the value of Resp1 when Cat1 is Sp2 is different (higher) than that when Cat1 is Sp3 as P < 0.0001¹⁴ is smaller than P = 0.05.¹⁵.

Note that the p-values have been adjusted via the “false discovery rate” (fdr) method (Verhoeven, Simonsen, and McIntyre 2005) which adjusts the difference that the two coefficients need to have to allow for the fact that we are making multiple comparisons.¹⁶

Note that you can also get the results from the pairwise testing visually by handing the output of pairs() to plot().

Note that the difference between Resp1 when Cat1 is Sp1 vs. Sp3 is so close to overlapping zero. This is reflected in the high P value.

Example 2: Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Recall that Example 2 contains two predictors and both are categorical:

Resp2 ~ Cat2 + Cat3 + Cat2:Cat3 + 1

What are your modelled effects (with uncertainty)?

Again, for categorical predictors, there is a coefficient estimated for each level of the predictor. If there is one or more interactions among predictors, there will be one coefficient for each combination of levels among predictors. Let’s look at the summary() output of your model to understand better:

coef(summary(bestMod2)) # extract the coefficients from summary()

                        Estimate Std. Error    t value     Pr(>|t|)
(Intercept)            364.90879   24.73850 14.7506420 1.629062e-25
Cat2TypeB               37.37325   32.98467  1.1330491 2.602712e-01
Cat2TypeC              -75.94350   33.87459 -2.2419017 2.748208e-02
Cat2TypeD             -141.96558   38.32472 -3.7042819 3.695537e-04
Cat3Treat2             -43.62974   36.41409 -1.1981554 2.340735e-01
Cat3Control             64.11573   30.29835  2.1161457 3.715672e-02
Cat2TypeB:Cat3Treat2    42.70879   53.60009  0.7968045 4.277091e-01
Cat2TypeC:Cat3Treat2    73.60837   54.15228  1.3592849 1.775295e-01
Cat2TypeD:Cat3Treat2    66.02110   55.13228  1.1975037 2.343260e-01
Cat2TypeB:Cat3Control   81.25030   43.24327  1.8789121 6.356678e-02
Cat2TypeC:Cat3Control  -19.07730   41.29748 -0.4619483 6.452584e-01
Cat2TypeD:Cat3Control  -71.72444   46.17117 -1.5534463 1.239057e-01

Comparing this output to that of Example 1 shows many more estimated coefficients for Example 2. This is because we have one coefficient for each level of each predictor, as well as coefficients for each combination (the interaction term) of levels of the predictors.

Again, it takes a little practice to read the coefficients from the summary() output. For Example 2:

The modelled prediction for Resp2 when Cat2 is TypeA and Cat3 is Treat1 is 365 units (the intercept).

The modelled prediction for Resp2 when Cat2 is TypeB and Cat3 is Treat1 is 365 + 37 = 402 units.

The modelled prediction for Resp2 when Cat2 is TypeA and Cat3 is Treat2 is 365 - 44 = 321 units.

The modelled prediction for Resp2 when Cat2 is TypeC and Cat3 is Treat2 is 365 - 76 - 44 + 74 = 319 units.

As above, we can use the emmeans package to more easily see these coefficients:

emmOut <- emmeans(object = bestMod2, # your model
            specs = ~ Cat2 + Cat3 + Cat2:Cat3, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat2  Cat3    emmean   SE df lower.CL upper.CL
 TypeA Treat1     365 24.7 88      316      414
 TypeB Treat1     402 21.8 88      359      446
 TypeC Treat1     289 23.1 88      243      335
 TypeD Treat1     223 29.3 88      165      281
 TypeA Treat2     321 26.7 88      268      374
 TypeB Treat2     401 32.7 88      336      466
 TypeC Treat2     319 32.7 88      254      384
 TypeD Treat2     245 29.3 88      187      304
 TypeA Control    429 17.5 88      394      464
 TypeB Control    548 21.8 88      504      591
 TypeC Control    334 15.9 88      302      366
 TypeD Control    215 18.9 88      178      253

Confidence level used: 0.95

So now we have a modelled value of our response for each level of our categorical predictor. For example:

The modelled prediction for Resp2 when Cat2 is TypeA and Cat3 is Treat1 is 365 units (95% confidence limit: 316-414).

The modelled prediction for Resp2 when Cat2 is TypeB and Cat3 is Treat1 is 402 units (95% confidence limit: 359-446).

The modelled prediction for Resp2 when Cat2 is TypeA and Cat3 is Treat2 is 321 units (95% confidence limit: 268-374).

The modelled prediction for Resp2 when Cat2 is TypeC and Cat3 is Treat2 is 319 units (95% confidence limit: 254-384).

Finally, note that you can also get a quick plot of the effects by handing the emmeans() output to plot().

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

As with Example 1, you can also find out which combinations of predictor levels are leading to statistically different model predictions in Resp2:

forComp <- pairs(emmOut, # your emmeans object
                 adjust = 'fdr') # multiple comparison adjustment

forComp

 contrast                      estimate   SE df t.ratio p.value
 TypeA Treat1 - TypeB Treat1    -37.373 33.0 88  -1.133  0.3303
 TypeA Treat1 - TypeC Treat1     75.944 33.9 88   2.242  0.0465
 TypeA Treat1 - TypeD Treat1    141.966 38.3 88   3.704  0.0010
 TypeA Treat1 - TypeA Treat2     43.630 36.4 88   1.198  0.3090
 TypeA Treat1 - TypeB Treat2    -36.452 41.0 88  -0.889  0.4361
 TypeA Treat1 - TypeC Treat2     45.965 41.0 88   1.120  0.3307
 TypeA Treat1 - TypeD Treat2    119.574 38.3 88   3.120  0.0054
 TypeA Treat1 - TypeA Control   -64.116 30.3 88  -2.116  0.0584
 TypeA Treat1 - TypeB Control  -182.739 33.0 88  -5.540  <.0001
 TypeA Treat1 - TypeC Control    30.905 29.4 88   1.051  0.3617
 TypeA Treat1 - TypeD Control   149.574 31.1 88   4.805  <.0001
 TypeB Treat1 - TypeC Treat1    113.317 31.8 88   3.563  0.0016
 TypeB Treat1 - TypeD Treat1    179.339 36.5 88   4.912  <.0001
 TypeB Treat1 - TypeA Treat2     81.003 34.5 88   2.348  0.0367
 TypeB Treat1 - TypeB Treat2      0.921 39.3 88   0.023  0.9814
 TypeB Treat1 - TypeC Treat2     83.338 39.3 88   2.119  0.0584
 TypeB Treat1 - TypeD Treat2    156.947 36.5 88   4.299  0.0002
 TypeB Treat1 - TypeA Control   -26.742 28.0 88  -0.956  0.4098
 TypeB Treat1 - TypeB Control  -145.366 30.9 88  -4.711  <.0001
 TypeB Treat1 - TypeC Control    68.278 27.0 88   2.531  0.0248
 TypeB Treat1 - TypeD Control   186.947 28.9 88   6.477  <.0001
 TypeC Treat1 - TypeD Treat1     66.022 37.3 88   1.769  0.1128
 TypeC Treat1 - TypeA Treat2    -32.314 35.3 88  -0.914  0.4280
 TypeC Treat1 - TypeB Treat2   -112.396 40.1 88  -2.804  0.0128
 TypeC Treat1 - TypeC Treat2    -29.979 40.1 88  -0.748  0.5038
 TypeC Treat1 - TypeD Treat2     43.631 37.3 88   1.169  0.3176
 TypeC Treat1 - TypeA Control  -140.059 29.0 88  -4.828  <.0001
 TypeC Treat1 - TypeB Control  -258.683 31.8 88  -8.134  <.0001
 TypeC Treat1 - TypeC Control   -45.038 28.1 88  -1.605  0.1510
 TypeC Treat1 - TypeD Control    73.631 29.9 88   2.465  0.0279
 TypeD Treat1 - TypeA Treat2    -98.336 39.6 88  -2.481  0.0275
 TypeD Treat1 - TypeB Treat2   -178.418 43.9 88  -4.064  0.0003
 TypeD Treat1 - TypeC Treat2    -96.001 43.9 88  -2.186  0.0519
 TypeD Treat1 - TypeD Treat2    -22.391 41.4 88  -0.541  0.6383
 TypeD Treat1 - TypeA Control  -206.081 34.1 88  -6.043  <.0001
 TypeD Treat1 - TypeB Control  -324.705 36.5 88  -8.894  <.0001
 TypeD Treat1 - TypeC Control  -111.061 33.3 88  -3.335  0.0029
 TypeD Treat1 - TypeD Control     7.609 34.8 88   0.218  0.8535
 TypeA Treat2 - TypeB Treat2    -80.082 42.2 88  -1.895  0.0920
 TypeA Treat2 - TypeC Treat2      2.335 42.2 88   0.055  0.9708
 TypeA Treat2 - TypeD Treat2     75.945 39.6 88   1.916  0.0899
 TypeA Treat2 - TypeA Control  -107.746 31.9 88  -3.374  0.0027
 TypeA Treat2 - TypeB Control  -226.369 34.5 88  -6.562  <.0001
 TypeA Treat2 - TypeC Control   -12.725 31.1 88  -0.409  0.7158
 TypeA Treat2 - TypeD Control   105.945 32.7 88   3.237  0.0039
 TypeB Treat2 - TypeC Treat2     82.417 46.3 88   1.781  0.1125
 TypeB Treat2 - TypeD Treat2    156.026 43.9 88   3.554  0.0016
 TypeB Treat2 - TypeA Control   -27.663 37.1 88  -0.745  0.5038
 TypeB Treat2 - TypeB Control  -146.287 39.3 88  -3.719  0.0010
 TypeB Treat2 - TypeC Control    67.357 36.4 88   1.852  0.0989
 TypeB Treat2 - TypeD Control   186.027 37.8 88   4.923  <.0001
 TypeC Treat2 - TypeD Treat2     73.609 43.9 88   1.677  0.1336
 TypeC Treat2 - TypeA Control  -110.081 37.1 88  -2.967  0.0083
 TypeC Treat2 - TypeB Control  -228.704 39.3 88  -5.815  <.0001
 TypeC Treat2 - TypeC Control   -15.060 36.4 88  -0.414  0.7158
 TypeC Treat2 - TypeD Control   103.609 37.8 88   2.742  0.0148
 TypeD Treat2 - TypeA Control  -183.690 34.1 88  -5.387  <.0001
 TypeD Treat2 - TypeB Control  -302.313 36.5 88  -8.281  <.0001
 TypeD Treat2 - TypeC Control   -88.669 33.3 88  -2.663  0.0179
 TypeD Treat2 - TypeD Control    30.000 34.8 88   0.861  0.4455
 TypeA Control - TypeB Control -118.624 28.0 88  -4.242  0.0002
 TypeA Control - TypeC Control   95.021 23.6 88   4.023  0.0004
 TypeA Control - TypeD Control  213.690 25.7 88   8.299  <.0001
 TypeB Control - TypeC Control  213.644 27.0 88   7.918  <.0001
 TypeB Control - TypeD Control  332.314 28.9 88  11.514  <.0001
 TypeC Control - TypeD Control  118.669 24.7 88   4.809  <.0001

P value adjustment: fdr method for 66 tests

This is a hard table to navigate as every possible combination of the levels across predictors is compared. This may not be what you want. Instead, you can look for differences of one predictor based on a value of another. For example:

forComp <- pairs(emmOut, # emmeans output
                 adjust = 'fdr',  # multiple comparison adjustment
                 simple = "Cat2") # contrasts within this categorical predictor

forComp

Cat3 = Treat1:
 contrast      estimate   SE df t.ratio p.value
 TypeA - TypeB   -37.37 33.0 88  -1.133  0.2603
 TypeA - TypeC    75.94 33.9 88   2.242  0.0412
 TypeA - TypeD   141.97 38.3 88   3.704  0.0011
 TypeB - TypeC   113.32 31.8 88   3.563  0.0012
 TypeB - TypeD   179.34 36.5 88   4.912  <.0001
 TypeC - TypeD    66.02 37.3 88   1.769  0.0964

Cat3 = Treat2:
 contrast      estimate   SE df t.ratio p.value
 TypeA - TypeB   -80.08 42.2 88  -1.895  0.1166
 TypeA - TypeC     2.34 42.2 88   0.055  0.9560
 TypeA - TypeD    75.94 39.6 88   1.916  0.1166
 TypeB - TypeC    82.42 46.3 88   1.781  0.1166
 TypeB - TypeD   156.03 43.9 88   3.554  0.0037
 TypeC - TypeD    73.61 43.9 88   1.677  0.1166

Cat3 = Control:
 contrast      estimate   SE df t.ratio p.value
 TypeA - TypeB  -118.62 28.0 88  -4.242  0.0001
 TypeA - TypeC    95.02 23.6 88   4.023  0.0001
 TypeA - TypeD   213.69 25.7 88   8.299  <.0001
 TypeB - TypeC   213.64 27.0 88   7.918  <.0001
 TypeB - TypeD   332.31 28.9 88  11.514  <.0001
 TypeC - TypeD   118.67 24.7 88   4.809  <.0001

P value adjustment: fdr method for 6 tests

Here you can more easily see the contrasts, and the adjustment to the P-value will be limited to what you actually need. Based on a threshold P-value of 0.05, you can see that the effects at some combinations of your predictors are not statistically different from each other.

For example, the coefficient (predicted Resp2) when Cat3 = Treat1 and Cat2 = TypeA is 365 and this is 37 lower than the coefficient when Cat3 = Treat1 and Cat2 = TypeB, but there is no evidence that the difference has any meaning as P = 0.26 for this comparison.

On the other hand, some combinations of your predictors are statistically different from each other. For example, a comparison of modelled Resp2 when Cat3 = Control and Cat2 = TypeB (548) vs. Cat3 = Control and Cat2 = TypeD (215) shows that they differ by 332 and that there is strong evidence that this difference is meaningful as P < 0.0001.

Note that you could also organize this with contrasts by Cat3 instead:

forComp <- pairs(emmOut, # emmeans output
                 adjust = 'fdr',  # multiple comparison adjustment
                 simple = "Cat3") # contrasts within this categorical predictor

forComp

Cat2 = TypeA:
 contrast         estimate   SE df t.ratio p.value
 Treat1 - Treat2    43.630 36.4 88   1.198  0.2341
 Treat1 - Control  -64.116 30.3 88  -2.116  0.0557
 Treat2 - Control -107.745 31.9 88  -3.374  0.0033

Cat2 = TypeB:
 contrast         estimate   SE df t.ratio p.value
 Treat1 - Treat2     0.921 39.3 88   0.023  0.9814
 Treat1 - Control -145.366 30.9 88  -4.711  <.0001
 Treat2 - Control -146.287 39.3 88  -3.719  0.0005

Cat2 = TypeC:
 contrast         estimate   SE df t.ratio p.value
 Treat1 - Treat2   -29.979 40.1 88  -0.748  0.6799
 Treat1 - Control  -45.038 28.1 88  -1.605  0.3363
 Treat2 - Control  -15.060 36.4 88  -0.414  0.6799

Cat2 = TypeD:
 contrast         estimate   SE df t.ratio p.value
 Treat1 - Treat2   -22.391 41.4 88  -0.541  0.8276
 Treat1 - Control    7.609 34.8 88   0.218  0.8276
 Treat2 - Control   30.000 34.8 88   0.861  0.8276

P value adjustment: fdr method for 3 tests

Note that you can also get the results from the pairwise testing visually by handing the output of pairs() to plot().

Example 3: Resp3 ~ Cont4 + 1

Recall that Example 3 contains one predictor and the predictor is continuous:

Resp3 ~ Cont4 + 1

What are your modelled effects (with uncertainty)?

For a continuous predictor, the effect is captured in one coefficient that describes the change in the response for a unit change in the continuous predictor. For models with a normal error distribution assumption, this is communicated as the slope.

Let’s look at the summary() output for Example 3:

coef(summary(bestMod3)) # extract the coefficients from summary()

             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) -226.9131  166.09360 -1.366176 1.750105e-01
Cont4        260.0574   14.56593 17.853818 1.438227e-32

This tells you that for a unit change in Cont4, you get a 260 ± 15¹⁷.

Note that this same information is in the emmeans package. Because Cont4 is a continuous predictor, you need the emtrends() function, which provides the 95% confidence interval on the estimate:

trendsOut <- emtrends(bestMod3,  # your model
                      specs = ~ Cont4 + 1, # your predictors
                      var = "Cont4") # your continuous predictors

trendsOut

 Cont4 Cont4.trend   SE df lower.CL upper.CL
  9.94         260 14.6 98      231      289

Confidence level used: 0.95

Getting a sense for coefficients - normal error distribution assumption

Here’s a little thought experiment that will help you get a better sense for your model coefficients.

First, a visual reporting of the modelled effects:

visreg(bestMod3, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cont4", # predictor on x-axis
       #by = ..., # if you want to include a 2nd predictor plotted as colour
       #breaks = ..., # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 3rd predictor
       #overlay = TRUE, # to plot as overlay or panels, when there is 
       #rug = FALSE, # to turn off the rug. The rug shows you where you have positive (appearing on the top axis) and negative (appearing on the bottom axis) residuals
       gg = TRUE)+ # to plot as a ggplot, with ggplot, you will have more control over the look of the plot.
  geom_point(data = myDat3,
             mapping = aes(x = Cont4, y = Resp3))+
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cont4, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

If we want to quantitatively report these effects, we are looking for one number that describes how much Resp3 changes for a unit change in Cont4. We can explore this with the predict() function.

First, pick a value of Cont4 at which to make an estimate of Resp3:

t1 <- 1 # put any number here

The Resp3 estimated for that t1 is:

## Step One: Define the predictor values for your response prediction
forCont4.1 <- data.frame(Cont4 = t1)

## Step Two: Make your response prediction
myPred.1 <- predict(object = bestMod3, # your model
                  newdata = forCont4.1, # the values of the predictors at which to make the prediction
                  type = "response", # to make the prediction on the response scale. IMPORTANT if you have a non-normal error distributions assumption
                  se.fit = TRUE) # to include a measure of uncertainty around the prediction


myPred.1$fit

       1 
33.14433

Now let’s find Resp3 estimated for t2 = t1 + 1 (i.e. one unit more):

t2 <- t1 + 1

## Step One: Define the predictor values for your response prediction
forCont4.2 <- data.frame(Cont4 = t2)

## Step Two: Make your response prediction
myPred.2 <- predict(object = bestMod3, # your model
                  newdata = forCont4.2, # the values of the predictors at which to make the prediction
                  type = "response", # to make the prediction on the response scale. IMPORTANT if you have a non-normal error distributions assumption
                  se.fit = TRUE) # to include a measure of uncertainty around the prediction


myPred.2$fit

       1 
293.2017

The effect of Cont4 on Resp3 can be expressed as the difference between the two numbers:

# Effect of Cont4 on Resp is
myPred.2$fit-myPred.1$fit

       1 
260.0574

If you tried this for a number of different t1 values you would notice that the effect is constant.

This slope describes the absolute change in Resp3.p from a one unit change in Cont4.

Compare this to your modelled coefficients:

summary(bestMod3)


Call:
glm(formula = Resp3 ~ Cont4 + 1, family = gaussian(link = "identity"), 
    data = myDat3)

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -226.91     166.09  -1.366    0.175    
Cont4         260.06      14.57  17.854   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 662598.5)

    Null deviance: 276143773  on 99  degrees of freedom
Residual deviance:  64934654  on 98  degrees of freedom
AIC: 1628.2

Number of Fisher Scoring iterations: 2

So for a normal error distribution assumption the effect of a continuous predictor on the response is the absolute change in the response for a unit change in the predictor. And this can be read directly from the model coefficients as the slope of the model fit (\(\beta_1\)).

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

This question is only applies to categorical predictors of which there are none in Example 3.

Example 4: Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

Recall that Example 4 contains two predictors, one is categorical and one is continuous:

Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

What are your modelled effects (with uncertainty)?

The effects estimated for this model will include values of the response (Resp4) at each level of the categorical predictor (Cat5) as well as slopes describing the change in the response when the continuous predictor (Cont6) changes, and a different slope will be estimated for each level in Cat5 (this is because of the interaction in the model).

As above, you can take a look at your modelled coefficients with the summary() output:

coef(summary(bestMod4))

                   Estimate  Std. Error     t value     Pr(>|t|)
(Intercept)     98.34679337 1.868900753 52.62280151 1.536692e-71
Cat5Urban       -0.23642403 2.895404981 -0.08165491 9.350947e-01
Cat5Wild        -0.86733554 3.238562391 -0.26781498 7.894285e-01
Cont6           -0.00293078 0.003740830 -0.78345704 4.353283e-01
Cat5Urban:Cont6  0.01511745 0.005611511  2.69400763 8.361918e-03
Cat5Wild:Cont6   0.03085821 0.006183238  4.99062262 2.756559e-06

This shows:

The model prediction of Resp4 when Cat5 is Farm and Cont6 is 0 is 98.34 units¹⁸ (the intercept).

The model prediction of Resp4 when Cat5 is Urban and Cont6 is 0 is 98.34 - 0.24 = 98.1 units.

The model prediction of Resp4 when Cat5 is Wild and Cont6 is 0 is 98.34 - 0.87 = 97.5 units.

The slope of the relationship between Cont6 and Resp4 when Cat5 is Farm is -0.0029.¹⁹

The slope of the relationship between Cont6 and Resp4 when Cat5 is Urban is -0.0029 + 0.015 = 0.0121.

The slope of the relationship between Cont6 and Resp4 when Cat5 is Wild is -0.0029 + 0.031 = 0.0281.

Again, interpreting the coefficients from the summary() output is tedious and not necessary: You can use the emmeans package to give you the modelled response for each level of the categorical predictor (Cat5) directly:

emmOut <- emmeans(object = bestMod4, # your model
            specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat5  Cont6 emmean    SE df lower.CL upper.CL
 Farm    510   96.9 0.458 94     95.9     97.8
 Urban   510  104.3 0.421 94    103.5    105.2
 Wild    510  111.7 0.511 94    110.7    112.7

Confidence level used: 0.95

Note that emmeans() sets our continuous predictor (Cont6) to the mean value of Cont6 (510 units). We can also control this with the at = argument:

emmOut <- emmeans(object = bestMod4, # your model
            specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1, # your predictors
            type = "response", # report coefficients on the response scale
            at = list(Cont6 = 0)) # control the value of your continuous predictor at which to make the coefficient estimates


emmOut

 Cat5  Cont6 emmean   SE df lower.CL upper.CL
 Farm      0   98.3 1.87 94     94.6      102
 Urban     0   98.1 2.21 94     93.7      103
 Wild      0   97.5 2.64 94     92.2      103

Confidence level used: 0.95

By setting at = 0, you get the intercept - i.e. the modelled Resp4 when Cont6 = 0 for each level of Cat5, and this is what is reported in the summary() output.

Similarly, you can get the emmeans package to give you the slope coefficients for the continuous predictor (Cont6) using the emtrends() function, rather than interpreting them from the summary() output:

trendsOut <- emtrends(bestMod4,  # your model
                      specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1, # your predictors
                      var = "Cont6") # your continuous predictors


trendsOut

 Cat5  Cont6 Cont6.trend      SE df lower.CL upper.CL
 Farm    510    -0.00293 0.00374 94 -0.01036   0.0045
 Urban   510     0.01219 0.00418 94  0.00388   0.0205
 Wild    510     0.02793 0.00492 94  0.01815   0.0377

Confidence level used: 0.95

Note that the 95% confidence limits for the modelled effect for Cont6 when Cat5 = Farm includes zero. We can also see this with the test() function.

test(trendsOut) # get test if coefficient is different than zero.

 Cat5  Cont6 Cont6.trend      SE df t.ratio p.value
 Farm    510    -0.00293 0.00374 94  -0.783  0.4353
 Urban   510     0.01219 0.00418 94   2.914  0.0045
 Wild    510     0.02793 0.00492 94   5.673  <.0001

combining the two output, we can report that:

there is no evidence on an effect on Resp4 of Cont6 when Cat5 = Farm (p = 0.44).
there is evidence for a positive effect of Cont6 on Resp4 when Cat5 = Urban. The effect is 0.012²⁰ with a 95% confidence interval of 0.0039 - 0.0205²¹.
there is evidence for a positive effect of Cont6 on Resp4 when Cat5 = Wild. The effect is 0.028²² with a 95% confidence interval of 0.0182 - 0.0377²³.

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

Since you have a categorical predictor with more than two levels, you will want to report where the effects among levels differ from one another.

Note that the effects in Example 4 include slope coefficients (associated with the continuous predictor) and intercept coefficients (associated with the categorical predictor). Because the slope and intercept are dependent on one another (when the slope changes, the intercept will naturally change), you need to first check if the slope coefficients vary across your categorical levels. Only if they don’t vary (i.e. the modelled effects produce lines that are parallel across your categorical levels) would you go on to test if the intercept coefficiens vary across your categorical levels.

You can find out which slopes (i.e. the effect of Cont6 on Resp4) are different across the levels of Cat5 using emtrends() with a request for pairwise testing:

forCompSlope <- pairs(trendsOut, # emmeans object
                      adjust = "fdr", # multiple comparison adjustment
                      simple = "Cat5") # contrasts within this categorical predictor

forCompSlope

Cont6 = 510:
 contrast     estimate      SE df t.ratio p.value
 Farm - Urban  -0.0151 0.00561 94  -2.694  0.0125
 Farm - Wild   -0.0309 0.00618 94  -4.991  <.0001
 Urban - Wild  -0.0157 0.00646 94  -2.437  0.0167

P value adjustment: fdr method for 3 tests

Here you see that the slope estimates (the effect of Cont6 on Resp4) for all levels of Cat5 are significantly different from one another (0.0001 ≤ P ≤ 0.017).

Note that you can also get the results from the pairwise testing visually by handing the output of pairs() to plot().

Again: only if all the slopes were similar, would you want to test if the levels of your categorical predictor (Cat4) have significantly different coefficients (intercepts) from each other. You would do this with pairs() on the output from emmeans() (the emmOut object).

Models with a poisson error distribution assumption:

Background

For models with a poisson error distribution assumption, you will typically start with a link function that is the natural logarithm or link = "log". You need to take this link function into consideration when you report your modelled effects.

If you are using link = "log" (as with a poisson error distribution, and sometimes also a Gamma error distribution - more on this to come), you get your coefficient on the response scale by taking e to the coefficient on the link scale (with the R function exp()). This coefficient is called the rate ratio²⁴ and it tells you the % change in the response for a unit change in your predictor.

For a continuous predictor, the effect is captured in one coefficient that describes the change in the response for a unit change in the continuous predictor.

For models with a normal error distribution assumption, this was simply a slope.

For models using a log link (including many models with a poisson error distribution assumption), this is communicated as the rate ratio or the % change in the response for a unit change in your predictor. This allows you to communicate the effect of the continuous predictor while accounting for the curve in the relationship (see visualization of model effects above). This curve occurs because the model is not allowed to go below zero to be consistent with a poisson error distribution assumption.

the math

But why do we convert coefficients from the link to response scale with \(e^x\)²⁵ when link = "log"?

A reminder that we can present our linear model like this:

\[ \begin{align} g(\mu_i|Pred_i) &= \beta_1 \cdot Pred_i + \beta_0 \\ Resp_i &\sim F(\mu_i) \end{align} \]

If you have an error distribution assumption (\(F\)), the link function \(g()\) is \(log_e\)²⁶:

\[ \begin{align} log_e(\mu_i|Pred_i) &= \beta_1 \cdot Pred_i + \beta_0 \\ Resp_i &\sim poisson(\mu_i) \end{align} \]

Then the rate ratio (RR), or % change in the response for a unit change in the predictor becomes,

\[ \begin{align} RR&=\frac{(\mu_i|Pred = a+1)}{(\mu_i|Pred = a)}\\[2em] log_e(RR)&=log_e(\frac{(\mu_i|Pred = a+1)}{(\mu_i|Pred = a)})\\[2em] log_e(RR)&=log_e(\mu_i|Pred = a+1)-log_e(\mu_i|Pred = a)\\[2em] log_e(RR)&=(\beta_1\cdot (a+1)+\beta_0)-(\beta_1\cdot (a)+\beta_0)\\[2em] log_e(RR)&=\beta_1\cdot a + \beta_1+\beta_0-\beta_1\cdot a-\beta_0\\[2em] log_e(RR)&=\beta_1\\[2em] RR &=e^{\beta_1} \end{align} \]

One thing to note:

The emmeans() function will convert the coefficients (intercepts) from the link to the response scale. You can ask for this with the type = "response" argument.

In contrast, the emtrends() function does not convert the coefficients (slopes) to represent effects on the response scale. This is because emtrends() is reporting the slope of a straight line - the trend line on the link scale. But the line isn’t straight on the response scale.

We need to convert from the link to the response by hand when we use emtrends()^[remember, the coefficients for categorical predictors are also being converted from the link to the response scale. It is just that R is doing it for us in the emmeans() function when we add the argument `type = “response”). How we make the conversion, and interpret the result, will depend on our error distribution assumption:

Here we will go through examples of reporting for models with a poisson error distribution assumption. If you want more details on why you are using the methods below, check the section on “Models with a normal error distribution assumption” (Section 5.2.3) for context.

Examples

Here I have made new examples where the error distribution assumption is poisson. This changes the response variables, indicated by Resp1.p, Resp2.p, etc. to be integers ≥ 0.

Example 1: Resp1.p ~ Cat1 + 1

Recall that Example 1 contains only one predictor and it is categorical:

Resp1.p ~ Cat1 + 1

The best-specified model (now with poisson error distribution assumption) is:

bestMod1.p


Call:  glm(formula = Resp1.p ~ Cat1 + 1, family = poisson(link = "log"), 
    data = Dat)

Coefficients:
(Intercept)      Cat1StB      Cat1StC  
    6.15524     -0.04869      0.09948  

Degrees of Freedom: 49 Total (i.e. Null);  47 Residual
Null Deviance:      139 
Residual Deviance: 40.01    AIC: 446

that was fit to data in myDat1.p:

str(myDat1.p)

'data.frame':   50 obs. of  2 variables:
 $ Cat1   : Factor w/ 3 levels "StA","StB","StC": 2 2 2 1 3 2 3 1 3 1 ...
 $ Resp1.p: int  434 476 443 473 516 449 529 462 512 429 ...

and the dredge() table you used to pick your bestMod1.p is in dredgeOut1.p

dredgeOut1.p

Global model call: glm(formula = Resp1.p ~ Cat1 + 1, family = poisson(link = "log"), 
    data = Dat)
---
Model selection table 
  (Intrc) Cat1 df   logLik  AICc delta weight
2   6.155    +  3 -219.983 446.5  0.00      1
1   6.164       1 -269.477 541.0 94.55      0
Models ranked by AICc(x)

And here’s a visualization of the model effects:

# Set up your predictors for the visualized fit
forCat1<-unique(myDat1.p$Cat1) # every value of your categorical predictor

# create a data frame with your predictors
forVis<-expand.grid(Cat1=forCat1) # expand.grid() function makes sure you have all combinations of predictors

# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod1.p, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod1.p)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame



library(ggplot2)

ggplot() + # start ggplot
  geom_point(data = myDat1.p, # add observations to your plot
             mapping = aes(x = Cat1, y = Resp1.p), 
             position=position_jitter(width=0.1)) + # control position of data points so they are easier to see on the plot
  geom_errorbar(data = forVis, # add the uncertainty to your plot
              mapping = aes(x = Cat1, y = Fit, ymin = Lower, ymax = Upper),
              linewidth=1.2) + # control thickness of errorbar line
  geom_point(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cat1, y = Fit), 
             shape = 22, # shape of point
             size = 3, # size of point
             fill = "white", # fill colour for plot
             col = 'black') + # outline colour for plot
  ylab("Resp1.p, (units)")+ # change y-axis label
  xlab("Cat1, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

What are your modelled effects (with uncertainty)?

For categorical predictors, you can get the coefficients on the response scale directly with the emmeans() function²⁷ when you specify type = “response”:

library(emmeans) # load the emmeans package

emmOut <- emmeans(object = bestMod1.p, # your model
            specs = ~ Cat1 + 1, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat1 rate   SE  df asymp.LCL asymp.UCL
 StA   471 6.54 Inf       459       484
 StB   449 4.32 Inf       440       457
 StC   520 5.89 Inf       509       532

Confidence level used: 0.95 
Intervals are back-transformed from the log scale

Notice that the column reported is “rate”: these are the coefficients converted back onto the response scale.

Note also that two types of uncertainty are measured here. SE stands for the standard error around the prediction, and is a measure of uncertainty of the average modelled effect. The lower.CL and upper.CL represent the 95% confidence limits of the prediction - so if I observed a new Resp1.p at a particular Cat1, there would be a 95% chance it would lie between the bounds of the confidence limits.

From this output you see:

When Cat1 is SpA, Resp1.p is estimated to be 471 (95% confidence interval: 459 to 484 units).
When Cat1 is SpB, Resp1.p is estimated to be 449 (95% confidence interval: 440 to 457 units). When Cat1 is SpC, Resp1.p is estimated to be 520 (95% confidence interval: 509 to 532 units).

Finally, note that you can also get a quick plot of the effects by handing the emmeans() output to plot().

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

Here you can compare the effects of the levels of Cat1 on Resp1.p to see if the effects differ between the levels.

forComp <- pairs(emmOut, #emmeans object
                 adjust = "fdr") # multiple testing adjustment

forComp

 contrast  ratio     SE  df null z.ratio p.value
 StA / StB 1.050 0.0177 Inf    1   2.880  0.0040
 StA / StC 0.905 0.0162 Inf    1  -5.552  <.0001
 StB / StC 0.862 0.0128 Inf    1  -9.968  <.0001

P value adjustment: fdr method for 3 tests 
Tests are performed on the log scale

The output shows the results of the multiple comparison (pairwise) testing.

The values in the p.value column tell you the results of the hypothesis test comparing the coefficients between the two levels of your categorical predictor.

For example:

there is evidence that Resp1.p is significantly larger when Cat 1 = SpA than when Cat1 = SpB (p = 0.004). Resp1.p is expected to be ~ 5 ± 2% bigger when Cat 1 = SpA than when Cat 1 = SpB.
there is evidence that Resp1.p is significantly smaller when Cat 1 = SpA than when Cat1 = SpC (p = 0). Resp1.p is expected to be ~ 9 ± 2% smaller when Cat 1 = SpA than when Cat 1 = SpC.

Note that you can also get the results from the pairwise testing visually by handing the output of pairs() to plot().

Example 2: Resp2.p ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Recall that Example 2 contains two predictors and both are categorical:

Resp2.p ~ Cat2 + Cat3 + Cat2:Cat3 + 1

The best-specified model (now with poisson error distribution assumption) is:

bestMod2.p


Call:  glm(formula = Resp2.p ~ Cat2 + Cat3 + Cat2:Cat3 + 1, family = poisson(link = "log"), 
    data = myDat2.p)

Coefficients:
        (Intercept)            Cat2TypeB            Cat2TypeC  
             2.7213              -0.3365              -0.4526  
          Cat2TypeD            Cat3Treat  Cat2TypeB:Cat3Treat  
            -1.2744              -0.3810              -0.6176  
Cat2TypeC:Cat3Treat  Cat2TypeD:Cat3Treat  
            -0.5999               0.4381  

Degrees of Freedom: 49 Total (i.e. Null);  42 Residual
Null Deviance:      132.5 
Residual Deviance: 35.27    AIC: 239.5

that was fit to data in myDat2.p:

str(myDat2.p)

'data.frame':   50 obs. of  3 variables:
 $ Cat2   : Factor w/ 4 levels "TypeA","TypeB",..: 1 2 4 1 2 4 1 1 2 2 ...
 $ Cat3   : Factor w/ 2 levels "Control","Treat": 2 2 1 1 2 1 1 1 1 1 ...
 $ Resp2.p: int  10 3 3 19 4 4 9 13 15 12 ...

and the dredge() table you used to pick your bestMod2.p is in

dredgeOut2.p

Global model call: glm(formula = Resp2.p ~ Cat2 + Cat3 + Cat2:Cat3 + 1, family = poisson(link = "log"), 
    data = myDat2.p)
---
Model selection table 
  (Int) Ct2 Ct3 Ct2:Ct3 df   logLik  AICc delta weight
8 2.721   +   +       +  8 -111.758 243.0  0.00   0.77
4 2.860   +   +          5 -117.038 245.4  2.41   0.23
2 2.461   +              4 -134.275 277.4 34.41   0.00
3 2.344       +          2 -150.075 304.4 61.38   0.00
1 2.087                  1 -160.376 322.8 79.81   0.00
Models ranked by AICc(x)

And here’s a visualization of the model effects:

#### i) choosing the values of your predictors at which to make a prediction

# Set up your predictors for the visualized fit
forCat2<-unique(myDat2.p$Cat2) # every level of your categorical predictor
forCat3<-unique(myDat2.p$Cat3) # every level of your categorical predictor
  
# create a data frame with your predictors
forVis<-expand.grid(Cat2 = forCat2, Cat3 = forCat3) # expand.grid() function makes sure you have all combinations of predictors

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod2.p, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod2.p)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame



#### iii) use the model estimates to plot your model fit

library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  geom_point(data = myDat2.p, # add observations to your plot
             mapping = aes(x = Cat2, y = Resp2.p, col = Cat3), 
             position=position_jitterdodge(jitter.width=0.75, dodge.width=0.75)) + # control position of data points so they are easier to see on the plot
  geom_errorbar(data = forVis, # add the uncertainty to your plot
              mapping = aes(x = Cat2, y = Fit, ymin = Lower, ymax = Upper, col = Cat3),
              position=position_dodge(width=0.75), # control position of data points so they are easier to see on the plot
              size=1.2) + # control thickness of errorbar line
  geom_point(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cat2, y = Fit, fill = Cat3), 
             shape = 22, # shape of point
             size=3, # size of point
             col = 'black', # outline colour for point
             position=position_dodge(width=0.75)) + # control position of data points so they are easier to see on the plot
  ylab("Resp2.p, (units)")+ # change y-axis label
  xlab("Cat2, (units)")+ # change x-axis label
  labs(fill="Cat3, (units)", col="Cat3, (units)") + # change legend title
  theme_bw() # change ggplot theme

What are your modelled effects (with uncertainty)?

As above, we can use the emmeans package to see the effects of the predictors on the scale of the response:

emmOut <- emmeans(object = bestMod2.p, # your model
            specs = ~ Cat2 + Cat3 + Cat2:Cat3, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat2  Cat3     rate    SE  df asymp.LCL asymp.UCL
 TypeA Control 15.20 1.740 Inf     12.14     19.03
 TypeB Control 10.86 1.250 Inf      8.67     13.59
 TypeC Control  9.67 1.800 Inf      6.72     13.91
 TypeD Control  4.25 1.030 Inf      2.64      6.84
 TypeA Treat   10.38 0.894 Inf      8.77     12.29
 TypeB Treat    4.00 0.707 Inf      2.83      5.66
 TypeC Treat    3.62 0.673 Inf      2.52      5.22
 TypeD Treat    4.50 1.500 Inf      2.34      8.65

Confidence level used: 0.95 
Intervals are back-transformed from the log scale

So now we have a modelled value of our response for each level of our categorical predictors. For example:

The modelled prediction for Resp2.p when Cat2 is TypeA and Cat3 is Treat is 15.2 (95% confidence interval: 12.1 - 19.0 units).

The modelled prediction for Resp2.p when Cat2 is TypeC and Cat3 is Control is 9.7 units (95% confidence interval: 6.7 - 13.9 units).

Finally, note that you can also get a quick plot of the effects by handing the emmeans() output to plot().

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

As with Example 1, you can also find out which combinations of predictor levels are leading to statistically different model predictions in Resp2.p:

forComp <- pairs(emmOut, # emmeans output
                 adjust = "fdr", # multiple testing adjustment
                 simple = "Cat2") # contrasts within this categorical predictor

forComp

Cat3 = Control:
 contrast      ratio    SE  df null z.ratio p.value
 TypeA / TypeB 1.400 0.227 Inf    1   2.074  0.0457
 TypeA / TypeC 1.572 0.343 Inf    1   2.074  0.0457
 TypeA / TypeD 3.576 0.960 Inf    1   4.750  <.0001
 TypeB / TypeC 1.123 0.245 Inf    1   0.532  0.5947
 TypeB / TypeD 2.555 0.685 Inf    1   3.496  0.0014
 TypeC / TypeD 2.275 0.695 Inf    1   2.690  0.0143

Cat3 = Treat:
 contrast      ratio    SE  df null z.ratio p.value
 TypeA / TypeB 2.596 0.510 Inf    1   4.852  <.0001
 TypeA / TypeC 2.865 0.586 Inf    1   5.142  <.0001
 TypeA / TypeD 2.308 0.794 Inf    1   2.429  0.0303
 TypeB / TypeC 1.103 0.283 Inf    1   0.384  0.7549
 TypeB / TypeD 0.889 0.335 Inf    1  -0.312  0.7549
 TypeC / TypeD 0.806 0.307 Inf    1  -0.567  0.7549

P value adjustment: fdr method for 6 tests 
Tests are performed on the log scale

Here you can more easily see the contrasts. Contrast ratios similar to 1 mean that the predictor levels lead to a similar value of the response.

Based on a threshold P-value of 0.05, you can see that the effects at some combinations of your predictors are not statistically different from each other. For example, the coefficient (predicted Resp2.p) when Cat3 = Control and Cat2 = TypeB (10.9) is 1.12 times (or 12%) more than the predicted Resp2.p when Cat3 = Control and Cat2 = TypeC (9.7), but there is no evidence that the difference has any meaning as P = 0.59 for this comparison.

On the other hand, some combinations of your predictors are statistically different from each other. For example, a comparison of modelled Resp2.p when Cat3 = Control and Cat2 = TypeB (10.9) is 2.6 times (or 160%) higher than when Cat3 = Control and Cat2 = TypeD (4.2) and there is strong evidence that this difference is meaningful as P < 0.0014.

Note that you can also get the results from the pairwise testing visually by handing the output of pairs() to plot().

Example 3: Resp3.p ~ Cont4 + 1

Recall that Example 3 contains one predictor and the predictor is continuous:

Resp3 ~ Cont4 + 1

The best-specified model (now with poisson error distribution assumption) is:

bestMod3.p


Call:  glm(formula = Resp ~ Cont4 + 1, family = poisson(link = "log"), 
    data = myDat3.p)

Coefficients:
(Intercept)        Cont4  
   1.577204     0.006674  

Degrees of Freedom: 49 Total (i.e. Null);  48 Residual
Null Deviance:      74.74 
Residual Deviance: 62.96    AIC: 256.7

that was fit to data in myDat3.p:

str(myDat3.p)

'data.frame':   50 obs. of  2 variables:
 $ Cont4  : num  67.5 61.9 71.5 116.6 55.9 ...
 $ Resp3.p: int  12 8 4 12 0 2 4 10 8 7 ...

and the dredge() table you used to pick your bestMod3.p is

dredgeOut3.p

Global model call: glm(formula = Resp ~ Cont4 + 1, family = poisson(link = "log"), 
    data = myDat3.p)
---
Model selection table 
  (Intrc)    Cont4 df   logLik  AICc delta weight
2   1.577 0.006674  2 -126.339 256.9  0.00  0.992
1   2.069           1 -132.230 266.5  9.61  0.008
Models ranked by AICc(x)

And here’s a visualization of the model effects:

#### i) choosing the values of your predictors at which to make a prediction


# Set up your predictors for the visualized fit
forCont4<-seq(from = min(myDat3.p$Cont4), to = max(myDat3.p$Cont4), by = 1)# a sequence of numbers in your continuous predictor range
  
# create a data frame with your predictors
forVis<-expand.grid(Cont4 = forCont4) # expand.grid() function makes sure you have all combinations of predictors.  

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod3.p, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod3.p)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame


#### iii) use the model estimates to plot your model fit


library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  
  geom_point(data = myDat3.p, # add observations to your plot
             mapping = aes(x = Cont4, y = Resp3.p)) + # control position of data points so they are easier to see on the plot
  
  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont4, y = Fit),
              linewidth = 1.2) + # control thickness of line
  
    geom_line(data = forVis, # add uncertainty to your plot (upper line)
              mapping = aes(x = Cont4, y = Upper),
              linewidth = 0.8, # control thickness of line
              linetype = 2) + # control style of line
  
      geom_line(data = forVis, # add uncertainty to your plot (lower line)
              mapping = aes(x = Cont4, y = Lower),
              linewidth = 0.8, # control thickness of line
              linetype = 2) + # control style of line
  
  ylab("Resp3.p, (units)") + # change y-axis label
  
  xlab("Cont4, (units)") + # change x-axis label
  
  theme_bw() # change ggplot theme

What are your modelled effects (with uncertainty)?

trendsOut <- emtrends(bestMod3.p,
                      specs = ~ Cont4 + 1, # your predictors
                      var = "Cont4") # your predictor of interest

trendsOut

 Cont4 Cont4.trend      SE  df asymp.LCL asymp.UCL
  71.5     0.00667 0.00193 Inf   0.00288    0.0105

Confidence level used: 0.95

Note that these are the same estimates as we find in the summary() of your model object

coef(summary(bestMod3.p))

               Estimate  Std. Error   z value     Pr(>|z|)
(Intercept) 1.577203770 0.155306248 10.155443 3.133832e-24
Cont4       0.006674037 0.001934204  3.450534 5.594789e-04

This is an estimate of the modelled effects in the linked world. We need to consider our log link function to report the modelled effects in the response world.

Since you have a poisson error distribution assumption, you can convert the estimate made by emtrends() on to the response scale with:

trendsOut.df <- data.frame(trendsOut) # convert trendsOut to a dataframe

RR <- exp(trendsOut.df$Cont4.trend) # get the coefficient on the response scale

RR

[1] 1.006696

We can also use the same conversion on the confidence limits of the modelled effect, for the upper:

RR.up <- exp(trendsOut.df$asymp.UCL) # get the upper confidence interval on the response scale
  
RR.up

[1] 1.01052

and lower 95% confidence level:

RR.low <- exp(trendsOut.df$asymp.LCL) # get the lower confidence interval on the response scale

RR.low

[1] 1.002887

This tells you that for a unit change in Cont4, Resp3 increases by 1.007x (95% confidence interval is 1.003 to 1.011) which is an increase of 0.67%.

Getting a sense for coefficients - poisson error distribution assumption

Here’s a little thought experiment that will help you get a better sense for your model coefficients when modelling with a poisson error distribution.

First, a visual reporting of the modelled effects:

visreg(bestMod3.p, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cont4", # predictor on x-axis
       #by = ..., # if you want to include a 2nd predictor plotted as colour
       #breaks = ..., # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 3rd predictor
       #overlay = TRUE, # to plot as overlay or panels, when there is 
       #rug = FALSE, # to turn off the rug. The rug shows you where you have positive (appearing on the top axis) and negative (appearing on the bottom axis) residuals
       gg = TRUE)+ # to plot as a ggplot, with ggplot, you will have more control over the look of the plot.
  geom_point(data = myDat3.p,
             mapping = aes(x = Cont4, y = Resp3.p))+
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cont4, (units)")+ # change x-axis label
  ylim(0, 20)+
  theme_bw() # change ggplot theme

First, pick a value of Cont4 at which to make an estimate of Resp3:

t1 <- 10 # put any number here

The Resp3 estimated for that t1 is:

## Step One: Define the predictor values for your response prediction
forCont4.1 <- data.frame(Cont4 = t1)

## Step Two: Make your response prediction
myPred.1 <- predict(object = bestMod3.p, # your model
                  newdata = forCont4.1, # the values of the predictors at which to make the prediction
                  type = "response", # to make the prediction on the response scale. IMPORTANT if you have a non-normal error distributions assumption
                  se.fit = TRUE) # to include a measure of uncertainty around the prediction


myPred.1$fit

       1 
5.175542

Now let’s find Resp3 estimated for t2 = t1 + 1 (i.e. one unit more):

t2 <- t1 + 1

## Step One: Define the predictor values for your response prediction
forCont4.2 <- data.frame(Cont4 = t2)

## Step Two: Make your response prediction
myPred.2 <- predict(object = bestMod3.p, # your model
                  newdata = forCont4.2, # the values of the predictors at which to make the prediction
                  type = "response", # to make the prediction on the response scale. IMPORTANT if you have a non-normal error distributions assumption
                  se.fit = TRUE) # to include a measure of uncertainty around the prediction


myPred.2$fit

     1 
5.2102

The effect of Cont4 on Resp3 can be expressed as the difference between the two numbers:

# Effect of Cont4 on Resp is
myPred.2$fit-myPred.1$fit

         1 
0.03465728

If your try this for a number of different t1 values you’ll notice that the number isn’t stable but varies with the values of t1 and t2 we choose.

Instead, we can express the effect of Cont4 on Resp3 as the ratio of the two numbers (called the rate ratio):

myPred.2$fit/myPred.1$fit

       1 
1.006696

This rate ratio (RR) describes the % change in Resp3.p from a one unit change in Cont4, and it is stable whatever t1 and t2 values we choose.

Compare this rate ratio to your modelled coefficients:

summary(bestMod3.p)


Call:
glm(formula = Resp ~ Cont4 + 1, family = poisson(link = "log"), 
    data = myDat3.p)

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) 1.577204   0.155306  10.155  < 2e-16 ***
Cont4       0.006674   0.001934   3.451 0.000559 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 74.743  on 49  degrees of freedom
Residual deviance: 62.962  on 48  degrees of freedom
AIC: 256.68

Number of Fisher Scoring iterations: 4

coef(summary(bestMod3.p))["Cont4", "Estimate"]

[1] 0.006674037

and note that the rate ratio is the same as exp(coef):

exp(coef(summary(bestMod3.p))["Cont4", "Estimate"])

[1] 1.006696

So for a poisson error distribution assumption (or Gamma with log link) the effect of a continuous predictor on the response is described as the rate ratio (RR). This describes the % change in the response for a unit change in the predictor. This is estimated by raising \(e\) to the coefficient: \(e^{\beta_1}\)

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

This question is only applies to categorical predictors of which their on none in Example 3.

Example 4: Resp4.p ~ Cat5 + Cont6 + Cat5:Cont6 + 1

Recall that Example 4 contains two predictors, one is categorical and one is continuous:

Resp4 ~ Cat5 + Cont6 + Cat5:Cont6 + 1

The best-specified model (now with poisson error distribution assumption) is:

bestMod4.p


Call:  glm(formula = Resp4.p ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = poisson(link = "log"), 
    data = myDat4.p)

Coefficients:
  (Intercept)        Cat5StB        Cat5StC          Cont6  Cat5StB:Cont6  
     4.374118       0.074861      -0.262595       0.004075      -0.002618  
Cat5StC:Cont6  
     0.003898  

Degrees of Freedom: 49 Total (i.e. Null);  44 Residual
Null Deviance:      183.1 
Residual Deviance: 50.35    AIC: 386.8

that was fit to data in myDat4.p:

str(myDat4.p)

'data.frame':   50 obs. of  3 variables:
 $ Cat5   : Factor w/ 3 levels "StA","StB","StC": 2 1 3 2 2 1 1 3 3 2 ...
 $ Cont6  : num  60.9 55.9 82.8 89.4 53.4 ...
 $ Resp4.p: int  85 112 109 87 99 98 92 101 87 110 ...

and the dredge() table you used to pick your bestMod4.p is

dredgeOut4.p

Global model call: glm(formula = Resp4.p ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = poisson(link = "log"), 
    data = myDat4.p)
---
Model selection table 
  (Int) Ct5      Cn6 Ct5:Cn6 df   logLik  AICc  delta weight
8 4.374   + 0.004075       +  6 -187.378 388.7   0.00  0.999
4 4.291   + 0.005194          4 -196.412 401.7  13.00  0.001
3 4.261     0.005378          2 -207.753 419.8  31.05  0.000
2 4.669   +                   3 -239.402 485.3  96.62  0.000
1 4.665                       1 -253.742 509.6 120.86  0.000
Models ranked by AICc(x)

And here’s a visualization of the model effects:

#### i) choosing the values of your predictors at which to make a prediction


# Set up your predictors for the visualized fit
forCat5<-unique(myDat4.p$Cat5) # all levels of your categorical predictor
forCont6<-seq(from = min(myDat4.p$Cont6), to = max(myDat4.p$Cont6), by = 1)# a sequence of numbers in your continuous predictor range
  
# create a data frame with your predictors
forVis<-expand.grid(Cat5 = Cat5, Cont6 = forCont6) # expand.grid() function makes sure you have all combinations of predictors.  

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod4.p, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod4.p)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame


#### iii) use the model estimates to plot your model fit


library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  
  geom_point(data = myDat4.p, # add observations to your plot
             mapping = aes(x = Cont6, y = Resp4.p, col = Cat5)) + # control position of data points so they are easier to see on the plot
  
  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont6, y = Fit, col = Cat5),
              linewidth = 1.2) + # control thickness of line
  
    geom_line(data = forVis, # add uncertainty to your plot (upper line)
              mapping = aes(x = Cont6, y = Upper, col = Cat5),
              linewidth = 0.4, # control thickness of line
              linetype = 2) + # control style of line
  
      geom_line(data = forVis, # add uncertainty to your plot (lower line)
              mapping = aes(x = Cont6, y = Lower, col = Cat5),
              linewidth = 0.4, # control thickness of line
              linetype = 2) + # control style of line
  
  ylab("Resp4, (units)") + # change y-axis label
  
  xlab("Cont6, (units)") + # change x-axis label
  
  labs(fill="Cat5, (units)", col="Cat5, (units)") + # change legend title
  
  theme_bw() # change ggplot theme

What are your modelled effects (with uncertainty)?

The process for estimating effects of categorical vs. continuous predictors is different.

For categorical predictors (Cat5), you can use emmeans() to give you the effects on the response scale directly:

emmOut <- emmeans(object = bestMod4.p, # your model
            specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat5 Cont6  rate   SE  df asymp.LCL asymp.UCL
 StA   73.5 107.1 2.32 Inf     102.7       112
 StB   73.5  95.2 2.61 Inf      90.2       100
 StC   73.5 109.7 2.70 Inf     104.5       115

Confidence level used: 0.95 
Intervals are back-transformed from the log scale

When Cat5 is StA, Resp4.p is estimated to be 74 (95% confidence interval: to 103 units).
When Cat5 is StB, Resp4.p is estimated to be 74 (95% confidence interval: to 90 units). When Cat5 is StC, Resp4.p is estimated to be 74 (95% confidence interval: to 105 units).

For continuous predictors (Cont6), you need to use the emtrends() function and then convert the coefficients to the response scale:

trendsOut <- emtrends(bestMod4.p,
                      specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1, # your predictors
                      var = "Cont6") # your predictor of interest

trendsOut

 Cat5 Cont6 Cont6.trend       SE  df asymp.LCL asymp.UCL
 StA   73.5     0.00408 0.000829 Inf   0.00245   0.00570
 StB   73.5     0.00146 0.001450 Inf  -0.00139   0.00430
 StC   73.5     0.00797 0.000903 Inf   0.00620   0.00974

Confidence level used: 0.95

Since you have a poisson error distribution assumption, you can convert the estimate made by emtrends() on to the response scale with the exp() function:

trendsOut.df <- data.frame(trendsOut) # convert trendsOut to a dataframe

RR <- exp(trendsOut.df$Cont6.trend) # get the coefficient on the response scale

RR

[1] 1.004084 1.001458 1.008005

You can also use the same conversion on the confidence limits of the modelled effect, for the upper:

RR.up <- exp(trendsOut.df$asymp.UCL) # get the upper confidence interval on the response scale
  
RR.up

[1] 1.005716 1.004309 1.009791

and lower 95% confidence level:

RR.low <- exp(trendsOut.df$asymp.LCL) # get the lower confidence interval on the response scale

RR.low

[1] 1.0024542 0.9986158 1.0062224

Let’s organize these numbers so we can read the effects easier:

RRAll <- data.frame(Cat5 = trendsOut.df$Cat5, # include info on the Cat5 level
                    RR = exp(trendsOut.df$Cont6.trend), # the modelled effect as a rate ratio
                    RR.up = exp(trendsOut.df$asymp.UCL), # upper 95% confidence level of the modelled effect
                    RR.down = exp(trendsOut.df$asymp.LCL)) # lower 95% confidence level of the modelled effect

RRAll

  Cat5       RR    RR.up   RR.down
1  StA 1.004084 1.005716 1.0024542
2  StB 1.001458 1.004309 0.9986158
3  StC 1.008005 1.009791 1.0062224

This tells you that for a unit change in Cont6, Resp4.p

increases by 1.004 times (an increase of 0.41%) when Cat5 is StA (95% confidence interval: 1.002 to 1.006).
increases by 1.001 times (an increase of 0.15%) when Cat5 is StB (95% confidence interval: 0.999 to 1.004).
increases by 1.008 times (an increase of 0.8%) when Cat5 is StC (95% confidence interval: 1.006 to 1.01).

Note that the 95% confidence interval for the effect of Cont6 on Resp4.p (the rate ratio) includes 1. When the rate ratio is 1, there is no effect of the continuous predictor on the response. See Section 5.2.4 for more explanation on this.

You can test for evidence that the effects of Cont6 on Resp4.p are significant with:

test(trendsOut) # get test if coefficient is different than zero.

 Cat5 Cont6 Cont6.trend       SE  df z.ratio p.value
 StA   73.5     0.00408 0.000829 Inf   4.918  <.0001
 StB   73.5     0.00146 0.001450 Inf   1.005  0.3150
 StC   73.5     0.00797 0.000903 Inf   8.828  <.0001

Note that these tests are done on the link scale but can be used for reporting effects on the response scale.

From the results, you can see that:

there is evidence that there is an effect of Cont6 on Resp4.p when Cat5 = StA. (i.e. the slope on the link scale is different than zero, and the rate ratio on the response scale is different than 1; P < 0.0001).
there is no evidence that there is an effect of Cont6 on Resp4.p when Cat5 = StB. (i.e. the slope on the link scale is not different than zero, and the rate ratio on the response scale is not different than 1; P = 0.32).
there is evidence that there is an effect of Cont6 on Resp4.p when Cat5 = StC. (i.e. the slope on the link scale is different than zero, and the rate ratio on the response scale is different than 1; P < 0.0001).

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

You can also find out which effects of Cont6 on Resp4.p are different from one another using pairs() on the emtrends() output:

forCompRR <- pairs(trendsOut, # emmeans object
                   adjust = "fdr", # multiple comparison adjustment
                   simple = "Cat5") # contrasts within this categorical predictor

forCompRR

Cont6 = 73.5:
 contrast  estimate      SE  df z.ratio p.value
 StA - StB  0.00262 0.00167 Inf   1.568  0.1170
 StA - StC -0.00390 0.00123 Inf  -3.180  0.0022
 StB - StC -0.00652 0.00171 Inf  -3.814  0.0004

P value adjustment: fdr method for 3 tests

Remember that you need to convert any predictor coefficients to the response scale when reporting.

The table above tells you that:

There is no evidence the effect of Cont6 on Resp4.p differs when Cat5 is StA vs. StB (P = 0.12).
There is evidence that the effect of Cont6 on Resp4.p is different when Cat5 is StA vs. StC (P = 0.0022). The effect of Cont6 on Resp4.p is exp(-0.0039) = 0.9961076 times as big when Cat5 is StA vs. StC.
There is evidence that the effect of Cont6 on Resp4.p is different when Cat5 is StB vs. StC (P = 0.0004). The effect of Cont6 on Resp4.p is exp(-0.00652) = 0.9935012 times as big when Cat5 is StB vs. StC.

Only if all the slopes were similar: you would want to test if the levels of your categorical predictor (Cat5) have significantly different coefficients (intercepts) from each other with pairs() on the output from emmeans() (the emmOut object).

Models with a Gamma error distribution assumption:

Background

For models with a Gamma error distribution assumption, you will typically start with a link function that is the inverse or link = "inverse". You need to take this link function into consideration when you report your modelled effects.

For a model where link = "inverse", the interpretation of the coefficients on the response scale is less clear. In this case, you would use your model to make predictions of your response and report the effects by describing these changes in predictions. We will cover making predictions in the Predicting section.

However, you can also use link = "log" for your models with a Gamma error distribution assumption. Using “log” will let you use the reporting methods in the section on “Models with a poisson error distribution assumption”(Section 5.2.4). So, if you have a model with a Gamma error distribution assumption, use link = "log" for modelling, and follow the methods given above (Section 5.2.4).

Models with a binomial error distribution assumption:

Background

For models with a binomial error distribution assumption, you will typically start with a link function that is the logit function or link = "logit". You need to take this link function into consideration when you report your modelled effects.

The logit function is also called the “log-odds” function as it is equal to the logarithm of the odds.

Models with a binomial error distribution assumption model the probability of an event happening. This might be presence (vs. absence), mature (vs. immature), death (vs. alive), etc., but in all cases this is represented as the probability of your response being 1 (vs. 0).

If \(p\) is the probability of an event happening (e.g. response being 1), the probability an event doesn’t happen is \(1-p\) when an event can only have two outcomes. The odds are the probability of the event happening divided by the probability it doesn’t happen, or:

\(\frac{p}{1-p}\)

The logit function (or log-odds function) is then:

\(log_e(\frac{p}{1-p})\)

Jargon alert! Note that a GLM using the logit link function is also called logistic regression.

Alternate link functions for binomially distributed data are probit and cloglog, but the logit link function is by far the one most used as the coefficients that result from the model are the easiest to interpret. But, as always, you should choose the link function that leads to the best-behaved residuals for your model.

Similar to reporting a rate ratio when your link = “log” (Section 5.2.4), you report your coefficients of a model with a binomial error distribution assumption (and link = “logit”) on the response scale by raising e to the power of the coefficients estimated on the link scale.

When you have a continuous predictor, \(e\) is raised to the slope (estimated on the link scale) to give you an odds ratio (OR) on the response scale. 1 - OR gives you the % change in odds for a unit change in your continuous predictor.

When you have a categorical predictor, \(e\) is raised to the intercept of each level of your predictor (estimated on the link scale) to give you the odds associated with that predictor level on the response scale.

Here is the math linking the coefficient to the odds ratio:

the math

Given a binomial model, \[ \begin{align} log_e(\frac{p_i}{1-p_i}) &= \beta_1 \cdot Cov_i + \beta_0 \\ Resp_i &\sim binomial(p_i) \end{align} \] \(p_i\) is the probability of success and \(1-p_i\) is the probability of failure, and the odds are \(\frac{p_i}{1-p_i}\).

Then the odds ratio (OR), or % change in odds due to a change in predictor is:

\[ \begin{align} OR&=\frac{odds\;when\;Cov=a+1}{odds\;when \;Cov=a}\\[2em] OR&=\frac{(\frac{p}{1-p}|Cov=a+1)}{(\frac{p}{1-p}|Cov=a)}\\[2em] log_e(OR)&=(log_e\frac{p}{1-p}|Cov=a+1) - (log_e\frac{p}{1-p}|Cov=a)\\[2em] log_e(OR)&=(\beta_1\cdot (a+1)+\beta_0) - (\beta_1\cdot a+\beta_0)\\[2em] log_e(OR)&=\beta_1\cdot a + \beta_1-\beta_1\cdot a\\[2em] OR &=e^{\beta_1} \end{align} \]

One thing to note:

The emmeans() function will convert the coefficients (intercepts) from the link to the response scale. You can ask for this with the type = "response" argument.

In contrast, the emtrends() function does not convert the coefficients (slopes) to represent effects on the response scale. This is because emtrends() is reporting the slope of a straight line - the trend line on the link scale - but the line isn’t straight on the response scale.

You need to convert from the link to the response by hand when we use emtrends()²⁸. How you make the conversion, and interpret the result, will depend on your error distribution assumption:

Here we will go through examples of reporting for models with a binomial error distribution assumption. If you want more details on why you are using the methods below, check the section on “Models with a normal error distribution assumption” (Section 5.2.3) for context.

Examples

Here I have made new examples where the error distribution assumption is binomial This changes the response variables, indicated by Resp1.b, Resp2.b, etc. to be values of either 0 or 1.

Example 1: Resp1.p ~ Cat1 + 1

Example 1: Resp1.b ~ Cat1 + 1

Recall that Example 1 contains only one predictor and it is categorical:

Resp1.b ~ Cat1 + 1

The best-specified model (now with a binomial error distribution assumption) is:

bestMod1.b


Call:  glm(formula = Resp1.b ~ Cat1 + 1, family = binomial(link = "logit"), 
    data = myDat1)

Coefficients:
(Intercept)      Cat1StB      Cat1StC  
    -0.6931      -1.8718       1.6094  

Degrees of Freedom: 49 Total (i.e. Null);  47 Residual
Null Deviance:      68.03 
Residual Deviance: 51.43    AIC: 57.43

that was fit to data in myDat1.b:

str(myDat1.b)

'data.frame':   50 obs. of  2 variables:
 $ Cat1   : Factor w/ 3 levels "StA","StB","StC": 3 3 2 3 3 1 1 2 3 3 ...
 $ Resp1.b: int  1 0 0 1 1 0 0 0 1 0 ...

and the dredge() table you used to pick your bestMod1.b is in dredgeOut1.b

dredgeOut1.b

Global model call: glm(formula = Resp1.b ~ Cat1 + 1, family = binomial(link = "logit"), 
    data = myDat1)
---
Model selection table 
  (Intrc) Cat1    R^2 df  logLik AICc delta weight
2 -0.6931    + 0.2825  3 -25.714 57.9  0.00  0.998
1 -0.3228      0.0000  1 -34.015 70.1 12.16  0.002
Models ranked by AICc(x)

And here’s a visualization of the model effects:

# Set up your predictors for the visualized fit
forCat1<-unique(myDat1.b$Cat1) # every value of your categorical predictor

# create a data frame with your predictors
forVis<-expand.grid(Cat1=forCat1) # expand.grid() function makes sure you have all combinations of predictors

# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod1.b, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod1.b)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame



library(ggplot2)

ggplot() + # start ggplot
  geom_point(data = myDat1.b, # add observations to your plot
             mapping = aes(x = Cat1, y = Resp1.b)) + 
  geom_errorbar(data = forVis, # add the uncertainty to your plot
              mapping = aes(x = Cat1, y = Fit, ymin = Lower, ymax = Upper),
              linewidth=1.2) + # control thickness of errorbar line
  geom_point(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cat1, y = Fit), 
             shape = 22, # shape of point
             size = 3, # size of point
             fill = "white", # fill colour for plot
             col = 'black') + # outline colour for plot
  ylab("probability Resp1.b = 1")+ # change y-axis label
  xlab("Cat1, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

What are your modelled effects (with uncertainty)?

For categorical predictors, you can get the coefficients on the response scale directly with the emmeans() function²⁹ when you specify type = “response”. These are the probabilities (not odds or odds ratio) that Resp1.b = 1 at each level of your categorical predictor:

library(emmeans) # load the emmeans package

emmOut <- emmeans(object = bestMod1.b, # your model
            specs = ~ Cat1, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat1   prob     SE  df asymp.LCL asymp.UCL
 StA  0.3333 0.1220 Inf   0.14596     0.594
 StB  0.0714 0.0688 Inf   0.00996     0.370
 StC  0.7143 0.0986 Inf   0.49239     0.866

Confidence level used: 0.95 
Intervals are back-transformed from the logit scale

Notice that the column reported is prob: these coefficients are already converted back onto the response scale (as probabilities).

Note also that two types of uncertainty are measured here. SE stands for the standard error around the prediction, and is a measure of uncertainty of the average modelled effect. The lower.CL and upper.CL represent the 95% confidence limits of the prediction - so if you observed a new Resp1.b at a particular Cat1, there is a 95% probability that the probability Resp1.b is 1 would lie within the bounds of the confidence limits.

From this output you see:

When Cat1 is SpA, the probability that Resp1.b is 1 is estimated to be 0.33 (95% confidence interval: 0.15 to 0.59 units). This is equivalent to odds (\(\frac{p}{(1-p)}\)) of 0.5.

When Cat1 is SpB, the probability that Resp1.b is 1 is estimated to be 0.07 (95% confidence interval: 0.01 to 0.37 units). This is equivalent to odds (\(\frac{p}{(1-p)}\)) of 0.08.

When Cat1 is SpC, the probability that Resp1.b is 1 is estimated to be 0.71 (95% confidence interval: 0.49 to 0.87 units). This is equivalent to odds (\(\frac{p}{(1-p)}\)) of 2.5.

Finally, note that you can also get a quick plot of the effects by handing the emmeans() output to plot().

where do these probabilities and odds come from?

The emmeans package provides an easy way of seeing the effects of Cat1 on Resp1.b the response scale, but you can also get this information from the summary() output:

coef(summary(bestMod1.b))

              Estimate Std. Error   z value   Pr(>|z|)
(Intercept) -0.6931472  0.5477226 -1.265508 0.20568935
Cat1StB     -1.8718022  1.1734223 -1.595165 0.11067536
Cat1StC      1.6094379  0.7302967  2.203814 0.02753745

You can convert the coefficients to odds with the exp() function, for example:

When Cat1 is SpA, the odds that Resp1.b is 1 is estimated to be exp(coef(summary(bestMod1.b))[1]) = 0.5. And you can get the probabilities directly with the plogis() function (plogis(coef(summary(bestMod1.b))[1]) = 0.333).

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

Here you can compare the effects of the levels of Cat1 on Resp1.b to see if the effects differ between the levels.

forComp <- pairs(emmOut, # emmeans object
                 adjust = "fdr") # multiple comparison adjustment

forComp

 contrast  odds.ratio     SE  df null z.ratio p.value
 StA / StB     6.5000 7.6300 Inf    1   1.595  0.1107
 StA / StC     0.2000 0.1460 Inf    1  -2.204  0.0413
 StB / StC     0.0308 0.0352 Inf    1  -3.041  0.0071

P value adjustment: fdr method for 3 tests 
Tests are performed on the log odds ratio scale

The output shows the results of the multiple comparison (pairwise) testing.

The values in the p.value column tell you the results of the hypothesis test comparing the coefficients between the two levels of your categorical predictor.

Note the odds.ratio column here. Comparing effects of Cat1 on the probability of Resp1.b is done by comparing odds of Resp1.b being 1 under each level of Cat1.

For example:

there is no evidence that the probability of Resp1.b being 1 is different when Cat 1 = SpA vs. when Cat1 = SpB (P = 0.1107).
there is little evidence that the probability of Resp1.b being 1 is different when Cat 1 = SpA than when Cat1 = SpC (P = 0.0413). The odds ratio is 0.2.
there is strong evidence that the probability of Resp1.b being 1 is different when Cat 1 = SpB than when Cat1 = SpC (P = 0.0071). The odds ratio is 0.0308.

where did this odds ratio come from?

This is a ratio of the odds of that Resp1.b is 1 when Cat 1 = SpA vs. the odds of that Resp1.b is 1 when Cat 1 = SpC:

For example, the odds that Resp1.b is 1 when Cat 1 = SpA are:

pCat1.SpA <- summary(emmOut)[1,2]

oddsCat1.SpA <- pCat1.SpA /(1-pCat1.SpA)

oddsCat1.SpA

[1] 0.5

and the odds that Resp1.b is 1 when Cat 1 = SpC are:

pCat1.SpC <- summary(emmOut)[3,2]

oddsCat1.SpC <- pCat1.SpC /(1-pCat1.SpC)

oddsCat1.SpC

[1] 2.5

and the odds ratio StA/StB is

oddsCat1.SpA/oddsCat1.SpC

[1] 0.2

Note that you can also get the results from the pairwise testing visually by handing the output of pairs() to plot().

Example 2: Resp2.p ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Example 2: Resp2.b ~ Cat2 + Cat3 + Cat2:Cat3 + 1

Recall that Example 2 contains two predictors and both are categorical:

Resp2.b ~ Cat2 + Cat3 + Cat2:Cat3 + 1

The best-specified model (now with binomial error distribution assumption) is:

bestMod2.b


Call:  glm(formula = Resp2.b ~ Cat2 + Cat3 + Cat2:Cat3 + 1, family = binomial(link = "logit"), 
    data = myDat2.b)

Coefficients:
      (Intercept)            Cat2StB            Cat2StC          Cat3Treat  
         -0.15415            0.23639           -0.08352           -0.10368  
Cat2StB:Cat3Treat  Cat2StC:Cat3Treat  
         -0.29222            1.99958  

Degrees of Freedom: 499 Total (i.e. Null);  494 Residual
Null Deviance:      692.9 
Residual Deviance: 649.7    AIC: 661.7

that was fit to data in myDat2.b:

str(myDat2.b)

'data.frame':   500 obs. of  3 variables:
 $ Cat2   : Factor w/ 3 levels "StA","StB","StC": 2 2 3 2 2 3 2 1 1 1 ...
 $ Cat3   : Factor w/ 2 levels "Control","Treat": 1 1 1 2 1 2 1 2 2 1 ...
 $ Resp2.b: int  0 0 0 1 1 1 0 0 0 1 ...

and the dredge() table you used to pick your bestMod2.b is in

dredgeOut2.b

Global model call: glm(formula = Resp2.b ~ Cat2 + Cat3 + Cat2:Cat3 + 1, family = binomial(link = "logit"), 
    data = myDat2.b)
---
Model selection table 
     (Int) Ct2 Ct3 Ct2:Ct3      R^2 df   logLik  AICc delta weight
8 -0.15420   +   +       + 0.082720  6 -324.843 661.9  0.00      1
4 -0.38060   +   +         0.031870  4 -338.331 684.7 22.89      0
2 -0.20190   +             0.023290  3 -340.538 687.1 25.27      0
3 -0.11690       +         0.007163  2 -344.632 693.3 31.43      0
1  0.04801                 0.000000  1 -346.430 694.9 33.01      0
Models ranked by AICc(x)

And here’s a visualization of the model effects:

#### i) choosing the values of your predictors at which to make a prediction

# Set up your predictors for the visualized fit
forCat2<-unique(myDat2.b$Cat2) # every level of your categorical predictor
forCat3<-unique(myDat2.b$Cat3) # every level of your categorical predictor
  
# create a data frame with your predictors
forVis<-expand.grid(Cat2 = forCat2, Cat3 = forCat3) # expand.grid() function makes sure you have all combinations of predictors

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod2.b, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod2.b)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame


#### iii) use the model estimates to plot your model fit

library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  geom_point(data = myDat2.b, # add observations to your plot
             mapping = aes(x = Cat2, y = Resp2.b, col = Cat3), 
             position=position_jitterdodge(jitter.width=0.75, dodge.width=0.75)) + # control position of data points so they are easier to see on the plot
  geom_errorbar(data = forVis, # add the uncertainty to your plot
              mapping = aes(x = Cat2, y = Fit, ymin = Lower, ymax = Upper, col = Cat3),
              position=position_dodge(width=0.75), # control position of data points so they are easier to see on the plot
              size=1.2) + # control thickness of errorbar line
  geom_point(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cat2, y = Fit, fill = Cat3), 
             shape = 22, # shape of point
             size=3, # size of point
             col = 'black', # outline colour for point
             position=position_dodge(width=0.75)) + # control position of data points so they are easier to see on the plot
  ylab("probability Resp2.b = 1")+ # change y-axis label
  xlab("Cat2, (units)")+ # change x-axis label
  labs(fill="Cat3, (units)", col="Cat3, (units)") + # change legend title
  theme_bw() # change ggplot theme

What are your modelled effects (with uncertainty)?

As above, we can use the emmeans package to see the effects of the predictors on the scale of the response:

emmOut <- emmeans(object = bestMod2.b, # your model
            specs = ~ Cat2 + Cat3 + Cat2:Cat3, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat2 Cat3     prob     SE  df asymp.LCL asymp.UCL
 StA  Control 0.462 0.0523 Inf     0.362     0.564
 StB  Control 0.521 0.0585 Inf     0.407     0.632
 StC  Control 0.441 0.0515 Inf     0.344     0.543
 StA  Treat   0.436 0.0561 Inf     0.331     0.547
 StB  Treat   0.422 0.0521 Inf     0.325     0.526
 StC  Treat   0.840 0.0423 Inf     0.739     0.907

Confidence level used: 0.95 
Intervals are back-transformed from the logit scale

So now you have a modelled value of your response for each level of your categorical predictors. For example:

When Cat2 is StA and Cat3 is Treat, the probability that Resp1.b is 1 is estimated to be 0.44 (95% confidence interval: 0.33 to 0.55 units).

When Cat2 is StB and Cat3 is Control, the probability that Resp1.b is 1 is estimated to be 0.52 (95% confidence interval: 0.41 to 0.63 units).

Finally, note that you can also get a quick plot of the effects by handing the emmeans() output to plot().

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

As with Example 1, you can also find out which combinations of predictor levels are leading to statistically different model predictions in Resp2.b:

forComp <- pairs(emmOut, # emmeans output
                 adjust = "fdr", # multiple comparison adjustment
                 simple = "Cat2") # contrasts within this categorical predictor

forComp

Cat3 = Control:
 contrast  odds.ratio     SE  df null z.ratio p.value
 StA / StB      0.789 0.2490 Inf    1  -0.751  0.6791
 StA / StC      1.087 0.3220 Inf    1   0.282  0.7781
 StB / StC      1.377 0.4320 Inf    1   1.019  0.6791

Cat3 = Treat:
 contrast  odds.ratio     SE  df null z.ratio p.value
 StA / StB      1.057 0.3300 Inf    1   0.179  0.8582
 StA / StC      0.147 0.0573 Inf    1  -4.925  <.0001
 StB / StC      0.139 0.0530 Inf    1  -5.183  <.0001

P value adjustment: fdr method for 3 tests 
Tests are performed on the log odds ratio scale

Based on a threshold P-value of 0.05, you can see that the effects at some combinations of your predictors are not statistically different from each other. For example, there is no evidence that the probability of Resp2.b being 1 when Cat3 = Control and Cat2 = StA (0.46)³⁰ than when Cat3 = Control and Cat2 = StB (0.52) is different from each other (P = 0.68).

On the other hand, some combinations of your predictors are statistically different from each other. For example, there is strong evidence that the probability of Resp2.b being 1 when Cat3 = Treat and Cat2 = StA (0.46) is less than that when Cat3 = Treat and Cat2 = StC (0.84) is different from each other (P < 0.0001).

Note that you can also get the results from the pairwise testing visually by handing the output of pairs() to plot().

Example 3: Resp3.p ~ Cont4 + 1

Example 3: Resp3.b ~ Cont4 + 1

Recall that Example 3 contains one predictor and the predictor is continuous:

Resp3 ~ Cont4 + 1

The best-specified model (now with a binomial error distribution assumption) is:

bestMod3.b


Call:  glm(formula = Resp3.b ~ Cont4 + 1, family = binomial(link = "logit"), 
    data = myDat3.b)

Coefficients:
(Intercept)        Cont4  
    -8.1519       0.1086  

Degrees of Freedom: 299 Total (i.e. Null);  298 Residual
Null Deviance:      415.8 
Residual Deviance: 199.1    AIC: 203.1

that was fit to data in myDat3.b:

str(myDat3.b)

'data.frame':   300 obs. of  2 variables:
 $ Cont4  : num  57.7 53.2 79.7 35.1 72.2 ...
 $ Resp3.b: int  1 0 1 0 0 1 1 1 1 0 ...

and the dredge() table you used to pick your bestMod3.b is

dredgeOut3.b

Global model call: glm(formula = Resp3.b ~ Cont4 + 1, family = binomial(link = "logit"), 
    data = myDat3.b)
---
Model selection table 
   (Intrc)  Cont4    R^2 df   logLik  AICc  delta weight
2 -8.15200 0.1086 0.5144  2  -99.533 203.1   0.00      1
1  0.04001        0.0000  1 -207.884 417.8 214.68      0
Models ranked by AICc(x)

And here’s a visualization of the model effects:

#### i) choosing the values of your predictors at which to make a prediction


# Set up your predictors for the visualized fit
forCont4<-seq(from = min(myDat3.b$Cont4), to = max(myDat3.b$Cont4), by = 1)# a sequence of numbers in your continuous predictor range
  
# create a data frame with your predictors
forVis<-expand.grid(Cont4 = forCont4) # expand.grid() function makes sure you have all combinations of predictors.  

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod3.b, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod3.b)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame


#### iii) use the model estimates to plot your model fit


library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  
  geom_point(data = myDat3.b, # add observations to your plot
             mapping = aes(x = Cont4, y = Resp3.b)) + # control position of data points so they are easier to see on the plot
  
  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont4, y = Fit),
              linewidth = 1.2) + # control thickness of line
  
    geom_line(data = forVis, # add uncertainty to your plot (upper line)
              mapping = aes(x = Cont4, y = Upper),
              linewidth = 0.8, # control thickness of line
              linetype = 2) + # control style of line
  
      geom_line(data = forVis, # add uncertainty to your plot (lower line)
              mapping = aes(x = Cont4, y = Lower),
              linetype = 2) + # control style of line
  
  ylab("probability Resp2.b = 1")+ # change y-axis label
  
  xlab("Cont4, (units)") + # change x-axis label
  
  theme_bw() # change ggplot theme

The model fit line may seem a little strange as the data are binary (can only be 0 or 1) but the model fit line goes through the space in between 0 and 1. This model fit line tells you how the probability of Resp3.b being 1 changes as Cont4 changes.

What are your modelled effects (with uncertainty)?

trendsOut <- emtrends(bestMod3.b,
                      specs = ~ Cont4 + 1, # your predictors
                      var = "Cont4") # your predictor of interest

trendsOut

 Cont4 Cont4.trend     SE  df asymp.LCL asymp.UCL
  75.8       0.109 0.0116 Inf    0.0859     0.131

Confidence level used: 0.95

Note that this is the same estimate as you find in the summary() output of your model object:

coef(summary(bestMod3.b))

              Estimate Std. Error   z value     Pr(>|z|)
(Intercept) -8.1518996 0.89160247 -9.142976 6.075711e-20
Cont4        0.1085946 0.01160302  9.359159 8.037377e-21

This is because emtrends() returns coefficients on the link scale.

To report the coefficients on a response scale, you need to consider your link function. Since you have a binomial error distribution assumption, you can convert the estimate made by emtrends() to the response scale with:

trendsOut.df <- data.frame(trendsOut) # convert trendsOut to a dataframe

OR <- exp(trendsOut.df$Cont4.trend) # get the coefficient on the response scale

OR

[1] 1.11471

You can also use the same conversion on the confidence limits of the modelled effect, for the upper:

OR.up <- exp(trendsOut.df$asymp.UCL) # get the upper confidence interval on the response scale
  
OR.up

[1] 1.140351

and lower 95% confidence level:

OR.low <- exp(trendsOut.df$asymp.LCL) # get the lower confidence interval on the response scale

OR.low

[1] 1.089646

This coefficient - called the odds ratio (Section 5.2.6) - tells you that for a unit change in Cont4, the odds that Resp3.b is 1 increases by 1.115 times (95% confidence interval is 1.09 to 1.14) which is an increase in the odds of Resp3.b being 1 of 11.5%.

where did this odds ratio come from?

You can get a better sense of the odds ratio (vs. odds vs. probabilities) by estimating it directly from the modelled probability that Resp3.b is 1:

Let’s find the probability of Resp3.b = 1 when Cont4 is 60:

pCont4.60 <- predict(bestMod3.b, # model
                 newdata = data.frame(Cont4 = 60), # predictor values for the prediction
                 type = "response", # give prediction on the response scale
                 se.fit = TRUE) # include uncertainty

pCont4.60 <- pCont4.60$fit

pCont4.60

        1 
0.1629792

So there is a 16.3% probability of Resp3.b being 1 when Cont4 is 60.

The odds associated with Resp3.b being 1 when Cont4 is 60 is given by the probability of Cont4 is divided by the probability of absence:

oddsCont4.60 <- pCont4.60/(1-pCont4.60)

oddsCont4.60

        1 
0.1947134

Now let’s find the probability of Resp3.b being 1 when Cont4 is 61 (one unit more):

pCont4.61 <- predict(bestMod3.b, # model
                 newdata = data.frame(Cont4 = 61), # predictor values for the prediction
                 type = "response", # give prediction on the response scale
                 se.fit = TRUE) # include uncertainty

pCont4.61 <- pCont4.61$fit

pCont4.61

        1 
0.1783404

So there is a 17.8% probability of Resp3.b being 1 when Cont4 is 61.

The odds associated with Resp3.b being 1 when Cont4 is 61 is given by the probability of presence divided by the probability of absence:

oddsCont4.61 <- pCont4.61/(1-pCont4.61)

oddsCont4.61

       1 
0.217049

Now the odds ratio is found by comparing the change in odds when your predictor (Cont4) changes one unit

oddsCont4.61/oddsCont4.60

      1 
1.11471

Note that this is identical to the calculation of the odds ratio from the coefficient above:

OR

[1] 1.11471

and you interpret this as a OR - 1 or 11.5% change in the odds of Resp3.b being 1 with a unit change of Cont4.

Getting a sense for coefficients - binomial error distribution assumption

Here’s a little thought experiment that will help you get a better sense for your model coefficients.

First, a visual reporting of the modelled effects:

visreg(bestMod3.b, # model to visualize
       scale = "response", # plot on the scale of the response
       xvar = "Cont4", # predictor on x-axis
       #by = ..., # if you want to include a 2nd predictor plotted as colour
       #breaks = ..., # if you want to control how the colour predictor is plotted
       #cond = , # if you want to include a 3rd predictor
       #overlay = TRUE, # to plot as overlay or panels, when there is 
       #rug = FALSE, # to turn off the rug. The rug shows you where you have positive (appearing on the top axis) and negative (appearing on the bottom axis) residuals
       gg = TRUE)+ # to plot as a ggplot, with ggplot, you will have more control over the look of the plot.
  geom_point(data = myDat3.b,
             mapping = aes(x = Cont4, y = Resp3.b))+
  ylab("Response, (units)")+ # change y-axis label
  xlab("Cont4, (units)")+ # change x-axis label
  theme_bw() # change ggplot theme

First, pick a value of Cont4 at which to make an estimate of Resp3.b:

t1 <- 100 # put any number here

The Resp3.b estimated for that t1 is:

## Step One: Define the predictor values for your response prediction
forCont4.1 <- data.frame(Cont4 = t1)

## Step Two: Make your response prediction
myPred.1 <- predict(object = bestMod3.b, # your model
                  newdata = forCont4.1, # the values of the predictors at which to make the prediction
                  type = "response", # to make the prediction on the response scale. IMPORTANT if you have a non-normal error distributions assumption
                  se.fit = TRUE) # to include a measure of uncertainty around the prediction


myPred.1$fit

       1 
0.937471

Now let’s find Resp3.b estimated for t2 = t1 + 1 (i.e. one unit more):

t2 <- t1 + 1

## Step One: Define the predictor values for your response prediction
forCont4.2 <- data.frame(Cont4 = t2)

## Step Two: Make your response prediction
myPred.2 <- predict(object = bestMod3.b, # your model
                  newdata = forCont4.2, # the values of the predictors at which to make the prediction
                  type = "response", # to make the prediction on the response scale. IMPORTANT if you have a non-normal error distributions assumption
                  se.fit = TRUE) # to include a measure of uncertainty around the prediction


myPred.2$fit

        1 
0.9435423

How does Resp3.b differ when Cont4 is t1 vs. t2?

myPred.2$fit-myPred.1$fit

          1 
0.006071321

Try this for a number of different t1 values. What do you notice? The number isn’t stable but varies with the values of t1 and t2 we choose.

What about if we express the effect of Cont4 on the probability of Resp3. being 1 as the ratio of the two numbers:

myPred.2$fit/myPred.1$fit

       1 
1.006476

This number is still not stable. Instead, we need to compare the odds that the Resp3.b is 1 when Cont4 is t1 vs. t2:

odds1 <- myPred.1$fit/(1-myPred.1$fit)

odds2 <- myPred.2$fit/(1-myPred.2$fit)

and look at the ratio of these odds (called the odds ratio, OR):

odds2/odds1

      1 
1.11471

This rate ratio (OR) describes the % change in the odds that Resp3.p is one when Cont4 increases one unit.

Compare this odds ratio to your modelled coefficients:

summary(bestMod3.b)


Call:
glm(formula = Resp3.b ~ Cont4 + 1, family = binomial(link = "logit"), 
    data = myDat3.b)

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -8.1519     0.8916  -9.143   <2e-16 ***
Cont4         0.1086     0.0116   9.359   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 415.77  on 299  degrees of freedom
Residual deviance: 199.07  on 298  degrees of freedom
AIC: 203.07

Number of Fisher Scoring iterations: 6

coef(summary(bestMod3.b))["Cont4", "Estimate"]

[1] 0.1085946

and note that the rate ratio is the same as exp(coef):

exp(coef(summary(bestMod3.b))["Cont4", "Estimate"])

[1] 1.11471

So for a binomial error distribution assumption when using a logit link the effect of a continuous predictor on the response is described as the odds ratio (OR). This describes the % change in the odds that the response is 1 for a unit change in the predictor. This is estimated by raising \(e\) to the coefficient: \(e^{\beta_1}\)

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

This question only applies to categorical predictors of which their are none in Example 3.

Example 4: Resp4.b ~ Cat5 + Cont6 + Cat5:Cont6 + 1

Recall that Example 4 contains two predictors, one is categorical and one is continuous:

Resp4.b ~ Cat5 + Cont6 + Cat5:Cont6 + 1

The best-specified model (now with a binomial error distribution assumption) is:

bestMod4.b


Call:  glm(formula = Resp4.b ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = binomial(link = "logit"), 
    data = myDat4.b)

Coefficients:
  (Intercept)        Cat5StB        Cat5StC          Cont6  Cat5StB:Cont6  
    -28.55811       19.23667      -10.40094        0.15899       -0.09544  
Cat5StC:Cont6  
      0.03623  

Degrees of Freedom: 499 Total (i.e. Null);  494 Residual
Null Deviance:      639.7 
Residual Deviance: 169  AIC: 181

that was fit to data in myDat4.b:

str(myDat4.b)

'data.frame':   500 obs. of  3 variables:
 $ Cat5   : Factor w/ 3 levels "StA","StB","StC": 2 1 1 1 2 1 2 1 3 2 ...
 $ Cont6  : num  206 191 312 223 210 ...
 $ Resp4.b: int  1 1 1 1 1 1 1 0 1 1 ...

and the dredge() table you used to pick your bestMod4.b is

dredgeOut4.b

Global model call: glm(formula = Resp4.b ~ Cat5 + Cont6 + Cat5:Cont6 + 1, family = binomial(link = "logit"), 
    data = myDat4.b)
---
Model selection table 
     (Int) Ct5     Cn6 Ct5:Cn6     R^2 df   logLik  AICc  delta weight
8 -28.5600   + 0.15900       + 0.60990  6  -84.513 181.2   0.00      1
4 -19.4300   + 0.10820         0.59390  4  -94.548 197.2  15.98      0
3 -12.4100     0.07056         0.49520  2 -148.970 302.0 120.77      0
2   0.5276   +                 0.07603  3 -300.081 606.2 425.01      0
1   0.6722                     0.00000  1 -319.850 641.7 460.51      0
Models ranked by AICc(x)

And here’s a visualization of the model effects:

#### i) choosing the values of your predictors at which to make a prediction


# Set up your predictors for the visualized fit
forCat5<-unique(myDat4.b$Cat5) # all levels of your categorical predictor
forCont6<-seq(from = min(myDat4.b$Cont6), to = max(myDat4.b$Cont6), by = 1)# a sequence of numbers in your continuous predictor range
  
# create a data frame with your predictors
forVis<-expand.grid(Cat5 = forCat5, Cont6 = forCont6) # expand.grid() function makes sure you have all combinations of predictors.  

#### ii) using `predict()` to use your model to estimate your response variable at those values of your predictor


# Get your model fit estimates at each value of your predictors
modFit<-predict(bestMod4.b, # the model
                newdata = forVis, # the predictor values
                type = "link", # here, make the predictions on the link scale and we'll translate the back below
                se.fit = TRUE) # include uncertainty estimate

ilink <- family(bestMod4.b)$linkinv # get the conversion (inverse link) from the model to translate back to the response scale

forVis$Fit<-ilink(modFit$fit) # add your fit to the data frame
forVis$Upper<-ilink(modFit$fit + 1.96*modFit$se.fit) # add your uncertainty to the data frame, 95% confidence intervals
forVis$Lower<-ilink(modFit$fit - 1.96*modFit$se.fit) # add your uncertainty to the data frame


#### iii) use the model estimates to plot your model fit


library(ggplot2) # load ggplot2 library

ggplot() + # start ggplot
  
  geom_point(data = myDat4.b, # add observations to your plot
             mapping = aes(x = Cont6, y = Resp4.b, col = Cat5)) + # control position of data points so they are easier to see on the plot
  
  geom_line(data = forVis, # add the modelled fit to your plot
              mapping = aes(x = Cont6, y = Fit, col = Cat5),
              linewidth = 1.2) + # control thickness of line
  
    geom_line(data = forVis, # add uncertainty to your plot (upper line)
              mapping = aes(x = Cont6, y = Upper, col = Cat5),
              linewidth = 0.4, # control thickness of line
              linetype = 2) + # control style of line
  
      geom_line(data = forVis, # add uncertainty to your plot (lower line)
              mapping = aes(x = Cont6, y = Lower, col = Cat5),
              linewidth = 0.4, # control thickness of line
              linetype = 2) + # control style of line
  
  ylab("probability Resp2.b = 1")+ # change y-axis label
  
  xlab("Cont6, (units)") + # change x-axis label
  
  labs(fill="Cat5, (units)", col="Cat5, (units)") + # change legend title
  
  theme_bw() # change ggplot theme

What are your modelled effects (with uncertainty)?

The process for estimating effects of categorical vs. continuous predictors is different.

For categorical predictors (Cat5), you can use emmeans() to give you the effects on the response scale directly:

emmOut <- emmeans(object = bestMod4.b, # your model
            specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1, # your predictors
            type = "response") # report coefficients on the response scale

emmOut

 Cat5 Cont6  prob     SE  df asymp.LCL asymp.UCL
 StA    198 0.951 0.0346 Inf     0.819     0.988
 StB    198 0.964 0.0178 Inf     0.907     0.986
 StC    198 0.438 0.0941 Inf     0.269     0.622

Confidence level used: 0.95 
Intervals are back-transformed from the logit scale

This output tells you that:

When Cat5 is StA and Cont6 is 227 units³¹, there is a 95% probability that Resp4.b will be 1 (95% confidence interval: 0.82 to 0.99%)
When Cat5 is StB and Cont6 is 227 units³², there is a 96% probability that Resp4.b will be 1 (95% confidence interval: 0.91 to 0.99 units).
When Cat5 is StB and Cont6 is 227 units³³, there is a 44% probability that Resp4.b will be 1 (95% confidence interval: 0.27 to 0.62 units).

Note that emmeans() sets your continuous predictor (Cont6) to the mean value of Cont6 (227 units). You can also control this with the at = argument in the emmeans() function:

emmOut <- emmeans(object = bestMod4.b, # your model
            specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1, # your predictors
            type = "response", # report coefficients on the response scale
            at = list(Cont6 = 150)) # control the value of your continuous predictor at which to make the coefficient estimates


emmOut

 Cat5 Cont6     prob       SE  df asymp.LCL asymp.UCL
 StA    150 8.93e-03 0.009230 Inf  1.17e-03   0.06500
 StB    150 5.53e-01 0.078500 Inf  3.99e-01   0.69695
 StC    150 6.27e-05 0.000142 Inf  7.00e-07   0.00524

Confidence level used: 0.95 
Intervals are back-transformed from the logit scale

For continuous predictors (Cont6), you need to use the emtrends() function and then convert the coefficients to the response scale:

trendsOut <- emtrends(bestMod4.b,
                      specs = ~ Cat5 + Cont6 + Cat5:Cont6 + 1 , # your predictors
                      var = "Cont6") # your predictor of interest

trendsOut

 Cat5 Cont6 Cont6.trend     SE  df asymp.LCL asymp.UCL
 StA    198      0.1590 0.0330 Inf    0.0942    0.2237
 StB    198      0.0635 0.0114 Inf    0.0413    0.0858
 StC    198      0.1952 0.0444 Inf    0.1083    0.2821

Confidence level used: 0.95

Since you have a binomial error distribution assumption, you can convert the estimate made by emtrends() on to the response scale with the exp() function:

trendsOut.df <- data.frame(trendsOut) # convert trendsOut to a dataframe

OR <- exp(trendsOut.df$Cont6.trend) # get the coefficient on the response scale

OR

[1] 1.172328 1.065611 1.215576

You can also use the same conversion on the confidence limits of the modelled effect, for the upper:

OR.up <- exp(trendsOut.df$asymp.UCL) # get the upper confidence interval on the response scale
  
OR.up

[1] 1.250737 1.089631 1.325976

and lower 95% confidence level:

OR.low <- exp(trendsOut.df$asymp.LCL) # get the lower confidence interval on the response scale

OR.low

[1] 1.098834 1.042120 1.114368

Let’s organize these numbers so we can read the effects easier:

ORAll <- data.frame(Cat5 = trendsOut.df$Cat5, # include info on the Cat5 level
                    OR = exp(trendsOut.df$Cont6.trend), # the modelled effect as a rate ratio
                    OR.up = exp(trendsOut.df$asymp.UCL), # upper 95% confidence level of the modelled effect
                    OR.down = exp(trendsOut.df$asymp.LCL)) # lower 95% confidence level of the modelled effect

ORAll

  Cat5       OR    OR.up  OR.down
1  StA 1.172328 1.250737 1.098834
2  StB 1.065611 1.089631 1.042120
3  StC 1.215576 1.325976 1.114368

This tells you that for a unit change in Cont6, the odds that Resp4.b is 1:

increases by 17.2% with a unit change in Cont6 when Cat5 is StA (95% confidence interval: 9.9 to 25.1).
increases by 6.6% with a unit change in Cont6 when Cat5 is StB (95% confidence interval: 4.2 to 9).
increases by 21.6% with a unit change in Cont6 when Cat5 is StC (95% confidence interval: 11.4 to 32.6).

You can test for evidence that the effects of Cont6 on Resp4.b are significant (different than zero) with:

test(trendsOut) # get test if coefficient is different than zero.

 Cat5 Cont6 Cont6.trend     SE  df z.ratio p.value
 StA    198      0.1590 0.0330 Inf   4.813  <.0001
 StB    198      0.0635 0.0114 Inf   5.588  <.0001
 StC    198      0.1952 0.0444 Inf   4.401  <.0001

Note that these tests are done on the link scale but can be used for reporting effects on the response scale.

From the results, you can see that the effects of Cont6 on Resp4.b at all levels of Cat5 are significant (P < 0.0001).

Are the modelled effects different than one another across categorical levels? (only needed for categorical predictors with more than two levels)

You can also find out which effects of Cont6 on Resp4.b are different from one another using pairs() on the emtrends() output:

forCompOR <- pairs(trendsOut, # emmeans object
                   adjust = "fdr",  # multiple comparison adjustment
                   simple = "Cat5") # contrasts within this categorical predictor

forCompOR

Cont6 = 198:
 contrast  estimate     SE  df z.ratio p.value
 StA - StB   0.0954 0.0349 Inf   2.732  0.0094
 StA - StC  -0.0362 0.0553 Inf  -0.655  0.5124
 StB - StC  -0.1317 0.0458 Inf  -2.876  0.0094

P value adjustment: fdr method for 3 tests

The table above tells you that:

There is evidence that the effect of Cont6 on Resp4.b is different when Cat5 is StA vs. StB (P = 0.009). The effect of Cont6 on Resp4.b is exp(0.0954) = 1.1 times as big when Cat5 is StA vs. StB (i.e. the effect of Cont6 on Resp4.b is 10% more when Cat5 is StA vs. StB)
There is no evidence the effect of Cont6 on Resp4.b differs when Cat5 is StA vs. StC (P = 0.512).
There is evidence that the effect of Cont6 on Resp4.b is different when Cat5 is StB vs. StC (P = 0.009). The effect of Cont6 on Resp4.b is exp(-0.1317) = 0.88 times as big when Cat5 is StB vs. StC (i.e. the effect of Cont6 on Resp4.b is 12% less when Cat5 is StB vs. StC)

Only if all the slopes³⁴ were similar, you would want to test if the levels of your categorical predictor (Cat5) have significantly different coefficients (intercepts) from each other with pairs() on the output from emmeans() (the emmOut object).

Aside: how does this relate to the summary() output?

Up next

In the next chapter, we will discuss how you can use your model to make predictions.

References

Arel-Bundock, Vincent, Noah Greifer, and Andrew Heiss. 2024. “How to Interpret Statistical Models Using marginaleffects for R and Python.” Journal of Statistical Software 111 (9): 1–32. https://doi.org/10.18637/jss.v111.i09.

Popovic, Gordana, Tanya Jane Mason, Szymon Marian Drobniak, Tiago André Marques, Joanne Potts, Rocío Joo, Res Altwegg, et al. 2024. “Four Principles for Improved Statistical Ecology.” Methods in Ecology and Evolution 15 (2): 266–81. https://doi.org/10.1111/2041-210x.14270.

Verhoeven, Koen J. F., Katy L. Simonsen, and Lauren M. McIntyre. 2005. “Implementing False Discovery Rate Control: Increasing Your Power.” Oikos 108: 643–47.

Footnotes

more to come in the section on communicating ↩︎
check the rr2 package for many different options↩︎
this is called the Nagelkerke’s modified statistic - see ?r.squaredLR for more information↩︎
when done in a robust fashion↩︎
i.e. in the dredge() output↩︎
with some options being more or less helpful depending on your model structure↩︎
also see the “by hand” visualization method below↩︎
you will be using the predict() function again in the Predicting chapter↩︎
note that the p-values reported in the coefficient table are the result of a test of whether the coefficient associated with Sp2 is different than that of Sp1 (p = 0.001), and if the effect of Sp3 is different than that of Sp1 (p = 0.017)), but a comparison of effects of Sp2 vs. Sp3 is missing. We will discuss this more further along in these notes.↩︎
emmeans stands for “estimated marginal means”↩︎
another option is the marginaleffects package, where marginal effects represent the average effect - the average difference in your predicted response across a change in your categorical predictor level or a unit change in your numeric predictor(Arel-Bundock, Greifer, and Heiss 2024)↩︎
note that the summary() output becomes easier to read if you force the starting model not to have an intercept with: startMod<-glm(formula = Resp1 ~ Cat1 - 1, data = myDat, family = gaussian(link="identity")). This is fine to do here where you only have categorical predictors but causes problems when you start having continuous predictors in your model as well. So we won’t be practicing this in class and instead will use the very flexible and helpful emmeans package to help us report coefficients from our statistical models.↩︎
“post hoc” is a Latin phrase meaning “after the event”↩︎
when the P-value is very low, R reports is as simple <.0001↩︎
note that now you get to assess the difference between coefficients for Sp2 and Sp3 which was missing in the summary() output above↩︎
The emmeans package is very flexible and has a lot of options as to how to make these corrections depending on your needs. Plenty more information is available here ↩︎
* in units of Resp3 per units of Cont4↩︎
units of Resp4↩︎
units will be the units of Resp4 divided by those of Cont6↩︎
in units of Resp4 per units of Cont6↩︎
in units of Resp4 per units of Cont6↩︎
in units of Resp4 per units of Cont6↩︎
in units of Resp4 per units of Cont6↩︎
also called the incidence rate ratio, incidence density ratio or relative risk↩︎
a reminder that the number e that the function exp() uses as its base is Euler’s number. It (along with the natural logarithm, log()) is used to model exponential growth or decay, and can be used here when you want models where the response can not be below zero.↩︎
this is what R uses when you say link = "log"↩︎
from the emmeans package↩︎
remember, the coefficients for categorical predictors are also being converted from the link to the response scale. It is just that R is doing it for us in the emmeans() function when we add the argument `type = “response”)↩︎
from the emmeans package↩︎
taken from the emmOut output above↩︎
the mean of the range of Cont6↩︎
the mean of the range of Cont6↩︎
the mean of the range of Cont6↩︎
i.e. effects associated with the continuous predictor↩︎