Statistics with R session 3

3. Statistical analyses

3.1 RQ1: Is there an association between competence, autonomy and mathematics achievement?

3.1.1 Bi-variate correlation

We will answer to the research question 1 by calculating a bi-variate correlation (r) between the three variables.
We will use three methods: Pearson, Spearman and Kendall.

# select three variables from the whole data frame for correlation analysis
dfttu_wide_corr1 <- select(dfttu_wide, # define data frame
                           autonomy, competence, score) # define variables

# produce correlations
# pearson
stats::cor(dfttu_wide_corr1, # stats::cor refers to a base package "stats" that has a function "cor" (no need to install "stats")
           use="complete.obs", 
           method="pearson") # parametric: continuous variables, sensitive to outliers and non-normality

##               autonomy competence       score
## autonomy    1.00000000 0.75414993 -0.01419534
## competence  0.75414993 1.00000000  0.06565707
## score      -0.01419534 0.06565707  1.00000000

# spearman
stats::cor(dfttu_wide_corr1, 
           use="complete.obs", 
           method="spearman") # non-parametric: ordinal, interval and ratio variables

##               autonomy competence       score
## autonomy   1.000000000 0.70924315 0.004388977
## competence 0.709243149 1.00000000 0.077963659
## score      0.004388977 0.07796366 1.000000000

# kendall
stats::cor(dfttu_wide_corr1, 
           use="complete.obs", 
           method="kendall") # non-parametric: similar to spearman, more robust with small samples

##                autonomy competence        score
## autonomy   1.0000000000  0.5894390 0.0003040474
## competence 0.5894389832  1.0000000 0.0551966010
## score      0.0003040474  0.0551966 1.0000000000

# note that there are no p-values, instead the effect sizes are determined from the correlation values 
# by Cohen's (1988) "rules":
# small r > |.10|
# medium r > |.30|
# large r > |.50|

# Pearson and Spearman produce similar results, Kendall is more conservative (lower correlations)
# competence and autonomy are strongly positively correlated
# score is not correlated with competence or autonomy

As the output shows, there is a strong positive correlation between competence and autonomy. This means that if the values of competence get higher (or lower), the values of autonomy will also follow the same tendency. There is a negative correlation between autonomy and mathematics achievement score, but it is really small. Negative correlation indicates that if autonomy gets higher values, the mathematics score goes down. Note that there is also a positive correlation between competence and score.

Next we plot these correlations into a table that is in the APA format.

# produce APA-style correlation table btw Competence, Autonomy and Score

sjPlot::tab_corr(dfttu_wide_corr1, # df may ONLY contain above mentioned 3 variables
                 corr.method = "pearson", 
                 # corr.method = "spearman", 
                 # corr.method = "kendall", 
                 var.labels = c("Autonomy","Competence","Score"), # displayed in the table instead of var names
                 CSS = list(
                    css.thead = 'border-top: 1px solid; font-style:normal; font-weight:normal;',
                    css.caption = 'font-weight:normal;',
                    css.firsttablecol = 'font-style:normal;',
                    # do not display summary text, set color to white:
                    css.summary = '+border-bottom: 0px solid; color:#FFFFFF; font-style:normal;'), 
                 triangle = "lower", # "upper", default is full corr matrix (lower+upper)
                 digits = 2, # APA-style needs 2 decimals
                 show.p = FALSE, # if "TRUE", ***, **, * are presented
                 p.numeric = FALSE,
                 fade.ns = FALSE)

	Autonomy	Competence
Autonomy
Competence	0.75
Score	-0.01	0.07
Computed correlation used pearson-method with listwise-deletion.

                 # file = "dfttu_Table1_correlations.html") # writes this correlation table into file

Dull academic style may now go away and we do some fancier plottings from the same analysis.

# install.packages("corrplot")
library(corrplot)

## corrplot 0.92 loaded

# copy correlation matrix into "df_cor" for plotting
df_cor <- stats::cor(dfttu_wide_corr1, use="complete.obs", method="pearson")

# produce several views on correlations ...
cor_circle <- corrplot(df_cor, 
         method = "circle",
         type = "lower",
         bg="lightgray"
         )

cor_pie <- corrplot(df_cor, 
         method = "pie",
         type = "lower",
         bg="lightgray"
         )

cor_color <- corrplot(df_cor, 
         method = "color",
         type = "lower",
         bg="lightgray"
         )

cor_number <- corrplot(df_cor, 
         method = "number",
         type = "lower",
         bg="lightgray"
         )

Visualizations of association are sometimes needed, so here we create views into the data.

# 2-way plotting

# create plot of competence ~ autonomy
dfttu_corr_plot1 <- ggplot(data = dfttu_wide, aes(x = autonomy, y = competence)) + 
  geom_smooth(method = "lm") + 
  geom_jitter(alpha = 0.3, color = "black") +
  labs(title = " ", x = "Autonomy", y = "Competence")

# create plot of competence ~ score
dfttu_corr_plot2 <- ggplot(data = dfttu_wide, aes(x = score, y = competence)) + 
  geom_smooth(method = "lm") + 
  geom_jitter(alpha = 0.2, color = "black") +
  labs(title = " ", x = "Math Score", y = "")

# display plots in a single row
grid.arrange(dfttu_corr_plot1,dfttu_corr_plot2, nrow = 1)

## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'

And as we wish not only to limit ourselves to 2 dimensions, we visualize the effect of gender and mathematics achievement (classified into three groups).

# 3-way plotting 

# color data by gender
ggplot(data = dfttu_wide, aes(x = autonomy, y = competence, color = gender_chr)) +
    geom_point(alpha = .5, size = 2) +
    geom_smooth(method = "lm", aes(group = 1), color = "black") + 
    labs(title = " ", x = "Autonomy", y = "Competence") +
    scale_color_manual(name="Gender",
                    labels = c("Female", "Male"),
                    values = c("Female"="red", "Male"="blue"))

## `geom_smooth()` using formula = 'y ~ x'

# color data by score3class_chr
ggplot(data = dfttu_wide, aes(x = autonomy, y = competence, color = score3class_chr)) +
    geom_point(alpha = .5, size = 2) +
    geom_smooth(method = "lm", aes(group = 1), color = "black") +
    labs(title = " ", x = "Autonomy", y = "Competence") +
    scale_color_manual(name="Achievement",
                    labels = c("Low", "Medium", "High"),
                    values = c("Low score"="darkred", "Medium score"="darkorange", "High score"="darkblue"))

## `geom_smooth()` using formula = 'y ~ x'

Jamovi "is the man", easy to use graphical environment for R code. So, let’s use Jamovi to do the same analyses as above.

library(jmv)

## 
## Attaching package: 'jmv'

## The following objects are masked from 'package:psych':
## 
##     pca, reliability

library(jmvconnect)

# Jamovi: Regression - Correlation matrix

# "vanilla" version
jmv::corrMatrix(
    data = dfttu_wide,
    vars = vars(autonomy, competence, score))

## 
##  CORRELATION MATRIX
## 
##  Correlation Matrix                                                     
##  ---------------------------------------------------------------------- 
##                                 autonomy      competence    score       
##  ---------------------------------------------------------------------- 
##    autonomy      Pearson's r             —                              
##                  df                      —                              
##                  p-value                 —                              
##                                                                         
##    competence    Pearson's r     0.7541499             —                
##                  df                    206             —                
##                  p-value        < .0000001             —                
##                                                                         
##    score         Pearson's r    -0.0141953     0.0656571            —   
##                  df                    206           206            —   
##                  p-value         0.8387407     0.3460747            —   
##  ----------------------------------------------------------------------

Results of the bi-variate correlations are the same, so let’s see what kind of plots Jamovi can offer for us:

library(jmv)
library(jmvconnect)

# Jamovi: Regression - Correlation matrix

# "bells and whistles" version
# better run this in the console ...
jmv::corrMatrix(
    data = dfttu_wide,
    vars = vars(autonomy, competence, score),
    spearman = TRUE,
    kendall = TRUE,
    n = TRUE,
    ci = TRUE,
    plots = TRUE,
    plotDens = TRUE,
    plotStats = TRUE)

## 
##  CORRELATION MATRIX
## 
##  Correlation Matrix                                                         
##  -------------------------------------------------------------------------- 
##                                     autonomy      competence    score       
##  -------------------------------------------------------------------------- 
##    autonomy      Pearson's r                 —                              
##                  df                          —                              
##                  p-value                     —                              
##                  95% CI Upper                —                              
##                  95% CI Lower                —                              
##                  Spearman's rho              —                              
##                  df                          —                              
##                  p-value                     —                              
##                  Kendall's Tau B             —                              
##                  p-value                     —                              
##                  N                           —                              
##                                                                             
##    competence    Pearson's r         0.7541499             —                
##                  df                        206             —                
##                  p-value            < .0000001             —                
##                  95% CI Upper        0.8073597             —                
##                  95% CI Lower        0.6887738             —                
##                  Spearman's rho      0.7092431             —                
##                  df                        206             —                
##                  p-value            < .0000001             —                
##                  Kendall's Tau B     0.5894390             —                
##                  p-value            < .0000001             —                
##                  N                         208             —                
##                                                                             
##    score         Pearson's r        -0.0141953     0.0656571            —   
##                  df                        206           206            —   
##                  p-value             0.8387407     0.3460747            —   
##                  95% CI Upper        0.1220816     0.1999126            —   
##                  95% CI Lower       -0.1499469    -0.0710184            —   
##                  Spearman's rho      0.0043890     0.0779637            —   
##                  df                        206           206            —   
##                  p-value             0.9498321     0.2629966            —   
##                  Kendall's Tau B     0.0003040     0.0551966            —   
##                  p-value             0.9951707     0.2731602            —   
##                  N                         208           208            —   
##  --------------------------------------------------------------------------

3.1.2 Partial correlation

So far we have investigated "isolated" bi-variate correlations, but now we wish to investigate if another variable has something to do with the correlation (so called "z factor"). In this case, we check what is the effect of mathematics score on the correlation between competence and autonomy (by removing it’s effect).

# previously we saw that the Spearman correlation btw competence and autonomy was
x <- dfttu_wide_corr1[2] # competence
y <- dfttu_wide_corr1[1] # autonomy
stats::cor(x,y, 
           use="complete.obs", 
           method="spearman") # r = 0.7092431

##             autonomy
## competence 0.7092431

# how a "z factor" (= math score) may have been influencing that association?
# we know better if we "remove" its effect ...

# install.packages("ggm")
library(ggm)
ggm::pcor(c("competence", "autonomy", # correlation btw competence and autonomy
       "score"), # controlled by math ability, that is, effect of "score" is removed
     var(dfttu_wide_corr1)) #  r = 0.756791

## [1] 0.756791

# it seems that person's math ability ("score") mixes up the results, so when it is controlled (removed), the association btw the "competence" and "autonomy" grows stronger!

Same analysis with Jamovi …

library(jmv)
library(jmvconnect)

# Jamovi: Regression - Partial correlation

jmv::corrPart(
    data = dfttu_wide,
    vars = vars(autonomy, competence), # correlation btw autonomy and competence
    controls = score, # controlled for math test score
    spearman = TRUE,
    kendall = TRUE)

## 
##  PARTIAL CORRELATION
## 
##  Partial Correlation                                           
##  ------------------------------------------------------------- 
##                                     autonomy      competence   
##  ------------------------------------------------------------- 
##    autonomy      Pearson's r                 —                 
##                  p-value                     —                 
##                  Spearman's rho              —                 
##                  p-value                     —                 
##                  Kendall's Tau B             —                 
##                  p-value                     —                 
##                                                                
##    competence    Pearson's r         0.7567910             —   
##                  p-value            < .0000001             —   
##                  Spearman's rho      0.7110722             —   
##                  p-value            < .0000001             —   
##                  Kendall's Tau B     0.5903222             —   
##                  p-value            < .0000001             —   
##  ------------------------------------------------------------- 
##    Note. controlling for 'score'

3.2 RQ2: Is there a difference between females and males autonomy, competence and mathematics achievement?

This question calls for a method that can compare means of two groups: the t-test.

First we look how the t-test is computed in RStudio.

# we will test here only competence ~ gender

#-----------------------------------
# First, check normality of the data 
# if the data is normally distributed, we may use parametric t-test (student's or Welch's t-test)
# if it is not, we need to use non-parametric method (Mann-Whitney U-test)
library(ggpubr)
ggqqplot(dfttu_wide$competence) # observed values of competence seem to follow theoretical assumption (line) of normality

# tests like Shapiro-Wilks are too sensitive to sample size (we have too much data here)
shapiro.test(dfttu_wide$competence) # full data: p-value = 6.648e-06 (p should be > .05)

## 
##  Shapiro-Wilk normality test
## 
## data:  dfttu_wide$competence
## W = 0.95716, p-value = 6.648e-06

set.seed(2022)
shapiro.test(sample(dfttu_wide$competence,100)) # 100 random participants: p-value = 0.005239 <- getting closer ...

## 
##  Shapiro-Wilk normality test
## 
## data:  sample(dfttu_wide$competence, 100)
## W = 0.9616, p-value = 0.005239

set.seed(2022)
shapiro.test(sample(dfttu_wide$competence,50)) # 50 random participants: p-value = 0.3295 <- now the test shows normality!

## 
##  Shapiro-Wilk normality test
## 
## data:  sample(dfttu_wide$competence, 50)
## W = 0.97385, p-value = 0.3295

# so we trust the graphical display more and say that the data is normally distributed

#-------------------------------------------------------------
# Second, test if the variances of females and males are equal 
# null hypothesis = variances are equal
# if the variances are equal, the p-value is greater than .05

# Bartlett
# parametric test, assumes normality
bartlett.test(competence ~ gender_chr, data = dfttu_wide) # p-value = 0.13 <- variances are equal

## 
##  Bartlett test of homogeneity of variances
## 
## data:  competence by gender_chr
## Bartlett's K-squared = 2.2929, df = 1, p-value = 0.13

# Levene
# parametric test, does not assume normality
library(car)

## Loading required package: carData

## 
## Attaching package: 'car'

## The following object is masked from 'package:dplyr':
## 
##     recode

## The following object is masked from 'package:purrr':
## 
##     some

## The following object is masked from 'package:psych':
## 
##     logit

leveneTest(competence ~ gender_chr, data = dfttu_wide) # p-value = 0.14 <- variances are equal

## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.

## Levene's Test for Homogeneity of Variance (center = median)
##        Df F value Pr(>F)
## group   1  2.1509  0.144
##       206

# Fligner
# non-parametric test, does not assume normality
fligner.test(competence ~ gender_chr, data = dfttu_wide) # p-value = 0.11 <- variances are equal

## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  competence by gender_chr
## Fligner-Killeen:med chi-squared = 2.5986, df = 1, p-value = 0.107

# all three tests agreed that the variances of females and males are equal

# we test here if females and males differ by their experience of competence

# Student's t-test
# as we saw earlier, the variances of females and males are equal
# that allows us to use student's t-test that is a parametric test

t.test(competence ~ gender_chr, 
       data = dfttu_wide,
       paired = FALSE, # two independent populations (gender) are compared
       var.equal = TRUE) # equal variances are assumed

## 
##  Two Sample t-test
## 
## data:  competence by gender_chr
## t = -3.4328, df = 206, p-value = 0.0007221
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.4810452 -0.1300659
## sample estimates:
## mean in group Female   mean in group Male 
##             3.694444             4.000000

# Results of independent samples t-test (student's) showed that the male students experienced their competence to be on a higher level (M = 4.00) than the female students did (M = 3.69), t(206) = -3.4328, C.I. = -0.48 - -0.13, p < 0.001.

# We store the results into variables later used in the inline text:
ttest <- t.test(competence ~ gender, 
       data = dfttu_wide,
       paired = FALSE, # two independent populations (gender) are compared
       var.equal = TRUE) 

dfttu_ttest_mean_females <- ttest$estimate[1]
dfttu_ttest_mean_males <- ttest$estimate[2]
dfttu_ttest_statistic <- ttest$statistic
dfttu_ttest_df <- ttest$parameter
dfttu_ttest_p <- ttest$p.value
dfttu_ttest_CIlower <- ttest$conf.int[1]
dfttu_ttest_CIupper <- ttest$conf.int[2]

# statistical significance is not that important without reporting the effect size
# typical indicator of ES is Cohen's d:
# small > .2 (58% of females below male's average competence)
# medium > .5 (69% of females below male's average competence)
# large > .8 (79% of females below male's average competence)
# very large > 1.4 (92% of females below male's average competence)

cohen.d(competence ~ gender, data = dfttu_wide)

## Call: cohen.d(x = competence ~ gender, data = dfttu_wide)
## Cohen d statistic of difference between two means
##            lower effect upper
## competence  0.22   0.52  0.82
## 
## Multivariate (Mahalanobis) distance between groups
## [1] 0.52
## r equivalent of difference between two means
## competence 
##       0.23

# The effect size (Cohen's d) of this result was at the medium level (d = .52; 95%C.I. = .21 - .83). 

# store ES values into variables for inline text:
effsize <- cohen.d(competence ~ gender, data = dfttu_wide)
dfttu_d <- effsize$cohen.d[2] # es value (d)
dfttu_d_CIlower <- effsize$cohen.d[1] # es value lower confidence interval
dfttu_d_CIupper <- effsize$cohen.d[3] # es value upper confidence interval

# Welch's t-test
# what if the variances of females and males would not have been equal?
# then we may use parametric Welch's t-test that does not assume equal variances

t.test(competence ~ gender, 
       data = dfttu_wide,
       paired = FALSE, 
       var.equal = FALSE) # we indicate here that the variances are not equal

## 
##  Welch Two Sample t-test
## 
## data:  competence by gender
## t = -3.2234, df = 102.89, p-value = 0.001698
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.4935588 -0.1175523
## sample estimates:
## mean in group 0 mean in group 1 
##        3.694444        4.000000

# results are quite identical to student's t-test results:
# Results of independent samples t-test (Welch's) showed that the male students experienced their competence to be on a higher level (M = 4.00) than the female students did (M = 3.69), t(102.29) = -3.2234, C.I. = -0.49 - -0.12, p = 0.002.

# Mann-Whitney U-test
# what if the normality assumption does not hold and we wish to use non-parametric test?
# M-W U only assumes similar distributions for the compared groups, so let's see if they are alike
# first we produce a histogram of female's competence:
hist(dfttu_wide$competence[dfttu_wide$gender==0],main='Histogram for Females',xlab='Competence')

# followed by a histogram of male's competence:
hist(dfttu_wide$competence[dfttu_wide$gender==1],main='Histogram for Females',xlab='Competence')

# they look similar enough to carry on ...
# we use Wilcox test that runs Mann-Whitney U test as default
wilcox.test(competence ~ gender, 
       data = dfttu_wide)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  competence by gender
## W = 3380.5, p-value = 0.002567
## alternative hypothesis: true location shift is not equal to 0

# Results of independent samples Mann-Whitney U test showed that the male students experienced their competence to be on a higher level (M = 4.00) than the female students did (M = 3.69), W = 3380.5, p = 0.003.

# We store the results into variables later used in the inline text:
Utest <- wilcox.test(competence ~ gender, 
       data = dfttu_wide)

dfttu_Utest_median_females <- median(dfttu_wide$competence[which(dfttu_wide$gender==0)])
dfttu_Utest_median_males <- median(dfttu_wide$competence[which(dfttu_wide$gender==1)])
dfttu_Utest_statistic <- Utest$statistic
dfttu_Utest_p <- Utest$p.value

Next, we’ll look into Jamovi:

# run all the assumption checks and tests above with a single code:

jmv::ttestIS( # IS stands for "independent samples"
    formula = competence ~ gender,
    data = dfttu_wide,
    vars = competence,
    welchs = TRUE, # default is student's t-test, this adds also Wlech's t-test
    mann = TRUE, # and Mann-Whitney U test
    norm = TRUE, # check normality with Shapiro-Wilk et al. tests?
    qq = TRUE, # display Q-Q plots for normality check?
    eqv = TRUE, # check equal variances with e.g. Levene's test?
    meanDiff = TRUE, # display mean difference?
    ci = TRUE, # display confidence intervals?
    effectSize = TRUE, # display Cohen's d (effect size)? 
    desc = TRUE) # display means and medians of females/males?

## 
##  INDEPENDENT SAMPLES T-TEST
## 
##  Independent Samples T-Test                                                                                                                                                           
##  ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
##                                    Statistic    df          p            Mean difference    SE difference    Lower         Upper                                        Effect Size   
##  ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
##    competence    Student's t       -3.432778    206.0000    0.0007221         -0.3055556       0.08901116    -0.4810452      -0.1300659    Cohen's d                     -0.5179921   
##                  Welch's t         -3.223377    102.8860    0.0016976         -0.3055556       0.09479363    -0.4935588      -0.1175523    Cohen's d                     -0.5013719   
##                  Mann-Whitney U     3380.500                0.0025670         -0.2500494                     -0.4999422    -1.256787e-5    Rank biserial correlation      0.2598796   
##  ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
##    Note. Hₐ μ <sub>0</sub> ≠ μ <sub>1</sub>
## 
## 
##  ASSUMPTIONS
## 
##  Normality Test (Shapiro-Wilk)            
##  ---------------------------------------- 
##                  W            p           
##  ---------------------------------------- 
##    competence    0.9721658    0.0003883   
##  ---------------------------------------- 
##    Note. A low p-value suggests a
##    violation of the assumption of
##    normality
## 
## 
##  Homogeneity of Variances Test (Levene's)             
##  ---------------------------------------------------- 
##                  F           df    df2    p           
##  ---------------------------------------------------- 
##    competence    3.112431     1    206    0.0791790   
##  ---------------------------------------------------- 
##    Note. A low p-value suggests a violation of
##    the assumption of equal variances
## 
## 
##  Group Descriptives                                                                
##  --------------------------------------------------------------------------------- 
##                  Group    N      Mean        Median      SD           SE           
##  --------------------------------------------------------------------------------- 
##    competence    0         63    3.694444    3.750000    0.6559966    0.08264780   
##                  1        145    4.000000    4.000000    0.5590170    0.04642383   
##  ---------------------------------------------------------------------------------

# Results are naturally the same as before ..

Report the results

Results of independent samples t-test (student’s) showed that the male students experienced their competence to be on a higher level (M = 4.00) than the female students did (M = 3.69), t(206) = -3.433, C.I. = -0.48 - -0.13, p < 0.001. The effect size (Cohen d) of this finding was in a medium level (d = 0.52; 95%C.I. = 0.22 - 0.82).

Add here results regarding autonomy and gender …

Results of Mann-Whitney U test showed that the male students experienced their competence to be on a higher level (Md = 4.00) than the female students did (Md = 3.75), W = 3380.500, p = 0.003.

3.3 RQ3: Are there differences in autonomy or competence between mathematically low, medium and high achieving students?

Now that we have three groups (low, medium, high) of mathematical achievement, we need to use the analysis of variance (ANOVA) instead of t-test (that is for two group comparison).

First, we’ll describe the data with tables and plots.

# display descriptive table of three groups (0=low, 1=medium, 2=high) in math score
library(dplyr)
dplyr::group_by(dfttu_wide, score3class_chr) %>%
  dplyr::summarise(
    count = n(),
    mean = mean(competence, na.rm = TRUE),
    sd = sd(competence, na.rm = TRUE)
  )

## # A tibble: 3 × 4
##   score3class_chr count  mean    sd
##   <chr>           <int> <dbl> <dbl>
## 1 High score         70  3.97 0.588
## 2 Low score          67  3.81 0.526
## 3 Medium score       71  3.94 0.684

# alternative way
library(psych)
describeBy(dfttu_wide$competence, dfttu_wide$score3class_chr) # give this command in the Console panel

## 
##  Descriptive statistics by group 
## group: High score
##    vars  n mean   sd median trimmed  mad min max range  skew kurtosis   se
## X1    1 70 3.97 0.59      4       4 0.37 2.5   5   2.5 -0.38     0.11 0.07
## ------------------------------------------------------------ 
## group: Low score
##    vars  n mean   sd median trimmed  mad  min max range  skew kurtosis   se
## X1    1 67 3.81 0.53   3.75    3.81 0.37 2.75   5  2.25 -0.03    -0.36 0.06
## ------------------------------------------------------------ 
## group: Medium score
##    vars  n mean   sd median trimmed  mad min max range  skew kurtosis   se
## X1    1 71 3.94 0.68      4    3.96 0.74 2.5   5   2.5 -0.31    -0.69 0.08

# it seems that the low math scoring participants have the lowest competence ratings ...

# this plot visualizes the desc table
library("ggpubr")
ggboxplot(dfttu_wide, x = "score3class_chr", y = "competence", 
          color = "score3class_chr", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          ylab = "Competence", xlab = "Mathematical achievement")

# we may further ask if the gender has something to do with this?

# descriptives
dplyr::group_by(dfttu_wide, score3class_chr, gender_chr) %>%
  dplyr::summarise(
    count = n(),
    mean = mean(competence, na.rm = TRUE),
    sd = sd(competence, na.rm = TRUE)
  )

## `summarise()` has grouped output by 'score3class_chr'. You can override using
## the `.groups` argument.

## # A tibble: 6 × 5
## # Groups:   score3class_chr [3]
##   score3class_chr gender_chr count  mean    sd
##   <chr>           <chr>      <int> <dbl> <dbl>
## 1 High score      Female        26  3.80 0.625
## 2 High score      Male          44  4.07 0.548
## 3 Low score       Female        16  3.73 0.487
## 4 Low score       Male          51  3.84 0.540
## 5 Medium score    Female        21  3.54 0.792
## 6 Medium score    Male          50  4.11 0.560

# there are less females, but their competency scores are lower than male scores in all three achievement groups

# boxplot of competence by score and gender
ggplot(dfttu_wide, aes(x = score3class_chr, y = competence, fill = gender_chr)) + 
  geom_boxplot(width = 1) +
  labs(title="Development of competence by gender and mathematical achievement",
       subtitle="  ",
       x="Mathematical achievement",
       y="Competence")

# lineplot
ggline(dfttu_wide, x = "score3class_chr", y = "competence", 
       add = c("mean_se", "jitter"), 
       order = c("High score", "Medium score", "Low score"),
       ylab = "Competence", xlab = "Math achievement")

It seems that high and medium achieving participants have somewhat higher competence scores than low achieving participants. Is this a statistically significant finding?

Let’s use analysis of variance to find out:

\[ \operatorname{competence} = \beta_{1}(\operatorname{score3class\_chr}) + \beta_{2}() + \epsilon \]

# to find that out, we run analysis of variance
dfttu_competence_anova_RQ3 <- aov(competence ~ score3class_chr, data = dfttu_wide)

# and display the results
summary(dfttu_competence_anova_RQ3)

##                  Df Sum Sq Mean Sq F value Pr(>F)
## score3class_chr   2   0.91  0.4540   1.243  0.291
## Residuals       205  74.87  0.3652

It seems that the difference between the three math achievement groups is not stat sig what comes to competence.

However, we do pairwise comparisons between the means of these three groups (high, medium, low).

# Tukey Honest Significant Differences test compares the means of the three math achievement groups
TukeyHSD(dfttu_competence_anova_RQ3)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = competence ~ score3class_chr, data = dfttu_wide)
## 
## $score3class_chr
##                                diff        lwr        upr     p adj
## Low score-High score    -0.15442431 -0.3982784 0.08942982 0.2954089
## Medium score-High score -0.03123742 -0.2715561 0.20908130 0.9494280
## Medium score-Low score   0.12318688 -0.1198260 0.36619972 0.4564531

We learn that the greatest difference is between the low and high scoring groups (negative value as the high score is subtracted from the low score) and low and medium scoring groups (positive value as the low score is subtracted from the medium score). There is not much difference between medium and high scoring groups (negative value indicates that the high scoring group has the highest competence average value). None of these pairwise comparisons are statistically significant (as the "p adj" value is higher than .05). The same conclusion can be drawn by looking at the signs (positive or negative) of lower ("lwr") and upper ("upr") end points of 95% confidence intervals: Result is statistically significant if the sign stays the same. This is not the case here, all signs move from negative ("lwr") to positive ("upr") and thus contain the zero value.

Now it is up to You to investigate if the autonomy is stat sig different for the three math achievement groups …

# repeating this analysis with Jamovi produces similar result
jmv::anovaOneW(
    formula = competence ~ score3class_chr,
    data = dfttu_wide,
    welchs = FALSE,
    fishers = TRUE,
    desc = TRUE,
    descPlot = TRUE,
    eqv = TRUE)

## 
##  ONE-WAY ANOVA
## 
##  One-Way ANOVA (Fisher's)                              
##  ----------------------------------------------------- 
##                  F           df1    df2    p           
##  ----------------------------------------------------- 
##    competence    1.243138      2    205    0.2906428   
##  ----------------------------------------------------- 
## 
## 
##  Group Descriptives                                                             
##  ------------------------------------------------------------------------------ 
##                  score3class_chr    N     Mean        SD           SE           
##  ------------------------------------------------------------------------------ 
##    competence    High score         70    3.967857    0.5881090    0.07029247   
##                  Low score          67    3.813433    0.5264649    0.06431793   
##                  Medium score       71    3.936620    0.6836330    0.08113231   
##  ------------------------------------------------------------------------------ 
## 
## 
##  ASSUMPTION CHECKS
## 
##  Homogeneity of Variances Test (Levene's)              
##  ----------------------------------------------------- 
##                  F           df1    df2    p           
##  ----------------------------------------------------- 
##    competence    2.525996      2    205    0.0824657   
##  -----------------------------------------------------

Report the results

Results of analysis of variance showed that there was no difference in competence between the three mathematics achievement group members. Although the high and medium achieving participants had higher competence scores than the low achieving group, the difference was not statistically significant.

Add here results regarding autonomy and mathematics score …

# Kruskal-Wallis H test is a non-parametric alternative to ANOVA when the assumptions are not met
dfttu_competence_kw_RQ3 <- kruskal.test(competence ~ score3class_chr, data = dfttu_wide)
dfttu_competence_kw_RQ3

## 
##  Kruskal-Wallis rank sum test
## 
## data:  competence by score3class_chr
## Kruskal-Wallis chi-squared = 3.3736, df = 2, p-value = 0.1851

# data:  competence by score3class_chr
# Kruskal-Wallis chi-squared = 3.3736, df = 2, p-value = 0.1851

# result is similar to ANOVA: no stat sig difference between the three math achievement groups

# pairwise comparisons are made with Wilcox test
pairwise.wilcox.test(dfttu_wide$competence, dfttu_wide$score3class_chr,
                 p.adjust.method = "BH")

## 
##  Pairwise comparisons using Wilcoxon rank sum test with continuity correction 
## 
## data:  dfttu_wide$competence and dfttu_wide$score3class_chr 
## 
##              High score Low score
## Low score    0.23       -        
## Medium score 0.93       0.24     
## 
## P value adjustment method: BH

#              High score Low score
# Low score    0.23       -        
# Medium score 0.93       0.24     

# also pairwise comparisons produce similar results, the values are below .05 (no statistically significant differences between the three groups competence scores)

Results of Kruskal-Wallis H test showed that there were no statistically significant differences in competence between the three mathematical achievement groups.

3.4 RQ4: Do autonomy and mathematics achievement predict competence?

This time the RQ is addressed with a predictive model where the change in the dependent variable ("y" or "DV": "Competence") values are predicted by two independent variables ("x" or "IV": "Autonomy", "Score"). This type of question is quite often addressed with the linear regression analysis.

\[ \operatorname{competence} = \alpha + \beta_{1}(\operatorname{autonomy}) + \beta_{2}(\operatorname{score}) + \epsilon \]

We will start with the RStudio … and first fit a simpler model:

\[ \operatorname{competence} = \alpha + \beta_{1}(\operatorname{autonomy}) + \epsilon \]

# fit the Model 1 and store the result into object
dfttu_competence_regression_RQ4 <- lm(competence ~ autonomy, data = dfttu_wide)

# display summary of the results
summary(dfttu_competence_regression_RQ4)

## 
## Call:
## lm(formula = competence ~ autonomy, data = dfttu_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.06832 -0.25486  0.05572  0.24418  0.93168 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.06061    0.17491   6.064 6.27e-09 ***
## autonomy     0.75193    0.04562  16.482  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3983 on 206 degrees of freedom
## Multiple R-squared:  0.5687, Adjusted R-squared:  0.5666 
## F-statistic: 271.7 on 1 and 206 DF,  p-value: < 2.2e-16

# Residuals:
#      Min       1Q   Median       3Q      Max 
# -1.06832 -0.25486  0.05572  0.24418  0.93168 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  0.81254    0.13007   6.247 2.36e-09 ***
# autonomy     0.75193    0.04562  16.482  < 2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.3983 on 206 degrees of freedom
# Multiple R-squared:  0.5687,  Adjusted R-squared:  0.5666 
# F-statistic: 271.7 on 1 and 206 DF,  p-value: < 2.2e-16

Intercept is the predicted value of competence (y or DV) when autonomy (predictor) has a value of zero. It is not an interpretable estimate as the autonomy values range from 1 to 5. Before proceeding, we change the values of autonomy (and also competence) to have a range of 0-4.

# change autonomy and competence values
dfttu_wide$autonomy <- dfttu_wide$autonomy - 1
dfttu_wide$competence <- dfttu_wide$competence - 1

# fit the Model 1 once again and store the result into object
dfttu_competence_regression_RQ4 <- lm(competence ~ autonomy, data = dfttu_wide)

# display summary of the results
summary(dfttu_competence_regression_RQ4)

## 
## Call:
## lm(formula = competence ~ autonomy, data = dfttu_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.06832 -0.25486  0.05572  0.24418  0.93168 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.81254    0.13007   6.247 2.36e-09 ***
## autonomy     0.75193    0.04562  16.482  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3983 on 206 degrees of freedom
## Multiple R-squared:  0.5687, Adjusted R-squared:  0.5666 
## F-statistic: 271.7 on 1 and 206 DF,  p-value: < 2.2e-16

Now the intercept tells us that the average level of competence is 0.56 (range from 0 to 4) when participants have no feeling of autonomy (value zero, "totally disagree").

To produce a more interpretable intercept, we will center the predictors (subtract the grand mean from each individual’s value). After this, value zero represents the average value.

# check the names in the df
dput(names(dfttu_wide))

## c("id", "group", "gender", "autonomy", "competence", "deep", 
## "organised", "score", "score3class_chr", "score3class", "autonomy_chr", 
## "gender_chr", "group_chr", "index")

# c("id", "group", "gender", "autonomy", "competence", "deep", 
# "organised", "score", "score3class_chr", "score3class", "autonomy_chr", 
# "gender_chr", "group_chr", "index")

# center and scale 
library(magrittr) # for the %>% operator

## 
## Attaching package: 'magrittr'

## The following object is masked from 'package:purrr':
## 
##     set_names

## The following object is masked from 'package:tidyr':
## 
##     extract

## The following object is masked from 'package:texreg':
## 
##     extract

library(dplyr) # sym
varlist <- c("autonomy","score") # we'll center also the score here as it will be later used in Model 2

for (varname in varlist) {
  cgm <- paste0(varname,".cgm") # center grand mean
  cgmsd <- paste0(varname,".cgmsd") # cgm scaled (M = 0, SD = 1)
  dfttu_wide <- dfttu_wide %>% # pipe needs library(magrittr)
  # Centering Grand Mean (CGM)
    dplyr::mutate(
      !!sym(cgm) := !!sym(varname) - mean(!!sym(varname), na.rm = T),
      !!sym(cgmsd) := (!!sym(varname) - mean(!!sym(varname), na.rm = T)) / sd(!!sym(varname), na.rm = T)
      # !!sym(cgmsd) := scale(!!sym(varname), center = T, scale = T) # optional way
    )
}

dput(names(dfttu_wide))

## c("id", "group", "gender", "autonomy", "competence", "deep", 
## "organised", "score", "score3class_chr", "score3class", "autonomy_chr", 
## "gender_chr", "group_chr", "index", "autonomy.cgm", "autonomy.cgmsd", 
## "score.cgm", "score.cgmsd")

# c("id", "group", "gender", "autonomy", "competence", "deep", 
# "organised", "score", "score3class_chr", "score3class", "autonomy_chr", 
# "gender_chr", "group_chr", "index", "autonomy.cgm", "autonomy.cgmsd", 
# "score.cgm", "score.cgmsd")

# now we have centered (.cgm) and centered+scaled (.cgmsd) versions of autonomy and score
# scaling is always a good option if the value range of predictors is very different
# in this case
range(dfttu_wide$autonomy)

## [1] 1.25 4.00

# [1] 1.25 4.00
range(dfttu_wide$score)

## [1]   0 100

# [1]   0 100

Following figures show a) original (raw), b) centered (grand mean subtracted) and c) scaled (M=0, SD=1) histograms, densities (magenta) and means (blue) of the "autonomy" variable.

# --------------
# raw "autonomy"
var <- dfttu_wide$autonomy

# histogram
hist(var,
     main = "a) raw autonomy",
     xlab = "autonomy",
     freq = F # to scale hist and density
     )
# add density line
lines(density(var), col ="magenta", lwd = 2)
# add mean line
abline(v = mean(var), col ="blue", lwd = 2)

# -------------------
# centered "autonomy"
var <- dfttu_wide$autonomy.cgm

# histogram
hist(var, 
     main = "b) centered autonomy",
     xlab = "autonomy",
     freq = F # to scale hist and density
     )
# add density line
lines(density(var), col ="magenta", lwd = 2)
# add mean line
abline(v = mean(var), col ="blue", lwd = 2)

# ------------------------------
# centered and scaled "autonomy"
var <- dfttu_wide$autonomy.cgmsd

# histogram
hist(var, 
     main = "c) centered and scaled autonomy",
     xlab = "autonomy",
     freq = F # to scale hist and density
     )
# add density line
lines(density(var), col ="magenta", lwd = 2)
# add mean line
abline(v = mean(var), col ="blue", lwd = 2)

Figures show that the density remains the same.

# fit the Model 1 with centered autonomy
dfttu_competence_regression_RQ4_cgm <- lm(competence ~ autonomy.cgm, data = dfttu_wide)

# display summary of the results
summary(dfttu_competence_regression_RQ4_cgm)

## 
## Call:
## lm(formula = competence ~ autonomy.cgm, data = dfttu_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.06832 -0.25486  0.05572  0.24418  0.93168 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.90745    0.02762  105.28   <2e-16 ***
## autonomy.cgm  0.75193    0.04562   16.48   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3983 on 206 degrees of freedom
## Multiple R-squared:  0.5687, Adjusted R-squared:  0.5666 
## F-statistic: 271.7 on 1 and 206 DF,  p-value: < 2.2e-16

# Residuals:
#      Min       1Q   Median       3Q      Max 
# -1.06832 -0.25486  0.05572  0.24418  0.93168 
# 
# Coefficients:
#              Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   2.90745    0.02762  105.28   <2e-16 ***
# autonomy.cgm  0.75193    0.04562   16.48   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now (using grand mean centered predictor) we can say, that the average level of competence is 2.91 (range from 0 to 4) when autonomy is at it’s average value.

It seems that the autonomy is stat sig positive predictor of competence (and the math score is not), F(1,206) = 271.7, p < .001. This model predicts approximately 57.0% (Adjusted R^2 = 0.5666) of the variance in the DV (competence). That is quite much. What happens if we add mathematics score as a predictor?

# fit the model with score
dfttu_competence_regression2_RQ4 <- lm(competence ~ autonomy + score, data = dfttu_wide)
dfttu_competence_regression2_RQ4_cgm <- lm(competence ~ autonomy.cgm + score.cgm, data = dfttu_wide)

# display summary of the results
summary(dfttu_competence_regression2_RQ4_cgm)

## 
## Call:
## lm(formula = competence ~ autonomy.cgm + score.cgm, data = dfttu_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.12502 -0.24645  0.00879  0.22578  0.90144 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.9074519  0.0274969 105.738   <2e-16 ***
## autonomy.cgm 0.7530085  0.0454252  16.577   <2e-16 ***
## score.cgm    0.0015871  0.0009467   1.676   0.0952 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3966 on 205 degrees of freedom
## Multiple R-squared:  0.5746, Adjusted R-squared:  0.5704 
## F-statistic: 138.4 on 2 and 205 DF,  p-value: < 2.2e-16

# Residuals:
#      Min       1Q   Median       3Q      Max 
# -1.12502 -0.24645  0.00879  0.22578  0.90144 
# 
# Coefficients:
#               Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  2.9074519  0.0274969 105.738   <2e-16 ***
# autonomy.cgm 0.7530085  0.0454252  16.577   <2e-16 ***
# score.cgm    0.0015871  0.0009467   1.676   0.0952 .  
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.3966 on 205 degrees of freedom
# Multiple R-squared:  0.5746,  Adjusted R-squared:  0.5704 
# F-statistic: 138.4 on 2 and 205 DF,  p-value: < 2.2e-16

Score is not that important predictor as the R^2 (explained variance in DV) remains quite the same (Adjusted R-squared = 0.5704, interpretation is 57.0%).

We do not have to believe in ‘holy spirit’, as there is a special test for comparing two "competing" models: anova().

# we compare (simpler) Model 1 with a more complex Model 2
anova(dfttu_competence_regression_RQ4_cgm,dfttu_competence_regression2_RQ4_cgm)

## Analysis of Variance Table
## 
## Model 1: competence ~ autonomy.cgm
## Model 2: competence ~ autonomy.cgm + score.cgm
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1    206 32.681                              
## 2    205 32.239  1   0.44199 2.8105 0.09518 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Model 1: competence ~ autonomy.cgm
# Model 2: competence ~ autonomy.cgm + score.cgm
#   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
# 1    206 32.681                              
# 2    205 32.239  1   0.44199 2.8105 0.09518 .
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

What does the output tell us?

Note that the Model 1 is more simple (one IV) than Model 2 (two IV’s).

"Res. Df" column stands for Residual Degrees of Freedom:
n of observations - n of independent variables - 1
Model 1: 208 - 1 - 1 = 206.
Model 2: 208 - 2 - 1 = 205.
"Df" column is 206 - 205 = 1.

"RSS" is Residual Sum of Squares, a measure of the discrepancy between the data and an estimation model (smaller values indicate tighter fit between the data and the model).

# https://rinterested.github.io/statistics/anova_of_OLS_models.html

RSS_fit1 <- sum((dfttu_wide$competence - fitted(dfttu_competence_regression_RQ4_cgm))^2)  
RSS_fit1

## [1] 32.68113

# [1] 32.68113 # matches the output of ANOVA for Model 1

RSS_fit2 <- sum((dfttu_wide$competence - fitted(dfttu_competence_regression2_RQ4_cgm))^2)  
RSS_fit2

## [1] 32.23915

# [1] 32.23915 # matches the output of ANOVA for (more complex) Model 2

"Sum of Sq" is calculated as:
RSS Model 1 - RSS Model 2 = 32.68113 - 32.23915 = 0.44198.

Model 1 has larger RSS, so it fits to the data a bit worse than Model 2. But does this small difference (0.44198) really matter? One way to make a decision is to look at the p-value. If it is smaller than .05, we conclude that the difference matters. In our case the value (0.09518) is larger than 0.05, so we may continue using the simpler Model 1 (competence ~ autonomy).

There are also other tools to compare models. Bayesian Information Criterion (BIC) prefers simpler models over complex ones. Next we see that this is true: Model 1 with only one predictor (competence ~ autonomy) gets smaller (=preferable) BIC value than the more complex Model 2 (competence ~ autonomy + score).

However, the best predictive value is on the more complex Model 2 (competence ~ autonomy + score) as the Akaike Information Criterion (AIC) shows a smaller (=preferable) value for that model.

# BIC of the simpler model: competence ~ autonomy
BIC(dfttu_competence_regression_RQ4_cgm)

## [1] 221.3371

# [1] 221.3371 # matches the output of ANOVA

# BIC of the more complex model: competence ~ autonomy + score
BIC(dfttu_competence_regression2_RQ4_cgm)

## [1] 223.8424

# [1] 223.8424 # matches the output of ANOVA

# AIC of the simpler model: competence ~ autonomy
AIC(dfttu_competence_regression_RQ4_cgm)

## [1] 211.3245

# [1] 211.3245 # matches the output of ANOVA

# AIC of the more complex model: competence ~ autonomy + score
AIC(dfttu_competence_regression2_RQ4_cgm)

## [1] 210.4923

# [1] 210.4923 # matches the output of ANOVA

# so, BIC is favorable of the simpler model and AIC prefers the more complex model ...

fit1 <- dfttu_competence_regression_RQ4_cgm
fit2 <- dfttu_competence_regression2_RQ4_cgm

mu_competence <- mean(dfttu_wide$competence)                      # Mean competence
TSS <- sum((dfttu_wide$competence - mu_competence)^2)             # Total sum of squares
MSS_fit1 <- sum((fitted(fit1) - mu_competence)^2)                 # Variation accounted for by model
MSS_fit2 <- sum((fitted(fit2) - mu_competence)^2)                 # Variation accounted for by model
RSS_fit1 <- sum((dfttu_wide$competence - fitted(fit1))^2)         # Error sum of squares fit1
RSS_fit2 <- sum((dfttu_wide$competence - fitted(fit2))^2)         # Error sum of squares fit2
Frac_RSS_fit1 <- RSS_fit1 / TSS                                   # % Variation secondary to residuals fit1
Frac_RSS_fit2 <- RSS_fit2 / TSS                                   # % Variation secondary to residuals fit2
R.sq_fit1 <- 1 - Frac_RSS_fit1                                    # % Variation secondary to Model fit1
R.sq_fit2 <- 1 - Frac_RSS_fit2                                    # % Variation secondary to Model fit2
n <- nrow(dfttu_wide)                                             # Number of subjects or observations
Constraints <- 1                                                  # Constraints imposed or difference in IVs fit2 vs. fit1
UnConstrained <- 2                                                # Independent variables unconstrained model (fit2)
F_value <- (R.sq_fit2 - R.sq_fit1) * (n - UnConstrained - 1) / ((1 - R.sq_fit2) * Constraints)
F_value

## [1] 2.810463

# [1] 2.810463 # matches ANOVA output

Residuals present the "error" (unexplained variance) in the model. How far are the observations from the predicted (linear) slope?

# residuals should look "random" and centered around zero
plot(dfttu_competence_regression2_RQ4$residuals, 
     pch = 16, # filled dot, value "17" would produce triangles ...
     col = "darkred")

# and they do!

# let's see more ..
# create a new data frame that contains only competence and autonomy columns
dfReg <- dfttu_wide[c("competence","autonomy")]

# plot the observed data
plot(dfReg, pch = 16, col = "darkblue") #Plot the results

# and add a regression line
abline(dfttu_competence_regression2_RQ4)

## Warning in abline(dfttu_competence_regression2_RQ4): only using the first two of
## 3 regression coefficients

# in the first session, we conducted multivariate normality tests to get rid of 27 participants (outliers)
# did that pay off?

# read in the original data that has 27 rows more (235) than the current data (208)
dfttu_wide_original <- read.table("dfttu_wide_cross_complete.txt", header=TRUE, sep= "\t")

# create a new data frame that contains only competence and autonomy columns
dfReg_original <- dfttu_wide_original[c("competence","autonomy")]

# plot the observed data
plot(dfReg_original, pch = 16, col = "darkblue", main = "Original data (n=235)")

# fit regression with the original data
dfttu_competence_regression_RQ4_original <- lm(competence ~ autonomy, data = dfttu_wide_original)
summary(dfttu_competence_regression_RQ4_original)

## 
## Call:
## lm(formula = competence ~ autonomy, data = dfttu_wide_original)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.85360 -0.27838  0.04685  0.29685  1.34549 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.25632    0.16971   7.403 2.42e-12 ***
## autonomy     0.69909    0.04521  15.465  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.489 on 233 degrees of freedom
## Multiple R-squared:  0.5065, Adjusted R-squared:  0.5044 
## F-statistic: 239.2 on 1 and 233 DF,  p-value: < 2.2e-16

# Residuals:
#      Min       1Q   Median       3Q      Max 
# -1.85360 -0.27838  0.04685  0.29685  1.34549 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  1.25632    0.16971   7.403 2.42e-12 ***
# autonomy     0.69909    0.04521  15.465  < 2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.489 on 233 degrees of freedom
# Multiple R-squared:  0.5065,  Adjusted R-squared:  0.5044 
# F-statistic: 239.2 on 1 and 233 DF,  p-value: < 2.2e-16

# Note: This original data (n=235, without multivariate normality testing) has lower R^2 of 50.4% 
# compared to the data that was cleaned from the outliers (n=208). Also the intercept is now 
# quite different (1.25632) compared to the "cleaned" one (0.7337342)

# and add a regression line
abline(dfttu_competence_regression_RQ4_original)

# plot also the previous regression
plot(dfReg, pch = 16, col = "darkblue", main = "Cleaned data (n=208)")
abline(dfttu_competence_regression_RQ4)

The regression line goes to the same direction with the observations with both "original" (n=235) and "cleaned" (n=208) data.

But how about the math achievement ("score") as a predictor of competence?

# -----------------------------
# first, with the original data

dfReg_original2 <- dfttu_wide_original[c("competence","score")]

# plot the observed data
plot(dfReg_original2, pch = 16, col = "darkblue", main = "competence ~ score, original data (n=235)")

# fit regression with the original data
dfttu_competence_regression_RQ4_original2 <- lm(competence ~ score, data = dfttu_wide_original)
summary(dfttu_competence_regression_RQ4_original2)

## 
## Call:
## lm(formula = competence ~ score, data = dfttu_wide_original)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.46133 -0.32506  0.07747  0.38839  1.30492 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.695083   0.086576   42.68   <2e-16 ***
## score       0.002924   0.001556    1.88   0.0614 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6909 on 233 degrees of freedom
## Multiple R-squared:  0.01494,    Adjusted R-squared:  0.01071 
## F-statistic: 3.534 on 1 and 233 DF,  p-value: 0.06138

# Residuals:
#      Min       1Q   Median       3Q      Max 
# -2.46133 -0.32506  0.07747  0.38839  1.30492 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept) 3.695083   0.086576   42.68   <2e-16 ***
# score       0.002924   0.001556    1.88   0.0614 .  
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.6909 on 233 degrees of freedom
# Multiple R-squared:  0.01494, Adjusted R-squared:  0.01071 
# F-statistic: 3.534 on 1 and 233 DF,  p-value: 0.06138

# Note: It is clear the the math score is not a good predictor of competence!

# and add a regression line
abline(dfttu_competence_regression_RQ4_original2, col = "red")

# -----------------------------
# second, with the cleaned data

# create a new data frame that has only competence and score
dfReg2 <- dfttu_wide[c("competence","score")]

# plot the observed data
plot(dfReg2, pch = 16, col = "black", main = "competence ~ score, cleaned data (n=208)") #Plot the results

# fit lin reg model
dfttu_competence_regression3_RQ4 <- lm(competence ~ score, data = dfttu_wide)

# and add a regression line
abline(dfttu_competence_regression3_RQ4, col = "red")

We can see that there is no linear association between the competence and math score.

Next question is that if the autonomy and math score have an interaction effect (values of autonomy are related to math score).

# we add an interaction term by using "*" symbol btw autonomy and score (instead of "+")
dfttu_competence_regression2_interaction_RQ4 <- lm(competence ~ autonomy * score, data = dfttu_wide)

# display summary of the results
summary(dfttu_competence_regression2_interaction_RQ4)

## 
## Call:
## lm(formula = competence ~ autonomy * score, data = dfttu_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.13613 -0.24311  0.01938  0.24222  0.89563 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     0.961584   0.270047   3.561  0.00046 ***
## autonomy        0.671925   0.094314   7.124 1.76e-11 ***
## score          -0.003008   0.004778  -0.629  0.52976    
## autonomy:score  0.001638   0.001669   0.981  0.32774    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3966 on 204 degrees of freedom
## Multiple R-squared:  0.5766, Adjusted R-squared:  0.5703 
## F-statistic: 92.59 on 3 and 204 DF,  p-value: < 2.2e-16

# # we add an interaction term by using "*" symbol btw autonomy and score (instead of "+")
# dfttu_competence_regression2_interaction_RQ4_cgm <- lm(competence ~ autonomy.cgm * score.cgm, data = dfttu_wide)
# 
# # display summary of the results
# summary(dfttu_competence_regression2_interaction_RQ4_cgm)

There is no interaction effect ("autonomy:score" > .05), as the autonomy has a stat sig association with competence alone ("Pr(>|t|)" = 1.76e-11, i.e., < . 001) but not when paired with the math score ("Pr(>|t|)" = 0.327740, i.e., > . 05).

Jamovi produces similar results:

jmv::linReg(
    data = dfttu_wide,
    dep = competence,
    covs = vars(autonomy, score),
    blocks = list(
        list(
            "autonomy",
            "score",
            c("autonomy", "score"))),
    refLevels = list(),
    r = FALSE,
    r2 = FALSE,
    r2Adj = TRUE,
    aic = TRUE,
    bic = TRUE,
    norm = TRUE,
    qqPlot = TRUE,
    resPlots = TRUE)

## there are higher-order terms (interactions) in this model
## consider setting type = 'predictor'; see ?vif

## 
##  LINEAR REGRESSION
## 
##  Model Fit Measures                               
##  ------------------------------------------------ 
##    Model    Adjusted R²    AIC         BIC        
##  ------------------------------------------------ 
##        1      0.5703453    211.5133    228.2010   
##  ------------------------------------------------ 
## 
## 
##  MODEL SPECIFIC RESULTS
## 
##  MODEL 1
## 
##  Model Coefficients - competence                                               
##  ----------------------------------------------------------------------------- 
##    Predictor         Estimate        SE             t             p            
##  ----------------------------------------------------------------------------- 
##    Intercept          0.961583768    0.270046776     3.5608045     0.0004601   
##    autonomy           0.671925202    0.094313558     7.1243755    < .0000001   
##    score             -0.003007682    0.004778327    -0.6294425     0.5297637   
##    autonomy:score     0.001637634    0.001669300     0.9810306     0.3277397   
##  ----------------------------------------------------------------------------- 
## 
## 
##  ASSUMPTION CHECKS
## 
##  Normality Test (Shapiro-Wilk) 
##  ----------------------------- 
##    Statistic    p           
##  ----------------------------- 
##    0.9896875    0.1418734   
##  -----------------------------

Report the results

Research question 4 was addressed with linear regression where participants self-assessed competence (DV) was regressed on their self-assessed autonomy (IV1) and score from a mathematics test (IV2). Results showed that the autonomy was a statistically significant positive predictor of competence while the mathematical achievement was not related to competence, F(2,205) = 138.4, p < .001. This model predicted approximately 57.0% (Adjusted R^2 = 0.5704) of the variance in the DV (competence). Interaction effect between autonomy and mathematical achievement was not found.

Course homepage:
https://homepages.tuni.fi/petri.nokelainen/mmx9120/