# Play the chunk above and this one to get the data into your Console
View(Friendly)
?Friendly
Many teachers and other educators are interested in understanding how to best deliver new content to students. In general, they have two choices of how to do this.
A study was performed to determine whether the Meshed or Before approaches to delivering content had any positive benefits on memory recall.
On paper, it appears as though there isn’t much of a logical reason as to why one of the two proposed approaches should differ in results from another. With that in mind, the following hypothesis has been made: There is not a significant difference in the results of the Before and Meshed categories.
\[H_0: \text{difference in medians of Before and Meshed approaches = 0}\] \[H_a: \text{difference in medians of Before and Meshed approaches} \neq 0\]
By performing a Wilcoxon rank sum test, we can see if there is sufficent evidence to reject our hypothesis. The test will be preformed with a 0.95 confidence level (\(\alpha = 0.05\)).
wilcox.test(Friendly$correct[Friendly$condition == 'Meshed'], Friendly$correct[Friendly$condition == 'Before']) %>%
pander(caption="Wilcoxon rank some test with continuity correction (there are ties in the data)")
Test statistic | P value | Alternative hypothesis |
---|---|---|
38 | 0.378 | two.sided |
The notches on the box plot below show the range of significance for the data; they act as a visual representation of where a p-value would be significant or insignificant.
ggplot(Friendly, aes(
x= condition,
y= correct,
fill= condition)) +
geom_boxplot(fill= c('cyan3', 'cyan', 'gray'), color= 'black',
notch=TRUE, width= 0.4) +
geom_dotplot(binaxis= 'y', position= 'dodge', stackdir='center',
binwidth = 0.5) +
scale_fill_manual(values=c("cyan", "cyan3", "lightgray")) +
labs(title="Recalling Words Test",
x= "Testing Methods",
y= "Final Score") +
theme_light() +
theme(legend.position= 'none')
We can immediately see significant differences between the first two methods and SFR; however, the same can’t be said for our first two methods alone.
favstats(correct ~ condition, data = Friendly)[,-10] %>%
pander(caption= "Box-Plot Statistics")
condition | min | Q1 | median | Q3 | max | mean | sd | n |
---|---|---|---|---|---|---|---|---|
Before | 24 | 37.25 | 39 | 39.75 | 40 | 36.6 | 5.337 | 10 |
Meshed | 30 | 36 | 36.5 | 38.75 | 40 | 36.6 | 3.026 | 10 |
SFR | 21 | 25 | 27 | 38.5 | 39 | 30.3 | 7.334 | 10 |
Let’s look at our test results:
The Wilcoxon test resulted in a p-value of 0.378, which is more than our alpha value, 0.05. Thus there is insufficient evidence to reject our null hypothesis; there is no significant difference between using either the Before or Meshed learning methods. We can see that their median values are quite similar: \(\text{Meshed} = 36.5\) words correctly recalled, \(\text{Before} = 39\) words correctly recalled. Outside of that, it is clear that these two methods are superior to the SFR method \(\text{(median}=27\text{ words correctly recalled)}\) and at least one of them should be used in learning models as opposed to simply randomizing the order of learning material.