In search for the most informative data for feedback generation: Learning analytics in a data-rich context
Introduction
Learning analytics provide institutions with opportunities to support student progression and to enable personalised, rich learning (Bienkowski et al., 2012, Oblinger, 2012, Siemens et al., 2013, Tobarra et al., 2014). With the increased availability of large datasets, powerful analytics engines (Tobarra et al., 2014), and skillfully designed visualisations of analytics results (González-Torres, García-Peñalvo, & Therón, 2013), institutions may be able to use the experience of the past to create supportive, insightful models of primary (and perhaps real-time) learning processes (Rienties et al., submitted for publication, Baker, 2010, Stiles, 2012). According to Bienkowski et al. (2012, p. 5), “education is getting very close to a time when personalisation will become commonplace in learning”, although several researchers (García-Peñalvo et al., 2011, Greller and Drachsler, 2012, Stiles, 2012) indicate that most institutions may not be ready to exploit the variety of available datasets for learning and teaching.
Many learning analytics applications use data generated from learner activities, such as the number of clicks (Siemens, 2013, Wolff et al., 2013), learner participation in discussion forums (Agudo-Peregrina et al., 2014, Macfadyen and Dawson, 2010), or (continuous) computer-assisted formative assessments (Tempelaar, Heck, Cuypers, van der Kooij, & van de Vrie, 2013; Tempelaar, Kuperus et al., 2012; Wolff et al., 2013). User behaviour data are frequently supplemented with background data retrieved from learning management systems (LMS) (Macfadyen & Dawson, 2010) and other student admission systems, such as accounts of prior education (Arbaugh, 2014, Richardson, 2012, Tempelaar et al., 2012). For example, in one of the first learning analytics studies focused on 118 biology students, Macfadyen and Dawson (2010) found that some (# of discussion messages posted, # assessments finished, # mail messages sent) LMS variables but not all (e.g., time spent in the LMS) were useful predictors of student retention and academic performance.
Buckingham Shum and Deakin Crick (2012) propose a dispositional learning analytics infrastructure that combines learning activity generated data with learning dispositions, values and attitudes measured through self-report surveys, which are fed back to students and teachers through visual analytics. For example, longitudinal studies in motivation research (Järvelä, Hurme, & Järvenoja, 2011; Rienties, Tempelaar, Giesbers, Segers, & Gijselaers, 2012) and students’ learning approaches (Nijhuis, Segers, & Gijselaers, 2008) indicate strong variability in how students learn over time in face-to-face settings (e.g., becoming more focussed on deep learning rather than surface learning), depending on the learning design, teacher support, tasks, and learning dispositions of students. Indeed, in a study amongst 730 students Tempelaar, Niculescu, et al. (2012) found that positive learning emotions contributed positively to becoming an intensive online learner, while negative learning emotions, like boredom, contributed negatively to learning behaviour. Similarly, in an online community of practice of 133 instructors supporting EdD students, Nistor et al. (2014) found that self-efficacy (and expertise) of instructors predicted online contributions.
However, a combination of LMS data with intentionally collected data, such as self-report data stemming from student responses to surveys, is an exception rather than the rule in learning analytics (Buckingham Shum and Ferguson, 2012, Greller and Drachsler, 2012, Macfadyen and Dawson, 2010, Tempelaar et al., 2013). In our empirical contribution focusing on a large scale module in introductory mathematics and statistics, we aim to provide a practical application of such an infrastructure based on combining longitudinal learning and learner data. In collecting learner data, we opted to use three validated self-report surveys firmly rooted in current educational research, including learning styles (Vermunt, 1996), learning motivation and engagement (Martin, 2007), and learning emotions (Pekrun, Goetz, Frenzel, Barchfeld, & Perry, 2011). This operationalisation of learning dispositions closely resembles the specification of cognitive, metacognitive and motivational learning factors relevant for the internal loop of informative tutoring feedback (e.g., Narciss, 2008, Narciss and Huth, 2006). For learning data, data sources are used from more common learning analytics applications, and constitute both data extracted from an institutional LMS (González-Torres et al., 2013, Macfadyen and Dawson, 2010) and system track data extracted from the e-tutorials used for practicing and formative assessments (e.g., Tempelaar et al., 2013; Tempelaar, Kuperus, et al., 2012; Wolff et al., 2013). The prime aim of the analysis is predictive modelling (Baker, 2010, Sao Pedro et al., 2013), with a focus on the roles of (each of) 100+ predictor variables from the several data sources can play in generating timely, informative feedback for students.
Section snippets
Learning analytics
A broad goal of learning analytics is to apply the outcomes of analysing data gathered by monitoring and measuring the learning process (Buckingham Shum and Ferguson, 2012, Siemens, 2013). A vast body of research on student retention (Credé and Niehorster, 2012, Marks et al., 2005, Richardson, 2012) indicates that academic performance can be reasonably well predicted by a range of demographic, academic integration, social integration, psycho-emotional and social factors, although most
Research questions
While an increasing body of research is becoming available how students’ usage and behaviour in LMS influences academic performance (e.g., Arbaugh, 2014, Macfadyen and Dawson, 2010, Marks et al., 2005, Wolff et al., 2013), how the use of e-tutorials or other formats of blended learning effects performance (e.g., Lajoie & Azevedo, 2006), and how feedback based on learning dispositions stimulates learning Buckingham Shum and Deakin Crick (2012), to the best of our knowledge no study has looked at
Results
The aim of this study being predictive modelling in a rich data context, we will focus the reporting on the coefficient of multiple correlation, R, of the several prediction models. Although the ultimate aim of prediction modelling is often the comparison of explained variation, which is based on the square of the multiple correlation, we opted for using R itself, to allow for more detailed comparisons between alternative models. Values for R are documented in Table 1 for prediction models
Discussion
In this empirical study into predictive modelling of student performance, we investigated several different data sources to explore the potential of generating informative feedback for students and teachers using learning analytics: data from registration systems, entry test data, students’ learning dispositions, BlackBoard tracking data, tracking data from two e-tutorial systems, and data from systems for formative, computer assisted assessments. In line with recommendations by Agudo-Peregrina
Conclusion
The generation of timely feedback based on early performance predictions and early signalling of underperformance are crucial objectives in many learning analytics applications. The added value of data sources for such applications will therefore depend on the predictive power of the data, the timely availability of the data, and the uniqueness of information in the data. In this study, we integrated data from many different sources and found evidence for strong predictive power of data from
Acknowledgement
The project reported here has been supported and co-financed by the Dutch SURF-foundation as part of the Learning Analytics Stimulus program.
References (50)
- et al.
Can we predict success from log data in VLEs? Classification of interactions for learning analytics and their relation with performance in VLE-supported F2F and online learning
Computers in Human Behavior
(2014) Data mining for education
International Encyclopedia of Education
(2010)- et al.
Perceived openness of learning management systems by students and teachers in education and technology courses
Computers in Human Behavior
(2014) - et al.
Human-computer interaction in evolutionary visual software analytics
Computers in Human Behavior
(2013) - et al.
Cognitive, metacognitive and motivational perspectives on preflection in self-regulated online learning
Computers in Human Behavior
(2014) - et al.
Mining LMS data to develop an “early warning system” for educators: A proof of concept
Computers & Education
(2010) - et al.
Fostering achievement and motivation with bug-related tutoring feedback in a computer-based training for written subtraction
Learning and Instruction
(2006) - et al.
The extent of variability in learning strategies and students’ perceptions of the learning environment
Learning and Instruction
(2008) - et al.
Participation in virtual academic communities of practice under the influence of technology acceptance and community factors. A learning analytics application
Computers in Human Behavior
(2014) - et al.
Measuring emotions in students’ learning and performance: The Achievement Emotions Questionnaire (AEQ)
Contemporary Educational Psychology
(2011)
The role of academic motivation in Computer-Supported Collaborative Learning
Computers in Human Behavior
Beyond threaded discussion: Representational guidance in asynchronous collaborative learning environments
Computers & Education
How achievement emotions impact students’ decisions for online learning, and what precedes those emotions
Internet and Higher Education
Analyzing the students’ behavior and relevant topics in virtual learning communities
Computers in Human Behavior
System, scholar, or students? Which most influences online MBA course effectiveness?
Journal of Computer Assisted Learning
Aligning assessment with long-term learning
Assessment & Evaluation in Higher Education
Social learning analytics
Journal of Educational Technology & Society
An overview of learning analytics
Teaching in Higher Education
Adjustment to college as measured by the student adaptation to college questionnaire: A quantitative review of its structure and relationships with correlates and consequences
Educational Psychology Review
A study of the relationship between student social networks and sense of community
Journal of Educational Technology & Society
Opening learning management systems to personal learning environments
Journal of Universal Computer Science
Translating learning into numbers: A generic framework for learning analytics
Journal of Educational Technology & Society
Visible learning: A synthesis of over 800 meta-analyses relating to achievement
Cited by (296)
Learning analytics driven improvements in learning design in higher education: A systematic literature review
2024, Journal of Computer Assisted LearningPredicting time-management skills from learning analytics
2024, Journal of Computer Assisted LearningCan Crowdsourcing Platforms Be Useful for Educational Research?
2024, ACM International Conference Proceeding SeriesRevolutionizing education through personalized pedagogy and machine learning algorithms
2024, Fostering Pedagogical Innovation Through Effective Instructional DesignAnalyzing Learning Patterns and Potential Interventions in First-Year Compulsory Course at an Online University
2023, 31st International Conference on Computers in Education, ICCE 2023 - ProceedingsA Systematic Review of AI-Driven Educational Assessment in STEM Education
2023, Journal for STEM Education Research