Learning analytics and epistemic arrogance in higher and distance education

It is not a good time in higher and distance education to say: “I don’t know.” The use of technology is increasingly changing the higher education landscape; we face unprecedented changes in funding regimes; the private for-profit higher education sector is growing fast and the increasing diversity of our student profiles and levels of preparedness and literacy are making it almost impossible to predict and plan based on our predictions. As higher and distance education institutions respond to this constant flux, we rely more and more on analysis and data to get a grip on what to do next. Though the belief that knowing the past will help us to understand the future is as old as humanity, there is a proliferation of initiatives to harvest and analyse ‘real-time’ data to increase our understanding (and often, control).We  harvest more and more data from students, faculty and operational systems and processes trying to get one step ahead and lessen the amount of surprises and unforeseen changes in our context which no one foreseen.

I firmly belief that knowing more and having access to more data, historical and real-time data, may help us to understand and respond to the various challenges our faculty, students and systems face. Knowing more and having more data is however not a magic bullet and often results in an epistemic arrogance making us even more prone to the unforeseen and the unexpected. Considering what higher education has, up to now, done with what we already know, the promise which ‘big data’ holds to help us to understand and plan better, may not be realized. Often what we already know about our students are left unused, or used in “bang-bang” approaches as we shoot at noises we hear in the dark.

Higher education is increasingly living “under the sword of data” (Wagner & Ice, 2012, p.40) and we celebrate learning analytics as the “the new black” (Booth, 2012, p.52). Learning analytics proposes that if we know more about our students by harvesting the different data trails they leave on course management and student information systems (Oblinger 2012), as well as systems external to higher education (e.g. Twitter, Facebook, etc), we may be able to understand student behaviour better in order to determine their, and our, next moves.  Higher education institutions will then be able to suggest courses and actions to students on a similar basis Amazon suggests books you are likely to buy based on analysing your past behaviour. Most institutions also use analytics to identify at-risk students early enough to ensure that institutions do not waste unnecessary resources on students who will, in any event, not make the grade (literally). If these ‘at risk’ students are allowed to register, they are put in appropriate streams according to algorithms harvesting a range of data from primary school and secondary school histories, demographic data and increasingly, data trails and digital records students leave on social networking sites. There are claims that we should have analytics producing 360 degree views of students (Crow, 2012), providing us with flawless ways to prevent wasting resources (on both students’ and the institution;s side).

As I reflected on knowing too little, doing too little with what we already know and our commitment to know even more about our students, I could not help to think of the work of Nicholas Taleb (2007). According to him, the human mind suffers from three ailments, the “triplet of opacity” which encompasses

  1. “The illusion of understanding, or how everyone thinks he knows what is going on in the world that is more complicated (or random) than they realize;
  2. The retrospective distortion, or how we can assess matters only after the fact, as if they were in a rearview mirror…
  3. The overvaluation of factual information and the handicap of authoritative and learned people, particularly when they create categories – when they  ‘Platonify’”  (Taleb, 2007, p.8)

We often believe that the more we know, the more we will understand – and the causal link between knowing more and understanding does not necessarily exist. We assume that by categorizing what we know, we prove our understanding but Taleb (2012) warns that “Categorizing is necessary for humans, but it becomes pathological when the category is seen as definitive, preventing people from considering the fuzziness of boundaries, let alone revising their categories ” (p.15). There is a real danger that our obsession with categorizing students as Generation X, Y and Z, or  as “Millennials” or as ‘at risk’ reduces the complexities within different cohorts of students, resulting in homogenous Platonic categories which provide us with the ‘evidence’ to plan interventions and predict future behaviour. Our seeming obsession with predictive modelling and categorization speaks of the “weight of the epistemic arrogance of the human race” (Taleb, 2007, p.17).

As we gather more data resulting in the number of possible different variables impacting on student retention and success increase, we look for linear relationships forgetting that “linear relationships are truly the exception” (Taleb 2007, p.89).  Humans, and specifically higher education administrators, are “explanation-seeking animals who tend that everything has an identifiable cause and grab the most apparent one as the explanation” (Taleb, 2007, p.119). Taleb does not suggest that there are no causal links between variables, but he suggests that we should use the word – ‘because’- “sparingly” and “with care” (2007, p.120). While there is a need to plan and strategize after making sense of the highly dynamic higher education landscape where flux is the new normal, we should be careful of the danger of epistemic arrogance which “bears a double effect: we overestimate what we know, and underestimate uncertainty” (Taleb, 2007, p.140). Knowing the limitations of our predictions and knowledge does not result in a paralysis and an inability to plan and predict, but rather allows as to “…plan while bearing in mind such limitations. It just takes guts” (Taleb, 2007, p.157; emphasis added).

Juxtaposed to epistemic arrogance and the “pretence of knowledge”, Taleb proposes “epistemic humility” which he describes as follows: “Think of someone heavily introspective, tortured by the awareness of his (sic) own ignorance. He lacks the courage of the idiot, yet has the rare guts to say ‘I don’t know’. He does not mind looking like a fool or, worse, an ignoramus. He will not commit, and he agonizes over the consequences of being wrong. … This does not necessarily mean that he lacks confidence, only that he holds his own knowledge to be suspect” Taleb (2007, p.190).

In closing:  I started this blog stating that it is not a good time in higher education to say “I don’t know.” As “explanation-seeking animals” we constantly look for causal links in the belief that, somehow, life and the lives of our students are more predictable than they really are. We harvest more and more data, looking for causes and links in our endeavour to manage students’ learning more effectively.  We re-engineer big processes, make big scientific claims based on big data and celebrate the potential of big data mining and analyses in a continuous ritual of epistemic arrogance.  We state our claims with the courage of an idiot and banish those who dare to question or who hold their own (and others’) knowledge to be suspect.

A good place to realise the huge potential of learning analytics in higher and distance education is with a strong dose of “epistemic humility.”

[Image retrieved from http://www.dataminingtechniques.net/, 3 September 2012]


Booth, M. (2012). Learning analytics: the new black. EDUCAUSE Review, July/August, 52-53.

Crow, M.M. (2012). No more excuses. EDUCASE Review July/August, 14-22.

Oblinger, D.G. (2012). Let’s talk analytics. EDUCAUSE Review, July/August, 10-13.

Taleb, N. (2007). The black swan. The impact of the highly improbable. London, UK: Penguin Books.

Wagner, E., & Ice, P. (2012). Data changes everything: delivering on the promise of learning analytics in higher education. EDUCAUSE Review July/August, 33-42.

About opendistanceteachingandlearning

Research professor in Open Distance and E-Learning (ODeL) at the University of South Africa (Unisa). Interested in teaching and learning in networked and open distance and e-learning environments. I blog in my personal capacity and the views expressed in the blog does not reflect or represent the views of my employer, the University of South Africa (Unisa).
This entry was posted in OMDE601 and tagged , , , , , , , , . Bookmark the permalink.

16 Responses to Learning analytics and epistemic arrogance in higher and distance education

  1. Laura says:

    Thank you for this excellent post, and for making this case in so articulate a manner. There are no easy answers or quick fixes, and the future surprises us repeatedly! (PS I love the image you used)

  2. Thanks Laura! The more I reflect on student success and retention, the more I realise how little we know, how little we understand… Maybe that is a good place to start? 🙂

  3. Pingback: Learning analytics and epistemic arrogance in higher and distance … | Distance Education

  4. Pingback: Learning analytics and epistemic arrogance in higher and distance education | E-Learning-Inclusivo (Mashup) | Scoop.it

  5. You’ve touched on so many interesting ideas in this post. I find it difficult to chose just one to comment on.

    I recognize the potential for learning analytics to help institutions improve resource allocation and student outcomes. With all the recent focus on self directed learners, I wonder if there are ways to allow students to analyze their own learning trajectory.

    • Thanks Andrew. You make a very important point. It is crucial that we don’t see learning analytics as harvesting information from students only, but to get them engaged in not only providing informed consent for the information we use, but also contributing information that can assist both parties namely students and the institution -there is more, but for now let me focus on these two- to use appropriate information to make more informed choices. It is therefore not only institutions that need to know more about ‘them’, they also need to need to know more about us and themselves. If you are interested, have a look at the paper I co-authored with George Subutzky in 2011 as well as a wonderful paper on student-centered learning analytics authored by Kruse & Pongsajapan (see reference below).

      Kruse, A., & Pongsajapan, R. (2012). Student-centered learning analytics. Retrieved from https://cndls.georgetown.edu/m/documents/thoughtpaper-krusepongsajapan.pdf

      Subotzky, G. & Prinsloo, P. (2011): Turning the tide: a socio-critical model
      and framework for improving student success in open distance learning at the University of South
      Africa, Distance Education, 32:2, 177-193. http://dx.doi.org/10.1080/01587919.2011.584846

  6. Pamela Ryan says:

    Sometimes it’s best just to sit and think! Good post Paul. Provocative enough to make me pause.

  7. Pingback: Learning analytics and epistemic arrogance in higher and distance education | E-Learning: Knowledge Platform | Scoop.it

  8. Pingback: Learning analytics and epistemic arrogance in higher and distance education | e-Teacher | Scoop.it

  9. Shana says:

    I agree with previous posts – many interesting ideas. I particularly like the “triplet of opacity” – very true. My humble (epistemic humble?) comments: as with all statistics one should be mindful of the context and assumptions before drawing conclusions. And remember that correlation does not imply causation. Perhaps the problem of “reductionism” gets worse as the volume of data increases because we want ‘find explanations’ and cope with information overload.

  10. Thanks Shana – you make an interesting point that “he problem of “reductionism” gets worse as the volume of data increases because we want ‘find explanations’ and cope with information overload.” Good point. I think as we harvest more data from the data trails our students and faculty leave, the number of interdependencies increase making it very difficult (if not impossible) to determine the ‘weight’ of each variable in isolation to others. The other danger is to make assumption based on too little data or incomplete data – for example school leaving marks, or proficiency in the language of tuition. While these are important indicators, when seen in isolation or as permanent, we lose the plot.

    • Shana says:

      I agree – trying to infer from data something that is not there is dangerous. I think the danger is similar in research: there is a tendency to rely too heavily on quantitative positivist research but alternative approaches (interpretive, qualitative) can supplement and strengthen the research because quantitative approaches cannot reveal the whole story. Perhaps we should advocate a mixed methods approach? Or does that make it even more complex and overwhelming?

  11. Sachin says:

    Thank you for presenting an interesting perspective on Learning Analytics. In the last three months, I have attended two HE conferences, and Learning Analytics has certainly been the hot topic of discussion.

    The challenge and pressure of recruiting, engaging and retaining students is driving higher education providers to look into new ways to improve education delivery. Even though most institutions would like to explore (or exploit) data to understand student behaviour and their learning styles, doing so has its own challenges. The footprints (digital or otherwise) of students are so scattered that the notion of capturing a 360 degree view seems unachievable. Okay, granted that the data from online learning environments (LMS, digital library, lecture capture systems, e-portfolios, etc.) can be captured and analysed; but how do you capture student data from all the “offline” learning that happens? Another controversy is around privacy and ethical implications in the management and use of student data.

    This emerging area of research has a long way to go but it will be interesting to follow future developments.

    • Thanks for engaging with my reflections. I agree with you that despite the current excitement and hype about the potential of learning analytics, there are still a number of assumptions and other issues that we need to resolve or at least understand. For example:
      * We can only harvest digital footprints and although these footprints may provide us with information regarding non-digital experiences (e.g. postings on Facebook or Twitter) – we are still a long way off to make sense of the total student experience – digital and non-digital.
      * Do we really want to have a 360 degree view of our students? Like you said, the issues around privacy and ethical implications are many and much of the debate bypasses these issues.
      * My biggest concern about the potential of learning analytics is that it provides only data on one of the players in a student’s journey. Our research in my context shows that organizational inefficiencies and macro societal impacts on both students and institution are major factors in students’ success. I would therefore be encouraged to have a more complete view of students and institutional interactions to determine possible support and interventions.

      Currently many institutions are simply paralyzed or do not have the capacity to act on what we already know. Often institutional responses to student data are not integrated, scattered, and following a ‘bang, bang’ approach of shooting at noises we hear in the dark.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s