Decolonising the collection, analyses and use of student data: A tentative exploration/proposal


Voices from the Global South* (*I know the term is contentious) increasingly demand to not only be recognised in the extremely uneven and skewed terrain of knowledge production and dissemination, but to actively take part and contest and reshape knowledge claims. I would like to use this blog to tentatively interrogate the potential of a decolonising lens on the collection, analyses and use of student data.

Disclaimer 1: I am intensely aware of the impact of my race and gender in thinking about student data through a decolonising lens. My race, gender and the fact that I write this blog in English should make me uncomfortable and I am. Whether my inherent complicity in notions of white superiority precludes me in taking part in the debate is for you, as reader, to decide. I constantly grapple with the intersectionalities of my gender, race and settler identity as an African. In the field of learning analytics, as the measurement, collection, analysis and use of student data, this blog is a fundamentally and intentionally incomplete attempt to map a decolonising lens on learning analytics.

Disclaimer 2: I acknowledge that notions of post colonialism, decoloniality and coloniality are subjects of serious intellectual pursuits and my grasp of the different overlaps and differences/nuances is, for now, basic. I do accept, however, that coloniality is a reality and that we need to “better understand the nexus of knowledge, power, and being that sustain an endless war on specific bodies, cultures, knowledges, nature and peoples” (Maldonado-Torres, Outline of ten theses on coloniality and decoloniality, 26 October, 2016).

Disclaimer 3: I have a suspicion that the collection, analysis and use of student data overlaps with other discourses and practices of surveillance and digital redlining. As such a decolonising lens on learning analytics overlaps with and needs to take into account these discourses.

A month ago at the annual conference of the South African Association for Institutional Research (SAAIR) researchers from the Southern African region reflected on the role of institutional research in the extremely volatile South African higher education context with its increasing student demands for free higher education (#FeesMustFall) and demands to decolonise curricula. In my presentation I asked “How is it possible that the #FeesMustFall #RhodesMustFall campaigns caught higher education institutions relatively (or totally?) unprepared despite everything that we already know about our students?” (emphasis added); “Is it possible that the writing was on the wall but that we, for whatever reason, decided to ignore the message? Or did not understand the message?” and “What did we not know that would have prepared us for the disruption and destruction we faced over the last 18 months?”

Excursus: A lot of my research focused and still focuses on the ethical and privacy implications in learning analytics and in my preparation for this conference it started to deem on me how our collection, analysis and use of student data are informed by particular ideological and political agendas. This was the beginning of my discomfort and reflection.

I had (and still have) the nagging thought that the our samples, variables and the tools we use to  collect, analyse and use student data in higher education are shaped by the liberal and neoliberal social imaginaries of higher education, of the ‘educated subject.’ If we accept that data collection, analysis and use are political acts and serve declared and hidden assumptions about the purpose of higher education and the masters it serves, what are the implications for learning analytics? In a follow-up discussion during that conference I became aware of my increasing discomfort with our uncritical if not blasé approach to the collection, analysis and use of student data – without ever questioning the social imaginary informing our choice of variables, the hidden assumptions informing the proxies we use to define ‘effective’ teaching and learning, our emphasis on what our students lack and their deficiencies that prevent them from fitting in and our seeming nonchalant responses to the collateral damage of our analytics and interventions.  During the conference I raised the question: “What does a decolonised and decolonising collection, analysis and use of student data look like?”  Following the question there were a few awkward laughs, one or two responses that implied that I may have lost my senses or don’t I know that data are raw and the collection of data is neutral…

I could not sleep that night as I wrestled with the thought of what a decolonised and decolonising approach to the collection, analysis and use would look like? Already in the said presentation did I think aloud on how our collection and use of student data seem to disregard the entrenched, inter-generational structural inequalities in South African society.  We collect student data as if students start their studies with a clean slate, a tabula rasa, and as if they have not been impacted upon by generations of discrimination and disenfranchisement. We seem to blatantly disregard the fact that most of our students have limited loci of control over where they study, where and how long they can access the Internet, how many prescribed books they can buy. We ignore the epistemic violence integral to much of our curricula. We somehow believe that (more) grit and a growth mind set are the answer to their pathogenic vulnerability. And when you add to this the belief by government that education, on its own can rectify generations of injustice and inequality, then higher education institutions select and collect data that provide us with information on how to move students quicker through the system to increase our return-on-investment.

As my thoughts on what a decolonised/decolonising approach to the collection, analysis and use of student data were taking shape, I was forced to reflect on the question “how does a South African perspective differ from other perspectives in the world? What difference does a postcolonial and post-apartheid context make in how we view the ethical implications of the collection, analysis and use of student data?”

In the South African context we’ve been down the road before during Apartheid where individuals were classified according to some arbitrary classifications of race – white, black, coloured, and Indian. Four categories. Categories based on the curliness of your hair. The shape of your nose. The colour of your skin. There were also many people that somehow did not fit clearly into one category but who were categorised regardless of their ‘ill-fit’.

These classifications had immense consequences for many generations since.

Your category determined where you were allowed to live. What schools you had access to. The age at which you were allowed to start school. The curricula prescribed for the schools. The universities you had access to. The job opportunities. The loans and insurance you had access to.  Your risk profile for defaulting on loans, for getting HIV, for being in possession of drugs, for having friends and family who are in jail.

All based on you fitting into an arbitrary category. Categories that were informed by white superiority. Categories that were needed to ensure that we protect racial purity (WTF). Categories that ensured that education for white kids received much more funding, had access to better resources and better curricula and better job opportunities and better loan schemes and better universities and better lives.  And I was part of this. I was white.

The effects of these classifications have been felt and will be felt for many generations to come. Many of the assumptions and effects of these classifications became institutionalised and formed the basis for a massive set of laws and regulations. While many of these laws and institutionalised forms of racism and discrimination have been changed, it will take generations to address the effects of these structural inequalities and injustices. And yet we continue to use students’ home addresses and school experiences as variables if not determinants for access to higher education? We still charge a one-size-fits-all registration fee? We use variables such as number of logins, and contributions to discussion forums where the language of tuition is a settler language as variables to predict their success. WTF.

In the broader discourses on the collection, analysis and use of data – those who are on the receiving end of discriminatory practices and bias are often unheard, redlined and often excluded from access to the criteria being used to make decisions. The sources used to collect the data, the biases and assumptions of those who collected and analysed the data, the algorithms and decisions made in the analyses of the data – all of these disappear into a ‘black box’ – inaccessible, and not accountable to anyone, not even the user of the analysis at a particular moment in time.

So a contextualised view on the ethical implications on the collection, analysis and use of student data has to account for addressing the structural inequalities of the past, and ensuring that issues of race, gender, home addresses, credit records, criminal records, school completion marks are not used to predict potential and/or to exclude individuals from reaching their potential.

A decolonising lens on the collection, analysis and use of student data cannot ignore how colonialism

  • Stole the dignity and lives of millions based on arbitrary criteria and beliefs about meritocracy supported by asymmetries of power
  • Extracted value in exchange for bare survival
  • Objectified humans as mere data points and information in the global, colonial imaginary
  • Controlled the movement of millions based on arbitrary criteria such as race, cultural grouping and risk of subversion?

How dare we collect data like schooling backgrounds, and home addresses, and parental income as if there is not history to these data?

How do we collect, analyse and use student data recognising that their data are not indicators of their potential, merit or even necessarily engagement but the results of the inter-generational impact of the skewed allocation of value and resources based on race, gender and culture?

A decolonising lens on the collection, analysis and use of student data therefore has to

  • Acknowledge the lasting, inter-generational effects of colonialism and apartheid
  • Collect, analyse and use student data with the aim of addressing these effects and historical and arising tensions between ensuring quality, sustainability and success
  • Critically engage with the assumptions surrounding data, identity, proxies, consequences and accountability
  • Respond to institutional character, context and vision
  • Consider the ethical implications of the purpose, the processes, the tools, the staff involved, the governance and the results of the collection, analysis and use of student data


I acknowledged that this blog is a fundamentally and intentionally incomplete attempt to map a decolonising lens on learning analytics. I acknowledged my complicity and my own discomfort in attempting to take part in this discourse.  How our the purpose of our collection, analysis and use of student data, our tools, our samples, our variables still informed by a colonial social imaginary of control and ‘the educated subject’?

I hope this blog starts a conversation.

I close with a poem by Abhay Xaxa –

I am not your data, nor am I your vote bank,

I am not your project, or any exotic museum object,

I am not the soul waiting to be harvested,

Nor am I the lab where your theories are tested,

I am not your cannon fodder, or the invisible worker,

or your entertainment at India habitat centre,

I am not your field, your crowd, your history,

your help, your guilt, medallions of your victory,

I refuse, reject, resist your labels,

your judgments, documents, definitions,

your models, leaders and patrons,

because they deny me my existence, my vision, my space,

your words, maps, figures, indicators,

they all create illusions and put you on pedestal,

from where you look down upon me,

So I draw my own picture, and invent my own grammar,

I make my own tools to fight my own battle,

For me, my people, my world, and my Adivasi self!

About opendistanceteachingandlearning

Research professor in Open Distance and E-Learning (ODeL) at the University of South Africa (Unisa). Interested in teaching and learning in networked and open distance and e-learning environments. I blog in my personal capacity and the views expressed in the blog does not reflect or represent the views of my employer, the University of South Africa (Unisa).
This entry was posted in, Uncategorized and tagged , , , , , . Bookmark the permalink.

20 Responses to Decolonising the collection, analyses and use of student data: A tentative exploration/proposal

  1. sensor63 says:

    Excellent. Drawing our own pictures, maps, singing our own songs, writing our own stories…

  2. francesbell says:

    I have read your post quickly and look forward to re-reading but wanted to share my first thought.
    I am very ignorant of South african cultures beyond observing apartheid and its fall from a distance. One source that made me think was this one that seemed to reveal a way of including ‘different knowledges’ beyond western canonical knowledge. I don’t know if this helps and will try to address your post more directly after a second reading.

  3. Maha Bali says:

    I just realized I hadn’t commented here. This point really struck me:
    “those who are on the receiving end of discriminatory practices and bias are often unheard, redlined and often excluded from access to the criteria being used to make decisions”

    I suggest maybe it’s not just ACCESS that’s the problem. Not just access to data but also access to decision making. And so maybe each individual should have agency on how and when to give their data and how to interpret it and how those affect decisions that they themselves are at least involved in making, if not fully agent in making them.

    What would be the Paulo Freire or bell hooks approach to learning analytics? If the data is collectible, how do we raise consciousness of learners that they figure out ways of using it that may benefit THEM. Give them opportunities to recognize what may be hindering their learning or promoting it, and to analyze the realities behind the numbers and act based on what they learn? This of course is still problematic if the collectors of data are the institution/colonizer because the categories chosen are external to the learner. But what if it were different and learners had choice in every step of the way? It reeks of self-monitoring that Foucault may loathe… But I think it’s more of a self-consciousness, a self-awareness that can be fostered in community to help learners benefit from each others reflections.
    It occurs to me that much ethnographic and critical research that isn’t partucipatory has the same issues. So maybe I am calling for a participatory learning analytics approach or a broader thing that encompasses all kinds of learning. A kind of auto-reflection on own learning and learning data – be it analytics or qualitative data. What do you think?

    • I totally agree with your comments Maha – great stuff. Also see the comments by Frances and Gabi below and my comments on their comments 🙂

      I love your proposal/question “What would be the Paulo Freire or bell hooks approach to learning analytics?” There is an article (or more than one) in this question. So far critical approaches to learning analytics have been on the margins and we really need to stand up and raise our hands.

  4. francesbell says:

    I’m back and have re-read your post and Maha’s interesting ideas about participation (the link I shared was partially about the nature of participation and how epistemology relates to that).
    I have read some very interesting work by Richard Edwards (a critical education researcher) on inscrutable infrastructures and I think that LA systems are an example of this. I remember studying Expert Systems on an Artificial Intelligence in the late 1980s, and how an explanatory interface was deemed to be important to explain to us humans how the machine had arrived at its decision. Of course, even then people were thinking critically about how we understand information But where are we now? We consume vast quantities of information with little idea of its provenance or the succession of assumptions and errors that might have been encoded with it or even made it more or less likely that we will see it in the first place based on what some algorithm knows about us from other invisible sources.
    I am a naive scholar of intersectionality (currently writing a post about responses to media bias) so I am really learning from how you have framed this post with your ethical perspective, your current knowledge and your acknowledged privileges. Thank you.

    • Wow Frances – I did not notice your comment. Thanks for the engagement and the link. I totally agree with your statement “We consume vast quantities of information with little idea of its provenance or the succession of assumptions and errors that might have been encoded with it or even made it more or less likely that we will see it in the first place based on what some algorithm knows about us from other invisible sources.” Also see the comments from Gabi below. 🙂

    • Maha Bali says:

      Is it ok if I add something here? My undergraduate thesis was based on a neural network (i assume Frances and tech folks know this stuff but will explain in case soke don’t know it). Neural networks are black boxes. They’re algorithms that I learn from data and evolve into something that has made certain complex connections you can’t explain to humans. Just as our own brain’s neural networks are probably more complex. So some current adaptive systems have hardwired algorithms that say if A then B else C. But a smart adaptive system is different than artificial intelligence in that it isn’t humanlu comprehensible. Amazon’s algorithm is pretty hardwired. U bought this bool i will recommend more by same author or books bought by others who bought it. Google search not so much. It learns from ur behavior. I think.
      Some tech folks will say neural networks are therefore also “neutral” – no one tells algorithms how to behave. But that’s only half true. You still get to choose what data to feed the neural network to learn from. Which variables. Which dataset. Choice of dataset for “training” a network can create a biased network.

      All of which is to ask: are learning analysis of LMS currently hardwired or neural networks? Either way who chose which data to collect and what manipulation to do to it? I also suspect Facebook has a neural network algorithm but parts of it are hardwired (if u like someone’s posts often enough u see them more regularly) and some more subtle.

      Thoughts? Answers?

      • francesbell says:

        So interesting Maha – my Masters project was in another form of machine learning – Classifier Systems which are rule-based genetic algorithms. I think of it as learning to critique machine learning in a practical context.
        I think Google uses a combination of machine learning and human intervention to its algorithms – changing them daily The algorithms will rely on signals such as Page rank (how they are linked) and people working in SEO are constantly trying to game the algorithm which is partially secret.
        Facebook’s algorithm is also partially secret but I did learn a bit more about the Edgerank algorithm (governs and filters our news feeds) when researching for our 3rd rhizo14 paper. We can observe that we miss a lot of content and we might assume that’s inevitable in a stream but we have to remember that what we are more likely to see (about 30% of what’s available) is determined by what Facebook thinks is ‘good’ for its major goal of generating advertising revenue. So we are more likely to see newer posts with multimedia attachments from people who are closer to us.
        My most recent experience of an LMS is Blackboard 3 years ago and the analytic data and functionality was pretty rubbish then. The major problem with Blackboard was the antiquated architecture that had survived despite the policy of acquisitions that could be seen in its ragbag of functionality. 2 broader points are relevant.
        1. The LMS will be based more on what its vendor thinks will sell than its apparent purpose to support teaching and learning.
        2. The initial context for its development (North American HE) will have had the most significant impact on things that are hard to change eg architecture. Assumptions from this context will be hard-wired in to the architecture even if the software is configurable. So you may have choices but from a constrained set. An example from the version I used was that when you clicked a highlighted user name, you got an email dialogue rather than the user page you might have expected. This couldn’t be changed because the architecture pre-dated social media whilst user expectations have changed.

      • Maha – thanks again for sharing your thoughts and provoking further discussions. As far as I know, there is no examples (yet) of neural networks /smart adaptive systems used in learning analytics but I suspect it may not be long? I quite like the proposal by John Danaher about the different possibilities of human-algorithmic accountability and decision making in this blog post -Danaher, J. (2015). How might algorithms rule our lives? Mapping the logical space of algocracy. [Web log post]. Retrieved from

  5. Gabi Witthaus says:

    Thanks for another deeply thought-provoking post Paul. The metrics used for evaluation in any system tell us a lot about the values of the decision-makers – and the organisational culture that is likely to flow from those values. Maha’s suggestion of moving towards a more participatory approach to the use of student data might be a really good starting point to help a university evolve towards a more participatory culture in a holistic sense.

    About 20 years ago, I was asked to evaluate an adult literacy programme on a mine in South Africa. Although I was contracted as a supposed ‘neutral’ external evaluator, I was tasked with ensuring that both management and the trade union agreed with my findings. This was potentially going to be difficult, as there was tangible hostility between the two parties in my first joint meeting with them. With some trepidation, I got on with the job, using Lincoln & Guba’s (1989) ‘4G Evaluation’ ( as my guide – the process involved me facilitating ongoing negotiation between the different ‘stakeholders’ in the mine about what was to be evaluated, how the evaluation was to be done, and what the results meant. In this instance, there was political will from the highest level within the organisation to do this, and ultimately we did collaboratively construct a set of findings that everyone agreed on. I wonder whether such an approach would be transferable to the challenges in higher education described in this post?

    • Gabi – thanks so much for sharing your thoughts and for the reference. I think one of the dilemmas in/of learning analytics is the potential (and great possible harm) of real-time interventions to support and ‘personalise’ learning. There is also increasing reliance on algorithms and real-time automatic and autonomous decision making which, per se, excludes students from contributing context or data to possibly contest or enrich the analysis. This means that we will have to/must find a way to include students in the initial design of learning analytics’ initiatives and meet with the ‘affected’ students on a regular basis and/or provide them with mechanisms to alert us to problems with our assumptions and analyses.

    • francesbell says:

      I tried to respond to this yesterday Gabi but lost my response 😦
      From my work in Information Systems, I can relate to the need for management support, and I feel sure that your human agency played an important part. So I am sure that your approach could make a useful contribution but may not be able to fix problems that are contextual to SA society and HE culture or embedded in a technology framework (see comments above.
      Also, I would say that whatever LMS and Student Records Systems are chosen and then configured, the ‘official’ systems are not the only relevant ones in mapping a decolonising lens. All systems that collect personal data ( and it’s increasingly difficult to find ones that don’t) are relevant in some way or another. Students will be socialising, organising, learning, etc. on platforms that algorithmically influence what they see and what they do. These platforms are way beyond the control of educators and students but I don’t think that means we should treat them as a given. We can critique them, challenge them and I hope education plays a part in this – directly by engaging in it and indirectly by helping students in becoming critical and effective in their critique.
      For me Paul was saying in his post that SA HE missed a trick in ignoring what it should have known about their contexts. Let’s not ignore relevant sociotechnical contexts.
      P.S. if ever you or Paul was moved to read the article I linked in my first comment, I’d love to hear what you thought. But as the paper comes with a Deleuze & Guattari warning, I will quite understand if you don’t.

  6. Hello Paul, my name is Leigh, I’m an Australian (white, male, settler) also troubled by the unchallenged rise of “learning analytics” and the #academiccapitalism of the neoliberal technocracy the pervades universities here and too many elsewheres.

    I wanted to suggest a minor calibration to your description, “A lot of my research focused and still focuses on the ethical and privacy implications in learning analytics”. I think you need to include the word ‘power’ if not replace ‘privacy’ with it. I offer you this presentation I made, not to self promote, but to try and establish a connection with you, in the lovely world of critical practice…

    Data and power. YouTube:

  7. Reblogged this on Grassroots Education and commented:
    Well worth considering in all educational settings.

  8. Pingback: Learning analytics: the good, the bad and the ugly – LONDON eLEARNING READING GROUP

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s