_front page 1a
outside the library of the school I taught in, Nov 2015

I am a research fellow at the Research on Improving Systems of Education (RISE) Programme. At RISE, I synthesise research about teachers and about the accountability relationships between teachers, schools, and education authorities across the seven RISE country research teams. I am based at the University of Oxford’s Blavatnik School of Government.

I recently completed a PhD in education at the University of Cambridge. My thesis research, supervised by Panayiotis Antoniou and Ricardo Sabates, looked at teacher accountability policy and sociocultural context across countries. My empirical sources were multi-country statistical data from PISA, TIMSS, TALIS, the World Values Survey, and Hofstede’s IBM study, as well as field interviews with teachers in Finland and Singapore.

Prior to starting the PhD in 2016, I spent two years teaching English in a high-need secondary school in Malaysia; and then worked at the Penang Institute, a think tank. For more background, my (rather outdated) CV is available here, and my current research and writing for RISE is available here.

If you’re wondering how to pronounce my given name, Yue-Yi, it’s 悦义. If, like me, you can’t read Mandarin, you can approximate my name by saying the acronym for the United Arab Emirates – U.A.E.

If anything on this blog is useful or interesting to you, I’d love to hear from you, whether in a comment or via email.

Book chapter: Trust and teacher accountability in Finland and Singapore

I am happy to report that a chapter that I wrote based on my thesis fieldwork, Contrasting Approaches, Comparable Efficacy? How macro-level trust influences teacher accountability in Finland and Singapore, has been published in the book Trust, Accountability and Capacity in Education System Reform: Global Perspectives in Comparative Education, edited by Melanie Ehren and Jacqueline Baxter.

I first met Melanie when we presented on the same panel at the EARLI SIG 18 & 23 conference in 2018, and she generously reached out to ask if I would like to contribute to their edited volume. As a PhD student who felt rather green, I deeply (and nervously!) appreciated the opportunity to work with her.


While Finland and Singapore both enjoy the global educational limelight due to their successful school systems, they differ considerably in their approaches to teacher accountability. Finland’s light-touch teacher accountability system focuses on setting standards at the point of entry to the teaching profession, whereas Singapore uses a comprehensive, tiered and competitive performance management system that deploys promotions and performance bonuses to manage the processes and outputs of teacher practice in schools. In this chapter, I use interviews with 24 Finnish and Singaporean teachers to explore the differences between these distinct approaches to teacher accountability – and to account for their disparate but apparently successful pathways. I argue that these disparate approaches share an underlying principle: each model of teacher accountability is compatible with the macrosystem in which it is embedded. Thus, teachers regard the accountability instruments as legitimate, enabling the instruments to favourably influence teacher motivation and practice. Specifically, public trust in Finland’s education system is distributed throughout each level of the system, with teachers enjoying high generalised trust. This is compatible with an accountability approach that gives teachers considerable autonomy over their daily work. In contrast, public trust in Singapore’s education system is concentrated on the Ministry of Education. This institutionally focused trust supports – and is supported by – a teacher accountability system that gives the managers considerable influence over teacher practice.

Note that the distribution of trust is far from being the only way to slice the sociocultural differences between Finland and Singapore. Sociocultural context is so fascinatingly complex. Elsewhere, I’ve framed the teacher accountability-relevant sociocultural differences between Finland and Singapore as being related to different mental models of motivation (in my thesis) and different visions of what education is and should be (in a RISE blog). What it boils down to is whether the teacher accountability instruments in question are meaningful and persuasive (rather than just bureaucratic or coercive) to the teachers who experience them in their specific day-to-day contexts.

The chapter is available for open-access download here.

For anyone who reads 300-page PDFs for fun

My PhD thesis is now freely accessible in the Cambridge repository, under the title Teacher accountability policy and sociocultural context: A cross-country study focusing on Finland and Singapore.

For a more digestible version, here’s a blog in which I use the RISE framework to summarise some findings from my interviews with teachers in Finland and Singapore. I have yet to write any summaries of the statistical analysis in my thesis — but I plan to convert parts of the thesis into a few journal articles, one of which will focus on the stats.

In the meantime, I hope all is well with you and with the people you care about, amid the ongoing uncertainty of the pandemic. Take care, friends.

Finishing the thesis, starting a job

A very quick update: I submitted my thesis last week! I’ll still need to go through the viva (i.e. oral defense) and to complete any corrections recommended during the viva before this PhD is wrapped up, but I’m so grateful to be at this point.

Also, later this week I’ll be starting a job at the Research on Improving Systems of Education (RISE) Programme. I’m very excited.

On another note, here’s my favourite typo among the many that emerged during the thesis revision process:

Next, I discuss how and why teacher accountability is pivotal to teacher accountability.

Conference presentation: how my field interviews complement the statistical analysis

Last month (yes, this post is overdue), I presented part of my thesis at the Comparative and International Education Society’s (CIES) 2019 conference in San Francisco. My presentation was part of a panel about methodologies for looking at cross-cultural aspects of the teaching profession. I talked about how cross-country statistical analysis and teacher interviews with Finnish and Singaporean teachers play different roles in testing the theoretical framework that I’ve been developing for mapping the intended outcomes of teacher accountability instruments.

The presentation slides are available here. And here’s a teaser:


Besides getting helpful feedback at the panel and during a dissertation mentoring workshop with other PhD students who are researching education policy, I really enjoyed the other sessions. CIES had a staggering number of sessions, and I would have gladly cloned myself so that I could listen to more of them. Some of the sessions that I’m still thinking about are:

  • Marjolein Camphuisen’s co-authored paper about test-based accountability in Norway — given that one of my fieldwork sites was another Nordic country, it was fascinating to hear about the inextricable Nordic-ness in the justification and design of Norway’s accountability policies.
  • This panel on educational accountability, which gave me SO much to think about for my thesis project.
  • A panel on Western and Confucian philosophies of learning — it’s never felt quite right to me when people talk about how the typical East Asian has more of a growth mindset than the typical Westerner. It’s true that many East/Southeast Asians invest a lot of effort into studying, but I’ve heard far too many Malaysian and Singaporean adults berate students for being bodoh (i.e. stupid), and really mean it; which seems very much like a fixed concept of intelligence. Jin Li’s presentation on this panel helped me to see that the crux isn’t whether Malaysians/Singaporeans have a fixed or flexible concept of intelligence — but rather that the nature of intelligence is irrelevant to the Confucian-influenced concept of learning. Unlike the dominant Western model of learning, which focuses on the mind, the Confucian model focuses on virtue — on diligent effort and self-perfection rather than on mastering an understanding of the world.*
  • And this panel on the politics of education in developing countries, which made me miss my comparative politics days.

Another bonus of attending CIES (besides the loveliness of the Bay Area) was getting to catch up with old friends. I stayed with an old friend from college and met up with two other college friends, none of whom I’d seen since I graduated. At the conference itself, I also got to catch up with a friend from the Teach For All Community of Practice in education policy — and we marvelled at the fact that we’ve been lucky enough to spend time together wandering a Peruvian market, sweating in a Helsinki sauna, and hunting for budget lunches in San Francisco’s financial district. Here’s to more satisfying conferences!


*Li’s Cultural Foundations of Learning: East and West had been on my reading list for months before the conference. I finally read it this week, and it was a great read: lots of empirical evidence from developmental psychology, alongside primary philosophical texts and literary references, and side-by-side comparisons that didn’t privilege either the Western or the Eastern model over the other. It’s always nice to find frameworks that can discuss sociocultural difference without assigning an implicit (or explicit) hierarchy of values to the different units in question.

Announcing the launch of Dialog Pendidikan!

Earlier this month, some friends and I launched Dialog Pendidikan (i.e. Education Dialogue), an English-Malay bilingual platform and Facebook page for discussing education issues in Malaysia.

LOGODialogPendidikan_noborderDiscussions about Malaysian education take place in numerous venues, both online and offline. But many of these discussions are monolingual — which can result in ethnoreligious and socioeconomic homogeneity that doesn’t represent the richness of Malaysia’s population. (For example, English-medium discussions tend to favour the educational priorities of urban, upwardly mobile families.) So the hope of Dialog Pendidikan is to connect some of these voices on the same platform, while throwing relevant educational research into the mix.

Dialog Pendidikan has been in the making for quite a long time. I’ve felt strongly about linguistic echo chambers in Malaysia since 2010, when I wrote my undergrad thesis on a short-lived policy for English-medium science and maths instruction in Malaysian schools. And I’ve wanted to establish a multilingual online platform for bringing together different Malaysian voices since my time as a secondary school teacher.*

Then I hit on the idea for Dialog Pendidikan in the summer of 2017, during the two weeks when I was commuting from Cambridge to Colchester for an Essex Summer School course on multilevel modelling.** That autumn, I got as far as assembling a preliminary team and consulting a friend with a long track record in communications work.*** But then I decided that it would be prudent to put the project on hold until I finished my thesis fieldwork.

And now, here we are. I’ve put together a small team of lovely people from my TFM cohort, learned enough CSS to have the two language versions of our posts show up in parallel columns on desktop browsers (e.g. here), and we’re now three posts into our first monthly discussion. This month, we’re looking at streaming, in the sense of ability grouping between classes. We’ve had an introductory post talking about recent policy changes in Malaysia and educational research from elsewhere, a post showcasing snippets of comments that readers contributed to the first post, and an interview with Danial Rahman, who’s done education work in both the public and private sectors. Next week, we’ll publish a summary of the month’s discussion.

We’re still very much working to extend our reach and to make the discussions increasingly inclusive, but we’ve received some very encouraging feedback. And I’m optimistic that, with time, Dialog Pendidikan will be a genuinely useful contribution to educational discourse in Malaysia. Semoga pendidikan di Malaysia semakin membina kebolehan dan membuka peluang untuk anak-anak kita.


*While I was teaching, the idea was to start a programme called Jom Share Stories — a blog where the students in my school could write stories and translate each others’ pieces into different Malaysian languages, under the mentorship of some professional writers. I got as far as creating a logo and a proposal slide deck, but didn’t get any farther because I had enough on my plate during the second year of my Teach For Malaysia fellowship. If anyone’s keen to develop this idea, you’re more than welcome to it! :D

**I don’t think it was the multilevel modelling that prompted thoughts of a bilingual discussion platform. More likely, it was the publication of the PISA 2015 technical report draft chapter on data adjudication, which happened at the same time — because I was poking at the chapter to figure out why Malaysia had botched its PISA sample. So it’s very likely that the quality of our educational research and discourse was on my mind.

***When I spoke to the communications expert in December 2017, they advised me to establish relationships with people in Malaysian online media who had run into issues with government censorship and/or shutdowns. That was on my to-do list for the project — until, in May 2018, we had our first change in government since Malaysia gained independence. The new government is far from perfect (like all governments), but it’s exciting and heartening to work on Dialog Pendidikan without the looming spectre of getting shut down for criticising government policy. (Probably.)

Promo image WP

Finishing my fieldwork, and musing about culture

I walk up to the ticket counter at the Alvar Aalto Museum in Jyväskylä, Finland. The staff member on duty has long hair, tattoos, a black T-shirt, and glasses.

  • Me: I’m a student, but my university is in another country. Can I still get the ticket discount?
  • Him: Yup, it’s for everyone.
  • Me: That’s nice. The train system only gives the discount to Finnish students. Which I guess makes sense.
  • Him: No, it doesn’t, actually.
  • Me: I’ll show you my student card.
  • Him: No need, I believe you.

I’m back in Cambridge after conducting field interviews with teachers in Singapore throughout July, and in Finland throughout September. In between, I also presented the results of my statistical analysis at the EARLI SIG 18&23 Conference in Groningen, the Netherlands — where I learned a lot and met many lovely people who are doing fascinating research on accountability, evaluation, and improvement in education.

It was a real privilege to spend time travelling and speaking with teachers — and also catching up with old friends in Singapore, as well as visiting exciting new places in Finland, and meeting more lovely researchers at the University of Tampere’s EduKnow research group, through Jaakko Kauko’s generous willingness to host me as a visitor there throughout September. But it’s also good to be back with my husband in our flat.

As I analyse and write up this research project over the coming year, one thing I’ll be thinking about is how to define sociocultural context, i.e. one of the two central constructs in this project. For the other central construct, i.e. teacher accountability instruments, I have a pretty coherent definition that hasn’t changed much over the past year. In contrast, I’ve struggled to define sociocultural context. From the beginning, it’s been clear that I should avoid defining culture in ways that are either too deterministic (cf. the colonial civilising mission) or too relativistic (cf. Lee Kuan Yew’s “Asian values” justification of authoritarianism). But it’s been much less clear where I should go from there.

One thing I really hope to do in this thesis is to illustrate the wonderful and/or frustrating within-country variegation in both teacher accountability policy and sociocultural context — variegation that is often neglected in zoomed-out discussions of both education policy and culture. Of course, some stereotypes do reflect (and caricature) real contextual elements. For example, I was inordinately amused by my exchange with the Aalto Museum staff member, because it seemed to encapsulate so many “typically Finnish” things (a love of heavy metal! egalitarianism! trust! forthright conversation! thoughtful architecture and museums!).

But the stereotypes don’t always match reality. For example, I was expecting many of my Finnish interview participants to talk about sisu. According to both Finnish and foreign sources, sisu — a kind of inner determination to do what has to be done, whatever the obstacles — is fundamental to Finnish culture. But only one of my twelve Finnish participants mentioned sisu at all. I ran this past a Finn who wasn’t in my interview sample, and they said that sisu isn’t something that they personally identify with, although perhaps older generations identify with it more. But it’s somehow become integral to Finland’s international image.

From a research standpoint, I’m grateful that I had the opportunity to conduct field interviews, which showed me some disconnects between rhetoric and reality that I otherwise would have glossed over. But it feels like this raises the stakes for how I choose to define sociocultural context.

(From a cutesy standpoint, I’m slightly disappointed — because one prominent characteristic of Singaporean culture is kiasu-ism, a fear of losing out or falling behind that typically manifests as aggressive competitiveness, and “Kiasu and sisu” would have made a fun section heading. Kiasu-ism is legit, as I well know from my four years of secondary school in Singapore; apart from it being mentioned by multiple Singaporean interview participants. But of course analytical rigour takes priority over snappy chapter headings.)

I’ve been asking around for ideas on how to conceptualise culture/sociocultural context in my thesis. I’ve read bits of cultural psychology and new institutionalism, and some have suggested that I look at Alfred Schütz’s homunculi or Judith Butler’s performative acts. I like James Maxwell’s notion of culture in A realist approach for qualitative research (2012), as “a system of individuals’ conceptual/meaningful structures (minds) found in a given social system, and is not intrinsically shared, but participated in” (p. 28) — but I not sure whether this gives me the traction that I want for discussing policy-relevant aspects of sociocultural context, in language that policymakers won’t shy away from.

Here’s how I’m currently thinking about culture/sociocultural context (and clearly I also need to decide if/how I’m going to distinguish between those two):

It’s like the music playing on a dance floor. The music doesn’t cause people to move, but it does guide how they choose to move, whether consciously or subconsciously. If you’re unfamiliar with the music, you can try to dance along, but, unless you’re highly adaptable and talented, it’ll probably be evident that your moves aren’t the quite same as the veterans’. You can choose to dance a style contraindicated by the music, or not to dance at all — but this will set you apart from the in-crowd, who might respond to you less favourably. And whether or not you dance, and whatever style or level of competence you’re dancing with, you’re aware that there’s music in the background. 

Of course, this illustration only goes so far (e.g. what about the hearing-impaired? what about the possibility of leaving the room? who chooses the playlist?), but I don’t yet know how to move from this dimly lit metaphor to a lucid definition. Incidentally, I landed on this dance-floor music metaphor when I was thinking about Geert Hofstede’s conceptualisation of culture in Culture’s consequences (2001) as “the collective programming of the mind that distinguishes the members of one group or category of people from another” (p. 9) — which sounds a bit too mechanistic to me, but which makes sense in that Hofstede was a long-time employee of IBM, where he developed the survey dataset that his research is based on. (That said, it would be false to assume, based on my choice of metaphor, that I am a competence dancer. Hahaha.)

If you have any thoughts about possible directions or theoretical references that might help me define sociocultural context in a way that allows for both granularity and policy-relevant rigour, please do let me know in the comments. Many thanks!

(Very belated) update on 28 Jan 2019: I’ve been working with a definition of sociocultural context as dominant patterns of behaviour and values in a given social system that influence people’s interactions with their environments. This definition draws on Maxwell’s conceptualisation of culture, quoted above; and it’s also compatible with Markus and Kitayama’s (2010) work in cultural psychology: “Culture is not a stable set of beliefs or values that reside inside people. Instead, culture is located in the world, in patterns of ideas, practices, institutions, products, and artifacts” (p. 422). So far I’m happy with this definition (although I could probably come up with something more elegant than “behaviour and values”) — and if that changes, as I work through my interview data and write up my analyses, I’ll modify it accordingly.

Also, a couple of weeks after writing this post, I had coffee with a Finn in my peer group (i.e. not from the wartime generation) who said that she does identify with sisu; and that perhaps it wasn’t mentioned in most of the interviews because sisu is associated with striving against adversity, whereas most Finns would associate schools with calmness and order. Sisu won’t feature prominently in any of my thesis writeups — but this is another reminder about the importance of circumspection and of checking the validity of my interpretations with as many informed people as possible.

Conference presentation: the theoretical framework for my thesis

Accountability mechanisms_square_3

Last Thursday, I presented on the theoretical framework that I’ve developed for my thesis research (part of which is shown on the right) at the student-led Kaleidoscope Conference in my faculty.

I got valuable feedback on this framework after the presentation, including some input from Christine Salmen from the University of Vienna, who is further along than me in the process of writing a theoretically informed mixed-methods PhD thesis on accountability policy reform.

My  presentation slides are available here. The slides are light on text, but do let me know in the comments or send me an email if you would like to hear more.

Op-ed: Did the Education Ministry influence our PISA 2015 results?

The OECD released the PISA 2015 Technical Report chapter on data adjudication at the end of last month, and the report says that Malaysia’s replacement schools were higher-performing than the non-responding initially selected schools that they replaced.

I wrote a piece about that, and about Malaysia’s participation in PISA 2015 more generally, for The Malaysian Insight—and I promise that it has less jargon than the preceding sentence. Read it here. Alternatively, here’s a version with the citation links:

Did the Education Ministry influence our PISA 2015 results?

by Hwa Yue-Yi

Like many other Malaysians, I want our country to have a good education system. So when the results of the Programme for International Student Assessment (PISA) 2015 were released last December, I was curious to see whether our 15-year-olds had developed more skills in reading, maths, and science than their peers in previous PISA rounds.

But, like many other Malaysians, I was disappointed. Not because of our average PISA scores – which had gone up – but because we weren’t included among other countries in the main database of PISA results. Apparently, we had been excluded because only half of the schools that had originally been chosen to take part in PISA 2015 had actually taken the test.

Within each PISA country, schools are randomly chosen to participate in the assessment, after considering certain school characteristics, such as the size of each school and whether it is urban or rural. The goal is to get a balanced, accurate picture of student learning across the whole country. Each selected school is also paired with a backup school. If one of the originally selected schools doesn’t want to take part, the backup school will be asked to participate instead. But when many originally selected schools drop out of PISA, it’s hard to tell if the results represent the country accurately.

In Malaysia, after including the backup schools, our weighted PISA 2015 response rate increased from 51% to 98%. However, at the end of last month, the OECD released an official report stating that Malaysia’s backup schools “had a significantly better result, on a national examination, than the non-responding schools in the original sample”.

So did the Ministry of Education strategically ask higher-performing backup schools to take the test, so that our PISA 2015 results would look better?

Looking at the evidence

To answer that question, let’s start with what the Ministry has said. Shortly after the results came out, the Ministry promised to release a report with full details on why we were excluded from the main PISA database.

Eight months later, no report has been released. However, in a March 2017 parliamentary reply to Tony Pua, the Ministry said that the 51% initial response rate was due to PISA 2015 being conducted using computers (unlike previous PISA rounds, which used paper-and-pencil tests). Because of this, some students were unfamiliar with computer-based tests and didn’t record their answers properly, and there were technical issues with data loss.

But the evidence suggests that computers probably weren’t to blame. And it also doesn’t seem likely that our low initial response rate was just a coincidence.

  1. The previous two times we participated in PISA, our weighted initial response rates were above 99%. This is also true of all five times we have participated in the TIMSS international assessments – including TIMSS 2015, which took place just six months before PISA 2015. Moreover, principals and teachers in Malaysia, as civil servants, usually obey government directives. It would be very surprising if half of the Malaysian school principals who were told to administer PISA simply refused.
  1. In PISA 2015, the Netherlands also had an initial response rate in the “unacceptable” range – although their 63% was higher than our 51%. But they were included in the main PISA database because national exam data showed that their PISA results probably weren’t biased. In contrast, the data submitted by our government showed that our PISA 2015 backup schools had higher national exam scores than expected, so our results may have been biased.
  1. On average, it’s likely – though unfair – that schools with better exam results also have better computer equipment, and that students in these schools are more familiar with computer-based tests. But it’s a lot less likely that (a) almost half of our initially selected schools faced computer-related difficulties in conducting PISA; but (b) such computer-related problems affected almost none of the backup schools, which had been selected using the same randomised process as the initially selected schools.
  1. The Ministry invested a lot of time and money to prepare for PISA – certainly enough to detect and solve computer issues. The PISA test was held in April 2015, but the Ministry had formed a committee on TIMSS and PISA by December 2013. Mock PISA tests were held as early as May 2014. In March 2015, the Ministry reported that students had been given PISA-style exercises to familiarise themselves with the test, and that teachers had been trained to conduct the computer-based test. In the weeks leading up to the test, students attended PISA training camps in hotels. Also, the OECD provided each country with a diagnostic programme several months in advance, so that PISA administrators could check if each participating school had adequate computers.
  1. PISA 2015 did not require fancy new computers – Windows XP was enough. Also, the test was delivered using USB sticks, so it did not need an internet connection.

All of this casts some doubt on what the Ministry has said about the problems with our PISA 2015 participation.

Measuring up

But why might the Education Ministry want to influence our PISA results?

Since our subpar TIMSS 2011 and PISA 2012 results were revealed, the government has been under tremendous pressure to improve our performance in international student assessments. As a result, the Malaysia Education Blueprint 2013–2025 uses these exams as the benchmark of educational quality: the aspiration is to be among the top third of countries participating in PISA and TIMSS.

But this might not be the best benchmark for Malaysian schools. For one thing, our ranking in PISA and TIMSS depends on which other countries decide to participate that year. If, for whatever reason, a lot of high-performing countries decide to drop out of TIMSS 2019, our relative ranking could rise, even if our average score doesn’t change.

If we want to benchmark our education system against international assessments, it would make more sense to use PISA proficiency levels, which are consistent from year to year. For example, we could say that, in PISA 2021, we want 80% of Malaysian 15-year-olds to reach at least at Proficiency Level 3 for science, which means that they can identify evidence supporting a scientific claim and construct explanations in complex situations.

Moving beyond exam obsessions

Most Malaysians would agree that our education system is too exam-oriented. To its credit, the Education Ministry has been trying to address this, by introducing coursework elements to the PT3 and the STPM, and by initiating consultations on whether the UPSR should be abolished.

But in emphasising our PISA and TIMSS rankings, we are choosing to worry about even more exams.

This is especially sad because PISA and TIMSS are not meant to be that type of exam. They are not the sort of test where you must do as well as you can because it will affect your future. PISA and TIMSS give national-level results, not results for individual students or schools. And it’s hard to think of any ways in which our national future would be directly affected by these results. It’s unlikely that multinational companies would use PISA rankings to decide whether to invest in Malaysia – they might look at graduate skill levels instead. Some Malaysians may migrate to other countries so that their children can attend better schools, but our TIMSS and PISA scores are just one data point among many indications that our education system is struggling.

Instead, PISA and TIMSS are more like the weekly quizzes that you might have taken in school, so that both you and your teacher have an accurate picture of what you know, and what you still need to work on. Similarly, these international student assessments are meant to help education systems understand what their students are good at (analysing literary texts? applying maths principles to everyday problems?) and what they can do to improve. Increasingly, it looks like one way our Education Ministry can improve is by embodying the integrity and responsibility that the national curriculum espouses.

Learning lessons, old and new

To practice integrity and responsibility myself, I’d like to highlight some evidence proving that the Ministry is innocent of a suspicion that had been raised earlier about our PISA results. When the PISA data were released last December, I noticed that 30% of students who took the PISA test were from fully residential schools (Sekolah Berasrama Penuh). This looked suspicious because less than 3% of students are enrolled in these asramas, which usually admit students based on their UPSR or PMR/PT3 results – so asrama students would probably get higher PISA scores than average.

However, when I later emailed the OECD to ask about this, they confirmed that the oversampling of asrama students was intentional, and that it was balanced out in all PISA calculations so that it did not bias our average results. It’s likely that the Education Ministry requested this oversampling to get a higher-resolution picture of asrama students’ performance – which is a legitimate reason, especially when the Education Blueprint discusses plans to internationally benchmark Malaysia’s education programmes for gifted children.

So in this process of looking at our PISA 2015 results, I have re-learned the lesson that I must be unbiased in seeking an accurate picture of Malaysian education. As a citizen and researcher, I must hold the government accountable for how it uses our national resources to prepare our children for the future – but I must also give credit where it is due. And as a former SMK teacher, I know that there are many hardworking teachers and Ministry officials who work sacrificially for our children’s futures, and that it can be dispiriting when everyone seems pessimistic about our schools.

I hope that the Ministry reveals more evidence about our participation in PISA 2015, as they promised in December. And perhaps this evidence will show that the bias from the backup schools was unintentional and unavoidable. But more than that, I hope that the Ministry will learn all that they can from this PISA 2015 process, for the sake of the millions of children whose education they are entrusted with.

Translation: Lant Pritchett membincangkan cabaran dalam pembaharuan sistem pendidikan

Terjemahan daripada Lant Pritchett, December 2015, “Creating Education Systems Coherent for Learning Outcomes: Making the Transition from Schooling to Learning”, RISE Working Paper, m/s 7–8. Dalam petikan ini, Pritchett membincangkan cabaran dalam pembaharuan sistem pendidikan, dan betapa pentingnya kesepaduan sistem. Tema lain dalam kertas kerjanya ini termasuk beza antara persekolahan (schooling) dan pembelajaran (learning). Cadangan dan pembetulan bahasa dialu-alukan.

Inilah soalan yang sukar dijawab: “Terdapat sebilangan negara yang telah berusaha selama 50–60 tahun untuk meluaskan pendidikan sekolah rendah kepada semua, tetapi keberhasilan pendidikannya terlalu rendah. Bagaimanakah keadaan menjadi begitu teruk?”

Lebih mencabar lagi, semua jawapan yang mudah kepada soalan sukar “Bagaimanakah keadaan menjadi begitu teruk?” ini tidak dapat menjadi penjelasan umum. Hampir semua teori tentang unsur utama dalam penambahbaikan sekolah telahpun disangkal oleh bukti tentang pembelajaran di tempat lain. (Sudah tentu, kebanyakan teori ini juga terbukti dampaknya di tempat-tempat tertentu, dan teori-teori ini juga mewakili ciri-ciri sistem pendidikan yang berprestasi tinggi.) Yakni, jika si pemerhati mendapati bahawa pembelajaran di sesuatu tempat terlalu teruk, dan pelajar di situ tiada buku teks, jawapan yang mudah ialah: “Pembelajaran tidak berjaya kerana murid tiada buku teks.” Namun, sekiranya kajian menunjukkan bahawa membekalkan buku teks di suatu tempat yang lain tidak dapat meningkatkan tahap pembelajaran yang teruk di sana, maka jawapan mudah ini ternyatalah bukan jawapan yang dapat dipakai umum. Penaakulan yang sama, berdasarkan kajian pro dan kajian kontra, juga menyangkal teori-teori lain, sama ada tentang meningkatan gaji guru, mengurangkan bilangan pelajar dalam kelas, mengetatkan syarat untuk pentauliahan guru, menambahkan geran pembiayaan pendidikan, dan sebagainya. Faktor-faktor ini (saiz kelas, buku teks, gaji guru, dsb) memang terbukti kepentingannya dalam sesetengah sistem lain, terutamanya sistem yang sudah mantap. Meskipun begitu, ini tidak bermaksud bahawa kekurangan faktor-faktor tersebut dapat memberi penjelasan lengkap tentang punca prestasi rendah di sesebuah negara.

Biar kita ambil kereta sebagai perbandingan. Berdasarkan bukti empirik yang kukuh, kita semua memahami bahawa sebuah kereta dapat bergerak lebih jauh apabila diisi dengan lebih banyak minyak. Ini memang benar untuk kereta yang berfungsi. Namun begitu, jika sistem gear kereta saya telah rosak, mengisi minyak tidak akan menambahkan jarak yang dilalui—walaupun motornya mungkin berjalan lebih lama. Sekiranya unsur-unsur sesuatu sistem tidak membentuk satu keseluruhan yang bersepadu dan berfungsi, adalah mustahil untuk meramal kesan daripada penambahan salah satu unsurnya.

Secara tentatif, jawapan (panjang) kepada soalan sukar tadi, “Bagaimanakah keadaan menjadi begitu teruk?” ialah: “Sistem pendidikan yang dibina di banyak negara, biasanya melalui kerajaan, tidak direka sebagai (atau tidak berkembang menjadi) sistem yang bersepadu ke arah keberhasilan pembelajaran yang tinggi untuk semua.” Sistem-sistem ini mempunyai sasaran utama yang lain, seperti perluasan akses kepada pendidkan. Kerap kali, sebahagian daripada mereka yang menganggotai sistem tersebut memang mengharapkan keberhasilan pembelajaran yang tinggi, tetapi sistem secara keseluruhan tidak pernah koheren untuk pembelajaran.

[original text]

The hard question is: “How is it that some countries are 50-60 years into pursuing universal primary education as a goal, yet have learning outcomes that are so awful?”

Worse, all the easy answers to the hard question of “How can it be this bad?” are ruled out as general explanations. Pretty much everything everyone believes is the key element of better schools has, by now, been rigorously disproved to have an impact on student learning somewhere. Of course, many of these same notions have also been rigorously proven to have an impact on student learning somewhere else, and are characteristics of well-performing education systems. That is, if one observed that learning outcomes of students were awful and that they lacked textbooks, then the easy answer suggests itself: “Learning is bad because kids lack textbooks.” But if studies show that, in places where learning outcomes are bad, adding textbooks doesn’t make things better, this obviously cannot be the answer as to why things are so bad. The same logic applies to better teacher pay, small class sizes, teachers with more formal qualifications, larger block grants, etc. Even if it is the case that these same factors (class size, textbooks, teacher pay, and so on) are proven to matter in some, often well-functioning, systems, this doesn’t mean the causal explanation of poor performance in a given country or region is a lack of these proximate determinants or simple policy elements.

An analogy is a car. We all have the empirically well-honed intuition that a car can go further with more gas than with less gas. This is because for cars that work, it is true. But if I have a car whose transmission has failed then adding more gas will not add to miles travelled—even if it allows the motor to run for longer.

If the system does not add up to a functional whole, the causal impact of augmenting individual elements is completely unpredictable.

The tentative (long) answer to the hard question “How can it be so bad?” is: “Systems of  education were built up in many countries, primarily within governments, that were never actually designed (or emerged) as systems coherent to the purpose of producing uniformly high learning outcomes.” These systems had other, often desirable, objectives, like expansion of access. They often had learning as one objective of at least some actors in the system, but the system was never coherent for learning.

—Lant Pritchett, December 2015, Creating Education Systems Coherent for Learning Outcomes: Making the Transition from Schooling to Learning”, RISE Working Paper, p. 7–8. See also the many other working papers from RISE (Research on Improving Systems of Education).