TeachLab with Justin Reich

Failure to Disrupt Book Club with Candace Thille

Episode Summary

For TeachLab’s ninth Failure to Disrupt Book Club we look back at Justin’s live conversation with regular Audrey Watters and special guest Candace Thille, director of Learning Science at Amazon and former researcher and faculty member at Stanford University and at Carnegie Mellon. Together they discuss Chapter 8, The Toxic Power of Data and Experiment. “It wasn't just that they didn't know how to use the educational technology. It was their belief about their role as a learner and their belief about her role as an instructor. And so just like you talked about many times in your book, the technology can't do it. The human interactions are what really drive how the technology gets used.” -Candace Thille

Episode Notes

“It wasn't just that they didn't know how to use the educational technology. It was their belief about their role as a learner and their belief about her role as an instructor. And so just like you talked about many times in your book, the technology can't do it. The human interactions are what really drive how the technology gets used.” -Candace Thille

In this episode we’ll talk about:

Candace’s positive edtech story - Human interaction with edtech implementation
Candace’s negative edtech story - Failure of interface design
Privacy/surveillance/autonomy concerns in edtech
Open Learning Initiative statistics course
- Comprehensive Assessment of Outcomes in a first Statistics course (CAOS)
Systematically evaluating the variations between teachers
Defining “experiments”
Ethical data collection
Data ≠ useful insight
Democratizing education research

Resources and Links

Watch the full Book Club webinar here!

Check out Justin Reich’s new book, Failure To Disrupt!

Join our self-paced online edX course: Becoming a More Equitable Educator: Mindsets and Practices

Transcript

https://teachlabpodcast.simplecast.com/episodes/bookclub9/transcript

Produced by Aimee Corrigan and Garrett Beazley

Recorded and mixed by Garrett Beazley

Follow TeachLab:

Facebook

Twitter

YouTube

Episode Transcription

Justin Reich: From the home studios, of the Teaching Systems Lab at MIT. This is TeachLab. A podcast about the art and craft of teaching. I'm Justin Reich. This is the ninth episode of our book club series about my new book, Failure To Disrupt: Why technology alone, can't transform education. This week, we're talking about chapter eight, The Toxic Power of Data and Experiments. And once again, I've got Audrey Watters here talking about it with me, Audrey. Thanks for joining us.

Audrey Watters: Thank you for having me.

Justin Reich: So this chapter, The Toxic Power of Data and Experiment, it tries to argue that one of the cool things about online learning environments is that we can systematically vary learner experience. We can have some people try one things. The similar group of people try another things, collect a bunch of data about their experience and see which ones work better. And through that process kind of incrementally improve what we're doing.

The two problems with that is it turns out that parents don't like the idea of having their children be experimental subjects in learning environments. And that hoovering up vast amounts of data about people's behavior and learning environments seems like creepy surveillance kinds of things. And in fact, in many circumstances is creepy surveillance kinds of things. Audrey, what was your sense of what Candace brought to this conversation about this dilemma?

Audrey Watters: I thought that Candace shared some really interesting insights. One of the, I think most important ones was that when she talks about how students make their own meaning from the technology or from the instruction, and you have to make sure that they're making the meaning that you want them to make. I thought that that was really powerful and thinking about the way in which things are looks as though they're designed well and thoughtfully, but you can just sometimes miss the mark and students, they sort of infer their own lessons and incorrect lessons from it.

Justin Reich: Yeah, with these learning systems, there are students or participants who are doing activities, those activities generate data, and the data are a bunch of ones and zeros, or they're a bunch of like weird JSON strings that say things like, "Play video start at second, 00437." And then researchers, instructors, designers they parse that data. They say, "Oh, I think what the students meant was X." But all the time we can find that that is not at all what student meant by X. That they're meaning is very much obscured because one of the things about online and distance learning is that we're often not next to each other, that we see the results of people's actions, but we don't watch them do the actions. We don't listen to them as talking or thinking as they do the actions. And that's what we're going to try to get into here.

Let's hear what Candace had to say. Great, welcome everyone to week nine of the Failure to Disrupt Book Club. We're looking now at chapter eight, The Toxic Power of Data and Experiment. Very grateful again, to have Audrey Watters with us. And then also very happy to welcome Candace Thille. Candace Thille is the director of Learning Science at Amazon. And she's a former researcher and faculty member at Stanford University and at Carnegie Mellon. She's one of the lead of the Open Learning Initiative Project. And has done some very innovative work in developing new online learning experiences, in spreading and sharing new, online learning experiences, in creating sort of norms and values and communities around this kind of research and publishing, some really important pieces as well. So Candace, thanks so much for joining us.

Candace Thille: Thank you. Pleasure to be here.

Justin Reich: So Candace, we'd like to get people to introduce themselves by talking about their kind of EdTech moment. Is there some piece from your personal history as a student or a teacher where education technology really captured your attention and interests?

Candace Thille: Yes. Can I give two?

Justin Reich: Yes. Yeah. Take the floor.

Candace Thille: A positive and a sort of mind shaper.

Justin Reich: Great.

Candace Thille: So the positive one was when I first started the Open Learning Initiative at Carnegie Mellon University, and we've created this introductory statistics course online. We made it available, openly available to faculty all over the country to use, to teach introductory statistics. And a faculty member and me, at the time, also did little faculty workshops to help faculty learn how to use the software in a way that would actually support their teaching and their students' learning.

And there was a faculty member at a community college in California who was very gung ho about it and took the OLI statistics course. And on the first day of class showed her students how to log into it and told them as we had recommended that they have their students work through the module between that class and their next class. And then she could look at the instructor dashboard to see where students were struggling and where students were getting it. And then she could use that to guide the way she supported her students. So she was very excited to do this. And so the class was all great. Everyone knew how to log on. They all knew how to use it. They all went off.

And she looked at the instructor dashboard right before class and what she saw was nothing. No insights, no learner interactions. Basically the students hadn't really done anything. So she went into class and asked the students. They were like, "Well, we didn't really know, understand this. Could you explain this? Could you explain that?" So she did. And very quickly, the class went right back into the traditional way of the students sitting there taking notes while she explained everything.

So this happened two times in a row. And so she called me and said, "I don't know what to do. That's not working. There's something wrong." And so we problem solved it and realized that it wasn't just that they didn't know how to use the educational technology. It was their belief about their role as a learner and their belief about her role as an instructor. And so just like you talked about many times in your book that technology can't do it, the human interactions are what really drive how the tech knowledge it gets used.

So we talked about how could she shift that? And so what I suggested to her is first, "To do your next class. If you can get a computer lab, do your next class in a computer lab." Because the students need to practice learning in this different way and building their own competence and their own ability to be able to essentially teach themselves or support their own learning using these resources.

So she got a computer lab. There weren't enough chairs for each student to have a computer. So she had two students on each computer, it wasn't our design, but it actually ended up being way better. So then she just said, "Okay, today we're going to learn how to make box plots. So I want everyone to go to this part in the course and work through the material and essentially figure out how to build a box plot. And she just circulate. And if you struggle, talk to the person next to you, you guys can figure it out together." And she circulated around just answering questions about how to use it, not answering questions about how to build a box plot.

Then she brought the class attention back up to the front and said, "Okay, now I want to build a box plot who can tell me how to build a box plot?" And essentially had the students then, essentially walk her through what they had just in theory learned from working through the material. And then her feedback on that was not just, "Great, how to build a box plot," but, "See, you were also able to learn it to the point where you could teach me how to do it. Now let's look at histograms."

Also, I should say her class was an evening class, a seven to 10 class in the evening, three hours long. And so she broke up the class by having them, at least for the next couple sessions, just working through the material and then teaching her as a group, how to do what it was they just learned. And then after she'd done that a few times, when it looked like people could do it, then she said, "Okay, next time before you come to class, work through this next module. And if you have challenges doing it, you have already a built-in study-buddy, the person that you sat next to on the computer." So then through the rest of the class, that's the way she did it.

Now, the reason that was impactful to me was at the end of the class, she had one of her students just just essentially bring in a video camera and videotape people's feedback on how the class was for them. And she sent me, and this tells you how long ago it was. She sent me the video on a CD and it made me cry because the students were saying things like, "Well, this is the third time I take taken introductory statistics, but it's the first time I took it the OLI way. And now I'm teaching my friends what a P value is."

And what struck me about that was not that they learn statistics, but they were then positioned in authority with respect to the subject. They weren't victimized by statistics anymore. They saw it as something that was a tool for them to use, to make sense out of the world. And that experience is part of what actually compelled me to keep working on the OLI Project for as many years as I did. But I would say also it's two designers out there to always be thinking about how does the software that you're developing or the EdTech that you're developing, how does it position the learner? Does it position the learner as a passive recipient? Or does it position the learner as someone with authority? And that's always going to be a tension in designing EdTech I think.

Justin Reich: Now what was your negative story?

Candace Thille: Oh, my negative story was again at Carnegie Mellon, we were doing an intro economics course. We created essentially a little application that allowed people to draw supply and demand curves and then find a point of equilibrium. So, we've done this little great thing and used it in the teaching of economics. It was a whole experiment. The course was a series of economic experiments where the learners were essentially economic agents. So they would engage in sort of buying and selling textbooks or cars or monopolies and cartels. So they actually acted as economic agents and they got assigned different roles with different price points and so on. And then after that experiment closed over three days, then they would get the data from the experiment and analyze it to learn the principles of the concepts and see if the principles and concepts explained the actual behavior that they themselves exhibited.

Justin Reich: Cool.

Candace Thille: So that's the way we did the microeconomics course or the economics course. And so the supply and demand curve thing. And that of course was really cool. The supply and demand curve thing, though, what we found was that we were getting all of this feedback that students couldn't plot supply and demand curves. That, especially on the supply side, the curve was always wrong. They'd get the demand curve fine, the supply curve. And so digging into the data to kind of see how could this be? I couldn't see what was wrong, just that they were always getting the supply side wrong.

So I interviewed some students about, "How did you do this?" And a student said, "Well, I knew that I had to make the label point at the origin and then draw the curve from the origin out." And I'm like, "The label point? There's no such thing as a label point." But in fact, when we had created our little app in order to distinguish the supply curve from the demand curve, we had taken the very first point and put a label on it. So they know they were working on a supply curve or the demand curve. And they had made meaning out of that.

And in the test environment, essentially the label was not at the origin point on the supply, it was on the last point. And so they meticulously moved that point all the way back to the origin and then plotted the points from that point forward. So they had drawn an absolutely accurate supply curve, it was just from the computer's perspective, the inverse of the curve. So it scored it wrong. So, that was the negative thing. So the automated scoring told us that nobody understood how to draw a supply curve, but it wasn't, it was the interface design.

Justin Reich: It was some artifact of the user experience that you had created that led your students astray.

Candace Thille: And also, it led them astray, both in that. And also that then they had a concept of a label point, which was meaningless. So the learning there is students will make meaning out of what you give them and you have to monitor your data collection. It gives you insight into are they making the meaning that you want them to be making?

Justin Reich: I think that's a terrific lead in to this week's discussion about the toxic power of data and experiment. And about how you had traded this interface for a series of purposes. It was generating data that you could draw inferences from. And there was something about the mismatch of meaning between sort of what you intended and what students discovered that led to this confusion that needed to be debugged and made sense of. That's a perfect story to tie us in with.

So Audrey or Candace, it would be great to hear from you as we have in past weeks. Just what was your sense of what the toxic power of data and experiment is all about? What is the chapter intend to do? And then, where does it fall short or where are there different stories that we want to tell? Maybe Audrey, we should let you go first for a bit and bring us into the conversation.

Audrey Watters: Yeah, I thought that this chapter actually... You packed a lot into this chapter. I appreciated the way in which you actually reflected on your own role as a researcher, with the kinds of experiments that you've done with data at scale. I thought it was really effective, and I think showed the humility that one does not always see in ed tech when it comes to the kinds of pronouncements that we can know.

I thought that the important things from this chapter were to recognize that data is a toxic asset or that data can be a toxic asset. And that it's also important to know that students are compelled to use these technologies. And that I think this actually does tie a bit to what the story that Candace just said, that the idea that you introduce of contextual integrity, that students have a sense of privacy. They have a sense of that they have their own meaning that they attach to certain circumstances that makes them assess whether or not, in this case, whether or not their privacy was violated in ways that I think the researcher might not have thought about, because they weren't paying attention to the context in the students might be interpreting the experiment.

Justin Reich: Good. So just in the same way that Candace realized some feature of her design wasn't being interpreted in the way that students were intending. We can imagine that the same issues could happen, not just with understanding, but with issues of privacy, with issues of surveillance, with issues of autonomy. A researcher like me could think, oh, everybody must think it's fine to do X. And then students, parents, people who are intermediators can look at the exact same setups and circumstances go, no, no, no, that's deeply unethical or that's problematic for these reasons or that our interpretation in what we see is shaped by our position and background and so forth.

Good. And we should go into some of those kind of key stories and key examples, but Candace, what were some of the things that you took away? And in particular, what were some of the things that you thought were missed or you want to argue against, or that we should delve into further?

Candace Thille: Okay, great. So I really liked the chapter and I liked the way you spoke to some of the successes, but you also acknowledged that there has not yet been a big, compelling success that would sort of swing the pendulum saying it's worth the risks that you outlined.

Justin Reich: What would you put on that list, Candace? If you were to make your list of your top sort of whatever number it is, two or three or four most important contributions from online learning, research learning, analytics research, what are the things that you would go look, of course, we should keep doing it. Look at these things that we've done in the past.

Candace Thille: Well, I do think that... I actually think that the... Not to-

Justin Reich: Not to toot my own horn, but one of them is yours.

Candace Thille: But I actually think that the study we did with the OLI Statistics course where it was pretty compelling, I actually thought that the Carnegie Mellon study was more compelling than the larger study.

Justin Reich: Just to clarify, initially, there's a randomized controlled trial that's happening at Carnegie Mellon.

Candace Thille: Oh, yeah. So I can explain that. So we built this OLI. So OLI stands for Open Learning Initiative. And for those of you who are not familiar with that, it was a project that was funded by the William and Flora Hewlett Foundation in 2002, right after they funded MIT OpenCourseWare.

And the idea was to use what we know about human learning from all the work that had been done in intelligent touring systems at Carnegie Mellon to inform the design of introductory level college courses. And we started with introductory statistics, because everyone knows that, bygone, everyone just wants to learn statistics. It's such a natural human drive. And so, we made this openly available statistics course available out on the web. And we first started collecting data from the interactions from that course, because we realized we would see the learners.

So how did we know whether or not the course was actually supporting someone to learn statistics? So before we released it out onto the web, we wanted to do some studies to test was it effective? Did it actually help learners learn statistics? So we did a randomized control trial at Carnegie Mellon with the introductory students at Carnegie Mellon that were taking first year statistics.

My registrar would not let me randomly assign students to condition. I don't know why, but they didn't.

Justin Reich: We'll go back to the registrar later, but keep going.

Candace Thille: But what they did let me do was say, we're running a study to test the effectiveness of this statistics course. Do you want to be part of the study? And of the students who volunteered to be part of the study, we randomly assign them to condition. And the two conditions were the students in the control condition took the intro stat course the way everyone always took it at Carnegie Mellon, which was three lectures, one a discussion section a week, with a textbook for 15 weeks.

The students in the treatment condition took the OLI course in place of the textbook, met with the instructor twice a week, and finished the course in eight weeks instead of the 15, and had the same three midterms and final. And we also did a CAOS test, which was somebody else's research looking at that studied statistical, not literacy. We did pre and post test on both treatment and control. And the upshot was the students in the OLI course not only completed the course in half the time, but their performance on the chaos post-test was 18% greater than the students in the traditional-

Justin Reich: CAOS is an acronym for a standardized-

Candace Thille: I can-

Justin Reich: We'll put it in the show notes with a CAOS test words. It's some kind of test [crosstalk 00:22:11] it's not on the subject of chaos.

Candace Thille: No, no, no, no, no. CAOS was the acronym for the group that was studying introductory statistical literacy. So anyway-

Justin Reich: Things work at Carnegie Mellon. And then your critics say something along the lines of, well, those Carnegie Mellon kids, of course, they're perfectly happy to learn online. They're a bunch of techie nerds. So then, there's this follow-up study in which they try to do it in settings that are more like much more of higher education.

Candace Thille: That's correct. So Bill Bowen, who is the former president of Princeton and former president of the Mellon Foundation, was essentially asked to review a book where there was a chapter written about OLI. And so, he called the president of Carnegie Mellon and said, "What is this thing? This sounds like it could possibly be a solution to a Baumol's cost disease. But I want to know if it just works at Carnegie Mellon or if it works in other contexts."

So he got his friends from the University of Maryland system, Brit Kirwan and said, "I want to run a study in your system." And so, he ran the second study. He and his team ran the second study. We weren't involved in it at all, where they were looking at could they see the same level of acceleration at community colleges, state colleges, and universities in Maryland. But they couldn't get the same study design, which we can talk about. But they were able to show about a 20% savings in time across those institutions.

Justin Reich: But the one part of the study design is the same is that if you opt into the study, you're randomly assigned to a regular statistics class or an OLI statistics class. And the upshot was is that the people in the treatment condition got the same grade, but spent about 20% less time on all activities related to the course.

Candace Thille: Yes.

Justin Reich: Yeah. That's great. I wonder if we should go back to that registrar, because in some ways that registrar is really at the fulcrum of some of the debates in this chapter. You go to the registrar and say, "I think I have a way of teaching statistics, which is just as good as the existing method. Might even be better than the existing method."

And one of the things that we know about teaching and learning and higher education, but lots of other contexts, is that faculty have in many places, enormous academic freedom. So there can be quite high levels of variation in teaching from one year to the next. If a teacher just went to their class and said, "I want to do some radical way of doing teaching and learning this year, that was different than last year," no registrar or other agent at CMU is likely to stop them.

And so, every time a student signs up for a class, they're getting variation between one version or another. And to some extent you propose, well, let's systematically evaluate that variation. Just like sometimes you get randomly assigned to [inaudible] for certain kinds of things or one teacher might be better than other. Let's randomly assign you to this online learning condition or this non-online learning condition. Just see which one works better. I think for people who have a sort of bent towards research, they kind of say, look, there's variation in the system, no matter what. There's variation between last year and year, between this instructor and that instructor.

The only difference with the experiment is that this is variation that we can actually tell whether or not one thing is better than the other. What's wrong with that argument?

Candace Thille: Yes. And this is the place when you say what did I think you missed in the book? You call out that argument really well. So you got that. What I think you missed is the other power of this technology is fundamentally changing the relationship between research and practice. So if, as you rightly call out, faculty, and you just described now, faculty pretty much every semester or every quarter might change something about how they teach something.

And they're doing that because they believe that this new insight or this new way of teaching is going to improve the learning outcomes for their students, even if they don't express it that way. And so then, they try something and it works or it doesn't work, but the only person, or the only group that benefits from that, is them in their class. And so, you have all of these people who are building up observations and refining the way they're teaching someone based on those observations over a long period of time.

But what this technology enables us to do is give benefits to those structured observations across a whole bunch of faculty, instead of just one. And that's what I see as the real... When you said that we haven't seen any huge sort of seed changes in the way teaching happens, you also make I think the great point, which is the way change is going to happen is going to be incremental and continuous. I agree. I agree. But what the technology can allow us to do is that incremental and continuous change can be accelerated by benefiting from all the little experiments that all the faculty are doing, whether they're expressly explicitly calling them an experiment or not.

Justin Reich: Yeah, that learning can be instantiated in a software product in a way that it would be kind of lost or it is lost every semester when individual faculty get better in their own classrooms, but don't transmit that knowledge to their colleagues or build into materials, or those kinds of things.

Candace Thille: Yes, and it's not just instantiated. And the mistake that we made in OLI on this initially was that we controlled the authoring environment. We didn't have a WYSIWYG authoring environment. All of those courses were sort of written in languages. We didn't have a WYSIWYG authoring tool as edX did.

Justin Reich: So basically, only computer programmers who are familiar with the system or who were hired by OLI and then a tiny handful of super nerds elsewhere can actually change and modify these things.

Candace Thille: Yeah, not even super nerds elsewhere. Only the people in my team. So we ended up with this situation where we had multiple faculty using the OLI statistics course, and they wanted to say, "I don't like the way you teach sampling distribution. I think I teach it better. How do I put my sampling distribution way of activity into this course, so my students have a seamless experience." And the only way at that time that we could do that is we said, "Oh, great, come work with us. We will put it in. And then you can have it in your course. And then if it works, then we can try and tell the other faculty that are using the OLI statistics course about this other cool thing you did and ask them do they want it in their course?"

But we were a total bottleneck in that way. And instead, what we need is an environment. And Studio tried to get to this with edX, allowing faculty to essentially write their own courses without a central thing controlling it. The problem there was because I used Open edX, then when I went to Stanford, the problem there was... So I could put the OLI statistics course in Open edX. Faculty could use it, they could use Studio to make a modification to it, but then they didn't know. There was no central way of discovering did that modification actually support their students better or not. And even if it had, how do you share it with somebody else that's using it? So that's the piece of the infrastructure that was missing, and to some extent, still missing, is an infrastructure that supports faculty to engage in this continuous contribution to research and practice.

Justin Reich: Audrey, what do you think about? And we can talk about this too, but let's go back to the registrar again. So the registrar says no experimentation unless people opt in. What's right or what's wrong about that position or what's right or what's wrong about that question?

Audrey Watters: I was listening to what Candace and both you and Candace were saying, one of the questions that I had reading this was are the kinds of changes that faculty teachers make in their lessons year over year or even day after day, are those are those experiments? I think that our use of language, they do feel like people, they're making change, but I'm not sure that they're necessarily scientific in that kind of a way. I do think that people recognize when things don't work or when they work well. And then they decide to repeat it or replicate it as much as they can the next time around. But I'm not sure if there is that same level of experimentation where things are thought about it and adjusted at the same level of when we talk about experiment. So I don't think that what faculty do when they make changes in their courses are experiments. Although, maybe it's just a fuzzy line that some-

Candace Thille: No, I think that's a good point, Audrey. And I guess what's interesting to me, having been a faculty member, is... And what do I say? Some of my best friends still are faculty members, is that they can have these two different almost brains that I'm in my research mode. And so, I'm trying to discover new knowledge. And so I have a very structured approach that I know to take, to do that and do that well. And then I shut that off when I go into teaching mode and my goal in teaching is just to tell people stuff. And so what I tried to do both in OLI and at Stanford, was to connect those two parts of the faculty brain and say, "Bring the same inquiry into your teaching that you have in your research."

And I would tell you, honestly, if I had to say what was the biggest success of the first few years of OLI at Carnegie Mellon? I would say the biggest success was we had a weekly meeting of the OLI faculty. That means if you were working on logic, or statistics, or economics, or French, or biology, or chemistry, I mean, we were doing many different subjects. And if you were working on an OLI course, you were the lead faculty, you had to come to the statistics faculty meeting. But what those meetings were, were exploration and conversations about how do I get my students to get this concept? And it was an amazing cross-fertilization of the economics faculty talking with the chemistry faculty, abstracting out the idea of proportional reasoning, and how do we get students to get the ideas behind proportional reasoning.

And then we have the learning researchers they're talking about what do we know about human learning? And it really fostered this conversation around understanding learning as research. And they did very much get into this... They got like kids in a candy store where they'd say, "Oh, I think people will get this chemistry concept better if we do this instead of this." And we could put that into OLI, get the data back. And they would be like, "Oh, it worked," or it didn't work. And so it is to some extent, shifting the mindset of how you're engaged in your practice. Yeah.

Justin Reich: Now one of the things that we've talked about in this conversation too is that in order to discover whether or not systematic variation is happening in productive or counterproductive ways. In order to give students and their educators feedback about how students are doing, we have to collect all kinds of data about people's performance. And one of the things that online learning systems do, for a couple, is they try to exhaustively track what participants in the system are doing. Some of that is for research. Some of that is for learning. Some of that is actually just for simple debugging. In many education technology products, the data that's collected, its primary purpose is like figuring things out if the system breaks rather than doing researcher or other kinds of things with it. And one of the consequences of that is that the degree to which we can collect data about human experiences in schooling situations is growing dramatically.

And certainly this is being accelerated right now in the pandemic where teachers and students are watching each other, but they're watching each other in these new video recorded kinds of ways that might've been used in a limited way before, but are growing now. What kinds of... And I try to argue that there are tradeoffs around this, to me, that's what's so compelling about Bruce Schneier's metaphor of a toxic asset is that it seems really cool the kinds of research that we can do with this data in the learning that we can do to help learners. But it also seems like having a giant behavioral dossier of students is also not a good idea. So, how do we reconcile those two things? Candace, how do you think about reconciling that in your work?

Candace Thille: So, you brought up many things to reconcile.

Justin Reich: Pick one of them.

Candace Thille: One of them is to be very clear about the, and this goes back to the assessment chapter too, what's the purpose? If the assessment is simply collecting evidence to be able to make a claim, that's what you're trying to do with assessment, then the same thing with data collection. What's the purpose for collecting that data? And then that should dictate what data you do and don't collect and who has access to it. And I found when I was reading the chapter, that sometimes you were munging those things. And so, for example, if you're collecting data from the students for the purposes, solely for the purposes of supporting their learning, and when I'm talking about learning in this case, let's make it really simple, that the learner's knowledge state is in one state and you're trying to help them put it into a different state. And you can't ever directly see their knowledge state.

That's something we always have to infer. So what kinds of activities do you have the learner engage in in order to make an inference about their current state and also whether the learning process is actually supporting them to get to their desired state or what do they need to get to their desired state? So if I'm collecting learner data for that purpose, then probably a lot of the stuff that all that click-stream isn't material. What's material are the actions that the learner is taking that is giving me insight into their current knowledge state relative to the desired state. Does that make sense?

Justin Reich: Mm-hmm (affirmative).

Candace Thille: And so, then I take that information and I have an appropriate knowledge modeling algorithm that takes that information for the purpose of modeling it to get the best estimate, so that I can give guidance to the learner or to the instructor who is helping the learner. That'd be a very different reason for collecting the data than the reason of summatively assessing whether the learner could... Whether I could predict a learner would do something in order to award goodies or not goodies to that learner. And I think that making it really clear having environments where it's very clear that the purpose of the environment and the data collection is for the purpose of supporting learning and not for the purposes of essentially assessing for summative knowledge that that distinction's really important.

That the only thing you're really assessing in that first case is understanding where the learner's current state is for the purpose of helping the learner to move states or assessing the effectiveness of the environment in supporting learners to move states. That's a very different data collection regime than collecting them data to be able to make a statement about the individual learner that then gives them access or not to other things.

Justin Reich: So one of the things that we could expand on that is that people who engineer these systems could consistently make those kinds of commitments. There could be professional norms around these kinds of things. You can say, look, if you have a title that's like instructional designer, or learning scientist, or learning engineer, you have an obligation to collect data aligned with these principles, but not aligned with those principles, or at least you have an obligation to communicate to people what that data collection is for. And if you sort of held those norms and communicate those norms to your partners, to your students, to the public, then you could generate a kind of trust that would allow that data collection to continue.

Candace Thille: So the problem, and I think this is a failure of a basic understanding of the difference between exploratory data analysis and inference. So to make an inference, you have to have... Well, I won't get into my little geekiness there, but I think this is the difference. What's the first rule you teach people in intro stat courses, is association is not causation. But they're correlated. It's a little nerdy statistics joke.

But in any case, so yes, Audrey, and yes, Justin. That's why I keep using the word experiment, even though that's not really the right word. It is more as one of the people in the chat put. It's more like an engineering design experiment. I don't think we're trying to create grand theory about human learning. I think that would be a mistake to try and say, we're going to have grand theory, like gravity, that's going to explain everything. That reductionist approach doesn't fit. We need to take an adaptive management approach in terms of a philosophy of research. More like what the kind of research one does in biology, where you're talking about emergent complex systems. And that's the theory of research we need to be grounding our learning research in. Not something that's mechanistic like physics.

Justin Reich: Candace, one way I imagine this startup culture, which is about collect all the data in and do everything you want with it. You're proposing an amendment which I'm very sympathetic to which is to, as professionals, put constraints on our own behavior around those things. And you're making two kinds of arguments that I think are important. One kind of argument is a relational, ethical one. It says, "Look, if we want our students to be able to trust us, we have to work in their interests. So let's limit our data collection to things that would help them."

But a second one is that we will actually do better science and better engineering with these constraints in place. That the myth of the infinite pool of data from which insight magically appears through machine learning algorithm, it's a dead end, and there's more productive science that we can do, more productive engineering, learner support that we can do without that. My read of the last decade definitely supports that claim. There was widespread hope with massive open online courses, with learning management systems, that once we started hoovering up these vast swaths of data, all of these new insights would pour out. And I just don't think that view has been validated at all.

But to the extent that there have been new useful insights, it's because people have said, "Let's try these particular designs, let's collect these particular data to see if they're working. And we may have a much longer, more productive relationship with the public, with learners, with these institutions and do better science at the same time, if we apply these kinds of professional restraints. Not because not because anybody makes us, but because we as..." There's a relatively small group of people who have the privilege to be able to build this software, to do research on it and so forth. Anyway, I think the arguments that you're making, I find very compelling and they seem like important discussions to be had. Not just among the people who build these things, but as you've said, with all the kinds of partners that help us make these learning systems work.

Candace Thille: Yes. I agree completely.

Justin Reich: Still a technocratic argument, but it's a restraint on the elites.

Candace Thille: Yeah, but with the added dimension, which I don't know if I failed to communicate. I know you go against this idea of democratizing education, which I agree with all of your reasons for that. What I'm talking about is democratizing education research. Building new knowledge about how to teach effectively, the tools at that back into the practitioners hands, who are working with the researchers.

It's like if you think about medicine. I've thought about this as an analogy. It may be wrong, which is if I'm a doctor and I'm trying to make a decision, a treatment decision for a patient that's presenting with a particular illness, I read the literature and I see what randomized controlled trials have done to give me guidance on which of these two treatments to select.

Unfortunately, the variables that they controlled for in those two researchers are not controlled for in my patient. My patient has all the things that they controlled for, so I can't really make a good decision by just looking at the research between these two treatment choices, but I have to make a decision, because I have to treat my patient. So I make a decision and I, and I might even rationalize why I made that decision as opposed to the other decision, but that gets documented. And then some place there's going to be an outcome for that patient.

And while that doesn't meet the standard of some randomized controlled trial, it is a piece of evidence about what decision was made and what happened. And that you have doctors all over the world, essentially doing that. Then the next time a doctor had to make that decision, and it wasn't in the research literature, which one to do, they have this whole other source, that there were 20 doctors who made that decision. And there's patients that have the variables that are most like mine, my patient, and this was the outcome of their decision. And so I can use that to help guide my decision. And I think that we could get faculty participating in the same community-based research activity.

Justin Reich: That's great. I think so too. Audrey, do you have final thoughts about the toxic power of data and experiment, or the things that we missed in this conversation that we should make sure to bring up as we'd come to the close?

Audrey Watters: In one place you talk about, you didn't think that... I think you were talking to K-12 school districts, didn't have the expertise to deal with data management, and that you didn't think that they would develop the expertise. I agree with what Candace was saying as well, but I wonder how we help develop this expertise. Not just in managing data, but how do we help develop this expertise in school districts? And particularly cash strapped, resource strapped, school districts. How do we practically get districts, get schools there? How do we get them there?

Justin Reich: I think one component of it can be connected to what Candace is talking about, which is that one of the resources that schools have is an inexhaustibly curious faculty. And they're exhaustively busy and they're exhausted, but they're inexhaustibly curious. And so one bottom up starting point, is to say that the people who are building these tools and the people who are implementing them in schools have obligations as part of their practice, to think about democratizing these processes, not democratizing as in exporting resources from elite universities to the far-flung corners of the world. But through making the people who purchase, use, lend, borrow our resources, meaningful partners in what we're doing.

And one could imagine that in any school system that's using Khan Academy that Kahn Academy could, could publicize, "Hey, three times a year, we're going to have a... Here's the research that we're doing, and here's what we're doing with your data. Come learn about it." And I don't know how many of the hundreds of thousands of math teachers in the United States would sign up, but a bunch of them would. And those folks that volunteered to have some more expertise, there's a lot of complexity here. But one starting point is to say that if we're asking people who have this power over these tools and machines to engage in meaningful conversations with people who are stakeholders in different kinds of ways, that's a good start.

And it'll be good for two reasons. One is that we'll get better ideas and better perspectives. And going back to Candace's example of the mismatch between her and her team's idea, and design and her students, what they saw on the other end, we can sand down some of those communication mishaps. But then more importantly, I think if people in the field of education technology are really committed to hosting those kinds of conversations, it's through those bottom up conversations, that's where trust is built. And the trust will be built both because we're transparent about what we're doing. And because most schools will tell us when we're doing stupid things and we'll stop. So those are two mechanisms that I can imagine.

Well, this has been a very rich conversation. Thanks so much to all of you who joined us for this online. Audrey Watters, thanks always for joining me. Candace Thille, thank you so much for coming in and being here. It was great to be able to get your perspective and your wisdom on these topics.

Candace Thille: Thanks. This was really fun.

Justin Reich: We have one more conversation left, a conclusion, a wrap up. We've got Kevin Gannon and Cathy Davidson coming in along with Audrey to help us combine some of these themes together. And I hope many of you will come and join us for that conversation. It'll be the Monday before Thanksgiving. So we'll wrap up before the holidays. Really wonderful to have you all here and have a great afternoon and rest of your week. Thanks everyone.

Audrey Watters: Thanks everyone. Bye-bye.

Justin Reich: Thanks for joining us for the Failure to Disrupt Book Club on chapter eight, The Toxic Power of Data and Experiments, with Candace Thille. Thanks, Candace for joining us along as always with Audrey Watters. I'm Justin Reich. Thanks for listening to TeachLab. I hope you enjoyed that conversation. Be sure to subscribe to TeachLab, to get future episodes on how educators from all walks of life are tackling distance learning during COVID-19.

Be sure to check out the new book, Failure to Disrupt: Why Technology Alone Can't Transform Education, available from booksellers everywhere. You can find out more at failuretodisrupt.com. That's failuretodisrupt.com. Also this month, you can join myself and Vanderbilt Professor, Rich Milner, in a free, self-paced online course for educators, Becoming a More Equitable Educator: Mindsets and Practices. Through inquiry and practice, you'll cultivate a better understanding of yourself and your students, you'll gain new resources to help all students thrive, and develop an action plan to work in your community, to advance the lifelong work of equitable anti-racist teaching.

You can find the link to this edX course in our show notes, where you can enroll now. The course will run until August 26, 2021, and you can join on your own or with colleagues any time and complete the course at your own pace, for free. This episode of TeachLab was produced by Aimee Corrigan and Garrett Beazley, recorded and sound mixed by Garrett Beazley. Stay safe, until next time.