In last month's column, I discussed how the issue of correlation vs. causation jumpstarted a conversation about data literacy, and I will continue that discussion in this month’s column. The term “data literacy” applies to the overlapping skillsets of statistical literacy (learning how to read statistics), information literacy (finding, using, and applying information), and the ability to “obtain and manipulate data” (Schield 2004). As reading and writing infographics, charts, graphs, and visualizations become ubiquitous in mainstream and scholarly media and in secondary classrooms, students must learn to critically examine the data sets and data visualizations they encounter.
RULES OF THUMB
The following offers data-related rules of thumb that librarians, educators, and students can quickly bring to their inquiry projects:
It’s healthy for students to talk back to data.
Many of my graduate students confessed earlier this year that until they were college upperclassmen, they had always operated under the assumption that while fiction could be dissected and evaluated, textbooks and articles should be taken as unquestionably true. As upperclassmen in undergraduate school, they assumed that reading and articles should be taken at face value, as they often did with fiction in secondary school. Their brave confession reminds us that we need students, beginning in high school, to see themselves as participants in scholarly conversations. As our ACRL colleagues point out in the third draft of Framework for Information Literacy in Higher Education, “Scholarship is a conversation” (2014, 11). That conversation begins as students interact with resources, not when they submit a final product for consideration by classmates or colleagues online.
Teaching Tip: Librarians can model their “talking back” using sticky notes, margin notes, highlighting, online annotations, and oral think-alouds to demonstrate how students can interrogate ideas and arguments.
Data is not neutral.
For children, numbers bring a certain reassurance. But when humans are doing the counting, then human fallibility and choice can impact which numbers come up and what they mean. Consider a state in which students have shown a double-digit increase or decrease in proficiency on a state test over the course of a single year. Our first instinct—and, sadly, that of many reporters—is that our students are dramatically better or worse than in the previous year. But remember, behind data lies humanity. Perhaps the state adjusted the cut scores (the benchmark score a student must reach in order to be considered proficient in the subject area). Perhaps there was a new test and the new questions are easier or harder. Probe beyond the captions or, in the case of scholarly research, read beyond the abstract to look at what kinds of data were collected and how it was processed.
Teaching Tip: Encourage students to ask, “What data was used? What is missing? What else might account for this finding?”
Consider sample size.
Imagine a paper abstract that states a reading incentive program raised achievement by an entire grade level within just a few months. Sounds miraculous, thinks the student, who decides to cite the abstract and skip reading the article. This decision means the student does not learn that only five students were studied. The program worked for those five students, but will it work for others? The small sample size makes it difficult for us to extrapolate the finding into other populations with confidence.
One underappreciated portion of a professional article is the “Limitations of the Study” portion. In this section, the scholars identify anything that might limit the impact of the study. In many cases, sample size, or the number of people who were studied, is a factor. Why might small sample sizes be published at all? When an academic or laboratory researcher begins a new area of academic study, they may conduct preliminary research for which they have yet to obtain funding. To keep out-of-pocket costs low, they start a small study with which to explore their new hypotheses. They can use those results later to apply for larger funding and expand their testing with more people and with a more diverse population. Small sample sizes don’t negate the findings of the study; they just limit the degree to which we can assume the findings would be similar with other kids.
Teaching Tip: Numerous studies now confirm that few students read or cite beyond the first pages of a study. Show students how the “Limitations” section can reveal important context with which to weigh the author’s findings.
Heads-up for selection bias.
Imagine that a high school senior wants to know whether parents prefer LittleBits or LEGO® pieces. She sends a request to a LEGO® Facebook group asking for their participation in her survey. Needless to say, the survey results show overwhelming preference for LEGO®.
Selection bias occurs when the people who participate in a study, observation, or research project are not representative of the greater population being studied. This can happen when only a sub-section of a population volunteers or only a partial population is recruited to participate.
Teaching Tip: To illustrate selection bias, ask students to consider that many campus medical studies fifty years ago were conducted with local students. How might the fact that most college students were male then impact how physicians understand health for men? For women?
Watch for conditional language.
Teaching Tip: As a mini-lesson or homework assignment prior to a larger research project, have students browse a collection of commercials (
http://www.ispot.tv/browse) and identify three commercials that use data or statistics but frame them with conditional language that makes the arguments seem less compelling than they appear up-front.
Watch for wonky graphs.
When we learn graphs in math class, we learn that the y-axis (the vertical axis) always begins at zero and moves up from there. But in many contemporary graphics, elongating, truncating, or manipulating the y-axis can technically show accurate information while creating a visual argument that is biased.
Study Figures 1 and 2 showing two graphs related to a proposed school millage. How likely is the millage to pass according to Figure 1? Figure 2? (Note: data is fictional.)
The answer, of course, is that both graphs show the same data (only the y-axis labels are different). Visualizations like Figure 1 are created when someone wants to use data creatively to make an argument. Look quickly, and it seems that the community is split over the millage. The data is simultaneously accurate and misleading.
Teaching Tip: Do an image search for “obamacare graph.” Find two graphs that appear to show similar data represented differently. Try to identify what makes this possible. Some possibilities include manipulating the y-axis (vertical) of a graph, such as starting the numbering from a number other than zero, having very small gaps (e.g., .0001, .0002) or very large distances (e.g., 4,000,000 and 8,000,000) between numbers, or using colors that subtly signal danger (red) or health (green). Though scatalogically named, WTF Visulations (
http://viz.wtf/) collects misleading examples of visualized data that make great classroom discussion artifacts.
As you learn more about how your students interpret data during research, you will see increased opportunities where mini-lessons can help deepen and clarify their understanding. When students work with data during research, you will start to see entry points where you can develop mini-lessons around the kinds of data your students are facing.