Note that this site is in currently in version 1.0.0-alpha.   Some functionality may be limited.

Data Literacies

What is data? What counts as data? These are questions we will explore throughout the workshop.

Data is foundational to nearly all digital projects and often help us to understand and express our ideas and narratives. Hence, in order to do digital work, we should know how data is captured, constructed, and manipulated. In this workshop we will be discussing the basics of research data, in terms of its material, transformation, and presentation. We will also engage with the ethical dimensions of what it means to work with data, from collection to visualization to representation.

In this workshop, you will learn to:

  • Become familiar with the specific requirements of “high quality data”
  • Know the stages of data analysis
  • Learn about ethical issues around working with different types of data and analysis
  • Understand the difference between proprietary and open data formats

Views: 99

Last updated: September 15, 2021

Before you get started

In this section, we want to introduce some central steps that you want to take before you get started with this workshop. For instance, there are workshop suggestions that you may want to engage with before you start this workshop, some required or recommended software installations, some files from external sources to download, etc.


This is a list of workshops that we suggest you engage with before you get started with this one. They are listed here as they contain some central concepts or tools that you may need before you can digest all the information you will be presented in this workshop.

Introduction to the Command LineRequired

This workshop makes reference to concepts from the Command Line workshop, and having some knowledge about how to use the command line will be central for anyone who wants to learn about how to handle and process data and data analysis.



Why am I learning this? Why does it matter? How will it help my project? Learning new digital skills is an investment of your valuable time, so it is reasonable to want to know—essentially—what will I get out of taking this workshop? The materials below help situate the skills you are about to learn within a larger context of how they are used, by whom, and to what ends.

Ethical considerations

Digital tools and the skills required to use them are part of our culture and, therefore, never neutral. Digital humanists and social scientists consider the ethical challenges and responsibilities of the tools and methods that they use. The following materials are designed to introduce you to issues you may want to consider as you learn this new skill and decide how to integrate it into your own research and teaching.

Big data projects often times requiring sharing data sets across different individuals and teams. In addition, to ensure that our work is reproducible and accountable, we may also feel inclined to share the data collected. As such, figuring out how to share such data is crucial in the project planning stage.

Consider how you may use differential privacy as a strategy against re-identification. Consider the US Census 2020 example on utilizing this strategy to address privacy concerns.

Data and data analysis is not free from bias. There is no magic blackbox for which data emerges from and is contextually driven. As we think about the automation process of looking at “big” data, we have to be aware of the biases that gets reproduced that is “hidden.”

De-identified information can be reconstructed from piecemeal data found across different sources. When we consider what we are doing with the data we have collected, we also need to think about the possible re-identification of our participants.

Readings before you get started

The readings listed below situate what you are about to learn in cultural contexts, such as a particular humanities or social science field, the information or computer sciences, or popular discourse. The purpose of the readings is to provide a theoretical framework you can use to contextualize how you intend to use the skill or tool introduced in this workshop.

Big? Smart? Clean? Messy? Data in the Humanities
In Big? Smart? Clean? Messy? Data in the Humanities, Christof Schöch discusses what data means in the humanities and the necessity of “smart big data.”
Bit By Bit: Social Research in Digital Age
The book, Bit By Bit: Social Research in Digital Age, written by Matthew Salganik, approaches data and social research from a computational social science perspective. He also discusses the idea of “readymade” and “custommade” data alongside ethics.
Ten Simple Rules for Responsible Big Data Research
Ten Simple Rules for Responsible Big Data Research explores some guidelines for addressing complex ethical issues that arise in any research project.

Projects related to Data Literacies

The following are sample projects that use the skill or tool (either implicitly or explicitly) that you are about to learn. Some skills that are foundational may seem not to lead to a specific project goal that you have in mind. You might be surprised to learn that the following projects depend on the skills learned in this workshop.

Data for Public Good
The Data for Public Good is a semester-long collaborative project led by CUNY graduate students. Each semester, a different public-interest dataset is explored to present information that is useful and informative to a public audience.
SAFElab, led by Dr. Desmond U. Patton, uses computational and social work approaches to understand the mechanisms of violence and work on prevention and intervention in violence that occur in neighborhoods and on social media.

Datasets related to Data Literacies

An introduction to what datasets are and what they do in our frontmatter section.

Meet your instructor

Di is currently a PhD candidate in Critical Social/Personality Psychology at CUNY, The Graduate Center (GC). They are also a GC Digital Initiatives Digital Fellow. Broadly, their work is on understanding the relationality between systems of oppression and the individual. They are interested in identities as discourses, and the ties between transnationalism and diasporas. Currently, they are working on several projects that is exploring identities as discourses, including in alt-right spaces, K-pop/Hallyu, and the experiences of queer and trans Asian (American) in the US.

As a GC Digital Initiatives Digital Fellow, they are also interested in understanding what ethics is within computational social science, digital humanities, and public humanities projects. They are also invested in bridging the gaps of technology literacy, especially within underserved communities.