1. Data is Foundational
In this workshop we will be discussing the basics of research data in terms of material, transformation, and presentation. We will also be discussing the ethical issues that arise in data collection, cleaning, and representation. Because everyone has a different approach and understanding to data and ethics, this workshop will also include multiple sites for discussions to help us think through what data literacies mean within our projects and broader applications.
What Constitutes Research Data?
These quotes below offers a variety of perspectives to understanding research data across different stakeholders. The inclusion of these different approaches to research data is to suggest that there is no singular, definitive approach, and is dependent on multiple factors, including your project considerations.
Material or information on which an argument, theory, test or hypothesis, or another research output is based.
What constitutes such data will be determined by the community of interest through the process of peer review and program management. This may include, but is not limited to: data, publications, samples, physical collections, software and models.
Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues.
The short answer is that we can’t always trust empirical measures at face value: data is always biased, measurements always contain errors, systems always have confounders, and people always make assumptions
Broadly, research data can be understood as materials or information necessary to come to your conclusion but what these materials and information is depends on your project.
Forms of Data
There are many ways to represent data, just as there are many sources of data. What can you/do you count as data? Here’s a small list of possibilities:
- Non-digital text (lab books, field notebooks)
- Digital texts or digital copies of text
- Statistical analysis (SPSS, SAS, R)
- Scientific sample collections
- Data visualizations
- Computer code
- Standard operating procedures and protocols
- Protein or genetic sequences
- Artistic products
- Curriculum materials (e.g. course syllabi)
- Spreadsheets (e.g.
- Audio (e.g.
- Video (e.g.
- Computer Aided Design/CAD (
- Databases (e.g.
- Geographic Information Systems (GIS) and spatial data (e.g.
- Digital copies of images (e.g.
- Web files (e.g.
- Matlab files & 3D Models (e.g.
- Metadata & Paradata (e.g.
- Collection of digital objects acquired and generated during research
Adapted from: Georgia Tech
Challenges for lesson 1
Assignment: Challenge: Forms of Data
Assignment: Challenge: Forms of Data
These are some (most!) of the shapes your research data might transform into.
- What are some forms of data you use in your work?
- What about forms of data that you produce as your output? Perhaps there are some forms that are typical of your field.
- Where do you usually get your data from?
- As I am currently exploring discourses on various social media ecosystem, I tend to extract/scrape data that comes through as JSON files, which is a text-file type that is often used to structure large data sets. Sometimes they also come in other forms of data bases such as CSVs or XLS.
- Often times outputs are statistical analysis and various data visualizations. This is also pretty comment in my field of psychology.
- I can get them from large databases like pushshift.io or scrape certain social media outlets directly such as Twitter.
Research data can be defined as:
Research data can be defined as:(Select all that apply)