Note that this site is in currently in version 1.0.0-alpha.   Some functionality may be limited.

1. Data is Foundational

In this workshop we will be discussing the basics of research data in terms of material, transformation, and presentation. We will also be discussing the ethical issues that arise in data collection, cleaning, and representation. Because everyone has a different approach and understanding to data and ethics, this workshop will also include multiple sites for discussions to help us think through what data literacies mean within our projects and broader applications.

What Constitutes Research Data?

These quotes below offers a variety of perspectives to understanding research data across different stakeholders. The inclusion of these different approaches to research data is to suggest that there is no singular, definitive approach, and is dependent on multiple factors, including your project considerations.

Material or information on which an argument, theory, test or hypothesis, or another research output is based.

Queensland University of Technology. Manual of Procedures and Policies. Section 2.8.3.

What constitutes such data will be determined by the community of interest through the process of peer review and program management. This may include, but is not limited to: data, publications, samples, physical collections, software and models.

Marieke Guy

Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues.

OMB-110, Subpart C, section 36, (d) (i)

The short answer is that we can’t always trust empirical measures at face value: data is always biased, measurements always contain errors, systems always have confounders, and people always make assumptions

Angela Bassa

Broadly, research data can be understood as materials or information necessary to come to your conclusion but what these materials and information is depends on your project.

Forms of Data

There are many ways to represent data, just as there are many sources of data. What can you/do you count as data? Here’s a small list of possibilities:

  • Non-digital text (lab books, field notebooks)
  • Digital texts or digital copies of text
  • Statistical analysis (SPSS, SAS, R)
  • Scientific sample collections
  • Data visualizations
  • Computer code
  • Standard operating procedures and protocols
  • Protein or genetic sequences
  • Artistic products
  • Curriculum materials (e.g. course syllabi)
  • Spreadsheets (e.g. .xlsx, .numbers, .csv)
  • Audio (e.g. .mp3, .wav, .aac)
  • Video (e.g. .mov, .mp4)
  • Computer Aided Design/CAD (.cad)
  • Databases (e.g. .sql)
  • Geographic Information Systems (GIS) and spatial data (e.g. .shp, .dbf, .shx)
  • Digital copies of images (e.g. .png, .jpeg, .tiff)
  • Web files (e.g. .html, .asp, .php)
  • Matlab files & 3D Models (e.g. .stl, .dae, .3ds)
  • Metadata & Paradata (e.g. .xml, .json)
  • Collection of digital objects acquired and generated during research

Adapted from: Georgia Tech

Challenges for lesson 1

Assignment: Challenge: Forms of Data

These are some (most!) of the shapes your research data might transform into.

  1. What are some forms of data you use in your work?
  2. What about forms of data that you produce as your output? Perhaps there are some forms that are typical of your field.
  3. Where do you usually get your data from?

  1. As I am currently exploring discourses on various social media ecosystem, I tend to extract/scrape data that comes through as JSON files, which is a text-file type that is often used to structure large data sets. Sometimes they also come in other forms of data bases such as CSVs or XLS.
  2. Often times outputs are statistical analysis and various data visualizations. This is also pretty comment in my field of psychology.
  3. I can get them from large databases like or scrape certain social media outlets directly such as Twitter.


Try again!

Research data can be defined as:

(Select all that apply)

Workshop overall progress