Data wrangling practices and collaborative interactions with aggregated data

Shiyan Jiang, Jennifer Kahn

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


Data visualization technologies are powerful tools for telling evidence-based narratives about oneself and the world. This paper contributes to the literature on data science education by examining the sociotechnical practices of data wrangling—strategies for selecting and managing large, aggregated datasets to produce a model and story. We examined the learning opportunities related to data wrangling practices by investigating youth’s talk-in-interaction while assembling models and stories about family migration using interactive data visualization tools and large socioeconomic datasets. We first identified ten sociotechnical practices that characterize youth’s interaction with tools and collaboration in data wrangling. We then suggest four categories of activities to describe patterns of learning related to the practices, including addressing missing data, understanding data aggregation, exploring social or historical events that constitute the formation of data patterns, and varying data visual encoding for storytelling. These practices and activities are important to understand for supporting future data science education opportunities that facilitate learning and discussion about scientific and socioeconomic issues. This study also sheds light on how the family migration modeling context positions the youth as having agency and authority over the data and contributes to the design of CSCL environments that tackle the challenges of data wrangling.

Original languageEnglish (US)
Pages (from-to)257-281
Number of pages25
JournalInternational Journal of Computer-Supported Collaborative Learning
Issue number3
StatePublished - Sep 1 2020


  • Data visualization
  • Data wrangling
  • Family migration
  • Modeling
  • Sociotechnical practices
  • Storytelling

ASJC Scopus subject areas

  • Education
  • Human-Computer Interaction


Dive into the research topics of 'Data wrangling practices and collaborative interactions with aggregated data'. Together they form a unique fingerprint.

Cite this