Schedule#
Coursera is the primary source of schedule information for this class.
Note: To access the slides you may need to enable Box for your NetID.
Module |
Topic |
Slides |
|---|---|---|
Course Overview |
||
1 |
Introduction to Data Curation |
|
Introduction to Data Curation |
||
Data Curation Universe |
||
Data Lifecycle Models |
||
The Curated Data Lake |
||
Curation Profile: GHCN |
||
Curation Profile: Common Crawl |
||
2 |
Ethics, Law, and Policy |
|
Introduction to Ethics, Laws, and Policies |
||
Research and Data Ethics |
||
Privacy Laws |
||
De-Identification Methods |
||
Intellectual Property Laws |
||
AI Law and Data Curation |
||
Curation Profile: Census ACS |
||
3 |
Data Abstractions - Relations |
|
Data Models |
||
The Problem |
||
The Relational Model |
||
How is the Relational Model Implemented? |
||
Abstraction, Indirection & Data Independence |
||
Relational Model and Curation Activities |
||
Tidy Data |
||
4 |
Data Abstractions - Trees |
|
Text and Documents |
||
The Problem |
||
The Solution: (1) Descriptive Markup |
||
The Solution: (2) Trees |
||
Why The Solution Works |
||
Implementing The Solution with XML and JSON |
||
5 |
Data Abstractions - Ontologies |
|
The Problem: Connecting Data to Information |
||
The Solution: Ontologies |
||
An ER/Ontology Example: FRBR |
||
Implementing Ontologies in RDF/RDFS |
||
Practical Ontologies with JSON-LD |
||
6 |
Data Integration |
|
Data Cleaning, Data Integration |
||
Managing Heterogeneity |
||
Schema Integration |
||
Schema Integration: an example |
||
Example: The Curated Data Lake |
||
7 |
Data Concepts |
|
What is data? A first attempt |
||
The Identity Problem |
||
Some Ontological Analysis |
||
A Way Forward: Roles and Types |
||
An Ontology for Data Concepts |
||
What is data? |
||
8 |
Metadata |
|
What is Metadata? |
||
Metadata Schemas |
||
Common Metadata Ambiguities |
||
How Does Metadata Support Data Curation? |
||
Metadata in Practice |
||
9 |
Identity |
|
Why is Identification Important? |
||
What Are We Identifying? |
||
How Do We Identify? |
||
Canonicalization |
||
Identifiers and Identifier Systems |
||
10 |
Preservation |
|
Introduction to Data Preservation Challenges |
||
What is Data Preservation? |
||
The Preservation Integration Parallels |
||
Standard Data Preservation Strategies |
||
Two Data Preservation Standards |
||
11 |
Standards |
|
Standards and Standards Organizations |
||
Some Standard Standards Maneuvers |
||
Compatibility |
||
Standards Organizations |
||
12 |
Workflow, Provenance and Reproducibility |
|
Workflow |
||
Provenance |
||
Workflow Systems |
||
Provenance Standards |
||
Computational Reproducibility |
||
13 |
Data Practices |
|
Data Practices |
||
What’s Going on in the Lab? |
||
Data Sharing |
||
Data Reuse |
||
Trends in Data Curation Research |
||
14 |
Fall Break |
|
15 |
Communication |
|
Communication and Data Curation |
||
Information Overload |
||
Limited Access to Research |
||
Research Integrity |
||
Beyond the PDF |
||
16 |
Course Review |