SORTEE Workshop - Integrating SQL into R analytical workflows using dbplyr
Event description
In this workshop, we will discuss why you might consider a relational database to store environmental data. We will go over how to insert data in and retrieve data from a database using R and duckDB. We will focus on how to use the R dbplyr package to integrate databases into tidyverse focused analytical workflows.
This workshop will cover these main points:
- Introduce concepts of a database and discuss why you might want to have/want to use one
- How to integrate the use of a database into an R analytical workflow
- Hands-on exercise using dbplyr and how it can be used to learn some SQL basics
The esteemed trainers leading this workshop are Julien Brun and Greg Janée from the University of California, Santa Barbara.
In advance of this working, we encourage you to make sure you have R and RStudio downloaded to your machine, as well as the tidyverse and dbplyr libraries already installed and loaded.
Speakers' bios
Julien Brun is an Earth & Environmental Sciences Research Facilitator in the Research Data Services department at the UCSB Library. He was previously a Senior Data Scientist at the National Center for Ecological Analysis & Synthesis (NCEAS). Julien helps researchers to make their research more collaborative and reproducible. Julien is also a Lecturer for the Bren Master of Environmental Data Science (MEDS) program at UCSB.
Julien's scientific expertise is in ecohydrology, Earth observation techniques (remote sensing and GIS), and process-based modeling. Prior to conducting his PhD on the ecohydrological impacts of tropical cyclones in the Southeastern US, Julien conducted several projects on land cover change, vegetation monitoring, and disaster mapping for governmental and international institutions.
Julien is also co-leading the development of two R packages: lterdatasampler to create ready-to-use teaching datasets for environmental data science training and metajam which helps scientists to download data & metadata in a user-friendly format from the federation of data repositories dataONE.
Greg Janée is founder and director of the Research Data Services department at the UCSB Library. His areas of expertise include research data management, data curation, digital preservation, digital library research and development, repository and storage architectures, geospatial data processing, and metadata standards development and mapping. At UCSB he has led or participated in numerous digital library-related projects, beginning with the Alexandria Digital Library (ADL). At the California Digital Library he was principal developer of the EZID persistent identifier service. Prior to working in academia, he worked in industry as a software engineer on both research projects and commercial products.
Greg is a certified Carpentry instructor and regularly teaches Python, R, SQL, version control, and other introductory data science workshops. His experience with relational database systems includes data modeling and database design, database administration, programming language integration, and query optimization.
Tickets for good, not greed Humanitix dedicates 100% of profits from booking fees to charity