This FREE workshop is sponsored by the University of Hawai’i Information Technology Service Cyberinfrastructure group, the Texas Advance Computing Center and Hawai’i EPSCoR.
This workshop is designed for researchers working with genomics data and teaches basic concepts, skills and tools. No prior computational experience is required
The focus will be on data management and analysis for genomics research. We will cover metadata organization, data organization, connecting to and using advanced computing, the command line for sequence quality control and bioinformatics workflows, and R for data analysis and visualization. We will not be teaching any particular bioinformatics tools, but the foundational skills that will allow you to conduct any analysis and analyze the output of a genomics pipeline. By the end of the workshop learners should be able to manage and analyze data more effectively and be able to apply the tools and approaches directly to their ongoing research.
Participants should bring their laptops and plan to participate actively. Laptops will require a terminal application for accessing compute resources - Linux and Mac machine have this already, for Windows machines we recommend the free home version of MobaXterm. We will also be using jupyter notebooks- at the moment Microsoft is offering free Azure hosted jupyter notebooks so please create a Microsoft account prior to the workshop if you do not have one already.
One dataset will be used throughout the workshop. We will start by introducing the dataset and the steps we’ll go through for analysis.
In this workshop we’re using data from Blount et al 2012 paper from Dr. Richard Lenski’s Long Term Evolution Experiment.
Start Time | Topic |
---|---|
9:00am | Introduction |
9:10am | The Dataset |
9:20am | Metadata and Data Organization |
9:30am | Introduction to the command line |
10:00am | Break - coffee and light refreshments provided |
10:15am | Command line Continued |
11:00am | Data Wrangling |
Noon | Lunch - on your own |
1:00pm | Introduction to R |
2:15pm | Break - coffee and light refreshments provided |
2:30pm | Agave/ToGo |
4:00pm | End |
https://public.etherpad-mozilla.org/p/UH-bioworkshop
Metadata and Data Organization
Introduction to the command line