Bioinformatics Genomics Data Workshop

Date: April 12, 2017 9am-4pm

Location: Information Technology Center Room 105A/B, UH Manoa

Presenters: John Fonner (TACC), Sean Cleveland(UH), Joe Stubbs (TACC), Rion Dooley (TACC), Ron Merrill (UH), David Schanzenbach (UH), and Jennifer Geis(UH)

Drawing Drawing Drawing

This FREE workshop is sponsored by the University of Hawai’i Information Technology Service Cyberinfrastructure group, the Texas Advance Computing Center and Hawai’i EPSCoR.

This workshop is designed for researchers working with genomics data and teaches basic concepts, skills and tools. No prior computational experience is required

The focus will be on data management and analysis for genomics research. We will cover metadata organization, data organization, connecting to and using advanced computing, the command line for sequence quality control and bioinformatics workflows, and R for data analysis and visualization. We will not be teaching any particular bioinformatics tools, but the foundational skills that will allow you to conduct any analysis and analyze the output of a genomics pipeline. By the end of the workshop learners should be able to manage and analyze data more effectively and be able to apply the tools and approaches directly to their ongoing research.

Participants should bring their laptops and plan to participate actively. Laptops will require a terminal application for accessing compute resources - Linux and Mac machine have this already, for Windows machines we recommend the free home version of MobaXterm. We will also be using jupyter notebooks- at the moment Microsoft is offering free Azure hosted jupyter notebooks so please create a Microsoft account prior to the workshop if you do not have one already.

The Workshop is at capacity and is no longer accepting participants. Please be on the look out for future opportunities.

Workshop structure

One dataset will be used throughout the workshop. We will start by introducing the dataset and the steps we’ll go through for analysis.

Dataset

In this workshop we’re using data from Blount et al 2012 paper from Dr. Richard Lenski’s Long Term Evolution Experiment.

Workshop Timeline

Start Time Topic
9:00am Introduction
9:10am The Dataset
9:20am Metadata and Data Organization
9:30am Introduction to the command line
10:00am Break - coffee and light refreshments provided
10:15am Command line Continued
11:00am Data Wrangling
Noon Lunch - on your own
1:00pm Introduction to R
2:15pm Break - coffee and light refreshments provided
2:30pm Agave/ToGo
4:00pm End

Etherpad

https://public.etherpad-mozilla.org/p/UH-bioworkshop

Workshop Materials

Metadata and Data Organization

Introduction to the command line

Data wrangling and processing

R for data analysis and visualization

Agave/ToGo