Python Data Analysis with Pandas and Matplotlib Workshop
Date: April 17-18, 2018 9am-4:30pm
Location: Information Technology Center Room 105A/B, UH Manoa
Presenters: Mahdi Belcaid (HDSI), Sean Cleveland(UH), Ron Merrill (UH), David Schanzenbach (UH), and Jennifer Geis(UH)
This FREE workshop is sponsored by the Hawai’i Data Science Institue and the University of Hawai’i Information Technology Service Cyberinfrastructure group and Hawai’i EPSCoR.
This workshop focuses specifically on the Python skills necessary for data analysis – as opposed to software development – and introduces some of the libraries that have made Python a popular alternative for working with data at any scale.
Takeaways:
By the end of this workshop students will be able to:
- Work with the Pandas library to conduct essential data analysis tasks such as reading, exploring, filtering, and summarizing data.
- Slice, shape and pivot tables.
- Implement calculations on rows, columns, and tables.
- Use split-apply-combine to summarize data
- Merge, concatenate and filter data from multiple sources.
- Visualize data using matplotlib
Participants should bring their laptops and plan to participate actively. Laptops will require a browser application for accessing jupyter notebooks resources.
Required Software Installation
Installing the Python environment with Anaconda
Python is a popular language for research computing, and great for general-purpose programming as well. Installing all of its research packages individually can be a bit difficult, so we recommend Anaconda, an all-in- one installer. Regardless of how you choose to install it, make sure you install Python version 3.6. We will extensively use the Jupyter programming environment that runs in a web browser. For this to work you will need a reasonably up- to-date browser. The current versions of the Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9 and below, are not).
Installation Instructions for Windows
Browse to http://continuum.io/downloads Download the Python installer for Windows Install Python 3.6 using all of the defaults for installation except make sure to check “Make Anaconda the default Python”
Installation Instructions for Mac OS X
Browse to http://continuum.io/downloads Download the Python 3.6 installer for OS X Install using all of the defaults for installation
Schedule
TUESDAY APRIL 17
- 9AM BEGIN WORKSHOP
- 10:30AM BREAK
- 10:45 RESUME
- NOON LUNCH
- 1PM RESUME
- 2:30PM BREAK
- 2:45 RESUME
- 4PM STOP FOR THE DAY
WEDNESDAY APRIL 18
- 9AM BEGIN WORKSHOP
- 10:30AM BREAK
- 10:45 RESUME
- NOON LUNCH
- 1PM RESUME
- 2:30PM BREAK
- 2:45 RESUME
- 4PM STOP FOR THE DAY
Workshop Materials
Tuesday
Preliminaries.ipynb https://bit.ly/2H5N9Xl
Introduction_to_Python.ipynb https://bit.ly/2vuWAP1
Intro_to_pandas.ipynb https://bit.ly/2J3Y3xp
Plotting_and_visualization.ipynb https://bit.ly/2EW4Erk
Exploring_data.ipynb https://bit.ly/2J3Y4RZ
Missing_values.ipynb https://bit.ly/2voIHl6
Data Files ALL DATA FILES
ZIP OF FILES Once unzipped all the files are in the “data” folder
WEDNESDAY
Grouping Dataframes https://bit.ly/2Ha9lE3
Merging Joining Data https://bit.ly/2JXCgZN
Plotting with Seaborn https://bit.ly/2HyqjLC
Survey
Please fill out the demographic Survey