Data Wrangling in Python: Introduction to the Pandas Library
Description
Prerequisites: Participants should be familiar with basic programming concepts, including variable assignment, data types, function calls, and installing packages or libraries. Introductory experience in Python or R will be especially helpful for this workshop.
This beginner-to-intermediate level workshop from Johns Hopkins Data Services will introduce you to the pandas library, a popular Python library for data cleaning, data wrangling, and data analysis. Participants in this interactive class will use Jupyter Notebooks software and Python code to import, understand, and prepare a dataset for further analysis or visualization. By the end of this workshop, participants will be able to:
- Identify and use the two primary data structures of the pandas library: Series and DataFrame
- Implement functions from the pandas library to explore and analyze a data set, including 1) handling missing data, 2) filtering and sorting data, 3) grouping data, and 4) calculating basic summary statistics
- Find documentation for the pandas library to troubleshoot errors and apply new functions to analyze a data set
Who can attend?
- Faculty
- Staff
- Students