Big data exploration and object-oriented programming with Python

  • 26 February - 01 March 2024
  • Wageningen
  • Methodology
  • 1.2 ECTS

Python is a dynamic, readable language that is a popular platform fit for executing different kinds of numerical problems, from simple one-off scripts to large, complex software projects. This workshop is aimed at people who already have a basic knowledge of Python and are interested in using the language to explore and visualize large datasets and write more complex programs using object-oriented programming techniques.

The workshop will use examples and exercises drawn from various aspects of environmental and climate sciences, along with a variety of different datasets. We will focus on the use of the pandas and seaborn packages for data manipulation and visualization, as well as using parts of the standard library to write custom classes and integrate them with the rest of the language.

The course starts with the core concepts of object-oriented programming, i.e., an introduction of classes, instances, methods vs functions, constructors, and magic methods. Following, we will introduce some advanced ideas where we will look at their concepts, where it is useful to use them, and details of how they work in Python. Core concepts include inheritance and class hierarchies, method overriding, superclasses and subclasses, polymorphism, composition, multiple inheritance. Next, we look at the main difference between working with core Python objects and working with pandas. Following, we will turn our attention from data analysis to data visualization. We’ll start with an overview of the seaborn package then dive straight into the core chart types for looking at distributions and relationships. Further, we will survey very common chart types like strip plots, box plots and bar plots, along with less common types like swarm, violin and boxen plots. We will finish our learning part of the course by looking at real life datasets and see what tools pandas gives us to overcome possible difficulties. We will also look at some best practices for making sure that code runs quickly, and some options for what to do when code is too slow.

  • Wageningen Institute for Environment and Climate Research WIMEK
  • 10-15 persons