Geopandas introduction

If you are data analyst or data scientist working with python, then probably you are already familiar with pandas, but if you are entering a spatial domain data analytics you should check out GeoPandas. It extends panda’s capabilities by adding spatial operations and geometric types. In this blog I will focus on two aspects of GeoPandas: visualization and simple spatial analytics.

Let’s start from notebook output area setup, so that we can see the entire map, without vertical scrollbars.

Data used in this post is open source and comes from envirosolutions web page. Similar to pandas, geopandas allows you to easily read and write from spatial formats like: Shapefile, GeoJson, postgis. It is mostly achieved thanks to Fiona which is used here, similarly like Shapely (used for geometry format storage and spatial data analytics) or matplotlib (used for data visualization).

First method for geometry display is using matplotlib.pyplot, which was already disscused in this post : Matplotlib: Lines, bars, pie charts and some other stuff

Other option for geometry display is interactive map based on folium/leaflet.js, here you can easily display your geometry with underlying layer (i.e map or aerial imagery)

Now let’s try to do something with our data. First task will be to calculate road length coverage per admin in Poland (stats will be calculated only for the main roads). Our original data reference system is WGS84, and as we want to get stats in meters, let’s project our data to UTM. Geopandas might help you there and estimate what is the applicable UTM zone for your data.

Below is a simple function calculating our stats, let’s see step by step what is happening there:

  • spatial join between road layer and admin layer
  • calculating length in meters (and then we divide by 1000 to get kilometers)
  • group by admin and sum length

Second task is to create a new layer called bridge. We will find intersection between rivers and roads and at those point create a 20m long bridge overlaying with road geometry. In order to achieve this task, we create a function with following logic:

  • dissolve road and river geometry to get just one multipart element per layer and then intersect those layers to get intersecting point
  • For intersection points create GeoDataFrame with 10 m buffer
  • Clip roads geometry with buffers to get bridge geometry overlying with roads

Last step is to store our bridges GeoDataFrame to shapefile

That was just a small sample o GeoPandas capabilities. In a nutshell, instead of doing your spatial analytics in GIS tools, with using this powerful tool, you can embed your spatial related work at larger python project and automate your work.
You can download this code from my GitHub