Although it is much more convenient to use software dedicated for GIS, like ArcGIS or QGIS, for spatial data visualization, but ability to display spatial data within your code (especially if you are working with notebooks) might be very handy. Currently there are tens of geo-spatial python libraries, and here you can find a nice overview. I just took one, which seems to be relatively easy for simple map display use cases. My choice was Folium.
In most of the cases, functionality that I need is to display some vector data on the map or highlight some location with using marker, so let’s start from some simple task.
Titanic, this tragedy was investigated by almost each Data Scientist beginner (here you can find my Kaggle Titanic project). By now, it is clear, that the best chance to survive had kinds in the first class, but that is not a topic of this post. Here, we will display Titanic track before it sank.
In this example I will use vector data stored in GeoJSON format, which looks like this:
#GeoJSON data format
"""{ "type": "LineString",
"coordinates": [
[-1.422394023261522,50.915326603354451],[-1.2069179833123727,50.761806565381434],...
...,[-47.474265452339353,44.197223433400623],[-49.93827045144009,41.749277801296536]
]
}"""
import folium
from folium import plugins
import pandas as pd
%matplotlib inline
#create interactive Leaflet maps object
map = folium.Map(location=[48, -25], tiles='Mapbox Bright', zoom_start=4)
#add vector (in GeoJson format) data to the map
map.add_child(folium.GeoJson(open('./data/titanic_test.json',encoding = "utf-8-sig").read()))
#add markers
folium.Marker([50.892239,-1.3981697], popup='Southampton<br><i>10 April 1912</i>').add_to(map)
folium.Marker([49.646042,-1.618031], popup='Cherbuorg<br><i>10 April 1912</i>').add_to(map)
folium.Marker([51.853955,-8.2997997], popup='Cobh<br><i>11 April 1912</i>').add_to(map)
folium.Marker(location=[41.726931,-49.948253], popup='Titanic Site<br><i>15 April 1912</i>'\
,icon=folium.Icon(color='red')).add_to(map)
map
Like I said, in most of the cases, functionality presented above is what I need, but just to have better feeling what we can do with Folium…
Air pollution, this is quite popular topic (probably because I live in one of the most polluted cities in Europe). Let’s see if that is true. I have downloaded air quality annual statistics from European Environment Agency (EEA) home page and I will display mean PM10 Benzo(a)pyrene 2015 volume in 2015 measured in multiple European cities.
## Load csv "The annual aggregated air quality values" to pandas dataframe
air_q_data =pd.read_csv('./data/data_air_q.csv', sep=',', header=0)
air_q_data.head()
# Select only PM10 Benzo(a)pyrene 2015 value data for display
air_q_data_print =air_q_data[(air_q_data.ReportingYear==2015)&
(air_q_data.Pollutant=='Benzo(a)pyrene in PM10 (aerosol)')&\
(air_q_data.AggregationType=='Annual mean / 1 calendar year')]
# get station location, PM10 value and convert it to list
y=air_q_data_print['SamplingPoint_Latitude'].tolist()
x=air_q_data_print['SamplingPoint_Longitude'].tolist()
v=air_q_data_print['AQValue'].tolist()
t =list(zip(y,x,v))
#create interactive Leaflet maps object
map = folium.Map(location=[48, 8], tiles='Stamen Terrain', zoom_start=4)
#add HeatMap to map object
map.add_child(plugins.HeatMap(t, radius = 12))
Hmm, indeed it doesn’t look very optimistic. This large yellow area in Poland, this is where I live.
Python code and data set for this post might be downloaded from my github.