Normalized difference vegetation index (NDVI)
- FR
- EN
This dataset is a compilation of openly available remote-sensed NDVI (normalized difference vegetation index) data for BC. NDVI is a measure of greenness calculated from spectrometric data at two specific bands (red and near-infrared) and regularly used as a measure of ecosystem productivity.
Formally, NDVI is defined as the ratio $(\eta_{\rm NIR}-\eta_{\rm red})/ (\eta_{\rm NIR}+\eta_{\rm red})$, where $\eta_{\rm NIR}$ and $\eta_{\rm red}$ are the values of the reflectance in the near-infrared and in the red bands, respectively. NDVI always falls between $-1$ and $+1$ and can highlight the following features:
Feature | Specs | NDVI range |
---|---|---|
Dense forest canopy, e.g. in the Amazon | dark in red and bright in NIR | close to +1 |
Dense vegetation | dark in red and bright in NIR | 0.6 – 0.8 |
Sparse vegetation (shrub and grassland) | brighter in NIR | 0.2 – 0.3 |
Dry land with nothing growing | almost equal reflectance | 0 – 0.1 |
Snow, glaciers, clouds | low reflectance in red, even lower NIR reflectance | -0.5 – 0 |
Open water | low reflectance in red, almost no NIR reflectance | close to -1 |
You can find more about the NDVI on Wikipedia and in 5 Things To Know About NDVI.
Contest dataset
What makes the current dataset unique is that – while the mean NDVI has been calculated since the 1970s – this dataset is one of the first attempts to map the variance in NDVI over space and time. Both the mean NDVI and its variance provided here were produced by a BC-scale hierarchical GAM (generalized additive model) over a multi-GB raw dataset so that – for a given location and time – the mean and the variance are informed by data close in time or space.
The Contest dataset contains 5,935,736 points and 53 timesteps. The points are not connected, i.e. they do not form a smooth surface. The 53 time steps are uniformly spread throughout 2022 from Jan-01 (first step) to Dec-31 (last step).
Data are provided in two formats: VTK and compressed CSV. Each format is self-contained – use one of them depending on which data description you like best (no need to use both).
-
In the VTK files each point is placed in the 3D space using its Cartesian coordinates ($x$, $y$, $z$). On top of each point, we store three variables: the mean NDVI ($\mu$), its variance ($\sigma_2$), and elevation in km.
-
In the compressed CSV format, each row corresponds to a data point with longitude, latitude, elevation in km, two horizontal coordinates ($x_{\rm alb}$, $y_{\rm alb}$) in the Albers equal-area conic projection, the mean NDVI ($\mu$), and its variance ($\sigma_2$).
Downloading the data
To start playing with this dataset, you can download only the first time step, but for a production-quality animation you will need all 53 time steps.
VTK format
File | Size | MD5 checksum |
---|---|---|
First time step | 138M | df82fda21a542d64255fde9d31856051 |
All 53 time steps (gzipped compressed file) | 7.1G | 91d3daa7cb75dd124981c80ee3cd74b3 |
Compressed CSV format
File | Size | MD5 checksum |
---|---|---|
First time step | 139M | 172176097e1c66cf415b8c22da6dbe94 |
All 53 time steps (gzipped compressed file) | 7.1G | c7b67777bad4d2b45345db753d4e0963 |
After you download the files, you can check against the provided md5 checksum to see if the download succeeded.
Loading the data in ParaView
Each data format can be loaded easily into ParaView. When loading from CSV, you have to pass points through the Table To Points filter.
Please note that when you load points, you will normally not see them (they are infinitely small points!), but you can render data by:
- using the Point Gaussian representation, or
- using Glyphs, or
- triangulating or projecting data onto a mesh (uniform or not).
You can easily manipulate data inside ParaView with the Programmable Filter. To give you an example, assuming
you have read data from the compressed CSV format, a new filter with Output Type = Same as Input
and the
following Python code inside the filter
import numpy as np
npoints = inputs[0].Points.shape[0]
lon = np.radians(inputs[0].Points[:,0])
lat = np.radians(inputs[0].Points[:,1])
points = vtk.vtkPoints()
radius = 6371
for i in range(npoints):
r = radius + inputs[0].Points[i,2]
x = r * np.cos(lon[i]) * np.cos(lat[i])
y = r * np.sin(lon[i]) * np.cos(lat[i])
z = r * np.sin(lat[i])
points.InsertNextPoint(x,y,z)
output.SetPoints(points)
output.PointData.append(inputs[0].PointData['mu'], 'mu')
output.PointData.append(inputs[0].PointData['sigma2'], 'sigma2')
will create a new set of points that are mapped into the 3D space using their longitude, latitude, and elevation. To learn more about ParaView’s Programmable Filter, watch our January 2021 webinar.
NDVI colour map
If you like, you can use the blue-to-brown-to-green NDVI colour map covering the values from $-1$ to $+1$.
Loading the data in Python
To read VTK files in Python, you can use the official VTK Python library, as well as a number of 3rd-party libraries, e.g. meshio.
The compressed CSV files can be read directly with Pandas:
import pandas as pd
data = pd.read_csv('step000.csv.gz')
print(data.shape)
print(data.columns)
and then exported to numpy or xarray.
Reference
N. Pettorelli, S. Ryan, T. Mueller, N. Bunnefeld, B. Jędrzejewska, M. Lima, K. Kausrud (2011): The Normalized Difference Vegetation Index (NDVI): unforeseen successes in animal ecology. Climate Research 46, 15-27.
Acknowledgments
Data courtesy of Michael Noonan and Stefano Mezzini from the University of British Columbia at Okanagan.