Neurospatio: Harnessing Neural Networks for 2D non-linear interpolation based on IoT-sensor data
Introduction
Popular geostatistical methods are trying to predict the values of a spatial properties of the surface based on spatial dependence. The best linear unbiased predictor, Kriging, exploiting covariance functions and Gaussian processes, provides the adequate quality of results in many cases. But, when it comes to non-linear dependencies including non-Gaussian data, the Kriging method is not always optimal. Moreover, sometimes it’s problematic to support consistency between 2D interpolation maps of dependent spatial properties as Kriging prediction is not taking into account auxiliary sources of data.
In this article, I introduce Neurspatio: the Python library created for complex scenarios of 2D interpolation that demonstrates promising results in non-linear cases.
The Neural Network Architecture
The heart of Neurospatio is the artificial neural network with 4 full-connected layers (300, 150, 75 + 1 neurons from input to output) with Relu activation function. The Dropout layers are located between 1st and 2nd, 2nd and 3d layers respectively. The Batch Normalization procedure is performed before the 2nd and the 4th layer input. Additionally to Dropout, to prevent overfitting the L1, L2 regularization block during learning is applied to the 3d layer.
Example
In the following example, we’ve created the 2D array of points and 1D array of values with the same length as a test dataset. Then we’ve initialized SpLearner class passing the test points and values along with the type of interpolation (SpreadOp.CENTROIDS). Eventually, we’ve generated new grid of (X,Y) points and called execute method for prediction, as you may see below.
import numpy as np
from neurospatio.learner2D import SpLearner, SpreadOp
def example1():
# train dataset: set of 2D points with values
points = np.array([
[2, 8], # X, Y coordinate
[8, 10],
[10,2]
])
values = np.array([19.39, 17.18, 20.95])
# create SpLearner with radial interpolation
learner = SpLearner(points, values, spread_op_flag=SpreadOp.CENTROIDS, n_epochs=600)
# generate a grid for the test
grid = np.array([[i, j] for i in range(10) for j in range(10)])
# [
# [0,0], ...
# [9,9]
#]
predicted_values = learner.execute(grid_points=grid)
# Example of prediction (!may be not the exactly same from launch to launch)
#[
# [17.96],
# ...
# [16.27]
# ]
Experiments
To demonstrate the 2D-prediction we picked up the Cook Agronomy Farm data set (Figure 2) gathered by ECH2O-TE and 5TE sensors since 2007. In addition to soil water content readings, the collection of data includes soil temperature readings (on the hourly/daily basis), annual crop histories, a digital elevation model, Bt horizon maps, seasonal apparent electrical conductivity, texture of soil, and soil bulk density (here you may find the full description). In our work, without loss of generality, we focus on volumetric water and temperature sensing on the 0.3m level, which has some correlation. For simplicity, we made a screenshot of the daily data, actual on September 1st, 2013 .
First of all, we’ve built a 2D-prediction of the water volume in the each point of the area, exploiting radial proliferation all over the field, as showed on the figure 3.
According to sensors, water volume and temperature have a weak correlation (~ 0.2), however, in this practical case such level of dependency is enough to build temperature map complementary to the water volume spatial distribution (Figure 4).
Further, we’ve made another set of maps with prediction, taking into account some specific direction of values spreading. I.e. the figure 5 demonstrates the strong shift to the south-eastern direction.
learner = SpLearner(points, values * 100, spread_op_flag = SpreadOp.HORIZONTAL_FORWARD | SpreadOp.VERTICAL_FORWARD, n_epochs=10000)
The set of SpreadOp flags, embracing such categories in combination as HORIZONTAL_FORWARD, HORIZONTAL_BACKWARD, VERTICAL_FORWARD, VERTICAL_BACKWARD, allows to define the orientation of the interpolation.
Benchmark testing
We picked up following non-linear non-gaussian spatial function to compare ANN approach with the Kriging:
Firstly, random 100 (X,Y) points data sets were generated using uniform spatial distribution. On next step, data was splitted between train and test parts with 0.33 ratio.
The numerous experiments showed Neurospatio results are superior to conventional Kriging predictions with various configurations, as it presented in the Table 1
The further development of Neurospatio library includes following features:
- Integration with regional anisotropic vector maps,
- Support of masks,
- Exploiting KNN approach for even more precise results.
You may find the Neurospatio code on GitHub.