Demonstration of LSTM Networks for Identification of Nonlinear Dynamic Systems with Python

Tom Simpson, Chair of Structural Mechanics and Monitoring, ETH Zurich.

This tutorial is derived from work carried out in the following publication: Machine Learning Approach to Model Order Reduction of Nonlinear Systems using Autoencoder and LSTM Networks

Table of Contents

LSTM Networks


LSTM neural networks are a powerful tool for modelling nonlinear sequence data. They are a form of recurrent neural networks specifically designed to better deal with long term effects. Being a kind of RNN, LSTMs work by making repeated predictions using the same network and weights but by passing the activation values of the network after prediction at $t=i$ as additional inputs for the prediction at $t=i+1$. In a traditional system ID sense this is conceptually similar to an AR model passing previous output values. In an LSTM for example, the "endogenous" input which we pass between steps is the memory state $h_i$. This memory state is of a dimension which can be chosen as a hyper-parameter, with a larger dimension giving more expressive power, similarly to a larger Auto-regressive value. It is noteworthy however, that the way in which this internal state is updated is also learned during training, as such, the LSTM can learn dependencies from arbitrarily long lengths within the data. The LSTM cell uses a series of "Gate functions", these control how the memory state of the model develops over time. The memory state is then passed as output from the LSTM at which point it is typical to stack a dense layer on this for the final output prediction. This allows the hidden state to be of greater dimension than the predicted output. A nice and more thorough description of LSTMs is given in : https://colah.github.io/posts/2015-08-Understanding-LSTMs/.

LSTMcell.png

To demonstrate how to use an LSTM network for such a system ID problem, we take a dataset from a series of experiments carried out at Los Alamos national lab. The experiment involves a 3 story shear frame structure excited by ground motion applied by a shaker, nonlinearity is introduced to the system with a bumper between the 2nd and 3rd storeys. For each experimental setup we have a known input acceleration time history applied to the base and uni-axial accelerations are measured at each floor of the structure.

losalamos.jpg

For the given experimental setup, the dataset contains 50 repeated runs with different random time series used for the ground motion. The system ID problem consists of using the 1 dimensional forcing input to predict the 3 dimensional acceleration time series response at each of the floors. As such we use a certain number of the 50 repeats for training and test the performance of the neural network on predicting response for some unseen repeats.

Import all the modules and functions used

  • Tensorflow for modelling (version 2.6.0)
  • Matplotlib imported for plotting
  • Numpy used for importing and processing data
  • glob used for linux type file name searching
  • time used for timing simulations
In [3]:
import tensorflow as tf
import tensorflow.keras.backend as K
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras import optimizers
import matplotlib.pyplot as plt
from math import floor
import numpy as np
from glob import glob
import time
print(tf.version.VERSION)
2.6.0

Load and Normalise Data

We have separate files containing base excitations and response accelerations for each of the 50 experiment repeats. We first find a list of file paths for the input and response files. We then load each of these files as numpy arrays into lists in which each list element is one such array. Additionally, we pad the front of each array with some zeros,

We take the first 25 experiment repeats for training, note we keep the training runs as separate arrays for now to aid in processing the data to train the LSTM. We then find max and min values in the training datasets and then apply a -1 to 1 normalisation on both the forcing and acceleration datasets. Normalisation can have a very big effect on the efficiency of training with an LSTM.

In [93]:
Forcefiles=glob('**/LANL_state_12_base_accel_M_*.csv',recursive=True)
Forcefiles.sort()
Encodedfiles = filelist = glob('**/LANL_state_12_original_accelerations_M_*.csv',recursive=True)
Encodedfiles.sort()

FF = []
for file in Forcefiles:
    ff = np.loadtxt(open(file,"rb"), delimiter=",").reshape((-1,1))
    ff = np.concatenate((np.zeros((120,1)),ff),axis=0)
    FF.append(ff)

Ftrain = FF[:25] # First 5 taken for training

nT = FF[0].shape[0] # Number of sample points in each experimental run

Fn = np.concatenate((Ftrain),axis=0) #Fn create only to find normalising max and min values
Fmax = Fn.max(axis=0,keepdims=True)
Fmin = Fn.min(axis=0,keepdims=False)


YY = []
for file in Encodedfiles:
    yy = np.loadtxt(open(file,"rb"), delimiter=",")
    yy = np.concatenate((np.zeros((120,3)),yy),axis=0)
    YY.append(yy)

Ytrain = YY[:25]

Yn = np.concatenate((Ytrain),axis=0)
Ymax = Yn.max(axis=0,keepdims=True)
Ymin = Yn.min(axis=0,keepdims=False)

FFN = Ftrain
for i in range(len(FFN)):
  FFN[i] = 2*(FFN[i]-Fmin)/(Fmax-Fmin)-1

YYN = Ytrain
for i in range(len(YYN)):
  YYN[i] = 2*(YYN[i]-Ymin)/(Ymax-Ymin)-1

Create Training Dataset

Creating a training dataset is quite involved for LSTM networks. Recurrent neural networks are trained using the back-propagation through time algorithm (BPTT), whilst we can theoretically propagate errors through infinite time steps, in practice this doesn't work well even for LSTM networks. As such, we have to use truncated BPTT which for each training example only makes forward and backward runs for a limited number of time steps. As such, we must create a dataset of training input samples in which each sample is a matrix of shape $(look\_back, n\_features)$ in which look_back is the length of each sequence used in the TBPTT training algorithm, this also is the limit of the longest relationship which can be learned by the network. The second dimension n_features corresponds to the dimension of the input vector at each time point, in this case, 4, one value of the forcing input at this time point and 3 acceleration values of the output from the previous time step. The full input dataset for the training is as such a 3D tensor of shape $(n\_samples, look\_back, n\_features)$. The output data on the other hand simply takes the conventional shape $(n\_samples, f\_features)$ with feature dimension being 3 of the 3 predicted accelerations.

Also it is noteworthy that in this case we explicitly add the acceleration response at time $t=i$ as an exogenous input for the prediction at $t=i+1$, in other words we include an autoregression of order 1. This is not the usual way to setup an RNN, normally the internal state is the only thing propagated between time steps, it was found however, that explicitly adding the previous output value leads to more favourable training.

In [94]:
DDN = []
for i in range(len(YYN)):
  DDN.append(np.concatenate((FFN[i],YYN[i]),axis=1))
look_back = 120

def create_dataset(dataset, look_back=look_back):
    X, Y = [], []
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back), :]
        X.append(a)
        Y.append(dataset[i + look_back, 1:])
    return np.array(X), np.array(Y)

XY=[]
for i in range(len(DDN)):
    XY.append(create_dataset(DDN[i]))

xtrain, ytrain = zip(*XY)
xtrain = list(xtrain)
ytrain = list(ytrain)
x_t = np.concatenate(xtrain,axis=0)
y_t = np.concatenate(ytrain,axis=0)
x_train = x_t[:floor(0.8*x_t.shape[0]),:]
y_train = y_t[:floor(0.8*x_t.shape[0]),:]
x_val = x_t[floor(0.8*x_t.shape[0]):,:]
y_val = y_t[floor(0.8*x_t.shape[0]):,:]

Create and Train Network

Firstly we define some training settings for the network. The callbacks: early stopping and checkpoint combined, are used to stop the training early once the validation loss is no longer decreasing and to save the weights of the training epoch which achieved the lowest validation error. We also defined an Adam optimiser which uses a learning rate which decreases during training.

The LSTM model is then defined, we define a model which consists of a single LSTM layer with a 15 dimensional hidden state, stacked on this is a fully-connected layer of dimension 3. Note the hidden layer dimension is a hyperparameter to be chosen where as the dense layer dimension is dictated by the problem, in this case 3 dimensions for the 3 predicted accelerations. The model is fit for 100 epochs with possible early stopping and the best weighs reloaded after training.

In [99]:
earlystopping = EarlyStopping(monitor='val_loss', min_delta=1e-9, patience = 20)
checkpoint = ModelCheckpoint('.lstm_wts.hdf5', verbose=1, monitor='val_loss',save_best_only=True, mode='auto')
callbacksList = [earlystopping, checkpoint]
lr = tf.keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate=5e-3,
    decay_steps=1000,
    decay_rate=0.96)
Adopt= optimizers.Adam(learning_rate=lr, beta_1=0.9, beta_2=0.999, clipnorm=1)

lstm = Sequential()
lstm.add(LSTM(15,stateful=False, input_shape=(x_train.shape[1],x_train.shape[2])))
lstm.add(Dense(3,activation='linear'))
lstm.compile(optimizer=Adopt, loss='mean_squared_error')
lstm.load_weights(filepath='.lstm_wts.hdf5')
lstm.fit(x_train,y_train, epochs=100, batch_size=1024,verbose=1,validation_data=(x_val,y_val), callbacks=callbacksList)
lstm.load_weights(filepath = '.lstm_wts.hdf5')
Epoch 1/100
       160/160 [==============================] - 80s 489ms/step - loss: 6.1714e-05 - val_loss: 4.8676e-05

       Epoch 00001: val_loss improved from inf to 0.00005, saving model to .lstm_wts.hdf5
       Epoch 2/100
       160/160 [==============================] - 86s 534ms/step - loss: 4.6682e-05 - val_loss: 4.6414e-05

       Epoch 00002: val_loss improved from 0.00005 to 0.00005, saving model to .lstm_wts.hdf5
       Epoch 3/100
       160/160 [==============================] - 72s 454ms/step - loss: 4.7035e-05 - val_loss: 4.5530e-05

       Epoch 00003: val_loss improved from 0.00005 to 0.00005, saving model to .lstm_wts.hdf5
       Epoch 4/100
       160/160 [==============================] - 63s 395ms/step - loss: 4.6229e-05 - val_loss: 4.6543e-05

       Epoch 00004: val_loss did not improve from 0.00005
       Epoch 5/100
       160/160 [==============================] - 62s 387ms/step - loss: 4.7023e-05 - val_loss: 4.6413e-05

       Epoch 00005: val_loss did not improve from 0.00005
       Epoch 6/100
       160/160 [==============================] - 64s 397ms/step - loss: 4.6342e-05 - val_loss: 4.7555e-05

       Epoch 00006: val_loss did not improve from 0.00005
       Epoch 7/100
       160/160 [==============================] - 79s 494ms/step - loss: 4.6276e-05 - val_loss: 4.8812e-05

       Epoch 00007: val_loss did not improve from 0.00005
       Epoch 8/100
       160/160 [==============================] - 69s 433ms/step - loss: 4.5982e-05 - val_loss: 4.5415e-05

       Epoch 00008: val_loss improved from 0.00005 to 0.00005, saving model to .lstm_wts.hdf5
       Epoch 9/100
       160/160 [==============================] - 51s 320ms/step - loss: 4.5419e-05 - val_loss: 4.6047e-05

       Epoch 00009: val_loss did not improve from 0.00005
       Epoch 10/100
       160/160 [==============================] - 70s 439ms/step - loss: 4.5746e-05 - val_loss: 4.6633e-05

       Epoch 00010: val_loss did not improve from 0.00005
       Epoch 11/100
       160/160 [==============================] - 68s 421ms/step - loss: 4.6286e-05 - val_loss: 4.6084e-05

       Epoch 00011: val_loss did not improve from 0.00005
       Epoch 12/100
       160/160 [==============================] - 52s 326ms/step - loss: 4.5707e-05 - val_loss: 4.7994e-05

       Epoch 00012: val_loss did not improve from 0.00005
       Epoch 13/100
       160/160 [==============================] - 60s 373ms/step - loss: 4.5136e-05 - val_loss: 4.5411e-05

       Epoch 00013: val_loss improved from 0.00005 to 0.00005, saving model to .lstm_wts.hdf5
       Epoch 14/100
       160/160 [==============================] - 61s 381ms/step - loss: 4.5826e-05 - val_loss: 4.6069e-05

       Epoch 00014: val_loss did not improve from 0.00005
       Epoch 15/100
       160/160 [==============================] - 88s 549ms/step - loss: 4.5117e-05 - val_loss: 4.6637e-05

       Epoch 00015: val_loss did not improve from 0.00005
       Epoch 16/100
       160/160 [==============================] - 74s 463ms/step - loss: 4.4799e-05 - val_loss: 4.7166e-05

       Epoch 00016: val_loss did not improve from 0.00005
       Epoch 17/100
       160/160 [==============================] - 82s 512ms/step - loss: 4.5527e-05 - val_loss: 4.7788e-05

       Epoch 00017: val_loss did not improve from 0.00005
       Epoch 18/100
       160/160 [==============================] - 70s 435ms/step - loss: 4.5604e-05 - val_loss: 4.6398e-05

       Epoch 00018: val_loss did not improve from 0.00005
       Epoch 19/100
       160/160 [==============================] - 77s 475ms/step - loss: 4.5282e-05 - val_loss: 4.5696e-05

       Epoch 00019: val_loss did not improve from 0.00005
       Epoch 20/100
       160/160 [==============================] - 79s 494ms/step - loss: 4.4904e-05 - val_loss: 4.5071e-05

       Epoch 00020: val_loss improved from 0.00005 to 0.00005, saving model to .lstm_wts.hdf5
       Epoch 21/100
       160/160 [==============================] - 67s 418ms/step - loss: 4.4890e-05 - val_loss: 4.6524e-05

       Epoch 00021: val_loss did not improve from 0.00005
       Epoch 22/100
       160/160 [==============================] - 69s 427ms/step - loss: 4.4716e-05 - val_loss: 4.6540e-05

       Epoch 00022: val_loss did not improve from 0.00005
       Epoch 23/100
       160/160 [==============================] - 63s 396ms/step - loss: 4.4419e-05 - val_loss: 4.6141e-05

       Epoch 00023: val_loss did not improve from 0.00005
       Epoch 24/100
       160/160 [==============================] - 67s 418ms/step - loss: 4.4573e-05 - val_loss: 4.6624e-05

       Epoch 00024: val_loss did not improve from 0.00005
       Epoch 25/100
       160/160 [==============================] - 58s 361ms/step - loss: 4.4414e-05 - val_loss: 4.5793e-05

       Epoch 00025: val_loss did not improve from 0.00005
       Epoch 26/100
       160/160 [==============================] - 64s 401ms/step - loss: 4.4368e-05 - val_loss: 4.8849e-05

       Epoch 00026: val_loss did not improve from 0.00005
       Epoch 27/100
       160/160 [==============================] - 65s 405ms/step - loss: 4.4701e-05 - val_loss: 4.5641e-05

       Epoch 00027: val_loss did not improve from 0.00005
       Epoch 28/100
       160/160 [==============================] - 81s 503ms/step - loss: 4.4763e-05 - val_loss: 4.7370e-05

       Epoch 00028: val_loss did not improve from 0.00005
       Epoch 29/100
       160/160 [==============================] - 78s 491ms/step - loss: 4.4478e-05 - val_loss: 4.5468e-05

       Epoch 00029: val_loss did not improve from 0.00005
       Epoch 30/100
       160/160 [==============================] - 77s 482ms/step - loss: 4.4031e-05 - val_loss: 4.4986e-05

       Epoch 00030: val_loss improved from 0.00005 to 0.00004, saving model to .lstm_wts.hdf5
       Epoch 31/100
       160/160 [==============================] - 73s 454ms/step - loss: 4.4480e-05 - val_loss: 4.6913e-05

       Epoch 00031: val_loss did not improve from 0.00004
       Epoch 32/100
       160/160 [==============================] - 86s 536ms/step - loss: 4.4068e-05 - val_loss: 4.6027e-05

       Epoch 00032: val_loss did not improve from 0.00004
       Epoch 33/100
       160/160 [==============================] - 79s 489ms/step - loss: 4.4420e-05 - val_loss: 4.5140e-05

       Epoch 00033: val_loss did not improve from 0.00004
       Epoch 34/100
       160/160 [==============================] - 87s 546ms/step - loss: 4.4077e-05 - val_loss: 4.8810e-05

       Epoch 00034: val_loss did not improve from 0.00004
       Epoch 35/100
       160/160 [==============================] - 71s 446ms/step - loss: 4.4012e-05 - val_loss: 4.6567e-05

       Epoch 00035: val_loss did not improve from 0.00004
       Epoch 36/100
       160/160 [==============================] - 66s 413ms/step - loss: 4.4425e-05 - val_loss: 4.6512e-05

       Epoch 00036: val_loss did not improve from 0.00004
       Epoch 37/100
       160/160 [==============================] - 86s 533ms/step - loss: 4.4189e-05 - val_loss: 4.4450e-05

       Epoch 00037: val_loss improved from 0.00004 to 0.00004, saving model to .lstm_wts.hdf5
       Epoch 38/100
       160/160 [==============================] - 72s 448ms/step - loss: 4.4213e-05 - val_loss: 4.5718e-05

       Epoch 00038: val_loss did not improve from 0.00004
       Epoch 39/100
       160/160 [==============================] - 46s 291ms/step - loss: 4.3892e-05 - val_loss: 4.3782e-05

       Epoch 00039: val_loss improved from 0.00004 to 0.00004, saving model to .lstm_wts.hdf5
       Epoch 40/100
       160/160 [==============================] - 48s 301ms/step - loss: 4.3690e-05 - val_loss: 4.4632e-05

       Epoch 00040: val_loss did not improve from 0.00004
       Epoch 41/100
       160/160 [==============================] - 43s 267ms/step - loss: 4.4024e-05 - val_loss: 4.5999e-05

       Epoch 00041: val_loss did not improve from 0.00004
       Epoch 42/100
       160/160 [==============================] - 49s 307ms/step - loss: 4.3783e-05 - val_loss: 4.5963e-05

       Epoch 00042: val_loss did not improve from 0.00004
       Epoch 43/100
       160/160 [==============================] - 47s 290ms/step - loss: 4.3663e-05 - val_loss: 4.5433e-05

       Epoch 00043: val_loss did not improve from 0.00004
       Epoch 44/100
       160/160 [==============================] - 52s 321ms/step - loss: 4.3677e-05 - val_loss: 4.4587e-05

       Epoch 00044: val_loss did not improve from 0.00004
       Epoch 45/100
       160/160 [==============================] - 55s 341ms/step - loss: 4.3649e-05 - val_loss: 4.4794e-05

       Epoch 00045: val_loss did not improve from 0.00004
       Epoch 46/100
       160/160 [==============================] - 43s 268ms/step - loss: 4.3486e-05 - val_loss: 4.5249e-05

       Epoch 00046: val_loss did not improve from 0.00004
       Epoch 47/100
       160/160 [==============================] - 50s 312ms/step - loss: 4.4074e-05 - val_loss: 4.7264e-05

       Epoch 00047: val_loss did not improve from 0.00004
       Epoch 48/100
       160/160 [==============================] - 58s 364ms/step - loss: 4.3849e-05 - val_loss: 4.5369e-05

       Epoch 00048: val_loss did not improve from 0.00004
       Epoch 49/100
       160/160 [==============================] - 54s 339ms/step - loss: 4.3597e-05 - val_loss: 4.5770e-05

       Epoch 00049: val_loss did not improve from 0.00004
       Epoch 50/100
       160/160 [==============================] - 51s 320ms/step - loss: 4.3370e-05 - val_loss: 4.6590e-05

       Epoch 00050: val_loss did not improve from 0.00004
       Epoch 51/100
       160/160 [==============================] - 65s 408ms/step - loss: 4.3286e-05 - val_loss: 4.6297e-05

       Epoch 00051: val_loss did not improve from 0.00004
       Epoch 52/100
       160/160 [==============================] - 43s 265ms/step - loss: 4.3536e-05 - val_loss: 4.5532e-05

       Epoch 00052: val_loss did not improve from 0.00004
       Epoch 53/100
       160/160 [==============================] - 51s 321ms/step - loss: 4.2989e-05 - val_loss: 4.4061e-05

       Epoch 00053: val_loss did not improve from 0.00004
       Epoch 54/100
       160/160 [==============================] - 50s 313ms/step - loss: 4.3173e-05 - val_loss: 4.6013e-05

       Epoch 00054: val_loss did not improve from 0.00004
       Epoch 55/100
       160/160 [==============================] - 48s 297ms/step - loss: 4.3549e-05 - val_loss: 4.4636e-05

       Epoch 00055: val_loss did not improve from 0.00004
       Epoch 56/100
       160/160 [==============================] - 56s 348ms/step - loss: 4.3642e-05 - val_loss: 4.5760e-05

       Epoch 00056: val_loss did not improve from 0.00004
       Epoch 57/100
       160/160 [==============================] - 47s 294ms/step - loss: 4.3401e-05 - val_loss: 4.5608e-05

       Epoch 00057: val_loss did not improve from 0.00004
       Epoch 58/100
       160/160 [==============================] - 54s 340ms/step - loss: 4.2965e-05 - val_loss: 4.4262e-05

       Epoch 00058: val_loss did not improve from 0.00004
       Epoch 59/100
       160/160 [==============================] - 50s 315ms/step - loss: 4.2870e-05 - val_loss: 4.5898e-05

       Epoch 00059: val_loss did not improve from 0.00004
     

Create Predictive Network

We then create a second model which we use for prediction. This is done as in the training model we use a stateless LSTM whereby we pass sequences of a certain length to the network, this is what we referred to as the AR order previously. In this case the model expects to predict on tensors of size (1,120,6). For the testing however, since we chose to pass the previous predicted outputs as exogenous inputs, we have to make predictions one at a time on tensors of shape (1,1,6) whilst passing the internal state between predictions. As such we create a stateful LSTM of the same architecture and change the expected input shape. We then load the weights of the trained network into this network.

In [126]:
lstm.load_weights('.lstm_wts.hdf5')
lstmPred = Sequential()
lstmPred.add(LSTM(15,stateful=True,batch_input_shape=(1,1,x_train.shape[2]),unroll=True))
lstmPred.add(Dense(3,activation='linear'))
lstmPred.set_weights(lstm.get_weights())
#lstmPred.save_weights('prettygood.hdf5')
lstmPred.load_weights('prettygood.hdf5')

We then take one of the datasets not used for training and apply the correct normalisation. We can then create our prediction loop. For each prediction we give as input the concatenation of the forcing at time i and the predicted response from the previous step. Since we have here used a stateful LSTM network, this state value is propagated by itself. However, as it is a stateful network we must be mindful to reset this state to initialisation before beginning the predictions.

In [214]:
Ftest = FF[-1]
Ytest = YY[-1]
FtestN = 2*(Ftest-Fmin)/(Fmax-Fmin)-1
YtestN = 2*(Ytest-Ymin)/(Ymax-Ymin)-1
datasetTest=np.concatenate((FtestN,YtestN),axis=1)
dataTest = np.reshape(datasetTest,(-1,1,datasetTest.shape[1]))
lstmPred.reset_states()
@tf.function
def makePred(lstmPred,Ti):
    return lstmPred(Ti)
zz1=[]
x0 = dataTest[0:1,:,:]
print(dataTest.shape)
print(type(dataTest))
t = time.time()
for i in range(0,8000):
        zzz = makePred(lstmPred,x0)
        zz1.append(zzz)
        zzz = np.reshape(zzz,(1,1,3))
        x0 = np.concatenate((dataTest[i+1:i+2,:,0:1],zzz),axis=2)
print(time.time()-t)
z = np.squeeze(np.concatenate(zz1,axis=0))
(8312, 1, 4)
<class 'numpy.ndarray'>
10.957023620605469

Plot Results

We then compare the predicted response to the ground truth response for the testing example we chose. We can see that the initial performance of the LSTM network is not very good...

In [211]:
fs = 320
ts = 1/fs
t = np.arange(0,8000*ts,ts)
groundTruth = YtestN[1:,:]


plt.figure(figsize=(10,5))
fig, ax = plt.subplots(3,1, figsize=(12,8))
for i in range(3):
    ax[i].plot(t[0:350],(groundTruth[0:350,i]),label='True')
    ax[i].plot(t[0:350],z[0:350,i],'--',label='Pred')
    ax[i].legend(loc='upper left')
plt.tight_layout()
plt.show()
<Figure size 720x360 with 0 Axes>

However, the performance outside of the transient regime is much better! To fully capture transient behaviour, it may be useful to consider more powerful networks and or more training examples.

In [212]:
plt.figure(figsize=(10,5))
fig, ax = plt.subplots(3,1, figsize=(12,8))
for i in range(3):
    ax[i].plot(t[1000:1350],(groundTruth[1000:1350,i]),label='True')
    ax[i].plot(t[1000:1350],z[1000:1350,i],'--', label='Pred')
    ax[i].legend(loc='upper left')
plt.tight_layout()

plt.show()
<Figure size 720x360 with 0 Axes>

We then calculate the normalised mean squared error of prediction for all 25 of the testing datasets. We take each in turn, apply the normalisation then repeat the prediction procedure before calculating and plotting the percentage NMSE.

In [217]:
from sklearn.metrics import mean_squared_error
def nmse(y, yhat):
    Error = 0
    for i in range(y.shape[1]):
        ms = mean_squared_error(y[:,i],yhat[:,i])
        nms = ms / np.mean(y[:,i]**2)
        Error = Error+nms
    return Error/y.shape[1]

err = []
for i in range(26,50):
    Ftest = FF[i]
    Ytest = YY[i]
    FtestN = 2*(Ftest-Fmin)/(Fmax-Fmin)-1
    YtestN = 2*(Ytest-Ymin)/(Ymax-Ymin)-1
    datasetTest=np.concatenate((FtestN,YtestN),axis=1)
    dataTest = np.reshape(datasetTest,(-1,1,datasetTest.shape[1]))
    lstmPred.reset_states()

    zz1=[]
    x0 = dataTest[0:1,:,:]
    t = time.time()
    for i in range(0,8000):
            zzz = makePred(lstmPred,x0)
            zz1.append(zzz)
            zzz = np.reshape(zzz,(1,1,3))
            x0 = np.concatenate((dataTest[i+1:i+2,:,0:1],zzz),axis=2)

    z = np.squeeze(np.concatenate(zz1,axis=0))
    err.append(100*nmse(z[:8000],YtestN[1:8001,:]))


We can see that the performance of the network is pretty consistent across all 25 testing datasets, with a couple of examples in which the error is slightly higher.

In [185]:
plt.figure(figsize=(8,5))
plt.plot(err)
plt.ylabel('NMSE %')
plt.xlabel('Test Dataset #')
plt.show()

Summary

LSTMs can be a powerful method for modelling time series but aren't often demonstrated in the context in which we would use them in system ID. We've shown here how LSTMs can be used for modelling time series response of nonlinear dynamic systems. The training of these networks can be quite involved but they can also be very powerful for significantly nonlinear systems. For any questions or help with specific implementation please email me: simpson@ibk.baug.ethz.ch