Applying neural network for 1D-signal ppg

 In this post, I jot down some useful thing need to notice when trying to test a model with Python and Keras for newbies like me.

First, segment data with python command

Second, organization data to fit the built-in model in Keras

Third, experience with training model with preprocessing data with MinMaxScaler.

1. Segment Data with Python

In python to import data, the easy and regular way is employing `read_csv` function from pandas package.

File format often in .txt, .csv, or .xls but prefer to save and load .csv file because I can explore, verify, and manipulate on Excel software confidently.

Let's start by read a csv file name "InvertPhaseLong2.csv" as below

from pandas import read_csv

dataset = read_csv('drive/My Drive/InvertPhaseLong2.csv', header=0)

header= 0 mean the file contains a header in the first rows, in case we do not care header then let set it as
The file dataset is in the format of pandas as DataFrame so in this form, it permits us to make some useful
checking function like .header to read the first serveral lines of data table, or .describe() give a
basic statistics of data including the number of data, mean, std, min, max etc.

To segment data first, we got there values from the dataset by command below:

data = dataset.values

data now in ndarray format and we cand start segment data in to many sections with window length as we wish.
Below is an example of function to segment data in with many part and store in a list

def load_data(filename):
  dataset = read_csv(filename, header=0);
  data = dataset.values;
  return data
def segment_data(signaldistanceoverlap):
  i = 1;
  s = list()
  while i < len(signal):
    a = signal[int(i):int(i+distance)];
    #a = np.split(a,int(distance-1)
    i = i + distance*overlap
  return s

data = load_data('drive/My Drive/InvertPhaseLong2.csv')  
ppg = segment_data(data[:,0],70,0.5)
ppg1 = segment_data(data[:,2],70,0.5)

To display the values of some segment use matplotlib as below

from matplotlib import pyplot


2. Fit the data to model

From the raw, the value is very big so it will be over floating if we use raw data to apply to machiner learning model
sklearn provide several functions to re-organize data like Standardise() or MinMaxScaler()
In this example, I use MinMaxScaler() to scale data into range (0,1).

Because two datasets for In-Phase and Invert-Phase, we will combine to a unique training dataset and lable of it.

# label for first group of Data
y1 = [1]*ppg.shape[0];
y2 = [0]*ppg1.shape[0]
y1 = np.asarray(y1)
y2 = np.asarray(y2)
y1.shape, y2.shape
#ydf1 = DataFrame(y1)
#ydf2 = DataFrame(y2)
# Concatenate data
X = np.concatenate((ppg, ppg1))
y = np.concatenate((y1, y2))
Here, concatenate() function permits us to combine data and y1, y2 are output labels with 1 for in-phase and 0 for invert-phase
np.asarray() is a function for converting data from list to ndarray.

The model for classifying will be used 1D-CNN model with Keras as below summary
Layer (type)                 Output Shape              Param #   
conv1d_122 (Conv1D)          (None, 68, 64)            256       
conv1d_123 (Conv1D)          (None, 66, 64)            12352     
dropout_61 (Dropout)         (None, 66, 64)            0         
max_pooling1d_61 (MaxPooling (None, 33, 64)            0         
flatten_61 (Flatten)         (None, 2112)              0         
dense_122 (Dense)            (None, 100)               211300    
dense_123 (Dense)            (None, 2)                 202       
Total params: 224,110
Trainable params: 224,110
Non-trainable params: 0
The input data in this case have the shape (number_of_segment, size_of_segment, number_of_feature)
To know the number of segment let check by shape function X, y

X.shape, y.shape
(1178, 70), (2278,)

will show the size of (number_of_segment, size_of_segment) without the number_of_feature Nf.
In our case, we use single channel so Nf =1; in case use apply for multi-channel then use will repalce with suitable type of data.
With this shape we need to reshape the data input to match with the model.
Two function will be used are reshape() from numpy and to_categorical() from Keras as shown below:

from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
# split the dataset into train and test sets
X_train, X_test, y_train, y_test = train_test_split(Xscaled, y, test_size=0.33, random_state=1)
X_test, X_val, y_test, y_val = train_test_split(X_test, y_test, test_size=0.5, random_state=1)

X_train = X_train.reshape((X_train.shape[0],X_train.shape[1],1))
X_test = X_test.reshape((X_test.shape[0],X_test.shape[1],1))
X_val = X_val.reshape((X_val.shape[0],X_val.shape[1],1))
y_train = to_categorical(y_train, num_classes=2)
y_test = to_categorical(y_test, num_classes= 2)
y_val = to_categorical(y_val, num_classes= 2)

Here to train model we also split data into {train, validation and test} set for training by train_test_split from sklearn
to_categorical will mapping the outputs [0] and [1] to vector [1 0] and [0 1] respectively.

3. Scaling data with right MinMaxScaler()

MinMaxScaler() will scale data along segment by segment. Assume segmment0, segment1 are
[a00 a01 a02] and [a10 a11 a12] respectively.
Scaling will be group between [(a00,a10), (a01, a11), (a02, a12)]
In our case, we want to scale data in each segment rather segment by segment we can do by
transpose-MinMaxScaler-transpose as the code below

from sklearn.preprocessing import MinMaxScaler

# define min max scaler
def scaling(signal):
  scaler = MinMaxScaler()
  s_trans = signal.transpose()
  Xscaled = scaler.fit_transform(s_trans)
  Xscaled = Xscaled.transpose()
  return Xscaled  
Xscaled = scaling(X)

Full code can be refer from my Github:


