Monday, 18 February 2019

[Tutorial] ADLs recognition using meachine learning and deep learning: taking Shoulder Physiotherapy Exercise Recognition as an example

Introduction

Learning and recognizing human activities, e.g. Activities of daily living (ADLs), are not only very useful when building a pervasive home monitoring system but ADLs are also important indicators of both cognitive and physical well-being in healthy and ill humans. People can benefit a lot from ADLs recognition. For example, 1) allowing computing systems to proactively assist users with their tasks; 2) supporting more information from past activities for medical diagnosis; 3) assisting patients with chronic impairments, personal fitness training and rehabilitation, encouraging humans to adopt a healthy lifestyle; 4) preventing young children from danger areas (e.g. stove, balcony); 5) changing the game experience (e.g. the Microsoft Kinect).

Recently, population aging becomes a global phenomenon with improvements in people's life expectancy and fewer children. The ageing population and changing structure of the population will bring both opportunities and challenges for the economy, services and society at national and local levels. Thus, society pays more attention to older healthcare. Besides, care for patients is still on the spot of people's attention. To provide better and timely care for both of them, researchers are leveraging different sensors, e.g. WiFi, UWB, inertial sensors, cameras and so on, to detect people's ADLs. It is the so-called human activity recognition (HAR).

So in this post, you will see:
1) What is the process of HAR, namely HAR chain.
2) How to recognize ADLs using machine learning algorithms and deep learning algorithms.
The experiment data and source code we use here are referring to here.

HAR chain

Generally, sensor-based HAR Chain includes:
  1. Data Collection
  2. Data Segmentation
  3. Feature Normalisation or Scaling 
  4. Feature Extraction, Feature Selection (if necessary)
  5. Classifier Selection
  6. Evaluation
Next, I will show your the details of each step with our example.

Experiment setup

Twenty healthy adult subjects with asymptomatic shoulders and no prior shoulder surgery was recruited and provided informed consent for participation in this study. The subjects’ mean age was 28.9, range 19-56. There were 14 male and 6 female subjects. Fifteen subjects were right hand dominant, and five were left-hand dominant.

Under the supervision of an orthopedic surgeon, each subject performed 20 repetitions of seven shoulder exercises bilaterally. The sensor used here is an Apple Watch located on the wrist of the subjects' dominant hands. The exercises performed are elements of an evidence-based rehabilitation protocol for full-thickness atraumatic rotator cuff tears (Kuhn et al. 2013) and included:
  1. pendulum (PEN)
  2. abduction (ABD)
  3. forward elevation (FEL)
  4. internal rotation (IR)
  5. external rotation (ER)
  6. trapezius extension (TRAP)
  7. upright row (ROW)

Data Collection

The 6-axis raw sensor data consists of total acceleration a = [ax, ay, az] and rotational velocity ω = [ωx, ωy, ωz], measured in the coordinate frame of the watch. No further preprocessing or filtering was applied to the raw data. The sensor data was acquired from the active extremity using an Apple Watch (Series 2 & 3) with the PowerSense app, sampling at fs = 50 Hz.
Fig 1 activity: pendulum (PEN)
Fig 2 activity: abduction (ABD)
Fig 3 forward elevation (FEL)
Fig 4 internal rotation (IR)
Fig 5 external rotation (ER)
Fig 6 trapezius extension (TRAP)
Fig 7 upright row (ROW)

Data Preprocessing: data segmentation

The raw sensor data was segmented using overlapping fixed-length sliding windows W for each of the six sensor signals. The 3D temporal signal tensor ϕ is produced from the set of windows
and has a shape (N, L/fs, 6), where L is the window length. An exercise label was attributed to a window from the ground truth annotation when the exercise was performed for the entirety of that window.
Fig 7 sketch for data segmentation 


Feature Normalisation or Scaling

Since the range of values of raw data varies widely, in some machine learning algorithms, objective functions will not work properly without normalization. Another reason why feature scaling is applied is that gradient descent converges much faster with feature scaling than without it.

The common-used normalization methods include Rescaling, Standardization, Scaling to unit length.
Rescaling: The simplest method is rescaling the range of features to scale the range in [0, 1] or [−1, 1]
Standardization: Feature standardization makes the values of each feature in the data have zero-mean (when subtracting the mean in the numerator) and unit variance.
Scaling to unit length: To scale the components of a feature vector such that the complete vector has
length one. 

Feature Extraction and Feature Selection

A feature mapping F(W) comprised of typical HAR statistical and heuristic features was computed to define the feature space for the classifiers. An identical set univariate features: mean, variance (σ^2), standard deviation (σ), maximum, minimum, skewness, kurtosis, mean crossings (ζ), mean spectral energy (ξ), and a 4-bin histogram, were computed for each signal vector in each segment. Of course, there are also some other features like FFT amplitude and frequency, Zero Crossing Rate, Mean Crossing Rate, Mean of gradient and so on.

Though we could use more features as far as we can. It doesn't mean the positive linear relationship between the feature size and recognition accuracy. Therefore, we need to choose the most effective features for both efficiency and accuracy. Common methods we use include Principal component analysis (PCA) and Linear discriminant analysis (LDA).

The feature extraction and feature selection I introduced here is mainly for the machine learning (ML) algorithms. ML needs we manually extract features. However, with the occurrence of deep learning, many researchers are fond of using deep neural networks because they can automatically extract features which is more likely to include more information beneficial for recognition accuracy.
But keeping in mind, ML and Deep learning both have their strengths. It doesn't mean deep learning is suitable for every circumstance. for situations lack of enough samples, ML is more likely to achieve good accuracy.

Classification  

Common-used classifiers include (ML) Decision Tree, k-Nearest Neighbor (k-NN), Naive Bayes classifier, Support vector machine (SVM), random forest (RF) and (Deep Learning) DNN, RNN, LSTM, RNN+LSTM and so on. In this post, we will use both ML and deep learning algorithms to make a comparison.
With the limit scale, I am not going to talk deeply about these classifiers. You can refer to outside resources if you have interest.

Evaluation

Similarly, there are multi indicators for evaluation: accuracy, precision, recall and F-measure.

  • Accuracy is the most intuitive performance measure and it is simply a ratio of correctly predicted observation to the total observations. One may think that, if we have high accuracy then our model is best. Yes, accuracy is a great measure but only when you have symmetric datasets where values of false positive and false negatives are almost the same. Therefore, you have to look at other parameters to evaluate the performance of your model. 
  • Precision - Precision is the ratio of correctly predicted positive observations of the total predicted positive observations. 
  • Recall (Sensitivity) - Recall is the ratio of correctly predicted positive observations to all observations in actual class - yes.
  • F1 score - F1 Score is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. The F1 Score is the 2*((precision*recall)/(precision+recall)). It is also called the F Score or the F Measure. Put another way, the F1 score conveys the balance between the precision and the recall.

Algorithms Implementation

Need to know the code we use here is based on python. and source code please refer to seglearn

The most convenient way is to use pip install seglearn as shown below:
Fig 8 pip install seglearn

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

from seglearn.datasets import load_watch
from seglearn.pipe import Pype
from seglearn.transform import FeatureRep, PadTrunc

# load the data
data = load_watch()
X = data['X']
y = data['y']

# create a feature representation pipeline with PadTrunc segmentation
# the time series are between 20-40 seconds
# this truncates them all to the first 5 seconds (sampling rate is 50 Hz)

pipe = Pype([('trunc', PadTrunc(width=250)),
             ('features', FeatureRep()),
             ('scaler', StandardScaler()),
             ('svc', LinearSVC())])

# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=True,
                                                    random_state=42)

pipe.fit(X_train, y_train)
score = pipe.score(X_test, y_test)

print("N series in train: ", len(X_train))
print("N series in test: ", len(X_test))
print("N segments in train: ", pipe.N_train)
print("N segments in test: ", pipe.N_test)
print("Accuracy score: ", score)
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier

from seglearn.datasets import load_watch
from seglearn.pipe import Pype
from seglearn.transform import FeatureRep, PadTrunc

# load the data
data = load_watch()
X = data['X']
y = data['y']

# create a feature representation pipeline with PadTrunc segmentation
# the time series are between 20-40 seconds
# this truncates them all to the first 5 seconds (sampling rate is 50 Hz)

pipe = Pype([('trunc', PadTrunc(width=250)),
             ('features', FeatureRep()),
             ('scaler', StandardScaler()),
             ('RF', RandomForestClassifier())])

# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=True,
                                                    random_state=42)

pipe.fit(X_train, y_train)
score = pipe.score(X_test, y_test)

print("N series in train: ", len(X_train))
print("N series in test: ", len(X_test))
print("N segments in train: ", pipe.N_train)
print("N segments in test: ", pipe.N_test)
print("Accuracy score: ", score)
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier

from seglearn.datasets import load_watch
from seglearn.pipe import Pype
from seglearn.transform import FeatureRep, PadTrunc

# load the data
data = load_watch()
X = data['X']
y = data['y']

# create a feature representation pipeline with PadTrunc segmentation
# the time series are between 20-40 seconds
# this truncates them all to the first 5 seconds (sampling rate is 50 Hz)

pipe = Pype([('trunc', PadTrunc(width=250)),
             ('features', FeatureRep()),
             ('scaler', StandardScaler()),
             ('KNN', KNeighborsClassifier(n_neighbors=7))])

# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=True,
                                                    random_state=42)

pipe.fit(X_train, y_train)
score = pipe.score(X_test, y_test)

print("N series in train: ", len(X_train))
print("N series in test: ", len(X_test))
print("N segments in train: ", pipe.N_train)
print("N segments in test: ", pipe.N_test)
print("Accuracy score: ", score)
  •  Convolution and RNN (LSTM) combination classification
from keras.layers import Dense, LSTM, Conv1D
from keras.models import Sequential
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import train_test_split

from seglearn.datasets import load_watch
from seglearn.pipe import Pype
from seglearn.transform import SegmentX


def crnn_model(width=100, n_vars=6, n_classes=7, conv_kernel_size=5,
               conv_filters=10, lstm_units=10):
    input_shape = (width, n_vars)
    model = Sequential()
    model.add(Conv1D(filters=conv_filters, kernel_size=conv_kernel_size,
                     padding='valid', activation='relu', input_shape=input_shape))
    model.add(Conv1D(filters=conv_filters, kernel_size=conv_kernel_size,
                     padding='valid', activation='relu'))
    model.add(LSTM(units=lstm_units, dropout=0.1, recurrent_dropout=0.1))
    model.add(Dense(n_classes, activation="softmax"))

    model.compile(loss='categorical_crossentropy', optimizer='adam',
                  metrics=['accuracy'])

    return model


# load the data
data = load_watch()
X = data['X']
y = data['y']

# create a segment learning pipeline
width = 100

pipe = Pype([('seg', SegmentX(order='C')),
           ('crnn',KerasClassifier(build_fn=crnn_model,epochs=8, batch_size=256, verbose=0))])
# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

pipe.fit(X_train, y_train)
score = pipe.score(X_test, y_test)

print("N series in train: ", len(X_train))
print("N series in test: ", len(X_test))
print("N segments in train: ", pipe.N_train)
print("N segments in test: ", pipe.N_test)
print("Accuracy score: ", score)

  • Evaluation
This section can be implemented by yourself. Remember here, we use F-score as the evaluation indicator. 
Follow the code above and run them to see which algorithm achieves better accuracy!. And then to see what you can find.

Conclusion

In this post, I mainly introduced the HAR chain and how to use machine learning and deep learning algorithms to recognize 7 classes of activities using the dataset from the work: Shoulder Physiotherapy Exercise.

Since the mathematical principles of machine learning and deep learning are complicated. To avoid off the topic, I didn't look deeply at the explanation of these algorithms. However, if you want to do some innovation, I recommend you spare some time to seriously learn their principles.

Back to our topic, human activity recognition is a hotspot for recent researchs. If you share this interest with us, we can step into a deep ground.

Reference

segLearn: https://dmbee.github.io/seglearn/install.html
Shoulder Physiotherapy Exercise Recognition: Machine Learning the Inertial Signals from a Smartwatch: https://arxiv.org/pdf/1802.01489.pdf

No comments:

Post a Comment

[Research] Recurrent Neural Network (RNN)

1. Introduction As we all know, forward neural networks (FNN) have no connection between neurons in the same layer or in cross layers. Th...