Biomedical research and clinical data are often collected on the same sample of data at different points in time. These data are called “longitudinal data.” (See the definition by BLS.) When performing supervised learning (e.g., SVM) on data of this kind, the impact of time-varying correlation of the features on the outcomes / predictions may be blurred. In order to smoothing out the temporal effect of the features, changes to the original learning algorithms are necessary.

In a study conducted by Center for Information Technology (CIT), and National Institutes on Aging (NIA) in National Institutes of Health (NIH), with some clinical data as the training data, a longitudinal support vector regression (LSVR) algorithm was presented, and shown to outperform other machine learning methods. [Du et. al. 2015] Their results were published in IEEE BIBM (Bioinformatics and Biomedicine) conference. Their work is adapted from an earlier work by Chen and Bowman. [Chen & Bowman 2011] The dataset is a longitudinal, because it contains N patients with p features, taken at T points in time.

Traditional support vector regression (SVR) is to solve the following optimization problem:

$\text{min} \frac{1}{2} || \mathbf{w} ||^2$,

where $\mathbf{w}$ is a hyperplane surface, under the constraints:

$y_i - \mathbf{w^T} \Phi (\mathbf{x}_i) - b \leq \epsilon$,
$\mathbf{w^T} \Phi (\mathbf{x}_i) + b - y_i \leq \epsilon$.

However, in LSVR, the data points are more complicated. For each patient s, its features at time t is given by a vector $\mathbf{x}_s^{(t)}$. The first goal of LSVR is to assign each patient a T-by-p matrix $\tilde{\mathbf{x}}_s = \left[ \mathbf{x}_s^{(1)}, \mathbf{x}_s^{(2)}, \ldots, \mathbf{x}_s^{(T)} \right]^T$, and a T-by-1 vector $\tilde{y}_s =\left[ y_s^{(1)}, y_s^{(2)}, \ldots, y_s^{(T)} \right]^T$, with an unknown parameter vector $\mathbf{\beta} = \left[ 1, \beta_1, \beta_2, \ldots, \beta_{T-1} \right]^T$ such that the constraints becomes:

$\tilde{y}_i^T \beta - \langle \mathbf{w^T}, \mathbf{\tilde{x}_i}^T \beta \rangle - b \leq \epsilon + \xi_i$,
$\langle \mathbf{w^T}, \mathbf{\tilde{x}_i}^T \beta \rangle+ b -\tilde{y}_i^T \beta \leq \epsilon + \xi_i^{*}$,

where $\xi_i$‘s are additional regularization parameters. The parameters $\beta$‘s can be found by iteratively quadratic optimization. The constraints are handled with Lagrangian’s multipliers.

For details, please refer to [Du et. al. 2015]. This way decouples, or smoothes out, the temporal covariation within the patients. A better prediction can be made.