---
title: "Dynamic prediction in functional concurrent regression with sparse functional covariates"
author: "Leroux A, Xiao L, Crainiceanu C, Checkley W"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Tutorial on how to fit models and perform prediction using fcr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


```{r global_options, include=FALSE}
rm(list=ls()) ### To clear namespace
library(knitr)
opts_chunk$set(echo=TRUE, eval=FALSE)
```

This vignette is intended to show users how to fit functional concurrent regression models and then perform 
dynamic prediction using the *fcr* package. 
The analysis here mirrors the data application in Leroux et. al (2017).

The *fcr* package was created to fit functional concurrent models when the functional response and predictor are irregularly measured. 
 We note that regularly spaced functional concurrent regression models can be fit using this package, but if dynamic prediction is not 
 of interest and the data are regularly spaced,  
 the *refund* package likely provides substantial computational gains over the *fcr* package.

# Model

The *fcr()* function is a wrapper for *mgcv::gam()*/*mgcv::bam()* and *face::face.sparse()*. 
The mean model specification syntax of *fcr()* is that of *mgcv::gam()*. 
This function fits models of the form


$$
Y_ij = f_0(t_{ij}) + \beta_1X_{i} + f_2(t_{ij})Z_{ij} + f_3(Z_{ij}) + f_4(Z_{ij}, t_{ij}) + \cdots + b_i(t_{ij}) + \epsilon_{ij} 
$$
Where $b_i(t_{ij})$ is denotes a random function intercept
and $\epsilon_{ij}$ are iid normally distributed noise. 
We assume the functional random intercept and white noise terms are independent. 
The *fcr()* function does not currently allow non-identity link functions for the response.


# CONTENT data

The *fcr* package contains
simulated child growth data. 
Below, we present a model fitting procedure and show how to 
perform dynamic prediction, and plot the results.

In this data, we have two irregularly measured growth indicators, height for age (haz) and weight for age (waz), as well as a time-invariant covariate 
indicating the sex of the child (Male). 
We allow for time varying coefficients on both the gender covariate, and concurrently measured waz.
More precisely, our model is

$$
\text{HAZ}_{ij} = f_0(t_{ij}) + f_1(t_{ij})\text{Male}_i + f_2(t_{ij})WAZ_{ij} + b_i(t_{ij}) + \epsilon_{ij}
$$

In this data, both $HAZ$ and $WAZ$ are measured with noise, and so the first step is to smooth 
the noisy $WAZ$ covariate. Here, we smooth using functional principal 
component analysis via the *face* package.


## Fit the model

```{r content_setup}
library(fcr)
## split the data into training and test data
data_full <- content
data      <- subset(data_full, include == 1)  ## data used in model fitting
data_test <- subset(data_full, include == 0)  ## out-of-sample data used in dynamic prediction
rm(list="data_full")

k <- 12 # number of interior knots for fpca (results in k + 3 basis functions)

## Specify the functional domain where we want to make predictions.
## We need to add in the times of observation for the test data
tnew <- sort(unique(c(data$argvals, data_test$argvals)))

###########################################
## Step 1: Smooth time-varying covariate ##
###########################################
dat.waz <- data.frame("y" = data$waz, "subj" = data$subj, argvals = data$argvals)
fit.waz <- face.sparse(dat.waz, newdata = dat.waz, knots = k, argvals.new = tnew)
data$wazPred <- fit.waz$y.pred
```


Once we have the smooth functional covariate, we are ready to fit our model.

```{r fit_fcr}
#####################
## Step 2: Fit fcr ##
#####################

K <- 15 # dimenson of smooth for time varying coefficients

## The mean model in order of the formula presented below:
##     E[Y_{ij}] = f_0(t_{ij}) + f_1(t_{ij})Male_i + f_2(t_{ij})\tilde{WAZ}_{ij}
## note that we do not specify any random effects. That is done internally by fcr()
fit <- fcr(formula = Y ~ s(argvals, k=K, bs="ps") + 
                         s(argvals, by=Male, k=K, bs="ps") + 
                         s(argvals, by=wazPred, bs="ps"),
           argvals = "argvals", subj="subj", data=data, use_bam=TRUE, argvals.new=tnew,
           face.args = list(knots=k, pve=0.99))

```


## Visualize the data


Below we plot the data. The panels in the top row correspond to the 
estimated $f_0,f_1,f_2$. 
The bottom three panels correspond features of the estimate covariance function.

```{r visualize,fig.height=8,fig.width=16,fig.align='center'}
#####################
## Examine results ##
#####################

## plot coefficients
par(mfrow=c(2,3),las=1)
for(k in 1:3) plot(fit,select=k)
# plot covariance function
plot(fit, plot.covariance=TRUE)
```

![The top three panels present the estimated coefficient functions $f_0, f_1, f_2$ along with pointwise 95% confidence intervals. The figures are by default created using *plot.gam()*. The bottom three panels visualize the features of the covariance function.](visualize-1.png)


## Prediction

### Dynamic prediction in a functional concurrent model

Dynamic prediction in a functional concurrent model is complicated by the fact that we 
require a value for the functional covariate at the time where we wish to dynamically predict the 
outcome of interest. 

The solution proposed in Leroux et. al (2017) is to use fpca to dynamically predict the covariate values, then 
use those predictions as our covariate in the functional concurrent model. As a result, for each new 
set of dynamic predictions, the functional covariate must be re-estimated. 

Suppose we wise to make dynamic predictions for a subject for $t > t_m$ for some $t, t_m < t_{\text{max}}$ where 
$t_{\text{max}}$ is the maximum value of the functional domain used in model fitting.
The procedure can for dynamic prediction can be summarized in the following steps 

1. Create a new data frame (**newdata**) with all available data for the subject up to $t_m$
2. Insert new rows of subject data into **newdata** with **NA** values for the functional covariate and response. 
   The functional domain variable in these rows should indicate the $t$ for which dynamic predictions are desired.
3. Use fpca (or another prediction procedure) to smooth and dynamically predict the functional covariate at observed times and 
    "unobserved" times, respectively. Do not use this subject's data beyond $t_m$ (if available) to make these predictions. 
4. If subjects were used in data fitting, re-assign their subject identifiers to a value not used in model fitting. 
5. Supply fitted fcr object and **newdata** to *predict.fcr()*

Note that step (4) above is required by the *fcr()* function to distinguish dynamic prediction from 
"in-sample" prediction.
We provide an example of both in-sample and dynamic prediction in the code below. 

### Dynamic versus in-sample prediction in *fcr*

The *predict.fcr()* function allows for both in-sample and dynamic prediction. 
For in-sample prediction, the estimated BLUPs from *mgcv::bam()*/*mgcv::gam()* are used. 
Only subjects included in model fitting can be used for in-sample prediction. 

In contrast, dynamic prediction is performed on subjects not included in model fitting as indicated by the "subj" 
identifier. Note that dynamic predictions for subjects that were included in model fitting can be 
retrieved by assigning them a new subject id in the "newdata" argument to *predict.fcr()*.

The difference between dynamic predictions and in-sample predictions manifests in two ways. 

First, suppose we observe data on subject $i$ at times $1,2,\ldots,10$ and fit the model on this data. 
If we supply subject $i$'s data for times $1,2,3$ and **NA** values at times $4,\ldots,10$, in-sample prediction 
will provide the same predictions as if we had included all times $1,\ldots,10$ in the new data. 
In contrast, dynamic prediction for this subject will re-estimate this subjects' BLUPs using only the available 
data ($t=1,2,3$). Generally, these two predictions will not be equal.

Second, suppose we're in the scenario mentioned above, but we supply all the subjects' data to the predict function. 
The predicted values will be the same (up to a negligible difference that arises from the differing matrix algebra 
operations used), but the standard errors will differ. This is because the standard errors calculated for in-sample 
prediction take into account the estimated covariance of the random effects with fixed effects (see *?predict.gam* for more details). 
Dynamic predictions, in contrast, assume independence between random effects and fixed effects.

We illustrate these differences in the code below. 


```{r prediction}
## do in-sample and dynamic prediction for subject 1
data_pred <- subset(data, subj==1)

## make two separate dataframes for dynamic prediction
## and create make new subject id not used in model fitting
data_pred_dyn      <- data_pred_dyn2      <- data_pred
data_pred_dyn$subj <- data_pred_dyn2$subj <- data_pred_dyn$subj + 10000

## make "observed data" beyond t=0.5 NA
data_pred_dyn$Y[data_pred_dyn$argvals >= 0.5]   <- NA
data_pred_dyn$waz[data_pred_dyn$argvals >= 0.5] <- NA

## calculate dynamic waz scores
data_dyn_waz         <- data.frame("y" = data_pred_dyn$waz,
                                   "subj" = data_pred_dyn$subj,
                                   "argvals" = data_pred_dyn$argvals)
data_pred_dyn$wazPred <- predict(fit.waz, newdata=data_dyn_waz)$y.pred


## perform predictions
in_samp <- predict(fit, newdata=data_pred, se.fit=TRUE)
dyn     <- predict(fit, newdata=data_pred_dyn)
dyn2    <- predict(fit, newdata=data_pred_dyn2)
```


```{r prediction_plot, echo=FALSE}

## plot the predictions
ylims <- c(-1.5,0.75)
tplot <- data_pred$argvals
par(mfrow=c(1,1))

plot(tplot,in_samp$insample_predictions$fit,xlab="t",ylab="haz",
     type='l',xlim=c(0,1),ylim=ylims,main="Dynamic vs In-Sample Predictions")
lines(tplot,lty=2,
      in_samp$insample_predictions$fit + 1.96*in_samp$insample_predictions$se.fit)
lines(tplot,lty=2,
      in_samp$insample_predictions$fit - 1.96*in_samp$insample_predictions$se.fit)

lines(tplot,lty=1,col='red',
      dyn$dynamic_prediction$fitted.values$y.pred)
lines(tplot,lty=2,col='red',
      dyn$dynamic_prediction$fitted.values$y.pred + 1.96*dyn$dynamic_prediction$fitted.values$se.fit)
lines(tplot,lty=2,col='red',
      dyn$dynamic_prediction$fitted.values$y.pred - 1.96*dyn$dynamic_prediction$fitted.values$se.fit)

lines(tplot,lty=1,col='blue',
      dyn2$dynamic_prediction$fitted.values$y.pred)
lines(tplot,lty=2,col='blue',
      dyn2$dynamic_prediction$fitted.values$y.pred + 1.96*dyn2$dynamic_prediction$fitted.values$se.fit)
lines(tplot,lty=2,col='blue',
      dyn2$dynamic_prediction$fitted.values$y.pred - 1.96*dyn2$dynamic_prediction$fitted.values$se.fit)

abline(v=0.5,lty=2,lwd=2,col='grey')
legend("top",c("In-sample","Dynamic (all data)","Dynamic (up to t=0.5)","", "Predicted Curve","95% CI"),
       col=c("black","blue","red","black","black","black"),
       lty=c(1,1,1,NA,2,1),lwd=2,bty="n",ncol=2)

axis(1,0.5,labels=expression(t[m]==0.5))

```

![ ](prediction_plot-1.png)


### Perform dynamic prediction on the test data set

Below we show how to perform dynamic prediction for all subjects in the test dataset using data 
up to $t_m = 0.5$. 
We also supplement the dataset with **NA** values on a fine grid over the range of the functional domain 
to show how the procedure results in smooth fitted curves and pointwise confidence/prediction intervals. 


```{r dyn_pred_test}
## perform dynamic prediction for test subjects
## on the last 1/2 of the functional domain
data_dyn <- data_test
data_dyn$Y[data_dyn$argvals >= 0.5]   <- NA
data_dyn$waz[data_dyn$argvals >= 0.5] <- NA

## remove smoothed waz -- dyanmic predictions calculated shortly
data_dyn$wazPred <- NULL


set.seed(1012341)
ids <- sample(unique(data_dyn$subj), 4, replace = FALSE)


data_plot <- c()
for(i in 1:length(ids)){
        tmp <- subset(data_dyn, subj %in% ids[i])
        ut_tmp <- tnew[!tnew %in% tmp$argvals & tnew > min(tmp$argvals)]
        n_ut <- length(ut_tmp)
        empty_dat <- data.frame("Y" = rep(NA,n_ut),
                                "Ytrue" = rep(NA,n_ut),
                                "X" = rep(NA,n_ut),
                                "waz.true"=rep(NA,n_ut),
                                "waz"=rep(NA,n_ut),
                                "Male" = rep(tmp$Male[1], n_ut),
                                "argvals"=ut_tmp,
                                "subj"=rep(tmp$subj[1],n_ut),
                                "include"=rep(0,n_ut))
        tmp <- rbind(tmp, empty_dat)
        tmp <- tmp[order(tmp$argvals),]
        
        data_plot <- rbind(data_plot, tmp)
        rm(list=c("tmp","ut_tmp","n_ut","empty_dat"))
}

data_dyn <- data_plot
rm(list=c("data_plot"))

## get dynamic waz predictions
data_dyn_waz     <- data.frame("y" = data_dyn$waz, "subj" = data_dyn$subj, "argvals" = data_dyn$argvals)
data_dyn$wazPred <- predict(fit.waz, newdata=data_dyn_waz)$y.pred

preds <- predict(fit, newdata=data_dyn)$dynamic_predictions
```


```{r plot_test_dyn}
par(mfrow=c(2,2),las=1)
for(i in 1:length(ids)){
        inx <- which(data_dyn$subj==ids[i])
        inx2 <-which(data_test$subj==ids[i])
        
        yl  <- range(c(preds$fitted.values$y.pred[inx] + 
                               rep(c(1,-1),each=length(inx))*1.96*preds$fitted.values$se.fit.p[inx],
                       data_test$Y[inx2]))

        plot(preds$data$argvals[inx], preds$fitted.values$y.pred[inx],type='l',
             main = paste("Subject",i),xlab="Functional Domain", ylab="Response",
             xlim=c(0,1), ylim=yl)
        lines(preds$data$argvals[inx], lty=2,
              preds$fitted.values$y.pred[inx] - 1.96*preds$fitted.values$se.fit[inx],col='blue')
        lines(preds$data$argvals[inx], lty=2,
              preds$fitted.values$y.pred[inx] + 1.96*preds$fitted.values$se.fit[inx],col='blue')
        
        lines(preds$data$argvals[inx],  lty=2,
              preds$fitted.values$y.pred[inx] - 1.96*preds$fitted.values$se.fit.p[inx],col='red')
        lines(preds$data$argvals[inx],  lty=2,
              preds$fitted.values$y.pred[inx] + 1.96*preds$fitted.values$se.fit.p[inx],col='red')
        axis(1,0.5,labels=expression(t[m]))
        
        points(data_test$argvals[inx2],data_test$Y[inx2])
        
        abline(v=0.5,col='grey',lty=2)
        if(i == 4){
                legend("top",c("Observed Data","fcr() Prediction","95% Confidence Interval","95% Prediction Interval"),
                       col=c("black","black","blue","red"),bty="n",pch=c(1,NA,NA,NA), lty = c(NA,1,2,2))
        }
}
```

![Dynamic prediction for four subjects. The solid black line represents the predicted curve, dashed blue and red lines indicate 95% confidence and prediction intervals, respectively. Predictions use only the data up to $t_m = 0.5$ to make predictions.](plot_test_dyn-1.png)