Influence of systolic blood pressure on coronary heart disease
Source:vignettes/Framingham_heart_study.Rmd
Framingham_heart_study.Rmd
In this tutorial, we re-do the analysis from two previous
publications that use the same data set. The data set contains the heart
disease status for a number of patients, along with repeated
measurements of their systolic blood pressure and their smoking status.
The data is also included in the package, simply type
?framingham
for more details.
head(framingham)
#> disease sbp1 sbp2 smoking
#> 1 0 0.043556319 -0.24144440 0
#> 2 0 -0.088249125 -0.03871185 1
#> 3 0 0.063634980 0.02971774 1
#> 4 0 -0.331791943 -0.44704230 1
#> 5 0 0.006502333 -0.09760337 1
#> 6 0 0.124884993 0.09793482 0
The two publications are Bayesian analysis of measurement error models using INLA, Muff et al (2015) and Reverse attenuation in interaction terms due to covariate measurement error, Muff & Keller (2015).
First example: A logistic regression model with repeated measurements
Error types | Likelihood | Response | Covariate with error | Other covariate(s) |
---|---|---|---|---|
Classical | Binomial | disease |
sbp1 , sbp2
|
smoking |
The data and model in this example was also used in Muff et al (2015), so more
information on the measurement error model can be found there. The model
is identical, this example just shows how it can be implemented in
inlamemi
.
In this example, we fit a logistic regression model for whether or
not a patient has heart disease, using systolic blood pressure (SBP) and
smoking status as covariates. SBP is measured with error, but we have
repeated measurements, and so we would like to feed both measurements of
SBP into the model. This can be done easily in the inlamemi
package.
The formula for the main model of interest will be and the formula for the imputation model will be
In addition, we of course also have the classical measurement error model that describes the actual error in the SBP measurements, and since we have repeated measurements we actually have two: where are the measurement error terms.
We can then call the fit_inlamemi
function directly with
the above formulas for the model of interest and imputation model. Also
note the repeated measurements argument, which must be set to
TRUE
to ensure that the model is specified correctly. We
give the precision for the error term of the measurement error model a
prior, and the error term for the imputation model a prior. By default
in R-INLA, the fixed effects are given Gaussian priors with mean
and precision
.
We re-assign the precisions to be
,
but keep the means at 0 (therefore they are not specified in the
control.fixed
argument).
framingham_model <- fit_inlamemi(formula_moi = disease ~ sbp + smoking,
formula_imp = sbp ~ smoking,
family_moi = "binomial",
data = framingham,
error_type = "classical",
repeated_observations = TRUE,
prior.prec.classical = c(100, 1),
prior.prec.imp = c(10, 1),
prior.beta.error = c(0, 0.01),
initial.prec.classical = 100,
initial.prec.imp = 10,
control.fixed = list(
prec = list(beta.0 = 0.01,
beta.smoking = 0.01,
alpha.0 = 0.01,
alpha.smoking = 0.01)))
Once the model is fit we can look at the summary.
summary(framingham_model)
#> Formula for model of interest:
#> disease ~ sbp + smoking
#>
#> Formula for imputation model:
#> sbp ~ smoking
#>
#> Error types:
#> [1] "classical"
#>
#> Fixed effects for model of interest:
#> mean sd 0.025quant 0.5quant 0.975quant mode
#> beta.0 -2.3615605 0.2704869 -2.8934484 -2.3610451 -1.8325971 -2.3610396
#> beta.smoking 0.3993941 0.2994675 -0.1875007 0.3992733 0.9869779 0.3992731
#>
#> Coefficient for variable with measurement error and/or missingness:
#> mean sd 0.025quant 0.5quant 0.975quant mode
#> beta.sbp 1.89742 0.5598976 0.8403958 1.882701 3.043096 1.817019
#>
#> Fixed effects for imputation model:
#> mean sd 0.025quant 0.5quant 0.975quant mode
#> alpha.sbp.0 0.01454298 0.01858279 -0.02190741 0.01454298 0.05099338 0.01454298
#> alpha.sbp.smoking -0.01958476 0.02156436 -0.06188353 -0.01958476 0.02271401 -0.01958476
#>
#> Model hyperparameters (apart from beta.sbp):
#> mean sd 0.025quant 0.5quant 0.975quant mode
#> Precision for sbp classical model 75.90183 3.690514 68.89237 75.81344 83.41967 75.63985
#> Precision for sbp imp model 19.89455 1.234091 17.55057 19.86501 22.40742 19.82450
plot(framingham_model)
For a comparison, we can also fit a βnaiveβ model, that is, a model that ignores the measurement error in SBP. For this model, we will take an average of the two SBP measurements and use that as the SBP variable.
framingham$sbp <- (framingham$sbp1 + framingham$sbp2)/2
naive_model <- inla(formula = disease ~ sbp + smoking,
family = "binomial",
data = framingham)
naive_model$summary.fixed
#> mean sd 0.025quant 0.5quant 0.975quant mode kld
#> (Intercept) -2.3547258 0.2692636 -2.8824728 -2.3547258 -1.8269788 -2.3547258 0
#> sbp 1.6686897 0.4937443 0.7009686 1.6686897 2.6364107 1.6686897 0
#> smoking 0.3972557 0.2992211 -0.1892069 0.3972557 0.9837183 0.3972557 0
Then we can compare the estimated coefficients from both models.
naive_result <- naive_model$summary.fixed
rownames(naive_result) <- c("beta.0", "beta.sbp", "beta.smoking")
naive_result$variable <- rownames(naive_result)
me_result <- rbind(summary(framingham_model)$moi_coef[1:6],
summary(framingham_model)$error_coef[1:6])
me_result$variable <- rownames(me_result)
results <- dplyr::bind_rows(naive = naive_result, me_adjusted = me_result, .id = "model")
ggplot(results, aes(x = mean, y = model, color = variable)) +
geom_point() +
geom_linerange(aes(xmin = `0.025quant`, xmax = `0.975quant`)) +
facet_grid(~ variable, scales = "free_x") +
theme_bw()
Second example: Heteroscedastic measurement error and interaction between error variable and error free variable
Error types | Likelihood | Response | Covariate with error | Other covariate(s) |
---|---|---|---|---|
Classical (with interaction effects) | Binomial | disease |
sbp1 , sbp2
|
smoking |
In this example, we artificially increase the measurement error for the smoking group in order to study a model with homoscedastic measurement error. The model in this example is identical to that in Muff & Keller (2015)
framingham2 <- framingham
n <- nrow(framingham2)
set.seed(1)
framingham2$sbp1 <- framingham$sbp1 +
ifelse(framingham2$smoking == 1, rnorm(n, 0, 0.117), 0)
framingham2$sbp2 <- framingham$sbp2 +
ifelse(framingham2$smoking == 1, rnorm(n, 0, 0.117), 0)
# Or:
#framingham2 <- read.table("../data-raw/fram_data_case2.txt", header=T)
#names(framingham2) <- c("disease", "sbp1", "sbp2", "smoking")
#framingham2$sbp <- (framingham2$sbp1 + framingham2$sbp2)/2
error_scaling <- c(1/(1+9*framingham2$smoking), 1/(1+9*framingham2$smoking))
# Homoscedastic ME modeled as homoscedastic ME (correct)
framingham_model2.1 <- fit_inlamemi(formula_moi = disease ~ sbp:smoking,
formula_imp = sbp ~ smoking,
family_moi = "binomial",
data = framingham,
error_type = "classical",
repeated_observations = TRUE,
prior.prec.classical = c(100, 1),
prior.prec.imp = c(10, 1),
prior.beta.error = c(0, 0.01),
initial.prec.classical = 100,
initial.prec.imp = 10,
control.fixed = list(
prec = list(beta.0 = 0.01,
beta.smoking = 0.01,
alpha.0 = 0.01,
alpha.smoking = 0.01)))
# Heteroscedastic ME modeled as heteroscedastic ME (correct)
framingham_model2.2 <- fit_inlamemi(formula_moi = disease ~ sbp:smoking,
formula_imp = sbp ~ smoking,
family_moi = "binomial",
data = framingham2,
error_type = "classical",
repeated_observations = TRUE,
classical_error_scaling = error_scaling,
prior.prec.classical = c(100, 1),
prior.prec.imp = c(10, 1),
prior.beta.error = c(0, 0.01),
initial.prec.classical = 100,
initial.prec.imp = 10,
control.fixed = list(
prec = list(beta.0 = 0.01,
beta.smoking = 0.01,
alpha.0 = 0.01,
alpha.smoking = 0.01)))
# Heteroscedastic ME modeled as homoscedastic ME (incorrect)
framingham_model2.3 <- fit_inlamemi(formula_moi = disease ~ sbp:smoking,
formula_imp = sbp ~ smoking,
family_moi = "binomial",
data = framingham2,
error_type = "classical",
repeated_observations = TRUE,
prior.prec.classical = c(100, 1),
prior.prec.imp = c(10, 1),
prior.beta.error = c(0, 0.01),
initial.prec.classical = 100,
initial.prec.imp = 10,
control.fixed = list(
prec = list(beta.0 = 0.01,
beta.smoking = 0.01,
alpha.0 = 0.01,
alpha.smoking = 0.01)))
framingham_model2.5 <- fit_inlamemi(formula_moi = disease ~ sbp:smoking,
formula_imp = sbp ~ smoking,
family_moi = "binomial",
data = framingham2,
error_type = "classical",
repeated_observations = TRUE,
classical_error_scaling = c(rep(10^(-12), 2*n)),
prior.prec.classical = c(100, 1),
prior.prec.imp = c(10, 1),
prior.beta.error = c(0, 0.01),
initial.prec.classical = 100,
initial.prec.imp = 10,
control.fixed = list(
prec = list(beta.0 = 0.01,
beta.smoking = 0.01,
alpha.0 = 0.01,
alpha.smoking = 0.01)))
plot(framingham_model2.1)
plot(framingham_model2.2)
plot(framingham_model2.3)
plot(framingham_model2.5)
framingham_model2.1$summary.hyperpar
#> mean sd 0.025quant 0.5quant 0.975quant mode
#> Precision for the Gaussian observations[2] 75.926745 3.691419 68.9229967 75.835774 83.453911 75.651639
#> Precision for the Gaussian observations[3] 19.899106 1.235852 17.5768533 19.860949 22.440894 19.785229
#> Beta for beta.smokingsbp -1.550385 1.368556 -4.1568580 -1.579337 1.229324 -1.706516
#> Beta for beta.sbp 3.018276 1.192594 0.6152684 3.036915 5.310360 3.117344
framingham_model2.2$summary.hyperpar
#> mean sd 0.025quant 0.5quant 0.975quant mode
#> Precision for the Gaussian observations[2] 157.132846 7.224567 143.347610 156.981638 171.786833 156.7113407
#> Precision for the Gaussian observations[3] 22.107856 1.647459 19.052409 22.043246 25.535423 21.9076225
#> Beta for beta.smokingsbp -1.321276 1.812718 -5.541954 -1.081464 1.323648 0.1774153
#> Beta for beta.sbp 3.475219 1.546659 1.236920 3.262221 7.082559 2.1712585
framingham_model2.3$summary.hyperpar
#> mean sd 0.025quant 0.5quant 0.975quant mode
#> Precision for the Gaussian observations[2] 45.173090 2.188453 40.9850002 45.131480 49.599358 45.072754
#> Precision for the Gaussian observations[3] 19.188177 1.278490 16.8159894 19.138396 21.847723 19.023909
#> Beta for beta.smokingsbp -1.556081 1.567443 -4.3841723 -1.633877 1.760159 -2.006828
#> Beta for beta.sbp 3.224484 1.369344 0.3427539 3.287009 5.715341 3.584224
framingham_model2.5$summary.hyperpar
#> mean sd 0.025quant 0.5quant 0.975quant
#> Precision for the Gaussian observations[2] 7.411258e+02 27.2247363 688.7973955 7.406868e+02 795.9749930
#> Precision for the Gaussian observations[3] 1.012133e+01 3.2153919 5.0781874 9.696652e+00 17.5970850
#> Beta for beta.smokingsbp 4.122008e-04 0.2990914 -0.5888122 5.530046e-04 0.5888194
#> Beta for beta.sbp 5.987715e-04 0.2823957 -0.5553716 6.067459e-04 0.5565229
#> mode
#> Precision for the Gaussian observations[2] 7.399408e+02
#> Precision for the Gaussian observations[3] 8.921509e+00
#> Beta for beta.smokingsbp 1.134864e-03
#> Beta for beta.sbp 6.396683e-04