The Ballot and the Bazaar

May 8, 2017, 11:55 am R intermediate gastric emptying nonlinear fit nlme blog-header-image

Gastric emptying can be recorded by scintigraphy or by MRI techniques. Scintigraphy returns times series data of the meal volume, MRI gives the total content of the stomach, which includes meal and secretion. To quantify secretion and emptying in clinical studies, non-linear curves are fitted to the time series, which can give strange results. When the ballot fails, the bazaar comes to the rescue.

Gastric volume time series - or short: gastric emptying data

Single linexp curve fit with nls

Single curves fits to gastric emptying data can be computed with function nls from the base package, or, more conveniently, with nlsList from package nlme.

The linexp function is defined as follows (Fruehauf et al. 2011):

linexp = function(minute, v0, tempt, kappa){
  v0 * (1 + kappa * minute / tempt) * exp(-minute / tempt)
}

A first shot at linexp

All non-linear models must be fed with start values to initialize the iterative approximation process. The result of the fit may depend on start values.

start = c(v0 = 400, tempt = 100, kappa = 1.8)
d_nls <- suppressWarnings(nlsList(
  vol ~ linexp(minute, v0, tempt, kappa) | record_id,
  data = d, start = start))

   record  v0 tempt   kappa sigma
1     P01 523    95    0.11    31
2     P02 364    34    1.48    21
3     P03  NA    NA      NA    NA
4     P04 443   139   -0.99    16
5     P05 522   190   -0.70    26
6     P06 464   278   -1.89    25
7     P07 491   811   -5.00    30
8     P08 436 15570 -120.60    21
9     P09  NA    NA      NA    NA
10    P10 425    45    0.94    12
11    P11 443    47    0.94    19
12    P12  NA    NA      NA    NA
13    P13  NA    NA      NA    NA
14    P14 384    44    1.50    11
15    P15 436    79   -0.16    11

Column sigma shows the residual standard deviation — confusingly called standard error in the R summary — and can be used as an indicator for fit quality. No convergence was achieved for 4 records: P03, P09, P12, P13. One coefficient is utterly nonsense: P08, but most negative values of kappa are also erratic.

Why this went wrong: The linexp function was introduced to fit curves with a pronounced initial overshoot due to secretion (Fruehauf et al. 2011), and it gives reasonable values for most record with fat-heavy meals. For almost-linear records without such an overshoot, the exponential function and the linear function term could each explain the curve form alone; when both are forced to work out a compromise, the tug-of war between the two ends miserably. Nevertheless: The residual standard deviation for P08 is smaller than that of all other fits presented in the following, even if only by a small amount.

It's the start value, stupid - sometimes

To be fair: the results are so bad because a start value for kappa = 1.8 was chosen. Let's try again with a more reasonable start value for kappa:

start = c(v0 = 400, tempt = 100, kappa = 0.8)
d_nls_1 <-  suppressWarnings(nlsList(
  vol ~ linexp(minute, v0, tempt, kappa) | record_id,
  data = d, start = start))

   record  v0 tempt kappa sigma
1     P01 523    95  0.11    31
2     P02 364    34  1.48    21
3     P03  NA    NA    NA    NA
4     P04 443   139 -0.99    16
5     P05 516    61  0.56    25
6     P06 447    36  0.92    21
7     P07 468    43  1.05    25
8     P08 418    36  1.08    22
9     P09  NA    NA    NA    NA
10    P10 425    45  0.94    12
11    P11 443    47  0.94    19
12    P12  NA    NA    NA    NA
13    P13  NA    NA    NA    NA
14    P14 384    44  1.50    11
15    P15 436    79 -0.16    11

With a start value of kappa = 0.8, the number of failed convergences remains the same, but there are still some negative estimate of kappa, which are quite certainly wrong.

Fitting log-transformed parameters

One simple way out is to softly restrict kappa to positive values by estimating log(kappa) and log(tempt). "Softly" means that getting close to zero gets more and more difficult. Some authors, e.g. Parker et al. (2016), have hard-limited parameters by imposing fixed numeric constraints; such discontinuities can lead to erratic behaviour of the minimization algorithm and should be avoided.

start_log = c(v0 = 400, logtempt = log(80), logkappa = log(0.8))
linexp_log = function(minute, v0, logtempt, logkappa) {
  tempt = exp(logtempt)
  v0*(1 + exp(logkappa) * minute / tempt) * exp(-minute / tempt)
}
d_nls_log  =   suppressWarnings(nlsList(
  vol ~ linexp_log(minute, v0, logtempt, logkappa) | record_id,
  data = d, start = start_log, warn.nls = TRUE))

   record  v0 logtempt logkappa kappa tempt sigma
1     P01 523      4.6   -2.213  0.11    95    31
2     P02 364      3.5    0.393  1.48    34    21
3     P03  NA       NA       NA    NA    NA    NA
4     P04 440      3.6   -0.579  0.56    36    17
5     P05 516      4.1   -0.580  0.56    61    25
6     P06 447      3.6   -0.083  0.92    36    21
7     P07 468      3.8    0.052  1.05    43    25
8     P08 418      3.6    0.081  1.08    36    22
9     P09  NA       NA       NA    NA    NA    NA
10    P10 425      3.8   -0.057  0.94    45    12
11    P11 443      3.9   -0.057  0.94    47    19
12    P12  NA       NA       NA    NA    NA    NA
13    P13  NA       NA       NA    NA    NA    NA
14    P14 384      3.8    0.406  1.50    44    11
15    P15  NA       NA       NA    NA    NA    NA

Now 5 suspects do not converge: P03, P09, P12, P13, P15. However, kappa and tempt are in a reasonable numerical range and do not scatter widely.

The friendly face of the power exponential

The power exponential function (powexp, Elashoff, Reedy, and Meyer (1982)) is the classical alternative to the linexp function when your gastric emptying curves do not have an initial overshoot due to secretion, but start flat in the limit. This model curve does not suffer from the correlation between tempt and beta that results in erratic single-curve fits for linexp, and it can follow details of the curve's tail better. Data from scintigraphy by design do not have an overshoot, so powexp is the method of choice if no comparison with MRI gastric emptying data is required.

However, when some curves have an overshoot and you need to assess the degree of overshoot, there is no way around using linexp.

powexp = function(minute, v0, tempt, beta) {
  v0 * exp(-(minute / tempt) ^ beta)
}

start = c(v0 = 400, tempt = 100, beta = 0.8)
d_pnls <- nlsList(
  vol ~ powexp(minute, v0, tempt, beta) | record_id,
  data = d, start = start
)

   record  v0 tempt beta sigma
1     P01 525   106 0.97    31
2     P02 386    84 1.95    24
3     P03 493    63 0.84    25
4     P04 437    60 1.18    17
5     P05 514   101 1.15    25
6     P06 451    75 1.38    22
7     P07 478    94 1.45    26
8     P08 423    79 1.61    22
9     P09 447    81 1.03    22
10    P10 429    94 1.42    12
11    P11 442    98 1.55    19
12    P12 394    93 0.85    17
13    P13 494    62 0.96    13
14    P14 404   105 2.32    11
15    P15 438    67 0.99    11

The golden approach: Mixed-model fit

Single curve fits are often used because Matlab license is so cheap for universities, and the straightforeward method in Matlab is the single curve fit. It produces output where nls surrenders to NA; or the analyst resorts to exotic functions with additional parameters (Parker et al. 2016) which often make things worse.

The alternative is not to fit single curves one-by-one, but using common knowledge of all curves in a study to stay within reasonable limits. Keywords you can use to google are "nonlinear mixed model", "shrinkage", "hierarchical model", "borrowing strength" or "population fit"; the latter term, common in pharmacology, is somewhat misleading because it could give the impression that all data are put into the kettle and stirred (= pooled), which is not the case. Shamelessly stealing from Eric S Raymond and the Open Source movement, I call this the "Bazaar", and the single fits the "Ballot":

From <http://menne-biomed.de/gastempt/NGM09Beamer.pdf>

d_nlme = nlme(vol~linexp(minute, v0, tempt, kappa),
     data = d,
     fixed = v0 + tempt + kappa ~1,
     random = v0 + tempt + kappa ~ 1,
     groups = ~record_id,
     start = start,
     control = nlmeControl( pnlsTol = 0.05))
cf_nlme =  coef(d_nlme) %>%  
  rownames_to_column("record")

The function nlme in package nlme is well documented with examples in Pinheiro and Bates (2000). To succesfully fit gastric emptying curves, increasing pnlsTol from its default value of 0.001 is almost always necessary, which is not well documented in the book and on in the web.

   record  v0 tempt kappa
1     P01 507    55  0.72
2     P02 388    42  1.00
3     P03 468    51  0.28
4     P04 448    49  0.21
5     P05 505    54  0.71
6     P06 461    50  0.48
7     P07 476    51  0.79
8     P08 435    47  0.65
9     P09 444    48  0.55
10    P10 431    47  0.86
11    P11 446    48  0.89
12    P12 387    42  0.76
13    P13 477    52  0.24
14    P14 404    44  1.34
15    P15 432    47  0.40

Isn't it a beauty, the above table, when you compare it with the first shot?

There are no outliers both in kappa and in tempt. It cannot always be guaranteed that the process converges for studies with extreme outliers; in case of failure, increasing pnlsTol can help. You may also consider pdDiag in the random term to tame ping-ponging nlme; for details, see the online documentation or the book. A somewhat more complex alternative that to my knowledge always works uses Stan; see this blog item with examples for ¹³C breath test time series.

Why the fuzz?

The figure below shows the the meal volume data overlaid by fitted curves. Most curves overlap, but take into account that some linexp records did not converge. The linexp function with bad start values (red) had a tendency to interpret curves as linear, best seen in P06 to P08, and is quite successfull explaining the data: see column sigma in the table of coefficients.

The mixed-model fits with nlme do not always follows the data as good as single fits. This is the price to be paid for the fewer degrees of freedom that results in shrinkage

Population fits are a compromise between "follow the data of one record" and "follow the concept behind all records". Stable estimates are traded against flexibility.

Take home

If you need an estimate of half-time t50, all models that converge are equivalent, even those that give exotic estimates of parameters.
You could also use a smoothing spline to estimate t50 when records are long enough to cross the half-emptying line. When some record do not cross the 50% volume, you must use a model base fit.
When the power exponential function is adequate, you may use it to fit single curves, but even for the good-humoured power exponential, mixed-model fits give more stable result. In case you are are Matlab-junky from the astronophysics department: no excuse to hide from mixed models, check this.
When you need kappa as a measure of the amount of secretion, you must use the linexp function with the mixed-model approach. kappa is an erratic parameter when estimated from single curves.
If you need the area-under-curve AUC integrated over the whole duration of the curve extrapolated to infinity, you must use linexp, because the AUC over the power exponential function is not always defined. If you insist to compute the AUC over the duration of your record because it is the only descriptor of curves you know: please think it over. This is a no-sense measure and should be banned.
Don't try to extend the two 3-parameter functions with linear tails or similar gimmicks. Using the single-fit approach, it is a criminal act to use parameters from such a fit, because these are erratic. Even if you get results, every parameter you add is difficult to interpret, and between-group p-values are most certainly invalid. If you feel like adding gimmicks, first rule out that the simpler model really would not work.
With mixed-model fits or Stan, it is possible to obtain valid estimates with additional parameters, for example for meals with strong lipid layering. This is advanced stuff for reserved another blog.
In clinical routine, you may have to fit a single curve from a patient. If you want to use the power of borrowing strength to get stable results, add the patient's data to a set that has converged before, and run the procedure on the concatenated set; you can try it out with this Shiny web app. This mix-in approach applies a prior from the population data on your patient's record, and gives results similar to a Bayesian fit with a prior from the fit's posterior; ouff... scrap this, I will try to explain this in a separate blog.

Package gastempt

You can find some pre-packed functions for the methods discussed here in the gastempt package; the examples do not require that the package is loaded. Package features can be demonstrated and tested with your own data with the Shiny web app in the package. The app can be installed on your computer, or tested here.

if (require(gastempt))
  devtools::install_github("dmenne/gastempt")
gastempt::run_shiny()

Zipped source files and data for this blog item

References

Elashoff, J. D., T. J. Reedy, and J. H. Meyer. 1982. “Analysis of Gastric Emptying Data.” Gastroenterology 83 (6): 1306–12.

Fruehauf, H., D. Menne, M. A. Kwiatek, Z. Forras-Kaufman, E. Kaufman, O. Goetze, M. Fried, W. Schwizer, and M. Fox. 2011. “Inter-Observer Reproducibility and Analysis of Gastric Volume Measurements and Gastric Emptying Assessed with Magnetic Resonance Imaging.” Neurogastroenterol Motil 23 (9). Division of Gastroenterology; Hepatology, University Hospital Zurich, Switzerland Menne Biomed, Tuebingen, Germany.: 854–61. doi:10.1111/j.1365-2982.2011.01743.x.

Parker, H. L., E. Tucker, C. L. Hoad, A. Pal, C. Costigan, N. Hudders, A. Perkins, et al. 2016. “Development and Validation of a Large, Modular Test Meal with Liquid and Solid Components for Assessment of Gastric Motor and Sensory Function by Non-Invasive Imaging.” Neurogastroenterol Motil 28 (4). Department of Gastroenterology, St. Claraspital, Basel, Switzerland.: 554–68. doi:10.1111/nmo.12752.

Pinheiro, Jose C., and Douglas M. Bates. 2000. Mixed-Effects Models in S and S-Plus. Springer.

Next Post Previous Post