Response modeling methodology explained

Response modeling methodology (RMM) is a general platform for statistical modeling of a linear/nonlinear relationship between a response variable (dependent variable) and a linear predictor (a linear combination of predictors/effects/factors/independent variables), often denoted the linear predictor function. It is generally assumed that the modeled relationship is monotone convex (delivering monotone convex function) or monotone concave (delivering monotone concave function). However, many non-monotone functions, like the quadratic equation, are special cases of the general model.

RMM was initially developed as a series of extensions to the original inverse Box–Cox transformation:

y={{(1+λz)}^1/λ

}, where y is a percentile of the modeled response, Y (the modeled random variable), z is the respective percentile of a normal variate and λ is the Box–Cox parameter. As λ goes to zero, the inverse Box–Cox transformation becomes:

y=e^z,

an exponential model. Therefore, the original inverse Box-Cox transformation contains a trio of models: linear (λ = 1), power (λ ≠ 1, λ ≠ 0) and exponential (λ = 0). This implies that on estimating λ, using sample data, the final model is not determined in advance (prior to estimation) but rather as a result of estimating. In other words, data alone determine the final model.

Extensions to the inverse Box–Cox transformation were developed by Shore (2001a^[1]) and were denoted Inverse Normalizing Transformations (INTs). They had been applied to model monotone convex relationships in various engineering areas, mostly to model physical properties of chemical compounds (Shore et al., 2001a, and references therein). Once it had been realized that INT models may be perceived as special cases of a much broader general approach for modeling non-linear monotone convex relationships, the new Response Modeling Methodology had been initiated and developed (Shore, 2005a,^[2] 2011^[3] and references therein).

The RMM model expresses the relationship between a response, Y (the modeled random variable), and two components that deliver variation to Y:

The linear predictor function, LP (denoted η):

η=\beta₀+\beta₁X₁₊ … +\beta_kX_k,

where are regressor-variables (“affecting factors”) that deliver systematic variation to the response;

Normal errors, delivering random variation to the response.

The basic RMM model describes Y in terms of the LP, two possibly correlated zero-mean normal errors, ε₁ and ε₂ (with correlation ρ and standard deviations σ_ε1 and σ_ε2, respectively) and a vector of parameters (Shore, 2005a, 2011):

W=log(Y)=\mu+\left(

	\alpha
	λ

\right)[(η+

	λ-1]
\varepsilon
	1)

+\varepsilon_2,

and ε₁ represents uncertainty (measurement imprecision or otherwise) in the explanatory variables (included in the LP). This is in addition to uncertainty associated with the response (ε₂). Expressing ε₁ and ε₂ in terms of standard normal variates, Z₁ and Z₂, respectively, having correlation ρ, and conditioning Z₂ | Z₁ = z1 (Z₂ given that Z₁ is equal to a given value z₁), we may write in terms of a single error, ε:

\begin{align} \varepsilon₁&=

\sigma
	\varepsilon₁

Z_{1;\varepsilon}₂=

\sigma
	\varepsilon₂

Z_2;\\[4pt] \varepsilon₂&=

\sigma
	\varepsilon₂

\rhoz₁+(1-\rho²⁾^(1/2)

\sigma
	\varepsilon₂

Z=dz₁+\varepsilon,\ \end{align}

where Z is a standard normal variate, independent of both Z₁ and Z₂, ε is a zero-mean error and d is a parameter. From these relationships, the associated RMM quantile function is (Shore, 2011):

w=log(y)=\mu+\left(

	\alpha
	λ

\right)[(η+cz)^{λ-1]+(d)z+\varepsilon,}

or, after re-parameterization:

w=log(y)=log(M_Y)+\left(

	aη^b
	b

\right)\left\{\left[1+\left(

	c
	η

\right)z\right]^b-1\right\}+(d)z+\varepsilon,

where y is the percentile of the response (Y), z is the respective standard normal percentile, ε is the model's zero-mean normal error with constant variance, σ, are parameters and M_Y is the response median (z = 0), dependent on values of the parameters and the value of the LP, η:

log(M_Y)=\mu+\left(

	a
	b

\right)[η^b-1]=log(m)+\left(

	a
	b

\right)[η^b-1],

where μ (or m) is an additional parameter.

If it may be assumed that cz<<η, the above model for RMM quantile function can be approximated by:

w=log(y)=log(M_Y)+\left(

	aη^b
	b

\right)\left[\exp\left(

	bcz
	η

\right)-1\right]+(d)z+\varepsilon.

The parameter “c” cannot be “absorbed” into the parameters of the LP (η) since “c” and LP are estimated in two separate stages (as expounded below).

If the response data used to estimate the model contain values that change sign, or if the lowest response value is far from zero (for example, when data are left-truncated), a location parameter, L, may be added to the response so that the expressions for the quantile function and for the median become, respectively:

w=log(y-L)=log(M_Y-L)+\left(

	aη^b
	b

\right)\left\{\left[1+\left(

	c
	η

\right)z\right]^b-1\right\}+(d)z+\varepsilon;

log(M_Y-L)=\mu+\left(

	a
	b

\right)[η^b-1].

Continuous monotonic convexity

As shown earlier, the inverse Box–Cox transformation depends on a single parameter, λ, which determines the final form of the model (whether linear, power or exponential). All three models thus constitute mere points on a continuous spectrum of monotonic convexity, spanned by λ. This property, where different known models become mere points on a continuous spectrum, spanned by the model's parameters, is denoted the Continuous Monotonic Convexity (CMC) property. The latter characterizes all RMM models, and it allows the basic “linear-power-exponential” cycle (underlying the inverse Box–Cox transformation) to be repeated ad infinitum, allowing for ever more convex models to be derived. Examples for such models are an exponential-power model or an exponential-exponential-power model (see explicit models expounded further on). Since the final form of the model is determined by the values of RMM parameters, this implies that the data, used to estimate the parameters, determine the final form of the estimated RMM model (as with the Box–Cox inverse transformation). The CMC property thus grant RMM models high flexibility in accommodating the data used to estimate the parameters. References given below display published results of comparisons between RMM models and existing models. These comparisons demonstrate the effectiveness of the CMC property.

Examples of RMM models

Ignoring RMM errors (ignore the terms cz, dz, and e in the percentile model), we obtain the following RMM models, presented in an increasing order of monotone convexity:

\begin{align} &linear:y=η&&(\alpha=1,λ=0);\\[5pt] &power:y=η^\alpha,&&(\alpha\ne1,λ=0);\\[5pt] &exponential-linear:y=k\exp(η),&&(\alpha\ne1,λ=1);\\[5pt] &exponential-power:y=k\exp(η^λ),&&(\alpha\ne1,λ\ne1;kisanon-negativeparameter.) \end{align}

Adding two new parameters by introducing for η (in the percentile model):

\exp\left[\left(

	\beta
	\kappa

\right)(η^{\kappa-1)\right]}

, a new cycle of “linear-power-exponential” is iterated to produce models with stronger monotone convexity (Shore, 2005a, 2011, 2012^[4]):

\begin{align} &exponential-power:y=k\exp(η^λ),&&(\alpha ≠ ,λ\ne1,\beta=1,\kappa=0,\\ &&&restoringtheformermodel);\\[6pt] &exponential-exponential-linear:y=k₁\exp[k_2\exp(η)],&&(\alpha\ne1,λ\ne1,\beta=1,\kappa=1);\\[6pt] &exponential-exponential-power:y=k₁\exp[k_2\exp(η^\kappa)],&&(\alpha\ne1,λ\ne1,\beta=1,\kappa\ne1). \end{align}

It is realized that this series of monotonic convex models, presented as they appear in a hierarchical order on the “Ladder of Monotonic Convex Functions” (Shore, 2011), is unlimited from above. However, all models are mere points on a continuous spectrum, spanned by RMM parameters. Also note that numerous growth models, like the Gompertz function, are exact special cases of the RMM model.

Moments

The k-th non-central moment of Y is (assuming L = 0; Shore, 2005a, 2011):

\operatorname{E}(Y^k)=

	k
(M
	Y)

\operatorname{E}\left\{\exp\left\{\left(

	k\alpha
	λ

\right)[(η+cZ)^λ-1]+(kd)Z\right\}\right\}.

Expanding Y^k, as given on the right-hand-side, into a Taylor series around zero, in terms of powers of Z (the standard normal variate), and then taking expectation on both sides, assuming that cZ ≪ η so that η + cZ ≈ η, an approximate simple expression for the k-th non-central moment, based on the first six terms in the expansion, is:

\operatorname{E}(Y)^k\cong

	k
(M
	Y)

	\alphak\left(η^λ-1\right)/λ
e

\left\{1+

	1
	2

(kd)²+

	1
	8

(kd)⁴\right\}.

An analogous expression may be derived without assuming cZ ≪ η. This would result in a more accurate (however lengthy and cumbersome) expression.Once cZ in the above expression is neglected, Y becomes a log-normal random variable (with parameters that depend on η).

Fitting and estimation

RMM models may be used to model random variation (as a general platform for distribution fitting) or to model systematic variation (analogously to generalized linear models, GLM).

In the former case (no systematic variation, namely, η = constant), RMM Quantile function is fitted to known distributions. If the underlying distribution is unknown, the RMM quantile function is estimated using available sample data. Modeling random variation with RMM is addressed and demonstrated in Shore (2011 and references therein).

In the latter case (modeling systematic variation), RMM models are estimated assuming that variation in the linear predictor (generated via variation in the regressor-variables) contribute to the overall variation of the modeled response variable (Y). This case is addressed and demonstrated in Shore (2005a, 2012 and relevant references therein). Estimation is conducted in two stages. First the median is estimated by minimizing the sum of absolute deviations (of fitted model from sample data points). In the second stage, the remaining two parameters (not estimated in the first stage, namely,), are estimated. Three estimation approaches are presented in Shore (2012): maximum likelihood, moment matching and nonlinear quantile regression.

Literature review

AS of 2021, RMM literature addresses three areas:

(1) Developing INTs and later the RMM approach, with allied estimation methods;

(2) Exploring the properties of RMM and comparing RMM effectiveness to other current modelling approaches (for distribution fitting or for modelling systematic variation);

(3) Applications.

Shore (2003a^[5]) developed Inverse Normalizing Transformations (INTs) in the first years of the 21st century and has applied them to various engineering disciplines like statistical process control (Shore, 2000a,^[1] b,^[6] 2001a,^[7] b,^[8] 2002a^[9]) and chemical engineering (Shore at al., 2002^[10]). Subsequently, as the new Response Modeling Methodology (RMM) had been emerging and developing into a full-fledged platform for modeling monotone convex relationships (ultimately presented in a book, Shore, 2005a), RMM properties were explored (Shore, 2002b,^[11] 2004a,^[12] b,^[13] 2008a,^[14] 2011), estimation procedures developed (Shore, 2005a, b,^[15] 2012) and the new modeling methodology compared to other approaches, for modeling random variation (Shore 2005c,^[16] 2007,^[17] 2010;^[18] Shore and A’wad 2010^[19]), and for modeling systematic variation (Shore, 2008b^[20]).

Concurrently, RMM had been applied to various scientific and engineering disciplines and compared to current models and modeling approaches practiced therein. For example, chemical engineering (Shore, 2003b;^[21] Benson-Karhi et al., 2007;^[22] Shacham et al., 2008;^[23] Shore and Benson-Karhi, 2010^[24]), statistical process control (Shore, 2014;^[25] Shore et al., 2014;^[26] Danoch and Shore, 2016^[27]), reliability engineering (Shore, 2004c;^[28] Ladany and Shore, 2007^[29]), forecasting (Shore and Benson-Karhi, 2007^[30]), ecology (Shore, 2014), and the medical profession (Shore et al., 2014; Benson-Karhi et al., 2017^[31]).

References

__FORCETOC__

Notes and References

Shore. Haim. 2000-12-01. Three Approaches to Analyze Quality Data Originating in Non-Normal Populations. Quality Engineering. 13. 2. 277–291. 10.1080/08982110108918651. 120209267. 0898-2112.
Book: Haim., Shore. Response modeling methodology : empirical modeling for engineering and science. 2006-01-01. World Scientific. 978-9812561022. 949697181.
Shore. Haim. 2011. Response Modeling Methodology. 10.1002/wics.151. WIREs Comput Stat. 3. 4. 357–372. 62021374 .
Shore. Haim. 2012. Estimating Response Modeling Methodology models. 10.1002/wics.1199. WIREs Comput Stat. 4. 3. 323–333. 122366147 .
Book: Shore, Haim. Advances on Theoretical and Methodological Aspects of Probability and Statistics. 2003-04-24. CRC Press. 9781560329817. 131–145. 10.1201/9780203493205.ch9. Inverse Normalizing Transformations and an Extended Normalizing Transformation.
Shore. Haim. 2000-05-01. General control charts for variables. International Journal of Production Research. 38. 8. 1875–1897. 10.1080/002075400188645. 120647313. 0020-7543.
Book: Shore, Haim. Frontiers in Statistical Quality Control 6. 2001-01-01. Physica, Heidelberg. 194–206. en. 10.1007/978-3-642-57590-7_12. Process Control for Non-Normal Populations Based on an Inverse Normalizing Transformation. 978-3-7908-1374-6.
Shore. H.. 2001-01-01. Modelling a non-normal response for quality improvement. International Journal of Production Research. 39. 17. 4049–4063. 10.1080/00207540110072245. 110083024. 0020-7543.
Shore. Haim. 2002-06-18. Modeling a Response with Self-Generated and Externally Generated Sources of Variation. Quality Engineering. 14. 4. 563–578. 10.1081/QEN-120003559. 120494823. 0898-2112.
Shore. Haim. Brauner. Neima. Shacham. Mordechai. 2002-02-01. Modeling Physical and Thermodynamic Properties via Inverse Normalizing Transformations. Industrial & Engineering Chemistry Research. 41. 3. 651–656. 10.1021/ie010039s. 0888-5885.
Shore. Haim. 2002-12-31. Response Modeling Methodology (rmm)—Exploring the Properties of the Implied Error Distribution. Communications in Statistics - Theory and Methods. 31. 12. 2225–2249. 10.1081/STA-120017223. 119599987. 0361-0926.
Shore. Haim. 2004. Response Modeling Methodology (RMM) - Current distributions, transformations, and approximations as special cases of the RMM error distribution. 10.1081/STA-120017223. https://archive.today/20220713125550/https://www.researchgate.net/profile/Haim-Shore/publication/233918439_Response_Modeling_Methodology_RMM-_Current_distributions_transformations_and_approximations_as_special_cases_of_the_RMM_error_distribution/links/54f2c23a0cf2b36214b2a770/Response-Modeling-Methodology-RMM-Current-distributions-transformations-and-approximations-as-special-cases-of-the-RMM-error-distribution.pdf. 2022-07-13. live. Communications in Statistics - Theory and Methods. 33. 7. 1491–1510. 119599987 .
Shore. Haim. 2004. Response Modeling Methodology Validating Evidence from Engineering and the Sciences. 10.1002/qre.547. Qual. Reliab. Eng. Int.. 20. 61–79. 120932424 .
Shore. Haim. 2008-01-01. Distribution Fitting with Response Modeling Methodology (RMM) — Some Recent Results. American Journal of Mathematical and Management Sciences. 28. 1–2. 3–18. 10.1080/01966324.2008.10737714. 119890008. 0196-6324. free.
Shore. Haim. 2005-06-15. Response modeling methodology (RMM)—maximum likelihood estimation procedures. Computational Statistics & Data Analysis. 49. 4. 1148–1172. 10.1016/j.csda.2004.07.006.
Shore. Haim. 2005-03-01. Accurate RMM-Based Approximations for the CDF of the Normal Distribution. Communications in Statistics - Theory and Methods. 34. 3. 507–513. 10.1081/STA-200052102. 122148043. 0361-0926.
Shore. Haim. 2007-11-09. Comparison of Generalized Lambda Distribution (GLD) and Response Modeling Methodology (RMM) as General Platforms for Distribution Fitting. Communications in Statistics - Theory and Methods. 36. 15. 2805–2819. 10.1080/03610920701386885. 121278971. 0361-0926.
Book: Shore, Haim. Handbook of Fitting Statistical Distributions with R. 2010-10-01. Chapman and Hall/CRC. 9781584887119. 537–556. 10.1201/b10159-17. Distribution Fitting with the Quantile Function of Response Modeling Methodology (RMM).
Shore. Haim. A'wad. Fatina. 2010-05-12. Statistical Comparison of the Goodness of Fit Delivered by Five Families of Distributions Used in Distribution Fitting. Communications in Statistics - Theory and Methods. 39. 10. 1707–1728. 10.1080/03610920902887707. 121490873. 0361-0926.
Shore. Haim. 2008. Comparison of linear predictors obtained by data transformation, generalized linear models (GLM) and response modeling methodology (RMM). 10.1002/qre.898. Qual. Reliab. Eng. Int.. 24. 4. 389–399. 2696320 .
Shore. Haim. 2003-05-15. Response modeling methodology (RMM)—a new approach to model a chemo-response for a monotone convex/concave relationship. Computers & Chemical Engineering. 27. 5. 715–726. 10.1016/S0098-1354(02)00255-7.
Benson-Karhi. Diamanta. Shore. Haim. Shacham. Mordechai. 2007-05-01. Modeling Temperature-Dependent Properties of Water via Response Modeling Methodology (RMM) and Comparison with Acceptable Models. Industrial & Engineering Chemistry Research. 46. 10. 3446–3463. 10.1021/ie061252x. 0888-5885.
Shacham. Mordechai. Brauner. Neima. Shore. Haim. Benson-Karhi. Diamanta. 2008-07-01. Predicting Temperature-Dependent Properties by Correlations Based on Similarities of Molecular Structures: Application to Liquid Density. Industrial & Engineering Chemistry Research. 47. 13. 4496–4504. 10.1021/ie701766m. 0888-5885.
Shore. Haim. Benson-Karhi. Diamanta. 2010-10-06. Modeling Temperature-Dependent Properties of Oxygen, Argon, and Nitrogen via Response Modeling Methodology (RMM) and Comparison with Acceptable Models. Industrial & Engineering Chemistry Research. 49. 19. 9469–9485. 10.1021/ie100981y. 0888-5885.
Shore. Haim. 2014. Modeling and monitoring ecological systems — a statistical process control approach. 10.1002/qre.1544. Quality and Reliability Engineering International. 30. 8. 1233–1248. 9841735 .
Shore. Haim. Benson-Karhi. Diamanta. Malamud. Maya. Bashiri. Asher. 2014-07-03. Customized Fetal Growth Modeling and Monitoring—A Statistical Process Control Approach. Quality Engineering. 26. 3. 290–310. 10.1080/08982112.2013.830742. 111061936. 0898-2112.
Danoch. Revital. Shore. Haim. 2016. SPC scheme to monitor linear predictors embedded in nonlinear profiles. 10.1002/qre.1856. Qual. Reliab. Eng. Int.. 32. 4. 1453–1466. 43167469 .
2004-01-02. Letter to the Editor. Communications in Statistics - Simulation and Computation. 33. 2. 537–539. 10.1081/SAC-120037902. 218568529. 0361-0918.
Ladany. Shaul. Shore. Haim. 2007. Profit Maximizing Warranty Period with Sales Expressed by a Demand Function. Qual. Reliab. Eng. Int.. 23. 3. 291–301. 10.1002/qre.790. 11187814 .
Shore. H.. Benson-Karhi. D.. 2007-06-01. Forecasting S-shaped diffusion processes via response modelling methodology. Journal of the Operational Research Society. en. 58. 6. 720–728. 10.1057/palgrave.jors.2602187. 205131178. 0160-5682.
Benson-Karhi. Diamanta. Shore. Haim. Malamud. Maya. 2017-01-23. Modeling fetal-growth biometry with response modeling methodology (RMM) and comparison to current models. Communications in Statistics - Simulation and Computation. 47. 129–142. 10.1080/03610918.2017.1280160. 46801213. 0361-0918.