### Introduction

### Multicollinearity

_{1}and X

_{2}. In other words, exact collinearity occurs if one variable determines the other variable (e.g., X

_{1}= 100 − 2X

_{2}). If such relationship exists between more than two explanatory variables (e.g., X

_{1}= 100 − 2X

_{2}+ 3X

_{3}), the relationship is defined as multicollinearity. Under multicollinearity, more than one explanatory variable is determined by the others. However, collinearity or multicollinearity do not need to be exact to determine their presence. A strong relationship is enough to have significant collinearity or multicollinearity. A coefficient of determination is the proportion of the variance in a response variable predicted by the regression model built upon the explanatory variable (s). However, the coefficient of determination (

*R*

^{2}) from a multiple linear regression model whose response and explanatory variables are one explanatory variable and the rest, respectively, can also be used to measure the extent of multicollinearity between explanatory variables.

*R*

^{2}= 0 represents the absence of multicollinearity between explanatory variables, while

*R*

^{2}= 1 represents the presence of exact multicollinearity between them. The removal of one or more explanatory variables from variables with exact multicollinearity does not cause loss of information from a multiple linear regression model.

### Variance Inflation Factor

*R*

^{2}(0 ≤

*R*

^{2}≤ 1),

*R*

^{2}= 0 (complete absence of multicollinearity) minimizes the variance of the regression coefficient of interest, while

*R*

^{2}= 1 (exact multicollinearity) makes this variance infinite (Fig. 1). The reciprocal of the variance inflation factor (1 −

*R*

^{2}) is known as the tolerance. If the variance inflation factor and tolerance are greater than 5 to 10 and lower than 0.1 to 0.2, respectively (

*R*

^{2}= 0.8 to 0.9), multicollinearity exists. Although the variance inflation factor helps to determine the presence of multicollinearity, it cannot detect the explanatory variables causing the multicollinearity.

### Condition Number and Condition Index

*λ*). Eigenvalues close to 0 indicate the presence of multicollinearity, in which explanatory variables are highly intercorrelated and even small changes in the data lead to large changes in regression coefficient estimates. The square root of the ratio between the maximum and each eigenvalue (λ

_{max}_{1}, λ

_{2}, … , λ

_{k}) is referred to as the condition index:

### Variance Decomposition Proportion

### Strategies to Deal with Multicollinearity

### Numerical Example

*Liver regeneration rate*= 232.797 + 0.221 ×*PPV ⁄ GW*− 0.050 ×*PSV ⁄ GW*+ 0.690 ×*EDV ⁄ GW*+ 4.083 ×*HVV ⁄ GW*− 0.905 ×*GW ⁄ SLV*− 37.594 ×*GRWR*(*R*^{2}= 0.682, P < 0.001)

*R*= 0.886) between them (Table 2). However, the second strongest correlation between PSV/GW and EDV/GW (

*R*= 0.841), which can be found in Table 2, does not seem to cause multicollinearity in the multiple linear regression model. Although the variance inflation factor of the PSV/GW (4.948) is lower than 5, it is very close to it. In addition, only one of their variance decomposition proportions corresponding to the condition index of 11.938 is over 0.8. However, if the cut-off value of the variance decomposition proportion for the diagnosis of multicollinearity is set to 0.3 according to the work of Liao et al. [6], the two explanatory variables are multicollinear. Therefore, excluding the GW/SLV from the regression model is justified, but whether the PSV/GW is removed from the regression model is not clear.

*Liver regeneration rate*= 209.393 + 0.392 ×*PVV ⁄ GW*+ 1.006 ×*EDV ⁄ GW*+ 4.410 ×*HVV ⁄ GW*− 68.832 ×*GRWR*(*R*^{2}= 0.669, P < 0.001)

*R*

^{2}) from 0.682 to 0.669 indicates a negligible loss of information.