Possible statistical problems of the original paper titled "Predictors of difficult intubation": a bad start
Article information
In science, statistics is very important in the original article.
It is not only proves the author's conclusions, but also gives accreditation to the paper.
Additionally, statistics, which had a basis in mathematics, is essential in modern sciences. Of course it provides a concrete basis for an original, scientific article in medicine.
Basically, there are two kinds of tests in order to testify the null hypothesis (H0) in medical statistics. One is a test to show the difference, whereas the other is to prove the equivalence - test for no difference - between control and experimental group.
For the difference test, a null hypothesis (H0) should be made. Null hypothesis is usually asserts that "there are no differences between two groups. Nevertheless if there is a difference between two groups, it is originated from the sampling variance" [1].
To guarantee no differences between two sampled groups, variance from sampling should be minimized. It can be realized by random sampling in a clinical study. If we increase the sample size by random sampling, there would be a lessening of differences in demographic data between groups. Of course this means that there should be no differences in the factors that could affect the results of the experiment between the two groups [1].
The next step is calculating a probability (P value) of achieving no difference in the result between groups. If it is below 5% (P < 0.05), it means that no difference between groups, if ever, would take place very seldom. Hence, it is reasonable to reject the null hypothesis. Then we can say that there is a significant difference between groups statistically [1].
It is time to review the paper "Predictors of difficult intubation defined by the intubation difficulty scale (IDS): predictive value of 7 airway assessment factors." written by Seo et al. [2]. The aim of the paper is to find the preoperative airway risk factors that can predict the difficult intubation more efficiently.
The two groups are divided by the criteria of intubation difficulty score, which was measured by the experimenter, who had administered endotracheal intubation. The demographic differences between the two groups are shown in Table 3 of the aforementioned paper. There are already three statistically different factors (age, intubation duration, lowest SaO2 level) between the two groups. From this information, what kind of a null hypothesis could be built? To make a null hypothesis, there should be no differences in the demographic data between the two groups. If there are different factors between the two groups with regards to the demographic data, it is very hard to rule out their influence on the result. Additionally there is a significant relationship between the criteria that divides a population into two groups and the results itself, which is shown in this paper. There are many clinical tests to predict a possible airway risk. Of course, these types of test are officially accepted as being useful in predicting a possible difficulty in endotracheal intubation. Most of them appear in this paper. Considering the link between the criteria by which the two groups are divided and the experimental results, it is easy to infer that there are big differences in the results between the two groups.
Finally, Table 5 shows that the total airway score is the most useful factor in predicting the difficulty intubation, followed by the upper lip biting test. However most tests have false positive and negative traits. Therefore, it is very hard to say that one test is better than another just for the reason of high sensitivity. Additionally, the author should show why he used the method of multiple logistic regression analysis in Table 5. Odds ratios are hardly used in testifying a usefulness of a clinical test.