Discussion of the statistics of the NEDU study on the redistribution of decompression stop time from

Please register or login

Welcome to ScubaBoard, the world's largest scuba diving community. Registration is not required to read the forums, but we encourage you to join. Joining has its benefits and enables you to participate in the discussions.

Benefits of registering include

  • Ability to post and comment on topics and discussions.
  • A Free photo gallery to share your dive photos with the world.
  • You can make this box go away

Joining is quick and easy. Log in or Register now!

The fact that one or both profiles did not behave according pre-trial predictions is a legitmate (sic) outcome of the study: it does not invalidate it in any way.

Simon M


According to this 10 Elements of Good Experimental Design


* Includes a control for comparison
* Can be reproduced by other scientists to give similar results


The VVAL18 model profile (A1) was that control. But the control in this test showed to be neither a reliable or a repeatable baseline.


The VVAL18 was a contemporary and well used model within NEDU testing. The VVAL18 has the backing and calibrations of the NEDU's extensive man tested data base, and was an extension to previous model (Thalman) and work . It was used to make the new Mk re-breather tables.


The A1 profile is an ordinary and simple ascent, that was reportedly used exactly as the model intended. The depth / time profile is within the norms of previous NEDU testing and development and table test limits and calibrations and saved datapoints.

I point out that a condition of this test was that both profiles have the same run time and the same risk (isorisk). To achieve that, they manipulated the A2 profile to fit it into the same space of the A1 profile.


So why did the predicted injury rate of the VVAL18 model profile get so far off course and beyond its tested / predicted pDCS range? Was the baseline defective in some way, or the prediction off for some reason? This absence of a reliable baseline calibration on the test needs to be explained.


The question remains, If the NEDU cannot reproduce the baseline control data points (as shown in this test), then that implies the experiment was not the required comparison of iso-risk that they say it is.

.
 
Last edited:
Well you can start with the fact that all of those hypothetical profiles were still skewed to mandate a 174 minute total dive time.

So, you believe that shorter decompressions are safer than longer decompressions?

Simon M
 
Regarding the test being cut short, for complete accuracy, the report describes that the trial was stopped because the data satisfied one of the protocol's conditions (i.e. the interim one-sided Fisher test with α = 0.05 gave p=0.0489), so according to this criterion, it had to stop.
 
The DMO's were not blinded to the study, and were aware of the profiles each subject was diving, before making their assessments.

As has been pointed out earlier in this thread, it is unlikely this introduced systematic bias, and if it did, such biases usually favour the beliefs of the people conducting the study which in this case were that deep stops would be better.

The A1 baseline test data, was ...just short of triggering an automatic test rejection.

Is it your new approach to simply keep repeating stuff even if it is demonstrably wrong?

The low rejection criteria was to prevent futility. That is, if both profiles fell below 3% then the trial was not powered (big enough) to show a difference between profiles with such a low incidence of DCS.

As David said when he corrected you on RBW:

"We had stopping rules if both schedules had unexpectedly high or low risk, which were likely to result in severe DCS or an inconclusive result, respectively. We never came close to these (the figure presented in an earlier post is misinterpreted)".

The study report also clarifies this:

These rules were to limit exposure of divers to the risk of severe DCS and to limit the potentially inconclusive testing of two low-risk dive profiles.

The test was cut short because of the differences in the two data points and potentially, to avoid the pending rejection to come.

Ah! "Differences in the two data points"! What you really mean is 'a difference in the DCS rates between the two profiles'. Still, a slight change in position as the lights go on and certainly different to:

It's my opinion the test stopped early to salvage what they could from a expensive test procedure that was about to be scrapped.

Therefore: The often reported claim by some - "....that the IRB cut the test because of an excess injury rate".... that statement is false.

Well despite your use of quotation marks no one actually put it exactly that way Ross, but nevertheless it is not false. The IRB mandated that further testing and episodes of DCS (that is, further injuries), after the question was answered to a pre-defined level of statistical significance at the mid point analysis, would constitute unnecessary exposure of subjects to injury. The trial would be terminated at that point to avoid excess injury. Read what Joseph said above. The protocol conditions he mentions were negotiated with the IRB. Read what David told you, Read the protocol. How many times to you want to hear it?

You were actually the one who introduced the term "excess injury" to this thread. And in response I put context around your words as follows:

Dr Simon Mitchell:
Excess injury in this context (and from the perspective of the IRB) meant any further cases of DCS beyond those required to achieve an answer to the question. Thus, if the trial had continued after the experimental end point had been reached, that would constitute excessive injury in the eyes of the IRB.]

This is exactly what happened, and it is not false.

Simon M
 
Last edited:
This reminds me of reruns of the Andy Griffith Show. It's sort of amusing to look back at the 1950's and how people were portrayed back then, for a while. But in 2018, Mr. Griffith has been gone for nearly six years, the youngest cast member ("Ronnie" Howard) is himself 64 years of age and a prominent film director, and there are only so many reruns you can watch before some of the characters are so tiresome to watch that you want to change the channel whenever it pops up.

Dr. Mitchell is possessed of seemingly infinite patience, and we are lucky to have him as a member. Simon, thank you for all the time you spend here, particularly on this topic, which, like the Andy Griffith Show, seems to rerun over and over and...
 
So, you believe that shorter decompressions are safer than longer decompressions?

Simon M

In the case of A2....yes.
 
Regarding the test being cut short, for complete accuracy, the report describes that the trial was stopped because the data satisfied one of the protocol's conditions (i.e. the interim one-sided Fisher test with α = 0.05 gave p=0.0489), so according to this criterion, it had to stop.

Hi Joseph,

Yes it does say that.... written after the test was done. We have no guarantee that was a decision point or value before the test. And even if it was a real planning point, its purpose is to act as a guard rail to avoid the impending rejection, and highlighting that the profiles were not an equal match in the first place.

What happened during the test? All persons involved were acutely aware of the running totals, and where it was headed. No sane person would let such an expensive undertaking run off the rails.

Like I said, I think they salvaged this test from imminent rejection, and the reason quoted above is the excuse.

.
 
Last edited:
Yes it does say that.... written after the test was done. We have no guarantee that was a decision point or value before the test. And even if it was a real planning point, its purpose is to act as a guard rail to avoid the impending rejection, and highlighting that the profiles were not an equal match in the first place.

With any study, I can only discuss / comment on what the report states (in this case with regards to the adopted protocol); otherwise I'd be making an assumption.
 
According to this 10 Elements of Good Experimental Design

* Includes a control for comparison
* Can be reproduced by other scientists to give similar results

The VVAL18 model profile (A1) was that control. But the control in this test showed to be neither a reliable or a repeatable baseline.

So the A1 profile can't be replicated? Also, Ross, when discussing science, let's please not quote experimental design guidelines from a Middle School web page.
 
With any study, I can only discuss / comment on what the report states (in this case with regards to the adopted protocol); otherwise I'd be making an assumption.

This study was reported 3 times.. The original version, as reported in the UHMS decompression workshop 2008, the experiment method does NOT make mention of the complex mid point test. Here is the page below. The only criteria shown for stopping the test, is the 3 to 7% reject boundaries.

IMAG0004.JPG


The elaborate mid point test limits, were only added to the 2011 report. So which version do we believe? The original technical version, or the re-written for public consumption version?

Note that this study had served its purpose for the USN back in 2008, when they decided on the new modified VVAL18 for deco, in the rev 6 of their dive manual (2008). So what purpose does it serve to publish an old test, that been re-formatted and elaborated for public uses in 2011? You can see why we should be skeptical of new reasons added to old reports.

.
 
https://www.shearwater.com/products/teric/

Back
Top Bottom