Info Don't use AI (like ChatGPT) for planning a dive

Please register or login

Welcome to ScubaBoard, the world's largest scuba diving community. Registration is not required to read the forums, but we encourage you to join. Joining has its benefits and enables you to participate in the discussions.

Benefits of registering include

  • Ability to post and comment on topics and discussions.
  • A Free photo gallery to share your dive photos with the world.
  • You can make this box go away

Joining is quick and easy. Log in or Register now!

Oh, yeah, this is TOTALLY fun! Didn't mean to imply otherwise.

But the most important thing that we should ALWAYS keep in mind is that an LLM (as opposed to other AI applications) is designed NOT to give accurate information, but to CONFIDENTLY give information with excellent linguistic style, better than what most humans are capable of.

And that's the dangerous combination, lack of underlying content accuracy, but with a surface appearance which will lead anyone who doesn't have their own expert knowledge to assume that the content is correct.

Pretty much the same thing happened when websites came out. People would look at a well designed site and just assume that if it looked good, it was probably correct.

ChatGPT vs Consultants.jpg
 
I sense that several in this thread disagree, but I think this is a fun exploration of AI/LLMs and how you can and cannot use them currently. how they can be helpful, and how they can lead you astray. I appreciate those who modeled the dives to show, as expected, that the provided plan was not good.

However, it is interesting that it does a great job of showing what a dive plan could look like. In other words, it provides an overview of what a decompression dive might look like, as long as you don't look too close. It gives the 'shape' of a dive plan, and sorta uses the right words. The issue of course is that the plan is completely made up, and no actual calculations are being performed. To be fair, from the beginning it said that the plan was just an example, and that actual dive plans may vary.

I went back to my Claude chat and made a few more prompts. Thought I would share in case you are interested:

My prompt:

Claude:


My next prompt:


Claude:



So Claude wouldn't dive his plan. As I said earlier, I would never use the current AI/LLMs to generate 'facts', only to edit text and generate broad concepts. Its like a person who has read every textbook and manual in the world, but has no actual experience with anything. Perhaps its self-evident to some, but this dive plan example provides a good illustration.

I also wonder if I could use Claude to make a Python script to calculate deco times, but I do not have the energy for that.
I mused with GPT4 based Matlab and Python scripts for both Buhlman compartment model and Wienke's RGM. It was frustrating to say the least, most of the time spent at "I am so sorry for the confusion here is..." of course everytime it crashed and restarted the thread it gave a completely different answer, and utter crap at that. Although GPT has some nice capabilities for writing/checking small pieces of code, I wouldn't use it for more than having fun at the results it produces.
 
I agree about the problem. Their responses 'sound' correct, but you cannot trust the details, and therefore there is no way to distinguish right from wrong. It is essentially hallucinating the whole thing, it just hallucinates correctly much of the time.

This is how it works in my world/career also. If I ask for anything detailed in an area where I am a content expert, it provides broad concepts but falls apart with the details. I was curious about how it would respond to the dive plan, and the result is consistent. It's also true for both ChatGPT and Claude. I wondered if Claude would do better, but obviously not.

I have learned that if it does math, you can see it calculating. Often actually showing the Python code. But even then its wrong sometimes. I have uploaded data sets and asked it to do statistics and graphs (ChatGPT), and it is usually but not always correct. I have had better luck asking it to write code for me, which I run separately.

They predict GPT5 this summer/fall, so it will be interesting to test it again.
 
3. Oxygen Toxicity Units (OTU) formula:
- OTU = (PPO2 - 0.5) × Time (in minutes) / 30
This is, in a nutshell, why you should never take anything produced by an LLM at face value.

Let's take a look at some actual formulas for OTU. This is from an Erik Baker paper that was originally hosted on Shearwater's site, It's been removed, but it's available here: Wayback Machine . There's a discussion of it at A few thoughts on Oxygen Toxicity – The Theoretical Diver along with an even more complex formula.

otu.jpg
 
Claude didn't even get this one right. Pretty sad.

Would have it gotten right, that would have been by chance, not by design.

They are currently trying to offload some more precise thing to non-LLM and non-AI engines. But without knowing what triggers such offload, the utility will be small as you won't know if the result comes from the LLM and sounds right without any guarantee or from a more precise engine which has correctness guarantees whatever its other flows may be.

Note that you have the same issues with humans and certifications is something intended to solve the issues.
 
I think that machine learning approaches (i.e. AI, but not the LLM version of AI) has tremendous potential to advance dive safety and decompression planning.

Since there is no ethical way of prospectively generating stress or injury data in human subjects, we are stuck using retrospective data. There are animal models, but these are expensive, and there is no guarantee that those will map well to human divers.

Deterministic models start with an assumption that a given degree of supersaturation is unsafe, an algorithm is developed, and then over time the model is refined by evidence of actual injury rates. Probabilistic models start with the data (depth, time, injury rate, other modifying factors) and derive a rule from that dataset.

We could benefit from a uniform project to accurately and comprehensively capture real world data. That could become an AI training dataset, which could go a long way towards improving the tools that we use to minimize decompression injury.
 
I asked Google's Gemini a simple question- what is the NDL for a dive to 30m, the answer is complete nonsense, but they had some non liabilty crap that one shouldget training bla bla bla:

"No-decompression limits (NDLs) for recreational diving are typically not provided for depths exceeding 21 meters (70 feet) due to safety concerns. It's highly recommended to consult decompression tables or a dive computer for dives deeper than 21 meters, as these tools account for factors like depth, time, and ascent rate to provide a more personalized and safe NDL.
For dives within the recreational diving limit (typically 40 meters), dive computers are preferred over decompression tables due to their ability to factor in real-time changes in depth and ascent rate. If you're planning a dive, it's crucial to have proper training and certification for the intended depth and to always follow recommended dive safety practices."
 
Deterministic models start with an assumption that a given degree of supersaturation is unsafe, an algorithm is developed, and then over time the model is refined by evidence of actual injury rates. Probabilistic models start with the data (depth, time, injury rate, other modifying factors) and derive a rule from that dataset.

We could benefit from a uniform project to accurately and comprehensively capture real world data. That could become an AI training dataset, which could go a long way towards improving the tools that we use to minimize decompression injury.
Getting the data would be nice. But once you have it, there are few enough parameters that there is no obvious advantage to using AI instead of standard statistical methods such as regression analysis.

For dive computers, you do have to keep in mind that you have to end up with an algorithm that can be executed by a small device and is valid across the entire population of divers with no more than a couple of user controllable tweaks.

For treatment of DCS, you can assume unlimited computer power, but you won't be able to use black box outputs from an AI model to treat patients. If you can't reduce the results to a standard algorithm that can be tested, you'll never get approval from the FDA or whomever is in charge of liability exposure for your practice.
 
Getting the data would be nice. But once you have it, there are few enough parameters that there is no obvious advantage to using AI instead of standard statistical methods such as regression analysis.

For dive computers, you do have to keep in mind that you have to end up with an algorithm that can be executed by a small device and is valid across the entire population of divers with no more than a couple of user controllable tweaks.

For treatment of DCS, you can assume unlimited computer power, but you won't be able to use black box outputs from an AI model to treat patients. If you can't reduce the results to a standard algorithm that can be tested, you'll never get approval from the FDA or whomever is in charge of liability exposure for your practice.

Not sure that's true. The advantage of big data approaches and machine learning is that sometimes you come up with associations that would have never happened by human beings looking at numbers and applying standard mathematical techniques.

But we are getting pretty far from my area of expertise, so maybe I'm wrong.
 

Back
Top Bottom