I used Claude Code to get a second opinion on my MRI

Posted by engmarketer |2 hours ago |194 comments

sxg 2 hours ago[4 more]

I'm a radiologist but can't really weigh in without seeing the full 3D MRI dataset. Regarding this point:

> They performed shockwave therapy on my shoulder even though a recent clinical practice guideline says clinicians should not use or recommend shockwave therapy for rotator-cuff tendinopathy without calcification; I was told during ultrasound that there was no calcification.

Ultrasound isn't a great way to assess for calcification. It'll find large calcification but easily miss small ones. Plain radiograph would be more helpful, but the MRI may have revealed it as well. Either way, shockwave therapy isn't harmful in the absence of calcification--it's just not helpful.

Edit: when a radiology report says something isn't present, there's always an implicit caveat that the finding isn't present within the context of the modality and images obtained. So an ultrasound report can state there are no calcifications while a plain radiograph can report the presence of calcifications without being inconsistent. Obviously very confusing to patients and people unfamiliar with medical jargon, but clarifying this in reports would make them sound even more qualified, "hedgey", and annoying to read than they already are.

AceJohnny2 an hour ago[4 more]

> There's something incredibly peaceful about being in the hands of an expert you trust. [...] AI can absolutely shatter that feeling in an uncomfortable way [...] but I don't know if I can fully trust AI either.

This really is key. We know we can't trust the AI, but at the same time we're also more comfortable asking the AI for clarifications or confronting it. Not having a time-bound appointment or paying by the hour helps a lot. But even then, more information doesn't necessarily help!

I once brought my 11-year-old car, a Civic with 150k miles, to multiple garages. I figured I'd play the "second opinion" game to correlate what the garages recommended to decide on what needed to be done...

I got 3 completely unrelated recommendations, including one that I knew was invalid. I felt worse off than when I started!

The solution to uncertain information isn't more information, which the AI can certainly provide, it's better information, and AI cannot currently provide that.

anigbrowl 2 minutes ago

[delayed]

hennell 2 hours ago[2 more]

Personally my favourite feature of the new ai world is not when I use it directly but it's when one of my managers uses it to try to fix a problem, then issue to me their findings and I have to defend my process to someone who understands neither my process, their suggested solution nor often the problem they're solving in the first place.

throwforfeds an hour ago[1 more]

I've seen a lot of friends and family members almost immediately get offered surgery for shoulder pain. It's just often the default for people that do surgeries for a living.

I also had a pretty painful shoulder issue at one point, where the pain just wasn't subsiding for months. I tried massages and acupuncture as I didn't want to do surgery, but it wasn't helping at all. The thing that fixed it for me was just really focusing on doing pull-ups. I couldn't do them at all when I started, so I began with dead hangs and scapular pull-ups, eventually progressing to regular pull-ups, and then training with a "grease-the-groove" method once I could get a few per set. I stopped the training schedule once I was getting in around 17 pull-ups per set, and now just do 6 sets of about 7-8 pullups 3x per week spaced throughout the day. I'll also do some shoulder mobility drills [1].

Whenever I get lazy about keeping up with them inevitably discomfort will start arising again, but it goes away once I get back to strengthening.

[1] https://www.youtube.com/watch?v=vP8YmmRMz6I

idopmstuff 13 minutes ago

> It might seem obvious to coders, but the difference between Claude Code and Claude.ai's chat is enormous, even if those two run the same model.

In my experience, Claude Code is vastly better for doing tasks, writing code, etc., but Claude.ai is better for analysis and high-level planning. When I'm working on a new project, I've started using the latter to do the initial planning, get feedback and draw up a spec, which then goes to Claude Code.

For this project, I probably would've done something similar - use CC to get whatever you need out of the image files, but have Claude.ai do the actual review/diagnosing.

Either way, I often think about how far behind most of the world is in really understanding AI. The overwhelming majority of people would never guess that you get vastly different outcomes from the exact same model in a different harness (tbf most people don't know what a harness is). I spend hours every day using AI for a broad range of tasks and still feel like I know a fraction of what there is to know. I haven't even tried the new GLM model (or really any of the open source Chinese ones of the most recent generation). With so many people thinking that the free version of ChatGPT is SOTA AI, a lot of folks are in for a very rude awakening at some point soon.

nostrebored an hour ago

I don’t understand the negative reactions. Medical care as it exists requires the doctor and patient to have their brains switched on. I’ve almost never had a problem where a doctor provides me with a diagnosis and I go about my day. Most of the times that I have, I’ve been confident about the problem and known what I needed. The doctor was a barrier to accessing care.

Dr. GPT is a good brainstorming tool. It helps synthesize information in a way that primary texts don’t. But it does force you to say “that doesn’t make sense”.

I do think that people saying “doctors don’t know the state of the art” have a weaker case. If you think about it in terms of token density during pretraining and how post training datasets are constructed, I think it would take us a very long time to adapt to any fundamental shifts. If we have forgotten how to cure scurvy, how many journal articles would it take before we adapt to a discovery?

rasmus1610 an hour ago[2 more]

As a radiologist I have found Claude and ChatGPT to be absolutely terrible at MRI and I would not trust it one bit. It has its merits if you need to research stuff that is more text based, but radiological images is just something that they cannot interpret good enough (yet)

jeswin 2 hours ago[1 more]

I would not trust AI on images. But I once had ChatGPT tell me that an MRI report was very likely to be incorrect based on the text, and offered a different diagnosis. Since it was semi insisting, I visited another doctor who made me do a retest. Long story short, ChatGPT was correct.

Again, this is just one single person's experience. So not worth much.

ricardobayes 2 hours ago[12 more]

That might be doctors new nightmare: people who second guess everything with AI. Previously it was "google your symptoms".

dwa3592 21 minutes ago[2 more]

Was it 2016 when Geoff hinton said that radiology was a dead career?

Well, we now have the best model of our time (trillions of $$$ of investments) telling us something completely different(and wrong) from a human expert. I would really like someone calling out dario, sam, elon on these things and hear their explanations but alas, a man can only dream.

linsomniac an hour ago[1 more]

~2 years ago I used ChatGPT "deep research" to investigate a chronic sinus infection I'd been fighting for ~3 years. After seeing 3 GPs and 3 visits with an ENT, I fed all the observations I had into the AI. In particular, I couldn't get the ENT to explain why he visually saw, via a scope, evidence of allergic reaction in my sinuses, but then later concluded, after an allergy test, that it couldn't be treated via allergy medication. I asked this question a few times and he just never answered.

ChatGPT surfaced a NIH study that concluded that 20% of people have allergic reactions that are isolated to a body location, and that shoulder "skin prick" testing may not reveal. I asked him about that and he said "that's not how allergies work". Full stop. He was unwilling to even look at the study.

He prescribed a CPAP and regular nebulizer treatments. Side story: the CPAP place sent me a SMS message that I couldn't recognize was not a phishing attempt, and when I reached out to inquire who they were they never replied.

So I decided: Let me just try taking a second-gen allergy tablet every day and see what happens.

My sinus infections have gone away. Previously I was getting a major sinus infection at least quarterly. Maybe he's right that allergies don't work that way, but allergy tablets have absolutely solved my problem. Which I'm thankful for because I tried a CPAP for a solid month a few years ago and I just could not get used to it, and was sleeping like crap.

hectdev 35 minutes ago

My only issue with this was the restriction of "Do not look at any data outside of our working folder" is preventing the tool from doing what it does best. I would have given it access to PubMed to pull the latest research on the subject and validate.

I wouldn't consider Claude itself to be the tool that does a job like this, but the tool that pulls in the best data and gives a supported suggestion. And then go through a number of iterations on where it failed to hone in its assessment.

dazhbog an hour ago[1 more]

You should always be getting a second or third opinion from real doctors for matters like surgeries, radiology, etc.

One doctor diagnosis + LLM is gonna throw you off. You need more datapoints.

lycos 11 minutes ago

I'm surprised about the 266 MB of DICOM images, I've never had an MRI but my CT results are generally between 1-2GB (zipped) and I always assumed an MRI would have more data, guess I was wrong about that!

gaolei8888 a minute ago

I have already done that several times, and I found the comments from ChatGTP/Claude, is absolutely bullshit.

rapatel0 13 minutes ago

you shouldn’t expect frontier models to work on medical imaging. There is much more that goes into building a medical imaging product. first and foremost is data. medical imaging datasets are not prevalent one the public internet at the scale necessary to have good performance on medical imaging tasks. especially MRI. also the labels are super noisy. this is completely different than asking for genreal medical reasoning which is more derived from papers, public standards and textbooks. text exists at the right scale but images don’t.

eqvinox an hour ago[2 more]

> My hope is that in a couple of model generations, we'll trust AI to review MRIs the way we trust it to proofread our emails.

https://www.nature.com/articles/d41586-026-01947-1

I've started asking my doctors whether they use AI, and if they say yes look for another one.

TSiege 2 hours ago[5 more]

Always worth a share for this scenario. It's not clear if LLMs are capable of doing actual analysis on medical imaging. For details see this article https://futurism.com/artificial-intelligence/frontier-models...

> As detailed in a new, yet-to-be-peer-reviewed paper, a team of researchers at Stanford University found that frontier AI models readily generated “detailed image descriptions and elaborate reasoning traces, including pathology-biased clinical findings, for images never provided.”

> In other words, the AI models happily came up with answers to questions about a supposedly accompanying image — even if the researchers never even showed it an image.

> As opposed to hallucinations, which involve AI models arbitrarily filling in the gaps within a logical framework, the team coined a new term for the phenomenon: “mirage reasoning.”

> The effect “involves constructing a false epistemic frame, i.e., describing a multi-modal input never provided by the user and basing the rest of the conversation on that, therefore changing the context of the task at hand,” the researchers wrote in their paper.

> The damning findings suggest AI models cheat by diving into the data they were given — and coming up with the rest based on probability, even if it’s almost entirely conjecture.

lucfranken an hour ago[3 more]

Why wouldn’t you as a doctor by standard run the images through a certified compliant LLM? The actual cost won’t be it and then you can see if you get any new ideas from it. See if it’s just wrong or that it spotted a little detail you missed?

The LLM doesn’t need to be leading or whatever but then you can have a conversation with the patient. If their ChatGPT reports has differences it can be analyzed as well.

It feels like the time constraint of the 15m doctor sessions is the thing. But if prepared immediately after the scan then why not?

There is always time needed to factor in new developments and innovations and that’s fine. Just moving blindly work from human to LLM is wrong. But learning on and testing with all the ai tools incoming constantly won’t be a waste. There will be more and more tools in those processes outside of human judgement, better improve the workflows now to be able to test and plugin new models and systems when they are ready.

LogicFailsMe an hour ago

I did the same exercise here with medical reports and CT scans for a friend's cancer diagnosis and I got ahead of the oncologists predicting they were about to be cured. Spoilers: yep, cancer free now.

And well, yes, I have the appropriate life science degrees to navigate clinical trial reports and research publications, and that was likely indispensable for steering Claude Code where it went, the radiologist's caution is merited here. But it's just not amateur hour for me to do this, it's 2 decades of academic research in my rearview mirror.

Aeolun 2 hours ago[5 more]

I would not use Claude to get a second opinion on anything that’s an image.

darepublic an hour ago[1 more]

I would like if we could have a site where you submit your MRI then doctor commenters anonymously post their opinion. In general I want a forum where.. when people come with questions for which there are varying opinions we don't just have people leave their 2c and then jet. The thread persists, duplicated ideas get merged, erroneous statements get purged and gradually we refine shining truth

intoXbox 2 hours ago

Radiologists very often have to weigh up different theories, guidelines based on the symptoms. The certainty of their diagnosis is their added value, or if they don’t know they will tell you why.

An AI telling you it could be X or Y because theory ABC… is the academic answer and a luxury clinicians don’t have. AI doesn’t give you what you want. I don’t see any added value in using generic AI models for this

jochem9 2 hours ago

Right now the article reads as "AI can play doctor if you give MRI scans".

If the author would actually go for a second opinion (maybe bring along the AI to let it explain it's findings), then the article could read as "AI did MRI analysis and proved my doctor wrong" (or: "AI did MRI analysis and failed").

skybrian 2 hours ago

Getting an actual second opinion seems like the next step?

Gareth321 15 minutes ago

I have had terrible experiences with medical professionals. Especially the experienced/senior/specialists. First, they just don't have the time to do a thorough research of my medical history. Second, they are often arrogant and resistant to any kind of critical questions. They have an apparently unwavering belief that they are correct. In fairness, they probably usually are, but they are not infallible, and they are at their weakest when it comes to the edge cases.

AI is completely without ego, and can process all my medical records in minutes. In truth, even today, I would rather have an AI analyse my records.

VladVladikoff 2 hours ago[1 more]

Hey OP my wife had a subscap tear and went through with surgery. Recovery was ROUGH, she couldn’t use that arm at all for almost two months. It’s amazing how much this can cripple a person, we don’t realize how much we use both our hands for our daily lives until one is gone. Even basic stuff like cooking, bathing, etc. If you can avoid surgery you should. Try doing the Buckburger 12 (spelling?) shoulder physiotherapy regiment. You’ll need to even if you get surgery, but this can help with tedonopathy. Also try to identify what is causing the repetitive stress and cut back on that activity.

mootothemax an hour ago

Can any LLM give you the rough pixel coordinates of an item it identifies in an image?

I found that while Claude, GPT etc could describe an image, there was no way to link the description back to specific pixels in the image itself. Not even to a bounding box or segment.

terzioglubaris an hour ago

Hey, glad you did that , I have done the exact same think last week but the radiologist interpretation and claudes interpretation was pretty much the same ! you want my doctors number ? lol

davikr an hour ago

You can try sending basic chest radiographs to GPT and it'll fail at interpretation. I'd be wary of premature conclusions.

mistic92 2 hours ago

I have used Gemini 3.1 Pro through CLI to analyze my DICOM images. It gave me the same diagnosis as radiologists. But it was just interesting test

fabioz 2 hours ago

I wouldn't trust anything from Claude here image-wise (maybe to get a 2nd opinion on the report itself and treatment it's reasonable), but also, on the cases there is something something serious, go to at least 2 different doctors and if they have different opinions go for a 3rd for a decisive vote, besides doing your own research (it's not that uncommon for hard cases to be badly diagnosed).

cityofdelusion 30 minutes ago

Its very interesting how people trust LLMs in domains they know little about.

Instead, it is my experiences with LLMs in a domain that I know very well that makes me skeptical of their performance across the board. I find issues in code review multiple times a day with their output, and they are explicitly and extensively trained on this use-case, unlike with the MRI data. Sometimes I veer into other domains I have decent knowledge about (construction, carpentry, landscaping) and LLMs disappoint me there as well.

I suppose Gell-Mann amnesia is a universal human quirk and not restricted to just the news.

quacked an hour ago[1 more]

The thing that annoys me about AI discourse is that AI is a mathematical technique of rapidly increasing efficacy, and yet everyone personifies it. It would help if every time someone said "AI" they supplemented "a mathematical method where extensions onto a very large corpus of information are statistically simulated".

It's not true that "AI makes mistakes" or "ChatGPT is sycophantic". It's just that sometimes the simulated extensions to the training material are accurate, and sometimes they're not.

lutusp 25 minutes ago

> There's something incredibly peaceful about being in the hands of an expert you trust. You don't have to worry anymore and can let them guide you through the process.

> AI can absolutely shatter that feeling in an uncomfortable way ...

I see this as a field report in a time of fundamental transition, from a world without AI, to one that accommodates/incorporates AI. For this to happen, AI will need to become more trustworthy. As for the U.S. medical system, it can't get much worse.

I recently had a similar experience (meaning walking a fence between old and new methods), where I was told I could get an appointment with a human medical practitioner in nine months. So, to resolve my anxiety I consulted AI and got an instant diagnosis, one that was later confirmed by the inaccessible medics.

Being a born skeptic I wasn't going to act on AI's diagnosis, I just wanted to know what was going on, resolve some uncertainty. Another advantage: an AI chatbot doesn't say, "Wait, you're on Medicare? Hmm. See you in nine months."

Don't take this as an endorsement of AI's diagnostic abilities -- it's way too soon for that. In my case it was a slam dunk, about a condition I knew nothing about.

neilv 2 hours ago

This could be a starting point for consulting a different human expert for a second opinion (e.g., specific questions to ask about), but I wouldn't put much trust in Claude alone on this.

IME, on an almost daily basis, claude.ai and Claude Code are confidently wrong about something, and use polished language to assert nonsense.[*]

If it's doing that on something easy, like factual knowledge available in text on the Internet, or programming code that can be inspected easily and follows well-known rules, and I can tell, because I understand those things... then there's no way I'm going to assume that Claude doesn't also BS when it comes to someone else's field. Especially not a field that requires some of the smartest people to go a decade of training, just to get started in the field.

[*] And if I confront Claude with its mistakes, eventually it apologizes, and acts as if it's learned something, again mimicking word patterns it's heard real people use and mean, without meaning any of it. I wonder whether the AI user experience would be better, if LLM-ish interfaces weren't implicitly created in the image of fake-it-till-you-make-it overconfident performative sociopathic techbros.

late2part 2 hours ago[1 more]

If you have 2 clocks you have none.

simianwords 2 hours ago

Everyone talking about how doctors know better or have some context that is not shown here.

But are you all forgetting that they literally injected a homeopathic drug on the author?

Between that and Claude sometimes hallucinating, it’s probably worth encouraging patients to take second opinion always.

loadcurve 40 minutes ago

[flagged]

rainydesert an hour ago

[flagged]

hansmayer 2 hours ago

Comment deleted

Kapura an hour ago

I asked a bird about my father's potential prostate cancer. It gave extremely good advice.