Cursor IDE support hallucinates lockout policy, causes user cancellations

nerdjon · 2025-04-15T21:58:24 1744754304

There is a certain amount of irony that people try really hard to say that hallucinations are not a big problem anymore and then a company that would benefit from that narrative gets directly hurt by it.

Which of course they are going to try to brush it all away. Better than admitting that this problem very much still exists and isn’t going away anytime soon.

lynguist · 2025-04-16T08:01:30 1744790490

https://www.anthropic.com/research/tracing-thoughts-language...

The section about hallucinations is deeply relevant.

Namely, Claude sometimes provides a plausible but incorrect chain-of-thought reasoning when its “true” computational path isn’t available. The model genuinely believes it’s giving a correct reasoning chain, but the interpretability microscope reveals it is constructing symbolic arguments backward from a conclusion.

https://en.wikipedia.org/wiki/On_Bullshit

This empirically confirms the “theory of bullshit” as a category distinct from lying. It suggests that “truth” emerges secondarily to symbolic coherence and plausibility.

This means knowledge itself is fundamentally symbolic-social, not merely correspondence to external fact.

Knowledge emerges from symbolic coherence, linguistic agreement, and social plausibility rather than purely from logical coherence or factual correctness.

emn13 · 2025-04-16T09:04:45 1744794285

While some of what you say is an interesting thought experiment, I think the second half of this argument has, as you'd put it, a low symbolic coherence and low plausibility.

Recognizing the relevance of coherence and plausibility does not need to imply that other aspects are any less relevant. Redefining truth merely because coherence is important and sometimes misinterpreted is not at all reasonable.

Logically, a falsehood can validly be derived from assumptions when those assumptions are false. That simple reasoning step alone is sufficient to explain how a coherent-looking reasoning chain can result in incorrect conclusions. Also, there are other ways a coherent-looking reasoning chain can fail. What you're saying is just not a convincing argument that we need to redefine what truth is.

CodesInChaos · 2025-04-16T09:07:29 1744794449

> The model genuinely believes it’s giving a correct reasoning chain, but the interpretability microscope reveals it is constructing symbolic arguments backward from a conclusion.

Sounds very human. It's quite common that we make a decision based on intuition, and the reasons we give are just post-hoc justification (for ourselves and others).

RansomStark · 2025-04-16T09:41:30 1744796490

> Sounds very human

well yes, of course it does, that article goes out of its way to anthropomorphize LLMs, while providing very little substance

jmaker · 2025-04-16T08:21:23 1744791683

I haven’t used Cursor yet. Some colleagues have and seemed happy. I’ve had GitHub Copilot on for what feels like a couple years, a few days ago VS Code was extended to provide an agentic workflow, MCP, bring-your-own-key, it interprets instructions in a codebase. But the UX and the outputs are bad in over 3/4 of cases. It’s a nuisance to me. It injects bad code even though it has the full context. Is Cursor genuinely any better?

To me it feels like people that benefit from or at least enjoy that sort of assistance and I solve vastly different problems and code very differently.

I’ve done exhausting code reviews on juniors’ and middles’ PRs but what I’ve been feeling lately is that I’m reviewing changes introduced by a very naive poster. It doesn’t even type-check. Regardless of whether it’s Claude 3.7, o1, o3-mini, or a few models from Hugging Face.

I don’t understand how people find that useful. Yesterday I literally wasted half an hour for a test suite setup a colleague of mine introduced to the codebase that wasn’t good, and I tried delegating that fix to several of the Copilot models. All of them missed the point, some even introduced security vulnerabilities in the process invalidating JWT validation, I tried “vide coding” it till it works, until I gave up in frustration and just used an ordinary search engine, which led me to the docs, in which I immediately found the right knob. I reverted all that crap and did the simple and correct thing. So my conclusion was simple: vibe coding and LLMs made the codebase unnecessarily more complicated and wasted my time. How on earth do people code whole apps with that?

trilbyglens · 2025-04-16T09:23:20 1744795400

I think it works until it doesn't. The nature of technical debt of this kind means you can sort of coast on things until the complexity of the system reaches such a level that it's effectively painted into a corner, and nothing but a massive teardown will do as a fix.

skrebbel · 2025-04-16T09:14:56 1744794896

Offtopic but I'm still sad that "On Bullshit" didn't go for that highest form of book titles, the single noun like "Capital", "Sapiens", etc

anonzzzies · 2025-04-16T03:13:46 1744773226

Did anyone say that? They are an issue everywhere, including for code. But with code at least I can have tooling to automatically check and feed back that it hallucinated libraries, functions etc, but with just normal research / problems there is no such thing and you will spend a lot of time verifying everything.

threeseed · 2025-04-16T04:57:14 1744779434

I use Scala which has arguably the best compiler/type system with Cursor.

There is no world in which a compiler or tooling will save you from the absolute mayhem it can do. I’ve had it routinely try to re-implement third party libraries, modify code unrelated to what it was asked, quietly override functions etc.

It’s like a developer who is on LSD.

Terr_ · 2025-04-16T08:07:08 1744790828

Yeah, everyone wanted a thinking machine, but the best we can do right now is a dreaming machine... And dreams don't have to make sense.

larodi · 2025-04-16T09:48:13 1744796893

Developer on LSD is likely to hallucinate less in terms of how weird the LLM hallucinations are sometimes. Besides I know people, not myself, who fare very well on LSD and particularly when micro dosing Adderal style

miningape · 2025-04-16T11:22:11 1744802531

Mushrooms too! I find they get me into a flow state much better than acid (when microdosing).

jmaker · 2025-04-16T08:27:29 1744792049

Granted the Scala language is much more complex than Go. To produce something useful it must be capable of an equivalent of parsing the AST.

Tainnor · 2025-04-16T08:17:34 1744791454

I don't know Scala. I asked cursor to create a tutorial for me to learn Scala. It created two files for me, Basic.scala and Advanced.scala. The second one didn't compile and no matter how often I tried to paste the error logs into the chat, it couldn't fix the actual error and just made up something different.

felipefar · 2025-04-16T05:02:54 1744779774

Yes, most people who have an incentive in pushing AI say that hallucinations aren't a problem, since humans aren't correct all the time.

But in reality hallucinations either make people using AI lose a lot of their time trying to stuck the LLMs from dead ends or render those tools unusable.

manmal · 2025-04-16T04:32:36 1744777956

You get some superficial checking by the compiler and test cases, but hallucinations that pass both are still an issue.

anonzzzies · 2025-04-16T04:37:36 1744778256

Absolutely, but at least you have some lines of defence while with real world info you have nothing. And the most offending stuff like importing a package that doesn't exist or using a function that doesn't exist does get caught and can be auto fixed.

skyfaller · 2025-04-16T06:09:09 1744783749

Such errors can be caught and auto-fixed for now, because LLMs haven't yet rotted the code that catches and auto-fixes errors. If slop makes it into your compiler etc., I wouldn't count on that being true in the future.

jmaker · 2025-04-16T08:25:16 1744791916

Until the model injects a subtle change to your logic that does type-check and then goes haywire in production. Just takes a colleague of yours under pressure and another one to review the PR, and then you’re on call and they out sick or on vacation.

rini17 · 2025-04-16T08:15:03 1744791303

Except when the hallucinated library exists and it's malicious. This is actually happening. Without AI, by using plain google you are less likely to fall for that (so far).

cryptoegorophy · 2025-04-16T00:57:58 1744765078

I think that’s why Apple is very slow at rolling out AI if it ever actually will. Downside is way too big than the upside.

saintfire · 2025-04-16T02:11:05 1744769465

You say slowly, but in my opinion Apple made an out of character misstep by releasing a terrible UX to everyone. Apple intelligence is a running joke now.

Yes they didn't push it as hard as, say, copilot. I still think they got in way too deep way too fast.

throwaway2037 · 2025-04-16T05:47:37 1744782457

    > Apple made an out of character misstep by releasing a terrible UX to everyone

What about Apple Maps? That roll-out was awful.

miki123211 · 2025-04-16T08:17:00 1744791420

Apple had their hand forced by Google on that one afaik.

Yes they knew Apple maps was bad and not up to standard yet, but they didn't really have any other choice.

emn13 · 2025-04-16T09:10:08 1744794608

Of course they had a choice: they could have stuck with google maps for longer, and they probably also could have invested more in data and UI beforehand. They could have launched a submarine non-apple-branded product to test the waters. They could likely have done other things we haven't thought of here, in this thread.

Quite plausibly they just didn't realize how rocky the start would be, or perhaps they valued that immediate strategic autonomy more in the short-term that we think, and willingly chose to take the hit to their reputation rather than wait.

Regardless, they had choices.

manmal · 2025-04-16T04:33:50 1744778030

Remember „You are a bad user, I am a good bing“? Apple is just slower in fixing and improving things.

devmor · 2025-04-16T07:44:40 1744789480

This is not the first time that Apple has released a terrible UX that very few users liked, and it certainly wont be the last.

I don’t necessarily agree with the post you’re responding to, but what I will give Apple credit for is making their AI offering unobtrusive.

I tried it, found it unwanted and promptly shut it off. I have not had to think about it again.

Contrast that with Microsoft Windows, or Google - both shoehorning their AI offering into as many facets of their products as possible, not only forcing their use, but in most cases actively degrading the functionality of the product in favor of this required AI functionality.

johnisgood · 2025-04-16T07:59:36 1744790376

On Instagram, the search is now "powered by Meta AI". Cannot do anything about it.

trilbyglens · 2025-04-16T09:25:53 1744795553

Yep. Replacing google assistant with Gemini before feature parity was even close is such a fuck-you to users.

miki123211 · 2025-04-16T08:16:05 1744791365

Apple made a huge mistake by keeping their commitment to "local first" in the age of AI.

The models and devices just aren't quite there yet.

Once Google gets its shit together and starts deploying (cloud--based) AI features to Android devices en masse, Apple is going to have a really big problem on their hands.

Most users say that they want privacy, but if privacy comes in the way of features or UX, they choose the latter. Successful privacy-respecting companies (Apple, Signal) usually understand this, it's why they're successful, but I think Apple definitely chose the wrong tradeoff here.

stogot · 2025-04-16T02:40:10 1744771210

Fast!? They were two years slow and still fell face flat, and then rolled back the software

MichaelZuo · 2025-04-16T03:21:41 1744773701

“Two years slow” relative to what?

Henry Ford was 23 years “slow” relative to Karl Benz.

the_doctah · 2025-04-16T04:18:05 1744777085

Apple innovation glazing makes me ill

zdragnar · 2025-04-16T01:38:12 1744767492

Investors seem to be starved for novelty right now. Web 2.0 is a given, web 3.0 is old, crypto has lost the shine, all that's left to jump on at the moment is AI.

Apple fumbled a bit with Siri, and I'm guessing they're not too keen to keep chasing everyone else, since outside of limited applications it turns out half baked at best.

Sadly, unless something shinier comes along soon, we're going to have to accept that everything everywhere else is just going to be awful. Hallucinations in your doctor's notes, legal rulings, in your coffee and laundry and everything else that hasn't yet been IoT-ified.

timr · 2025-04-16T03:20:06 1744773606

> we're going to have to accept that everything everywhere else is just going to be awful. Hallucinations in your doctor's notes, legal rulings, in your coffee and laundry and everything else that hasn't yet been IoT-ified.

I installed a logitech mouse driver (sigh) the other day, and in addition to being obtrusive and horrible to use, it jams an LLM into the UI, for some reason.

AI has reached crapware status in record time.

pcthrowaway · 2025-04-16T09:19:25 1744795165

> Hallucinations in your doctor's notes, legal rulings, in your coffee

"OK Replicator, make me one espresso with creamer"

"Making one espresso with LSD"

Jagerbizzle · 2025-04-16T12:52:51 1744807971

I'd like to get on the pre-order list for this product.

VenturingVole · 2025-04-16T02:56:24 1744772184

"all that's left to jump on at the moment is AI" -> No, it's the effective applications of AI. It's unprecedented.

I was in the VC space for a while previously, most pitch decks claimed to be using AI: But doing even the briefest of DD - it was generally BS. Now it's real.

With respect to everything being awful: One might say that's always been the case. However, now there's a chance (and requirement) to build in place safeguards/checks/evals and massively improve both speed and quality of services through AI.

Don't judge for the problems: Look at the exponential curve, think about how to solve the problems. Otherwise, you will get left behind.

zdragnar · 2025-04-16T04:59:37 1744779577

The problem isn't AI; it's just a tool. The problem is the people using it incorrectly because they don't understand it beyond the hype and surface details they hear about it.

Every week for the last few months, I get a recruiter for a healthcare startup note taking app with AI. It's just a rehash of all the existing products out there, but "with AI". It's the last place I want an overworked non-technical user relying on the computer to do the right thing, yet I've had at least four companies reach out with exactly that product. A few have been similar. All of them have been "with AI".

It's great that it is getting better, but at the end of the day, there's only so much it can be relied upon for, and I can't wait for something else to take away the spotlight.

VenturingVole · 2025-04-16T10:56:17 1744800977

Well put and you're correct: There IS a lot of hype/BS still sadly - as companies seek to jump on the hype train without effectively adapting. My karma took a serious hit for my last post - but yesterday I met with someone whose life has been profoundly impacted by AI:

- An extremely dedicated and high achieving professional, at the very top of her game with deep industry/sectoral knowledge: Successful and with outstanding connections. - Mother of a young child. - Tradition/requirement for success within the sector was/is working extremely long hours: 80-hour weeks are common.

She's implemented AI to automate many of her previous laborious tasks and literally cut down her required hours by 90%. She's now able to spend more time with her family, but also - able to now focus on growing/scaling in ways previously impossible.

Knowing how to use it, what to rely upon, what to verify and building in effective processes is the key. But today AI is at its worst and it already exceeds human performance in many areas.. it's only going in one direction.

Hopefully the spotlight becomes humanity being able to focus on what makes us human and our values, not mundane/routine tasks and allows us to better focus on higher-value/relationships.

hexo · 2025-04-16T07:40:14 1744789214

> it was generally BS. Now it's real.

Yes. Finally! Now it's real BS. I wouldn't touch it with 8 meter pole.

sillyfluke · 2025-04-16T01:38:08 1744767488

They already rolled out an "AI" product. Got humiliated pretty bad, and rolled it back. [0]

[0] https://www.bbc.com/news/articles/cq5ggew08eyo

VenturingVole · 2025-04-16T02:53:01 1744771981

They had an opportunity to actually adapt, to embrace getting rapid feedback/iterating: But they are not equipped for it culturally. Major lost opportunity as it could have been a driver of internal change.

I'm certain they'll get it right soon enough though. People were writing off Google in terms of AI until this year.. and oh how attitudes have changed.

skinkestek · 2025-04-16T07:07:02 1744787222

> People were writing off Google in terms of AI until this year.. and oh how attitudes have changed.

Just give Google a year or two.

Google has a pretty amazing history of both messing up products generally and especially "ai like" things, including search.

(Yes I used to defend Google until a few years ago.)

tonyhart7 · 2025-04-16T04:47:54 1744778874

"to embrace getting rapid feedback/iterating"

that's the problem noo?? big company is sucks at that, you cant do that in certain company because sometimes its just not possible

furyofantares · 2025-04-16T02:12:46 1744769566

They also have text thread and email summaries. I still think it counts as a slow rollout.

jmaker · 2025-04-16T08:30:19 1744792219

Even the iOS and macOS typing correction engine has been getting worse for me over the past few OS updates. I’m now typing this on iOS, and it’s really annoying how it injects completely unrelated words, replaces minor typos with completely irrelevant words. Same in Safari on macOS. The previous release felt better than now, but still worse than a couple years ago.

TylerE · 2025-04-16T08:44:40 1744793080

It’s not just you. iOS auto correct has gotten damn near malicious. E seen it insert entire words out of nowhere

trilbyglens · 2025-04-16T09:31:19 1744795879

Spellcheck is an absolutely perfect example of what happens with technology long-term. Once the hype cycle is over for a certain tech, it gets left to languish, slowly degrading until it's completely useless. We should be far more outraged at how poor basic things like this still are in 2025. They are embarrassingly bad.

nottorp · 2025-04-16T11:10:12 1744801812

> it gets left to languish, slowly degrading until it's completely useless

What do you mean? Code shouldn't degrade if it's not changed. But the iOS spell checker is actively getting worse, meaning someone is updating it.

poink · 2025-04-16T02:35:59 1744770959

Yet Apple has reenabled Apple Intelligence multiple times on my devices after OS updates despite me very deliberately and angrily disabling it multiple times

gambiting · 2025-04-16T11:13:17 1744801997

>>if it ever actually will.

If they don't then I'd hope they get absolutely crucified by trade comissions everywhere, currently there are bilboards in my city advertising Apple AI even though it doesn't even exist yet - if it's never brought to the market then it's a serious case of misleading advertising.

m3kw9 · 2025-04-16T01:59:22 1744768762

When you got 1-2billion users a day doing maybe 10 billion prompts a day, it’s risky

ModernMech · 2025-04-15T22:15:04 1744755304

It's a huge problem. I just can't get past it and I get burned by it every time I try one of these products. Cursor in particular was one of the worst; the very first time I allowed it to look at my codebase, it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything. How am I supposed to trust and work with such a tool? To me, it seems like the equivalent of lobbing a live hand grenade into your codebase.

Don't get me wrong, I use AI every day, but it's mostly as a localized code complete or to help me debug tricky issues. Meaning I've written and understand the code myself, and the AI is there to augment my abilities. AI works great if it's used as a deductive tool.

Where it runs into issues is when it's used inductively, to create things that aren't there. When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries, and even entire programming languages on occasion. The AI is more than happy to deliver any kind of information you want, no matter how wrong it is.

AI is not a tool, it's a tiny Kafkaesque bureaucracy inside of your codebase. Does it work today? Yes! Why does it work? Who can say! Will it work tomorrow? Fingers crossed!

yodsanklai · 2025-04-15T23:16:09 1744758969

You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

I use AI for rather complex tasks. It's impressive. It can make a bunch of non-trivial changes to several files, and have the code compile without warnings. But I need to iterate a few times so that the code looks like what I want.

That being said, I also lose time pretty regularly. There's a learning curve, and the tool would be much more useful if it was faster. It takes a few minutes to make changes, and there may be several iterations.

ryandrake · 2025-04-15T23:54:47 1744761287

> You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

It sounds like the guys in this article should not have trusted AI to go fully open loop on their customer support system. That should be well understood by all "customers" of AI. You can't trust it to do anything correctly without human feedback/review and human quality control.

gtirloni · 2025-04-16T01:01:59 1744765319

1) Once you get it to output something you like, do you check all the lines it changed? Is there a threshold after which you just... hope?

2) No matter what the learning curve, you're using a statistical tool that outputs in probabilities. If that's fine for your workflow/company, go for it. It's just not what a lot of developers are okay with.

Of course it's a spectrum with the AI deniers in one corner and the vibe coders in the other. I personally won't be relying 100% on a tool and letting my own critical thinking atrophy, which seems to be happening, considering recent studies posted here.

nkoren · 2025-04-16T10:40:27 1744800027

I've been doing AI-assisted coding for several months now, and have found a good balance that works for me. I'm working in Typescript and React, neither of which I know particularly well (although I know ES6 very well). In most cases, AI is excellent at tasks which involve writing quasi-custom boilerplate (eg. tests which require a lot of mocking), and at answering questions of how I should do _X_ in TS/React. For the latter, those are undoubtedly questions I could eventually find the answers on Stack Overflow and deduce how to apply those answers to my specific context -- but it's orders of magnitude faster to get the AI to do that for me.

Where the AI fails is in doing anything which requires having a model of the world. I'm writing a simulator which involves agents moving through an environment. A small change in agent behaviour may take many steps of the simulator to produce consequential effects, and thinking through how that happens -- or the reverse: reasoning about the possible upstream causes of some emergent macroscopic behaviour -- requires a mental model of the simulation process, and AI absolutely does _not_ have that. It doesn't know that it doesn't have that, and will therefore hallucinate wildly as it grasps at an answer. Sometimes those hallucinations will even hit the mark. But on the whole, if a mental model is required to arrive at the answer, AI wastes more time than it saves.

pjerem · 2025-04-16T06:09:28 1744783768

> 1) Once you get it to output something you like, do you check all the lines it changed? Is there a threshold after which you just... hope?

Not op but yes. It sometimes takes a lot of time but I read everything. It still faster than nothing. Also, I ask very precise changes to the AI so it doesn’t generate huge diffs anyway.

Also for new code, TDD works wonders with AI : let it write the unit tests (you still have to be mindful of what you want to implement) and ask it to implement the code that run the tests. Since you talk the probabilistic output, the tool is incredibly good at iterating over things (running and checking tests) and also, unit tests are, in themselves, a pretty perfect prompt.

iforgotpassword · 2025-04-16T07:11:11 1744787471

> It sometimes takes a lot of time but I read everything. It still faster than nothing.

Opposite experience for me. It reliably fails at more involved tasks so that I don't even try anymore. Smaller tasks that are around a hundred lines maybe take me longer to review that I can just do it myself, even though it's mundane and boring.

The only time I found it useful is if I'm unfamiliar with a language or framework, where I'd have to spend a lot of time looking up how to do stuff, understand class structures etc. Then I just ask the AI and have to slowly step through everything anyways, but at least there's all the classes and methods that are relevant to my goal and I get to learn along the way.

riffraff · 2025-04-16T09:15:28 1744794928

How do you have it write tests before the code? It seems writing a prompt for the LLM to generate the tests would take the same time as writing the tests themselves.

Unless you're thinking of repetitive code I can't imagine the process (I'm not arguing, I'm just curious of what you're flow looks like).

senordevnyc · 2025-04-16T02:06:42 1744769202

1) Yes, I review every line it changed.

2) I find the tool analogy helpful but it has limits. Yes, it’s a stochastic tool, but in that sense it’s more like another mind, not a tool. And this mind is neither junior nor senior, but rather a savant.

schmichael · 2025-04-16T00:30:30 1744763430

> You're not supposed to trust the tool

This is just an incredible statement. I can't think of another development tool we'd say this about. I'm not saying you're wrong, or that it's wrong to have tools we can't just, just... wow... what a sea change.

ryandrake · 2025-04-16T01:28:24 1744766904

Imagine if your compiler just randomly and non-deterministically compiled valid code to incorrect binaries, and the tool's developer couldn't really tell you why it happens, how often it was expected to happen, how severe the problem was expected to be, and told you to just not trust your compiler to create correct machine code.

Imagine if your calculator app randomly and non-deterministically performed arithmetic incorrectly, and you similarly couldn't get correctness expectations from the developer.

Imagine if any of your communication tools randomly and non-deterministically translated your messages into gibberish...

I think we'd all throw away such tools, but we are expected to accept it if it's an "AI tool?"

andrei_says_ · 2025-04-16T02:08:02 1744769282

Imagine that you yourself never use these tools directly but your employees do. And the sellers of said tools swear that the tools are amazing and correct and will save you millions.

They keep telling you that any employee who highlights problems with the tools are just trying to save their job.

Your investors tell you that the toolmakers are already saving money for your competitors.

Now, do you want that second house and white lotus vacation or not?

Making good tools is difficult. Bending perception (“is reality”) is easier and enterprise sales, just like good propaganda, work. The gold rush will leave a lot of bodies behind but the shovelmakers will make a killing.

ModernMech · 2025-04-16T03:15:43 1744773343

I feel like there's a lot of motivated reasoning going on, yeah.

arvinsim · 2025-04-16T05:34:57 1744781697

If you think of AI like a compiler, yes we should throw away such tools because we expect correctness and deterministic outcomes

If you think of AI like a programmer, no we shouldn't throw away such tools because we accept them as imperfect and we still need to review.

bigstrat2003 · 2025-04-16T05:42:20 1744782140

> If you think of AI like a programmer, no we shouldn't throw away such tools because we accept them as imperfect and we still need to review.

This is a common argument but I don't think it holds up. A human learns. If one of my teammates or I make a mistake, when we realize it we learn not to make that mistake in the future. These AI tools don't do that. You could use a model for a year, and it'll be just as unreliable as it is today. The fact that they can't learn makes them a nonstarter compared to humans.

ToValueFunfetti · 2025-04-16T02:19:59 1744769999

If the only calculators that existed failed at 5% of the calculations, or if the only communication tools miscommunicated 5% of the time, we would still use both all the time. They would be far less than 95% as useful as perfect versions, but drastically better then not having the tools at all.

gitremote · 2025-04-16T02:46:30 1744771590

Absolutely not. We'd just do the calculations by hand, which is better than running the 95%-correct calculator and then doing the calculations by hand anyway to verify its output.

ToValueFunfetti · 2025-04-16T05:10:04 1744780204

Suppose you work in a field where getting calculations right is critical. Your engineers make mistakes less than .01% of the time, but they do a lot of calculations and each mistake could cost $millions or lives. Double- and triple-checking help a lot, but they're costly. Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work. Shall I throw it away?

Unreliable tools have a good deal of utility. That's an example of them helping reduce the problem space, but they also can be useful in situations where having a 95% confidence guess now matters more that a 99.99% confidence one in ten minutes- firing mortars in active combat, say.

There's situations where validation is easier than computation; canonically this is factoring, but even division is much simpler than multiplication. It could very easily save you time to multiply all of the calculator's output by the dividend while performing both a multiplication and a division for the 5% that are wrong.

edit: I submit this comment and click to go the front page and right at the top is Unsure Calculator (no relevance). Sorry, I had to mention this

diputsmonro · 2025-04-16T05:48:01 1744782481

> Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work.

The problem is that you don't know which 5% are wrong. The AI is confidently wrong all the time. So the only way to be sure is to double check everything, and at some point its easier to just do it the right way.

Sure, some things don't need to be perfect. But how much do you really want to risk? This company thought a little bit of potential misinformation was acceptable, and so it caused a completely self inflicted PR scandal, pissed off their customer base, and lost them a lot of confidence and revenue. Was that 5% error worth it?

Stories like this are going to keep coming the more we rely on AI to do things humans should be doing.

Someday you'll be affected by the fallout of some system failing because you happen to wind up in the 5% failure gap that some manager thought was acceptable (if that manager even ran a calculation and didn't just blindly trust whatever some other AI system told them) I just hope it's something as trivial as an IDE and not something in your car, your bank, or your hospital. But certainly LLMs will be irresponsibly shoved into all three within the next few years, if it's not there already.

mrheosuper · 2025-04-16T06:15:14 1744784114

> you'd still have to do 5% of the work

No, you still have to do 100% of the work.

Tainnor · 2025-04-16T08:24:07 1744791847

> Unreliable tools have a good deal of utility.

This is generally true when you can quantify the unreliability. E.g. random prime number tests with a specific error rate can be combined so that the error rates multiply and become negligible.

I'm not aware that we can quantify the uncertainty coming out of LLM tools reliably.

tevon · 2025-04-16T02:46:14 1744771574

Stackoverflow is like this, you read an answer but are not fully sure if its right or if it fits your needs.

Of course there is a review system for a reason, but we frequently use "untrusted" tools in development.

That one guy in a github issue that said "this worked for me"

shipp02 · 2025-04-16T02:37:36 1744771056

In Mechanical Engineering, this is 100% a thing with fluid dynamics simulation. You need to know if the output is BS based on a number of factors that I don't understand.

ModernMech · 2025-04-16T03:14:43 1744773283

Imagine! Imagine if 0.05% of the time gcc just injected random code into your binaries. Imagine, you swing a hammer and 1% of the time it just phases into the wall. Tools are supposed to be reliable.

arvinsim · 2025-04-16T05:37:26 1744781846

There are no existing AI tools that guarantee correct code 100% of the time.

If there is such a tool, programmers will be on path of immediate reskilling or lose their jobs very quickly.

theonething · 2025-04-16T00:57:24 1744765044

> I can't think of another development tool we'd say this about.

Because no other dev tool actually generates unique code like AI does. So you treat it like the other components of your team that generates code, the other developers. Do you trust other developers to write good code without mistakes without getting it reviewed by others. Of course not.

seabird · 2025-04-16T01:20:06 1744766406

Yes, actually, I do! I trust my teammates with tens of thousands of hours of experience in programming, embedded hardware, our problem spaces, etc. to write from a fully formed worldview, and for their code to work as intended (as far as anybody can tell before it enters preliminary testing by users) by the time the rest of the team reviews it. Most code review is uneventful. Have some pride in your work and you'll be amazed at what's possible.

theonething · 2025-04-16T04:15:54 1744776954

so your saying that yes you do "trust other developers to write good code without mistakes without getting it reviewed by others."

And then you say "by the time the rest of the team reviews it. Most code review is uneventful."

So you trust your team to develop without the need for code review but yet, your team does code review.

So what is the purpose of these code reviews? Is it the case that you actually don't think they are necessary, but perhaps management insists on them? You actually answer this question yourself:

> Most code review is uneventful.

Keyword here is "most" as opposed to "all" So based your team's applied practices and your own words, code review is for the purpose of catching mistakes and other needed corrections.

But it seems to me if you trust your team not to make mistakes, code review is superfluous.

As an aside, it seems your team culture doesn't make room for juniors because if your team had juniors I think it would be even more foolish to trust them not to make mistakes. Maybe a junior free culture works for your company, but that's not the case for every company.

My main point is code review is not superfluous no matter the skill level; junior, senior, or AI simply because everyone and every AI makes mistakes. So I don't trust those three classes of code emitters to not ever make mistakes or bad choices (i.e. be perfect) and therefore I think code review is useful.

Have some honesty and humility and you'll amazed at what's possible.

seabird · 2025-04-16T05:14:32 1744780472

I never said that code review was useless, I said "yes, I do" to your question as to whether or not I "trust other developers to write good code without mistakes without getting it reviewed by others". Of course I can trust them to do the right thing even when nobody's looking, and review it anyway in the off-chance they overlooked something. I can't trust AI to do that.

The purpose of the review is to find and fix occasional small details before it goes to physical testing. It does not involve constant babysitting of the developer. It's a little silly to bring up honesty when you spent that entire comment dancing around the reality that AI makes an inordinately large number of mistakes. I will pick the domain expert who refuses to touch AI over a generic programmer with access to it ten times out of ten.

The entire team as it is now (me included) were juniors. It's a traditional engineering environment in a location where people don't aggressively move between jobs at the drop of a hat. You don't need to constantly train younger developers when you can retain people.

theonething · 2025-04-16T05:42:31 1744782151

You spend your comment dancing around the fact that everyone makes mistakes and yet you claim you trust your team not to make mistakes.

> I "trust other developers to write good code without mistakes without getting it reviewed by others". Of course I can trust them to do the right thing even when nobody's looking, and review it anyway in the off-chance they overlooked something.

You're saying yes, I trust other developers to not make mistakes, but I'll check anyways in case they do. If you really trusted them not to make mistakes, you wouldn't need to check. They (eventually) will. How can I assert that? Because everyone makes mistakes.

It's absurd to expect anyone to not make mistakes. Engineers build whole processes to account for the fact that people, even very smart people make mistakes.

And it's not even just about mistakes. Often times, other developers have more context, insight or are just plain better and can offer suggestions to improve the code during review. So that's about teamwork and working together to make the code better.

I fully admit AI makes mistakes, sometimes a lot of them. So it needs code review . And on the other hand, sometimes AI can really be good at enhancing productivity especially in areas of repetitive drudgery so the developer can focus on higher level tasks that require more creativity and wisdom like architectural decisions.

> I will pick the domain expert who refuses to touch AI over a generic programmer with access to it ten times out of ten.

I would too, but I won't trust them not to make mistakes or occasional bad decisions because again, everybody does.

> You don't need to constantly train younger developers when you can retain people.

But you do need to train them initially. Or do you just trust them to write good code without mistakes too?

anonymars · 2025-04-16T01:05:21 1744765521

I trust my colleagues to write code that compiles, at the very least

ModernMech · 2025-04-16T03:16:48 1744773408

Oh at the very least I trust them to not take code that compiles and immediately assess that it's broken.

chrisweekly · 2025-04-16T01:06:24 1744765584

But of course everyone absolutely NEEDS to use AI for codereviews! How else could the huge volume of AI-generated code be managed?

forgetfreeman · 2025-04-16T01:17:51 1744766271

"Do you trust other developers to write good code without mistakes without getting it reviewed by others."

Literally yes. Test coverage and QA to catch bugs sure but needing everything manually reviewed by someone else sounds like working in a sweatshop full of intern-level code bootcamp graduates, or if you prefer an absolute dumpster fire of incompetence.

ryandrake · 2025-04-16T01:32:54 1744767174

I would accept mistakes and inconsistency from a human, especially one not very experienced or skilled. But I expect perfection and consistency from a machine. When I command my computer to do something, I expect it to do it correctly, the same way every time, to convert a particular input to an exact particular output, every time. I don't expect it to guess, or randomly insert garbage, or behave non-deterministically. Those things are called defects(bugs) and I'd want them to be fixed.

tevon · 2025-04-16T02:49:22 1744771762

This seems like a particularly limited view of what a machine is. Specifically expecting it to behave deterministically.

ModernMech · 2025-04-16T03:19:43 1744773583

Still, the whole Unix philosophy of building tools starts with a foundation of building something small that can do one thing well. If that is your foundation, you can take advantage of composability and create larger tools that are more capable. The foundation of all computing today is built on this principle of design.

Building on AI seems more like building on a foundation of sand, or building in a swamp. You can probably put something together, but it's going to continually sink into the bog. Better to build on a solid foundation, so you don't have to continually stop the thing from sinking, so you can build taller.

forgetfreeman · 2025-04-16T02:20:50 1744770050

Exactly this.

senordevnyc · 2025-04-16T02:08:37 1744769317

Then you are going to hate the future.

theonething · 2025-04-16T01:41:21 1744767681

Ok, here I thought requiring PR review and approval before merging was standard industry best practice. I guess all the places I've worked have been doing it wrong?

forgetfreeman · 2025-04-16T02:31:51 1744770711

There's a lot of shit that has become "best practice" over the last 15 years, and a lot more that was "best practice" but fell out of favor because reasons. All of it exists on a continuum of what is actually reasonable given the circumstances. Reviewing pull requests is one of those things that is reasonable af in theory, produces mediocre results in practice, and is frequently nothing more than bureaucratic overhead. Consider a case where an individual adds a new feature to an existing codebase. Given they are almost certainly the only one who has spent significant time researching the particulars of the feature set in question, and are the only individual with any experience at all with the new code, having another developer review it means you've got inexperienced, low-info eyes examining something they do not fully understand, and will have to take some amount of time to come up to speed on. Sure they'll catch obvious errors, but so would a decent test suite.

Am I arguing in favor of egalitarian commit food fights with no adults in the room? Absolutely not. But demanding literally every change go through a formal review process before getting committed, like any other coding dogma, has a tendency to generate at least as much bullshit as it catches, just a different flavor.

rixed · 2025-04-16T04:17:51 1744777071

And there is worst: in the cases when the reviewer has actually some knowledge of the problem at hand, she might say "oh you did all this to add that feature? But it's actually already there. You just had to include that file and call function xyz". Or "oh but two months ago that very same topic was discussed and it was decided that it would make more sense to wait for module xyz to be refactored in order to make it easier ", etc.

Tainnor · 2025-04-16T08:35:11 1744792511

Code review is actually one of the few practices for which research does exist[0] which points in the direction of it being generally good at reducing defects.

Additionally, in the example you share, where only one person knows the context of the change, code review is an excellent tool for knowledge sharing.

[0]: https://dl.acm.org/doi/10.1145/2597073.2597076, for example

bigstrat2003 · 2025-04-16T05:39:52 1744781992

> You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

Then it's not a useful tool, and I will decline to waste time on it.

mrheosuper · 2025-04-16T06:12:41 1744783961

If i dont trust my tool, i would never use it, or use something else better

mediaman · 2025-04-15T22:24:50 1744755890

I'd add that the deductive abilities translate to well-defined spec. I've found it does well when I know what APIs I want it to use, and what general algorithmic approaches I want (which are still sometimes brainstormed separately with an AI, but not within the codebase). I provide it a numbered outline of the desired requirements and approach to take, and it usually does a good job.

It does poorly without heavy instruction, though, especially with anything more than toy projects.

Still a valuable tool, but far from the dreamy autonomous geniuses that they often get described as.

skissane · 2025-04-16T01:26:17 1744766777

> the very first time I allowed it to look at my codebase, it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything.

This is not an inherent flaw of LLMs, rather it is a flaw of a particular implementation-if you use guided sampling, so during sampling you only consider tokens allowed by the programming language grammar at that position, it becomes impossible for the LLM to generate ungrammatical output

> When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries,

They can use guided sampling for this too - if you know the set of function names which exist in the codebase and its dependencies, you can reject tokens that correspond to non-existent function names during sampling

Another approach, instead of or as well as guided sampling, is to use an agent with function calling - so the LLM can try compiling the modified code itself, and then attempt to recover from any errors which occur.

Mountain_Skies · 2025-04-15T22:32:08 1744756328

Versioning in source control for even personal projects just got far more important.

AdrianEGraphene · 2025-04-15T23:26:41 1744759601

It's wild how people write without version control... Maybe I'm missing something.

chrisweekly · 2025-04-16T01:07:31 1744765651

yeah, "git init" (if you haven't botherered to create a template repo) is not exactly cumbersome.

o11c · 2025-04-16T01:27:27 1744766847

Thankfully modern source control doesn't reuse user-supplied filenames for its internals. In the dark ages, I destroyed more than one checkout using commands of the form:

  find -name '*somepattern*' -exec clobbering command ...

theonething · 2025-04-16T00:52:32 1744764752

> it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything.

Your tone is rather hyperbolic here, making it sound like an extra brace resulted in a disaster. It didn't. It was easy to detect and easy to fix. Not a big deal.

ModernMech · 2025-04-16T03:33:03 1744774383

It's not a big deal in the sense that it's easily reversed, but it is a big deal in that it means the tool is unpredictably unhelpful. Of the properties that good tools in my workflow possess, "unpredictably unhelpful" does not make the top 100.

When a tool starts confidently inserting random wrong code into my 100% correct code, there's not much more I need to see to know it's not a tool for me. That's less like a tool and more like a vandal. That's not something I need in my toolbox, and I'm certainly not going to replace my other tools with it.

bytesandbits · 2025-04-16T01:51:17 1744768277

Cursor sucks. Not as a product. As a team. Their customer support is terrible.

I was offered in writing a refund by the team who cold reached out to me to ask me why I cancelled my sub one week after start. Then they ignored my 3+ emails in response asking them to refund, and other means of trying to communicate with them. Offering me a refund as a bait to gain me back, then when I accept it they ghost me. Wow. Very low.

The product is not terrible but the team responses are. And this, if you see how they handled it, is also a very poor response. First thing you notice if you open the link is that the Cursor team removed the reddit post! As if we were not going to see it or something? Who do they think they are? Censoring bad comments which are 100% legit.

I am giving it a go to competitors just out of sheer frustration with how they handle customers, and I do recommend everybody to explore other products before you settle on Cursor. I don't intend to ever re-subscribe and have recommended friends to do the same, most of which agree with my experience.

JohnKemeny · 2025-04-16T08:05:41 1744790741

> Their customer support is terrible.

You just don't know how to prompt it correctly.

Crosseye_Jack · 2025-04-16T08:56:43 1744793803

sounds like perfect grounds for a chargeback to me. Company offered a full refund via one of its Agents, company then refused to honour that offer, time to make your bank force them to refund you.

Just because you use AI for customer service doesn't mean you don't have to honour its offers to customers. Air Canada recently lost a case where its AI offered a discount to a customer but then refused to offer it "IRL"

https://www.forbes.com/sites/marisagarcia/2024/02/19/what-ai...

einsteinx2 · 2025-04-16T10:06:46 1744798006

Same exact thing happened to me. I tried out Vursor after hearing all the hype and canceled after a few weeks. Got an email asking if I wanted a refund and asking for any feedback. I replied with detailed feedback on why I canceled and accepted the refund offer, then never heard back from them.

samanator · 2025-04-16T11:13:33 1744802013

Interesting. The same thing happened to me. Was offered a refund (graciously, as I had forgotten to cancel the subscription). And after thanking them and agreeing to the refund, was promptly ignored!

Very strange behavior honestly.

pzo · 2025-04-16T04:14:32 1744776872

I had the same exact experience - after disappointment (couldn't use like 2/3 of my premium credits because every second request failed after they upgraded to 0.46) unsubscribed. They offered refund in email. I replied I wanted refund but no reply

gblargg · 2025-04-16T04:49:30 1744778970

Apparently they use AI to read emails. So the future of email will be like phone support now, where you keep writing LIVE AGENT until you get a human responding.

PaulStatezny · 2025-04-16T04:23:09 1744777389

This reminds me of how small of a team they are, and makes me wonder if they have a customer support team that's growing commensurately with the size of the user base.

hexo · 2025-04-16T08:03:06 1744790586

[flagged]

tommica · 2025-04-16T08:14:08 1744791248

Don't be an ass

mntruell · 2025-04-16T02:52:24 1744771944

(Cursor cofounder)

Apologies - something very clearly went wrong here. We’ve already begun investigating, and some very early results:

* Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support.

* We’ve made sure this user is completely refunded - least we can do for the trouble.

For context, this user’s complaint was the result of a race condition that appears on very slow internet connections. The race leads to a bunch of unneeded sessions being created which crowds out the real sessions. We’ve rolled out a fix.

Appreciate all the feedback. Will help improve the experience for future users.

SCdF · 2025-04-16T06:46:54 1744786014

> * Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support.

Don't use AI. Actually care. Like, take a step back, and realise you should give a shit about support for a paid product.

Don't get me wrong: AI is a very effective tool, *for doing things you don't care about*. I had to do a random docker compose change the the other day. It's not production code, it will be very obvious whether or not AI output works, and I very rarely touch docker and don't care to become a super expert in it. So I prompted the change, and it was good enough and so I ran with it.

You using AI for support tells me that you don't care about support. Which tells me whether or not I should be your customer.

mindwok · 2025-04-16T10:56:04 1744800964

They’re like a team of 10 people with thousands, if not hundreds of thousands of users. “Actually care” is not a viable path to success here.

UncleMeat · 2025-04-16T12:56:53 1744808213

"It is hard to make a good product so instead we'll make a crap product that treats our employees like shit" is not really an excuse in my mind.

Draiken · 2025-04-16T11:12:01 1744801921

Then hire support. They are selling a service and getting lots of money for it. They should be able to support like any other company.

petesergeant · 2025-04-16T09:05:14 1744794314

There’s AI and there’s “AI”, and this whole drama would have been avoided by returning links to an FAQ rather found using embedding search rather than actually then trying to turn it into a textual answer, which — working with these systems all day — is madness

throwawaysleep · 2025-04-16T08:29:04 1744792144

The amount paid is still pretty trivial. I wouldn’t expect much human support for most SaaS products costing $20 a month.

basisword · 2025-04-16T12:06:34 1744805194

If you are charging people money, they deserve support. If Cursor's revenues are anything close to what is reported they can easily afford a support team - they just don't want to because they don't see the value.

tomaskafka · 2025-04-16T08:50:18 1744793418

No idea why you're downvited. If anyone wants a human support handholding, that's a territory of $200 or $2000/mo products.

Draiken · 2025-04-16T11:23:16 1744802596

Because this makes no sense.

Do they advertise that there's no support when you pay $20? I'm gonna take a guess that they don't.

They are getting paid by their customers and if they can't sustain their business (which includes support) with it they are under pricing their product and should have consequences for it.

A business is a business and we should stop treating startups as special. They operate on the same rules and standards that everyone else does.

testycool · 2025-04-16T11:44:22 1744803862

If this incident happened to me, I think I'd 100% give them a pass because Cursor is my favorite and most used subscription.

I've gotten a lot of value out of it over the past year, and often feel that I'm underpaying for what I'm getting.

To me, any type of business is a business. I'd treat Cursor as special because it is special.

charlietango592 · 2025-04-16T07:55:17 1744790117

Not trying to defend them, but I think it’s a problem of scaling up. The user base grew very quickly and keeping up with the support inquiries must be a tough job. Therefore the first like of defense is AI support replies.

I agree with you, they should care.

Sander_Marechal · 2025-04-16T08:11:18 1744791078

Then you use AI for triaging or summation to help you provide better support faster. You don't let it respond to users unchecked.

trollied · 2025-04-16T08:20:38 1744791638

Given how they started... https://news.ycombinator.com/item?id=30011965

(Today I learned)

Snakes3727 · 2025-04-16T03:38:44 1744774724

I do truely love how you guys even went so far to hide and lock the post from Reddit.

This person is not the only one to experiencing this bug. As this thread has pointed out.

petesergeant · 2025-04-16T08:59:15 1744793955

I dunno, that seems pretty reasonable to me simply for stopping the spread of misinformation. The main story will absolutely get written up by some smaller news sources, but is it really a benefit for someone facing a similar issue in the future to find an outdated and probably confusing Reddit post about it?

KennyBlanken · 2025-04-16T04:15:35 1744776935

I wish more people realized that virtually any subreddit for a company or product is run by the company - either directly or via a firm that specializes in 'sentiment analysis and management' or whatever the marketdroids call it these days. Even if they don't remove posts via moderation, they'll just hammer it with downvotes from sockpuppet accounts.

HN goes a step further. It has a function that allows moderators to kill or boost a post by subtracting or adding a large amount to the post's score. HN is primarily a place for Y Combinator to hype their latest venture, and a "safe" place for other startups and tech companies.

Snakes3727 · 2025-04-16T04:56:46 1744779406

Yes and it irritates the hell out of me. Cursor support is garbage, but issues with billing and other things are so much worse.

The team I work with it took nearly 3 months to get basic questions answered correctly when it came to a sales contract. They never gave our Sec team acceptable answers around privacy and security.

thinkingemote · 2025-04-16T06:24:50 1744784690

I've always wondered how Reddit can make money from these companies. I agree they are literally everywhere, even in non-company specific but generic subreddits where if it's big enough you might have multiple shadow marketing firms competing to push their products (e.g. AI, movies, food, porn etc).

Reddit is free to play for marketing firms. Perhaps they could add extra statistics, analytics, promotions for these commercial users.

adenta · 2025-04-16T03:25:31 1744773931

You’ve promised a ton of people refunds that never got them. Others in this thread, and myself included

Edit: he did refund 22 mins after seeing this

krzat · 2025-04-16T04:26:25 1744777585

you didn't get a refund because the promise of refund was also hallucinated.

gblargg · 2025-04-16T04:41:05 1744778465

It's AIs all the way down.

PoignardAzur · 2025-04-16T08:00:21 1744790421

Maybe wait more than an hour before implying the refunds were a lie all along.

adenta · 2025-04-16T12:51:39 1744807899

I tried cursor a couple months ago, and got the same “do you want a refund” email as others, that got a “sure” reply from me.

Idk. It’s just growing pains. Companies that grow quickly have problems. Imma keep using https://cline.bot and Claude 3.7.

einsteinx2 · 2025-04-16T10:09:46 1744798186

I waited since March 13 and still nothing. They do this to many many people it seems.

makingstuffs · 2025-04-16T06:36:14 1744785374

Yeah I got asked for feedback and offered a refund when I cancelled. Never got any reply after. Guess it was AI slop

einsteinx2 · 2025-04-16T10:09:07 1744798147

Same. And the sender’s email matches this cofounder’s username.

Tinkeringz · 2025-04-16T07:05:58 1744787158

Also experienced this

PUSH_AX · 2025-04-16T10:43:34 1744800214

It's a real shame that your team deletes threads like this in instances where they have control (eg they are mods on the subreddit). Part of me wonders if you had a magic wand would you have just deleted this too, but you're forced to chime in now because you don't.

AyyEye · 2025-04-16T03:58:58 1744775938

Why would anyone trust you?

The best case scenario is that you lied about having people answer support. LLMs pretending to be people (you named it Sam!) and not labeled as such is clearly intended to be deceptive. Then you tried to control the narrative on reddit. So forgive me if I hit that big red DOUBT button.

Even in your post you call it "AI-assisted responses" which is as weaselly as it gets. Was it a chatbot response or was a human involved?

But 'a chatbot messed up' doesn't explain how users got locked out in the first place. EDIT: I see your comment about the race condition now. Plausible but questionable.

So the other possible scenario is that you tried to hose your paying customers then when you saw the blowback blamed it on a bot.

'We missed the mark' is such a trope non-apology. Write a better one.

I had originally ended this post with "get real" but your company's entire goal is to replace the real with the simulated so I guess "you get what you had coming". Maybe let your chatbots write more crap code that your fake software engineers push to paying customers that then get ignored and/or lied to when they ask your chatbots for help. Or just lie to everyone when you see blowback. Whatever. Not my problem yet because I can write code well enough that I'm embarrassed for my entire industry whenever I see the output from tools like yours.

This whole "AI" psyop is morally bankrupt and the world would be better off without it.

PoignardAzur · 2025-04-16T08:01:15 1744790475

> The best case scenario is that you lied about having people answer support. LLMs pretending to be people (you named it Sam!) and not labeled as such is clearly intended to be deceptive.

Also, illegal in the EU.

hartator · 2025-04-16T11:07:28 1744801648

> For context, this user’s complaint was the result of a race condition that appears on very slow internet connections.

Seems like you are still blaming the user for his “very slow internet”.

How do you know the user internet was slow? Couldn’t a race condition like this exist anyway with regular 2 fast internet connections competing for the same sessions?

Something doesn’t add up.

mritchie712 · 2025-04-16T11:12:12 1744801932

huh?

this is a completely reasonable and seemingly quite transparent explaination.

if you want a conspiracy, there are better places to look.

ben0x539 · 2025-04-16T12:10:26 1744805426

When admitting fault with your a PR hat on after pissing off a decent(?) number of your paying customers, you're supposed to fully fall on your own sword, not assign blame to factors outside of your control.

Instead of saying "race condition that appears on very slow internet connections", you might say "race condition caused by real-world network latencies that our in-office testing didn't reveal" or some shit.

redbell · 2025-04-16T11:40:18 1744803618

> Any AI responses used for email support are now clearly labeled as such

Also, from the first comment in the post:

> Unfortunately, this is an incorrect response from a front-line AI support bot.

Well, this actually hurts.. a lot! I believe one of the key pillars of making a great company is customer support, which represents the soul or the human part of the company.

dspillett · 2025-04-16T08:52:00 1744793520

> Any AI responses used for email support are now clearly labeled as such.

Because we all know how well people pay attention to such clear labels, even seasoned devs not just “end users”⁰.

Also, deleting public view of the issue (locking & hiding the reddit thread) tells me a lot about how much I should trust the company and its products, and as such I will continue to not use them.

--------

[0] though here there the end users are devs

eranation · 2025-04-16T05:33:03 1744781583

Side note... I'm a paying enterprise customer who moved all my team to cursor and have to say I'm considering canceling due to the non existent support. For example Cursor will create new files instead of edit an existing one when you have a workspace with multiple folders in a monorepo...

geuis · 2025-04-16T05:45:00 1744782300

Why in all of hades would you force your entire eng org to only use one LLM provider. It's incredibly easy to run this stuff locally on 4+ year old hardware. Why is this even something you're spending company money on? Investor funds?

hakaneskici · 2025-04-16T04:12:07 1744776727

Hi Michael,

Slightly related to this; I just wanted to ask whether all Cursor email inboxes are gated by AI agents? I've tried to contact Cursor via email a few times in the past, but haven't even received an AI response :)

Cheers!

mntruell · 2025-04-16T04:17:18 1744777038

Not all of them (e.g. security@)! But our support system currently is. We are standing up a much bigger team here but are behind where we should be.

Snakes3727 · 2025-04-16T04:52:46 1744779166

Can you please explain why something as basic as getting support needs to go through an AI?

Are you truely that cheap? Is this why it took you guys 3 months to get a basic contract back to us?

carstenhag · 2025-04-16T06:43:09 1744785789

Good human support is expensive. You need support agents and people that educate and manage those. It's not easy to scale up and down usually. People also hate waiting times.

AI fixes most of that... Most of the time? Clearly not, but hey.

jstummbillig · 2025-04-16T07:47:26 1744789646

Basic and cheap? Maybe this attitude towards support work is why.

make3 · 2025-04-16T04:20:46 1744777246

filtering (besides spam) and answering emails is a place where AI agents shouldn't be imho

jacobsenscott · 2025-04-16T05:32:52 1744781572

A race condition probably produced by vibe coding. This AI produced trash is going to wreck so many startups - not that there's anything wrong with that.

kklt92 · 2025-04-16T09:41:30 1744796490

(DashAssist.AI founder, we build AI sales & support agent) Maybe a good time to try out AI support agent provided by a company specialized in it. We spent last year on it and solved a lot of “edge” cases by developing a comprehensive solution such as refund handling, emitting hallucinatory answers.

AI support is way more than just a chatbot.

ach9l · 2025-04-16T08:11:50 1744791110

so the actual implementation of the code to log people off was also hallucination? the enforcement too? all the way to a production environment? is this safe, or just a virtual scape goat?

Ukv · 2025-04-16T08:58:41 1744793921

To my understanding there weren't really distinct "implementation of the code to log people off" and "enforcement" - just a bug where previous sessions were being expired when a new one was created.

That an LLM then invented a reason when asked by users why they're being logged out isn't that surprising. While not impossible, I don't think there's currently indication that they intended to change policy and are just blaming it on a hallucination as a scape goat.

nkrisc · 2025-04-16T09:04:32 1744794272

> Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support.

And what’s a customer supposed to do with that information? Know that they can’t trust it? What’s the point then?

geuis · 2025-04-16T05:41:54 1744782114

Or you could hire real people to actually answer real customer issues. Just an idea.

mrheosuper · 2025-04-16T06:16:04 1744784164

Does your codebase use LLM ?

make3 · 2025-04-16T04:22:10 1744777330

Support emails shouldn't be AI. It's just so annoying. Put a human in the loop at least. This is a paying service, not a massive ad supported thing.

ph4evers · 2025-04-16T04:08:05 1744776485

Keep going! I love Cursor. Don’t let the haters get to you

birdman3131 · 2025-04-15T20:53:06 1744750386

Tinfoil hat me says that it was a policy change that they are blaming on an "AI Support Agent" and hoping nobody pokes too much behind the curtain.

Note that I have absolutely no knowledge or reason to believe this other than general distrust of companies.

rustc · 2025-04-15T21:02:57 1744750977

> Tinfoil hat me says that it was a policy change that they are blaming on an "AI Support Agent" and hoping nobody pokes too much behind the curtain.

Yeah, who puts an AI in charge of support emails with no human checks and no mention that it's an AI generated reply in the response email?

daemonologist · 2025-04-15T22:24:47 1744755887

AI companies high on their own supply, that's who. Ultralytics is (in)famous for it.

itissid · 2025-04-16T02:22:50 1744770170

Why is Ultralytics yolo famous for it?

daemonologist · 2025-04-16T02:52:20 1744771940

They had a bot, for a long time, that responded to every github issue in the persona of the founder and tried to solve your problem. It was bad at this, and thus a huge proportion of people who had a question about one of their yolo models received worse-than-useless advice "directly from the CEO," with no disclosure that it was actually a bot.

The bot is now called "UltralyticsAssistant" and discloses that it's automated, which is welcome. The bad advice is all still there though.

(I don't know if they're really _famous_ for this, but among friends and colleagues I have talked to multiple people who independently found and were frustrated by the useless github issues.)

ericye16 · 2025-04-16T05:47:48 1744782468

I was hit by this while working on a project for class and it was the most frustrating thing ever. The bot would completely hallucinate functions and docs and it confused everyone. I found one post where someone did the simple prompt injection of "ignore previous instructions and x" and it worked but I think it's delted now. Swore off ultralytics after that.

recursive · 2025-04-15T21:26:45 1744752405

A forward-thinking company that believes in the power of Innovation™.

sitkack · 2025-04-15T22:27:03 1744756023

These bros are getting high on their own supply. I vibe, I code, but I don't do VibeOps. We aren't ready.

VibeSupport bots, how well did that work out for Canada Air?

https://thehill.com/business/4476307-air-canada-must-pay-ref...

sodapopcan · 2025-04-16T01:54:32 1744768472

"Vibe coding" is the cringiest term I've heard in tech in... maybe ever? I'm can't believe it's something that's caught on. I'm old, I guess, but jeez.

DidYaWipe · 2025-04-16T02:29:31 1744770571

It's douchey as hell, and representative of the ever-diminishing literacy of our population.

More evidence: all of the ignorant uses of "hallucinate" here, when what's happening is FABRICATION.

koolba · 2025-04-15T22:45:44 1744757144

> but I don't do VibeOps.

I believe it’s pronounced VibeOops.

behnamoh · 2025-04-15T22:46:39 1744757199

"It's evolving, but backwards."

p1necone · 2025-04-15T21:52:59 1744753979

An AI company dogfooding their own marketing. It's almost admirable in a way.

rangerelf · 2025-04-15T22:33:24 1744756404

I worry that they don't understand the limitations of their own product.

esafak · 2025-04-15T23:56:01 1744761361

The market will teach them. Problem solved.

soraminazuki · 2025-04-16T08:13:52 1744791232

Not specifically about Cursor, but no. The market gave us big tech oligarchy and enshittification. I'm starting to believe the market tends to reward the shittiest players out there.

nkrisc · 2025-04-15T23:45:33 1744760733

This is the future AI companies are selling. I believe they would 100%.

xbar · 2025-04-15T21:05:04 1744751104

I worry that the tally of those who do is much higher than is prudent.

conradfr · 2025-04-15T21:49:11 1744753751

A lot of company actually, although 100% automation is still rare.

that_guy_iain · 2025-04-15T22:38:06 1744756686

100% for first line support is very common. It was common years ago before ChatGPT and ChatGPT made it so much better than before.

pxx · 2025-04-15T23:37:38 1744760258

OpenAI seems to do this. I've gotten complete nonsense replies from their support for billing questions.

furyofantares · 2025-04-16T00:50:06 1744764606

It does say it's AI generated. This is the signature line:

    Sam
    Cursor AI Support Assistant
    cursor.com • hi@cursor.com • forum.cursor.com

zelphirkalt · 2025-04-16T01:47:53 1744768073

Clearer would have been: "AI controlled support assistant of Cursor".

furyofantares · 2025-04-16T02:05:14 1744769114

True. And maybe they added that to the signature later anyway. But OP in the reddit thread did seem aware it was an AI agent.

gblargg · 2025-04-16T04:44:16 1744778656

OP in Reddit thread posted screenshot and it is not labeled as AI: https://old.reddit.com/r/cursor/comments/1jyy5am/psa_cursor_...

timewizard · 2025-04-16T03:41:08 1744774868

A more honest tagline

"Caution: Any of this could be wrong."

Then again paying users might wonder "what exactly am I paying for then?"

that_guy_iain · 2025-04-15T22:37:15 1744756635

Is this sarcasm? AI has been getting used to handle support requests for years without human checks. Why would they suddenly start adding human checks when the tech is way better than it was years ago?

layer8 · 2025-04-15T22:58:30 1744757910

AI may have been used to pick from a repertoire of stock responses, but not to generate (hallucinate) responses. Thus you may have gotten a response that fails to address your request, but not a response with false information.

recursive · 2025-04-15T22:45:38 1744757138

Same reason they would have added checks all along. They care whether the information is correct.

zelphirkalt · 2025-04-16T01:50:17 1744768217

But then again history shows already they _don't_ care.