鲁迅文学院第十届网络文学作家高级研修班结业

Question

Since it is pretty evident that users, companies and snake-oil vendors seems to think that ChatGPT is a tool for generating meaningful answers I was wondering if the site could use a fake, canonical question that can be linked to whenever someone will fall in that misconception.

I was thinking about something like this:

Question

I have tried to use ChatGPT as a tool to help me finding solutions to common issue like helping with coding, providing information about historical evens, solving math problems and so on.

I have noticed that often ChatGPT does not provide real, usable answers but will instead hallucinate facts, events, and so on. This is more evident when the user asks a confirmation of a wrong premise.

Not only the bot agreed on a false premise, the answer also contains contradictions. It agreed that Nintendo bought Link from Sega while at the same time saying that "Sega acquired the rights to both Sonic and Link". The message is grammatically correct but the actual meaning is not.

Answer

This is due to a common misconception about what Large Language Model like ChatGPT do.

Sadly, ChatGPT has been often presented as an intelligent "AI-Assistant", help desk tool but this is in a way far from truth: all ChatGPT tries to do is to generate a sequence of word base on some score rules and the data it was trained against.

You can read a quite accurate yet very accessible article about the inner workings of ChatGPT written by Stephen Wolfram here.

An extreme summary of said article (not very accurate but hopefully enough for this short explanation) is that given a sequence of words ChatGPT tries to calculate what next word has the best "score".

The color of this apple is ...

Word	Score
red	very high
pink	low
blue	low
green	very high
dog	very low

Again, this is an oversimplification, but please bear with me.

How are those score calculated and what do they represent? While the actual math is kinda complex, the purpose is actually quite simple. The model uses the dataset it was trained on to give each possible next piece of the message a score that represents how "likely" that continuation is.

You could see this as a sort of probability that word or phrase has to follow your previous message. It should be intuitive to assume that if the training data is made up by meaningful, correct, not made up text most of the time a phrase like "The color of this apple is" will continue with words like "green" or "red" and not "blue" or "dog".

At this point is should be clear WHY ChatGPT is capable of parroting meaningful info even without understanding the semantic meaning of the words.
Ask yourself this: if your phrase so far is

The year Columbus discovered America is ...

Word	Score
1492	?
dog	?
Fluttershy	?
London	?

What word would you expect to have the best score? How likely would you consider the phrase to continue with "Fluttershy? Do you really expect the training data to contain "The year Columbus discovered America is London"?

As soon as you realize this, ChatGPT limitations should become evident.

As the training set grows it becomes more and more likely that if your question is simple enough ChatGPT will decide to parrot some actually relevant text it "knows about" and provide you with an useful answer.

Yet at the same time you should realize that this does not imply anywhere that ChatGPT understands what it is generating.

Before, we made an example asking what do you expect to be the better scoring option for continuing the message "The year Columbus discovered America is ...".
Now... let me asking something a little different: what would you expect to be the better scoring option for continuing a message "The year before the year Columbus discovered America is ...".

Sadly the answer seems to be still 1492.

the model is able to identify a statistical relationship between the words "discovery, America" and the number 1492, but it cant understand the actual meaning. So, by asking the year before the year America was discovered we can easily trick the tool into giving us an inaccurate answer.

But the year America was discovered is a very well defined concept that is likely to come up mentioned by multiple sources and so it will be very frequent in your training set. So ask yourself: what do you expect to happen when the question is something that , to quote the Hitchhiker Guide

"was almost, but not quite, entirely unlike tea"

What if you question is not really that similar to any text the model was trained on? At this point whatever score the model is trying to calculate is probably based on the likelihood of some word appearing in unrelated, not-relevant content.

The result?

The bot will gladly agree on made up fact because in doing so it has fulfilled it purpose. Producing some text that maximize a likelihood utility score function.

Generating factual answers was never its purpose.

chatgpt is already the top tag. If you haven't don't yet, please take a look at the questions already posted. P.S. At first sight, the question that has been presented here might be a canonical question. — Wicket, Commented Jul 20, 2023 at 16:54
@Wicket I know that multiple questions already exist. My point is that the entire network is in the middle of a strike caused indirectly by users thinking that ChatGPT is a knowledge assistant tool that can provide meaningful answers to any question. Since I think this topic will come up again and again on this site I am pointing out that an artificial but canonical question that can be referenced to other users as needed could be useful. But since I am not sure the one I wrote is good enough, I am posting here for a) gathering opinions and b) polish up whatever will be posted — SPArcheon - on strike, Commented Jul 20, 2023 at 17:11
In case I was unclear in my previous comment, let me add that I have upvoted the question. Remember that I mentioned that this might be a canonical question, which implies that I have received this post well. Since this question arrives after 16 chatgpt questions, I think It might be a good idea to look at them. — Wicket, Commented Jul 20, 2023 at 17:48
I think it would be on topic. Like with any other SE site, they are bound to attract askers of varying knowledge. Having a canonical question on what can be expected of an LLM or specifically for ChatGPT is probably worthwhile. — Hoid, Commented Jul 20, 2023 at 20:05
Me: "What is the year before the year of the discovery of America?" ChatGPT: "The discovery of America by Christopher Columbus is traditionally dated to 1492. Therefore, the year before the discovery of America would be 1491." This shows how it's not consistent in its responses. — Someone, Commented Jul 25, 2023 at 18:36
@Someone-OnStrike consider that the more you point out a logical fallacy, the more it become probable that random training picks that up. And now, plug-in extensions are a thing to add some more confusion to the mix. This post doesn't want to be an absolute truth, just a oversimplified attempt at clearing up some misconceptions. I hoped other users would provide their versions or propose edits to the post to slowly clean up a canonical question for the main site. — SPArcheon - on strike, Commented Jul 26, 2023 at 7:44

Wicket · Accepted Answer · 2025-08-06 18:25:00Z

3

I think that a single question might be too broad. I suggest having one question for each one of the most relevant misconceptions.

There are several questions [chatgpt] questions that misses that ChatGPT / OpenAI Chat API are intended to have conversations

answered Jul 25, 2023 at 18:25

Wicket

1,0823 silver badges16 bronze badges

Add a comment |

芒果什么时候吃最好	怀孕肚子疼是什么原因	高密度脂蛋白胆固醇偏低是什么意思	总出汗是什么原因	锌中毒是什么症状
片反过来念什么	所向披靡是什么意思	腰斩什么意思	藕带是什么	树叶为什么是绿色的
眩晕症吃什么药	梦见别人拉屎是什么意思	角瓜是什么	东北小咬是什么虫子	咽炎是什么原因引起的
尿检阳性是什么意思	通情达理是什么意思	小米叫什么	水痘长什么样子	女性绝经有什么征兆

女性尿路感染吃什么药效果好hcv7jop6ns8r.cn	鸡头上长痘痘用什么药hcv9jop6ns6r.cn	多吃香蕉有什么好处和坏处xianpinbao.com	大枣吃多了有什么危害hcv8jop7ns2r.cn	排长是什么军衔hcv8jop2ns4r.cn
双氯芬酸钠缓释片是什么药sanhestory.com	刘欢属什么生肖hcv8jop1ns4r.cn	便秘吃什么药最好hcv7jop6ns3r.cn	dht是什么意思hcv9jop2ns7r.cn	查胆固醇挂什么科hcv8jop0ns6r.cn
四两拨千斤是什么意思hcv8jop8ns7r.cn	摊手是什么意思hcv8jop3ns9r.cn	出差带什么hcv7jop6ns3r.cn	来月经喝什么好hcv7jop6ns2r.cn	脾胃是什么clwhiglsz.com
上头了是什么意思zhongyiyatai.com	尿结晶高是什么原因hcv8jop7ns4r.cn	glenfiddich是什么酒weuuu.com	什么是胆囊炎hcv8jop5ns4r.cn	三跪九叩是什么意思hcv8jop3ns9r.cn

Stack Exchange Network

鲁迅文学院第十届网络文学作家高级研修班结业

Question

Answer

1 Answer 1

You must log in to answer this question.

Linked

Hot Network Questions

鲁迅文学院第十届网络文学作家高级研修班结业

Question

Answer

1 Answer 1

You must log in to answer this question.

Linked

Related

Hot Network Questions