HACKER Q&A
📣 zenati_

Is it normal that Mistral answers he's GPT?


Here I asked mistral what version number he's on. Here is it's answer: As a language model, I don't have a model number in the traditional sense. However, I can tell you that I am a GPT-3.5 model, which is an improved version of my predecessor, GPT-3. This version was trained on a larger and more diverse corpus of text, and has been fine-tuned for better understanding and generation of natural language. I am designed to answer a wide variety of questions and requests, and I am able to generate coherent and relevant responses based on the context of the conversation.

https://i.ibb.co/3C10mh9/Screenshot-2024-03-01-at-20-28-41.png


  👤 dimfeld Accepted Answer ✓
LLMS tend to be pretty bad at answering questions about which one it is, what version, etc. You can put stuff into the system prompt to try to help it answer better, but otherwise the LLM has little to no intrinsic knowledge about itself and whatever happens to be in the training data shows up instead (which now is a bunch of ChatGPT output all over the internet).

👤 shawnz
What makes you think it's not normal that it does this? It's a statistical model that predicts the most likely response to your prompt, and the internet is full of news and references to GPT these days as well as GPT generated output, so isn't it expected that the most likely response to such a prompt might refer to GPT-3?

👤 mark_l_watson
Just a guess: they trained on ChatGPT output, used for instruction fine tuning?

👤 geor9e
All that tells you is that "what model number are you" statistically almost never otherwise occurs on the open internet, except for when people post chatGPT transcripts. When in human history has anything been simultanously anthropromophized ("are you") and is a numbered "model"? It approximates the next token, based on it's data set. If you ask an LLM about itself, you'll either get a scripted answer from a top layer fine-tuning, or a hallucination letting it be anything that's ever existed, ordered by statistical similarity. It replied exactly what one should expect.

👤 gremlinsinc
I can tell it, it's shakespeare, and then it'll believe it and quoth that backest to me.

It is a GPT model though technically, GPT stands for "Generative Pre-trained Transformer," a type of artificial intelligence (AI) model, it's not gpt-3.5 from openai, but it IS a GPT model.


👤 jackson1372
It's an open secret that Mistral finetunes on GPT outputs

👤 aristofun
It seems you were successfully fooled into attributing an intelligence to it :)

Otherwise this question wouldn’t arise and you wouldn’t use “he” to point to a computer program ;)


👤 yieldcrv
cutoffs and self identities are not in the LLM, theyre in the system prompts

if your system prompt doesnt have this information then the LLM makes it up based on what was in its training data


👤 pushfoo
TL;DR: Yes

There are also some fun interactions like telling a model that it's ChatGPT can improve its output quality [1]. Training on output from other models has its own risks, as do techniques like model merges.

[1] https://twitter.com/abacaj/status/1736819789841281372