That said, it does "complete" or "predict" a next word or continuation phrase, so it can be though as belonging to the same basic concept of "how to complete what was given", but so is Vim's crude "control-N/P" which just finds tokens starting with the same characters that it was given.
Not long after I was working for another startup that was developing CNN models for classification, we were watching fasttext and BERT when they came out and one thing I was impressed with was the subword features which keep the benefits of words but don't fail catastrophically for the out-of-dictionary case.
Now the transformer models are pretrained out masking out random words from the input which is better than "predict the next word" because it is bidirectional and the unidirectional approach causes all sorts of problems. (In text generation, for instance, the LSTM starts out in a very small part of the state space and decides if the patient had cancer or not not by starting out with a latent state the way the patient did or the way the person writing the case report did, but because of the letters that were randomly chosen.)
Somebody might fine-tune that kind of model to do information extraction or classification and I've got a pretty good picture of how the pretraining of a transformer works and how to fine-tune it to do that.
The chatbots are trained by human feedback reinforcement learning to play a character, be helpful, be agreeable even when they refuse to write Hitler speeches, that's another stage of training past "autocomplete".