Can LLMs be trained to operate robots?

Question

As a complete non-expert, I'm curious what the SOTA is. OpenAI and others have demonstrated multimodal LLMs where you can prompt using a picture and text and get code output. How feasible is it to have that code control a robot arm? Is it limited by training set, or something more fundamental?As an example, consider simple setup of a robot arm and commodity camera. An operator wearing a VR headset can see through the camera and the VR controllers can move the arm. The operator receives instructions ("arrange all items by colour") and proceeds to execute the instruction, which gets saved and added to the training set. Would a big enough training set be able to produce an instruction following robot arm?

PaulHoule · Accepted Answer

it is a huge research areahttps://arxiv.org/search/?query=llm+robots&searchtype=all&so...