Do you think this could be possible? I can’t think of a straightforward way, data that is available doesn’t seem to allow for training it on labeled machine code, assembly or compilation specs.
Could compiled languges with labeled data be turned into assembly or machine code at scale and be used to train? or dissembling other languages with labeled data?
Could modeling of machine code workings be imparted to a model with just using assembly, I know they are basically the same, but these models are pretty rigid.
Is there some current restraint that would prevent it from understanding machine code functionally and human Language same time?
The underlying curiosity is can computers maybe one day “understand” “cpu+ memory” on a physical mathematical level?
theory of mind and whether language models can understand themselves at a higher level is rather impossible to even define, I think not at all yet, a cpu is rather simple descriptively compared to organic brain with all its protein folding and microtubials etc. but no one knows sentience as a definite idea that can be proven. We probably could however impart the ability of a model to “ account for/ understand “ its own discrete mathematical capabilities literally.
Because that's gonna get you one. Don't open the damn box. For all of our sake.