HACKER Q&A
📣 bckr

Can subject-predicate-object triples suffice for an AGI knowledge base?


Prior work really appreciated. At some point someone is going to build an AI that knows things.

RAG does not appear to be enough.

Maybe it will be as conceptually simple as a knowledge base glued to an LLM.

First, a natural language answer is converted into a knowledge base query. Next, the knowledge-base response is generated. Then, the natural language response is generated conditioned on the facts at hand.

Why would this not work?


  👤 PaulHoule Accepted Answer ✓
Simple triples like

  :BobB a :Cat .
  :BobB rdfs:Label "Bob B" .
  :BobB :weightInKg 4.1 .
aren't sufficient for situations where things change over time, people have different opinions, etc. You can write something like

   [
      :subject :BobB .
      :predicate :weightInKg .
      :object 4.1 .
      :measuredBy :PaulHoule .
      :measuredWhen "2024-07-16Z"^^xsd:date .
      :comment "Estimate by visual observation because Bob B wouldn't let me pick him up"
   ]
but people don't usually do that. I wrote a lot about this stuff in a technical report for ISO that is about to be published but, unfortunately, will be another ISO document that will cost 133 Swiss Franc. Probably the most advanced system for common-sense logic was Cyc

https://en.wikipedia.org/wiki/Cyc

which didn't change the world as people had hoped. It's usually not that hard to build a model for changes over time or other details of the commonsense domain for a limited domain (e.g. "Where was package 888310-313 at 3:42 PM last Thursday?") but its a very difficult problem to do generally.


👤 verdverm
AGI requires much more than knowing facts, things like planning and spending different amounts of time "thinking" depending on the query or task

That being said, adding a graph based knowledge graph is gaining in active research and I suspect will improve outcomes


👤 compressedgas
That's how every single natural language QA system worked before LLMs.

For example START https://start.csail.mit.edu/ works that way.

Now LLMs can do the natural language to formal language conversion in both directions. LLMs are good at translation though not so well at exact recall. They are doing this though it goes by the name 'function calling'.


👤 austin-cheney
So the W3C went through this about 25 years ago.

* RDF for triples

* OWL for modeling upon those triples

* FOAF for graphing and serializing relationships

The technology is already there and works well and it’s generally very simple. The challenge though is just using it.

These concepts are higher order, so the developer using it has to be really smart. I mean actually smart and not pretend smart using some framework bullshit or a package manager to do the heavy lifting. So, that alone eliminates some 80% of developers.

Also, there is no immediate gratification to higher order data modeling. It takes a fair amount of effort to build something worthy of the technology and then it’s just some abstract tool for use in some other related business/product. That then eliminates another 10-15% of developers. That doesn’t leave many people left capable of doing the work and then those people have to be connected enough to have someone to fund their self-learning and building effort.


👤 mindcrime
Agree with @PaulHoule, that basic RDF triple style KB's probably aren't sufficient in and of themselves. Buuuttttt... You can "cheat" a bit and use reification so that triples can themselves become the subject of another triple, which lets you create "subjects" like "bayesianPrior" or "strengthOfBelief" or "beliefHeldSinceDate" and other things that you can then wire into your reasoning system, to enable bayesian reasoning, modal logics, temporal logics, etc. Given all of that, can you represent everything you'd need for AGI? Weeeelll... a qualified "maybe" seems appropriate here.

The thing is, we don't have AGI, so nobody can say with absolute certainty what we need to enable it. A complete answer to that question would be tantamount to developing AGI which would make the whole discussion moot. But I think it suffices to say that there is "some there, there" in terms of using semantic KB's in conjunction with LLM's to enable richer experiences that have "more intelligent" behavior. Will it get to AGI? No idea. But it's something.

There's still a bit of "stuff" to question though. For example, how will an AGI handle paraconsistent logic? That is, logic that deals with contradictions better than classical logic, where a contradiction in the premises allows literally anything to be "proven". Humans deal with (apparent) contradictions more or less seamlessly most of the time, albeit experiencing various degrees of cognitive dissonance when trying to consider mutually contradicting beliefs that are both held as true. Just so long as we don't have to collapse our beliefs down to the level of rigour of classical logic, we can usually deal with contradictory beliefs. Making a machine deal gracefully with that stuff is still an open area of research.

Anyway... if you are interested in pursuing this further, one specific project I'm familiar with (but not directly involved with) is PR-OWL or "Probabilistic OWL" which adds Bayesian probability to OWL.

https://www.pr-owl.org/


👤 codegladiator