HACKER Q&A
📣 ibegtin

Tools/research to identify semantic types of data in databases?


There is such an idea that data is not just integers/strings/floats and so on but has a semantic type attached that identifies its logic. For example, it could be postcode, country name, business registry number, and so on.

I know several open-source PII detection tools that could connect to SQL databases and detect personal data. And some commercial ETL/data prep tools could identify semantic types.

Is there any deep research on this topic and an advanced tool to work with semantic types?


  👤 MaknMoreGtnLess Accepted Answer ✓
> Is there any deep research on this topic and an advanced tool to work with semantic types?

Uhm, there's a process called Domain Driven Design that requires you to think of your types semantically.

IMHO semantics are always a higher level abstraction built on top of primitive types (integers/strings/floats and so on)

That said, unless there's already a well defined ontology in place, the semantic types are usually very adhoc and can even change from one module to another in the same ecosystem.