Here on HN, every link that is submitted has a site that it belongs to, which is shown to the left of the title, in parentheses. Usually it's just the domain name, but sometimes it includes part of the path, e.g. for a submission like https://github.com/rails/rails/pull/12345, the site would be github.com/rails.
How is this implemented? Is there a list somewhere of the sites to treat differently (like github.com or wordpress.com), and if so, is that list publicly available? If not, is there a similar list that someone maintains that I could use for a side project?