HACKER Q&A
📣 throwawayapples

What is the Hacker News database structure? Does it use tables or NoSQL?


HN always seems pretty fast and is probably well designed from a data structure standpoint.

It seems that each comment and submissions are assigned a monotonically increasing number; do both submissions and comments share the same numeric space?

How is the content hierarchy stored? Is it a single large table with (for example) a structure that looks like id, parentID, contenttype, content?

Is data stored in an OLAP SQL database, on disk as flat files, in DBM files, etc?

What kind of performance is required? Is the database distributed across more than one machine?


  👤 eesmith Accepted Answer ✓
The old version in arc, mirrored at https://github.com/wting/hackernews/blob/5a3296417d23d1ecc90..., uses the file system as a database.

https://github.com/wting/hackernews/blob/5a3296417d23d1ecc90... shows the monotonically increasing number:

  (def new-item-id ()
    (evtil (++ maxid*) [~file-exists (+ storydir* _)]))

It's used for stories:

  (def create-story (url title text user ip)
    (newslog ip user 'create url (list title))
    (let s (inst 'item 'type 'story 'id (new-item-id) 
                       'url url 'title title 'text text 'by user 'ip ip)
and comments:

  (def create-comment (parent text user ip)
    (newslog ip user 'comment (parent 'id))
    (let c (inst 'item 'type 'comment 'id (new-item-id)
                       'text text 'parent parent!id 'by user 'ip ip)
No clue how that code structure compares to current implementation. For all I know it's completely different.