HACKER Q&A
📣 anigbrowl

How would you spatialize higher dimensional data?


I wrangle a lot of social media datasets, which could be anything from an email database to an old 4chan archive. I'm primarily interested in the metadata network dynamics rather than the content (although I might mine it for hashtags of length/complexity, but I am not very interested in NLP). Sometimes it's just a few thousand items, sometimes orders of magnitude larger.

Typically when I represent this data visually I use force-directed layouts or tree/radial ones if it's definitely hierarchical. And I'm pretty good at the visualization side - filtering, community detection, backbone extraction, layer decomposition, motif identification, time windowing and so on. I enjoy starting out with a giant hairball and translating it into something intuitively comprehensible and explicable. I also experiment a lot with other dataviz paradigms - treemaps, hyperbolic geometry and so on.

But one thing that bothers me about many layout algorithms is that they're so arbitrary; once you go past a certain level of complexity you can get quite different outcomes from the same dataset and starting procedures. And it's difficult to represent agents and their activity on the same graph, eg influencers and their tweets. I am constantly trying to think of ways to map the raw data to Cartesian space or on the interior of a sphere, not unlike a planetarium. But I have some kind of cognitive or knowledge-based block to understanding this, which I can't seem to get around.

How would you spatialize a complex network - eg suppose you had unlimited access to the HN API and plugged it into a particle system or an FPS game engine?

I'm more interested in creating a vivid, explorable impression than scientific modeling. I just struggle with the concept of locality because (in this example) any HN user could reply to or vote/flag any other as there's no 'travel time' between threads or users. And without a coherent spatial metaphor, I can't think of good mappings to 2 or 3 dimensions of space and another of time.

Thanks for your help!


  👤 sargstuff Accepted Answer ✓

👤 sargstuff
Some stories not safe for work viewing! Longform data journalism stories (aka visual statistics case studies): https://pudding.cool/

Small corrolation of book covers (~5,000) : https://pudding.cool/2019/07/book-covers/



👤 sargstuff
Pick your statistical software package & chart to generate code for! https://chartmaker.visualisingdata.com

👤 sargstuff
Gephi [1] is a software suite to visualize social media datasets.

Demo / feature pages [2][3] show different approaches to visualizing different social media datasets.

Uncertain about current status of software support/development.

-----------------------

[1] https://en.wikipedia.org/wiki/Gephi

[2] http://www.martingrandjean.ch/gephi-introduction/

[3] Gephi home page : https://github.com/gephi/gephi/releases


👤 gus_massa
Locality means make a lot of comments in the same threads. Note that it will make close contacts people that agree a lot and also people that disagree a lot.

I'd try using the 2nd and 3rd eigenvector of the Laplacian of gtaph. I heard a lot of good properties of them, but I never used them so I'm not sure how tricky is to get a nice graph. Something like https://math.stackexchange.com/questions/3853424/what-does-t...


👤 clusterhacks
This is a super-interesting question.

I don't think it is quite the answer to your question, but I have seen combined heatmaps/dendrograms that feel like a possibility for this? Maybe the heatmap deals with activity between nodes and the dendrogram handles the community cluster relationships? I see them often in bioinformatics

Ex: https://en.wikipedia.org/wiki/Dendrogram#/media/File:Heatmap...


👤 ByersReason
https://arxiv.org/abs/1806.09823 - you can use this to find nearest neighbors to reduce the number of elements you are considering (or maybe not). Then you need: https://en.wikipedia.org/wiki/Dimensionality_reduction

👤 sargstuff
Perhaps view of data at different locality scales in different windows?

figure 2 of subspace explorer in https://www.cs.uic.edu/~wilkinson/Publications/visualpattern...


👤 anigbrowl
Many great new avenues of inquiry here - I thought the thread had died for lack of interest. Thanks to everyone!

👤 sargstuff
On the 'more animated' side of things : https://hypertools.readthedocs.io/en/latest/

**

different visual demonstrations : https://www.datavis.ca/gallery/delights.php

Edward Tufte has looked at various ways to present data : https://www.amazon.com/Edward-R-Tufte/e/B000APET3Y/ref=aufs_... / https://www.edwardtufte.com/tufte/

**

Make the quantum leap and use fennman diagrams to construct the network ( https://www.ias.edu/ideas/2009/arkani-hamed-oconnell-feynman... )

**

So, could take the idea of scatter plot matrix in https://www.cs.uic.edu/~wilkinson/Publications/sorting.pdf

and use the various plots to literally build a picture as represented in Excel cells as "picture" : https://www.youtube.com/watch?v=UBX2QQHlQ_I

and

https://jpg.space/mmmatto/exhibition/Mathcastles-%3E-Sandcas...

Hypercastle explorer for terraforms : https://www.youtube.com/watch?v=1jD6F_6_Yak

Although, that kind of starts to get into coding. ;)

https://direct.mit.edu/leon/article-abstract/48/4/375/45993/...

http://lightpattern.info/

https://esolangs.org/wiki/ObjectArt

https://www.dangermouse.net/esoteric/piet.html

**

Obligitory HN 'lisp' link : https://web.archive.org/web/20110504131632/http://sas.uwater...