Wed Jan 04 2012

Dark data, and how frustrating it is that we can't see the forest from the trees

It sucks that this is 2012 and that there’s no good way to organize and collect the things we do, share and like (online or otherwise). In the last hour I added things to instapaper, pinboard, gimmebar, and a notebook I carry. I took photos on my iPhone, that I will one day move to a computer, then an external hard-drive, then eventually forget about. I like collecting things, because it gives me the sense that one day, someday, I’ll be able to look back and connect the dots. There’s meaning in the things that I’ve collected, but that meaning is lost without the tool to capture it.

There’s certainly meaning in the over 70.000 songs I played on Last.fm, the things I tweeted, the images I saved on FFFFOUND and Gimmebar, the books I’ve read, the bookmarks I added to Pinboard (and before that, Delicious). I know the meaning is in there, lurking beneath the surface of a seemingly infinite number of services; it’s a huge graph of things I care about, and the invisible connections between them. It is huge, I am certain, but I still can’t see it.

I’m calling this “dark data” because much like dark matter, we know it exists even if we can’t grasp it. Dark matter is 83% of the universe and we can’t see it. Dark data is everything we do and we still can’t see it.

I’ve been thinking about this problem of dark data for a while, sketching solutions, putting some into code (certainly not enough code). But I’m not there yet, and sadly, services like Gimme Bar, Bundlr, Evernote are not there yet either. There’s a ton of potential in making meaning out of all our photos, our tweets, our listened songs and our shared articles. Building tools to see the forest from the trees - now there’s something worth doing.