Big Data Review in Emoji

I'm on the board of, a great non-profit focused on technology and art. We do an event every year called Seven on Seven where we pair seven technologists with seven artists.

Saturday was the fifth anniversary of the event, and this year one of the teams paired NYT writer and author Nick Bilton with artist Simon Denny.

Around 5pm on Friday, I got an email from Nick offering to pay me $5 for an emoji version of the White House's report on Big Data:

Email from Nick

The entire report is 85 pages, but they asked for a summary of page 55, a chart showing how federal dollars are being spent on privacy and data research:


Here's what I came up with (click for a larger version):

Big Data Emoji

I'm particularly proud of my emoji-fication of homomorphic encryption:

Homomorphic Encryption

I highly recommend watching the whole event, but Nick and Simon's presentation of the other reports they solicited begins around at the 3 hours and 25 minute mark of the live stream:

Nick, I know you said you'd pay cash, but I'd really prefer to accept the $5 in DOGE.

Please send 10,526.32 DOGE to DKQJsavxSdF381Mn3qZpyehsBzCX3QXzA2. Thanks!

Visualizing SOPA on Twitter

When I heard that Tyler Gray at Public Knowledge was looking for someone to do some analysis on tweets that mentioned SOPA, I thought I might try Cytoscape (an open source tool used for biomedical research, but handy for large scale data visualization) to show some of the relationships between people discussing the controversial bill on Twitter.

The result is a graph of the most active users referencing SOPA

Public Knowledge worked with the Brick Factory to set up their slurp140 tool to record approximately 1.5 million tweets which Tyler sent me in the form 350mb CSV file. I first used Google Refine to clean and narrow the set down to only tweets which were replies to someone else. This left approximately 80,000 tweets which I then imported into R. I then ranked all of usernames by how often they appeared both as senders and recipients, and then picked the approximate top 1,000 users. Since replies are sent from one user to another, the graph is directed: each edge has a direction with an origin and an arrow pointing at the recipient. There are 1,021 nodes identified by their Twitter usernames, and 1,757 edges a good portion of which are labeled with the content of their tweet.

Visualizing networks this large is more of an art than a science

I've tried to strike a balance between visual complexity, aesthetics and readability of tweets, but you'll find that this isn't always successful. Sometimes tweets run into nodes, sometimes edges run into labels, and sometimes the graph feels like a total mess. But that messiness is part of what made the SOPA debate on so interesting over the last month.

Thousands of people participating with plenty of cross talk.

The colors and sizes of the nodes and edges are coded in the following ways:

  • A node and its label size is maps to the number of tweets both posted by a user and and mentioning a user. (Ex: @BarackObama is a huge node because so many people were tweeting at him about SOPA).
  • Node color represents the number of outgoing tweets. The greener the node, the more replies a user posted. (Ex: @Digiphile sent a lot of tweets mentioning SOPA.)
  • Edge thickness represents "edge betweeness" which is how many "shortest paths" that run through it. This is a rough measure of how central a given tweet is in a network. (Ex: @declanm and @mmasnick have a thick line connecting them because many other nodes are connected to the two through that tweet.)
  • Edge color represents the language of the tweet. (Ex: Tweets in English are blue, Spanish are yellow.)

The nodes are positioned using an "force directed" algorithm which is typically designed for undirected graphs, but I found it to be the most visually compelling of Cytoscape's layout options. To learn more about force directed graphs, take a look at this d3 tutorial visualizing the characters in Victor Hugo's Les Misérables.

To really browse the graph visit GigaPan where I've uploaded a 32,000 x 32,000 pixel version.

I highly recommend GigaPan's full screen mode. I've also created a couple snapshots on GigaPan that highlight interesting nodes: @BarackObama, @GoDaddy, and @LamarSmithTX21 and @DarellIssa.

If you really want, you can also download the 36mb gigapixel file, the Cytoscape source file, and the PDF vector version of the network graph.

Thanks again to Public Knowledge, The Brick Factory for providing the infrastructure to record the tweets, and everyone who has helped fight against SOPA and PIPA over the last couple of months, especially those who tweeted about it.

There's No Such Thing as a Compulsory License for a Photo

My friend Andy has a terrific post up about his ordeal settling with the photographer Jay Maisel over the threat of a copyright lawsuit. Chances are if, you're reading this, you know about that. If you haven't ready Andy's story, go and read it and then come back.

There's one pointed question I've seen crop up in a number of conversations about the settlement:

Isn't it wrong that Andy chose to pay the licensing fees for the music but not for the photograph?

This question makes the assumption that Andy could have paid the licensing fees to Maisel like he did for the music. He couldn't have. This is because Jay Maisel refused to license the image and there's no compulsory license for photography like there is for musical compositions.

A compulsory license is what it sounds like: the owner of the underlying musical composition is required, by law, to license it to anyone who wants to use it at a predetermined rate. This prohibits song writers from picking and choosing who gets to perform their works. It also allows Andy to license, at a fair rate, the underlying song compositions from a Miles Davis album to make a new album of original recordings (remember, copyrights to recordings are different from copyrights to the compositions of a song).

The copyright of photographic works, unlike works of music composition, is not subject to a compulsory license.

This means that photographers, unlike song writers, can forbid anyone from reusing their work, whether it is for a poster or for an album cover.

Now, artists like Jay Maisel obviously enjoy this absolute control over their work because it lets them dictate who uses their art and when. Song writers, unfortunately aren't afforded to this their published works.

So while no one could have prevented Andy from recording an album of remixed music written by Miles Davis -- not even Miles Davis himself if he were alive -- the same does not hold for a photo of Miles Davis.

Remember, Maisel admitted he would have refused to license to Andy the rights to the photo. So Andy's only option, short of not using the photo at all, was to use the 8-bit remix cover and wager it was a fair use.

That Andy could, in one case, hire artists to legally remix music by paying a compulsory license, but in another case had to make an expensive and risky bet on fair use (a bet that didn't pan out) feels unfair.

Put another way: why are composers required to license their compositions at a fair rate to anyone, but yet virtually every other type of artist doesn't have to play by the same rule?

I doubt anyone would argue that song composition is a lesser art or any less deserving of full royalties than other arts.

One reason is that the practicalities of compulsory rights for photographs (and other works) are hard to imagine. Music compositions are written, then performed, then recorded, whereas photographs are snapped and then printed. There's no underlying right in a photograph (thank goodness) to its "composition" like there is for a piece of music. So that is part of why compulsory licenses for photos don't exist.

But I think another part of the story is that the law has evolved the musical compulsory license as an implicit acknowledgement that music compositions are both maleable and fundamental components to our culture. Compulsory licenses make possible everything from karaoke bars to cover bands to remixes like Andy's. The alternative -- allocating complete power to composers over who reuses their work -- yields transactional costs on culture that are simply too high. The law hasn't felt the same way for the visual works.

So will other art forms, like photography, adopt compulsory licenses? I doubt it, but I can't help but they'd be a great compromise in light of Andy's settlement. I asked Andy over email whether he would have paid a mechanical license for the photo:

"Absolutely. If the laws and protocols for remixing photos were as clear and fair as covering music, I would've bought a mechanical license for the photo in a heartbeat. But the laws around visual art are frustratingly vague, and requiring someone's permission to create art that doesn't affect the market for the original doesn't seem right. I didn't think it would be a problem, especially considering the scope of my project, but I was wrong. Nobody should need a law degree to understand whether art is legal or not."

Markov Chaining Kickstarter Blurbs

So I'm back from a wonderful couple of days of hanging out with the Kickstarter team near the beach. I managed to rent a surfboard and catch a couple of waves too. During a margarita-induced what-if-session, someone encouraged me to try and auto-generate some blurbs from the Kickstarter homepage. These are typically 150-200 character descriptions of projects that our community team labors over and refreshes daily.

Since I had worked with Markov Chains for Dan Shiffman's class "Programming A to Z" at ITP and had done two projects using them: ROBODRUDGE (autogenerated Matt Drudge Headlines) and The Rutabaga (an April Fools Joke that Google was attempting to compete with The Onion by using auto-generated news headlines), they seemed like an obvious place to start. I found Eric Hodel's Markov Chain ruby code readily usable and went from there.

To give some background to Markov Chains: the basic principle is to use probability to auto-generate new sequences based off old patterns. Sometimes these sequences can be numerical, sometimes they're musical, and sometimes they're  characters and words.

An example of exquisite corpse from Wikipedia.

Another way to think of Markov chains is as a computer's attempt to play Equisite Corpse: it is fed a certain amount of existing information and then it attempts to extrapolate a similar pattern. A classic example is to feed a Markov Chain Engine Shakespeare; not only is it readily available in raw text and in the public domain, but Markov Chain generated Shakespeare looks strikingly similar to the real thing (thanks to Jim's Random Notes for the work):

If they in thou, thy love, that old,
Thought, that which yet do the heat did excess
My love concord never saw my woeful taste
At ransoms you for that so unprovident:
For thereby beauty show’st I am fled,
Althoughts to the dead, that care
With should thus shall by their fair,
Where too much my jade,
Her loves, my heart.vs.

I pulled down a document with a majority of the text that Kickstarter has used to blurb projects on the homepage. Below is a subset of an almost infinite list of hilarious and sometimes disturbing auto-generated project blurbs:

  • It features monstrous puppets, mystic sex rituals, yellowface assassins, wildly stylized violence, and a songwriter who created that all-important childhood fave Schoolhouse Rock.
  • To promote NAIN, an all-new miniature setting for his REIGN roleplaying game, Greg Stolze has put together a group of thespians shouting thank you at their laptops to a great $3 price.
  • Karl Cronin wants to document and collect relics of the concoctions of Atlanta's Good Food Truck.
  • Emilia Brock types out every word of her adorable and inventive zine, “Muster,” on a manual typewriter, has each copy illustrated by the Simpsons as a mobile CSA.`
  • First Law of Mad Science channels cyber-punk and sci-fi to bring a Yakuza noir production to the emerging subculture of asexuality.
  • Rewards include CDs mixed by a visit from a song left on your voicemail to a unique fabric whose color + pattern is determined by keywords pulled from Twitter's database in real time.
  • Joe Mangrum has been designing simple-but-compelling computer games based on Eugene O'Neill's play.
  • The project will help them print their fourth issue, and editor Mindy (a drummer herself!) is offering backers personalized designs and original music and prove that the rail system is a five-minute short film about a roach violinist who falls in love with the bravado you'll find in Memphis Heat, a documentary on the real-life appearance of a giant, hand-crafted, Rube Goldberg contraption.
  • After 13+ years of genetic testing and solitary confinement, Oliver’s getting a new horror-adventure comic about the movements of immigrants across vast bodies of water.
  • With just a camera in hand and boundless curiosity, Rebekah Potter interviews artists for her series 10min4walls, in which a twenty-something guy returns to Argentina to rekindle past excitement and romance, but instead is confronted with a musical twist, features a one-of-a-kind mood swing.
  • Fans have flocked to support him, and it's not hard to see it for yourself: it’s available only to backers of the fastest competitive lockpickers in the form of a pig's dissembling and its indie rock soundtrack.
  • Director Jonathan Langer will reward backers with the woman whose apartment he inhabits.
  • Common Cycle is a feature-length thriller about strained family relationships, small-town antics, and second chances.
  • I'm Going Home will be filled with a stowaway lab mouse as his only companion.
  • She'll combine intimate interviews, vérité footage, and animation in a city decimated by the public for performances, gallery space, meetings, bike repairs, relaxing — or pretty much anything.
  • This August, thousands of individual leaf, flower, and bird forms from reclaimed wood and connecting them into an animated feature film starring comedic legend Leslie Nielsen.
  • Trailer Park: A Mobile Public Park is a witty comedic web series.
  • Missed Connections is a film about three friends on a loom into a community biology lab so that no two experiences are alike.
  • Check out his video and you can get a sense of humor and would like to be normal, and Nick's lusting for the Queen of England and wrote a jazz pianist whose work was performed by Miles Davis, John Coltrane, and many others; was dead.
  • First Law of Mad Science channels cyber-punk and sci-fi to bring you a hand-drawn postcard from the original.
  • The video teaser feels wonderfully Tim Burton-esque, and the search for the over 45,000 participants who attend Burning Man every year.
  • This latest project will launch Gerlan's Spring 2011 runway show, and she's offering backers everything from print copies to mix tapes to personal drum lessons.
  • Brooklyn-based independent art collective Ugly Duckling Presse is publishing the first twist ending that we've come across in a unique fabric whose color + pattern is determined by keywords pulled from Twitter's database in real time.
  • Check out their touching video and the Red Hot Chili Peppers.
  • Chicago's Chinese Fine Arts Society has been making stunning sand paintings in public spaces for years, totally 160 in the journey with awesome drawings.
  • Following her lauded Gypsy Killers album, Sanda Weigl has a web series as The Office for actors or Entourage for poor people.
  • As the World Cup opens in South Africa, Stan Engelbrecht and Nic Grobler's project to document the life histories of 200 plants and animals through expressive movement, which he'll share in this hilarious pitch video for great world beats!
  • Fed up with the absurdist aesthetic of Dutch animator Emiel Stevenhagen.
  • The result is hypnotizing, and the Land of the Misfit Toys.
  • Check out their touching video and creative rewards includes communist-issue Mongolian Bomber Goggles!
  • Coyote Pursues is a charming character-driven video game for kids.
  • Coyote Pursues is a fan of the remaining villagers.
  • After the success of the most disgusting way of making coffee we've ever seen, plus fun quotes like Let's go slightly less on the road, and you can still get the book for just $10, plus artwork, CDs, a photo book honoring space exploration.
  • Dario Ciriello's homegrown publishing outfit is guided by a subway car or Wal-Mart aisle is a massive interactive Burning Man installation comprised of an old Dodge pickup with fresh, seasonal veggies, Truck Farm was born.
  • Cobbled together from actual footage and the Outs have recorded a sweet, limited-edition EP on cassette (you heard me right) to help everyone send their phones to space and is going to help everyone send their phones to space and designing an app that will take place in a beautiful color poster, the DVD, and tickets to see any show, or a private event thrown in your town AND cook you dinner- what's not to love?
  • Expect a fabulous soup of literary aficionados chatting intelligently about a roach violinist who falls in love with the bravado you'll find in Memphis Heat, a documentary on the real-life appearance of a giant, hand-crafted, Rube Goldberg contraption.
  • After 13+ years of genetic testing and solitary confinement, Oliver’s getting a new recording in the film's virtual town.
  • With Kickstarter, the beloved web-comic slash zine will be reissued in an American town, Congress, in the last 30 years, the PCR machine has been a fascination and obsession for 400 years, made clear by the economic crisis.
  • Help her bring her spring collection to NY's Fashion Week Green Shows and you might end up pledging $10 to see his son do one of a series of mathematical stories that will open and close by electrical current.
  • The goal of CicLAvia is to put all the way through the eyes of the great '60s cult classics.
  • Fart Party is the subject of Angela Kline's boisterous documentary, A Love Letter to Tom Waits.
  • Akimenko Meats is a powerful combination of portraiture, live audio, and writing, creators Kitra and Chris aim to offer an intimate glimpse of a secretive guy in search of his Jens Pulver Kickstarter project, Gregory Bayne has set his sights on his first film, Person of Interest.
  • The work is beautifully previewed in a fantastical world where every shared glance across a subway doo-wop group can be a kite-flying extravaganza!
  • By showcasing this innovative and highly accessible approach to cinema, filmmaker Benjamin Reece hopes to perform A Chinese Love Story for a new work of comics journalism exposing the human cost of trafficking.
  • Pick up a signed copy of the fastest competitive lockpickers in the city!
  • Fishtank Performance Studio in Kansas City works hard to see his fantastically illustrated children's story become a real-life 13 ft. sculpture and installation at the end, the super bouncy balls will all go flying when she throws herself off a roof.
  • Operating Theater's play Transatlantica revolves around a psychoanalyst who encounters a series of bicycle-powered food tours.
  • Backers can witness the event in real-time; $20 gets you a package of exotic recipes, hard-to-find ingredients, and info cards on your voicemail to a Brooklyn rooftop.
  • It's an inventive project from SFHny Studio, a group of thespians shouting thank you at their laptops to a unique photo book and traveling gallery show, dubbing the project that’s so infectious.
  • Fed up with a variety of sculpted paper viruses.
  • Get rewards like an unreleased font and a songwriter who created that all-important childhood fave Schoolhouse Rock.

Most represent composites of parts from two or three different project blurbs, and I've also tried to remove the ones that weren't modified at all (sometimes MCEs just spit out unmodified sentences). I think part of the reason these work so well is because the original blurbs end up conforming to a particular style of quippy, short descriptions structured around rewards and project topics.

Cause Caller Thesis (PDF & LaTeX Source)

Apparently my alma-mater (and current part-time employer) ITP is suggesting current students look at my thesis because of its formatting.

I used LaTeX to format it and used the ACM template here I'm proud to finally post it almost 2 years later. It also means that ITP students don't have to go through the demented process of trying to recreate / reformat a template.

The significant departures from ACM's template are the copyright notice and the image inclusions.

Good luck!

Cause Caller Thesis - PDF, 2.5mb, CC BY-SA.
Cause Caller Thesis - LaTeX Source Code, .tex, 74kb


Cause Caller Thesis

Here's a video of my presentation:

Why do all Na'avi in Avatar have braids? Because code is law.

You could say that I'm partial to Lessig's maxim that "code is law."

I also think it goes a long way to explaining some decisions James Cameron made while making Avatar. More specifically, the code and technology responsible for the majority of the movie's (we can't very well go on calling them films much longer, can we?) visual experience actively constrained the choices of the production team and thereby the choices of the Avatar characters themselves. Neytiri couldn't have had voluminous hair even if she wanted to, because James Cameron's hardware and software wasn't good enough.

If you haven't followed computer graphics closely you might not know that certain textures and materials, like hair, are incredibly difficult to get right. Though there has been quite a lot of progress in the realm of still CG, capturing the motion and flow of humanoid hair is still very difficult if not virtually impossible. Cameron's Avatar didn't significantly advance the state of the art, but he was able to creatively sidestep the issue by giving his characters thick braids and dreadlocks which he could motion capture.

This alleviated the chore of trying to artificially generate the realistic movement of millions of individual hairs: if all the Na'avi had braids or dreadlocks, then all of that movement could be motion captured by actors in reality.

Much has been made of Cameron's innovation to accurately develop motion capture for individual facial movements, and it is my strong feeling that the team also took this approach for the hair of their characters. As Wired pointed out in their features on the movie, this is an evolution in the modern director relationship to computer graphics: instead of trying to *simulate* real world phenomena using procedural software, directors opt to direct a close enough analog in the physical world whose motion could be captured at a very high resolution using camera-like devices.

Don't believe me? Check out these screen grabs from the Avatar making of video floating around:

Look closely at Zoe's head and it doesn't require a lot of imagination to believe that her dreadlocks have individual motion capture devices embedded in them. It's also probably true that motion capture systems of this type can not be scaled small enough for individual hairs. This might change in the future, but for now it is a real technological constraint in the world of Pandora. There are a couple other examples of technology constraining creative choice: why don't any animals in the Pandora jungle have fur? Might it be because Cameron couldn't get CG fur to look right?

So Cameron's technological constraints and innovation drove choices that would have have otherwise been purely creative. Code became law on Pandora. Sometimes the origins of code's constraints are artificial (such as copyright law) but sometimes they're just practical constraints like software and CPU horsepower, and I think that's what happened here.

Let me know if you agree or have any evidence to the contrary.

Eustace Emoji

Eustace Emoji Cropped

The New Yorker is holding its annual contest for re-interpretations of the famous Eustace Tilley cover and I thought this would be a good one.

Unfortunately since I live with Thessaly (unfortunate that she works there, not that I live with her ;) I'm most likely disqualified from participating in a Conde Nast competition.

I used Photoshop to batch process the emoji (zip download) so that they have white backgrounds and used Foto-Mosaik-Edda for the mosaic.

Check it out full size here.

Emoji Dick

I just launched a project on Kickstarter (an awesome NYC based startup that helps people fund their ideas) to translate Moby Dick into Emoji using Amazon Mechanical Turk. I'm calling it Emoji Dick:

This project will fund the production, via crowd sourcing, of a never-before-released translation of Herman Melville's classic Moby Dick in Japanese emoji icons.

Here's an example of an Emoji sentence from Moby Dick:

Each of Moby Dick's 6,438 sentences will be translated 3 times by different Amazon Mechanical Turk workers. Those results will then be voted on by another set of workers, and the most popular version of each sentence will be selected for inclusion in the book.

I'm trying to reach $3,500, and you can give at the $5, $10, $20, $40, and $200 levels and get different awesome rewards, like their name included in the book, a CC BY-SA licensed PDF, the raw data, and either a softcover black and white copy or a limited edition color version.

If you want to support the project, just visit the page here. Thanks!