Archive for the ‘Technology’ Category

Markov Chaining Kickstarter Blurbs

Monday, August 30th, 2010

So I’m back from a wonderful couple of days of hanging out with the Kickstarter team near the beach. I managed to rent a surfboard and catch a couple of waves too. During a margarita-induced what-if-session, someone encouraged me to try and auto-generate some blurbs from the Kickstarter homepage. These are typically 150-200 character descriptions of projects that our community team labors over and refreshes daily.

Since I had worked with Markov Chains for Dan Shiffman’s class “Programming A to Z” at ITP and had done two projects using them: ROBODRUDGE (autogenerated Matt Drudge Headlines) and The Rutabaga (an April Fools Joke that Google was attempting to compete with The Onion by using auto-generated news headlines), they seemed like an obvious place to start. I found Eric Hodel’s Markov Chain ruby code readily usable and went from there.

To give some background to Markov Chains: the basic principle is to use probability to auto-generate new sequences based off old patterns. Sometimes these sequences can be numerical, sometimes they’re musical, and sometimes they’re  characters and words.

An example of exquisite corpse from Wikipedia.

Another way to think of Markov chains is as a computer’s attempt to play Equisite Corpse: it is fed a certain amount of existing information and then it attempts to extrapolate a similar pattern. A classic example is to feed a Markov Chain Engine Shakespeare; not only is it readily available in raw text and in the public domain, but Markov Chain generated Shakespeare looks strikingly similar to the real thing (thanks to Jim’s Random Notes for the work):

If they in thou, thy love, that old,
Thought, that which yet do the heat did excess
My love concord never saw my woeful taste
At ransoms you for that so unprovident:
For thereby beauty show’st I am fled,
Althoughts to the dead, that care
With should thus shall by their fair,
Where too much my jade,
Her loves, my heart.vs.

I pulled down a document with a majority of the text that Kickstarter has used to blurb projects on the homepage. Below is a subset of an almost infinite list of hilarious and sometimes disturbing auto-generated project blurbs:

  • It features monstrous puppets, mystic sex rituals, yellowface assassins, wildly stylized violence, and a songwriter who created that all-important childhood fave Schoolhouse Rock.
  • To promote NAIN, an all-new miniature setting for his REIGN roleplaying game, Greg Stolze has put together a group of thespians shouting thank you at their laptops to a great $3 price.
  • Karl Cronin wants to document and collect relics of the concoctions of Atlanta’s Good Food Truck.
  • Emilia Brock types out every word of her adorable and inventive zine, “Muster,” on a manual typewriter, has each copy illustrated by the Simpsons as a mobile CSA.`
  • First Law of Mad Science channels cyber-punk and sci-fi to bring a Yakuza noir production to the emerging subculture of asexuality.
  • Rewards include CDs mixed by a visit from a song left on your voicemail to a unique fabric whose color + pattern is determined by keywords pulled from Twitter’s database in real time.
  • Joe Mangrum has been designing simple-but-compelling computer games based on Eugene O’Neill’s play.
  • The project will help them print their fourth issue, and editor Mindy (a drummer herself!) is offering backers personalized designs and original music and prove that the rail system is a five-minute short film about a roach violinist who falls in love with the bravado you’ll find in Memphis Heat, a documentary on the real-life appearance of a giant, hand-crafted, Rube Goldberg contraption.
  • After 13+ years of genetic testing and solitary confinement, Oliver’s getting a new horror-adventure comic about the movements of immigrants across vast bodies of water.
  • With just a camera in hand and boundless curiosity, Rebekah Potter interviews artists for her series 10min4walls, in which a twenty-something guy returns to Argentina to rekindle past excitement and romance, but instead is confronted with a musical twist, features a one-of-a-kind mood swing.
  • Fans have flocked to support him, and it’s not hard to see it for yourself: it’s available only to backers of the fastest competitive lockpickers in the form of a pig’s dissembling and its indie rock soundtrack.
  • Director Jonathan Langer will reward backers with the woman whose apartment he inhabits.
  • Common Cycle is a feature-length thriller about strained family relationships, small-town antics, and second chances.
  • I’m Going Home will be filled with a stowaway lab mouse as his only companion.
  • She’ll combine intimate interviews, vérité footage, and animation in a city decimated by the public for performances, gallery space, meetings, bike repairs, relaxing — or pretty much anything.
  • This August, thousands of individual leaf, flower, and bird forms from reclaimed wood and connecting them into an animated feature film starring comedic legend Leslie Nielsen.
  • Trailer Park: A Mobile Public Park is a witty comedic web series.
  • Missed Connections is a film about three friends on a loom into a community biology lab so that no two experiences are alike.
  • Check out his video and you can get a sense of humor and would like to be normal, and Nick’s lusting for the Queen of England and wrote a jazz pianist whose work was performed by Miles Davis, John Coltrane, and many others; was dead.
  • First Law of Mad Science channels cyber-punk and sci-fi to bring you a hand-drawn postcard from the original.
  • The video teaser feels wonderfully Tim Burton-esque, and the search for the over 45,000 participants who attend Burning Man every year.
  • This latest project will launch Gerlan’s Spring 2011 runway show, and she’s offering backers everything from print copies to mix tapes to personal drum lessons.
  • Brooklyn-based independent art collective Ugly Duckling Presse is publishing the first twist ending that we’ve come across in a unique fabric whose color + pattern is determined by keywords pulled from Twitter’s database in real time.
  • Check out their touching video and the Red Hot Chili Peppers.
  • Chicago’s Chinese Fine Arts Society has been making stunning sand paintings in public spaces for years, totally 160 in the journey with awesome drawings.
  • Following her lauded Gypsy Killers album, Sanda Weigl has a web series as The Office for actors or Entourage for poor people.
  • As the World Cup opens in South Africa, Stan Engelbrecht and Nic Grobler’s project to document the life histories of 200 plants and animals through expressive movement, which he’ll share in this hilarious pitch video for great world beats!
  • Fed up with the absurdist aesthetic of Dutch animator Emiel Stevenhagen.
  • The result is hypnotizing, and the Land of the Misfit Toys.
  • Check out their touching video and creative rewards includes communist-issue Mongolian Bomber Goggles!
  • Coyote Pursues is a charming character-driven video game for kids.
  • Coyote Pursues is a fan of the remaining villagers.
  • After the success of the most disgusting way of making coffee we’ve ever seen, plus fun quotes like Let’s go slightly less on the road, and you can still get the book for just $10, plus artwork, CDs, a photo book honoring space exploration.
  • Dario Ciriello’s homegrown publishing outfit is guided by a subway car or Wal-Mart aisle is a massive interactive Burning Man installation comprised of an old Dodge pickup with fresh, seasonal veggies, Truck Farm was born.
  • Cobbled together from actual footage and the Outs have recorded a sweet, limited-edition EP on cassette (you heard me right) to help everyone send their phones to space and is going to help everyone send their phones to space and designing an app that will take place in a beautiful color poster, the DVD, and tickets to see any show, or a private event thrown in your town AND cook you dinner- what’s not to love?
  • Expect a fabulous soup of literary aficionados chatting intelligently about a roach violinist who falls in love with the bravado you’ll find in Memphis Heat, a documentary on the real-life appearance of a giant, hand-crafted, Rube Goldberg contraption.
  • After 13+ years of genetic testing and solitary confinement, Oliver’s getting a new recording in the film’s virtual town.
  • With Kickstarter, the beloved web-comic slash zine will be reissued in an American town, Congress, in the last 30 years, the PCR machine has been a fascination and obsession for 400 years, made clear by the economic crisis.
  • Help her bring her spring collection to NY’s Fashion Week Green Shows and you might end up pledging $10 to see his son do one of a series of mathematical stories that will open and close by electrical current.
  • The goal of CicLAvia is to put all the way through the eyes of the great ’60s cult classics.
  • Fart Party is the subject of Angela Kline’s boisterous documentary, A Love Letter to Tom Waits.
  • Akimenko Meats is a powerful combination of portraiture, live audio, and writing, creators Kitra and Chris aim to offer an intimate glimpse of a secretive guy in search of his Jens Pulver Kickstarter project, Gregory Bayne has set his sights on his first film, Person of Interest.
  • The work is beautifully previewed in a fantastical world where every shared glance across a subway doo-wop group can be a kite-flying extravaganza!
  • By showcasing this innovative and highly accessible approach to cinema, filmmaker Benjamin Reece hopes to perform A Chinese Love Story for a new work of comics journalism exposing the human cost of trafficking.
  • Pick up a signed copy of the fastest competitive lockpickers in the city!
  • Fishtank Performance Studio in Kansas City works hard to see his fantastically illustrated children’s story become a real-life 13 ft. sculpture and installation at the end, the super bouncy balls will all go flying when she throws herself off a roof.
  • Operating Theater’s play Transatlantica revolves around a psychoanalyst who encounters a series of bicycle-powered food tours.
  • Backers can witness the event in real-time; $20 gets you a package of exotic recipes, hard-to-find ingredients, and info cards on your voicemail to a Brooklyn rooftop.
  • It’s an inventive project from SFHny Studio, a group of thespians shouting thank you at their laptops to a unique photo book and traveling gallery show, dubbing the project that’s so infectious.
  • Fed up with a variety of sculpted paper viruses.
  • Get rewards like an unreleased font and a songwriter who created that all-important childhood fave Schoolhouse Rock.

Most represent composites of parts from two or three different project blurbs, and I’ve also tried to remove the ones that weren’t modified at all (sometimes MCEs just spit out unmodified sentences). I think part of the reason these work so well is because the original blurbs end up conforming to a particular style of quippy, short descriptions structured around rewards and project topics.

Thoughts on Verizon and Google

Thursday, August 12th, 2010

In early 2007 I attended a talk at Fordham Law School by William Barr, the former US Attorney General and current Verizon General Counsel and Executive Vice President. The premise of his talk was that regulation, of the network neutrality kind, would only hurt technological innovation in the broadband and Internet space.

A lot of has changed since then, and now that Google and Verizon have stuck a deal purportedly threatening the openness of the future of the web, I thought I’d revisit some of my thoughts from that night as well as muse about what this deal might mean and why its happening now.

During his lecture Barr attempted to point out that there had never been an instance of a telecommunications company violating the terms of network neutrality, so why would they begin now? Out of nowhere, from behind me, someone shouted “What about Madison River?” That person was Tim Wu, who I didn’t personally know at the time, but who would later become a friend of mine. Tim had interrupted Barr to remind him aboutMadison River where a local telecom had blocked VoIP connections for broadband subscribers because the telephone company didn’t want to compete with inexpensive internet telephony. It was precisely the kind of violation of network neutrality that Barr was claiming could never have happened. Barr dismissed Madison River as an isolated incident which didn’t represent the overall policy of non-discrimination by the telecom industry.

Later in the lecture, Barr tried to envision an industry closely regulated by the FCC in order to uphold network neutrality. This would be a world that Barr thought no one would want: innovation would peter out as businesses would face a high barrier of entry in the form of regulations. Conversely, if corporations had the opportunity to really invest in research and development without the fear of future regulatory action, then they might come up with services and tech that would be even better than TCP/IP. Barr believed that it was naive for us to blindly accept that TCP/IP was the best we were going to get for transferring data and communications over a network. Who is to say Verizon or AT&T couldn’t come up with a better protocol? TCP/IP has plenty of performance issues (real time synchronous voice communication was a huge challenge), so why not let Verizon innovate at the protocol level, and sure, maybe they’d prioritize some kind of traffic, but it would be for the benefit of technological innovation. Just think of all the potentially amazing applications they’d could come up with if the FCC just left the innovation to Verizon’s R&D lab instead of the open internet and the public?

Just say no to walled gardens.

During the question and answer period, I asked Barr why he thought that consumers wanted more walled gardens of content, and whether it was wise to assume the market was going to support another set of AOLs, Compuserves and Prodigys? He replied that of course they consumers wanted better content — video on handheld devices was going to be the future and the telecoms were going to be the only companies who could deliver it. I insisted that consumers only really want the internet in their pockets and that he was kidding himself if he thought a curated walled garden on a handset would be nearly as appealing as an actual functional web browser (something no mobile company had delivered yet).

In a sense we were both wrong and we were both right. Consumers did want mobile video on demand, but they also wanted the entire open web in a functional experience.

Prior to Barr’s lecture Verizon had announced a half-baked partnership with YouTube which would offer limited and selected versions of YouTube videos for watching on handheld devices. Then, a couple of months later, Steve Jobs announced the iPhone which would have even greater support for YouTube. Verizon was banking on curated portals inside hobbled handsets, and Apple had just bet the farm on the touchscreen and a mobile Safari browser. We know who won this battle. Does anyone ever talk about watching YouTube on their 3 year old cell phone any more? Does anyone even remember the partnership?

Why Verizon and none of the other telecoms never fully invested in a serious mobile browsing experience is best explained by their general hostility to the open web. The big telecoms have always loathed the net, whether it was manifest in an engineering snobbery towards the “dumbness” of TCP/IP or the fact that the net worked best when it treated their products not like products at all but like common utilities, something no company wants. So it has never been surprising that the telecommunications industry never bothered to create a real mobile browsing experience; they were too eager to strike Big Deals with Exclusive Providers of Proprietary Content than supply an actual connection to the open web.

Steve Jobs, to his credit, saw the opportunity to serve consumers what they really wanted, and he and Apple have since been handsomely rewarded for creating a mobile browsing experience worth using. Google’s choice to freely offer Android was a brilliant bit of strategy: all of the telecommunications firms and handset manufacturers were panicking and desperate to compete with Apple’s iPhone, so why not give supply them what they wanted?

So now Verizon and Google are making an uneasy deal behind the FCC’s back and trying to assuage the FCC and the public that they’re really doing it in the name of technological innovation. Think about all the applications that could exist if we didn’t have to rely on the Internet! Healthcare Monitoring! The Smart Grid! Advanced educational services! Incredible entertainment and gaming options! These are all ghosts of walled gardens past and there’s no reason to believe that a competitive startup can’t supply these exact services over the open web.

The wireless component of the Google/Verizon deal is the biggest wild card and the most controversial aspect of their joint policy proposal. The two companies argue that the principles of network neutrality shouldn’t apply in the wireless space. I couldn’t agree less. The telecoms have demonstrated very little capacity for innovation in the wireless space in the last 15 years (why is it so hard to develop SMS applications? why is Google voice such a pain to reconfigure as my voicemail? etc.), so why would we trust them now?

Ultimately, why shouldn’t the principles of common carriage and network neutrality apply to the wireless space? Because its too difficult? Too expensive? I don’t buy it. What the wireless space needs now is faster and cheaper TCP/IP service and a more open application infrastructure. Negotiating one off deals for new channels and services will only remind us of Compuserve circa 1999.

Lessig, Crawford and Wu have a good post about the proposal, but also read Jonathan Zittrain’s thoughts on it here too.

Duplicate Windows 7 Commercials Show Why Software Patents are a Bad Idea

Sunday, May 9th, 2010

Part of Microsoft’s aggressive Windows 7 TV advertising campaign revolves pairing feature ideas with tongue-in-cheek-reenactments of how those ideas occurred to “real” users. The “real” user retells how and where they came up with the concept and then demonstrates that, hey, Microsoft thought it was a good idea too and hey, look at that its now in Windows 7! Clearly Microsoft is finally listening to its users (as opposed to Windows Vista).

Anyway, these faux testimonials-reenactments never struck me as particularly sincere and after being subjected to one just a couple of minutes ago, I realized that I had seen basically same commercial with another actor claiming to have thought of the same feature. So I went and double checked on YouTube, and indeed, there are two commercials with two totally different men (with different names) claiming to have thought of the “Aero Snap” feature in Windows 7. The former one is the original US one, and the latter one is the UK version.

So, who came up with the idea? For the sake of the argument, let’s interpret it in the most generous way possible: two independent, real people named Jake and Ramin came up with the same idea and Microsoft chose to implement it. How cool.

But wait, wouldn’t Microsoft probably own a patent on the Aero Snap feature? Sure enough, they do. Its actually a lot more broad and powerful than simply snapping windows, but Microsoft applied for and received the patent in 2005.

And now they have two commercials with two people claiming to have come up with the same idea by themselves. Just imagine if one of those users didn’t submit the idea to Microsoft, but merged into a free software project at the same time? (It turns out that KDE, a free software window manager has long had such a feature).

In this generous interpretation Microsoft has implicitly created an argument against patents: independent and simultaneous discovery of inventions. Who do you give the patent to, Jake or Ramin? This is actually a hugely interesting area of contemporary research, and there’s been lots of work done to demonstrate that new ideas are almost never new. Kevin Kelly has a good post about it here.

Unfortunately, there’s a more likely and cynical explanation for the duplicate commercials: someone at Microsoft “discovered” the concept (here’s a MS blog post discussing its development and effectively taking credit for it), and then they did two or more sets of commercials with different demographically-appealing actors claiming credit for the features.

Cause Caller Thesis (PDF & LaTeX Source)

Friday, April 9th, 2010

Apparently my alma-mater (and current part-time employer) ITP is suggesting current students look at my thesis because of its formatting.

I used LaTeX to format it and used the ACM template here I’m proud to finally post it almost 2 years later. It also means that ITP students don’t have to go through the demented process of trying to recreate / reformat a template.

The significant departures from ACM’s template are the copyright notice and the image inclusions.

Good luck!

Cause Caller Thesis – PDF, 2.5mb, CC BY-SA.
Cause Caller Thesis – LaTeX Source Code, .tex, 74kb

Scribd:

Cause Caller Thesis

Here’s a video of my presentation:

SQL Query for Confidence Intervals and The Adjusted Wald Method

Monday, April 5th, 2010

Suppose you want to compare two different websites and how many users of each site have bothered to fill out their e-mail address. But you have one problem: Site A only has 6 users, while Site B has 636 users. Does it make sense to compare the completion rate between the two sites? How confident can you be when comparing such different size samples?

As many people who work on usability studies will tell you, the small sample size of Site A constrains the conclusions you’re allowed to draw from your results, but the sample size of Site B is large enough to give you good confidence about most results you want to draw from its population.

So let’s assume 4 out of the 6 users of Site A have filled out their e-mail address, and that 424 out of Site B have. These numbers represent exactly the same ratio: both sites have a little more than 66.66% of their users who have completed the task. But can we really trust the comparison and say that they have equal e-mail-completion rates?

Statisticians measure this trustworthiness with something called a “confidence interval” and its very important when looking at trials or polls, especially when comparing two trials that have different sample sizes. As the sample size grows larger, this interval shrinks.

Now there are a number of ways to estimate the confidence interval for a trial, but the one that seems to be the best for small sizes (such as half a dozen users) is something called the Adjusted Wald method. The site Measuring Usability has a fantastic calculator that will generate these numbers for you and has a great explanation of them, but since I’ve found myself doing a lot of work in SQL, I created a generalized query I can drop into other queries to give me the results in the command line.

Here it is:

SELECT n.c as "Total",r.c as "Successes", CONCAT(ROUND((r.c/n.c)*100,2),'%') as "Raw Ratio", CONCAT('±',ROUND(1.96 * SQRT((( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) ))*(1-(( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) )))/(n.c + (1.96*1.96)))*100,2),'%') as "Confidence Interval", CONCAT(ROUND((( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) ))*100 - 1.96 * SQRT((( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) ))*(1-(( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) )))/(n.c + (1.96*1.96)))*100,2),'%') as "Lower Limit", CONCAT(ROUND((( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) ))*100 + 1.96 * SQRT((( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) ))*(1-(( r.c + (1.96*1.96)/2 ) / (n.c + (1.96*1.96) )))/(n.c + (1.96*1.96)))*100,2),'%') as "Upper Limit" FROM (SELECT COUNT(*) as c FROM users WHERE users.email IS NOT NULL) as r, (SELECT COUNT(*) as c FROM users) as n;

This query uses two derived tables (the one named n represents the total number of users on the site and the one named r represents the users who have something in their e-mail address field) and will return 6 columns.

The first column will be the total number of users in your users table. The second will be the number whose e-mail address is not null, and the third column is the raw percentage.

The “Confidence Interval” column represents the confidence interval as calculated by the using the Adjusted Wald Method based on the walkthrough at MeausingUX.com. Finally, the “Lower Limit” and “Upper Limit” columns are the simple result of subtracting and adding the confidence interval to the probability. The confidence interval is assuming a 95% confidence (you can choose 99% if you want, just replace the occurrences of 1.96 with 2.58, but using a z-score of 1.96 for 95% confidence is very common) and the units of the results are in percent (hence the 100 multiplier).

So this is what the results look like for a query which returns 4 out of 6 user trials:

Total: 6
Successes: 4
Raw Ratio: 66.67%
Confidence Interval: ±30.59%
Lower Limit: 29.57%
Upper Limit: 90.75%

And 424 out of 636 trials:

Total: 636
Successes: 424
Raw Ratio: 66.67%
Confidence Interval: ±3.66%
Lower Limit: 62.91%
Upper Limit: 70.22%

Notice how because of the small sample size, the first trial’s results can swing between 29% and 90%! That’s a huge margin of error.

But in the case of the Site B, with 636 users, there’s only a swing of 3.65%, meaning that the margin of error could only get the probability down to 62.9%.

If we assume the worst case scenario, meaning the Site A’s trial gets the lowest bound and Site B gets its highest bound, then there’s more than a 40% difference (29% vs. 70%) between the two sites statistics despite the ratio of users being identical!

That’s not terribly exciting news if you’re working with small numbers, but I hope my little query gets you on slightly more solid footing when comparing results of trials.

Good luck and please let me know if you have any questions (or suggestions — this is my first attempt at writing a stats post, so I’m sure there’s room for improvement/bug fixes).

Why do all Na’avi in Avatar have braids? Because code is law.

Sunday, January 10th, 2010

You could say that I’m partial to Lessig‘s maxim that “code is law.”

I also think it goes a long way to explaining some decisions James Cameron made while making Avatar. More specifically, the code and technology responsible for the majority of the movie’s (we can’t very well go on calling them films much longer, can we?) visual experience actively constrained the choices of the production team and thereby the choices of the Avatar characters themselves. Neytiri couldn’t have had voluminous hair even if she wanted to, because James Cameron’s hardware and software wasn’t good enough.

If you haven’t followed computer graphics closely you might not know that certain textures and materials, like hair, are incredibly difficult to get right. Though there has been quite a lot of progress in the realm of still CG, capturing the motion and flow of humanoid hair is still very difficult if not virtually impossible. Cameron’s Avatar didn’t significantly advance the state of the art, but he was able to creatively sidestep the issue by giving his characters thick braids and dreadlocks which he could motion capture.

This alleviated the chore of trying to artificially generate the realistic movement of millions of individual hairs: if all the Na’avi had braids or dreadlocks, then all of that movement could be motion captured by actors in reality.

Much has been made of Cameron’s innovation to accurately develop motion capture for individual facial movements, and it is my strong feeling that the team also took this approach for the hair of their characters. As Wired pointed out in their features on the movie, this is an evolution in the modern director relationship to computer graphics: instead of trying to *simulate* real world phenomena using procedural software, directors opt to direct a close enough analog in the physical world whose motion could be captured at a very high resolution using camera-like devices.

Don’t believe me? Check out these screen grabs from the Avatar making of video floating around:

Look closely at Zoe’s head and it doesn’t require a lot of imagination to believe that her dreadlocks have individual motion capture devices embedded in them. It’s also probably true that motion capture systems of this type can not be scaled small enough for individual hairs. This might change in the future, but for now it is a real technological constraint in the world of Pandora. There are a couple other examples of technology constraining creative choice: why don’t any animals in the Pandora jungle have fur? Might it be because Cameron couldn’t get CG fur to look right?

So Cameron’s technological constraints and innovation drove choices that would have have otherwise been purely creative. Code became law on Pandora. Sometimes the origins of code’s constraints are artificial (such as copyright law) but sometimes they’re just practical constraints like software and CPU horsepower, and I think that’s what happened here.

Let me know if you agree or have any evidence to the contrary.

Emoji Dick

Monday, September 21st, 2009

I just launched a project on Kickstarter (an awesome NYC based startup that helps people fund their ideas) to translate Moby Dick into Emoji using Amazon Mechanical Turk. I’m calling it Emoji Dick:

This project will fund the production, via crowd sourcing, of a never-before-released translation of Herman Melville’s classic Moby Dick in Japanese emoji icons.

Here’s an example of an Emoji sentence from Moby Dick:

Each of Moby Dick’s 6,438 sentences will be translated 3 times by different Amazon Mechanical Turk workers. Those results will then be voted on by another set of workers, and the most popular version of each sentence will be selected for inclusion in the book.

I’m trying to reach $3,500, and you can give at the $5, $10, $20, $40, and $200 levels and get different awesome rewards, like their name included in the book, a CC BY-SA licensed PDF, the raw data, and either a softcover black and white copy or a limited edition color version.

If you want to support the project, just visit the page here. Thanks!

Fighting iPhone App Store Stockholm Syndrome with Easter Eggs

Saturday, August 29th, 2009

Some iPhone app store developers are beginning to suffer from Stockholm syndrome and are now sympathizing and fighting on behalf of their captor, known as the iPhone approval process.

From Wikipedia’s article on Stockholm Syndrome:

Stockholm syndrome is a psychological response sometimes seen in abducted hostages, in which the hostage shows signs of loyalty to the hostage-taker, regardless of the danger or risk in which they have been placed.

And just as Patty Hearst picked up a machine gun to rob a bank while being held captive by the Symbionese Liberation Army, these developers are attacking the sane programmers trying to save them.

Here’s a guest post on TechCrunch where Matt Galligan, a CEO of an iPhone app development shop where he calls out Yelp for not abiding by Apple’s rules:

Call it sneaky, call it clever, but I call it deceit. Apple has put forth specific guidelines, and “rules” around their app development, and while I don’t always agree, it’s the reality of how we must work with them for now. Yelp hid their easter egg behind shaking the device, which isn’t always the most intuitive action to take on an app that contains some maps and lists. As a result, the unsanctioned Augmented Reality view was gone from Apple’s radar.

Why is Galligan chastising Yelp? Sure, he acknowledges, the app store may act badly sometimes, but hey, rules are rules, right?

Wrong. He should be commending Yelp for putting their app’s approval on the line by risking Apple’s wrath. Yelp must have one of the most popular free apps in the iPhone app store, so it is quite a risk to release it with functionality purposely hidden from Apple.

But its the right kind of risk; it’s gutsy, offers a new whiz-bang feature, and asserts Yelp’s right to develop whatever features they want outside the scrutiny of their captor.  These are values that all developers need more of when creating iPhone applications.

And, if as Galligan predicts, Yelp’s risk forces the App Store approval process to spend more time digging through source to discover undocumented functionality using forbidden (Gasp!) API calls, then maybe it will demonstrate to Apple that it’s just not worth treating your developers like hostages, and they’ll dismantle the approval process entirely.

Apple now has such strict control over the development process that some developers have clearly lost the ability to think for themselves. That means we have to find every opportunity to encourage them to fight against their captor’s tyranny.

That means encouraging risks like Yelp’s and developing more Easter eggs for iPhone apps.

So if you’re reading this and are also currently developing an iPhone app, think about including an Easter Egg that might rankle Apple. You won’t be ruining it for the rest of us, you’ll be chipping away at the wall of Apple’s tyranny over developers.

Regarding Public Disclosure of Private Fact on Social Networks

Monday, June 29th, 2009

A quick update about the Facebook governance post I wrote a while ago where I wondered whether disclosing private facts about yourself on your Facebook page would constitute “public disclosure of private facts” and thereby prevent you from claiming invasion of privacy should a friend disclose something they discovered on your semi-private profile:

… American law prevents me from disclosing private facts about Alice that are not news worthy. However, if Alice had disclosed such private facts in a public space (perhaps in front of a large audience), I can pass on the facts to others and even publish them.

But what if Alice discloses her private fact on her Facebook profile? It remains private in the sense that only I and her friends can see it by logging into Facebook’s private service, but it also arguably public in the sense that I and her friends are also an audience. Does it matter how many friends she has? What privacy settings did she have in place?

Through a Slashdot post, I just stumbled across a case that hinged on a very similar fact pattern, Moreno vs. Hanford Setinel. The judge decided that since a teenager wrote a post on her MySpace blog revealing facts she believed (and now regretfully wishes) were private, she could not claim a breach of privacy under the doctrine.

The judge astutely points out that since the teenager’s MySpace page and blog were publicly available to “anyone with a computer and Internet connection.”, they couldn’t be considered private even if she believed her actual audience to be tiny. But this leaves open the question of whether using Facebook’s privacy settings would create a particular level of security that would classify the profile and facts as “private.”

Obviously details about actions and relationships matter a great deal in determining whether privacy has been breached and whether certain disclosures are public “enough” to negate a plaintiff’s privacy claim. But what is still interesting to me, is whether certain technical choices a user can make on Facebook are substantial enough to shift a profile from being public to being private in the eyes of the law.

As Lessig argues, code is law, but in this case, we might be able to see it the other way around: Facebook’s code could amount to sufficient law.

RT @mecredis RANT RANT

Wednesday, June 10th, 2009

First, if you don’t like Twitter (I know, this blog is becoming a Twitter fan page, but hey, its my blog, right?) don’t read this post. It’ll just annoy you, so consider this your fair warning.

Last night I finally figured out how to change Tweetie on the iPhone’s setting to allow me to post RT‘s instead of via‘s. The setting was buried in “Advanced -> Experimental ->  RT-gurgitationability” an obviously spiteful placement and label.

This means that my retweets look like this:

RT @creativecommons: June’s CC Salon NYC / @OpenVideo Conf Pre-Party: http://bit.ly/jAk1b Facebook RSVP: http://bit.ly/qJU3b

instead of this:

June’s CC Salon NYC / @OpenVideo Conf Pre-Party: http://bit.ly/jAk1b Facebook RSVP: http://bit.ly/qJU3b (via @creativecommons)

Why would Tweetie make it so difficult to use the RT convention over their suggested via convention?

This answer seems seems to be rooted in a minority view held by the creator of Tweetie. He doesn’t think the RT form is “cool” and thinks that it discourages people from “thinking for themselves”.

Or something.

The points raised against RT followed by my thoughts:

I don’t know how to reply to this. Is the @ symbol in e-mail cool? Its a convention, get over it.

So what? A massive amount of human creation is “me too”; there’s no reason to discourage this on a software level. Let people filter out the “me-too’ers” using their own agency and following habits. You’re not going to suddenly encourage people to be more original by breaking your own software and bucking a convention.

There are plenty of people that I stopped following on Twitter because their output consisted only of RT’s, and I agree, they were spammy. But again, hiding a useful feature because you think its going to decrease spam is naive at best, and fascist at worst.

More importantly, however, there’s value in verbatim copying: you preserve the tone and the meaning of the source. How should I retweet something that Shaq says, if I want my followers to see it, supposing they don’t already follow him? Am I supposed to rewrite Shaq’s words? The curious way in which Shaq interprets the English language on Twitter is one of the best reasons to follow him. Rewriting Shaq’s tweets would kill the meaning, and so would linking to them.

I also fail to see the difference in the claim that all retweets should be rewritten or linked from the claim  that all journalists must rewrite and link quotes from their sources. The point is making the actual quote available in their words, right now, not through a link, and not through your lens.

I actually have sympathy for this, to a certain extent. Many friends were confused by RT when joining twitter, but they asked questions and discovered the meaning. Same with e-mail.

You’re making my point for me!

One final point against Tweetie’s suggested convention: when you use (via @ … ) you’re adding 3 unnecessary characters compared to RT, which are precious when faced with Twitter’s 140 limit.

Anyway, at the end of the day, the developer of Tweetie’s behavior represents a strong argument for software freedom. If you can view the source, modify it, and distribute a new version, why not just fork the project and “fix” the bug instead?

I suppose this is what I get for using closed source software. Too bad Tweetie works better than the open source clients.


Creative Commons License

Fred Benenson's Blog by Fred Benenson is licensed under a Creative Commons Attribution 3.0 United States License.