Author: Fred

  • Introducing: Code Canary

    If you’ve got any kind of engineering background, you’re probably spending most of your waking hours enmeshed in a possibly unhealthy relationship with a coding agent like Claude Code. I know I am. This also means you know models like Claude have good days and bad days. Sometimes these issues stem from service interruptions as APIs get overloaded or there’s some kind of backend issue preventing them from working at all.

    Other times it just seems like the model is being dumb and it’s hard to discern whether it’s the task you’ve given it or something deeper at play1.

    The most maddening part is that it’s very hard to tell whether it’s you or your hastily vibe coded codebase causing the difficulties, or whether there’s something actually happening. Without data, it’s impossible to know.

    Anthropic has this data, but we don’t.

    You’ve probably seen Claude Code ask you for feedback on how it’s doing:

    The problem with this survey is that the data gets sent to Anthropic and we never get to see it aggregated, so we have no idea if Claude is actually having a bad day or if it’s just our perception.

    This simple survey gave me an idea: what if we distributed this with a decentralized data survey tool?

    Enter Code Canary, my humble attempt at creating a distributed data collection platform for analyzing the quality of coding agents in real time.

    Code Canary is a lightweight, open feedback system that lets developers rate their AI coding sessions and publishes the results as a public, continuously-updated comparison dashboard.

    Here’s how it works:

    1. You install a hook. For Claude Code, it’s a TaskCompleted hook — a single shell command that fires when your coding session ends. It takes about 30 seconds to set up.
    2. You rate your session. After each session, a small prompt asks: Did the agent complete the task? How was the code quality? How many corrections did you have to make? A quick 10-second interaction.
    3. Your rating is anonymized and aggregated. No code leaves your machine. No prompts are shared. Just structured metadata: which tool, what type of task, what language, and your rating.
    4. The dashboard updates. At codecanary.net, the aggregated data powers a public leaderboard that anyone can explore — sliced by language, task type, codebase size, and more.
    More graphs coming soon!

    This isn’t just about buggy models

    As developers become more dependent on coding agents, we’re going to need an independent source of truth about how they’re actually behaving. Benchmarks are fine, but they really measure a model’s performance in a vacuum. Even worse, models are frequently over-optimized for individual evaluation frameworks and can behave vastly differently in situ.

    For example, SWE-bench runs models against a curated set of GitHub issues. HumanEval tests function completion in isolation. These are useful, controlled experiments — but they measure performance in a vacuum, not in your codebase and not based on your lived experience as a developer.

    Meanwhile, the people with the best data on how these tools actually perform — working developers — have no systematic way to share what they know.

    The Canary in the Code Mine

    I named this project Code Canary after canaries in coal mines where they served as early warning systems. In practice, this meant sacrificing the lives of a lot of tiny birds: lethal gases (mainly carbon monoxide) would suffocate them before humans, so it gave miners time to reach safety. Code Canary works with a similar idea, but thankfully with zero avian death: when a tool’s quality starts slipping — maybe after a model update, maybe after a rushed release — the developers using it every day will know first. Their ratings are the earliest, most reliable signal available.

    Join the Beta

    I’m launching Code Canary in beta today. Here’s how to get it running:

    1. Install the hook for your preferred AI coding tool and start rating your sessions:
      curl -sL https://codecanary.net/install | bash
    2. Explore the dashboard at codecanary.net to see early results
    3. Share your setup — the more developers who contribute ratings, the more useful the data becomes for everyone
    4. Read the FAQ if you’re curious about how it works – Code Canary never sees your code and data is anonymized before publicly displayed.

    As more people use Code Canary, the data will become more useful.

    I can’t wait to start building some really sophisticated analytics, so I hope you check it out and contribute some votes!


    1. One interesting theory I’ve heard is that the model providers will purposely degrade performance before they release a new model by reallocating GPUs away from inference and onto model training. By decreasing the number of GPUs available for inference, the theory goes, they have to quantize models (e.g. add lossiness, like turning down the bitrate of an MP3) which means a decrease in reasoning quality and model performance. ↩︎
  • Repurposing Claude Code for Better Spotify Recommendations

    Repurposing Claude Code for Better Spotify Recommendations

    I built a Claude Code skill that creates Spotify playlists from natural language. You describe what you want — “70s Ethiopian jazz fusion,” “ambient music that sounds like it was recorded in a cathedral,” “deep cuts from the golden age of hip-hop” — and Claude talks to Spotify’s API and generates a personalized playlist in your Spotify account, informed by your actual listening history.

    Spotify’s Recommendations Are Fine

    OK let’s be more generous – Spotify’s recommendation algorithms are actually pretty good. I’ve discovered all kinds of new genre and artists since I became a member in 2020. I no longer hate World Music, I just wasn’t listening to enough afropsychedelica. But the more time I spend on Spotify, the more it feels genuinely hard to find really good music that the rest of the world hasn’t discovered via the same correlation matrixes.

    To wit, I think we’ve all had the experience of sitting in a restaurant and realizing that a song we thought we were so cool for finding on Spotify was actually on heavy rotation on everyone else’s playlists as well. (I’m looking at you: Fantastic Man by William Onyeabor).

    In an effort to get some more interesting human curated music in my life, I participate in a Music League with my friends. It’s basically a weekly playlist competition based on a single prompt.

    Our most recent prompt was more or less about this:

    We Shall Overcome (Da Algoriddim)
    Search Spotify for obscure and it will make an “obscure mix” for you. It will be….fine. But go to a track you find especially interesting, make it make a “radio” off this track, then you start getting somewhere good. Enter to the best/most intriguing track off THIS playlist. Then do the same thing again. Points go to things truly new to you and (hopefully) everyone else.

    After trying this well intentioned directive, I quickly found running into a similar feeling: I’ve seen a lot of these songs before.

    But since I spend most of my waking hours interacting with Claude, I wondered if I could get Claude to help me out with this instead of futzing around in Spotify. Which beame the inspiration for my playlist-builder skill:

    Your Music Library Is Invisible to Spotify

    One key thing I wanted to incorporate that Spotify could never handle: I have a large MP3 collection I regularly listen to. Hundreds of gigabytes meticulously organized and carefully curated tunes that I’ve excruciatingly preserved across hard drives throughout the years. But this collection represents two decades of music taste that Spotify knows nothing about. When Spotify recommends music to me, it’s working with maybe 40% of my actual listening history. The other 60% – all that music that I began listening to since I had my first Mp3 player, through college, through … well 2020 when I signed up for Spotify – it’s all useless for understanding my tastes algorithmically.

    The playlist builder skill fixes this. It reads a simple list of my mp3 archive list and my Spotify liked songs, and then uses Claude to build a combined taste profile. Then it’s simply a matter of asking for some recommendations. Claude has the musical knowledge to understand that if I own a bunch of Mulatu Astatke records and a bunch of Khruangbin records, I might like Mdou Moctar or Bombino — connections that require cultural context, not just collaborative filtering.

    Claude Code for Non-Code Projects

    Using Claude Code for things that aren’t software engineering is not just my hobby, it’s also my passion. The tool is designed for code, but the underlying capabilities — reading files, understanding context, executing scripts, maintaining state across a workflow — make it a surprisingly good platform for creative or research projects.

    My playlist builder is a good example. The actual Python script is ~290 lines of stdlib-only code that talks to the Spotify API. The interesting part is everything Claude does before the script runs: reading your listening history, understanding the genre you’re describing, reasoning about what specific tracks would fit, balancing familiarity with discovery, and knowing enough about music history to make connections across decades and continents. The script is plumbing. Claude is the curator.

    There’s something satisfying about the workflow. You type a sentence describing a vibe. Claude thinks for a moment, then produces a paragraph explaining its curatorial approach — why it chose certain artists, how it’s balancing eras, what sonic thread connects the selections. Then the script runs, tracks appear in your Spotify, and you hit play. The whole thing takes about two minutes.

    Free-Form Genre Description Is Underrated

    One thing that’s become clear from using this: the natural language interface feels better for music discovery than just searching for songs you already know, genre tags or Spotify’s Radio feature. Music taste is idiosyncratic and contextual in ways that fixed taxonomies can’t capture.

    Try describing what you want to a genre picker: “I want something that sounds like a 1970s French film soundtrack but with more percussion and maybe some dub influence.” You can’t click your way to that. But you can say it to Claude, and it knows exactly what you mean — it’ll pull from Serge Gainsbourg, Manu Dibango, Lee “Scratch” Perry, and Stereolab, and the resulting playlist will feel coherent in a way that no algorithmic “mood blend” achieves.

    The Measurement Problem

    All that said, I have no idea if Claude’s playlists are objectively better than Spotify’s. “Better” in music recommendation is deeply subjective, and the only way to rigorously test it would be large-scale controlled experiments — A/B testing playlist sources across thousands of users, measuring skip rates, save rates, repeat listens, and long-term retention. I can’t run that experiment.

    What I can say is that Claude’s playlists feel better. They have more range and more intelligence for how they’re built. They also come with interesting anecdotes about the music itself – though I suppose you have to trust that those aren’t hallucinations.

    If Spotify ever opened up their engagement data through an API, it would be a fascinating experiment: take 1,000 users, give half of them Claude-curated playlists and half Spotify’s algorithmic ones for the same prompt, and compare the engagement metrics after a month. My guess is Claude would win on satisfaction and discovery but lose on raw play-through rates, because Spotify is very good at picking songs you won’t skip — which isn’t the same thing as picking songs that expand your taste.

    What’s Actually Happening Here

    The interesting meta-observation is that this skill works because Claude has absorbed a huge amount of music criticism, history, and cultural context from its training data. It understands genre genealogies, regional scenes, label rosters, producer relationships, and era-specific sounds in a way that’s genuinely useful for curation. It’s not doing collaborative filtering (“users who liked X also liked Y”). It’s doing something closer to what a knowledgeable record store clerk does: understanding what you’re asking for, drawing on deep domain knowledge, and making recommendations that respect your taste while pushing its boundaries.

    That’s a fundamentally different approach to recommendation, and it’s one that the major streaming platforms can’t easily replicate since they rely on collaborative filtering. This means they’re trying to send you recommendations of what people typically listen to not necessarily recommendations that will broaden or challenge your horizons.

    The skill is open source if you want to try it yourself. You’ll need a Spotify developer account and Python 3, but no other dependencies.

  • 221 Cannon is Not For Sale

    221 Cannon is Not For Sale

    Like most people, I’ve had my identity stolen once or twice in my life. It’s annoying, but thankfully I’ve avoided some of the more catastrophic outcomes when criminals begin impersonating you.

    These days, however, it seems like someone is really trying to change that: a scammer has now tried to impersonate me multiple times in a six figure land deal in my hometown of Wilton, CT. So while I usually use this blog to write about finding weird things on the internet, it’s now time for a story about something weird on the internet finding me.

    The story begins with my brother Alexander and I purchasing a small parcel of vacant land at 221 Cannon Road in Wilton, Connecticut in 2015. It’s been over 10 years since we purchased it and we have never listed it for sale. Nor do we have plans to sell it.

    And yet, three different real estate agents have now contacted us to let us know that someone has been impersonating us and attempting to sell our property out from under us.

    The first time it happened, it was pretty upsetting, but now that it’s happened another two times, I figured it was time to write a blog post about it.

    The First Attempt (March 2024)

    In March 2024, I received an email from a real estate attorney in Wilton, asking if I was the “Fred Benenson” who co-owned property in town with an “Ed Benenson.” He explained that a realtor at a major brokerage had been working with someone claiming to be us, and that there was already an offer on the table. The attorney was doing his due diligence before representing the sellers — and something didn’t add up.

    I replied within minutes: Neither of us had spoken to anyone about selling the property. It was pretty concerning.

    The realtor had been contacted through Zillow by someone claiming to be me. They’d had a phone conversation — she noted the person had a “middle European” accent — and the scammer had provided accurate details about the property, including its exact acreage. The impostor gave her the email address [email protected] and the phone number (516) 828-0305. He also provided a fake email for my brother: [email protected]. Notice the subtle misspelling — “Benenson” without the second “n” in the email, and the hyphenated “out-look.com” domain.

    She had walked the property, taken drone photos, pulled comps, and listed for a price well above what we paid for it. The property had been live on dozens of real estate websites for days before anyone caught it. A builder had already submitted a full-price cash offer.

    The scammer had even e-signed a purchase agreement.

    When the attorney requested identification before closing, the impostor provided a New York State driver’s license. It had my father’s name (which I share with him) and his correct date of birth and home address. But the photo was of a complete stranger.

    I have no idea who that guy is in on the license, but it’s definitely not my Dad. The license wouldn’t fool anyone who knew my father, but it didn’t need to – in a transaction conducted entirely by email and text message, with a closing that the scammer would never actually attend, the ID just needed to look plausible enough to keep things moving forward. (Though if you look closely at his signature, it’s clearly not written by hand.)

    How It Was Caught

    The attorney deserves most of the credit here. He told me this was the second time in nine months he’d encountered this exact scheme on vacant land in Wilton – his policy is that he won’t represent owners of vacant land without independently verifying ownership. That’s what led him to track me down, and that’s what stopped the sale.

    The realtor was an innocent victim in this too. She’d done her job by walking the property, pulling comps, etc., all in good faith. When I initially suggested (perhaps unfairly) that this felt like lead generation, the attorney took me aside and vouched for her. I’m glad I listened to him!

    I apologized to her, and she graciously forwarded me all of her text message exchanges with the scammer. Reading through them was fascinating. The impostor was responsive, polite, and generally knew the right things to say. But there were tells: slightly awkward phrasing (“Hi good morning”), declining a for-sale sign (“No I don’t think that will be necessary”), and a general reluctance to engage in any way that might require showing up in person.

    Going to the FBI

    After gathering everything I could — the fake ID, the realtor’s text messages, the scammer’s email addresses and phone number, and the attorney’s notes from a prior similar case — I contacted the FBI field office in Connecticut. They directed me to “walk it in” to the office in New York City.

    The experience was, frankly, underwhelming. The FBI wouldn’t let me submit any of our documentation. Instead, they required me to write out the entire complaint by hand on a single piece of paper and hand it to the guard. He made some calls while I waited, and by the end he seemed at least somewhat interested. He gave me the standard line: 2-3 weeks if I hear from anyone.

    I never heard from anyone.

    The attorney, meanwhile, checked with his title company about recording an affidavit on the land records — something that would alert any future buyer or title searcher that the property had been targeted by fraud, and providing our verified contact information.

    It’s Happening Again (February 2026)

    I thought this was behind us. Then, this past week, nearly two years later, I was contacted by two more real estate agents, both reaching out to warn me that someone was once again trying to sell 221 Cannon Road.

    The first was a agent in Wilton who reached out via Instagram DM, of all places — it was the only way he could find to contact me. He explained that his team had received an inquiry to list 221 Cannon Road and had sent paperwork to “Fred and Alex” the night before to sign. But he’d done something smart: he’d noticed he had a mutual friend with my brother, and when he asked him about the situation, he flagged that the conversation with “Alex” didn’t sound right.

    “I had a really bad feeling it wasn’t,” he told me.

    The second agent, a woman at Berkshire Hathaway, sent a carefully worded email explaining that she’d been contacted by someone claiming to have authority to sell our property, but that “several standard verification steps raised concerns” and she chose not to proceed. She reached out purely as a courtesy to let us know.

    Which is about when I decided I should write something about this. Not only because it’s a fascinating scam that seems to be getting more common, but because I figured this post might show up for the next broker who might be doing research on the address.

    Vacant Land Fraud

    This type of scam targets a very specific vulnerability: vacant land has no occupants to notice a for-sale sign, no neighbors who’d immediately recognize something is wrong, and closings often happen remotely.

    Here’s how it works:

    1. The scammer identifies vacant land through public records or Zillow. They look for parcels that are owned free and clear (no mortgage), haven’t changed hands recently, and are in desirable areas.
    2. They contact a real estate agent through a platform like Zillow, posing as the owner. They know the property details because that information is publicly available.
    3. They communicate primarily through text and email, avoiding in-person meetings. They provide fake identification if asked.
    4. They agree to whatever price the agent suggests (because they don’t actually own the property, any sale is pure profit).
    5. They push for a quick closing and attempt to direct proceeds to an account they control.
    6. If questioned, they disappear. The scammer who targeted us in 2024 simply stopped responding once the attorney asked for an in-person closing.
    7. If they get farther they’ll pocket the earnest money deposit which would have been significant in my case.

    A similar scheme in nearby Fairfield wasn’t caught in time: someone had a $1.5 million home built on land they didn’t own without the actual owners knowing.

    What You Can Do

    If you own vacant land there are a couple of things you can do, but the most effective one is probably to register the address with a Fraud / No-Authority notice. This involves calling the County Recorder / Register of Deeds and ask how to record one of these (names vary by state):
        •    Owner Affidavit
        •    Affidavit of Fact
        •    Notice of Non-Authority to Convey
        •    Fraud Alert / Title Alert Notice
        •    Statement of Ownership / Anti-Fraud Notice:

    You could also setup up Google Alerts for your address and you’ll be notified if it appears online.

    Finally, and this certainly isn’t for everyone, you can make yourself easily findable online. One reason the attorney was able to verify ownership quickly in 2024 was that it’s fairly easy to gooogle me. If you own property, make sure there’s some way for a diligent attorney or agent to reach the real you.

    The Property Is Not For Sale

    In case it isn’t clear, 221 Cannon Road is not for sale. It has never been for sale. If you are a real estate professional who has been contacted about listing or purchasing this property, please reach out to me directly.

  • SendGrid isn’t emailing you about ICE or BLM. It’s a phishing attack.

    For the past several months, I’ve been receiving and then ignoring a steady stream of concerning emails from Sendgrid, the popular email delivery service owned by Twilio that I use for sending emails from Breadwinner. I’d see some weird API error notification, login to my SendGrid account, check everything is working properly, and then delete the email. I didn’t pay too close attention to them until I saw a couple very strange ones.

    Today, I received this one implying SendGrid was going to be adding a “Support ICE” button to all emails sent through their platform:


    If you’ve been paying any attention at all to US politics, you’ll know how insidiuously provocative this would be if it were a real email.

    But it isn’t. It’s a phishing email. If you use SendGrid, or have ever used it, you might be getting these too.

    This phishing campaign is a fascinating example of how sophisticated social engineering has become. Instead of Nigerian 419 scams, hackers have evolved to carefully craft messages sent to professionals that are designed to exploit the American political consciousness.

    The opt-out buttons are the trap.

    The Attack

    Here’s how it works: hackers compromise SendGrid customer accounts (through credential stuffing, password reuse, the usual methods). Once they have access, they can send emails through SendGrid’s infrastructure, which means the emails pass all the standard authentication checks (SPF, DKIM) that your spam filter uses to determine legitimacy. The emails look real because, technically, they are real SendGrid emails sent via SendGrid’s platform and via a customer’s reputation – they’re just sent by the wrong people and wrong domains.

    They’re likely using a list of SendGrid customers so they can target this to only people who have used the service before.

    Security researchers at Netcraft dubbed this “Phishception” back in 2024: attackers using SendGrid to phish SendGrid users, creating a self-perpetuating cycle where each compromised account can be used to compromise more accounts.

    This has been going on for years. Brian Krebs wrote about it in 2020. And yet here we are.

    The Lures

    What’s changed, or at least what I’ve noticed recently, is the political sophistication of the bait. The attackers aren’t just sending “your account is suspended” emails (though they do that too). They’re sending messages designed to provoke a strong emotional reaction that compels you to click.

    Here are some I’ve received:

    The LGBT Pride Footer

    From: [email protected]

    This one claims SendGrid’s CEO “James Mitchell” (not a real person) came out as gay, and to show support, SendGrid is adding a pride-themed footer to all emails. “We understand this may not be right for everyone,” it helpfully notes, offering a “Manage Preferences” button.

    Note the opt-out. If you support LGBTQ+ rights, you might ignore this. But if you don’t? You’re clicking that button immediately.

    The Black Lives Matter Theme

    From: [email protected]

    For “one week,” all emails will feature a commemorative theme honoring George Floyd and the Black Lives Matter movement. This change applies “platform-wide to all users.”

    Again: “If you prefer not to participate, you can opt out below.”

    Note the sender domain: nellions.co.ke, a Kenyan domain. This is a compromised SendGrid customer account being used to send phishing emails to American targets about American political issues.

    The ICE Support Initiative

    From: [email protected]

    This one arrived just this morning. SendGrid is supposedly adding a “Support ICE” donation button to the footer of every email sent through their platform, “in response to recent events” and “as part of our commitment to supporting U.S. Immigration and Customs Enforcement.”

    The timing here is notable: these hackers are reading the news.

    The Spanish Language Switch

    From: [email protected]

    And then there’s this one, which is just absurd: “Your language preference has been successfully changed to Spanish. All emails sent via the API will now be formatted in Spanish.”

    This one is less politically charged and more “wait, what? I didn’t do that” – just enough anxiety to get you to click.

    The Classic Account Termination

    From: [email protected]

    And of course, they still do the classics: “Your account has been terminated for misusing sending guidelines.”

    The Pattern

    Look closely at those sender addresses again at the top of the Gmail message:

    • drummond.com
    • nellions.co.ke
    • theraoffice.com
    • nutritionsociety.org
    • myplace.co

    None of these are sendgrid.com. They’re all legitimate businesses whose SendGrid accounts have been compromised. When these emails hit your inbox, they pass authentication because they really were sent through SendGrid, just not by SendGrid.

    Who’s Behind This?

    The political sophistication on display here (BLM, LGBTQ+ rights, ICE, even the Spanish language switch playing on immigration anxieties) suggests someone with a deep understanding of American cultural fault lines.

    We know that state actors have invested heavily in understanding and exploiting these divisions. Russian active measures campaigns have been documented doing exactly this kind of work: identifying wedge issues and creating content designed to inflame both sides. North Korea has demonstrated similar sophistication in their social engineering operations by targeting academics and foreign policy experts.

    I’m not saying this is a state actor necessarily – the economic value of exploiting SendGrid’s formidable email infrastructure is most likely the appeal here. Similarly, this could just as easily be a domestic operation run by someone who’s extremely online and knows which culture war buttons to push. But I think the skill set required (technical ability to compromise accounts at scale plus cultural fluency in American politics) is notable.

    Can This Be Fixed?

    Honestly? I don’t know.

    SendGrid has known about this problem for years. Twilio (SendGrid’s parent company) has talked about requiring two-factor authentication for all customers, but implementation has been slow. The fundamental issue is that SendGrid’s business model depends on making it easy for legitimate businesses to send email at scale. Anything that adds friction for good actors also adds friction for bad actors, but the bad actors are more motivated to work around it.

    Meanwhile, the attackers only need one thing: access to SendGrid customer accounts. As long as people reuse passwords and don’t enable 2FA, there will be a steady supply of compromised accounts. It’s a bit of a hydra problem: cut off one head, another grows behind it.

    Protecting Yourself

    If you’re a SendGrid customer: enable two-factor authentication immediately. Use a unique password. Check your account for unauthorized API keys or sender identities.

    If you’re just receiving these emails: don’t click anything. The links go to fake SendGrid login pages that will steal your credentials in real-time as they actually validate your password against SendGrid’s API and even capture your 2FA codes.

    A Filter Hack

    For Gmail users, you can create a filter to automatically delete SendGrid impersonation emails that don’t come from legitimate SendGrid domains:

    1. Go to Settings → Filters and Blocked Addresses → Create new filter
    2. In the “From” field, enter: -from:sendgrid "sendgrid" -from:sendgrid.com -from:twilio.com
    3. In the “Has the words” field, enter: sendgrid
    4. Click “Create filter” and select “Delete it”

    This will catch emails that mention SendGrid and have SendGrid in the sender name but aren’t actually from SendGrid. It’s not perfect, but it helps.

    Have You Gotten These?

    I’m curious what other variations are out there. If you’ve received SendGrid phishing emails (especially weird or politically-charged ones) leave a comment or reach out. The more examples we document, the easier it is for people to recognize these when they land in their inbox.

    And if you work at Twilio/SendGrid and want to explain what’s being done about this: I’m all ears.

  • The Art and Science of Counting Pixels: Behind the Math of the game Figment

    The Art and Science of Counting Pixels: Behind the Math of the game Figment

    “What’s the fastest and most accurate way to calculate the exact area of a pink in this design?”

    That was the question my friend Alex Hague asked me when he was developing the game Figment for his tabletop company CMYK Games.

    It’s a deceptively simple request, but answering it formally or even technically is quite difficult.

    Before we try to answer it, here’s a little more on one of CMYK’s newest games: Figment is a brilliantly simple and fun game that challenges us to to estimate the area of shapes and then use those estimates to sort a set of cards. The twist is that when a card appears to have more one color than another, it is sometimes merely an optical illusion exploiting our visual system’s difficulty estimating the area of arbitrary shapes.

    This is the perceptual challenge that makes Figment challenging (and therefore fun to play) but it also meant CMYK needed an objective “ground truth” for each color on every card in order to make the game fair. And with dozens of cards, that meant we need an automated way to calculate the exact percentages for each card’s answer key.

    Alex explained to me that what had initially seemed like a straightforward design task and something easy to prototype had turned out to be a fiendishly annoying problem. He approached me because he knows I love these types of problems and wondered whether it could be solved with some computational creativity.

    And indeed, as soon as he explained it to me, I jumped at the opportunity to start puzzling through how we might do it algorithmically.

    The challenge behind the game

    Before we get to the mathy stuff, let’s go over the rules for playing Figment. 

    In each round of Figment, players are tasked with ranking cards based on one of the 4 main colors of the game – simply laying out 5 cards ranked by which ones have the “most” of the round’s arrow color.

    Sounds simple, right? Well, it turns out it’s pretty tricky in practice which is what makes the game fun.

    The back of each card reveals each color’s percentage relative to the entire surface area of the card including the white background. 

    The task Alex and his design team were struggling with was systematically calculating the exact percentages of the card that each color occupies on the card face. This was crucial because the entire point of the game is to rank the cards based on their color percentages – if the numbers were wrong, the game wouldn’t be fair nor fun. 

    Alex needed a method that was:

    • Accurate down to a couple decimal points: If a card were 25.51% blue and another was 25.49% blue, they need to be rounded properly
    • Consistent across all cards: Checking the values manually in Illustrator turned out to be a total chore and sometimes yielded different results. Online tools were no better.
    • Verifiable: We wanted the process to be data driven so we could easily audit the entire deck in a spreadsheet
    • Automated: Sometimes card designs changed, and being able to quickly process all 100 designs efficiently in one task would be ideal 

    You’d think there would be good tools for this, but they’re harder to use than you’d expect.

    Our first instinct was to look for existing tools that could batch process the color analysis. Adobe Illustrator offers one-off estimates for the area of vector shapes when a single color is selected but seeing as Figment ships with 100 distinct cards (some with hideously complex designs) we wanted to avoid the manual and error prone labor required with that approach, especially considering it would have involved manually compiling the results into a separate spreadsheet. Automating Illustrator to “pick” the color areas automatically also seemed like a nightmare if it was even possible.

    A computational method would mean implementing an automated way to mathematically derive the surface area of each color. This approach would also have the added benefit that we could apply exactly the same algorithm to each card systematically.

    If you try digging around the web for this kind of tool, you’ll find there are apps and online services that will analyze images for their color distribution, but we found them to be consistently unreliable when compared to Illustrator and not much more efficient either. Even more annoyingly, we found ourselves getting occasionally inconsistent results from the same tools with the same card, which was very frustrating.

    This kind of thing actually really matters for a game like Figment.

    Without a rigorously tested ground truth for each color on each card, we knew shipping the game would feel sketchy at best and dishonest at worst. In particular, it’d be problematic for the really close calls where players were forced to choose between two cards with 6% pink vs. 5% pink, for example. A rounding error of 0.5% could determine the entirety of a game’s outcome. 

    If you’ve ever participated in a particularly competitive board game session, you’ll know how important it is to have a definitively, objectively correct outcome for a given rule. 

    All this meant that I was very excited to see if I could figure out a comprehensive computational approach using mathematics.

    I initially considered calculating the area of vector shapes directly. In theory, you could calculate the exact area of each shape based on its geometry using pure math (e.g. calculating the area of a circle is given by a simple algorithm A = π * r2 ). 

    Sadly this approach quickly falls apart when dealing with complex, overlapping shapes.

    The more I thought about it, the more I realized that this must be an established problem in mathematics, similar to the coastline paradox:

    It’s not hard to see how the coastline paradox applies here: if the same shapes has different perimeters based on the units you measure them with, then you’ll end up with different areas!

    The challenge with Figment is similar and it turns out this is a non-trivial problem in higher order math. There’s even a field dedicated to it called measurement theory! 

    Here’s a relevant excerpt from the mathematics Stack Overflow discussing the difficulty in calculating the area of an arbitrary shape:

    This happens to be a surprisingly difficult problem. In 1890 Karl Hermann Amandus Schwarz (1843-1921) published an example that showed the accepted definition of surface area gives an infinite area for a cylinder by showing there exists a sequence of inscribed polyhedra that converge uniformly to a cylinder such that the areas of the polyhedra diverge to infinity. 

    Thankfully I didn’t need a PhD in mathematics to get any further – some basic problem solving landed me with a really solid solution. 

    The Pixel-Counting Solution

    The original designs of Figment’s cards, created and stored as vector graphics in Adobe Illustrator, are the closest to pure shapes inside a computer’s mind. But remember, because virtually all of these shapes are irregular they don’t have easily computable areas, so while the computer can render them, it doesn’t actually know their area initially! 

    How would you estimate the area of the pink line?

    That’s the beguiling paradox that’s at the heart of the Figment game mechanic: we can perceive the boundaries and edges of these shapes both digitally and visually, but neither our minds nor our computers have a precise way to determine their area. 

    However, I realized that the key to computing their areas could rely on rasterization, the process of turning a vector image into a pixel-based raster image. 

    In order for a shape to be displayed on your computer’s LCD screen or static raster image, it must follow a transformation from its theoretical representation consisting of lines, curves, and fills, into individual pixels. So whether your computer is displaying a vector shape on an LCD screen, saving it as a PNG, or printing it out on an inkjet printer, the same underlying process is at work: fill up a grid of pixels in such a way that it resembles the underlying shape defined in the vector file and let our eyes do the rest. The original information of the vector image defined by splines and shapes gets discarded in lieu of a finite set of pixels on a grid that has a limited resolution. 

    I realized that we could piggyback off this foundational process in computer science to solve the problem for Figment, as once we had a pixel representation of each card, we could merely count the number of pixels per color, something standard image processing libraries can all do. 

    Just count the pixels per color, then you’ll your answer, right? 


    Unfortunately, while this was the right path to the correct solution, it turned out to be a more difficult problem than meets the eye.

    Here’s why: while vector images offer a ton of value to designers, they must eventually always go through a rasterization process to make them look appealing to human eyes and not just computer systems. This rasterization step introduces a whole set of ambiguous colors via something anti-aliasing.

    Anti-aliasing is its own topic in computer graphics but here’s a quick explanation:

    Anti-aliasing smooths jagged edges by creating intermediate colors between shapes, making edges look smoother to the human eye but creating a gradient of ‘in-between’ colors that complicate pixel counting.

    Anti-aliasing initially became a huge problem for this approach as it results in a bunch of “in between” colors on the border of edges:

    How would you count the pixels in this cropped portion of a card? Inevitably, you’d end up with some tough calls when it comes to these edge pixels – they’re essentially indeterminate colors for the purposes of playing Figment. 

    This posed a problem for the naive pixel counting approach as we’d end up with a lot of colors that were similar to the main colors, but not actually the game’s primary colors. These indeterminate lost pixels ended up adding up to single digit percentages, which in a game that comes down to small differences, could be a huge issue.

    Put another way, if you had to look at each of these pixels and answer the question: which pixels “belong” to Figment’s pink color, you’d have a hell of a time.

    However, thinking about it this way gave me the clue I needed for the solution: I could use a simple distance metric to chart a line in RGB color space between each indeterminate pixel and one of our 5 colors.

    The Figment color that was closest (e.g. the shortest line) to the ambiguous pixel would yield the color that the pixel “belonged to”. 

    Here’s how I did it in code – it’s an algorithm that effectively forces each pixel to the closest of our four target colors using Euclidean distance in RGB space:

    def get_color_name(rgb_triplet, target_colors):
        min_distance = float('inf')
        closest_color = None
        for color_name, target_rgb in target_colors.items():
        # Calculate Euclidean distance between the pixel and target color   
        # in 3D RGB space (treating R, G, B as x, y, z coordinates)
            dist = sum([(a - b) ** 2 for a, b in zip(rgb_triplet, target_rgb)])
            if dist < min_distance:
                min_distance = dist
                closest_color = color_name
        return closest_color

    This ensured that light pink pixels were counted as pink, slightly off-green pixels were counted as green, and so on.

    It also meant that there were no pixels left behind – every pixel would get assigned into one and only one of the five main colors. 

    Put really simply, you can think of this algorithm as “rounding” the pixels to their nearest Figment color by using geometry in a 3D color cube. Neat!

    The Results

    The algorithm performed perfectly. For each card, we derived precise percentages that we could verify visually and mathematically. The whole deck was processed in under a minute, generating a spreadsheet with color percentages accurate to two decimal places.

    We were even able to use these percentages to rank cards by difficulty, with cards having more evenly distributed colors being harder to guess than those dominated by one color.

    Open-Sourcing the Tool

    I’m excited to open-source my pixel counting tool for others to use and modify. Whether you’re developing a game, analyzing designs, or just curious about color distribution in images, the core algorithm is now available on GitHub.

    Feel free to adapt it to your needs—maybe you need to analyze more colors, process different file types, or integrate it into a larger workflow.

    All the pixels that are fit to count

    Sometimes the simplest approach is the most effective but it can take a while to whittle down a problem to it’s core before you get there. 

    This project also reminded me that game design is often as much about solving technical problems as it is about creating engaging gameplay. The pixel-counting algorithm might be invisible to players, but it’s woven into every guess they make and every point they score.

    If you’re curious about the algorithm or want to try it yourself, check out the code on GitHub.

    And if you’re interested in seeing the results in action, grab a copy of Figment and start guessing those color percentages!

  • Getting Caught on the Inside

    When you’re surfing and get caught in the whitewater after a big wave, the feeling can be terrifying – you’re “caught on the inside.” I love this phrase because it conveys a feeling all humans have felt some version of: when things get a little intense, your body is burning oxygen, and you lose track of which way is up. This is how I’ve begun to reflect on my time this summer while attempting to launch my own VC fund: staring down the barrel of the gun of raising 7 or 8 figures of other people’s money became a bit overwhelming, if not intimidating. 

    I found myself procrastinating by spending a lot of time writing code with the help of AI. And eventually I realized that maybe I should be riding this wave myself, not as an investor, but as a builder.

    My git contributions have benefited significantly from AI

    If you’ve followed my career at all, you’ll know I’ve always been a builder at heart, and the more time I spent organizing the preliminary work for a fund, the clearer that vein of my identity became.


    The Builder-Investor

    In late 2024, I stepped away from my role as a General Partner at TwentyTwo Ventures after helping Katey close out the final batch of investments in our Fund VI. We haddeployed more than $15 million into dozens of really exciting Y Combinator startups, and it was so exhilarating helping support founders right at the beginning of their journey once again after having worked at Kickstarter and Y Combinator.

    And even though I came to it with a decent understanding of VC and lots of experience with past investments, my time at TwentyTwo really helped me cement my understanding of seed capital from the inside of a small seed stage fund. It was also so refreshing to be back in the saddle of the startup-world alongside some really exciting ideas and great founders. But I felt myself getting a little antsy, and after some encouragement from friends and family, the natural next step felt obvious: raise my own fund. 

    I spent the first half of 2025 exploring that idea with a trusted friend and a handful of incredible advisors. We made a deck, refined a thesis, tried to agree on a name, and generated a ton of interest from my network.

    A few months in, I hosted a dinner with potential LPs and advisors that galvanized me. The conversation jumped from AI’s impact on labor markets and its possible cultural implications, to the future of human creativity and what we really want from technology. The mood in the room was obvious: with the advent of LLMs, we’ve arrived at a spooky inflection point in human history – we’re simultaneously surrounded by an undeniable wave of opportunity but also wary about what is about to change.

    I didn’t quite realize it at the time, but this wave had begun to pull me under.

    I’ll never forget the first time when I started up Claude Code in early 2025 and kicked off rehabilitating an old project. A week or two later I looked up and realized I could build things in a week that would have taken 3-4 developers a month. Like many other people over the last year, this drive overtook me. I started pushing out side projects left and right: a cat breed analyzer, a prototype of a 15th anniversary edition of Emoji Dick translated by Claude, a number of open source libraries, and countless other quick hack projects that seemed to just keep coming.

    The most engrossing project has been a digital version of the tabletop game Monikers for my dear friends over at CMYK games. Though I had never written an iOS app before, I built it in Swift. In only a couple of months, I’ve shipped a stable beta to a dozen play-testers, built on top of a robust Rails backend. I’ve even incorporated a complex translation management engine that handles internationalization for thousands of cards across a dozen languages, from Thai to Hebrew. 

    So, that same week as our pre-fund dinner, when I found myself losing track of time, shipping things at an embarrassing pace, my partner Louise said something that hit home:

    You LOVE it, you love building things, just admit it.

    She was right.


    Money, what is it good for (especially now)?

    Amidst all the code projects and VC conversations, I began to get the sense that AI had begun shifting the ground under venture capital itself.

    While trying to distill our best guess at what the next generation of companies would look like for the fund, I wanted to focus on:

    • Companies using AI to exploit a vertical their founders were experts in;
    • Products that differentiated on something other than shipping software fast – cultural savviness, human factors, hardware, etc.;
    • Teams that understood the new bottlenecks, like the exhaustion of publicly available data to train on.

    But we kept returning to the same question: when AI tools can almost replace a full engineering team, what exactly is the VC model solving for? If founders can validate ideas with $50K instead of $500K – and can reach profitability – what’s the venture money value prop?

    Capital used to be a scarce resource but even in this post-ZIRP world, money and frothy valuations feel abundant for founders. But now MVPs can be built with minimal headcount. Founders who could get funded by anyone are choosing their investors based on everything except the check size – domain expertise, distribution channels, talent networks. And while I was confident I could bring lots of value across those dimensions, it became clear that AI is accelerating venture capital’s transformation from a financial product into an even thinner service business.

    For someone who’s happiest in build mode, it started to feel strange to focus on investing during the most exciting moment for builders in a generation.


    A pivot, of sorts

    I’m not abandoning investing and I don’t think VC is going anywhere anytime soon. There are lots of crazy ideas out there that need a lot of money quickly, and VC will always be great for that. I just realized I didn’t want to be fighting these headwinds while missing out on doing what I enjoy most.

    So the good news is that I’m going to continue backing exceptional founders shipping great ideas and I’m lucky enough to be able to do it using my own capital. But I’m going to be doing it as someone who’s building alongside them, not as someone managing other people’s money through a traditional fund structure.

    I’m still figuring it out, but this will probably be some combination of angel investing, syndicating deals to friends who had expressed interest in my fund, partner-investing in projects where I can contribute technically and building products that demonstrate a thesis rather than just funding it.


    What’s Next

    For the foreseeable future I want to be heads-down building the kinds of products I can’t stop thinking about – the technically complex learn-as-you-go and creatively weird projects that only someone with my particular blend of interests would pursue. The kind of stuff that pulls you in for hours and helps you understand a new corner of the universe. 

    Over the last year, sometimes this has meant developing games, other times it has been complex financial modeling software. I’ve even gotten into modeling power demand curves for a friend’s energy company. It’s exhilarating and I’m totally hooked.

    In terms of investing, I’m most interested in strategically backing founders attacking problems I can get viscerally excited about. But I also love the weird stuff that breaks my brain in the best way – projects at the intersection of art and technology, tools that shouldn’t exist but do, gutsy movies, and whatever else grabs me.

    To those who were ready to back our fund – thank you. I’d love to bring you into specific deals as they arise. To those building weird and/or great things at this moment – let’s talk. The builder-investor model might not be for everyone, but for those of us who can’t stop shipping and have the resources to do it, it feels like the only honest way forward.

    Look around, this couldn’t be a more exciting time to build things – I’ll see you on the inside.


    Playa Marbella, Costa Rica



  • Big Data Review in Emoji

    I’m on the board of Rhizome.org, a great non-profit focused on technology and art. We do an event every year called Seven on Seven where we pair seven technologists with seven artists.

    Saturday was the fifth anniversary of the event, and this year one of the teams paired NYT writer and author Nick Bilton with artist Simon Denny.

    Around 5pm on Friday, I got an email from Nick offering to pay me $5 for an emoji version of the White House’s report on Big Data:

    Email from Nick

    The entire report is 85 pages, but they asked for a summary of page 55, a chart showing how federal dollars are being spent on privacy and data research:

    bigdatagraphic2

    Here’s what I came up with (click for a larger version):

    Big Data Emoji

    I’m particularly proud of my emoji-fication of homomorphic encryption:

    Homomorphic Encryption

    I highly recommend watching the whole event, but Nick and Simon’s presentation of the other reports they solicited begins around at the 3 hours and 25 minute mark of the live stream:

    Nick, I know you said you’d pay cash, but I’d really prefer to accept the $5 in DOGE.

    Please send 10,526.32 DOGE to DKQJsavxSdF381Mn3qZpyehsBzCX3QXzA2. Thanks!

  • Proving Rancid filmed “Time Bomb” at Kickstarter’s Old HQ

    A month or so ago, Andy Baio was watching 120 Minutes and thought he recognized the wall moulding in Rancid’s Time Bomb video:

    Here’s the video:

    Rancid was playing on the 4th Floor of 155 Rivington, Kickstarter’s former home on the Lower East Side.

    Andy sent it around to everyone at Kickstarter and it was obvious — it had to be the building we had spent the last 4 years in.

    I watched the video repeatedly, and knew it too, but I wanted proof.

    So I started digging through my photos of the place.

    When we first moved into the 4th floor of 155 Rivington, a decision was made to take down some of the drywall on the west wall:

    REAS / 155 Rivington
    The wall on the 4th floor after we removed the sheetrock in February 2012

    Underneath we discovered graffiti that must have been at least 15 years old, probably older. My colleagues Alex and Tieg recognized the H₂O H/C/N/Y tag as from the band H₂O, a punk rock band started in NYC in 1995. Ensign, in the upper right, was also an local hardcore punk band, so they must have been there too.

    Other tags were easily Googleable. One was Lord Ezec is Danny Diablo, a famous underground recording artist, who is still very active in the scene (@dannydiablo on Twitter):

    Lord Ezec

    Danny was Sick of It All‘s roadie around the same time, which explains the “Alleyway Crew” tag, the story of which is referenced in this biography of the band:

    Numerous Sick of It All fans have tattoos of the “Alleyway Dragon”, the band’s official logo. The Dragon is from a sheet of Greg Irons flash. It is not, as some people have claimed, a misappropriated gang symbol, but then the Alleyway Crew was never a gang to begin with. It was, and is, a group of friends. The dragon is a symbol of friendship as well as a way that members would relate who was hanging out at a particular gathering. The “Alleyway” is in a school yard in Flushing, Queens, where the band and all of their friends would gather.

    The other acronyms, such as C.C.C. and D.M.S. and SNS are too hard to Google, but I’m sure if you were in the scene at the time you’d recognize them.

    The biggest piece, however, was the big white REAS tag

    REAS was a well known NYC graffiti artist who hit tons of spots in Manhattan and around NYC:

    Screen Shot 2014-01-19 at 12.33.38 PM

    Rumor has it REAS still painting, and he even recently collaborated with the well known artist KAWS on a vinyl toy:

    KAWS / REAS Toy

    Something told me that if I looked hard enough in the Rancid video, that REAS’ tag might be there.

    Sure enough, around 1 minute and 17 seconds in, it makes an appearance.

    I transformed my photo onto a screen grab from the video and mapped it onto an animation:

    So barring an explanation involving REAS and a time machine, I’d say this is proof that Rancid shot their video at Kickstarter HQ.

    If you want to learn more about the making of video, check out Flaming Pablum’s post from October 2013 on it here.

  • Visualizing CitiBike Share Station Data

    CitiBike Share launched yesterday. I finally got my fob activated, but since its raining, I haven’t had a chance to take the bikes out for a spin, so I decided to take their data for a spin instead.

    Jer RT’d Chris Shiflett’s link to the CitiBike Share JSON data:


    asking for a “decent visualization” within half an hour:

    So I fired up R and after a couple of minutes of JSON-munging, threw together this graph plotting the # of available bikes per station against the total number of docks of that station:


    Not surprisingly, there’s a positive correlation with the number of docks a station has and the number of bikes available for that station.

    But some of the outliers represent stations that are popular and have fewer bikes available, places where CitiBike Share might consider adding capacity. For example, East 33rd street and 1st avenue had 60 total docks, but 0 available bikes.

    In contrast, stations like Park Place & Church Street (middle of the graph) lie on the identity line and represent stations with close (or exactly) 100% available bikes for their docks. They may be examples of over-provisioned stations.

    I also colored the name of the station based on its latitude to give for a very rough proxy for how “downtown” the station was. This glosses over the fact that Brooklyn has a downtown which is distinct from what people normally consider NYC’s downtown, but is interesting to note that some uptown stations (lighter blue) appear to cluster towards the right of the graph indicating uptown stations have been granted more total docks overall. More space in midtown I guess.

    I’m not proud of the R code I used to hack this together, but I spent about 10 minutes on it:

  • The Data Behind My Ideal Bookshelf

    Thessaly La Force, recently published a book with the artist Jane Mount called “My Ideal Bookshelf.” In it, Thessaly interviews over 100 people and Jane paints their bookshelves:

    The books that we choose to keep –let alone read– can say a lot about who we are and how we see ourselves. In My Ideal Bookshelf, dozens of leading cultural figures share the books that matter to them most; books that define their dreams and ambitions and in many cases helped them find their way in the world. Contributors include Malcolm Gladwell, Thomas Keller, Michael Chabon, Alice Waters, and Tony Hawk among many others.

    As I observed Jane and Thessaly compile the book over the last year, I couldn’t help but think about all the fun opportunities I could have exploring the data behind the shelves.

    Contributor Neighbors

    Each of the 101 contributors Thessaly interviewed picked as many books as they thought represented their ideal bookshelf, and I knew some of them would pick identical books.

    So what would a taste graph linking contributors to each other using the books on their shelves look like?

    Previously, I had worked with Cytoscape to render a network graphs, but this seemed like a good opportunity to make something interactive and also a perfect first project to really use d3. I can’t wait to do more with it.

    Hover over each node to see the contributor’s neighbors.

    The layout of the graph is done using d3’s force-directed graph layout implementation and each node represents a contributor’s shelf and is colored by the contributor’s profession.

    Each active link is then colored by the neighbor’s profession. The nodes at the center of the graph have the most neighbors and exert the most pull over the rest of the graph. Try clicking and dragging a node to get a good feel for its centrality.

    Genres and Professions

    click for larger
    The distribution of contributors professions skews heavily toward writers so all the professions aren’t evenly represented: 41 out of the 101 bookshelves were from writers, and there were a total of total of 35 unique professions represented.

    click for larger

    In the graph above, I picked books chosen by professions which were represented by two or more contributors. Each circle is proportionally sized with the share of books chosen by the profession of the contributor.

    For example, fiction books make up the plurality — 31% — of books chosen by writers. Similarly, 55% of the books chosen by photographers were classified as photography.

    This next graph is a violin plot showing the distribution of page counts of the books chosen by each profession:

    click for larger
    Excluding the Oxford English Dictionary which comes in at 22,000 pages, legal scholars (Larry Lessig & Jonathan Zittrain) earned the highest average page count of 475, and all the books chosen by the two photographers had page counts under 500.

    Year vs. Page Count

    Taking it a step farther, here is almost every book (again, excluding the OED) based on the year it was published (X-axis) against the log10 of its page count (Y-axis).

    The size of each point represents the number of ratings the book had accumulated on Google Books, and its color represents the book order the contributor placed it on their shelf.

    click for larger
    The darker the point, the farther on the left of the shelf the book was ordered. Conversely, small teal dots represent books with relatively fewer ratings which were placed towards the right of contributor’s shelf.

    Unfortunately, not all the books had publishing dates that made sense — Google reported the dates of Shakespeare’s works anywhere from 1853 to 1970 to 2010, and when would you say the Bible was published? 1380? 1450?

    Excusing these erroneous points, I think the graph still works — checkout the cluster of books published in the early 20th century chosen by the writers in the upper left of the graph: they represent popular early 20th century books with page counts between 200 and 500.

    Summary Stats

    • Number of shelves: 101
    • Number of Books Chosen: 1,645
    • Unique Books According to Google’s API: 1,431
    • Average number of books chosen:16.28
    • Average Pagecount: 381.2
    • Average Year of Publication: 1992
    • Top 5 Chosen Books:
      1. Lolita chosen by 8 contributors
      2. Moby Dick (chosen by 7)
      3. Jesus’ Son (chosen by 5 contributors)
      4. The Wind-Up Bird Chronicle (chosen by 5 contributors)
      5. Ulysses (chosen by 5 contributors)
    • Top 5 Authors:
      1. William Shakespeare (10 different books)
      2. Ernest Hemingway (7 different books)
      3. Graham Green (7 different books)
      4. Anton Chekov (6 different books)
      5. Edith Wharton (6 different books)
    • Contributor with the most number of books: James Franco
    • Contributor with the most number of shared books: James Franco
    • Longest Book: The Oxford English Dictionary, Second Edition: Volume XX* chosen by Stephin Merritt, 22,000 pages.
    • Shortest Book: Pac-Mastery: Observations and Critical Discourse by C.F. Gordon chosen by Tauba Auerbach, 12 pages

    * Jane was only able to paint one volume of Stephen’s OED for his shelf, but the authors agreed it could stand as a synedoche for his choice of the entire edition.

    Overall…

    The data I cobbled together from My Ideal Bookshelf is far from perfect, but I think it does a good job of illustrating some of the larger themes and relationships behind the book.

    For example, the books of the two legal scholars tended to picked on average some of the longest books in the set, and that professionals tend to pick books related to their jobs (checkout the large proportion of photography books chosen by photographers). Also: if you’re James Franco and pick a ton of books, you’re gonna have a lot of neighbors in a network graph.

    I also discovered how skewed the dataset was towards the choices of writers — I jumped in expecting a diverse set of contributors which might be useful for representing an ideal ideal bookshelf — but that would have been a difficult case to make when 40% of the contributors were from one profession.

    This might sound like an obvious observation (and something I should have known having spent so much time thinking about the book), but it wasn’t something I was able to really observe until looking at simple histogram of their professions. So remember: its always worth thinking critically about whether your samples are representative of the underlying distribution, and simple exploratory data analysis can really help you out there.

    Bigger picture, I think this skew demonstrates the nature of coming up with an ideal list of anything: no matter who you ask, the task is essentially a subjective one. Here, it’s biased towards the network Thessaly and Jane were able to tap to make the book.

    Cleaning and Reconciling the Data

    While creating the book Thessaly and Jane had carefully compiled an organized spreadsheet listing each contributor and their chosen books, but I knew there could be subtle typos here and there. These typos could possibly throw off the larger analysis: if the title of two books was even slightly off or missing some punctuation, then aggregating based on their titles would be problematic. I also wanted additional data about the books (data which Thessaly and Jane didn’t record), such as the year the book was published, or the number of pages in the book.

    So I figured I’d kill two birds with one stone and look for an API which I could automatically search using the title they had entered into the spreadsheet, and get a best-guess at the “true” book it represented. The API would also hopefully return a lot of useful metadata that I could use down the line.

    It turns out Google’s Book API is the perfect tool for such a job. I could send a book title to it and get back the book that Google thought I was looking for. This allowed me to lean heavily on Google’s excellent search technology to reconcile book titles that might have had typos, while also retrieving the individual book’s metadata. While I could have used something like Levenshtein distance to try to find book titles that were close to each other in the original dataset, I wouldn’t have been able to retrieve any additional metadata.

    A quick side note for copyright nerds: the Google Books API played into the HathiTrust case recently, and I’d like to imagine use cases like this were part of the reasoning behind declaring the Google’s use of copyrighted materials a fair use.

    Google’s Book API allows 1,000 queries a day, but since the list contained thousands, I had to write in and ask for a quota-extension — thanks to whomever at Google granted that — I now get to hit it 10,000 times a day, which was enough to iterate building a script to compile the data.

    Not surprisingly, Google’s API returns a ton of metadata. Everything from a thumbnail representing the book’s cover, to the number of pages, to the medium it was originally published, to its ISBN10 and ISBN13 … the list goes on. I tried to choose fields that I knew would be interesting to aggregate on, but also ones that would help me uniquely identify the books.

    One particular piece of metadata that was missing was the genre of the book — only 28% of the books returned from Google had category information. Another option would have been to set up a Mechanical Turk Task to ask humans to try to determine the books’ genres. This kind of book ontology is actually a very difficult and somewhat subjective problem. Just think of how complicated the Dewey Decimal system is.

    Finally, not all data is created equal — I’ve manually corrected a handful of incorrect classifications from Google where the search results clearly did return the right book, but its certainly possible not all books were recognized or reconciled properly.

    The tools

    Aside from d3 for the interactive plot at the top of this post, I used R and ggplot2 to create the static graphs.

    The script I used to query the Google Books API was written in Ruby, and exported the data to a CSV which I then loaded into MySQL and Google Docs to manually review and spot check.

    Here’s the query I used to generate the data necessary for the force-directed graph:

    SELECT
      ideal_bookshelf_one.contributor_id as source,
      ideal_bookshelf_one.google_title as book,
      ideal_bookshelf_two.contributor_id as target
    FROM 
      ideal_bookshelf as ideal_bookshelf_one,
      ideal_bookshelf as ideal_bookshelf_two
    WHERE
      ideal_bookshelf_one.google_book_id = ideal_bookshelf_two.google_book_id
      AND ideal_bookshelf_one.contributor_id != ideal_bookshelf_two.contributor_id
    

    Sequel Pro’s “Copy as JSON” was extremely helpful here — it took relatively little effort to munge the SQL results into an array of nodes and links required by d3’s force layout.

    If you liked this post…

    Pickup a copy of Thessaly and Jane’s book today!