Fetching Patreon Data

Patreon is a delight to scrape. Actually, scrapping is the wrong word for it – the frontend of Patreon is a react application that calls a number of very sensibly designed json end points. Call the same endpoints and you get delightfully clean json that exactly matches what gets displayed on the site.

A disclaimer – this is undocumented as far as I can tell – the publicly documented API (JS Implementation) is targeted at creators and provides access to privately information only visible to creators. All of this could change at any point.

I was all ready to parse HTML, but looking at the source there was a beautiful JS object containing all the data needed to display most pages.

"data": {
  "attributes": {
    "created_at": "2016-04-30T13:58:22+00:00",
    "creation_name": "Entertainment",
    "display_patron_goals": false,
    "earnings_visibility": null,
...

Even better, at the tail end of the long object is the call to fetch just the JSON:

"links": {
  "self": "https://api.patreon.com/campaigns/355645"
}

So as long as I can get the ID of a campaign, I can get all the information about it in an easily processed format. Thanks to the explore pages and a bit of network monitoring reveals calls to the following URL’s:

https://api.patreon.com/explore/category/12?include=creator.null&fields[user]=full_name,image_url,url&fields[campaign]=creation_name,patron_count,pledge_sum,is_monthly,earnings_visibility&page[count]=20&json-api-version=1.0';

Inspection of the different category pages reveals the number after ‘/category/’ runs from 1 through 14, and 99 for the ‘All’ category. This way, I can fetch all the top campaigns then use the campaign API to retrieve detailed information.

An interesting note – the data structures reveal a lot about how site has been designed and where complexity can be added later – multiple campaigns per user, links between campaigns, etc.

Full code for my scrapper is after the break – I’ll be diving into analysis next.
Continue reading “Fetching Patreon Data”

Quick Analysis – Patreon Funding

Inspired by the launch yesterday of the Patreon funding campaign for Movies with Mikey, a movie analysis YouTube channel, I’ve performed some rudimentary analysis of how Patreon donors fit into donation tiers.

Movies with Mikey Patreon

A quick primer – Patreon allows creators to collect donations from supporters on an ongoing basis as opposed to a one-time engagement as with Kickstarter. Donations can be by month or by produced work. While various donation levels provide perks or recognition, Patreon tends to be more focused on “support” than perk compared to other funding platforms. As a result, tiers tend to be less about “value for money” and therefore more interesting to analyze.

Patreon, as with Twitch, limits the information publicly available to summary information, but still enough to assess donation breakdown given some reasonable assumptions. For Movies with Mikey (MWM), we are given the total donations and how those are broken down by tier.

While we would like to know the specific distribution of donations, we can estimate the average donation per tier to get an idea of how donations break down. One caveat is that there is some missing data – the sum of donors in tiers only adds up to 744, so 39 supporters or about 5% didn’t select a tier and we have no idea where they fall. To analyze the breakdown, I made a simple spreadsheet to allow easy estimation of donation amounts.

This estimation underestimates the total by 1.8%, which given the ‘missing’ donations suggests a fair degree of accuracy. There are a few hypothesis that come out of this:

  • Donors give the minimum to be in a given tier.
  • There are likely a few high-end donations that pull the average of the top tier up.

I’d stress that this is exploratory work at the moment that uncovered some reasonable hypotheses. To confirm them and to make any recommendations will entail seeing if this pattern holds for other Patreon campaigns which I hope to do in the next few days looking across categories and sizes of campaigns.

One last comment I also plan of returning to is the nature of sites that provide summary statistics. By limiting the information available, they are creating a market for 3rd party scrapers and losing control over the data. For Patreon, there is Graphtreon and Kickstarter has Kicktraq. I’m torn on how the platforms and their users should feel about these, particularly as personal finances are often involved, but I’ll delve more into the issue later.