Another cross-post from my blog at http://preludeinteractive.com
recently started releasing their corpus of volunteer, donation, and other opportunities (social actions) as a nightly dump
. I think that's super! I've started setting up some analytics for the data, as I think there is a lot of wisdom and science to be found within. The eventual goal is to do some fancy topic modeling using the data, but since I accidentally stomped the file I was starting with, I decided to try a few simpler things first.
First, I did some simple averages across the database.
- 12.7886 hits per action. Not too shabby!
- The average action specifying a goal amount (typically US dollars) is 77% achieved. Neato!
- The average action was updated 10.33 days ago.
- The average action was created 218.56 days ago.
Why is this useful? If split by action source, you can give the people posting these actions a good idea of the relative health of their data. There might be good reasons why some actions take 20 days to find resolution, while others only take 5. But what are those reasons? Maybe some are novel and interesting and unknown?
Next, interested by the average ages of the actions, I made a couple simple plots of how recently the actions were updated or created.
I think those big spikes correspond to the dates various action sources were added, but I could be wrong. It's also interesting that the long tail of actions goes all the way to 2000+ days old. They were discussing cutting really old actions out of the database (since the information is probably no longer valid anyway), and I think that might be a good idea.
This exponential decrease is not surprising. The 16 day shift right is because I was testing with a dump a little more than 2 weeks old. The real puzzling thing is the big spike around 205 days. What's that all about?
Sure, the Social Actions folks could have run these figures anytime they wanted, but the great part about them being so open is that they don't have to. Maybe something we uncover can inform the action sources (and ultimately those in need of volunteers, donations, votes, etc) and allow them to reach people quicker and more efficiently.