Since we released our first app version in September last year, we've collected over 55 million activities. This set of valuable activity data is growing crazy fast. We strongly believe the value it provides will be instrumental in making our app better. We're trying to make moving fun, and numbers are encouraging. Our users move 75% more after 6 weeks and we're adamant to push it up further.
Human translates activity data into simple goals while leaning on game mechanics to influence behavior. But behind the scenes we're collecting massive amounts of raw data. We use aggregated, anonymized activity data to make design decisions and improve our activity detection. Instead of focusing on raw numbers and stats, we wanted to put 55 Million activities into perspective. So how do people around the world use Human? Human Cities is our first attempt to translate our data into useful insights.
The inspiration for Human Cities came from some amazing visualizations around the web. These maps by Nikita Barsukov of Endomondo running data and these beautiful renders by Nathan Yau were definitely an inspiration to explore ways to visualize our own data. Be sure the check out the great work by the people behind MapBox too.
Extracting 65 Million points
Our infrastructure is entirely built with Amazon Web Services. We use an RDS (MySQL) database to store stats for individual activities. Roughly 10 terabytes of motion and location data for activities currently sit on S3. GPS boundaries for chosen cities allowed us to select the correct JSON files from S3, which were then converted to CSV. A simple multithreaded Python script opened connections to S3 and ingested thousands of files per minute. The terminal quickly becomes your best friend under these circumstances.
The privacy of our users is extremely important to us. We limited our data exports as much as possible and merged filed to prevent activities being connected to individual users. Plotting paths for individual activities can possibly reveal user patterns, which was an important limitation in the final visualizations. We finally combined all location points into one dataset per city.
Our tool of choice to visualize data was R. The first experiments with activities in Amsterdam on a Google map showed the density of our data. Wow - we just weren't expecting a complete street grid when plotting GPS coordinates with such a low opacity (>10%).
The first map-based plots for Amsterdam showed the density of our data for random major cities. After some trial & error with different colors, opacity settings, and custom maps, we realized that simply drawing maps based on Human activity made for better results; we didn't need a map after all.
We plotted every city with multiple opacity settings and manually picked the best setting for each city based on the output. The final visualizations for cities like Los Angeles and New York have an 0.025 opacity setting (40 location points per pixel for 100% white), where 0.1 opacity gave the best results for cities like Vienna, Copenhagen, and Cape Town. By using only a single opacity setting for each city, we made sure that it would be easy to compare activity types.
Bringing a city to life
Plotting activity data in a static image already provided interesting visuals, but how could we make these shots really come to life? The idea of showing the cadence of a city in a 24-hour time frame was worth an experiment. At this point, the enslaved Macbook was already rendering stills at maximum CPU capacity and our wishlist grew to about 30 cities. This is where AWS really came to the rescue. An extra large EC2 instance based on this Rstudio AMI by Louis Asslet doubled our computing power in little less than an hour.
To make a 30 second video at 24 FPS, you need 720 frames. Based on the timestamp of a location point, we calculated the local time of day, and used that to assign a point to one of 720 frames. Points disappear after being visible for 60 frames (2 hours), which made a total of 780 frames per city. We rendered these for a small selection of cities.
With the command line tool ffmpeg it's possible to easily render a movie out of a sequence of images:
From idea to release in 10 days
Our main objective is and has always been to build an amazing app. We didn't want to shift too much focus from this mission, so we gave ourselves 10 days from start to finish to come up with something "typically Human" that gives a glimpse of the size and breadth of our data set. In those ten days we figured out how to extract & aggregate data, learn the basics of R, and figured out the best way to visualize our data. The moment we realized how powerful these images were, we knew we had to get the details right and started preparing for a big marketing push. While stills were being rendered, we started sketching out the site, scripted the movie and worked on visual design & copy.
Our video was made with 30 second video exports of 24 hours of aggregated data per city, edited in Final Cut Pro. When we were exporting data, editing video and coding the site, we skipped assigning each other tasks and kept all communications restricted to Slack. No emails were exchanged in regard to this project.
After determining the total amount of content and content types, we sketched out a simple design to present our data. Our process started on paper, followed by converting the spartan wireframe to vectors in Sketch. We switched to using Sketch exclusively for all visual design around October of last year. Come to think of it, this project was made in complete absence of Photoshop. In Sketch, we finished a basic visual template for style and typography.
Aesthetically, we elected to keep things simple and let the contrast do the talking. White activity trails over pure black. Typographically, we stuck with our usual visual style which is mostly based on unmodified Proxima Nova applied à la Suisse.
The Human Data site was built using Cactus, which allowed us to quickly set up a simple grid and deploy directly to an AWS bucket. Using Cactus means a fast setup, so we built the site in little over a weekend. It has great built-in, responsive templates that require little modification to use. All charts are based on aggregated stats per city, built with D3.js.
Prepare for impact
Just before launch we spent some time to make sure we maximized our reach. We already increased the number of cities, to make the visuals & ranking more interesting for people around the globe and provide a hook for local media. Twitter Cards and Facebook OpenGraph tags on every page boosted our visibility in social media. We added a subtle watermark to all high res images, generated animated gifs, and shared all content in a public Dropbox folder.
We visualized 7.5M miles of activity in major cities all across the globe to get an insight into Human activity http://t.co/hEeFxKldwm— Human (@humandotco) July 2, 2014
- Featured on over 200 (international) media outlets.
- Featured on Business Insider, Engadget, The Next Web, Citylab, many local outlets (for specific city data), and yes... Buzzfeed.
- 55k+ plays on Vimeo in the first week.
- 2000+ shares on Twitter & Facebook linking to cities.human.co.
We're looking forward to share more insights in Human activity, but our primary focus is on building the most fun app to get and stay healthy. Follow us on Twitter or Like us on Facebook to stay up to date. Ping us if you have any questions.