Ben Gotow-SMS Visualization
The Idea
What can you infer about someone’s social network from their text messaging activity?
A few months ago, I started working on an app that syncs text messages from an Android phone to a desktop client for the Mac. The idea was to decouple text messaging from the phone, enabling the user to have a conversation anywhere and seamlessly transition between messaging on the phone and messaging on a laptop or desktop.
While developing the application, I noticed that the thousands of messages synced by the app revealed interesting trends about my conversational patterns, and it seemed like a perfect data set for a visualization.
The Data
The application downloads all the user’s messages from their phone and stores them in an SQLite database. A text dump from this database formed the data used in the visualization. The format of the text dump is shown below:
6157141096:::Allison:::1:::119:::87
The text is ‘:::’ delimited. The columns are as follows:
- The phone number messaged
- The display name of the user messaged
- The origin of the message (0 = your phone, 1 = theirs)
- The frame number on which the message should appear in the animation. This is calculated by taking the timestamp of when the message was sent or received, subtracting the timestamp of the first message in the animation and dividing by an acceleration factor.
- The length of the text content in the message.
The Visualization
The Discoveries
The biggest discovery along the way was that Processing is pretty cool and very easy to use. I have a lot of experience working with OpenGL and Mac OS X’s Quartz2D APIs, and Processing was a nice surprise. I was able to go from concept to an early working version of the visualization in one afternoon. My one big complaint is that there’s no built-in debugger whatsoever… Coming from a programming background, that’s pretty damn ghetto. I’ve heard you can use Eclipse somehow, so I’ll try that next time.
I was unsure of how to create a graph of interconnected nodes in Processing. I wanted to create one dynamically without advance knowledge of the number of nodes needed, and I didn’t want to write any code to do it. I thought that using some sort of spring physics model would allow the graph to be self-organizing. I did some searching and found the Traer Physics library, which I dropped into my Processing libraries folder and linked into my sketch by binding each contact object to a node in the physics simulation. That was it. There was much rejoicing.
Each node was added to the physics model as a solid body, and negative attractors were added between each of the nodes to cause them to spread evenly. Springs were added between each node and the center ring. This turned out to be a great solution because the resting length of the spring could be adjusted to move the nodes toward and away from the center. I’d wanted to do this the whole time, but using the springs allowed me to smoothly animate that part of the visualization, too.
My original idea was to represent each message sent or received as an arc between nodes. However, I wasn’t sure whether Processing would be able to handle drawing the number of curves required at a decent framerate. With a data set of over 2300 messages, I was pretty sure it would become unworkably slow. Big surprise, it’s Java, and it did. I had to add a shortcut to disable lines so I could rapidly test the visualization.
The Critique
Overall, I’m pretty happy with the visualization. I was able to animate it, and it achieved the initial goal of revealing the social network inferred by your messaging habits. There are a few things I’d like to explore in the data that the visualization doesn’t reveal, though. There’s a lot of data in the actual text contents of the messages that would be fun to look at. How often do people use emoticons? Do you use emoticons more frequently when the person you’re talking to also does? Is there a bimodal distribution in message length that implies that some messages are part of complex multi-message conversations while others are simple “pings?” Answering those questions would require other visualizations, I think–but I’m really curious.
The Code
Dependencies: The processing applet requires the Traer Physics library.
Example Code Used: The code below draws on a large amount of sample code, from Processing.org and from the documentation of the Traer Physics Library. The Processing “Load File 2” example was particularly useful. The code for the wavestream was written from scratch (in a rather ghetto way.) I’m still looking for a good library that creates them!
The source code is available here
A note about source data: Unfortunately, the source data for this experiment contains sensitive data including people’s phone numbers and names. I’ll be releasing the SMS synchronizing app for Mac and Android soon and that will allow you to gather and format your own messaging history for visualization.
Strong work, Ben. Great that you posted source code. Great that you experimented with both traditional (timeline) and non-traditional displays, though it would have been interesting to see more connections across the two. For the red/green timeline, it would have been nice to see the signed difference as well, as this would reflect the imbalance in the conversation. We’ll speak about this more tomorrow.
Comments from PiratePad A:
self inquiry excellent
would be interesting on the timeline to see people flow in and out of your sms life i know its already inherent here, but maybe highlighted more – i know at one point in my life i was getting 40+ txts a day from one single person, and then stopped, interesting to see visual, i have an old xml backup of my texts if you want to play with it
Love seeing the hand-drawn sketches!
Nice sketches to show process. I like the idea of visualizing who is importnant to you. Peolpe’s pictures would have been almost nicer then the pie graphs. This is really awesome. PLZ let me analyze my texts!
Good to see people still think on paper.
The concept development process looks very well thought out. Very cool visualization and animation. I like the concept of the “center.” Great context on the mouseovers.
I like the little pie charts. So many ways to get different data from this visualization. And it’s all displayed really cohesively. The fading out effect is a really good way to show time passing, and very natural to understand. I think the top graph thing could be placed better/labeled better. Wow! It switches. That’s crazy. Really insightful of you to consider both number of characters and number of messages.
I like the wave stream though because it shows changes in relationships over time. Bringing the element of time here is really important. Ok now that I see this social graph playing over time I really like it. A very nice thing to sit back and watch. You should take the phone numbers out and put this online.
Attractive visualization. I like the chart on the top too.
I’m not sure if red and green are the right colors. Designers—any advice?
– I find the lime-y-ness of the green more annoying, but red/green doesn’t particularly bother me (though one runs the risk of going christmas). complementary color is good idea for this, i think. I’m not into the black background, but that might be a personal preference
– I wonder: could this be done effectively with a single color?
-i don’t know if that would be appropriate — this data lends itself to two colors (or, two patterns, two somethings) given that it’s him and another person.
– Pattern is what I was thinking about
-certainly doable. I find two colors effective, and since he can use color, why not, because the two colors provide an edge of contrast to read easily (as the proportions change in the piecharts, i see it quickly) and I’m not sure if two different patterns in a piechart would show that as effectively, given the playback of the viz.
Is this scrubbable? I want to pause it, reverse it, etc.
This is unbelievably smooth
Very smmooth and visual, polished. good job ;) Maybe some way of differentiating different people?
I like the visual display of communication patterns! Wow, nice to see the disparity between number of characters and number of messages. Have you recorded messages sent to groups? Would be interesting to see little connections between other people that were generated by you..
Concept, technical execution, and chrome: All solid. Nothing was left behind.
Your process is nice, perhaps the only one to show hand-drawn sketches. The visualization looks great, especially enjoy the animation. It might be nice to have labels or some way of indicating that each point/pie chart represents a different person. That isn’t immediately clear. Nevermind, just saw that information revealed when you hover over. Interesting insights into you personal life from this. Impressive execution both technically and visually.
Very nice, good process, interesting data, very interesting for the holder of the data, I like how personal it is.
Comments from PiratePad B:
I do like the alternative wavestream graph idea, from a practical perspective. I often have to search backwards in time to find the number of someone who sent me a text message, whose number isn’t in my contacts list…
Lots of great questions raised by the project. In your online blog post summary, I hope you can share some of the things you learned about your own texting habits. It would be great if you could do this for someone else. I volunteer (Golan).
It is nice to see the conceptual development. I wonder if there is an interesting way that you can include the geogrphical data since you have area code. peasycam! go 3d!
Very intuitive. I immediately know what everything means.
Could use a stacked pie to compare characters vs sms count
Lots of dimensions going on here. Visually interesting and clever way of displaying all that information, but the message might be more clear either with a reduction in the number of parameters being analyzed or with more explanatory text on screen.
Beautiful and super legible.
Very beautiful visualization, and the information contained in each bubble is very clear. Switching between number of messages and characters is a REALLY nice feature – adds a whole other dimension to the information I’m getting out of this.
If this was an iPhone app, or an app that someone could actually run on their phone, I could see it being VERY popular. Agreed!thanks pink you’re welcome, green.
wooooooow. skillz. The timeline is what makes this so effective.
Lovelovelove this! I think it’s a really clear representation of your personal social network, and I think the ability to compare #s of messages is really great. I want this on my phone.
The question about emoticons is pretty great; personally I think they’re addictive (i feel like I’m being rude sometimes if I don’t :D or :) right back at them) but I’d love to see an actual dataset that could prove or disprove the frequency of emoticons.