by Ward Penney @ 11:20 am 26 January 2011

What would you do if 60,000 people saw a UFO? Well that’s exactly what the National UFO Reporting Center has worked so hard to gather. Since 1981, they have been collecting voluntary UFO sighting reports over the phone, in person and on the web. Forming a record of moments from tens of thousands of people across the US, the data speaks volumes to us today. I decided to analyze the data and generate a visualization in Processing in order to find patterns and determine if we truly aren’t alone.

The Data

A mass of self-reported sighting instances since the early 1900’s, the collection of sightings comes to nearly ~100MB indexed in SQLite. The fields are:

  • sighting date,
  • reported date,
  • a one-word description of the “shape” of the craft,
  • the duration, and
  • a description of the sighting.

Some of the fields were missing from a lot of the data, for example only half of the records had the shape field populated. Almost all had dates, and all had descriptions. I used Ruby on RAILS to parse the data into a SQLite database, and began to think about how I may want to visualize it.


I had several ideas in the beginning. I considered morphing the Shapes together, to form a Voltron UFO, but I didn’t think that would make sense even if I had the technical ability. I also wanted to somehow connect the top-grossing Sci-fi movies to the data, possibly showing “tails” behind the movie posters to represent the “tail” of the movie. The tails would be larger if more sightings were reported following the movie. I had finally settled on placing dots on a US map for the sightings, perhaps over time, when our TA pointed out something.

Dan Wilcox found out that Slate.com had done just that! With the same data set! They did one that plotted the sightings geographically as points on a US map. I thought this looked good, but was not that informative because it generally tracked population. They also did one that had a weekly slider control, and it displayed prevalent “shapes” as small icons on a US map. This was interesting because it was over Google Maps, but I don’t think it was informative at all.

So, after a brief data identity-crisis, I decided to just plot a histogram of sighting count per day. When I got the visualization working, I did notice a pattern.

UFO Infoviz Screenshot

UFO Sighting Count Over Time. Notice the seasonal spikes.

The data showed seasonal spikes during the summer months. I was also really interested in why some days had so many sightings, so I began googling the date. A few of the dates were quite revealing: one was a “Earth-grazer”, an asteroid nearly colliding with Earth! Another was a piece of a Chinese satellite falling from orbit and crashing into a house.


I added in a few features on the visualization. First, you can select if you would like to see the data for “sighted_at” or “reported_at” date. The data goes back to 1400, and it is really spread out, so I added date sliders to adjust your beginning and end date. Also, when you hover a datapoint, it shows the date in an opaque box. Clicking the box takes you to a google search for that date.


The GUI controls were in Processing from controlP5, by Andreas Schlegel. SQLite interfacing provided by SQLibrary for Processing, by Florian Jenett.

What I Would Change

I wish I would have filled in the columns below the points, like a true histogram. The axes and their labels were also obscured when using the date sliders. The top left of the graph contains a lot of wasted space.

After I noticed the seasonal spikes, I should have taken some time to create another visualization that was circular and used polar coordinates to show how the summer months yielded many more sightings than the other months.


I learned to really look for examples of something you’re trying to do, especially if your dataset is public or accessible. I also learned that in order to include zooming functionality, you need to think about it from the beginning. I achieved something close to this, but I doubt my structure could have zoomed a US map.

Slides and Code

Here are the slides from my presentation.

Here is the code and dataset zipped up on my Dropbox folder. ~100MB.

    Comment by Golan Levin — 4 February 2011 @ 3:19 pm

