Due 3-31 (Image Collection)

For this assignment, you are asked to create an interpretation of an image collection.

Overview of Deliverables

This assignment has three parts, which are due at the beginning of class on Thursday, March 31. The percentages below are suggestions as to how you should plan your effort, and do not reflect grading rubrics. 

  1. Identifying your Subject (~10%)
    You are asked to select, download or scrape a large (5000+) dataset of images. A variety of datasets have been made available to you, such as the ones in this spreadsheet. Many of these datasets have already been helpfully scraped and downloaded for you by our course assistant, Charlotte Stiles.
  2. The Small Warmup (~10%)
    As a small “warmup” for the main assignment, you are asked to generate a high-resolution image that visualizes your entire dataset of images. To do this you will compile and run Gene Kogan’s readymade openFrameworks demo, the ofxTsne Gridding Example. Some more information about t-Sne can be found in our lecture notes.
  3. The Main Part (~80%)
    For the main part of the assignment, you are invited to create an interpretation of the large dataset of images.


  • Your interpretation may be expressive and/or analytic. In either case, your work should reveal some ‘truth’ — poetic, social, visual, historical, or otherwise — about your subject.
  • Your interpretation must be a system you have coded, or the products generated by a system you have coded.
  • It is expected that the majority of students in the class will interpret collections of images. However, it is permissible to use datasets of other media objects that have visual properties (e.g. 3D models, movies, Powerpoint presentations). Please note that support for such investigations may be more limited. If you do elect to use a dataset of an alternative media type, allowances will be made for the additional effort required in producing the ofxTsne visualization.
  • Your subject, the collection of images, must be large. Arbitrarily, I have decided that this means your dataset must have at least 5000 items. If you wish to use a dataset with fewer elements than this, you must obtain a signed note from your parent, guardian, priest or advisor with a written explanation; additionally, this note must be displayed as part of your project documentation. No exceptions will be made for datasets with fewer than 1000 objects.

Some Possibilities.

For the Main Part of the assignment, some of the kinds of things you might create could include, but are not limited to:

  • An interactive or static data visualization
  • [Software which generates] a book
  • [Software which generates] a movie or animation
  • [Software which generates] a print or series of prints
  • An image sequencer, such as a comic strip generator, or a Twitter bot
  • An interactive installation, performance or game
  • Something else

Note that it is not a requirement that you visualize all of the images in your dataset. (That’s what the Warmup exercise was for.) To give an example of what I mean, suppose the purpose of your software is to search through the entire public art collection of the United Kingdom — more than 200,000 images — in order to discover instances of possible plagiarism or influence. Your final output might simply be a few pairs of images.

If necessary, you may visualize different image collections for the Warmup and Main components of the projects. The purpose of the Warmup is (in part) to help you consider the possibilities of your dataset for further investigation.

Some Cautions

There are three main challenges with this assignment. Each of these challenges is interrelated with the other two.

  1. Ideation. Having a good concept, and having a good intuition that the dataset you’ve selected will be suitable for this concept.
  2. Analysis. Understanding what you are able to quantify about an image (e.g. To what extent does it contain the feature I’m interested in? How similar is it to another image?) …and being able to create a system which can analyze and/or detect the things you’re interested in.
  3. Presentation. Having quantified (measured, analyzed, detected, filtered, etc.) the things which interest you about your images, then: understanding how to allow your images, to most compellingly tell the stories they have to tell.

Fine Print (Deliverables)

  • The Warmup t-Sne visualization will be a large image, likely around 10000×10000 pixels and 50MB or larger. For obvious reasons this will not upload to the course website. Please put a copy of this on the course computer (“classy”), in a specially designated folder on the Desktop, and the professor will collect it from there.
  • Using whichever programming languages are necessary or preferable, obtain your dataset and use it to develop a visual representation.
  • Create a blog post for your project. Categorize this with the category, ImageCollection.
  • Upload your code to Github, and include a link in your blog to your repository.
  • In your blog post, please write 150-200 words about your project.
  • In your blog post, include some scans or photos of any sketches, if you have them.
  • In your blog post, include some screenshots of your project. Include a (small!!) version of your t-Sne visualization.
  • You’ll be expected to present your work in class on Thursday March 31st.
  • By Tuesday April 5, you should also have completed documentation of your software in a brief video (1-2 minutes long), preferably with a narration. Please upload this video to YouTube or Vimeo, and embed this video in the blog post.
  • Note that a proposal for your final project will also be due on Tuesday April 5.

Learning Objectives

On completion of this assignment, students will

  • Be able to demonstrate skills in wrangling, analyzing, and displaying large collections of images.
  • Be able to demonstrate familiarity with the ways in which large image collections are or can be used in art, design, and cultural research.

Some Image Databases of Possible Interest