Work – Page 7 – 60212: INTERACTIVITY & COMPUTATION

ilovit-SituatedEye

https://editor.p5js.org/ilovit/sketches/vbDYcAqf7-(you have to train it yourself when you start)

I created a game where the player is protecting themselves from some unknown entities that are trying to gain passage through the window. The game state is communicated to the player entirely through audio, while they interact with the game solely by opening and closing the window.

The player will periodically hear a whistling noise, indicating an oncoming assailant. If they close the window in time, the assailant will be stopped with a crash. If they fail to close it in time, the room will become a bit more "stuffy" - indicated by a static noise in the background (this is effectively player health). If the window remains closed, the room also becomes slowly stuffier, while an open window will let the air out. There is currently no lose condition. The static noise just becomes loud and annoying.

I think the basic interaction is fun, but various variables aren't tuned quite right. My original intent was to have the play sitting down doing something else, and then have to get up to open and close the window every once in a while, but I couldn't figure out a good way of encouraging the player to sit down (one thought is that this game is to be played while trying to do other things), and the sound that indicates an imminent assailant is too short for someone to get up and close the window in time. I should also have some kind of instructions for the player explaining how the whole thing works.

lubar-SituatedEye

The Situated Psychic Eye

I am fascinated by fortune tellers and the idea of a "psychic eye" (I don't buy the "psychic" one bit) but the elements of incredibly detailed observation and building from the 'telees' cues is interesting enough on its own. Using the computer vision to create an accurate bodily and verbal cue reader, was ever so slightly out of scope for this piece, but I wanted to continue with the idea of a psychic gaze. So, I created a tarot card reading set up, which incorporates the fun, a bit ridiculous, and somewhat mysterious air of telling the future.

My initial, and continued intent is to train the computer to recognize all 76 cards, however as I have yet to find a way to successfully load pre-stored images and upload the images taken, I scaled down slightly to save sanity when every time the program is restarted the cards are re-scanned.

The set up in a physical space I think is critical to creating the air of mystic, and I pulled a dirty trick in projecting onto a surface by zooming the unnecessary elements out of frame.( This would not be ideal for a final system set up.)

The program itself works beautifully to recognize the different cards, now I just need to figure out how to save and upload the training so that I can implement the entire deck.

Program Link Here

Process:

tli-SituatedEye

It's a bird! It's a plane! It's a drawing canvas that attempts to identify whether the thing you're drawing is a bird or a plane.

I am not too happy with the result of this project because it's basically a much worse version of Google Quick Draw. Initially, I wanted to use ml5 to make a model that would attempt to categorize images according to this meme:

I would host this as a website where users who stumble across my web page would upload images to complete the chart and submit it for the model to learn. However, I ran into quite a few technical difficulties, including but not exclusive to:

submitting images to a database.
training the model on newly submitted images.
making the model persistent. The inability to save/load the model was the biggest roadblock to this idea.
cultivating a good data set in this way.

My biggest priority for choosing a project idea was mainly finding a concept where the accuracy of the model wouldn't obstruct the effectiveness of its delivery, so something as subjective as this meme was a good choice. However, I had to pivot to a much simpler idea that could work on the p5.js web editor due to all the problems that came with the webpage-on-Glitch approach. I wanted to continue with this meme format, but again, issues with loading/submitting images made me pivot to using drawings, instead. With the drawing approach, the meme format no longer made sense, hence the transformation of the labels to bird/plane.

I don't have much to say about my Google Quick Draw knockoff besides I'm mildly surprised by how well it works even with the many flaws of my data set.

Some images from the dataset:

An image of using the canvas:

A video:

zapra – situated eye

Eye tracker drawing

For this project, I wanted to use machine learning to create a device that explored subtle changes in the eyes. Using a two dimensional image regressor, most of my development involved exploring the nuances of how I interacted with the tracker rather than the code itself. While my original intention was a tracker that would allow the user to draw with their eyes, I spent a lot of time experimenting with how to collect data points, my proximity to the camera, and the range and speed of how I moved my eyes.

View code

Process / Early Iterations

Knowing I wanted to make a program detecting eye movements, I experimented with pupil dilation, lying, and smile lines by recording myself as I performed different tasks. While I was interested in the concept of detecting nuances expressions in people's eyes, I felt the observable changes were too subtle to detect for the scope of this project.

Before adding specific points of reference for the training set, I added samples by clicking in the vicinity of where I was moving. This helped me while I was starting out but did not produce the most accurate results.

Training

For greater precision and a method for other users to reproduce the tracker, I created a series of points to use during the training set with indicators of when each point had at least 30 samples.

Setup

My final setup involved a large monitor screen, a precise webcam, and a number keypad to train the program. The larger screen allowed for a greater sense of eye movement for the program to track, and the keypad let me train the set without having to glance at my laptop and disrupt eye movement.

sovid – Situated Eye

Sketch can be found here.

To use:
Toggle the training information by tapping 'z'.
Toggle the hand instructions by tapping 'm'.

For this project, I was interested in creating a virtual theremin, where much like an actual theremin, the positions that a user's right hand makes controls the note on the scale, and a slight wiggle controls the vibrato. I used the adapted Image Classifier to train my program on the hand positions, and looked at a point tracker by Kevin McDonald to track the hand for the vibrato. My main issue was finding good lighting and backgrounds to make the image classifier work reliably - I made a lot of strange sets and stands to make it work, so it's a very location-based project.

lsh-SituatedEye

Last year, a visiting guest in the studio mentioned that they consider many of our interactions with smart assistants quite rude and that these devices reinforce an attitude of barking commands without giving thanks. I think back to this conversation every so often, and ponder to what extent society anthropomorphizes technology. In this project I decided to flip the usual power dynamic of human and computer. The artificial intelligence generally serves our commands and does nothing (other than ping home and record data) when not addressed. Simon says felt like a fun way to explore this relationship by having the computer give the human commands, and chide us when we are wrong. I also decidedly made the gap between commands short as a way to consider how promptly we expect a response from technology. I would say this project is fun to play. My housemate giggled as the computer told him he was doing the wrong motions. However, one may not consider the conceptual meaning of the dynamic during the game. Another issue I ran into during development is that when trained on more than three items, the network rapidly declined in accuracy. In the end, I switched to training KNN on PoseNet data, which worked significantly better. There are still a few tiny glitches, but the basic app works.

vingu – SituatedEye

I made a survelience ramen bot that takes a picture when it sees someone take instant ramen out of the pantry, and tweets the image on twitter. I thought it would be interesting to document me and my housemates' instant ramen eating habits, since our pantry is always stocked with it.

I worked backwards, starting with the twitterbot. I used Twit API, and node.js. (Most of the work was from setting up the twitter account and learning about command prompt.) Then I added the image capture to the Feature Extractor template. I struggled with connecting the two programs, since one runs on pj5s (client-based?) and the other on node (local computer?). I tried to call the twitterbot code in the feature extractor code(trying out different modules and programs), but I couldn't get it to work. I opted to make the twitterbot run continously once I call it in the command prompt; it calls a function every 5 seconds to check if there is a new image to post.

I made the twitter account header/profile look like a food blog/food channel account. I thought it would be a fun visual contrast between the tweets.

code (I didn't run on pj5s editor, I ran it locally from my computer)

Some after thoughts:

It would be better if I finished this earlier, so that there would be more realistic twitter documentation of me and my housemates. none of my housemates were avaliable/awake by the time I finished
find a better camera location, so it looks less performative
I should of done samples of holding food that wasn't instant ramen
this can only be run locally from my computer, maybe upload to heroku?

Scrolling through my tester tweets.

iSob-SituatedEye

Even compared to all the other projects, I spent a very long time troubleshooting and down-scoping my idea for this project! My first idea was to train a model to recognize its physical form -- a model interpreting footage of the laptop the code was running on, or webcam footage reflecting the camera (the model's 'eyes') back at it. However, training for such specific situations with so much variability would have required thousands of training data.

Next, I waffled between several other ideas, especially using a two-dimensional regressor. I was feeling pretty bad about the whole project because none of my ideas expressed interesting ideas in a simple but conceptually sophisticated way. I endeavored to get the 2D regressor working (which was its own bag of fun monkeys,) and make the program track the point of my pen as I drew.

Luckily, Golan showed me an awesome USB microscope camera! The first thing I noticed when experimenting with this camera was how gross my skin was. There were tiny hairs and dust particles all over my fingers, and a hangnail which I tried to pull off, causing my finger to bleed. Though the bleeding healed within a few hours, it inspired a project about a deceptively cute vampiric bacterium who is a big fan of fingers.

This project makes use of two regressors (determining the x and y location of the fingertip) and a classifier (to determine whether a finger is present and if it is bloody.) I did not show the training process in my video because it takes a while. If I had more time, I think there is lots of potential for compelling interactions with the Bacterium. I wanted him to provoke some pity and disgust in the viewer, while also being very cute.

In conclusion, I spent many hours on this project and tried hard. I really like Machine Learning, so I wanted my piece to be 'better' and 'more'. But I learnt a lot and made an amusing thing so I don't feel unfulfilled by it.

Documentation:

Sketch

meh-SituatedEye

Space Invader controlled by hand posture - collaborated with Sanjay

This is an exploration of training our own model for posture detection to apply it to game. We were inspired by the rock paper scissor detection and wanted to something that also detects gestures but in a different game scOur first step is to detect both the rotation of the hand and whether you pull the trigger. Using regression template we were able to create two axises that separately detect the two features. However, as we combined the detection with Space Invader, we faced a very low frame rate because of the expensive calculation. Currently this is a very crude exploration, and we could be more creative with the application of the shooting gestures. A further development of this game could be optimizing the calculation time by moving offline and try to save and preload the model. Following are some other game scenarios this gesture can be developed:

Link to p5js: https://editor.p5js.org/svsalem/sketches/A8Ao1HcrT

vikz-SituatedEye

We were really inspired by Google's drawing machine learning, and the ability to play around with the different types of applications that machine learning has in with drawing. In order to most quickly and accurately iterate over and over again, we started our explorations by playing around with the whiteboard. We started off playing around with the program to see if machine learning was able to detect the difference between written text and drawings. From this, we were also thinking of maybe incorporating mathematical graphs and/or equations as a possible third scope; an example that lives between text and drawing.

From our experiments, we saw that computer could usually detect between drawings and text, presumptuously mostly dependent on the text. The diversity of drawings was differed widely, as we literally began to draw just about everything that first came to mind, whereas text was definitely more limited in terms of aesthetic, and was visually more uniform. However, we came upon an interesting discovery when drawing stars, but in a linear form. Despite being drawings, it was detected as text, because of its linear nature. This propelled us into thinking about the possible implications for using machine learning to detect the differences between languages.

The stars that sparked off our stars.

Our final exploration dealt with exploring the detecting the difference between western and eastern languages; western being more letter-based, and eastern being more pictorial-based characters.

Western Languages

Eastern Languages

Training our model with white background, western text, and eastern text.

We decided to map the result out visually through three colors:

White indicates that there is no hand written text being shown. (We fed a series of white background images to train this part)
Blue indicates that it is more-so western text. (We fed a series of handwritten phrases and words in English, Spanish, French, and Latin to train this part)
Red indicates that is more-so eastern text. (We fed a series of handwritten phrases and words in Chinese, Japanese, and Korean to train this part)

From our results, we've discovered a couple things.

The program is relatively good at detecting a "blank background" though a couple times, when our paper was shifted, the program recognized it as "western".

But most importantly, the program was very accurate in detecting western text, but significantly less so with eastern text.

This observation has led us to a couple hypotheses:

Our data is lacking. We took about 100 photos for western and eastern each, but this may have been not enough for the machine learning to generative a conclusive enough database.
The photos that we took could also have been of not high enough quality.
In our sample data, we only wrote horizontally for western text, where as eastern had both horizontal and vertical.

Future thoughts...

To test the machine learning program to see if could simply tell the difference between western and eastern languages, we could do away with the "varied handwriting" completely and use very strict criteria (for handwriting style) in writing our sample text. When we tested the learned program, we could continue to write in that same style between the eastern and western texts. This could help isolate our variables to test out our above hypothesis.