Alan

27 Feb 2013

#Listen To Color

Artist Neil Harbisson was born completely color blind, but this device attached to his head turns color into audible frequencies. Instead of seeing a world in grayscale, Harbisson can hear a symphony of color — and yes, even listen to faces and paintings. It is a great idea to transform visual signals into different wavelength which enlarges human ability to sense. It is still disappointing that the device can only generate sound right now. If it can recognize color and shapes and even depth and relationships of objects in space and convert it into real visual signal directly to human brain, this will be revolutionary.

#Google Driveless Cars

The driveless car is not a new concept all, even for the real implementation. You may track back to automated car experiment by Mecedez in 1995. Thrun was criticized in this talk for the reason he didn’t understated the value of other pioneers in this field. The video below is a keynote speech by Ernst DickManns with introduction of automated car.

One reason for Google to introduce driveless car is it is safer than cars driven by human. This is doubtful and even controversial among people who insist on freedom of driving. The car in the future must by designed for autonomousness rather than a car that is designed to be driven can drive automatically.

#Photo-real Digital Face

This project is good, since it generate a highly simulated face which is very close to human face. However, there are several deficiencies yet. Due to uncanny valley, you can still tell it is a digital face and the movement in this real-time is static.

TED Talks on Computer Vision

Michael

27 Feb 2013

I hope you’ll pardon me for doing work that I’m already familiar with and may not be particularly categorized as art… I’ve been in New York since Monday morning, and unfortunately Megabus’s promise of even infrequently functional internet is a pack of lies. My bus gets in at midnight, so to avoid having to stay up too late and risk sleeping through my alarms, I’m doing the draft mostly from memory and correcting later.

Alexei Efros’s Research

Visual Memex

Professor Efros is tackling a variety of research projects that address the fact that we have an immense amount of data at our fingertips in the form of publicly-available images on flickr and other sites, but relatively few ways of powerfully using it.  What I find unique about the main thrust of the research is that it acknowledges that categorization (which is very common in computer vision) is not a goal in and of itself, but is just one simple method for knowledge transfer.  Thus, instead of asking the question “what is this,” we may wish to ask “what is it like or associated with.”  For example, it is very easy to detect “coffee mugs” if you assume a toy world where every mug is identical in shape and color. It is somewhat more difficult to identify coffee mugs if the images contain both the gallon-size vessels you can get from 7-11 and the little ones they use in Taza d’Oro and the weird handmade thing your kid brings home from pottery class. It is more difficult still to actually associate a coffee mug with coffee itself.  In general, I’m attracted to Professor Efros’s work because it gets its power from using the massive amount of data available in publicly-sourced images, and is built upon a variety of well-known image processing techniques.

GigaPan

GigaPan
This may not be a fair example, but I want to share it for those who aren’t familiar. GigaPan is a project out of the CREATE Lab that consists of a robotic pan-tilt camera mount and intelligent stitching software that allows anyone to capture panoramic images with multiple-gigapixel resolution. The camera mount itself is relatively low-cost, and will work wth practically any small digital camera with a shutter button. The stitching software is advanced, but the user interface is basic enough that almost anyone is capable of using it. We send these all over the globe and are constantly surprised by the new and unique ways that people find to use them, from teachers in Uganda capturing images of their classroom to paleontologists in Pittsburgh capturing high-resolution macro-panoramas of fossils from the Museum of Natural History. I appreciate this project because at its core, the software is simply a very effective and efficient stitching algorithm packaged with a clever piece of hardware, but it gets its magic from the way in which it is applied to allow people to share their environment and culture.

Google Goggles

Goggles
Google Goggles is an Android app that allows the user to take a picture of an object in order to perform a Google search. It is unclear to me how much of the computer vision is performed on the phone and how much is performed on Google’s servers, but it is my impression that must of it is done on the phone itself. At the very least, some of the techniques employed involve feature classification and OCR for text analysis. The app does not seem to have found widespread use, but I still find it an interesting direction for the future because it could make QR codes obsolete. Part of me hates to rag on QR codes, because I’ve seen them used cleverly, but I feel like most of the time they simply serve as bait because whenever people see QR codes, they want to scan them regardless of the content,, because people love to flex their technology for technology’s sake. I think Google Goggles might be a case where people will use it more naturally than QR codes, since in some instances it is simply easier to search short text rather than taking pictures.

SamGruber::LookingOutwards::ComputerVision

Pinokio – Shanshan Zhou, Adam Ben-Dror and Joss Doggett

Pinokio is an animatronic lamp which gazes around and responds to the presence of humans in a surprisingly lifelike way. It will even respond to sudden sounds in its environment and resists being turned off. Based upon the video, it would appear that the lamp is a very straightforward application of face detection, which surprises given the quality of the character shown. The project is of course reminiscent of the Pixar Lamp, though one of the core elements lacking relative to that earlier work is the partner. The ability to just watch two lamps clumsily interact with each other provided the Pixar piece with much more character than the single lamp alone can manage.

Virtual Ground – Andrew Hieronymi

Virtual Ground is a projected game that is played by two people who try to steer a bouncing ball to light up the floor. I find this project interesting because it seems to run counter to our typical rivalrous impression of a game. However, in terms of computer vision, it is a fairly basic exercise of tracking moving objects that does not seem to push the bounds of the possible. I am immediately curious to see how the game could evolve if more participants entered the playing area, though the presentation of this project seems to suggest that there would not be a meaningful result.

Can

27 Feb 2013

Reactable

Reactable, the well known, futuristic-looking instrument, uses opencv to detect fiducial markers, and generate/control sounds using them. It’s mostly used by electronic music lovers. (perhaps the most important musician who uses Reactable is Björk.)

 

Nosaj Thing – Eclipse / Blue

An amazing modern dance performance, by Nosaj Thing. The visualization of an amazing song. The they blend the dance moves with the visuals and the way they visualize the sounds are pretty mind-blowing.

 

Orbitone

A life-size version of Reactable.. (or) an ambient music making tool. What I can’t understand about this project is the timing/positioning. It doesn’t seem too precise to me, and although I think it can be useful for making ambient music, It could be more useful as a toy for kids.

Anna

27 Feb 2013

Goodevening. I bring you a brief and cursory ramble (how can it be both brief and a ramble? magic!) about computer-vision-related dazzle-ry. A disclaimer for anyone splitting hairs: more of this post is focused on Augmented Reality, but that tends to involve a good bit of computer vision.

I’d also like to point out that two of the three things below were discovered via this thoughtful commentary on the influence of filmmaker George Melies on the ideas of Augmented Reality. You may be familiar with Melies if you saw the film ‘Hugo’ — or read the book it was based upon.

Ring-ring-ring-ring-ring-ring….

It’s cellular, modular, interactivodular…

Raffi song aside, this is an awesome demo of everyday objects turned into complex gadgetry via the recognition of gestures— say, opening a laptop, or picking up a phone. It reminds me a lot of the student project about hand-drawn audio controls that graces this website’s homepage, but what I really like about it is the fact that the system relies not on shape recognition, but instead upon such small and seemingly inconsequential human behaviors. Honestly, who even thinks about the way we open our laptops? It’s just something we do, habitually and subconsciously. To be able to harness that strange subliminal action and use it to transform objects into devices is fascinating to me. I’m also interested in the work that went into projecting the sound so that it seems to originate at the banana.

And now, a word on demonic chairs….

KinÊtre

This demo of a man using a kinect to make a chair jump around isn’t particularly compelling to me as a stand-alone piece, but I thought the article was worth including because of the implications for animation that it proposes. It really makes a whole lot of sense to use the kinect to create realistic animations. Sure, you can spend a lot of time in Maya making every incremental movement perfect, or you could capture a motion fluidly and organically by actually doing it yourself. It’s a no brainer, really, and I bet it’s a lot cheaper than those motion capture suits like they put on
Andy Serkis in LotR… or Mark Ruffalo in Avengers…

Did somebody say Avengers? Oh that reminds me…

Jarvis is the new watercolors….

Last post, I made the bold claim that watercolors make everything better, a statement that I could be quickly talked out of believing. When it comes to Jarvis (or really anything Marvel related), however, I’m much more likely to stand my ground, for better or worse. This is part of the promotional material for the Iron Man 2 movie a few years ago — an interface that lets you put Iron Man’s helmet on, and also control bits of Tony’s Heads-Up Display via head gestures. Honestly, this looks like something that could be pounded out with very few issues just using FaceOSC and some extra-sparkly sci-fi, Stark-Industries graphics. But even with a technical background, I am still a sucker for sparkly pseudo-science graphics. Sue me! The Marvel marketing machine tends to be pretty clever, if admittedly gimmicky.

Elwin

27 Feb 2013

IllumiRoom: Peripheral Projected Illusions for Interactive Experiences // Microsoft Research


Wow! IllumiRoom is a proof-of-concept system from Microsoft Research. It augments the area surrounding a television screen with projected visualizations to enhance the traditional living room entertainment experience. I think this is an excellent and smart implementation using the Kinect and projection. They have taken the confined experience of a tv screen and extended the virtual world to the physical world. I love that they didn’t do a literal extension of the virtual environment, but decided to depict the surroundings with a variety visual styles (otherwise they could have just used a projector). I can definitely see this system make gaming to be more engaging and immersive.
 

PhobiAR // HITLabNZ


An advanced interactive exposure therapy system to treat specific phobias, such as the fear of spiders. This system will be based on AR technology, allowing patients to see virtual fear stimuli overlaid on the real world and to interact with the stimuli in real time. I think this is a very interesting usage of Augmented Reality. Even though I know the spider is fake, it still gives me goosebumps seeing how it walks up that person’s hand. I would love to see how effective this treatment is for people who actually have arachnophobia.
 

Leap Motion


Leap Motion represents an entirely new way to interact with your computers. It’s more accurate than a mouse, as reliable as a keyboard and more sensitive than a touchscreen. For the first time, you can control a computer in three dimensions with your natural hand and finger movements. Just look at the accuracy! It’s hard to believe how precise the tracking of the fingers are and how fast it responds. I’m very curious and excited to get my hands on a leap motion and test it out for myself. I think this will be the next big thing for designers, similar to what the Kinect was, but this time it’s for your hands/fingers!

Dev

27 Feb 2013

Computer Vision

Reconstructing Rome

Paper for those interested: http://www.cs.cornell.edu/~snavely/publications/papers/ieee_computer_rome.pdf

This was reccomended by a friend of mine who is really into CV.  This project is amazing – it takes photos from social image sharing sites like flickr and uses the images with their geolocation to recreate a 3-D image of Rome’s monuments. Crowdsourcing 3-D imaging. Amazing.

Faceoff/FaceAPI

http://torbensko.com/faceoff/

FaceAPI is an API developed to track head motion using your computer. This is not too far off from what we have seen with FaceOSC. The part that is cool is the fat that this API is integrated into the  source engine. Games developed with the Half-Life 2 engine can utilize this API to provide head-based gesture control for games. This can be anything from creating a realistic 3-D feedback to zooming in when closer to the screen. The video goes over this in detail. I simply like the idea of using head interaction in gaming since webcams are so commonplace now.

Battlefield Simulator

https://www.youtube.com/watch?v=eg8Bh5iI2WY

 

Immersive gaming to the MAX. This is byfar the most impressing game-controllers/reality simulators I have seen. It really engages your entire body. I won’t be able to explain everything, but basically CV is used here to map pixels onto the dome based on the users body position. Also body gestures are detected via Kinect to trigger game events. On top of that the game is constantly being scanned for detection of the player getting hurt. When the player is hurt in-game, paintballs are triggered to shoot making the pain a reality. Check out the video – its long, but worth it.

 

Patt

27 Feb 2013

Computer Vision 

ALT CTRL by Matt Ruby

Excitebike is one of the three projects under a series of works ALT CTRL to experiment with how people interact with different, unfamiliar interfaces in digital systems. The game is controlled by the sound of a user that is detected by a microphone embedded in the helmet. Instead of a normal hand controller, volume is used to control the gas buttons and the frequency is used to change  lanes. The project pushes the users out of their comfort zone, steering away from what is considered the norm.

Cubepix by Xavi’s Lab at Glassworks Barcelona

Cubepix is an interactive and real-time projection mapping installation that combines the use of a projector, a Kinect, 8 Arduino boards, OpenFrameworks, 64 servo motors and 64 cardboard boxes. Users can interact to control the motion and the illumination of the boxes. This is a project I really like because it integrates the use of software and hardware to create something that exemplify simplicity. I also like it partly because I am interested in doing a project on projection mapping. I think it would be really fun to play with.

Fabricate Yourself by Karl D.D. Willis

My fascination with 3D printed objects puts this project on top of my list. The Kinect is used to capture different poses of people, where the depth image is then processed, meshed, and displayed in real-time. The images are saved as SLT files, and printed as a 3×3 cm models with dovetail joins added on the side for pieces to snap together. Ah, it’d be cool to have people do a wave, printed them out in 3D, and line a series of them next to each other around the room.

Dev

27 Feb 2013

Interactivity

Myo

The video for this almost looks too good to be true. Its a simple arm band that measures muscles tension and arm movements using accelerometers and other sensors. From this data a variety of interactions can be imagined. Users are seen controlling anything from computers to television to vehicles. I have always been interested in different forms of off-screen interaction. Something like this is particularly appealing for its broad usage spectrum and its non-intrusive nature.

Occulus Rift

http://www.oculusvr.com/

Commercialized VR! This is really cool. I had heard about Occulus from a friend during CES where it became very popular. Basically this is a VR headset which claims to be non-laggy and provide a huge field of view. The non-laggyness is a huge part of it since it enables high-frame rate activities like video games to be realized. John Carmack, creator of Doom endorses this project, and if it pans out will have Doom released for the platform. This is super exiting for gamers like me who are always looking at the future of gaming.

Real World In Video Games

http://arstechnica.com/tech-policy/2013/02/montreal-designer-remains-defiant-plans-to-release-new-counter-strike-map/

More gaming related news. This article talks about a designer who is adamant about recreating real-world areas into video game maps for Counter-Strike. Its very interesting and hacky since hae turns places people see every day like the subway into warzones. Areas that people might have neglected in real life, become well known in-game for different reasons (like good cover). Unfortunately this guy is getting a ton of flak from authorities for doing this, something I think is perfectly legal. How can anyone own the rights to how a place in this world looks?

Marlena

26 Feb 2013

I love the idea of computer vision–it’s an excellent system for sensing that lets people react with machines in a much more natural and intuitive way than typing or other standard mechanical inputs.

Of course, the most ubiquitous form of consumer computer vision has been made possible by cheap and the Kinect: 

Of course here are plenty of games, both those approved by Microsoft and those made by developers and enthusiasts all over the web; [see http://www.xbox.com/en-US/kinect/games for some examples], but there are also plenty of cool applications for tools, robotics, and augmented reality.

Here’s a great example of an augmented reality application that uses the Kinect–it tracks the placement of different sized blocks on a table to build a city. It’s a neat project in its ability to translate real objects into a continuous digital model.

Similarly, there is a Kinect hack that allows the user to manipulate Grasshopper files using gestures [see http://www.grasshopper3d.com/video/kinect-grasshopper ]. It is a great prototype for what is probably the next level of interaction: direct tactile feedback between user and device. This particular example is lacks a little polish–its feedback isn’t immediate and there are other minor experience details that could be improved. For an early tactile interface, though, it does a pretty good job. There are plenty of other good projects at http://www.kinecthacks.com/top-10-best-kinect-hacks/

Computer vision is also incredibly important to many forms of semi or completely autonomous navigation. For example, the Cobot project at CMU uses a combination of mapping and computer vision to navigate the Gates-Hillman Center. [See http://www.cs.cmu.edu/~coral/projects/cobot/ ]. There are a lot of cool things that can be done with autonomous motion, but the implementation is difficult to create due to the large levels of prediction necessary for navigating a busy area.

Another great application of computer vision is augmented reality. There are plenty of projects at http://www.t-immersion.com/projects to give a good idea as to how many projects involving augmented reality exist, with every idea ranging from face manipulation to driving tiny virtual cars to applying an interface to a blank wall having been implemented in some form. Unfortunately, it is difficult to make augmented reality seem like a completely immersive experience because there is always a disconnect between the screen and the surrounding environment. A good challenge to undertake, perhaps, is then how to make the experience such that the flow from screen to environment doesn’t break the illusion for the user. Food for thought.