Friday, October 24, 2014

Gradient Domain Cloning



I'm taking Computational Photography this semester and we recently had a homework assignment to implement Gradient Domain Cloning. Gradient Domain Cloning is a technique for blending together two images so that they fit together seamlessly.

In the past, the naive approach to doing this would be to identify two images with similar backgrounds and simply past one on top of the other. Using wikipedia's creepy example of an eye on a hand, this is what would happen:

 +=

The result is okay, but generally not great. It turns out, however, that if we analyze the images in the gradient domain (the gradient part of Gradient Domain Cloning), we can automate this process!

What is the gradient domain, you may ask? It can mean several things depending on what field you're in, but for our purposes, it means quantifying the change in pixel values in the X and Y directions. If we consider each color channel of a pixel at a time, we can calculate the slope for each channel in the X-direction by finding the difference in color values between the pixel to the left and right of the current pixel, and then dividing it by 2 (the distance between the pixels). We can do the same process with the pixel above and below the current pixel to determine the slope in the Y-direction.

(editor's note: what I'm describing is a generalization of a gradient, in the context of image processing. In Mathematics, the gradient is a well-defined concept which wikipedia has a great article on)

Performing this gradient operation on an image and then visualizing the output gives us something similar to edge-detection filters:

(source: wikipedia)

Now, what would happen if we performed our naive approach from above, but this time doing it with the gradient of each image? Wikipedia comes to our rescue with another creepy eye/hand picture  to illustrate the result (Apple: you should patent this!):

(source: wikipedia)

This brings us one step closer to our goal of automated blending. Since we chose our two source images to have similar backgrounds (e.g. the eye and hand have the same skin tone), unsurprisingly, their gradient images should fit together nicely without any need for blending. The next step, then, would be to apply a magical mathematical operation to convert the combined image from the gradient domain to the original image domain.

Doing this is not so simple because we can't just do the functional inverse of taking a gradient (i.e. integrating in 2D); if we did, we would get back the naive approach's result. We need to take into account the colors in the foreground and background images so that they both match at the edge between the two images.

This paper goes into significant depth for how a solution was derived, but suffice it to say, we can reduce the problem down to solving a linear system of equations via Poisson's equation (in fact, Prof. Barnes simplified this to equation #7 on the homework assignment page). Once in this form, we can use SciPy's built-in solver for linear systems to solve for the blended image. Below is the result for our eye/hand example:

(source: wikipedia)

I've posted my code for this project on GitHub; the project is written in Python.

To create the image at the top of this post, use ron2.jpg as the foreground image, Mona_Lisa.jpg as the background image, and ron_matte.png to define the boundaries of the foreground image. All of these are found in the "imgs" directory on the GitHub page.

Happy cloning!

Tuesday, May 6, 2014

Ergodox Keyboard


I spend most of my day staring at a computer screen, so over the past couple of years I've been transitioning my workspace to be more ergonomic. My most recent acquisition towards this goal has been building an ErgoDox keyboard.

The ErgoDox is an ergonomic, split keyboard that is designed to protect the user from repetitive strain injures (e.g. carpal tunnel). Furthermore, the code and hardware design for the keyboard is completely open-source and the keyboard uses a Teensy microcontroller for interfacing with the computer. The Teensy can be reprogrammed to reflect any key-layout that you would want and shows up on the computer as a generic USB keyboard. This is especially useful because there are no special drivers you have to install to use the keyboard and no extra software required to remap keys.

Most people buy an ErgoDox keyboard as a kit from the group-buy service, Massdrop. I originally planned to do this, but I didn't want to wait several weeks or months for the kit to arrive, so I opted to source the parts on my own. The parts list is available on the Ergodox website, and Matt Adereth has a great blog post about his experience with sourcing the parts. I bought my parts from the same places that Matt mentioned, with the only deviations of not building a case, combining a set of rainbow keycaps from WASD Keyboards with a blank modifier keycap set from Signature Plastics, and choosing brown Cherry switches.

At the time I was considering to buy a kit from Massdrop, getting the PCB, electronic components, blank keycaps, and a case would be ~$240. Sourcing the parts on my own, I was able to get the PCB, electronic components, rainbow keycaps, blank modifier keys, and brown Cherry switches for ~$212.

I held off on buying sheets of acrylic for a case because I wasn't able to secure access to a laser cuter at the time. I don't feel bad about this, though, because I only plan to keep the keyboard at my desk. To protect the diodes on the bottom from shorting and to protect my desktop from getting scratched, I put rubber feet (salvaged from an old project) on the bottom of the PCB.

~~~

I'm now a couple weeks into using the keyboard and I am very satisfied with the project. I find that I don't notice the keyboard's layout as I use it, which is good, because it means that I've memorized the layout. Currently, I'm using a modified QWERTY layout, but I'm also interested in trying more optimized layouts such as Dvorak or Colemak. The only annoyance I've found as a result of the Ergodox is that I have trouble adjusting back to traditional QWERTY keyboards. Specifically, I have a habit of thinking that traditional keyboards have a split spacebar and each half of the spacebar has a different function. However, this is a small price to pay for pain-free typing when I use the ErgoDox!

(I'm sorry I don't have more photos of the keyboard--I took several during the build process but forgot to import them into my computer before wiping my camera's SD card)

Saturday, April 12, 2014

Dark Souls Watchface

IMG_3424

I participated in the Hack.UVA Hackathon this weekend, and while there, I created a Dark Souls-themed watchface for my Pebble smart watch. The watchface displays the time, date, year, bluetooth connection status, and charging status of the watch.

During normal use, the center of the watchface shows Solaire of Astora (a character from Dark Souls) praising the Sun:

screen1

...and when the watch is charging, Solaire is shown resting at a bonfire (a key gameplay element of Dark Souls):

screen2

Below is a quick video of the transition taking place (I was quite happy to get this working). The underlaying Pebble OS is able to tell when the charging cable is plugged in and can trigger an event for the watch to switch images.


The code for this watchface is all open-source and available for download on my Github page. Enjoy!