CSE 599 N1

Assignment 1: Gesture based interaction System

In this assignment, you will build a gesture based interaction system using the acoustic sensors in your device. The system has to identify three different gestures defined by you. You can choose any three gestures of your choice.

How should it work?

Your system should use the device's speaker to continuously emit a tone (sine wave of any specific frequency). Now when a user performs a hand gesture, their hands will reflect this sound wave. The is reflection will vary depending upon the gesture performed. So, the system should record the reflections using the device's microphone. Finally, the system should process the recording from the microphone to identify the gesture performed.

What processing you need to implement ?

You can detect the gestures using either of the two methods discussed in class.

Signal Processing: You can use signal processing techniques to identify changes in the reflection. One of the common methods is to look for doppler shifts. Each gesture could create different doppler shifts at the microphone. By identifying the sequence of positive and negative doppler shifts, you could classify your gestures.
Machine Learning: You can use machine learning classification techniques like SVM to classify your gestures. In this case, you will first identify features that can distinguish the gestures. Features could be anything like amplitude, phase changes, FFT etc. You should then train your model by performing each of your gestures multiple times. Once you are satisfied with your model, you can test with it.

What device can be used ?

The device can be a laptop or a Smartphone.

What will be tested ?

Grading will be based on a "demo or die" presentation of the system and the submitted code. The students should perform all the three gestures and the system should display the name of the gesture performed in a screen.

Checkpoints and recommended steps:

Write code to access speaker and microphone in your device. Your system should play a tone continuously in a loop using the speaker and simultaneously also record using the device's microphone. You can use any language of your choice.
Stream the microphone data to the processing function. You can use shared buffers, Threads etc for implementation. Again you can use any language or software like MATLAB.
Examine this microphone data in Matlab, Python/matploblib/iPython or other data visualization tool of your choice.
Start performing the gestures one by one with sufficient time gap in between them. Whenever you are performing the gesture, use the visualization tool to see the change in the signal received.
In addition to plotting the raw signal, you might have to plot FFTs of the signals to see the variation for each of these gestures.
Once you understand the difference between the gestures, write code to automatically identify the gesture.
Your code should be two fold. For each time period, it should first identify whether a gesture was performed. If it is, then it should classify the gesture.
Some of the functions that could be useful are FFT and spectrogram.

References

SoundWave
Viband
SoundWave video