Indoor Location Detection - PDF Free Download

Indoor Location Detection Arezou Pourmir Abstract: This project is a classification problem and tries to distinguish some specific places from each other. We use the acoustic waves sent from the speaker of the mobile device, like cell-phone or laptop, and then try to extract all the reflected components of the received signal from the microphone. We used the approximated amplitude and delay of each reflected component as our feature vectors in order to distinguish different places from each other. To examine the proposed method, the classification of three different places is tested. At the training part the experiments are done in 10 different and random points in each room, and for the test part a signal is sent once in a random point in a random room and then KNN, ML and MAP are used for the classification. The classification percentage is around 80% to 90%. It seems that a simple KNN (K=5) works better than ML and MAP in this problem. For the distribution approximation I applied Parzen window on the extracted samples. 1. Introduction: You are sitting in a conference room and suddenly your cell-phone rings load! Yes, you forgot to switch your cell-phone to silent mode. You set it to the silent mode. After that you go back home, your friend calls you but you don t hear! Yes, you forgot to switch your cell-phone back to the normal mode! How can we solve this problem? In this project we will present a new method for indoor location detection, which can give us a good solution for this problem. This is just one of its applications. This problem is different from the recent indoor localization problem in which they try to estimate the location of the mobile device in an indoor place like a mall, using some existing access points in that place. This is a classification problem, but those are mainly estimation. We are using acoustic signal by using the speaker and microphone, but they are using other methods, for example camera phones and then image processing techniques [1]. The problem in both cases is that there is no access to GPS signal. The approach to solve this problem is to use the acoustic signals. I transmit a known and fixed acoustic waveform and then listen to the reflections of the transmitted signal. All we need is a speaker and a microphone on a mobile device like your cellphone, with enough processing capability and memory. All today s smart-phones seem to work well for our purpose. Reference [4] is a very good and brief reference about the different properties and behavior of acoustic signal. In [5] the absorption coefficients of different materials are shown in a plot. The low absorption coefficient (high reflection coefficient) of most parts of a room, for example brick, concrete, wood, painted brick or concrete walls, convinced us that study of the reflections of the acoustic signals is first possible and also can reveal a lot of information about that environment. The report is organized as follows: In section 2 I explain about the feature selection and extraction. After that and from those extracted features in section 3, I explain the classification

part. The results of the real experiments and the conclusions are presented in sections 4 and 5 respectively. 2. Feature Selection and Extraction: As mentioned above we are trying to use acoustic signals in our classification problem. We want to use the reflection structure of each place as its signature. To make the analysis simpler, like all the classification problems, we try to extract some features form the received signals and instead of working directly with the complete data which can be very complicated, we work on those extracted features. The features that we worked with are the delays and their corresponding attenuation factors, which is the reciprocal of the amplitude of each reflection. Considering our choices for the features, the feature extraction part of our problem is related to Radar and Sonar problems as they do sort of similar task, send a specific signal and then analyze the received signal to extract the delayed reflected waveforms. The feature extraction in those cases is simpler, because they don t have the same processing complexity limit as we do have in our case and therefore they can use better, but much more complicated methods for the signal extraction part. [4],[5] Two different waveforms for the transmission part are tested, a few periods of Sinusoid waveform and also Golay code [6]. The transmitted waveforms are depicted in figure 1. The figure shows a 4 bits Golay code in which for each bit we sent one period of a Sine wave or the inverted of it for +1 and -1 respectively. The larger the length of the code, the better it works in noise cancellation. Golay code works in pairs and gives the Impulse response of the room. The pair for the [+1 +1-1 +1] code that is shown in the figure is [+1 +1 +1-1]. But I just tested and worked with the single Golay code in the same way as I worked with the Sine wave. (a) 5 period of Sinusoids wave

(b) 4 bits Golay code Figure 1. Two types of transmitted waveforms The frequency of each period of the Sinusoid wave is selected to be 2KHz. By increasing this frequency we get higher resolution for the extracted delays. The tested transmitted frequency is kept 2KHz to make sure that it works in all cases with different mobile devices! After transmitting each of these waveforms, and by recording the received signal immediately after that (in order to do so the microphone should be turned on before the transmission) we can extract the reflected waveforms by using the correlation of the original signal with the received one. Then after the correlation we should try to distinguish the peaks (local maximums) related to the delays we are interested in. We did as follows: We first consider the absolute value of the correlator output. Then we should have a method to extract the envelop of the signal (this step is absolutely required because of the periodic nature of the signal which gives us a lot of local maximum that actually doesn t have any meaning!) To extract the envelop I applied an integrator over every 0.0005 sec of the correlator output, which is one period of the Sine wave. This operation is to eliminate the peaks of the Sine waves that don t have any real meaning for the feature extraction part! Then the largest peaks were selected as the most powerful reflected signals. The locations of the peaks tells us the delays and the reciprocal of their amplitudes shows their attenuation factors. In our real tests in section 4, the 10 most powerful reflections were selected. It is very important to scale all the attenuation factors to the largest one, which is always the first peak heard. The first peak is actually the original sound transmitted and should be larger than all the reflections. In figure 2 a sample of the received signal and the correlator output of it is shown. The transmitted signal was the Sine wave explained above. The correlator output shows that our method explained above, extracting the peaks (local maximums) of the envelope of the absolute value of the correlator output, works in practice!

(a) Received signal containing the original signal and the reflected waves (b) The Correlator output Figure 2. A sample of the received signal and its correlation output result (a Sinusoids wave was transmitted) 3. Classification: After extracting the feature vectors from the received data now the next step is the classification. For classification different methods are tried as follows: - KNN on the feature vectors - ML and MAP on the feature vectors - Working with the transformed feature vectors Now I try to explain each method briefly, and what was my approach in using them for this specific problem.

3.1. KNN on the feature vectors: Two different approaches are tested in this section. We gathered all the delays of the different rooms with the label of that room and worked with them, which is actually considering each feature as a 1D variable, or we can work with the 2D feature vectors, delay and their corresponding attenuation factor together. An important issue that can affect the performance a lot is the definition of distance in this case. We want to poll from the K nearest feature vectors to the feature vector extracted form the room we want to classify. How can we say which feature vectors are the closest? What is really the best definition for distance in our problem? We can work with Hamming distance or we can use L2 norm. There are some strong reasons for using Hamming distance. By using Hamming distance we can relieve the effect of missing some delays that can happen because of inaccurate delay extraction part, or maybe because of the changes in the room structure. L2 norm also has its own advantages. For instance it can consider the relative strength (or in other words importance) of the delays, which can be very helpful if we have a good method for the feature extraction part. The experiments show that the performance of the Hamming distance method is much better than the L2 norm. 3.2. ML and MAP on the feature vectors: Another method is to first estimate the distribution of the feature vectors. The distribution can be defined in 1D by considering only the delay, or 2D by considering the delay and the attenuation factor together, or higher dimensions by considering the distribution of the vector of delays in each test. The last one makes the best sense because of the high correlation between the delays at different points in each place, although it is more complicated. We can consider a 1D feature vector as [d 1 d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 10 ] (assuming we considered the 10 largest reflections in the received signal) or a 2D feature vector as [(α 1 d 1 ) (α 2 d 2 ) (α 3 d 3 ) (α 4 d 4 ) (α 5 d 5 ) (α 6 d 6 ) (α 7 d 7 ) (α 8 d 8 ) (α 9 d 9 ) (α 10 d 10 )]. All the 4 cases of definition for the distribution are as follows: 1D delay probability distribution, f(d) ; in this case it is assumed that all the delays in a feature vector are independent, which is not true in reality. 2D attenuation factor and delay probability distribution: f(α, d) ; the same assumption for independency of the reflections (attenuation factor and delay) as the previous case also exists here. nd delays probability distribution: f(d 1 d 2 d n ) ; considering the joint distribution of the n strongest reflections in the received signal. It captures the correlation between the delays and should work better than the 2 previous cases, but at of course the expense of a more complexity. In addition, the number of samples required for the training phase in this case should be about n times larger than the previous cases. nd attenuation factors and delays probability distribution: f((α 1 d 1 ),(α 2 d 2 ),,(α n d n )) ; the same as the previous case but instead of each of those n variables we have a vector of delay and its corresponding attenuation factor.

To calculate this distribution in 1D case, we applied Parzen window to the histogram of the extracted delays for that room to get a smoother distribution. In this case we are actually assuming that the delays are independent and we have one distribution of delay for each room and therefore the final probability that a specific feature vector (which is actually a vector of delays) belongs to any of those trained rooms is the multiplication of all the probabilities of the individual delays in that feature vector. To use Maximum Likelihood, we select the room with the largest product of the probabilities. In order to use Maximum aposteriori instead of ML, we multiply these derived probabilities for each room with the apriori probability of that room which should be calculated in some ways. It makes sense that at the beginning we start with ML, and program the mobile device to update the apriori probability of each room automatically after each successful detection. We can also program the device to decide automatically itself when it can switch to the ML mode. Or maybe sometimes it realizes that it is better to switch back to the ML mode for instance when there are some high rate changes in the apriori probability which tells us that they are not reliable any more. 3.3. Working with the transformed feature vectors: Considering the structure of our problem, we will probably see the reflections from the 6 sides of each room (assuming all rooms are rectangular prisms) in every test at different points in each place. The value of those delays alone are not that much informative, but the sum of the delays from those pairs of the opposite sides are the same. This can be a great signature for each room, three fixed sums of the pairs of delays! In order to use this property we can calculate the sum of all pairs of the extracted delays for each room in the training phase and extract those fixed sums in the different tests and save them as the signature of that room. For the detection part we can select the room with the largest number of similar fixed sum of the delays with those extracted in the test section. If our place is fixed in the test section, then we have a few sums of the pairs of delays and we can work with one sample test or use more samples to improve the accuracy of the delays and attenuation at that point. But if we are moving in that place, then we can do the same but repeat the test at different points and then compare those results together to extract the fixed sum of the pairs from the test data itself. This step can prevent false assumption for fixed sum of pairs of delays that can happen because of the lack of information in the former case. Figures 3 shows the distribution of the delays by plotting the scatter plot and histogram of the delays for two different places, one for our lab, and the other for the conference room. The experiments are just for some specific points in each place but we can see that there are some obvious differences between the distributions of the delays, the most obvious one is the maximum delay in each case. This property can be a very useful one to at least tell us that the testing room is not room x (assuming for room x the maximum delay was always less than the maximum delay we saw in the test room.) The histogram plots below help us to see better what is going on at the dense points in the scatter plots.

(a) Scatter plot of the delays in our lab (b) Histogram of the delays our lab (c) Scatter plot of the delays in the conference room (d) Histogram of the delays in the conference room Figure 3. Scatter plot and histogram of the approximated delays at 2 different rooms, at some specific points

4. Experiments and Results: The test set-up is as follows: I selected three rooms to test the proposed method: my room, our living room, and our lab. I did all the experiments with my laptop assuming that it is my mobile device. For instance you can imagine that I try to train my laptop to control the sound settings and run some programs automatically depending on where I am. During the training phase for each of these rooms I walked in the room for 1 minute in all the different points that I might be there later in the test phase. During that 1 minute, 30 waveforms were transmitted, each 2 seconds apart from each other. The waveform I worked with for these experiments is 5 periods of 2KHz Sinusoid wave as explained before. I turned on the microphone before the transmission in order not to miss the signal, which can happen if I don t do so. So for each room in the training phase we have 30 tests. The feature extraction method is applied on each test, and the 10 strongest reflections (delays and also their amplitude) will be saved in the mobile device, because they are required for the KNN test. To use ML or MAP in the classification part, we need to calculate and save the distribution of the features. The first and second method explained before in the ML and MAP parts are tested. I worked with both the raw distribution (without any preprocessing) of the quantized delay and also the joint distribution of the quantized delay and quantized attenuation factor. After that I applied Parzen window to make those distributions smoother to make it closer to the reality. The simulation for KNN with K=5 resulted in the best classification rate, around 90%. The ML method for the first and second method had a classification rate of around 80% or less. The reliability of the above classification rates depends on many factors, for example the noise of the environment, the changes in the decoration of each room, the training phase, and also the test data! But at least these approximations of classification rate convinced us pretty well that this method is a very good way of indoor location detection. Classification using the transformed feature vectors is also tested. This method, although sounds pretty good, has a lot of problem in practice. For instance as mentioned before this method needs more accurate feature vectors. Considering the error from the feature extraction methods, in addition to the error in estimating the start point, the total error seems to be large for this method. The complexity of this method can be pretty high both in the train phase and the test phase. Another problem that we had in practice was to extract the exact start point of the transmission, which can affect the accuracy of the extracted features. We might be able to extract it from the hardware device directly, but in our tests I tried to extract the start point by programming. I used a relative threshold for the amplitude of the correlation output and the place of the first peak in the output of the correlator with amplitude larger than the threshold is considered as the start point. All the other peaks, which show the reflections of the original transmitted signal, will be measured relative to that start point.

5. Conclusions and Future work: We started with a new idea and the question that whether we can use acoustic signals for indoor location detection or not. We proved that we can do it, and this idea can be used later in some future devices! The applications of this idea seems to be more than just location detection and we are working on it. The use of acoustic signal, and working with just speaker and microphone which can be found in almost all devices of our interest, makes this idea more important and applicable. In order to improve the performance of the algorithm, I am working on long Golay codes to use them instead of Sinusoids waveform, which will hopefully increase the accuracy of the feature extraction part, and therefore the classification part, a lot. It can also give us a much better approximation for the start point by using exactly the same method as before. Using the transformed features, although being very difficult when used alone, seems to be very useful to improve the result of the other methods. So we can add it as a final step to the other methods, just for checking the results of the other methods. In this case it can be very simple but useful. References: [1] Ravi, N.; Shankar, P.; Frankel, A.; Elgammal, A.; Iftode, L. Indoor localization using camera phones, 7th IEEE Workshop on Mobile Computing Systems and Applications, April 2006. [2] A very good source about the properties of acoustic signals: http://mysite.du.edu/~jcalvert/waves/soundwav.htm [3] An online absorption coefficient chart: http://www.sae.edu/reference_material/pages/coefficient%20chart.htm [4] T. G. Manickam, R. J. Vaccaro, and D. W. Tufts, A Least-Squares Algorithm for Multipath Time- Delay Estimation, IEEE trans. on Signal Processing, vol. 42, no. 11, pp. 3229-3233, 1994. [5] M. Wax and A. Leshem, Joint Estimation of Time Delays and Directions of Arrival of Multiple Reflections of a Known Signal, IEEE trans. on Signal Processing, vol. 45, no. 10, pp. 2477-2484, 1997. [6] Abel, J. and Berners, D. Signal Processing Techniques for Digital Audio Effects, http://ccrma.stanford.edu/courses/424/, 2005