Long Range Acoustic Classification

Similar documents
PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT

Non-Data Aided Doppler Shift Estimation for Underwater Acoustic Communication

NPAL Acoustic Noise Field Coherence and Broadband Full Field Processing

Thermal Simulation of Switching Pulses in an Insulated Gate Bipolar Transistor (IGBT) Power Module

Gaussian Acoustic Classifier for the Launch of Three Weapon Systems

Ultrasonic Nonlinearity Parameter Analysis Technique for Remaining Life Prediction

Acoustic Change Detection Using Sources of Opportunity

DESIGN AND IMPLEMENTATION OF AN ALGORITHM FOR MODULATION IDENTIFICATION OF ANALOG AND DIGITAL SIGNALS

Summary: Phase III Urban Acoustics Data

David Siegel Masters Student University of Cincinnati. IAB 17, May 5 7, 2009 Ford & UM

Super Sampling of Digital Video 22 February ( x ) Ψ

SEISMIC ATTENUATION CHARACTERIZATION USING TRACKED VEHICLES

Chapter 4 SPEECH ENHANCEMENT

Army Acoustics Needs

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Willie D. Caraway III Randy R. McElroy

DERIVATION OF TRAPS IN AUDITORY DOMAIN

Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic Masking

Voice Activity Detection

Supplementary Materials for

Study Of Sound Source Localization Using Music Method In Real Acoustic Environment

How to Use the Method of Multivariate Statistical Analysis Into the Equipment State Monitoring. Chunhua Yang

Speech Enhancement Based On Noise Reduction

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

SEISMIC, ACOUSTIC, AND MAGNETIC TEST RESULTS FROM US/GERMAN TESTING

N J Exploitation of Cyclostationarity for Signal-Parameter Estimation and System Identification

US Army Research Laboratory and University of Notre Dame Distributed Sensing: Hardware Overview

Detection, Classification and Tracking in Distributed Sensor Networks

Antennas and Propagation. Chapter 5c: Array Signal Processing and Parametric Estimation Techniques

Chapter IV THEORY OF CELP CODING

Ocean Ambient Noise Studies for Shallow and Deep Water Environments

Noise estimation and power spectrum analysis using different window techniques

Efficient Signal Identification using the Spectral Correlation Function and Pattern Recognition

A Novel Approach for the Characterization of FSK Low Probability of Intercept Radar Signals Via Application of the Reassignment Method

Digital Radiography and X-ray Computed Tomography Slice Inspection of an Aluminum Truss Section

THE DET CURVE IN ASSESSMENT OF DETECTION TASK PERFORMANCE

Acoustic Horizontal Coherence and Beamwidth Variability Observed in ASIAEX (SCS)

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

DISCRIMINANT FUNCTION CHANGE IN ERDAS IMAGINE

Detection, localization, and classification of power quality disturbances using discrete wavelet transform technique

Improving the Detection of Near Earth Objects for Ground Based Telescopes

A Novel Technique or Blind Bandwidth Estimation of the Radio Communication Signal

AD-A 'L-SPv1-17

SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum

Mel Spectrum Analysis of Speech Recognition using Single Microphone

ANALOGUE TRANSMISSION OVER FADING CHANNELS

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

AUTOMATIC MODULATION RECOGNITION OF COMMUNICATION SIGNALS

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands

Energy-Scalable Protocols for Battery-Operated MicroSensor Networks

Indoor Location Detection

Electro-Optic Identification Research Program: Computer Aided Identification (CAI) and Automatic Target Recognition (ATR)

REPORT DOCUMENTATION PAGE. A peer-to-peer non-line-of-sight localization system scheme in GPS-denied scenarios. Dr.

Acoustic Monitoring of Flow Through the Strait of Gibraltar: Data Analysis and Interpretation

A New Scheme for No Reference Image Quality Assessment

MURDOCH RESEARCH REPOSITORY

HD Radio FM Transmission. System Specifications

Neural Networks and Antenna Arrays

Signals A Preliminary Discussion EE442 Analog & Digital Communication Systems Lecture 2

Bearing Accuracy against Hard Targets with SeaSonde DF Antennas

Neural Network-Based Hyperspectral Algorithms

Adaptive Waveforms for Target Class Discrimination

8.3 Basic Parameters for Audio

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

DIGITAL processing has become ubiquitous, and is the

Wavelet Transform for Classification of Voltage Sag Causes using Probabilistic Neural Network

Ship echo discrimination in HF radar sea-clutter

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

ENF ANALYSIS ON RECAPTURED AUDIO RECORDINGS

PSEUDO-RANDOM CODE CORRELATOR TIMING ERRORS DUE TO MULTIPLE REFLECTIONS IN TRANSMISSION LINES

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA

Can binary masks improve intelligibility?

Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research

Keywords: Wavelet packet transform (WPT), Differential Protection, Inrush current, CT saturation.

Proposed Method for Off-line Signature Recognition and Verification using Neural Network

GE 113 REMOTE SENSING

Investigation of a Forward Looking Conformal Broadband Antenna for Airborne Wide Area Surveillance

Mikko Myllymäki and Tuomas Virtanen

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

Applications of Music Processing

University of Molise Engineering Faculty Dept. SAVA Engineering & Environment Section. C. Rainieri, G. Fabbrocino

Keywords: Power System Computer Aided Design, Discrete Wavelet Transform, Artificial Neural Network, Multi- Resolution Analysis.

Drum Transcription Based on Independent Subspace Analysis

Some of the proposed GALILEO and modernized GPS frequencies.

Question 1 Draw a block diagram to illustrate how the data was acquired. Be sure to include important parameter values

This tutorial describes the principles of 24-bit recording systems and clarifies some common mis-conceptions regarding these systems.

IREAP. MURI 2001 Review. John Rodgers, T. M. Firestone,V. L. Granatstein, M. Walter

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

Environmental Sound Recognition using MP-based Features

ARL-TN-0835 July US Army Research Laboratory

Moving Object Detection for Intelligent Visual Surveillance

3D Distortion Measurement (DIS)

Measuring the complexity of sound

Adaptive CFAR Performance Prediction in an Uncertain Environment

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Implementation of decentralized active control of power transformer noise

Application of Hilbert-Huang Transform in the Field of Power Quality Events Analysis Manish Kumar Saini 1 and Komal Dhamija 2 1,2

Combining High Dynamic Range Photography and High Range Resolution RADAR for Pre-discharge Threat Cues

Transcription:

Approved for public release; distribution is unlimited. Long Range Acoustic Classification Authors: Ned B. Thammakhoune, Stephen W. Lang Sanders a Lockheed Martin Company P. O. Box 868 Nashua, New Hampshire 03061-0868 Abstract This paper introduces the use of dynamic features for robust target recognition of ground vehicles. Most current approaches rely on instantaneous spectral features such as those derived from harmonically related spectral lines. Significant drawback of these approaches are that the use of low amplitude (10-20dB below dominant line) spectral lines severely limit classification range. The strongest line is often detectable well before secondary lines. Dynamic features extracted directly from the strongest spectral line, if successfully characterizing the target, will extend the range of operation to several times. In this report, a complete experimental evaluation of the effectiveness of dynamic features is conducted. The analysis is performed using a database consisting of approximately two hundred acoustic signatures collected from six unique vehicles. A number of features captured from the dynamic characteristic of the spectral line are evaluated. Classification performance is measured and presented in terms of confusion matrices. As an additional test of the classifier development tools developed for this task, we selected added instantaneous spectral measurements to the dynamic feature, and re-tested. We found that the performance of the classifiers using the mixed spectral and dynamic features was excellent, but blind testing of the classifiers that were developed (testing against vehicle runs that were not used during classifier development) showed disappointing results. Introduction The primary challenge for the success of ground vehicle classification using acoustic signature is in the area of searching for robust features for class recognition. In the past, feature design has been primarily driven by the fundamental physics of the engine mechanics, which translates acoustic energy into series of narrow band spectral peaks. These harmonically related signal components are directly related to the engine firing rate and track slap. It is then natural to classify vehicles using the feature that relate to the makeup of these harmonic lines usually detected by Harmonic Line Association (HLA) algorithm. One difficulty these techniques encounter is the low probability of detection of secondary spectral lines. It has been shown that the acoustic signature of ground vehicles is nonstationary due to many factors. Some of these dynamics are believed to be from the engine itself and some from the influence of environments such as the terrain, atmosphere and geologic characteristics. In this paper, we investigate means to extract features from the dynamic aspects of signals. The application of dynamic features in classification is motivated by the recent success of many speech recognition algorithms. Our primary objective is to evaluate classification effectiveness of transient/dynamic features that could be computed from tracking a single spectral line. If successful, it will extend the tactically useful ranges for ground vehicles several times. We used the ARL ACIDS database and a multi-variate classifier (MVG) to quantitatively evaluate our features. Figure 1 Figure 2

Form SF298 Citation Data Report Date ("DD MON YYYY") 00001999 Title and Subtitle Long Range Acoustic Classification Authors Report Type N/A Dates Covered (from... to) ("DD MON YYYY") Contract or Grant Number Program Element Number Project Number Task Number Work Unit Number Performing Organization Name(s) and Address(es) Sanders a Lockheed Martin Company P. O. Box 868 Nashua, New Hampshire 03061-0868 Sponsoring/Monitoring Agency Name(s) and Address(es) Performing Organization Number(s) Monitoring Agency Acronym Monitoring Agency Report Number(s) Distribution/Availability Statement Approved for public release, distribution unlimited Supplementary Notes Abstract Subject Terms Document Classification unclassified Classification of Abstract unclassified Classification of SF298 unclassified Limitation of Abstract unlimited Number of Pages 6

Feature Design The primary signal space we extracted feature from is the time-frequency distribution. We examined both the short time Fourier transform (STFT) and the reduced interference (RID) time-frequency distributions. The RID distribution produces better spectrum resolution as compared to the STFT distribution. It utilizes a single-side spectrum of real input signal by applying Hilbert transform. This effectively doubles the frequency resolution. In addition, it reduces the cross interference among closely space spectral peaks by the smoothing effect of exponential kernels. It, however, introduces significant amplitude distortion. In our application, for features that depend only on the variation of the maximum frequency bin, we used the RID distribution to capture more details of spectral variation. For features that depend on the amplitude, we used the STFT distribution as the feature s signal space. We focused mostly on means of measuring the time evolving characteristics of the strongest spectral line. Figures1 and Figure 2 shows examples of the RID distribution of two different vehicles under the same driving environments. Clearly, it illustrates different rate of change for the maximum frequency of the strongest spectral line. The images in figure 1 and 2were locally normalized to enhance the spectral line over the time scale. It is also important to note that all the spectral lines share the same dynamic characteristics over time; thus it is adequate to capture dynamic behavior from one single line without any loss of information. A list of the features that we extracted is shown in Table 1. Standard deviation of F max Number of positive df max /dt Standard deviation of df max/ dt Standard deviation of da max /dt over df max / dt Sum of df max/ dt Number of zero crossing of df max /dt Sum of df max/ dt/f max over F max Table 1 Feature Extraction and Optimization This section briefly describes how we systematically extracted features from the acoustic signature. We first removed DC bias by performing trend removal. We then calculated STFT and RID time-frequency distributions. The frame size is set to 1 seconds using 50 percent overlap. Based on the signal to noise ratio, we tracked the strongest spectral line and extracted maximum frequency bin versus time (F(t)). From the tracked spectral line, we then captured all the features of interest. Following that, we associated each feature vector with type of vehicle, environment and speed using ground truth. We performed a quick analysis of each feature by inspecting the probability density distributions (pdf). Figure 3 and 4 show examples of pdf s of two features. As depicted, class separation is obvious among some classes while others exhibit considerable overlap. The pdf s also approximate Gaussian distribution to some degree. The pdf analysis gave us an early indication that this is a complex class boundary problem. We then considered feature analysis that accommodates for the feature correlation. We chose a sub-optimal multidimensional feature ranking technique to perform further feature analysis. Ideally, we would prefer the exhaustive search method in which every M out of N feature combinations are tried for the best performance. However, because the number of combination increases prohibitively with the number of features, the implementation is impractical. We thus resort to a sub-optimal search method known as "add-on" to find a Figure 3 Figure 4

reasonably good feature subset. The algorithm first evaluates classification performance of each of the N features independently and selects a single best feature. It then proceeds to evaluate performance of the next N-1 two-feature subset, and selects the best. The process repeats in the same manner, each time adding the one feature that maximizes the performance. This method then evaluates M(2N+1-M)/2 subset to reach the best M-feature subset. Vehicle Type Classifier Output % 1 Heavy Track Vehicle 47 6 0 0 1 3 0 4 0 0.77 2 Heavy Track Vehicle 9 16 0 0 4 4 0 1 2 0.44 3 Heavy Wheel Vehicle 6 0 0 0 0 2 0 1 0 0.00 4 Heavy Track Vehicle 6 5 0 3 7 0 0 0 6 0.11 5 Heavy Wheel Vehicle 6 2 0 0 21 1 0 1 8 0.54 6 Heavy Wheel Vehicle 3 0 0 0 0 27 0 2 4 0.75 7 Heavy Wheel Vehicle 2 0 0 0 0 1 0 3 0 0.00 8 Heavy Track Vehicle 16 0 0 0 0 1 0 15 1 0.45 9 Heavy Track Vehicle 0 2 0 0 6 6 0 0 7 0.33 Table 2 Classification Performance Analysis To evaluate the target recognition performance of the optimized feature set, we generated a classification performance ROC. Because of the limited number of target signatures we have for each class, we had to train and test the classifier using the single hold out method to maximize the training set. This minimizes the error due to under-training. We chose the classical Multi-variate Gaussian Classifier as the primary classifier for this analysis. We also performed the same analysis using PNN and NNC classifiers for comparison purposes. Multivariate Gaussian Classifier (MVG) is a classical conventional classifier that assumes a Gaussian distribution of underlying features. It parameterizes each class mean and covariance matrix and classifies by minimizing the nearest mean. Its performance degrades if the assumed models are mismatched. The Probabilistic Neural Network (PNN), on the other hand, is a non-parametric neural network classifier that makes no assumptions on the underlying feature distribution. It utilizes a Gaussian kernel function with a smoothing coefficient as activation function for neurons and classifies by summing feature vector distance from all training data. Its performance degrades if the training data are limited. Table 2 shows the result of target identification for all 9 vehicles. The recognition percentage for vehicle1 and vehicle 6 are among the highest score at 70 s. Vehicle 2,5 8,9 scored ranging from 33 to 54 %. For vehicle 3,4 and 7, the very low scores reflected the fact that there were very small number of acoustic signatures for the class to be properly trained. We grouped the vehicles of same definition together and performed the same classification analysis. The result is shown in Table 3. Similar results were obtained. Again, class 2 scores the lowest because of the small population in its class. Classifier Output % 1 Heavy track vehicle 61 8 4 24 0 0.63 2 Heavy wheel vehicle 9 9 0 9 0 0.00 3 Light track vehicle 4 23 2 18 1 0.48 4 Light wheel vehicle 5 2 23 12 0 0.54 5 Heavy track 10 5 5 34 0 0.63 Table 3 Output % Heavy 174 25 0.87 Light 41 28 0.41 Output % Track 129 49 0.72 Wheel 26 71 0.71 Table 5 Table 4

Combined Spectral and Dynamic Features In this part of the effort, we combined traditional spectral features with the dynamic features described above. The complete list of features is provided in Table 6. Frequency of loudest tone Ratio of (frequency of second loudest tone)/frequency of loudest tone Ratio of (frequency of third loudest tone)/frequency of loudest tone Ratio of (frequency of third loudest tone)/frequency of second loudest Ratio of (power of second loudest tone)/power of loudest tone Ratio of (power of third loudest tone)/power of loudest tone Ratio of (power of third loudest tone)/power of second loudest Number of zero crossing of df max /dt (20 second window) (loudest tone) Sum of df max/ dt (20 second window) (loudest tone) Standard deviation of df max/ dt (20 second window) (loudest tone) Number of zero crossing of df max /dt (7 second window) (loudest tone) Sum of df max/ dt (7 second window) (loudest tone) Number of zero crossing of df max /dt (7 second window) (loudest tone) Number of zero crossing of df max /dt (20 second window) (second loudest tone) Sum of df max/ dt (20 second window) (second loudest tone) Standard deviation of df max/ dt (20 second window) (second loudest tone) Number of zero crossing of df max /dt (7 second window) (second loudest tone) Sum of df max/ dt (7 second window) (second loudest tone) Standard deviation of df max/ dt (7 second window) (second loudest tone) Ratio of frequency of loudest seismic tone/loudest acoustic tone Ratio of power in lowest seismic tone/power in loudest seismic tone Ratio of frequency of lowest seismic tone/frequency of loudest seismic tone Number of seismic tones that match acoustic tones in frequency ratio of frequency of lowest acoustic tone/loudest acoustic tone Ratio of power in lowest acoustic tone/power in loudest acoustic tone ratio of frequency of lowest (harmonic) tone/loudest acoustic tone Number of acoustic tones in target Number of seismic tones in target frequency of loud harmonic/frequency of loud tone power of loud harmonic/power of loud tone power of low frequency harmonic/power of loud tone frequency of loudest seismic tone frequency of loud harmonic frequency of low harmonic instantaneous spectral width of loudest tone average spectral width of loudest tone variance of the spectral width of loudest tone instantaneous spectral width of second loudest tone average spectral width of second loudest tone variance of the spectral width of loudest tone ratio of spectral width of the loudest and second loudest tones ratio of average spectral width of the loudest and second loudest tones mean of the absolute value of df/dt for loudest tone Total acoustic power in the 0 100 Hz band in the direction of the target Total acoustic power in the 100-200 Hz band in the direction of the target Broadband acoustic power in the 0 100 Hz band in the direction of the target (tones excluded) Broadband acoustic power in the 100-200 Hz band in the direction of the target (tones excluded) Total acoustic power in the 0 67 Hz band in the direction of the target Total acoustic power in the 67-132 Hz band in the direction of the target

Total acoustic power in the 132-200 Hz band in the direction of the target Broadband acoustic power in the 0 67 Hz band in the direction of the target (tones excluded) Broadband acoustic power in the 67-132 Hz band in the direction of the target (tones excluded) Broadband acoustic power in the 132-200 Hz band in the direction of the target (tones excluded) Fundamental Frequency of the loudest harmonic set Acoustic Power level of the first 8 harmonics of the set (normalized by power of the loudest tone) Ordered Harmonic numbers of the loudest 3 harmonics Fundamental Frequency of the loudest harmonic set (Alternate fundamental estimation technique) Acoustic Power level of the first 8 harmonics of the set (alternate technique) Ordered Harmonic numbers of the loudest 3 harmonics (alternate technique) Number of harmonic sets detected Table 6 Since we wished to test the utility of seismic features, and we did not have the seismic portion of the ACIDS database, we switched to using our own database, with a small number of target runs collected at Aberdeen in December 1998, and at Fort Irwin in February 1999. The tools described earlier were used to analyze these features, and to rank them in terms of their utility as classification features. The initial run showed that the frequency information (Frequency of the loudest tone and fundamental frequency of the loudest harmonic set were the most valuable features available. After considering this result, we decided that we had only a small number of target runs, with a limited number of vehicle speeds, so our sampling of frequencies was too limited, to use as a classifier input. After excluding the two frequency features, we re-ran the analysis and found that the seismic-related features (Number of seismic tones that match acoustic tones in frequency, Ratio of frequency of loudest seismic tone/loudest acoustic tone, Ratio of power in lowest seismic tone/power in loudest seismic tone Ratio of frequency of lowest seismic tone/frequency of loudest seismic tone, Number of seismic tones that match acoustic tones in frequency, Number of seismic tones in target) were among the top-ranked features. After closer examination, we found that the hardware configuration for the seismic sensor changed dramatically between the Aberdeen and Irwin data collection exercises, and the classifiers were using this difference to distinguish between the US vehicles collected at Aberdeen and the Soviet vehicles from Ft. Irwin. After failing to find a method to compensate for the hardware changes, we decided to exclude these features from subsequent analyses. The final analysis, with the feature set now pruned to include only the reliable features, yielded a short list of features that are most valuable for classification Ratio of the frequency of the second loudest tone to the loudest tone Ratio of the powers of the second loudest and loudest tones Mean df/dt for the second loudest tone (7 second window) Average width of the second loudest tone Mean df/dt for the loudest tone Number of acoustic tones detected Average spectral width of the loudest tone Variance of the spectral width of the loudest tone With these 8 features, the vehicle ID performance was about 75% correct. A blind test was performed using a few runs that were excluded from the data sets used to develop the classifier. The blind test showed that the classifier performance was only about 55% correct. From this, we conclude that the number of vehicle runs in the target database was insufficient to develop a reliable classifier (average of 3 pass-by s per vehicle type). A final test was performed using just the relative power of the first 8 harmonics of the loudest harmonic set that was detected. Using these 8 features, the classifier performance against the train/test set was only about 55%. The performance on the blind set, however, was also 55% correct, from which we conclude that these features are robust in the face of a small training set.

Summary The search for robust features will continue to be an important area of target recognition for ground vehicles. Different aspects of signals should be exploited to extract many uncorrelated features for versatility, and effectiveness. In this report, our preliminary investigation 1 shows moderate success of using dynamic features alone in target ID for different class category partition. It is less likely that these features are highly correlated with HLA based features simply because of the way they were extracted. This suggests the possibility of performance improvement when the two feature sets are combined and optimized for the best combination subset. In the future, a more complicated method of extracting dynamic feature should be studied. 1 This material is based upon work supported by the Army Research Laboratory under contract DAAL-01-96-2-0001.