org.oc.ocvolume.dsp
クラス featureExtraction

java.lang.Object
  上位を拡張 org.oc.ocvolume.dsp.featureExtraction

public class featureExtraction
extends java.lang.Object

last updated on June 15, 2002
description: feature extraction class used to extract mel-frequency cepstral coefficients from input signal
calls: none
called by: volume, train
input: speech signal
output: mel-frequency cepstral coefficient

作成者:
Danny Su

フィールドの概要
protected  fft FFT
          Fast Fourier Transformation
protected static int fftSize
          FFT Size (Must be be a power of 2)
protected static int frameLength
          Number of samples per frame
protected  double[][] frames
          All the frames of the input signal
protected  double[] hammingWindow
          hamming window values
protected static double lowerFilterFreq
          lower limit of filter (or 64 Hz?)
 int numCepstra
          Number of MFCCs per frame Modifed 4/5/06 to be non final variable - Daniel McEnnnis
protected static int numMelFilters
          number of mel filters (SPHINX-III uses 40)
protected static double preEmphasisAlpha
          Pre-Emphasis Alpha (Set to 0 if no pre-emphasis should be performed)
protected static int shiftInterval
          Number of overlapping samples (usually 50% of frame length)
protected static double upperFilterFreq
          upper limit of filter (or half of sampling freq.?)
 
コンストラクタの概要
featureExtraction()
           
 
メソッドの概要
 double[] cepCoefficients(double[] f)
          Cepstral coefficients are calculated from the output of the Non-linear Transformation method
calls: none
called by: featureExtraction
 int[] fftBinIndices(double samplingRate, int frameSize)
          calculates the FFT bin indices
calls: none
called by: featureExtraction 5-3-05 Daniel MCEnnis paramaterize sampling rate and frameSize
protected  void framing(double[] inputSignal)
          performs Frame Blocking to break down a speech signal into frames
calls: none
called by: featureExtraction
protected static double freqToMel(double freq)
          convert frequency to mel-frequency
calls: none
called by: featureExtraction
protected static double log10(double value)
          calculates logarithm with base 10
calls: none
called by: featureExtraction
 double[] magnitudeSpectrum(double[] frame)
          computes the magnitude spectrum of the input frame
calls: none
called by: featureExtraction
 double[] melFilter(double[] bin, int[] cbin)
          Calculate the output of the mel filter
calls: none called by: featureExtraction
 double[] nonLinearTransformation(double[] fbank)
          the output of mel filtering is subjected to a logarithm function (natural logarithm)
calls: none
called by: featureExtraction
protected static double[] preEmphasis(short[] inputSignal)
          perform pre-emphasis to equalize amplitude of high and low frequency
calls: none
called by: featureExtraction
 double[][] process(short[] inputSignal, double samplingRate)
          takes a speech signal and returns the Mel-Frequency Cepstral Coefficient (MFCC)
calls: fft
called by: volume, train 5-3-05 Daniel McEnnis - paramatrized sampling rate.
 
クラス java.lang.Object から継承されたメソッド
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

フィールドの詳細

frameLength

protected static final int frameLength
Number of samples per frame

関連項目:
定数フィールド値

shiftInterval

protected static final int shiftInterval
Number of overlapping samples (usually 50% of frame length)

関連項目:
定数フィールド値

numCepstra

public int numCepstra
Number of MFCCs per frame Modifed 4/5/06 to be non final variable - Daniel McEnnnis


fftSize

protected static final int fftSize
FFT Size (Must be be a power of 2)

関連項目:
定数フィールド値

preEmphasisAlpha

protected static final double preEmphasisAlpha
Pre-Emphasis Alpha (Set to 0 if no pre-emphasis should be performed)

関連項目:
定数フィールド値

lowerFilterFreq

protected static final double lowerFilterFreq
lower limit of filter (or 64 Hz?)

関連項目:
定数フィールド値

upperFilterFreq

protected static final double upperFilterFreq
upper limit of filter (or half of sampling freq.?)

関連項目:
定数フィールド値

numMelFilters

protected static final int numMelFilters
number of mel filters (SPHINX-III uses 40)

関連項目:
定数フィールド値

frames

protected double[][] frames
All the frames of the input signal


hammingWindow

protected double[] hammingWindow
hamming window values


FFT

protected fft FFT
Fast Fourier Transformation

コンストラクタの詳細

featureExtraction

public featureExtraction()
メソッドの詳細

process

public double[][] process(short[] inputSignal,
                          double samplingRate)
takes a speech signal and returns the Mel-Frequency Cepstral Coefficient (MFCC)
calls: fft
called by: volume, train 5-3-05 Daniel McEnnis - paramatrized sampling rate.

パラメータ:
inputSignal - Speech Waveform (16 bit integer data)
戻り値:
Mel Frequency Cepstral Coefficients (32 bit floating point data)

fftBinIndices

public int[] fftBinIndices(double samplingRate,
                           int frameSize)
calculates the FFT bin indices
calls: none
called by: featureExtraction 5-3-05 Daniel MCEnnis paramaterize sampling rate and frameSize

戻り値:
array of FFT bin indices

melFilter

public double[] melFilter(double[] bin,
                          int[] cbin)
Calculate the output of the mel filter
calls: none called by: featureExtraction


cepCoefficients

public double[] cepCoefficients(double[] f)
Cepstral coefficients are calculated from the output of the Non-linear Transformation method
calls: none
called by: featureExtraction

パラメータ:
f - Output of the Non-linear Transformation method
戻り値:
Cepstral Coefficients

nonLinearTransformation

public double[] nonLinearTransformation(double[] fbank)
the output of mel filtering is subjected to a logarithm function (natural logarithm)
calls: none
called by: featureExtraction

パラメータ:
fbank - Output of mel filtering
戻り値:
Natural log of the output of mel filtering

log10

protected static double log10(double value)
calculates logarithm with base 10
calls: none
called by: featureExtraction

パラメータ:
value - Number to take the log of
戻り値:
base 10 logarithm of the input values

freqToMel

protected static double freqToMel(double freq)
convert frequency to mel-frequency
calls: none
called by: featureExtraction

パラメータ:
freq - Frequency
戻り値:
Mel-Frequency

magnitudeSpectrum

public double[] magnitudeSpectrum(double[] frame)
computes the magnitude spectrum of the input frame
calls: none
called by: featureExtraction

パラメータ:
frame - Input frame signal
戻り値:
Magnitude Spectrum array

framing

protected void framing(double[] inputSignal)
performs Frame Blocking to break down a speech signal into frames
calls: none
called by: featureExtraction

パラメータ:
inputSignal - Speech Signal (16 bit integer data)

preEmphasis

protected static double[] preEmphasis(short[] inputSignal)
perform pre-emphasis to equalize amplitude of high and low frequency
calls: none
called by: featureExtraction

パラメータ:
inputSignal - Speech Signal (16 bit integer data)
戻り値:
Speech signal after pre-emphasis (16 bit integer data)