Aqua Phoenix
     >>  Research >>  Matlab  

Title: Collection of Matlab resources

1. Audio Processing

  mfcc.m   Computes Mel Frequency Cepstral Coefficients (from Toolbox by Malcolm Slaney)

  extractMFCC.m   Extracts all MFCC vectors for a given audio file (not yet commented)

EXTRACTMFCC: Extract Mel Frequency Cepstral Coefficients from a file
  or an audio vector. This function extracts MFCCs using mfcc.m by
  reading frames of audio data from the file or audio vector.
  Several constants are declared and can be changed:
      windowSize = 256 (audio samples per frame resulting in one MFCC
      feature vector)
      rangeRead = 4000 * windowSize (amount of audio samples read from
      file at one time)

  CEPS = EXTRACTMFCC(audiofile) for a wave audio file

  CEPS = EXTRACTMFCC(audioVector, samplingFrequency) for an audio vector
  If samplingFrequency is not passed, a warning is printed and a default
  frequency of 48kHz is used.

  speakerchange.m   Applies BIC to determine speaker changes in an audio file. Requires audio file (wave) as input. Depends on mfcc.m. Returns table of segmented speakers.

SPEAKERCHANGE: Compute speaker changes in an audio file based on BIC

  T = SPEAKERCHANGE(audiofile) for a wave audio file returns a table of
  speaker changes where rows are segmented speakers and columns are:
      1: begin of segment in wave samples
      2: end of segment in wave samples
      3: begin of segment in MFCC CEPS index
      4: begin of segment in MFCC CEPS index

  [T, C] = SPEAKERCHANGE(audiofile) returns segment table T, as well as
  extracted cepstral coefficients in matrix C with rows(C) as MFCC
  coefficient dimension and columns(C) as feature vectors over time.

  [T, C, B] = SPEAKERCHANGE(audiofile) returns segment table T, cepstral
  coefficients C, as well as vector B representing the BIC values over
  the entire audio vector.

  Implementation: Alexander Haubold
      3/17/2006:  Added 2nd level (fine) BIC evaluation on top of
                    first level (coarse) BIC evaluation. Precision of
                    speaker change increases significantly

  Paper detailing BIC: S.S. Chen, P.S. Gopalakrishnan. Speaker,
    environment and channel detection and clustering via the Bayesian
    information criterion. Proceedings of the DARPA Broadcast News
    Transcription and Understanding Workshop, Landsdowne, VA, 1998,
    pp. 127-132.