|
 |
About |
|
| |
|
| Title: |
Collection of Matlab resources |
|
|
| |
|
|
|
| |
|
1. Audio Processing
 | | mfcc.m
Computes Mel Frequency Cepstral Coefficients (from Toolbox by Malcolm Slaney) |
 | | extractMFCC.m
Extracts all MFCC vectors for a given audio file (not yet commented) |
EXTRACTMFCC: Extract Mel Frequency Cepstral Coefficients from a file or an audio vector. This function extracts MFCCs using mfcc.m by reading frames of audio data from the file or audio vector. Several constants are declared and can be changed: windowSize = 256 (audio samples per frame resulting in one MFCC feature vector) rangeRead = 4000 * windowSize (amount of audio samples read from file at one time)
CEPS = EXTRACTMFCC(audiofile) for a wave audio file
CEPS = EXTRACTMFCC(audioVector, samplingFrequency) for an audio vector If samplingFrequency is not passed, a warning is printed and a default frequency of 48kHz is used.
|
 | | speakerchange.m
Applies BIC to determine speaker changes in an audio file. Requires audio file (wave) as input. Depends on mfcc.m. Returns table of segmented speakers. |
SPEAKERCHANGE: Compute speaker changes in an audio file based on BIC
T = SPEAKERCHANGE(audiofile) for a wave audio file returns a table of speaker changes where rows are segmented speakers and columns are: 1: begin of segment in wave samples 2: end of segment in wave samples 3: begin of segment in MFCC CEPS index 4: begin of segment in MFCC CEPS index
[T, C] = SPEAKERCHANGE(audiofile) returns segment table T, as well as extracted cepstral coefficients in matrix C with rows(C) as MFCC coefficient dimension and columns(C) as feature vectors over time.
[T, C, B] = SPEAKERCHANGE(audiofile) returns segment table T, cepstral coefficients C, as well as vector B representing the BIC values over the entire audio vector.
Implementation: Alexander Haubold 3/17/2006: Added 2nd level (fine) BIC evaluation on top of first level (coarse) BIC evaluation. Precision of speaker change increases significantly
Paper detailing BIC: S.S. Chen, P.S. Gopalakrishnan. Speaker, environment and channel detection and clustering via the Bayesian information criterion. Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Landsdowne, VA, 1998, pp. 127-132.
|
|
|
| |
|
|