本文是对MATLAB官网文档Acoustic Scene Recognition Using Late Fusion中训练集成分类器部分的翻译与解析,请参考原网页进行理解。
[4]中显示小波散射可以很好地表示声学场景。定义一个waveletScattering(小波工具箱)对象。通过反复试验确定不变性量表和品质因数。
sf = waveletScattering('SignalLength',size(data,1), ...% 针对一个样本 'SamplingFrequency',fs, ... 'InvarianceScale',0.75, ... 'QualityFactors',[4 1]); % 得到一个小波时间散射对象将音频信号转换为单声道,然后调用featureMatrix(小波工具箱)以返回散射分解框架sf的散射系数。
dataMono = mean(data,2);% 得到样本每一行的平均值 scatteringCoeffients = featureMatrix(sf,dataMono,'Transform','log');% 得到散射特征矩阵 featureVector = mean(scatteringCoeffients,2);% 对10秒音频片段上的散射系数求平均值 fprintf('Number of wavelet features per 10-second clip = %d\n',numel(featureVector))% 每10秒剪辑的小波特征数量对全部样本,辅助函数HelperWaveletFeatureVector执行上述步骤。辅助函数HelperWaveletFeatureVector执行上述步骤。使用带有cellfun和HelperWaveletFeatureVector的tall数组来并行化特征提取。提取训练集和测试集的小波特征向量。
scatteringTrain = cellfun(@(x)HelperWaveletFeatureVector(x,sf),train_set_tall,'UniformOutput',false); xTrain = gather(scatteringTrain); xTrain = cell2mat(xTrain')'; scatteringTest = cellfun(@(x)HelperWaveletFeatureVector(x,sf),test_set_tall,'UniformOutput',false); xTest = gather(scatteringTest); xTest = cell2mat(xTest')';使用fitcensemble创建训练有素的分类集成模型(ClassificationEnsemble)。
subspaceDimension = min(150,size(xTrain,2) - 1); numLearningCycles = 30; classificationEnsemble = fitcensemble(xTrain,adsTrain.Labels, ... 'Method','Subspace', ... 'NumLearningCycles',numLearningCycles, ... 'Learners','discriminant', ... 'NPredToSample',subspaceDimension, ... 'ClassNames',removecats(unique(adsTrain.Labels)));对于每个10秒钟的音频剪辑,调用预测以返回标签和权重,然后将其映射到相应的预测位置。调用混淆图以可视化测试集的准确性,并打印平均值。
[waveletPredictedLabels,waveletResponses] = predict(classificationEnsemble,xTest); figure cm = confusionchart(adsTest.Labels,waveletPredictedLabels,'title','Test Accuracy - Wavelet Scattering'); cm.ColumnSummary = 'column-normalized'; cm.RowSummary = 'row-normalized'; fprintf('Average accuracy of classifier = %0.2f\n',mean(adsTest.Labels==waveletPredictedLabels)*100)对于每个10秒的剪辑,在小波分类器上调用预测,然后CNN返回一个向量,该向量指示对其决策的相对置信度。将waveletResponses与cnnResponses相乘以创建后期融合系统。
fused = waveletResponses .* cnnResponses; [~,classIdx] = max(fused,[],2); predictedLabels = classes(classIdx);调用混淆图以可视化融合的分类准确性,将平均精度打印到命令窗口。
figure cm = confusionchart(adsTest.Labels,predictedLabels,'title','Test Accuracy - Fusion'); cm.ColumnSummary = 'column-normalized'; cm.RowSummary = 'row-normalized'; fprintf('Average accuracy of fused models = %0.2f\n',mean(adsTest.Labels==predictedLabels)*100)