Modeling Musical Emotion Dynamics 音乐情绪模型化分析

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

6. Y. E. Kim, E. M. Schmiddt, “Modeling musical emotion dynamics with conditional random fields”, in ISMIR 2011
7. Hanna M. Wallach, “Conditional Random Fields: An Introduction”, in 2004 8. Greg Welch, and Gary Bishop, “An Introduction to the Kalman Filter”, in 2006
Adding the known bias, we get the final estimate as:
7
Experiment Result
Emotion label preprocessing: gray dots indicate individual second-bysecond labels, red ellipses indicate the estimates of the distribution, and blue ellipses indicate the predictions using Kalman filter
12
CRF: Dynamic Programming
Right side could be rewritten as
Using forward-backward algorithm, define
13
CRF: Dynamic Programming (Cont.)
And the recursion relations:
Modeling Musical Emotion Dynamics
Presenter: SHUMIN XU
1
Arousal-Valence Model
2-dimensions representation of emotional memory Arousal: High- vs. low-energy (e.g. energetic vs. calm) Valence: positive vs. negative (e.g. happy vs. sad)
2
Data Training with Kalman Filtering
AKA: Linear Quardratic Estimation (LQE)
Recursive Estimator Predict: uses the state estimate from the previous timestep to produce an estimate of the state at the current timestep, aka priori state estimate
9
Conditional Random Fields (CRF)
Definition: For observations sequence X and label sequence Y, Let G = (V, E) be a graph such that , so that Y is indexed by the vertices of G. Then (X, Y) is a conditional random field when the random variables Yv conditioned on X, obey the Markov property with respect to the graph: where w~v means that w and v are neighbors in G.
Much relaxation of the independence assumptions Avoid label bias problem
Disadvantage:
much complexity
17
References
1. http://en.wikipedia.org/wiki/Emotion_and_memory 2. Y. E. Kim, E. M. Schmiddt, R. Migneco, B. G. Morton, P.Richardson, J. Scott, J. A. Speck and D. Turnbull, “Music emotion recognition: A state of the art review”, in ISMIR, Utrecht, Netherlands, 2010 3. Y. E. Kim, E. M. Schmiddt, and D. Turnbullm, “Feature selection for content-based, time-varying musical emotion regression”, in ACM MIR, Philadelphia, PA, 2010
8
Advantages and Limitations
Advantages:
Smooth and robust estimates
Distribution evolves over time
Limitations:
Limited model complexity was unable to cover a wide variance in emotion space dynamics all three become darker as time progresses, i.e. the estimation becomes indeterminism
Where Z(x) is a normalization factor
Maximum Likelihood Parameter Inference The log likelihood is given by:
Differentiating the log-likelihood with respect
4. Y. E. Kim, E. M. Schmiddt, “Prediction of time-varying musical mood distributions from audio”, in ISMIR, Utrecht, Netherlands, 2010
5. Y. E. Kim, E. M. Schmiddt, “Prediction of time-varying musical mood distributions using Kalman filtering”, in IEEE ICMLA, Washington, D.C., 2010
10
Conditional Random Fields (Cont.)
Potential functions:
Joint Probability:
Let Rewrite the probability function as:
11
CRF: Max Likelihood Parameter Inference
The process x is Markov, i.e.,
5
Kalman Filtering
First performing the forward recursions:
6
Kalman filtering (Cont.)
Then performing the backward recursions:
Update: the current a priori prediction is combined with current observation information to refine the state estimate, termed as posteriori state estimate
State probabilities: state of emotion Recognition: emotion changes over time
15
Emotion Space Heatmap Prediction
16
Advantages and Disadvantages
Advantages:
So
Where
14
CRF in Musical Emotion Recognition
Label: A-V modeled acoustic data
Observation: Mel-frequency cepstral coefficients (MFCC) Transition probabilities: emotions tend to change smoothly
3
Linear Gauss-Markov model
4
Statistical Assumptions
Driving noise w and observation noise v are zero mean Gaussian
ቤተ መጻሕፍቲ ባይዱ
W and v are independent of X and Y
相关文档
最新文档