Wav2Vec2 Sentiment Analysis Using Shemo Dataset

Model

Overview

In this project, we fine-tuned the Wav2Vec2 model to perform sentiment analysis based on both voice features and text transcripts from the Shemo dataset. This hybrid approach allows robust emotion recognition using both audio and textual data for classification.

Team Members

  • Amirreza Vishteh
  • sentiment analysis in speech
  • Iran University of Science and Technology
  • 6/09/2024

Table of Contents

  1. Data Loading
  2. Data Preprocessing
  3. Model Configuration and Preprocessing
  4. Model Definition
  5. Trainer Setup
  6. Results

1. Data Loading

We used the Shemo dataset from Sharif University, which includes .wav audio files paired with corresponding transcripts and emotion labels stored in a JSON file. Data Loading

2. Data Preprocessing

The loaded data was converted into a pandas DataFrame, and paths were verified to ensure file existence. Missing paths were dropped from the dataset. The dataset was split into training (80%) and validation (20%) sets using stratified sampling based on emotion labels. Data Preprocessing

3.Model Configuration and Preprocessing

We loaded a pre-trained Wav2Vec2 model for Persian speech emotion recognition. Configuration was customized to set up the pooling mode and label mappings. Model Configuration and Preprocessing Model Definition

4.Model Definition

We defined a custom Wav2Vec2 model for speech emotion classification, which included a feature extractor and a classification head. Trainer Setup

In the forward() method, hidden states from Wav2Vec2 were pooled, and the resulting tensor was classified into the target emotion label.

6. Trainer Setup

We used Hugging Face’s Trainer class to fine-tune the model. A data collator was implemented for dynamic padding, and evaluation metrics (accuracy, F1-score) were set up.

7. Results

After training the model, we evaluated its performance using the following metrics: Results

The final accuracy was 94%, demonstrating the effectiveness of using both voice features and text transcripts for sentiment analysis.

2024

Osmium Project Tasks

1 minute read

Overview In this project, we developed an Android application to estimate the location of cellular network cells using Received Signal Strength Indicator (RS...

Back to top ↑

2023

Back to top ↑

2021

Sonic pi

less than 1 minute read

My sonic pi project : . این پست من مربوط به پوروژه سونیک پای بنده است

Back to top ↑

2020

مصاحبه

1 minute read

My works and wishes Success secret مصاحبه با جناب اقا پارسا: ایشون بسیار ادم سخت کوشی بودن و گفتن که حتی در دوران دانشجویی شون درس هم میدادن 1 به موار...

My works and wishes

less than 1 minute read

Its my favorite university This is oxford: In England The best unniversity. this university is first inthe world. this university have best descover...

Back to top ↑