Case Studies
March 13, 2024

Offensive Language Detection in Social Media

The SemEval-2020 Task 12 was an AI competition focusing on the detection of offensive language on social media platforms. Our team leveraged the teacher-student paradigm to transfer knowledge from a larger model to a smaller one. We outperformed 81 teams with a 93.19% F1-score using a model three times smaller than the team who ranked first.

Offensive Language Detection in Social Media

Low-code tools are going mainstream

Purus suspendisse a ornare non erat pellentesque arcu mi arcu eget tortor eu praesent curabitur porttitor ultrices sit sit amet purus urna enim eget. Habitant massa lectus tristique dictum lacus in bibendum. Velit ut viverra feugiat dui eu nisl sit massa viverra sed vitae nec sed. Nunc ornare consequat massa sagittis pellentesque tincidunt vel lacus integer risu.

  1. Vitae et erat tincidunt sed orci eget egestas facilisis amet ornare
  2. Sollicitudin integer  velit aliquet viverra urna orci semper velit dolor sit amet
  3. Vitae quis ut  luctus lobortis urna adipiscing bibendum
  4. Vitae quis ut  luctus lobortis urna adipiscing bibendum

Multilingual NLP will grow

Mauris posuere arcu lectus congue. Sed eget semper mollis felis ante. Congue risus vulputate nunc porttitor dignissim cursus viverra quis. Condimentum nisl ut sed diam lacus sed. Cursus hac massa amet cursus diam. Consequat sodales non nulla ac id bibendum eu justo condimentum. Arcu elementum non suscipit amet vitae. Consectetur penatibus diam enim eget arcu et ut a congue arcu.

Vitae quis ut  luctus lobortis urna adipiscing bibendum

Combining supervised and unsupervised machine learning methods

Vitae vitae sollicitudin diam sed. Aliquam tellus libero a velit quam ut suscipit. Vitae adipiscing amet faucibus nec in ut. Tortor nulla aliquam commodo sit ultricies a nunc ultrices consectetur. Nibh magna arcu blandit quisque. In lorem sit turpis interdum facilisi.

  • Dolor duis lorem enim eu turpis potenti nulla  laoreet volutpat semper sed.
  • Lorem a eget blandit ac neque amet amet non dapibus pulvinar.
  • Pellentesque non integer ac id imperdiet blandit sit bibendum.
  • Sit leo lorem elementum vitae faucibus quam feugiat hendrerit lectus.
Automating customer service: Tagging tickets and new era of chatbots

Vitae vitae sollicitudin diam sed. Aliquam tellus libero a velit quam ut suscipit. Vitae adipiscing amet faucibus nec in ut. Tortor nulla aliquam commodo sit ultricies a nunc ultrices consectetur. Nibh magna arcu blandit quisque. In lorem sit turpis interdum facilisi.

“Nisi consectetur velit bibendum a convallis arcu morbi lectus aecenas ultrices massa vel ut ultricies lectus elit arcu non id mattis libero amet mattis congue ipsum nibh odio in lacinia non”
Detecting fake news and cyber-bullying

Nunc ut facilisi volutpat neque est diam id sem erat aliquam elementum dolor tortor commodo et massa dictumst egestas tempor duis eget odio eu egestas nec amet suscipit posuere fames ded tortor ac ut fermentum odio ut amet urna posuere ligula volutpat cursus enim libero libero pretium faucibus nunc arcu mauris sed scelerisque cursus felis arcu sed aenean pharetra vitae suspendisse ac.

Problem

SemEval-2020 Task 12 was an AI competition held in 2020. The goal was to use AI to identify offensive language in social media.

offensive language detection in social media

The leaderboard of the competition is presented in the following table.

Leaderboard of the competition

Approach

We used only the teacher-student paradigm to transfer knowledge from a large model (Teacher) to a small model (Student).

Result

iSemantics team surpassed 81 teams and scored 93.19% F1-score. We achieved this result using a model 3x smaller than the UHH-LT team, who ranked 1st.

A 3x smaller model means 3x faster decision time and 3x lower running cost when hosting on cloud services.

To learn more about automated data labeling techniques, read the full article here.