Parasecurity Group
About

Info

We initiated a data collection gathering from Twitter regarding the Russo-Ukrainian War on February 24, 2022. The main goal is to analyze the main trends and topics discussed in this online discourse, watch the tendencies of users, the suspension of potential malicious entities, the sentiment of the text in, the hate speech or any propaganda that may be visible through OSNs.

Team

Team:

Data collection

The collected dataset was retrieved based on the popular hashtags that support both entities, in order to keep a sentiment balance. The described hashtags were selected and updated during the first week of the conflict in order to keep up-to-date only the most popular user hashtags.
List of all the HTs we retrieve:
  • #Ukraine
  • #Ukraina
  • #ukraina
  • #Украина
  • #Украине
  • #PrayForUkraine
  • #UkraineRussie
  • #StandWithUkraine
  • #StandWithUkraineNOW
  • #RussiaUkraineConflict
  • #RussiaUkraineCrisis
  • #RussiaInvadedUkraine
  • #WWIII
  • #worldwar3
  • #Война
  • #BlockPutinWallets
  • #UkraineRussiaWar
  • #Putin
  • #Russia
  • #Россия
  • #StopPutin
  • #StopRussianAggression
  • #StopRussia
  • #Ukraine_Russia
  • #Russian_Ukrainian
  • #SWIFT
  • #NATO
  • #FuckPutin
  • #solidarityWithUkraine
  • #PutinWarCriminal
  • #PutinHitler
  • #BoycottRussia
  • #with_russia
  • #FUCK_NATO
  • #ЯпротивВойны
  • #StopNazism
  • #myfriendPutin
  • #UnitedAgainstUkraine
  • #StopWar
  • #ВпередРоссия
  • #ЯМыРоссия
  • #ВеликаяРоссия
  • #Путинмойпрезидент
  • #россиявперед
  • #россиявперёд
  • #ПутинНашПрезидент
  • #ЗаПутина
  • #Путинмойпрезидент
  • #ПутинВведиВойска
  • #СЛАВАРОССИИ
  • #СЛАВАВДВ

Sentiment Model

Sentiment analysis was applied with a pre-trained multi language XLM-RoBERTa model that allows the use of sentiment analysis for zero-shot classification, with user defined entities. Thanks to our GPU implementation, this methodology allows the process of a large amount of data in order to provide the daily sentiment values. For more details please visit this link.

Military Intelligence Pipeline

Military Intelligence was gathered with a pre-trained multi language XLM-RoBERTa model for zero-shot classification using the label "military". Furthermore, we applied a custom-trained NER(Named-entity recognition) model, using as a base spaCy's NER model and trained with this dataset, to identify entities like Location, Person, Weapon and more and filtered the tweets based on Ukrainian locations.

Contact

Email Us