Insights and Report from the Coordinated Sharing Behavior Detection Conference

Here's a summary of what some of the world's leading experts in the field of multimodal and cross-platform coordinated behavior detection presented at the Coordinated Sharing Behavior Detection Conference, organised by the University of Sheffield, on 29 October 2024, in conjunction with AoIR 2024.

Support, context and enablers

The conference, organised under the aegis of the vera.ai project (specifically, Task 3 of WP4 named "Tackling Coordinated Sharing Behavior with Network Science Methods"), brought together a vibrant array of scholars and experts to tackle the pressing issues of coordinated inauthentic behavior on social media platforms, and to deliberate on existing problems, potential solutions and further research on this topic.

The event was the result of a joint effort of vera.ai, and respective departments of the University of Urbino and Queensland University of Technology. It was furthermore made possible thanks to the support received from the Vigilant project, sobigdata++ and the University of Sheffield.

A shift in focus of research

Since 2016, there has been a shift towards focusing on behavior rather than content to combat disinformation. This strategy avoids the limitations of content-based approaches and protects platforms from accusations of arbitrating truth. The concept of "coordinated inauthentic behavior," introduced by Nathaniel Gleicher of Facebook (now Meta), adds a layer of complexity by privileging the notion of authenticity in social media behavior. While some ambiguity is necessary to combat adversarial tactics aimed at evading detection, the lack of clear definitions poses significant challenges for researchers attempting to detect coordinated behavior on social media.

Scholars have developed open-source software toolkits to detect coordinated behavior, emphasizing the need to identify similar actions, such as sharing the same link or post in a closely timed and repetitive manner. The diverse forms of similarity, varying social media platforms, and evolving user behaviors make setting fixed detection thresholds difficult, often requiring case-by-case handling. The lack of universally recognized thresholds complicates the use of traditional machine learning approaches that rely on labeled datasets. Thus, the event addressed such difficulties by bringing together experts from different continents and contexts who tackled this pressing concern.

Selected outcomes

One of the main takeaways centered on the need for robust, adaptive detection methods that could address coordination without stifling legitimate expression. Scholars showcased open-source tools designed to identify behaviors — like synchronously sharing identical posts — yet highlighted the challenges in setting consistent detection thresholds across platforms with diverse user bases and content types. The ethical implications of these methods also came to the forefront, particularly regarding accountability and privacy.

As generative AI advances, the need for adaptable detection mechanisms has never been clearer. The conference underscored that while the field has made strides, ongoing collaboration is essential to meet the evolving complexities of social media and ensure fair and effective moderation practices in a dynamic digital landscape.

In the next paragraphs, we share some of the key highlights from the speeches/researches of the different experts.

Individual presentations and findings

The day began with a keynote speech of Timothy Graham of Queensland University of Technology, Brisbane. Graham’s speech, entitled "The Inauthenticity Paradox", explored the complexities surrounding authenticity and inauthenticity on social media. He emphasized that while platforms benefit from user engagement, they simultaneously profit from Coordinated Inauthentic Behavior (CIB) as it boosts engagement metrics. Graham argued that authenticity, as defined by social media platforms, serves their strategic interests rather than holding any intrinsic meaning. His presentation revealed the paradox of harm and profit, suggesting that inauthenticity is embedded in the platforms' own design architecture which provides users with certain affordances that shape their sharing behavior which, in turn, guide the platforms' monetization strategies.

Impressions Part 1Bruna Almeida Paroni

The keynote speech was followed by three sessions of four papers each.

The morning session deliberated on the approaches and challenges of detecting coordinated behavior.

The first paper was presented by Stefano Cresci. Using data from UK/US elections, his paper emphasized the use of dynamic, multimodal, and time-segmented analyses to detect and characterize coordinated online behaviors, underscoring that static methods fall short in capturing network shifts over time. His research team’s findings highlight that dynamic analysis can open new, largely unexplored research directions, while nuanced characterization helps reveal distinctions between various forms of coordination. Although multimodality shows some promises, Cresci noted that deeper understanding is needed to operationalize it effectively for distinguishing patterns within political and informational contexts.

Daniel Thiele then introduced the embedding-based ‘coorsim’ package for detecting multimodal coordinated behaviors, particularly in deceptive or manipulative content. His paper, co-authored with Miriam Milzner, emphasized the need for cross-lingual detection and non-textual modalities, which remain understudied in coordinated manipulation research. Introducing their new ‘R package’, still a work in progress, Thiele demonstrated its capabilities for detecting suspiciously similar content using embeddings across text and other modalities.

Thereafter, Daniel Angus and his team introduced a six-step framework that integrates multimodal analysis, allowing for the detection of "similitude" in coordinated inauthentic behavior, across text, images, and videos. This 6-step framework for detecting multimodal coordination, leverages advanced models such as Sentence-BERT, visual transformers, and CLIP (contrastive language-image pretraining). Applying this framework to a dataset of 45,000 tweets with 30,629 unique images from the 2020 “Reopen America” protests, he illustrated the challenges and potential of multimodal embeddings in text-image detection. Limitations include the scarcity of task-specific datasets, high computational demands, and the complexity of integrating multimodal data.

Luca Rossi, who followed, examined longitudinal coordinated link-sharing behavior by focusing on the temporal aspects of the data. Rossi’s research highlighted the challenges in detecting community structures using conventional methods, such as the Louvain method, which may not reveal overlapping communities effectively. The speaker discussed how multilayer networks, which integrate multiple modalities and time dimensions, facilitate an understanding of coordinated behaviors across various scales. This approach allows for comparisons between grassroots coordination and inauthentic, even orchestrated activities, for example, offering new insights into the evolving nature of online coordination.

The first post-lunch session shifted the focus to datasets from co-ordinated sharing behaviour in non-western contexts.

The first speaker was Raquel Recuero who presented a sociolinguistic analysis of Russian influence on Brazilian Telegram channels. Drawing from a linguistic framework, Recuero highlighted how language and discourse serve as instruments of power. Using the Telegram API, her research team collected data from Brazil-focused channels over the past six months, analyzing 20,167 posts by over 300 users. Recuero observed pro-Russian content in groups discussing topics like weaponry, war, military strategies, politics, and architecture, indicating a deliberate push of alternative narratives favorable to Russia.

The second presenter, Felipe Bonow Soares, focused on the spread of misinformation in Brazilian Facebook buy-and-sell groups. He emphasized that content moderation in non-Western regions is less consistent, allowing misinformation on topics like health and political conspiracies to remain prevalent. Soares suggested that enhanced automated moderation could help address these issues, although current tools are limited in their effectiveness with non-English content.

Aytalina Kulichkina, the third speaker of this part of the conference, examined coordinated behaviors in the context of authoritarian regimes, analyzing the synchronization of pro- and anti-government content in Russia and China. Kulichkina’s research found that both state-sponsored and protest-supporting groups use coordinated behavior to amplify their messages, with state actors often targeting international audiences through English-language content.

Finally, Fabio Giglietto and Giada Marino closed this session with insights from the vera.ai Alerts project, which tracked and analyzed coordinated accounts on Facebook over the period from October 2023 to August 2024. Using CrowdTangle, they identified 7,068 coordinated posts, 10,681 coordinated links, and 2,126 new coordinated accounts, with notable activity in large groups, casino engagement clusters, and Putin-support networks. Their research integrated an LLM-based narrative extraction pipeline developed with Massimo Terenzi, involving data gathering, image-to-text transformation, embedding, dimensionality reduction, and clustering. They identified distinct clusters, such as those focused on casino-related engagement. Challenges emerged, particularly with real-time API workflows, Meta's restrictive data library access, and TikTok’s inconsistent API functionality. These limitations pose significant barriers to real-time, multimodal coordinated content tracking on social media platforms.

Impressions Part 2Bruna Almeida Paroni

The final conference session involved talks on cross-platform coordination detection. Detecting cross-platform coordination presents additional challenges due to the varied structures and policies of each platform.

The session started with Ahmad Zareie’s presentation on work that categorized coordination detection into interaction-based and similarity-based methods, highlighting their limitations, such as failing to capture independently active accounts. To address this, he proposed a hierarchical anomaly-based approach, analyzing anomalies across individual, pairwise, and group levels without set thresholds, accounting for behavioral frequency, effort, and temporal patterns.

Nicola Righetti presented CooRTweet, a platform- and content-independent coordination detection R tool co-developed with Paul Balluff, that allows for the analysis of multimodal and cross-platform data using a flexible threshold approach. The tool’s adaptable approach allows researchers to detect coordinated networks on any kind of platform and around any type of content, facilitating the analysis of coordinated networks also within the context of non-coordinated accounts that might be influenced by coordinated activities.

Jakob Kristensen, in turn, studied cross-platform coordinated link-sharing behaviors using URLs as universal identifiers. His findings indicate that information flow between platforms is often organized around a few primary connections, with Facebook and Telegram showing significant coordinated activity. Kristensen’s work suggests that while cross-platform coordination is limited in scale, it remains influential in the propagation of specific narratives.

In the final paper of the day, Jennifer Stromer-Galley introduced the use of knowledge graphs in detecting coordinated inauthentic behavior, specifically in the context of the U.S. presidential elections. Currently, obtaining grants in the U.S. to study misinformation and disinformation is challenging. Neo4j, a company specializing in knowledge graph technology, offered a grant that Jennifer was eager to utilize for this project. Knowledge graphs pre-encode data, building a schema of how the data should relate to each other, pre-structuring the data and not building relationships afterwards. Stromer-Galley’s paper was a stark reminder about addressing the complexities of detecting coordinated activity within the constraints of limited data access.

Brief summary and way ahead

To sum up briefly: a highly interesting day with lots of different works presented and insights shared among researchers in multimodal and cross-platform coordinated behavior detection. More work is surely needed. So is access to relevant data. if any of the above has raised your interest and you want to know more, share your own work, or even cooperate on certain aspects, feel free to reach out to the respective individuals mentioned in the article.

Authors: Bruna Almeida Paroni and Massimo Terenzi (Uni Urbino), with inputs from Kamila Koronska (UvA)

Editors: Anwesha Chakraborty (Uni Urbino) and Jochen Spangenberg (DW)

vera.ai is co-funded by the European Commission under grant agreement ID 101070093, and the UK and Swiss authorities. This website reflects the views of the vera.ai consortium and respective contributors. The EU cannot be held responsible for any use which may be made of the information contained herein.