Taming the bottom half of the internet:

How news outlets manage, discipline, elevate and shape comment sections after the ‘constructive turn’

Team Members

Project facilitators: Ernst van den Hemel, Cedric Waterschoot, Gijs Wijngaard, Liesje van der Linden

Project Members:

Commenting Guidelines: Hannes Cools, Monika Maciuliene

The Guardian: Greta Timaite, Clair Richards, Sandy Rafaela Krambeck

Die Zeit: Bjorn Schiermer, Luca Hammer

El Pais: André Calderolli, André Rodarte, César Santos Silva

Visualisation: Andrea Elena Febres Medina, Priscila Yoshihara, Tim Olbrich

User-based moderation: Harlie Hendrickson

Contents

Key Findings

Through our meta-analysis of commenting guidelines, we found a growing focus on constructivity in online comment sections, as opposed to deleting unwanted or toxic comments. Using three media outlets as case studies, we further expanded the perspective on what these companies deem constructive comments. This perspective ranges from the mere absence of toxicity to educational content. Furthermore, we saw that a hybrid approach to moderation is common, i.e. supplementing human moderation with AI tools to filter incoming posts. While the platforms differ in their views on the role of commenting and the definition of constructivity, the turn towards active shaping of the commenting platform is shared among them.

1. Introduction

Readers want to be heard. When reading a news article, people want to share their opinion and engage in discussions about current events in the world. This has been true across different eras, and due to innovation it has resulted in different ways in which readers participate in journalism. In the age before the internet, the demand for participation resulted in newspapers publishing letters that were sent in by readers on all kinds of topics (Santana, 2011). Readers would write their opinion in a letter, the editors would read the letters, and subsequently pick the most relevant letters and publish those. However, things rapidly changed after the digital acceleration of journalism due to the entrance of the internet.

The era of ‘participatory journalism’ arose (Domingo et al., 2008). News outlets would not only print their stories, but also publish them online with the free ability for any reader to engage with the news by leaving a comment in the comment section. This participatory form of journalism which includes user-generated content, soon became a strongly hyped innovation (Frislich et al., 2019). Especially since deliberation is often viewed as a central aspect of democracy (Manin, 1987), the new online sphere was often thought of as the ideal solution to the practical limits of mass deliberation (Wright & Street, 2007).

However, despite the potential and the initial enthusiasm, the growing amount of contributions to comment sections of news outlets has resulted in increasing incivility within comment sections (Coe et al., 2014; Cheng et al., 2015). This raises the question to what extent comment sections need to be moderated (Gibson, 2019; Wright, 2009). As some news platforms encountered difficulty dealing with the issue of moderation, the decision to close comment sections all together is not uncommon (e.g. Goldberg, 2018; Hoekman, 2016).

What resulted is what we would like to call the constructive turn: news platforms became increasingly aware that only filtering out profanity is not sufficient to keep comment sections useful, constructive and respectful (Delgado, 2019; Yárnoz, 2019). Even though some news outlets still see the comment sections as disconnected from journalistic work, it has become more common for news outlets to feel that they have to take responsibility for comments. This is often done with human-powered moderation (El País, 2018, 2020), outsourcing the comment sections to social media (Sonderman, 2011), or resorting to the assistance of artificial intelligence (Delcambre, 2019; Delgado, 2019).

The aim of our project is to further investigate this constructive turn. For this reason, we focus on the way in which newspapers are currently moderating their online discussions and have done so in the past. We compare several news outlets on their commenting guidelines to better understand their own ideas of how to entertain a constructive comment section, as well as comparing how they operationalize constructivity, by looking at the features of the comment platforms. Furthermore, we also conduct several case studies to explore specific comment sections more in depth.

2. Research Questions

Our research questions are as follows:

Comment section guidelines

  • How do online news outlets take responsibility for their comment sections and keep the deliberation inside the comment sections constructive? In other words: what do they encourage and discourage in the commenting guidelines?

  • Can we detect a constructive turn in the way in which online news outlets take responsibility for their commenting sections? E.g. are commenting guidelines getting increasingly more focussed on maining constructive deliberation instead of only focussing on keeping out incivility?

Case studies

  • What are the different ways in which our case study news outlets moderate their comment sections, and how do these different forms of moderation influence the discussions on their platforms?

  • The case study news outlets use human moderators to highlight the most important and constructive contributions to the online discussion. In what way do these highlighted comments stand out from the other comments? What features can we distinguish that cause them to be picked by the moderators to be most constructive?

3. Methodology and initial datasets

3.1 Comment section guidelines

In order to investigate the way in which news outlets aim to keep their comment sections constructive alongside the way in which their perspective on this has changed over time, we have gathered a dataset containing the commenting guidelines from several English newspapers. This dataset included their guidelines from three moments in time: 2010, 2015 and 2021 (see section 2.1).

Each set of guidelines was coded on what behavior was encouraged and what was discouraged by the news outlet, and what kind of discussion environment the news outlet expects from their commenters in general (e.g. entertaining, healthy, inclusive etc.). The factors on which the encouragements and discouragements were coded can be found in Table 1.

Subsequently, for all of the news outlet guidelines several comparisons were made. On the one hand the guidelines from the different news outlets were contrasted against one another in order to see how the different outlets operationalize constructive deliberation in their comment sections. On the other hand the guidelines from the same news outlets but from a different time period were also contrasted against one another, in order to gain more insight into the way in which the perspective of the news outlet on their discussion platform has changed over time.

What kind of discussion environment do the outlets expect?

Civil

Constructive

Diverse

Enjoyable

Lively

Safe

Open-minded

Unique

Vibrant

Not identified

Healthy

Interactive

Inclusive

Insightful

Intelligent

Entertaining

What is encouraged?

add value / contribute

appropriate for all ages

ask thoughtful questions

be a good citizen

be brief/concise

Be clear

Be original

Be polite

Be relevant

Respect others

Report others

Consider impact on others

respect people's privacy

What is discouraged

abusive towards moderation team,

adult content (profanity, pornography)

caps lock

Contains virus or any other harmful software

content encouraging suicide

copyrights, intellectual property

duplicate content

false or misleading information

generalizations

in contempt of court

insults

Interrupt server work

invasion of privacy

misrepresentation / pretending to be someone else

no images

Only in English

political content

restrict others from enjoying the site

harassment and bullying

hate speech (racist, sexist, homophobic or discriminatory)

illegal activities

spam, ads, commercial content, self-promotion

supports terrorism

vulgar language

Table 1

3.2 Case studies

For each of the case study news outlets we collected information about their moderation practices and created datasets containing either comments from their comment sections or meta information about some of the discussions on their platform (see section 2.2 - 2.4). By combining the gathered information about their moderation practices and the collected datasets, we aimed to find how the different moderation practices influence the discussions on their platforms. In the following subsections we will further elaborate on the different methodologies of each case study.

3.2.1 The Guardian

For our case study on The Guardian we collected information about their moderation practices through online research and by conducting an interview with an editor at The Guardian.

In order to profile and explore the political bias in The Guardian Picks, a mix of visual analysis in Python and discursive analysis of the comments was performed on the dataset on the monarchy of the United Kingdom. For the profile, through discursive analysis, 4 observation categories were created: political spectrum, existence of criticism in the comment, type of comment and agreement with the author of the article. To examine the political bias, an article considered politically polarized was annotated on the 4 observation categories in order to identify the presence of a multiplicity of opinions in the comments. This subset of the original data that was annotated contained 571 comments. After the annotation of the comments, the resulting comments per category were compared with the amount of editor picks in order to see whether some of the categories led to a higher number of picks.

Furthermore, we trained a language model on our Guardian datasets to enable it to generate comments on articles that could be selected as a Guardian pick by the moderators. Our model was based on GPT, a model capable of predicting the next word in a piece of text given the previous words. We started with a sentence-start token and let the rest of the text be generated word by word by the model. In this way, our model was capable of generating long comments with an interesting argumentative structure, almost indistinguishable from a real comment.

Lastly, a mixed methods approach was used to explore the dataset on climate change. Initially, exploratory data analysis was employed to find out how many comments are the Guardian Picks (≈ 0.5%), deleted after publication (≈ 2%), what proportion of all the articles have at least 1 Guardian Pick (≈ 84%) or a deleted comment (≈ 86%) and other statistical information. Then text analysis methods were applied to explore the comments. In particular, bigram wordclouds and topic modelling (LDA) were applied to explore all the comments and the Guardian Picks only. The goal was to find out if any differences emerge in regards to the language/discourse used in the Guardian Picks compared to the whole dataset. Finally, a qualitative approach was used to explore the articles that are not fully moderated (“full moderation” refers to the articles that have both Guardian Picks and deleted comments). This was used to detect cases when no post-moderation takes place (there are no Guardian Picks and no deleted comments) and determine what is happening in those comment sections.

3.2.2 Die Zeit

To gain insight into the moderation practices at this news outlet, we looked at their so-called netiquette and rules for commenting. These rules consist of three pages: netiquette (i.e. what a good comment looks like), rules on what not to do and finally, a short insight in how they moderate their comment sections. Further information on their use of machine learning came from a blog series they published during the initial phases of development. This model is aimed at filtering out the unwanted, toxic comments. This information can be used to gain insight into their perspective on moderation and what they see as ‘the role’ of the moderator.

Subsequently, we looked at the metrics that we collected alongside our textual data. These include the percentage of moderated/removed posts and the number of editor’s picks. This data can provide insight into their moderation strategy. We visualise these metrics in the form of networks and user interaction.

A third focus of this case study relates to the reasoning that moderators provide when adjusting or removing a comment. We gathered all of the given explanations and aim to pool them together in a coherent set of moderation choices. We investigate and interpret their moderation strategy further using this set of practices.

Finally, we provide insight into the presence of comment sections over time. Using the article dataset, we derive the ratio of articles with at least one comment. We also look at the trends of articles without comment sections in order to analyse whether the ability to comment on an article follows a pattern related to content or source of the article in question.

3.2.3 El País

For this case study on El País we collected information about their moderation practices through online research and by conducting an interview with an editor at El País.

To understand changes in the moderation practices at El País we compared all the versions of its guidelines for online comments since 2014. These guidelines were written in the 2014 22nd revision of the Style Book (Libro de Estilo), a book first published in 1977 and now on its 23nd revised edition, setting the ethical and technical guidelines for EL País editorial and management; the regulations this book has for online comments are accessible for online users through a link at the top right corner of every comment section called Principles and Rules of Participation (Principios y normas de participación).

This particular year was chosen because it’s when El País changed its moderation model to the present system and these rules appeared in the comment section for the first time. Since 2010 comments have been submitted by registered users through a proprietary platform, Eskup (El País, 2010; Galán, 2017). These rules consist of a list of 9 principles that state what is valued and what is rejected in the comment section. Further information on the use of a mix of AI and labour on their moderation strategy came from statements in articles in the news outlet (El País, 2015, 2018, 2020; Delgado, 2019; Yárnoz, 2019, 2021). These sources enabled us to have a detailed description into El País model of online comment moderation, and the role of comments section, users and moderators.

Through a semi-structured interview with an editor and journalist from El País, we sought to clarify some aspects about the evolution of the moderation model and the limitations that the current model presents from the perspective of the interviewee. It is a subjective perspective, of course, but allowed us to clarify our interpretations of the statements collected in the above-mentioned articles. Namely, about who does the moderation labour and how the Style Book has a central role in defining the communication with the reader and online user.

Further, our case study on El País mainly focussed on the observation that not for all articles on all subjects the comment sections were enabled as well as that not all articles receive similar amounts of comments on their website. Therefore, we investigated the relations between the subject matter of the articles and the number of comments in order to see, for example, whether the controversiality of the topic influences the amount of comments.

In order to do so, we scraped metadata about the comment sections from the website of El País on three different topics (see section 2.4). These topics were: ‘Climate change’ as a global issue, not specific to the Spanish readers; ‘covid’ as a global issue but with an immediate impact on the readers’ everyday life; and ‘Catalonia independence’ to check for a national issue, therefore specific to the Spanish readers.

4. Findings

4.1 Comment section guidelines

Our analysis of the comment section guidelines showed a clear turn towards constructivity. Based on the time frames of 2010, 2015 and 2021, we have seen in our code that news outlets tend to focus more on what is encouraged then on what is discouraged. Guidelines include more recommendations of what can be in the comment sections, then restrictions. That said, we will outline the findings based on how we coded the dataset, starting with the discussion environment, followed by our moderation findings, concluding with the findings on what is discouraged and what is encouraged in the news outlets’ guidelines. Each set of guidelines was labeled on what behavior was encouraged and what was discouraged by the news outlet, and what kind of discussion environment the news outlet expects from their commenters in general (e.g. entertaining, healthy, inclusive etc.).

2010

2015

2021

What kind of discussion environment do media outlets expect?

9

22

47

Is moderation mentioned?

28

38

55

What is discouraged?

155

186

190

What is encouraged?

8

20

72

Table 2: Amount of code excerpts in comment guidelines.

Discussion environment

In 2021, the majority of news outlets have included more information in their guidelines on what kind of discussion environments they want to create. Most strikingly, we have seen five times more mentions in the guidelines on what kind of atmosphere they want to create in their comments sections. In Figure 1, all of the results regarding the discussion environment are visualised. Most of the topics that are present in the guidelines are civil (8), enjoyable (7), intelligent (5) and diverse (5). These topics were mostly absent in guidelines that were on the website in 2010 and could be embedded in the larger development of online negativity that has become more present over the last years. In adding these guidelines, the news outlets want to stress ‘what they stand for’, like The Guardian is mentioning. By adding these specific characteristics at the beginning of the guidelines, they set the tone for how they expect readers to respond. In being constructive from the start, they also stress on ‘what is not allowed’ or ‘what they do expect’ commenters to do. In doing so, these news outlets hope to reduce the amount of negative comments, and therefore also to reduce the workload of moderation teams.

Figure 1: what kind of discussion environment do news outlets expect?

Moderation

Our analysis of the guidelines of the 20 largest news outlets showed that moderation has become more present in the current ones than in 2010 or 2015. Of course, moderation has been, to some degree, part of the work of news outlets, but most of the comments that were on websites in 2010 were not being erased or moderated. In the light of this, we have noticed that moderation is mentioned more in 2021 (55 times) than in 2015 (38 times) and 2010 (28 times). If moderation is included in the guidelines, it is merely to inform news consumers that it could be possible that their comments could be erased or edited. Extra information on how the process of moderation takes place remains absent, and somehow there is a lack of transparency on how news outlets filter comments (see also: Koliska and Diakopoulos, 2017). Some may use smart software like algorithms or tracking systems to filter out certain words or sentences, but that process remains unclear for the news consumer.

Discouraged and encouraged user actions

As mentioned before, each set of guidelines was labeled on what user behavior was encouraged and what was discouraged by the news outlet. Following sections will overview the current practices and changes over the years.

After analysis we can see that media outlets were more or less stable with the number of discouraging elements (155 in 2010, 186 in 2015 and 190 in 2021). The contents of course changed over time. As years pass on, it is clear that the media outlets are focusing on the problems that the Internet is plagued with. For example: Harassment and bullying (most prominent in 2021); Hate speech (racist, sexist, homophobic or discriminatory) (most prominent in 2021); Content encouraging suicide (no mention in 2010 & 2015); False or misleading information (most prominent in 2021); Insults (most prominent in 2021); Invasion of privacy (most prominent in 2015 & 2021). Other elements (more related to the hygiene of discussion) were prominent too - users were discouraged to use other languages than English, avoid writing in Caps Lock and pretend to be someone else (misrepresentation). Some discouraged elements were more or less stable over the years e.g. adult content (profanity, pornography), content violating intellectual property rights and duplicates. The summary of the results is provided in Table 3 below.

Most prominent changes

Less focus over the years

Stable elements

Harassment and bullying (most prominent in 2021);

Hate speech (racist, sexist, homophobic or discriminatory) (most prominent in 2021);

Content encouraging suicide (no mention in 2010 & 2015);

False or misleading information (most prominent in 2021); Insults (most prominent in 2021);

Invasion of privacy (most prominent in 2015 & 2021).Only in English (most prominent in 2021)

Caps lock (no mention in 2010 & 2015) (most prominent in 2021)

Generalisations (no mention in 2010 & 2015) (most prominent in 2021)

Misrepresentation/ pretending to be someone else (most prominent in 2021)

Political content (no mention in 2010 & 2015) (most prominent in 2021)

Spam, ads, commercial content, self-promotion (most prominent in 2021)

Supports terrorism (no mention in 2010 & 2015) (most prominent in 2021)

Interrupt server work (most prominent in 2015)

Contains virus or any other harmful software (most prominent in 2010)

Illegal activities (most prominent in 2015)

In contempt of court (most prominent in 2015)

No images (most prominent in 2015)

Restrict others from enjoying the site (most prominent in 2010 & 2015)

Adult content (profanity, pornography);

Content violating intellectual property rights;

Duplicate content

Table 3: Summary of discouraged elements in user generated content

We can see the biggest quantitative changes in terms of what is encouraged in the guidelines. In 2010, the guidelines mentioned what is encouraged only 8 times. In 2015 this number increased to 20 and in 2021 - to 72. This might indicate that the media outlets expect much more from their users and treat them as adept members of the community. As years passed, media outlets put focus on being relevant (mentioned 23 times in 2021), reporting others (15), being polite (10) and being a good citizen (8). The reporting part is especially interesting, because this shows that users get the tools to shape constructive discussions in media outlets. Other less mentioned elements include: Add value/contribute (most prominent in 2021); Appropriate for all ages (most prominent in 2021); Ask thoughtful questions (no mention in 2010 & 2015) (most prominent in 2021); Be brief/concise (no mention in 2010 & 2015) (most prominent in 2021); Be clear (most prominent in 2015 & 2021); Be original (most prominent in 2021); Consider impact on others (most prominent in 2021, no mention in 2015) and Respect others (most prominent in 2021). Figure 2 below provides the summary of the results. Analysis of identified dimensions shows quite clearly - media outlets expect more constructivity in their user generation in 2021.

Figure 2: What is encouraged in user generated content?

4.2 Case studies

4.2.1 The Guardian

Moderation at The Guardian

Our analysis of the moderation practices at The Guardian showed us that The Guardian is a community. A place for serious conversation, the exchange of structured, well researched perspectives but it’s also a place for whimsy, wacky fun and snark. As an online manifestation of a model, functional community, The Guardian operates with a hybrid system of machine and human moderation as described in a written statement to the Parliamentary Communications and Digital Committee in April 2021, “the role of Robot Eirene does not replace human moderators, but rather it serves to reduce the volume of comments in our queues and to have high risk comments flagged to the moderation team. With a slightly reduced number of comments in the queues, human moderators are able to watch and pre-moderate more threads without significantly increasing their workload”. This method depends upon a commitment to generally good behaviour from commenters along with strong levels of engagement and by placing faith in readers to quickly report comments that might be in breach of the guidelines. From our conversation with a source at The Guardian we learned that the moderators are a highly trained, closely knit team that often work in pairs. In a written statement to the UK Parliamentary Communications and Digital Committee, The Guardian stated that “at the heart of our approach to moderating comment sections on The Guardian, is an internal team of experienced moderators, dedicated to improving the experience for readers and staff visiting the comments section below the line on our website.”

Perhaps surprisingly, the editor we interviewed at The Guardian indicated that where previously it was the editors that made the decision on whether or not to open articles for comments, for a number of years that decision has been made by the moderation team. We were informed that the moderators make their decisions based upon the availability of staff to perform the moderation and from their experience and understanding of which articles The Guardian commenters prefer to place their remarks upon. The Guardian do not encourage comment on news and factual articles, but rather seek to manage comments on opinion pieces where there is space for nuance, argument, debate and education. During our conversation, our source at The Guardian suggested that they optimistically regard the comments section as a place with the potential for the reasoned and debate of contemporary ideas. It is striking that, when speaking about the comment or below the line community as distinct from the readers, our source at The Guardian referred to readers of The Guardian as a distinct and separate group from the commenters. There is much debate within the moderation and editorial teams as to what the point of the below the line comment section actually is. From the interview, we learned that there is a desire to encourage participation and to elicit comment from those that might not ordinarily participate in online commenting. This is driven by a motivating belief that the best way to demonstrate the meaningful and constructive nature of this activity is to demonstrate that the comments are read and responded to by the moderators, the writers of the articles and fellow commenters.

The Guardian picks were introduced in 2016 with the intention of highlighting and elevating the comments of certain users at the top of the comment section with the option to jump to comment in the thread. This elevation of certain comments is open to criticism that it helps to create a hierarchy of commenters within the comment space. The Guardian uses Guardian Picks as an example of exemplary comment behaviour. Picks are chosen for their educational value and the quality of their argument or the manner in which they articulate their points. The Guardian Picks do not have to agree with the article but any disagreement must be expressed in a measured, inoffensive and intelligent way. These terms are vague and wooly and open to a range of entirely subjective interpretations that, although they are clearly written and available on the site to read, appear to operate in a way where you will find out what the guidelines are only when you break or breach them. When we discussed the Community Guidelines with our contact at The Guardian, the guidelines were described as being very, very clear and as such, offered the moderators and commenters a transparent framework that could be referred to at times of uncertainty. Great value is placed upon the transparency of the operation of the moderation process that allows commenters that have had comments removed to appeal the moderation and receive feedback on what motivated the comment removal and offers reinstatement if the moderation is found to be incorrect. Contrary to operating an authoritarian space, The Guardian seeks to provide engaged, intelligent commenters with a dynamic and safe environment for the exchange of knowledge and ideas.

Profiling political bias

The Guardian Picks profile, drawn from articles related to the UK monarchy theme over a 6-month period, revealed that (1) moderators tend to highlight comments that are directly associated with the article; (2) that the number of characters in picks is 2.3 times greater than in general comments; (3) that most of the picks contain criticism of someone or something (67%) and, finally, (4) that the picks are mostly on the left of the political spectrum. A closer look at the general comments section allowed us to infer that this bias derives from the individuals who comment on The Guardian and not from a moderation option, since the percentages of the political spectrum are maintained in both cases (picks and general comments). The attempt to pluralize the comments section was verified by the existence of criticism directed not only to the articles in question, but also to the newspaper itself.
Language model

We let the trained model on our Climate Change guardian picks dataset generate texts with a maximum of 512 tokens. The resulting texts are formulated the same way the guardian picks are, but are newly generated and thus never seen before. We see some interesting compositions in these generated sentences. Some sentences from our output contain personal stories, while others formulate an opinion. From this we can deduce that most guardian picks are also formulated in this way. Ergo, to get a guardian pick we could use strategies formulated in the generated texts. We also see notions of sarcasm in our generated texts. When further inspecting our generated comments, we see that they are not totally coherent. It can easily be spotted that it was generated by a computer and not typed by a human. Lastly, to test whether our generated guardian picks would actually be constructive, we placed a generated comment under a new climate change article. Unfortunately, this comment did not get selected as a guardian pick.
Climate change debate

We began by analysing the climate change dataset from a quantitative perspective. In general exploratory data analysis methods were employed and it can be splitted into three parts: initial exploration, text analysis (wordclouds and LDA), and what we described as “going further” which details two (qualitative) case studies to better understand moderating practices in the Guardian comment sections.

Firstly, the initial exploration of the dataset started from making a table of articles sorted in a decreasing order based on the number of comments. We paid attention to the articles with the most and least comments. Interestingly, the article with the most comments does not have a single Guardian Pick, while the article that should have one comment does not display any on the webpage (this might have been caused by a problem in the API).

The second step was to explore the Guardian Picks. First, the total number was calculated. 386 Guardian Picks were in a dataset which results in about 0.5% of all the comments. A table of Guardian Picks sorted in a decreasing order has been computed. It showed that the article with the most - 9 - Guardian Picks is at the intersection of technology and climate change. Indeed, not every Guardian article has a Guardian Pick. In total we have 164 unique articles in the dataset and 138 with at least 1 Guardian Pick, so it results in ≈ 84% of all the articles.

Finally, blocked/removed comments have been analysed. While it is impossible for a researcher to know why a comment has been blocked, it is possible to analyse how many comments have been blocked. It was found that there are 1570 blocked comments. It makes about 2% of all the comments, which is ≈ 4 times more than Guardian Picks. This allows us to assume that the Guardian tries to create a space in the comment sections where the good behaviour is a default expectation and, thus, normal. Moreover, just like with the Guardian Picks, not every article has deleted comments. However, the percentage is slightly higher as ≈ 86% of all the unique articles in the dataset has at least one deleted comment. Furthermore, we tried to understand more what comments are being blocked - are they generally “liked” and, thus, approved by other commenters or not? We found that some blocked comments do receive a lot of support from other commenters. We found that one of the deleted comments received 112 “likes”, which indicates a potential gap in what commenters might consider as “right” and “truthful” and what moderators consider as violating their commenting guidelines. Indeed, this raises a question: who do moderators represent? Following political bias analysis, moderators seem to support left-leaning comments, yet a more in-depth analysis is required.

After this initial exploration of the dataset, we decided to look at the content of comments. Reading each and every comment is time demanding, so it was decided to rely on bigram wordclouds analysis first. Wordclouds can be a useful method to visualise which bigrams (a combination of two words) are the most popular in the dataset and learn if any interesting/unexpected phrases emerge. First, a bigram of all the comments have been generated. Figure 1 demonstrates the word cloud of top 50 bigrams. The phrase “climate change” was deleted as it was the most popular bigram, which was expected as the dataset is about climate change. The word cloud seems to contain a phrase from the deleted comments (“community standards”).



Figure 3. Bigram wordcloud. Data: all the comments from the Guardian Climate Change dataset

Then a wordcloud containing text from only Guardian Picks was generated (see Figure 4). We wanted to find out if the language used in Guardian Picks differs from the overall language used in the comments. We reasoned that this might help to indicate what comments become Guardian Picks. We cannot spot substantial differences between the wordclouds. Therefore, it is likely that it is not as much about the language (or, to be precise, vocabulary) but rather the context in which those phrases are used in the Guardian picks.


Figure 4. Bigram wordcloud. Data: Guardian picks from the Guardian Climate Change dataset

Finally to supplement wordclouds, Latent Dirichlet allocation (LDA) was employed for topic modelling. Again, LDA was applied to all the comments and Guardian Picks only. The results can be seen in Figures 5a and 5b. Indeed, topic modelling confirmed our previous ideas that content (in terms of topics and vocabulary) in the Guardian Picks and all the comments are similar and, thus, the real difference lies in the context (how topics are addressed and how words are used) for which qualitative in-depth analysis seems to be a more appropriate approach.


Figure 5a: LDA, Data from all the comments; Figure 5b: LDA, data from picks

We have finished our text analysis with a conclusion that quantitative text analysis might not be a right method to understand why some comments become the Guardian Picks. We knew that due to time limitations it is impossible to go through all the Guardian Picks and read the articles to which comments respond. Therefore, we thought what if we look at the comment sections that do not have any Guardian Picks? Are there any instances when articles have neither Guardian Picks nor deleted comments? If there are, what kind of article has comment sections with no post-moderation?

We found that there are 9 articles which have neither Guardian Picks nor deleted comments. Carol Rumen’s Poem of the week: Grey Natural Light by Katherine Horrex | Books | The Guardian caught our attention. This article received 226 comments while other such articles had 10-21 comments. In other words, the comment section was very active but faced no post-moderation. Why? Very likely it is linked to the content of the article - review of a poem, which attracted an audience of commentators who wanted to immerse themselves in the discussion of the literature. Indeed, the 226 comments that appeared below the line on this article revealed a lively conversation between poets and those that appreciate poetry. The comment space contains a number of threads, but the most highly recommended comment has only 10 recommendations which indicates the niche and distributed nature of the responses that were provoked or inspired by the poem of the week in May 2021. It seems that commenters responded to the poem as they worked their way towards understanding, they responded to one another and shared their experiences and emotions. Also, some share their observations about the changing patterns of the annual migration of birds and wildlife. Discussions about the environment are common and appear frequently in the comments. Many of the comments are long and wordy as people find their way to their argument or take their time to make their point in luxurious long texts. It is worth noting that not every commenter ‘gets’ or enjoys the poem but, nevertheless, this comment section emerges as an example of a communal space operating online - free of toxicity, menace and aggression. Essentially, this comment space is an example of a certain magic that no quantitative or qualitative analysis could ever hope to explain but which can still be captured when the gaze is directed from the (post-)moderated to non-(post-)moderated spaces.

The Guardian’s Pick competition

Gently motivated by the spirit of research and competition we created a comment profile on The Guardian with the username Infinitebunny. Infinitebunny’s mission was to have a comment selected as a Guardian Pick. On a practical level this meant reading The Guardian website intently looking for fresh articles that had open comment sections and then placing suitably mature, friendly and engaging comments that added to the debate in a respectful manner. This presented a number of challenges since comments can close very quickly after articles are published. Infinitebunny placed carefully worded comments that aspired to follow the guidelines on 17 articles on a range of topics. Success came with a typo ridden comment on an article with pre-moderated comments on Great Britain’s young female Olympic skateboarder. The comment mixed personal experience with a touch of feminism and support for the skateboarder. Half-pipe dreams: girls on the edge of skateboarding glory | Skateboarding | The Guardian


However Infinitebunny’s time below the line was not entirely without incident since poor or misguided interpretation of the guidelines led to one comment placed by Infinitebunny being removed. Some replies to Infinitebunny’s comments were removed by moderators. Thanks to hastily made screenshots and a record of these exchanges, we can see some examples of comments that are removed in the spirit of community and constructivity. Naturally we can draw no larger conclusions from this but in the case of Infinitebunny and the removed comments the exchanges were benign to say the least.

4.2.2 Die Zeit

Moderation practices and Netiquette

The primary goal of moderation at Die Zeit is to allow more people to participate in the discussion. With the words of Julia Meyer, Community Team Lead at Die Zeit: “No-one should prevent others from sharing their opinion and knowledge by posting insulting or ostra­cising comments” . While the community guidelines of Die Zeit, called Netiquette are similar to the ones of other publications, they aren’t prominently linked anywhere. Neither are they shown to new users during signup, nor are they linked around the commenting section or even the terms of service (AGB). This hints at the trust that Die Zeit puts into their readers.

Another important aspect of their moderation style is how they handle comments that violate their rules. They do something that feels intrusive at first: They edit comments of their users by removing parts of them. At closer inspection this practice does nor reduce the amount of allowed speech, but increases it. Instead of removing the whole comment, the moderation team removes the violating part and leaves the rest of the comment up. Through this practice even opinions that contain insults or other violations can be part of the discussion.

In addition to this cautious moderation, Die Zeit is very transparent when moderating comments. While other platforms direct users to their guidelines when removing comments, Die Zeit gives the commenters specific reasons why they partially or completely removed a comment. Those messages have a specific markup, to make them clearly identifiable as moderation actions. We found over 1.500 different reasons for moderation interventions. Some of them have typos which lets us assume that moderators are encouraged to be as specific as possible.

Most moderation notices end with the initials of the moderator responsible for the intervention. This reduces the distance between commenters and moderators as they can see which moderator moderated which other comments in a thread. In combination with the individualized notices, this makes moderators appear as members of the community instead of anonymous, unapproachable black boxes.

From 2013 to 2018 the amount of comments per week at Die Zeit increased from around 10.000 to over 50.000 and stayed at that level since then.



Reasons for removing comments

Because Die Zeit gives specific reasons for moderation actions, we were able to analyze those. We identified keywords to match them to categories of worldwide moderation guidelines (see Appendix). With just 51 keywords we were able to match 99% of the 23.713 notices. A single comment can contain multiple categories for editing/removal. Nearly half of the notices (45,6%) said that the comment was removed because the referenced comment got removed. This means that those comments didn’t violate any rules themselves, but they were removed because they referenced a comment that didn’t exist anymore.



About a third of the comments (26,6%) gave as a reason that the comment contained false or misleading information. There was a wide range of comments in that category. From polemic to proven lies. Just over ten percent (10,4%) of the comments were not relevant enough. In most cases that meant that the commenters wrote about a topic that had no connection to the article under which they commented. While the other categories are things that are forbidden according to most guidelines, being relevant is part of the wanted section. Less than ten percent (8,6%) of the notices talked about insults. This is the first category that falls under the label toxic. All other categories were named in less than five percent of the notices: Vulgar language (2,2%), generalisations (1,8%), request of the author (1,4%), meta about commenting (1,2%), duplicate content (0,7%), hate speech (0,6%), spam (0,1%). A single comment was removed because it was written in all caps.

Comment sections over time

Similar to other platforms, Die Zeit highlighted good comments as editor’s picks (Redaktionsempfehlung) for several years. But, as you can see in the illustration above, they stopped doing that. While in 2010 nearly 3% of comments were highlighted, that number went down to 1,5% in 2011 and under 1% in 2012. In 2013 nearly 2% were highlighted, but the following years the ratio went down again. In 2016, 2017 and 2018 only a tiny percentage was highlighted.

Instead, Die Zeit started to highlight their moderation decisions with the special markup. They had transparent moderation notices before, but without special markup. Since 2016 the ratio of partly or completely removed comments was quite stable between 6% to 7% of all comments.

4.2.3 El País

Moderation practices

El País has a few ways that readers may use to express their opinions. There is a section of The Readers’ Advocate (Defensor del Lector, in Spanish), an Ombudsman to whom readers can write letters or email with complaints, issues relating to poor service or breaches of people's rights. Another way readers can present their concerns is writing to the director. Then, there is the actual comment section that is at the bottom of some selected articles under the title What do you think? (¿Y Tú Qué Piensas?).

From our observations, it is possible to say that El País pre-select the topics in which they intend to open the article for comments. This is already a kind of moderation, as it might indicate whether they believe people should discuss that issue. Since 2014 the Style Book sets the ground for all other documents relating to user’s commentary regulation on elpais.com. So, it states:

“EL PAÍS favors the participation of readers, always under a quality requirement that excludes insults, disqualifications and considerations not related to the subject in question. The objective is to offer the reader a platform for debate and discussion.”

To be able to comment one must complete a registration and may vote on each other’s comments by pressing thumbs up/thumbs down icons. El País moderating policy, contrary to other news outlets discussed in this report, doesn’t emphasize constructive comments themselves, it highlights users that sustain a respectful commenting behaviour. Commenting users can receive a badge of Highlighted user (Usuário destacado), and their comments have a distinguishable graphic element to emphasize this, selected by the editorial staff: “The contributions of readers that were presented under the name of Open Forum will be identified from now on with the badge "Highlighted User" as they are profiles with a history of participation based on respect and the absence of disqualifications.” (El País, 2015). This badge logic rewards users for sustained ‘good practices’ instead of the content of comments by itself.

The status of highlighted users can change and, although we could not access a given user's commenting history, we were able to retrieve some cases where the user lost this ‘badge’.


El País partnered with Google Jigsaw in early 2019 on the Perspective Project to decrease toxicity in their comment section. The system automatically suggests to the user that their comment might not be accepted by the moderation while they are typing. In a news article the news outlet states that the objective is “to raise the quality of the debate that occurs through these comments and to encourage conversation among readers within EL PAÍS platforms.” (Delgado, 2019; El País, 2018).

El País relies also on human moderators, alongside with AI, to filter unwanted comments. This is a company, Interactora, that has been working with El País in the articulation Perspective (Delclós, 2012; El País, 2020). This outsourcing of moderation comes from 2012, and is made explicit in an article by the Reader’s Advocate (Delclós, 2012).

Delgado says El País has a “data warehouse (a system we use for data analysis)” on the users, which, “alongside the sentiment of the articles, we can now check if certain authors are associated with toxic comments, if the sentiment of the articles influences toxicity, or if some commenters always have high toxicity across all their comments.” It is worth noting that in this statement there’s a concern with “the toxicity of the articles”, so we can infer that the said database serves to tackle unwanted comments but also problematic news issues. The ‘toxicity’ seems to be present both in readers’ comments and reporters’ articles.

Reading the list of 9 Principles and Rules of Participation gave us some interesting insights. The list has been quite constant for the period of analysis (2014-2021), with no changes in its content. But contrasting these rules with those expressed in the Style Book reveals changes in their perceived importance. The first entry of the list implies that El País moderation model considers anonymity prone to lead to unwanted behaviour, so to comment “the author must identify himself with his first and last name.” That principle is also stated elsewhere in an article presenting new guidelines in 2015, “users who wish to contribute their opinions in the space provided for commenting within the articles on elpais.com must be identified with their first and last names” under the belief that civil behaviour requires “relationships based on full personal identification to which this newspaper does not want to be alien” (El País, 2015).

Noteworthy, this list of principles is taken from Style Book (pp. 66-67) but with some interesting modifications. There are two completely new articles and two were dropped out of the online list. The said principle of identification of the commenting user used to be further down the list in 5th place in the Style Book and is now on the top of the online list. This implies that elpais.com gives it more relevance in the web realm of communications. The dropped articles referred to the usage of commentaries across the different media of El País, so “comments made in EL PAÍS can be published simultaneously on the main social networks within the expectation to expand the forum to other conversation spaces” and “photos and videos will be rejected”, maybe due to limitations in bandwidth back in 2014.

The most relevant aspect is the 9th article, entirely new on the online forum rules, defining the status of highlighted user:

“Users with a comment history based on respect and no disqualifications will be highlighted graphically and will have priority for posting. When a prominent user modifies his profile photo or her personal data, the distinction will be suspended until completing a validation process.” (El País, Principios…)

The Principles and Rules of Participation also state that the “moderation policy will guarantee the quality of the debate, which must be in accordance with the principles of plurality and respect for EL PAÍS contained in its Style Book.” With this purpose, the newspaper states that “will be very strict in rejecting insulting, xenophobic, racist, homophobic, defamatory or other opinions that are considered unacceptable.” It continues stating that “disagreement and difference of opinion are basic elements of the debate” and that “insults, personal attacks, discriminations or any expression or content that deviates from the correct channels of discussion have no place in EL PAÍS”. These can be highlighted as the definition El País goes by in terms of ‘constructive’ commenting, although it’s not clear what are “the correct channels of discussion”.

In order to have a better understanding of the themes that make up the guidelines, we generated two word clouds to visualise what is valued and what is censored according to the guidelines, by the prominence of words, as follows. The left cloud is composed of what is not ‘constructive’ and the right cloud are those words used to describe ‘constructive’ comments. This understanding of ‘constructive’ practices is also addressed by El País Reader’s Advocate, writing:

“In our hypertechnological and global world, opinions abound and those who like to express them in writing. That explains the more than 13,000 daily comments received by this news outlet that open to this possibility in the different digital editions of EL PAÍS, just over 200 pieces. A difficult flow to moderate that causes friction and complaints from users from whom I frequently receive messages. (…) The problem is the degree of aggressiveness that, in general, prevails in social networks. A little confusion can open the door to intolerable comments, which deteriorate the quality of the dialogue.” (El País, 2015a)

There’s also an understanding of writing all caps as a breach of one of the most quoted ‘netiquette’ rules. Both the Style Book and the Principles and Rules state this as a form of civil behaviour on online text conversation bluntly saying “Messages written in capital letters will be rejected.”

We made attempts to grasp what Perspective API filters as ‘toxic’ and how it differs according to cultural standards. To test this we wrote the same sentence in the different languages in the API test feature. We wrote sentence A: “You are a communist!” and B: “MESSI IS GREAT BUT IS NOT RONALDO”, and got the toxicity levels detected by the API (likelihood of being toxic).

English

Spanish

Portuguese

German

French

Italian

A: «You are a communist!»

51,20%

30,57%

71,21%

57,07%

19,91%

28,87%

B: «MESSI IS GREAT BUT IS NOT RONALDO»

11,52%

2,75%

9,71%

11,54%

3,89%

3,81%

The levels of toxicity detected by the API varied greatly between languages, sometimes by more than 35%, ranging from a maximum of 71% to a minimum of 20% for sentence A and 11,5% and 3% for sentence B. Noteworthy, the German language presented higher levels of likelihood of toxicity for both cases.

Our observations of the comment section reveals that it is very likely that human moderators only intervene after the AI flags a comment for so. There are occasions where users address the moderators expecting them to correct some issues but it doesn't get any attention.

As a preventive policy of moderation El País explicitly leaves a trace of erased comments, and by so asserts its authority over the conversation space, claiming that “when a comment is erased, the space it occupied appears empty. There is no camouflage.” (Delclós, 2012).

Another interesting aspect of El País’s moderation system is the interface of the comment section. Even though they say that debate is encouraged in their guidelines, that concept is not reinforced by the design affordances of the platform, which seem to prioritize individual responses to the article, rather than a conversation.
The presence of comment sections on different topics

From a network analysis of the articles we were able to scrape for each of our chosen topics, we realized a few things. The first being that, even though the interface does not necessarily promote debate, people are willing to engage in it. Specifically for the article on the Catalonia issue, the interaction between users was high.

Figure 6: interaction El Pais

The same is not true for the articles on covid and climate change. On that note, we might take into consideration what we observed from our interviews: editors prefer to open for comments articles with nuances or that describe shared experiences. The article on covid had some interaction, even if less than the one on Catalonia, probably because even if it is not exactly a nuanced topic, as the previous political one, it describes a shared experience. The last article, on climate change, had basically no interaction, apart from a few among the same group of users. That might be because climate change is not necessarily a nuanced topic and, for many people, can still be far from personal experience.

5.2.4 Results of the case studies compared

​​With so much data collected from different platforms, the last step is to visualize and identify patterns across them. It does not only allow to make the data more accessible, but also seeing the bigger picture of how content platforms manage and moderate their users.

One way of moderating users and evaluating their behavior is done through language recognition programs. These algorithms rate the toxicity of a comment from 0% toxic up to 100% toxic. To be able to compare the toxicity across two news outlets, the same article “EU Commission wants to ban gasoline and diesel cars by 2035” on El País and Die Zeit is looked at. The first 43 comments on each platform are rated by the Perspective API to receive their level of toxicity. The visual model connects both dimensions — the toxicity level on the y-axis and the comment in order of their appearance on the x-axis. Surprisingly Die Zeit and El País show similar variances as seen in Figure 5, but this exploration must be seen with caution, because the used dataset is relatively small. Furthermore, it seems that Die Zeit has an overall higher toxicity rating than El País. Reasons for that could be differences of moderating users in each platform, or the fact that the Perspective API evaluates German and Spanish in different ways.

Figure 7

Aside from the content, it is interesting to further investigate the user behind it. As seen in Figure 7, different groups of users tend to follow different patterns in terms of writing length: how often a user commented is displayed on the x-axis. For example, a user who only left 1 comment is placed on the far-left side while a user with more than 800 comments is found on the far-right side of the graph. The y-axis determines how long the average comment of that group is. In general, it can be said that neither the users who frequently post under an article, nor the users who rarely post write long texts. Instead, those who can be found in the first third tend to write more. This applies to Die Zeit as well as to The Guardian. However, users in The Guardian write significantly less than Die Zeit users. Both platforms allow users to vote on other commentator’s posts which is reflected in the number of upvotes on the bottom half of the graph. By comparing it to the upper section it can be said that those who write longer comments often receive more appreciation from their peers through upvotes.

Figure 8

Figure 9

This leads to a further investigation of how users interact with each other on these platforms. In Die Zeit users tend to form bubbles with others (Figure 8). Each dot represents a user; users are connected when they connected at least once, meaning they responded to another user’s comment. The network illustrates how users form clusters. These could be clusters per article, in other words people comment and respond only within the same article. Clusters might also form around interests, for example one cluster could be the politics section of Die Zeit, while another cluster is formed but users who stay within the culture section. The Guardian shows a totally different picture (Figure 8). Three big clusters have formed the network, but users connect across them. Different from Die Zeit, commentators from The Guardian might base their interactions more on the behavior of peers. While users whose comments got blocked appear everywhere, they accumulate in one corner (Figures 10 and 11).



Figure 10

Figure 11


Figure 12

5. Discussion

Our case studies show that, as the meta-analysis of the guidelines suggested, a focus on constructivity exists. This can take multiple forms. At The Guardian for example, picks are chosen from the posted comments to shape debate. Additionally, at Die Zeit, the moderation team provides reasons for deleting or adjusting a comment.

The discussed case studies show more similarities. The platforms make use of a hybrid approach, supplementing human moderation with AI systems. The Guardian developed their ‘Robot Eirene’. The german platform Die Zeit is working on their ‘Robot Zoë’, while El Pais uses the Perspective API.

Interestingly, while all platforms strive towards constructivity, there seems to be no agreement to what exactly this goal entails. What they do agree on, however, is that moderation ought to be centralized at the platform level, not distributed towards the user-base.

Comparison to user-based moderation

Slashdot has been examined as an idiosyncratic example of user-based moderation systems that succeed in fostering constructive participation and community engagement. While the website’s overseers reserve the right to remove certain comments (such as bigoted remarks), there is no pre-moderation and the user base is tasked with rating comments based on quality on metrics such as “underrated,” “overrated,” “insightful,” “informative,” or even “troll.” Comments can be sorted by these rating scores to better benefit the reader. Users can mark others as a “friend” or “foe,” and sort comments that came from those labels. This creates networks of possible interactions and histories between users.

It is perhaps the range of options that make Slashdot an excellent ecosystem for commenting. Additionally, the power of moderation is in the hands of many as opposed to the few, making the site more egalitarian than those with top-down moderating systems. Research on the activity and interactions between commenters, moderators and commenters, friends and foes, and the anonymous and the named of Slashdot would provide an interesting case study on the effectiveness of community moderation on fostering constructive dialogue.

6. Conclusions

The project has succeeded in documenting findings with regards to three dimensions, these, in turn, lead to multiple vistas for future research.

Mapping ‘the constructive turn’

The project has succeeded, through harvesting and visualising data from, roughly speaking, the last two decades, in documenting a significant turn in editorial policy and attitudes towards user-generated content (comments). By mapping the way in which there was a significant shift in editorial guidelines concerning user generated content, this project contributes significantly to insight about journalist practice and its relation to societal dynamics such as the rise of post-truth, fake news and inauthentic communication.

Further research can focus on deepening insight into these dynamics: whereas our project used methods such as the internet wayback machine to map the changes in editorial policy, ample room for finetuning our knowledge of this turn exists. One can think of interviews with editors from newspapers at the time, analysis of changing interfaces and the way in which AI-assisted moderating practices entered the journalistic scene.

Such understanding of this constructive turn and its implications could be of applied value for debates with regards to a wide variety of disciplines and knowledge domains. Academically speaking these findings inform media studies but also contribute to the computational analysis of online content and the wider discussion on what constructivity and online deliberation.

Understanding contemporary hybrid practices of moderation

Our project has explored a variety of hybrid (i.e. human-led moderation assisted by machine learning algorithms) moderating practices. In particular we have shown how editorial teams, moderators and AI interfaces collaborate and interact. We have shown how even the most careful attempts to have every decision that is made by AI-interfaces be checked by a human editor can nonetheless be susceptible to bias and selective application of censorship. Moreover, we have shown how editorial teams are aware of this and have, as far as we were able to document, little faith in AI assisted moderation concerning the recognition of what constitutes constructivity. In most cases, hybrid systems of moderation focus on AI-assisted interfaces to pre-select for toxicity, while human-led moderating practices are mostly applied to verify acts of censorship and to recognize and promote comments that are deemed constructive. This human element, illustrated by practices such as editorial picks, and the verification of particularly ‘constructive’ comments, is vulnerable to its own biases and errors. Our research has shown that editorial teams themselves actively struggle with how to implement hybrid systems of moderating comments.

Our findings can thus be of direct interest to editorial teams themselves. Newspapers such as The Guardian, El Pais and Die Zeit have all an active and ongoing interest in understanding what the effects of their own moderation is, the interviews we have undertaken indicated interest in closer collaboration between academics and journalists.

Effects of moderating practices: The Commenter Her/Himself

Our findings have shown that one effect of the rise of ‘post-truth’ discourse has been for a variety of actors to try to tame what people have called ‘the bottom half of the internet’, i.e. the comment section. Our findings indicate that a common tendency for news outlets is to construct a middle part of the internet so to speak. The top part is the article, the lowest part is the comment section. But, in between lies the comment section deemed constructive (e.g. by moderator picks, algorithmic selection, validated commenters or other means).

This middle space opens up a series of questions we have not yet fully explored. What are the effects of the implementation of a middle space? It could easily be envisioned that by combining diachronic and synchronic approaches researchers try to measure the impact of the constructive turn. Does setting up editors picks work? Does the promotion of particularly well-behaved commenters influence the ill-behaved commenters or the debate in which both partake? These findings are of interest again for news outlets themselves but also for developers of moderation systems and practices.

Questions of a more meta-level also arise: What are the implications of setting up such a middle space, of taming the unruly commenters through top-down selection procedures (by which, one might add, the commenter is drawn closer to the work and interests of the journalists and news outlets themselves)? What does such a space do to larger existing narratives and normative expectations of how to behave on the internet? Is this indicative of a new netiquette? What does this do for larger frameworks of freedom of speech, etc? A large avenue of future research that remained largely unexplored during the Summer School project was the figure of the commenter her/himself. Future research can profit greatly by focussing not just on editorial interests in taming the internet but also the bottom-up reactions it engenders.

7. References

Borra, E., (2016). Lippmannian device. https://wiki.digitalmethods.net/Dmi/ToolLippmannianDevice

Cheng, J., Danescu-Niculescu-Mizil, C., & Leskovec, J. (2015, April). Antisocial behavior in online discussion communities. In Ninth International AAAI Conference on Web and Social Media.

Coe, K., Kenski, K., & Rains, S. A. (2014). Online and uncivil? Patterns and determinants of incivility in newspaper website comments. Journal of Communication, 64(4), 658-679.

Delgado, Pablo (2019). How El País used AI to make their comments section less toxic. Google News Initiative, blog.google. https://blog.google/outreach-initiatives/google-news-initiative/how-el-pais-used-ai-make-their-comments-section-less-toxic/

Delclós, Tomàs (2012). «Los filtros de la moderación». El País. https://elpais.com/elpais/2012/03/18/defensor_del_lector/1332060943_133206.html

Domingo, D., Quandt, T., Heinonen, A., Paulussen, S., Singer, J. B., & Vujnovic, M. (2008). Participatory journalism practices in the media and beyond: An international comparative study of initiatives in online newspapers. Journalism practice, 2(3), 326-342.

Delcambre, A. (2019, May 23). De nouvelles règles pour vos commentaires sur Le Monde.fr. Le Monde. https://www.lemonde.fr/refaire-le-monde/article/2019/05/21/vos-commentaires-sur-le-monde-fr-ce-qui-change_5465126_5330899.html

El País (2010). «Que és Eskup?». http://eskup.elpais.com/Estaticas/ayuda/quees.html

El País (2014). Libro de Estilo. 22nd revised edition. Ed. El País. https://colecciones.elpais.com/index.php?controller=attachment&id_attachment=2

El País (2015). «EL PAÍS mejora el sistema de comentarios en sus noticias». https://elpais.com/elpais/2015/03/24/actualidad/1427229587_101365.html

El País (2018). «Inteligencia artificial para elevar la calidad del debate digital». https://elpais.com/sociedad/2018/12/17/actualidad/1545081231_439667.html

El País (2020) «No, EL PAÍS no censura comentarios críticos con Pedro Sánchez». https://elpais.com/elpais/2020/10/03/hechos/1601758726_104919.html

El País, «Defensor del Lector». https://elpais.com/noticias/defensor-lector/

El País, «Principios y normas de participación». https://elpais.com/estaticos/normas-de-participacion/

El País. El País Archive: https://elpais.com/archivo/

Frischlich, L., Boberg, S., & Quandt, T. (2019). Comment sections as targets of dark participation? Journalists’ evaluation and moderation of deviant user comments. Journalism Studies, 20(14), 2014-2033.

Galán, Lola (2014). «Cuando el remedio es peor que la enfermedad». El País. https://elpais.com/elpais/2014/10/25/defensor_del_lector/1414238580_141423.html

Galán, Lola (2015). «Opinar sí, pero con reglas». El País. https://elpais.com/elpais/2015/10/22/defensor_del_lector/1445540820_144554.html

Galán, Lola (2017). «El papel, los vídeos y Eskup». El País. https://elpais.com/elpais/2017/07/01/opinion/1498861426_909572.html

Gibson, A. (2019). Free speech and safe spaces: How moderation policies shape online discussion spaces. Social Media+ Society, 5(1), 2056305119832588.

Goldberg, J. (2018, February 2). We Want to Hear From You. The Atlantic. https://www.theatlantic.com/letters/archive/2018/02/we-want-to-hear-from-you/552170

Hoekman, G. (2016, August 11). Nu.nl stopt met open reacties onder artikelen. Nu.nl. https://www.nu.nl/blog/4305300/nunl-stopt-met-open-reacties-artikelen.html

Manin, B. (1987). On legitimacy and political deliberation. Political theory, 15(3), 338-368.

Murphy, J., Hashim, N. H., & O’Connor, P. (2007). Take me back: validating the wayback machine. Journal of Computer-Mediated Communication, 13(1), 60-75.

Wright, S. (2009). The role of the moderator: Problems and possibilities for government-run online discussion forums. Online deliberation: Design, research, and practice, 233-242.

Wright, S. & Street, J. (2007). Democracy, deliberation and design: The case of online discussion forums, New Media & Society, 9(5), 849–869.

Santana, A. D. (2011). Online readers' comments represent new opinion pipeline. Newspaper research journal, 32(3), 66-81.

Sonderman, J. (2011, August 18). News sites using Facebook Comments see higher quality discussion, more referrals. Poynter. https://www.poynter.org/reporting-editing/2011/news-sites-using-facebook-comments-see-higher-quality-discussion-more-referrals

Yárnoz, Carlos (2019). «El lector gana protagonismo». El País. https://elpais.com/elpais/2019/10/26/opinion/1572078022_285897.html

Yárnoz, Carlos (2021). «Los lectores rechazan cotos vedados». El País. https://elpais.com/opinion/2021-07-04/los-lectores-rechazan-cotos-vedados.html

8. Appendix

Keyword matching of Die Zeit comment moderation reasons. The categories come from the investigation into international comment section guidelines.

matches = [

{'keywords': ['community@zeit.de','community-redaktion@zeit.de'],

'category' : 'abusive towards moderation team'

},

{'keywords': ['großschreibung'],

'category' : 'caps lock'

},

{'keywords': ['doppel','wiederhol','mehrfach'],

'category' : 'duplicate content'

},

{'keywords': ['fake','falschaussage', 'sachlich', 'polemi', 'quelle', 'beleg', 'differenzier','konstruktiv',

'argument', 'vergleich', 'relativier', 'spekul', 'falschinf', 'gesichert'],

'category' : 'false or misleading information'

},

{'keywords': ['verallgemeinerung', 'pauschal'],

'category' : 'generalizations'

},

{'keywords': ['sexistisch', 'homo', 'rass', 'diskr', 'gewalt', 'verl', 'hetz', 'antisem'],

'category' : 'hate speech (racist, sexist, homophobic or discriminatory)'

},

{'keywords': ['beleidi', 'unterstellung', 'diffamier','angriff', 'anfeind', 'spott', 'persö', 'respekt'],

'category' : 'insults'

},

{'keywords': ['spam', 'werb' ],

'category' : 'spam, ads, commercial content, self-promotion'

},

{'keywords': ['thema', 'artikelinhalt', 'themen'],

'category' : 'be relevant'

},

{'keywords': ['geschmacklos','angemess', 'wortwahl', 'inhalt'],

'category' : 'vulgar language'

},

{'keywords': ['deutsch'],

'category' : 'Only in English (Deutsch)'

},

{'keywords': ['beziehen', 'bezug'],

'category' : 'missing reference'

},

{'keywords': ['wunsch'],

'category' : 'request by author'

}

]

I Attachment Action Size Date Who Comment
Areagraph epoche.jpgjpg Areagraph epoche.jpg manage 202 K 21 Oct 2019 - 13:30 EmilieDeKeulenaar  
Areagraph_03_Tavola disegno 1.jpgjpg Areagraph_03_Tavola disegno 1.jpg manage 302 K 21 Oct 2019 - 13:36 EmilieDeKeulenaar  
Atlantis_WikiTimeline_Tavola disegno 1.jpgjpg Atlantis_WikiTimeline_Tavola disegno 1.jpg manage 86 K 21 Oct 2019 - 13:28 EmilieDeKeulenaar  
Crusade_WikiTimeline-02.jpgjpg Crusade_WikiTimeline-02.jpg manage 70 K 21 Oct 2019 - 13:27 EmilieDeKeulenaar  
Screenshot 2019-07-22 at 15.22.51.pngpng Screenshot 2019-07-22 at 15.22.51.png manage 429 K 21 Oct 2019 - 13:20 EmilieDeKeulenaar  
Screenshot 2019-07-22 at 16.42.17.pngpng Screenshot 2019-07-22 at 16.42.17.png manage 527 K 21 Oct 2019 - 13:37 EmilieDeKeulenaar  
Screenshot 2019-07-23 at 12.25.46.pngpng Screenshot 2019-07-23 at 12.25.46.png manage 60 K 21 Oct 2019 - 13:24 EmilieDeKeulenaar  
Screenshot 2019-07-23 at 16.10.01.pngpng Screenshot 2019-07-23 at 16.10.01.png manage 327 K 21 Oct 2019 - 13:31 EmilieDeKeulenaar  
WW2_WikiTimeline-03.jpgjpg WW2_WikiTimeline-03.jpg manage 66 K 21 Oct 2019 - 13:28 EmilieDeKeulenaar  
cluster 2.pngpng cluster 2.png manage 1 MB 21 Oct 2019 - 13:44 EmilieDeKeulenaar  
image-wall-e3b55f6d8e296e95f13bd18fc943dd55.pngpng image-wall-e3b55f6d8e296e95f13bd18fc943dd55.png manage 934 K 21 Oct 2019 - 13:33 EmilieDeKeulenaar  
pasted image 0.pngpng pasted image 0.png manage 1 MB 21 Oct 2019 - 13:23 EmilieDeKeulenaar  
pasted image 2.pngpng pasted image 2.png manage 1 MB 21 Oct 2019 - 13:32 EmilieDeKeulenaar  
unnamed-2.pngpng unnamed-2.png manage 12 K 21 Oct 2019 - 13:34 EmilieDeKeulenaar  
unnamed-3.pngpng unnamed-3.png manage 11 K 21 Oct 2019 - 13:34 EmilieDeKeulenaar  
unnamed-4.pngpng unnamed-4.png manage 54 K 21 Oct 2019 - 13:37 EmilieDeKeulenaar  
Topic revision: r1 - 30 Jul 2021, CedricWaterschoot
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback