The Democratic Perils of Hidden Moderation

Now that we hold a significant part of our discussions on social media platforms, worries and complaints about “shadowbanning”—the act of moderating online speech without notifying its author—have emerged in debates about online speech. This discontent seems to cross bipartisan lines. In the United States, Republicans have vociferously accused technology firms of being biased against conservatives. At the other end of the political spectrum, LGBTQ+ advocates, Black Lives Matter supporters, and pro-Palestinian activists have also publicly expressed the worry that their voices are subtly silenced by large platforms such as Facebook, TikTok, Instagram, and Twitter. However, the idea that we should worry about shadowbanning is a controversial one. Indeed, many suspect that shadowbanning is often a figment of social media users’ imaginations. When a post fails to reach its author’s intended audience, the temptation for them to claim that they have secretly been punished can be strong. Moreover, the concept of shadowbanning is shrouded in ambiguity as those who use it can’t seem to agree on a definition. Unfortunately, this ambiguity easily allows digital platforms to deny that they engage in it through a kind of semantical sleight-of-hand (“Oh, yes, we do moderate content without letting users know, but that’s not what I thought you meant by shadowbanning”). Consequently, some researchers have even proposed that we abandon this concept.

While I have no objection to sunsetting shadowbanning as a term, I believe the phenomenon it describes raises interesting philosophical questions. Is there anything wrong with what I prefer to call hidden content moderation? If so, who is wronged by it, and how? Should social media users have any recourse against it? In what follows, I suggest that hidden moderation is indeed objectionable as it impedes democratic citizens from fulfilling their duty to help establish a just digital public sphere. Even if shadowbanning does not wrong users individually, it fosters a kind of ignorance that makes it harder for the democratic public to tell whether benefits and burdens are distributed fairly amongst members of society. For instance, hidden moderation makes it harder for us to determine whether all political speakers are provided with a fair chance to influence the political process, which is a core requirement of democracy.

Let me first explain what I have in mind when I speak of hidden moderation. A piece of digital content is the object of hidden moderation when (i) a social media platform limits its spread, (ii) because it is deemed undesirable, violative, or nearly violative of moderation policies, (iii) without notifying its author. Here, condition (ii) is noteworthy. In many instances, pieces of content spread poorly as social media recommendation algorithms predict that they won’t be engaging for a large number of users. According to my definition, these are not cases of hidden moderation, which always involves the application of penalties by a human moderator or a machine-learning model used for moderation. By way of contrast, here are some examples of interventions that do count as hidden moderation on my account:

Not recommending a piece of content: a user posts content, but a moderator makes it so that such content won’t be algorithmically recommended to other users (although it remains searchable). Users are not aware of this situation.
Removing comments: a user adds comment(s) on another user’s posts, but only the user who added the comments can view them. The user who added the comments is not notified that this is the case.
Hashtag unlinking: a user posts a piece of content using a hashtag, but other users cannot find this content using this hashtag. Users are not aware of this situation.

If hidden moderation truly is hidden, how can we even know that it is happening? There are at least three main sources of information about this phenomenon. First, academic research has shed light on this issue. In a recent contribution, Tarleton Gillespie explains that “many social media platforms have quietly begun to identify content that they deem not quite bad enough to remove.” When they do, “the offending content remains on the site, still available if a user can find it directly” but “the platform limits the conditions under which it circulates: whether or how it is offered up as a recommendation or search result, in an algorithmically generated feed, or ‘up next’ in users’ queues.” Second, journalists have conducted investigations into hidden moderation. According one such investigation, Instagram “heavily demoted nongraphic images of war, deleted captions and hid comments without notification, erratically suppressed hashtags, and denied users the option to appeal when the company removed their comments, including ones about Israel and Palestine, as ‘spam.’” Third, social media platforms often reveal that they engage in hidden moderation when publishing their moderation guidelines. On Facebook’s website, Meta thus explains that it uses “a strategy called remove, reduce, and inform to manage problematic content across the Facebook family of apps.” This involves “reducing the spread of problematic content that does not violate our policies.” Similarly, TikTok has publicly stated that they make some content ineligible for algorithmic recommendation: “We make ineligible for the FYF, and may also make harder to find in search, certain content that may not be suitable for a broad audience.” Here, FYF stands for For-You Feed, which is TikTok’s main algorithmic feed. In other words, while many moderation interventions are hidden from users who are subject to them, the very fact that platforms engage in hidden moderation is common knowledge. So, what’s the problem?

One possible way to object to hidden moderation is to argue that it wrongs individual users by violating their moral rights. For instance, according to Kate Vredenburgh, people subject to power have a right to explanation grounded in their interest in informed self-advocacy. Accordingly, they should be able to understand what rules apply to them (and how these rules are applied) so that they can (i) comply and avoid penalties and (ii) hold decision-makers accountable to remedy mistakes or unfairness. While Vredenburg has not examined the implications of the right to explanation for online expression, hidden moderation appears to set back social media users’ interest in informed self-advocacy. After all, users cannot modify their behavior on platforms to avoid penalties if they don’t know that their behavior is deemed violative or nearly violative. They also cannot identify mistakes and unfairness—and then attempt to remedy them—if they don’t know that penalties have been applied to them.

Furthermore, making moderation more transparent would not entail unreasonable costs for social media platforms (which, according to Vredenburgh, is another requirement that must be met for the right to explanation to apply). Trust and Safety professionals can typically see the history of moderation interventions that have been applied to a specific piece of content. If this is so, then they already have the practical means to make this history viewable to users or the public at large.

Nevertheless, the argument I have just sketched faces important objections. First, some philosophers are quite skeptical that people have a right to explanation. In their view, this right is “either superfluous, impossible to obtain, or not the best way to secure relevant normative goods.” Second, even if such a right exists, Vredenburgh makes it clear that it only applies to what she calls “non-voluntary” institutions. These are institutions that “unavoidably exercise coercive or manipulative power, or distribute justice-relevant goods” and where “the high benefit of participation makes the latter non-voluntary.”

Do social media platforms count as non-voluntary institutions? This suggestion seems plausible to me. After all, philosophers have recently made the case that such platforms exercise coercive power over their users (here and here). Social media sites also distribute at least one justice-relevant good: attention, without which democratic speakers have no chance of reaching their intended audience and influencing the political process. Moreover, political speakers can reap significant rewards from their participation on social media. It is now quite difficult to imagine a political candidate winning a political office without recourse to large social media platforms. Certainly, such platforms remain voluntary in the sense that at least some people can easily choose not to use them (and thus be subject to their rules). However, for others, the use of social media platforms is difficult to avoid. Lastly, note that some social media platforms are so large that they create significant burdens for all people, including non-users. By way of example, a convinced Luddite who wouldn’t dream of opening a social media account could still be negatively impacted by Facebook or Instagram’s (hypothetical) refusal to combat medical misinformation, especially if she is immunocompromised.

As the reader can tell, I am not convinced that the argument I have just sketched fails. For the sake of the current discussion, however, I will assume that it does. In a nutshell, this is because a second argument is available to those who oppose hidden moderation. This second argument begins with the idea that democratic citizens have a moral duty to help establish just and fair institutions. Simply put, all members of a society—not just those who have to endure significant burdens—have an obligation to help realize justice and fairness. Furthermore, willingly creating and benefiting from a situation that makes it harder for people to fulfill their duty to create a just society amounts to wronging them as moral agents. And this is precisely what social media firms do when they engage in hidden moderation, or so my argument goes.

Specifically, opaque moderation prevents social media users and non-users alike from understanding whether platform power is exercised rightfully and benefits are distributed fairly amongst members of society. Consider once more the benefit that consists in having a fair opportunity to influence the democratic decision-making process. In general, hidden moderation prevents the democratic public from determining whether opportunity for political influence is distributed equitably on large digital platforms. Is it the case, for instance, that members of a socially relevant group—say, pro-Palestine activists, Black Lives Matter supporters, LGBTQ+ influencers, or political conservatives—are disproportionately affected by hidden moderation interventions? Currently, it is very hard to tell; it may be that they are, but a large part of the penalties to which they have been subjected are hidden from view. And when unfairness is invisible, it becomes harder for those who care about justice to combat it. Here, my argument is partly epistemological: when people do not know if or how an institution is unjust, then they cannot organize to try and make it more just. This dynamic applies to hidden moderation, which fosters a kind of ignorance that prevents us from realizing our moral ideals.

Hidden moderation also makes it challenging to determine whether platforms exercise coercive power consistently. This is worrying in a context where, when it comes to consistency, “platforms historically fared poorly, making up policies on the fly.” Considering this fact, there is a significant risk that similar users will be treated differently without good reason. Once again, the democratic public has no way of determining whether some speakers are penalized for posting content that other users post with impunity.

If I am right to suggest that hidden moderation impairs citizens’ ability to evaluate the current state of the digital public sphere, what kind of transparency would better allow them to do so? Should users be notified as soon as a penalty has been applied to their account? Perhaps not. To assess how socially relevant groups are affected by moderation, it might suffice for the public to have access to the results of internal research conducted by social media firms. Sometimes, the questions that investigative journalists take months to answer can easily be resolved by Trust and Safety professionals who have access to internal data. Consider, for instance, the question of knowing whether images of war shared by pro-Palestine activists on Instagram are more often suppressed or flagged than those posted by pro-Israel users. While journalists at The Markup did not find any evidence that this is so, Meta’s employees are in a better position to find such evidence or conclusively rule out this hypothesis.

Unfortunately, social media employees—even those employed as researchers—have little incentive to conduct the kind of investigation that would reveal, say, that members of a racial or sexual minority are more impacted by hidden moderation than other users, especially under Trump’s second administration. It seems quite naïve to believe that large platforms’ leadership will choose to conduct such research when they have demonstrated their willingness to cooperate with Trump’s anti-DEI agenda. And if research only focuses on whether conservatives are unfairly censored, then we won’t get the full picture. In this context, allowing all social media users to see the history of moderation interventions applied to a given post or account might yield better results. It certainly would make it easier for them to identify inconsistencies in moderation operations (“I posted this content, user B posted very similar content, but only my post was flagged. Why?”). However, if tech critics or regulators push for a ban on hidden moderation, some exceptions are likely to be warranted. After all, alerting a 40-year-old user who has repeatedly attempted to “friend” 14-year-old users that penalties have been applied to his account seems like a bad idea. As always with Trust and Safety operations, the practical challenge remains to distinguish cases where making moderation more transparent will not increase the likelihood that a user will break the law or violate moderation guidelines from those in which it clearly will.

One final question: are there any signs that the democratic public struggles with the task of identifying unfairness in the digital public sphere? By way of conclusion, I suggest that the enduring moral panic about social media bias counts as one. As is well known, American conservatives frequently accuse digital platforms of unfairly censoring their speech. Although there is some evidence that this is not the case, hidden moderation undoubtedly fuels this panic. If content moderation were perfectly transparent and the public had good data about the fate of posts shared by conservatives and progressives on large platforms, we could settle this debate or, at least, point to statistical evidence when doubtful accusations are made. Currently, we cannot, and social media platforms are largely responsible for this situation. Their leaders have chosen to create a digital reality in which the evidence that would allow us to establish whether content moderation is fair either does not exist or is not accessible to the public. When users speculate about whether their posts are unfairly censored, one option is to request that they stop making these claims in the absence of reliable evidence. Another approach involves creating a situation where these users are sufficiently knowledgeable, thereby rendering speculation unnecessary. We have tried the first to no avail. Let’s try the second.

Étienne Brown

Étienne Brown is an Associate Professor of Philosophy at the University of Ottawa, Canada. His work sits at the intersection of political philosophy and philosophy and technology, with a current focus on the philosophical implications of content moderation and algorithmic recommendation. He is the founder and co-organizer of PhilMod, a community of academic researchers and technology professionals interested in the ethics of social media platform policy.

Étienne Brown

Leave a Reply Cancel reply