Conspiracists Also Viewed: ‘Problematic’ Networks of Recommendation on Amazon.com
Team Members
Dylan O'Sullivan, Ekaterina Khryakova, Jingyi Wu, Mingzhao Lin, Yingying Chen
Facilitators: Tommaso Elli and Jonathan Gray
Essay Summary
The whole project is about relationships between a list of problematic books selling on Amazon and their “also-viewed” recommendation books. The main tools we used in the research are DMI recommendations scraper (Peeters, 2020) and Gephi. At the beginning, we manually collected the “also-viewed” recommendation books of the conspiratorial books (problematic status marked as 5) in the Problematic Infomedic Amazon Book list, which is provided by the infodemic amazon project team, on Amazon.com. By doing that, we can interface the recommendation algorithms directly on the platform. DMI recommendations scraper was used to enlarge the “COVID” direct search database. Gephi helped us to visualise the data we scraped from amazon and allows both researchers and readers to see the networks between different books clearly.
Since the conspiratorial books formed a network, we found that a non-conspiracy book can be a bridge between two conspiracy books. That is, on the one hand, we can reach a nonconspiratorial book via the recommendation section of a conspiratorial book. On the other hand, by looking at the recommendation section of the non-conspiracy book, customers can also find a book which is conspiratorial.
According to the results we have collected and the related analysis, readers will develop a better understanding of the recommendation system and algorithm of books selling on Amazon. In addition, they can also see what genres of books are most related to the COVID-19 conspiracy theory.
Research Questions
RQ 1: How easily might one enter a conspiratorial network on Amazon.com, via recommendations or otherwise?
RQ 2: How interconnected are conspiratorial books and genres—nodes and clusters— within the broader conspiratorial network?
RQ 3: What is the connectivity—at an individual and generic level—between conspiratorial and non-conspiratorial books?
Methodology
In this research, we first looked into the interface of Amazon.com. We manually scraped all the information from the platform to get a more comprehensive understanding of the more extensive infrastructure: namely, the “behaviour” of the platform, the topics/themes of books, the logistics of the recommendation system “on the field,” amongst other things. It is vital to fully understand the logic of the recommendation system and look into the interface of the platform directly.
Subsequently, we found that the recommendation methods of Amazon can be divided into six categories: “Frequently Bought Together”, “Viewed Also Viewed”, “Bought Also Bought”, “Buy After Viewing”, “More To Explore”, and “Items Related,” we ultimately chose “Viewed Also Viewed” as our core algorithm because it appeared most often across the recommendation system as a whole.
To analyze the connections of conspiracy books—“exploring associations around single actors” and “detecting key players”—narrative readings of the networks were applied (Liliana et al., 2016). We started from the list of Problematic Infomedic Amazon Books (Appendix 1), which was provided by the Infodemic Amazon Project team, and manually collected the “Viewed Also Viewed” recommendation books of the 12 conspiratorial books (problematic status marked as 5) in the list on Amazon.com. In order to avoid personalization settings and location influences, our team used the Hoxx VPN Proxy Chrome plugin on Google Chrome (private browser mode) with the VPN set to the United States of America.
Figure 1: The 12 Conspiratorial Seed Books in the Problematic Infomedic Amazon Book list
After finishing the manual data collection, based on 628 books that have been recorded, we merged books which appeared repetitively, as well as different editions of the same book, to avoid errors and improve graph readability. Ultimately, we were left with 396 books in total.
The second step of this project is to categorise the 396 books from four aspects, what is its genre (21 genres in total, including “Health & Fitness”, “Christianity”, “Self-Help”, “Illuminati”, “Vaccines”, “QAnon”, “Deep State”, “Covid-19”, “Flat Earth”, “General Conspiracy”, “Culture & Society”, “5G & EMF”, “Aliens”, “Fiction & Biography”, “Eastern Philosophy”, “Science & Technology”, “Obama & Democrats”, “History”, “Trump & MAGA”, “9/11” and “Miscellaneous”), whether it is conspiratorial, is it in the list of Problematic Infomedic Amazon books, and is it a conspiratorial seed book. To do this, three of our group members viewed each books’ name, summaries, category, and publication year one by one. When we define a book is conspiratorial, we are following five principles: nothing happens by accident, nothing is as it seems, everything is connected, tone/style of conspiracy theories and assumption that this is somehow going against received wisdom (Butter and Knight, 2020)
To broaden the core dataset, DMI recommendations scraper was used to scrape the recommendations from the first 12 books in the direct “COVID” query. From this scrape, one hundred eighty-nine books from the “Viewed Also Viewed” recommendation list were added and analysed.
Figure 2: The First 12 books of the Direct Search “COVID” Query Results (The query was performed in the week between 4 and 8 Jan 2021)
After all the data scraping and collecting, the third step is to visualise the results. In the view of Bastian et al., “visualisations are useful to leverage the perceptual abilities of humans to find features in network structure and data” (2009, p.361). Therefore, Gephi, “an open source network exploration and manipulation software” (Bastian et al., 2009, p.361), is used to help with visualising the connections between books. The software allowed researchers to use colours to differentiate which book belongs to which genre since it is hard to present the full name of books in the network graph. One of the reasons we choose Gephi to export the network graph is its convenience to import data with CSV documents. The whole process is easy, and when all correct data is organised, Gephi can automatically generate and display a network graph to its users. Besides, Bastian et al. maintained that the “networks can be explored in an interactive way with the visualisation module, it can also be exported as a SVG or PDF file” (2009, p.362). This statement demonstrates that Gephi also provides researchers with multiple choices on how they want to present their data.
Although both data collection methods and data visualisation methods look reasonable and work well in the whole project, this research also contains several issues. Firstly, data scraping manually required a lot of time. Without the Amazon scraper tool, our group took nearly four hours to collect information from the 628 books, which seems time-consuming. Additionally, only one group member can use Gephi skillfully. Therefore, other group members took some time to learn how to utilise the tool and made many mistakes during the process. It is worth to notice that even though the tool could be beneficial, it will work efficiently only when the users know how to employ it. Consequently, in the future study, researchers can consider what they are familiar with and avoid wasting time learning how to use a tool or method.
Findings
During the four days of exploration and analysis, our research team discovered a number of interesting findings. For instance, we found that the genre of COVID-19 is particularly related to books not only pertaining to science, health and politics, but to Christianity and the Illuminati. Furthermore, by categorising each book according to the genre, as well as checking whether they are conspiratorial books related explicitly to COVID-19 (a feat accomplished by reviewing book summaries and checking publication dates) we discovered three additional conspiratorial books pertaining to COVID-19 not featured on the original “Problematic Infomedic Amazon Books” checklist. These were: Liberty or Lockdown, Pandemic; Inc.: 8 Trends Driving Business Growth; and Success in the New Economy, The Date of the Rapture How and When the World Ends (End of World Series Book 3). These general conclusions aside, our more in-depth research revealed three key findings worth expanding upon.
Finding 1:
The foremost finding of our research concerned the accessibility of the network. The graph below (Figure 3) shows conspiratorial books in the recommendation network—at two iterations—for querying “COVID” on Amazon.com (seed books are marked in blue, whilst conspiratorial books are marked in red). ‘Problematic’ books pertaining to COVID-19 exhibit a high degree of centrality—and therefore connectivity—within the network; an attribute represented by the size of the node.
Figure 3: The Network of “COVID” Query Results and Their “Also-Viewed” Recommendation Books
As one can see, the route between the search bar and networks of conspiratorial content is remarkably short. Even amongst the primary query results, two conspiracy books are present (marked purple), whilst a host of others are but a single click away. Such is the adjacency of conspiracy-related content to scientifically (and socially) orthodox content—the centripetal nature of the network, as it were—that an Amazon user can find him- or herself browsing problematic literature even from the most benign of search terms.
Finding 2:
The images below show the data network of books from two different aspects. Figure 4 visualised the connection between problematic books (coloured in green) and unproblematic books (coloured in red), whilst Figure 5 displays how book genres interrelated to each other. In addition, based on the analysis of our core dataset of 396 books, 41 books (∼10%) were conspiratorially ‘problematic’ (Figure 4). What we found interestingly is that a number of clusters—such as the one between Coming Apocalypse and Vaccine Nation—an almost 50/50 split of problematic and unproblematic books. Additionally, if we combined Figure 4 and Figure 5, it would not be hard to find that most of these books in the cluster were marked in orange, which means they were highly related to the book genre: vaccines.
Figure 4: The Network of Problematic Books and Their “Also-Viewed” Recommendation Books
Figure 5: The Genres of Problematic Books and Their “Also-Viewed” Recommendation Books
Figure 6: The Representation of Genres in Different Colours
Additionally, when one looks at Rise of the New World Order (the topmost seed, marked purple), one finds a cluster dominated by books related to Christianity—a predominantly ‘unproblematic’ genre—which suggests comparatively low levels of connectivity between Christian and problematic networks. Indeed, not all books or genres are highly connected. Though all the seeds have direct connections with one another (there is no seed at more than one degree of separation), not all genres are algorithmically adjacent. To navigate between “5G & EMF” and “QAnon” networks, for example, one must traverse multiple nodes. Moreover, certain books like Peter May’s Lockdown and Dean Koontz’s Thriller, as books of fiction, serve to disconnect the network. These virus-themed books were revived (popularity-wise) as a result of the COVID-19 pandemic whilst bearing little relation to any given ‘problematic’ node or network (written, as they were, pre-pandemic) and therefore produced an algorithmic network (and pathway thereto) of their own. On the flip side, The Answer, a ‘problematic’ book, appears in the recommendations for Green Mask: UN Agenda 21 and The Conspiracy: A Chronological Journey Through Secret Societies and Hidden History, two ‘unproblematic’ books; thereby showing exceptionally high levels of intra-network connectivity.
Finding 3:
As a whole, the network shows a broad distribution of books of different genres—and therefore a high level of inter-generic connectivity. Interestingly, this seems to be the case for conspiratorial and non-conspiratorial books alike. For example, books related to “Self Help” (marked in light blue) and related to the “Illuminati” (marked in dark grey) appear right throughout the network—popping up beside various conspiratorial seeds (marked in purple)— and can therefore act as gateways or “bridges” between individual conspiratorial seeds and generic clusters—conspiratorial or otherwise. When one looks at the relation between the ‘problematic’ Number Games: 9/11 to Coronavirus and One Hundred Proofs that the Earth is Not a Globe, for example, one finds the pair connected via The Secret Teachings—an ‘unproblematic’ book. This finding illustrates that even though the customer is searching for a problematic book, there is a possibility for him or her to reach an unproblematic book via the recommendation section. This also works the other way round, a customer of an unproblematic book may also jump to a problematic book webpage by clicking on what is recommended.
Figure 7: The Non-Tinfoil Guide to EMFs as a Bridge between EMF*D: 5G, Wi-Fi & Cell Phones and
Coming Apocalypse
Similarly, The Non-Tinfoil Guide to EMFs by Nicolas Pineault (marked in the red box) serves as a direct bridge between two seeds: EMF*D: 5G, Wi-Fi & Cell Phones and Coming Apocalypse, thereby revealing that even if one exits a conspiratorial genre, via the algorithmic recommendation of a non-conspiratorial book such as The Non-Tinfoil Guide, one can quickly find oneself browsing an entirely new genre of conspiracies: back within the network, as it were.
Discussion
When Twitter and Facebook banned QAnon content from their networks, in part because of its connections to real-world violence (Scott, 2020), Amazon still does not moderate, modulate or manipulate conspiracy theory content on their bookstore since they have no policies related to book content moderation and in the US no obligation to commit certain publications (Amazon, 2018). Therefore, if a user searches for “COVID” in the books section of Amazon.com, the first item: COVID-19: The Great Reset (itself a conspiracy book, and marked as best seller) opens a trapdoor into a network full of conspiracy book recommendations and conspiratorially-minded comments and reviews.
Figure 8: Direct Query Results of “COVID” in Amazon Book Section
Based on our research, we found out that books not only can be conspiracy related through their writers or their readers, but also be construed as algorithmically associated with conspiracy theories through an interplay between recommendation features and user practices on Amazon.com. Writers can freely publish and push books full of conspiratorial content, readers can post conspiratorial comments in the review section and give high ratings to ‘problematic’ books, whilst Amazon’s own algorithmic engines accentuate and accelerate all of the above—directing browsers and buyers, speedily and smoothly, down the conspiratorial rabbit hole.
Figure 9: The “Viewed Also Viewed” Recommendation list of a Conspiratorial Book
The lack of management of algorithms has turned—and is turning—Amazon into a fertile ground for conspiracy theories to grow, as “they are encoded procedures for transforming input data into a desired output” (Gillespie, 2014, p.167). That is to say, the platform will keep recommending information which is “calculated” to be most connected to customers’ interests. Therefore, once a person searches for a conspiratorial book, he or she will continuously receive promotional information from Amazon pertaining to conspiracy theories, coaxing first-time and repeat purchasers of ‘problematic’ material deeper into the network.
The hands-off, reluctant approach to modulation and moderation, on the part of Amazon, which has encouraged the spread of conspiratorial books writ large, is nowhere more exposed (and more consequential) than surrounding the topic of COVID-19. Indeed, Romer and Jamieson claimed that “if conspiracy beliefs are associated with mistaken fears about the nature or effects of vaccination, their circulation could undermine the country’s ability to bring COVID-19 to heel” (2020, p.2). Consequently, once a viral conspiracy theory is not controlled and spreads through both social media and academic spaces without pushback, it will create panic and anxiety to the whole society, thus bringing the current situation to an even lower low. When it comes to conspiratorial content, Amazon should be treated as a social media platform (not just as a bookseller) and held to the same standards.
Nevertheless, recently, some steps have been taken. In recent weeks, Amazon made the decision to remove QAnon books (Weise, 2021), citing policies that prohibit offensive items or other inappropriate content. That being said, at least according to the research done by Chandrasekharan and his colleagues in 2017, “banning” does have an obvious effect on reducing the content of conspiracy theories, but it also may drive them to darker corners of the internet. According to our research, most conspiracy theory books are actually well interrelated to each other in the recommendation system network. Thus, as the banning books may not always be appropriate, perhaps “quarantining” problematic material (severing them from algorithmic recommendations, etc.) would be a better solution rather than “deplatforming,” so that, though conspiratorial books could be searched directly, they would exist outside of the recommendation infrastructure: not given the death penalty, but rather life imprisonment.
Conclusions
At a macro level, our results confirm and coalesce with pre-existing theories and past research: for creators and consumers of conspiracism, Amazon.com is a one-stop-shop. The algorithmic ecosystem of the platform is predisposed to the purveyance and promotion of provocative, clickable and—literally, in this case—viral content, fact-based or otherwise. At a micro, our results shine new light on the nuts and bolts of how, via recommendations, nodes form networks and how networks form internetworks: politically and socioculturally valent entities, which have emerged—synergistically, dynamically and exponentially—as an outgrowth of the complex meshwork that exists between Amazon consumers, creators and coders. Conspiratorial networks are not a conspiracy, but empirical and measurable—verifiable and verified.
Bibliography
- Offensive and Controversial Materials. (2018, August 11). Amazon.com, Retrieved from URL: https://sellercentral.amazon.com/gp/help/external/200164670
- Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. In Proceedings of the International AAAI Conference on Web and Social Media, Volume 3 (1), pp.361-362.
- Bounegru, L., Venturini, T., Gray, J., & Jacomy, M. (2017). Narrating networks: Exploring the affordances of networks as storytelling devices in journalism. Digital Journalism, Volume 5(6), pp.699-730.
- Butter, M., & Knight, P. (Eds.). (2020). Routledge handbook of conspiracy theories. Routledge.
- Chandrasekharan, E., Pavalanathan, U., Srinivasan, A., Glynn, A., Eisenstein, J., & Gilbert, E. (2017). You can't stay here: The efficacy of reddit's 2015 ban examined through hate speech. Proceedings of the ACM on Human-Computer Interaction, 1(CSCW), pp.1-22.
- Earnshaw, V. A., Eaton, L. A., Kalichman, S. C., Brousseau, N. M., Hill, E. C., & Fox, A. B. (2020). COVID-19 conspiracy beliefs, health behaviors, and policy support. Translational behavioral medicine, Volume 10(4), pp.850-856.
- Gillespie, T. (2014). The relevance of algorithms. Media technologies: Essays on communication, materiality, and society, Volume 167(2014), pp.167-193.
- Karen, W. (2021, January 11), Amazon says it will remove QAnon products from its store. The New York Times, Retrieved from URL: https://www.nytimes.com/2021/01/11/business/amazon-qanon.html
- Mark,S.(2020, December 22), Conspiracy theories run wild on Amazon, Plitico, Retrieved from URL: https://www.politico.eu/article/amazon-qanon-covid19-coronavirusdisinformationconspiracy-theories/
- Romer, D., & Jamieson, K. H. (2020). Conspiracy theories as barriers to controlling the spread of COVID-19 in the US. Social Science & Medicine, 263, 113356, pp.1-8.
Figure
- Figure 1: The Problematic Infomedic Amazon Book List Screenshot. (2021, January 15). The 12 Conspiratorial Seed Books in the Problematic Infomedic Amazon Book list
- Figure 2: Amazon.com Screenshot. (2021, January 07). The First 12 books of the Direct Search “COVID” Query Results. Retrieved from URL: https://www.amazon.com/s?i=stripbooks-intl-ship&k=COVID&ref=glow_cls&refresh=2
- Figure3: Tommaso, E. (2021, January 07). The Network of “COVID” Query Results and Their “Also-Viewed” Recommendation Books
- Figure 4: Tommaso, E. (2021, January 04). The Network of Problematic Books and Their “Also-Viewed” Recommendation Books
- Figure 5: Tommaso, E. (2021, January 05). The Genres of Problematic Books and Their “Also-Viewed” Recommendation Books
- Figure 6: Gephi Screenshot. (2021, January 07). The Representation of Genres in Different Colours
- Figure 7: Tommaso, E. (2021, January 07). The Non-Tinfoil Guide to EMFs as a Bridge between EMF*D: 5G, Wi-Fi & Cell Phones and Coming Apocalypse
- Figure 8: Amazon.com Screenshot. (2021, January 07) Direct Query Results of “COVID” in Amazon Book Section. Amazon. Retrieved from URL: https://www.amazon.com/s?i=stripbooks-intl-ship&k=COVID&ref=glow_cls&refresh=2
- Figure 9: Amazon.com Screenshot. (2021, January 15) The “Viewed Also Viewed” Recommendation list of a Conspiratorial Book. Retrieved from URL: https://www.amazon.com/Number-Games-9-11-Coronavirus/dp/1098329864/ref=sr_1_1?dchild=1&keywords=number+games&qid=1610713409&s=books&sr=1-1
Tool