Lingyun Yue, Joe Zhou
Looking at 10k AI-generated photographs on Instagram, more than 60% of the images represent women. In such images, Japanese pop culture has a significant influence, focusing on youth and specific beauty norms. There's a strong emphasis on the physical attributes of women, leading to a pattern of sexualization and objectification. There is a relevant presence of AI photography with a surreal style, with a tendency towards dark, cold tones and complementary colors. When reverse-engineering prompts for emblematic images, we notice a 'toning down' effect, such as transforming surreal creatures into realistic figures or changing the portrayal of women from sexualized to more generic depictions.
Current debates on the role of AI in contemporary visual media culture focus on questions about creativity and ownership and, on the other hand, on concerns about truth and the inauthenticity of visual content. These academic and public debates emphasize AI's potential and actual impact on those who produce imagery for a living or, conversely, on the public who view and consume media imagery. With this project, we would like to take a step back and examine AI-generated photography from the bottom up, starting from the oxymoronic hashtag #AIphotography on Instagram. Through AI-specific digital and visual methods (such as machine vision and reverse-engineering prompts) and close reading of image collections, we aim to understand how AI-generated imagery performs its photographic aesthetic.
This research contributes to knowledge of the work that goes into producing generative imagery, thus allowing us to understand better the creative practices underlying AI’s photographic aesthetic. In doing so, this project also engages with AI in its own right as a method to uncover and reflect on the photographic ‘status’ of generative imagery.
RQ1: How does AI-generated imagery perform its photographic aesthetic?
RQ2: Looking at thematic subsets, how can we characterize the representation of femininity and the surreal in the realm of AI photography?
RQ2: How can AI be used to study the language of generative photography?
We start from the #AIphotography hashtag to build a collection of generated AI photos shared on Instagram (10k) from September 2022 to September 2023, and zoom in to explore the top 200 most-engaging images in the set. From there, we focus on two subsets to study the representation of ‘women’, and the realm of the ‘surreal’.
Main DatasetWomen Sub-set
From the 10k dataset, we selected all the captions and asked ChatGPT to detect all the hashtags relating to women. We filtered the 10k dataset to retain only the posts containing at least one of the hashtags in the women's hashtags list, resulting in a subset of 6363 images. From this subset, we selected and downloaded the top 2500 images.
Surreal Sub-set
From the co-hashtag network of the main dataset, we manually selected a list of hashtags related to surrealism and filtered the main dataset (10k) for the 500 most interacted with ‘surreal’ pictures.
Main Dataset
From the main dataset of 10k Instagram posts, we downloaded the top 200 most interacted with images. We uploaded the dataset to 4CAT, running a co-hashtag analysis. Visually exploring the co-hashtag network with Gephi, we realized that the two most prominent clusters contained hashtags referring to two distinct themes: women and the surreal. For each theme, we compiled a list of representative hashtags (to name a few prominent ones: #beautifulgirls, #girlsfashion, and #virtualgirfriend for the women subset and #popsurrealism, #surrealart, and #surreal for the surreal subset). For the sub-set on women, the hashtags were extracted automatically with the help of ChatGPT by asking to extract all hashtags referring to women, femininity, and womanhood from the captions of the Instagram posts. For the surreal sub-set, the list of meaningful hashtags was compiled manually.
We used the list of hashtags to filter the 10k dataset and create two distinct sets: the womanhood set containing 2500 most interacted with posts and the surreal set containing 500 most interacted with posts.
Manually searching the accounts in the top 200 set in Google Search, we compiled a collection of interviews with creators in the top 200 set to discover their way of working with AI. The list of interviews can be accessed here, and will be completed and analyzed in a future sprint.
Women Sub-set
Analysis of Labels and Web Entities
We used Memespector with Google Vision API to get the labels and the web entities for the top 2500 images representing women. We used Gephi to visualize the network, showing only the 300 most interacted with pictures (fig 1). The network was then visually explored and manually annotated.
Figure 1: Image-Webentities network of the top 2500 most interacted with images for the women sub-set. Only the top 300 images are previewed in the network.
Analysis of the settings
We were interested in analyzing the kind of settings and backdrops characterizing this sub-set. In which kind of context do we find AI-generated women? We created a folder with the most-interacted-with 2500 images for the women subset, generating an image grid with Image Sorter. (Fig. 2)
Figure 2: Image Grid showing 2500 most interacted with images of the womanhood subset, sorted by hue.
Using the image grid, we manually identified clusters of images with similar backgrounds/settings. For each cluster, we manually selected emblematic images and removed the subject using the Photoshop function ‘AI generative fill’, which replaces the women in the picture with the background. (Fig. 3)
Figure 3: original image (left) and processed image (right).
We visualized the different settings by designing composite images and stacking the emblematic images for each setting (Fig. 4).
Figure 4: Composite images showing stacks of emblematic images per each type of setting. Each stack has been titled to introduce the type of setting.
Surreal Sub-set
What constitutes an AI-generated surreal photo? Guided by this question, the subset of 500 most interacted with surreal images was analyzed from two perspectives: a qualitative analysis of the content of each image, focusing on the representation of hybrid humans, animals, and machines, and a quantitative analysis of hue, saturation, and brightness of the images in the set.
Qualitative analysis of content
Subjects were manually extracted from all images in the set and organized in a Venn diagram showing the presence and overlapping of elements representing humans, animals, and machines (fig. 5)
Figure 5: A Venn diagram showing surreal subjects categorized as humans, animals, machines or any combination thereof.
Automatic analysis of visual characteristics
The same subset was automatically analyzed with Image J, calculating each image's hue, saturation, and brightness and generating three image montages. (fig. 6)
Figure 6: Three image montages showing the distribution of surreal images according to their hue, saturation and brightness.
Reverse-Engineer Prompting
For the main dataset, We reverse-engineered the prompts of the 200 images using the img2prompt API. From the list of prompts, we automatically extracted and categorized the most frequent words, using ChatGPT as an assistant with this prompt: I will provide you with a list of words, and you will return a table with the same list in column A and a category chosen from this code book in column B: [person] [verb] [adjective] [style] [object] [other].
We then designed a ranked tag-image grid in Google Spreadsheet with the most 5 frequent words per category and the most engaging image per word. (Fig. 7)
Figure 7: ranked tag-image grid showing the most frequent words per category, extracted from the reverse-engineered prompts of the top 200 images with the hashtag #AIphotography
For the two subsets, we selected two emblematic images to test how AI can be used to study the language of generative photography. (Fig. 8) We used the API of img2prompt to reverse engineer the prompts of the images automatically and then used such prompts in Stable Diffusion to generate new images.
Figure 8: Comparison of two emblematic images from the womanhood and the surreal subsets (on the left), their reverse-engineered prompts (center) and the resulting newly generated images (right).
Lev Manovich (2019). AI Aesthetics. Moscow: Strelka Press.
Lev Manovich and Emanuele Arielli (2023). Artificial Aesthetics: A Critical Guide to AI, Media and Design.
Amanda Wasielewski (2023). Computational Formalism: Art History and Machine Learning. MIT Press.
Amanda Wasielewski (2023) “‘Midjourney Can’t Count’: Questions of Representation and Meaning for Text-to-Image Generators.” IMAGE: Zeitschrift Für Interdisziplinäre Bildwissenschaft 37, no. 1 (May 2023): 70–81. https://image-journal.de/wp-content/uploads/2023/05/IMAGE-1614-0885-37-2023-H-71-82.pdf
Joanna Zylinska (2020) AI Art: Machine Visions and Warped Dreams (London: Open Humanities Press). Open access.
Donnarumma, Marco. "Against the Norm: Othering and Otherness in AI Aesthetics" Digital Culture & Society, vol. 8, no. 2, 2022, pp. 39-66. https://doi.org/10.14361/dcs-2022-080205
Steinfeld K. Clever little tricks: A socio-technical history of text-to-image generative models. International Journal of Architectural Computing. 2023;21(2):211-241. doi:10.1177/14780771231168230
O’Meara, J., & Murphy, C. (2023). Aberrant AI creations: co-creating surrealist body horror using the DALL-E Mini text-to-image generator. Convergence, 29(4), 1070–1096. https://doi.org/10.1177/13548565231185865
Chesher, C., & Albarrán-Torres, C. (2023). The emergence of autolography: the ‘magical’ invocation of images from text through AI. Media International Australia, 0(0). https://doi.org/10.1177/1329878X231193252
Romele, A., & Severo, M. (2023). Microstock images of artificial intelligence: How AI creates its own conditions of possibility. Convergence, 0(0). https://doi.org/10.1177/13548565231199982
Katja de Vries (2020) You never fake alone. Creative AI in action, Information, Communication & Society, 23:14, 2110-2127, DOI: 10.1080/1369118X.2020.1754877
https://arxiv.org/abs/2307.06033
https://www.emerald.com/insight/content/doi/10.1108/LHTN-10-2022-0116/full/html?casa_token=VTQ8yq3yThMAAAAA:Sgbs4Cg24TR1JTpSpIv4nTPNWllPK-AlORfnUDRyOHhGiGoKSbd1GjwYCtU051KeST-44oHBqzV-eKOGzyihuiauvF8A1qTJKMxtOn2_InqgJA8dR1aS1w
Add additional references and links, referring to growth in popularity and accessibility of generative visual AI tools, such as DALL-E, Midjourney, and others; growing concerns about the potential impact of such tools on the creative community (AI in art competitions, fear of losing their jobs, ethics of using artworks to train the AI…) and the community overall (deepfakes, actors starting to copyright their faces, …); Generative AI seen as just another tool to add to creatives’ stack.
https://www.blind-magazine.com/stories/ai-generated-images-a-visual-revolution/
https://hbr.org/2022/11/how-generative-ai-is-changing-creative-work
https://www.nytimes.com/2022/09/02/technology/ai-artificial-intelligence-artists.html
https://www.vice.com/en/article/ake53e/ai-art-lawsuits-midjourney-dalle-chatgpt
https://www.theverge.com/2023/1/17/23558516/ai-art-copyright-stable-diffusion-getty-images-lawsuit
https://www.telegraph.co.uk/business/2023/09/17/britain-label-deepfake-pictures-videos-ai-crackdown/
https://www.nytimes.com/2023/04/08/business/media/ai-generated-images.html
https://www.nytimes.com/2022/10/21/technology/ai-generated-art-jobs-dall-e-2.html
https://mashable.com/article/ai-chatgpt-influencer-creator-economy