This process brought to light several areas of friction that enriched the conceptual side of the project as well as the research trajectory. We consider them the most interesting aspects, having induced new ideas and critical reflection.
These frictional aspects can be divided into two categories: logical clashes or nonsense, and ethical.
- Choosing an elusive concept as the object of observation, instead of a well-defined phenomenon.
- Using generative neural networks to produce intuitive knowledge, instead of simply generating synthetic copies.
- Seeing potential where failure would be expected.
- Willingly doing something absurd and rejecting the rational.
All these decisions are a rejection of the rules, and what society sees as common sense. The freedom that arose from these twists allowed us to make an unexpected leap in our thinking and dare to imagine the speculative use of the visually ambiguous images that were generated in surprising contexts, for instance, in place of news photography. The orientation of the process rather than the outcome made it possible to critically analyse the separate steps of training a StyleGAN model, questioning the ethical consequences as well as envisioning alternative future applications that are different from the commercially oriented view of the AI industry.
- Training of own model, instead of using available pre-trained models.
- Building of own dataset, rather than using an existing (often biased) dataset.
- Scraping dataset images, intead of handpicking them.
- Using licensed images in the dataset instead of producing own footage.
- The illustrative practice of photojournalism in contrast to visually ambiguous generated images.
All aspects have advantages and disadvantages. It is open to debate as to which decisions are ethically correct.
Training one’s own model is essential if an artist wants to define their observed object and understand the logic behind the training. However, such a task is resource-demanding and needs to be justified. We notice that there’s a growing number of meaningless, poorly trained StyleGAN models. A StyleGAN model should be seen as a tool for future use, not a final product. Additionally, if an artist requires full control over the production of such a tool, collecting their own dataset is the best option.
By engaging in these expensive, labour-intensive processes, an artist can claim responsibility and ownership over the whole project. There are many layers to it, from the most laborious (manually creating the dataset images from scratch) to using all kinds of ‘cheats’ (scraping scripts, using borrowed bits of code, existing datasets, etc.). When opting for these kinds of shortcuts, it’s good to be aware of how they were made, for what purpose, by whom, and so on. In general, the convenience of the popular tools used in AI media synthesis nowadays overshadows the awareness of its context.
Another ethical grey zone is the licences required for the found footage used as training material. Using public domain footage or getting access from an institution to its data would be preferable, but this is not applicable in every situation. How to obtain access to visual data and find out who owns it is often dubious. It is likely to take years or even decades until governments adjust copyright laws to fit the new advances of AI concerning synthetic media. In addition, it remains unclear whether using licensed footage for training a generative model is actually legally problematic. As it serves only as training data and is not being altered or published, it would be hard to argue that it is used counter to the terms of most licences. The neural network reinterprets the viewed data, in a process similar to looking at it and grasping its atmosphere that, some would say, does no violate copyright.
The second aspect, ethical friction, concerns the speculative use of visually ambiguous TroublingGAN images as substitutes for photojournalism that captures specific troubling events. (We have delved into the conceptual underpinning in section 3.) Given the spectacular nature and affective potency of the TroublingGAN visuals, we believe that news photography provides a fitting backdrop for their peculiar visual ambiguity, the disquieting abstraction carrying a continually shifting meaning, and the emotionally resonant remnants of their former visual identity.
Two years into this project, generative AI tools have improved to the point where synthetic visual media attains unprecedented degrees of photorealism. Given the current capability to generate photorealistic images of anything within mere seconds, it is unsurprising that the prevailing media narrative harbours concerns of the mass fabrication of fake photographic material depicting unreal or alternative events. Despite these fears, the Norwegian section of Amnesty International employed AI-generated images in a social media post on Twitter about the 2021 protests in Colombia (Taylor 2023). Despite ample photographic documentation of police brutality during the protests, Amnesty International opted for generated content in their posts to protect the identities of protesters at risk of potential police prosecution. The images included a small disclaimer in the bottom left corner, indicating that they were ‘illustrations produced by artificial intelligence’. The tweet stirred significant public anger and sparked a heated debate before its eventual deletion within a brief timeframe. It was only a matter of time before AI-generated images infiltrated mainstream visual communication, but the sensitive context of this particular instance is noteworthy. What prompted the outrage? Was it the non-authentic origin of the footage or its photorealistic nature? Would public reception have been more favourable if the image hadn’t strived for realism?
This controversy exemplifies the friction between abstraction and representation. While photorealistic, the generated image served as an abstraction, ethically sidestepping the exploitation of real events or the genuine suffering of people captured in the scene for illustrative purposes. However, this intent is not immediately evident (barring the watermark), which could potentially mislead viewers and incite scepticism regarding the credibility of an institution that advocates for human rights and often documents troubling events. More pronounced visual ambiguity might mitigate this issue, making the TroublingGAN approach seem pertinent in such scenarios. Further exploration is required to pinpoint the balance between photorealism and abstraction, allowing visual ambiguity to effectively replace illustrative photojournalism. With this controversial connection, we hope to prompt numerous unanswered questions about the future applications of synthetic media, thereby encouraging researchers and artists to engage with this topic in their work.