Skip to Main Content

Artificial Intelligence and ChatGPT

An overview of AI tools, resources, assignment ideas, and more.

Topics

"Garbage In, Garbage Out"

Bias is not an inherently bad thing, we all have it. In our academic writing we attempt to limit the influence of our biases, but acknowledge that they exist.

It is important to note here that although GenAI models cannot inherently harbor biases, the data they are trained on, as well as the training process itself, can be highly biased and as such their output will reflect that. It's also important to remember that there is no guarantee that what the model generates is accurate; if the information the model is trained on is flawed, that same flaw is replicated, or even amplified, in its output. 

The videos below provide a good overview as well as an example of constructing a simple model and exploring the biases within.

Technological terms such as "cloud" can make many modern digital services and applications sound very ethereal, however it's important to remember that there are physical locations where data are stored and complex processes are performed. These data and processing centers can consume staggering levels of energy that result in large amounts of greenhouse gas emissions. 

Strubel et al. (2019) found that training a single model can result in nearly 5 times the carbon dioxide emissions as the entire life time of an American car (including it's fuel consumption). Below are some additional articles concerned with the environmental impact of computing practices.

In addition to concerns around emissions, Strubel et al. (2019) and Schwartz et al. (2020) also raise concerns around equity in AI research. As the size of the training data grows, the more processing power required, the costs also grow. This prohibits the ability of many researchers to compete with organizations, such as OpenAI, that have significant financial backers.

For users, the trend of subscription-based models means that many users may be prevented from using the best available tools, such as OpenAI reserving GPT-4 for users with "Pro" subscriptions. This guarding of tools allows for potential exacerbation of the Matthew Effect. Those who have additional resources (money) then have access to better tools (proprietary AI tools) and thus have a potential advantage over others with fewer resources causing a self-perpetuating cycle of inequity. 

A computer, or in this case a model, cannot be held accountable. 

These tools are trained on large swaths of data that have been pulled from various sources. Using text, images, and audio that others created and own to produce new output. Is it a case of infringement or is it fair use? Who owns the output?

The US copyright office has asserted that the very term 'author' is reserved for humans only, so, at least in the US, ChatGPT/Midjourney/Stable Diffusion and others can't be considered to be the authors or creators of the work.

Resources