TQE-2026-1-Project-1 – Inter-Rater Reliability Report

task_id	annotator_id	total_spans	subcategory_%	impact_%	span_comments_%
25.0	58.0	16.0	87.5	87.5	68.8
25.0	84.0	19.0	42.1	15.8	52.6
25.0	59.0	17.0	94.1	94.1	0.0
25.0	57.0	21.0	81.0	47.6	0.0
25.0	61.0	48.0	97.9	95.8	97.9
25.0	60.0	5.0	100.0	40.0	0.0
25.0	79.0	23.0	95.7	87.0	87.0

task_id	annotator_id	document_issues	correspond_score	correspond_comment	readable_score	readable_comment
25	58	✓	✓	✓	✓	✓
25	84	✗	✗	✗	✗	✗
25	59	✓	✓	✓	✓	✓
25	57	✓	✓	✓	✓	✓
25	61	✓	✓	✗	✓	✗
25	60	✗	✓	✓	✓	✓
25	79	✗	✗	✗	✗	✗

Statistic	Value
Measurements	7
Median (min)	117.4
Mean (min)	194.8
Q1 – Q3 (min)	58.4 – 212.1
Min (min)	26.3
Max (min)	678.9
Outliers (> 442.5 min)	1 value(s): 678.9

Statistic	Value
Measurements	7
Median (min)	0.0
Mean (min)	239.1
Q1 – Q3 (min)	0.0 – 289.5
Min (min)	0.0
Max (min)	1094.5
Outliers (> 723.8 min)	1 value(s): 1094.5

Annotation Overlap Visualization — All Documents

Each document is shown with annotated spans highlighted. Darker shading indicates higher annotator agreement on a given span.

Annotation Overlap Visualization

Document ID: 25

Number of unique annotators: 7 Total annotations: 149 Maximum overlap: 6 annotators

Generative Artificial Intelligence integrated into the digital ecosystem: A framework for algorithmic governmentality

1\. INTRODUCTION

The development of artificial intelligence (AI) algorithms for language understanding and generation has been a key focus of major technology corporations in both the East and West over the past two decades. This progress has been made possible by massive access to training data and the increasing processing power provided by graphics processing units (GPUs)¹. Thus, there has been significant progress in research into language modeling, moving from statistical models to neural models. Recently, pre-trained language models (PLMs) have been introduced using transformers on large corpora, showing a high capacity for various natural language processing tasks (NLP). Scaling up models has been proven to improve their performance. By increasing the number of parameters, they not only significantly improve in performance, but also acquire special abilities (for example, contextual learning), something that does not happen in smaller models. It is this contextual learning capability that generates a sort of "shift in the tectonic plates" regarding what the new capabilities of platforms that integrate AI into their operation might mean.

1 The acronym GPU – or graphics processing unit – refers to processors designed to handle and accelerate graphics and image processing in devices such as video cards, motherboards, mobile phones, and personal computers. By performing mathematical calculations quickly, it reduces the amount of time it takes for a computer to run multiple programs, making it an essential enabler of emerging and future technologies such as machine learning (ML), AI, and blockchain.

Since the massive release of ChatGPT-3.5 in November 2022 by Open AI, Large Language Models (LLMs) have been the subject of fascination and experimentation beyond the academic and scientific realm. The launch of ChatGPT lowered the barriers to entry for conversational chatbots and other forms of AI, marking the beginning of a new era in which the term "artificial intelligence" became commonplace. Behind this technology are transformers, the architecture that made its development possible. A transformer model is a neural network that learns context and meaning by tracking relationships in sequential data. It applies a set of mathematical, self-aware techniques to detect subtle ways in which data elements in a series influence and depend on each other. LLMs, or large language models, are the key component behind text generation. They consist of transformers pre-trained to predict, given an input text, the next word (or, more precisely, token). Since language models predict one token at a time, generating complete sentences requires a more elaborate approach than simply calling them once. To do this, autoregressive generation is used, a procedure in which the model is called iteratively, using its own previous results as input to continue the sequence and improve the coherence of the text. In these cases, the model generates data sequences — such as text — using its own previous outputs as new inputs, starting with an initial data set. This process is repeated iteratively to produce a complete sequence. The machine learning potential associated with these models is endless and will generate changes in the way we understand our environment and generate knowledge.

Transformers were first introduced in 2017 at the NeurIPS conference through Google's white paper titled "Attention is all you need," kicking off a wave of machine learning advancements that some are calling "Transformer AI". Generative Pre-trained Transformers (GPTs) are natural language models that use deep neural networks to process and generate text. They are designed to understand and produce human language in a similar way to a human being, although their knowledge comes from large data sets. The most widely used GPT, ChatGPT, has over 200 million weekly active users — as of November 2024 — and has a simple interface in which the user (in its "free" version) can access the history of chat interactions from the last 7 days, view sample conversations, or enter text with the prompt (instruction or text used to interact with the AI), either by voice or by keyboard. Some of these organizations and/or companies are dedicated exclusively to strengthening and further training these language models, while others use them as an Application Programming Interface (API) to develop new products and services. In the West, there are currently more than 7,000 companies and organizations developing software, applications, and hardware with integrated AI. The areas in which these companies operate are varied, including: logistics, finance, insurance, credit management, search, resale, education, aerospace and defense, healthcare, customer service, cybersecurity, gaming, waste management, information technology (IT)² and DevOps³, automation, agriculture.

2 IT: digital systems and tools used to process, store, transmit and protect information.

3 DevOps: combines the English words development and operations. It is a set of practices and tools that seeks to improve application development and the release of new software features.

Legend:

1 annotator

2 annotators

3 annotators

4 annotators

5 annotators

6 annotators

TQE-2026-1-Project-1 – Inter-Rater Reliability Report

Annotator Completeness — All Documents

Span-level Completeness (% filled)

Document-level Assessment Completeness

Annotator Timing — All Documents

Lead Time

Review Time

Annotation Overlap Visualization — All Documents

Annotation Overlap Visualization

Document ID: 25

Error Type Distribution — All Documents

Correspondence & Readability Ratings — All Documents

Exact Span Matching — All Documents

Span F1 — Partial Match Agreement — All Documents

Cohen's Kappa — Error Category Agreement — All Documents

task_id	annotator_id	document_issues	correspond_score	correspond_comment	readable_score	readable_comment
25	58	✓	✓	✓	✓	✓
25	84	✗	✗	✗	✗	✗
25	59	✓	✓	✓	✓	✓
25	57	✓	✓	✓	✓	✓
25	61	✓	✓	✗	✓	✗
25	60	✗	✓	✓	✓	✓
25	79	✗	✗	✗	✗	✗

task_id	annotator_id	document_issues	correspond_score	correspond_comment	readable_score	readable_comment
25	58	✓	✓	✓	✓	✓
25	84	✗	✗	✗	✗	✗
25	59	✓	✓	✓	✓	✓
25	57	✓	✓	✓	✓	✓
25	61	✓	✓	✗	✓	✗
25	60	✗	✓	✓	✓	✓
25	79	✗	✗	✗	✗	✗

task_id	annotator_id	document_issues	correspond_score	correspond_comment	readable_score	readable_comment
25	58	✓	✓	✓	✓	✓
25	84	✗	✗	✗	✗	✗
25	59	✓	✓	✓	✓	✓
25	57	✓	✓	✓	✓	✓
25	61	✓	✓	✗	✓	✗
25	60	✗	✓	✓	✓	✓
25	79	✗	✗	✗	✗	✗