
Research Article
On Trusting a Cyber Librarian: How Rethinking Underlying Data Storage Infrastructure Can Mitigate Risksof Automation
@INPROCEEDINGS{10.1007/978-3-030-76426-5_3, author={Maria Joseph Israel and Mark Graves and Ahmed Amer}, title={On Trusting a Cyber Librarian: How Rethinking Underlying Data Storage Infrastructure Can Mitigate Risksof Automation}, proceedings={Intelligent Technologies for Interactive Entertainment. 12th EAI International Conference, INTETAIN 2020, Virtual Event, December 12-14, 2020, Proceedings}, proceedings_a={INTETAIN}, year={2021}, month={5}, keywords={Intelligent systems AI-Human problem Semantic sentiment analysis Artificial intelligence Ethics of AI Cyber curation of scholarship}, doi={10.1007/978-3-030-76426-5_3} }
- Maria Joseph Israel
Mark Graves
Ahmed Amer
Year: 2021
On Trusting a Cyber Librarian: How Rethinking Underlying Data Storage Infrastructure Can Mitigate Risksof Automation
INTETAIN
Springer
DOI: 10.1007/978-3-030-76426-5_3
Abstract
The increased ability of Artificial Intelligence (AI) technologies to generate and parse texts will inevitably lead to more proposals for AI’s use in the semantic sentiment analysis (SSA) of textual sources. We argue that instead of focusing solely on debating the merits of automated versus manual processing and analysis of texts, it is critical to also rethink our underlying storage and representation formats. Further, we argue that accommodating multivariate metadata exemplifies how underlying data storage infrastructure can reshape the ethical debate surrounding the use of such algorithms. In other words, a system that employs automated analysis typically requires manual intervention to assess the quality of its output, and thus demands that we select between multiple competing NLP algorithms. Settling on an algorithm or ensemble is not a decision that has to be made apriori, but when made, involves implicit ethical considerations. An underlying storage and representation system that allows for the existence and evaluation of multiple variants of the same source data, while maintaining attribution to the individual sources of each variant, would be a much-needed enhancement to existing storage technologies, as well as, facilitate the interpretation of proliferating AI semantic analysis technologies. To this end, we take the view that AI functions as (or acts as an implicate meta-ordering of) the SSA sociotechnical system in a manner that allows for novel solutions for safer cyber curation. This can be done by holding the attribution of source data in symmetrical relationship to its further multiple differing annotations as coexisting data points within a single publishing ecosystem. In this way, the AI program allows for the annotations of individual and aggregate data by means of competing algorithmic models, or varying degrees of human intervention. We discuss the feasibility of such a scheme, using our own infrastructure model, (MultiVerse), as an illustrative model for such a system, and analyse its ethical implications.