Research Article
ALGORITHMIC LITERACY: Generative Artificial Intelligence Technologies for Data Librarians
@ARTICLE{10.4108/eetsis.4067, author={Alexandre Semeler and Adilson Pinto and Tibor Koltay and Thiago Dias and Arthur Oliveira and Jos\^{e} Gonz\^{a}lez and Helen Beatriz Frota Rozados}, title={ALGORITHMIC LITERACY: Generative Artificial Intelligence Technologies for Data Librarians}, journal={EAI Endorsed Transactions on Scalable Information Systems}, volume={11}, number={2}, publisher={EAI}, journal_a={SIS}, year={2024}, month={1}, keywords={Generative Pre-trained Transformer, Algorithmic Literacy, Python, Open AI, Data Librarian}, doi={10.4108/eetsis.4067} }
- Alexandre Semeler
Adilson Pinto
Tibor Koltay
Thiago Dias
Arthur Oliveira
José González
Helen Beatriz Frota Rozados
Year: 2024
ALGORITHMIC LITERACY: Generative Artificial Intelligence Technologies for Data Librarians
SIS
EAI
DOI: 10.4108/eetsis.4067
Abstract
INTRODUCTION: Artificial intelligence (AI) is a novel type of library technology. AI technologies and the needs of data librarians are hybrid and symbiotic, because academic libraries must insert AI technologies into their information and data services. Library services need AI to interpret the context of big data. OBJECTIVES: In this context, we explore the use of the the OpenAI Codex, a deep learning model trained on Python code from repositories, to generate code scripts for data librarians. This investigation examines the practices, models, and methodologies for obtaining code script insights from complex code environments linked to AI GPT technologies. METHODS: The proposed AI-powered method aims to assist data librarians in creating code scripts using Python libraries and plugins such as the integrated development environment PyCharm, with additional support from the Machinet AI and Bito AI plugins. The process involves collaboration between the data librarian and the AI agent, with the librarian providing a natural language description of the programming problem and the OpenAI Codex generating the solution code in Python. RESULTS: Five specific web-scraping problems are presented. The scripts demonstrate how to extract data, calculate metrics, and write the results to files. CONCLUSION: Overall, this study highlights the application of AI in assisting data librarians with code script creation for web scraping tasks. AI may be a valuable resource for data librarians dealing with big data challenges on the Web. The possibility of creating Python code with AI is of great value, as AI technologies can help data librarians work with various types of data sources. The Python code in Data Science web scraping projects uses a machine-learning model that can generate human-like code to help create and improve the library service for extracting data from a web collection. The ability of nonprogramming data librarians to use AI technologies facilitates their interactions with all types and data sources. The Python programming language has artificial intelligence modules, packages, and plugins such as the OpenAI Codex, which serialises automation and navigation in web browsers to simulate human behaviour on pages by entering passwords, selecting captcha options, collecting data, and creating different collections of datasets to be viewed.
Copyright © 2024 Author et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.