Collective Intelligence based Endangered Language Revitalisation Systems : Design , Implementation , and Evaluation

The languages are disappearing at an alarming rate; half of 7105 plus languages spoken today may disappear by end of this century. When a language becomes extinct, communities lose their cultural identity, practices tied to a language and intellectual wealth. The rapid loss of languages motivates this study. We first introduce collective intelligence, endangered languages, and language revitalisation. Secondly we discuss and explore how to leverage collective intelligence to preserve, curate, discover, learn, share and eventually revitalise endangered languages. Thirdly we compare and synthesise existing language preservation and learning systems. Subsequently, we outline the research methodology. Finally, we propose the design, implementation and evaluation of “Save Lingo” and “Learn Lingo” apps for revitalising endangered languages. The systems are instantiated and validated in context of te reo Māori, Vietnamese and non-roman script languages such as Arabic, Chinese and Hindi.


Introduction
Language is a living and dynamic phenomenon that plays a significant role in our daily lives.It is essential for acquisition, accumulation, maintenance, and transmission of human knowledge regarding the natural environment and ways of interacting with it [1].Language defines a culture, through which people communicate.Currently, languages are at greater risk of extinction than species of animals and plants [2].Most linguists believe that at least half of the world's 7105 languages are endangered and they are not being learnt as first languages by children, and ultimately leading to the death of languages.Firstly, to revitalising a language, we need to document the language by capturing, curating and preserving various language artifacts such as words, phrases, songs, idioms, stories and dialects for future use.Secondly, the preserved language can be disseminated and learnt by the wider * Corresponding author.a.mirza@auckland.ac.nz community.The key success to any revitalisation efforts is the contribution and collaboration by its native community.Hence, leveraging collective intelligence within the community will significantly help in preserving and learning the language.In the subsequent section we explore the fundamentals of collective intelligence, endangered languages and language revitalisation efforts.

Collective Intelligence
Intelligence can be defined as the ability to see things, understand and reason at different capacities.It involves one being aware and conscious of individual actions and analysis of creative and critical thinking.In this context, collective intelligence can be described as the ability for different individuals to come together and form a group with the intention to share a common line of thought [3].Lévy (1997) defines collective intelligence as a form of subjective mobilization, highly individual as well as ethical and cooperative [4].
Collective intelligence can either be beneficial or negative depending on the objectives set by the group.In this digitally enabled generation, the Internet is described as a useful tool that can be used to promote collective intelligence [5].Currently, many individuals and groups are using the Internet to collectively provide content on various issues defining global trends as well as raising awareness on topics.Search engines and social media platforms are playing a significant role in facilitating these actions.
Collective intelligence dwells on three key principles that include cooperation, coordination and cognition [6].Using these key principles, harnessing collective intelligence enables individuals and groups to solve practical problems.In this research we explore how to leverage concepts of collective intelligence to save endangered languages.Moreover, design and implement systems that can be used to mitigate the loss of languages and revitalise the affected languages [7].

Endangered Languages
Language is a primary means of interacting among people in various forms such as in person, writing, over the phone or the Internet.We all are so called 'social animals' therefore communication plays a very important role in our daily living, and language enables a person to express their feelings and opinions.Krauss (1992) estimates that 90% of the world's 7105 languages would become endangered or extinct by the end of this century, if no language revitalisation efforts are made.Endangered languages are disappearing at a frightening rate; one language every two weeks [9].The reasons for language endangerment are complex but generally it is linked to communities abandoning their minority native languages to a mainstream language that is more economically, politically and socially powerful.
Languages are one the richest part of human diversity.Currently there are 7105 living languages spoken by a world population of 6,716,664,407 [10].The language distribution among the world population is heavily skewed.Approximately 79.5% of the world population speaks only 75 languages.In contrary, 3894 smaller languages are spoken by only 0.13% of the world population [1].The rapid decline of languages highlights the need to revitalise endangered languages for the survival of culture, diversity and knowledge.

Language Revitalisation
Language revitalisation is to reverse the decline of a language from becoming extinct or endangered.Linguists have proposed various models for language revitalisation [11,12].The language revitalisation models include school-based programs (total and partial immersion), children's programs outside the school, adult language programs, documentation and materials development, home-based programs, and language reclamation.
Language preservation is one of the essential tasks when revitalising a language.Language can be documented in multiple formats including, audio and video recordings, scanned images, or written notes.The documented data can then be archived and mobilised into various publications (print and digital).The preservation is highly dependent on the younger generation learning and using their indigenous language [13].Most of the language revitalisation models focus on language learning.Language learning is an emerging research area known as "Language Acquisition" which overlaps in linguistics and psychology.We will focus on language learning techniques that assist in language revitalisation efforts.Language learning is a key component to language revitalisation, as teaching the future generation their indigenous language will help keep the language alive.Learning a language is dependent on the available resources for the language.Most of the language revitalisation models consist of teaching and learning a language.Past research suggests that total immersion schools and classrooms have been very successful for language revitalisation [12].However, emerging technologies provide capabilities that were not available before which gives birth to new concepts towards language revitalisation.In the next section we look at how we can leverage collective intelligence for language revitalisation purposes.

Leveraging Collective Intelligence for Language Revitalisation
One of the most important components of existence is language; it gives people a certain sense of belonging and originality.Language connects people of the same cultural values and makes them diversified to the general world.According to Evans and Levinson (2009), linguistic is the basis for cultural preference and belonging, the two authors note that the diversity offered by languages provide a crucial element in cognitive science.Community driven projects are the best examples of how collective intelligence connects their interest.It is important for companies and other corporate bodies to develop programs that help connect language to daily work practice to help preserve the endangered language.Incorporating communities' practices and language preference in the dayto-day running of the corporations provide a sustainable step of language preservation [15].
Language provides deep insight of the culture associated with the language.Thus, gathering collective intelligence from various age groups and genders will help preserve, revitalise and pass the language to future generations.The language is rapidly evolving and like biology, only the fittest survive, similar to language, only the commonly used language will survive the change.The learning and usage of a language can be made possible by harnessing collective intelligence about a particular language and Making the language available to wider society is made possible through the use of mobile devices and social media.This will enable the community to access their language anywhere anytime.
The diagram below illustrates the research dimensions (Figure 1).Research in the past (inner circle) has focused predominantly on the design and implementation of traditional systems to capture and curate languages by experts in limited contexts focused on standard vocabulary and media.This research (red zone) tries to address these research problems by exploring a crowd sourced approach to harness collective intelligence using ubiquitous systems to capture, curate the linguistic diversity and richness anytime anywhere.Moreover, allow end users to discover and learn to use the endangered language.

Existing Systems for Language Revitalisation and Learning
Current language revitalisation models and activities include school-based immersion programmes, children's programmes outside school, adult language programmes, documentation and materials development, home-based programs, and language reclamation [1,11,12,[16][17][18].However, the focus of these efforts have been on language learning rather than a holistic approach towards language preservation, curation, and usage.
Only few indigenous communities around the world have started adopting mobile apps to revitalise their language and cultural practices [19][20][21][22].However, the apps are mostly learning-oriented using already documented information (e.g.Go Vocab, Hika Explorer, Te reo Māori dictionary, Duo Lingo, Ojibway and Saulteaux).Ma! Iwaidja, and Duo Lingo allow data capture but does not facilitate curation process.Moreover, there is limited research on models, frameworks, and/or architectures for the design, implementation and evaluation of collective intelligence based language revitalisation system.Hence, there is a gap in existing approaches and apps, because they do not facilitate a holistic approach to preserve, curate, and learn to use an endangered language.

Research Methodology
The primary aim of this research is to design and implement a system.The word "Design" means "to create, fashion, execute, or construct according to plan" [16].Therefore, it is best to discover through design and adapt a multi-methodological approach to conduct this design science research [17].For this study, Nunamaker's [18] multi-methodological approach for information systems research (ISR) will be adapted to propose and develop various artefacts.Moreover, the criteria for the design science artefacts proposed by Nunamaker [18] and Hevner [19] will be followed throughout the study.The adapted multi-methodological approach is a practical way of designing and implementing a system.It consists of four research phases -observation, theory building, systems development and experimentation as illustrated in Figure 2. The phases are not in any particular order but they are all mutually connected to support creation and validation of a system with multiple iterations.As this research focuses mainly on design and implementation of a system, the proposed approach will follow the sequence of observation, theory building, system development, and experimentation.As research progresses through each phase, the artefacts will be refined and generalised as depicted in Figure 2. Generalisation of the artefacts is the centre focus of this research.

Figure 2. Research Dimensions
Observation: The observation of existing literature and systems helps bring clarity to the research domain.We examined existing academic literature on language revitalisation and review existing applications that are available for indigenous languages.The outcome was comparison of existing applications available for language revitalisation [20].
Theory Building: This consists of adapting and developing ideas and concepts, creation of conceptual models, processes and frameworks.The proposed theories will help conceptualise a generic system that supports a crowd sourced approach towards language revitalisation including Te Reo Māori, Vietnamese and non-Roman languages.The outcomes were: conceptual concepts, models, processes, frameworks and architectures for crowd sourced knowledge management driven approach towards language revitalisation [20,21].
Systems Development: The proposed concepts, models, processes, frameworks will enable us to design and implement a holistic crowd sourced knowledge management system to capture, curate, discover and learn Te Reo Māori which supports dialect variations and media such as words, phrases, imagery, poetry, proverbs and idioms that are common as well as specific to a particular tribe or family [20,21].This system is described in section 4. The development of the Te Reo Māori revitalisation system will help demonstrate feasibility of the system for other endangered languages.The outcomes include Save Lingo -a crowd sourced based language revitalisation system, Learn Lingo apps (Flash cards and hangman) and a refined architecture and implementation.
Evaluation: Once the system is developed, we will adopt various evaluation mechanisms to validate and refine purposed theories (concepts, models, processes, frameworks and architectures) and to enhance and generalise our systems namely Save Lingo and Learn Lingo.Development is an iterative process and the issues identified during evaluation will lead to further refinement or creation of design artefacts.The evaluation plan is described in section 5.

Generalisation:
The generalisation of concepts, models, processes, frameworks, architectures and systems is an ongoing process to make each of the artefacts applicable to all languages.The Framework of Common Design Elements as shown in Figure 4 will be language independent.Initial implementation will be for Te Reo Māori, followed by Vietnamese, and then system will be generalised to support non-Roman language such as Arabic, Chinese and Hindi.This will help us generalise our artefacts to a level that they are adaptable for majority of the languages.

Design and implementation of a collective intelligence based language revitalisation systems
Majority of languages in the world are endangered and rapidly becoming extinct.The aim of this research is to design and implement a collective intelligence driven smart mobile apps to preserve and learn endangered languages.We adopted a design science research methodology to help develop concepts, models, processes, frameworks and architectures [17][18][19]22].Subsequently, mobile apps will be designed and implemented leveraging key principles of collective intelligence including cooperation, coordination and cognition to support vital language revitalisation processes as illustrated in Figure 3: (1) Capture/Preserve -words, phrases, poems, idioms, and stories as text, audio, images, and video in multiple dialects; (2) Curate -filter and approve captured content, and (3) Learn and usecontext-aware dynamic games and apps based on curated data to encourage the use of the endangered language in daily life.
The apps and related concepts, models, processes, and frameworks will be initially designed for preserving and learning te reo Māori, which is the native language of New Zealand.These research artefacts will then be further generalised to help save and learn other languages including non-roman script based languages.

Concepts and Processes
This research employs a holistic crowd sourced approach to harness collective intelligence to revitalise endangered languages as illustrated in Figure 3.This model is created by synthesising concepts from collective intelligence, knowledge management, and language revitalisation literature [1,4,5,11,23,24].It has five stages and related processes namely: capture, curate, discover, learn and share.The capture stage allows contributors to create/capture words, phrases, idioms, stories and songs in multiple formats including text, image and video.The curate stage involves moderating and refining the captured data by language experts.The discover stage will facilitate the wider community to retrieve knowledge from the dynamic repository.The learn stage allows user to learn the language using interactive games.Lastly, the share stage enables the dissemination of knowledge through social media among the wider community to help promote the use of language.

Framework
The generic design elements make up the Save Lingo framework are depicted in Figure 4.The framework incorporates collective intelligence fundamentals to save endangered languages.The framework elements support the reuse of artifacts for multiple endangered languages.Ninety percent of the artifacts such as concepts, models, processes, framework, architecture and system are not associated with the language.Only user interface changes at a system level are required to tailor it to a particular language.The artifacts are key outputs of design science research [18,19].The first six layers of the framework are standard and well understood in research.However, applying the fundamentals for language revitalisation and learning purposes is novel.Ubiquitous Information Systems and Devices (UIS&D) are being adopted by many indigenous communities to preserve, maintain and revitalise their language and cultural practices [25,26].UIS&D refer to systems and devices (tabs, pads, or boards) that are available abundantly without boundaries [27,28].There is a significant rise in adoption of UIS&D among everyone; both digital natives and digital immigrants.Ubiquitous devices provide many advantages: flexibility, low cost, mobility, userfriendliness, connectivity and multimedia capabilities.These advantages significantly help in implementing Save Lingo app -holistic crowd sourced language preservation system to harness collective intelligence.The prototypical implementation of Save Lingo adapted to revitalise te reo Māori, which is the native language of the Māori population of New Zealand.The Save Lingo system was further generalised and implemented for Vietnamese and non-roman scripts including Arabic, Hindi and Chinese.Save Lingo app as shown in Figure 5 extends upon fundamentals from social media, knowledge management, and collective intelligence to create a highly interactive platform that allows users to remotely contribute and  Collective Intelligence based Endangered Language Revitalisation Systems: Design, Implementation, and Evaluation collaborate towards capturing, curating, discovering and sharing the endangered language.The functionalities are described in a book chapter [20].

Implementation of language learning systems -Learn Lingo
Learning a language can be a difficult task regardless of age or gender.We aim to implement Learn Lingo apps that are user-friendly, dynamic and gamified to engage, retain and help users learn an endangered language.The underlying fundamentals and learning processes, modalities and mechanisms are synthesized from language acquisition and learning literature [11,[29][30][31][32][33] as presented in Table 2. Every individual progresses through various learning stages to learn a new language.Each stage is unique as the modalities and mechanisms applied for learning There are many learning apps available including Flash card and Hangman for various languages.Nevertheless, the novel contribution of Save Lingo and Learn Lingo is that we are using the collective intelligence/data of the community which was captured and curated Save Lingo app is used as the content for Learn Lingo's Flash cards and Hangman apps.

Flash cards app to support: Observe, Identify, Listening and Speaking
Flash cards are a relatively simple way of learning key words and phrases of a language.The flash cards app supports observe, identify, listening and speaking modalities to help users learn the language.The prototypical implementation of Flash cards app is shown in Figure 6.The app allows user to securely login and create various flash card sets.The set can be constructed using the data that was captured and curated using Save Lingo.

Hangman app to support: Identify, Writing and Reading
The Hangman app supports identify, writing and reading modalities to help users learn new words of the language.The prototypical implementation of Hangman app is displayed in Figure 7.The app allows user to securely login and simply select a category.The app will randomly select a curated word from the category selected by the user.

Evaluation
Our research on language revitalisation systems has produced a number of artefacts.These artefacts include conceptual models, processes, frameworks, architectures, and save lingo and learn lingo systems.To validate them, we adopted a variety of evaluation methods in our research methodology design.A summary of these evaluation methods is provided in Table 2.In order to assess the validity of the research artefacts, one or more evaluation methods were employed according to the nature and evaluation requirements of the research artefact.Table 3 presents the summary of our research artefacts and their selected evaluation methods.

Definition and Models
The definition and models of the holistic crowd sourced approach to harness collective intelligence to revitalise endangered languages were validated mainly through gathering feedback and suggestions from domain/system experts coming from a variety of disciplines.The feedback received deepened our understanding of language revitalisation and knowledge management systems.This assisted in refining our concepts and models.To support the feedback collection and achieve a thorough understanding, we initially presented the conceptual models to the Maori Language Commission of New Zealand.The key outcome of the discussion was that there is a lot of emphasis on learning languages but not so much on the usage of the languages within the society.Hence, we refined our models and incorporated the aspect of learn to use endangered languages rather than just learning a language.Furthermore, during the initial stages of the research, the research online and preliminary models were presented at multiple University of Auckland Seminars.The discussion with academics and domain experts provided useful input for understanding and evaluating the models.

Processes
The key processes to harness collective intelligence to revitalise endangered languages which are illustrated in Figure 3 guide the development and application of Save Lingo and Learn Lingo systems.These processes were validated through five research evaluation methods: prototyping, case study, illustrative scenarios, generalizability and expert evaluation.Different methods helped evaluate the proposed processes from different perspectives.In particular, the prototyping approach focuses on validating the system development phase of the key processes: capture, curate and discover.The case study approach validates the processes by demonstrating its application and generalizability to multiple languages including Te Reo Māori, Vietnamese and non-roman scripts such as Arabic, Chinese and Hindi.The illustrative scenarios included in the case demonstration reinforces the validity of the Save Lingo processes.The processes were also presented in publications [20,21] to collect valuable feedback from wider academic and research communities.

Features
Save Lingo and Learn Lingo systems features were evaluated using prototyping, case study, illustrative scenarios, and expert evaluation methods.The development of prototypes and systems allowed us to visualise, interact, evaluate and refine the features.Some of the key Save Lingo features include: (1) capturing words, idioms, stories and songs in multiple formats including text, image and video; (2) curate captured content by accepting, rejecting or refine the captured data; (3) discover content from the dynamic repository via search functionality or navigate using categories; (4) share content via social media integrations; (5) gamification aspects such as collecting points, badges, and leader board rankings; and (6) bookmarking records for easy access.
The features were also evaluated walking through illustrative scenarios and case study implementations in multiple languages.Furthermore, system features were published [20,21] and demonstrated to system, domain and academic experts.The suggestions and feedback received was highly valuable as it helped to identify hidden issues and new features we could incorporate within our systems and refine our research artefacts.
To operationalise the concept of a holistic crowd sourced approach to save endangered, we proposed frameworks and architectures to guide the design and development of Save Lingo and Learn Lingo systems.We explain the validation of the frameworks and architectures in the following section.

Framework and Architectures
The holistic crowd sourced approach framework and architectures guide the design and development of the Save Lingo system.The Save Lingo frameworks and architectures are evaluated by three essential methods: architecture analysis, prototyping, and expert evaluation.
To assist in validating the Save Lingo frameworks and architectures, we conducted architecture analysis for each development iteration of the Save Lingo prototype.This method is adopted in our evaluation for three main reasons.Firstly, it allows us to assure that the Save Lingo unique features are fully supported and to detect any flaws with system designs at an earlier stage during a prototype development iteration.Secondly, it enables us to follow good practices for designing the system.Thirdly, it contributes to improving the overall functionality, usability, efficiency and sustainability of Save Lingo.
Our architecture was validated by setting up a set of evaluation scenarios to ensure Save Lingo features and requirements are implemented.We performed the first round of architecture analysis against the Save Lingo designs before starting the prototype development.Each development iteration deepened our understanding of the system features and requirements of Save Lingo.Accordingly, we improved the system framework and architecture designs in the following development iteration and conducted the analysis again before making any improvements/changes to the prototype.
Furthermore, we validated the proposed Save Lingo frameworks and architectures, by implementing a fully functional holistic crowd sourced language revitalisation system (Save Lingo) as shown in described in section 4.This prototype follows the detailed level architecture design and supports all the features of Save Lingo.To identify gaps, we implemented the prototype for multiple languages using a series of scenario-driven illustrations.
Lastly, the Save Lingo framework and architecture designs were continuously changed and improved over time.During this continuous improvement process, we communicated the system designs with wider academic and research communities in seminars and publications.The valuable feedback obtained from system, domain and academic experts further helped us to refine the Save Lingo frameworks and architectures.

Prototypes
Save Lingo endangered language revitalisation system, Learn Lingo Flashcards and Hangman prototypes were implemented based on the proposed models, concepts, processes, frameworks and architectures.System development is an iterative process; prototypes were adjusted each time any alternations were made to system architectures.The prototype implementation validates the proposed theories of a collective intelligence based endangered language revitalisation system to help save endangered languages.
The development of the prototypes plays a significant role in validating other research artefacts, we adopted eight methods to evaluate the prototype.These evaluation methods are static analysis, structural testing, functional testing, computer simulations, case study, illustrative scenarios, informed argument, and expert evaluation.
In summary, significant mechanisms were used to evaluate generalizability of this research.This was to ensure our concepts, processes, framework, architecture, and implementation would be applicable to many languages.We have successfully implemented the system to support te reo Māori, Vietnamese and non-roman scripts including Arabic, Chinese and Hindi.The system is not limited to languages mentioned above; it is language independent.Hence, it can be adapted to cater for any written language.

Conclusion
In conclusion, this research leverages from strengths of collective intelligence to address weakness in systems that attempt to preserve and/or teach languages.We first began by looking at the strengths of collective intelligence, practical problems of endangered languages and language revitalisation efforts.Moreover, we explored how collective intelligence can be leveraged for language revitalisation purposes.Consequently, we reviewed variety of language preservation and learning systems.Based on these finding and our previous work on Save Lingo [20], we proposed concepts, models, processes and framework to help preserve and learn endangered languages.Furthermore, we describe the implementation of Save Lingo and Learn Lingo -Flash card and Hangman, which are based on the proposed concepts, models, processes and framework in the context of an endangered language, te reo Māori.We validated and evaluated the generalizability by implementing concepts, models, processes, framework, architecture and systems for other languages including Vietnamese and non-roman script languages such as Arabic, Chinese and Hindi.Lastly, the system's functionality can easily be used to facilitate cooperation, coordination and cognition within the community to revitalisation other endangered languages.This crowd-sourced approach will harness collective intelligence to create a central repository for the distribution and revitalisation of indigenous languages, knowledge, values and culture.

EAI
Endorsed Transactions on Context-aware Systems and Applications 09 2016 -03 2017 | Volume 4 | Issue 11 | e disseminating the knowledge among the community.

Figure 3 .
Figure 3. Framework of generic design elements to preserve and learn endangered languages

Figure 2 .
Figure 2. Key concepts and processes to harness collective intelligence to revitalise endangered languages

EAI
Endorsed Transactions on Context-aware Systems and Applications 09 2016 -03 2017 | Volume 4 | Issue 11 | e Collective Intelligence based Endangered Language Revitalisation Systems: Design, Implementation, and Evaluation

Table 1 .
Existing systems available for revitalising endangered languages

Table 2 .
Evaluation Methods

Table 3 .
Research artefacts and evaluation methods