Toward the Ideal Signing Avatar

The paper discusses ongoing research on the effects of a signing avatar's modeling/rendering features on the perception of sign language animation. It reports a recent study that aimed to determine whether a character's visual style has an effect on how signing animated characters are perceived by viewers. The stimuli of the study were two polygonal characters presenting two different visual styles: stylized and realistic. Each character signed four sentences. Forty-seven participants with experience in American Sign Language (ASL) viewed the animated signing clips in random order via web survey. They (1) identified the signed sentences (if recognizable), (2) rated their legibility, and (3) rated the appeal of the signing avatar. Findings show that while character's visual style does not have an effect on subjects' perceived legibility of the signs and sign recognition, it has an effect on subjects' interest in the character. The stylized signing avatar was perceived as more appealing than the realistic one.


Introduction
Computer animation of American Sign Language (ASL) has the potential to remove many of the barriers to education for deaf students because it provides a lowcost and effective means for adding sign language translation to any type of digital content.Like a video of an ASL interpreter, computer animation technology allows for direct communication of ASL in a dynamic visual form that eliminates the need for closed captioning text, symbolic representations of the signs [30] or static sign images.
The benefits of rendering sign language in the form of 3D animations have been investigated by several research groups [2,29,34] and commercial companies [13,33] during the past decade and the quality of 3D animation of ASL has improved significantly.However, its effectiveness and widespread use is still precluded by two major limitations: low realism of the animated signing (which results in low legibility of the signs) and low avatar appeal.The goal of our research is to advance the state of-the-art in ASL animation by improving the quality of the signing motions and the appeal of the signing avatars.Our first step toward this objective is to determine whether certain characteristics of a 3D signing character have an effect on the way it is perceived by the viewer.
The objective of the work reported in this paper was to determine whether the character's visual style, and specifically its degree of stylization, has an effect on the legibility of the animated signs and on the viewer's interest in the character.The paper is organized as follows.In section 2 we report recent research on animation of sign language and in section 3 we present prior studies (by the authors) on perception of sign language animation.Visual Style is discussed in section 4; the study and findings are described in section 5. Discussion and future work are included in section 6.

Sign Language Animation
Recent advances in computer graphics and animation are making possible the development of animated signing avatars that show great potential for improving deaf education and sign language communication.Although signing avatars are still a relatively young research area with only two decades of active research, several significant results have been achieved.
Vcom3D [33] was first to reveal the potential of computer animation of ASL, with two commercial products designed to add ASL to media: SigningAvatar ® and Sign Smith Studio ® .While Vcom3D animation can approximate sentences by ASL signers, individual hand shapes and signing rhythm are often unnatural, and facial expressions do not convey meanings as clearly as a live signer.
In 2005, TERC [32] collaborated with Vcom3D and the National Technical Institute for the Deaf on the SigningAvatar ® accessibility software for web activities and resources for two Kids Network units.TERC has also developed a Signing Science Dictionary with the same software [34].Although both projects have benefited young deaf learners, they have not advanced the state-of-the-art in animation of ASL -they employed existing Vcom3D animation technology.Purdue University Animated Sign Language Research Group [1] with the Indiana School for the Deaf, focuses on development and evaluation of innovative animation-based interactive tools to improve K-6 math/science education for the Deaf (e.g.Mathsigner and SMILE™).The signing avatars in Mathsigner and SMILE, improve over previous examples of ASL animation.
Many research efforts target automated translation from written to sign language to give signers with low reading proficiency access to written information in contexts such as education and internet usage.In the U.S., English to ASL translation research systems include those developed by Zhao et al. [36], Grieve-Smith [16] and continued by Huenerfauth [17].To improve the realism and intelligibility of ASL animation, Huenerfauth is using a data-driven approach based on corpora of ASL collected from native signers [18].In France, Delorme et al. [10] are working on automatic generation of animated French Sign Language using two systems: one that allows pre-computed animations to be replayed, concatenated and co-articulated (OCTOPUS) and one (GeneALS) that builds isolated signs from symbolic descriptions.Gibet et al. [15] are using data-driven animation for communication between humans and avatars.The Signcom project incorporates an example of a fully data-driven virtual signer, aimed at improving the quality of real-time interaction between humans and avatars.In Germany, Kipp et al. [21] are working on intelligent embodied agents, multi modal corpora and sign language synthesis.Recently, they conducted a study with small groups of deaf participants to investigate how the deaf community sees the potential of signing avatars.Findings from their study showed generally positive feedback regarding acceptability of signing avatars; the main criticism on existing avatars primarily targeted the lack of non-manual components (facial expression, full body motion) and emotional expression.In Italy, Lesmo et al. [23] and Lombardo et al. [24] are working on project ATLAS (Automatic Translation into the Language of Sign) whose goal is the translation from Italian into Italian Sign Language represented by an animated avatar.The avatar takes as input a symbolic representation of a sign language sentence and produces the corresponding animations; the project is currently limited to weather news.
The ViSiCAST project [12], continued by eSIGN [13], provides text-to-sign language animated translation in the United Kingdom.Translation to Greek Sign Language (SL) is pursued by Efthimiou's group [22], to German SL by Bungeroth [6], to Irish SL by Morrissey and Way [27], to Polish SL by Suszczanska [31], and to Taiwanese SL by Chiu et al. [7], to name just a few.
Despite the substantial amount of research and recent advancements, existing sign language animation programs still lack natural characteristics of intelligible signing, resulting in stilted, robot-like, low-appeal signing avatars whose signing motions are often difficult to understand.

Prior Studies on Perception of Signing Avatars
This section presents a brief review of recent studies that investigated the effects of certain avatar's modeling/rendering features on perception of ASL animation.
In 2009 Adamo-Villani et al. [4] conducted an experiment that aimed to determine whether character geometric model (i.e.segmented vs. seamless) has an effect on how animated signing is perceived by viewers.Additionally, the study investigated whether the geometric model affects perception at varying degrees of linguistic complexity-specifically hand shape complexity.Results of the study showed that the seamless avatar was rated highest for perceived legibility of the signs and sign recognition.Simple hand shapes were rated higher than moderately complex and complex ones and the interaction between character and hand shape complexity was significant.For the segmented character (more than for the seamless one), ratings decreased as hand shape complexity increased.Findings from this experiment could indicate a preference for seamless, deformable characters over segmented ones, especially in signs with complex hand shapes.
In 2011 Adamo-Villani et al. [3] examined the effect of Ambient Occlusion Shading (AOS) on user perception of American Sign Language (ASL) fingerspelling animations.Seventy-one (71) subjects participated in the study; all subjects were fluent in ASL.The participants were asked to watch forty (40) sign language animation clips representing twenty (20) finger spelled words.Twenty (20) clips did not show ambient occlusion, whereas the other twenty (20) were rendered using ambient occlusion shading.After viewing each animation, subjects were asked to type the word being finger-spelled and rate its legibility.Findings showed that the presence of AOS improves subjects' perceived legibility of the animated signs significantly, as well as sign recognition.
Another study by Jen and Adamo-Villani [19] explored whether the implementation of a particular non-photorealistic rendering style (e.g., cel shading) in ASL fingerspelling animations could improve their legibility.Sixty-nine (69) subjects (all ASL users) participated in the study.Stimuli included forty animation clips: twenty clips were rendered with cel shading and twenty were rendered with photorealistic rendering with ambient occlusion shading.The cel shaded animations were rated highest for legibility and sign recognition.These results are not surprising if we consider that with standard rendering methods, based on local lighting or global illumination, it may be difficult to clearly depict palm and fingers positions because of occlusion problems.It is common for artists of technical drawings or medical illustrations to depict surfaces in a way that is inconsistent with any physically-realizable lighting model, but that is specially intended to bring out surface shape and detail.We believe that the cel shaded renderings with contour lines helped viewers better perceive the palm/fingers configuration by highlighting the finger positions and eliminating irrelevant surface details.
In summary, the findings from these three studies suggest that an adequate signing avatar should be a seamless character rendered with cel shading or photorealistic rendering with AOS.Additional studies need to be conducted in order to identify other key modeling and rendering characteristics that make up an effective and engaging animated signing character.The work reported in the paper advances this line of research and brings us one step closer to the ideal signing avatar.

Realistic versus Stylized Characters
In character design, the level of stylization refers to the degree to which a design is simplified and reduced.Several levels of stylization (or iconicity) exist, such as iconic, simple, stylized, realistic [5].A realistic character is one that closely mimics reality and often photorealistic techniques are used.For instance, the body proportions of a realistic character closely resemble the proportions of a real human, the level of geometric detail is high and the materials and textures are photorealistic [8].A stylized character often presents exaggerated proportions, such as a large head and large eyes, and simplified painted textures.In general, stylized avatars are easier to model and set up for animation and much less computationally expensive for real time interaction than realistic avatars.Figure 1 shows examples of realistic and stylized 3D characters.
Both realistic and stylized characters (also called agents) have been used in e-learning environments to teach and supervise.A few researchers have conducted studies on realistic versus stylized agents with respect to interest and engagement effects in users.Welch et al. [35] report a study that shows that pictorial realism increases involvement and the sense of immersion in a virtual environment.Nass et al. [28] suggest that embodied conversational agents should accurately mirror humans and should resemble the targeted user group as closely as possible.
On the other hand, Cissel's work [8] suggests that stylized characters are more effective at conveying emotions than realistic characters.In her study on the effects of character body style (e.g.realistic versus stylized) on user perception of facial emotions, stylized characters were rated higher for intensity and sincerity.McCloud [25] argues that audience interest and involvement is often increased by stylization.This is due to the fact that when people interact, they sustain a constant awareness of their own face, and this mental image is stylized.Thus, it is easier to identify with a stylized character.
In summary, literature shows no consensus on realistic versus stylized characters with respect to their impact and ability to engage the users.In addition, to our knowledge, no studies on the effects of visual style in regard to signing avatars currently exist.This indicates a need for additional research and systematic studies.The work reported in the paper aims to fill this gap.

Description of the Study
The objective of the study was to determine whether the visual style of a character (e.g.stylized versus realistic) has an effect on subjects' perception of the signing avatar.The independent variable for the experiment was the presence of stylization in the signing avatar's visual style.The dependent variables were the ability of the participants to identify the signs, their perception of the legibility of the signed sentences, and their perception of the avatar's appeal.
The hypotheses of the experiment were the following: H 0 (1) = The presence of stylization in a signing avatar's visual style has no effect on the subjects' ability to recognize the animated signs presented to them.H 0 (2) = The presence of stylization in a signing avatar's visual style has no effect on subjects' perceived legibility of the signing animations.H 0 (3) = The presence of stylization in a signing avatar's visual style has no effect on perceived avatar's appeal.H a (1) = The presence of stylization in a signing avatar's visual style affects the subjects' ability to recognize the animated signs presented to them.H a (2) = The presence of stylization in a signing avatar's visual style affects subjects' perceived legibility of the signing animations.H a (3)= The presence of stylization in a signing avatar's visual style affects perceived avatar's appeal.

Subjects
Forty-seven (47) subjects age 19-32, twenty-four (24) Deaf, five (5) Hard-of-Hearing, and eighteen (18) Hearing, participated in the study; all subjects were ASL users.Participants were recruited from the Purdue ASL club and through one of the subject's ASL blog (johnlestina.blogspot.com/).The original pool included fifty-three (53) subjects, however six (6) participants were excluded from the study because of their limited ASL experience (less than 2 years).None of the subjects had color blindness, blindness, or other visual impairments.

Avatars
The two characters used in the study were modeled, textured, rigged, and animated in MAYA 2014 software.One character, Tom, is a realistic character constructed as one seamless polygonal mesh, with a poly-count of 622,802 triangles and a skeletal deformation system comprised of 184 joints; the face is rigged using a combination of blendshapes and joint deformers.To achieve realism, the face/head was set up to convey 64 deformations/movements that correspond to the 64 action units (AU) of the FACS system [11,31].The hands have a realistic skin texture with wrinkles, furrows, folds and lines, and are rigged with a skeleton that closely resembles the skeleton of a human hand.The second character, Jason, is a stylized avatar; he is a partially segmented 3D character comprised of 14 polygonal meshes with a total poly-count of 107,063 triangles.He is rigged with the same skeletal deformation system as the realistic character and uses the same number of blendshapes and joint deformers for the face.Although the skeletal structures of both characters are identical, Jason presents exaggerated body proportions and unrealistic, stylized textures.For instance, the hands have a solid color texture without realistic skin details; the face is textured with a simple Blinn shader with painted freckles.Figure 2 shows two screenshots of the characters; figure 3 shows close-up shots of their hands and faces.
All signing animations were keyframed on Tom and the animation data was exported and applied to Jason.Videos of a native signer performing the signs were used as reference footage for the creation of the clips.The signer was actively involved in the animation process.Because both characters have the same skeletal structure it was possible to retarget the motion from one character to the other.However, as the characters have different body proportions, there are slight differences in the hand shapes and in the position of the arms/hands in relation to the torso and face.According to a native ASL signer who viewed all animations, these differences do not compromise the accuracy of the signs.(2) Everyone knows about hurricanes, snowstorms, forest fires, floods, and even thunderstorms.
(3) But wait!Nature also has many different powers that are overlooked and people don't know about them.(4) These can only be described as "FREAKY".These four sentences (from National Geographic for kids [20]) were chosen because they represent fairly complex sign language discourse.They include one  and two-handed signs, different levels of hand shape complexity, a finger-spelled word (FREAKY), and a variety of non-manual markers.Camera angles and lighting conditions were kept identical for all animations.The animations were created and rendered in Maya 2014 using Mental Ray. Figure 4 shows three frames extracted from the animation featuring the realistic character, and three extracted from the animation featuring the stylized character.

Web Survey
The web survey consisted of an introductory screen with written and signed instructions and eight other screens (one for each animated clip), with a total of nine screens.The eight screens with the animations included the animated clip, a text box in which the participant entered the signed sentence, (if identified), a 5-point Likert scale rating question on perceived legibility (1=high legibility; 5 = low legibility), and a 5-point Likert scale rating question on perceived avatar appeal (1=high appeal; 5= low appeal).The animated sequences were presented in random order and each animation was assigned a random number.Data collection was embedded in the survey; in other words, a program running in the background recorded all subjects' responses and stored them in an excel spreadsheet.The web survey also included a demographics questionnaire with questions on subjects' age, hearing status and experience in ASL.

Procedure
Subjects were sent an email containing a brief summary of the research and its objectives, an invitation to participate in the study, and the http address of the web survey.Participants completed the on-line survey using their own computers and the survey remained active for 2 weeks.It was structured in the following way: the animation clips were presented in randomized order and for each clip, subjects were asked to (1) view the animation; (2) enter the English text corresponding to the signs in the text box, if recognized, or leave the text box blank, if not recognized; (3) rate the legibility of the animation; and (4) rate the avatar's appeal.At the end of the survey, participants were asked to fill out the demographics questionnaire.

Findings
For the analysis of the subjects' legibility ratings a paired sample T-test was used.With four pairs of animations for each subject, there were a total of 188 rating pairs.The mean of the ratings for animations featuring the realistic character was 2.19, and the mean of the ratings for animations featuring the stylized character was 2.21.Using the statistical software SPSS, a probability value of .068was calculated.At an alpha level of .05,our alternative hypothesis Ha (2) (e.g.stylization has an effect on the user's perceived legibility of the animated signs) was therefore rejected.There is no statistically significant difference between perceived legibility of realistic versus stylized signing avatars.
For the analysis of the subjects' perceived avatar's appeal ratings a paired sample T-test was used as well.The mean of the ratings for animations featuring the realistic character was 2.28, and the mean of the ratings for animations featuring the stylized character was 1.36.Using the statistical software SPSS, a probability value of .034was calculated.At an alpha level of .05,our alternative hypothesis Ha (3) (e.g.stylization has an effect on the subject's interest in the character) was therefore accepted.There is a statistically significant difference between perceived appeal of realistic versus stylized signing avatars.Our study shows that subjects perceived the stylized character as more appealing than the realistic one.For the analysis of the ability of the subjects to recognize the signed sentences, the McNemar test, a variation of the chi-square analysis, was used.Using SPSS once again, a probability value of .062was calculated.At an alpha level of .05, a relationship between realistic and stylized characters and the subjects' ability to identify the animated signs could not be determined.Our hypothesis Ha (1) (e.g. the presence of stylization in a signing avatar's visual style affects the subjects' ability to recognize the animated signs presented to them) was therefore rejected.There was not a statistically significant difference in sign recognition between the two avatars.

Discussion and Future Work
In this paper we have reported a study that explored whether character's visual style has an effect on subjects' perception of signing avatars.Findings show that whereas sign recognition and perceived legibility of the signs are not affected by the visual style of the character, visual style has a significant effect on perceived avatar's appeal.Subjects found the stylized character more appealing than the realistic one.The lower appeal ratings of the realistic character might be due to the "Uncanny Valley" effect described by T. Mori [26].Mori hypothesized that when animated characters (or robots) look and move almost, but not exactly, like natural beings, they cause a response of revulsion among some observers.For instance, in computer animation several movies have been described by reviewers as giving a feeling of revulsion or "creepiness" as a result of the animated characters looking too realistic.Realistic signing avatars might evoke the same response of rejection as they approach, but fail to attain completely, lifelike appearance and motions.
One limitation of the study was the relatively small sample size, and therefore the difficulty in generalizing the results.Because of the limited number of participants, we can only claim that stylized characters, which are easier to model and much less computationally expensive for real time interaction, show promise of being effective and engaging signing avatars.To build stronger evidence, additional studies with larger pools of participants of different ages and cultural backgrounds, and in different settings will be conducted in the future.
The overall goal of our research is to drastically improve the quality of sign language animation.Toward this goal, we will continue to conduct research studies to identify key modeling and rendering characteristic that make up an "ideal signing avatar", e.g. an animated character that is highly effective at conveying the signs and engaging to the viewers.

Figure 3 .
Figure 3. Clockwise: Rendering of realistic character's hand; polygonal mesh and skeleton of realistic character's hand; rendering of realistic character's face; rendering of stylized character's hand; polygonal mesh and skeleton of stylized character's hand; rendering of stylized character's face

4
EAI Endorsed Transactions on e-Learning 04 -06 2016 | Volume 3 | Issue 11 | e1 EAI European Alliance for Innovation Toward the Ideal Signing Avatar

Figure 4 .
Figure 4. Three frames extracted form the animation featuring the realistic character (top), and three frames extracted from the animation featuring the stylized character (bottom)