2. Usability Testing and Serious Games
Usability is defined in the ISO 9241-11 as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, effi-ciency and satisfaction in a specified context of use”. This broad definition focuses on having products that allow the users to achieve goals and provides a base for measuring usability for different software products. However, digital games are a very specific type of software with unique requirements while serious games have the additional objective of knowledge discovery through exploratory learning. This presents unique usability challenges that are specific to serious games.
In this Section we provide an overview of the main techniques for usability testing in general, and then we focus on the specific challenges posed by serious games.
2.1. Usability Testing Methods and Instruments. Usability represents an important yet often overlooked factor that impacts the use of every software product. While usability is often the intended goal when developing a software package, engineers tend to design following engineering criteria, often resulting in products that seem obvious in their functioning for the developers, but not for general users, with correspondingly negative results .
There are a variety of methods typically used to assess for usability. As described by Macleod and Rengger, these methods can be broadly catalogued as (i) expert methods, in which experienced evaluators identify potential pitfalls and usability issues, (ii) theoretical methods, in which theoretical models of tools and user behaviors are compared to predict usability issues, and (iii) user methods, in which software prototypes are given to end users to interact.
Among user methods, two main approaches exist: observational analysis, in which a user interacts with the system while the developers observe, and survey-based methods, in which the user fills in evaluation questionnaires after interacting with the system. Such questionnaires may also be used when applying expert methods, and they are typically based on heuristic rules that can help identify potential issues.
There are a number of survey-based metrics and evaluation methodologies for usability testing. A method most commonly cited is the System Usability Scale (SUS) because it is simple and relatively straightforward to apply. SUS focuses on administering a very quick Likert-type questionnaire to users right after their interaction with the system, producing a “usability score” for the system. Another popular and well-supported tool, the Software Usability Measurement Inventory (SUMI), provides detailed evaluations by measuring usability across five different dimensions (efficiency, affect, helpfulness, control, Advances in Human-Computer Interaction and learnability). In turn, the Questionnaire for User Interaction Satisfaction (QUIS) deals in terms more closely related with the technology (such as system capabilities, screen factors, and learning factors) with attention to demographics for selecting appropriate audiences. Finally, the ISO/IEC 9126 standard is probably the most comprehensive instrument, as described in detail in Jung and colleaguesrsquo; work.
However, many of these metrics suffer from the same weakness in that they can yield disparate results when reapplied to the same software package. In addition, it is very common for such questionnaires and methods to focus on producing a usability score for the system, rather than the identification and remediation of the specific usability issues. This focus on identifying remediation actions as well as the prioritization of the issues and the actions surprisingly is often missing in studies and applications.
When the objective is to identify specific issues that may prevent end users from interacting successfully with the system, the most accurate approaches are observational user methods, as they provide direct examples of how the end users will use (or struggle to use) the applications. However, observational analysis requires the availability of fully functioning prototypes and can involve large amounts of observational data that requires processing and analysis. The experts may analyze the interaction directly during the session or, more commonly, rely on video recordings of the sessions to study the interaction. This has also led to considerations on the importance of having more than one expert review each interaction session. As discussed by Boring et al., a single reviewer watching an interaction session has a small likelihood of identifying the majority of usability issues. The likelihood of discovering usability issues may be increased by having more than one expert review each session; but this increased detection comes at the expense of time and human resources during the reviewing process.
In summary, usability testing is a mature field, with multiple approaches and instruments that have been used in a variety of contexts. All the approaches are valid and useful, although they provide different types of outcomes. In particular, observational user methods seem to be the most relevant when the objective is to identify design issues that may interfere with the userrsquo;s experience, which is the focus of this work. However, these methods present issues in terms of costs and the subjectivity of the data collected.
2.2. Measuring Usability in Serious Games. In the last ten years, digital game-based learning has grown from a small niche into a respected branch of technology-enhanced learning . In addition, the next generation of educational technologies considers educational games (or serious games)as an instrument to be integrated in different formal and informal learning scenarios.
Different authors have discussed the great potential of serious games as learning tools. Games attract and maintain young studentsrsquo; limited
剩余内容已隐藏,支付完成后下载完整资料
可用性测试和严肃游戏
可用性是在ISO 9241-11中定义为“产品在多大程度上可以使指定的用户达到特定目标有效性、效率和满意度的使用指标的文本指标”。这个宽泛的定义侧重于让用户能够使用目标的产品,并为不同软件产品的可用性提供了基础。然而,数字游戏是一个非常具体的类型的软件,具有独特的要求,而严肃游戏有额外的目标,通过探索性学习的知识发现。这提出了独特的可用性挑战,具体的严肃游戏。
在本节中,我们提供了一般可用性测试的主要技术概述,然后我们专注于严重的游戏所带来的具体挑战。
1可用性测试方法和仪器。
可用性是一个重要但经常被忽视的因素,影响每个软件产品的使用。虽然开发软件包时,可用性往往是预期的目标,工程师往往设计以下的工程标准,往往导致产品似乎在其功能的开发方向明显,但不考虑一般的用户体验,具有相对的负面结果。
有多种方法通常用于评估可用性。由麦克劳德和瑞杰德介绍,这些方法大致可以分为(i)专家的方法,由经验丰富的评估,识别潜在的陷阱和可用性问题,(ii)理论方法,其中的工具和用户行为的理论模型进行比较,预测的可用性问题,及(iii)用户的方法,其中的软件原型给最终用户的互动。
在用户的方法中,存在两种主要方法:观察分析,其中用户与系统交互,而开发人员观察,和调查为基础的方法,其中用户填写评估问卷后,与系统交互。在这样的问卷调查时,也可以使用专家方法,他们通常是基于启发式规则,可以帮助识别潜在的问题。
有一些基于调查的度量和评估方法的可用性测试。最常见的方法是系统可用性量表(SUS),因为它是简单和相对简单的应用。SUS集中在右后与系统交互的用户将很快的李克特式问卷,对系统产生一个“可用性评分。另一个受欢迎的和支持的工具,软件可用性的测量量表(SUMI),通过测量可用性在五个不同的尺寸提供了详细的评价(效率、影响、乐于助人、控制、人机交互的研究进展)。反过来,用户交互满意度问卷(QUIS)经营方面更密切相关的技术(如系统功能,屏幕因素和学习因素)关注人口选择合适的观众。最后,ISO / IEC 9126标准可能是最全面的仪器,详细描述在Jung和同事的工作。
然而,许多这些指标有相同的弱点,他们可以产生不同的结果,当重新同一软件包。此外,这是非常常见的,这样的问卷调查和方法,专注于生产系统的可用性得分,而不是识别和修复的具体可用性问题。这一重点是确定整治行动,以及优先次序的问题和行动令人惊讶的往往是失踪的研究和应用。
当目标是确定具体的问题,可能会阻止最终用户成功地与系统交互,最准确的方法是观察用户的方法,因为他们提供了直接的例子,最终用户将使用(或犹豫使用)的应用程序。然而,观测分析需要完全功能的原型的可用性,并可能涉及大量的观测数据,需要处理和分析。专家可以直接分析会话过程中的交互作用,或者更普遍地依赖会话的视频记录来研究交互作用。这也导致了考虑有一个以上的专家审查每个交互会话的重要性。正如无聊等人所讨论的,观看交互会话的一个审稿人很难识别大多数可用性问题。发现可用性问题的可能性可能会增加每个会话有一个以上的专家审查,但这种增加的检测是在审查过程中牺牲了时间和人力资源。
总之,可用性测试是一个成熟的领域,具有多种方法和工具,已在各种上下文中使用。所有的方法是有效的和有用的,虽然他们提供不同类型的结果。特别是,观察用户的方法似乎是最相关的,当目标是确定设计问题,可能会干扰用户的经验,这是这项工作的重点。然而,这些方法存在的问题,在成本和收集的数据的主观性。
2 严肃游戏的可用性测量。
在过去的十年中,数字游戏为基础的学习已经从一个小型生态位通过提高学习成长为一个受尊敬的分支技术。此外,下一代的教育技术认为教育游戏(或严肃游戏)作为一个工具,被集成在不同的正式和非正式的学习方案。
不同的作者讨论了严重的游戏作为学习工具的巨大潜力。游戏吸引和维护年轻学生有限的注意力范围,并为儿童和成人提供有意义的学习体验,同时提供深入学习经验的参与活动。
然而,由于游戏获得接受作为一个有效的教育资源,游戏设计,UI开发,严格的可用性测试越来越必要。虽然有不同的研究举措,看看如何评估这些游戏的学习成效,严重的游戏的可用性得到了较少的关注,在文献中。为“普通”游戏玩家设计游戏是相当简单的,因为游戏有自己的语言,UI约定和控制方案。然而,严肃的游戏越来越多地被广泛的观众,包括非游戏玩家,偶尔造成不良的经验,因为目标观众“没有得到游戏”。
为广大受众设计,并确保进行全面的可用性分析可以减轻这些不良经验。在这样的背景下,厄拉哈尔和奥利拉对游戏原型评价技术最近的一项调查,承认现成的人机交互工具的使用是可能的,但这些仪器应适应游戏的具体特点的报告。在这样的背景下,有一些现有的研究努力适应启发式评估(专家寻找具体的问题)对商业游戏的特定元素。然而,可用性度量和仪器的观测方法并不总是适当的或可靠的游戏。大多数可用性度量被设计为通用的生产工具,因此他们关注如生产力方面,效率性,和错误数。但游戏(无论是严肃的或纯粹的娱乐)是完全不同的,更多地关注过程而不是结果,享受而不是生产力,并提供各种结果提供比较是否一致性。
游戏吸引用户提出实际的挑战,这需要探索性思维,实验和观察结果。理想的情况下,这个周期打算让用户只需一步超越他们的技能水平——引人注目的游戏,而一个游戏,可以很容易地掌握和发挥,不犯错误——一个无聊的游戏。因此,可用性度量,反映完美的性能,没有“错误”(适用于生产力的应用程序)将不适合(好玩)游戏。类似的效果,可以观察到评估挫折的指标。游戏应该被设计成“愉快的令人沮丧的体验”,挑战用户超越他们的技能,迫使用户失败,从而提供更满意的胜利。事实上,游戏提供这种愉快的令人沮丧的感觉是最让人上瘾和令人信服的。另一方面,也有游戏,因为玩家UI设计不佳挫败玩家。在这些情况下,虽然用户仍然无法完成游戏的目标,失败是坏UI或有缺陷的游戏概念的结果。严肃游戏的可用性指标应区分游戏中的挫折,从游戏挫折,以及考虑“障碍的成就”可能是可取的,而不是“障碍的乐趣”。
不幸的是,作为游戏设计师可以承认,有没有具体的配方的乐趣,并作为教师和教育工作者可以承认,引发主动学习是一个难以捉摸的目标。可用性和生产力工具的有效性,可以在生产方面,测量吞吐量,效率性,和效率。但如学习的影响,参与其他方面的乐趣,或者更加主管难以衡量。
这种主观性和不影响正式的可用性测试协议应用于游戏。正如怀特和他的同事发现的,当不同的专家评估相同的游戏体验(同一个测试对象),结果是大不相同的,一个问题,他们归因于什么样的东西在游戏中的“工作”的主观感知。
总之,评估游戏的可用性提出了独特的挑战,需要的指标和方法,目的是考虑他们的变化和主观的互动与游戏,以及其独特的探索经验,应该是令人愉快的沮丧。
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[25823],资料为PDF文档或Word文档,PDF文档可免费转换为Word
您可能感兴趣的文章
- 饮用水微生物群:一个全面的时空研究,以监测巴黎供水系统的水质外文翻译资料
- 步进电机控制和摩擦模型对复杂机械系统精确定位的影响外文翻译资料
- 具有温湿度控制的开式阴极PEM燃料电池性能的提升外文翻译资料
- 警报定时系统对驾驶员行为的影响:调查驾驶员信任的差异以及根据警报定时对警报的响应外文翻译资料
- 门禁系统的零知识认证解决方案外文翻译资料
- 车辆废气及室外环境中悬浮微粒中有机磷的含量—-个案研究外文翻译资料
- ZigBee协议对城市风力涡轮机的无线监控: 支持应用软件和传感器模块外文翻译资料
- ZigBee系统在医疗保健中提供位置信息和传感器数据传输的方案外文翻译资料
- 基于PLC的模糊控制器在污水处理系统中的应用外文翻译资料
- 光伏并联最大功率点跟踪系统独立应用程序外文翻译资料
