机器人的声源定位——基于NAO机器人

合集下载

NAO机器人__一种类人机器人的语音交互与软件设计

NAO机器人__一种类人机器人的语音交互与软件设计

HUNAN UNIVERSITY 毕业论文论文题目一种类人机器人的语音交互与软件设计学生姓名陈明学生学号201208070103专业班级智能科学与技术1班学院名称信息科学与工程学院指导老师李仁发学院院长李仁发2016年 6月1日一种类人机器人的语音交互与软件设计摘要本文阐述了利用NAO机器人进行语音识别研究并涉及了机器人相关的常见行为交互。

语音识别技术是一门涉及了语音学、声学、语言学、信号处理、人工智能等多学科的综合性技术,目前其应用越来越广。

NAO机器人作为标准机器人平台应用在比赛、教育、科研等方方面面,基于NAO机器人进行相关科研是符合时代趋势与研究趋势所在。

论文前面部分介绍了语音识别领域的基础方法与知识,并且简要介绍了NAO机器人的结构和功能。

在理论部分,第3章介绍了GMM-HMM,即高斯混合模型-隐马尔科夫模型的理论知识。

这两个模型在实验中都应用到了语音识别中。

在语音识别的实验部分,通过对由NAO机器人捕获的音频流进行处理操作:音频分轨、滤波、分帧、加Hamming窗函数、语音特征提取、对样本音频流进行机器学习训练等等。

完成必要的处理后,处理结果将会由本地计算机的matlab客户端传到NAO机器人控制软件Choregraphe的服务器端。

机器人将会根据识别的传回的结果做出相应的行为。

论文除了进行语音识别的研究外,还对NAO机器人进行定向运动、多任务并行的舞蹈、给定话题下的交流这些行为交互功能进行了设计。

定向运动能够使机器人运动具体的角度和旋转方向。

多任务并行的舞蹈的设计实际上是把多种任务聚合在behavior层,这些任务包括:头、足、手臂的分帧运动设计,LEDs灯组的颜色变化以及根据Aldebaran Robotics公司的官方文档中QiChat Syntax部分进行给定话题下的对话设计。

总的来说,本项目设计和论文的撰写包含语音识别和行为交互设计两大部分。

语音识别是通过NAO机器人捕获目标音频流并通过ftp传入本地计算机继续处理。

机器人的声源定位——基于NAO机器人

机器人的声源定位——基于NAO机器人

的声源定位——基于NAO的声源定位——基于NAO一、引言1.1 背景的定位能力是实现智能感知环境的重要一项能力。

声源定位是定位的一种重要方式,通过识别和定位声源,可以更好地感知和理解周围环境,进而提供更加智能化的服务。

1.2 目的本文档旨在介绍如何利用NAO进行声源定位的方法和步骤,以帮助开发人员和研究者在实际应用中实现声源定位功能。

二、NAO概述2.1 特点NAO是一种人形,具有人类的身体结构和一定的交互能力。

其具备语音、视觉、触觉等多模态感知能力,适合在多样化环境下进行声源定位任务。

2.2 硬件要求进行声源定位任务,需要具备以下硬件设备:- NAO- 麦克风阵列- 电脑或服务器三、声源定位原理3.1 声源定位方法声源定位可以通过多种方法实现,常用的方法包括:- 声音强度差(Interaural Level Difference,ILD)法- 声音时间差(Interaural Time Difference,ITD)法- 声学波束成形(Acoustic Beamforming)法3.2 NAO的声源定位原理NAO通过麦克风阵列采集周围环境中的声音,并利用声音的强度差和时间差来判断声音的来源方向。

通过定位计算和声学波束成形技术,可以实现高精度的声源定位。

四、NAO声源定位的实现步骤4.1 准备工作在进行声源定位任务之前,需要进行以下准备工作:- 搭建NAO的开发环境- 连接麦克风阵列至NAO- 安装和配置声源定位算法库4.2 数据采集使用NAO的麦克风阵列采集周围环境中的声音数据,并记录下采样时刻和对应的音频信号。

4.3 数据处理对于采集到的音频数据,需要进行预处理,包括噪声去除、数据滤波等。

然后,根据声音的强度差和时间差等信息,计算声源的方向和距离。

4.4 声源定位结果显示将声源定位的结果显示在NAO的屏幕上,或通过语音输出或其他形式告知用户。

五、附录本文档涉及的附件有:1: NAO开发环境安装指南2:麦克风阵列连接和配置指南3:声源定位算法库安装和配置指南六、法律名词及注释1: NAO:一种由法国软银公司(SoftBank Robotics)开发的可人,广泛应用于教育、娱乐和服务等领域。

基于NAO机器人的多种通道人机交互研究与案例实现

基于NAO机器人的多种通道人机交互研究与案例实现

基于NAO机器人的多种通道人机交互研究与案例实现作者:聂艳明林吴航董佩杰梁会张志刚钟梦浩来源:《数字技术与应用》2018年第04期摘要:NAO机器人是一款可编程人形机器人,集成摄像头、麦克风、触碰、超声波、红外等多种传感器,具备对外部环境进行综合感知和交互的能力。

文章基于NAOqi API实现人脸图像识别与简单的触觉感知,借助讯飞语音云和图灵机器人进行语音识别与理解,结合NAO机器人视觉、听觉、触觉等开发多种通道人机交互案例,为基于NAO机器人的多通道交互应用研究奠定基础。

关键词:人机交互;NAO机器人;NAOqi API;多通道中图分类号:R749.94;TP242 文献标识码:A 文章编号:1007-9416(2018)04-0078-031 引言人机交互是实现人与计算机对话的技术[1]。

当前多通道的人机交互已成为一种新兴的人机交互方式。

人机交互通道指的是人机交互的途径。

所谓多通道人机交互就是在单一通道交互方式的基础上,发展融合视觉、触觉、听觉、语音和肢体行为的多种通道交互,使人机双方都能得到及时的反馈[2]。

作为一种可编程人形机器人,NAO机器人(如图1所示)集成摄像头、麦克风、触碰、超声波、红外等多种传感器,具备对外部环境进行综合感知和交互的能力。

在自闭症医治、全身运动、多智能系统、自动化、信号处理等领域具有广泛的应用[1][2]。

2 NAO机器人的人机交互通道2.1 视觉在NAO机器人头部有两个摄像头,其分辨率都为640×480,图像的有效像素为920万,系统可提供30帧/秒的图像帧率[6]。

在NAO机器人的人机交互过程中,通过摄像头采集视频信息流,完成相应处理[7]。

2.2 听觉NAO机器人内安装有四个麦克风,通过利用NAOqi中ALAudioRecorder模块,实现NAO 听觉功能。

NAO机器人自主生活时,自身也可以进行声源的定位。

2.3 触觉NAO共提供了13个触觉传感器,其中头顶三个,左右手各三个,左右脚各两个。

基于机器学习技术的人机互动——NAO人型机器人的应用案例

基于机器学习技术的人机互动——NAO人型机器人的应用案例

Heni Ben Amor / David Vogt / Erik Berger Institute of Computer Science / T.U. Bergakademie Freiberg Freiberg (Germany)The Institute of Computer S cience at the Technical University Bergakademie Freiberg performs research on various aspects of humanoid and android robotics, virtual reality and computer networks. Research topics include the imitation of human behavior, physical human-robot interaction and the integration of physical robots in virtual reality environments. The institute runs Master’s programs in Applied Computer Science and Network Computing in which several robotics related lectures are taught. S tate-of-the-art research at the international level is ensured through collaborations with Universities in Japan, e.g. the Intelligent Robotics Lab (Prof. Hiroshi Ishiguro) at the University of Osaka. We acquired the NAO robots in the beginning of 2011 for research on interaction learning methods. Currently, one Postdoc and two Master’s students are working with NAO. The objective of this research project is to capture the way humans interact with each other during conversations or during physical interaction, for example when a person hands a bottle over to a second person. Using machine learning techniques we extract information about the interaction that allows the NAO robot to engage in a similar interaction with a human being, e.g. hand over a bottle to a human partner.Imitation learning enables a robot to copy a human behavior seen through some camera or motion capture device. In this way, we think that robots will become more autonomous and this will also reduce the need for programming or coding. We want to capture the interaction between two humans and would like to replicate this situation afterwards with a robot and a human. In this way, we can make the interaction capabilities of robots more natural and lifelike.We are also developing simulation and optimization techniques, which allow us to evaluate a new behavior in a virtual environment before using it on the real robot. Using dimensionality reduction algorithms in conjunction with evolutionary algorithms, we are able to optimize each new behavior in the NaoSim simulator. For example, we are currently working on optimizing the walking gait for stability. To initialize this process (the so called bootstrapping stage) we record a walking gait from a human demonstrator. The walking gait is then applied in simulation on the NAO robot and is then further optimized in the NaoSim simulator. Once a stable variant of the walking gait is found, we replay it using the realNAO robot. HUMAN ROBOT INTERACTIONS USING MACHINE LEARNING TECHNIQUES“ The objective of this research project is to capture the way humans interact with each other during conversations or during physical interaction. ” “ We are also developing simulation and optimization techniques, which allow us to evaluate a new behavior in a virtual environment beforeusing it on the real robot. ”At this point, we are trying to learn an interaction model from previously recorded motions. Then the idea is to find the robot posture that is suitable for a human posture depending on the current situation.Another focus of our research is to increase the vividness of a robot motion; or change the visual appeal.For that, we are using mostly Disney’s principles of animation. For example, you can have the robot play back animations in an exaggerated style or you can have it play back the animation witha different mood like sadness or happiness.To visualize this process we use a CAVE virtual reality installation. The CAVE allows us to easily embed NAO in a wide variety of different environments. We can put the robot in this environment and try out different situations and see how the robot would respond to this situation.In our motion tracking approach, we are using the Microsoft Kinect camera to get a 3D depth picture of our scene. Based on this information, we detect up to two users and extract their body postures over time. The resulting motions and interactions get optimized for the NAO’s body configuration within simulation. Optimized motions can then be replayed on the NAO. This allows us to easily create new behaviors.The NAO robot is an ideal platform for our research. It is probably the most sophisticated available commercial humanoid robot. NAO can perform difficult movements (standing on one leg, dancing, etc.), recognize objects and faces and comes with high-quality speech synthesis algorithms. For our research it has proven crucial, that NAO has an appealing design. The cute appearance along with the speech synthesis capabilities allow us to easily initiate conversations and human-robot interactions.Although we have acquired our NAO robots only five months ago, we have been very productive. This is mainly due to the great SDK and the …Choregraphe“ software which are included. Using choregraphe helps (especially students) to make the first steps with NAO and to learn about its capabilities.In our lab programming is done using the NAO API in C++. Choregraphe and the NaoSim simulator are then used to assess the quality of a newly developed behavior. Additionally, we use a set of in-house and open-source tools for machine learning and motion capturing. This also includes middleware for interfacing the Microsoft Kinect camera.So far we have developed the main components of our interaction learning method, such as the imitation of the users movements. Our next goal is to evaluate the approach by performing human-robot interaction study with a number of human test subjects. The results will be presented at the ICML 2011 Workshop on New Developments in Imitation Learning. We also hope to create a rich set of interactive behaviors for NAO, which would allow him to react in a natural way to different situations.“ Although we have acquired our NAO robots only five months ago, we have been very productive. This is mainly due to the great SDK and the “Choregraphe” software which are included. Using choregraphe helps (especially students) to make the first steps with NAO and to learn about its capabilities. ”/AldebaranRobotics。

机器人的触碰传感器功能——基于NAO人型机器人

机器人的触碰传感器功能——基于NAO人型机器人

机器人的触碰传感器功能——基于NAO人型机器人AbstractNatural and intuitive interaction is an important part of successful robotics. Most of this interaction is made possible by the robot’s sensors –cameras for identifying people and objects; microphones for recognizing words and other sounds; etc.. In this paper we will discuss some of the interaction modes that are specific to NAO: the tactile capacitative sensors on his head and hands, his chest button, and the bumpers on his feet.Related workCapacitative sensors have many uses, but one of their most common applications is in touch-screens such as those of smart-phones.PrinciplesNAO’s chest button has several functions. For starters, pressing it once turns NAO on; if the button is held for five seconds then NAO will shut down. When NAO is up and running, one press on the chest button will make him introduce himself and give his current IP address, while two consecutive pushes will stop his motor control loop. The “two-push” function should be used with caution and preferably while holding NAO, as otherwise removing stiffness may cause the robot to fall. One-push, two-push, and three-push can be overridden to trigger other events by using the ALSentinel module, for example if you want to personalize the way your robot says its IP address.NAO’s foot bumper s are simple on/off type buttons. They only have two states: pressed or not pressed.NAO also has several tactile sensors: three on his head, which cover three clearly marked zones (front, center, and back) andsurrounded by LEDs; and three more discreet tactile sensors on the back of each hand. These sensors work by detecting the changes in surface capacitance caused by contact with a conductor, such as a finger.All of these sensors can be programmed to trigger a variety of events.KEY FEATURE TACTILE SENSORSPerformancesNAO can not only determine which sensors have been activated, but also – in the case of capacitative sensors – the exact zone that has been touched, allowing for complex interactions and commands.Limitations● On/off sensors offer a more limited range of interaction.● Capacitative sensors only react to the touch of conductive objects such as a hand or a finger (but not a pen, for example).How does it work?NAO’s foot bumpers and the capacitative sensors on his head and hands can be programmed in two ways:● In C++ or Python, by creating an event to detect changes to the sensor state, and using this to launch the desired action● Or simply by using the dedicated boxes in Choregraphe.NAO’s reactions to presses of his chest button can be changed through the ALSentinel module only. This module is in charge of NAO’s “vital functions”, including checking battery levels and making sure NAOqi is running and motors aren’t overheating.FIGURE 1Hhead sensorsFIGURE 2Close-up of hand ADD LABELSHow are people using it?NAO’s buttons and sensors are an important interface for interacting with humans and NAO’s environment.Here are a few examples:● NAO’s foot bumpers can be used to detect collisions with low objects.● I f high levels of background noise make voice-recognition impractical, NAO’s head sensors can be used to navigate in a menu by touching the back or front sensors to move through themenu, then the central panel to select an option.● Using NAO’s hand sensor s, we can easily detect if someone takes one of NAO’s hands. This can be used in order to lead the robot by the hand, or to enable NAO to react to assistance in getting up from a sitting position, for example.As mentioned above, NAO can be programmed to trigger almost any kind of action or processing upon receiving a signal from one or more of these sensors – yourimagination is the only limit!FIGURE 3Dedicated Choregraphe boxes for NAO’s sensors。

基于NAO机器人视觉识别科教创新实验设计

基于NAO机器人视觉识别科教创新实验设计

基于NAO机器人视觉识别科教创新实验设计作者:王可欣卫九蓉杨嘉融李程王晨石玉强宁来源:《科技创新导报》2021年第19期摘要:依托高校人工智能课程,结合垃圾分类这一重要民生问题,本文设计了基于NAO 机器人的垃圾识别与目标抓取创新实验。

基于NAO的实验开发平台Choregraphe,使用视频监视器进行图像数据的学习,建立图像数据库,进而使用Vision Reco.指令盒进行视觉识别。

根据机器人运动学理论,实现NAO对目标的抓取动作设计,并使用NAOMarks进行定位信息處理,确保NAO正确定位垃圾桶位置。

经过教学实践发现,学生对该种模式的教学方法有极高的学习兴趣,培养了学生的动手实践能力,对于学生创新实践思维的培养有积极的促进意义。

关键词:NAO机器人视觉识别机器人运动控制垃圾分类中图分类号:TP391 文献标识码:A 文章编号:1674-098X(2021)07(a)-0178-06Innovative Experimental Design Based on NAO Robot Visual RecognitionWANG Kexin WEI Jiurong YANG Jiarong LI Cheng WANG Chen SHI Yu QIANG Ning*(School of Physics and Information Technology, Shaanxi Normal University, Xi'an,Shaanxi Province, 710062 China)Abstract: Relying on the artificial intelligence course in colleges and universities, combined with the important livelihood problem of garbage classification, this paper designs an innovative experiment of waste identification and target capture based on NAO robot. The experimental development platform Chorepache based on NAO uses video monitor to learn image data, establish image database, and then use Vision Reco. Instruction box for visual recognition. According to the robot kinematics theory, the grasping action design of NAO to the target is realized, and NAOMarks is used to process the positioning information to ensure that NAO can correctly locate the trash can. Through teaching practice, it is found that students have a high interest in this mode of teaching method, cultivate students' practical ability, and have a positive significance in promoting the cultivation of students' innovative practical thinking.Key Words: NAO robot; Visual recognition; Robot motion control; Garbage classification仿人型机器人具备与人类相似的外观,可以适应人类的生活和工作环境,代替人类完成各种作业,其中Aldebaran Robotics公司研制的NAO是目前世界上应用最广泛的人形机器人[1]。

NAO机器人介绍

NAO机器人介绍

全身运动
NAO的运动模型基于一个普遍的逆运动学( Generalized Inverse Kinematics),当要求NAO 伸出手臂时,它会同时弯下躯干。这是因为它的 手臂和腿部关节都被考虑在内。而且NAO会停止移 动,以保持平衡。
摔倒管理器
摔倒管理器(Fall Manager)可在机器 人摔倒时起到保护作用。它的主要功能 在于探测机器人的重心(CoM)是否超出 支持多边形的范围。该支持多边形根据 接触地面的双足 的位置来确定。当摔倒 管理器探测到机器人要摔倒时,所有的 运动任务都会被终止,机器人的双臂会 根据情况处于自我保护的位置,而且机 器人重心降低,电机的刚 度也会降为零 。
音频
声源定位 让机器人与人类互动是研制仿人机器
人的主要目的之一。声源定位功能用 于确定声音来自何方。当NAO附近的 某个声源发出声音时,NAO身上的四 个麦克风在接收声波的时间上会略有 差异。例如,当有人在NAO左侧说话 时,相应的信号会首先到达机器人左 侧的麦克风,几毫秒之后到达位于前 额与脑后的麦克风,最后到达右侧的 麦克风。 这种时间差名为“双耳时间差”( interaural time differences,简称ITD )。在这些时间差的基础上,通过数 学运算可获得声源的当前位置。
功能介绍
超声波
视觉
全方位行走 摔倒管理器 声源定位
运动
音频 触觉传感器
构建程序
功能特点
NAO机器人拥有着讨人喜欢的外形。并具备有一定程 度的人工智能和约一定程度的情感智商并能够和人亲 切的互动。该机器人还如同真正的人类婴儿一般拥有 学习能力。
NAO机器人还可以通过学习身体语言和表情来推断出 人的情感变化,并且随着时间的推移“认识”更多的 人,并能够分辨这些人不同的行为及面孔。

NAO机器人

NAO机器人

NAO机器人NAO 是谁?NAO是一个 58 厘米高的仿人机器人,身体小巧圆润,人见人爱!NAO 立志成为人类理想的家居伙伴。

它可以行走,会认人,能听人说话,甚至还可以与人交谈!自 2006 年诞生以来,NAO 一直在不断进步,变得越发讨人喜欢,会帮人消遣散心,还越来越了解、热爱人类,有朝一日将会成为人类真正的朋友。

Aldebaran 公司研制 NAO 的主要目的在于创造出一个会在人们日常生活中伴随其左右的伙伴。

NAO 这个人造小精灵会让人们的生活变得更加美好。

它具备极为出色的互动能力,在惹人喜爱的同时,还会制造出层层惊喜与感动。

NAO 的成长历程NAO虽然目前还尚未进入家庭,但已在教育界成为一颗耀眼的明星。

在 70 多个国家里,它走入中学和大学的信息技术和科技专业课堂。

许多大学生借助 NAO,以寓教于乐、学以致用的方式学习编程。

他们编写程序,让NAO 走路、抓取小物体,甚至翩翩起舞!随后,NAO 又征服了一大批程序开发人员。

在他们眼里,NAO 是一个功能强大、具有惊人表现力的应用程序创建平台,可以让大量设想变为现实,由此开辟出程序开发的新天地,也为日后创作出面向大众的机器人铺平了道路。

NAO 有望在不久的将来出现在千家万户!NAO,即将进入您的家庭!对NAO 讲话,他会回应您!无论您要NAO 做什么,他都会遵命照办!例如,您可以让NAO 教孩子念乘法口诀、早上叫您起床、在您外出时看家护院、告诉您最新消息等。

这些例子看似不足为奇,但其间的特别之处在于您不再需要键盘、电脑或鼠标,而只需把指令说出来,NAO 就会做出回应。

家庭新成员试想一下,NAO 会察言观色、嘘寒问暖,可以认出您的家庭成员并叫出他们的名字,熟知您喜欢的音乐曲目、菜式或电影……这就是Aldebaran公司目前定下的开发目标,即把 NAO 塑造成生动有趣并可互动的亲密伙伴。

其仿人造型惹人喜爱,动作灵活,富有生命力,能与人交流互动,成为一名真正的家庭新成员。

声源定位

声源定位
相关研究成果
长期以来,人们一直在研究声源定位的问题,并提出了大量解决方法。这些方法 建筑在相同的基本原理之上,但采用了不同的执行方式,所需的CPU负荷 也各有差异。考虑到NAO的CPU及记忆能力,为了生成鲁棒的有用输出信 息,我们在“到达时间差法”(Time Difference of Arrival)的基础上,开发 出机器人NAO的声源定位性能。
局限
NAO声源定位的效果受声音清晰度的影响。如果背景噪音较大,自然会降 低模块输出的可靠性。 机器人可探测并定位任何“响亮的声音”,但无法自己过滤区分人声与非人 声。 此外,机器人每次只能定位一个声源。如果NAO同时面对数个响亮的声 音,声源定位模块可能无法准确运作。机器人可能只会确定出声音最大的 声源的方位。
关键性能
声源定位
摘要
仿人机器人的一个主要用途是与人类互动。这无疑是一个高难度的研究项 目,需要机器人具备多种性能。诚然,听懂他人的言谈并正确予以回答是 机器人应该掌握的重要能力,但是在完成这些任务之前,机器人往往首先 需要处于正确的位置,以充分发挥其传感器的功能,并将头部转向相应的 方向,向谈话对象表示它正在倾听或在讲话。为了解决这个问题,机器人 NAO具有“声源定位”功能。只要听到的声音足够响亮,它就可以确定声源 的方位。
工作原理
该 功 能 作 为 一 个 NaoQi模 块 供 用 户 使 用 。 模 块 名 为“ALAudioSourceLocalization”,并提供一个C++和Python语言API(应用 程序编程接口),可保证Python脚本或NaoQi模块准确互动。 此外,Choregraphe中包含两个指令盒,可帮助用户轻松地在自行设计的 行为中加入声源定位: ● “Sound Loc.”指令盒提供声源定位模块的输出信息(角度及可信度), 但不会让机器人做出任何其它动作。 ● “Sound Tracker”指令盒则可运用这些输出信息让NAO把头转向发出声音 的地方。

基于机器人听觉系统的声源目标定位研究

基于机器人听觉系统的声源目标定位研究
对比分析
探讨定位算法的性能提升方向,如增加训练数据量、优化模型结构等。同时,分析在实际应用场景中可能遇到的问题及解决方法。
结果讨论
05
研究成果与贡献
提出了一种基于机器人听觉系统的声源目标定位方法,实现了高精度、高稳定性的定位。
通过实验验证,该方法在不同环境条件下均表现出良好的定位性能。
与现有方法相比,该方法具有更高的定位精度和更低的误差率。
xx年xx月xx日
基于机器人听觉系统的声源目标定位研究
contents
目录
研究背景与意义国内外研究现状及发展趋势研究内容与方法实验与结果分析研究成果与贡献研究不足与展望参考文献
01
研究背景与意义
机器人技术的快速发展
听觉系统的重要性
声源目标定位的应用
研究背景
03
推动相关领域的发展
该研究还可以推动与声音识别、目标追踪等相关领域的发展,为人工智能技术的进步做出贡献。
研究成果
技术创新点
结合机器学习算法,对声音信号进行分类和识别,提高了定位精度。
实现了声音信号的实时采集和处理,提高了系统的响应速度。
采用了先进的信号处理技术,能够有效地提取和识别声音信号。
基于机器人听觉系统的声源目标定位研究具有重要的实际应用价值,可用于机器人自主导航、智能监控、救援等领域。
在智能监控方面,该研究可实现声音信号的实时采集和处理,提高监控系统的预警能力听觉系统中,进行实际测试,评估性能。
技术路线
收集声音数据
利用声音采集设备收集声音数据,并对数据进行预处理。
算法设计与实现
根据理论分析结果,设计并实现声源目标定位算法。
实验验证
利用实验平台对算法进行验证,并对实验结果进行分析。

机器人的声源定位——基于NAO机器人

机器人的声源定位——基于NAO机器人

AbstractOne of the main purposes of having a humanoid robot is to have it interact with people. This is undoubtedly a tough task that implies a fair amount of features. Being able to understand what is being said and to answer accordingly is certainly critical but in many situations, these tasks will require that the robot is first in the appropriate position to make the most out of its sensors and to let the considered person know that the robot is actually listening/talking to him by orienting the head in the relevant direction. The “Sound Localization” feature addresses this issue by identifying the direction of any “loud enough” sound heard by NAO.Related workSound source localization has long been investigated and a large number of approaches have been proposed. These methods are based on the same basic principles but perform differently and require varying CPU loads. To produce robust and useful outputs while meeting the CPU and memory requirements of our robot, the NAO’s sound source localization feature is based on an approach known as “Time Difference of Arrival”.PrinciplesThe sound wave emitted by a source close to NAO is received at slightly different times on each of its four microphones. For example, if someone talks to the robot on his left side, the corresponding signal will first hit the left microphones, few milli-seconds later the front and the rear ones and finally the signal will be sensed on the right microphone (FIGURE 1).These differences, known as ITD standing for “interaural time differences”, can then be mathematically related to the current location of the emitting source. By solving this equation every time a noise is heard the robot is eventually able to retrieve the direction of the emitting source (azimutal and elevation angles) from ITDs measuredon the 4 microphones.FIGURE 1Schematic view of the dependency between the position of the sound source (a human in this example) and the different distances that the sound wave need to travel to reach the four NAO’s micro-phones. These different distances induce times differences of arrival that are measured and used to compute the current position of the source.KEY FEATURE SOUND SOURCE LOCALIZATIONPerformancesThe angles provided by the NAO’s sound source localization engine match the real position of the source with an average accuracy of 20 degrees, which is satisfactoryin many practical situations. Note that the maximum theoretical accuracy dependson the microphones’ spatial configuration and on the sample rate of the measured signal, and is about 10 degrees on NAO.The distance separating NAO and a sound source successfully located can reachseveral meters depending on the situation (reverberation, background noise, etc…).Once launched, this feature uses 10% of the CPU constantly and up to 20% for few milliseconds when the location of a sound is being computed.LimitationsThe performance of NAO’s sound source localization is limited by how clearly thesound source can be heard with respect to background noise. Noisy environments naturally tend to decrease the reliability of the module outputs.It will also detect and locate any “loud sounds” without being able by itself to filterout sound source that are not humans.Finally, only one sound source can be located at a time. The module can behave in aless reliable manner if NAO faces several loud noises at the same time. He will likelyonly output the direction of the loudest source.How does it work?This feature is available as a NaoQi module named “ALAudioSourceLocalization”which provides a C++ and Python API (application programming interface) thatallows precise interactions from a python script or a NaoQi module.Two boxes in Choregraphe are also available that allow an easy use of the featureinside a behavior:● The box “Sound Loc.” provides the output (angles and level of confidence) of thesound localization module without taking any further actions.● The box “Sound Tracker” uses these outputs to make NAO’s head turn in the appropriate direction.FIGURE 2An example of a behavior usingthe «Sound Tracker» box toorientate NAO’s head so thatthe «Face Recognition» box canactually perform its recognition.How are people using it?Here are some possible applications (from the simplest to the more ambitious ones) that can be built from NAO’s ability to locate sound sources.- Using the “Sound Source Localization” to have a person enter the camera field of view (as shown in the above example). This allows subsequent vision based features to work on relevant images (images showing a person for example). This is consequently of interest for these specific tasks:● Human Detection, Tracking and Recognition● Noisy Objects Detection, Tracking and Recognition- “Sound Source Localization” can be used to strengthen the Signal/Noise ratio in a specific direction - this is known as Audio Source Separation – and can critically enhance subsequent audio based algorithms such as:● Speech Recognition in a specific direction● Speaker Recognition in a specific direction- Theses possible applications can also be mixed together making NAO’s sound source localization the basic block for sophisticated applications such as:● Remote Monitoring / Security applications (NAO’s could track noises in an empty flat, take pictures and record sounds in relevant directions, etc…)● Entertainment applications (by knowing who speaks and understanding what is being said, NAO could easily take part in a great variety of games with humans.)。

法国NAO机器人介绍

法国NAO机器人介绍

法国NAO机器人介绍第一篇:法国NAO机器人介绍法国NAO机器人介绍NAO机器人介绍NAO是一个57厘米高的可编程仿人机器人。

其关键组件如下:· 拥有25个自由度(DOF)的身体,其关键部件为电机与致动器。

· 一系列传感器:2个摄像头、4个麦克风、1个超声波距离传感器、2个红外线发射器和接收器、1个惯性板、9个触觉传感器及8个压力传感器。

· 用于自我表达的器件:语音合成器、LED灯及2个高品质扬声器。

· 一个CPU(位于机器人头部),运行一个Linux内核,并支持ALDEBARAN公司自行研制的专有中间件(NAOqi)。

· 第二个CPU(位于机器人躯干内)。

· 一个55瓦时电池,根据使用方式的不同,可为NAO提供1.5小时、甚至更长的自主时间。

构建机器人的应用程序具有挑战性:应用程序建立在大量先进的复杂技术之上,如语音识别、物体识别、地图构建等。

应用程序必须安全可靠,而且能够利用有限的资源、在有限的环境中运行。

嵌入式软件NAOqi包含一个跨平台的分布式机器人框架,快速、安全、可靠,为开发人员提供了一个全面的基础,以提高、改进NAO 的各项功能。

NAOqi使算法的API可供其它算法使用。

通过该软件,用户还可选择将模块在NAO上运行或是在一台电脑上远程运行。

用户可在Windows、Mac或Linux系统下开发代码,并通过C++、Python、Urbi、.Net等多种语言进行调用。

建立在该框架之上的模块提供丰富的API接口,以便与NAO互动。

NAOqi可满足一般机器人开发的需要:并行,资源,同步,事件。

正如在其它框架中一样,NAOqi中也包含通用层。

这些通用层专为NAO设计。

通过NAOqi,不同模块(如运动、音频、视频等)之间可协调沟通,还可实现齐次规划,并与ALMemory模块共享信息。

运动全方位行走NAO行走使用的是一个简单动态模型(线性倒摆,LIPM)及二次规划(Quadratic programming)。

基于NAO机器人在高职人工智能机器人专业教学中的应用

基于NAO机器人在高职人工智能机器人专业教学中的应用

基于NAO机器人在高职人工智能机器人专业教学中的应用作者:郭钺来源:《科学大众·教师版》2019年第04期摘要:将NAO机器人虚拟仿真软件Choregraphe图形编辑程序应用到高职人工智能机器人专业教学中,可以完成学生对人工智能机器人认识、编程实践操作、指令盒数据库及3D仿真演示等,实现了教学内容与工作能力的有效对接,提高了教学的安全性和趣味性,为后续人工智能机器人实操实训打下良好基础。

关键词:Choregraphe; 图形编辑; 人工智能机器人; 专业教学中图分类号:TP249 ; ; ; 文献标识码:A; ; ; 文章编号:1006-3315(2019)04-126-001Aldebaran Robotics公司出来的NAO是目前世界上最为人熟知且是最广泛地应用于教育和研究上的人形机器人。

作为机器人足球世界杯指定的官方平台,人形机器人NAO由25个关节构成,每个关节由电机和电动制动器控制,装配有摄像头、红外传感器、超声波传感器、惯性器件等,而NAO机器人通过使用红外线传感器、无线网络、相机、麦克风和扬声器与人进行互动。

用户可以通过上述设备来完成输入。

而状态输出可以通过多个LED和扬声器来传递。

NAO机器人具有丰富、强大的函数库,在Linux,Windows,Mac OS等操作系统下,均可利用C++,MATLAB,Python语言、以及基于图形接口编程的Choregraphe对其编程操作。

目前,全球多个高等学校和研究机构已将NAO机器人用于教育和科研工作。

2017年咸阳职业技术学院电子信息学院引进NAO机器人平台,并将其用于第二课堂及技能大赛,意在锻炼学生的创新能力和科研能力。

将Choregraphe应用于NAO人工智能机器人专业教学中,可以滿足在没有人工智能机器人实训设备的情况下进行人工智能机器人课程教学的要求,通过虚拟仿真技术建立机器人工作场景,实现机器人工作路径规划等工作,让学生掌握人工智能机器人技术。

基于NAO机器人的定位算法与目标识别

基于NAO机器人的定位算法与目标识别
1 基于NAO机器人的定位算法研究
基于 NAO 机器人的定位算法需要依照其
当前的全局位置来确定每一环节的决策,不能
全部依赖于外部的全景摄像头为其提供相关信
息。NAO 机器人是根据对图像信息分析进行
定位,目前其定位通常是采用粒子滤波的算
法,这是一种将定位样本进行概率相关统计的
贝叶斯滤波算法,利用对应参数的迭代来完成
参考文献 [1] 黄朝美 , 杨马英 . 基于信息融合的移动机
器人目标识别与定位 [J]. 计算机测量与 控制 ,2016,24(11):192-195. [2] 李 鹤 喜 , 张 长 亮 . 基 于 单 目立体视觉的 NAO 机器人的目标定位与抓取 [J]. 五邑大 学学报 ( 自然科学版 ),2017,31(02):2026. [3] 杨 杰 .NAO 机 器 人 对 目 标 红 球 识 别 与 视 野 中 心 追 踪 策 略 [J]. 科 教 导 刊 : 电 子 版 ,2017,24(16):159-160.
• 电子技术 Electronic Technology
基于 NAO 机器人的定位算法与目标识别
文/舒军 李志魁 杨露 蒋明威 吴柯

NAO 机器人对未知环境的探寻
是现代研究领域中较前沿的课题, 要 精确目标识别与定位算法则是未
知 环 境 探 寻 的 重 要 技 术 之 一。 基
于 NAO 机器人,深入分析仿人机
类别 目标识别 传统方法
识别率(%) 95.5 89.3
时间(ms) 12 675
敛致使无法获取机器人的位置,许多中外研究 人员对这种方法做了大量改进。蒙特卡罗粒子 滤波定位算法最关键的部分就是对样本中每一 个点的概率计算,其公式如下。
(2) 其中 wk-1 表示先验概率,Lmx 表示为位移, qXm 表示空间概率。并根据此公式获得 NAO 机器人的定位识别算法。

基于NAO机器人目标识别与定位算法

基于NAO机器人目标识别与定位算法

基于NAO机器人目标识别与定位算法
柏雪峰;杨斌
【期刊名称】《成都信息工程学院学报》
【年(卷),期】2014(029)006
【摘要】针对目前应用最广的新型仿人机器人NAO平台,和其官网发布的目标识
别算法原理,提出一种改进的识别方法,降低光照对目标识别的影响,缩短机器人识别、定位和跟踪整个过程闭环操作过程的反应时间.其中搜索路径采用原地查找、走动
和旋转查找三部曲实现360°全视野搜索;目标颜色识别基于YUV颜色空间进行;目
标跟踪采用目标始终在视野中心算法实现;同时分析第四代NAO机器人的硬件参
数设计头关节垂直偏角与目标的距离计算公式,用于目标快速定位.
【总页数】5页(P625-629)
【作者】柏雪峰;杨斌
【作者单位】西南交通大学信息科学与技术学院,四川成都610031;西南交通大学
信息科学与技术学院,四川成都610031
【正文语种】中文
【中图分类】TP249
【相关文献】
1.RoboCup标准平台NAO机器人目标识别与自定位研究 [J], 许恒飞;李龙澍
2.基于NAO机器人的目标识别方法 [J], 梁付新;刘洪彬;张福雷;常发亮
3.基于单目立体视觉的NAO机器人的目标定位与抓取 [J], 李鹤喜;张长亮
4.基于NAO机器人的定位算法与目标识别 [J], 舒军;李志魁;杨露;蒋明威;吴柯;
5.基于NAO机器人的红球识别与定位算法 [J], 褚光耀;陈中;徐佳妮;郑涛涛;唐智科
因版权原因,仅展示原文概要,查看原文内容请购买。

基于移动机器人的声源定位系统

基于移动机器人的声源定位系统

其中:Ins为根据不同回声环境丽设定的实验门[引.
瀵常,霾声豹影嚷葵豢薤,劳显逶过F.E算法过滤嚣声
后,可以通过广义_甄相关法来获得在嗓音环境中麦克风之 冈的时蜒[引.
谈s(t)走式(7)~(10)。
Si(1)=辟{(t)+忍{(1)
《7)
S2(t)=弗2(1)+n2(I)
(8)
是h2《f)=l sl(1)是(1一r)dt
锐化相关函数的峰值(图5).这种算法可有效抑制噪旃干
涉的影响[2J】,其蜂壤对应着对延。
万方数据
刘红宁,等:基于移动机器人的声源定位系统
79
觉的部分会与负责处理视觉的部分进行信息交换,然后系

统分析路径情况并确定行进路线.
表1测试结果

忒 矽酽过 ”
}铺 艚 √●●。
…1 1I。



图5 广义相关函数曲线
Japan,1993,11(6):868—874.

Davenport J W B.Probability and random processes【M J.
New York:McGraw-Hill,1970.
b ]1J SoIlg K,Chen J.Sound di/℃ctioll recognition using a con‘
回声的影响[4一纠.彼定位系统中,当通过扫描脉冲能激确 定声音的触发点聪,可以通过EE(eliminating-echo)算法米处 瑾数据。羧设耸(£)(蠢覆声纛瓣声缝残)秀凌克燕采集鹅戆 信号,阐声e<t)可通过下列式(3)计算.
案掣 堕磷 ^J鞋 丛《 么r=45一a1"ctaatn
(1)

机器人头部水平转轴与声源的距离为D,其公式为式

【CN109889723A】一种基于NAO机器人的音视频数据采集系统【专利】

【CN109889723A】一种基于NAO机器人的音视频数据采集系统【专利】
( 57 )摘要 本发明公开了一种基于NAO机器人的音视频
数据采集系统 ,通过控制NAO机器人的 双目 摄像 头 和麦克风进行音视频数据的收 集 ,包括 :音频 收集模块 ,用于控制机器人进行音频数据的收 集 ,选择声道 ,并 将收 集到的 数据传输到PC端进 行可选路径的 保存 ;视频收 集模块 ,用于控 制机 器人进行视频数据的收集,实时显示摄像头图像 并将收集到的数据传输到PC端进行可选路径的 保存 ;音视频收 集模块 ,用于控 制机器人同时 进 行音视频数据的收 集 ,选择声道 ,实时 显示摄像 头的图像,并将收集到的数据传输到PC端进行可 选路径的 保存 ;音频播放模块 ,用于选择已 录 制 保存的 音频文件进行播放 ,供使 用者查看 ;视频 播放模块,用于选择已录制保存的视频文件进行 播放,供使用者查看。
( 19 )中华人民 共和国国家知识产权局
( 12 )发明专利申请
(21)申请号 201910088821 .8
(22)申请日 2019 .01 .30
(71)申请人 天津大学 地址 300072 天津市南开区卫津路92号
(72)发明人 罗韬 魏一凡 赵满坤 高洁 于健 喻梅 于瑞国 应翔 高钟烨
19中华人民共和国国家知识产权局12发明专利申请10申请公布号cn109889723a43申请公布日2019061421申请号cn201910088821822申请日2019013071申请人天津大学地址300072天津市南开区卫津路9274专利代理机构天津市北洋有限责任专利代理事务所代理人51intci权利要求说明书说明书54发明名称一种基于nao机器人的音视频数据采集系统57摘要本发明公开了一种基于nao机器人的音视频数据采集系统通过控制nao机器人的双目摄像头和麦克风进行音视频数据的收集包括
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

AbstractOne of the main purposes of having a humanoid robot is to have it interact with people. This is undoubtedly a tough task that implies a fair amount of features. Being able to understand what is being said and to answer accordingly is certainly critical but in many situations, these tasks will require that the robot is first in the appropriate position to make the most out of its sensors and to let the considered person know that the robot is actually listening/talking to him by orienting the head in the relevant direction. The “Sound Localization” feature addresses this issue by identifying the direction of any “loud enough” sound heard by NAO.Related workSound source localization has long been investigated and a large number of approaches have been proposed. These methods are based on the same basic principles but perform differently and require varying CPU loads. To produce robust and useful outputs while meeting the CPU and memory requirements of our robot, the NAO’s sound source localization feature is based on an approach known as “Time Difference of Arrival”.PrinciplesThe sound wave emitted by a source close to NAO is received at slightly different times on each of its four microphones. For example, if someone talks to the robot on his left side, the corresponding signal will first hit the left microphones, few milli-seconds later the front and the rear ones and finally the signal will be sensed on the right microphone (FIGURE 1).These differences, known as ITD standing for “interaural time differences”, can then be mathematically related to the current location of the emitting source. By solving this equation every time a noise is heard the robot is eventually able to retrieve the direction of the emitting source (azimutal and elevation angles) from ITDs measuredon the 4 microphones.FIGURE 1Schematic view of the dependency between the position of the sound source (a human in this example) and the different distances that the sound wave need to travel to reach the four NAO’s micro-phones. These different distances induce times differences of arrival that are measured and used to compute the current position of the source.KEY FEATURE SOUND SOURCE LOCALIZATIONPerformancesThe angles provided by the NAO’s sound source localization engine match the real position of the source with an average accuracy of 20 degrees, which is satisfactoryin many practical situations. Note that the maximum theoretical accuracy dependson the microphones’ spatial configuration and on the sample rate of the measured signal, and is about 10 degrees on NAO.The distance separating NAO and a sound source successfully located can reachseveral meters depending on the situation (reverberation, background noise, etc…).Once launched, this feature uses 10% of the CPU constantly and up to 20% for few milliseconds when the location of a sound is being computed.LimitationsThe performance of NAO’s sound source localization is limited by how clearly thesound source can be heard with respect to background noise. Noisy environments naturally tend to decrease the reliability of the module outputs.It will also detect and locate any “loud sounds” without being able by itself to filterout sound source that are not humans.Finally, only one sound source can be located at a time. The module can behave in aless reliable manner if NAO faces several loud noises at the same time. He will likelyonly output the direction of the loudest source.How does it work?This feature is available as a NaoQi module named “ALAudioSourceLocalization”which provides a C++ and Python API (application programming interface) thatallows precise interactions from a python script or a NaoQi module.Two boxes in Choregraphe are also available that allow an easy use of the featureinside a behavior:● The box “Sound Loc.” provides the output (angles and level of confidence) of thesound localization module without taking any further actions.● The box “Sound Tracker” uses these outputs to make NAO’s head turn in the appropriate direction.FIGURE 2An example of a behavior usingthe «Sound Tracker» box toorientate NAO’s head so thatthe «Face Recognition» box canactually perform its recognition.How are people using it?Here are some possible applications (from the simplest to the more ambitious ones) that can be built from NAO’s ability to locate sound sources.- Using the “Sound Source Localization” to have a person enter the camera field of view (as shown in the above example). This allows subsequent vision based features to work on relevant images (images showing a person for example). This is consequently of interest for these specific tasks:● Human Detection, Tracking and Recognition● Noisy Objects Detection, Tracking and Recognition- “Sound Source Localization” can be used to strengthen the Signal/Noise ratio in a specific direction - this is known as Audio Source Separation – and can critically enhance subsequent audio based algorithms such as:● Speech Recognition in a specific direction● Speaker Recognition in a specific direction- Theses possible applications can also be mixed together making NAO’s sound source localization the basic block for sophisticated applications such as:● Remote Monitoring / Security applications (NAO’s could track noises in an empty flat, take pictures and record sounds in relevant directions, etc…)● Entertainment applications (by knowing who speaks and understanding what is being said, NAO could easily take part in a great variety of games with humans.)。

相关文档
最新文档