3D MOTION ESTIMATION OF HEAD AND SHOULDERS IN VIDEOPHONE SEQUENCES
三自由度车辆动力学模型英文
三自由度车辆动力学模型英文Three-Degree-of-Freedom Vehicle Dynamics Model.Vehicle dynamics is a crucial aspect of automotive engineering, dealing with the motion of vehicles under the influence of various forces and moments. Among various dynamic models, the three-degree-of-freedom (3DOF) vehicle dynamics model stands out as a simplified yet effective representation for analyzing vehicle handling characteristics. This model captures the essential dynamics of a vehicle by considering the motion in the lateral, longitudinal, and yaw directions.Lateral Motion:The lateral motion of a vehicle refers to its movement perpendicular to the direction of travel. This motion is primarily influenced by factors such as tire-road interaction forces, steering inputs, and vehicle sidewinds. In the 3DOF model, the lateral motion is described by alateral displacement variable, which represents the deviation of the vehicle from its straight-ahead path.Longitudinal Motion:The longitudinal motion of a vehicle corresponds to its movement along the direction of travel. This motion is primarily influenced by factors such as engine torque, braking forces, and rolling resistance. In the 3DOF model, the longitudinal motion is described by a longitudinal velocity variable, which represents the speed of the vehicle along its path.Yaw Motion:Yaw motion refers to the rotation of a vehicle around its vertical axis, which passes through the vehicle's center of gravity. This motion is influenced by moments generated by tire forces and steering inputs. In the 3DOF model, yaw motion is described by a yaw rate variable, which represents the rate of rotation of the vehicle around its vertical axis.Model Equations:The 3DOF vehicle dynamics model is described by a set of ordinary differential equations. These equations represent the laws of motion in the lateral, longitudinal, and yaw directions. The equations are typically derived using Newton's laws of motion and principles of moment balance.The lateral motion equation takes into account tire forces, steering inputs, and sidewinds. The longitudinal motion equation considers factors like engine torque, braking forces, and rolling resistance. The yaw motion equation incorporates tire forces and steering moments to describe the vehicle's rotational dynamics.Applications:The 3DOF vehicle dynamics model finds applications in various areas of automotive engineering, including vehicle handling analysis, suspension design, and control systemdevelopment. It can be used to simulate vehicle responses to different driving scenarios, such as cornering, braking, and acceleration.By analyzing the model's responses, engineers can assess vehicle handling characteristics, identify potential issues, and optimize vehicle design. Additionally, the model can be extended to include more complex dynamic effects, such as tire roll dynamics and vehicle rollover stability, to further enhance its predictive capabilities.Conclusion:The three-degree-of-freedom vehicle dynamics model is a valuable tool for analyzing vehicle handlingcharacteristics and understanding the dynamics of a vehicle under various driving conditions. Its simplicity and effectiveness make it a popular choice for automotive engineering applications, ranging from vehicle design and optimization to control system development. By leveraging this model, engineers can gain insights into vehicledynamics, improve vehicle performance, and enhance overall safety.。
三维建模外文资料翻译--人体动画基础
外文资料翻译—原文部分Fundamentals of Human Animation<From Peter Ratner.3D Human Modeling and Animation[M].America:Wiley,2003:243~249>If you are reading this part, then you have mostlikely finished building your human character,created textures for it, set up its skeleton, mademorph targets for facial expressions, and arrangedlights around the model. You have then arrived at perhapsthe most exciting part of 3-D design, which isanimating a character. Up to now the work has beensomewhat creative, sometimestedious, and often difficult.It is very gratifying when all your previous effortsstart to pay off as you enliven your character. When animating, there is a creative flow that increases graduallyover time. You are now at the phase where you becomeboth the actor and the director of a movie or play.Although animation appears to be a more spontaneousact, it is nevertheless just as challenging, if notmore so, than all the previous steps that led up to it.Your animations will look pitiful if you do not understandsome basic fundamentals and principles. Thefollowing pointers are meant to give you some direction.Feel free to experiment with them. Bend andbreak the rules whenever you think it will improve theanimation.SOME ANIMATION POINTERS1. Try isolating parts. Sometimes this is referredto as animating in stages. Rather than trying tomove every part of a body at the same time, concentrateon specific areas. Only one section ofthe body is moved for the duration of the animation.Then returning to the beginning of the timeline,another section is animated. By successivelyreturning to the beginning and animating a differentpart each time, the entire process is lessconfusing.2. Put in some lag time. Different parts of the bodyshould not start and stop at the same time.Whenan arm swings, the lower arm should follow afew frames after that. The hand swings after thelower arm. It is like a chain reaction that worksits way through the entire length of the limb.3. Nothing ever comes to a total stop. In life, onlymachines appear to come to a dead stop. Muscles,tendons, force, and gravity all affect the movementof a human. You can prove this toyourself.Try punching the air with a full extension. Noticethat your fist has a bounce at the end. If a part comes to a stop such as a motionhold, keyframe it once and then again after threeto eight or more keyframes. Your motion graphwill then have a curve between the two identicalkeyframes. This will make the part appear tobounce rather than come to a dead stop.4. Add facial expressions and finger movements.Your digital human should exhibit signs of lifeby blinking and breathing. A blink will normallyoccur every 60 seconds. A typical blink might beas follows:Frame 60: Both eyes are open.Frame 61: The right eye closes halfway.Frame 62: The right eye closes all the wayand the left eye closes halfway.Frame 63: The right eye opens halfway andthe left eye closes all the way.Frame 64: The right eye opens all the way andleft eye opens halfway.Frame 65: The left eye opens all the way.Closing the eyes at slightly different timesmakes the blink less mechanical.Changing facial expressions could be justusing eye movements to indicate thoughts runningthroughyour model's head. The hands willappear stiff if you do not add finger movements.Too many students are too lazy to take the time toadd facial and hand movements. If you make theextra effort for these details you will find thatyour animations become much more interesting.5. What is not seen by the camera is unimportant.If an arm goes through a leg but is not seenin the camera view, then do not bother to fix it. Ifyou want a hand to appear close to the body andthe camera view makes it seem to be close eventhough it is not, then why move it any closer? This also applies to sets. There is no need to buildan entire house if all the action takes place in theliving room. Consider painting backdrops ratherthan modeling every part of a scene.6. Use a minimum amount of keyframes. Toomany keyframes can make the character appearto move in spastic motions. Sharp, cartoonlikemovements are created with closely spacedkeyframes. Floaty or soft, languid motions arethe result of widely spaced keyframes. Ananimationwill often be a mixture of both. Try tolook for ways that will abbreviate the motions.You can retain the essential elements of an animationwhile reducing the amount of keyframesnecessary to create a gesture.7.Anchor a part of the body. Unless your characteris in the air, it should have some part of itselflocked to the ground. This could be a foot, ahand, or both. Whichever portion is on theground should be held in the same spot for anumber of frames. This prevents unwanted slidingmotions. When the model shifts its weight,the foot that touches down becomes locked inplace. This is especially true with walkingmotions.There are a number of ways to lock parts of amodel to the ground. One method is to useinverse kinematics. The goal object, which couldbe a null, automatically locks a foot or hand tothe bottom surface. Another method is to manuallykeyframe the part that needs to be motionlessin the same spot. The character or its limbs willhave to be moved and rotated, so that foot orhand stays in the same place. If you are using forwardkinematics, then this could mean keyframingpractically every frame until it is time tounlock that foot or hand.8.A character should exhibit weight. One of themost challenging tasks in 3-D animation is tohave a digital actor appear to have weight andmass. You can use several techniques to achievethis. Squash and stretch, or weight and recoil,one of the 12 principles of animation discussedin Chapter 12, is an excellent way to give yourcharacter weight.By adding a little bounce to your human, heor she will appear to respond to the force of gravity.For example, if your character jumps up andlands, lift the body up a little after it makes contact.For a heavy character, you can do this severaltimes and have it decrease over time. Thiswill make it seem as if the force of the contactcauses the body to vibrate a little.Secondary actions, another one of the 12principles of animation discussed in Chapter 12,are an important way to show the effects of gravityand mass. Using the previous example of ajumping character, when he or she lands, thebelly could bounce up and down, the arms couldhave some spring to them, the head could tilt forward,and so on.Moving or vibrating the object that comes incontact with the traveling entity is anothermethod for showing the force of mass and gravity.A floor could vibrate or a chair that a personsits in respond to the weight by the seat goingdown and recovering back up a little. Sometimesan animator will shake the camera to indicate theeffects of a force.It is important to take into consideration thesize and weight of a character. Heavy objectssuch as an elephant will spend more time on theground, while a light character like a rabbit willspend more time in the air. The hopping rabbithardly shows the effects of gravity and mass.9. Take the time to act out the action. So often, itis too easy to just sit at the computer and trytosolve all the problems of animating a human. Putsome life into the performance by getting up andacting out the motions. This will make the character'sactions more unique and also solve manytiming and positioning problems. The best animatorsare also excellent actors. A mirror is anindispensable tool for the animator. Videotapingyourself can also be a great help.10. Decide whether to use IK, FK, or a blend ofboth. Forward kinematics and inversekinematicshave their advantages and disadvantages. FKallows full control over the motions of differentbody parts. A bone can be rotated and moved to theexact degree and location one desires. The disadvantageto using FK is that when your person hasto interact within an environment,simple movementsbecome difficult. Anchoring a foot to theground so it does not move ischallenging becausewhenever you move the body, the feet slide. Ahand resting on a desk has the same problem.IK moves the skeleton with goal objects suchas a null. Using IK, the task of anchoring feet andhands becomes very simple. The disadvantage toIK is that a great amount of control is packedtogether into the goal objects. Certain posesbecome very difficult to achieve.If the upper body does not require any interactionwith its environment, then consider ablend of both IK and FK. IK can be set up for thelower half of the body to anchor the feet to theground, while FK on the upper body allowsgreater freedom and precision of movements.Every situation involves a different e your judgment to decide which setup fits theanimation most reliably.11.Add dialogue. It has been said that more than90% of student animations that are submitted tocompanies lack dialogue. The few that incorporatespeech in their animations make their workhighly noticeable. If the animation and dialogueare well done, then those few have a greateradvantage than their competition. Companiesunderstand that it takes extra effort and skill to create animation with dialogue.When you plan your story, think about creatinginteraction between characters not only on aphysical level but through dialogue as well.There are several techniques, discussed in thischapter, that can be used to make dialogue manageable.12. Use the graph editor to clean up your animations.The graph editor is a useful tool that all3-D animators should become familiar with. It isbasically a representation of all the objects,lights, and cameras in your scene. It keeps trackof all their activities and properties.A good use of the graph editor is to clean upmorph targets after animating facial expressions.If the default incoming curve in your graph editoris set to arcs rather than straight lines, youwill most likely find that sometimes splines inthe graph editor will curve below a value of zero.This can yield some unpredictable results. Thefacial morph targets begin to take on negativevalues that lead to undesirable facial expressions.Whenever you see a curve bend below a value ofzero, select the first keyframe point to the right ofthe arc and set its curve to linear. A more detaileddiscussion of the graph editor will be found in alater part of this chapter.ANIMATING IN STAGESAll the various components that can be moved on ahuman model often become confusing if you try tochange them at the same time. The performancequickly deteriorates into a mechanicalroutine if youtry to alter all these parts at the same keyframes.Remember, you are trying to create human qualities,not robotic ones.Isolating areas to be moved means that you canlook for the parts of the body that have motion overtime and concentrate on just a few of those. For example,the first thing you can move is the body and legs.When you are done moving them around over theentire timeline, then try rotating thespine. You mightdo this by moving individual spine bones or using aninverse kinematics chain. Now that you have the bodymoving around andbending, concentrate on the arms.If you are not using an IK chain to move the arms,hands, andfingers, then rotate the bones for the upperand lower arm. Do not forget the wrist. Fingermovementscan be animated as one of the last parts. Facialexpressions can also be animated last.Example movies showing the same character animatedin stages can be viewed on the CD-ROM asCD11-1 AnimationStagesMovies. Some sample imagesfrom the animations can also be seen in Figure 11-1.The first movie shows movement only in the body andlegs. During the second stage, the spine and headwere animated. The third time, the arms were moved.Finally, in the fourth and final stage, facial expressionsand finger movements were added.Animating in successive passes should simplifythe process. Some final stages would be used tocleanup or edit the animation.Sometimes the animation switches from one partof the bodyleading to another. For example, somewhereduring the middle of an animation the upperbody begins to lead the lower one. In a case like this,you would then switch from animating the lower bodyfirst to moving the upper part before the lower one.The order in which one animates can be a matterof personal choice. Some people may prefer to dofacial animation first or perhaps they like to move thearms before anything else. Following is a summary ofhow someone might animate a human.1. First pass: Move the body and legs.2. Second pass: Move or rotate the spinal bones, neck, and head.3. Third pass: Move or rotate the arms and hands.4. Fourth pass: Animate the fingers.5. Fifth pass: Animate the eyes blinking.6. Sixth pass: Animate eye movements.7. Seventh pass: Animate the mouth, eyebrows,nose, jaw, and cheeks <you can break these upinto separate passes>.Most movement starts at the hips. Athletes oftenbegin with a windup action in the pelvic area thatworks its way outward to the extreme parts of thebody. This whiplike activity can even beobserved injust about any mundane act. It is interesting to notethat people who study martial arts learn that most oftheir power comes from the lower torso.Students are often too lazy to make finger movementsa part of their animation. There are several methodsthat can make the process less time consuming.One way is to create morph targets of the fingerpositions and then use shape shifting to move the variousdigits. Each finger is positioned in an open andfistlike closed posture. For example, the sections ofthe index finger are closed, while the others are left inan open, relaxed position for one morph target. Thenext morph target would have only the ring fingerclosed while keeping theothers open. During the animation,sliders are then used to open and close the fingersand/or thumbs. Another method to create finger movements is toanimate them in both closed and open positions andthen save the motion files for each digit. Anytime youanimate the same character, you can load the motionsinto your new scene file. It then becomes a simpleprocess of selecting either the closed or the open positionfor each finger and thumb and keyframing themwherever you desire. DIALOGUEKnowing how to make your humans talk is a crucialpart of character animation. Once you adddialogue,you should notice a livelier performance and a greaterpersonality in your character. At first, dialogue mayseem too great a challenge to attempt. Actually, if youfollow some simple rules, you will find that addingspeech to your animations is not as daunting a taskas one would think. The following suggestions shouldhelp.DIALOGUE ESSENTIALS1. Look in the mirror. Before animating, use amirror or a reflective surface such as that on a CDto follow lip movements and facial expressions.2. The eyes, mouth, and brows change the most.The parts of the face that contain the greatestamount of muscle groups are the eyes, brows,and mouth. Therefore, these are the areas thatchange the most when creating expressions.3. The head constantly moves during dialogue.Animate random head movements, no matterhow small, during the entire animation. Involuntarymotions of the head make a point withouthaving to state it outright. For example, noddingand shaking the head communicate, respectively,positive and negative responses. Leaning thehead forward can show anger, while a downwardmovement communicates sadness. Move thehead to accentuate and emphasize certain statements.Listen to the words that are stressed andadd extra head movements to them.4. Communicate emotions. There are six recognizableuniversal emotions: sadness, anger, joy,fear, disgust, and surprise. Other, more ambiguousstates are pain, sleepiness, passion, physicalexertion, shyness, embarrassment, worry, disdain,sternness, skepticism, laughter, yelling,vanity, impatience, and awe.5. Use phonemes and visemes. Phonemes are theindividual sounds we hear in speech. Rather thantrying to spell out a word, recreate the word as aphoneme. For example, the word computer isphonetically spelled "cumpewtrr." Visemes arethe mouth shapes and tongue positionsemployedduring speech. It helps tremendously to draw achart that recreates speech as phonemes combinedwith mouth shapes <visemes> above orbelow a timeline with the frames marked and thesound and volume indicated.6. Never animate behind the dialogue. It is betterto make the mouth shapes one or two framesbefore the dialogue.7. Don't overstate. Realistic facial movements arefairly limited. The mouth does not open thatmuch when talking.8. Blinking is always a part of facial animation.Itoccurs about every two seconds. Differentemotional states affect the rate of blinking. Nervousnessincreases the rate of blinking, whileanger decreases it.9. Move the eyes. To make the character appear tobe alive, be sure to add eye motions. About 80%of the time is spent watching the eyes and mouth,while about 20% is focused on the hands andbody.10. Breathing should be a part of facial animation.Opening the mouth and moving the headback slightly will show an intake of air, whileflaring the nostrils and having the head nod forwarda little can show exhalation. Breathingmovements should be very subtle and hardlynoticeable...外文资料翻译—译文部分人体动画基础<引自Peter Ratner.3D Human Modeling and Animation[M].America:Wiley,2003:243~249> 如果你读到了这部分,说明你很可能已构建好了人物角色,为它创建了纹理,建立起了人体骨骼,为面部表情制作了morph修改器并在模型周围安排好了灯光.接下来就是三维设计中最精彩的部分,即制作角色动画.到目前为止有些工作极富创意,有些枯燥乏味,但都困难重重.在经过了前期的努力后,角色已显示出了活力,这是非常令人高兴的.在制作动画时,创意会随着时间的推移不断涌现.现在你既是电影和戏剧的演员又是导演.虽然动作是很自然的表演,但它即使不比之前的准备步骤更复杂,也极具挑战.如果你不懂一些基础知识和基本原理,制作出的动画会很可笑.以下几点为你提供一些指导.尽管拿它们做实验.只要你认为能改进动画,可随意遵守或打破这些规那么.动画指南:1.尝试分离各部分.有时指的是分阶段制作动画.不要试图同时移动身体的每个部位,应集中精力制作具体部位的动画.在动画的持续时间内只移动身体的一部分.然后返回时间轴的起始位置,制作另一部分的动画.通过不断回到起始位置,每次制作一个不同部位的动画,能使整个过程变得清晰明了.2.添加一些延迟.身体的不同部位不应该同时开始和停止动作.当胳膊摆动时,下臂应该在其随后摆动几帧.在下臂停止摆动后手再摆动.整个手臂的活动就像是一边连串的连锁反应.3.任何一个动作都不会戛然而止.生活中,只有机器会突然停止.肌肉,腱,压力和引力都会影响人体的移动.你可以亲自证明这一点.用力向前推拳直到完全舒展开手臂.注意最终你的拳头会回弹一下.如果一个部位要停止,例如要保持动作,首先把它设置为关键帧,然后在3到8个或更多关键帧后再设置一次关键帧.动作图形会在两个相同的关键帧中间产生一条曲线.这会使动作有一个回弹而不是马上停止.4.添加面部表情和手指动作.数字人体应当通过眨眼和呼吸来呈现生命的气息.通常每隔60秒会眨一下眼睛.典型的眨眼应该如下所述:第60帧:两眼都睁开.第61帧:右眼半合.第62帧:右眼紧闭,左眼半合.第63帧:右眼半睁,左眼紧闭.第64帧:右眼完全睁开,左眼半睁.第65帧:左眼完全睁开.在不同时间闭上眼睛会让眨眼显得更为自然.面部表情的改变可通过眼睛的转动来表明模型脑海中的想法.如果你不添加手指动作,手会显得过于僵硬.很多同学懒得花时间去添加面部和手部动作.如果你花额外的努力在这些细节上,你的动画会变得更有趣.5.摄像机没有拍到的内容不用关注.如果胳膊叉到了腿里但摄像机视图中看不到,就不用费心去更正.如果你希望一只手看上去靠近身体并且摄像机视角看上去也是如此,即使实际并不靠近,也没必要再做调整.这也适用于布景.如果所有的表演都发生在起居室,就没必要建造整幢房子.考虑绘制背景而不是做出场景每一部分的模型.6.尽量少使用关键帧.过多的关键帧会让角色动作看上去有痉挛的感觉.剧烈,类似于卡通的动作是使用分布密集的关键帧制作的.飘逸或柔和、没精打采的动作是通过分布稀疏的关键帧制作的.动画中通常结合使用二者.试着寻找可以简化动作的方法.你可以在保留动画基本元素的同时减少构成姿势所需的关键帧数量.7.通过锁定位置锚定身体的某个部位.除非你的角色在空中,否那么它身体的一些部位应该被锁定在地面上.可以是一只脚,一只手或二者.处于地面的部分应该在好几帧上保持在同一位置.这样可阻止不必要的滑动.当模型移动重量时,落下的脚被锁定在适当的位置.对于行走动作这点特别适用.有很多方法将模型的部位锁定在地面上.除了直接把一只脚或一只手锁定在地面外,另一种方法是把需要保持在相同位置的部位手动变成关键帧.角色或其四肢必须移动或旋转,只有这样,脚可手才能保持在相同位置.8.角色应该显示重量.三维动画中最富挑战性的一项任务是让一个数字演员显得拥有重量和质量.可以使用几种方法来实现.第12章中讨论的动画的12个原理之一的挤压与拉伸〔或者重量与反弹〕是为角色提供重量的好方法.通过为人体添加一些反弹动作,可以展示角色受到重力影响的效果.比如,如果角色跳起后落下,脚触地后身体要稍微向上抬一下.对于一个比较重的角色,可以让这个动作重复几次,一次比一次弱.这显示出接触的力量似乎让身体微微有些振动.第12章中讨论的动画的12个原理中的另外一个——辅助动作是显示重量和质量效果的一种重要方法.就用前面跳跃的角色例子,角色着地时,腹部可以上下颤动,胳膊可以微微弹起,头可以向前倾斜等.移动与正在移动的实体接触的物体或让其振动是另一种显示质量和重力的方法.地板可以振动,有人坐进去的椅子通过下陷再稍微弹回也可以显示出对重量的反应.有时动画师可以晃动摄像机来表明力量的效果.考虑角色的大小和重量很重要.较重的物体如大象大部分时间都在地面上,而较轻的角色如兔子大部分时间在空中.忙碌的兔子很难显示出重力和质量的效果.9.花时间表演动作.我们很容易只是坐在电脑前,努力解决人体动画的所有问题.站起来,实际表演一下动作,会给动画注入活力.这会让角色的动作显得更为独特,也可以解决许多时间和位置安排问题.最好的动画师也是最优秀的演员.对于动画师来说,镜子是不可或缺的工具.录制自己的表演也有很大的用处.10.决定是否使用IK,FK,或两者都用.正向运动和逆向运动各有其优缺点.FK能控制不同身体部位的运动.一个骨骼可被旋转移动到想要的精确位置和程度.使用FK 的缺点是当你的角色处在一个互动的环境内,简单的移动也会变得困难.当你把脚固定在地面上让它不动也会有难度因为当你移动身体时,脚就会滑动.放在桌上的手也会有相同问题.IK没有目标的移动骨骼.使用IK,固定脚和手就变得非常简单.其缺点是大部分的控制会被集中到目标位置.某个特定姿势会变得难以实现.如果上身不需要任何与环境的互动,那就考虑IK和FK两者都用.IK可以设置身体的下半部分把脚固定在地上,而上半部分用FK使身体移动的自由度和精确度更好.每种情况都涉及到一种不同的方法.根据自己判断决定哪种设置最可靠地适合动画.11.添加对话.曾经有个说法是学生提交给公司的动画中有90%以上都缺少对话.只有很少一部分学生在动画中添加了对话,从而极大地提高了作品的吸引力.如果动画和对话配合良好,比起他们的竞争对手,这些学生便具有了相当大的优势.公司了解,要制作拥有对话的动画,需要付出加倍的努力,拥有一流的技术.在计划故事时,考虑在角色之间形成交流,这种交流不仅是身体层面的,而且要通过对话来表现.本章讨论了几种让对话更具管理性的技巧.12.使用图形编辑器来清理动画.图形编辑器是所有三维动画师都应该掌握的有用工具.它基本上是场景中所有物体,灯光和摄像机的代表.它了解它们的所有活动和属性.一种使用图形编辑器的好方法是在制作面部动画后清理morph Shape.如果图形编辑器中的默认引入曲线被设置为弧线而不是直线,有时图形编辑器中的曲线会弯到0以下.这会造成一些不可预知的结果.如果面部开始呈现负值,将会导致变形的面部表情.无论何时看到曲线弯到0值以下,先选择弧形右边的第一个关键点,然后把它的曲线设置为直线.本章后面分步讲解的时候将详细讨论图形编辑器.分阶段制作动画如果试图同时改变人体模型上可以移动的各个部件,制作动画的过程经常会变得混乱不堪.如果试图在同一个关键帧上改变这些部件,表演会迅速沦落为机械的程序.记住,您是在试图模仿人类的动作,而不是机器人的动作.隔离要移动的区域意味着您可以分步寻找要移动的身体部位,一段时间只集中精力于一个部位.比如,可以移动的第一个部位是身体和腿.在整个时间轴上完成对它们的移动后,再试着弯曲脊柱和转动髋部.完成转身和弯腰动作后,再集中精力制作臂部动作.不要忘记手腕.最后可以添加手指动作.也可以最后制作面部表情动画.连续地制作各个部位的动画会简化该流程.可以在最后几个阶段清理或编辑动画.有时动画从身体某一部位切换会引出另一部位.比如,有时在动画中间,上身开始引出下肢.在这种情况下,您要从首先制作下肢动画转换到先移动上身,再移动下肢.制作动画的顺序取决于个人喜好.有些人可能更愿意首先制作面部表情动画,也有些人喜欢首先移动胳膊.以下总结了一些制作人体动画的方法.1.第1轮:移动身体和腿部.2.第2轮:移动或旋转脊骨,脖子和头.3.第3轮:移动或旋转胳膊和手.4.第4轮:制作手指动画.5.第5轮:制作眨眼动画.6.第6轮:制作眼睛动作.7.第7轮:制作嘴、眉毛、鼻子、颚和脸颊动画〔可以把这些再细分成几轮〕.大多数移动从臀部开始.运动员总是从撅到极限的骨盆部位开始结束动作.这种像鞭子的行为在现实生活中也可以看到.有趣的是尚武的人可以发现他们的大多数力量来自于下体.对话在人物动画中,了解如何让人开口说话是一个关健部分.加入对话后,人物就会具有更逼真的表现和更鲜明的个性.起初,对话可能是一项极大的挑战,您连尝试的勇气都没有.实际上,如果遵循一些简单的原那么,就会发现给动画添加对话没有想象中的那么困难.下面这些建议可能会对您有所帮助.对话基础1.看镜子.在制作动画前,使用镜子或CD之类的反射面来观察嘴唇动作和面部表情.2.眼睛、嘴和眉毛是变化最大的部分.脸上包含肌肉组最多的部分是眼睛、眉毛和嘴.因此,制作表情时这些是变化最大的区域.3.对话期间头部要不停地摆动.在整个动画中,添加头部随机摆动的动画,幅度多小都无所谓.下意识的头部动作显然含意丰富.例如,点头和摇头分别表示赞成和反对.头向前伸可以表示生气;低头可以表示伤心;猛然抬头可以表示吃惊.移动头部来强调特定的状态.聆听重读的词语,然后对这些词添加头部动作. 4.传达情感.可以识别的情感一般有6种:伤心、生气、开心、恐惧、厌恶和惊讶.其他比较模糊的状态有痛苦、困倦、热情、用力、害羞、尴尬、担心、鄙视、严厉、怀疑、微笑、欢呼、骄傲、不耐烦等.5.使用音素和发音嘴形.音素是我们在对话中听到的单个声音.使用音素组成单词,而不是试图拼出单词.例如:单词computer根据音素拼为"compewtrr".发音嘴形是说话时嘴的形状和舌头的位置.在时间轴上方或下方绘制图表,在图表中使用音素和嘴形组成话语,标出这些话语所在的帧,并说明声音和音量.这样的图表将非常有用.6.动画不要晚于对话.最好将嘴形设置为早于对话一到两帧.7.不要过于夸X.现实中面部表情的变化是非常有限的.说话时嘴巴不会X得很大.8.眨眼始终是面部动画的一部分.一般每两秒钟就要眨一次眼.不同的情绪状态影响眨眼的频率.紧X时眨眼频率会增加,而生气时会减少.9.转动眼睛.要使人物显得生动,一定要添加眼睛动作.人类大约有80%的时间花在注意他人的眼睛和嘴上,而只有20%的时间关注他人的手和身体.10.呼吸应该是面部动画的一部分.X开嘴的同时头稍微向后仰表示吸气,而鼻孔翕动的同时头稍微向前倾可以表示呼气,呼吸动作幅度应该小到几乎注意不到.- 21 - / 11。
深度图像中的3D手势姿态估计方法综述
小型微型计算机系统Journal of Chinese C o m p u t e r Systems 2021年6月第6期 V o l.42 N o.6 2021深度图像中的3D手势姿态估计方法综述王丽萍、汪成\邱飞岳u,章国道1U浙江工业大学计算机科学与技术学院,杭州310023)2(浙江工业大学教育科学与技术学院,杭州310023)E-mail :690589058@ qq. c o m摘要:3D手势姿态估计是计算机视觉领域一个重要的研究方向,在虚拟现实、增强现实、人机交互、手语理解等领域中具有 重要的研究意义和广泛的应用前景_深度学习技术已经广泛应用于3D手势姿态估计任务并取得了重要研究成果,其中深度图 像具有的深度信息可以很好地表示手势纹理特征,深度图像已成为手势姿态估计任务重要数据源.本文首先全面阐述了手势姿 态估计发展历程、常用数据集、数据集标记方式和评价指标;接着根据深度图像的不同展现形式,将基于深度图像的数据驱动手 势姿态估计方法分为基于简单2D深度图像、基于3D体素数据和基于3D点云数据,并对每类方法的代表性算法进行了概括与 总结;最后对手势姿态估计未来发展进行了展望.关键词:3D手势姿态估计;深度学习;深度图像;虚拟现实;人机交互中图分类号:T P391 文献标识码:A文章编号:1000-1220(2021)06-1227■(»Survey of 3D Hand Pose Estimation Methods Using Depth MapW A N G Li-ping' ,W A N G C h e n g1 ,Q I U Fei-yue1'2,Z H A N G G u o-d a o11 (College of Computer Science and Technology .Zhejiang University of Technology .Hangzhou 310023 ’China)2(College of Education Science and Technology.Zhejiang University of Technology,Hangzhou 310023,China)Abstract:3D han d pose estimation is an important research direction in the field of computer vision .which has essencial research significance and wide application prospects in the fields of virtual reality,a u g m ented reality,h u m a n-c o m p u t e r interaction and sign language understanding. D e e p learning has been widely used in 3D h and pose estimation tasks and has achieved considerable results. A-m o n g t h e m,the depth information contained in the depth image can well represent the texture characteristics of the h and poses,and the depth image has b e c o m e an important data source for han d pose estimation tasks. Firstly,development history,b e n c h m a r k data sets, marking methods and evaluation metrics of hand pose estimation were introduced. After that,according to the different presentation forms of depth maps,the data-driven hand pose estimation methods based on depth images are divided into simple 2D depth m a p based m e t h o d s,3D voxel data based methods and 3D point cloud data based m e t h ods,and w e further analyzed and su m m a r i z e d the representative algorithms of them. A t the en d of this paper,we discussed the development trend of hand pose estimation in the future.K e y w o r d s:3D hand pose estimation;deep learning;depth m a p;virtual reality;human-c o m p u t e r interactioni引言手势姿态估计是指从输人的图像或者视频中精确定位手 部关节点位置,并根据关节点之间的位置关系去推断出相应 的手势姿态.近年来,随着深度学习技术的发展,卷积神经网 络(Convolution Neural N e t w o r k s,C N N)'1-推动了计算机视觉 领域的快速发展,作为计算机视觉领域的一个重要分支,手势 姿态估计技术引起了研究者广泛关注.随着深度学习技术的快速发展和图像采集硬件设备的提 升,基于传统机器学习的手势姿态估计模型逐渐被基于深度 学习的估计模型所取代,国内外众多研究机构相继开展了针 对该领域的学习研究,有效推动了手势姿态估计技术的发展. 手势姿态估计大赛“H a n d s 2017”[2]和“Ha n ds2019”[3]吸引了国内外众多研究者们参与,综合分析该项赛事参与者提出的 解决方案,虽然不同的方法在计算性能和手势姿态估计精度 上各有差异,但所有参赛者都是使用深度学习技术来解决手 势姿态估计问题,基于深度学习的手势姿态估计已经成为该 领域主流发展趋势.除此之外,潜在的市场需求也是促进手势姿态技术快速 发展的原因之一.手势姿态估计可广泛应用于虚拟现实和增 强现实中,手势作为虚拟现实技术中最重要的交互方式之一, 可以为用户带来更好的沉浸式体验;手势姿态估计还可以应 用于手势识别、机器人抓取、智能手机手势交互、智能穿戴等 场景.由此可见,手势姿态估计技术将给人类的生活方式带来 极大的改变,手势姿态估计技术已成为计算机视觉领域中重 点研究课题,对手势姿态估计的进一步研究具有非常重要的收稿日期:2020-丨1-27收修改稿日期:2021~01-14基金项目:浙江省重点研发计划基金项目(2018C01080)资助.作者简介:王丽萍,女,1964年生,博士,教授,博士生导师,C C F会员,研究方向为计算智能、决策优化,计算机视觉等;汪成,男,1996年生,硕士研究生,研究方向为 计算机视觉、人机交互、虚拟现实;邱飞岳,男,1%5年生,博士,教授,博士生导师,C C F会员,研究方向为智能教育、智能计算、虚拟现实;章国道,男.1988年生,博士研究生,C C F会员,研究方向为计算机视觉、人机交互、过程挖掘.1228小型微型计算机系统2021 年意义.手势姿态估计技术发展至今已取得大量研究成果,有关 手势姿态估计的研究文献也相继由国内外研究者提出.Erol 等人[41第一次对手势姿态估计做了详细的综述,对2007年之 前的手势姿态估计方法进行了分析比较,涉及到手势的建模、面临的问题挑战、各方法的优缺点,并且对未来的研究方向进 行了展望,但该文献所比较的33种方法都是使用传统机器学 习方法实现手势姿态估计,其中只有4种方法使用了深度图 像来作为数据源,且没有讲述数据集、评价标准、深度图像、深 度学习等现如今手势姿态估计主流研究话题;S u p a n c i c等 人[5]以相同的评价指标对13种手势姿态估计方法进行了详 细的对比,强调了数据集的重要性并创建了一个新的数据集;E m a d161对2016年前基于深度图像的手势姿态估计方法做了 综述,该文献也指出具有标记的数据集对基于深度学习的手 势姿态估计的重要性;从2016年-2020年,手势姿态估计技术 日新月异,基于深度学习的手势姿态估计方法相继被提出,Li 等人[7]对手势姿态估计图像采集设备、方法模型、数据集的 创建与标记以及评价指标进行综述,重点指出了不同的图像 采集设备之间的差异对手势姿态估计结果的影响.除了以上 4篇文献,文献[8-12]也对手势姿态估计的某一方面进行了 总结概要,如文献[8]重点讲述了手势姿态估计数据集创建 及标记方法,作者提出半自动标记方法,并创建出了新的手势 姿态估计数据集;文献[9]提出了 3项手势姿态估计挑战任 务;文献[10]对2017年之前的数据集进行了评估对比,指出 了以往数据集的不足之处,创建了数据量大、标记精度髙、手 势更为丰富的数据集“Bighand 2. 2M”;文献[11 ]对2017手 势姿态估计大赛排名前11的方法进行的综述比较,指出了 2017年前髙水准的手势姿态估计技术研究现状,并对未来手 势姿态估计的发展做出了展望.以上所提到的文献是迄今为止手势姿态估计领域较为全 面的研究综述,但这些文献存在一些共同的不足:1)没有讲 述手势姿态估计发展历程;2)对手势姿态估计方法分类不详 细;3)对手势姿态估计种类说明不够明确;4)没有涉及最新 提出的新方法,如基于点云数据和体素数据方法.针对以上存 在的问题,本文在查阅了大量手势姿态估计相关文献基础上,对手势姿态估计方法与研究现状进行了分类、梳理和总结后 得出此文,旨在提供一份更为全面、详细的手势姿态估计研究 综述.本文结构如下:本文第2节介绍相关工作,包括手势姿态估计发展历程、手势姿态估计任务、手势建模、手势姿态估计分类和方法类型;第3节介绍手势姿态估计常用数据集、数据集标记方式和 手势姿态估计方法评价指标;第4节对基于深度图像的手势 姿态估计方法进行详细分类与总结;第5节总结本文内容并 展望了手势姿态估计未来的发展趋势.2相关工作2.1手势姿态估计发展历程手势姿态估计技术的发展经历了 3个时期:基于辅助设 备的手势姿态估计、基于传统机器学习的手势姿态估计和基于深度学习的手势姿态估计,如图1所示.图1手势姿态估计发展历程图Fig.1D ev el op m e nt history of hand pose estimation1) 基于辅助设备的手势姿态估计.该阶段也称为非视觉 手势姿态估计时期,利用硬件传感器设备直接获取手部关节点位置信息.其中较为经典解决方案为Dexvaele等人[13i提出的数据手套方法,使用者穿戴上装有传感器设备的数据手套,通过手套中的传感器直接获取手部关节点的坐标位置,然后根据关节点的空间位置,做出相应的手势姿态估计;W a n g等人[M]使用颜色手套来进行手势姿态估计,使用者穿戴上特制颜色手套来捕获手部关节的运动信息,利用最近颜色相邻法找出颜色手套中每种颜色所在的位置,从而定位手部关节肢体坐标位置.基于辅助设备的手势姿态估计具有一定优点,如具有良好的鲁棒性和稳定性,且不会受到光照、背景、遮挡物等环境因素影响,但昂贵的设备价格、繁琐的操作步骤、频繁的维护校准过程、不自然的处理方式导致基于辅助设备的手势姿态估计技术在实际应用中并没有得到很好地发展[15].2) 基于传统机器学习的手势姿态估计该阶段也称为基于计算机视觉的手势姿态估计时期,利用手部图像解决手势姿态估计问题.在深度学习技术出现之前,研究者主要使用传统机器学习进行手势姿态估计相关的工作,在这一阶段传统机器学习主要关注对图像的特征提取,包括颜色、纹理、方向、轮廓等.经典的特征提取算子有主成分分析(PrincipalC o m p o n e n t A n a l y s i s,P C A)、局部二值模式(Local Binary Patterns ,L B P)、线性判别分析( Linear Discriminant Analysis ,L D A)、基于尺度不变的特征(Scale Invariant Feature Transform, S I FT) 和方向梯度直方图 (Histogram of Oriented Gradi-e n t,H O G)等.获得了稳定的手部特征后,再使用传统的机器学习算法进行分类和回归,常用的方法有决策树、随机森林和支持向量机等.3) 基于深度学习的手势姿态估计.随着深度学习技术的 发展,卷积神经网络大大颠覆了传统的计算机视觉领域,基于深度学习的手势姿态估计方法应运而生.文献[21 ]以深度图像作为输人数据源,通过卷积神经网络预测输出手部关节点的三维坐标;文献[22]利用深度图的二维和三维特性,提出了一种简单有效的3D手势姿态估计,将姿态参数分解为关节点二维热图、三维热图和三维方向矢量场,通过卷积神经网络进行多任务的端到端训练,以像素局部投票机制进行3D图2 21关节点手部模型图F ig . 2 21 joints hand model2.3手势姿态估计分类本小节我们将对目前基于深度学习的手势姿态估计种类 进行说明.从不同的角度以不同的分类策略,可将手势姿态估 计分为以下几种类型:2.3.1 2D /3D 手势姿态估计根据输出关节点所处空间的维度,可将手势姿态估计分 为2D 手势姿态估计和3D 手势姿态估计.2D 手势姿态估计指的是在2D 图像平面上显示关节点 位置,关节点的坐标空间为平面U ,y ),如图3所示;3D 手势 姿态估计指的是在3D 空间里显示关节点位置,关节点的坐 标空间为(x ,y ,z ),如图4所示.图3 2D 手势姿态估计图 图4 3D 手势姿态估计图Fig . 3 2D hand poseF ig . 4 3D hand poseestim ationestim ation在手势姿态估计的领域中,相较于2D 手势姿态估计,针 对3D 手势姿态估计的研究数量更多,造成这一现象的主要手势姿态估计;文献[23]将体素化后的3D 数据作为3D C N N 网络的输人,预测输出生成的体素模型中每个体素网格是关 节点的可能性;文献[24]首次提出使用点云数据来解决手势 姿态估计问题,该方法首先利用深度相机参数将深度图像转 化为点云数据,再将标准化的点云数据输人到点云特征提取 神经网络提取手部点云数据特征,进而回归出手部关节 点位置坐标.将深度学习技术引人到手势姿态估计任务中,无 论是在预测精度上,还是在处理速度上,基于深度学习手势姿 态估计方法都比传统手势姿态估计方法具有明显的优势,基 于深度神经网络的手势姿态估计已然成为了主流研究趋势. 2.2手势建模手势姿态估计的任务是从给定的手部图像中提取出一组 预定义的手部关节点位置,目标关节点的选择一般是通过参 考真实手部关节点而设定的.根据建模方式的不同,关节点的 个数往往也不同,常见的手部模型关节点个数为14、16、21 等.在手势姿态估计领域,手部模型关节点的个数并没有一个 统一的标准,在大多数手势姿态估计相关的论文和手势姿态 估计常用数据集中,往往采用21关节点的手部模型, 如图2所示.原因为2D 手势姿态估计的应用范围小,基于2D 手势姿态估 计的实际应用价值不大[7],而3D 手势姿态估计可以广泛应 用于虚拟现实、增强现实、人机交互、机器人等领域,吸引了众 多大型公司、研究机构和研究人员致力于3D 手势姿态估计 的研究[29%.由此可见,基于深度图像的3D 手势姿态估计已经成为 手势姿态估计领域主流研究趋势,本文也是围绕深度图像、深 度学习、3D 手势姿态估计这3个方面进行总结叙述.2.3.2R G B/Depth /R G B -D根据输入数据类型的不同,可将手势姿态估计分为:基于R GB 图像的手势姿态估计、基于深度图像的手势姿态估计、基于R G B -D (R G B图像+ D e p t h m a p )图像的手势姿态估计;其中,根据深度图像不同展现形式,将基于深度图像的手势姿 态估计进一步划分为:基于简单2D 深度图像、基于3D 体素 数据、基于3D 点云数据,如图5所示.基于不同数据形式 的手势姿 雜计方m m基于Dqptii Map 深®图 像的手势 姿态估计:@iSDq)th Map深度图多视角深度图 Multi View 体素Volume Voxel点云Point Cloud2D Data3DCNNs基于RGB-D r Dqith Map |图像的手势姿态估计RGB 图人手分割图5手势姿态估计方法分类图F ig . 5 Classification o f hand pose estim ation m ethods2.4方法类型文献[4]根据不同的建模途径和策略,将手势姿态估计 方法划分为模型驱动方法(生成式方法)[31~ ,和数据驱动方 法(判别式方法).研究者结合了模型驱动和数据驱动两种方法的特点,提出混合式方法[3541];在本小节我们将对这3种 手势姿态估计方法类型进行简要概述.2.4.1模型驱动模型驱动方法需要大量的手势模型作为手势姿态估计的 基础.该方法实现的过程为:首先,创建大量符合运动学原理 即合理的手势模型,根据输人的深度图像,选择一个最匹配当 前深度图像的手势模型,提出一个度量模板模型与输入模型 的差异的代价函数,通过最小化代价函数,找到最接近的手势 模型.2.4.2数据驱动数据驱动方法需要大量的手势图像数据作为手势姿态估 计的基础.数据驱动方法所使用的图像数据可以是R G B 图像、深度图像或者是R G B -D 图像中的任意一种或者多种类型 图像相结合.以深度图像为例,基于数据驱动的手势姿态估计 方法可以通过投喂特定标记的手势数据来训练,建立从观察 值到有标记手势离散集之间的直接映射.在这个过程中,根据 手势关节点结果值计算方式的不同,可以将基于数据驱动的Hand PointNet SHPR-Net SO-HandNet Cascade PointNet3D Data基于RGB 图像的 手棘 纖十王丽萍等:深度图像中的3D 手势姿态估计方法综述12291230小型微型计算机系统2021 年手势姿态估计方法进一步分为基于检测和基于回归的方法.2.4.3 混合驱动模型驱动和数据驱动各有优势,模型驱动是基于固定手势模型,手势姿态识别率高;数据驱动基于神经网络,不需要固定手势模型,且对不确定手势和遮挡手势的鲁棒性髙.研究者们结合了两种方法的特点,提出混合式方法解决手势姿态估计问题.常见的混合式手势姿态估计方式有两种:1)先使用模型驱动预估一个手势结果,若预估失败或者预估的结果与手势模型相差较大,则使用数据驱动进行手势姿态估计,在这种方法中,数据驱动只是作为一种备选方案当且仅在模型驱动失败的情况下使用;2)先使用数据驱动预测出一个初始的手势姿势结果,再使用模型驱动对预测的初始手势结果进行优化.3数据集和评价指标数据集对有监督深度学习任务十分重要,对手势姿态估计而言,规模大、标记精度髙、适用性强的手势姿态数据集不仅能提供准确的性能测试和方法评估,还能推进手势姿态估计研究领域的发展.目前常见3D手势姿态估计数据集有:B ig Ha nd2. 2M[I0),N Y U[42).Dexter l[43i,M S R A14[441,IC V L[451,M S R A15 w,H a n d N e t[47】,M S R C[48],等,其中 I C V L、N Y U 和M S R A15是使用最为广泛的手势姿态估计数据集,常用手势姿态估计数据集相关信息如表1所示.表1手势姿态估计数据集Table 1H a n d pose estimation datasets数据集发布时间图像数量类别数关节数标记方式视角图像尺寸I A S T A R20138703020自动3320 x240 Dexter 12013213715手动2320 x240M S R A1420142400621手动3320x240I C V L2014176041016半自动3320 x240N Y U201481009236半自动3640 x480M S R A15201576375921半自动3640 x480M S R C2015102000122合成3512 x424 HandNet2015212928106自动3320x240 BigHand2.2M 2017 2.2M1021自动3640 x 480F H A D2018105459621半自动1640 x4803.1数据集标记方法Y u a n等人指出创建大规模精准数据集的关键因素是快速、准确的标记方式.常用手势姿态数据集标记方式有四 种:手动标记、半自动标记、自动标记和合成数据标记.手动标 记方法因其耗时耗力且存在标记错误情况,导致使用人工手 动标记的手势数据集规模小,不适合用于基于大规模数据驱 动的手势姿态估计方法;半自动标记方法有两种形式,一种是 先使用人工手动标记2D关节信息,再使用算法自动推断3D 关节信息;另一种是先使用算法自动推断出3D关节信息,再 使用人工手动对标记的3D关节信息进行修正,与全手动标 记方法相比,半自动标记方法具有高效性,适用于创建数据规 模大的数据集.合成数据标记方法指的是使用图形图像应用程序,先基于先验手势模型生成仿真手势图像数据,同时自动标记3D关节信息;与手动标记和半自动标记方法相比,合成数据标记方法无需手工介人,有效提高了数据标记效率,适合于大规模数据集的创建;但不足的是,合成的仿真数据无法全面有效地反映真实手势姿态,合成手势数据集中存在手势扭曲、反关节、关节丢失等不符合运动学规律的手势情形,导致丢失真实手势特征.自动标记方法指的在采集手部图像时,使用外部传感器设备对手势关节进行标记.文献[49]的A S T A R数据集使用带有传感器数据手套对手部关节进行标记;B i g H a n d2.2M数据集采用具有6D磁传感器的图像采集标记系统进行自动标记.3.2评价指标3D手势姿态估计方法的评价指标主要包括:1) 平均误差:在测试集图像中,所有预测关节点的平均 误差距离;以21个手势关节点模型为例,会生成21个单关节点平均误差评测值,对21个单关节点平均误差求均值,得到整个测试集的平均误差.2)良好帧占比率:在一个测试图像帧中,若最差关节点 的误差值在设定的阈值范围内,则认为该测试帧为良好帧,测试集中所有的良好帧之和占测试集总帧数的比例,称为良好帧占比率.其中,第1个评价指标反映的是单个关节点预测精准度,平均误差越小,则说明关节定位精准度越高;第2个评价指标反映的是整个测试集测试结果的好坏,在一定的阈值范围内,单个关节的错误定位将造成其他关节点定位无效,该评价指标可以更加严格反映手势姿态估计方法的好坏.4基于深度图像手势姿态估计方法深度图像具有良好的空间纹理信息,其深度值仅与手部表面到相机的实际距离相关,对手部阴影、光照、遮挡等影响因素具有较高的鲁棒性.基于深度学习和深度图像的手势姿态估计方法属于数据驱动,通过训练大量的数据来学习一个能表示从输人的深度图像到手部关节点坐标位置的映射关系,并依据映射关系预测出每个关节点的概率热图或者直接回归出手部关节点的二维或者三维坐标.在本节中,将深度图像在不同数据形式下的3D手势姿态估计方法分为:1) 直接将深度图像作为简单2D图像,使用2D C N N s进 行3D手势姿态估计.2)将深度图像转换成3D体素数据,使用3D C N N s进行 3D手势姿态估计.3)将深度图像转换成3D点云数据,使用点云特征提取 网络提取手部点云数据特征,从而实现手部关节点定位.4.1基于简单2D深度图像早期C. X u等人[50]提出使用随机森林传统机器学习方法直接从手部深度图像中回归出手势关节角度,随着深度学习技术的提出,卷积神经网络在计算机视觉任务中取得了巨大成就,与传统机器学习方法相比具有较大的优势.表2详细列举了基于简单2D深度图像手势姿态估计代表性算法相关信息.其中,受文献[51]启发,T o m p s o n%首次6期王丽萍等:深度图像中的3D 手势姿态估计方法综述1231提出将卷积神经网络应用于手势姿态估计任务中,他们使用 卷积神经网络生成能代表深度图像中手部关节二维概率分布 的热图,先从每幅热图中分别定位出每个关节点的2D 平面 位置,再使用基于模型的逆运动学原理从预估的2D 平面关 节和其对应的深度值估计出关节点三维空间位置.由于手势 复杂多样和手指之间具有高相似性,导致了从热图中预估出 的2D 关节点与真实关节点位置之间可能存在偏差,且当手 部存在遮挡时,深度值并不能很好地表示关节点在三维空间 中的深度信息.针对文献[42]中所存在的问题,G e 等人[52]提 出将手部深度图像投影到多个视图上,并从多个视图的热图 中恢复出手部关节点的三维空间位置,他们使用多视图 C N N s 同时为手部深度图像前视图、侧视图和俯视图生成热 图,从而更精准地定位手关节的三维空间位置.表2基于简单2D 深度图手势姿态估计代表性算法对比 Table2 Com parison of representative algorithmsforhandpose estimation based on2D depth m a p分类算法名称提出时间算法特点平均误差(nun)m j I C V L M S R A 15首次应用C N N ,关ConvNet[42]2014节点二维热图,逆^r e n[55]于简 DeepPrior 单2D Multi-深 V i e w -C N N [52] 度 图 像[54]D e n s e R e g 22]P o s e -R E N [56]J G R -P 20[59]运动学模型.区域集成网络,检2017测关节点三维13.39 7.63 •位置.20178.10 9.50网络.关节点二维热图,2018 多视图 C N N 定位 12.50 - 9.70关节点三维位置.逐像素估计,关节2018 点二维、三维热图,10.20 7.30 7.20单位矢量场.謂迭倾测关节点三u 81 6 79 8 65维位置.漏8 讀 755积网络.O b e r w e g e r 等人使用卷积神经网络直接输出手部关节点三维空间位置,他们认为网络结构对3D 手势姿态估结果 很重要,使用了 4种不同C N N 架构同时预测所有的关节点位 置,通过实验对比得出多尺寸方法对手部关节点位置回归效果更好,同时他们在网络中加入3D 手势姿态先验信息预测 手部关节点位置,并使用了基于C N N 架构的关节点优化网络 对每一个预测的关键点进行更加精准的位置输出;除此之外, 为了进一步提升3D 手势姿态估计的准确性,他们在文献 [21]基础上提出使用迭代优化的方法多次修正手部关节点 位置,对DeepPrior[53]进行改进,提出DeepPrior + + [54]方法, 通过平移、旋转、缩放等方法增强手势姿态估计训练集数据, 以获得更多的可利用信息,并在手势特征提取网络中加人了 残差模块以进一步提升了 3D 手势姿态估计精度.G u o等人[55]提出基于区域集成的卷积神经网络架构 R E N .R E N将卷积层的特征图分成多个局部空间块,并在全连接层将局部特征整合在一起,与之前基于2D 热图、逆运动学约束和反馈回路的手势姿态估计方法相比,R E N 基于单一 网络的方法直接检测出手部关节的三维位置,极大提高了手势姿态估计的性能.然而,R E N 使用统一的网格来提取局部 特征区域,对所有特征都进行同等的处理,这并不能充分获得 特征图的空间信息和具有高度代表性的手势特性.针对该问 题,C h e n 等人[56]提出P o s e -R E N 网络进一步提高手势姿态估 计性能,他们基于R E N 网络预测的手势姿态,将预测的初始 手部姿态和卷积神经网络特征图结合,以提取更优、更具代表 性的手部姿态估计特征,然后根据手部关节拓扑结构,利用树 状的全连接对提取的特征区域进行层次集成,P o s e -R E N 网络 直接回归手势姿态的精准估计,并使用迭代级联方法得到最 终的手势姿态.W a n 等人[22]提出一种密集的逐像素估计的方法,该方法 使用了沙漏网络Hourglass Network-571生成关节点2D 热图和3D热图以及三维单位矢量场,并由此推断出三维手部关节的 位置;他们在文献[58]提出自监督方法,从深度图像中估计3D手势姿态,与以往基于数据驱动的手势姿态估计方法不同的是,他们使用41个球体近似表示手部表面,使用自动标记 的合成手势数据训练神经网络模型,用无标记的真实手势数 据对模型进行了微调,并在网络中采用多视图监督方法以减 轻手部自遮挡对手势姿态估计精度的影响.4.2基于3D 体素数据2D C N N提取的深度图像特征由于缺乏3D 空间信息,不适合直接进行3D 手势姿态估计.将深度图像的3D 体素表示作为3D C N N 的输人,从输入的3D 体素数据中提取关节点 特征,可以更好地捕获手的3D 空间结构并准确地回归手部 关节点3D 手势姿态[60].基于3D 体素数据手势姿态估计流 程如图6所示.基于检测图6基于体素数据手势姿态估计流程图 Fig. 6W o r k f l o w ofhandposeestimationbased o nvoxeldata表3详细列举了基于3D 体素数据手势姿态估计代表性 算法相关信息,其中,G e 等人在文献[61 ]中首次提出使用3DC N N s解决3D 手势姿态估计问题,他们先使用D -T S D F [62]将局部手部图像转换成3D 体素数据表现形式,设计了一个具 有3个三维卷积层、3个三维全连接层的3D 卷积神经网络架 构,用于提取手部体素数据三维特征,并基于提取的三维特征 回归出最终手部关节点三维空间位置;在文献[52]基础上,G e等人[63]提出利用完整手部表面作为从深度图像中计算手势姿态的中间监督,进一步提升了 3D 手势姿态估计精度.M o o n等人[23]指出直接使用深度图像作为2D CN N的输入进行3D 手势姿态估计存在两个严重缺点:缺点1是2D 深 度图像存在透视失真的情况,缺点2是深度图和3D 坐标之 间的高度非线性映射,这种高度非线性映射会直接影响到手 部关节点位置的精准回归.为解决这些问题,他们提出将从深 度图像中进行3D 手势姿态估计的问题,转化为体素到体素。
3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid mode
F
first type of these approaches is often called multi-view face recognition [1], [6]. Multi-view face recognition is a simple extension of frontal face recognition. It treats the whole face image under a certain pose as one vector in a high-dimensional vector space. And the training is done using multi-view face images and a test image is assumed to be matched to one of the existing head poses. Generally, multi-view based approaches should have view-specific classifiers. Therefore, the training and recognition processes are even more time consuming. The second type of approaches is face recognition across pose [3], [4]. It uses a canonical frontal view for face recognition. This method needs a face alignment process to generate a novel frontal view image. Therefore, various well-known frontal face recognition methods can be easily applied to this type of approaches. In this paper, we adopt the latter approach. If we can register face images into frontal views, the recognition task would be much easier. To align a face image into a canonical frontal view, we need to know the pose information of a human head. Therefore, in this paper, we propose a novel method for modeling a human head as a simple 3D ellipsoid. Also, we present 3D head tracking and pose estimation methods using the proposed ellipsoid model. After recovering full motion of the head, we can register face images with pose variations into stabilized view images which are suitable for frontal face recognition. In other words, both training and test face images are back projected to the surface of a 3D ellipsoid according to their poses computed by 3D motion estimation and registered into stabilized view images. By doing so, simple and efficient frontal face recognition can be carried out in the stabilized texture map space instead of the original input image space. To evaluate the feasibility of the proposed approach using a simple ellipsoid model, 3D head tracking experiments are carried out on 45 image sequences with ground truth from Boston University, and several face recognition experiments are conducted on our laboratory database and the Yale Face Database B by using subspace-based face recognition methods such as PCA, PCA+LDA, and DCV [2], [5], [7]. The rest of the paper is organized as follows. In Section 2, we first introduce how to model a human head as a simple 3D ellipsoid. We apply this novel model to 3D head tracking and motion estimation in Section 3. Section 4 describes how to generate a stabilized texture map from an input face image by using the proposed 3D head model and estimated pose information. In Section 5, we show various experimental results to verify the feasibility of our proposed approach. Finally, we conclude the paper in
中英文对照阅读--为什么手指永远取代不了鼠标
中英文对照阅读---为什么手指永远取代不了鼠标为什么手指永远取代不了鼠标体感技术不断发展之后,人们对体感控制器的期望越来越高。
但是,实测表明,这类产品目前还取代不了传统的鼠标,最近被寄予厚望的手势感应设备Leap Motion也不例外。
我中枪了。
我在桌子上朝空气又抓又砸又戳地比划了八秒钟,弗兰克•威尔提终于掏出手枪把我干掉了。
幸好这只是一场游戏,但是我却连一点取胜的机会都没有。
我的枪——在这场游戏中就是我的食指,一整天都没有击中任何东西。
这款射击游戏Fast Iron》只是最近上市的Leap Motion体感控制器的众多游戏应用里的一款。
Leap Motion是一种可以让用户通过手势控制电脑的外围设备.2012年5月,它刚刚发布的时候,的确带给人很多期待。
现在一年多过去了,这款79美元的设备也终于正式投放市场。
试用一周之后,以它目前的情况看,这种体感控制技术倒是说不出有什么问题。
安装Leap Motion体感控制器的过程出乎意料地简单。
我还以为必须得把这个体感控制器放在离屏幕一定距离的地方才行,要么就是必须得把这款设备放在一定的高度上,但事实证明这些都没必要。
而且也不必下载额外的软件,算是一款即插即玩型的设备,我通过一个USB接口就把它连接到了我的Mac上。
安装Leap Motion的软件也没什么难度,而且给人的感觉就是手势控制似乎很好掌握,似乎传感器可以很清楚地捕捉每个手指的运动。
但是软件的完善程度很不够,让我大跌眼镜——在我的Mac上运行的时候,它的全屏功能自行调节了所有其它应用窗口的大小,这一点非常讨厌。
而且图像也不太稳定,感觉就像分辨率很低一样。
想到这里,我心烦地摆了摆手,但是这款设备却没有探测到这个动作。
事后来看,这说明它灵敏度欠缺。
这个缺点可能会影响我对这款体感控制器的体验。
比如说,就在我选择能与这款设备兼容的软件的时候,我发现图像效果不佳反倒成了这款设备最小的问题。
这款设备有一个专门的Airspace Store 网络商城,主要销售各种第三方开发者针对这款设 He shot me. After a good eight seconds of flailing, grabbing, and poking at the air above my desk, Frank Welty finally unholstered his sidearm and put me out of my misery. Alas, it was only a game, but I never really stood a chance. My shooter, which in this case was my pointer finger, hadn't hit a damn thing all day.The game, Fast Iron, is just one of dozens of apps available for the newly launched Leap Motion Controller. A peripheral that lets users control their computer through hand gestures, this device showed plenty of promise when it was announced in May 2012. Now, more than a year later, the $79 product has come to market, and after a week of feeling it out, it's hard to point a finger at what exactly is wrong with gesture-based computing, at least in its current state.Setting up the Leap Motion Controller was unexpectedly easy. I had imagined having to input measurements like the controller's distance to the screen, or heeding requirements like keeping the device at a certain height, but none of that was necessary. Other than downloading a software suite, the peripheral was more-or-less plug-and-play, the unit powered and connected to my Mac via a USB cable.The Leap Motion software involved minimal hand-holding, and gave the impression that gesture-based controls would be easy to master, with the sensor seeming to pick up each finger and hand rotation cleanly. But the program's lack of finish caught my eye -- running on my Mac, the software's full-screen capability resized all my other applications windows (a huge annoyance), and its graphics looked choppy, almost like they were low-resolution. The device couldn't detect my hands shaking with worry over these concerns, but in hindsight, they were clear indicators of a lack of finish that would plague my experience with the controller.For example, as I continued pawing through the device's compatible software, poor graphics soon became the least of Leap Motion's problems. The company's Airspace Store, a proprietary app marketplace that sells third-party created software for the备开发的软件。
头部姿势估计实时随机森林算法(Random Forests for Real Time Head Pose Estimation)_算法理论_科研数据集
头部姿势估计实时随机森林算法(Random Forests for Real Time Head Pose Estimation)数据介绍:Fast and reliable algorithms for estimating the head pose are essential for many applications and higher-level face analysis tasks. We address the problem of head pose estimation from depth data, which can be captured using the ever more affordable 3D sensing technologies available today.关键词:算法,估算,实时,头部姿势,高品质,低质量,algorithms,estimation,real time,head pose,high-quality,low-quality,数据格式:TEXT数据详细介绍:Random Forests for Real Time Head Pose EstimationFast and reliable algorithms for estimating the head pose are essential for many applications and higher-level face analysis tasks. We address the problem of head pose estimation from depth data, which can be captured using the ever more affordable 3D sensing technologies available today.To achieve robustness, we formulate pose estimation as a regression problem. While detecting specific face parts like the nose is sensitive to occlusions, we learn the regression on rather generic face surface patches. We propose to use random regression forests for the task at hand, given their capability to handle large training datasets.In this page, our research work on head pose estimation is presented, source code is made available and an annotated database can be downloaded for evaluating other methods trying to tackle the same problem.Real time head pose estimation from high-quality depth dataIn our CVPR paper Real Time Head Pose Estimation with Random Regression Forests, we trained a random regression forest on a very large, synthetically generated face database. In our experiments, we show that our approach can handle real data presenting large pose changes, partial occlusions, and facial expressions, even though it is trained only on synthetic neutral face data. We have thoroughly evaluated our system on a publicly available database on which we achieve state-of-the-art performance without having to resort to the graphics card. The video shows the algorithm running in real time, on a frame by frame basis (no temporal smoothing), using as input high resolution depth images acquired with the range scanner of Weise et al.CODEThe discriminative random regression forest code used for the DAGM'11 paper is made available for research purposes. Together with the basic head pose estimation code, a demo is provided to run the estimation directly on the stream of depth images coming from a Kinect, using OpenNI. A sample forest is provided which was trained on the Biwi Kinect Head Pose Database.Because the software is an adaptation of the Hough forest code, the same licence applies:By installing, copying, or otherwise using this Software, you agree to be bound by the terms of the Microsoft Research Shared Source License Agreement (non-commercial use only). If you do not agree, do not install copy or use the Software. The Software is protected by copyright and other intellectual property laws and is licensed, not sold.THE SOFTWARE COMES "AS IS", WITH NO WARRANTIES. THIS MEANS NO EXPRESS, IMPLIED OR STATUTORY WARRANTY, INCLUDING WITHOUT LIMITATION, WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, ANY WARRANTY AGAINST INTERFERENCE WITH YOUR ENJOYMENT OF THE SOFTWARE OR ANY WARRANTY OF TITLE OR NON-INFRINGEMENT. THERE IS NO WARRANTY THAT THIS SOFTWARE WILL FULFILL ANY OF YOUR PARTICULAR PURPOSES OR NEEDS. ALSO, YOU MUST PASS THIS DISCLAIMER ON WHENEVER YOU DISTRIBUTE THE SOFTWARE OR DERIVATIVE WORKS.NEITHER MICROSOFT NOR ANY CONTRIBUTOR TO THE SOFTWARE WILL BE LIABLE FOR ANY DAMAGES RELATED TO THE SOFTWARE OR THIS MSR-SSLA, INCLUDING DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL OR INCIDENTAL DAMAGES, TO THE MAXIMUM EXTENT THE LAW PERMITS, NO MATTER WHAT LEGAL THEORY IT IS BASED ON. ALSO, YOU MUST PASS THIS LIMITATION OF LIABILITY ONWHENEVER YOU DISTRIBUTE THE SOFTWARE OR DERIVATIVE WORKS.If you do use the code, please acknowledge our papers:Real Time Head Pose Estimation with Random Regression Forests@InProceedings{fanelli_CVPR11,author = {G. Fanelli and J. Gall and L. Van Gool},title = {Real Time Head Pose Estimation with Random Regression Forests}, booktitle = {Computer Vision and Pattern Recognition (CVPR)},year = {2011},month = {June},pages = {617-624}}Real Time Head Pose Estimation from Consumer Depth Cameras@InProceedings{fanelli_DAGM11,author = {G. Fanelli and T. Weise and J. Gall and L. Van Gool},title = {Real Time Head Pose Estimation from Consumer Depth Cameras}, booktitle = {33rd Annual Symposium of the German Association for Pattern Recognition (DAGM'11)},year = {2011},month = {September}}If you have questions concerning the source code, please contact Gabriele Fanelli.Biwi Kinect Head Pose DatabaseThe database was collected as part of our DAGM'11 paper Real Time Head Pose Estimation from Consumer Depth Cameras.Because cheap consumer devices (e.g., Kinect) acquire row-resolution, noisy depth data, we could not train our algorithm on clean, synthetic images as was done in our previous CVPR work. Instead, we recorded several people sitting in front of a Kinect (at about one meter distance). The subjects were asked to freely turn their head around, trying to span all possible yaw/pitch angles they could perform.To be able to evaluate our real-time head pose estimation system, the sequences were annotated using the automatic system of ,i.e., each frame is annotated with the center of the head in 3D and the head rotation angles.The dataset contains over 15K images of 20 people (6 females and 14 males - 4 people were recorded twice). For each frame, a depth image, the corresponding rgb image (both 640x480 pixels), and the annotation is provided. The head pose range covers about +-75 degrees yaw and +-60 degrees pitch. Ground truth is provided in the form of the 3D location of the head and its rotation angles.Even though our algorithms work on depth images alone, we provide the RGB images as well.The database is made available for research purposes only. You are required to cite our work whenever publishing anything directly or indirectly using the data:@InProceedings{fanelli_DAGM11,author = {G. Fanelli and T. Weise and J. Gall and L. Van Gool},title = {Real Time Head Pose Estimation from Consumer Depth Cameras}, booktitle = {33rd Annual Symposium of the German Association for Pattern Recognition (DAGM'11)},year = {2011},month = {September}}Files:Data (5.6 GB, .tgz compressed) Readme fileSample code for reading depth images and ground truthIf you have questions concerning the data, please contact Gabriele Fanelli. 数据预览:点此下载完整数据集。
基于双相机捕获面部表情及人体姿态生成三维虚拟人动画
2021⁃03⁃10计算机应用,Journal of Computer Applications 2021,41(3):839-844ISSN 1001⁃9081CODEN JYIIDU http ://基于双相机捕获面部表情及人体姿态生成三维虚拟人动画刘洁,李毅*,朱江平(四川大学计算机学院,成都610065)(∗通信作者电子邮箱liyi_ws@ )摘要:为了生成表情丰富、动作流畅的三维虚拟人动画,提出了一种基于双相机同步捕获面部表情及人体姿态生成三维虚拟人动画的方法。
首先,采用传输控制协议(TCP )网络时间戳方法实现双相机时间同步,采用张正友标定法实现双相机空间同步。
然后,利用双相机分别采集面部表情和人体姿态。
采集面部表情时,提取图像的2D 特征点,利用这些2D 特征点回归计算得到面部行为编码系统(FACS )面部行为单元,为实现表情动画做准备;以标准头部3D 坐标值为基准,根据相机内参,采用高效n 点投影(EP n P )算法实现头部姿态估计;之后将面部表情信息和头部姿态估计信息进行匹配。
采集人体姿态时,利用遮挡鲁棒姿势图(ORPM )方法计算人体姿态,输出每个骨骼点位置、旋转角度等数据。
最后,在虚幻引擎4(UE4)中使用建立的虚拟人体三维模型来展示数据驱动动画的效果。
实验结果表明,该方法能够同步捕获面部表情及人体姿态,而且在实验测试中的帧率达到20fps ,能实时生成自然真实的三维动画。
关键词:双相机;人体姿态;面部表情;虚拟人动画;同步捕获中图分类号:TP391.4文献标志码:A3D virtual human animation generation based on dual -camera capture of facialexpression and human poseLIU Jie ,LI Yi *,ZHU Jiangping(College of Computer Science ,Sichuan University ,Chengdu Sichuan 610065,China )Abstract:In order to generate a three -dimensional virtual human animation with rich expression and smooth movement ,a method for generating three -dimensional virtual human animation based on synchronous capture of facial expression andhuman pose with two cameras was proposed.Firstly ,the Transmission Control Protocol (TCP )network timestamp method was used to realize the time synchronization of the two cameras ,and the ZHANG Zhengyou ’s calibration method was used to realize the spatial synchronization of the two cameras.Then ,the two cameras were used to collect facial expressions and human poses respectively.When collecting facial expressions ,the 2D feature points of the image were extracted and theregression of these 2D points was used to calculate the Facial Action Coding System (FACS )facial action unit in order toprepare for the realization of expression animation.Based on the standard head 3D coordinate ,according to the camera internal parameters ,the Efficient Perspective -n -Point (EP n P )algorithm was used to realize the head pose estimation.After that ,the facial expression information was matched with the head pose estimation information.When collecting human poses ,the Occlusion -Robust Pose -Map (ORPM )method was used to calculate the human poses and output data such as the position and rotation angle of each bone point.Finally ,the established 3D virtual human model was used to show the effect of data -driven animation in the Unreal Engine 4(UE4).Experimental results show that this method can simultaneously capture facial expressions and human poses and has the frame rate reached 20fps in the experimental test ,so it can generate naturaland realistic three -dimensional animation in real time.Key words:dual -camera;human pose;facial expression;virtual human animation;synchronous capture0引言随着虚拟现实技术走进大众生活,人们对虚拟替身的获取手段及逼真程度都提出较高要求,希望能够通过低成本设备,在日常生活环境下获取替身,并应用于虚拟环境[1]。
Unity3D技术之根运动 (Root Motion) – 工作原理
Unity3D技术之根运动 (Root Motion) –工作原理根运动(Root Motion) –工作原理身体变换身体变换(Body Transform) 是角色的质心。
它用于Mecanim 重定位引擎,提供最稳定的位移模型。
身体方向(Body Orientation) 是下半身方向与上半身方向的平均,与Avatar T 字姿势(T-Pose) 相关。
身体变换(Body Transform) 和方向(Orientation) 存储在动画片段(Animation Clip) 中(使用Avatar 中设置的肌肉线条)。
它们是存储在动画片段(Animation Clip) 中的唯一世界空间曲线。
其他一切事项:肌肉线条和IK 目标(四肢(Hands and Feet))相对身体变换存储。
根变换根变换(Root Transform) 是身体变换(Body Transform) 在Y 平面上的投影,在运行时计算。
在每一帧,根变换(Root Transform) 中的变化都会计算出来。
然后变换的变化会被应用到游戏对象(Game Object) 上,使之移动。
角色下面的圆代表根变换动画片段检视器动画片段编辑器(Animation Clip Editor) 设置(根变换旋转(Root Transform Rotation)、根变换位置(Root Transform Position) (Y) 和根变换位置(XZ))让您从身体变换(Body Transform) 控制根变换(Root Transform) 投影。
根据这些设置,身体变换(Body Transform) 的某些部分可能会是转移的根变换(Root Transform)。
例如,您可以决定是否要让运动Y 位置成为根运动(Root Motion)(轨迹)或者姿势(身体转换)的一部分,这就是所谓的合并到姿势(Baked into Pose)。
根变换旋转合并到姿势(Bake into Pose):方向与身体变换(或者姿势(Pose))保持一致。
人体三维运动姿态估计方法研究
实验设置与数据集介绍
实验目的
评估人体三维运动姿态估 计方法的性能和准确性, 对比不同方法的优劣。
数据集来源
使用公开数据集,包括真 实场景下的运动姿态数据 和模拟场景下的数据。
数据集特点
数据集包含不同场景、不 同人体姿态、不同动作类 型的数据,具有多样性和 挑战性。
实验结果展示
1Байду номын сангаас
方法一:基于传统机器学习的方法
05
融合卷积神经网络与循环 神经网络的人体姿态估计
方法
融合CNN与RNN的模型设计
卷积神经网络(CNN)用于 提取图像中的空间特征,包括 人体各部位的位置、方向和形 状等信息。
循环神经网络(RNN)用于 处理时间序列数据,捕捉人体 姿态随时间的变化情况。
将CNN与RNN融合,将空间 特征与时间序列信息结合起来 ,以实现对人体三维运动姿态 的准确估计。
研究内容与方法
研究内容
本文旨在研究一种基于深度学习的人体三维运动姿态估计方 法,提高对人体运动分析的准确性,并拓展其应用范围。
研究方法
首先,收集大量人体运动数据,构建一个包含丰富人体姿势 的数据库。其次,设计并构建一个深度神经网络模型,用于 对人体姿势进行端到端的估计。最后,通过实验验证所提方 法的准确性和鲁棒性。
准确率:75%
2
召回率:80%
3
实验结果展示
F1得分:78% 方法二:基于深度学习的方法 准确率:85%
实验结果展示
召回率:82%
1
F1得分:83%
2
方法三:多模态融合方法
3
实验结果展示
准确率:90%
F1得分:89%
召回率:88%
结果分析
基于Unity 3D与自然手势交互的车辆虚拟拆装实验系统
外,s,g是y、s和g的权重向量,是门函数。
E = [l+exp(-.)] -0.5
(4)
在基于HCNF的识别算法中,观察函数①“(X, Y,S)有用到门函数A(x),P(XI Y)可通过式(3) 和式(4)导出如下。
》exp 仏(血(X,y,s)+Wn(x,y,s)))
p(xi y)=
z(x)
(5) 其中Z(X)是分割函数,其表达示为
① 秦皇岛市科学技术研究与发展计划(202004A002),河北省示范性虚拟仿真实验教学(2018XNFZ13)和河北省高端装备产业研究院研 究资助项目。
② 男,1979年生,硕士,高级实验师;研究方向:智能车辆,虚拟现实;联系人,E-mail: wangwenfeng@ ysu. edu. cn (收稿日期=2020-07-13)
0引言
随着虚拟现实(virtual reality, VR )技术的发 展,如何将其融入到教学中,已经引起了广大学者的 重视。越来越多的研究者利用虚拟现实技术构建了 逼真的三维模拟环境和交互式动态教学场景丄 2], 使学生沉浸到虚拟环境中完成实验和学习。自 2017年以来,教育部在大力开展示范性虚拟仿真实 验教学项目⑶。各类仿真教学实验层出不穷"⑶, 极大地推动了实验教学的发展。
测试评分模块,通过对操作步骤的正确性和规 范性进行评分,测试掌握程度。测试进行时,去掉了 拆装步骤的提示部分,完全在相关约束的限制下自 由拆卸和安装,每一步都设置相应的分数,操作错误 将会扣分,测试岀的成绩用来测验掌握程度。 —648 —
(a)方程式赛车3D模型
(b)拖拉机3D模型
(e)轿车车桥3D模型
原理教学 拆装教学丿
体 3D Max 建
基于多媒体技术的三维人物图像动态重构
基于多媒体技术的三维人物图像动态重构李天峰【摘要】The traditional three?dimensional(3D)character image dynamic reconstruction method based on self?calibration can′t acquire the 3D position of characters motion morphology accurately,and has high deviation of reconstruction results. Ac?cording to the above problems,a 3D character image dynamic reconstruction method based on multimedia technology is put for?ward,in which the monocular image reconstruction algorithm is used to reconstruct the 3D posture of human body. For the recon?struction of monocular human movement image taken by a single camera in multimedia technology,the 2D limbs detection algo?rithm based on component detector is used to detect the 2D limbs of human body according to the tree?shaped model of the hu?man body. The 3D posture reconstruction algorithm based on image coordinate of joint points is used,and the annealing particle filtering algorithm for the prediction of joint point back projection error is adopted to track the 3D posture of human body accor?din g to detection results of human body′s 2D posture,so as to realize the 3D character image dynamic reconstruction. The experi?mental results indicate that the proposed method can realize the dynamic reconstruction of 3D character image accurately,and has high reconstruction accuracy and efficiency.%针对基于自标定的三维人物图像动态重构方法不能准确获取人物运动形态的三维位置,重构结果存在较高的偏差的问题,提出基于多媒体技术的三维人物图像动态重构方法,采用单目图像重构算法完成人体三维姿态重构.对多媒体技术中单摄像机拍摄的单目人体运动图像进行重构时,首先基于人体树状模型,采用基于部件检测器的二维肢体检测算法完成人体二维肢体检测;再采用基于关节点图像坐标的三维姿态重构算法,依据人体二维姿态检测结果,通过预测关节点反投影误差的退火粒子滤波算法完成人体三维姿态的跟踪,实现三维人体图像动态重构.实验结果说明,所提方法可准确实现三维人物图像动态重构,具有较高的重构精度和效率.【期刊名称】《现代电子技术》【年(卷),期】2018(041)009【总页数】4页(P68-71)【关键词】多媒体技术;三维人物图像;三维姿态重构;退火粒子滤波算法;重构精度;姿态跟踪【作者】李天峰【作者单位】南阳理工学院计算机与信息工程学院,河南南阳473004【正文语种】中文【中图分类】TN911.73-34;TP1810 引言随着计算机动画和计算机视觉等多媒体技术的快速发展,对运动人体三维姿态进行准确的动态跟踪,在监控、体育、医疗以及影视制作等领域具有重要的应用价值。
基于计算机视觉的运动员错误动作识别模型构建及仿真
基于计算机视觉的运动员错误动作识别模型构建及仿真高亮【摘要】针对当前运动员训练的需求,结合当前的计算机视觉技术,提出一种运动员错误动作识别的模型.对体育运动错误动作三维建模的原理进行分析,然后结合错误动作识别的流程,以南拳中的高空腾飞作为研究对象,用三维坐标构建其动作形式化描述规则,并采用贝叶斯算法对错误动作进行识别.最后对上述的方案进行验证,得到本文构建的贝叶斯算法在检测的准确率等方面都比传统的要高,以此验证了贝叶斯在对南拳错误动作中的有效性.%In view of the needs of current athletes training and combined with the current computer vision technology, a model based on the error recognition of athletes is proposed. The principle of sports error action 3D modeling is analyzed. Combined with the wrong action recognition process, taking high altitude Nanquan as the research object, the 3D coordinates of the athletes are applied to build the formal description of action rules, and the Bayesian algorithm is used to identify the wrong action. Finally the above scheme is verified. The Bayesian algorithm has higher detection accuracy than the traditional ones.【期刊名称】《微型电脑应用》【年(卷),期】2018(034)006【总页数】4页(P59-62)【关键词】计算机视觉;错误动作;模型;仿真;贝叶斯【作者】高亮【作者单位】榆林学院, 榆林 719000【正文语种】中文【中图分类】TP3110 引言近年来,在我国计算机图像处理技术的推动下,计算机视觉特性解析以及图像处理技术被广泛应用于人体结构分析中,能够对人体在运动时各类形体进行解析。
基于计算机视觉的3D手势交互系统
基于计算机视觉的3D手势交互系统霍鹏飞【期刊名称】《现代电子技术》【年(卷),期】2016(039)017【摘要】随着计算机的广泛发展,键盘、鼠标等传统的人机交互方式很难满足用户自然、便捷的交互需求。
研究手势建模、人手跟踪和手势交互系统的应用成为热点趋势。
提出了一种简化的2D人手模型,该模型将人手建模为掌心点和5根手指,同时设计了一种基于粒子群优化(PSO)算法的人手跟踪方法,通过建模人手的生理和运动学约束关系,实现了基于2D/3D人手模型的PSO人手跟踪,该手势交互系统框架更具适用性和扩展性,融合了语义和反馈信息,提高了人手跟踪的鲁棒性和手势识别的准确度。
%With the wide development of the computer,the keyboard,mouse and other traditional human⁃computer interaction modes are difficult to meet the users′ natural and convenient interaction needs. The study on the applications of gesture modeling, hand tracking and gesture interactive system has become a hotspot. A simplified 2D hand model is proposed in this paper,in which the hand is modeled as the palm point and 5 fingers. A hand tracking method based on particle swarm optimization(PSO) algorithm was designed,and the PSO hand tracking based on 2D/3D hand model was realized by modeling the physiology and kine⁃matics constraint relation of the hand. The framework of this gesture interactive system has applicability and scalability. It fusedthe semantics and feedback information,and improved the robustness of hand tracking and accuracy of gesture recognition.【总页数】5页(P26-29,34)【作者】霍鹏飞【作者单位】南阳师范学院珠宝玉雕学院,河南南阳 473061【正文语种】中文【中图分类】TN911.73-34;TM417【相关文献】1.基于手势识别技术的3D虚拟交互系统 [J], 王粉花;张万书2.基于卷积神经网络的手势识别人机交互系统的设计 [J], 陈壮炼;林晓乐;王家伟;李超3.基于双手手势融合交互系统的方法设计 [J], 缪文南4.基于最邻近算法的三维手势交互系统设计 [J], 陈晓慧;明月;刘罡5.基于轻量级OpenPose改进的幻影机手势交互系统 [J], 谭立行;鲁嘉淇;张笑楠;刘宇红;张荣芬因版权原因,仅展示原文概要,查看原文内容请购买。
一种在3D手势交互中模拟鼠标的方法
一种在3D手势交互中模拟鼠标的方法律睿慜;郝豪杰;陈飞;陈伟;刘渊【摘要】Current gesture interaction technologies are not satisfactory to perform 2D interaction tasks, since they are developed for 3D interactions. To solve this problem, this paper presents a new 3D gesture interaction technique to emulate a mouse function for 2D interactions. Based on the ISO9241-9 standard, a multi-directional tapping test for controlling typical 2D interactions is developed to evaluate the performance by comparing the 3D gesture interaction method with the general 3D gesture interaction method and the mouse interaction method in three aspects:the standard throughput, pointing-selection time and error rate. After that, a survey is conducted by users to evaluate the satisfactory scores of the proposed method. The results show that the proposed method can perform 2D interaction tasks by achieving higher accuracy, better control-ling experiences, and without reducing the throughput rate than the general3D gesture interaction method.%针对当下3D手势交互在执行2D交互任务时用户体验不佳,提出了一种模拟鼠标2D交互方式的3D手势交互技术.根据ISO9241-9标准,设计多方向点选实验,用于比较该3D手势交互技术、一般3D手势交互技术和日常鼠标这三种交互技术在执行典型2D交互任务时的标准吞吐量、指向-选择时间和错误率.以调查问卷的方式评估该交互技术的主观舒适性.结果显示,与一般的手势交互相比,在不降低吞吐率的情况下,该交互技术在执行2D交互任务时可以获得更高的准确率以及更好的舒适度体验.【期刊名称】《计算机工程与应用》【年(卷),期】2017(000)010【总页数】8页(P13-20)【关键词】手势交互;虚拟鼠标;菲兹定律;ISO9241-9标准【作者】律睿慜;郝豪杰;陈飞;陈伟;刘渊【作者单位】江南大学数字媒体学院,江苏无锡 214122;江南大学数字媒体学院,江苏无锡 214122;江南大学数字媒体学院,江苏无锡 214122;江南大学数字媒体学院,江苏无锡 214122;江南大学数字媒体学院,江苏无锡 214122【正文语种】中文【中图分类】TP391手势交互是利用计算机图形学等技术识别人的肢体语言,并转化为命令来操作设备[1]。
基于光栅双目视觉的手掌重构与实现
基于光栅双目视觉的手掌重构与实现曹淼龙;李强;姜文彪【期刊名称】《浙江科技学院学报》【年(卷),期】2015(27)1【摘要】Aiming at the characteristics of complicated texture epidermis and elasticity deformation of hand and palm form ,a reconstruction and implementation method for building the model of hand‐shape based on the technology of raster binocular system in vision metrology is proposed . The 3D vision measurement system is employed to calibrate the character parameter ,attribute parameters and relative position of cameras in left and right side .Firstly , by two finished calibration cameras performance ,we acquire data of photographed multi images from the hand‐shape model with raster binocular system and global data optimization registration are accomplished . T hen on this basis , the massive o riginal point‐clouds data are compiled .Finally ,the hand shap model is simulated with numerical control and is also verified by practical processing . T he experimental results demonstrate that the reconstruction model with non‐contact measuring method is of high precision ,good practicality and valuable to be referenced and promoted in application .%针对手部掌形的表皮纹理复杂性和弹性易形变等特点,提出了一种以光栅式双目三维视觉技术对手部掌形模型进行重构及实现的方法。
基于视角不变的三维手势轨迹识别
基于视角不变的三维手势轨迹识别张毅;张烁;罗元【摘要】提出了一种新颖的基于视角不变的三维手势轨迹识别方法,手势分割采用Kinect传感器获取图像深度信息,通过先定位起始点再定位结束点的方法定位手心点,使手势轨迹点定位有自动无延时的特性。
采用改进的质心距离函数表示视角不变的三维轨迹特征,隐马尔可夫模型用于训练和识别有效的轨迹。
实验结果表明,该方法具有光照及复杂背景鲁棒性,数字0~9的平均识别率可达97.7%。
【期刊名称】《电子科技大学学报》【年(卷),期】2014(000)001【总页数】6页(P60-65)【关键词】三维手势轨迹;质心距离函数;轨迹点定位;视角不变【作者】张毅;张烁;罗元【作者单位】重庆邮电大学自动化学院重庆沙坪坝区 400065;重庆邮电大学自动化学院重庆沙坪坝区 400065;重庆邮电大学光电工程学院重庆沙坪坝区400065【正文语种】中文【中图分类】基础科学第 43 卷第 1 期2014年1月电子科技大学学报Joumalof University of Electronic Science and Technologyof China V01.43No.lJan.2014基于视角不变的三维手势轨迹识别张毅 1 ,张烁 1 ,罗元 2 (1 .重庆邮电大学自动化学院重庆沙坪坝区 400065;2 .重庆邮电大学光电工程学院重庆沙坪坝区 400065)【摘要】提出了一种新颖的基于视角不变的三维手势轨迹识别方法,手势分割采用Kinect 传感器获取图像深度信息,通过先定位起始点再定位结束点的方法定位手心点,使手势轨迹点定位有自动无延时的特性。
采用改进的质心距离函数表示视角不变的三维轨迹特征,隐马尔可夫模型用于训练和识别有效的轨迹。
实验结果表明,该方法具有光照及复杂背景鲁棒性,数字O ~9 的平均识别率可达97.7% 。
关键词三维手势轨迹;质心距离函数;轨迹点定位;视角不变中图分类号, IP242.6 文献标志码 A doi:10.3969/j.issn.1001-0548.2014.01.010 View-Invariant3DHandTrajectory-BasedRecognition.--2 ZHANG Yi'.ZHANG Shuo'.and LUOYuan- (l.lnstitute ofAutomation,ChongqingUniversity ofPosts and Telecommunications ShapingbaChongqing 400065;2.InstituteofOptoelectronicsEngineering, ChongqingUniversity ofPostsand Telecommunications ShapingbaChongqing 400065) Abstract Thispaperproposesanovelmethodfor view-invariant3Dhandtrajectory-basedrecognition.The imagedepthinformationin gesturesegmentationis collectedbyusingKinectsensor.View-invariant3Dhand trajectory isrepresentedbyimprovingcentroiddistance function.HiddenMarkovmodelis appliedto train and recognizehandgesture.Experimentresults showthat theproposedmethodis robustundertheconditionof di 丘’ erentilluminationandcomplexbackground.Theillustratedsystemcansuccessfullyrecognizespot tedhand gestures witha97.7%recognitionrate forArabicnumberso t0 9. Keywords 3D gesture trajectory; centroid distance function;gesturespotting; view-invariant手势识别是人机交互中的一种重要手段[1-2]。
人手到灵巧手的运动映射实现_刘杰
文章编号:1002 0446(2003)05 0444 04人手到灵巧手的运动映射实现刘杰,张玉茹,刘博(北京航空航天大学机器人研究所 北京 100083)摘 要:本文研究主从操作中人手到灵巧手的运动映射.提出了一种基于虚拟关节和虚拟手指的关节空间运动映射方法,实现了人手和灵巧手的三维运动仿真.以数据手套为人机接口,在虚拟环境下,通过直观地比较映射效果,验证了映射算法.关键词:主从操作;灵巧手;运动映射中图分类号: T P24 文献标识码: BMOTION MAPPING OF HUMAN HAND TODEXTEROUS ROBOTIC HANDSLIU Jie,ZHANG Yu ru,LIU Bo(Robotics Institu te,Beijing Univer sity of Aeronautics and Astr onautics,Beij ing 100083,China)Abstract:T his paper describes t he implementation of gr asp planning for robot ic dex ter ous hands.Human grasp strateg y is learned through master slave manipulation.T he human hand motion is mapped to the robot hands through datag love.A method of joint space mapping is developed based on virtual jo ints and virtual fing ers.A v irtual envir onment is developed to visualize the motion of the human hand and the robot hands.T he mapping r esult is verified intuitively in the virtual envi ro nment.Keywords:master slave manipulation;dexterous hands;motion mapping1 引言(Introduction)灵巧手的主要功能是抓持和操作物体,抓持是操作的基础.抓持规划要解决的问题是根据抓持任务确定指关节转角的运动规律和指关节力矩的目标值.在对人手的抓持规划未作深刻系统的研究之前,实现自主抓持规划的可能性很小.人手经过长期自然选择进化,获得了可以完成复杂动作的能力,因此,许多机器人灵巧手采用了仿人手的设计策略,在对它们进行运动规划时,也希望利用人的智能,主从操作策略成为解决灵巧手抓持规划的主要途径.其基本思想是由人来决定如何抓持,通过控制灵巧手再现人的意图,使人的智能与机器人的机械运动能力有效结合.例如,美国国家航空航天局(NASA) JPL实验室的Jau利用外骨架机构测量人手关节运动,并映射成灵巧手关节的运动[1];NASA Johnson 空间中心的Farry通过测量人手运动的肌电信号,进行人手与灵巧手之间的运动映射[2];德国宇航研究中心的Fischer,采用数据手套测量人手关节角,利用神经网络确定人手关节与指尖位置的关系,在人手与灵巧手的工作空间上进行映射[3].以主从方式实现灵巧手抓持规划的关键问题之一是如何将人手的抓持转换为灵巧手的抓持,即人手运动向灵巧手的映射.一些学者就此进行了研究[4~8].我们相信,如果可以建立人手到灵巧手的运动映射,在相同的操作环境和操作对象下,拟人手也能完成类似的动作.本文介绍一种运动映射方法的具体实现过程,包括人手运动的测量、人手到灵巧手的运动映射方法、映射算法的验证和映射效果比较等.针对灵巧手第25卷第5期2003年9月机器人 R OBOT V ol.25,No.5 Sept.,2003基金项目:国家自然科学基金:基于6维力传感器的灵巧手控制和操作规划资助项目(59985001);国家教育部博士点基金:人手运动识别与灵巧手的抓持规划资助项目(2000000605)收稿日期:2003-03-10和人手之间的异构映射问题,定义了预抓持手势,利用虚拟关节和虚拟手指的概念,解决了多关节手指向少关节手指、多个手指向一个手指的运动映射问题.2 人手的运动测量(Motion captu re of hu manhand)目前,采集人手运动的设备主要有摄像头和数据手套.对用户来说,利用摄象头输入手势信息是很方便的,但摄像头采集的数据易受光照等环境变化的影响.目前的视觉技术还不能对自动理解提供足够的支持,图像处理计算的速度很难保证在线规划的要求.数据手套是一种有效的手势输入设备,目前被许多系统所用.它一方面可以通过手指上的传感器测量手指的姿势及其变化情况;另一方面,又可通过手腕上的位置跟踪器测定手腕的位置和姿态.与摄像头相比,数据手套采集的数据简洁、准确.数据手套易获取表明手的动作时空特性的特征信息,如手指关节运动信息;另外,数据手套采集的数据不受光照等环境变化的影响.通过对两种输入设备性能的比较,我们选取带有18个传感器的Cy berglove 数据手套,作为人手运动的输入设备.如图1所示,利用该手套可以得到各关节的原始传感数据为:小指MPJ 和PIJ 、无名指MPJ 和PIJ 、中指M PJ 和PIJ 、食指M PJ 和PIJ 、拇指MPJ 和PIJ 、小指与无名指间外展、无名指与中指间外展、中指与食指间外展、拇指外展、拇指旋转度、腕的外展、腕的曲度、掌的弧度,其中M PJ 为根关节转角(Metacarpo phalangeal Joints)、PIJ 为中间关节转角(Proximal In terphalangeal Joint)、DIJ 为指端关节转角(Distal In terphalangeal Joints)[9].图1 Cyberglov e 测量角度示意图Fig.1 Cyberglove capture我们把手指在运动过程中可以达到的空间位置作为衡量人手运动能力的指标,这个指标可以用指尖的空间位置结合手指三个关节的弯曲角度和范围来描述,测量所得运动空间如图2所示.图2 人手的运动空间Fig.2 T he wor kspace of human hand人手在抓取物体时,将根据物体的形状、尺寸、重量和任务来选择抓取姿势,而不是在物体上选择准确的抓取点.因此,我们定义:手尚未与被抓持物体接触前所形成的包罗物体的抓取姿势为预抓持手势.灵巧手要抓住物体,就需要完成类似人手的抓持动作.灵巧手模仿人手的预抓持手势是完成仿人抓持的必要条件.因此,我们研究的重点是预抓持手势的映射问题.在灵巧手完成仿人的预抓持之后,接触物体到抓住物体的过程中,需要根据灵巧手的触觉传感器反馈信号来实现稳定的抓持,这个过程类似于人手借助丰富的触觉信息来完成对物体的抓取.3 运动映射(Motion mapping)运动映射的本质是通过跟踪人手指的运动来控制灵巧手手指的运动,目的是让灵巧手实现类似人手的复杂和多样的操作.3.1 研究对象本文的研究对象是北京航空航天大学机器人研究所自主开发的机器人灵巧手.图3和图4是B H 3和BH4的单指运动简图.BH3是三指9自由度灵巧手,三个手指结构相同,每个手指有三个指节、三个关节和三个自由度.BH 4是四指12自由度仿人手,拇指2个指节、3个自由度,其余三指结构相同,每个手指有三个指节、四个关节和三个自由度,前两关节 3和 4运动耦合.3.2 映射方法人手到灵巧手运动映射的主要方法分为四类:指尖映射、关节映射、关键点映射和基于被操作物体的映射.本文采用关节角映射,这种方法的优点是便445 第25卷第5期刘杰等: 人手到灵巧手的运动映射实现于规划系统和控制系统的结合,因为规划的结果是各个关节转角,它们是手指关节电机的控制目标信号.受到现有驱动器、传感器、材料和加工条件等因素的限制,机器人手还不能做到和人手完全一样,在运动学结构上存在明显的不同,这给运动映射带来了难题.BH 系列灵巧手和人手的异构给运动映射带来的问题主要有:1)多关节多自由度的人手指到少关节少自由度机器人手的映射.2)多个手指向一个手指的运动映射.其中解决手指运动映射问题是基础.解决第一个问题的方法是关节运动的组合分配,即把人手指的中间关节的运动组合到根关节上,然后映射给BH3的根关节.这里本文定义虚拟关节,把两个物理关节抽象为一个物理上并不存在的关节,该关节的运动功能等效于原来的两个关节的运动组合.图3 BH3单指运动简图Fig.3 BH3finger kinematicstructure图4 BH4单指运动简图Fig.4 BH4finger kinematic structure图5显示了虚拟关节的实现方法.图中 、 和!分别为人手指的根关节转角、中间关节转角和指端关节转角;手指近指节、中指节的指长分别为a,b.∀和! 分别为虚拟手指的关节转角.根据余弦定理可得:cos (∀- )=a 2+(a 1+b 2-2ab cos (180- ))-b 22a a 2+b 2-2ab cos (180- )=a +b cos ( )a 2+b 2+2ab cos ( )(1)∀= +arccos (a +b cos ( )a 2+b 2+2ab cos ( )) (2)由于人手和灵巧手的关节转角范围不一样,所以要求等效转角的范围要在灵巧手关节转角的设计范围内.图5 手指关节运动分析Fig.5 Mo tion analysis of finger measuring angles人手和BH3手的主要运动学区别包括拇指根关节的结构、手掌的结构和手指在手掌上的布局.它们是影响映射的关键因素.人手的拇指、食指对应B H 3两指,将人手的中指、无名指和小指抽象为一个虚拟手指[10],该虚拟手指对应BH3的一个手指.虚拟手指的MPJ 计算如下:vf MP J B H 3=mpjK 1 middleMP J -mpjK 2ringMP J +mpjK 3 pinkieMP J(3)式中middleMPJ 、ringM PJ 、pinkieMPJ 分别代表中指、无名指、小指的根关节转角,mpjK 1、mpjK 2、mpjK 3是权值,mpjK 1、mpjK 2、mpjK 3=1.权值需要根据灵巧手和人手的运动近似程度进行调整.近似程度的评价方法是采用模式识别技术对灵巧手和人手的关节运动数据进行聚类分析,详细内容我们将在另文中进行介绍.在虚拟环境中,通过反复实验,我们选取mpjK 1为0.5、mpjK 1和mpjK 1为0.25.手指对应关系确定之后,根据公式(1)和(2)提供的关节角度等效方法,在考虑人手和灵巧手在指长和关节运动范围的差别基础上,确定对应关节角度的映射.手值得其他关节的角度的映射方法同根关节.人手到B H 3灵巧手的运动映射效果参见图6.在结构上,BH4灵巧手自由度的分配更加接近人手,映射方法相对简单,人手拇指对应BH4拇指,人手食指、中指对应BH4的两个手指,而人手无名指446机 器 人2003年9月和小指抽象为一个虚拟手指,这个虚拟手指的运动映射给BH4的一个手指.同理,虚拟手指vfM PJ BH4等效公式如下:vf MPJ BH 4=mpj K 2 ringMPJ+mp jK 3 p ingkieMP J(4)图6 人手到BH3运动映射Fig.6 M apping human hand to BH3其中mpjK 2+mpjK 3=1.而每个手指关节角度的映射方法同BH3,mpjK 2为0.75,m pjK 3为0.25.映射实现效果参见图7.图7 人手到BH4运动映射F ig.7 M apping human hand to BH44 映射仿真(Mapping results)为了便于观察人手到灵巧手的运动映射效果,验证不同的映射算法,我们开发了基于PC 机的虚拟现实运动映射平台[11],它的体系结构如图8所示.图8 人手到灵巧手运动映射软件体系结构Fig.8 Diagram of software ar chitecture实验时,操作者佩戴经过标定的数据手套与虚拟环境进行交互,用人手的运动数据驱动虚拟人手和虚拟机器人手,比较他们之间运动相似程度.该系统不仅可以直观快速地验证运动映射算法,而且可图9 实验环境场景图Fig.9 T he exper imental system以检验不同的运动学结构设计对手的运动能力的影响,进而根据任务要求设计不同的灵巧手.实验环境场景见图9.BH3和BH4的运动能力和运动映射效果比较参见图10.图10 人手到BH3和BH4灵巧手运动映射Fig.10 M apping from human hand to BH3&BH4(下转第451页)447 第25卷第5期刘杰等: 人手到灵巧手的运动映射实现[4]陈茂华,张华等.智能移动机器人的实验研究[J].南昌航空工业学院学报.2001.[5]Ik eda K,Nozaki T.Fundamental study on a w all climbing robot[A].Proc.of the5th Int.Symp.On Robotics in Construction[C].1998.[6]Jagannathan S,Zhu S Q,L ew i s F L.Path planni ng and control of amobile base w ith nonbolonomic constraints[J].Robotica,1994(12).作者简介:欧光峰(1971 ),男,硕士研究生.研究领域:工业焊接机器人自动控制及智能化.张 华(1964 ),男,教授,博士生导师.研究领域:焊接过程自动控制、工业焊接机器人智能化.刘国平(1964 ),男,副教授,硕士生导师.研究领域:机器人、自动控制等.(上接第447页)我们在实验中发现灵巧手拇指在手掌上的位置和手掌的形状对映射效果影响很大.5 结论和将来的工作(C onclu sio n and fu turew ork s)本文针对灵巧手的抓持规划,实现了基于PC的虚拟现实仿真平台,建立了一个友好的人机环境来模拟和再现人手到灵巧手的运动映射.以B H系列灵巧手为例,验证了运动映射的方法及其映射效果.该虚拟环境既可以用于灵巧手的抓持规划,也可以用于评估设计方案,为改进灵巧手的运动学结构提供依据.在此基础上,我们将进一步探索定量评价运动映射效果的方法.参考文献 (References)[1]Jau, B.Dextrous telemanipulation w ith four fingered hand system[A].IEEE Int.Conf.on Robotics and Automation[C].M i nneapolis,M N,USA:IEEE Press,1996.338-343.[2]Farry K A,e t al.M yoelectric teleoperation of a complex roboti chand[J].IEEE T rans.on Robotics and Automation,1996,12(5): 775-877.[3]Fischer M,et al.Learning techniques in a dataglove based telemanipulation sys tem for th e DLR hand[A].IEE E Int.Conf.Robotics and Automation[C].Leuven,Belgium:IEEE Press,1998.1603-1608.[4]Kang S B,Ikeuchi K.T ow ard automatic robot instruction from percepti on mapping human grasps to manipulator grasps[J].IEEE Trans.on Robotics and Automation,1997,13(1):81-95.[5]Kawasaki H,e t al.Virtual teaching based on hand manipulabilityfor multi fingered robots[A].IEEE Int.Conf.Robotics an d Au tomation[C].Seoul,Korea:IEEE Press,2001.1388-1393. [6]Griffin W B,Findley R P,e t al.Calibration an d mapping of a hum an hand for dexterous telemanipulation[A].2000ASM E IM ECE DSC Symposium on H aptic Interfaces[C].2000.1-8.[7]Turner M L.Programming dexterous manipulation by demonstration[D].Stanford,CA:Stanford Universi ty,2001.[8]李继婷,张玉茹,张启先.人手抓持识别与灵巧手的抓持规划[J].机器人,2002,24(6):530-534.[9]Virtual T echnologies[M].Inc.,CyberGlove User s M anual,1997.[10]Iberall T.Human prehension and dexterous robot han ds[J].T heInt.J.Robotics Research,1997,16(3):285-299.[11]Jie Liu,Yuru Zhang.Dataglove based grasp planning for multi fingered robot hand[A],Accepted by Proc.the11th W orld Congress in M echanism and M achine Sci ence[C].T i anjin,China:China M a chinery Press,2003.作者简介:刘 杰(1975 ),男,博士研究生研究领域:模式识别、虚拟现实、机器人多指手运动规划.张玉茹(1959 ),女,博士,教授,博导,研究领域:机器人机构学、医用机器人、人机交互技术.刘 博(1968 ),男,博士研究生研究领域:机器人多指手设计与规划.451第25卷第5期欧光峰等: 履带式智能弧焊机器人焊缝跟踪控制系统。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
3D MOTION ESTIMATION OF HEAD AND SHOULDERSIN VIDEOPHONE SEQUENCESMarkus KampmannInstitut für Theoretische Nachrichtentechnik und InformationsverarbeitungUniversität Hannover, Appelstraße 9A, 30167 Hannover, F.R.Germanyemail: kampmann@tnt.uni–hannover.de,WWW: http://www.tnt.uni–hannover.de/~kampmannABSTRACTIn this paper, an approach for 3D motion estima-tion of head and shoulders of persons in video-phone sequences is presented. Since head and shoulders are linked together by the neck, constraints for the motion of head and shoulders exist which are exploited to improve the motion estimation. In this paper, the human neck joint is modelled by a spherical joint between head and shoulders and its 3D position is calculated. Then, rotation and translation of the shoulders are esti-mated and propagated to the head. Finally, a rota-tion of the head around the neck joint is estimated. The presented approach is applied to the video-phone sequences Akiyo and Claire. Compared with an approach without using a neck joint, ana-tomically correct positions of head and shoulders are estimated.1. INTRODUCTIONFor coding of moving images at low bit rates, an object–based analysis–synthesis coder (OBASC) has been introduced [1]. In an OBASC, each real object is described by a model object. A model object is defined by three sets of parameters de-fining its motion, shape and color. In [2], the shape of a model object is represented by a 3D wire-frame. The motion is defined by 6 parameters describing the translation and rotation of the mo-del object in 3D space. The color parameters de-note luminance and chrominance reflectance on the model object surface. Objects may be articu-lated, i.e. may consist of two or more flexibly connected 3D object components. Each object component has its own set of 3D motion, 3D shape and color parameters. All these parameters have to be estimated automatically by image analysis. In the case of typical videophone sequences, the human body in the sequence can be considered as an articulated object and head and shoulders as object components. All three sets of model pa-rameters have to be estimated for these object components. This contribution deals with esti-mating the 3D motion parameters of head and shoulders.In [5], a hierarchical approach for 3D motion estimation of object components is proposed. No constraints for the spatial location of the object components are considered. However, for articu-lated objects like a human body constraints for the spatial location of the object components exist. These constraints can be exploited to improve the motion estimation. By using joints between object components, constraints for the relative motion between the object components are introduced. In [6][7], 2D joint positions in sequent images are estimated and a 3D joint position is calculated. In [8], the 3D joint position is directly estimated using motion parameters from preceding images. Using these joint positions, motion estimation of articulated objects is carried out [6][7][9]. In these algorithms, no a priori knowledge about the object components head and shoulders and the position of the connecting neck joint on a human body is exploited.In this contribution, an algorithm for 3D motion estimation of head and shoulders in videophone sequences is presented which uses a priori know-ledge about the image content of a videophone sequence. Here, the human neck joint is modelled by a spherical joint between head and shoulders. The 3D position of the neck joint is calculated using an automatically generated 3D wireframe of the person in the sequence [4] and knowledge about the position of the neck joint on a human body. This 3D wireframe of the person is splitted into the object components head and shoulders and the motion of head and shoulders is estimated. First, six motion parameters namely three rotationJPȀȀh+[R h](PȀh*JȀ))JȀ(3.4) with the rotation matrix [R h]. [R h] is calculated from the rotation angles R h+(R h,x,R h,y,R h,z)T. For motion estima-tion, the three rotation parameters R h have to be estimated. Here, the same algorithm [2] as for the estimation of the motion parameters of the shoul-ders is used.4. EXPERIMENTAL RESULTSThe described algorithm has been tested using the videophone sequences Akiyo and Claire (CIF, 10Hz). The face model from Fig. 1 is adapted automatically to the individual face, the 3D wire-frame of the person is generated and splitted into the object components head and shoulder. After-wards, the 3D motion of head and shoulders is estimated between two succeeding images of the videophone sequences. Here, the proposed 3D motion estimation algorithm is compared with the hierarchical 3D motion estimation algorithm in [5] which does not use a neck joint for motion estimation. Fig. 3 shows original images of the sequences Akiyo and Claire. Fig. 4 and Fig. 6 show results of the proposed motion estimation algorithm, Fig. 5 results of the motion estimation algorithm from [5]. Since no constraints for the spatial location of head and shoulders are used in [5], anatomically impossible positions of head and shoulders are estimated with the algorithm in [5] (Fig. 5). Using the proposed motion estimation algorithm, anatomically correct positions of head and shoulders are estimated (Fig. 6).5. CONCLUSIONSIn this paper, a new approach for 3D motion es-timation of head and shoulders in videophone se-quences is presented. First, the 3D neck joint posi-tion of the person in the sequence is calculated using an automatically generated 3D wireframe of the person and knowledge about the position of the neck joint on a human body. Then, translation and rotation of the shoulders are estimated and propagated to the head. Finally, a rotation of the head around the neck joint is estimated. Compared with an algorithm without using a neck joint, ana-tomically correct positions of head and shoulders are estimated.6. REFERENCES[1]H.G. Musmann, M. Hötter, J. Ostermann,”Object–oriented analysis–synthesis cod-ing of moving images”, Signal Proces-sing: Image Communications, V ol. 3, No.2, pp. 117–138, November 1989.[2]J. Ostermann, ”Object–based analysis–synthesis Coding based on the sourcemodel of moving rigid 3D objects”, Sig-nal Processing: Image Communications,V ol. 6, pp. 143–161, May 1994.[3]M. Kampmann, R. Farhoud, ”Precise facemodel adaptation for semantic coding ofvideophone sequences”, Picture CodingSymposium (PCS ’97), Berlin, Germany,pp. 675–680, September 1997.[4]M. Kampmann, ”Automatic Generationof 3D wireframes of human persons forvideo coding”, 7. Dortmunder Fernsehse-minar, Dortmund, Germany, pp.105–108, September 1997.[5]R. Koch, ”Dynamic 3–D scene analysisthrough synthesis feedback control”,IEEE T–P AMI, V ol. 15, No. 6, pp.556–568, June 1993.[6]R. Holt, A. Netravali, T. Huang and R.Qian, ”Determining Articulated Motionfrom Perspective Views: A Decomposi-tion Approach”, IEEE Workshop on Mo-tion of Non–Rigid and Articulated Ob-jects, Austin, Texas, pp. 126–137, Nov.1994.[7]J. Web. and J. Aggarwal, ”Structure frommotion of rigid and jointed objects”, Se-venth International Joint Conference onArtifitial Intelligence, Vancouver, S.686–691, August 1981.[8]G. Martínez, ”Analyse–Synthese–Codie-rung basierend auf dem Modell bewegterdreidimensionaler, gegliederter Objekte”,Dissertation, Universität Hannover,1998.[9]G. Martínez, ”3D Motion Estimation ofArticulated 3D Objects for Object–BasedAnalysis–Synthesis Coding (OBASC)”,International Workshop on Coding Tech-niques for Very Low Bit–rate Video(VLBV’95), Tokyo, Japan, G.1, Novem-ber 1995.Fig. 3:Original images (CIF,10Hz): (a) Akiyo frame 57, (b) Akiyo frame 91, (c) Claire frame 32,(d) Claire frame 48.(b)(a)(c)(d)Fig. 4:3D wireframe over original image for the proposed algorithm: (a) Akiyo frame 57, (b) Akiyo frame 91,(c) Claire frame 32, (d) Claire frame 48.(b)(a)(c)(d)Fig. 5:3D wireframe in a side view for the algorithm in [5]: (a) Akiyo frame 57, (b) Akiyo frame 91,(c) Claire frame 32, (d) Claire frame 48.(b)(a)(c)(d)Fig. 6:3D wireframe in a side view for the proposed algorithm: (a) Akiyo frame 57, (b) Akiyo frame 91,(c) Claire frame 32, (d) Claire frame 48.(b)(a)(c)(d)。