人脸识别外文翻译
人脸识别英文专业词汇教学提纲
人脸识别英文专业词汇gallery set参考图像集Probe set=test set测试图像集face renderingFacial Landmark Detection人脸特征点检测3D Morphable Model 3D形变模型AAM (Active Appearance Model)主动外观模型Aging modeling老化建模Aging simulation老化模拟Analysis by synthesis 综合分析Aperture stop孔径光标栏Appearance Feature表观特征Baseline基准系统Benchmarking 确定基准Bidirectional relighting 双向重光照Camera calibration摄像机标定(校正)Cascade of classifiers 级联分类器face detection 人脸检测Facial expression面部表情Depth of field 景深Edgelet 小边特征Eigen light-fields本征光场Eigenface特征脸Exposure time曝光时间Expression editing表情编辑Expression mapping表情映射Partial Expression Ratio Image局部表情比率图(,PERI) extrapersonal variations类间变化Eye localization,眼睛定位face image acquisition 人脸图像获取Face aging人脸老化Face alignment人脸对齐Face categorization人脸分类Frontal faces 正面人脸Face Identification人脸识别Face recognition vendor test人脸识别供应商测试Face tracking人脸跟踪Facial action coding system面部动作编码系统Facial aging面部老化Facial animation parameters脸部动画参数Facial expression analysis人脸表情分析Facial landmark面部特征点Facial Definition Parameters人脸定义参数Field of view视场Focal length焦距Geometric warping几何扭曲Street view街景Head pose estimation头部姿态估计Harmonic reflectances谐波反射Horizontal scaling水平伸缩Identification rate识别率Illumination cone光照锥Inverse rendering逆向绘制技术Iterative closest point迭代最近点Lambertian model朗伯模型Light-field光场Local binary patterns局部二值模式Mechanical vibration机械振动Multi-view videos多视点视频Band selection波段选择Capture systems获取系统Frontal lighting正面光照Open-set identification开集识别Operating point操作点Person detection行人检测Person tracking行人跟踪Photometric stereo光度立体技术Pixellation像素化Pose correction姿态校正Privacy concern隐私关注Privacy policies隐私策略Profile extraction轮廓提取Rigid transformation刚体变换Sequential importance sampling序贯重要性抽样Skin reflectance model,皮肤反射模型Specular reflectance镜面反射Stereo baseline 立体基线Super-resolution超分辨率Facial side-view面部侧视图Texture mapping纹理映射Texture pattern纹理模式Rama Chellappa读博计划:1.完成先前关于指纹细节点统计建模的相关工作。
人脸识别论文文献翻译中英文
人脸识别论文中英文附录(原文及译文)翻译原文来自Thomas David Heseltine BSc. Hons. The University of YorkDepartment of Computer ScienceFor the Qualification of PhD. -- September 2005 -《Face Recognition: Two-Dimensional and Three-Dimensional Techniques》4 Two-dimensional Face Recognition4.1 Feature LocalizationBefore discussing the methods of comparing two facial images we now take a brief look at some at the preliminary processes of facial feature alignment. This process typically consists of two stages: face detection and eye localisation. Depending on the application, if the position of the face within the image is known beforehand (for a cooperative subject in a door access system for example) then the face detection stage can often be skipped, as the region of interest is already known. Therefore, we discuss eye localisation here, with a brief discussion of face detection in the literature review(section 3.1.1).The eye localisation method is used to align the 2D face images of the various test sets used throughout this section. However, to ensure that all results presented arerepresentative of the face recognition accuracy and not a product of the performance of the eye localisation routine, all image alignments are manually checked and any errors corrected, prior to testing and evaluation.We detect the position of the eyes within an image using a simple template based method. A training set of manually pre-aligned images of faces is taken, and eachimage cropped to an area around both eyes. The average image is calculated and usedas a template.Figure 4-1 - The average eyes. Used as a template for eye detection.Both eyes are included in a single template, rather than individually searching for each eye in turn, as the characteristic symmetry of the eyes either side of the nose, provides a useful feature that helps distinguish between the eyes and other false positives that may be picked up in the background. Although this method is highly susceptible to scale(i.e. subject distance from thecamera) and also introduces the assumption that eyes in the image appear near horizontal. Some preliminary experimentation also reveals that it is advantageous to include the area of skin just beneath the eyes. The reason being that in some cases the eyebrows can closely match the template, particularly if there are shadows in the eye-sockets, but the area of skin below the eyes helps to distinguish the eyes from eyebrows (the area just below the eyebrows contain eyes, whereas the area below the eyes contains only plain skin).A window is passed over the test images and the absolute difference taken to that of the average eye image shown above. The area of the image with the lowest difference is taken as the region of interest containing the eyes. Applying the same procedure using a smaller template of the individual left and right eyes then refines each eye position.This basic template-based method of eye localisation, although providing fairly preciselocalisations, often fails to locate the eyes completely. However, we are able to improve performance by including a weighting scheme.Eye localisation is performed on the set of training images, which is then separated into two sets: those in which eye detection was successful; and those in which eye detection failed. Taking the set of successful localisations we compute the average distance from the eye template (Figure 4-2 top). Note that the image is quite dark, indicating that the detected eyes correlate closely to the eye template, as we would expect. However, bright points do occur near the whites of the eye, suggesting that this area is often inconsistent, varying greatly from the average eye template.Figure 4-2 – Distance to the eye template for successful detections (top) indicating variance due tonoise and failed detections (bottom) showing credible variance due to miss-detected features.In the lower image (Figure 4-2 bottom), we have taken the set of failed localisations(images of the forehead, nose, cheeks, background etc. falsely detected by the localisation routine) and once again computed the average distance from the eye template. The bright pupils surrounded by darker areas indicate that a failed match is often due to the high correlation of the nose and cheekbone regions overwhelming the poorly correlated pupils. Wanting to emphasise thedifference of the pupil regions for these failed matches and minimise the variance of the whites of the eyes for successful matches, we divide the lower image values by the upper image to produce a weights vector as shown in Figure 4-3. When applied to the difference image before summing a total error, this weighting scheme provides a much improved detection rate.Figure 4-3 - Eye template weights used to give higher priority to those pixels that best represent the eyes.4.2 The Direct Correlation ApproachWe begin our investigation into face recognition with perhaps the simplest approach,known as the direct correlation method (also referred to as template matching by Brunelli and Poggio [ 29 ]) involving the direct comparison of pixel intensity values taken from facial images. We use the term ‘Direct Correlation’ to encompass all techniques in which face images are compared directly, without any form of image space analysis, weighting schemes or feature extraction, regardless of the di stance metric used. Therefore, we do not infer that Pearson’s correlation is applied as the similarity function (although such an approach would obviously come under our definition of direct correlation). We typically use the Euclidean distance as our metric in these investigations (inversely related to Pearson’s correlation and can be considered as a scale and translation sensitive form of image correlation), as this persists with the contrast made between image space and subspace approaches in later sections.Firstly, all facial images must be aligned such that the eye centres are located at two specified pixel coordinates and the image cropped to remove any backgroundinformation. These images are stored as greyscale bitmaps of 65 by 82 pixels and prior to recognition converted into a vector of 5330 elements (each element containing the corresponding pixel intensity value). Each corresponding vector can be thought of as describing a point within a 5330 dimensional image space. This simple principle can easily be extended to much larger images: a 256 by 256 pixel image occupies a single point in 65,536-dimensional image space and again, similar images occupy close points within that space. Likewise, similar faces are located close together within the image space, while dissimilar faces are spaced far apart. Calculating the Euclidean distance d, between two facial image vectors (often referred to as the query image q, and gallery image g), we get an indication of similarity. A threshold is then applied to make the final verification decision.d q g (d threshold ⇒accept ) (d threshold ⇒reject ) . Equ. 4-14.2.1 Verification TestsThe primary concern in any face recognition system is its ability to correctly verify a claimed identity or determine a person's most likely identity from a set of potential matches in a database. In order to assess a given system’s ability to perform these tasks, a variety of evaluation methodologies have arisen. Some of these analysis methods simulate a specific mode of operation (i.e. secure site access or surveillance), while others provide a more mathematical description of data distribution in someclassification space. In addition, the results generated from each analysis method maybe presented in a variety of formats. Throughout the experimentations in this thesis, we primarily use the verification test as our method of analysis and comparison, although we also use Fisher’s Linear Discriminant to analyse individual subspace components in section 7 and the identification test for the final evaluations described in section 8. The verification test measures a system’s ability to correctly accept or reject the proposed identity of an individual. At a functional level, this reduces to two images being presented for comparison, for which the system must return either an acceptance (the two images are of the same person) or rejection (the two images are of different people). The test is designed to simulate the application area of secure site access. In this scenario, a subject will present some form of identification at a point of entry, perhaps as a swipe card, proximity chip or PIN number. This number is then used to retrieve a stored image from a database of known subjects (often referred to as the target or gallery image) and compared with a live image captured at the point of entry (the query image). Access is then granted depending on the acceptance/rejection decision.The results of the test are calculated according to how many times the accept/reject decision is made correctly. In order to execute this test we must first define our test set of face images. Although the number of images in the test set does not affect the results produced (as the error rates are specified as percentages of image comparisons), it is important to ensure that the test set is sufficiently large such that statistical anomalies become insignificant (for example, a couple of badly aligned images matching well). Also, the type of images (high variation in lighting, partial occlusions etc.) will significantly alter the results of the test. Therefore, in order to compare multiple face recognition systems, they must be applied to the same test set.However, it should also be noted that if the results are to be representative of system performance in a real world situation, then the test data should be captured under precisely the same circumstances as in the application environment.On the other hand, if the purpose of the experimentation is to evaluate and improve a method of face recognition, which may be applied to a range of application environments, then the test data should present the range of difficulties that are to be overcome. This may mean including a greater percentage of ‘difficult’ images thanwould be expected in the perceived operating conditions and hence higher error rates in the results produced. Below we provide the algorithm for executing the verification test. The algorithm is applied to a single test set of face images, using a single function call to the face recognition algorithm: CompareFaces(FaceA, FaceB). This call is used to compare two facial images, returning a distance score indicating how dissimilar the two face images are: the lower the score the more similar the two face images. Ideally, images of the same face should produce low scores, while images of different faces should produce high scores.Every image is compared with every other image, no image is compared with itself and no pair is compared more than once (we assume that the relationship is symmetrical). Once two images have been compared, producing a similarity score, the ground-truth is used to determine if the images are of the same person or different people. In practical tests this information is often encapsulated as part of the image filename (by means of a unique person identifier). Scores are then stored in one of two lists: a list containing scores produced by comparing images of different people and a list containing scores produced by comparing images of the same person. The final acceptance/rejection decision is made by application of a threshold. Any incorrect decision is recorded as either a false acceptance or false rejection. The false rejection rate (FRR) is calculated as the percentage of scores from the same people that were classified as rejections. The false acceptance rate (FAR) is calculated as the percentage of scores from different people that were classified as acceptances.For IndexA = 0 to length(TestSet)For IndexB = IndexA+1 to length(TestSet)Score = CompareFaces(TestSet[IndexA], TestSet[IndexB])If IndexA and IndexB are the same personAppend Score to AcceptScoresListElseAppend Score to RejectScoresListFor Threshold = Minimum Score to Maximum Score:FalseAcceptCount, FalseRejectCount = 0For each Score in RejectScoresListIf Score <= ThresholdIncrease FalseAcceptCountFor each Score in AcceptScoresListIf Score > ThresholdIncrease FalseRejectCountFalseAcceptRate = FalseAcceptCount / Length(AcceptScoresList)FalseRejectRate = FalseRejectCount / length(RejectScoresList)Add plot to error curve at (FalseRejectRate, FalseAcceptRate)These two error rates express the inadequacies of the system when operating at aspecific threshold value. Ideally, both these figures should be zero, but in reality reducing either the FAR or FRR (by altering the threshold value) will inevitably resultin increasing the other. Therefore, in order to describe the full operating range of aparticular system, we vary the threshold value through the entire range of scoresproduced. The application of each threshold value produces an additional FAR, FRRpair, which when plotted on a graph produces the error rate curve shown below.Figure 4-5 - Example Error Rate Curve produced by the verification test.The equal error rate (EER) can be seen as the point at which FAR is equal to FRR. This EER value is often used as a single figure representing the general recognitionperformance of a biometric system and allows for easy visual comparison of multiple methods. However, it is important to note that the EER does not indicate the level oferror that would be expected in a real world application. It is unlikely that any realsystem would use a threshold value such that the percentage of false acceptances wereequal to the percentage of false rejections. Secure site access systems would typicallyset the threshold such that false acceptances were significantly lower than false rejections: unwilling to tolerate intruders at the cost of inconvenient access denials.Surveillance systems on the other hand would require low false rejection rates tosuccessfully identify people in a less controlled environment. Therefore we should bear in mind that a system with a lower EER might not necessarily be the better performer towards the extremes of its operating capability.There is a strong connection between the above graph and the receiver operating characteristic (ROC) curves, also used in such experiments. Both graphs are simply two visualisations of the same results, in that the ROC format uses the True Acceptance Rate(TAR), where TAR = 1.0 – FRR in place of the FRR, effectively flipping the graph vertically. Another visualisation of the verification test results is to display both the FRR and FAR as functions of the threshold value. This presentation format provides a reference to determine the threshold value necessary to achieve a specific FRR and FAR. The EER can be seen as the point where the two curves intersect.Figure 4-6 - Example error rate curve as a function of the score threshold The fluctuation of these error curves due to noise and other errors is dependant on the number of face image comparisons made to generate the data. A small dataset that only allows for a small number of comparisons will results in a jagged curve, in which large steps correspond to the influence of a single image on a high proportion of thecomparisons made. A typical dataset of 720 images (as used in section 4.2.2) provides258,840 verification operations, hence a drop of 1% EER represents an additional 2588 correct decisions, whereas the quality of a single image could cause the EER tofluctuate by up to 0.28.4.2.2 ResultsAs a simple experiment to test the direct correlation method, we apply the technique described above to a test set of 720 images of 60 different people, taken from the AR Face Database [ 39 ]. Every image is compared with every other image in the test set to produce a likeness score, providing 258,840 verification operations from which to calculate false acceptance rates and false rejection rates. The error curve produced is shown in Figure 4-7.Figure 4-7 - Error rate curve produced by the direct correlation method using no image preprocessing.We see that an EER of 25.1% is produced, meaning that at the EER thresholdapproximately one quarter of all verification operations carried out resulted in anincorrect classification. There are a number of well-known reasons for this poor levelof accuracy. Tiny changes in lighting, expression or head orientation cause the location in image space to change dramatically. Images in face space are moved far apart due to these image capture conditions, despite being of the same person’s face. The distance between images of different people becomes smaller than the area of face space covered by images of the same person and hence false acceptances and false rejections occur frequently. Other disadvantages include the large amount of storage necessary for holding many face images and the intensive processing required for each comparison, making this method unsuitable for applications applied to a large database. In section 4.3 we explore the eigenface method, which attempts to address some of these issues.4 二维人脸识别4.1 功能定位在讨论比较两个人脸图像,我们现在就简要介绍的方法一些在人脸特征的初步调整过程。
人脸识别论文文献翻译中英文_大学论文
人脸识别论文中英文附录(原文及译文)翻译原文来自Thomas David Heselt ine BSc. Hons. The Un iversity of YorkDepartme nt of Computer Scie neeFor the Qualification of PhD. -- September 2005 -《Face Recog niti on: Two-Dime nsio nal and Three-Dime nsional Tech nique》4 Two-dimensional Face Recognition4.1 Feature LocalizationBefore discuss ing the methods of compari ng two facial images we now take a brief look at some at the prelimi nary processes of facial feature alig nment. This process typically con sists of two stages: face detect ion and eye localisati on. Depe nding on the applicati on, if the positi on of the face with in the image is known beforeha nd (for a cooperative subject in a door access system for example) the n the face detect ion stage can ofte n be skipped, as the regi on of in terest is already known. Therefore, we discuss eye localisati on here, with a brief discussi on of face detect ion in the literature review(sect ion 3.1.1).The eye localisati on method is used to alig n the 2D face images of the various test sets used throughout this section. However, to ensure that all results presented are represe ntative of the face recog niti on accuracy and not a product of the performa nee of the eye localisati on rout ine, all image alig nments are manu ally checked and any errors corrected, prior to testi ng and evaluati on.We detect the position of the eyes within an image using a simple template based method. A training set of manually pre-aligned images of faces is taken, and each image cropped to an area around both eyes. The average image is calculated and used as a template.Figure 4-1 - The average eyes. Used as a template for eye detection.Both eyes are in cluded in a sin gle template, rather tha n in dividually search ing for each eye in turn, as the characteristic symmetry of the eyes either side of the no se, provides a useful feature that helps disti nguish betwee n the eyes and other false positives that may be picked up in the background. Although this method is highly susceptible to scale(i.e. subject distance from the camera) and also in troduces the assumpti on that eyes in the image appear n ear horiz on tai. Some preliminary experimentation also reveals that it is advantageous to include the area of skin just ben eath the eyes. The reas on being that in some cases the eyebrows can closely match the template, particularly if thereare shadows in the eye-sockets, but the area of skin below the eyes helps to disti nguish the eyes from eyebrows (the area just below the eyebrows con tai n eyes, whereas the area below the eyes contains only plain skin).A window is passed over the test images and the absolute difference taken to that of the average eye image shown above. The area of the image with the lowest difference is taken as the region of interest containing the eyes. Applying the same procedure using a smaller template of the in dividual left and right eyes the n refi nes each eye positi on.This basic template-based method of eye localisati on, although provid ing fairly preciselocalisati ons, ofte n fails to locate the eyes completely. However, we are able to improve performa nce by in cludi ng a weighti ng scheme.Eye localisati on is performed on the set of training images, which is the n separated in to two sets: those in which eye detect ion was successful; and those in which eye detect ion failed. Taking the set of successful localisatio ns we compute the average dista nce from the eye template (Figure 4-2 top). Note that the image is quite dark, indicating that the detected eyes correlate closely to the eye template, as we would expect. However, bright points do occur near the whites of the eye, suggesting that this area is often inconsistent, varying greatly from the average eye template.Figure 4-2 -Distance to the eye template for successful detections (top) indicating variance due to noise and failed detections (bottom) showing credible variance due to miss-detected features.In the lower image (Figure 4-2 bottom), we have take n the set of failed localisati on s(images of the forehead, no se, cheeks, backgro und etc. falsely detected by the localisati on routi ne) and once aga in computed the average dista nce from the eye template. The bright pupils surr oun ded by darker areas in dicate that a failed match is ofte n due to the high correlati on of the nose and cheekb one regi ons overwhel ming the poorly correlated pupils. Wanting to emphasise the differenee of the pupil regions for these failed matches and minimise the varianee of the whites of the eyes for successful matches, we divide the lower image values by the upper image to produce a weights vector as show n in Figure 4-3. When applied to the differe nee image before summi ng a total error, this weight ing scheme provides a much improved detect ion rate.Figure 4-3 - Eye template weights used to give higher priority to those pixels that best represent the eyes.4.2 The Direct Correlation ApproachWe begi n our inv estigatio n into face recog niti on with perhaps the simplest approach,k nown as the direct correlation method (also referred to as template matching by Brunelli and Poggio [29 ]) inv olvi ng the direct comparis on of pixel inten sity values take n from facial images. We use the term ‘ Direct Correlation ' to encompass all techniques in which face images are compareddirectly, without any form of image space an alysis, weight ing schemes or feature extracti on, regardless of the dsta nee metric used. Therefore, we do not infer that Pears on ' s correlat applied as the similarity fun cti on (although such an approach would obviously come un der our definition of direct correlation). We typically use the Euclidean distance as our metric in these inv estigati ons (in versely related to Pears on ' s correlati on and can be con sidered as a scale tran slati on sen sitive form of image correlati on), as this persists with the con trast made betwee n image space and subspace approaches in later sect ions.Firstly, all facial images must be alig ned such that the eye cen tres are located at two specified pixel coord in ates and the image cropped to remove any backgro und in formati on. These images are stored as greyscale bitmaps of 65 by 82 pixels and prior to recog niti on con verted into a vector of 5330 eleme nts (each eleme nt containing the corresp onding pixel inten sity value). Each corresp onding vector can be thought of as describ ing a point with in a 5330 dime nsional image space. This simple prin ciple can easily be exte nded to much larger images: a 256 by 256 pixel image occupies a si ngle point in 65,536-dime nsional image space and again, similar images occupy close points within that space. Likewise, similar faces are located close together within the image space, while dissimilar faces are spaced far apart. Calculati ng the Euclidea n dista need, betwee n two facial image vectors (ofte n referred to as the query image q, and gallery imageg), we get an indication of similarity. A threshold is then applied to make the final verification decision.d q g (d threshold ? accept) d threshold ? reject ) . Equ. 4-14.2.1 Verification TestsThe primary concern in any face recognition system is its ability to correctly verify aclaimed identity or determine a person's most likely identity from a set of potential matches in a database. In order to assess a given system ' s ability to perform these tasks, a variety of evaluati on methodologies have arise n. Some of these an alysis methods simulate a specific mode of operatio n (i.e. secure site access or surveilla nee), while others provide a more mathematical description of data distribution in some classificatio n space. In additi on, the results gen erated from each an alysis method may be prese nted in a variety of formats. Throughout the experime ntatio ns in this thesis, weprimarily use the verification test as our method of analysis and comparison, although we also use Fisher Lin ear Discrim inant to an alyse in dividual subspace comp onents in secti on 7 and the iden tificati on test for the final evaluatio ns described in sect ion 8. The verificati on test measures a system ' s ability to correctly accept or reject the proposed ide ntity of an in dividual. At a fun cti on al level, this reduces to two images being prese nted for comparis on, for which the system must return either an accepta nee (the two images are of the same pers on) or rejectio n (the two images are of differe nt people). The test is desig ned to simulate the applicati on area of secure site access. In this scenario, a subject will present some form of identification at a point of en try, perhaps as a swipe card, proximity chip or PIN nu mber. This nu mber is the n used to retrieve a stored image from a database of known subjects (ofte n referred to as the target or gallery image) and compared with a live image captured at the point of entry (the query image). Access is the n gran ted depe nding on the accepta nce/rejecti on decisi on.The results of the test are calculated accord ing to how many times the accept/reject decisi on is made correctly. In order to execute this test we must first define our test set of face images. Although the nu mber of images in the test set does not affect the results produced (as the error rates are specified as percentages of image comparisons), it is important to ensure that the test set is sufficie ntly large such that statistical ano malies become in sig ni fica nt (for example, a couple of badly aligned images matching well). Also, the type of images (high variation in lighting, partial occlusions etc.) will significantly alter the results of the test. Therefore, in order to compare multiple face recog niti on systems, they must be applied to the same test set.However, it should also be no ted that if the results are to be represe ntative of system performance in a real world situation, then the test data should be captured under precisely the same circumsta nces as in the applicati on en vir onmen t. On the other han d, if the purpose of the experime ntati on is to evaluate and improve a method of face recog niti on, which may be applied to a range of applicati on en vir onmen ts, the n the test data should prese nt the range of difficulties that are to be overcome. This may mea n in cludi ng a greater perce ntage of ‘ difficult would be expected in the perceived operati ng con diti ons and hence higher error rates in the results produced. Below we provide the algorithm for execut ing the verificati on test. The algorithm is applied to a sin gle test set of face images, using a sin gle fun cti on call to the face recog niti on algorithm: CompareFaces(FaceA, FaceB). This call is used to compare two facial images, returni ng a dista nce score in dicat ing how dissimilar the two face images are: the lower the score the more similar the two face images. Ideally, images of the same face should produce low scores, while images of differe nt faces should produce high scores.Every image is compared with every other image, no image is compared with itself and no pair is compared more tha n once (we assume that the relati on ship is symmetrical). Once two images have been compared, producing a similarity score, the ground-truth is used to determine if the images are ofthe same person or different people. In practical tests this information is ofte n en capsulated as part of the image file name (by means of a unique pers on ide ntifier). Scores are the n stored in one of two lists: a list containing scores produced by compari ng images of differe nt people and a list containing scores produced by compari ng images of the same pers on. The final accepta nce/reject ion decisi on is made by applicati on of a threshold. Any in correct decision is recorded as either a false acceptance or false rejection. The false rejection rate (FRR) is calculated as the perce ntage of scores from the same people that were classified as rejectio ns. The false accepta nce rate (FAR) is calculated as the perce ntage of scores from differe nt people that were classified as accepta nces.For IndexA = 0 to length (TestSet)For IndexB = lndexA+1 to length (T estSet)Score = CompareFaces (T estSet[IndexA], TestSet[IndexB]) If IndexA and IndexB are the same person Append Score to AcceptScoresListElseAppend Score to RejectScoresListFor Threshold = Minimum Score to Maximum Score:FalseAcceptCount, FalseRejectCount = 0For each Score in RejectScoresListIf Score <= ThresholdIncrease FalseAcceptCountFor each Score in AcceptScoresListIf Score > ThresholdIncrease FalseRejectCountFalseAcceptRate = FalseAcceptCount / Length(AcceptScoresList) FalseRejectRate = FalseRejectCount / length(RejectScoresList) Add plot to error curve at (FalseRejectRate, FalseAcceptRate)These two error rates express the in adequacies of the system whe n operat ing at aspecific threshold value. Ideally, both these figures should be zero, but in reality reducing either the FAR or FRR (by alteri ng the threshold value) will in evitably resultin increasing the other. Therefore, in order to describe the full operating range of a particular system, we vary the threshold value through the en tire range of scores produced. The applicati on of each threshold value produces an additi onal FAR, FRR pair, which when plotted on a graph produces the error rate curve shown below.Figure 4-5 - Example Error Rate Curve produced by the verification test.The equal error rate (EER) can be see n as the point at which FAR is equal to FRR. This EER value is often used as a single figure representing the general recognition performa nee of a biometric system and allows for easy visual comparis on of multiple methods. However, it is important to note that the EER does not indicate the level of error that would be expected in a real world applicati on .It is un likely that any real system would use a threshold value such that the perce ntage of false accepta nces were equal to the perce ntage of false rejecti ons. Secure site access systems would typically set the threshold such that false accepta nces were sig nifica ntly lower tha n false rejecti ons: unwilling to tolerate intruders at the cost of inconvenient access denials.Surveilla nee systems on the other hand would require low false rejectio n rates to successfully ide ntify people in a less con trolled en vir onment. Therefore we should bear in mind that a system with a lower EER might not n ecessarily be the better performer towards the extremes of its operating capability.There is a strong conn ecti on betwee n the above graph and the receiver operat ing characteristic (ROC) curves, also used in such experime nts. Both graphs are simply two visualisati ons of the same results, in that the ROC format uses the True Accepta nee Rate(TAR), where TAR = 1.0 -FRR in place of the FRR, effectively flipping the graph vertically. Another visualisation of the verification test results is to display both the FRR and FAR as functions of the threshold value. This prese ntati on format provides a refere nee to determ ine the threshold value necessary to achieve a specific FRR and FAR. The EER can be seen as the point where the two curves in tersect.ThrasholdFigure 4-6 - Example error rate curve as a function of the score thresholdThe fluctuati on of these error curves due to no ise and other errors is depe ndant on the nu mber of face image comparis ons made to gen erate the data. A small dataset that on ly allows for a small nu mber of comparis ons will results in a jagged curve, in which large steps corresp ond to the in flue nce of a si ngle image on a high proporti on of thecomparis ons made. A typical dataset of 720 images (as used in sect ion 422) provides 258,840 verificatio n operati ons, hence a drop of 1% EER represe nts an additi onal 2588 correct decisions, whereas the quality of a single image could cause the EER tofluctuate by up to 0.28.4.2.2 ResultsAs a simple experiment to test the direct correlation method, we apply the technique described above to a test set of 720 images of 60 different people, taken from the AR Face Database [ 39 ]. Every image is compared with every other image in the test set to produce a like ness score, provid ing 258,840 verificati on operati ons from which to calculate false accepta nce rates and false rejecti on rates. The error curve produced is show n in Figure 4-7.Figure 4-7 - Error rate curve produced by the direct correlation method using no image preprocessing.We see that an EER of 25.1% is produced, meaning that at the EER thresholdapproximately one quarter of all verification operations carried out resulted in anin correct classificati on. There are a nu mber of well-k nown reas ons for this poor levelof accuracy. Tiny changes in lighting, expression or head orientation cause the location in image space to cha nge dramatically. Images in face space are moved far apart due to these image capture conditions, despite being of the same person ' s face. The distanee between images differe nt people becomes smaller tha n the area of face space covered by images of the same pers on and hence false accepta nces and false rejecti ons occur freque ntly. Other disadva ntages in clude the large amount of storage n ecessary for holdi ng many face images and the inten sive process ing required for each comparis on, making this method un suitable for applicati ons applied to a large database. In secti on 4.3 we explore the eige nface method, which attempts to address some of these issues.4二维人脸识别4.1功能定位在讨论比较两个人脸图像,我们现在就简要介绍的方法一些在人脸特征的初步调整过程。
人脸识别英文
Application
Face Recognition Access Control System
Face Recognition access control system is called FaceGate, Whenever one wishes to access a building, FaceGate verifies the person’s entry code or card, then compares his face with its stored “key.” It registers him as being authorized and allows him to enter the building. Access is denied to anyone whose face does not match.
Fundamentals
step 3 ) recognization process
After step2, the extracted feature of the input face is matched against those faces in the database; just like this pictuer, it outputs the result when a match is found.
Application
Face recognition to pay
Alibaba Group founder Jack Ma showed off the technology Sunday during a CeBIT event that would seamlessly scan users’ faces via their smartphones to verify mobile payments. The technology, called “Smile to Pay,” is being developed
人脸识别英语作文
人脸识别英语作文Facial Recognition: A Double-Edged Sword in the Digital AgeIn the realm of technology, few innovations have garnered as much attention and debate as facial recognition. This technology, which utilizes artificial intelligence toidentify individuals by analyzing their facial features, has been hailed for its potential to revolutionize security, enhance user experiences, and streamline various processes. However, it also raises significant concerns about privacy, ethical use, and the potential for abuse.One of the most prominent uses of facial recognition isin security systems. Airports, for instance, have begun to implement facial recognition gates that can verifypassengers' identities swiftly and accurately, reducing wait times and the risk of human error. Similarly, law enforcement agencies have employed this technology to track down criminals, locate missing persons, and prevent crime by identifying individuals in public spaces.On the commercial side, facial recognition has been integrated into smartphones for user authentication,providing a convenient and secure method of unlocking devices. Retailers are also exploring its use to personalize shopping experiences, recognize loyal customers, and even predict shopping trends based on demographic data.Despite these benefits, the technology is not without its critics. Privacy advocates argue that the widespread use of facial recognition could lead to a surveillance state where individuals' movements and behaviors are constantly monitored without their consent. There are also concerns about the accuracy of the technology, with studies showing that it can be less effective for certain demographics, potentially leading to false identifications and unjust consequences.Moreover, the ethical implications of facial recognition are profound. Who has the right to access this technology and how it is used? What safeguards are in place to prevent misuse? These questions become even more critical when considering the potential for facial recognition to be usedin more invasive ways, such as monitoring political dissent or suppressing minority groups.Regulation is a key component in addressing these concerns. Governments and international bodies must work together to establish clear guidelines on the use of facial recognition technology. This includes ensuring transparencyin how the data is collected, stored, and used, as well as implementing strict penalties for misuse.In conclusion, facial recognition represents asignificant leap forward in technological capability. It has the potential to greatly enhance security, efficiency, and personalization in various sectors. However, it also presents a profound challenge to our understanding of privacy and personal freedom. As this technology continues to evolve, itis imperative that we approach its use with caution, guided by a robust ethical framework and stringent regulation to ensure that it serves as a tool for societal benefit rather than a weapon against individual liberty.。
人脸识别英文专业词汇教学内容
gallery set参考图像集Probe set=test set测试图像集face renderingFacial Landmark Detection人脸特征点检测3D Morphable Model 3D形变模型AAM (Active Appearance Model)主动外观模型Aging modeling老化建模Aging simulation老化模拟Analysis by synthesis 综合分析Aperture stop孔径光标栏Appearance Feature表观特征Baseline基准系统Benchmarking 确定基准Bidirectional relighting 双向重光照Camera calibration摄像机标定(校正)Cascade of classifiers 级联分类器face detection 人脸检测Facial expression面部表情Depth of field 景深Edgelet 小边特征Eigen light-fields本征光场Eigenface特征脸Exposure time曝光时间Expression editing表情编辑Expression mapping表情映射Partial Expression Ratio Image局部表情比率图(,PERI) extrapersonal variations类间变化Eye localization,眼睛定位face image acquisition 人脸图像获取Face aging人脸老化Face alignment人脸对齐Face categorization人脸分类Frontal faces 正面人脸Face Identification人脸识别Face recognition vendor test人脸识别供应商测试Face tracking人脸跟踪Facial action coding system面部动作编码系统Facial aging面部老化Facial animation parameters脸部动画参数Facial expression analysis人脸表情分析Facial landmark面部特征点Facial Definition Parameters人脸定义参数Field of view视场Focal length焦距Geometric warping几何扭曲Street view街景Head pose estimation头部姿态估计Harmonic reflectances谐波反射Horizontal scaling水平伸缩Identification rate识别率Illumination cone光照锥Inverse rendering逆向绘制技术Iterative closest point迭代最近点Lambertian model朗伯模型Light-field光场Local binary patterns局部二值模式Mechanical vibration机械振动Multi-view videos多视点视频Band selection波段选择Capture systems获取系统Frontal lighting正面光照Open-set identification开集识别Operating point操作点Person detection行人检测Person tracking行人跟踪Photometric stereo光度立体技术Pixellation像素化Pose correction姿态校正Privacy concern隐私关注Privacy policies隐私策略Profile extraction轮廓提取Rigid transformation刚体变换Sequential importance sampling序贯重要性抽样Skin reflectance model,皮肤反射模型Specular reflectance镜面反射Stereo baseline 立体基线Super-resolution超分辨率Facial side-view面部侧视图Texture mapping纹理映射Texture pattern纹理模式Rama Chellappa读博计划:1.完成先前关于指纹细节点统计建模的相关工作。
人脸识别技术外文翻译文献编辑
文献信息文献标题:Face Recognition Techniques: A Survey(人脸识别技术综述)文献作者:V.Vijayakumari文献出处:《World Journal of Computer Application and Technology》, 2013,1(2):41-50字数统计:英文3186单词,17705字符;中文5317汉字外文文献Face Recognition Techniques: A Survey Abstract Face is the index of mind. It is a complex multidimensional structure and needs a good computing technique for recognition. While using automatic system for face recognition, computers are easily confused by changes in illumination, variation in poses and change in angles of faces. A numerous techniques are being used for security and authentication purposes which includes areas in detective agencies and military purpose. These surveys give the existing methods in automatic face recognition and formulate the way to still increase the performance.Keywords: Face Recognition, Illumination, Authentication, Security1.IntroductionDeveloped in the 1960s, the first semi-automated system for face recognition required the administrator to locate features ( such as eyes, ears, nose, and mouth) on the photographs before it calculated distances and ratios to a common reference point, which were then compared to reference data. In the 1970s, Goldstein, Armon, and Lesk used 21 specific subjective markers such as hair color and lip thickness to automate the recognition. The problem with both of these early solutions was that the measurements and locations were manually computed. The face recognition problem can be divided into two main stages: face verification (or authentication), and face identification (or recognition).The detection stage is the first stage; it includesidentifying and locating a face in an image. The recognition stage is the second stage; it includes feature extraction, where important information for the discrimination is saved and the matching where the recognition result is given aid of a face database.2.Methods2.1.Geometric Feature Based MethodsThe geometric feature based approaches are the earliest approaches to face recognition and detection. In these systems, the significant facial features are detected and the distances among them as well as other geometric characteristic are combined in a feature vector that is used to represent the face. To recognize a face, first the feature vector of the test image and of the image in the database is obtained. Second, a similarity measure between these vectors, most often a minimum distance criterion, is used to determine the identity of the face. As pointed out by Brunelli and Poggio, the template based approaches will outperform the early geometric feature based approaches.2.2.Template Based MethodsThe template based approaches represent the most popular technique used to recognize and detect faces. Unlike the geometric feature based approaches, the template based approaches use a feature vector that represent the entire face template rather than the most significant facial features.2.3.Correlation Based MethodsCorrelation based methods for face detection are based on the computation of the normalized cross correlation coefficient Cn. The first step in these methods is to determine the location of the significant facial features such as eyes, nose or mouth. The importance of robust facial feature detection for both detection and recognition has resulted in the development of a variety of different facial feature detection algorithms. The facial feature detection method proposed by Brunelli and Poggio uses a set of templates to detect the position of the eyes in an image, by looking for the maximum absolute values of the normalized correlation coefficient of these templates at each point in test image. To cope with scale variations, a set of templates atdifferent scales was used.The problems associated with the scale variations can be significantly reduced by using hierarchical correlation. For face recognition, the templates corresponding to the significant facial feature of the test images are compared in turn with the corresponding templates of all of the images in the database, returning a vector of matching scores computed through normalized cross correlation. The similarity scores of different features are integrated to obtain a global score that is used for recognition. Other similar method that use correlation or higher order statistics revealed the accuracy of these methods but also their complexity.Beymer extended the correlation based on the approach to a view based approach for recognizing faces under varying orientation, including rotations with respect to the axis perpendicular to the image plane(rotations in image depth). To handle rotations out of the image plane, templates from different views were used. After the pose is determined ,the task of recognition is reduced to the classical correlation method in which the facial feature templates are matched to the corresponding templates of the appropriate view based models using the cross correlation coefficient. However this approach is highly computational expensive, and it is sensitive to lighting conditions.2.4.Matching Pursuit Based MethodsPhilips introduced a template based face detection and recognition system that uses a matching pursuit filter to obtain the face vector. The matching pursuit algorithm applied to an image iteratively selects from a dictionary of basis functions the best decomposition of the image by minimizing the residue of the image in all iterations. The algorithm describes by Philips constructs the best decomposition of a set of images by iteratively optimizing a cost function, which is determined from the residues of the individual images. The dictionary of basis functions used by the author consists of two dimensional wavelets, which gives a better image representation than the PCA (Principal Component Analysis) and LDA(Linear Discriminant Analysis) based techniques where the images were stored as vectors. For recognition the cost function is a measure of distances between faces and is maximized at each iteration. For detection the goal is to find a filter that clusters together in similar templates (themean for example), and minimized in each iteration. The feature represents the average value of the projection of the templates on the selected basis.2.5.Singular Value Decomposition Based MethodsThe face recognition method in this section use the general result stated by the singular value decomposition theorem. Z.Hong revealed the importance of using Singular Value Decomposition Method (SVD) for human face recognition by providing several important properties of the singular values (SV) vector which include: the stability of the SV vector to small perturbations caused by stochastic variation in the intensity image, the proportional variation of the SV vector with the pixel intensities, the variances of the SV feature vector to rotation, translation and mirror transformation. The above properties of the SV vector provide the theoretical basis for using singular values as image features. In addition, it has been shown that compressing the original SV vector into the low dimensional space by means of various mathematical transforms leads to the higher recognition performance. Among the various dimensionality reducing transformations, the Linear Discriminant Transform is the most popular one.2.6.The Dynamic Link Matching MethodsThe above template based matching methods use an Euclidean distance to identify a face in a gallery or to detect a face from a background. A more flexible distance measure that accounts for common facial transformations is the dynamic link introduced by Lades et al. In this approach , a rectangular grid is centered all faces in the gallery. The feature vector is calculated based on Gabor type wavelets, computed at all points of the grid. A new face is identified if the cost function, which is a weighted sum of two terms, is minimized. The first term in the cost function is small when the distance between feature vectors is small and the second term is small when the relative distance between the grid points in the test and the gallery image is preserved. It is the second term of this cost function that gives the “elasticity” of this matching measure. While the grid of the image remains rectangular, the grid that is “best fit” over the test image is stretched. Under certain constraints, until the minimum of the cost function is achieved. The minimum value of the cost function isused further to identify the unknown face.2.7.Illumination Invariant Processing MethodsThe problem of determining functions of an image of an object that are insensitive to illumination changes are considered. An object with Lambertian reflection has no discriminative functions that are invariant to illumination. This result leads the author to adopt a probabilistic approach in which they analytically determine a probability distribution for the image gradient as a function of the surfaces geometry and reflectance. Their distribution reveals that the direction of the image gradient is insensitive to changes in illumination direction. Verify this empirically by constructing a distribution for the image gradient from more than twenty million samples of gradients in a database of thousand two hundred and eighty images of twenty inanimate objects taken under varying lighting conditions. Using this distribution, they develop an illumination insensitive measure of image comparison and test it on the problem of face recognition. In another method, they consider only the set of images of an object under variable illumination, including multiple, extended light sources, shadows, and color. They prove that the set of n-pixel monochrome images of a convex object with a Lambertian reflectance function, illuminated by an arbitrary number of point light sources at infinity, forms a convex polyhedral cone in IR and that the dimension of this illumination cone equals the number of distinct surface normal. Furthermore, the illumination cone can be constructed from as few as three images. In addition, the set of n-pixel images of an object of any shape and with a more general reflectance function, seen under all possible illumination conditions, still forms a convex cone in IRn. These results immediately suggest certain approaches to object recognition. Throughout, they present results demonstrating the illumination cone representation.2.8.Support Vector Machine ApproachFace recognition is a K class problem, where K is the number of known individuals; and support vector machines (SVMs) are a binary classification method. By reformulating the face recognition problem and reinterpreting the output of the SVM classifier, they developed a SVM-based face recognition algorithm. The facerecognition problem is formulated as a problem in difference space, which models dissimilarities between two facial images. In difference space we formulate face recognition as a two class problem. The classes are: dissimilarities between faces of the same person, and dissimilarities between faces of different people. By modifying the interpretation of the decision surface generated by SVM, we generated a similarity metric between faces that are learned from examples of differences between faces. The SVM-based algorithm is compared with a principal component analysis (PCA) based algorithm on a difficult set of images from the FERET database. Performance was measured for both verification and identification scenarios. The identification performance for SVM is 77-78% versus 54% for PCA. For verification, the equal error rate is 7% for SVM and 13% for PCA.2.9.Karhunen- Loeve Expansion Based Methods2.9.1.Eigen Face ApproachIn this approach, face recognition problem is treated as an intrinsically two dimensional recognition problem. The system works by projecting face images which represents the significant variations among known faces. This significant feature is characterized as the Eigen faces. They are actually the eigenvectors. Their goal is to develop a computational model of face recognition that is fact, reasonably simple and accurate in constrained environment. Eigen face approach is motivated by the information theory.2.9.2.Recognition Using Eigen FeaturesWhile the classical eigenface method uses the KLT (Karhunen- Loeve Transform) coefficients of the template corresponding to the whole face image, the author Pentland et.al. introduce a face detection and recognition system that uses the KLT coefficients of the templates corresponding to the significant facial features like eyes, nose and mouth. For each of the facial features, a feature space is built by selecting the most significant “eigenfeatures”, which are the eigenvectors corresponding to the largest eigen values of the features correlation matrix. The significant facial features were detected using the distance from the feature space and selecting the closest match. The scores of similarity between the templates of the test image and thetemplates of the images in the training set were integrated in a cumulative score that measures the distance between the test image and the training images. The method was extended to the detection of features under different viewing geometries by using either a view-based Eigen space or a parametric eigenspace.2.10.Feature Based Methods2.10.1.Kernel Direct Discriminant Analysis AlgorithmThe kernel machine-based Discriminant analysis method deals with the nonlinearity of the face patterns’ distribution. This method also effectively solves the so-called “small sample size” (SSS) problem, which exists in most Face Recognition tasks. The new algorithm has been tested, in terms of classification error rate performance, on the multiview UMIST face database. Results indicate that the proposed methodology is able to achieve excellent performance with only a very small set of features being used, and its error rate is approximately 34% and 48% of those of two other commonly used kernel FR approaches, the kernel-PCA (KPCA) and the Generalized Discriminant Analysis (GDA), respectively.2.10.2.Features Extracted from Walshlet PyramidA novel Walshlet Pyramid based face recognition technique used the image feature set extracted from Walshlets applied on the image at various levels of decomposition. Here the image features are extracted by applying Walshlet Pyramid on gray plane (average of red, green and blue. The proposed technique is tested on two image databases having 100 images each. The results show that Walshlet level-4 outperforms other Walshlets and Walsh Transform, because the higher level Walshlets are giving very coarse color-texture features while the lower level Walshlets are representing very fine color-texture features which are less useful to differentiate the images in face recognition.2.10.3.Hybrid Color and Frequency Features ApproachThis correspondence presents a novel hybrid Color and Frequency Features (CFF) method for face recognition. The CFF method, which applies an Enhanced Fisher Model(EFM), extracts the complementary frequency features in a new hybrid color space for improving face recognition performance. The new color space, the RIQcolor space, which combines the component image R of the RGB color space and the chromatic components I and Q of the YIQ color space, displays prominent capability for improving face recognition performance due to the complementary characteristics of its component images. The EFM then extracts the complementary features from the real part, the imaginary part, and the magnitude of the R image in the frequency domain. The complementary features are then fused by means of concatenation at the feature level to derive similarity scores for classification. The complementary feature extraction and feature level fusion procedure applies to the I and Q component images as well. Experiments on the Face Recognition Grand Challenge (FRGC) show that i) the hybrid color space improves face recognition performance significantly, and ii) the complementary color and frequency features further improve face recognition performance.2.10.4.Multilevel Block Truncation Coding ApproachIn Multilevel Block Truncation coding for face recognition uses all four levels of Multilevel Block Truncation Coding for feature vector extraction resulting into four variations of proposed face recognition technique. The experimentation has been conducted on two different face databases. The first one is Face Database which has 1000 face images and the second one is “Our Own Database” which has 1600 face images. To measure the performance of the algorithm the False Acceptance Rate (FAR) and Genuine Acceptance Rate (GAR) parameters have been used. The experimental results have shown that the outcome of BTC (Block truncation Coding) Level 4 is better as compared to the other BTC levels in terms of accuracy, at the cost of increased feature vector size.2.11.Neural Network Based AlgorithmsTemplates have been also used as input to Neural Network (NN) based systems. Lawrence et.al proposed a hybrid neural network approach that combines local image sampling, A self organizing map (SOM) and a convolutional neural network. The SOP provides a set of features that represents a more compact and robust representation of the image samples. These features are then fed into the convolutional neural network. This architecture provides partial invariance to translation, rotation, scale and facedeformation. Along with this the author introduced an efficient probabilistic decision based neural network (PDBNN) for face detection and recognition. The feature vector used consists of intensity and edge values obtained from the facial region of the down sampled image in the training set. The facial region contains the eyes and nose, but excludes the hair and mouth. Two PDBNN were trained with these feature vectors and used one for the face detection and other for the face recognition.2.12.Model Based Methods2.12.1.Hidden Markov Model Based ApproachIn this approach, the author utilizes the face that the most significant facial features of a frontal face which includes hair, forehead, eyes, nose and mouth which occur in a natural order from top to bottom even if the image undergo small variation/rotation in the image plane perpendicular to the image plane. One dimensional HMM (Hidden Markov Model) is used for modeling the image, where the observation vectors are obtained from DCT or KLT coefficients. They given c face images for each subject of the training set, the goal of the training set is to optimize the parameters of the Hidden Markov Model to best describe the observations in the sense of maximizing the probability of the observations given in the model. Recognition is carried out by matching the best test image against each of the trained models. To do this, the image is converted to an observation sequence and then model likelihoods are computed for each face model. The model with the highest likelihood reveals the identity of the unknown face.2.12.2.The Volumetric Frequency Representation of Face ModelA face model that incorporates both the three dimensional (3D) face structure and its two-dimensional representation are explained (face images). This model which represents a volumetric (3D) frequency representation (VFR) of the face , is constructed using range image of a human head. Making use of an extension of the projection Slice Theorem, the Fourier transform of any face image corresponds to a slice in the face VFR. For both pose estimation and face recognition a face image is indexed in the 3D VFR based on the correlation matching in a four dimensional Fourier space, parameterized over the elevation, azimuth, rotation in the image planeand the scale of faces.3.ConclusionThis paper discusses the different approaches which have been employed in automatic face recognition. In the geometrical based methods, the geometrical features are selected and the significant facial features are detected. The correlation based approach needs face template rather than the significant facial features. Singular value vectors and the properties of the SV vector provide the theoretical basis for using singular values as image features. The Karhunen-Loeve expansion works by projecting the face images which represents the significant variations among the known faces. Eigen values and Eigen vectors are involved in extracting the features in KLT. Neural network based approaches are more efficient when it contains no more than a few hundred weights. The Hidden Markov model optimizes the parameters to best describe the observations in the sense of maximizing the probability of observations given in the model .Some methods use the features for classification and few methods uses the distance measure from the nodal points. The drawbacks of the methods are also discussed based on the performance of the algorithms used in the approaches. Hence this will give some idea about the existing methods for automatic face recognition.中文译文人脸识别技术综述摘要人脸是心灵的指标。
托福阅读练习:人脸识别Facial Recognition
托福阅读练习:人脸识别Facial Recognition托福阅读素材:托福阅读练习:人脸识别Facial Recognition 托福备考精品课程辅导人脸识别无处躲藏人脸识别不只是另一种技术。
它将改变社会Facial recognitionNowhere to hideFacial recognition is not just another technology. It will change societyTHE human face is a remarkable piece of work. The astonishing variety of facial features helps people recognise each other and is crucial to the formation of complex societies. So is the face’s ability to send emotional signals, whether through an involuntary blush or the artifice of a false smile. People spend much of their waking lives, in the office and the courtroom as well as the bar and the bedroom, reading faces, for signs of attraction, hostility, trust and deceit. They also spend plenty of time trying to dissimulate.人类的脸是一件杰作。
面部特征之纷繁各异令人惊叹,它让人们能相互辨认,也是形成复杂社会群体的关键。
人脸传递情感信号的功能也同样重要,无论是通过下意识的脸红还是有技巧的假笑。
人脸识别英文专业词汇
gallery set参考图像集Probe set=test set测试图像集face renderingFacial Landmark Detection人脸特征点检测3D Morphable Model 3D形变模型AAM (Active Appearance Model)主动外观模型Aging modeling老化建模Aging simulation老化模拟Analysis by synthesis 综合分析Aperture stop孔径光标栏Appearance Feature表观特征Baseline基准系统Benchmarking 确定基准Bidirectional relighting双向重光照Camera calibration摄像机标定(校正)Cascade of classifiers级联分类器face detection 人脸检测Facial expression面部表情Depth of field 景深Edgelet 小边特征Eigen light-fields本征光场Eigenface特征脸Exposure time曝光时间Expression editing表情编辑Expression mapping表情映射Partial Expression Ratio Image局部表情比率图(,PERI) extrapersonal variations类间变化Eye localization,眼睛定位face image acquisition人脸图像获取Face aging人脸老化Face alignment人脸对齐Face categorization人脸分类Frontal faces 正面人脸Face Identification人脸识别Face recognition vendor test人脸识别供应商测试Face tracking人脸跟踪Facial action coding system面部动作编码系统Facial aging面部老化Facial animation parameters脸部动画参数Facial expression analysis人脸表情分析Facial landmark面部特征点Facial Definition Parameters人脸定义参数Field of view视场Focal length焦距Geometric warping几何扭曲Street view街景Head pose estimation头部姿态估计Harmonic reflectances谐波反射Horizontal scaling水平伸缩Identification rate识别率Illumination cone光照锥Inverse rendering逆向绘制技术Iterative closest point迭代最近点Lambertian model朗伯模型Light-field光场Local binary patterns局部二值模式Mechanical vibration机械振动Multi-view videos多视点视频Band selection波段选择Capture systems获取系统Frontal lighting正面光照Open-set identification开集识别Operating point操作点Person detection行人检测Person tracking行人跟踪Photometric stereo光度立体技术Pixellation像素化Pose correction姿态校正Privacy concern隐私关注Privacy policies隐私策略Profile extraction轮廓提取Rigid transformation刚体变换Sequential importance sampling序贯重要性抽样Skin reflectance model,皮肤反射模型Specular reflectance镜面反射Stereo baseline立体基线Super-resolution超分辨率Facial side-view面部侧视图Texture mapping纹理映射Texture pattern纹理模式Rama Chellappa读博计划:1.完成先前关于指纹细节点统计建模的相关工作。
人脸识别外文翻译参考文献
人脸识别外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)译文:基于PAC的实时人脸检测和跟踪方法摘要:这篇文章提出了复杂背景条件下,实现实时人脸检测和跟踪的一种方法。
这种方法是以主要成分分析技术为基础的。
为了实现人脸的检测,首先,我们要用一个肤色模型和一些动作信息(如:姿势、手势、眼色)。
然后,使用PAC技术检测这些被检验的区域,从而判定人脸真正的位置。
而人脸跟踪基于欧几里德(Euclidian)距离的,其中欧几里德距离在位于以前被跟踪的人脸和最近被检测的人脸之间的特征空间中。
用于人脸跟踪的摄像控制器以这样的方法工作:利用平衡/(pan/tilt)平台,把被检测的人脸区域控制在屏幕的中央。
这个方法还可以扩展到其他的系统中去,例如电信会议、入侵者检查系统等等。
1.引言视频信号处理有许多应用,例如鉴于通讯可视化的电信会议,为残疾人服务的唇读系统。
在上面提到的许多系统中,人脸的检测喝跟踪视必不可缺的组成部分。
在本文中,涉及到一些实时的人脸区域跟踪[1-3]。
一般来说,根据跟踪角度的不同,可以把跟踪方法分为两类。
有一部分人把人脸跟踪分为基于识别的跟踪喝基于动作的跟踪,而其他一部分人则把人脸跟踪分为基于边缘的跟踪和基于区域的跟踪[4]。
基于识别的跟踪是真正地以对象识别技术为基础的,而跟踪系统的性能是受到识别方法的效率的限制。
基于动作的跟踪是依赖于动作检测技术,且该技术可以被分成视频流(optical flow)的(检测)方法和动作—能量(motion-energy)的(检测)方法。
基于边缘的(跟踪)方法用于跟踪一幅图像序列的边缘,而这些边缘通常是主要对象的边界线。
然而,因为被跟踪的对象必须在色彩和光照条件下显示出明显的边缘变化,所以这些方法会遭遇到彩色和光照的变化。
此外,当一幅图像的背景有很明显的边缘时,(跟踪方法)很难提供可靠的(跟踪)结果。
当前很多的文献都涉及到的这类方法时源于Kass et al.在蛇形汇率波动[5]的成就。
人脸识别外文文献
Method of Face Recognition Based on Red-BlackWavelet Transform and PCAYuqing He, Huan He, and Hongying YangDepartment of Opto-Electronic Engineering,Beijing Institute of Technology, Beijing, P.R. China, 10008120701170@。
cnAbstract。
With the development of the man—machine interface and the recogni—tion technology, face recognition has became one of the most important research aspects in the biological features recognition domain. Nowadays, PCA(Principal Components Analysis) has applied in recognition based on many face database and achieved good results. However, PCA has its limitations: the large volume of computing and the low distinction ability。
In view of these limitations, this paper puts forward a face recognition method based on red—black wavelet transform and PCA. The improved histogram equalization is used to realize image pre-processing in order to compensate the illumination. Then, appling the red—black wavelet sub—band which contains the information of the original image to extract the feature and do matching。
人脸识别英文文献
A Parallel Framework for Multilayer Perceptron for Human FaceRecognitionDebotosh Bhattacharjee debotosh@ Reader,Department of Computer Science and Engineering,Jadavpur University,Kolkata- 700032, India.Mrinal Kanti Bhowmik mkb_cse@yahoo.co.in Lecturer,Department of Computer Science and Engineering,Tripura University (A Central University),Suryamaninagar- 799130, Tripura, India.Mita Nasipuri mitanasipuri@ Professor,Department of Computer Science and Engineering,Jadavpur University,Kolkata- 700032, India.Dipak Kumar Basu dipakkbasu@ Professor, AICTE Emeritus Fellow,Department of Computer Science and Engineering,Jadavpur University,Kolkata- 700032, India.Mahantapas Kundu mkundu@cse.jdvu.ac.in Professor,Department of Computer Science and Engineering,Jadavpur University,Kolkata- 700032, India.AbstractArtificial neural networks have already shown their success in face recognition and similar complex pattern recognition tasks. However, a major disadvantage of the technique is that it is extremely slow during training for larger classes and hence not suitable for real-time complex problems such as pattern recognition. This is an attempt to develop a parallel framework for the training algorithm of a perceptron. In this paper, two general architectures for a Multilayer Perceptron (MLP) have been demonstrated. The first architecture is All-Class-in-One-Network (ACON) where all the classes are placed in a single network and the second one is One-Class-in-One-Network (OCON) where an individual single network is responsible for each and every class. Capabilities of these two architectures were compared and verified in solving human face recognition, which is a complex pattern recognition task where several factors affect the recognition performance like pose variations, facial expression changes, occlusions, and most importantly illumination changes. Both the structures wereimplemented and tested for face recognition purpose and experimental results show that the OCON structure performs better than the generally used ACON ones in term of training convergence speed of the network. Unlike the conventional sequential approach of training the neural networks, the OCON technique may be implemented by training all the classes of the face images simultaneously.Keywords:Artificial Neural Network, Network architecture, All-Class-in-One-Network (ACON), One-Class-in-One-Network (OCON), PCA, Multilayer Perceptron, Face recognition.1. INTRODUCTIONNeural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network can be thought of as an "expert" in the category of information it has been given to analyze [1]. This proposed work describes the way by which an Artificial Neural Network (ANN) can be designed and implemented over a parallel or distributed environment to reduce its training time. Generally, an ANN goes through three different steps: training of the network, testing of it and final use of it. The final structure of an ANN is generally found out experimentally. This requires huge amount of computation. Moreover, the training time of an ANN is very large, when the classes are linearly non-separable and overlapping in nature. Therefore, to save computation time and in order to achieve good response time the obvious choice is either a high-end machine or a system which is collection of machines with low computational power.In this work, we consider multilayer perceptron (MLP) for human face recognition, which has many real time applications starting from automatic daily attendance checking, allowing the authorized people to enter into highly secured area, in detecting and preventing criminals and so on. For all these cases, response time is very critical. Face recognition has the benefit of being passive, nonintrusive system for verifying personal identity. The techniques used in the best face recognition systems may depend on the application of the system.Human face recognition is a very complex pattern recognition problem, altogether. There is no stability in the input pattern due to different expressions, adornments in the input images. Sometimes, distinguishing features appear similar and produce a very complex situation to take decision. Also, there are several other that make the face recognition task complicated. Some of them are given below.a) Background of the face image can be a complex pattern or almost same as the color of theface.b) Different illumination level, at different parts of the image.c) Direction of illumination may vary.d) Tilting of face.e) Rotation of face with different angle.f) Presence/absence of beard and/or moustacheg) Presence/Absence of spectacle/glasses.h) Change in expressions such as disgust, sadness, happiness, fear, anger, surprise etc.i) Deliberate change in color of the skin and/or hair to disguise the designed system.From above discussion it can now be claimed that the face recognition problem along with face detection, is very complex in nature. To solve it, we require some complex neural network, which takes large amount of time to finalize its structure and also to settle its parameters.In this work, a different architecture has been used to train a multilayer perceptron in faster way. Instead of placing all the classes in a single network, individual networks are used for each of theclasses. Due to lesser number of samples and conflicts in the belongingness of patterns to their respective classes, a later model appears to be faster in comparison to former.2. ARTIFICIAL NEURAL NETWORKArtificial neural networks (ANN) have been developed as generalizations of mathematical models of biological nervous systems. A first wave of interest in neural networks (also known as connectionist models or parallel distributed processing) emerged after the introduction of simplified neurons by McCulloch and Pitts (1943).The basic processing elements of neural networks are called artificial neurons, or simply neurons or nodes. In a simplified mathematical model of the neuron, the effects of the synapses are represented by connection weights that modulate the effect of the associated input signals, and the nonlinear characteristic exhibited by neurons is represented by a transfer function. The neuron impulse is then computed as the weighted sum of the input signals, transformed by the transfer function. The learning capability of an artificial neuron is achieved by adjusting the weights in accordance to the chosen learning algorithm. A neural network has to be configured such that the application of a set of inputs produces the desired set of outputs. Various methods to set the strengths of the connections exist. One way is to set the weights explicitly, using a priori knowledge. Another way is to train the neural network by feeding it teaching patterns and letting it change its weights according to some learning rule. The learning situations in neural networks may be classified into three distinct sorts. These are supervised learning, unsupervised learning, and reinforcement learning. In supervised learning,an input vector is presented at the inputs together with a set of desired responses, one for each node, at the output layer. A forward pass is done, and the errors or discrepancies between the desired and actual response for each node in the output layer are found. These are then used to determine weight changes in the net according to the prevailing learning rule. The term supervised originates from the fact that the desired signals on individual output nodes are provided by an external teacher [3]. Feed-forward networks had already been used successfully for human face recognition. Feed-forward means that there is no feedback to the input. Similar to the way that human beings learn from mistakes, neural networks also could learn from their mistakes by giving feedback to the input patterns. This kind of feedback would be used to reconstruct the input patterns and make them free from error; thus increasing the performance of the neural networks. Of course, it is very complex to construct such types of neural networks. These kinds of networks are called as auto associative neural networks. As the name implies, they use back-propagation algorithms. One of the main problems associated with back-propagation algorithms is local minima. In addition, neural networks have issues associated with learning speed, architecture selection, feature representation, modularity and scaling. Though there are problems and difficulties, the potential advantages of neural networks are vast. Pattern recognition can be done both in normal computers and neural networks. Computers use conventional arithmetic algorithms to detect whether the given pattern matches an existing one. It is a straightforward method. It will say either yes or no. It does not tolerate noisy patterns. On the other hand, neural networks can tolerate noise and, if trained properly, will respond correctly for unknown patterns. Neural networks may not perform miracles, but if constructed with the proper architecture and trained correctly with good data, they will give amazing results, not only in pattern recognition but also in other scientific and commercial applications [4].2A. Network ArchitectureThe computing world has a lot to gain from neural networks. Their ability to learn by example makes them very flexible and powerful. Once a network is trained properly there is no need to devise an algorithm in order to perform a specific task; i.e. no need to understand the internal mechanisms of that task. The architecture of any neural networks generally used is All-Class-in-One-Network (ACON), where all the classes are lumped into one super-network. Hence, the implementation of such ACON structure in parallel environment is not possible. Also, the ACON structure has some disadvantages like the super-network has the burden to simultaneously satisfy all the error constraints by which the number of nodes in the hidden layers tends to be large. The structure of the network is All-Classes-in-One-Network (ACON), shown in Figure 1(a) where one single network is designed to classify all the classes but in One-Class-in-One-Network(OCON), shown in Figure 1(b) a single network is dedicated to recognize one particular class. For each class, a network is created with all the training samples of that class as positive examples, called the class-one, and the negative examples for that class i.e. exemplars from other classes, constitute the class-two. Thus, this classification problem is a two-class partitioning problem. So far, as implementation is concerned, the structure of the network remains the same for all classes and only the weights vary. As the network remains same, weights are kept in separate files and the identification of input image is made on the basis of feature vector and stored weights applied to the network one by one, for all the classes.(a)(b)Figure 1: a) All-Classes-in-One-Network (ACON) b) One-Class-in-One-Network (OCON). Empirical results confirm that the convergence rate of ACON degrades drastically with respect to the network size because the training of hidden units is influenced by (potentially conflicting) signals from different teachers. If the topology is changed to One Class in One Network (OCON) structure, where one sub-network is designated and responsible for one class only then each sub-network specializes in distinguishing its own class from the others. So, the number of hidden units is usually small.2B. Training of an ANNIn the training phase the main goal is to utilize the resources as much as possible and speed-up the computation process. Hence, the computation involved in training is distributed over the system to reduce response time. The training procedure can be given as:(1) Retrieve the topology of the neural network given by the user,(2) Initialize required parameters and weight vector necessary to train the network,(3) Train the network as per network topology and available parameters for all exemplars of different classes,(4) Run the network with test vectors to test the classification ability,(5) If the result found from step 4 is not satisfactory, loop back to step 2 to change the parameters like learning parameter, momentum, number of iteration or even the weight vector,(6) If the testing results do not improve by step 5, then go back to step 1,(7) The best possible (optimal) topology and associated parameters found in step 5 and step 6 are stored.Although we have parallel systems already in use but some problems cannot exploit advantages of these systems because of their inherent sequential execution characteristics. Therefore, it is necessary to find an equivalent algorithm, which is executable in parallel.In case of OCON, different individual small networks with least amount of load, which are responsible for different classes (e.g. k classes), can easily be trained in k different processors and the training time must reduce drastically. To fit into this parallel framework previous training procedure can be modified as follows:(1) Retrieve the topology of the neural network given by the user,(2) Initialize required parameters and weight vector necessary to train the network,(3) Distribute all the classes (say k) to available processors (possibly k) by some optimal process allocation algorithm,(4) Ensure the retrieval the exemplar vectors of respective classes by the corresponding processors,(5) Train the networks as per network topology and available parameters for all exemplars of different classes,(6) Run the networks with test vectors to test the classification ability,(7) If the result found from step 6 is not satisfactory, loop back to step 2 to change the parameters like learning parameter, momentum, number of iteration or even the weight vector,(8) If the testing results do not improve by step 5, then go back to step 1,(9) The best possible (optimal) topology and associated parameters found in step 7 and step 8 are stored,(10) Store weights per class with identification in more than one computer [2].During the training of two different topologies (OCON and ACON), we used total 200 images of 10 different classes and the images are with different poses and also with different illuminations. Sample images used during training are shown Figure 2. We implemented both the topologies using MATLAB. At the time of training of our systems for both the topologies, we set maximum number of possible epochs (or iterations) to 700000. The training stops if the number of iterations exceeds this limit or performance goal is met. Here, performance goal was considered as 10-6. We have taken total 10 different training runs for 10 different classes for OCON and one single training run for ACON for 10 different classes. In case of the OCON networks, performance goal was met for all the 10 different training cases, and also in lesser amount of time than ACON. After the completion of training phase of our two different topologies we tested our both the network using the images of testing class which are not used in training.2C. Testing PhaseDuring testing, the class found in the database with minimum distance does not necessarily stop the testing procedure. Testing is complete after all the registered classes are tested. During testing some points were taken into account, those are:(1) The weights of different classes already available are again distributed in the available computer to test a particular image given as input,(2) The allocation of the tasks to different processors is done based on the testing time and inter-processor communication overhead. The communication overhead should be much less than the testing time for the success of the distribution of testing, and(3) The weight vector of a class matter, not the computer, which has computed it.The testing of a class can be done in any computer as the topology and the weight vector of that class is known. Thus, the system can be fault tolerant [2]. At the time of testing, we used total 200 images. Among 200 images 100 images are taken from the same classes those are used during the training and 100 images from other classes are not used during the training time. In the both topology (ACON and OCON), we have chosen 20 images for testing, in which 10 images from same class those are used during the training as positive exemplars and other 10 images are chosen from other classes of the database as negative exemplars.2D. Performance measurementPerformance of this system can be measured using following parameters:(1) resource sharing: Different terminals remain idle most of the time can be used as a part of this system. Once the weights are finalized anyone in the net, even though not satisfying the optimal testing time criterion, can use it. This can be done through Internet attempting to end the “tyranny of geography”,(2) high reliability: Here we will be concerned with the reliability of the proposed system, not the inherent fault tolerant property of the neural network. Reliability comes from the distribution of computed weights over the system. If any of the computer(or processor) connected to the network goes down then the system works Some applications like security monitoring system, crime prevention system require that the system should work, whatever may be the performance, (3) cost effectiveness: If we use several small personal computers instead of high-end computing machines, we achieve better price/performance ratio,(4) incremental growth: If the number of classes increases, then the complete computation including the additional complexity can be completed without disturbing the existing system. Based on the analysis of performance of our two different topologies, if we see the recognition rates of OCON and ACON in Table 1 and Table 2 OCON is showing better recognition rate than ACON. Comparison in terms of training time can easily be observed in figures 3 (Figure 3 (a) to (k)). In case of OCON, performance goals met for 10 different classes are 9.99999e-007, 1e-006, 9.99999e-007, 9.99998e-007, 1e-006, 9.99998e-007,1e-006, 9.99997e-007, 9.99999e-007 respectively, whereas for ACON it is 0.0100274. Therefore, it is pretty clear that OCON requires less computational time to finalize a network to use.3. PRINCIPAL COMPONENT ANALYSISThe Principal Component Analysis (PCA) [5] [6] [7] uses the entire image to generate a set of features in the both network topology OCON and ACON and does not require the location of individual feature points within the image. We have implemented the PCA transform as a reduced feature extractor in our face recognition system. Here, each of the visual face images is projected into the eigenspace created by the eigenvectors of the covariance matrix of all the training images for both the ACON and OCON networks. Here, we have taken the number of eigenvectors in the eigenspace as 40 because eigenvalues for other eigenvectors are negligible in comparison to the largest eigenvalues.4. EXPERIMENTS RESULTS USING OCON AND ACONThis work has been simulated using MATLAB 7 in a machine of the configuration 2.13GHz Intel Xeon Quad Core Processor and 16 GB of Physical Memory. We have analyzed the performance of our method using YALE B database which is a collection of visual face images with various poses and illumination.4A. YALE Face Database BThis work has been simulated using MATLAB 7 in a machine of the configuration 2.13GHz Intel Xeon Quad Core Processor and 16 GB of Physical Memory. We have analyzed the performance of our method using YALE B database which is a collection of visual face images with various poses and illumination.This database contains 5760 single light source images of 10 subjects each seen under 576 viewing conditions (9 poses x 64 illumination conditions). For every subject in a particular pose, an image with ambient (background) illumination was also captured. Hence, the total number of images is 5850. The total size of the compressed database is about 1GB. The 65 (64 illuminations + 1 ambient) images of a subject in a particular pose have been "tarred" and "gzipped" into a single file. There were 47 (out of 5760) images whose corresponding strobe did not go off. These images basically look like the ambient image of the subject in a particular pose. The images in the database were captured using a purpose-built illumination rig. This rig is fitted with 64 computer controlled strobes. The 64 images of a subject in a particular pose were acquired at camera frame rate (30 frames/second) in about 2 seconds, so there is only small change in head pose and facial expression for those 64 (+1 ambient) images. The image with ambient illumination was captured without a strobe going off. For each subject, images were captured under nine different poses whose relative positions are shown below. Note the pose 0 is the frontal pose. Poses 1, 2, 3, 4, and 5 were about 12 degrees from the camera optical axis (i.e., from Pose 0), while poses 6, 7, and 8 were about 24 degrees. In the Figure 2 sample images of per subject per pose with frontal illumination. Note that the position of a face in an image varies from pose to pose but is fairly constant within the images of a face seen in one of the 9 poses, since the 64 (+1 ambient) images were captured in about 2 seconds. The acquired images are 8-bit (gray scale) captured with a Sony XC-75 camera (with a linear response function) and stored in PGM raw format. The size of each image is 640(w) x 480 (h) [9].In our experiment, we have chosen total 400 images for our experiment purpose. Among them 200 images are taken for training and other 200 images are taken for testing purpose from 10 different classes. In the experiment we use total two different networks: OCON and ACON. All the recognition results of OCON networks are shown in Table 1, and all the recognition results of ACON network are shown in Table 2. During training, total 10 training runs have been executed for 10 different classes. We have completed total 10 different testing for OCON network using 20 images for each experiment. Out of those 20 images, 10 images are taken form the same classes those were used during training, which acts as positive exemplars and rest 10 images are taken from other classes that acts as negative exemplars for that class. In case of OCON, system achieved 100% recognition rate for all the classes. In case of the ACON network, only one network is used for 10 different classes. During the training we achieved 100% as the highest recognition rate, but like OCON network not for all the classes. For ACON network, on an average, 88% recognition rate was achieved.Figure 2: Sample images of YALE B database with different Pose and different illumination.Class Total number oftesting images Number of imagesfrom the trainingclassNumber ofimages fromother classesRecognitionrateClass-1 20 10 10 100% Class-2 20 10 10 100% Class-3 20 10 10 100% Class-4 20 10 10 100% Class-5 20 10 10 100% Class-6 20 10 10 100% Class-7 20 10 10 100% Class-8 20 10 10 100% Class-9 20 10 10 100% Class-10 20 10 10 100%Table 1: Experiments Results for OCON.Class Total number oftesting images Number of imagesfrom the trainingclassNumber ofimages fromother classesRecognitionrateClass - 1 20 10 10 100%Class - 2 20 10 10 100%Class - 3 20 10 10 90%Class - 4 20 10 10 80%Class - 5 20 10 10 80%Class - 6 20 10 10 80%Class - 7 20 10 10 90%Class - 8 20 10 10 100%Class - 9 20 10 10 90%Class-10 20 10 10 70%Table 2: Experiments results for ACON.In the Figure 3, we have shown all the performance measure and reached goal during 10 different training runs in case of OCON network and also one training phase of ACON network.We set highest epochs 700000, but during the training, in case of all the OCON networks, performance goal was met before reaching maximum number of epochs. All the learning rates with required epochs of OCON and ACON networks are shown at column two of Table 3.In case of the OCON network, if we combine all the recognition rates we have the average recognition rate is 100%. But in case of ACON network, 88% is the average recognition rate i.e.we can say that OCON showing better performance, accuracy and speed than ACON. Figure 4 presents a comparative study on ACON and OCON results.Total no. of iterations Learning Rate(lr)Class Figures Network Used 290556 lr > 10-4 Class – 1 Figure 3(a) 248182 lr =10-4 Class – 2 Figure 3(b) 260384 lr =10-5 Class – 3 Figure 3(c) 293279 lr < 10-4 Class - 4 Figure 3(d) 275065 lr =10-4 Class - 5 Figure 3(e) 251642 lr =10-3 Class – 6 Figure 3(f) 273819 lr =10-4 Class – 7 Figure 3(g) 263251 lr < 10-3Class – 8 Figure 3(h) 295986 lr < 10-3 Class – 9 Figure 3(i) 257019 lr > 10-6 Class - 10 Figure 3(j) OCON Highest epochreached(7, 00, 000)Performance goal not met For all Classes (class -1,…,10) Figure 3(k) ACONTable 3: Learning Rate vs. Required Epochs for OCON and ACON.Figure 3 (a) Class – 1 of OCON Network.Figure 3 (b) Class – 2 of OCON Network.Figure 3 (c) Class – 3 of OCON Network.D. Bhattacharjee, M. K. Bhowmik, M. Nasipuri, D. K. Basu & M. KunduFigure 3 (d) Class – 4 of OCON Network.Figure 3 (e) Class – 5 of Ocon Network.International Journal of Computer Science and Security (IJCSS), Volume (3): Issue (6)11D. Bhattacharjee, M. K. Bhowmik, M. Nasipuri, D. K. Basu & M. KunduFigure 3 (f) Class – 6 of OCON Network.Figure 3 (g) Class – 7 of OCON Network.International Journal of Computer Science and Security (IJCSS), Volume (3): Issue (6)12D. Bhattacharjee, M. K. Bhowmik, M. Nasipuri, D. K. Basu & M. KunduFigure 3 (h) Class – 8 of OCON Network.Figure 3 (i) Class – 9 of OCON Network.International Journal of Computer Science and Security (IJCSS), Volume (3): Issue (6)13D. Bhattacharjee, M. K. Bhowmik, M. Nasipuri, D. K. Basu & M. KunduFigure 3 (j) Class – 10 of OCON Network.3 (k) of ACON Network for all the classes.International Journal of Computer Science and Security (IJCSS), Volume (3): Issue (6)14D. Bhattacharjee, M. K. Bhowmik, M. Nasipuri, D. K. Basu & M. KunduFigure 4: Graphical Representation of all Recognition Rate using OCON and ACON Network. The OCON is an obvious choice in terms of speed-up and resource utilization. The OCON structure of neural network makes it most suitable for incremental training, i.e., network upgrading upon adding/removing memberships. One may argue that compared to ACON structure, the OCON structure is slow in retrieving time when the number of classes is very large. This is not true because, as the number of classes increases, the number of hidden neurons in the ACON structure also tends to be very large. Therefore ACON is slow. Since the computation time of both OCON and ACON increases as number of classes grows, a linear increasing of computation time is expected in case of OCON, which might be exponential in case of ACON.5. CONCLUSIONIn this paper, two general architectures for a Multilayer Perceptron (MLP) have been demonstrated. The first architecture is All-Class-in-One-Network (ACON) where all the classes are placed in a single network and the second one is One-Class-in-One-Network (OCON) where an individual single network is responsible for each and every class. Capabilities of these two architectures were compared and verified in solving human face recognition, which is a complex pattern recognition task where several factors affect the recognition performance like pose variations, facial expression changes, occlusions, and most importantly illumination changes. Both the structures were implemented and tested for face recognition purpose and experimental results show that the OCON structure performs better than the generally used ACON ones in term of training convergence speed of the network. Moreover, the inherent non-parallel nature of ACON has compelled us to use OCON for the complex pattern recognition task like human face recognition.ACKNOWLEDGEMENTSecond author is thankful to the project entitled “Development of Techniques for Human Face Based Online Authentication System Phase-I” sponsored by Department of Information Technology under the Ministry of Communications and Information Technology, New Delhi110003, Government of India Vide No. 12(14)/08-ESD, Dated 27/01/2009 at the Department of Computer Science & Engineering, Tripura University-799130, Tripura (West), India for providingInternational Journal of Computer Science and Security (IJCSS), Volume (3): Issue (6)15。
Face Recognition
FaceRecognition一、定义1.人脸识别特指利用分析比较人脸视觉特征信息进行身份鉴别的计算机技术。
广义的人脸识别实际包括构建人脸识别系统的一系列相关技术,包括人脸图像采集、人脸定位、人脸识别预处理、身份确认以及身份查找等;而狭义的人脸识别特指通过人脸进行身份确认或者身份查找的技术或系统。
人脸识别是一项热门的计算机技术研究领域,它属于生物特征识别技术,是对生物体(一般特指人)本身的生物特征来区分生物体个体。
2.LFWLabeled Faces in the Wild (户外脸部监测数据库)是人脸识别研究领域比较有名的人脸图像集合,其图像采集自Yahoo! News,共13233幅图像,其中5749个人,其中1680人有两幅及以上的图像,4069人只有一幅图像;大多数图像都是由Viola-Jones人脸检测器得到之后,被裁剪为固定大小,有少量的人为地从false positive 中得到。
所有图像均产生于现实场景(有别于实验室场景),具备自然的光线,表情,姿势和遮挡,且涉及人物多为公物人物,这将带来化妆,聚光灯等更加复杂的干扰因素。
因此,在该数据集上验证的人脸识别算法,理论上更贴近现实应用,这也给研究人员带来巨大的挑战。
3.FDDBFDDB全称Face Detection Data Set and Benchmark,是由马萨诸塞大学计算机系维护的一套公开数据库,为来自全世界的研究者提供一个标准的人脸检测评测平台,其中涵盖在自然环境下的各种姿态的人脸,作为全世界最具权威的人脸检测评测平台之一,FDDB使用Faces in the Wild数据库中的包含5171张人脸的2845张图片作为测试集,而其公布的评测集也代表了人脸检测的世界最高水平。
4.300-w人脸关键点定位5.FRVTFace Recognition Vendor Test人脸识别供应商测试,由美国国家标准技术研究所定制。
更趋近于现实应用的人脸识别测试。
人脸识别技术外文翻译文献编辑
文献信息文献标题:Face Recognition Techniques: A Survey(人脸识别技术综述)文献作者:V.Vijayakumari文献出处:《World Journal of Computer Application and Technology》, 2013,1(2):41-50字数统计:英文3186单词,17705字符;中文5317汉字外文文献Face Recognition Techniques: A Survey Abstract Face is the index of mind. It is a complex multidimensional structure and needs a good computing technique for recognition. While using automatic system for face recognition, computers are easily confused by changes in illumination, variation in poses and change in angles of faces. A numerous techniques are being used for security and authentication purposes which includes areas in detective agencies and military purpose. These surveys give the existing methods in automatic face recognition and formulate the way to still increase the performance.Keywords: Face Recognition, Illumination, Authentication, Security1.IntroductionDeveloped in the 1960s, the first semi-automated system for face recognition required the administrator to locate features ( such as eyes, ears, nose, and mouth) on the photographs before it calculated distances and ratios to a common reference point, which were then compared to reference data. In the 1970s, Goldstein, Armon, and Lesk used 21 specific subjective markers such as hair color and lip thickness to automate the recognition. The problem with both of these early solutions was that the measurements and locations were manually computed. The face recognition problem can be divided into two main stages: face verification (or authentication), and face identification (or recognition).The detection stage is the first stage; it includesidentifying and locating a face in an image. The recognition stage is the second stage; it includes feature extraction, where important information for the discrimination is saved and the matching where the recognition result is given aid of a face database.2.Methods2.1.Geometric Feature Based MethodsThe geometric feature based approaches are the earliest approaches to face recognition and detection. In these systems, the significant facial features are detected and the distances among them as well as other geometric characteristic are combined in a feature vector that is used to represent the face. To recognize a face, first the feature vector of the test image and of the image in the database is obtained. Second, a similarity measure between these vectors, most often a minimum distance criterion, is used to determine the identity of the face. As pointed out by Brunelli and Poggio, the template based approaches will outperform the early geometric feature based approaches.2.2.Template Based MethodsThe template based approaches represent the most popular technique used to recognize and detect faces. Unlike the geometric feature based approaches, the template based approaches use a feature vector that represent the entire face template rather than the most significant facial features.2.3.Correlation Based MethodsCorrelation based methods for face detection are based on the computation of the normalized cross correlation coefficient Cn. The first step in these methods is to determine the location of the significant facial features such as eyes, nose or mouth. The importance of robust facial feature detection for both detection and recognition has resulted in the development of a variety of different facial feature detection algorithms. The facial feature detection method proposed by Brunelli and Poggio uses a set of templates to detect the position of the eyes in an image, by looking for the maximum absolute values of the normalized correlation coefficient of these templates at each point in test image. To cope with scale variations, a set of templates atdifferent scales was used.The problems associated with the scale variations can be significantly reduced by using hierarchical correlation. For face recognition, the templates corresponding to the significant facial feature of the test images are compared in turn with the corresponding templates of all of the images in the database, returning a vector of matching scores computed through normalized cross correlation. The similarity scores of different features are integrated to obtain a global score that is used for recognition. Other similar method that use correlation or higher order statistics revealed the accuracy of these methods but also their complexity.Beymer extended the correlation based on the approach to a view based approach for recognizing faces under varying orientation, including rotations with respect to the axis perpendicular to the image plane(rotations in image depth). To handle rotations out of the image plane, templates from different views were used. After the pose is determined ,the task of recognition is reduced to the classical correlation method in which the facial feature templates are matched to the corresponding templates of the appropriate view based models using the cross correlation coefficient. However this approach is highly computational expensive, and it is sensitive to lighting conditions.2.4.Matching Pursuit Based MethodsPhilips introduced a template based face detection and recognition system that uses a matching pursuit filter to obtain the face vector. The matching pursuit algorithm applied to an image iteratively selects from a dictionary of basis functions the best decomposition of the image by minimizing the residue of the image in all iterations. The algorithm describes by Philips constructs the best decomposition of a set of images by iteratively optimizing a cost function, which is determined from the residues of the individual images. The dictionary of basis functions used by the author consists of two dimensional wavelets, which gives a better image representation than the PCA (Principal Component Analysis) and LDA(Linear Discriminant Analysis) based techniques where the images were stored as vectors. For recognition the cost function is a measure of distances between faces and is maximized at each iteration. For detection the goal is to find a filter that clusters together in similar templates (themean for example), and minimized in each iteration. The feature represents the average value of the projection of the templates on the selected basis.2.5.Singular Value Decomposition Based MethodsThe face recognition method in this section use the general result stated by the singular value decomposition theorem. Z.Hong revealed the importance of using Singular Value Decomposition Method (SVD) for human face recognition by providing several important properties of the singular values (SV) vector which include: the stability of the SV vector to small perturbations caused by stochastic variation in the intensity image, the proportional variation of the SV vector with the pixel intensities, the variances of the SV feature vector to rotation, translation and mirror transformation. The above properties of the SV vector provide the theoretical basis for using singular values as image features. In addition, it has been shown that compressing the original SV vector into the low dimensional space by means of various mathematical transforms leads to the higher recognition performance. Among the various dimensionality reducing transformations, the Linear Discriminant Transform is the most popular one.2.6.The Dynamic Link Matching MethodsThe above template based matching methods use an Euclidean distance to identify a face in a gallery or to detect a face from a background. A more flexible distance measure that accounts for common facial transformations is the dynamic link introduced by Lades et al. In this approach , a rectangular grid is centered all faces in the gallery. The feature vector is calculated based on Gabor type wavelets, computed at all points of the grid. A new face is identified if the cost function, which is a weighted sum of two terms, is minimized. The first term in the cost function is small when the distance between feature vectors is small and the second term is small when the relative distance between the grid points in the test and the gallery image is preserved. It is the second term of this cost function that gives the “elasticity” of this matching measure. While the grid of the image remains rectangular, the grid that is “best fit” over the test image is stretched. Under certain constraints, until the minimum of the cost function is achieved. The minimum value of the cost function isused further to identify the unknown face.2.7.Illumination Invariant Processing MethodsThe problem of determining functions of an image of an object that are insensitive to illumination changes are considered. An object with Lambertian reflection has no discriminative functions that are invariant to illumination. This result leads the author to adopt a probabilistic approach in which they analytically determine a probability distribution for the image gradient as a function of the surfaces geometry and reflectance. Their distribution reveals that the direction of the image gradient is insensitive to changes in illumination direction. Verify this empirically by constructing a distribution for the image gradient from more than twenty million samples of gradients in a database of thousand two hundred and eighty images of twenty inanimate objects taken under varying lighting conditions. Using this distribution, they develop an illumination insensitive measure of image comparison and test it on the problem of face recognition. In another method, they consider only the set of images of an object under variable illumination, including multiple, extended light sources, shadows, and color. They prove that the set of n-pixel monochrome images of a convex object with a Lambertian reflectance function, illuminated by an arbitrary number of point light sources at infinity, forms a convex polyhedral cone in IR and that the dimension of this illumination cone equals the number of distinct surface normal. Furthermore, the illumination cone can be constructed from as few as three images. In addition, the set of n-pixel images of an object of any shape and with a more general reflectance function, seen under all possible illumination conditions, still forms a convex cone in IRn. These results immediately suggest certain approaches to object recognition. Throughout, they present results demonstrating the illumination cone representation.2.8.Support Vector Machine ApproachFace recognition is a K class problem, where K is the number of known individuals; and support vector machines (SVMs) are a binary classification method. By reformulating the face recognition problem and reinterpreting the output of the SVM classifier, they developed a SVM-based face recognition algorithm. The facerecognition problem is formulated as a problem in difference space, which models dissimilarities between two facial images. In difference space we formulate face recognition as a two class problem. The classes are: dissimilarities between faces of the same person, and dissimilarities between faces of different people. By modifying the interpretation of the decision surface generated by SVM, we generated a similarity metric between faces that are learned from examples of differences between faces. The SVM-based algorithm is compared with a principal component analysis (PCA) based algorithm on a difficult set of images from the FERET database. Performance was measured for both verification and identification scenarios. The identification performance for SVM is 77-78% versus 54% for PCA. For verification, the equal error rate is 7% for SVM and 13% for PCA.2.9.Karhunen- Loeve Expansion Based Methods2.9.1.Eigen Face ApproachIn this approach, face recognition problem is treated as an intrinsically two dimensional recognition problem. The system works by projecting face images which represents the significant variations among known faces. This significant feature is characterized as the Eigen faces. They are actually the eigenvectors. Their goal is to develop a computational model of face recognition that is fact, reasonably simple and accurate in constrained environment. Eigen face approach is motivated by the information theory.2.9.2.Recognition Using Eigen FeaturesWhile the classical eigenface method uses the KLT (Karhunen- Loeve Transform) coefficients of the template corresponding to the whole face image, the author Pentland et.al. introduce a face detection and recognition system that uses the KLT coefficients of the templates corresponding to the significant facial features like eyes, nose and mouth. For each of the facial features, a feature space is built by selecting the most significant “eigenfeatures”, which are the eigenvectors corresponding to the largest eigen values of the features correlation matrix. The significant facial features were detected using the distance from the feature space and selecting the closest match. The scores of similarity between the templates of the test image and thetemplates of the images in the training set were integrated in a cumulative score that measures the distance between the test image and the training images. The method was extended to the detection of features under different viewing geometries by using either a view-based Eigen space or a parametric eigenspace.2.10.Feature Based Methods2.10.1.Kernel Direct Discriminant Analysis AlgorithmThe kernel machine-based Discriminant analysis method deals with the nonlinearity of the face patterns’ distribution. This method also effectively solves the so-called “small sample size” (SSS) problem, which exists in most Face Recognition tasks. The new algorithm has been tested, in terms of classification error rate performance, on the multiview UMIST face database. Results indicate that the proposed methodology is able to achieve excellent performance with only a very small set of features being used, and its error rate is approximately 34% and 48% of those of two other commonly used kernel FR approaches, the kernel-PCA (KPCA) and the Generalized Discriminant Analysis (GDA), respectively.2.10.2.Features Extracted from Walshlet PyramidA novel Walshlet Pyramid based face recognition technique used the image feature set extracted from Walshlets applied on the image at various levels of decomposition. Here the image features are extracted by applying Walshlet Pyramid on gray plane (average of red, green and blue. The proposed technique is tested on two image databases having 100 images each. The results show that Walshlet level-4 outperforms other Walshlets and Walsh Transform, because the higher level Walshlets are giving very coarse color-texture features while the lower level Walshlets are representing very fine color-texture features which are less useful to differentiate the images in face recognition.2.10.3.Hybrid Color and Frequency Features ApproachThis correspondence presents a novel hybrid Color and Frequency Features (CFF) method for face recognition. The CFF method, which applies an Enhanced Fisher Model(EFM), extracts the complementary frequency features in a new hybrid color space for improving face recognition performance. The new color space, the RIQcolor space, which combines the component image R of the RGB color space and the chromatic components I and Q of the YIQ color space, displays prominent capability for improving face recognition performance due to the complementary characteristics of its component images. The EFM then extracts the complementary features from the real part, the imaginary part, and the magnitude of the R image in the frequency domain. The complementary features are then fused by means of concatenation at the feature level to derive similarity scores for classification. The complementary feature extraction and feature level fusion procedure applies to the I and Q component images as well. Experiments on the Face Recognition Grand Challenge (FRGC) show that i) the hybrid color space improves face recognition performance significantly, and ii) the complementary color and frequency features further improve face recognition performance.2.10.4.Multilevel Block Truncation Coding ApproachIn Multilevel Block Truncation coding for face recognition uses all four levels of Multilevel Block Truncation Coding for feature vector extraction resulting into four variations of proposed face recognition technique. The experimentation has been conducted on two different face databases. The first one is Face Database which has 1000 face images and the second one is “Our Own Database” which has 1600 face images. To measure the performance of the algorithm the False Acceptance Rate (FAR) and Genuine Acceptance Rate (GAR) parameters have been used. The experimental results have shown that the outcome of BTC (Block truncation Coding) Level 4 is better as compared to the other BTC levels in terms of accuracy, at the cost of increased feature vector size.2.11.Neural Network Based AlgorithmsTemplates have been also used as input to Neural Network (NN) based systems. Lawrence et.al proposed a hybrid neural network approach that combines local image sampling, A self organizing map (SOM) and a convolutional neural network. The SOP provides a set of features that represents a more compact and robust representation of the image samples. These features are then fed into the convolutional neural network. This architecture provides partial invariance to translation, rotation, scale and facedeformation. Along with this the author introduced an efficient probabilistic decision based neural network (PDBNN) for face detection and recognition. The feature vector used consists of intensity and edge values obtained from the facial region of the down sampled image in the training set. The facial region contains the eyes and nose, but excludes the hair and mouth. Two PDBNN were trained with these feature vectors and used one for the face detection and other for the face recognition.2.12.Model Based Methods2.12.1.Hidden Markov Model Based ApproachIn this approach, the author utilizes the face that the most significant facial features of a frontal face which includes hair, forehead, eyes, nose and mouth which occur in a natural order from top to bottom even if the image undergo small variation/rotation in the image plane perpendicular to the image plane. One dimensional HMM (Hidden Markov Model) is used for modeling the image, where the observation vectors are obtained from DCT or KLT coefficients. They given c face images for each subject of the training set, the goal of the training set is to optimize the parameters of the Hidden Markov Model to best describe the observations in the sense of maximizing the probability of the observations given in the model. Recognition is carried out by matching the best test image against each of the trained models. To do this, the image is converted to an observation sequence and then model likelihoods are computed for each face model. The model with the highest likelihood reveals the identity of the unknown face.2.12.2.The Volumetric Frequency Representation of Face ModelA face model that incorporates both the three dimensional (3D) face structure and its two-dimensional representation are explained (face images). This model which represents a volumetric (3D) frequency representation (VFR) of the face , is constructed using range image of a human head. Making use of an extension of the projection Slice Theorem, the Fourier transform of any face image corresponds to a slice in the face VFR. For both pose estimation and face recognition a face image is indexed in the 3D VFR based on the correlation matching in a four dimensional Fourier space, parameterized over the elevation, azimuth, rotation in the image planeand the scale of faces.3.ConclusionThis paper discusses the different approaches which have been employed in automatic face recognition. In the geometrical based methods, the geometrical features are selected and the significant facial features are detected. The correlation based approach needs face template rather than the significant facial features. Singular value vectors and the properties of the SV vector provide the theoretical basis for using singular values as image features. The Karhunen-Loeve expansion works by projecting the face images which represents the significant variations among the known faces. Eigen values and Eigen vectors are involved in extracting the features in KLT. Neural network based approaches are more efficient when it contains no more than a few hundred weights. The Hidden Markov model optimizes the parameters to best describe the observations in the sense of maximizing the probability of observations given in the model .Some methods use the features for classification and few methods uses the distance measure from the nodal points. The drawbacks of the methods are also discussed based on the performance of the algorithms used in the approaches. Hence this will give some idea about the existing methods for automatic face recognition.中文译文人脸识别技术综述摘要人脸是心灵的指标。
介绍人脸识别英语作文
介绍人脸识别英语作文## Facial Recognition ##。
English Answer:Facial recognition is a computer-vision technology used to identify or verify a person's identity using theirfacial characteristics. It is based on the idea that each person's face is unique and can be distinguished from others. The technology works by analyzing the facial features of an individual, such as the shape of their face, the distance between their eyes, the shape of their nose, and the curve of their lips, and comparing them to a database of known faces, or creating a new entry if the face is unrecognized. Facial recognition is a non-invasive and user-friendly technology that can be used in a wide range of applications.Facial recognition technology has several advantages. Firstly, it is highly accurate and reliable. Withadvancements in machine learning algorithms and image processing techniques, facial recognition systems can now achieve accuracy rates of over 99% under ideal conditions. Secondly, facial recognition is a non-contact and non-intrusive method, which makes it convenient and user-friendly. Users do not need to touch or interact with any devices, making it a hygienic and efficient solution for identity verification. Finally, facial recognition is a passive and covert technology, meaning that individuals do not need to be aware that their faces are being recognized. This makes it a powerful tool for surveillance and security applications.Despite its advantages, facial recognition technology also has several disadvantages. One major concern is privacy. Facial recognition systems create and store large databases of facial images, which raises concerns about the misuse of such data. Unauthorized access to these databases could lead to identity theft, stalking, or even discrimination. Another concern with facial recognition is bias. Facial recognition algorithms have been found to be less accurate for certain demographics, such as women andpeople of color, due to historical biases in the training data. This can lead to unfair and discriminatory outcomes when using facial recognition for decision-making purposes.中文回答:面部识别。
Detecting Faces in Images:a survey(中文版)
2.
这部分,我们回顾一下在单张黑白或彩色图像中检测人脸的一些方法。我们把对单张图像的检 测分为四类,有些方法明显同时属于多于一个类,这些方法将在本部分的最后讨论。 1) 基于知识的方法。这些基于先验知识的方法对组成典型人脸的知识进行编码。通常,先验 知识包含了这些人脸特征之间的相互关系。此类方法主要用于人脸定位。 2) 特征不变方法。这些算法的目标是找出存在的一些结构特征,这些特征在姿势、观察点、 光照条件改变的情况下保持不变。然后使用这些特性来定位人脸。这些方法主要用于人脸 定位。 3) 模版匹配法。这种方法首先是存储一张人脸的几个标准模版,用来描述整张人脸或人脸的 部分特性。然后通过计算输入图像与已经存储模版之间的相关度来进行检测。这些方法既 可用于人脸检测也可用于人脸定位。 4) 基于外观的方法。 与模版匹配不同的是, 这里的模版是从一组训练图像经过学习而得来的, 这些图像应该包括人脸外观的具有代表性的变化因素。这些方法主要用于人脸检测。 表格 1 单张图像中的人脸检测方法分类 方法 基于知识 特征不变量 -人脸特征 -纹理 -肤色 -多特征 模版匹配 -预先定义人脸模版 -可变形模版 基于外观的方法 -本征脸 -基于分布 -神经网络 -支持向量机(SVM) -贝页斯分类 -隐马尔可夫模型(HMM ) -信息理论法 代表性成果和信息媒介的普及,在人机交互方面越来越多的高效友好的方法被开发出来,这 些方法不依赖于传统的设备,比如说键盘、鼠标和显示器。而且,计算机性价比持续下降,近来视 频设备成本下跌,预示着计算机视频系统能够在台式机和嵌入式系统中开发。 (见于文献〔 111〕 、 〔112〕 、 〔113〕 ) 。人脸处理研究的快速发展是基于假设的,即关于用户身份、状态、意图的信息能 够从图像中抽取出来,然后计算机做出相应响应,比方说观察一个人的面部表情。尽管心理学家、 神经学学家和工程师们已经在人脸和人脸表情识别方面研究了 20 多年, 但近五年中这方面已经吸引 了很多人的注意。很多开发出来的科研原型系统和商业产品应用了这些方法。任何一个人脸处理系 统的第一步是人脸在图像中的位置。然而,从单张图片中检测出人脸是一项具有挑战性的工作,因 为人脸在大小、位置、方向、姿势方面是可变的。人脸表情、牙齿相接触的方式、光照也会改变人 脸的整体外观。 现在,我们对人脸检测下一个定义:给定任意图像,人脸检测的目的是确定图像中是否有人脸, 如果有人脸,则返回人脸在图像中的位置和范围。人脸检测面临的挑战可以归结为一下因素:
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
毕业设计(论文)外文文献翻译题目基于PAC的实时人脸检测和跟踪方法专业名称测控技术与仪器学生姓名刘梦丹指导教师任安虎毕业时间 2014.07基于PAC的实时人脸检测和跟踪方法视频信号处理有许多应用,例如鉴于通讯可视化的电信会议,为残疾人服务的唇读系统。
在上面提到的许多系统中,人脸的检测喝跟踪视必不可缺的组成部分。
在本文中,涉及到一些实时的人脸区域跟踪[1-3]。
一般来说,根据跟踪角度的不同,可以把跟踪方法分为两类。
有一部分人把人脸跟踪分为基于识别的跟踪喝基于动作的跟踪,而其他一部分人则把人脸跟踪分为基于边缘的跟踪和基于区域的跟踪[4]。
基于识别的跟踪是真正地以对象识别技术为基础的,而跟踪系统的性能是受到识别方法的效率的限制。
基于动作的跟踪是依赖于动作检测技术,且该技术可以被分成视频流(optical flow)的(检测)方法和动作—能量(motion-energy)的(检测)方法。
基于边缘的(跟踪)方法用于跟踪一幅图像序列的边缘,而这些边缘通常是主要对象的边界线。
然而,因为被跟踪的对象必须在色彩和光照条件下显示出明显的边缘变化,所以这些方法会遭遇到彩色和光照的变化。
此外,当一幅图像的背景有很明显的边缘时,(跟踪方法)很难提供可靠的(跟踪)结果。
当前很多的文献都涉及到的这类方法时源于Kass et al.在蛇形汇率波动[5]的成就。
因为视频情景是从包含了多种多样噪音的实时摄像机中获得的,因此许多系统很难得到可靠的人脸跟踪结果。
许多最新的人脸跟踪的研究都遇到了最在背景噪音的问题,且研究都倾向于跟踪未经证实的人脸,例如臂和手。
在本文中,我们提出了一种基于PCA的实时人脸检测和跟踪方法,该方法是利用一个如图1所示的活动摄像机来检测和识别人脸的。
这种方法由两大步骤构成:人脸检测和人脸跟踪。
利用两副连续的帧,首先检验人脸的候选区域,并利用PCA技术来判定真正的人脸区域。
然后,利用特征技术(eigen -technique)跟踪被证实的人脸。
1.人脸检测在这一部分中,将介绍本文提及到的方法中的用于检测人脸的技术。
为了改进人脸检测的精确性,我们把诸如肤色模型[1,6]和PCA[7,8]这些已经发表的技术结合起来。
2.1肤色分类检测肤色像素提供了一种检测和跟踪人脸的可靠方法。
因为通过许多视频摄像机得到的一幅RGB 图像不仅包含色彩还包含亮度,所以这个色彩空间不是检测肤色像素[1,6]的最佳色彩图像。
通过亮度区分一个彩色像素的三个成分,可以移动亮度。
人脸的色彩分布是在一个小的彩色的色彩空间中成群的,且可以通过一个2维的高斯分部来近似。
因此,通过一个2维高斯模型可以近似这个肤色模型,其中平均值和变化如下:=其中m r-)(,g一旦建好了肤色模型,一个定位人脸的简单方法是匹配输入图像来寻找图像中人脸的色彩群。
原始图像的每一个像素被转变为彩色的色彩空间,然后与该肤色模型的分布比较。
2.2动作检测虽然肤色在特征的应用种非常广泛,但是当肤色同时出现在背景区域和人的皮肤区域时,肤色就不适合于人脸检测了。
利用动作信息可以有效地去除这个缺点。
为了精确,在肤色分类后,仅考虑包含动作的肤色区域。
结果,结合肤色模型的动作信息导出了一幅包含情景(人脸区域)和背景(非人脸区域)的二进制图像。
这幅二进制图像定义为,其中It(x,y) 和It-1(x,y)分别是当前帧和前面那帧中r MI r i i YY M C i ==∑=11φφ像素(x,y )的亮度。
St 是当前帧中肤色像素的集合,(斯坦)t 是利用适当的阈限技术计算出的阈限值[9]。
作为一个加速处理的过程,我们利用形态学(上)的操作(morphological operations )和连接成分分析,简化了图像Mt 。
2.3利用PCA 检验人脸因为有许多移动的对象,所以按序跟踪人脸的主要部分是很困难的。
此外,还需要检验这个移动的对象是人脸还是非人脸。
我们使用特征空间中候选区域的分量向量来为人脸检验问题服务。
为了减少该特征空间的维度,我们把N 维的候选人脸图像投影到较低维度的特征空间,我们称之为特征空间或人脸空间[7,8]。
在特征空间中,每个特征说明了人脸图像中不同的变化。
为了简述这个特征空间,假设一个图像集合I 1,I 2,I 3,,,,,I M ,其中每幅图像是一个N 维的列向量,并以此构成人脸空间。
这个训练(测试)集的平均值用A=1/M ∑M i-1 Ii 来定义。
用A I I i -=φ来计算每一维的零平均数,并以此构成一个新的向量。
为了计算M 的直交向量,其中该向量是用来最佳地描述人脸图像地分布,首先,使用 (4)来计算协方差矩阵[]M Y φφφ⋅⋅⋅⋅=21。
虽然矩阵C 是N×N 维的,但是定义一个N 维的特征向量和N 个特征值是个难处理的问题。
因此,为了计算的可行性,与其为C 找出特征向量,不如我们计算中个M 特征向量VK 和特征值K λ,所以用 来计算一个 基本集合,其中k =1,…,M 。
关于这M 个特征向量,选定M 个重要的特征向量当作它们的相应的最大特征值。
对于M 个训练(测试)人脸图像,特征向量W i =[w 1,w 2,…,w M ]用i T k K u w φ=,k=1,…,M (6)来计算。
为了检验候选的人脸区域是否是真正的人脸图像,也会利用公式(6)把这个候选人脸区域投影到训练(测试)特征空间中。
投影区域的检验是利用人脸类和非人脸类的检测区域内的最小距离,通过公式(7)来实现的。
),(nonface candidate k face candidate k W W W W Min --=),(7)其中candidate K W 是训练(测试)特征空间中对k 个候选人脸区域,且W face ,W nonface 分别是训练(测试)特征空间中人脸类和非人脸类的中心坐标,而||×||表示特征空间中的欧几里德距离(Euclidean )。
3. 人脸跟踪在最新的人脸检测中,通过在特征空间中使用一个距离度量标准来定义图像序列中下一幅图像中被跟踪的人脸。
为了跟踪人脸,位于被跟踪人脸的特征向量和K 个最近被检测的人脸之间的欧几里德距离是用obj =argkmin||Wold -Wk||,k =1,…,K ,(8)来计算的。
在定义了人脸区域后,位于被检测人脸区域的中心和屏幕中心之间的距离用dist t (face ,screen )=Face t (x ,y )-Screen (height/2,width/2),(9)来计算,其中Face t (x ,y )是时间t 内被检测人脸区域的中心,Screen (height/2,width/2)是屏幕的中心区域。
使用这个距离向量,就能控制摄像机中定位和平衡/倾斜的持续时间。
摄像机控制器是在这样的方式下工作的:通过控制活动摄像机的平和/倾斜平台把被检测的人脸区域保持在屏幕的中央。
在表2自己品母国。
参数表示的是活动摄像机的控制。
用伪代码[]Y Y T λκVKY U K *=来表示平衡/倾斜处理的持续时间和摄像机的定位。
计算平和/倾斜持续时间和定位的伪代码:Procedure Duration (x ,y )BeginSig d =None ;Distance=22y x ;IF distance>θclose then Sig d =Close ;ELSEIF distance>θfar thenSig d =fat ;Return (Sig d );End Duration ;Procedure Orientation (x ,y )BeginSigo=None ;IF x>θx thenAdd “RIGHT” to Sig o ;ELSEIF x<-θx thenA dd “LEFT” to Sig o ;IF y>θy then Add “up”to Sig o ;ElSEIF x<-θy thenAdd “DOWN” to Sig o ;Return (Sig o );End Orientation ;4. 结论本文中提议了一种基于PAC 的实时人脸检测和跟踪方法。
被提议的这种方法是实时进行的,且执行的过程分为两大部分:人脸识别和人脸跟踪。
在一个视频输入流中,首先,我们利用注入色彩、动作信息和PCA这类提示来检测人脸区域,然后,用这样的方式跟踪人脸:即通过一个安装了平衡/请求平台的活动摄像机把被检测的人脸区域保持在屏幕的中央。
未来的工作是我们将进一步发展这种方法,通过从被检测的人脸区域种萃取脸部特征来为脸部活动系统服务。
译文:PCA-Base Real-Time Face Detection and TrackingSeeing the signal of handles many applications, for example owing to the communication can see the telecommunication meeting that turn, for disable and sick person service of the lips reads the system. In up many systems that mention, the facial examination in person drink to follow to see to can't lack necessarily of constitute the part. In this text, involve the some solid of person a district follows the [1 - 3 ] .By any large, according to follow the angle different, can is divided in to follow the method two types. Reach a the part of people follows person's face is divided into according to identify on the trail of to drink according to act of on the trail of, but other a the part of people then follows person's face is divided into according to edge of on the trail of with on the trail of [that according to district 4].According to the on the trail of that identify is really with the object identifies technique is basal, but follow the function of the system is the restrict of the efficiency to suffer to identify the method. According to the on the trail of of the action is a method to depend on to examine the technique in the action, and that technique can be been divided in to see flow( optical flow) with the method that act the — energy( motion - energy).According to the method of the edge used for the edge that follow a picture preface row, but these edges is usually the boundary line of the main object.However, because were musted shine on with the light at the color by the on the trail of object the term descends to display the obvious edge changes, so these methods will fall among the color with the variety that light shine on.In addition, be a background of picture contain very obvious edge,( follow the method) dependable result in very difficult offering.Current this type of method that a lot of cultural heritages all involve come from the Kass et al.In the snake form rate of exchange motion [ 5 the achievement of ]s.Because see the scene of to acquire from included various the noise of varieties solid the hour the resemble the machine of, therefore many systems is very rare to dependable person's face to follow the result.Many latest a research for followings met most problem in background noise, and the research inclines toward person's face that follow has not yet the proof, for example arm with hand.In this text, we put forward a kind of according to PCA solid contemporaries an examination with follow the method, that method is an activity to make use of a,such as figure,1 show resemble machine to examine with identify the person facial.This kind of method from two greatest steps composing:Person an examination with person's face ing two pairs of consecutive frames, examine a person's face candidate for election districts first, combine exploitation PCA technique to judge the real person a district.Then, make use of the characteristic technique( eigen - technique) follow to confirmed person's face.1. Person an examinationIn this first part, will introduce the method that this text mention inside of used for the technique that examine person's face.For improves an accurate for examining, we announce such as the skin color model [ 1,6 ] with PCA [ 7,8 ] these already of the technique knot puts together.2.1 skin color classificationThe examination skin color pixel provides a kind of examination with follow the facial and dependable method in person.Because pass many that sees the machine resemble a RGB picture not only include color but also gets bright degree in containment, so this color space is not the best color to examine the skin color pixel [ 1,6 ] picture.Through the brightness distinguish the three components of a color pixel,brightness can be moved.A Gauss for of color distributing is in a small chromatic color space large groups, and can passing first 2 cent department to look like.Therefore, pass a 2 Gauss models can look like this skin color model, among them average value with change as follows:r M i r i i YY M C ==∑=11φφ∑==M i iI M A 11Once set up to like the skin color model, a positions facial and simple method in person is match the importation picture to look for facial color in middleman in picture cluster.Each a pixel of the primitive picture were changed into the chromatic color space, then distributing with the skin color's model the comparison.2.2 action examinationAlthough the skin color application in characteristic grows very extensive, when the skin color appear at the same time in the background district with the person's skin district, skin color is not suitable for in the person an examination.Making use of to act information can away with this weakness availably.For the sake of the precision, after the skin color divides into section, consider the skin color district of the containment action only.Result, the action information of the combination skin color model leads a binary system for a containment scene( person's a district) with background( not person's a district) picture.This binary system picture definition is, among them It( x, y) With the It-1( x, y) respectively is a bright degree for with front an inside pixel( x, y).The St is a current an inside skin color pixel to gather, the t is a worth [ in limit in to makes use of appropriate limit technique compute 9 ] .The acceleration that be used as a process handles, we make use of the operation( morphological operations) that appearance learn( top) with link the composition analyzes, simplifying the picture Mt.2.3 make use of the PCA examine person's faceThere is many ambulatory objects, so follow in sequence the facial and main part in person is very difficult.In addition, return the demand examine this ambulatory object is person's face or not person's face.We uses characteristic space inside the weight vector of the candidate for election district to behave face examination problem service.For reducing that characteristic the spatial a candidate for, we N a picture casts shadow the characteristic space of the lower the degree of , we call it as characteristic space or persons a space [ 7,8 ] .In characteristic space, each characteristic explained the different variety in inside in a picture in person.In order to sketch the feature space, suppose a picture gather the I 1, I 2, I 3, … , I M , among them each picture is the row vector of a N , and with this composing person a space.The average value that this training( test) gather uses thee the A I i i -=φcomputes the zero average number of each , and with this composing a new vector.For computing the M keep handing over vector, among them that vector is to uses to come to describe the person best a picture ground distribute, first, the usage (4)compute to help the covariance matrix []M Y φφφ⋅⋅⋅=21.Although matrix C is N × N dimension, but define a N dimensional feature vector and the N eigenvalues is a difficult problem.Therefore, for the sake of calculating possibility, with its finds out the characteristic vector for the C, not equal to we compute the [ YTY] the inside M a characteristic vector VK with the worth k in characteristic, so use the compute a basic gather, among them k=1, … , M.As for this M a characteristic vector, make selection an important characteristic vector regard as their homologous and biggest characteristic value.Trains( test) the person a picture to the of M, characteristic vector the W i =[ w 1, w 2, … , w M'] uses the W k = i T K u φ, k=1, … , the M(6) computes. For the sake of the person of the examination candidate for election whether a district is a real person or not a picture, also will make use of the formula(6) cast shadow the training( test) characteristic space inside to this candidate a district.Examination that cast shadow the district is a minimum distance to makes use of a person's face with not person's face examination district inside, passing the formula(7) come to something to realizes.Min(|| face candidate K W W -||,|| nonface candidate K W W -||),(7) among them the candidate K W is to trains( test) the characteristic space inside to the k a candidate a district, and face W , nonface W respectively is training( test) characteristic space middleman face with not person's face center sit the mark, but||×|| mean the characteristic in the space several in virtuous distance( Euclidean).3.Person's face followsIn latest person an examination, pass to use a distance generous character standard to define the picture preface row in characteristic λκVK Y u K *=space inside a picture inside drive on the trail of person's face.For following a person's face, locate to is followed a person's face the characteristic vector is recent to is examined with the of K of person the of the an is several in the virtuous distance is to uses the obj= k arg min|| W old - W k ||, k=1, … , K,(8) compute of.After defining the person a district, locate to is examined person the center of a district with distance that hold the act center uses the t dist ( face, screen)= Face t ( x, y) - Screen( height/2, width/2),(9) compute, among them Face t ( x, y) The that time a t inside were examined the person the center of a district, the Screen( height/2, width/2) is a center to hold the act e this distance vector, can control the resemble to position in the machine with equilibrium/ tilt to one side of continuously time.The resembles the machine controller is what under such way work:Pass to control the activity resemble the machine even with/ tilt to one side the terrace examines drive of person a district keeps at hold the act central.In the table 2 oneself article mother country.What parameter mean is a control that activity resemble the machine.Mean with the false code equilibrium/ tilts to one side to handle continuously time resemble the fixed position of the machine with . The calculation is even with/ tilt to one side keep on time with the false code that position:Procedure Duration (x ,y )BeginSig d =None ;Distance=22y x ;IF distance>θclose thenSig d =Close ;ELSEIF distance>θfar thenSig d =fat ;Return (Sig d );End Duration ;Procedure Orientation (x ,y )BeginSigo=None ;IF x>θx thenAdd “RIGHT” to Sig o ;ELSEIF x<-θx thenAdd “LEFT” to Sig o ;IF y>θy thenAdd “up”to Sig o ;ElSEIF x<-θy thenAdd “DOWN” to Sig o;Return(Sig o);End Orientation;4.ConclusionIt suggested in this text a kind of according to PAC solid contemporaries face examination with follow method.Were been a solid hour to proceed by this kind of method that suggest of, and the executive process is divided into two big part:Person's face identifies to follow with person's face.In first saw input flow, first, we make use of the infusion color, action the information is this type of to hint to examine the person a district with the PCA, then, use such way follow person's face:Passed a gearing namely equilibrium/ request the activity of the terrace resemble the machine examines drive of person a district keeps at hold the act central.The future work is a person who we will further develop this kind of method, passing from is examined a district grow to extract a characteristic to serve for a movable system.。