Local maximum-entropy approximation schemes a seamless bridge between finite elements and m
自动化专业英语词汇
exploit ['eksplɔit] 利用ferroelectric [,ferəui'lektrik] 铁电的注意e的发音,短促compound ['kɔmpaund] 混合物注意奥的发音capability [,keipə'biliti] 能力性能electromechanicalinherent [in'hiərənt] 内在的固有的regime [rei'ʒi:m] 政权,状态hysteresis [,histə'ri:sis]constitutive ['kɔnstə,tju:tiv] 本质的基本的,制定的,分子的nonlinearity [lini'ærəti]noncentrosymmetric [si'metrik]regulate ['regju,leit]utilize ['ju:tilaiz] 利用accuracy ['ækjurəsi] 正确的,准确的dipole ['daipəul] 偶极子,双极accommodate [ə'kɔmədeit] 提供algorithm ['ælgə,riðəm] 算法specification [,spesifi'keiʃən] 说明,详细ferroelastic [i'læstik]whereas [hwɛər'æz] 而associate [ə'səuʃieit]stress [stres] 应力manifest ['mænifest] 明显的strain [strein] 应变illustrate ['iləstreit]layer['leiə] 层unimorphmodification [,mɔdifi'keiʃən]airfoil ['ɛə,fɔil] 注意重音,机combination [,kɔmbi'neiʃən] 【数】组合[C]robustness [rəu'bʌstnis] 稳健性,鲁棒性metallic [mi'tælik] 金属的;含(或产)金属的asymmetric [,æsi'metrik] 不对称的,不匀称的behaviorpiezoelectric [pai,i:zəui'lektrik] 压电的actuate ['æktjueit] 开动(机器等)ionic[ai'ɔnik] 【物】离子的polarization [,pəulərai'zeiʃən] 产生极性;极化reversible [ri'və:səbl] 可反转的;可逆的alter ['ɔ:ltə] 改变configuration [kən,figju'reiʃən] 组态配置term [tə:m] 把...称为,把...叫做converse [kən'və:s] 【书】交谈,谈话;逆,相反的事物directcoupleatomic [ə'tɔmik] 原子的angstrom ['æŋstrəm] 埃(光谱线波长单位) 一万万分之一厘米polar ['pəulə] 北极的,南极的;极地的multistagedual ['dju:əl] "双倍的;双重的"sensitivity [,sensi'tiviti] 敏感性;感受性saturation [,sætʃə'reiʃən]domain [də'mein] 【数】域; 定义域【物】(磁)畴homogencity [,hɔməudʒe'ni:iti] 同种;同质mitigate ['miti,geit] "使缓和;减轻"deleterious [,deli'tiəriəs] 有害的;有毒的hybrid ['haibrid] 混合源物;合成物;混合词ratio ['reiʃəu] 比;比率;【数】比例stack [stæk]cylindrical [si'lindrikl] 圆柱形的;圆筒状的;attenuate [ə'tenjueit] (使)变细;(使)变小yield[ji:ld] 产生irreversible [,iri'və:səbl] 不可逆的;不能翻转的acoustic [ə'ku:stik] 声学的employ [im'plɔi] 使用利用scale[skeil] "大小;规模[C][U]比率;缩尺[C]"real-timebiological[,baiə'lɔdʒikəl] 生物的;生物学的efficacy ['efikəsi] 效力,功效diminish [di'miniʃ] "减少,减小,缩减"robust control [rəu'bʌst]thermal ['θə:məl]gain [gein] 增益circumvent [,sə:kəm'vent] "环绕;包围"compensator ['kɔmpenseitə] 补偿(或赔偿)者;补偿(或赔偿)物property ['prɔpəti] 特性,性能,属性transient ['trænʃənt] "瞬态"closure ['kləuʒə]sufficiently [sə'fiʃəntli] "足够地,充分地"implementation [,implimen'teiʃən] 履行;完成stochastic [stə'kæstik] 可能的, 或然的; 【数】随机的homogenizationinitial [i'niʃəl] 开始的summarize ['sʌməraiz] 总结quantify ['kwɔntifai] 为...定量;以数量表示roughly ['rʌfli] 粗糙地categorize ['kætigə,raiz] 使列入...的范畴;将...分类phenomenologicalconstruct [kən'strʌkt] 建造,构成microscopic ['maikrə'skɔpik] 显微镜似的;精微的mesoscopicframework ['freimwə:k] "构造,机构,组织"optimal ['ɔptəməl] 最理想的parameter[pə'ræmitɚ] 参数present ['preznt]tenet ['tenit] 信条;主义;教义;宗旨;原则category ['kætigəri] "种类;部属;类目;范畴"originate [ə'ridʒi,neit] 发源;来自;产生piezoceramic [si'ræmik] 陶器的;制陶艺术的generality[,dʒenə'ræliti] 一般性;普遍性attribute [ə'tribju:t] 属性;特性,特质identify [ai'dentifai] 确认;识别;鉴定,验明faciliatecorrelation [,kɔ:ri'leiʃən] 相互关系,关联compromise ['kɔmprəmaiz] 妥协方案,折衷办法;折衷物quasistatic ['stætik] ['kweisai]formulation [,fɔ:mju'leiʃən] 公式化approach [ə'prəutʃ]aggregate ['ægrigeit] 使聚集local ['ləukəl] 局部的expression [iks'preʃən]internal [in'tə:nəl]entropy ['entrəpi] 熵landscape ['lændskeip]minimizaition [,minimai'zeiʃən] 减缩到最小transitionanhystereticasymptotic 渐近线的approximation [ə,prɔksi'meiʃən] 【数】近似值[kwə'drætik] 数】二次方程式algebraic [,ældʒi'breik] 代数的;代数学的;代数上的negligible['neglidʒəbl] 可以忽略的;crystal ['kristl] 晶体的homogeneous [,hɔmə'dʒi:niəs] 同种的;同质的uniform ['ju:nifɔ:m] "相同的,一致的,均匀的"polverystallinitycoercive [kəu'ə:siv] 强制的;高压的manifestation [,mænifes'teiʃən] 显示,表明;证实correlate ['kɔ:rə,leit]congruency ['kɔŋgruənsi] 适合;一致exhibit [ig'zibit] 表示,显出analogous [ə'næləgəs] 类似的;可比拟的memory alloymethodology [,meθə'dɔlədʒi] 方法论ferroicrelaxation[,ri:læk'seiʃən] 驰豫eliminate [i'limineit] 排除,消除,消灭lookup tablemotivate ['məuti,veit] 给...动机;刺激;激发barium ['bɛəriəm] 钡titanate ['taitəneit] 钛酸盐reiterate [ri:'itəreit] 重做,反复做;重申,反复讲derivative[di'rivətiv] 引出的,导数,微商relaxation[,ri:læk'seiʃən] 松弛,放松tungsten ['tʌŋstən] 【化】钨hydrostatic [,haidrə'stætik] 静液压embrittle [im'britəl] 使(金属)变脆hydrogen ['haidridʒən] 【化】氢[U]excessive [ik'sesiv] 过度的;过分的;极度的capacity [kə'pæsiti] 容量,容积[U][S]equilibrium [,i:kwi'libriəm] 相称;平衡;均衡gravitational ['grævə'teiʃənəl] (万有)引力的;重力的resultant [ri'zʌltənt] 作为结果的intersect [,intə'sekt] 横断,贯穿;与...交叉gage [geidʒ] 挑战;象征挑战而扔下的手套或帽子等,规格,计器,量尺oblique [əb'li:k] 斜的;倾斜的[Z]infinitesimal [,infini'tesiməl] 极微小的,无限小的determinant [di'tə:minənt] 【数】行列式auxiliary [ɔ:g'ziljəri] 辅助的bisect [bai'sekt] 平分;二等分metallurgy [me'tælədʒi] 冶金学continuum [kən'tinjuəm] 连续统;闭联集designate ['dezigneit] 标出;表明;指定dummy ['dʌmi] 虚位orthogonal [ɔ:'θɔgənl] 直角的;矩形的dilation [dai'leiʃən] 扩张;扩大部分variables ['vɛəriəbl] 变量circuit ['sə:kit] 电路electrical engineering [i'lektrikəl] 电气工程unit ['ju:nit] 单位analysis [ə'næləsis] 分析voltage ['vəultidʒ] 电压current ['kə:rənt] 电流basic circuit element ['elimənt] 基本电路单元summary ['sʌməri] 总结pratical perspective [pə'spektiv] 实用看法current source 电流源electrical perspective [i'lektrikəl]circuit model ['mɔdl] 电路模型Kirchhoff's laws 基尔霍府定律dependent source [di'pendənt] 独立源rear window defroster [di:'frɔ:stə] 后窗除霜器resistor in parallel 并联电阻resistor in series [ri'zistə] 串联电阻voltage-divider circuit ['vəultidʒ] 分压电路current-divider circuit 分流电路wheatstone bridge 会斯顿桥delta-to-wye equivalent circuits ['deltə] 三角形-星形等效电路realistic resistors [riə'listik] 实际电阻terminology [,tə:mi'nɔlədʒi] 术语node-voltage method [nəud] 节点电压方法special cases ['speʃəl] 特例mesh-current method [meʃ] 网格电流方法source transformations [,trænsfə'meiʃən] 源变化maximum ['mæksiməm] 最大的superposition 叠加operational amplifier [,ɔpə'reiʃənəl] 运算放大器terminal ['tə:minl] 终端strain gages [strein] 应力仪inverting-amplifier circuit ['æmplifaiə] 反相放大电路summing-amplifier circuit 加法放大电路noninverting-amplifier circuit 正向放大电路difference-amplifier circuit ['difərəns] 差分放大电路proximity switch [prɔk'simiti] 接近开关inductor [in'dʌktə] 电感器capacitor [kə'pæsitə] 电容器inductance and capacitance [in'dʌktəns] 电感和电容mutual inductance ['mju:tʃuəl] 互感flashing light circuit 闪光电路natural response ['nætʃərəl] 自然响应step response [ri'spɔns] 阶跃响应general solution ['dʒenərəl] 一般解sequential switch [si'kwenʃəl] 时序开关unbounded response [ʌn'baundid] 无界响应integrating amplifier ['intigreit] 集成放大器ignition circuit [ig'niʃən] 点火电路household distribution ['haushəuld] 家用配电sinusoidal source [,sinə'sɔidl 正选电源phasor 相量sinusoidal response 正弦响应passive circuit elements ['pæsiv] 无源电路元件frequency domain [də'mein] 频域transformer [træns'fɔ:mə] 变压器phasor diagrams ['daiəgræm] 相量图heating appliances [ə'plaiəns] 加热电气instantaneous power [,instən'teiniəs] 瞬时功率average and reactive power [ri'æktiv] 平均无功功率rms value 有效值complex power 复数功率balanced three-phase voltage 平衡三相电压definition of laplace transform [,defi'niʃən] 拉普拉斯变换定义step function ['fʌŋkʃən] 阶跃函数impulse function ['impʌls] 脉冲函数functional transforms ['fʌŋkʃənəl] 功能变换operational transforms [,ɔpə'reiʃənəl] 运算变换inverse transform ['in'və:s] 逆变换poles and zeros of F(s) 极点和零点initial-and final-value theorem ['θiərəm 初值和终值定理s Domain s域convolution integral [,kɔnvə'lu:ʃən] 卷积积分steady-state sinusoidal [,sinə'sɔidl 稳态正弦的impulse function 脉冲函数pushbutton telephone circuit 按键电话电路some preliminary [pri'liminəri]low-pass filter 低通滤波器high-pass filter ['filtə] 高通滤波器bandpass filter 带通滤波器bandreject filter 带阻滤波器bode diagrams 波特图complex poles and zeros 复数零极点bass volume control ['beis]first-order low-pass and high-pass filters 一阶低通滤波器scale[skeil] 标度尺度Op Amp bandpass and bandreject filternarrow-band bandpass and bandreject filter 窄带带通和带阻滤波器Fourier series 傅立叶变换Fourier Coefficient 傅立叶系数Alternative Trigonometric Form ['trigənə'metrik] 备选三角形式periodic function [,piəri'ɔdik] 周期函数Exponential Form [,ekspəu'nenʃəl] 指数函数Amplitude and Phase Spectra 幅值和相谱derivation of the Fourier Transform [,deri'veiʃən] 傅立叶变换微分Convergence of the Fourier Integral 傅立叶积分收敛mathematical properties [,mæθə'mætikl] 数学特性parseval theoremtwo-port parameters 两端口参数interconnected two-port circuitpreliminary steps [pri'liminəri] 初步步骤Cramer's Method 克莱姆方法characteristic determinant [,kæriktə'ristik] 特征行列式numerator determinant ['nju:məreitə] 分子行列式evaluation of a determinant [i,vælju'eiʃən] 行列式估值matrices ['meitrisi:z] 矩阵matrix algebra ['meitriks]['ældʒibrə] 矩阵代数partitioned matrics [pɑ:'tiʃən] 分块矩阵identity [ai'dentiti] 恒等式adjoint 伴随,附加inverse metrices ['in'və:s] 逆矩阵Complex number 复数notation [nəu'teiʃən] 标记法graphical representation ['græfikəl] 图形表示arithmetic operation [ə'riθmətik] 算术运算useful identities 有用恒等式integer power of a complex number ['intidʒə]roots of a complex number 复数根magnetically coupled coils [mæg'netikli] 磁耦合线圈indeal transformer 理想变压器Decibel 分贝Abbreviated table [ə'bri:vieitid] 缩写表trigonometric identity ['trigənə'metrik] 三角恒等式aptitude ['æptitju:d] 性能applied science 应用科学dominant ['dɔminənt] 优势的satellite communication [kə,mjunə'keʃən] 卫星通讯digital computer 数字计算机diagnostic [,daiəg'nɔstik] 诊断的surgical medical equipment ['sə:dʒikəl]component [kəm'pəunənt]hierarchy ['haiərɑ:ki] 层次结构analysis entail [in'teil]polarity reference system [pəu'læriti]transmit [træns'mit] 发送,传递natural phenomena [fi'nɔminə]['nætʃərəl] 信息pervade [pə:'veid] 流行于,渗透于transportation [,trænspə'teiʃən] 运输calssfication [,klæsəfə'keʃən] 分类distribute information [di'stribjut] 分布transmitter [træns'mitə] 传递者receiver [rɪ'sivɚ]radar system ['reidɑ:]coordinate [kəu'ɔ:dneit] 协调,调节depict [di'pikt] 描述,描写coaxial cable [kəu'æksiəl] 同轴的microwave transmissionbroadcast ['brɔ:dkɑ:st] 广播antenna [æn'tenə] 天线fiber ['faibə] 纤维earphone ['iəfəun] 耳机stage [steidʒ] 舞台,阶段guarantee [,gærən'ti:] 保证supercomputerweather datachemical interaction [,intə'rækʃən] 互相影响organic molecule [ɔ:'gænik]:['mɔlikju:l] 有机分子microcircuitthermodynamic ['θə:məudai'næmik] 热力学regulate process ['regju,leit] 控制,调整flow rate [fləu]refinery [ri'fainəri] 精炼厂fuel-air mixture ['mikstʃə] 混合autopilot ['pailət] 飞行distributehydroelectric ['haidrəui'lektrik] 水力电气的;水力发电的crisscross ['kris,krɔs] 十字形,交叉redundancy [ri'dʌndənsi] 多余,冗余massive quantities ['mæsiv]orbiting weather satellites ['sætəlait] 卫星tomography [təu'mɔgrəfi] X线体层照相术diagnosis [,daiəg'nəusis] 诊断disease and injury [di'zi:z]:['indʒəri]reliablycontinentsophisticatedpilotnavigationairbornerudderwing flapaileronaboundunpredictableprimarilydiversebranchapproximateendeavorrefer to electromagnetic field theory cumbersome prerequisiteassumption simulataneouslylumped-parameter system quantitativepropagatewavelengthdimensionvelocitypropagationdefinepertinentstrategydestinationparaphraseobjectiveextraneousweed outdisposalbogged downsketchverbalreferencesimplified equivalent circuit analytical toolanticipatestreamlineinstructor's preference alternativerevisitdrill exerciseget stuckintuitionhunchtemporarily magnitude realizablevalidityoriginalintuitionrecipeelaborate guideline quantitative multidisciplinary abbreviated thermodynamic luminous convenientlyprefixdivisibleyieldoverviewassess specificationcircuit component batteryaccuracybehavior refinement prototype refinement eventuallyiterative process mature circuit quantitatively measurable quantities reviewbipolarelectric fluid differential formbipolarelectronconducitoncovalent bondattributesubdividedrepresentationcommitmentblankcommitmentdenoteplus and minus signalgebraicallyarbitrarysubsequent equationspassive sign conventionadoptionpassivenonelectricalconvenientlyillustrateexpendingabsorbing energywater pumpproductdeliver11。
英文单词
Synopsis:概要,大纲Macroscopic:宏观的,肉眼可见的Interconnected:连通的,有联系的Inter-dendritic:晶间Ingots:钢锭,铸块Globular:球状的Substantially:实质上,大体上,充分地Qualitatively:定性地Bulging:膨胀,凸出,打气,折皱(在连铸中是鼓肚的意思!)Hydrogen induced cracking:氢致裂纹(HIC)Correlated to:相互关联Perform:完成,执行Bulk concentration:体积浓度Introduction:引言Accordingly:因此,相应地Countermeasure:对策,对抗措施Equiaxed crystal:等轴晶Aggromerate:聚合Permeability:渗透性Slab:厚板Plate:薄板Contraction:收缩,紧缩Conventional:传统的Inconsistency:不一致Susceptibility:敏感性Resolve:解决,分解Morphology:形态Interpret:解释,解读Areal fraction:面积分数Quench:淬火Dendrite tips:枝晶尖端Specimen:试样,样品Proportional:成比例的Coarsening:晶粒粗大Coalescence:合并,联合Nevertheless:不过,虽然如此Planar:二维的,平面的Cellular:细胞的,多孔的Interface:界面,接触面Refer to :适用于Constant:常量Approximation:近似值,近似法Apparatus:仪器,装置Diagram:图表,图解Derive from:源自,来自Longitudinal:纵向的,长度的Section:截面Magnification:放大率schematic:图解的curvature:弯曲arrowed:标有箭头的in essence:本质上,其实lagged 延迟radial heat 辐射热transient:短暂的crucible:坩埚internal diam:内部直径chromel alumel thermocouple:铬镍-铝镍热电偶allumina:氧化铝agitated ice water:激冷冰水given:考虑到electropolish:用电解法抛光transverse:横向的,横断的metallographic:金相的diffusion:扩散,传播coefficient:系数,率undercooling/supercooling:过冷interdendritic:枝晶间,树枝晶间的intragranular:晶内的granular:颗粒的,粒状的isotherm:等温线arc-welded:弧焊deposit:沉积物,存款inversely:相反地geometry:几何学justification:理由,辩护,认为正当somewhat:有点gradient:梯度,倾斜度recognised:承认,辨别substitute:代替exponent:指数excluding:不包括,将。
统计学专业英语词汇
log-log 对数
log-normal distribution 对数正态分布
longitudinal 经度的,纵的
loss function 损失函数
M
Mahalanobis\' generalized distance Mahalanobis广义距离
drop out 脱落例
Durbin-Watson statistic(ratio) Durbin-Watson统计量(比)
E
efficient, efficiency 有效的、有效性
* Engel\'s coefficient 恩格尔系数
entropy 熵
epidemiology 流行病学
* error 误差
item 项
J
Jacknife 刀切法
K
Kaplan-Meier estimate Kaplan-Meier估计
* Kendall\'s rank correlation coefficients 肯德尔等级相关系数
Kullback-Leibler information number 库尔贝克-莱布勒信息函数
model, -ing 模型(建模)
moment 矩
moving average 移动平均
multicolinear, -ity 多重共线(性)
multidimensional scaling(MDS) 多维换算
multiple answer 重复回答
multiple choice 多重选择
multiple comparison 多重比较
* histogram 直方图
电力系统可投稿的SCI期刊及其评述
[1-50]《电力系统研究》Electric Power Systems Research (Switzerland)刊载发电、输配电以及电力应用方面的原始论文。
高价刊。
《IEEE电力系统汇刊》IEEE TRANSACTIONS ON POWER SYSTEMS (USA)刊载电力系统包括发电和输配电系统的技术条件、规划、分析、可靠性、运行以及经济性方面的论文。
平均3个月的审稿周期《IEEE 智能电网汇刊》IEEE Transactions on Smart Grid《英国电气工程师学会志:发电、输电与配电》IEE PROCEEDINGS-GENERATION TRANSMISSION AND DISTRIBUTION (England)《国际电力与能源系统杂志》International Journal of Electrical Power and Energy Systems (England)主要发表电力与能源系统的理论和应用问题的论文、评论和会议报告,涉及发电和电网规划、电网理论、大小型系统动力、系统控制中心、联机控制等。
EUROPEAN TRANSACTIONS ON ELECTRICAL POWER (2013年更名为International Transactions on Electrical Energy Systems)投稿回复比较慢,审稿周期不详。
《电力部件与系统》ELECTRIC POWER COMPONENTS AND SYSTEMS (USA) 刊载电力系统的理论与应用研究论文。
内容包括电机的固态控制,新型电机,电磁场与能量转换器,动力系统规划与保护,可靠性与安全等。
《电机与动力系统》ELECTRIC MACHINES AND POWER SYSTEMS (USA)《英国电气工程师学会志:电力应用》IEE PROCEEDINGS-ELECTRIC POWER APPLICATIONS (England)《IEEE电力系统计算机应用杂志》IEEE COMPUTER APPLICATIONS IN POWER (USA)刊载计算机在电力系统设计、运行和控制中应用方面的研究论述。
机器学习专业词汇中英文对照
机器学习专业词汇中英⽂对照activation 激活值activation function 激活函数additive noise 加性噪声autoencoder ⾃编码器Autoencoders ⾃编码算法average firing rate 平均激活率average sum-of-squares error 均⽅差backpropagation 后向传播basis 基basis feature vectors 特征基向量batch gradient ascent 批量梯度上升法Bayesian regularization method 贝叶斯规则化⽅法Bernoulli random variable 伯努利随机变量bias term 偏置项binary classfication ⼆元分类class labels 类型标记concatenation 级联conjugate gradient 共轭梯度contiguous groups 联通区域convex optimization software 凸优化软件convolution 卷积cost function 代价函数covariance matrix 协⽅差矩阵DC component 直流分量decorrelation 去相关degeneracy 退化demensionality reduction 降维derivative 导函数diagonal 对⾓线diffusion of gradients 梯度的弥散eigenvalue 特征值eigenvector 特征向量error term 残差feature matrix 特征矩阵feature standardization 特征标准化feedforward architectures 前馈结构算法feedforward neural network 前馈神经⽹络feedforward pass 前馈传导fine-tuned 微调first-order feature ⼀阶特征forward pass 前向传导forward propagation 前向传播Gaussian prior ⾼斯先验概率generative model ⽣成模型gradient descent 梯度下降Greedy layer-wise training 逐层贪婪训练⽅法grouping matrix 分组矩阵Hadamard product 阿达马乘积Hessian matrix Hessian 矩阵hidden layer 隐含层hidden units 隐藏神经元Hierarchical grouping 层次型分组higher-order features 更⾼阶特征highly non-convex optimization problem ⾼度⾮凸的优化问题histogram 直⽅图hyperbolic tangent 双曲正切函数hypothesis 估值,假设identity activation function 恒等激励函数IID 独⽴同分布illumination 照明inactive 抑制independent component analysis 独⽴成份分析input domains 输⼊域input layer 输⼊层intensity 亮度/灰度intercept term 截距KL divergence 相对熵KL divergence KL分散度k-Means K-均值learning rate 学习速率least squares 最⼩⼆乘法linear correspondence 线性响应linear superposition 线性叠加line-search algorithm 线搜索算法local mean subtraction 局部均值消减local optima 局部最优解logistic regression 逻辑回归loss function 损失函数low-pass filtering 低通滤波magnitude 幅值MAP 极⼤后验估计maximum likelihood estimation 极⼤似然估计mean 平均值MFCC Mel 倒频系数multi-class classification 多元分类neural networks 神经⽹络neuron 神经元Newton’s method ⽜顿法non-convex function ⾮凸函数non-linear feature ⾮线性特征norm 范式norm bounded 有界范数norm constrained 范数约束normalization 归⼀化numerical roundoff errors 数值舍⼊误差numerically checking 数值检验numerically reliable 数值计算上稳定object detection 物体检测objective function ⽬标函数off-by-one error 缺位错误orthogonalization 正交化output layer 输出层overall cost function 总体代价函数over-complete basis 超完备基over-fitting 过拟合parts of objects ⽬标的部件part-whole decompostion 部分-整体分解PCA 主元分析penalty term 惩罚因⼦per-example mean subtraction 逐样本均值消减pooling 池化pretrain 预训练principal components analysis 主成份分析quadratic constraints ⼆次约束RBMs 受限Boltzman机reconstruction based models 基于重构的模型reconstruction cost 重建代价reconstruction term 重构项redundant 冗余reflection matrix 反射矩阵regularization 正则化regularization term 正则化项rescaling 缩放robust 鲁棒性run ⾏程second-order feature ⼆阶特征sigmoid activation function S型激励函数significant digits 有效数字singular value 奇异值singular vector 奇异向量smoothed L1 penalty 平滑的L1范数惩罚Smoothed topographic L1 sparsity penalty 平滑地形L1稀疏惩罚函数smoothing 平滑Softmax Regresson Softmax回归sorted in decreasing order 降序排列source features 源特征sparse autoencoder 消减归⼀化Sparsity 稀疏性sparsity parameter 稀疏性参数sparsity penalty 稀疏惩罚square function 平⽅函数squared-error ⽅差stationary 平稳性(不变性)stationary stochastic process 平稳随机过程step-size 步长值supervised learning 监督学习symmetric positive semi-definite matrix 对称半正定矩阵symmetry breaking 对称失效tanh function 双曲正切函数the average activation 平均活跃度the derivative checking method 梯度验证⽅法the empirical distribution 经验分布函数the energy function 能量函数the Lagrange dual 拉格朗⽇对偶函数the log likelihood 对数似然函数the pixel intensity value 像素灰度值the rate of convergence 收敛速度topographic cost term 拓扑代价项topographic ordered 拓扑秩序transformation 变换translation invariant 平移不变性trivial answer 平凡解under-complete basis 不完备基unrolling 组合扩展unsupervised learning ⽆监督学习variance ⽅差vecotrized implementation 向量化实现vectorization ⽮量化visual cortex 视觉⽪层weight decay 权重衰减weighted average 加权平均值whitening ⽩化zero-mean 均值为零Letter AAccumulated error backpropagation 累积误差逆传播Activation Function 激活函数Adaptive Resonance Theory/ART ⾃适应谐振理论Addictive model 加性学习Adversarial Networks 对抗⽹络Affine Layer 仿射层Affinity matrix 亲和矩阵Agent 代理 / 智能体Algorithm 算法Alpha-beta pruning α-β剪枝Anomaly detection 异常检测Approximation 近似Area Under ROC Curve/AUC Roc 曲线下⾯积Artificial General Intelligence/AGI 通⽤⼈⼯智能Artificial Intelligence/AI ⼈⼯智能Association analysis 关联分析Attention mechanism 注意⼒机制Attribute conditional independence assumption 属性条件独⽴性假设Attribute space 属性空间Attribute value 属性值Autoencoder ⾃编码器Automatic speech recognition ⾃动语⾳识别Automatic summarization ⾃动摘要Average gradient 平均梯度Average-Pooling 平均池化Letter BBackpropagation Through Time 通过时间的反向传播Backpropagation/BP 反向传播Base learner 基学习器Base learning algorithm 基学习算法Batch Normalization/BN 批量归⼀化Bayes decision rule 贝叶斯判定准则Bayes Model Averaging/BMA 贝叶斯模型平均Bayes optimal classifier 贝叶斯最优分类器Bayesian decision theory 贝叶斯决策论Bayesian network 贝叶斯⽹络Between-class scatter matrix 类间散度矩阵Bias 偏置 / 偏差Bias-variance decomposition 偏差-⽅差分解Bias-Variance Dilemma 偏差 – ⽅差困境Bi-directional Long-Short Term Memory/Bi-LSTM 双向长短期记忆Binary classification ⼆分类Binomial test ⼆项检验Bi-partition ⼆分法Boltzmann machine 玻尔兹曼机Bootstrap sampling ⾃助采样法/可重复采样/有放回采样Bootstrapping ⾃助法Break-Event Point/BEP 平衡点Letter CCalibration 校准Cascade-Correlation 级联相关Categorical attribute 离散属性Class-conditional probability 类条件概率Classification and regression tree/CART 分类与回归树Classifier 分类器Class-imbalance 类别不平衡Closed -form 闭式Cluster 簇/类/集群Cluster analysis 聚类分析Clustering 聚类Clustering ensemble 聚类集成Co-adapting 共适应Coding matrix 编码矩阵COLT 国际学习理论会议Committee-based learning 基于委员会的学习Competitive learning 竞争型学习Component learner 组件学习器Comprehensibility 可解释性Computation Cost 计算成本Computational Linguistics 计算语⾔学Computer vision 计算机视觉Concept drift 概念漂移Concept Learning System /CLS 概念学习系统Conditional entropy 条件熵Conditional mutual information 条件互信息Conditional Probability Table/CPT 条件概率表Conditional random field/CRF 条件随机场Conditional risk 条件风险Confidence 置信度Confusion matrix 混淆矩阵Connection weight 连接权Connectionism 连结主义Consistency ⼀致性/相合性Contingency table 列联表Continuous attribute 连续属性Convergence 收敛Conversational agent 会话智能体Convex quadratic programming 凸⼆次规划Convexity 凸性Convolutional neural network/CNN 卷积神经⽹络Co-occurrence 同现Correlation coefficient 相关系数Cosine similarity 余弦相似度Cost curve 成本曲线Cost Function 成本函数Cost matrix 成本矩阵Cost-sensitive 成本敏感Cross entropy 交叉熵Cross validation 交叉验证Crowdsourcing 众包Curse of dimensionality 维数灾难Cut point 截断点Cutting plane algorithm 割平⾯法Letter DData mining 数据挖掘Data set 数据集Decision Boundary 决策边界Decision stump 决策树桩Decision tree 决策树/判定树Deduction 演绎Deep Belief Network 深度信念⽹络Deep Convolutional Generative Adversarial Network/DCGAN 深度卷积⽣成对抗⽹络Deep learning 深度学习Deep neural network/DNN 深度神经⽹络Deep Q-Learning 深度 Q 学习Deep Q-Network 深度 Q ⽹络Density estimation 密度估计Density-based clustering 密度聚类Differentiable neural computer 可微分神经计算机Dimensionality reduction algorithm 降维算法Directed edge 有向边Disagreement measure 不合度量Discriminative model 判别模型Discriminator 判别器Distance measure 距离度量Distance metric learning 距离度量学习Distribution 分布Divergence 散度Diversity measure 多样性度量/差异性度量Domain adaption 领域⾃适应Downsampling 下采样D-separation (Directed separation)有向分离Dual problem 对偶问题Dummy node 哑结点Dynamic Fusion 动态融合Dynamic programming 动态规划Letter EEigenvalue decomposition 特征值分解Embedding 嵌⼊Emotional analysis 情绪分析Empirical conditional entropy 经验条件熵Empirical entropy 经验熵Empirical error 经验误差Empirical risk 经验风险End-to-End 端到端Energy-based model 基于能量的模型Ensemble learning 集成学习Ensemble pruning 集成修剪Error Correcting Output Codes/ECOC 纠错输出码Error rate 错误率Error-ambiguity decomposition 误差-分歧分解Euclidean distance 欧⽒距离Evolutionary computation 演化计算Expectation-Maximization 期望最⼤化Expected loss 期望损失Exploding Gradient Problem 梯度爆炸问题Exponential loss function 指数损失函数Extreme Learning Machine/ELM 超限学习机Letter FFactorization 因⼦分解False negative 假负类False positive 假正类False Positive Rate/FPR 假正例率Feature engineering 特征⼯程Feature selection 特征选择Feature vector 特征向量Featured Learning 特征学习Feedforward Neural Networks/FNN 前馈神经⽹络Fine-tuning 微调Flipping output 翻转法Fluctuation 震荡Forward stagewise algorithm 前向分步算法Frequentist 频率主义学派Full-rank matrix 满秩矩阵Functional neuron 功能神经元Letter GGain ratio 增益率Game theory 博弈论Gaussian kernel function ⾼斯核函数Gaussian Mixture Model ⾼斯混合模型General Problem Solving 通⽤问题求解Generalization 泛化Generalization error 泛化误差Generalization error bound 泛化误差上界Generalized Lagrange function ⼴义拉格朗⽇函数Generalized linear model ⼴义线性模型Generalized Rayleigh quotient ⼴义瑞利商Generative Adversarial Networks/GAN ⽣成对抗⽹络Generative Model ⽣成模型Generator ⽣成器Genetic Algorithm/GA 遗传算法Gibbs sampling 吉布斯采样Gini index 基尼指数Global minimum 全局最⼩Global Optimization 全局优化Gradient boosting 梯度提升Gradient Descent 梯度下降Graph theory 图论Ground-truth 真相/真实Letter HHard margin 硬间隔Hard voting 硬投票Harmonic mean 调和平均Hesse matrix 海塞矩阵Hidden dynamic model 隐动态模型Hidden layer 隐藏层Hidden Markov Model/HMM 隐马尔可夫模型Hierarchical clustering 层次聚类Hilbert space 希尔伯特空间Hinge loss function 合页损失函数Hold-out 留出法Homogeneous 同质Hybrid computing 混合计算Hyperparameter 超参数Hypothesis 假设Hypothesis test 假设验证Letter IICML 国际机器学习会议Improved iterative scaling/IIS 改进的迭代尺度法Incremental learning 增量学习Independent and identically distributed/i.i.d. 独⽴同分布Independent Component Analysis/ICA 独⽴成分分析Indicator function 指⽰函数Individual learner 个体学习器Induction 归纳Inductive bias 归纳偏好Inductive learning 归纳学习Inductive Logic Programming/ILP 归纳逻辑程序设计Information entropy 信息熵Information gain 信息增益Input layer 输⼊层Insensitive loss 不敏感损失Inter-cluster similarity 簇间相似度International Conference for Machine Learning/ICML 国际机器学习⼤会Intra-cluster similarity 簇内相似度Intrinsic value 固有值Isometric Mapping/Isomap 等度量映射Isotonic regression 等分回归Iterative Dichotomiser 迭代⼆分器Letter KKernel method 核⽅法Kernel trick 核技巧Kernelized Linear Discriminant Analysis/KLDA 核线性判别分析K-fold cross validation k 折交叉验证/k 倍交叉验证K-Means Clustering K – 均值聚类K-Nearest Neighbours Algorithm/KNN K近邻算法Knowledge base 知识库Knowledge Representation 知识表征Letter LLabel space 标记空间Lagrange duality 拉格朗⽇对偶性Lagrange multiplier 拉格朗⽇乘⼦Laplace smoothing 拉普拉斯平滑Laplacian correction 拉普拉斯修正Latent Dirichlet Allocation 隐狄利克雷分布Latent semantic analysis 潜在语义分析Latent variable 隐变量Lazy learning 懒惰学习Learner 学习器Learning by analogy 类⽐学习Learning rate 学习率Learning Vector Quantization/LVQ 学习向量量化Least squares regression tree 最⼩⼆乘回归树Leave-One-Out/LOO 留⼀法linear chain conditional random field 线性链条件随机场Linear Discriminant Analysis/LDA 线性判别分析Linear model 线性模型Linear Regression 线性回归Link function 联系函数Local Markov property 局部马尔可夫性Local minimum 局部最⼩Log likelihood 对数似然Log odds/logit 对数⼏率Logistic Regression Logistic 回归Log-likelihood 对数似然Log-linear regression 对数线性回归Long-Short Term Memory/LSTM 长短期记忆Loss function 损失函数Letter MMachine translation/MT 机器翻译Macron-P 宏查准率Macron-R 宏查全率Majority voting 绝对多数投票法Manifold assumption 流形假设Manifold learning 流形学习Margin theory 间隔理论Marginal distribution 边际分布Marginal independence 边际独⽴性Marginalization 边际化Markov Chain Monte Carlo/MCMC 马尔可夫链蒙特卡罗⽅法Markov Random Field 马尔可夫随机场Maximal clique 最⼤团Maximum Likelihood Estimation/MLE 极⼤似然估计/极⼤似然法Maximum margin 最⼤间隔Maximum weighted spanning tree 最⼤带权⽣成树Max-Pooling 最⼤池化Mean squared error 均⽅误差Meta-learner 元学习器Metric learning 度量学习Micro-P 微查准率Micro-R 微查全率Minimal Description Length/MDL 最⼩描述长度Minimax game 极⼩极⼤博弈Misclassification cost 误分类成本Mixture of experts 混合专家Momentum 动量Moral graph 道德图/端正图Multi-class classification 多分类Multi-document summarization 多⽂档摘要Multi-layer feedforward neural networks 多层前馈神经⽹络Multilayer Perceptron/MLP 多层感知器Multimodal learning 多模态学习Multiple Dimensional Scaling 多维缩放Multiple linear regression 多元线性回归Multi-response Linear Regression /MLR 多响应线性回归Mutual information 互信息Letter NNaive bayes 朴素贝叶斯Naive Bayes Classifier 朴素贝叶斯分类器Named entity recognition 命名实体识别Nash equilibrium 纳什均衡Natural language generation/NLG ⾃然语⾔⽣成Natural language processing ⾃然语⾔处理Negative class 负类Negative correlation 负相关法Negative Log Likelihood 负对数似然Neighbourhood Component Analysis/NCA 近邻成分分析Neural Machine Translation 神经机器翻译Neural Turing Machine 神经图灵机Newton method ⽜顿法NIPS 国际神经信息处理系统会议No Free Lunch Theorem/NFL 没有免费的午餐定理Noise-contrastive estimation 噪⾳对⽐估计Nominal attribute 列名属性Non-convex optimization ⾮凸优化Nonlinear model ⾮线性模型Non-metric distance ⾮度量距离Non-negative matrix factorization ⾮负矩阵分解Non-ordinal attribute ⽆序属性Non-Saturating Game ⾮饱和博弈Norm 范数Normalization 归⼀化Nuclear norm 核范数Numerical attribute 数值属性Letter OObjective function ⽬标函数Oblique decision tree 斜决策树Occam’s razor 奥卡姆剃⼑Odds ⼏率Off-Policy 离策略One shot learning ⼀次性学习One-Dependent Estimator/ODE 独依赖估计On-Policy 在策略Ordinal attribute 有序属性Out-of-bag estimate 包外估计Output layer 输出层Output smearing 输出调制法Overfitting 过拟合/过配Oversampling 过采样Letter PPaired t-test 成对 t 检验Pairwise 成对型Pairwise Markov property 成对马尔可夫性Parameter 参数Parameter estimation 参数估计Parameter tuning 调参Parse tree 解析树Particle Swarm Optimization/PSO 粒⼦群优化算法Part-of-speech tagging 词性标注Perceptron 感知机Performance measure 性能度量Plug and Play Generative Network 即插即⽤⽣成⽹络Plurality voting 相对多数投票法Polarity detection 极性检测Polynomial kernel function 多项式核函数Pooling 池化Positive class 正类Positive definite matrix 正定矩阵Post-hoc test 后续检验Post-pruning 后剪枝potential function 势函数Precision 查准率/准确率Prepruning 预剪枝Principal component analysis/PCA 主成分分析Principle of multiple explanations 多释原则Prior 先验Probability Graphical Model 概率图模型Proximal Gradient Descent/PGD 近端梯度下降Pruning 剪枝Pseudo-label 伪标记Letter QQuantized Neural Network 量⼦化神经⽹络Quantum computer 量⼦计算机Quantum Computing 量⼦计算Quasi Newton method 拟⽜顿法Letter RRadial Basis Function/RBF 径向基函数Random Forest Algorithm 随机森林算法Random walk 随机漫步Recall 查全率/召回率Receiver Operating Characteristic/ROC 受试者⼯作特征Rectified Linear Unit/ReLU 线性修正单元Recurrent Neural Network 循环神经⽹络Recursive neural network 递归神经⽹络Reference model 参考模型Regression 回归Regularization 正则化Reinforcement learning/RL 强化学习Representation learning 表征学习Representer theorem 表⽰定理reproducing kernel Hilbert space/RKHS 再⽣核希尔伯特空间Re-sampling 重采样法Rescaling 再缩放Residual Mapping 残差映射Residual Network 残差⽹络Restricted Boltzmann Machine/RBM 受限玻尔兹曼机Restricted Isometry Property/RIP 限定等距性Re-weighting 重赋权法Robustness 稳健性/鲁棒性Root node 根结点Rule Engine 规则引擎Rule learning 规则学习Letter SSaddle point 鞍点Sample space 样本空间Sampling 采样Score function 评分函数Self-Driving ⾃动驾驶Self-Organizing Map/SOM ⾃组织映射Semi-naive Bayes classifiers 半朴素贝叶斯分类器Semi-Supervised Learning 半监督学习semi-Supervised Support Vector Machine 半监督⽀持向量机Sentiment analysis 情感分析Separating hyperplane 分离超平⾯Sigmoid function Sigmoid 函数Similarity measure 相似度度量Simulated annealing 模拟退⽕Simultaneous localization and mapping 同步定位与地图构建Singular Value Decomposition 奇异值分解Slack variables 松弛变量Smoothing 平滑Soft margin 软间隔Soft margin maximization 软间隔最⼤化Soft voting 软投票Sparse representation 稀疏表征Sparsity 稀疏性Specialization 特化Spectral Clustering 谱聚类Speech Recognition 语⾳识别Splitting variable 切分变量Squashing function 挤压函数Stability-plasticity dilemma 可塑性-稳定性困境Statistical learning 统计学习Status feature function 状态特征函Stochastic gradient descent 随机梯度下降Stratified sampling 分层采样Structural risk 结构风险Structural risk minimization/SRM 结构风险最⼩化Subspace ⼦空间Supervised learning 监督学习/有导师学习support vector expansion ⽀持向量展式Support Vector Machine/SVM ⽀持向量机Surrogat loss 替代损失Surrogate function 替代函数Symbolic learning 符号学习Symbolism 符号主义Synset 同义词集Letter TT-Distribution Stochastic Neighbour Embedding/t-SNE T – 分布随机近邻嵌⼊Tensor 张量Tensor Processing Units/TPU 张量处理单元The least square method 最⼩⼆乘法Threshold 阈值Threshold logic unit 阈值逻辑单元Threshold-moving 阈值移动Time Step 时间步骤Tokenization 标记化Training error 训练误差Training instance 训练⽰例/训练例Transductive learning 直推学习Transfer learning 迁移学习Treebank 树库Tria-by-error 试错法True negative 真负类True positive 真正类True Positive Rate/TPR 真正例率Turing Machine 图灵机Twice-learning ⼆次学习Letter UUnderfitting ⽋拟合/⽋配Undersampling ⽋采样Understandability 可理解性Unequal cost ⾮均等代价Unit-step function 单位阶跃函数Univariate decision tree 单变量决策树Unsupervised learning ⽆监督学习/⽆导师学习Unsupervised layer-wise training ⽆监督逐层训练Upsampling 上采样Letter VVanishing Gradient Problem 梯度消失问题Variational inference 变分推断VC Theory VC维理论Version space 版本空间Viterbi algorithm 维特⽐算法Von Neumann architecture 冯 · 诺伊曼架构Letter WWasserstein GAN/WGAN Wasserstein⽣成对抗⽹络Weak learner 弱学习器Weight 权重Weight sharing 权共享Weighted voting 加权投票法Within-class scatter matrix 类内散度矩阵Word embedding 词嵌⼊Word sense disambiguation 词义消歧Letter ZZero-data learning 零数据学习Zero-shot learning 零次学习Aapproximations近似值arbitrary随意的affine仿射的arbitrary任意的amino acid氨基酸amenable经得起检验的axiom公理,原则abstract提取architecture架构,体系结构;建造业absolute绝对的arsenal军⽕库assignment分配algebra线性代数asymptotically⽆症状的appropriate恰当的Bbias偏差brevity简短,简洁;短暂broader⼴泛briefly简短的batch批量Cconvergence 收敛,集中到⼀点convex凸的contours轮廓constraint约束constant常理commercial商务的complementarity补充coordinate ascent同等级上升clipping剪下物;剪报;修剪component分量;部件continuous连续的covariance协⽅差canonical正规的,正则的concave⾮凸的corresponds相符合;相当;通信corollary推论concrete具体的事物,实在的东西cross validation交叉验证correlation相互关系convention约定cluster⼀簇centroids 质⼼,形⼼converge收敛computationally计算(机)的calculus计算Dderive获得,取得dual⼆元的duality⼆元性;⼆象性;对偶性derivation求导;得到;起源denote预⽰,表⽰,是…的标志;意味着,[逻]指称divergence 散度;发散性dimension尺度,规格;维数dot⼩圆点distortion变形density概率密度函数discrete离散的discriminative有识别能⼒的diagonal对⾓dispersion分散,散开determinant决定因素disjoint不相交的Eencounter遇到ellipses椭圆equality等式extra额外的empirical经验;观察ennmerate例举,计数exceed超过,越出expectation期望efficient⽣效的endow赋予explicitly清楚的exponential family指数家族equivalently等价的Ffeasible可⾏的forary初次尝试finite有限的,限定的forgo摒弃,放弃fliter过滤frequentist最常发⽣的forward search前向式搜索formalize使定形Ggeneralized归纳的generalization概括,归纳;普遍化;判断(根据不⾜)guarantee保证;抵押品generate形成,产⽣geometric margins⼏何边界gap裂⼝generative⽣产的;有⽣产⼒的Hheuristic启发式的;启发法;启发程序hone怀恋;磨hyperplane超平⾯Linitial最初的implement执⾏intuitive凭直觉获知的incremental增加的intercept截距intuitious直觉instantiation例⼦indicator指⽰物,指⽰器interative重复的,迭代的integral积分identical相等的;完全相同的indicate表⽰,指出invariance不变性,恒定性impose把…强加于intermediate中间的interpretation解释,翻译Jjoint distribution联合概率Llieu替代logarithmic对数的,⽤对数表⽰的latent潜在的Leave-one-out cross validation留⼀法交叉验证Mmagnitude巨⼤mapping绘图,制图;映射matrix矩阵mutual相互的,共同的monotonically单调的minor较⼩的,次要的multinomial多项的multi-class classification⼆分类问题Nnasty讨厌的notation标志,注释naïve朴素的Oobtain得到oscillate摆动optimization problem最优化问题objective function⽬标函数optimal最理想的orthogonal(⽮量,矩阵等)正交的orientation⽅向ordinary普通的occasionally偶然的Ppartial derivative偏导数property性质proportional成⽐例的primal原始的,最初的permit允许pseudocode伪代码permissible可允许的polynomial多项式preliminary预备precision精度perturbation 不安,扰乱poist假定,设想positive semi-definite半正定的parentheses圆括号posterior probability后验概率plementarity补充pictorially图像的parameterize确定…的参数poisson distribution柏松分布pertinent相关的Qquadratic⼆次的quantity量,数量;分量query疑问的Rregularization使系统化;调整reoptimize重新优化restrict限制;限定;约束reminiscent回忆往事的;提醒的;使⼈联想…的(of)remark注意random variable随机变量respect考虑respectively各⾃的;分别的redundant过多的;冗余的Ssusceptible敏感的stochastic可能的;随机的symmetric对称的sophisticated复杂的spurious假的;伪造的subtract减去;减法器simultaneously同时发⽣地;同步地suffice满⾜scarce稀有的,难得的split分解,分离subset⼦集statistic统计量successive iteratious连续的迭代scale标度sort of有⼏分的squares平⽅Ttrajectory轨迹temporarily暂时的terminology专⽤名词tolerance容忍;公差thumb翻阅threshold阈,临界theorem定理tangent正弦Uunit-length vector单位向量Vvalid有效的,正确的variance⽅差variable变量;变元vocabulary词汇valued经估价的;宝贵的Wwrapper包装分类:。
英中文术语对照表(统计学习理论的本质-张学工译)
统计学习理论的本质:英中文术语对照表来源:张学工译, VN Vapnik原著, 统计学习理论的本质, 清华大学出版社, 2000使用范围:南京师范大学计算机科学与技术学院研究生。
声明:任何人在其出版物使用或者上载到互连网都必须得到译者及出版社的许可。
AdaBoost algorithm (AdaBoost(自举)算法)163admissible structure (容许结构) 95algorithmic complexity (算法复杂度) 10annealed entropy (退火熵) 55ANOVA decomposition (ANOVA分解) 199a posteriori information (后验信息) 120a priori information (先验信息) 120approximately defined operator (近似定义的算子) 230 approximation rate (逼近速率) 98artificial intelligence (人工智能) 13axioms of probability theory (概率理论的公理) 60back propagation method (后向传播方法) 126basic problem of probability theory (概率论的基本问题) 62basic problem of statistics (统计学的基本问题) 63Bayesian approach (贝叶斯方法) 119Bayesian inference (贝叶斯推理) 34bound on the distance to the smallest risk (与最小风险的距离的界) 77 bound on the values of achieved risk (所得风险值的界) 77bounds on generalization ability of a learning machine (学习机器推广能力的界) 76canonical separating hyperplanes (标准分类超平面) 132capacity control problem (容量控制问题) 116cause-effect relation (因果关系) 9choosing the best sparse algebraic polynomial (选择最佳稀疏多项式)117choosing the degree of polynomial (选择多项式阶数) 116 classification error (分类错误) 19codebook (码本) 106complete (Popper's) nonfalsifiability (完全(波普)不可证伪性) 52 compression coefficient (压缩系数) 107consistency of inference (推理的一致性) 36constructive distribution-independent bound on the rate of convergence (构造性的不依赖于分布的收敛速度界) 69convolution of inner production (内积回旋) 140criterion of nonfalsifiability (不可证伪性判据) 47data smoothing problem (数据平滑问题) 209decision-making problem (决策选择问题) 296decision trees (决策树) 7deductive inference (演绎推理) 47density estimation problem (密度估计问题):parametric(Fisher-Wald) setting(参数化(Fisher-Wald)表示) 20nonparametric setting (非参数表示) 28discrepancy (差异) 18discriminant analysis (判别分析) 24discriminant function (判别函数) 25distribution-dependent bound on the rate of convergence (依赖于分布的收敛速度界) 69distribution-independent bound on the rate of convergence (不依赖于分布的收敛速度界) 69ΔΔ-margin separating hyperplane (间隔分类超平面) 132 empirical distribution function (经验分布函数) 28empirical processes (经验过程) 40empirical risk functional (经验风险泛函) 20empirical risk minimization inductive principle (经验风险最小化归纳原则) 20ensemble of support vector machines (支持向量机的组合) 163 entropy of the set of functions (函数集的熵) 42entropy on the set of indicator functions (指示函数集的熵) 42 equivalence classes (等价类) 292estimation of the values of a function at the given points (估计函数在给定点上的值) 292expert systems (专家系统) 7ε-insensitivity (ε不敏感性) 181ε-insensitive loss function (ε不敏感损失函数) 181feature selection problem (特征选择问题) 118function approximation (函数逼近) 98function estimation model (函数估计模型) 17Gaussian (高斯函数) 26generalized Glivenko-Cantelli problem (广义Glivenko-Cantelli问题)66generalized growth function (广义生长函数) 85generator random vectors (随机向量产生器) 17Glivenko-Cantelli problem (Glivenko-Cantelli问题) 66growth function (生长函数) 55Hamming distance (汉明距离) 104handwritten digit recognition (手写数字识别) 146hard threshold vicinity function (硬限邻域函数) 103hard vicinity function (硬领域函数) 269hidden Markov models (隐马尔可夫模型) 7hidden units (隐结点) 101Huber loss function (Huber损失函数) 183ill-posed problems (不适定问题): 9solution by variation method (变分方法解) 236solution by residual method (残差方法解) 236solution by quasi-solution method (拟解方法解) 236 independent trials (独立试验) 62inductive inference (归纳推理) 50inner product in Hilbert space (希尔伯特空间中的内积) 140 integral equations (积分方程):solution for exact determined equations (精确确定的方程的解)237solution for approximately determined equations (近似确定的方程的解) 237kernel function (核函数) 27Kolmogorov-Smirnov distribution (Kolmogorov-Smirnov分布) 87 Kulback-Leibler distance (Kulback-Leibler距离) 32Kuhn-Tücker conditions (库恩-塔克条件) 134Lagrangian multiplier (拉格朗日乘子) 133Lagrangian (拉格朗日函数) 133Laplacian (拉普拉斯函数) 277law of large number in the functional space (泛函空间中的大数定律)41law of large numbers (大数定律) 39law of large numbers in vector space (向量空间中的大数定律) 41 Lie derivatives (Lie导数) 20learning matrices (学习矩阵) 7least-squares method (最小二乘方法) 21least-modulo method (最小模方法) 182linear discriminant function (学习判别函数) 31linearly nonseparable case (线性不可分情况) 135local approximation (局部逼近) 104local risk minimization (局部风险最小化) 103locality parameter (局部性参数) 103loss-function (损失函数):for AdaBoost algorithm (AdaBoost算法的损失函数) 163for density estimation (密度估计的损失函数) 21for logistic regression (逻辑回归的损失函数) 156for pattern recognition (模式识别的损失函数) 21for regression estimation (回归估计的损失函数) 21 madaline(Madaline自适应学习机) 7main principle for small sample size problems (小样本数问题的基本原则) 28maximal margin hyperplane (最大间隔超平面) 131maximum likehood method (最大似然方法) 24McCulloch-Pitts neuron model (McCulloch-Pitts神经元模型) 2 measurements with the additive noise (加性噪声下的测量) 25 metric ε-entropy (ε熵度量) 44minimum description length principle (最小描述长度原则) 104 mixture of normal densities (正态密度的组合) 26National Institute of Standard and Technology (NIST) digit database (美国国家标准技术研究所(NIST)数字数据库) 173neural networks (神经网络) 126non-trivially consistent inference (非平凡一致推理) 36 nonparametric density estimation (非参数密度估计) 27normal discriminant function (正态判别函数) 31one-sided empirical process (单边经验过程) 40optimal separating hyperplane (最优分类超平面) 131overfitting phenomenon (过学习现象) 14parametric methods of density estimation (密度估计的参数方法) 24 partial nonfalsifiability (部分不可证伪性) 51Parzen's windows method (Parzen窗方法) 27pattern recognition problem (模式识别问题) 19perceptron (感知器) 1perceptron's stopping rule (感知器迭代终止规则) 6polynomial approximation of regression (回归的多项式逼近) 116 polynomial machine (多项式机器) 143potential nonfalsifiability (潜在不可证伪性) 53probability measure (概率测度) 59probably approximately correct (PAC) model (可能近似正确(PAC)模型) 13problem of demarcation (区分问题) 49pseudo-dimension (伪维) 90quadratic programming problem (二次规划问题) 133quantization of parameters (参数的量化) 110quasi-solution (拟解) 112radial basis function machine (径向基函数机器) 144random entropy (随机熵) 42radnom string (随机串) 10randomness concept (随机性概念) 10regression estimation problem (回归估计问题) 19regression function (回归函数) 19regularization theory (正则化理论) 9regularized functional (正则化泛函) 9reproducing kernel Hilbert space (再生核希尔伯特空间) 244 residual principle (残差原则) 236rigorous (distribution-dependent) bounds (严格(依赖于分布的)界) 85 risk functional (风险泛函) 18risk minimization from empirical data problem (基于经验数据最小化风险的问题) 20robust estimators (鲁棒估计) 26robust regression (鲁棒回归) 26Rosenblatt's algorithm (Rosenblatt算法) 5set of indicators (指示器集合) 73set of unbounded functions (无界函数集合) 77σ-algebra (σ代数) 60sigmoid function (S型(sigmoid)函数) 125small samples size (小样本数) 93smoothing kernel (平滑核) 100smoothness of functions (函数的平滑性) 100soft threshold vicinity function (软阈值领域函数) 103soft vicinity function (软领域函数) 269soft-margin separating hyperplane (软间隔分类超平面) 135spline function (样条函数):with a finite number of nodes (有限结点的样条函数) 194with an infinite number of nodes (无穷多结点的样条函数) 195 stochastic approximation stopping rule (随机逼近终止规则) 34 stochastic ill-posed problems (随机不适定问题) 113strong mode estimating a probability measure (强方式概率度量估计)63structural risk minimization principle (结构风险最小化原则) 94 structure (结构) 94structure of growth function (生长函数的结构) 79supervisor (训练器) 17support vector machines (支持向量机) 137support vectors (支持向量) 134support vector ANOVA decomposition (支持向量ANOVA分解) 199 SVM n approximation of the logistic regression (逻辑回归的SVM n逼近) 155SVM density estimator (SVM密度估计) 246SVM conditional probability estimator (SVM条件概率估计) 257 tails of distribution (分布的尾部) 78tangent distance (切距) 149training set (训练集) 18transductive inference (转导推理) 293Turing-Church thesis (Turing-Church理论) 177two layer neural networks machine (两层神经网络机器) 145two-sided empirical process (双边经验过程) 40U.S. Postal Service digit database (美国邮政数字数据库) 173 uniform one-sided convergence (一致单边收敛) 39uniform two-sided convergence (一致双边收敛) 39VC dimension of a set of indictor functions (指示函数集的VC维) 79 VC dimension of a set of real functions (实函数集的VC维) 81VC entropy (VC熵) 44VC subgraph (VC子图) 90vicinal risk minimization method(领域风险最小化) 268vicinity kernel(领域核):273one-vicinal kernel (单领域核) 273two-vicinal kernel (双领域核) 273VRM method (VRM方法):for pattern recognition (模式识别的VRM方法) 273for regression estimation (回归估计的VRM方法) 282for density estimation (密度估计的VRM方法) 284for conditional probability estimation (条件概率估计的VRM方法) 285for conditional density estimation (条件密度估计的VRM方法)286weak mode estimating a probability measure (弱方式概率度量估计)63weight decay procedure (权值衰减过程) 102。
Maximum entropy modeling of species geographic distributions
Ecological Modelling 190(2006)231–259Maximum entropy modeling of species geographic distributionsSteven J.Phillips a ,∗,Robert P.Anderson b ,c ,Robert E.Schapire daAT&T Labs-Research,180Park Avenue,Florham Park,NJ 07932,USAbDepartment of Biology,City College of the City University of New York,J-526Marshak Science Building,Convent Avenue at 138th Street,New York,NY 10031,USAcDivision of Vertebrate Zoology (Mammalogy),American Museum of Natural History,Central Park West at 79th Street,New York,NY 10024,USAd Computer Science Department,Princeton University,35Olden Street,Princeton,NJ 08544,USAReceived 23February 2004;received in revised form 11March 2005;accepted 28March 2005Available online 14July 2005AbstractThe availability of detailed environmental data,together with inexpensive and powerful computers,has fueled a rapid increase in predictive modeling of species environmental requirements and geographic distributions.For some species,detailed pres-ence/absence occurrence data are available,allowing the use of a variety of standard statistical techniques.However,absence data are not available for most species.In this paper,we introduce the use of the maximum entropy method (Maxent)for modeling species geographic distributions with presence-only data.Maxent is a general-purpose machine learning method with a simple and precise mathematical formulation,and it has a number of aspects that make it well-suited for species distribution modeling.In order to investigate the efficacy of the method,here we perform a continental-scale case study using two Neotropical mammals:a lowland species of sloth,Bradypus variegatus ,and a small montane murid rodent,Microryzomys minutus .We compared Maxent predictions with those of a commonly used presence-only modeling method,the Genetic Algorithm for Rule-Set Prediction (GARP).We made predictions on 10random subsets of the occurrence records for both species,and then used the remaining localities for testing.Both algorithms provided reasonable estimates of the species’range,far superior to the shaded outline maps available in field guides.All models were significantly better than random in both binomial tests of omission and receiver operating characteristic (ROC)analyses.The area under the ROC curve (AUC)was almost always higher for Maxent,indicating better discrimination of suitable versus unsuitable areas for the species.The Maxent modeling approach can be used in its present form for many applications with presence-only datasets,and merits further research and development.©2005Elsevier B.V .All rights reserved.Keywords:Maximum entropy;Distribution;Modeling;Niche;RangeCorresponding author.Tel.:+19733608704;fax:+19733608871.E-mail addresses:phillips@(S.J.Phillips),anderson@ (R.P.Anderson),schapire@ (R.E.Schapire).1.IntroductionPredictive modeling of species geographic distribu-tions based on the environmental conditions of sites of known occurrence constitutes an important tech-0304-3800/$–see front matter ©2005Elsevier B.V .All rights reserved.doi:10.1016/j.ecolmodel.2005.03.026232S.J.Phillips et al./Ecological Modelling190(2006)231–259nique in analytical biology,with applications in con-servation and reserve planning,ecology,evolution, epidemiology,invasive-species management and other fields(Corsi et al.,1999;Peterson and Shaw,2003; Peterson et al.,1999;Scott et al.,2002;Welk et al.,2002;Yom-Tov and Kadmon,1998).Sometimes both presence and absence occurrence data are avail-able for the development of models,in which case general-purpose statistical methods can be used(for an overview of the variety of techniques currently in use, see Corsi et al.,2000;Elith,2002;Guisan and Zim-merman,2000;Scott et al.,2002).However,while vast stores of presence-only data exist(particularly in nat-ural history museums and herbaria),absence data are rarely available,especially for poorly sampled tropical regions where modeling potentially has the most value for conservation(Anderson et al.,2002;Ponder et al., 2001;Sober´o n,1999).In addition,even when absence data are available,they may be of questionable value in many situations(Anderson et al.,2003).Modeling techniques that require only presence data are therefore extremely valuable(Graham et al.,2004).1.1.Niche-based models from presence-only dataWe are interested in devising a model of a species’environmental requirements from a set of occurrence localities,together with a set of environmental vari-ables that describe some of the factors that likely influence the suitability of the environment for the species(Brown and Lomolino,1998;Root,1988). Each occurrence locality is simply a latitude–longitude pair denoting a site where the species has been ob-served;such georeferenced occurrence records often derive from specimens in natural history museums and herbaria(Ponder et al.,2001;Stockwell and Peterson, 2002a).The environmental variables in GIS format all pertain to the same geographic area,the study area, which has been partitioned into a grid of pixels.The task of a modeling method is to predict environmen-tal suitability for the species as a function of the given environmental variables.A niche-based model represents an approximation of a species’ecological niche in the examined envi-ronmental dimensions.A species’fundamental niche consists of the set of all conditions that allow for its long-term survival,whereas its realized niche is that subset of the fundamental niche that it actually occupies (Hutchinson,1957).The species’realized niche may be smaller than its fundamental niche,due to human influence,biotic interactions(e.g.,inter-specific com-petition,predation),or geographic barriers that have hindered dispersal and colonization;such factors may prevent the species from inhabiting(or even encoun-tering)conditions encompassing its full ecological po-tential(Pulliam,2000;Anderson and Mart´ınez-Meyer, 2004).We assume here that occurrence localities are drawn from source habitat,rather than sink habitat, which may contain a given species without having the conditions necessary to maintain the population with-out immigration;this assumption is less realistic with highly vagile taxa(Pulliam,2000).By definition,then, environmental conditions at the occurrence localities constitute samples from the realized niche.A niche-based model thus represents an approximation of the species’realized niche,in the study area and environ-mental dimensions being considered.If the realized niche and fundamental niche do not fully coincide,we cannot hope for any modeling al-gorithm to characterize the species’full fundamental niche:the necessary information is simply not present in the occurrence localities.This problem is likely ex-acerbated when occurrence records are drawn from too small a geographic area.In a larger study region,how-ever,spatial variation exists in community composi-tion(and,hence,in the resulting biotic interactions) as well as in the environmental conditions available to the species.Therefore,given sufficient sampling effort, modeling in a study region with a larger geographic extent is likely to increase the fraction of the funda-mental niche represented by the sample of occurrence localities(Peterson and Holt,2003),and is preferable. In practice,however,the departure between the fun-damental niche(a theoretical construct)and realized niche(which can be observed)of a species will remain unknown.Although a niche-based model describes suitabil-ity in ecological space,it is typically projected into geographic space,yielding a geographic area of pre-dicted presence for the species.Areas that satisfy the conditions of a species’fundamental niche represent its potential distribution,whereas the geographic ar-eas it actually inhabits constitute its realized distribu-tion.As mentioned above,the realized niche may be smaller than the fundamental niche(with respect to the environmental variables being modeled),in whichS.J.Phillips et al./Ecological Modelling190(2006)231–259233case the predicted distribution will be smaller than the full potential distribution.However,to the extent that the model accurately portrays the species’fundamen-tal niche,the projection of the model into geographic space will represent the species’potential distribution.Whether or not a model captures a species’full niche requirements,areas of predicted presence will typically be larger than the species’realized distribution.Due to many possible factors(such as geographic barriers to dispersal,biotic interactions,and human modifica-tion of the environment),few species occupy all areas that satisfy their niche requirements.If required by the application at hand,the species’realized distribution can often be estimated from the modeled distribution through a series of steps that remove areas that the species is known or inferred not to inhabit.For ex-ample,suitable areas that have not been colonized due to contingent historical factors(e.g.,geographic barri-ers)can be excluded(Peterson et al.,1999;Anderson, 2003).Similarly,suitable areas not inhabited due to bi-otic interactions(e.g.,competition with closely related morphologically similar species)can be identified and removed from the prediction(Anderson et al.,2002). Finally,when a species’present-day distribution is de-sired,such as for conservation purposes,a current land-cover classification derived from remotely sensed data can be used to exclude highly altered habitats(e.g.,re-moving deforested areas from the predicted distribution of an obligate-forest species;Anderson and Mart´ınez-Meyer,2004).There are implicit ecological assumptions in the set of environmental variables used for modeling, so selection of that set requires great care.Temporal correspondence should exist between occurrence localities and environmental variables;for example, a current land-cover classification should not be used with occurrence localities that derive from museum records collected over many decades(Anderson and Mart´ınez-Meyer,2004).Secondly,the variables should affect the species’distribution at the relevant scale, determined by the geographic extent and grain of the modeling task(Pearson et al.,2004).For example, using the terminology of Mackey and Lindenmayer (2001),climatic variables such as temperature and pre-cipitation are appropriate at global and meso-scales; topographic variables(e.g.,elevation and aspect)likely affect species distributions at meso-and topo-scales; and land-cover variables like percent canopy cover influence species distributions at the micro-scale.The choice of variables to use for modeling also affects the degree to which the model generalizes to regions outside the study area or to different environmental conditions(e.g.,other time periods).This is important for applications such as invasive-species management (e.g.,Peterson and Robins,2003)and predicting the impact of climate change(e.g.,Thomas et al.,2004). Bioclimatic and soil-type variables measure availabil-ity of the fundamental primary resources of light,heat, water and mineral nutrients(Mackey and Linden-mayer,2001).Their impact,as measured in one study area or time frame,should generalize to other situa-tions.On the other hand,variables representing latitude or elevation will not generalize well;although they are correlated with variables that have biophysical impact on the species,those correlations vary over space and time.A number of other serious potential pitfalls may af-fect the accuracy of presence-only modeling;some of these also apply to presence–absence modeling.First, occurrence localities may be biased.For example,they are often highly correlated with the nearby presence of roads,rivers or other access conduits(Reddy and D´a valos,2003).The location of occurrence localities may also exhibit spatial auto-correlation(e.g.,if a re-searcher collects specimens from several nearby local-ities in a restricted area).Similarly,sampling intensity and sampling methods often vary widely across the study area(Anderson,2003).In addition,errors may exist in the occurrence localities,be it due to transcrip-tion errors,lack of sufficient geographic detail(espe-cially in older records),or species misidentification. Frequently,the number of occurrence localities may be too low to estimate the parameters of the model re-liably(Stockwell and Peterson,2002b).Similarly,the set of available environmental variables may not be sufficient to describe all the parameters of the species’fundamental niche that are relevant to its distribution at the grain of the modeling task.Finally,errors may be present in the variables,perhaps due to errors in data manipulation,or due to inaccuracies in the climatic models used to generate climatic variables,or inter-polation of lower-resolution data.In sum,determining and possibly mitigating the effects of these factors rep-resent worthy topics of research for all presence-only modeling techniques.With these caveats,we proceed to introduce a modeling approach that may prove use-234S.J.Phillips et al./Ecological Modelling190(2006)231–259ful whenever the above concerns are adequately ad-dressed.1.2.MaxentMaxent is a general-purpose method for making predictions or inferences from incomplete information. Its origins lie in statistical mechanics(Jaynes,1957), and it remains an active area of research with an Annual Conference,Maximum Entropy and Bayesian Meth-ods,that explores applications in diverse areas such as astronomy,portfolio optimization,image recon-struction,statistical physics and signal processing.We introduce it here as a general approach for presence-only modeling of species distributions,suitable for all existing applications involving presence-only datasets. The idea of Maxent is to estimate a target probability distribution byfinding the probability distribution of maximum entropy(i.e.,that is most spread out,or closest to uniform),subject to a set of constraints that represent our incomplete information about the target distribution.The information available about the target distribution often presents itself as a set of real-valued variables,called“features”,and the constraints are that the expected value of each feature should match its em-pirical average(average value for a set of sample points taken from the target distribution).When Maxent is ap-plied to presence-only species distribution modeling, the pixels of the study area make up the space on which the Maxent probability distribution is defined,pixels with known species occurrence records constitute the sample points,and the features are climatic variables, elevation,soil category,vegetation type or other environmental variables,and functions thereof.Maxent offers many advantages,and a few draw-backs;a comparison with other modeling methods will be made in Section2.1.4after the Maxent approach is described in detail.The advantages include the fol-lowing:(1)It requires only presence data,together with environmental information for the whole study area.(2)It can utilize both continuous and categorical data,and can incorporate interactions between different variables.(3)Efficient deterministic algorithms have been developed that are guaranteed to converge to the optimal(maximum entropy)probability distribution.(4)The Maxent probability distribution has a concise mathematical definition,and is therefore amenable to analysis.For example,as with generalized linear and generalized additive models(GLM and GAM),in the absence of interactions between variables,additivity of the model makes it possible to interpret how each environmental variable relates to suitability(Dud´ık et al.,2004;Phillips et al.,2004).(5)Over-fitting can be avoided by using 1-regularization(Section2.1.2).(6) Because dependence of the Maxent probability distri-bution on the distribution of occurrence localities is ex-plicit,there is the potential(in future work)to address the issue of sampling bias formally,as in Zadrozny (2004).(7)The output is continuous,allowingfine dis-tinctions to be made between the modeled suitability of different areas.If binary predictions are desired,this allows greatflexibility in the choice of threshold.If the application is conservation planning,thefine distinc-tions in predicted relative environmental suitability can be valuable to reserve planning algorithms.(8)Maxent could also be applied to species presence/absence data by using a conditional model(as in Berger et al.,1996), as opposed to the unconditional model used here.(9) Maxent is a generative approach,rather than discrim-inative,which can be an inherent advantage when the amount of training data is limited(see Section2.1.4).(10)Maximum entropy modeling is an active area of re-search in statistics and machine learning,and progress in thefield as a whole can be readily applied here.(11) As a general-purpose andflexible statistical method, we expect that it can be used for all the applications outlined in Section1above,and at all scales.Some drawbacks of the method are:(1)It is not as mature a statistical method as GLM or GAM,so there are fewer guidelines for its use in general,and fewer methods for estimating the amount of error in a predic-tion.Our use of an“unconditional”model(cf.advan-tage8)is rare in machine learning.(2)The amount of regularization(see Section2.1.2)requires further study (e.g.,see Phillips et al.,2004),as does its effectiveness in avoiding over-fitting compared with other variable-selection methods(for alternatives,see Guisan et al., 2002).(3)It uses an exponential model for probabil-ities,which is not inherently bounded above and can give very large predicted values for environmental con-ditions outside the range present in the study area.Extra care is therefore needed when extrapolating to another study area or to future or past climatic conditions(for example,feature values outside the range of values in the study area should be“clamped”,or reset to the ap-propriate upper or lower bound).(4)Special-purposeS.J.Phillips et al./Ecological Modelling190(2006)231–259235software is required,as Maxent is not available in stan-dard statistical packages.1.3.Existing approaches for presence-only modelingMany methods have been used for presence-only modeling of species distributions,and we only attempt here to give a broad overview of existing methods. Some methods use only presences to derive a model. BIOCLIM(Busby,1986;Nix,1986)predicts suitable conditions in a“bioclimatic envelope”,consisting of a rectilinear region in environmental space represent-ing the range(or some percentage thereof)of observed presence values in each environmental dimension.Sim-ilarly,DOMAIN(Carpenter et al.,1993)uses a similar-ity metric,where a predicted suitability index is given by computing the minimum distance in environmental space to any presence record.Other techniques use presence and background data.General-purpose statistical methods such as generalized linear models(GLMs)and generalized additive models(GAMs)are commonly used for modeling with presence–absence datasets.Recently, they have been applied to presence-only situations by taking a random sample of pixels from the study area, known as“background pixels”or“pseudo-absences”, and using them in place of absences during model-ing(Ferrier and Watson,1996;Ferrier et al.,2002).A sample of the background pixels can be chosen purely at random(sometimes excluding sites with presence records,Graham et al.,2004),or from sites where sampling is known to have occurred or from a model of such sites(Zaniewski et al.,2002;Engler et al.,2004).Similarly,a Bayesian approach(Aspinall, 1992)proposed modeling presence versus a random sample.The Genetic Algorithm for Rule-Set Predic-tion(Stockwell and Noble,1992;Stockwell and Peters, 1999)uses an artificial-intelligence framework called genetic algorithms.It produces a set of positive and negative rules that together give a binary prediction; rules are favored in the algorithm according to their significance(compared with random prediction)based on a sample of background pixels and presence pixels. Environmental-Niche Factor Analysis(ENFA,Hirzel et al.,2002)uses presence localities together with environmental data for the entire study area,without requiring a sample of the background to be treated like absences.It is similar to principal components analysis, involving a linear transformation of the environmental space into orthogonal“marginality”and“specializa-tion”factors.Environmental suitability is then modeled as a Manhattan distance in the transformed space.As afirst step in the evaluation of Maxent,we chose to compare it with GARP,as the latter has recently seen extensive use in presence-only studies(Anderson, 2003;Joseph and Stockwell,2002;Peterson and Kluza, 2003;Peterson and Robins,2003;Peterson and Shaw, 2003and references therein).While further stud-ies are needed comparing Maxent with other widely used methods that have been applied to presence-only datasets,such studies are beyond the scope of this pa-per.2.Methods2.1.Maxent details2.1.1.The principleWhen approximating an unknown probability dis-tribution,the question arises,what is the best approx-imation?E.T.Jaynes gave a general answer to this question:the best approach is to ensure that the ap-proximation satisfies any constraints on the unknown distribution that we are aware of,and that subject to those constraints,the distribution should have max-imum entropy(Jaynes,1957).This is known as the maximum-entropy principle.For our purposes,the un-known probability distribution,which we denoteπ,is over afinite set X,(which we will later interpret as the set of pixels in the study area).We refer to the individ-ual elements of X as points.The distributionπassigns a non-negative probabilityπ(x)to each point x,and these probabilities sum to1.Our approximation ofπis also a probability distribution,and we denote itˆπ.The entropy ofˆπis defined asH(ˆπ)=−x∈Xˆπ(x)lnˆπ(x)where ln is the natural logarithm.The entropy is non-negative and is at most the natural log of the number of elements in X.Entropy is a fundamental concept in information theory:in the paper that originated that field,Shannon(1948)described entropy as“a measure236S.J.Phillips et al./Ecological Modelling 190(2006)231–259of how much ‘choice’is involved in the selection of an event”.Thus a distribution with higher entropy involves more choices (i.e.,it is less constrained).Therefore,the maximum entropy principle can be interpreted as saying that no unfounded constraints should be placed on ˆπ,or alternatively,The fact that a certain probability distribution maxi-mizes entropy subject to certain constraints represent-ing our incomplete information,is the fundamental property which justifies use of that distribution for inference;it agrees with everything that is known,but carefully avoids assuming anything that is not known (Jaynes,1990).2.1.2.A machine learning perspectiveThe maximum entropy principle has seen recent interest in the machine learning community,with a major contribution being the development of effi-cient algorithms for finding the Maxent distribution (see Berger et al.,1996for an accessible introduction and Ratnaparkhi,1998for a variety of applications and a favorable comparison with decision trees).The ap-proach consists of formalizing the constraints on the unknown probability distribution πin the following way.We assume that we have a set of known real-valued functions f 1,...,f n on X ,known as “features”(which for our application will be environmental vari-ables or functions thereof).We assume further that the information we know about πis characterized by the expectations (averages)of the features under π.Here,each feature f j assigns a real value f j (x )to each point x in X .The expectation of the feature f j under πis defined asx ∈X π(x )f j (x )and denoted by π[f j ].In general,for any probability distribution p and function f ,we use the notation p [f ]to denote the expectation of f under p .The feature expectations π[f j ]can be approximated using a set of sample points x 1,...,x m drawn inde-pendently from X (with replacement)according to the probability distribution π.The empirical average of f j is 1m mi =1f j (x i ),which we can write as ˜π[f j ](where ˜πis the uniform distribution on the sample points),and use as an estimate of π[f j ].By the maximum entropy principle,therefore,we seek the probability distribu-tion ˆπof maximum entropy subject to the constraint that each feature f j has the same mean under ˆπas ob-served empirically,i.e.ˆπ[f j ]=˜π[f j ],for each feature f j(1)It turns out that the mathematical theory of convexduality can be used (Della Pietra et al.,1997)to showthat this characterization uniquely determines ˆπ,and that ˆπhas an alternative characterization,which can be described as follows.Consider all probability distribu-tions of the form q λ(x )=e λ·f (x )Z λ(2)where λis a vector of n real-valued coefficients or fea-ture weights,f denotes the vector of all n features,and Z λis a normalizing constant that ensures that q λsums to 1.Such distributions are known as Gibbs distribu-tions.Convex duality shows that the Maxent probabil-ity distribution ˆπis exactly equal to the Gibbs prob-ability distribution q λthat maximizes the likelihood (i.e.,probability)of the m sample points.Equivalently,it minimizes the negative log likelihood of the sample points ˜π[−ln(q λ)](3)which can also be written ln Z λ−1m m i =1λ·f (x i )and termed the “log loss”.As described so far,Maxent can be prone to over-fitting the training data.The problem derives from the fact that the empirical feature means will typically not equal the true means;they will only approximate them.Therefore the means under ˆπshould only be restricted to be close to their empirical values.One way this can be done is to relax the constraint in (1)above (Dud´ık et al.,2004),replacing it with|ˆπ[f j ]−˜π[f j ]|≤βj ,for each feature f j (4)for some constants βj .This also changes the dual char-acterization,resulting in a form of 1-regularization:the Maxent distribution can now be shown to be the Gibbs distribution that minimizes˜π[−ln(q λ)]+jβj |λj |(5)where the first term is the log loss (as in (3)above),while the second term penalizes the use of large values for the weights λj .Regularization forces Max-ent to focus on the most important features,and 1-S.J.Phillips et al./Ecological Modelling190(2006)231–259237regularization tends to produce models with few non-zeroλj values(Williams,1995).Such models are less likely to overfit,because they have fewer parameters; as a general rule,the simplest explanation of a phe-nomenon is usually best(the principle of parsimony, Occam’s Razor).Note that 1regularization has also been applied to GLM/GAMs,and is called the“lasso”in that context(Guisan et al.,2002and references therein).This maximum likelihood formulation suggests a natural approach forfinding the Maxent probability distribution:start from the uniform probability distri-bution,for whichλ=(0,...,0),then repeatedly make adjustments to one or more of the weightsλj in such a way that the regularized log loss decreases.Regular-ized log loss can be shown to be a convex function of the weights,so no local minima exist,and several convex optimization methods exist for adjusting the weights in a way that guarantees convergence to the global min-imum(see Section2.2for the algorithm used in this study).The above presentation describes an“uncondi-tional”maximum entropy model.“Conditional”mod-els are much more common in the machine learning literature.The task of a conditional Maxent model is to approximate a joint probability distribution p(x,y) of the inputs x and output label y.Both presence and absence data would be required to train a conditional model of a species’distribution,which is why we use unconditional models.2.1.3.Application to species distribution modelingAustin(2002)examines three components needed for statistical modeling of species distributions:an eco-logical model concerning the ecological theory being used,a data model concerning collection of the data, and a statistical model concerning the statistical the-ory.Maxent is a statistical model,and to apply it to model species distributions successfully,we must con-sider how it relates to the two other modeling com-ponents(the data model and ecological model).Using the notation of Section2.1.2,we define the set X to be the set of pixels in the study area,and interpret the recorded presence localities for the species as sample points x1,...,x m taken from an unknown probability distributionπ.The data model consists of the method by which the presence localities were collected.One idealized sampling strategy is to pick a random pixel,and record1if the species is present there,and0other-wise.If we denote the response variable as y,then under this sampling strategy,πis the probability distribution p(x|y=1).By applying Bayes’rule,we get thatπis proportional to probability of occurrence,p(y=1|x), although with presence-only data we cannot determine the constant of proportionality.However,most presence-only datasets derive from surveys where the data model is much less well-defined that the idealized model presented above.The various sampling biases described in Section1seriously violate this data model.In practice,then,π(andˆπ)can be more conservatively interpreted as a relative index of envi-ronmental suitability,where higher values represent a prediction of better conditions for the species(similar to the relaxed interpretation of GLMs with presence-only data in Ferrier et al.(2002)).The critical step in formulating the ecological model is defining a suitable set of features.Indeed,the con-straints imposed by the features represent our ecologi-cal assumptions,as we are asserting that they represent all the environmental factors that constrain the geo-graphical distribution of the species.We considerfive feature types,described in Dud´ık et al.(2004).We did not use the fourth in our present study,as it may require more data than were available for our study species.1.A continuous variable f is itself a“linear feature”.It imposes the constraint onˆπthat the mean of the environmental variable,ˆπ[f],should be close to its observed value,i.e.,its mean on the sample locali-ties.2.The square of a continuous variable f is a“quadraticfeature”.When used with the corresponding linear feature,it imposes the constraint onˆπthat the vari-ance of the environmental variable should be close to its observed value,since the variance is equal to ˆπ[f2]−ˆπ[f]2.It models the species’tolerance for variation from its optimal conditions.3.The product of two continuous environmental vari-ables f and g is a“product feature”.Together with the linear features for f and g,it imposes the constraint that the covariance of those two variables should be close to its observed value,since the covariance isˆπ[fg]−ˆπ[f]ˆπ[g].Product features therefore in-corporate interactions between predictor variables.4.For a continuous environmental variable f,a“thresh-old feature”is equal to1when f is above a given。
辐照灭菌的过程控制指南(美国医疗器械促进协会)AAMI TIR 29-2002
AAMI TIR29:2002技术信息报告辐照灭菌过程控制指南AAMI 美国医疗器械促进协会(Association for the Advancement of MEDICALInstrumentation)AAMI 技术信息报告AAMI TIR29:2002辐照灭菌过程控制指南Approved 16 July 2002 by美国医疗器械促进协会摘要: 本技术信息报告增加了ANSI/AAMI/ISO 11137所界定的光子,电子束灭菌的剂量场的建立和规范,过程确认,和常规控制等辐射灭菌。
尽管轫致辐射的要求相似,但在这项工作开始的时候缺乏关于轫致辐射装置的设计和运行的经验。
所以轫致辐射不包括在此指南之内。
关键词: 辐射剂量场, 过程确认, 日常加工,剂量确认美国医疗器械促进协会技术信息报告信息技术报告是美国医疗器械促进协会标准局的刊物,它是为特殊的医疗技术提供。
提交到信息技术报告的材料需要更多专家的意见,发表的信息也得是用的,因为很多行业都急切需要它。
信息技术报告与标准和操作规程建议,读者应该理解这些文件的不同之处。
标准和工业标准由正式的委员会通过收集所有正确的意见和观点,此过程由美国医疗器械促进协会标准局和美国国际标准机构完成。
信息技术报告作为一个标准审核的过程不是一样。
但是,信息技术报告由技术委员会和美国医疗器械促进协会标准出版社发布。
另外一个不同的地方,尽管标准和信息技术报告都需要定期审查,一个标准必须经过重申,修改,或撤回,通常每五年或十年需要正式的被认可。
对于信息技术报告来说,美国医疗器械促进协会和技术委员会达成一致,规定自出版日期五年后(作为一个周期)进行审查报告是否有用,检查信息是否切题和具有实用性,如果信息没有实用性了,此信息技术报告就被删掉。
信息技术报告肯发展,因为它比标准和操作规程建议能更好响应基础安全和性能问题。
或者说因为达成共识是非常困难甚至不可能。
信息技术报告与标准不同,它允许在技术问题上由不同的观点。
电子信息类专业词汇
whatsoever=whatever0switchboard(电话)交换台bipolar(电子)双极的premise(复)房屋,前提cursor(计算机尺的)游标,指导的elapse(时间)经过,消失vaporize(使)蒸发subsystem(系统的)分部,子系统,辅助系统metallic(像)金属的,含金属的,(声音)刺耳的dispatch(迅速)派遣,急件consensus(意见)一致,同意deadline(最后)期限,截止时间tomographic X线体层摄像的alas唉,哎呀cluster把…集成一束,一组,一簇,一串,一群encyclopedia百科全书millionfold百万倍的semiconductor半导体radius半径范围,半径,径向射线half-duplex transmission半双工传输accompaniment伴随物,附属物reservation保留,预定quotation报价单,行情报告,引语memorandum备忘录redundancy备用be viewed as被看作…be regards as被认为是as such本身;照此;以这种资格textual本文的,正文的verge边界variation变化,变量conversion 变化,转化identity标识;标志criterion标准,准则in parallel on并联到,合并到juxtapose并置,并列dialing pulse拨号脉冲wave-guide波导wavelength division multiplexed波分复用baud rate波特率playback播放(录音带,唱片)no greater than不大于update不断改进,使…适合新的要求,更新asymmetric不对称的irrespective不考虑的,不顾的inevitably不可避免的inevitable不可避免的,不可逃避的,必定的segment部分abrasion擦伤,磨损deploy采用,利用,推广应用take the form of采用…的形式parameter参数,参量layer层dope掺杂FET(field effect transistors)场效应管audio recording唱片ultra-high-frequency(UHF)超高频in excess of超过in excess of超过hypertext超文本ingredient成分,因素ingredient成分,组成部分,要素metropolitan-area network(WAN)城域网metropolitan area network(WAN)城域网,城市网络congestion充满,拥挤,阻塞collision冲突extractive抽出;释放出extract抽取,取出,分离lease出租,租约,租界期限,租界物pass on传递,切换transmission传输facsimile传真innovative=innovatory创新的,富有革新精神的track磁道impetus促进,激励cluster簇stored-program control(SPC)存储程序控制a large number of 大量的peal大声响,发出supersede代替supplant代替,取代out-of-band signaling带外信号simplex transmission单工传输monochromatic单色的,单色光的,黑白的ballistic弹道的,射击的,冲击的conductor导体hierarchy等级制度,层次infrastructure底层结构,基础结构geographic地理的,地区的geographically地理上GIS(ground instrumentation system)地面测量系统ground station地面站earth orbit地球轨道extraterrestrial 地球外的,地球大气圈外的Land-sat地球资源卫星rug地毯,毯子ignite点火,点燃,使兴奋electromagnetic电磁的inductive电感arc电弧telephony电话(学),通话dielectric电介质,绝缘材料;电解质的,绝缘的capacitor电容telecommunication电信,无线电通讯scenario电影剧本,方案modem pool调制解调器(存储)池superimposing叠加,重叠pin钉住,扣住,抓住customize定做,定制monolithic独立的,完全统一的aluminize镀铝strategic对全局有重要意义的,战略的substantial多的,大的,实际上的multi-path fading多径衰落multi-path多路,多途径;多路的,多途径的multi-access多路存取,多路进入multiplex多路复用multiplex多路复用的degradation恶化,降级dioxide二氧化碳LED(light-emitting-diode)发光二极管evolution发展,展开,渐进feedback反馈,回授dimension范围,方向,维,元scenario方案scenario方案,电影剧本amplifer放大器noninvasive非侵略的,非侵害的tariff费率,关税率;对…征税distributed functional plane(DFP)分布功能平面DQDB(distributed queue dual bus)分布式队列双总线hierarchy分层,层次partition分成segmentation分割interface分界面,接口asunder分开地,分离地detached分离的,分开的,孤立的dispense分配allocate分配,配给;配给物centigrade分为百度的,百分度的,摄氏温度的fractal分形molecule分子,微小,些微cellular蜂窝状的cellular蜂窝状的,格形的,多孔的auxiliary storage(also called secondary storage)辅助存储器decay腐烂,衰减,衰退negative负电vicinity附近,邻近vicinity附近地区,近处sophisticated复杂的,高级的,现代化的high-frequency(HF)高频high definition television高清晰度电视chromium铬annotate给…作注解in terms of根据,按照disclosure公布,企业决算公开public network公用网functionality功能,功能度mercury汞resonator共鸣器resonance共振whimsical古怪的,反复无常的administration管理,经营cursor光标(显示器),游标,指针optical computer光计算机photoconductor光敏电阻optical disks光盘optically光学地,光地wide-area networks广域网specification规范,说明书silicon硅the international telecommunication union(ITU)国际电信联盟excess过剩obsolete过时的,废弃的maritime海事的synthetic合成的,人造的,综合的synthetic合成的,综合性的rational合乎理性的rationalization合理化streamline合理化,理顺infrared红外线的,红外线skepticism怀疑论ring network环形网hybrid混合物counterpart伙伴,副本,对应物electromechanical机电的,电动机械的Robot机器人Robotics机器人技术,机器人学accumulation积累infrastructure基础,基础结构substrate基质,底质upheaval激变,剧变compact disc激光磁盘(CD)concentrator集中器,集线器centrex system集中式用户交换功能系统converge on集中于,聚集在…上lumped element集总元件CAI(computer-aided instruction)计算机辅助教学computer-integrated manufacturing(CIM)计算机集成制造computer mediated communication(CMC)计算机中介通信record记录register记录器,寄存器expedite加快,促进weight加权accelerate加速,加快,促进categorize加以类别,分类in addition加之,又,另外hypothetical假设的rigidly坚硬的,僵硬的compatibility兼容性,相容性surveillance监视surveillance监视retrieval检索,(可)补救verification检验simplicity简单,简明film胶片,薄膜take over接管,接任ruggedness结实threshold界限,临界值with the aid of借助于,用,通过wire line金属线路,有线线路coherent紧凑的,表达清楚的,粘附的,相干的compact紧密的approximation近似undertake进行,从事transistor晶体管elaborate精心制作的,细心完成的,周密安排的vigilant警戒的,警惕的alcohol酒精,酒local area networks(LANs)局域网local-area networks(LANs)局域网drama剧本,戏剧,戏剧的演出focus on聚集在,集中于,注视insulator绝缘root mean square均方根uniform均匀的open-system-interconnection(OSI)开放系统互连expire开始无效,满期,终止immunity抗扰,免除,免疫性take…into account考虑,重视…programmable industrial automation可编程工业自动化demountable可拆卸的tunable可调的reliable可靠be likely to 可能,大约,像要videotex video可视图文电视negligible可以忽略的aerial空气的,空中的,无形的,虚幻的;天线broadband宽(频)带pervasive扩大的,渗透的tensile拉力的,张力的romanticism浪漫精神,浪漫主义discrete离散,不连续ion离子force力量;力stereophonic立体声的continuum连续统一体,连续统,闭联集smart灵巧的;精明的;洒脱的token令牌on the other hand另一方面hexagonal六边形的,六角形的hexagon六角形,六边形monopoly垄断,专利video-clip录像剪辑aluminum铝pebble卵石,水晶透镜forum论坛,讨论会logical relationships逻辑关系code book码本pulse code modulation(PCM)脉冲编码调制roam漫步,漫游bps(bits per second)每秒钟传输的比特ZIP codes美国邮区划分的五位编码susceptible(to)敏感的,易受…的analog模拟,模拟量pattern recognition模式识别bibliographic目录的,文献的neodymium钕the european telecommunication standardization institute(ETSI)欧洲电信标准局coordinate配合的,协调的;使配合,调整ratify批准,认可bias偏差;偏置deviate偏离,与…不同spectrum频谱come into play其作用entrepreneurial企业的heuristic methods启发式方法play a …role(part)起…作用stem from起源于;由…发生organic器官的,有机的,组织的hypothesis前提front-end前置,前级potential潜势的,潜力的intensity强度coincidence巧合,吻合,一致scalpel轻便小刀,解剖刀inventory清单,报表spherical球的,球形的distinguish区别,辨别succumb屈服,屈从,死global functional plane(GFP)全局功能平面full-duplex transmission全双工传输hologram全息照相,全息图deficiency缺乏thermonuclear热核的artifact人工制品AI(artificial intelligence)人工智能fusion熔解,熔化diskettes(also called floppy disk)软盘sector扇区entropy熵uplink上行链路arsenic砷neural network神经网络very-high-frequency(VHF)甚高频upgrade升级distortion失真,畸变identification识别,鉴定,验明pragmatic实际的implementation实施,实现,执行,敷设entity实体,存在vector quantification矢量量化mislead使…误解,给…错误印象,引错vex使烦恼,使恼火defy 使落空facilitate使容易,促进retina视网膜compatible适合的,兼容的transceiver收发两用机authorize授权,委托,允许data security数据安全性data independence数据独立data management数据管理database数据库database management system(DBMS)数据库管理信息系统database transaction数据库事务data integrity数据完整性,数据一致性attenuation衰减fading衰落,衰减,消失dual双的,二重的transient瞬时的deterministic宿命的,确定的algorithm算法dissipation损耗carbon碳diabetes糖尿病cumbersome讨厌的,麻烦的,笨重的razor剃刀,剃go by the name of通称,普通叫做commucation session通信会话traffic通信业务(量)synchronous transmission同步传输concurrent同时发生的,共存的simultaneous同时发生的,同时做的simultaneous同时发生的,一齐的coaxial同轴的copper铜statistical统计的,统计学的dominate统治,支配invest in投资perspective透视,角度,远景graphics图示,图解pictorial图像的coating涂层,层deduce推理reasoning strategies推理策略inference engine推理机topology拓扑结构heterodyne外差法的peripheral外界的,外部的,周围的gateway网关hazardous危险的microwave微波(的)microprocessor微处理机,微处理器microelectronic微电子nuance微小的差别(色彩等)encompass围绕,包围,造成,设法做到maintenance维护;保持;维修satellite communication卫星通信satellite network卫星网络transceiver无线电收发信机radio-relay transmission无线电中继传输without any doubt无疑passive satellite无源卫星sparse稀少的,稀疏的downlink下行链路precursor先驱,前任visualization显像feasibility现实性,可行性linearity线性度constrain限制,约束,制约considerable相当的,重要的geo-stationary相对地面静止by contrast相反,而,对比起来coorelation相关性mutual相互的mutually相互的,共同的interconnect相互连接,互连one after the other相继,依次minicomputer小型计算机protocol协议,草案protocol协议,规约,规程psycho-acoustic心理(精神)听觉的;传音的channelization信道化,通信信道选择run length encoding行程编码groom修饰,准备virtual ISDN虚拟ISDNmultitude许多,大批,大量whirl旋转preference选择,喜欢avalanche雪崩pursue寻求,从事interrogation询问dumb哑的,不说话的,无声的subcategory亚类,子种类,子范畴orbital眼眶;轨道oxygen氧气,氧元素service switching and control points(SSCPs)业务交换控制点service control points(SCPs)业务控制点service control function(SCF)业务控制功能in concert一致,一齐handover移交,越区切换at a rate of以……的速率in the form of以…的形式base on…以…为基础yttrium钇(稀有金属,符号Y)asynchronous transmission异步传输asynchronous异步的exceptional异常的,特殊的voice-grade音频级indium铟give rise to 引起,使产生cryptic隐义的,秘密的hard disk硬盘hard automation硬自动化by means of用,依靠equip with用…装备subscriber用户telex用户电报PBX(private branch exchange)用户小交换机或专用交换机be called upon to 用来…,(被)要求…superiority优势predominance优势,显著active satellite有源卫星in comparison with与…比较comparable to与…可比preliminary预备的,初步的premonition预感,预兆nucleus原子核valence原子价circumference圆周,周围teleprocessing远程信息处理,遥控处理perspective远景,前途constrain约束,强迫mobile运动的,流动的,机动的,装在车上的convey运输,传递,转换impurity杂质impurity杂质,混杂物,不洁,不纯regenerative再生的improve over在……基础上改善play important role in在…中起重要作用in close proximity在附近,在很近underlying在下的,基础的in this respect在这方面entail遭遇,导致presentation赠与,图像,呈现,演示narrowband窄(频)带deploy展开,使用,推广应用megabit兆比特germanium锗positive正电quadrature正交orthogonal正交的quadrature amplitude modulation(QAM)正交幅度调制on the right track正在轨道上sustain支撑,撑住,维持,持续outgrowh支派;长出;副产品dominate支配,统治knowledge representation知识表示knowledge engineering知识工程knowledge base知识库in diameter直径helicopter直升飞机acronym只取首字母的缩写词as long as只要,如果tutorial指导教师的,指导的coin制造(新字符),杜撰fabrication制造,装配;捏造事实proton质子intelligence智能,智力,信息intelligent network智能网intermediate中间的nucleus(pl.nuclei)中心,核心neutrons中子terminal终端,终端设备overlay重叠,覆盖,涂覆highlight重要的部分,焦点charge主管,看管;承载dominant主要的,控制的,最有力的cylinder柱面expert system专家系统private network专用网络transition转变,转换,跃迁relay转播relay转播,中继repeater转发器,中继器pursue追赶,追踪,追求,继续desktop publish桌面出版ultraviolet紫外线的,紫外的;紫外线辐射field字段vendor自动售货机,厂商naturally自然的;天生具备的synthesize综合,合成integrate综合,使完全ISDN(intergrated services digital network)综合业务数字网as a whole总体上bus network总线形网crossbar纵横,交叉impedance阻抗initial最初的,开始的optimum最佳条件appear as作为…出现。
最大熵模型核心原理
最大熵模型核心原理一、引言最大熵模型(Maximum Entropy Model, MEM)是一种常用的统计模型,它在自然语言处理、信息检索、图像识别等领域有广泛应用。
本文将介绍最大熵模型的核心原理。
二、信息熵信息熵(Entropy)是信息论中的一个重要概念,它可以衡量某个事件或信源的不确定度。
假设某个事件有n种可能的结果,每种结果发生的概率分别为p1,p2,...,pn,则该事件的信息熵定义为:H = -∑pi log pi其中,log表示以2为底的对数。
三、最大熵原理最大熵原理(Maximum Entropy Principle)是指在所有满足已知条件下,选择概率分布时应选择具有最大信息熵的分布。
这个原理可以理解为“保持不确定性最大”的原则。
四、最大熵模型最大熵模型是基于最大熵原理建立起来的一种分类模型。
它与逻辑回归、朴素贝叶斯等分类模型相似,但在某些情况下具有更好的性能。
五、特征函数在最大熵模型中,我们需要定义一些特征函数(Function),用来描述输入样本和输出标签之间的关系。
特征函数可以是任意的函数,只要它能够从输入样本中提取出有用的信息,并与输出标签相关联即可。
六、特征期望对于一个特征函数f(x,y),我们可以定义一个特征期望(Expected Feature),表示在所有可能的输入样本x和输出标签y的组合中,该特征函数在(x,y)处的期望值。
特别地,如果该特征函数在(x,y)处成立,则期望值为1;否则为0。
七、约束条件最大熵模型需要满足一些约束条件(Constraints),以保证模型能够准确地描述训练数据。
通常我们会选择一些简单明了的约束条件,比如每个输出标签y的概率之和等于1。
八、最大熵优化问题最大熵模型可以被看作是一个最优化问题(Optimization Problem),即在满足约束条件下,寻找具有最大信息熵的概率分布。
这个问题可以使用拉格朗日乘子法(Lagrange Multiplier Method)来求解。
半导体纳米晶体介电常数的尺寸和成分效应
半导体纳米晶体介电常数的尺寸和成分效应马艳丽; 李明【期刊名称】《《淮北师范大学学报(自然科学版)》》【年(卷),期】2019(040)003【总页数】5页(P12-16)【关键词】介电常数; 半导体纳米晶体; 尺寸和成分效应【作者】马艳丽; 李明【作者单位】淮北师范大学物理与电子信息学院安徽淮北 235000【正文语种】中文【中图分类】O3410 引言由于低维纳米晶体(纳米粒子、纳米线、薄膜)具有不同于相应块体材料的物理化学性能,因此具有广泛的应用价值,从而引起学者们的广泛关注[1].介电常数ε是用来描述单元电荷产生电流的多少,作为一个重要的光电性能,学者们通过理论和实验方法对其进行广泛的研究[2].由于纳米材料约束电子的低屏蔽性,其介电常数小于相应的块体材料,即:ε(D)<ε(∞)[3-5].其中:D是纳米粒子和纳米线的直径、薄膜的厚度,∞则表示相应的块体材料.ε(D)的减小可以提高纳米器件中的电子、空穴和浅层杂质电离的库仑相互作用,并且改善光吸收和传输性能[2].比如:纳米闪存,纳米晶体通常是嵌入到栅极氧化物中作为一个电荷存储节点,纳米晶体的存在会对栅极电容产生影响[6-8].为得到所需性能的器件,首先要理解介电常数的基本原理.为得到所需的光电性能,大多数工作是通过改变尺寸来调整纳米晶体的介电常数,但在小尺寸范围内,尤其当尺寸下降到2~3 nm时,器件将不可避免地出现热稳定性问题[9].为解决小尺寸器件的热稳定性问题,可以用热稳定性高的多元合金[9],多元合金不仅具有相应的单相纳米晶体所具有的基本光电性能,同时还具有高的光致发光性能[10].为了描述ε(D),学者们在理论方面建立不同的模型,模型预测结果与实验结果保持一致,但模型中用到的可调参数限制了模型的应用[2,11].此外,由于对合金介电常数的成分效应研究很少,因此有必要建立一个定量的模型来描述介电常数的尺寸和成分效应.本文中,根据已建立的热力学模型,建立一个没有任何可调参数的模型来预测纳米晶体的介电常数.根据这个模型,对于化合物和合金,介电常数随着尺寸D的减小而减小.此外,通过选择适当的x,可以有效地对合金的ε(x,D)进行调整.通过与实验结果的比较证实模型的有效性,表明该模型可以为光电器件的开发、应用提供有效途径.1 模型根据近自由电子方法,Eg=2|V1|.其中:Eg是决定材料导电性能的带隙,V1是晶体场,取决于原子总数和固体原子间的相互作用[12].作为一级近似,将这种关系扩展到纳米尺寸,可以得到:其中Δ 表示差值,由于V∝Ec[12],Ec是原子结合能. 因此Ec(D)的函数可以表示为[13]:其中:Tm是熔化温度,Svib(∞)是振动熵,R是理想气体常数. 对于半导体,Svib(∞)≈Sm(∞)-R ,其中:Sm(∞)为熔化熵[14],D0是临界直径,此时低维材料中所有原子都位于表面.作为维数d和最近邻原子间距h的函数,D0可表示为[14]:其中d=0,1,2分别表示纳米粒子、纳米线和薄膜.介电常数来源于从价带到导带的电子极化或者电子跃迁过程.这个过程服从能量和动量守恒,并影响半导体的光电响应以及价带电子与激发的导带电子的耦合程度[2].因此,在室温下,半导体的介电常数与带隙Eg是直接相关的. 根据公式(1)~(3)以及的近似关系[2],尺寸依赖的磁化系数χ(D)可以表示为[χ(D)/χ(∞)]={2-[Tm(D)/Tm(∞)]}-2. 将ε=χ+1扩展到纳米尺寸,ε(D)可表示为对于纳米半导体合金,由于成分x对h(x)和Svib(x)产生影响,随着尺寸D的增加,ε(x,D)随成分的变化由直线变为曲线,表现出非线性关系.根据Fox方程h(x)和Svib(x)可表示为[15]:其中:Svib(0)、Svib(1)、h(0)和h(1)表示x=0或x=1时对应的块体值.表1 模型计算过程中用到的相关参数注aSvib(∞)∝Sm(∞)-R,其中:CdTe、CdSe的Sm(∞)分别是14.91 J/(g-atom·K)[18]、20.37 J/(g-atom·K)[18].CdSe CdTe ε(∞)[17]9.7 10.2 Svib(∞)/(J/(g-atom·K))6.59a 12.06ah[16]/nm 0.263 0.2812 结果与讨论计算中使用的参数如表1所示.图1是根据式(5)预测的CdTe和CdSe纳米粒子、纳米线的ε(D)与Tsu模型以及实验结果的比较.模型预测结果表明,随着尺寸D的减小,表面体积比(A/V)增大,ε(D)减小,模型预测结果和实验结果在整个范围内具有良好的一致性.而且,当纳米线的尺寸D<5 nm以及纳米粒子的尺寸D<10 nm时,ε(D)随着尺寸的变化变化明显;而当尺寸D>10 nm时,ε(D)随着尺寸的变化平缓,直至慢慢接近块体值.由于表面原子具有与内部原子不同的物理特性,随着尺寸的减小,表面体积比和表面原子数增多,因此,在决定纳米晶体的性能时,表面原子起主导作用.Wang等[5]提出介电常数的变化是由于表面的量子点而并不是所有的量子点,而Delerue等[3]认为,介电常数的减小是由表面极化键的断裂导致的,这正好支持Wang等的早期发现.研究表明,纳米晶体尺寸D 的减小导致晶格收缩和结合能减小[2].尽管晶格收缩会使单键能增加,但表面原子的低配位数(存在于表面的断裂建)导致纳米晶体的结合能随着表面原子的增大而减小.因此,配位数的缺失(结合能减小)导致可捕获到的哈密顿总量的改变,使得带隙增大,进而影响电子极化过程[19].根据以上分析以及介电常数和带隙的近似关系,ε(D)随着尺寸D的减小而减小是合理的.从图1还可以看出,纳米线介电常数的尺寸效应弱于纳米粒子.这种差异产生的原因是由于纳米粒子、纳米线的表面体积比分别是6/D、4/D.模型预测结果表明,可以通过改变尺寸来调节纳米合金的介电常数.相反,图1中Tsu的模型仅在D>10 nm时和实验结果存在一致性,而D<10 nm时,Tsu模型与实验结果存在偏差,这是由于Tsu的模型限定ΔEg(D)/Eg(∞)<0.56[11].实际上D<10 nm时,纳米晶体的ΔEg(D)/Eg(∞)值可以大于0.5[2].与Tsu的模型相比,模型预测的CdTe和CdSe纳米粒子、纳米线的ε(D)和实验结果有着良好的一致性.图1 模型预测的CdSe和CdTe的介电常数和Tsu模型以及实验结果的比较□[17]表示CdTe纳米线的实验结果;▼[20]、●[21]、◆[22]表示CdTe纳米粒子的实验结果;■[22]☆[23]表示CdSe纳米粒子的实验结果.图2 模型预测的CdSexTe1-x纳米合金的介电常数■、▲[24]表示实验结果图2是根据式(7)预测的不同尺寸的CdSexTe1-x的ε(x,D)随成分变化与实验结果的比较.从图2可以看出,一方面,对于固定的x,随着尺寸的变化,纳米合金的介电常数与化合物具有相同的变化趋势,即ε(x,D)随着D的减小而减小.另一方面,随着尺寸D的增加,纳米半导体合金的ε(x,D)随成分的变化由线性变成非线性,其介电常数随着尺寸D的增加表现出弯曲行为.D=4.9 nm时,ε(x,D)表现出近似线性关系,而D=14 nm时,ε(x,D)表现出非线性关系,而且随着D的增加弯曲行为越明显.当D增加到大尺寸范围时,比如D=40 nm和D=50 nm,ε(x,40)和ε(x,50)之间的差异很小,其介电常数接近于块体值,表明此时介电常数具有弱的尺寸效应.模型预测和实验结果的一致性证实该模型的有效性,并表明利用Fox方程来确定纳米半导体合金的热力学常数是合理的.值得一提的是,式(5)和式(8)只适用于具有自由表面或位于惰性基体的纳米晶体[24-27].对于通过气相沉积方法来制备的纳米晶体[9]与基底形成非共格、半共格和共格界面,这可能会导致不同的变化趋势,比如对于具有不同界面的纳米晶体,可能产生过冷或过热现象[13].因此,界面效应在以后的工作中会进一步进行讨论.3 结论通过已建立的熔化温度模型以及Fox方程,建立一个没有任何可调参数的热力学模型来预测半导体化合物和合金的ε(x,D).模型预测结果表明,纳米晶体的ε(x,D)随着尺寸D的减小而减小,纳米半导体合金的ε(x,D)随成分表现出弯曲行为.而且,由于表面体积比的不同,纳米粒子ε(x,D)的尺寸效应强于纳米线.模型预测结果和实验结果一致性表明模型的有效性和普适性,同时该模型为光电器件的开发、应用提供有效指导.参考文献:【相关文献】[1]CANHAM L T.Silicon quantum wire array fabrication by electrochemical and chemical dissolution of wafers[J].Applied Physics Letters,1990,57(10):1046-1048. [2]SUN C Q,SUN X,TAY B,et al.Dielectric suppression and its effect on photoabsorption of nanometric semiconductors[J].Journal of Physics D:Applied Physics,2001,34(15):2359.[3]DELERUE C,LANNOO M,ALLAN G.Concept of dielectric constant for nanosized systems[J].Physical Review B,2003,68(11):115411.[4]YOO H G,FAUCHET P M.Dielectric constant reduction in silicon nanostructures [J].Physical Review B,2008,77(11):115355.[5]WANG L W,ZUNGER A.Pseudopotential calculations of nanoscale CdSe quantum dots[J].Physical Review B,1996,53(15):9579.[6]DE SOUSA J S,PEIBST R,ERENBURG M,et al.Single-electron charging and discharging analyses in Ge-nanocrystal memories[J].IEEE Transactions on Electron Devices,2011,58(2):376-383.[7]PEIBST R,DE SOUSA J S,HOFMANN K.Determination of the Ge-nanocrystal/SiO2matrix interface trap density from the small signal response of charge stored in the nanocrystals[J].Physical Review B,2010,82(19):195415.[8]TIWARI S,RANA F,HANAFI H,et al.A silicon nanocrystals based memory[J].Applied Physics Letters,1996,68(10):1377-1379.[9]ZHU Y F,LANG X Y,JIANG Q.The effect of alloying on the bandgap energy of nanoscaled semiconductor alloys[J].Advanced Functional Materials,2008,18(9):1422-1429.[10]SAKALAUSKA E,REUTERS B,KHOSHROO L R,et al.Dielectric function and optical properties of quaternary AlInGaN alloys[J].Journal of Applied Physics,2011,110(1):013102.[11]TSU R,BABIC D.Doping of a quantum dot[J].Applied Physics Letters,1994,64(14):1806-1808.[12]LANG X,ZHENG W,JIANG Q.Finite-size effect on band structure and photoluminescence of semiconductor nanocrystals[J].IEEE Transactions on Nanotechnology,2008,7(1):5-9.[13]JIANG Q,ZHANG Z,LI J.Melting thermodynamics of nanocrystals embedded in a matrix[J].Acta Materialia,2000,48(20):4791-4795.[14]ZHANG Z,ZHAO M,JIANG Q.Melting temperatures of semiconductor nanocrystals in the mesoscopic size range[J].Semiconductor Science and Technology,2001,16(6):33.[15]CHOW T.Molecular interpretation of the glass transition temperature of polymer-diluent systems[J].Macromolecules,1980,13(2):362-364.[16]Web Elements Periodic Table:the periodic table on the web[EB/OL].[2019-05-10].http://.[17]LI J,WANG L W.Band-structure-corrected local density approximation study of semiconductor quantum dots and wires[J].Physical Review B,2005,72(12):125325.[18]REGEL A,GLAZOV V.Entropy of melting of semiconductors[J].Semiconductors,1995,29:405-417.[19]GOH E S,CHEN T,YANG H,et al.Size-suppressed dielectrics of Ge nanocrystals:skin-deep quantum entrapment[J].Nanoscale,2012,4(4):1308-1311.[20]MASUMOTO Y,SONOBE K.Size-dependent energy levels of CdTe quantum dots [J].Physical Review B,1997,56(15):9734.[21]ARIZPE-CHAVEZ H,RAMIREZ-BON R,ESPINOZA-BELTRAN F,et al.Quantum confinement effects in CdTe nanostructured films prepared by the RF sputtering technique[J].Journal of Physics and Chemistry of Solids,2000,61(4):511-518. [22]VOSSMEYER T,KATSIKAS L,GIERSIG M,et al.CdS nanoclusters:synthesis,characterization,size dependent oscillator strength,temperature shift of the excitonic transition energy,and reversible absorbance shift[J].The Journal of Physical Chemistry,1994,98(31):7665-7673.[23]GORER S,HODES G.Quantum size effects in the study of chemical solution deposition mechanisms of semiconductor films[J].The Journal of Physical Chemistry,1994,98(20):5338-5346.[24]LI Y,ZHONG H,LI R,et al.High yield fabrication and electrochemical characterization of tetrapodal CdSe,CdTe,and Cd-SexTe1-xnanocrystals[J].Advanced Functional Materials,2006,16(13):1705-1716.[25]ZHONG X,HAN M,DONG Z,et position-tunable ZnxCd1-xSe nanocrystals with high luminescence and stability[J].Journal of the American Chemical Society,2003,125(28):8589-8594.[26]PETROV D,SANTOS B,PEREIRA G,et al.Size and band-gap dependences of the first hyperpolarizability of CdxZn1-xS nanocrystals[J].The Journal of Physical Chemistry B,2002,106(21),5325-5334.[27]SWAFFORD L A,WEIGAND L A,BOWERS M J,et al.Homogeneously alloyed CdSxSe1-xnanocrystals:synthesis,characterization,and composition/size-dependentband gap[J].Journal of the American Chemical Society,2006,128(37):12299-12306.。
遥感专业英语词汇
1.remote sensing used in forestry 林业遥感2.restoration of natural resources 自然资源的恢复3.above ground biomass(AGB)地上生物量4.biogeochemical cycle 生物地球化学循环5.carbon cycle 碳循环6.stand structure 林分结构7.high deforestation rates 森林砍伐率8.carbon emissions 碳排放9.environmental degradation 环境恶化10.biomass estimation 生物量估测11.field data 样地数据12.remotely sensed data 遥感数据13.statistical relationships 统计相关性14.biomass estimation model 生物量估测模型15.stem diameter 径阶16.stem height 枝下高17.Tree height树高18.Primary Forest 原始森林19.Successional Forest 次生林20.Endmember 端元21.canopy shadow冠层阴影22.canopy closure 冠层郁闭度23.sampling strategy 抽样方案24.stratified random 分层随机25.endmembers 端元26.intrinsic dimensionality 固有维数27.phenological changes 物候变化28.Chlorophyll 叶绿素29.Absorption 吸收30.Amplitude 振幅31.spatial frequency 空间频率32.Fourier transformation 傅立叶变化33.Decomposition 分解34.grain gradient 纹理梯度35.allometric model 异速生长模型36.fresh weight 鲜重37.Dry weight 干重38.Multicollinearity 多重共线性39.Overfitting 过度拟合40.successional vegetation classification 次生林分类41.classifier 分类器42.supervised classification监督分类43.unsupervised classification 非监督分类44.fuzzy classifier method 迷糊分类法45.maximum likelihood classification 最大似然法分类46.minimum distance classification 最小距离法分类47.Bayesian classification 贝叶斯分类48.Image analysis 图像分析49.feature extraction 特征提取50.feature analysis 特征分析51.pattern recognition 模式识别52.texture analysis 纹理分析53.ratio enhancement 比例增强54.edge detection 边缘检测55.image enhancement 影像增强56.reference data 参考数据57.auxiliary data 辅助数据58.principal component transformation 主成分变化59.histogram equalization 直方图均衡化60.image segmentation 图像分割61.geometric correction 几何校正62.geometric registration of imagery 几何配准63.radiometric correction 辐射校正64.atmospheric correction 大气校正65.synthetic aperture radar SAR 合成孔径雷达66.digital surface model, DSM 数字高程模型67.neighborhood method 邻近法68.least squares correlation 最小二乘相关69.illuminance of ground 地面照度70.geometric distortion 几何畸变71.mosaic 镶嵌72.pixel 像元73.quackgrass meadow 冰草草甸74.quagmire 沼泽地75.quantitative analysis 定量分析76.quantitative interpretation 定量判读77.radar echo 雷达回波78.radar image 雷达图像79.radar image texture 雷达图像纹理80.radiation 辐射81.rain intensity 降雨强度82.random distribution 随机分布83.random error 随机误差84.random sampling 随机抽样85.random variable 随机变量86.rare species 稀有种87.ratio method 比值法88.reafforestation 再造林89.reconnaissance survey 普查90.age structure 年龄结构91.recreation 休养92.afforestation 造林;植林93.recovery 再生94.abandoned land 弃耕地95.absorption 吸收〔作用〕96.climatic factor 气候因子97.reflected image 反射影像98.reforestation 森林更新99.regeneration cutting 更新伐100.regional remote sensing 区域遥感101.relative error 相对误差102.reliability 可靠性103.reversible process 可逆过程104.savanna forest 稀瘦原林105.heterogeneity 土壤差异性106.spectral resolution 光谱分辨率107.areal differentiation 地域分异108.substantial or systematic reproduction 实质性的或系统的繁殖109.initiated 开始110.converted 转变111.successional stages 演替系列112.uncertainties 不确定性113.soil fertility 土壤肥力nd-use history 土地利用历史115.vegetation age 植被年龄116.spatial distribution 空间分布117.field measurements 样地测量118.characteristics 特征119.Saplings 树苗120.primary data 原始数据nd cover 土地覆盖122. training sample 训练样本123.spectral signature 光谱特征124.spatial information 空间信息125.texture metrics 纹理度量126.texture measure 纹理测量127.data fusion 数据融合128.sensor 传感器129.multispectral data 多光谱数据130.panchromatic data 全色数据131.radar data 雷达数据132.classification algorithms 分类算法133.parametric 参数134.classification tree analysis 分类树135.K-nearest neighbor K近邻法136.Artifice alneural network (ANN) 神经网络137.per-pixel-based 基于像元的138.environmental features 环境要素139.preprocessing 预处理140.polarization 极化141.resampled 重采样142.image-to-image registration 影像到影像配准143.vegetation types 植被类型144.intensity-hue-saturation 亮度色度饱和度145.Brovey transform Brovey 变换146.Evaluated 评价147.error matrix 混淆矩阵nd use/cover classifation 土地利用/覆盖分类149.Misclassification 误分150.Classification accuracy 分类精度151.producer’s accuracy 生产者精度er’s accuracy 用户精度153.Optical multispectral image 光学多光谱影像154.optical sensor 光学传感器155.fusion techniques 融合技术156.uncertainty analysis 不确定性分析157.data saturation 数据饱和158.Parametric vs nonparametric algorithms 参数非参数算法159.global change 全球变化160.process model–based 基于模型的过程161.empirical model–based 基于经验的模型162.biomass expansion/conversion factor 生物量扩展/转换因子163.hyperspectral sensor 多光谱传感器164.radar data 雷达数据165.belowground biomass 地下生物量166.aboveground biomass 地上生物量167.GIS-based 基于GIS的168.ecosystem models 生态模型169.photosynthesis 光合作用170.anthropogenic effects 人为影响171.homogeneous stands 均一的立地条件172.empirical regression models 经验回归模型173.variables 变量174.subcompartment 小斑175.DBH 胸径176.Spectral features 光谱特征177.Spatial features 空间特征178.Subpixel features 亚像元特征179.Active sensor 主动传感器180.Lidar data 雷达数据181.vegetation indices 植被指数182.biophysical conditions 生物物理条件183.soil fertilities 土壤特征184.near-infrared 近红外185.extracting textures 纹理提取186.mean 均值187.variance 方差188.homogeneity 同质性189.contrast 对比度190.entropy 信息熵191.mature forest 成熟林192.secondary forest 次生林193.nonphotosynthetic vegetation 非光合作用植被194.shade fraction 阴影分量195.soil fraction 土壤分量196.biomass density 生物量密度197.vegetation characteristics 植被特征198.species composition 树种组成199.growth phase 生长期200.spectral signatures 光谱信息201.moist tropical 热带雨林202.primary data 原始数据203.unstable 不稳定204.soil moisture 土壤水分205.horizontal vegetation structures 水平植被结构206.canopy cover 灌层覆盖度207.canopy height 灌层高度208.regression technique 回归技术209.interferometry technique 干涉技术210.terrain properties 地形要素211.backscattering coefficient 后向散射系数212.canopy elements 灌层要素213.backscattering values 散射值214.coherence of data 数据一致性215.the total coherence of a forest 森林的一致性216.forest transmissivity 森林透射率rge scale biomass 大区域生物量218.Polarization Coherence Tomography 极化相干断层扫描219.filtering methods 滤波方法220.outliers 异常值221.stereo viewing 立体视觉ser return signal 激光反馈信号223.characterizing horizontal 水平特征224.characterizing vertical 垂直特征225.canopy structure 灌层结构226.biomass prediction 生物量预测227.height information 树高信息228.hypothetical example 假设样本229.mean height 平均树高230.univariate model 单变量模型231.metric 度量标准232.biomass accumulation 累计生物量233.categorical variables 绝对变量234.different source data 不同源数据235.DEM data DEM数据236.optimal variables 最佳变量237.expert knowledge 专家知识238.strong correlations 强相关239.weak correlations 弱相关240.stepwise regression analysis 逐步回归分析241.independent variables 独立变量242.Parametric algorithms 参数算法243.nonparametric algorithms 非参数算法244.linear regression models 线性回归模型245.nonlinearly related 非线性相关246.power models 指数模型247.nonlinear models 非线性模型248.random forest 随即森林249.support vector machine (SVM) 支持向量机250.Maximum Entropy 最大熵251.Simulation 仿真252.co-simulation 协同仿真253.normal distribution 正态分布254.spatial configuration 空间结构255.randomly setting 随机设置256.pixel estimation 像素估计257.sample variance 样本方差258.national forest inventory sample plot data 国家森林库存样地数据259.natural deciduous forests 自然落叶森林260.linear relationships 线性关系261.approximation 近似法262.mathematical functions 数学函数263.black-box model 黑箱模型264.iterating training 迭代训练265.root node 根节点266.internal nodes, 内部节点267.recursive partitioning algorithm 逐步分割算法268.stratified 分层269.terminal node 终端节点270.regression tree theory 回归树理论271.split 分割272.statistical learning algorithm 统计学习算法273.high-dimensional feature space 高维特征空间274.kernel 卷积核275.empirical averages 经验平均值276.subsections 分段277.Accurately estimating 精度评价278.relative errors 相对模糊279.global scales 全球尺度280.root mean square error (RMSE) 均方根误差281.correlation coefficient 相关系数282.systematic sampling 系统抽样283.data collection 数据收集284.subset 子集285.mapping forest biomass /carbon 生物量/碳储量制图286.sequestration 隔离287.forest management and planning 森林管理和规划288.allometric models 异速生长的模型289.representativeness 代表性290.lidar data 激光雷达数据291.vegetation structure gradient 植被结构梯度292.randomly perturbing 随机扰动293.north coordinates 北坐标294.coarser spatial resolution 粗分辨率295.grouping errors 分组误差296.Medium spatial resolution 中分辨影像297.population parameters 人口参数298.Mixed pixels 混合像元299.Mismatch 误差300.high spatial resolution images 高分辨率影像。
【2021.03.07】看论文神器知云文献翻译、百度翻译API申请、机器学习术语库
【2021.03.07】看论⽂神器知云⽂献翻译、百度翻译API申请、机器学习术语库最近在看论⽂,因为论⽂都是全英⽂的,所以需要论⽂查看的软件,在macOS上找到⼀款很好⽤的软件叫做知云⽂献翻译知云⽂献翻译界⾯长这样,可以长段翻译,总之很不错百度翻译API申请使⽤⾃⼰的api有两个好处:⼀、更加稳定⼆、可以⾃定义词库,我看的是医疗和机器学习相关的英⽂⽂献,可以⾃定义api申请在上⽅控制台、根据流程申请后可以在这⾥看到⾃⼰的ID和密钥填⼊就可以了⾃定义术语库我看的是机器学习的⽂献,因此在术语库⾥添加,导⼊⽂件(我会把⽂本放在后⾯导⼊后完成,有部分词语不翻译,⽐如MNIST这样的专有词语,就会报错,忽略掉就可以了开启术语库就⾏了机器学习术语库Supervised Learning|||监督学习Unsupervised Learning|||⽆监督学习Semi-supervised Learning|||半监督学习Reinforcement Learning|||强化学习Active Learning|||主动学习Online Learning|||在线学习Transfer Learning|||迁移学习Automated Machine Learning (AutoML)|||⾃动机器学习Representation Learning|||表⽰学习Minkowski distance|||闵可夫斯基距离Gradient Descent|||梯度下降Stochastic Gradient Descent|||随机梯度下降Over-fitting|||过拟合Regularization|||正则化Cross Validation|||交叉验证Perceptron|||感知机Logistic Regression|||逻辑回归Maximum Likelihood Estimation|||最⼤似然估计Newton’s method|||⽜顿法K-Nearest Neighbor|||K近邻法Mahanalobis Distance|||马⽒距离Decision Tree|||决策树Naive Bayes Classifier|||朴素贝叶斯分类器Generalization Error|||泛化误差PAC Learning|||概率近似正确学习Empirical Risk Minimization|||经验风险最⼩化Growth Function|||成长函数VC-dimension|||VC维Structural Risk Minimization|||结构风险最⼩化Eigendecomposition|||特征分解Singular Value Decomposition|||奇异值分解Moore-Penrose Pseudoinverse|||摩尔-彭若斯⼴义逆Marginal Probability|||边缘概率Conditional Probability|||条件概率Expectation|||期望Variance|||⽅差Covariance|||协⽅差Critical points|||临界点Support Vector Machine|||⽀持向量机Decision Boundary|||决策边界Convex Set|||凸集Lagrange Duality|||拉格朗⽇对偶性KKT Conditions|||KKT条件Coordinate ascent|||坐标下降法Sequential Minimal Optimization (SMO)|||序列最⼩化优化Ensemble Learning|||集成学习Bootstrap Aggregating (Bagging)|||装袋算法Random Forests|||随机森林Boosting|||提升⽅法Stacking|||堆叠⽅法Decision Tree|||决策树Classification Tree|||分类树Adaptive Boosting (AdaBoost)|||⾃适应提升Decision Stump|||决策树桩Meta Learning|||元学习Gradient Descent|||梯度下降Deep Feedforward Network (DFN)|||深度前向⽹络Backpropagation|||反向传播Activation Function|||激活函数Multi-layer Perceptron (MLP)|||多层感知机Perceptron|||感知机Mean-Squared Error (MSE)|||均⽅误差Chain Rule|||链式法则Logistic Function|||逻辑函数Hyperbolic Tangent|||双曲正切函数Rectified Linear Units (ReLU)|||整流线性单元Residual Neural Networks (ResNet)|||残差神经⽹络Regularization|||正则化Overfitting|||过拟合Data(set) Augmentation|||数据增强Parameter Sharing|||参数共享Ensemble Learning|||集成学习Dropout|||L2 Regularization|||L2正则化Taylor Series Approximation|||泰勒级数近似Taylor Expansion|||泰勒展开Bayesian Prior|||贝叶斯先验Bayesian Inference|||贝叶斯推理Gaussian Prior|||⾼斯先验Maximum-a-Posteriori (MAP)|||最⼤后验Linear Regression|||线性回归L1 Regularization|||L1正则化Constrained Optimization|||约束优化Lagrange Function|||拉格朗⽇函数Denoising Autoencoder|||降噪⾃动编码器Label Smoothing|||标签平滑Eigen Decomposition|||特征分解Convolutional Neural Networks (CNNs)|||卷积神经⽹络Semi-Supervised Learning|||半监督学习Generative Model|||⽣成模型Discriminative Model|||判别模型Multi-Task Learning|||多任务学习Bootstrap Aggregating (Bagging)|||装袋算法Multivariate Normal Distribution|||多元正态分布Sparse Parametrization|||稀疏参数化Sparse Representation|||稀疏表⽰Student-t Prior|||学⽣T先验KL Divergence|||KL散度Orthogonal Matching Pursuit (OMP)|||正交匹配追踪算法Adversarial Training|||对抗训练Matrix Factorization (MF)|||矩阵分解Root-Mean-Square Error (RMSE)|||均⽅根误差Collaborative Filtering (CF)|||协同过滤Nonnegative Matrix Factorization (NMF)|||⾮负矩阵分解Singular Value Decomposition (SVD)|||奇异值分解Latent Sematic Analysis (LSA)|||潜在语义分析Bayesian Probabilistic Matrix Factorization (BPMF)|||贝叶斯概率矩阵分解Wishart Prior|||Wishart先验Sparse Coding|||稀疏编码Factorization Machines (FM)|||分解机second-order method|||⼆阶⽅法cost function|||代价函数training set|||训练集objective function|||⽬标函数expectation|||期望data generating distribution|||数据⽣成分布empirical risk minimization|||经验风险最⼩化generalization error|||泛化误差empirical risk|||经验风险overfitting|||过拟合feasible|||可⾏loss function|||损失函数derivative|||导数gradient descent|||梯度下降surrogate loss function|||代理损失函数early stopping|||提前终⽌Hessian matrix|||⿊塞矩阵second derivative|||⼆阶导数Taylor series|||泰勒级数Ill-conditioning|||病态的critical point|||临界点local minimum|||局部极⼩点local maximum|||局部极⼤点saddle point|||鞍点local minima|||局部极⼩值global minimum|||全局最⼩点convex function|||凸函数weight space symmetry|||权重空间对称性Newton’s method|||⽜顿法activation function|||激活函数fully-connected networks|||全连接⽹络Resnet|||残差神经⽹络gradient clipping|||梯度截断recurrent neural network|||循环神经⽹络long-term dependency|||长期依赖eigen-decomposition|||特征值分解feedforward network|||前馈⽹络vanishing and exploding gradient problem|||梯度消失与爆炸问题contrastive divergence|||对⽐散度validation set|||验证集stochastic gradient descent|||随机梯度下降learning rate|||学习速率momentum|||动量gradient descent|||梯度下降poor conditioning|||病态条件nesterov momentum|||Nesterov 动量partial derivative|||偏导数moving average|||移动平均quadratic function|||⼆次函数positive definite|||正定quasi-newton method|||拟⽜顿法conjugate gradient|||共轭梯度steepest descent|||最速下降reparametrization|||重参数化standard deviation|||标准差coordinate descent|||坐标下降skip connection|||跳跃连接convolutional neural network|||卷积神经⽹络convolution|||卷积pooling|||池化feedforward neural network|||前馈神经⽹络maximum likelihood|||最⼤似然back propagation|||反向传播artificial neural network|||⼈⼯神经⽹络deep feedforward network|||深度前馈⽹络hyperparameter|||超参数sparse connectivity|||稀疏连接parameter sharing|||参数共享receptive field|||接受域chain rule|||链式法则tiled convolution|||平铺卷积object detection|||⽬标检测error rate|||错误率activation function|||激活函数overfitting|||过拟合attention mechanism|||注意⼒机制transfer learning|||迁移学习autoencoder|||⾃编码器unsupervised learning|||⽆监督学习back propagation|||反向传播pretraining|||预训练dimensionality reduction|||降维curse of dimensionality|||维数灾难feedforward neural network|||前馈神经⽹络encoder|||编码器decoder|||解码器cross-entropy|||交叉熵tied weights|||绑定的权重PCA|||PCAprincipal component analysis|||主成分分析singular value decomposition|||奇异值分解SVD|||SVDsingular value|||奇异值reconstruction error|||重构误差covariance matrix|||协⽅差矩阵Kullback-Leibler (KL) divergence|||KL散度denoising autoencoder|||去噪⾃编码器sparse autoencoder|||稀疏⾃编码器contractive autoencoder|||收缩⾃编码器conjugate gradient|||共轭梯度fine-tune|||精调local optima|||局部最优posterior distribution|||后验分布gaussian distribution|||⾼斯分布reparametrization|||重参数化recurrent neural network|||循环神经⽹络artificial neural network|||⼈⼯神经⽹络feedforward neural network|||前馈神经⽹络sentiment analysis|||情感分析machine translation|||机器翻译pos tagging|||词性标注teacher forcing|||导师驱动过程back-propagation through time|||通过时间反向传播directed graphical model|||有向图模型speech recognition|||语⾳识别question answering|||问答系统attention mechanism|||注意⼒机制vanishing and exploding gradient problem|||梯度消失与爆炸问题jacobi matrix|||jacobi矩阵long-term dependency|||长期依赖clip gradient|||梯度截断long short-term memory|||长短期记忆gated recurrent unit|||门控循环单元hadamard product|||Hadamard乘积back propagation|||反向传播attention mechanism|||注意⼒机制feedforward network|||前馈⽹络named entity recognition|||命名实体识别Representation Learning|||表征学习Distributed Representation|||分布式表征Multi-task Learning|||多任务学习Multi-Modal Learning|||多模态学习Semi-supervised Learning|||半监督学习NLP|||⾃然语⾔处理Neural Language Model|||神经语⾔模型Neural Probabilistic Language Model|||神经概率语⾔模型RNN|||循环神经⽹络Neural Tensor Network|||神经张量⽹络Graph Neural Network|||图神经⽹络Graph Covolutional Network (GCN)|||图卷积⽹络Graph Attention Network|||图注意⼒⽹络Self-attention|||⾃注意⼒机制Feature Learning|||表征学习Feature Engineering|||特征⼯程One-hot Representation|||独热编码Speech Recognition|||语⾳识别DBM|||深度玻尔兹曼机Zero-shot Learning|||零次学习Autoencoder|||⾃编码器Generative Adversarial Network(GAN)|||⽣成对抗⽹络Approximate Inference|||近似推断Bag-of-Words Model|||词袋模型Forward Propagation|||前向传播Huffman Binary Tree|||霍夫曼⼆叉树NNLM|||神经⽹络语⾔模型N-gram|||N元语法Skip-gram Model|||跳元模型Negative Sampling|||负采样CBOW|||连续词袋模型Knowledge Graph|||知识图谱Relation Extraction|||关系抽取Node Embedding|||节点嵌⼊Graph Neural Network|||图神经⽹络Node Classification|||节点分类Link Prediction|||链路预测Community Detection|||社区发现Isomorphism|||同构Random Walk|||随机漫步Spectral Clustering|||谱聚类Asynchronous Stochastic Gradient Algorithm|||异步随机梯度算法Negative Sampling|||负采样Network Embedding|||⽹络嵌⼊Graph Theory|||图论multiset|||多重集Perron-Frobenius Theorem|||佩龙—弗罗贝尼乌斯定理Stationary Distribution|||稳态分布Matrix Factorization|||矩阵分解Sparsification|||稀疏化Singular Value Decomposition|||奇异值分解Frobenius Norm|||F-范数Heterogeneous Network|||异构⽹络Graph Convolutional Network (GCN)|||图卷积⽹络CNN|||卷积神经⽹络Semi-Supervised Classification|||半监督分类Chebyshev polynomial|||切⽐雪夫多项式Gradient Exploding|||梯度爆炸Gradient Vanishing|||梯度消失Batch Normalization|||批标准化Neighborhood Aggregation|||邻域聚合LSTM|||长短期记忆⽹络Graph Attention Network|||图注意⼒⽹络Self-attention|||⾃注意⼒机制Rescaling|||再缩放Attention Mechanism|||注意⼒机制Jensen-Shannon Divergence|||JS散度Cognitive Graph|||认知图谱Generative Adversarial Network(GAN)|||⽣成对抗⽹络Generative Model|||⽣成模型Discriminative Model|||判别模型Gaussian Mixture Model|||⾼斯混合模型Variational Auto-Encoder(VAE)|||变分编码器Markov Chain|||马尔可夫链Boltzmann Machine|||玻尔兹曼机Kullback–Leibler divergence|||KL散度Vanishing Gradient|||梯度消失Surrogate Loss|||替代损失Mode Collapse|||模式崩溃Earth-Mover/Wasserstein-1 Distance|||搬⼟距离/EMD Lipschitz Continuity|||利普希茨连续Feedforward Network|||前馈⽹络Minimax Game|||极⼩极⼤博弈Adversarial Learning|||对抗学习Outlier|||异常值/离群值Rectified Linear Unit|||线性修正单元Logistic Regression|||逻辑回归Softmax Regression|||Softmax回归SVM|||⽀持向量机Decision Tree|||决策树Nearest Neighbors|||最近邻White-box|||⽩盒(测试 etc. )Lagrange Multiplier|||拉格朗⽇乘⼦Black-box|||⿊盒(测试 etc. )Robustness|||鲁棒性/稳健性Decision Boundary|||决策边界Non-differentiability|||不可微Intra-technique Transferability|||相同技术迁移能⼒Cross-technique Transferability|||不同技术迁移能⼒Data Augmentation|||数据增强Adaboost|||recommender system|||推荐系统Probability matching|||概率匹配minimax regret|||face detection|||⼈脸检测i.i.d.|||独⽴同分布Minimax|||极⼤极⼩linear model|||线性模型Thompson Sampling|||汤普森抽样eigenvalues|||特征值optimization problem|||优化问题greedy algorithm|||贪⼼算法Dynamic Programming|||动态规划lookup table|||查找表Bellman equation|||贝尔曼⽅程discount factor|||折现系数Reinforcement Learning|||强化学习gradient theorem|||梯度定理stochastic gradient descent|||随机梯度下降法Monte Carlo|||蒙特卡罗⽅法function approximation|||函数逼近Markov Decision Process|||马尔可夫决策过程Bootstrapping|||引导Shortest Path Problem|||最短路径问题expected return|||预期回报Q-Learning|||Q学习temporal-difference learning|||时间差分学习AlphaZero|||Backgammon|||西洋双陆棋finite set|||有限集Markov property|||马尔可夫性质sample complexity|||样本复杂性Cartesian product|||笛卡⼉积Kevin Leyton-Brown|||SVM|||⽀持向量机MNIST|||ImageNet|||Ensemble learning|||集成学习Neural networks|||神经⽹络Neuroevolution|||神经演化object recognition|||⽬标识别Multi-task learning|||多任务学习Treebank|||树图资料库covariance|||协⽅差Hamiltonian Monte Carlo|||哈密顿蒙特卡罗Inductive bias|||归纳偏置bilevel optimization|||双层规划genetic algorithms|||遗传算法Bayesian linear regression|||贝叶斯线性回归ANOVA|||⽅差分析Extrapolation|||外推法activation function|||激活函数CIFAR-10|||Gaussian Process|||⾼斯过程k-nearest neighbors|||K最近邻Neural Turing machine|||神经图灵机MCMC|||马尔可夫链蒙特卡罗Collaborative filtering|||协同过滤AlphaGo|||random forests|||随机森林multivariate Gaussian|||多元⾼斯Bayesian Optimization|||贝叶斯优化meta-learning|||元学习iterative algorithm|||迭代算法Viterbi algorithm|||维特⽐算法Gibbs distribution|||吉布斯分布Discriminative model|||判别模型Maximum Entropy Markov Model|||最⼤熵马尔可夫模型Information Extraction|||信息提取clique|||⼩圈⼦conditional random field|||条件随机场CRF|||条件随机场triad|||三元关系Naïve Bayes|||朴素贝叶斯social network|||社交⽹络Bayesian network|||贝叶斯⽹络SVM|||⽀持向量机Joint probability distribution|||联合概率分布Conditional independence|||条件独⽴性sequence analysis|||序列分析Perceptron|||感知器Markov Blanket|||马尔科夫毯Hidden Markov Model|||隐马尔可夫模型finite-state|||有限状态Shallow parsing|||浅层分析Active learning|||主动学习Speech recognition|||语⾳识别convex|||凸transition matrix|||转移矩阵factor graph|||因⼦图forward-backward algorithm|||前向后向算法parsing|||语法分析structural holes|||结构洞graphical model|||图模型Markov Random Field|||马尔可夫随机场Social balance theory|||社会平衡理论Generative model|||⽣成模型probalistic topic model|||概率语义模型TFIDF|||词频-⽂本逆向频率LSI|||潜在语义索引Bayesian network|||贝叶斯⽹络模型Markov random field|||马尔科夫随机场restricted boltzmann machine|||限制玻尔兹曼机LDA|||隐式狄利克雷分配模型PLSI|||概率潜在语义索引模型EM algorithm|||最⼤期望算法Gibbs sampling|||吉布斯采样法MAP (Maximum A Posteriori)|||最⼤后验概率算法Markov Chain Monte Carlo|||马尔科夫链式蒙特卡洛算法Monte Carlo Sampling|||蒙特卡洛采样法Univariate|||单变量Hoeffding Bound|||Hoeffding界Chernoff Bound|||Chernoff界Importance Sampling|||加权采样法invariant distribution|||不动点分布Metropolis-Hastings algorithm|||Metropolis-Hastings算法Probablistic Inference|||概率推断Variational Inference|||变量式推断HMM|||隐式马尔科夫模型mean field|||平均场理论mixture model|||混合模型convex duality|||凸对偶belief propagation|||置信传播算法non-parametric model|||⾮参模型Gaussian process|||正态过程multivariate Gaussian distribution|||多元正态分布Dirichlet process|||狄利克雷过程stick breaking process|||断棒过程Chinese restaurant process|||中餐馆过程Blackwell-MacQueen Urn Scheme|||Blackwell-MacQueen桶法De Finetti's theorem|||de Finetti定理collapsed Gibbs sampling|||下陷吉布斯采样法Hierarchical Dirichlet process|||阶梯式狄利克雷过程Indian Buffet process|||印度餐馆过程。
时延估计算法地方法很多
时延估计算法的方法很多,广义互相关函数法(Gee, Genear I i zedeross-ocerrat Inin)运用最为广泛"广义互相关法通过求两信号之间的互功率谱,并在频域内给予一定的加权,来抑制噪声和反射的影响,再反变换到时域,得到两信号之间的互相关函数"其峰值位置,即两信号之间的相对吋延45IH, 6],时延估计过程如图1 一7所示”设h. (n), h2 (n)分别为声源信号s (n)到两麦克风的冲激响应,則麦克风接收到的信号为:Xi (n) =hi (n) 0S (n) +ni (n) (1. 1)x2 (n) =h2 (n) 0 s (n) +n2 (n) (1.2)佈计结果结基于子空间的定位技术来源于现代高分辨率谱估计技术。
子空间技术是阵列信号处理技术中研究最多、应用最广、最基本也是最重要的技术之一。
该类声源定位技术是利用接收信号相关矩阵的空间谱,求解麦克风间的相关矩阵来确定方向角, 从而进一步确定声源位置。
子空间类方法主要分两类,一类是利用阵列自相关矩阵主特征向量(即信号子空间)的主分量方法,如AR参数模型主分量法,BT主分量法等;另一类方法是以信号子空间和噪声子空间的正交性原理为基础,利用组成噪声子空间的特征向量来进行谱估计,这类算法主要有多重信号分类法(MUSIC), Johnson 法,最小范数(Mini-Norm)法,MUSIC 根(Root-MUSIC)法, 旋转不变信号参数估计(ESPRIT)法,等等。
在实际中,基于子空间的定位技术的空间谱的相关矩阵是未知的,必须从观测信号中来估计,需要在一定时间间隔内把所有信号平均来得到,同时要求接收信号处于声源、噪声、估计参数固定不变的环境和有足够多的信号平均值。
即便满足这此条件,该算法也不如传统的波束形成方法对声源和麦克风模型误差的鲁棒性好。
目前定位问题所涉及算法都是研究远场的线性阵列情况。
基于子空间的定位技术是通过时间平均来估计信号之间的相关矩阵,需要信号是平稳过程,估计参数固定不变,而语音信号是一个短时平稳过程,往往不能满足这个条件。
Game theory, maximum entropy, minimum discrepancy, and robust Bayesian decision theory
The Annals of Statistics2004,Vol.32,No.4,1367–1433DOI10.1214/009053604000000553©Institute of Mathematical Statistics,2004GAME THEORY,MAXIMUM ENTROPY,MINIMUMDISCREPANCY AND ROBUST BAYESIANDECISION THEORY1B Y P ETER D.G RÜNWALD AND A.P HILIP D AWIDCWI Amsterdam and University College LondonWe describe and develop a close relationship between two problems that have customarily been regarded as distinct:that of maximizing entropy,andthat of minimizing worst-case expected ing a formulation groundedin the equilibrium theory of zero-sum games between Decision Maker andNature,these two problems are shown to be dual to each other,the solution toeach providing that to the other.Although Topsøe described this connectionfor the Shannon entropy over20years ago,it does not appear to be widelyknown even in that important special case.We here generalize this theory to apply to arbitrary decision problems and loss functions.We indicate how an appropriate generalized definition ofentropy can be associated with such a problem,and we show that,subject tocertain regularity conditions,the above-mentioned duality continues to applyin this extended context.This simultaneously provides a possible rationale formaximizing entropy and a tool forfinding robust Bayes acts.We also describethe essential identity between the problem of maximizing entropy and that ofminimizing a related discrepancy or divergence between distributions.Thisleads to an extension,to arbitrary discrepancies,of a well-known minimaxtheorem for the case of Kullback–Leibler divergence(the“redundancy-capacity theorem”of information theory).For the important case of families of distributions having certain mean values specified,we develop simple sufficient conditions and methods foridentifying the desired solutions.We use this theory to introduce a newconcept of“generalized exponential family”linked to the specific decisionproblem under consideration,and we demonstrate that this shares many ofthe properties of standard exponential families.Finally,we show that the existence of an equilibrium in our game can be rephrased in terms of a“Pythagorean property”of the related divergence, Received February2002;revised May2003.1Supported in part by the EU Fourth Framework BRA NeuroCOLT II Working Group EP27150, the European Science Foundation Programme on Highly Structured Stochastic Systems,Eurandom and the Gatsby Charitable Foundation.A four-page abstract containing an overview of part of this paper appeared in the Proceedings of the2002IEEE Information Theory Workshop[see Grünwald and Dawid(2002)].AMS2000subject classifications.Primary62C20;secondary94A17.Key words and phrases.Additive model,Bayes act,Bregman divergence,Brier score,convexity, duality,equalizer rule,exponential family,Gamma-minimax,generalized exponential family, Kullback–Leibler divergence,logarithmic score,maximin,mean-value constraints,minimax,mutual information,Pythagorean property,redundancy-capacity theorem,relative entropy,saddle-point, scoring rule,specific entropy,uncertainty function,zero–one loss.13671368P.D.GRÜNWALD AND A.P.DAWIDthus generalizing previously announced results for Kullback–Leibler andBregman divergences.1.Introduction.Suppose that,for purposes of inductive inference or choos-ing an optimal decision,we wish to select a single distribution P∗to act as rep-resentative of a class of such distributions.The maximum entropy principle [Jaynes(1989),Csiszár(1991)and Kapur and Kesavan(1992)]is widely ap-plied for this purpose,but its rationale has often been controversial[see,e.g., van Fraassen(1981),Shimony(1985),Skyrms(1985),Jaynes(1985),Seidenfeld (1986)and Uffink(1995,1996)].Here we emphasize and generalize a reinterpreta-tion of the maximum entropy principle[Topsøe(1979),Walley(1991),Chapter5, Section12,and Grünwald(1998)]:that the distribution P∗that maximizes the en-tropy over also minimizes the worst-case expected logarithmic score(log loss). In the terminology of decision theory[Berger(1985)],P∗is a robust Bayes,or -minimax,act,when loss is measured by the logarithmic score.This gives a decision-theoretic interpretation of maximum entropy.In this paper we extend this result to apply to a generalized concept of entropy, tailored to whatever loss function L is regarded as appropriate,not just logarithmic score.We show that,under regularity conditions,maximizing this generalized entropy constitutes the major step towardfinding the robust Bayes(“ -minimax”) act against with respect to L.For the important special case that is described by mean-value constraints,we give theorems that in many cases allow us to find the maximum generalized entropy distribution explicitly.We further define generalized exponential families of distributions,which,for the case of the logarithmic score,reduce to the usual exponential families.We extend generalized entropy to generalized relative entropy and show how this is essentially the same as a general decision-theoretic definition of discrepancy.We show that the family of divergences between probability measures known as Bregman divergences constitutes a special case of such discrepancies.A discrepancy can also be used as a loss function in its own right:we show that a minimax result for relative entropy[Haussler(1997)]can be extended to this more general case.We further show that a“Pythagorean property”[Csiszár(1991)]known to hold for relative entropy and for Bregman divergences in fact applies much more generally;and we give a precise characterization of those discrepancies for which it holds.Our analysis is game-theoretic,a crucial concern being the existence and properties of a saddle-point,and its associated minimax and maximin acts,in a suitable zero-sum game between Decision Maker and Nature.1.1.A word of caution.It is not our purpose either to advocate or to criticize the maximum entropy or robust Bayes approach:we adopt a philosophically neutral stance.Rather,our aim is mathematical unification.By generalizing the concept of entropy beyond the standard Shannon framework,we obtain a varietyMAXIMUM ENTROPY AND ROBUST BAYES 1369of interesting characterizations of maximum generalized entropy and display its connections with other known concepts and results.The connection with -minimax might be viewed,by those who already regard robust Bayes as a well-founded principle,as a justification for maximizing entropy—but it should be noted that -minimax,like all minimax approaches,is not without problems of its own [Berger (1985)].We must also point out that some of the more problematic aspects of maximum entropy inference,such as the incompatibility of maximum entropy with Bayesian updating [Seidenfeld (1986)and Uffink (1996)],carry over to our generalized setting:in the words of one referee,rather than resolving this problem,we “spread it to a new level of abstraction and generality.”Although these dangers must be firmly held in mind when considering the implications of this work for inductive inference,they do not undermine the mathematical connections established.2.Overview.We start with an overview of our results.For ease of exposition,we make several simplifying assumptions,such as a finite sample space,in this section.These assumptions will later be relaxed.2.1.Maximum entropy and game theory.Let X be a finite sample space,and let be a family of distributions over X .Consider a Decision Maker (DM)who has to make a decision whose consequences will depend on the outcome of a random variable X defined on X .DM is willing to assume that X is distributed according to some P ∈ ,a known family of distributions over X ,but he or she does not know which such distribution applies.DM would like to pick a single P ∗∈ to base decisions on.One way of selecting such a P ∗is to apply the maximum entropy principle [Jaynes (1989)],which advises DM to pick that distribution P ∗∈ maximizing H (P )over all P ∈ .Here H (P )denotes the Shannon entropy of P ,H (P ):=− x ∈X p(x)log p(x)=E P {−log p(X)},where p is the probability mass function of P .However,the various rationales offered in support of this advice have often been unclear or disputed.Here we shall present a game-theoretic rationale,which some may find attractive.Let A be the set of all probability mass functions defined over X .By the information inequality [Cover and Thomas (1991)],we have that,for any distribution P ,inf q ∈A E P {−log q(X)}is achieved uniquely at q =p ,where it takes the value H (P ).That is,H (P )=inf q ∈A E P {−log q(X)},and so the maximum entropy can be written assup P ∈ H (P )=sup P ∈ inf q ∈AE P {−log q(X)}.(1)Now consider the “log loss game”[Good (1952)],in which DM has to specify some q ∈A ,and DM’s ensuing loss if Nature then reveals X =x is measured by −log q(x).Alternatively,we can consider the “code-length game”[Topsøe (1979)and Harremoës and Topsøe (2001)],wherein we require DM to specify1370P.D.GRÜNWALD AND A.P.DAWIDa prefix-free codeσ,mapping X into a suitable set offinite binary strings,and to measure his or her loss when X=x by the lengthκ(x)of the codewordσ(x). Thus DM’s objective is to minimize expected code-length.Basic results of coding theory[see,e.g.,Dawid(1992)]imply that we can associate withσa probability mass function q having q(x)=2−κ(x).Then,up to a constant,−log q(x)becomes identical with the code-lengthκ(x),so that the log loss game is essentially equivalent to the code-length game.By analogy with minimax results of game theory,one might conjecture thatsup P∈ infq∈AE P{−log q(X)}=infq∈AsupP∈E P{−log q(X)}.(2)As we have seen,P achieving the supremum on the left-hand side of(2)is a maximum entropy distribution in .However,just as important,q achieving the infimum on the right-hand side of(2)is a robust Bayes act against ,or a -minimax act[Berger(1985)],for the log loss decision problem.Now it turns out that,when is closed and convex,(2)does indeed hold under very general conditions.Moreover the infimum on the right-hand side is achieved uniquely for q=p∗,the probability mass function of the maximum entropy distribution P∗.Thus,in this game between DM and Nature,the maximum entropy distribution P∗may be viewed,simultaneously,as defining both Nature’s maximin and—in our view more interesting—DM’s minimax strategy.In other words, maximum entropy is robust Bayes.This decision-theoretic reinterpretation might now be regarded as a plausible justification for selecting the maximum entropy distribution.Note particularly that we do not restrict the acts q available to DM to those corresponding to a distribution in the restricted set :that the optimal act p∗does indeed turn out to have this property is a consequence of,not a restriction on, the analysis.The maximum entropy method has been most commonly applied in the setting where is described by mean-value constraints[Jaynes(1989)and Csiszár (1991)]: ={P:E P(T)=τ},where T=t(X)∈R k is some given real-or vector-valued statistic.As pointed out by Grünwald(1998),for such constraints the property(2)is particularly easy to show.By the general theory of exponential families[Barndorff-Nielsen(1978)],under some mild conditions onτthere will exist a distribution P∗satisfying the constraint E P∗(T)=τand having probability mass function of the form p∗(x)=exp{α0+αT t(x)}for someα∈R k,α0∈R. Then,for any P∈ ,E P{−log p∗(X)}=−α0−αT E P(T)=−α0−αTτ=H(P∗).(3)We thus see that p∗is an“equalizer rule”against ,having the same expected loss under any P∈ .To see that P∗maximizes entropy,observe that,for any P∈ ,H(P)=infq∈A E P{−log q(X)}≤E P{−log p∗(X)}=H(P∗),(4)by(3).MAXIMUM ENTROPY AND ROBUST BAYES1371 To see that p∗is robust Bayes and that(2)holds,note that,for any q∈A,sup P∈ E P{−log q(X)}≥E P∗{−log q(X)}≥E P∗{−log p∗(X)}=H(P∗),(5)where the second inequality is the information inequality[Cover and Thomas (1991)].HenceH(P∗)≤infq∈A supP∈E P{−log q(X)}.(6)However,it follows trivially from the“equalizer”property(3)of p∗thatsup P∈ E P{−log p∗(X)}=H(P∗).(7)From(6)and(7),we see that the choice q=p∗achieves the infimum on the right-hand side of(2)and is thus robust Bayes.Moreover,(2)holds,with both sides equal to H(P∗).The above argument can be extended to much more general sample spaces(see Section7).Although this game-theoretic approach and result date back at least to Topsøe(1979),they seem to have attracted little attention so far.2.2.This work:generalized entropy.The above robust Bayes view of maxi-mum entropy might be regarded as justifying its use in those decision problems, such as discrete coding and Kelly gambling[Cover and Thomas(1991)],where the log loss is clearly an appropriate loss function to use.But what if we are interested in other loss functions?This is the principal question we address in this paper.2.2.1.Generalized entropy and robust Bayes acts.Wefirst recall,in Section3,a natural generalization of the concept of“entropy”(or“uncertainty inherent in a distribution”),related to a specific decision problem and loss function facing DM. The generalized entropy thus associated with the log loss problem is just the Shannon entropy.More generally,let A be some space of actions or decisions and let X be the(not necessarilyfinite)space of possible outcomes to be observed.Let the loss function be given by L:X×A→(−∞,∞],and let be a convex set of distributions over X.In Sections4–6we set up a statistical game G based on these ingredients and use this to show that,under a variety of broad regularity conditions, the distribution P∗maximizing,over ,the generalized entropy associated with the loss function L has a Bayes act a∗∈A[achieving inf a∈A L(P∗,a)]that is a robust Bayes( -minimax)decision relative to L—thus generalizing the result for the log loss described in Section2.1.Some variations on this result are also given.2.2.2.Generalized exponential families.In Section7we consider in detail the case of mean-value constraints,of the form ={P:E P(T)=τ}.Forfixed loss function L and statistic T,asτvaries we obtain a family of maximum generalized entropy distributions,one for each value ofτ.For Shannon entropy,this turns out1372P.D.GRÜNWALD AND A.P.DAWIDto coincide with the exponential family having natural sufficient statistic T[Csiszár (1975)].In close analogy we define the collection of maximum generalized entropy distributions,as we varyτ,to be the generalized exponential family determined by L and T,and we give several examples of such generalized exponential families. In particular,Lafferty’s“additive models based on Bregman divergences”[Lafferty (1999)]are special cases of our generalized exponential families(Section8.4.2).2.2.3.Generalized relative entropy and discrepancy.In Section8we describe how generalized entropy extends to generalized relative entropy and show how this in turn is intimately related to a discrepancy or divergence function.Maximum generalized relative entropy then becomes a special case of the minimum discrepancy method.For the log loss,the associated discrepancy function is just the familiar Kullback–Leibler divergence,and the method then coincides with the “classical”minimum relative entropy method[Jaynes(1989);note that,for Jaynes,“relative entropy”is the same as Kullback–Leibler divergence;for us it is the negative of this].2.2.4.A generalized redundancy-capacity theorem.In many statistical deci-sion problems it is more natural to seek minimax decisions with respect to the discrepancy associated with a loss,rather than with respect to the loss directly. With any game we thus associate a new“derived game,”in which the discrepancy constructed from the loss function of the original game now serves as a new loss function.In Section9we show that our minimax theorems apply to games of this form too:broadly,whenever the conditions for such a theorem hold for the original game,they also hold for the derived game.As a special case,we reprove a minimax theorem for the Kullback–Leibler divergence[Haussler(1997)],known in infor-mation theory as the redundancy-capacity theorem[Merhav and Feder(1995)].2.2.5.The Pythagorean property.The Kullback–Leibler divergence has a celebrated property reminiscent of squared Euclidean distance:it satisfies an analogue of the Pythagorean theorem[Csiszár(1975)].It has been noted[Csiszár (1991),Jones and Byrne(1990)and Lafferty(1999)]that a version of this property is shared by the broader class of Bregman divergences.In Section10we show that a“Pythagorean inequality”in fact holds for the discrepancy based on an arbitrary loss function L,so long as the game G has a value;that is,an analogue of(2)holds.Such decision-based discrepancies include Bregman divergences as special cases.We demonstrate that,even for the case of mean-value constraints, the Pythagorean inequality for a Bregman divergence may be strict.2.2.6.Finally,Section11takes stock of what has been achieved and presents some suggestions for further development.MAXIMUM ENTROPY AND ROBUST BAYES13733.Decision problems.In this section we set out some general definitions and properties we shall require.For more background on the concepts discussed here, see Dawid(1998).A DM has to take some action a selected from a given action space A,after which Nature will reveal the value x∈X of a quantity X,and DM will then suffer a loss L(x,a)in(−∞,∞].We suppose that Nature takes no account of the action chosen by DM.Then this can be considered as a zero-sum game between Nature and DM,with both players moving simultaneously,and DM paying Nature L(x,a)after both moves are revealed.We call such a combination G:=(X,A,L) a basic game.Both DM and Nature are also allowed to make randomized moves,such a move being described by a probability distribution P over X(for Nature)orζover A (for DM).We assume that suitableσ-fields,containing all singleton sets,have been specified in X and A,and that any probability distributions considered are defined over the relevantσ-field;we denote the family of all such probability distributions on X by P0.We further suppose that the loss function L is jointly measurable.3.1.Expected loss.We shall permit algebraic operations on the extended real line[−∞,∞],with definitions and exceptions as in Rockafellar(1970),Section4. For a function f:X→[−∞,∞],and P∈P0,we may denote E P{f(X)} [i.e.,E X∼P{f(X)}]by f(P).When f is bounded below,f(P)is construedas∞if P{f(X)=∞}>0.When f is unbounded,we interpret f(P)as f+(P)−f−(P)∈[−∞,+∞],where f+(x):=max{f(x),0}and f−(x):= max{−f(x),0},allowing either f+(P)or f−(P)to take the value∞,but not both.In this last case f(P)is undefined,else it is defined(either as afinite number or as±∞).If DM knows that Nature is generating X from P or,in the absence of such knowledge,DM is using P to represent his or her own uncertainty about X, then the undesirability to DM of any act a∈A will be assessed by means of its expected loss,(8)L(P,a):=E P{L(X,a)}.We can similarly extend L to randomized acts:L(x,ζ):=E A∼ζ{L(x,A)}, L(P,ζ)=E(X,A)∼P×ζ{L(X,A)}.Throughout this paper we shall mostly confine attention to probability measures P∈P0such that L(P,a)is defined for all a∈A,and we shall denote the family of all such P by P.We further confine attention to randomized actsζsuch that L(P,ζ)is defined for all P∈P,denoting the set of all suchζby Z.Note that any distribution degenerate at a point x∈X is in P,and so L(x,ζ)is defined for all x∈X,ζ∈Z.L EMMA3.1.For all P∈P,ζ∈Z,(9)L(P,ζ)=E X∼P{L(X,ζ)}=E A∼ζ{L(P,A)}.1374P.D.GRÜNWALD AND A.P.DAWIDP ROOF.When L(P,ζ)isfinite this is just Fubini’s theorem.Now consider the case L(P,ζ)=∞.First suppose L≥0everywhere. If L(x,ζ)=∞for x in a subset of X having positive P-measure,then(9) holds,both sides being+∞.Otherwise,L(x,ζ)isfinite almost surely[P]. If E P{L(X,ζ)}werefinite,then by Fubini it would be the same as L(P,ζ). So once again E P{L(X,ζ)}=L(P,ζ)=+∞.This result now extends easily to possibly negative L,on noting that L−(P,ζ) must befinite;a parallel result holds when L(P,ζ)=−∞.Finally the whole argument can be repeated after interchanging the roles of x and a and of P andζ.C OROLLARY3.1.For any P∈P,inf ζ∈Z L(P,ζ)=infa∈AL(P,a).(10)P ROOF.Clearly infζ∈Z L(P,ζ)≤inf a∈A L(P,a).If inf a∈A L(P,a)=−∞we are done.Otherwise,for anyζ∈Z,L(P,ζ)=E A∼ζL(P,A)≥inf a∈A L(P,a).We shall need the fact that,for anyζ∈Z,L(P,ζ)is linear in P in the following sense.L EMMA3.2.Let P0,P1∈P,and let Pλ:=(1−λ)P0+λP1.Fixζ∈Z,such that the pair{L(P0,ζ),L(P1,ζ)}does not contain both the values−∞and+∞. Then,for anyλ∈(0,1),L(Pλ,ζ)isfinite if and only if both L(P1,ζ)and L(P0,ζ) are.In this case L(Pλ,ζ)=(1−λ)L(P0,ζ)+λL(P1,ζ).P ROOF.Consider a bivariate random variable(I,X)with joint distribution P∗over{0,1}×X specified by the following:I=1,0with respective probabilitiesλ, 1−λ;and,given I=i,X has distribution P i.By Fubini we haveE P∗{L(X,ζ)}=E P∗[E P∗{L(X,ζ)|I}],in the sense that,whenever one side of this equation is defined andfinite,the same holds for the other,and they are equal.Noting that,under P∗,the distribution of X is Pλmarginally,and P i conditional on I=i(i=0,1),the result follows. 3.2.Bayes act.Intuitively,when X∼P an act a P∈A will be optimal if it minimizes L(P,a)over all a∈A.Any such act a P is a Bayes act against P.More generally,to allow for the possibility that L(P,a)may be infinite as well as to take into account randomization,we callζP∈Z a(randomized)Bayes act,or simply Bayes,against P(not necessarily in P)ifE P{L(X,ζ)−L(X,ζP)}∈[0,∞](11)MAXIMUM ENTROPY AND ROBUST BAYES1375 for allζ∈Z.We denote by A P(resp.Z P)the set of all nonrandomized(resp. randomized)Bayes acts against P.Clearly A P⊆Z P,and L(P,ζP)is the samefor allζP∈Z P.The loss function L will be called -strict if,for each P∈ ,there exists a P∈A that is the unique Bayes act against P;L is -semistrict if,for each P∈ ,A P is nonempty,and a,a ∈A P⇒L(·,a)≡L(·,a ).When L is -strict,and P∈ ,it can never be optimal for DM to choose a randomized act; when L is -semistrict,even though a randomized act can be optimal there is never any point in choosing one,since its loss function will be identical with that of any nonrandomized optimal act.Semistrictness is clearly weaker than strictness.For our purposes we can replace it by the still weaker concept of relative strictness:L is -relatively strict if for all P∈ the set of Bayes acts A P is nonempty and,for all a,a ∈A P, L(P ,a)=L(P ,a )for all P ∈ .3.3.Bayes loss and entropy.Whether or not a Bayes act exists,the Bayes loss H(P)∈[−∞,∞]of a distribution P∈P is defined byH(P):=infa∈A L(P,a).(12)It follows from Corollary3.1that it would make no difference if the infimum in(12)were extended to be overζ∈Z.We shall mostly be interested in Bayes acts of distributions P withfinite H(P).In the context of Section2.1,with L(x,q)the log loss−log q(x),H(P)is just the Shannon entropy of P.P ROPOSITION 3.1.Let P∈P and suppose H(P)isfinite.Then the following hold:(i)ζP∈Z is Bayes against P if and only ifE P{L(X,a)−L(X,ζP)}∈[0,∞](13)for all a∈A.(ii)ζP is Bayes against P if and only if L(P,ζP)=H(P).(iii)If P admits some randomized Bayes act,then P also admits some nonrandomized Bayes act;that is,A P is not empty.P ROOF.Items(i)and(ii)follow easily from(10)andfiniteness.To prove(iii),let f(P,a):=L(P,a)−H(P).Then f(P,a)≥0for all a,while E A∼ζP f(P,A)=L(P,ζP)−H(P)=0.We deduce that{a∈A:f(P,a)=0}has probabil-ity1underζP and so,in particular,must be nonempty.We express the well-known concavity property of the Bayes loss[DeGroot (1970),Section8.4]as follows.1376P.D.GRÜNWALD AND A.P.DAWIDP ROPOSITION3.2.Let P0,P1∈P,and let Pλ:=(1−λ)P0+λP1.Suppose that H(P i)<∞for i=0,1.Then H(Pλ)is a concave function ofλon[0,1](and thus,in particular,continuous on(0,1)and lower semicontinuous on[0,1]).It is either bounded above on[0,1]or infinite everywhere on(0,1).P ROOF.Let B be the set of all a∈A such that L(Pλ,a)<∞for someλ∈(0,1)—and thus,by Lemma3.2,for allλ∈[0,1].If B is empty, then H(Pλ)=∞for allλ∈(0,1);in particular,H(Pλ)is then concave on[0,1]. Otherwise,taking anyfixed a∈B we have H(Pλ)≤L(Pλ,a)≤max i L(P i,a), so H(Pλ)is bounded above on[0,1].Moreover,as the pointwise infimum of the nonempty family of concave functions{L(Pλ,a):a∈A},H(Pλ)is itself a concave function ofλon[0,1].C OROLLARY3.2.If for all a∈A,L(Pλ,a)<∞for someλ∈(0,1),then for allλ∈[0,1],H(Pλ)=lim{H(Pµ):µ∈[0,1],µ→λ}[it being allowed that H(Pλ)is notfinite].P ROOF.In this case B=A,so that H(Pλ)=inf a∈B L(Pλ,a).Each func-tion L(Pλ,a)isfinite and linear,hence a closed concave function ofλon[0,1]. This last property is then preserved on taking the infimum.The result now follows from Theorem7.5of Rockafellar(1970).C OROLLARY3.3.If in addition H(P i)isfinite for i=0,1,then H(Pλ)is a bounded continuous function ofλon[0,1].Note that Corollary3.3will always apply when the loss function is bounded.Under some further regularity conditions[see Dawid(1998,2003)and Section3.5.4below],a general concave function over P can be regarded as generated from some decision problem by means of(12).Concave functions have been previously proposed as general measures of the uncertainty or diversity in a distribution[DeGroot(1962)and Rao(1982)],generalizing the Shannon entropy. We shall thus call the Bayes loss H,as given by(12),the(generalized)entropy function or uncertainty function associated with the loss function L.3.4.Scoring rule.Suppose the action space A is itself a set Q of distributions for X.Note we are not here considering Q∈Q as a randomized act over X,but rather as a simple act in its own right(e.g.,a decision to quote Q as a description of uncertainty about X).We typically write the loss as S(x,Q)in this case and refer to S as a scoring rule or score.Such scoring rules are used to assess the performance of probability forecasters[Dawid(1986)].We say S is -proper if ⊆Q⊆P and,for all P∈ ,the choice Q=P is Bayes against X∼P. Then for P∈ ,(14)H(P)=S(P,P).Suppose now we start from a general decision problem,with loss function L such that Z Q is nonempty for all Q∈Q.Then we can define a scoring rule byS(x,Q):=L(x,ζQ),(15)where for each Q∈Q we suppose we have selected some specific Bayes actζQ∈Z Q.Then for P∈Q,S(P,Q)=L(P,ζQ)is clearly minimized when Q=P,so that this scoring rule is Q-proper.If L is Q-semistrict,then(15) does not depend on the choice of Bayes actζQ.More generally,if L is Q-relatively strict,then S(P,Q)does not depend on such a choice,for all P,Q∈Q.We see that,for P∈Q,inf Q∈Q S(P,Q)=S(P,P)=L(P,ζP)=H(P). In particular,the generalized entropy associated with the constructed scoring rule(15)is identical with that determined by the original loss function L.In this way,almost any decision problem can be reformulated in terms of a proper scoring rule.3.5.Some examples.We now give some simple examples,both to illustrate the above concepts and to provide a concrete focus for later development.Further examples may be found in Dawid(1998)and Dawid and Sebastiani(1999).3.5.1.Brier score.Although it can be generalized,we restrict our treatment of the Brier score[Brier(1950)]to the case of afinite sample space X= {x1,...,x N}.A distribution P over X can be represented by its probability vector p=(p(1),...,p(N)),where p(x):=P(X=x).A point x∈X may also be represented by the N-vectorδx corresponding to the point-mass distribution on{x} having entriesδx(j)=1if j=x,0otherwise.The Brier scoring rule is then defined byS(x,Q):= δx−q 2(16)=Nj=1{δx(j)−q(j)}2=j q(j)2−2q(x)+1.(17)ThenS(P,Q)=j q(j)2−2jp(j)q(j)+1,(18)which is uniquely minimized for Q=P,so that this is a P-strict proper scoring rule.The corresponding entropy function is(see Figure1)H(P)=1−j p(j)2.(19)。
最大熵模型简介-Read
FI 算法(特征引入算法,Feature Induction) 解决如何选择特征的问题:通常采用一个逐步增加特征的办
法进行,每一次要增加哪个特征取决于样本数据。
Algorithms
Generalized Iterative Scaling (GIS): (Darroch and Ratcliff, 1972) Improved Iterative Scaling (IIS): (Della Pietra et al., 1995)
( n) j
Z
Approximation for calculating feature expectation
E p f j p( x) f j ( x)
x a A,bB
p ( a, b) f
j
j
( a, b)
a A,bB
p(b) p(a | b) f
p (b) p(a | b) f ~
GIS: setup
Requirements for running GIS: Obey form of model and constraints: k
j f j ( x)
p ( x)
e
j 1
An additional constraint:
Z
Ep f j d j
x
最大熵原理
最大熵原理:1957 年由E.T.Jaynes 提出。 主要思想:
在只掌握关于未知分布的部分知识时,应该选取符合这些知识但熵 值最大的概率分布。
原理的实质:
前提:已知部分知识 关于未知分布最合理的推断=符合已知知识最不确定或最随机的推 断。 这是我们可以作出的唯一不偏不倚的选择,任何其它的选择都意味 着我们增加了其它的约束和假设,这些约束和假设根据我们掌握的 信息无法作出。
信息与计算科学专业英语词汇指南
信息与计算科学专业英语词汇指南摘要信息与计算科学专业是一门以信息领域为背景,数学,统计学与信息学为基础的综合性学科。
该专业培养具有较强的数学建模能力和计算机应用能力的高级专门人才。
在该专业的学习过程中,英语是一门重要的工具和交流语言,因此掌握相关的英语词汇是必不可少的。
本文旨在为信息与计算科学专业的学习者提供一份英语词汇清单,包括以下几个方面:基础数学词汇基础统计词汇基础信息词汇常用软件工具词汇常见算法和模型词汇一、基础数学词汇数学是信息与计算科学专业的基础,也是该专业的核心课程之一。
在数学方面,该专业主要涉及以下几个分支:数理基础科学离散数学数值分析运筹学图论概率论数理统计以下是这些分支中常用的一些英语词汇:中文英文数理基础科学Foundations of Mathematical Science集合Set元素Element子集Subset并集Union交集Intersection补集Complement空集Empty set集合运算Set operations映射Mapping函数Function逆映射Inverse mapping逆函数Inverse function复合映射Composite mapping复合函数Composite function关系Relation等价关系Equivalence relation序关系Order relation有序集合Ordered set上确界Supremum下确界Infimum极限Limit极限存在性定理Theorem of existence of limit极限唯一性定理Theorem of uniqueness of limit极限运算法则Rules of limit operations连续函数Continuous function连续性定理Theorem of continuity导数Derivative微分中值定理Mean value theorem of differential calculus 洛必达法则L'Hospital's rule泰勒公式Taylor's formula二、基础统计词汇在统计方面,该专业主要涉及以下几个分支:描述统计推断统计回归分析方差分析时间序列分析多元统计分析以下是这些分支中常用的一些英语词汇:中文英文描述统计Descriptive statistics数据Data总体Population样本Sample变量Variable随机变量Random variable分布Distribution频数Frequency频率Frequency rate相对频率Relative frequency累积频率Cumulative frequency频数分布表Frequency distribution table频数直方图Frequency histogram频数折线图Frequency polygon饼图Pie chart散点图Scatter plot箱线图Box plot均值Mean中位数Median众数Mode四分位数Quartile极差Range方差Variance标准差Standard deviation样本协方差Sample covariance样本相关系数Sample correlation coefficient三、基础信息词汇在信息方面,该专业主要涉及以下几个分支:信息论编码理论信息安全信息检索人工智能以下是这些分支中常用的一些英语词汇:中文英文信息论Information theory信息量Information quantity熵Entropy条件熵Conditional entropy互信息Mutual information信道容量Channel capacity编码理论Coding theory编码器Encoder解码器Decoder源编码定理Source coding theorem信道编码定理Channel coding theorem哈夫曼编码Huffman coding汉明距离Hamming distance汉明码Hamming code循环冗余校验码Cyclic redundancy check code (CRC)信息安全Information security加密算法Encryption algorithm解密算法Decryption algorithm对称加密算法Symmetric encryption algorithm非对称加密算法Asymmetric encryption algorithm公钥加密算法Public key encryption algorithm私钥加密算法Private key encryption algorithm数字签名算法Digital signature algorithm四、常用软件工具词汇在软件工具方面,该专业主要涉及以下几个方面:编程语言数据库数学软件统计软件人工智能框架以下是这些方面中常用的一些英语词汇:中文英文编程语言Programming language变量Variable常量Constant数据类型Data type表达式Expression语句Statement函数Function参数Parameter返回值Return value类Class对象Object属性Attribute方法Method继承Inheritance多态Polymorphism封装Encapsulation接口Interface抽象类Abstract class数据库Database数据库管理系统Database management system (DBMS)关系型数据库Relational database非关系型数据库Non-relational database表Table中文英文记录Record字段Field主键Primary key外键Foreign key索引Index查询语言Query language结构化查询语言Structured query language (SQL)数学软件Mathematical softwareMATLAB(矩阵实验室)MATLAB (Matrix Laboratory)Mathematica(数学家)Mathematica (Mathematician)Maple(枫树)Maple (Maple Tree)统计软件Statistical softwareR(统计编程语言)R (Statistical programming language)SPSS(统计产品与服务解决方案)SPSS (Statistical Product and Service Solutions)SAS(统计分析系统)SAS (Statistical Analysis System)人工智能框架Artificial intelligence frameworkTensorFlow(张量流)TensorFlow (Tensor Flow)PyTorch(Python火炬)PyTorch (Python Torch)Keras(可拉斯)Keras (Keras)五、常见算法和模型词汇算法和模型是信息与计算科学专业的重要研究对象,也是该专业的核心课程之一。
计算机视觉常用术语中英文对照
---------------------------------------------------------------最新资料推荐------------------------------------------------------ 计算机视觉常用术语中英文对照计算机视觉常用术语中英文对照(1)人工智能 Artificial Intelligence 认知科学与神经科学Cognitive Science and Neuroscience 图像处理Image Processing 计算机图形学Computer graphics 模式识别 Pattern Recognized 图像表示 Image Representation 立体视觉与三维重建Stereo Vision and 3D Reconstruction 物体(目标)识别 Object Recognition 运动检测与跟踪Motion Detection and Tracking 边缘edge 边缘检测detection 区域region 图像分割segmentation 轮廓与剪影contour and silhouette1/ 10纹理 texture 纹理特征提取 feature extraction 颜色 color 局部特征 local features or blob 尺度 scale 摄像机标定 Camera Calibration 立体匹配stereo matching 图像配准Image Registration 特征匹配features matching 物体识别Object Recognition 人工标注Ground-truth 自动标注Automatic Annotation 运动检测与跟踪 Motion Detection and Tracking 背景剪除Background Subtraction 背景模型与更新background modeling and update---------------------------------------------------------------最新资料推荐------------------------------------------------------ 运动跟踪 Motion Tracking 多目标跟踪 multi-target tracking 颜色空间 color space 色调 Hue 色饱和度 Saturation 明度 Value 颜色不变性 Color Constancy(人类视觉具有颜色不变性)照明illumination 反射模型Reflectance Model 明暗分析Shading Analysis 成像几何学与成像物理学 Imaging Geometry and Physics 全像摄像机 Omnidirectional Camera 激光扫描仪 Laser Scanner 透视投影Perspective projection 正交投影Orthopedic projection3/ 10表面方向半球 Hemisphere of Directions 立体角 solid angle 透视缩小效应 foreshortening 辐射度 radiance 辐照度 irradiance 亮度 intensity 漫反射表面、Lambertian(朗伯)表面 diffuse surface 镜面 Specular Surfaces 漫反射率 diffuse reflectance 明暗模型 Shading Models 环境光照 ambient illumination 互反射interreflection 反射图Reflectance Map 纹理分析Texture Analysis 元素 elements---------------------------------------------------------------最新资料推荐------------------------------------------------------ 基元 primitives 纹理分类 texture classification 从纹理中恢复图像 shape from texture 纹理合成 synthetic 图形绘制 graph rendering 图像压缩 image compression 统计方法 statistical methods 结构方法 structural methods 基于模型的方法 model based methods 分形fractal 自相关性函数autocorrelation function 熵entropy 能量energy 对比度contrast 均匀度homogeneity5/ 10相关性 correlation 上下文约束 contextual constraints Gibbs 随机场吉布斯随机场边缘检测、跟踪、连接 Detection、Tracking、Linking LoG 边缘检测算法(墨西哥草帽算子)LoG=Laplacian of Gaussian 霍夫变化 Hough Transform 链码 chain code B-样条B-spline 有理 B-样条 Rational B-spline 非均匀有理 B-样条Non-Uniform Rational B-Spline 控制点control points 节点knot points 基函数 basis function 控制点权值 weights 曲线拟合 curve fitting---------------------------------------------------------------最新资料推荐------------------------------------------------------ 内插 interpolation 逼近 approximation 回归 Regression 主动轮廓Active Contour Model or Snake 图像二值化Image thresholding 连通成分connected component 数学形态学mathematical morphology 结构元structuring elements 膨胀Dilation 腐蚀 Erosion 开运算 opening 闭运算 closing 聚类clustering 分裂合并方法 split-and-merge 区域邻接图 region adjacency graphs7/ 10四叉树quad tree 区域生长Region Growing 过分割over-segmentation 分水岭watered 金字塔pyramid 亚采样sub-sampling 尺度空间 Scale Space 局部特征 Local Features 背景混淆clutter 遮挡occlusion 角点corners 强纹理区域strongly textured areas 二阶矩阵 Second moment matrix 视觉词袋 bag-of-visual-words 类内差异 intra-class variability---------------------------------------------------------------最新资料推荐------------------------------------------------------ 类间相似性inter-class similarity 生成学习Generative learning 判别学习discriminative learning 人脸检测Face detection 弱分类器weak learners 集成分类器ensemble classifier 被动测距传感passive sensing 多视点Multiple Views 稠密深度图 dense depth 稀疏深度图 sparse depth 视差disparity 外极epipolar 外极几何Epipolor Geometry 校正Rectification 归一化相关 NCC Normalized Cross Correlation9/ 10平方差的和 SSD Sum of Squared Differences 绝对值差的和 SAD Sum of Absolute Difference 俯仰角 pitch 偏航角 yaw 扭转角twist 高斯混合模型Gaussian Mixture Model 运动场motion field 光流 optical flow 贝叶斯跟踪 Bayesian tracking 粒子滤波 Particle Filters 颜色直方图 color histogram 尺度不变特征转换 SIFT scale invariant feature transform 孔径问题 Aperture problem。
LS-DYNA EFG用户手册说明书
Variable
DX
DY
DZ
ISPLINE IDILA
IEBT
IDIM TOLDEF
Type
F
F
F
I
I
I
I
F
Default
1.01
1.01
1.01
0
0
-1
2
0.01
TOLDEF TODELF < 1.0 = 0.0 : Lagrangian kernel > 0.0 : Semi-Lagrangian kernel < 0.0 : Eulerian kernel
Livermore Software Technology Corporation
9
LS-DYNA
Global control for the activation of Semi-Lagrangian kernel or Eulerian kernel
Variable IGL
STIME IKEN
IEBT
Type
F
F
F
I
I
I
Default
1.01
1.01
1.01
0
0
-1
1.0 ≤ DX, DY, DZ ≤ 2.0 is recommended CPU time increases with support size
IDIM I 2
TOLDEF F
0.01
Some Guidelines for DX, DY and DZ
I y
x
d
rxI rxI
= =
d d
/
2
if IDILA=0 if IDILA=1
物质点方法与有限元方法在二维接触问题中的对比
物质点方法与有限元方法在二维接触问题中的对比卢嘉铮;姚星宇【摘要】针对非线性力学问题,特别是接触类问题时,有限元法经常暴露出时间成本高,计算结果不容易收敛等先天性缺陷.相比之下,物质点法具有更好的性质,接触应力由物质点之间的动量守恒直接计算得到,理论上计算开销更小,同时具备更高的计算精度和准确度.本文采用物质点方法、有限元法和解析方法对赫兹接触问题进行求解,并将三种方法计算得到的最大接触应力进行比较.计算分析后发现,相比有限元法,物质点法的计算结果的精确度与准确度略高.本文的分析研究为计算接触问题提供了新的思路,具备一定工程应用价值.【期刊名称】《科技视界》【年(卷),期】2018(000)005【总页数】4页(P12-14,83)【关键词】物质点法;有限元法;赫兹接触【作者】卢嘉铮;姚星宇【作者单位】中国民用航空飞行学院航空工程学院,四川广汉 618307;中国民用航空飞行学院航空工程学院,四川广汉 618307【正文语种】中文【中图分类】O241.82;O351 前言接触问题具有强非线性,提高计算结果的精度和准确度一直以来是学者们研究的重点。
20世纪,计算机技术的发展使得有限元等数值方法得以展开拳脚,迅速被应用于求解接触问题。
但是,有限元法中的接触计算主要采用罚函数法,两物体之间的接触状态未知,在计算的每一个增量步前后,都需要对接触面进行搜寻,并且约束条件不能被严格满足,因此有限元接触计算经常出现贯穿、不收敛等问题。
物质点法(Material Point Method,MPM)是由 Sulsky和 Chen于 1994年提出的一种数值方法[1],其本质是一种采用质点和网格双重描述的无网格法。
物质点法采用质点离散材料区域,通过背景网格计算空间导数和求解动量方程,避免了网格畸变和对流项处理,兼具欧拉和拉格朗日描述的优点,非常适合用于模拟涉及大变形、冲击和断裂破碎等问题。
但是,MPM中的近似方法不具有克罗内科德尔塔性质,不能解决边界条件施加的问题[1]。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERINGInt.J.Numer.Meth.Engng2006;65:2167–2202Published online12December2005in Wiley InterScience().DOI:10.1002/nme.1534Local maximum-entropy approximation schemes:a seamless bridge betweenfinite elements and meshfree methodsM.Arroyo‡,§and M.Ortiz∗,†Graduate Aeronautical Laboratories,California Institute of Technology,Pasadena,CA91125,U.S.A.SUMMARYWe present a one-parameter family of approximation schemes,which we refer to as local maximum-entropy approximation schemes,that bridges continuously two important limits:Delaunay triangulation and maximum-entropy(max-ent)statistical inference.Local max-ent approximation schemes represent a compromise—in the sense of Pareto optimality—between the competing objectives of unbiased statistical inference from the nodal data and the definition of local shape functions of least width. Local max-ent approximation schemes are entirely defined by the node set and the domain of anal-ysis,and the shape functions are positive,interpolate affine functions exactly,and have a weak Kronecker-delta property at the boundary.Local max-ent approximation may be regarded as a regu-larization,or thermalization,of Delaunay triangulation which effectively resolves the degenerate cases resulting from the lack or uniqueness of the triangulation.Local max-ent approximation schemes can be taken as a convenient basis for the numerical solution of PDEs in the style of meshfree Galerkin methods.In test cases characterized by smooth solutions wefind that the accuracy of local max-ent approximation schemes is vastly superior to that offinite elements.Copyright᭧2005John Wiley&Sons,Ltd.KEY WORDS:maximum entropy;information theory;approximation theory;meshfree methods;Delaunay triangulation1.INTRODUCTIONThis paper is concerned with the formulation of approximation schemes that bridge continuously two important limits:Delaunay triangulation and maximum-entropy statistical inference.The resulting basis functions bear similarities with those obtained from the moving least squares∗Correspondence to:M.Ortiz,Graduate Aeronautical Laboratories,California Institute of Technology,Pasadena, CA91125,U.S.A.†E-mail:ortiz@‡E-mail:marino.arroyo@§Current address:LaCàN,Universitat Politècnica de Catalunya,C/Jordi Girona1-3,Barcelona08034,Spain.Contract/grant sponsor:Department of EnergyContract/grant sponsor:NSFReceived17December2004Revised27April2005 Copyright᭧2005John Wiley&Sons,Ltd.Accepted17August20052168M.ARROYO AND M.ORTIZ(MLS)method,which are the basis of a number of meshfree methods for the numerical solution of partial differential equations(e.g.References[1–3];cf.also,Reference[4]for a review). Despite some similarities,the approach presented in this paper is distinct from MLS methods and presents a number of important advantages over them.The approximation schemes are entirely defined by the node set and fall into the general class of convex approximation schemes.These are schemes based on shape functions that are positive and interpolate affine functions exactly.An important property of convex approximation schemes is that they have a weak Kronecker-delta property at the boundary.This property greatly facilitates the imposition of essential boundary conditions and can also be exploited in order to glue together domain patches in a fully conforming way.The positivity of the shape functions endows the approximation schemes with useful properties and structure derived from convex geometry,but makes the construction of high order approximants more involved.Extensions beyondfirst-order methods will be pursued in subsequent work.Another avenue for defining higher-order approximation schemes is to combine the present approach with the partition of unity method[5,6].The specific convex approximation schemes that we investigate represent a compromise—in the sense of Pareto optimality—between two competing objectives:(i)Unbiased statistical inference based on the nodal data;(ii)the definition of local shape functions of least width.Objective(i)is classical in information theory and leads to Jaynes’principle of maximum entropy[7].In the present context the least biased shape functions,which we call global max-ent approximants,are those that maximize a suitably defined entropy of the approximation scheme.By way of contrast,the most local shape functions,in a sense to be made mathemat-ically precise,are found to be affine shape functions supported on a Delaunay triangulation of the node set.Specifically,we define a one-parameter family of smooth convex approxima-tion schemes,which we refer to as local max-ent schemes,which have global max-ent and Delaunay schemes as limiting cases.In particular,local max-ent approximation schemes sub-sume simplicialfinite elements and the Delaunay triangulation as a special case.Conversely, local max-ent approximation may be regarded as a regularization,or—in analogy to statistical mechanics—a thermalization,of Delaunay interpolation.The level of thermalization is smoothly controlled by a non-negative parameter that can be a function of position.This spatial depen-dence enables a seamless transition from meshfree-type approximants tofinite elements.An important feature of this regularization is that it effectively resolves the degenerate cases result-ing from the lack or uniqueness of Delaunay triangulation.Thus,for node sets for which the Delaunay triangulation is not unique our regularization selects a unique generalized Delaunay approximant in the limit,namely,that which maximizes the approximation entropy.The local max-ent shape functions follow from an unconstrained convex optimization problem at each evaluation point.The size of this problem equals the spatial dimension.In addition, this problem is guaranteed to be solvable on the convex hull of the node set,and its solution is very robust and efficient.Approximants derived from minimization problems have a long tradition and include cubic splines,thin plate splines,MLS approximants and natural neighbour approximants,to name a few.Local max-ent approximation schemes can be taken as a convenient basis for the numerical solution of PDEs in the style of meshfree Galerkin methods(cf.e.g.Reference[4]for a recent review of Galerkin meshfree methods)or,in problems governed by a minimum principle, Copyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–2202LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES2169 by constrained,or Rayleigh–Ritz,minimization.We illustrate the performance of local max-ent approximation schemes in this type of applications by means of a patch test and two test cases:the standard benchmark problem of a linear elastic built-in cantilever beam loaded at the tip;and the upsetting and extension of a block of compressible neo-Hookean rubber.In both examples wefind that the accuracy of local max-ent approximation schemes is vastly superior to that offinite elements,even when the solution cost is carefully factored in.The structure of the paper is as follows.In Section2we begin by establishing the properties of general convex approximation schemes,including a weak Kronecker-delta property at the boundary.In Section3we adopt an information-theoretical viewpoint and introduce the no-tions of entropy of an approximation scheme and global max-ent approximation.The resulting shape functions are of global support and non-interpolating in general.In order to bring these properties under control,in Section4we introduce the concept of width of a shape function. The sum of the widths of the shape functions supplies a measure of the degree of locality of the approximation.We then proceed to introduce the local max-ent approximation schemes by recourse to Pareto optimality.A method for the calculation of the shape functions and some properties of the approximation scheme are presented in this section.Applications to the numerical solution of PDEs are presented in Section5.Some concluding remarks arefinally collected in Section6.2.CONVEX APPROXIMATION SCHEMESAll approximation schemes considered in this paper fall within a class that we shall term the class of convex approximation schemes.These convex approximation schemes are characterized by the positivity of the shape functions and by being exact on affine functions.These conditions alone do not determine a unique convex approximation scheme,and most of this paper is devoted to the selection of convex approximation schemes that are optimal according to certain ancillary criteria.However,there are a number of desirable properties that are shared by all convex approximation schemes.In this section,we proceed to enumerate these properties. 2.1.Approximants as coefficients of convex combinationsConsider a set of distinct nodes X={x a,a=1,...,N}⊂R d,to be referred to as the node set. Recall[8]that the convex hull of X is the setconv X={x∈R d|x=X , ∈R N+,1· =1}(1) where R N+is the non-negative orthant,1denotes the vector of R N whose entries are one,and X is the d×N matrix whose columns are the co-ordinates of the position vectors of the nodes in the node set X.Since X isfinite,it follows that conv X is a compact convex polyhedron, or polytope.Let u:conv X→R be a function whose values{u a;a=1,...,N}are knownon the node set.We wish to construct approximations to u of the formNu h(x)=p a(x)u a(2)a=1Copyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–22022170M.ARROYO AND M.ORTIZwhere the functions p a:conv X→R will be referred to as shape functions.A particular choice of shape functions defines an approximation scheme.We shall require the shape functions to satisfy the zeroth andfirst-order consistency conditions:Np a(x)=1∀x∈conv X(3a)a=1Np a(x)x a=x∀x∈conv X(3b)a=1These conditions guarantee that affine functions are exactly reproduced by the approximation scheme.We note that if N=d+1and the point set is affinely independent,the consistency conditions uniquely determine the shape functions over the corresponding d-simplex.By way of contrast,the shape functions are not uniquely determined by the consistency conditions in general when N>d+1.In addition,we shall require the shape functions be non-negative,i.e.p a(x) 0∀x∈conv X,a=1,...,N(4) The positivity of the shape functions,together with the partition of unity property,allow us to interpret the shape functions as the coefficients of convex combinations.This viewpoint is common in geometric modelling,e.g.in Bézier and B-Spline techniques[9].Positive linearly consistent approximants have long been studied in the literature[10].Recent examples include the natural element method shape functions[11]and subdivision schemes[12].These methods often present a number of attractive features,such as the related properties of monotonicity, the variation diminishing property(the approximation is not more‘wiggly’than the data),or smoothness preservation[13],of particular interest in the presence of shocks.Furthermore, they lead to well behaved mass matrices.The positivity restriction is natural in problems where a maximum principle is in force,such as in the heat conduction problem.In the present context,the non-negativity requirement is introduced primarily to enable the interpretation of shape functions as probability distributions.It follows from(3a),(3b)and(4)that the shape functions at x∈conv X define a convex combination of vertices which evaluates to x.In view of this property we shall refer to non-negative andfirst-order consistent approximation schemes as convex approximation schemes.Let p(x)denote the vector of R N whose components are{p1(x),...,p N(x)}.Then,by virtue of the consistency and non-negativity constraints the domain of p(x),or feasible set,isP x(X)={p∈R N+|Xp=x,1·p=1}(5) Evidently,this set is convex.Afirst question of interest is whether P x(X)is non-empty,i.e. whether there exist shape functions consistent with the constraints.The following proposition follows directly by comparison of(1)and(5).Proposition2.1The feasible set P x(X)is non-empty if and only if x∈conv X.It follows from the preceding observations that non-negative and linearly consistent approxi-mation schemes can only be defined on conv X.If the node set is large enough,Carathéodory’s theorem states that at least N−d−1points in X are not necessary in order to expressCopyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–2202LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES2171 x∈conv X as a convex combination of points in X.Thus,as expected,convex approximation schemes are not uniquely determined by the node set in general.It is possible to consider domains which are subsets of conv X.However,for simplicity in the present work we will assume that =conv X throughout.2.2.Behaviour at the boundaryIn interpolating schemes such as Lagrangianfinite elements the shape functions satisfy the so-called Kronecker-delta property,i.e.p a(x b)= ab.This property is particularly useful when solving partial-differential equations numerically,since it renders the imposition of essential boundary conditions straightforward.Most meshfree methods,in particular those based on the MLS approximation,lack the Kronecker-delta property,and,consequently,the approximation on the boundary of the domain may depend on the nodal data of interior nodes.These methods experience difficulty in enforcing essential boundary conditions(cf.e.g.Reference[14]).In this section we study the behaviour of general convex approximation schemes at the relative bound-ary of conv X,rbd(conv X),i.e.the boundary of conv X regarded as a subset of its affine hull. The relative boundary of conv X coincides with the boundary of conv X when aff(conv X)=R d. Here aff denotes the affine hull.In particular,we show that all convex approximation schemes possess a weak Kronecker-delta property at the boundary.This Kronecker-delta property greatly facilitates the imposition of essential boundary conditions,which confers convex approximation schemes a distinct advantage over MLS and other meshfree approximation schemes.We begin by reviewing a few elementary facts concerning the boundary of polytopes.The faces of the polytope P=conv X can be characterized as the intersections of P with its sup-porting hyperplanes,in addition to P itself and∅,and are themselves polytopes.An equivalent definition of a face of P is a convex subset F of P such that every closed line segment in P with a relative interior point in F has both endpoints(and hence the entire segment) in F[8].A proper face of P is one that is neither P nor∅.The dimension of a face is the dimension of its affine hull.In particular,the zero-dimensional faces of P are called vertices, coincide with its extreme points,and belong to X.We shall denote by vert P the collection of vertices of the polytope.In addition,P=conv(vert P)and,if F is a face of P,it follows that vert F=vert P∩F.The relative interiors of the proper faces of P are a partition of rbd P,i.e. they are disjoint and their union is rbd P.The smallest face of P to which x belongs is the contact set of x,C(x),and is formally defined as the intersection of P with the intersection of all supporting hyperplanes to P at x.Its affine dimension is the facial dimension of x.The facial dimension of points interior to conv X is d,while the facial dimension of extreme points is0.If x∈rbd P,then C(x)is a proper face of P.Proposition2.2Let p(x)define a convex approximation scheme with node set X.Let F be a face of conv X and x a/∈F.Then p a=0on F.ProofSuppose otherwise,i.e.suppose that there is a point x∈F and a convex approximation scheme p(x)such that p a(x)=0.Sincex=b p b(x)x b=b=ap b(x)x b+p a(x)x a(6)Copyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–22022172M.ARROYO AND M.ORTIZand x=x a,it follows thatb=ap b(x)=0.Consider the closed line segment[0,1] t−→t y+(1−t)x a∈conv X(7)wherey=1b=ap b(x)b=ap b(x)x b(8)Then x∈F is a relative interior point of the segment,corresponding to t=1−p a(x),and hence the entire segment,including x a,must be contained in F,which contradicts the assumption. Remarks1.If E is the union of an arbitrary collection of faces of conv X and x a/∈E,then it followsthat p a=0on E.2.The shape functions corresponding to nodes that belong to relint(conv X)vanish inrbd(conv X).3.When approximating a function as in Equation(2),the value of u h at a face F dependsonly on the nodal values corresponding to nodes in X∩F.Let X and Y be two node sets such that conv X∩conv Y is a face of both conv X and conv Y.Then,given a method to select convex approximants,the approximation schemes based on X and Y are conforming (conforming patches).4.Suppose that a function u defined over conv X is affine on a face F.Then u=u h overF provided u a=u(x a)∀x a∈F∩X(exact interpolation of affine functions on faces).5.If x a is an extreme point or vertex of conv X,then p b(x a)= ba,and consequently,u a=u h(x a)(interpolation at extreme points).6.Let x∈rbd(conv X)with contact set C(x).If x a/∈C(x),then p a(x)=0.Thus,choosing aconvex approximation scheme in P x(X)is equivalent to choosing a convex approximation scheme in P x(X∩C(x)).Note that the latter problem can be formulated in aff C(x)−x, the subspace of R d parallel to C(x),whose dimension is the facial dimension of x,and involves a reduced node set(reduced face problem).7.If a n-dimensional face contains exactly n+1nodes,then the shape functions on thatface are the affine shape functions of the simplex defined by those nodes.Some of these observations are known in different contexts.For instance,the fact that Bézier curves pass through the end control points is a direct consequence of Proposition2.2.2.3.Higher-order consistencyA seemingly natural extension of the convex approximation schemes described in the foregoing would be to impose second and higher-order consistency conditions on the shape functions. However,these extensions are not straightforward.In order to demonstrate the source of the difficulty we may simply consider the one-dimensional case.The second-order consistencyCopyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–2202LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES2173Figure1.Illustration of the second-order moment space in1D.condition then takes the formNp a(x)x2a=x2(9)a=1Defining an extended point set Y={(x a,x2a);a=1,...,N}⊂R2,it follows thatfinding non-negative and second-order consistent approximation schemes amounts to defining a convex approximation scheme on the set P(x,x2)(Y).We have seen that this set is non-empty iff(x,x2)belongs to the set conv{(x a,x2a);a=1,...,N}.In the context of the classical problem of moments,namely,the problem offinding a probability distribution given itsfirst moments [15,16],that set is known as the moment space.However,due to the strict convexity of the function f(x)=x2,the condition that(x,x2)be in the set conv{(x a,x2a);a=1,...,N}cannot be satisfied in general,as illustrated in Figure1.Consequently,non-negative convex approxima-tion schemes cannot satisfy Equation(9).A similar argument applies to higher-order consistency conditions and higher spatial dimensions.This observation notwithstanding,it is nevertheless possible to extend the methods presented here to construct high-order convex approximants,as will be detailed in forthcoming work.3.GLOBAL MAX-ENT APPROXIMANTSIn this section we begin by adopting an information-theoretical viewpoint that naturally leads to a canonical choice of convex approximants,namely,those that maximize the entropy of the approximation scheme.In this framework,the problem of approximating a function from nodal data is regarded strictly as a problem of statistical inference,with no regard given to the physical nature of the data or the mathematical character of the governingfield equations. From a strict information-theoretical viewpoint,the overriding concern is to ensure an unbiased inference of the function from the data,i.e.one that is free of systematic errors or artifacts.Copyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–22022174M.ARROYO AND M.ORTIZ3.1.Entropy of a convex approximation schemeThough the principle of maximum entropy is well-known in information theory and statistical physics,in the present context it may stand a brief review.Consider a random variable which can take values in a set of events{A1,A2,...,A n}with probabilities{p1,p2,...,p n}.The set of events and the associated probabilitiesA=A1A2...A n p1p2...p nare jointly called afinite scheme.We now introduce the concept of entropy—uncertainty—of a givenfinite scheme,following the introductory text by Khinchin[17].Consider twofinite schemesA1A2 0.50.5andA1A20.990.01Evidently,thefirst scheme carries more uncertainty than the second,for which the outcome is almost certainly A1.The uncertainty associated with afinite scheme can also be interpreted as the amount of information gained by realizing the random variable,thus eliminating completely the uncertainty.Shannon[18]introduced the following measure of uncertainty,or information entropyH(A)=H(p1,...,p n)=−Na=1p a log p a(10)with the extension by continuity:0log0=0.The function H(A)is non-negative,symmetric, continuous,and strictly concave,and possesses a number of properties that are expected of a measure of uncertainty.In particular,H(p)=0iff one of the probabilities is one and all the others are zero,and attains its maximum for the probabilities{1/n,...,1/n}, which may intuitively be regarded as the most uncertain or random distribution.Furthermore, H(1/n,...,1/n)=log n,which is an increasing function of n.Consequently,adding events adds uncertainty to this most uncertain distribution.However,adding an impossible event does not alter the level of uncertainty,i.e.H(p1,...,p n,0)=H(p1,...,p n).Suppose that we are given twofinite schemes,A=A1A2...A np1p2...p nand B=B1B2...A mq1q2...q mThe set of events A i B j,i=1,...,n,j=1,...,m defines a newfinite scheme,called product scheme AB.If A and B are independent,we have that H(AB)=H(A)+H(B),whereas if the schemes are dependent,the H(AB)=H(A)+H A(B) H(A)+H(B),where H A(B) denotes the expectation of H(B)in scheme A(cf.Reference[17]for details).The inequality H A(B) H(B)can be interpreted by saying that the realization of the scheme A can only decrease the uncertainty of another scheme B.The axiomatic basis of Shannon’s information entropy is well-established in information theory(cf.e.g.Reference[17]).Copyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–2202LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES2175 Within the framework just outlined,the entropy of a convex approximation scheme may be defined as follows.Let X be a node set with N nodes,let x∈conv X,and let p(x)define a convex approximation scheme.Regard the index set I={1,...,N}as a complete system of events.Since the approximation scheme is non-negative and the shape functions add to one,we may regard{p1(x),...,p N(x)}as the corresponding probabilities and H(p1(x),...,p N(x))as the entropy of the correspondingfinite scheme.3.2.Least-biased approximation schemeAn information-theoretical approach to approximation theory can be devised as follows.Equa-tion(3b)is regarded as additional information on the discrete probability distribution p(x), namely that the statistical expectation or average of the random variable X:I→R d,which assigns to each index the position vector of the corresponding node X(a)=x a,is x.Consistent with this constraint,there are in general multiple probability distributions{p1(x),...,p N(x)}. The problem of approximating a function from scattered data may now be regarded as a prob-lem of statistical inference.From this standpoint,Equation(2)expresses the expected value u h(x)of a random variable U:I→R defined by U(a)=u a as determined by the probabilities {p1(x),...,p N(x)}.Suppose that we require that this process of inference be unbiased,i.e.that it be based solely on the a priori knowledge of the function and free of artifacts or hidden assumptions. According to Jaynes’principle of maximum entropy[7],the least biased probability distri-bution is that which maximizes entropy subject to all known constraints.Thus,Jaynes states that the maximum entropy distribution is‘...uniquely determined as the one which is maxi-mally noncommittal with regard to missing information,in that it agrees with what is known, but expresses maximum uncertainty with respect to all other matters’.Thus,from a purely information-theoretical viewpoint,the optimal,or least biased,convex approximation schemes are solutions of the program:Np a log p a(ME)maximize H(p)=−a=1subject to p a 0,a=1,...,NNp a=1a=1Np a x a=xa=1It is interesting to note that in the one-dimensional case this problem gives the max-ent solution of the classical problem of moments[19].Since the information entropy function is strictly concave in its domain R N+,the non-negative orthant,and the constraints are affine,(ME)defines a convex optimization problem.The existence and uniqueness of the solution of this program are established by the following proposition.Proposition3.1The program(ME)has a solution iff x∈conv X,in which case the solution is unique.Copyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–22022176M.ARROYO AND M.ORTIZProofIf x∈conv X,then by Proposition2.1P x(X)=∅.In addition,P x(X)is a closed and bounded subset of R N and,therefore,compact.Hence,by the Weierstrass extreme value theorem−H attains its minimum in P x(X).Since−H is strictly convex in P x(X)(the restriction of a strictly convex function to a convex subset)the minimum is unique. Since program(ME)depends parametrically on x,its unique solution p0(x)is also a function of x.We shall refer to the convex approximation scheme defined by p0(x)as the max-ent approximation scheme.The smoothness of p0(x)follows as a corollary to Proposition4.2.3.3.ExamplesGiven a point set X,the construction of a shape function p0a(x)requires solving the problem (ME)for every point x∈conv X.Examples of max-ent schemes in the plane are shown in Figure2.Figure2(a)shows a max-ent shape function for a point set consisting of the vertices of a convex pentagon.This example illustrates the delta Kroneker property of the max-ent shape functions,and the property that the restriction of the max-ent shape functions to the edges of the pentagon is linear.Thus,max-ent approximation schemes provide a basis for constructing conforming elements in the shape of arbitrary convex polyhedra(cf.References[20,21]for recent alternative methods to construct generalized barycentric co-ordinates for polyhedra).In recent independent work,maximum entropy methods have been used to construct barycentric co-ordinates for convex polyhedra,thus defining C0approximants on polygonal tesselations[22]. The max-ent shape function of an interior node for a larger node set is shown in Figure2(b). As expected,the shape function vanishes at the boundary.The support of the shape function is highly non-local and extends to the entire convex hull of the node set.In addition,the value of the shape function at its corresponding node differs greatly from unity.Consequently,the max-ent approximation is far from interpolating in the interior,and results in a very poorfit to the data as illustrated in Figure2(c).This example serves to illustrate some of the limitations of global max-ent as a candidate ap-proximation scheme for partial differential equations,namely,its non-local and non-interpolating character.An extension of the max-ent concept that provides control over the degree of locality of the shape functions is developed next.Figure 2.Examples of max-ent approximation schemes in the plane:(a)shape function for the vertex of a pentagon;(b)shape function for an interior node,illustrating the global character of max-ent approximation schemes;and(c)max-ent approximation,or inference,of a function from scattered data,illustrating the non-interpolating character of max-ent approximation schemes.Copyright᭧2005John Wiley&Sons,Ltd.Int.J.Numer.Meth.Engng2006;65:2167–2202。