Nondeterministic Discretization of Weights Improves Accuracy of Neural Networks

合集下载

culaccino的引用 -回复

culaccino的引用 -回复

culaccino的引用-回复哲学问题——自由意志存在与否的争论引言:在哲学领域中,自由意志的概念一直是一个备受争议的话题。

自由意志是指人类有能力在行动中自主选择的能力。

然而,有人认为自由意志只是一种幻想,所有的决定都是由前因来决定,而不是由个体的意愿所主导。

本文将以culaccino的引用“自由意志是天使和恶魔之间的子嗣”为主题,分析自由意志存在与否的争论。

第一部分:自由意志的争议一些哲学家主张自由意志是存在的,他们认为人类有选择和行动的自主权。

他们相信,每个人都有能力根据他们的喜好和价值观做出决策,并在行动中负责。

这些哲学家认为自由意志是人类特有的,是我们区别于其他动物的重要特征。

然而,还有另一派哲学家则持相反观点,他们认为自由意志只是一种幻觉。

他们声称人类的所有行动都是由一系列前因所决定的,而不是由个人主观意愿所驱动的。

这些前因可以是遗传、环境或其他影响人类思维和行为的因素。

他们坚信人类的决定只是显而易见的结果,并且无论我们是否意识到,我们的决策都是被巨大的力量所塑造的。

第二部分:对抗论证主张自由意志存在的哲学家们反驳道,即使行为受到一系列前因的影响,个体仍然能够在选择中发挥一定的主观性。

他们认为,即使存在着某种影响,我们仍然有能力通过思考和意愿来做出选择。

此外,他们还指出,如果我们的决策完全由外界因素所主导,那么我们就无法为我们的行为负任何责任。

然而,反对自由意志存在的人则回应说,即使人类能够做出选择,我们的选择也受到我们自身价值观、意识和经验的限制。

我们不能超越我们的认知水平来做出决策,因此我们的选择仍然是可以预测和解释的。

他们认为,即使我们不了解前因的具体细节,我们仍然可以说有某种方式可以解释我们的行为。

第三部分:实证研究与哲学思考随着科学的进步,自由意志存在与否的辩论也从哲学领域扩展到了实证研究领域。

有一些神经科学的研究表明,人类大脑的活动可以预测我们的决策。

这些研究者认为,我们的决策是由神经活动引发的,而这些神经活动是由我们无法控制的生物化学过程所驱动的。

非阿基米德绝对值

非阿基米德绝对值

一个域到实数域内的一种映射。它是通常绝对值的推广。若φ是由域F到实数域R的映射,称φ为F上的一个 绝对值,若φ满足条件:
1、φ(a)≥0,φ(a)=0,当且仅当a=0(F的零元); 2、φ(ab)=φ(a)φ(b); 3、φ(a+b)≤Cmax{φ(a),φ(b)},其中a,b∈F,C为一常数,满足0<C≤2; 注意由条件1,2,3可推出三角不等式,即 4、φ(a+b)≤φ(a)+φ(b); 并且条件1,2,3与条件1,2,4是等价的。最常见的绝对值有:通常实数域的绝对值|·|,复数域上的模。
概念基础
赋值论
绝对值
域论的一个重要分支,它是研究交换代数的一个工具,特别是在代数数论、分歧理论、类域论和代数几何中 有极为重要的应用。通常的赋值可分为加法与乘法赋值两类,有时简称赋值。从赋值出发,可以给原来的域一个 拓扑结构,使之成为拓扑域。赋值理论肇始于屈尔沙克于1913年发表的论文。赋值、赋值域这些名词都是他首先 引入的。气候,经过奥斯特洛夫斯基(Ostrowski,A.M.)等人的工作,解决了屈尔沙克在论文中提出的问题, 并发展了这一理论。1932年,克鲁尔(Krull,W.)发表了题为《一般赋值理论》的基本论文,从而奠定了赋值 论这一分支的基础。时至今日,赋值理论已逐渐越出了“域”的界限,在许多代数结构上,例如群、环、向量空 间等,也用多种方式引进赋值,并由此对这些结构作算术理论的研究。此外,赋值论还渗入泛函分析的领域,发 展了所谓非阿基米德泛函分析。
谢谢观看
非阿基米德绝对值
数学术语
目录
01 概念基础
02
03 阿基米德绝对值
非阿基米德绝对值(non-Archimedean absolute value)亦称一阶赋值,是一类特殊的绝对值。与其相排斥 的为阿基米德绝对值。把绝对值区分为阿基米德绝对值和非阿基米德绝对值,来自奥斯特洛夫斯基(Ostrowski, A. M.)于1915chimedean absolute value)是与非阿基米德绝对值相排斥的另一种绝对值。设φ为F上 的绝对值,若φ满足三角不等式

非富勒烯小分子受体不稳定单元

非富勒烯小分子受体不稳定单元

非富勒烯小分子受体不稳定单元英文回答:Non-fullerene small molecule acceptors (NF-SMAs) have attracted considerable attention as promising alternatives to fullerenes in organic solar cells (OSCs) due to their tunable optoelectronic properties, ease of synthesis, and low cost. However, the instability of NF-SMAs under ambient conditions remains a major challenge that hinders their practical application. The instability of NF-SMAs is primarily attributed to the presence of labile chemical bonds, such as C-C and C-H bonds, which are susceptible to attack by oxygen and moisture.Several strategies have been developed to improve the stability of NF-SMAs. One approach involves theintroduction of bulky side chains or steric hindrance around the labile bonds. This approach can effectively reduce the accessibility of these bonds to oxygen and moisture, thereby enhancing the stability of the NF-SMAs.Another approach involves the use of cross-linking agents to form covalent bonds between the NF-SMAs and the polymer matrix. This approach can prevent the NF-SMAs fromdiffusing out of the active layer and improve the long-term stability of the OSCs.The development of stable NF-SMAs is crucial for the commercialization of OSCs. By addressing the instability issues, NF-SMAs have the potential to become a viable alternative to fullerenes and enable the development of high-performance, low-cost OSCs.中文回答:非富勒烯小分子受体(NF-SMA)由于其可调的光电性能、易于合成和低成本,作为有机太阳能电池(OSC)中富勒烯的有希望的替代品而备受关注。

非决定论

非决定论
然而自从之后自然科学盛行,加上函数方程式能完美地解释及预测一些自然现象,曾经使学界过度倾向机械论的世界观。因此非决定论在二十世纪初期行为主义盛行时,遭到心理学家的忽视,直到二十世纪后期,随着量子力学的发展,其不确定性使研究学者渐渐接受函数方程式的自然现象仅能说明普遍的情形,非决定论的观点才再度受到心理学家的重视。
非决定论(indeterminism)泛指与决定论相对立的观点,用来解释人类行为的原因。支持此理论的心理学家认为人在各种情境下所展现的各种行为,并非单纯由物质或精神等因素所支配之结果,主要是出自于人所俱有的自由意志(free will)所做之自由选择,因此每个人的行为皆有很大的差异。换言之,同样情境,不同的人会有不同的行为,此为人之异于某些动物的原因,例如昆虫ቤተ መጻሕፍቲ ባይዱ是典型的例子,刺激与反应,其行为不需经由思考,在相同的情境下不会做出不同的行为。

不完全信息静态博弈Harsanyi(1967-68)提出了一个不完全信息博弈的

不完全信息静态博弈Harsanyi(1967-68)提出了一个不完全信息博弈的
以上方程是在给定价值 x 下, 最优标价 b 需满足的条 件。 如投标函数 β · 构成贝叶斯纳什均衡, 那么它从 x 出发的映射就应是最优的 b, 即满足以上必要条件的 b 应等于 βx。在这个均衡条件之下,以上方程可以变形 为:
β (x)F (x) + (N − 1)β(x) = (N − 1)x
– Typeset by FoilTEX –
4
我们以下定义均以纯策略为例:
不完全信息博弈 要求:虽然每个博弈者并不知道对手 的类型,但是所有类型出现的联合概率分布 F : Θ → [0, 1] 需为共同认识, 其中 Θ = Θ1 × Θ2... × ΘN。 博弈者 i 观察到私人类型 θi 后的效用可以表示为 Ui[s1(θ1), ..., sN(θN)|θi], Ui(·|θi) 是 在给定 θi 下的 von Neumann-Morgenstern 期望效用函 数, 因为其自变量均为随机变量。于是,
– Typeset by FoilTEX –
7
拍卖理论
现代拍卖理论是从 Vickery(1961) 开始的,80 年代以来 快速衍生出大量文献,其中以静态博弈为分析框架 的 拍卖问题主要是围绕收入相等法则(Revenue Equivalence Principle)和联系法则 (Linkage Principle) 两个基本原理展开。
方案 3? A 省在修路的情况下, 其支付额应在 50 万元 的修路费基础上,减去它给 B 省的外部性 30 万元,
– Typeset by FoilTEX –
20
方案 3 为: 如果 A 省上报值与 B 省收益和大于 100 万元,修路,但 A 省只支付 20 万元,B 省支付 50 万 元。
– Typeset by FoilTEX –

科学知识社会学的哲学审视——从“非充分决定性”命题的角度看

科学知识社会学的哲学审视——从“非充分决定性”命题的角度看


要: 科学知识社会学在认识论上的社会建构主义立场 , 使 得 它成 为 当代科 学哲 学 中极端 相对 主义的代表
性流派 。S S K的哲 学根基在于对传统科学哲学 中“ 观察渗透理论” 与“ 非充 分决 定性” 命题 的涂 尔干式 改造。但这
种 改造 也无法在社会 与科 学理论之 间建 立起 一种“ 充分决定” 的关系 , 这 导致 了 S S K 自身的理论危机 。
的 主题 , 为 S S K所 面 临 的 理 论 困境 找 到 一 条 新
的 研 究路 径 。
相 对 应 的 是 基 于 控 制 论 发 展 而 来 的 赛 博 科 学
观 。通 过 对 二 战 后 西 方 科 学 的分 析 , 皮 克 林 指
出, 科 学认 识 是 认 知 主体 与 认 知 客 体 相互 作 用 , 人 类 与机 器 强 化交界处 , 在 开 放 式 终 结 和 前 瞻式 的 反 复 试 探 的 过程 中, 真 正 的 新 奇 事 物 是 如 何 在 时 间 中真 实
地涌现 出来 ” l , 这 是 一 种 生 成 本 体 论 。皮 克
林 一 直 重 视 对 控 制 论 的研 究 , 与其 “ 冲撞理论 ”
基金项 目: 2 0 1 2年 度 南 京 大 学 哲 学 系 “ 爱 智 慧优 秀 新 人 基 金 ” 项 目。
作者简介 : 夏逸舟( 1 9 8 2 一 ) , 南京大 学哲 学系硕 士研 究生, 主要从事科学哲学、 S & T S研 究。
上世纪六七 十年代 , 传 统 科 学 哲 学 的理 性 主 义 立 场 遭 遇 了 库 恩相 对 主义 哲 学 的 冲击 。 在
博 对 象 之 间 的 出现 构 成 了二 战 后 文 化 的一 个 关

SPSS词汇

SPSS词汇

SPSS词汇(中英文对照L-Z)Lack of fit, 失拟Ladder of powers, 幂阶梯Lag, 滞后Large sample, 大样本Large sample test, 大样本检验Latin square, 拉丁方Latin square design, 拉丁方设计Leakage, 泄漏Least favorable configuration, 最不利构形Least favorable distribution, 最不利分布Least significant difference, 最小显著差法Least square method, 最小二乘法Least-absolute-residuals estimates, 最小绝对残差估计Least-absolute-residuals fit, 最小绝对残差拟合Least-absolute-residuals line, 最小绝对残差线Legend, 图例L-estimator, L估计量L-estimator of location, 位置L估计量L-estimator of scale, 尺度L估计量Level, 水平Life expectance, 预期期望寿命Life table, 寿命表Life table method, 生命表法Light-tailed distribution, 轻尾分布Likelihood function, 似然函数Likelihood ratio, 似然比line graph, 线图Linear correlation, 直线相关Linear equation, 线性方程Linear programming, 线性规划Linear regression, 直线回归Linear Regression, 线性回归Linear trend, 线性趋势Loading, 载荷Location and scale equivariance, 位置尺度同变性Location equivariance, 位置同变性Location invariance, 位置不变性Location scale family, 位置尺度族Log rank test, 时序检验Logarithmic curve, 对数曲线Logarithmic normal distribution, 对数正态分布Logarithmic scale, 对数尺度Logarithmic transformation, 对数变换Logic check, 逻辑检查Logistic distribution, 逻辑斯特分布Logit transformation, Logit转换LOGLINEAR, 多维列联表通用模型Lognormal distribution, 对数正态分布Lost function, 损失函数Low correlation, 低度相关Lower limit, 下限Lowest-attained variance, 最小可达方差LSD, 最小显著差法的简称Lurking variable, 潜在变量Main effect, 主效应Major heading, 主辞标目Marginal density function, 边缘密度函数Marginal probability, 边缘概率Marginal probability distribution, 边缘概率分布Matched data, 配对资料Matched distribution, 匹配过分布Matching of distribution, 分布的匹配Matching of transformation, 变换的匹配Mathematical expectation, 数学期望Mathematical model, 数学模型Maximum L-estimator, 极大极小L 估计量Maximum likelihood method, 最大似然法Mean, 均数Mean squares between groups, 组间均方Mean squares within group, 组内均方Means (Compare means), 均值-均值比较Median, 中位数Median effective dose, 半数效量Median lethal dose, 半数致死量Median polish, 中位数平滑Median test, 中位数检验Minimal sufficient statistic, 最小充分统计量Minimum distance estimation, 最小距离估计Minimum effective dose, 最小有效量Minimum lethal dose, 最小致死量Minimum variance estimator, 最小方差估计量MINITAB, 统计软件包Minor heading, 宾词标目Missing data, 缺失值Model specification, 模型的确定Modeling Statistics , 模型统计Models for outliers, 离群值模型Modifying the model, 模型的修正Modulus of continuity, 连续性模Morbidity, 发病率Most favorable configuration, 最有利构形Multidimensional Scaling (ASCAL), 多维尺度/多维标度Multinomial Logistic Regression , 多项逻辑斯蒂回归Multiple comparison, 多重比较Multiple correlation , 复相关Multiple covariance, 多元协方差Multiple linear regression, 多元线性回归Multiple response , 多重选项Multiple solutions, 多解Multiplication theorem, 乘法定理Multiresponse, 多元响应Multi-stage sampling, 多阶段抽样Multivariate T distribution, 多元T分布Mutual exclusive, 互不相容Mutual independence, 互相独立Natural boundary, 自然边界Natural dead, 自然死亡Natural zero, 自然零Negative correlation, 负相关Negative linear correlation, 负线性相关Negatively skewed, 负偏Newman-Keuls method, q检验NK method, q检验No statistical significance, 无统计意义Nominal variable, 名义变量Nonconstancy of variability, 变异的非定常性Nonlinear regression, 非线性相关Nonparametric statistics, 非参数统计Nonparametric test, 非参数检验Nonparametric tests, 非参数检验Normal deviate, 正态离差Normal distribution, 正态分布Normal equation, 正规方程组Normal ranges, 正常范围Normal value, 正常值Nuisance parameter, 多余参数/讨厌参数Null hypothesis, 无效假设Numerical variable, 数值变量Objective function, 目标函数Observation unit, 观察单位Observed value, 观察值One sided test, 单侧检验One-way analysis of variance, 单因素方差分析Oneway ANOVA , 单因素方差分析Open sequential trial, 开放型序贯设计Optrim, 优切尾Optrim efficiency, 优切尾效率Order statistics, 顺序统计量Ordered categories, 有序分类Ordinal logistic regression , 序数逻辑斯蒂回归Ordinal variable, 有序变量Orthogonal basis, 正交基Orthogonal design, 正交试验设计Orthogonality conditions, 正交条件ORTHOPLAN, 正交设计Outlier cutoffs, 离群值截断点Outliers, 极端值OVERALS , 多组变量的非线性正规相关Overshoot, 迭代过度Paired design, 配对设计Paired sample, 配对样本Pairwise slopes, 成对斜率Parabola, 抛物线Parallel tests, 平行试验Parameter, 参数Parametric statistics, 参数统计Parametric test, 参数检验Partial correlation, 偏相关Partial regression, 偏回归Partial sorting, 偏排序Partials residuals, 偏残差Pattern, 模式Pearson curves, 皮尔逊曲线Peeling, 退层Percent bar graph, 百分条形图Percentage, 百分比Percentile, 百分位数Percentile curves, 百分位曲线Periodicity, 周期性Permutation, 排列P-estimator, P估计量Pie graph, 饼图Pitman estimator, 皮特曼估计量Pivot, 枢轴量Planar, 平坦Planar assumption, 平面的假设PLANCARDS, 生成试验的计划卡Point estimation, 点估计Poisson distribution, 泊松分布Polishing, 平滑Polled standard deviation, 合并标准差Polled variance, 合并方差Polygon, 多边图Polynomial, 多项式Polynomial curve, 多项式曲线Population, 总体Population attributable risk, 人群归因危险度Positive correlation, 正相关Positively skewed, 正偏Posterior distribution, 后验分布Power of a test, 检验效能Precision, 精密度Predicted value, 预测值Preliminary analysis, 预备性分析Principal component analysis, 主成分分析Prior distribution, 先验分布Prior probability, 先验概率Probabilistic model, 概率模型probability, 概率Probability density, 概率密度Product moment, 乘积矩/协方差Profile trace, 截面迹图Proportion, 比/构成比Proportion allocation in stratified random sampling, 按比例分层随机抽样Proportionate, 成比例Proportionate sub-class numbers, 成比例次级组含量Prospective study, 前瞻性调查Proximities, 亲近性Pseudo F test, 近似F检验Pseudo model, 近似模型Pseudosigma, 伪标准差Purposive sampling, 有目的抽样QR decomposition, QR分解Quadratic approximation, 二次近似Qualitative classification, 属性分类Qualitative method, 定性方法Quantile-quantile plot, 分位数-分位数图/Q-Q图Quantitative analysis, 定量分析Quartile, 四分位数Quick Cluster, 快速聚类Radix sort, 基数排序Random allocation, 随机化分组Random blocks design, 随机区组设计Random event, 随机事件Randomization, 随机化Range, 极差/全距Rank correlation, 等级相关Rank sum test, 秩和检验Rank test, 秩检验Ranked data, 等级资料Rate, 比率Ratio, 比例Raw data, 原始资料Raw residual, 原始残差Rayleigh's test, 雷氏检验Rayleigh's Z, 雷氏Z值Reciprocal, 倒数Reciprocal transformation, 倒数变换Recording, 记录Redescending estimators, 回降估计量Reducing dimensions, 降维Re-expression, 重新表达Reference set, 标准组Region of acceptance, 接受域Regression coefficient, 回归系数Regression sum of square, 回归平方和Rejection point, 拒绝点Relative dispersion, 相对离散度Relative number, 相对数Reliability, 可靠性Reparametrization, 重新设置参数Replication, 重复Report Summaries, 报告摘要Residual sum of square, 剩余平方和Resistance, 耐抗性Resistant line, 耐抗线Resistant technique, 耐抗技术R-estimator of location, 位置R估计量R-estimator of scale, 尺度R估计量Retrospective study, 回顾性调查Ridge trace, 岭迹Ridit analysis, Ridit分析Rotation, 旋转Rounding, 舍入Row, 行Row effects, 行效应Row factor, 行因素RXC table, RXC表Sample, 样本Sample regression coefficient, 样本回归系数Sample size, 样本量Sample standard deviation, 样本标准差Sampling error, 抽样误差SAS(Statistical analysis system ), SAS统计软件包Scale, 尺度/量表Scatter diagram, 散点图Schematic plot, 示意图/简图Score test, 计分检验Screening, 筛检SEASON, 季节分析Second derivative, 二阶导数Second principal component, 第二主成分SEM (Structural equation modeling), 结构化方程模型Semi-logarithmic graph, 半对数图Semi-logarithmic paper, 半对数格纸Sensitivity curve, 敏感度曲线Sequential analysis, 贯序分析Sequential data set, 顺序数据集Sequential design, 贯序设计Sequential method, 贯序法Sequential test, 贯序检验法Serial tests, 系列试验Short-cut method, 简捷法Sigmoid curve, S形曲线Sign function, 正负号函数Sign test, 符号检验Signed rank, 符号秩Significance test, 显著性检验Significant figure, 有效数字Simple cluster sampling, 简单整群抽样Simple correlation, 简单相关Simple random sampling, 简单随机抽样Simple regression, 简单回归simple table, 简单表Sine estimator, 正弦估计量Single-valued estimate, 单值估计Singular matrix, 奇异矩阵Skewed distribution, 偏斜分布Skewness, 偏度Slash distribution, 斜线分布Slope, 斜率Smirnov test, 斯米尔诺夫检验Source of variation, 变异来源Spearman rank correlation, 斯皮尔曼等级相关Specific factor, 特殊因子Specific factor variance, 特殊因子方差Spectra , 频谱Spherical distribution, 球型正态分布Spread, 展布SPSS(Statistical package for the social science), SPSS统计软件包Spurious correlation, 假性相关Square root transformation, 平方根变换Stabilizing variance, 稳定方差Standard deviation, 标准差Standard error, 标准误Standard error of difference, 差别的标准误Standard error of estimate, 标准估计误差Standard error of rate, 率的标准误Standard normal distribution, 标准正态分布Standardization, 标准化Starting value, 起始值Statistic, 统计量Statistical control, 统计控制Statistical graph, 统计图Statistical inference, 统计推断Statistical table, 统计表Steepest descent, 最速下降法Stem and leaf display, 茎叶图Step factor, 步长因子Stepwise regression, 逐步回归Storage, 存Strata, 层(复数)Stratified sampling, 分层抽样Strength, 强度Stringency, 严密性Structural relationship, 结构关系Studentized residual, 学生化残差/t化残差Sub-class numbers, 次级组含量Subdividing, 分割Sufficient statistic, 充分统计量Sum of products, 积和Sum of squares, 离差平方和Sum of squares about regression, 回归平方和Sum of squares between groups, 组间平方和Sum of squares of partial regression, 偏回归平方和Sure event, 必然事件Survey, 调查Survival, 生存分析Survival rate, 生存率Suspended root gram, 悬吊根图Symmetry, 对称Systematic error, 系统误差Systematic sampling, 系统抽样Tags, 标签Tail area, 尾部面积Tail length, 尾长Tail weight, 尾重Tangent line, 切线Target distribution, 目标分布Taylor series, 泰勒级数Tendency of dispersion, 离散趋势Testing of hypotheses, 假设检验Theoretical frequency, 理论频数Time series, 时间序列Tolerance interval, 容忍区间Tolerance lower limit, 容忍下限Tolerance upper limit, 容忍上限Torsion, 扰率Total sum of square, 总平方和Total variation, 总变异Transformation, 转换Treatment, 处理Trend, 趋势Trend of percentage, 百分比趋势Trial, 试验Trial and error method, 试错法Tuning constant, 细调常数Two sided test, 双向检验Two-stage least squares, 二阶最小平方Two-stage sampling, 二阶段抽样Two-tailed test, 双侧检验Two-way analysis of variance, 双因素方差分析Two-way table, 双向表Type I error, 一类错误/α错误Type II error, 二类错误/β错误UMVU, 方差一致最小无偏估计简称Unbiased estimate, 无偏估计Unconstrained nonlinear regression , 无约束非线性回归Unequal subclass number, 不等次级组含量Ungrouped data, 不分组资料Uniform coordinate, 均匀坐标Uniform distribution, 均匀分布Uniformly minimum variance unbiased estimate, 方差一致最小无偏估计Unit, 单元Unordered categories, 无序分类Upper limit, 上限Upward rank, 升秩Vague concept, 模糊概念Validity, 有效性VARCOMP (Variance component estimation), 方差元素估计Variability, 变异性Variable, 变量Variance, 方差Variation, 变异Varimax orthogonal rotation, 方差最大正交旋转Volume of distribution, 容积W test, W检验Weibull distribution, 威布尔分布Weight, 权数Weighted Chi-square test, 加权卡方检验/Cochran检验Weighted linear regression method, 加权直线回归Weighted mean, 加权平均数Weighted mean square, 加权平均方差Weighted sum of square, 加权平方和Weighting coefficient, 权重系数Weighting method, 加权法W-estimation, W估计量W-estimation of location, 位置W估计量Width, 宽度Wilcoxon paired test, 威斯康星配对法/配对符号秩和检验Wild point, 野点/狂点Wild value, 野值/狂值Winsorized mean, 缩尾均值Withdraw, 失访Youden's index, 尤登指数Z test, Z检验Zero correlation, 零相关Z-transformation, Z变换。

非全局Lipschitz条件下随机延迟微分方程解的存在唯一性定理

非全局Lipschitz条件下随机延迟微分方程解的存在唯一性定理

I’ … f
假设 { t W()= ( () … , () £ ty t , t ) : ≥ 是概率空 间( F { 。P 上有独立分量的 F 一可适 Q, , F } , )
应的 m维标准 Bo n n r i 运动( w a 或称维纳过程 ) Q, , ) .( F P 是一完备 的概率空 间, F } 是其上满足一般条 {
FAN e c ng Zh n—he
( eate tfm t m ts Mi in n e i , uhu Ff n30 0 ,C ia D p r n ah ai , n a g U i r t F zo , u a 5 18 hn ) m o e c j v sy i
A s at T eeeisteuiu ouo r t hscdlydf rni qai s( D E )udrte bt c: hr xs nq esltnf o at ea ieetl utn S D s ne r th i o sc i ae o h
nl m
目前 , 随机延迟微分方程 已广泛应用于生物学 、 金融学 、 环境科学和控制科学等诸多研究领域 - . 4 并取 得了丰硕的理论研究成果 , 它们包括解 的存在唯一性 , 马尔可夫性 , 随机稳定性 , 数值解理论 , 参数估计等许 多内容. 的存在唯一性是其 中的基础问题 , 解 很多学者致力于此 , 尤其 19 96年 M hm e -证明了当系数函 oa m d5
件( 即右 连续 , .包 括所 有 零测 集 )的 一代 数 流 .
令 R R : × ×[ , ] R 和 g R R 0T 一 : × ×[ ,] 0 T
都是 B r 可测 的, A I ol e J 表示矩阵 A的“ ’ 迹’

恩斯特·布洛赫“尚未”存在论概述

恩斯特·布洛赫“尚未”存在论概述

恩斯特·布洛赫“尚未”存在论概述李 博(天津工业大学 马克思主义学院, 天津 300387)摘 要:“尚未”是布洛赫哲学的核心范畴。

借助这一范畴,布洛赫打破了传统哲学的僵化体系,强调要以开放动态的眼光审视外部世界及人类自身。

进一步地,以指向未来的能动的“尚未意识”为基础,布洛赫提出了独特的“尚未”存在论。

根据“尚未”存在论的观点,外部世界绝非一个封闭的已经完成的固化体系,人的本质也始终存在于尚未完成的过程之中,需要我们不断扬弃异化现实方能最终实现。

关键词:尚未;尚未意识;“尚未”存在论中图分类号:B151 文献标识码:A 文章编号:1004-9444(2019)01-0049-05收稿日期:2018-07-08基金项目:天津市教委科研计划项目:恩斯特•布洛赫社会主义人权理论研究(2017SK041)作者简介:李博(1987—),男,河北海兴人,讲师,博士,主要从事当代马克思主义、社会政治哲学研究。

“尚未”(not yet)是布洛赫哲学所独有的范畴,同时也是其最为基本的范畴。

就内涵而言,“尚未”可被解释为“还不是”、“还没有”等等。

因此,布洛赫的哲学思想通常可以用一个最为简单的公式加以概括,即“S还不是P (S is not yet P)”。

在布洛赫看来,“S 是 P(S is P)”的传统哲学公式太过封闭死板,难以正确解读外部世界的发展变化。

尤为严重地是,它还会导致人们习惯于“以不变应万变”的思维模式,从而不合时宜地用静止的目光来分析事物的演变趋势;相比之下,“S 还不是P”的公式则可以帮助人们获得开放动态的思维逻辑,专注于事物的实际发展过程及其广阔的未来可能性。

以此为基础,布洛赫得以超越传统哲学僵化保守的思想框架,以开放能动的“尚未意识”为依托建立起了全新的“尚未”存在论体系。

正是借助这一体系,布洛赫重新恢复了人的主体地位,指明了人本身所具有的巨大潜能,进而为其旨在追求人类解放的人本主义思想奠定了坚实的思想基础。

中微子 非标准相互作用

中微子 非标准相互作用

中微子非标准相互作用Neutrinos are fascinating subatomic particles that have captured the interest of scientists for decades. They are electrically neutral and interact only weakly with matter, making them extremely elusive to detect and study. However, recent research has revealed the existence of non-standard neutrino interactions, which deviate from the predictions of the Standard Model of particle physics. This discovery has opened up new avenues for understanding the fundamental nature of neutrinos and the laws that govern the universe.One of the main motivations for studying non-standard neutrino interactions is to shed light on the phenomenon of neutrino oscillation. Neutrinos come in three different flavors – electron, muon, and tau – and they can change from one flavor to another as they travel through space. This phenomenon, known as neutrino oscillation, impliesthat neutrinos must have mass, which contradicts theinitial assumption of the Standard Model. Non-standard interactions could provide an explanation for this massgeneration mechanism and help us understand why neutrinos behave the way they do.Another reason why non-standard neutrino interactions are important is their potential impact on astrophysics and cosmology. Neutrinos are produced in the core of the Sun, in supernovae explosions, and in other high-energy astrophysical processes. By studying the properties of neutrinos, scientists can gain insights into the inner workings of these cosmic phenomena. Non-standard interactions could affect the production, propagation, and detection of neutrinos, leading to observable differences in astrophysical data. Understanding these effects is crucial for accurately interpreting astronomical observations and unraveling the mysteries of the universe.Furthermore, non-standard neutrino interactions have implications for the search for new physics beyond the Standard Model. The Standard Model is a highly successful theory that describes the behavior of elementary particles and their interactions. However, it is known to be incomplete and does not account for several phenomena, suchas dark matter and the matter-antimatter asymmetry in the universe. Non-standard neutrino interactions could be a manifestation of new, yet undiscovered particles or forces that lie beyond the reach of the Standard Model. By studying these interactions, scientists hope to uncover clues about the nature of this new physics and pave the way for a more comprehensive theory of the universe.From a technological perspective, non-standard neutrino interactions could have practical applications in the field of neutrino detectors. Neutrino detectors play a crucial role in studying neutrinos and their properties. They are used to measure the flux, energy, and flavor composition of neutrinos, providing valuable data for theoretical models and experimental tests. Understanding non-standard interactions can help improve the design and sensitivity of these detectors, leading to more precise measurements and a deeper understanding of neutrinos.In conclusion, non-standard neutrino interactions represent an exciting area of research with far-reaching implications. They provide a unique window into themysteries of neutrino oscillation, astrophysics, new physics beyond the Standard Model, and even technological advancements in neutrino detectors. By studying these interactions, scientists are not only unraveling the secrets of the universe but also pushing the boundaries of human knowledge and understanding. The pursuit of knowledge about non-standard neutrino interactions is a testament to the curiosity and ingenuity of the human mind, and it holds the promise of revolutionizing our understanding of the fundamental laws that govern the cosmos.。

浅谈英语中隐性否定的汉译

浅谈英语中隐性否定的汉译

浅谈英语中隐性否定的汉译[摘要]“隐性否定”是英语语言中一种常见而又独特的语言现象。

由于英汉两种语言在表达方式上的差异性,从而造成了两种语言在表达否定形式上所采用的手段截然不同。

所以我们只有准确地把握其词汇意义,理解其语法结构和修辞手段才能进行有效的语码转换。

本文以常见翻译原则为立脚点,着重从词汇、语法、和修辞这三方面来探讨英语隐性否定的汉译。

旨在让大家能更深入的了解和体会英语语言现象的独特性以及英汉语言在表达方式的差异性。

[1]隐性否定词及其翻译[1.1]•隐性否定词几乎遍布英语中的每一个词类。

常见否定词如: no、not、never、refuse 、none、nobody、nothing、nowhere、neither、nor、neither...nor、but、without、unless、but for、but that、in the absence of、regardless of、instead of、exclusive of、short of、rather than、anything but、any more than、out of the question、would no more...than。

例(1) The door refuses to shut•译文:A•门拒绝关上。

B•门关不上•例(2) Illness prevents him from doing his work•译文:A•疾病阻止他工作。

B•疾病使他无法工作。

例(1)、例(2)译文A显然出现了两个问题:一是译文按原文表层含义进行翻译,没有对原文动词的深层含义进行挖掘,造成意思偏离。

二是从形式上讲,译文A和原文具有相似性,按字面表达了原文的意义,但实际上译文读者会产生误解,有时甚至无法理解。

[1.2]•隐性否定名词。

隐性否定名词多数是由其动词派生而来,如: absence、avoidance、omission、failure、refuse 、lack、ignorance、neglect、exclusion、overlook、denial、scarcity等。

JJG 14-2016《非自行指示秤》检定规程解读

JJG 14-2016《非自行指示秤》检定规程解读

计最测试I马晓夕:J J G 14-2016《非自行指示秤》检定规程解读JJG14-2016《非自行指示秤》检定规程解读马晓夕(沈阳市应急管理事务服务中心,辽宁沈阳110034)【摘要】本文对J J G14-2016《非自行指示秤》做了详细的解读,并且对其主要修订内容进行了说明。

【关键词】非自行指示秤;检定规程【D O I编码】10.3969/j.issn.l674>4977.2020_06.008The Interpretation of Verification Regulation for JJG 14-2016 N o n-self-in d ica tin g W eighing In stru m en tsMA Xiao-xi(S h e n y a n g E m e r g e n c y M a n a g e m e n t Service Center,Shenyang110034,C h i n a)Abstract:This paper makes a detailed interpretation of JJG 14-2016 Non-self-indicating Weighing Instruments explains its main revision contents.Key words:non-self-indicating weighing instruments;verification regulation目前,非自行指示秤的使用范围逐年缩小,但是由于结实 耐用、对环境具有较强的适应性,仍具有一定的优势和不可替 代性,为适应国际和国内非自行指示秤产品发展的需要,国家 质量监督检验检疫总局对U G丨4-丨997(旧规程)《非自行指示 秤》检定规程进行了修订,_U G 14-2016(新规程)自2017年5月30日起实施,新规程与旧规程相比,增强了检定工作的可操作 性,确保了非自行指示秤示值的准确可靠,对实际检定工作具 有很强的指导意义。

不确定型决策方法原理

不确定型决策方法原理

不确定型决策方法原理Decision making is a fundamental aspect of human behavior and is vital in both personal and professional contexts. There are various decision-making methods to consider, including deterministic and uncertain methods. Uncertain decision-making methods, also known as non-deterministic methods, are used when there is ambiguity or uncertainty in the outcomes of different alternatives.不确定型决策方法是指在决策过程中存在信息不完全或者存在风险、不确定性的情况下所采用的决策方法。

与确定型决策方法相比,不确定型决策方法更加贴近现实,因为在现实生活中,很少有决策是完全确定且没有风险的。

不确定型决策方法从概率的角度来看,会考虑到各种可能性的发生,而不是只考虑单一的最优解。

One of the most commonly used uncertain decision-making methods is the Bayesian method. This method involves using prior knowledge and updating it based on new information to make decisions. The Bayesian method considers the probability of different outcomes and utilizes this information to make decisions in an uncertain environment.贝叶斯方法在不确定型决策中具有广泛的应用,它通过不断的更新先验知识,结合新的信息来进行决策。

否定性权利要求限定 _ 沉默是否足够说明书

否定性权利要求限定 _ 沉默是否足够说明书

否定性权利要求限定——沉默是否足够作者:Hannah Ubersox和Aron Griffith在专利答辩期间,否定性权利要求限定(Negative claim limitations)经常被用来克服所引用的现有技术。

不幸的是,专利撰写者并不总能考虑到可能需要否定性权利要求限定,并且说明书通常不会提及排除的内容或原因。

示例通常用作否定性限定的唯一依据。

美国联邦巡回上诉法院(CAFC)最近的一项判决强调了在考虑修改权利要求以包括说明书未提及的否定性权利要求限定时需要谨慎。

在Novartis Pharmaceuticals Corp v. Accord Healthcare, Inc.案1中,在合议庭根据没有足够的书面描述来支持否定性权利要求限定而对Novartis的权利要求的有效性进行重新审理之后,CAFC于2022年6月21日撤销了其先前的判决2。

此外,CAFC推翻了地区法院的判决,认定Novartis的权利要求因书面描述不充分而无效。

一开始,CAFC在其最初的判决中维持了地区法院的认定,即美国专利第9,187,405号('405专利)的权利要求1-6并非无效,而且由共同被告HEC提交的简略新药申请(Abbreviated New Drug Application,“ANDA”)侵权。

在HEC 请求重审并且由Hughes法官取代O'Malley法官后,CAFC撤销了该最初判决并推翻了地区法院的判决,裁定支持HEC。

'405专利要求保护治疗一种多发性硬化症(MS)——复发缓解型多发性硬化症(RRMS)的方法,该方法需要服用芬戈莫德或芬戈莫德盐,“每日剂量为0.5 mg,不存在紧接在前的负荷剂量方案。

”负荷剂量是指所给用的高于每日剂量的首次剂量。

Novartis公司拥有'405专利,并以GILENYA®品牌生产一种实施该专利的药物。

HEC向FDA提交了ANDA申请,希望生产GILENYA®的仿制药。

承认并重视非理性因素的名言

承认并重视非理性因素的名言

承认并重视非理性因素的名言1、米开朗基罗·伽利略:“如果你想知道真理,你就必须放弃所有偏见。

”2、伦理学家康德:“事实上,毋庸置疑,思想告诉我们,我们的认知能力有非理性的一面。

”3、英国哲学家休谟:“认识的权利,不在于完美的理性,而在于无穷的想像力,因而批判必需有一定量的想象力。

”4、孔子:“学而不思则罔,思而不学则殆。

”米开朗基罗·伽利略说过,“要知道真理,就必须放弃所有偏见”,这表明了非理性因素在人类探索真理中所起的关键作用。

伦理学家康德也指出,思想力量已经表明了我们思想认知力有非理性的一面,也就是说,非理性因素是我们认识的重要基石。

英国哲学家休谟也提到:认知的权利不在于完美的理性,而是以想象力为前提的。

这就是说,想象力对于人们收集信息和探索真理也是不可或缺的。

孔子也对此有着深刻的见解,他说:“学而不思则罔,思而不学则殆”,提醒我们学习需要思考,而思而不学也是枉然。

由此可见,古今中外大家均承认非理性因素在认知和探索中的重要性。

答案:非理性在人类探索真理中起着重要的作用,古今中外的思想家们普遍认同这一观点。

①米开朗基罗·伽利略:“如果你想知道真理,你就必须放弃所有偏见。

” 伽利略指出,要知道真理,就必须放弃所有偏见,这表明出非理性因素在探索真理中有重要作用。

②伦理学家康德:“事实上,毋庸置疑,思想告诉我们,我们的认知能力有非理性的一面。

” 康德指出,思想告诉我们,人类认知能力有非理性的一面,这表明了非理性因素的重要性。

③英国哲学家休谟:“认识的权利,不在于完美的理性,而在于无穷的想像力,因而批判必需有一定量的想象力。

” 休谟说,我们认知的权利不在于完美的理性,而是以想象力为前提的,这表明了想象力在探索和认识真理中的重要性。

④孔子:“学而不思则罔,思而不学则殆。

” 孔子认为,学习需要思考,他主张学以致用,表明了他重视非理性因素在学习中的作用。

由上可见,世界各地大家都承认非理性因素在人类探索真理中所起的关键作用,我们应该虚心向这些大家学习,并要求自己不断思考,以此来更好地探索人类智慧背后的真理。

心灵哲学视角下人工智能体犯罪之否定阐释

心灵哲学视角下人工智能体犯罪之否定阐释

心灵哲学视角下人工智能体犯罪之否定阐释作者:霍南南来源:《西部学刊》2024年第05期摘要:人工智能体作为刑法主体的可行性在法学界争议颇多,通过心灵哲学的角度,引入塞尔中文屋和查莫斯哲学僵尸概念,从人工智能体萌芽时期到近期呼声极高的ChatGPT等事务,分析梳理行为主义和图灵测试的缺陷,从而对人工智能自由意志属性做出否定判断,进而建议取消其作为刑法主体前提。

关键词:人工智能体;心灵哲学;图灵测试;哲学僵尸;自由意志中图分类号:D924.3文献标识码:A文章编号:2095-6916(2024)05-0081-04Negative Elaboration of Artificial Intelligence Body’s Criminal Offensesfrom the Perspective of Philosophy of MindHuo Nannan(Hebei University of Engineering, Handan 056001)Abstract: The feasibility of artificial intelligence body as the subject of criminal law is controversial in the legal academy, through the perspective of philosophy of mind, introducing the concept of Selwyn Chinese house and Chalmers philosophical zombie, from the budding period of artificial intelligence body to the recent call for the ChatGPT and other affairs, analysing and sorting out the defects of behaviourism and the Turing test, so as to make a negative judgement on the attribute of free will of the artificial intelligence, and then suggesting to abolish it as the subject premise of criminal law.Keywords: artificial intelligence body; philosophy of mind; the Turing test; philosophical zombies; free will2006年,随着AI教父杰弗里·辛顿提出深度学习的概念,人工智能发展进入了快车道,工业界将之转化为社会生产生活一部分的趋势日渐增强。

关于非绝热问题的讨论

关于非绝热问题的讨论

关于nonadiabatic 和diabitic 的讨论在英文文献和书籍中,经常会碰到两个词,即,nonadiabatic 和diabatic ,它们都被翻译为“非绝热的”,有时也把后者翻译为“透热的”。

这些翻译不能表达二者在概念上的区别。

事实上,nonadiabatic 是一个更一般性的概念,它泛指各种“非绝热”现象,而diabatic 则是一个更为专业的术语,它指的是非绝热表象,即在非绝热情形下得到的运动状态和能级。

现结合下图说明有关概念。

(1) (2)分子体系的的Schrodinger 方程为:H E ψψ= (1)设{}1212,,,E E ψψ为1,2E E 相差较大时得到的能量和波函数,当两个势能面相互靠近时,两能级接近简并,可用微扰理论处理,令1122c c ±ψ=ψ+ψ (2)代入(1)式,分别左乘*1ψ和*2ψ,并对电子坐标积分,可得久期方程为:022211211=--EH V V E H (3)式中,ii i ii i i i H E V E V =+=+ψψ,ij i j V V =ψψ,V 为微扰。

假定不考虑自旋轨道耦合(体系中不含重原子),则1ψ和2ψ可以选为实函数,这时12V 和21V 为实数,并有1221V V =。

由(3)式可解得:()1222112211221211()422E H H H H V ±⎡⎤=+±-+⎣⎦ (4)相应的波函数分别为+1222cos sin =-θψψψθ(5)1222sin cos -=+θψψψθ(6)式中θ由下式确定 122211V H H -t a n θ=(7)当012=V ,即当两态间无相互作用时(图1),(7)式中的0ϑ=,由式(5)和(6)可知12,+-ψψψψ= = (8)这表明,在势能面交叉前后,能量为1E 的态的波函数始终为1ψ,而能量为2E 的态的波函数始终为2ψ。

在通过交叉点后,波函数并不“混合”,或者说电子态并不跃迁,因而电子运动是绝热的。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Nondeterministic Discretization of WeightsImproves Accuracy of Neural NetworksMarcin WojnarskiWarsaw University,Faculty of Mathematics,Informatics and Mechanicsul.Banacha2,02-097Warszawa,Polandmwojnars@ns.onet.plAbstract.The paper investigates modification of backpropagation al-gorithm,consisting of discretization of neural network weights after eachtraining cycle.This modification,aimed at overfitting reduction,restrictsthe set of possible values of weights to a discrete subset of real numbers,leading to much better generalization abilities of the network.This,inturn,leads to higher accuracy and a decrease in error rate by over50%in extreme cases(when overfitting is high).Discretization is performed nondeterministically,so as to keep ex-pected value of discretized weight equal to original value.In this way,global behavior of original algorithm is preserved.The presented methodof discretization is general and may be applied to other machine-learningalgorithms.It is also an example of how an algorithm for continuous opti-mization can be successfully applied to optimization over discrete spaces.The method was evaluated experimentally in WEKA environment usingtwo real-world data sets from UCI repository.Keywords:Generalization,Overtraining,Overfitting,Regularization.1IntroductionMulti-layer artificial neural networks[1,2]are well-established tools in machine learning,with proven effectiveness in many real-world problems.However,there are still many tasks in which they perform worse than other machine-learning systems[3].One of the reasons is that neural networks contain usually thousands of real-valued adaptive parameters,and so they have strong tendency to get overtrained(overfitted),especially when the size of the training set is not large enough.Thus,methods to improve generalization abilities of neural networks are necessary.Several such methods have been already proposed:early stopping[2]–the sim-plest and most commonly used–which consists offinishing the training process when the error on a validation set starts increasing;regularization[4,5,6]based on adding a regularization term to the error function;pruning[4,7],i.e.removing un-necessary weights or neurons during or after training;training with noise[5,8,9], i.e.disturbing training instances in a random way;weight sharing[10].This paper introduces a novel method based on discretization of weights. The method is easy to implement and potentially more versatile than existing J.N.Kok et al.(Eds.):ECML2007,LNAI4701,pp.765–772,2007.c Springer-Verlag Berlin Heidelberg2007766M.Wojnarskiones,yet it can lead to significant improvement in generalization abilities and accuracy of the neural network.It can be also used in conjunction with the above-mentioned algorithms.Motivation which underlies the presented method is described in Sect.2.It is followed by presentation of the algorithm in Sect.3and its experimental assess-ment in Sect.4.Finally,Sect.5recaps main points of the paper and presents conclusions.2MotivationDiscretization of weights means restricting the set of possible values of neuron weights to a small discrete subset of real numbers.In this way,the decision model represented by a neural network gets simpler and can be described using fewer number of bits,since every weight may be represented,for instance,by a single byte instead of four or eight bytes.This in turn leads to better generalization abilities.Theoretical justification of the method is provided by Rissanen’s Minimum Description Length(MDL)Principle[11,12]which states that the best way to capture regularities in data and avoid overfitting is to choose a model that has short description.Thus,a neural network which uses only one byte to represent a weight value is better than a network requiring4-byte-long description of every weight,even if the latter has slightly higher accuracy on training data than the former.A more intuitive justification might be given by considering a system whose accuracy(measured during training on a training set)abruptly decreases when some weight is slightly disturbed,e.g.by0.01.High accuracy of such a system is probably accidental and will not recur on test data.A system which is insensitive to such small perturbations of weight values would be much more trustworthy. 3The AlgorithmLet us denote byΩ,Ω⊂R,the set of permitted values of weights.Assume thatΩis discrete(and usually0∈Ω).There is a training algorithm A given, e.g.backpropagation,which searches through a family F of models tofind the one that achieves(approximately)minimum error rate on training data.Let us denote by FΩthe family of models from F whose weights belong toΩ(so FΩ⊂F).The goal is to create a discretized variant of algorithm A,whichfinds a model in FΩthat minimizes the error rate over FΩ.The easiest way to do this is to discretize weights of the model found by algorithm A,by simply rounding them to the closest values fromΩ.This pro-cedure is very simple,but it does not provide any control over the accuracy of the discretized model,so it cannot be beneficial for accuracy of thefinal model.A better method is to interlace discretization with training process A by round-ing weights every time they are updated.In this case,the process of searchingNondeterministic Discretization of Weights Improves Accuracy767 for the best model is restricted to FΩfrom the beginning,so it is directed by accu-racy of discretized models and thus is able tofind a better model than the previous method.However,there is still another problem if the process of searching guided by algorithm A moves slowly through the space F,which happens for example in backpropagation algorithm when the learning rate[1]is small.In this case, simple discretization–meant as deterministic rounding of every weight to the nearest value inΩ–may turn values of all updated weights back to the values from before the update.Consequently,the searching process guided by discrete variant of A may easily get stuck in some point of FΩwhich is neither global nor even local minimum of error function.This can be avoided by performing discretization in a nondeterministic way.Let v denote the weight value to be discretized;LΩ(v)is the greatest value inΩnot greater than v;GΩ(v)is the least value inΩnot less than v.Value v is discretized by replacing it nondeterministically with either LΩ(v)or GΩ(v), according to the formula:DΩ(v)=LΩ(v)with probability(GΩ(v)−v)/RΩ(v)GΩ(v)with probability(v−LΩ(v))/RΩ(v),(1)where DΩ(v)denotes discretized value of v and RΩ(v)=GΩ(v)−LΩ(v).The above choice of probabilities makes the following important property hold:E(DΩ(v))=v,(2) i.e.expected value of discretized weight is equal to the original value.In this way, discretization may be viewed as adding some zero-mean randomfluctuations to weight values,without disturbing global behavior of the original algorithm.Note, however,that discretization is not equivalent to adding randomfluctuations to weight values.General structure of the training algorithm which performs discretization of weights is presented in Figure1.for cycle:=1to number of training cycles dopattern:=GetNextPattern();CalculateResponse(network,pattern);UpdateWeights(network);/*standard algorithm,e.g.backpropagation*/for each weight in network dow:=ValueOf(weight);d:=DΩ(w);/*nondeterministic discretization of w,Eq.(1)*/ValueOf(weight):=d;endendFig.1.Outline of the neural network training algorithm with discretization of weights768M.WojnarskiSome attention should be paid to the question of what set of permitted values Ωto use.Simple yet efficient choice is to take a set of evenly-spaced numbers containing zero:Ω={kγ:k∈Z},(3) whereγ∈R is a parameter that controls granularity of discretization.4Experimental ResultsThe presented modification was applied to standard backpropagation algorithm [2,1]used for training of multilayer neural networks.The modified algorithm was compared with the standard one on two real-world datasets from the UCI[13] machine-learning repository:Labor and Image Segmentation.Experiments were conducted in WEKA[14]environment,whose implementation of backpropaga-tion algorithm was extended by the author to handle discretization of weights.To enable thorough analysis and reliable comparison of the algorithms,dif-ferent numbers of hidden neurons were tested:5,10,20,30,50,70,100,150, 200and250.In this way,it was possible to drawfinal conclusions that were independent from specific choice of training parameters.To obtain plausible results,20networks were trained for each algorithm and size of hidden layer,using different(random)split of data into training and test sets each time(percentage split:50+50%for Labor data;25+75%of training and test instances respectively for Image Segmentation data).Thus,20estimates of error rate on test set were obtained for each algorithm and size of hidden layer.Mean and standard deviation of these20values formed the basis of subsequent analysis.Throughout all experiments,the learning rate[1]of neural networks(which controls magnitude of updates)was set to0.1and each network underwent20 epochs of training.In discrete variant of the algorithm,all weights of networks–both in hidden and output layers–were discretized in the same way,with gran-ularityγ=0.1.4.1Labor DataLabor data set1[13,15,16]contains information onfinal settlements in labor ne-gotiations in Canadian industry.It is composed of57instances described by 16attributes–mixed symbolic and numeric.In the experiment,symbolic at-tributes were turned into binary,which resulted in a data set described by26 numeric or binary attributes.Then,all attributes were normalized.There were two classes with37and20instances respectively.Neural networks created dur-ing the experiment consisted of two layers of sigmoidal neurons:hidden one,of different size;and output one,containing two neurons,one for each class.Results of experiments are listed in Table1and presented graphically in Figure2.The results show that discretization substantially improves accuracy of neural networks trained on Labor data.The lowest error rate obtained with discretized 1ftp:///pub/machine-learning-databases/labor-negotiations.Nondeterministic Discretization of Weights Improves Accuracy769 Table1.Error rates[%]and their standard deviations for neural networks trained with either standard or discrete backpropagation algorithm,evaluated on Labor data No.of hidden neurons Discrete backpropagation Standard backpropagation 522.55±9.8335.10±0.631015.25±7.9335.10±0.632010.17±5.4234.06±2.28309.49±4.9928.08±5.61508.25±5.3726.00±9.45707.54±4.9822.62±9.301007.37±3.9418.26±8.481507.20±3.3321.39±9.292007.72±4.3721.25±10.852507.91±4.1121.96±11.67Fig.2.Error rates[%]and their standard deviations(vertical bars)for neural networks trained with either standard(squares)or discrete(circles)backpropagation algorithm, evaluated on Labor works with different number of hidden neurons(horizontal axis)were checked.algorithm(7.20%)is by60%smaller than with standard algorithm(18.26%), which is a huge difference.Moreover,standard deviation of the error rate among networks with the same size of hidden layer is also significantly lower when770M.WojnarskiTable2.Error rates[%]and their standard deviations for neural networks trained with either standard or discrete backpropagation algorithm,evaluated on Image Seg-mentation dataNo.of hidden neurons Discrete backpropagation Standard backpropagation 524.41±3.8437.72±5.941015.18±3.0227.88±3.942012.12±1.4319.94±2.263011.15±1.1219.61±2.105010.91±1.2517.45±1.507010.34±0.9716.59±1.2110010.24±0.9916.24±1.2915010.87±1.4016.24±1.6820010.77±1.7516.76±2.1125010.63±1.9017.04±2.41Fig.3.Error rates[%]and their standard deviations(vertical bars)for neural networks trained with either standard(squares)or discrete(circles)backpropagation algorithm, evaluated on Image Segmentation works with different number of hidden neurons(horizontal axis)were checked.discretization is used.These large differences indicate that standard backpropa-gation highly overtrains on Labor data and discretization of weights is an efficient way to reduce this overtraining.Nondeterministic Discretization of Weights Improves Accuracy7714.2Image Segmentation DataImage Segmentation data set2[13,3]contains2310instances,described by19 numeric attributes and uniformly distributed in7classes.The attributes were normalized before training.Neural networks created during the experiment con-sisted two layers of sigmoidal neurons:hidden one,of different size;and output one,containing7neurons,one for each class.Results of experiments are listed in Table2and presented graphically in Figure3.As in the case of Labor data,also for Image Segmentation data the perfor-mance of neural networks can be improved by discretization of weights.The best result achieved with discretization(10.24%error rate)is by37%better than the best result of standard algorithm(16.24%).The improvement is smaller than for Labor data,probably due to significantly bigger size of the training set and thus smaller degree of overfitting.5ConclusionsA novel method that reduces overfitting of neural networks has been presented. The method is based on non-deterministic discretization of weights after every training cycle,which restricts the set of possible weight values to a discrete sub-set of real numbers and enables much shorter description of the network.This,in turn,improves generalization abilities and performance of the system,according to Minimum Description Length principle.Thanks to non-determinism,there is no risk that discretization would force the training process to stop in some far-from-optimal point of parameter space.The method was evaluated on two real-world data sets from UCI repository,exhibiting high effectiveness in prevent-ing neural network from overfitting:the use of discretization enabled decrease in error rate by60%and37%respectively.Importantly,it was better to use discretization than to decrease the number of hidden neurons,so discretization appeared to be more effective than the most straightforward method of avoiding overtraining.It should be emphasized that the presented method is potentially very ver-satile.Although the paper covers only neural networks and backpropagation algorithm,discretization of parameters of a model could be applied to many other algorithms and systems,as different as evolutionary algorithms,Bayesian networks or Gaussian mixture models,to name just a few.There are also two more general conclusions which follow from the presented study.They may seem strange atfirst sight but have deep consequences.Firstly, it appears that methods of continuous optimization–like gradient descend, which lies in the basis of backpropagation algorithm–can be successfully applied to optimization over dis continuous(e.g.discrete)spaces,as well.Secondly,all decision systems built from data and described by real-valued parameters might probably benefit from some kind of restriction imposed on possible parameter values.These conclusions definitely need more investigation.2ftp:///pub/machine-learning-databases/image772M.WojnarskiAcknowledgementThe author thanks anonymous reviewers for their helpful remarks. References1.Russell,S.,Norvig,P.:Artificial Intelligence:A Modern Approach.Prentice-Hall,Englewood Cliffs(1995)2.Ripley, B.D.:Pattern recognition and neural networks.Cambridge UniversityPress,Cambridge(1996)3.Michie,D.,Spiegelhalter,D.J.,Taylor,C.C.:Machine Learning,Neural and Sta-tistical Classification.Elis Horwood,London(1994)4.Rychetsky,M.,Ortmann,S.,Glesner,M.:Pruning and regularization techniques forfeed forward nets applied on a real world data base.In:Heiss,M.(ed.)International Symposium on Neural Computation,pp.603–609(1998)5.Bishop,C.M.:Training with noise is equivalent to Tikhonov regularization.NeuralComputation7(1),108–116(1995)6.Burger,M.,Neubauer,A.:Analysis of tikhonov regularization for function approx-imation by neural networks.Neural Networks16(1),79–90(2003)7.Wojnarski,M.:LTF-C:Architecture,training algorithm and applications of newneural classifier.Fundamenta Informaticae54(1),89–105(2003)8.Sietsma,J.,Dow,R.J.F.:Creating artifical neural networks that generalize.NeuralNetworks4(1),67–79(1991)9.Holmstr¨o m,L.,Koistinen,P.:Using additive noise in back-propagation training.IEEE Transactions on Neural Networks3(1),24–38(1992)10.LeCun,Y.,Bottou,L.,Bengio,Y.,Haffner,P.:Gradient-based learning applied todocument recognition.Proceedings of the IEEE86(11),2278–2324(1998)11.Rissanen,J.:Modeling by shortest data description.Automatica14,465–471(1978)12.Gr¨u nwald,P.,Myung,I.J.,Pitt,M.:Advances in Minimum Description Length.MIT Press,Cambridge(2005)13.Newman, D.J.,Hettich,S.,Merz, C.B.:UCI repository of machine learningdatabases(1998)14.Witten,I.H.,Frank,E.:Data Mining:Practical Machine Learning Tools and Tech-niques,2nd edn.Morgan Kaufmann,San Francisco(2005)15.Bergadano,F.,Matwin,S.,Michalski,R.S.,Zhang,J.:Measuring quality of conceptdescriptions.In:EWSL,pp.1–14(1988)16.Bergadano,F.,Matwin,S.,Michalski,R.S.,Zhang,J.:Representing and acquiringimprecise and context-dependent concepts in knowledge-based systems.In:ISMIS, pp.270–280(1988)。

相关文档
最新文档