神经网络的研究和应用(英文文献)

合集下载

神经网络简介abstract( 英文的)

神经网络简介abstract( 英文的)

Abstract:Artificial Neural Network is a math model which is applied to process information of the structure which is similar to Brain synaptic connection in a distributed and parallel way. Artificial Neural Network is a computing model, and it contains of many neurons and the connection of the neurons. Every neuron represents a special output function which is called activation function. The connection of neurons represents a weighted value of the connection’s signal.Neuron is a basic and essential part of Artificial Neural Network, and it includes the sum of weighted value, single-input single-output (SISO) system and nonlinear function mapping. The element of neuron can represent different thing, such as feature, alphabet, conception and some meaningful abstract pattern. In the network, the style of neuron’s element divided into three categories: input element, output element and hidden element. The input element accepts the signal and data of outer world; the output element processes result output for system; the hidden element cannot find by outer world, it between input element and output element. The weighted value represents the strength of connection between neurons.Artificial Neural Network adopted the mechanisms that completely different from traditional artificial intelligence and information processing technology. It conquers the flaw of traditional artificial intelligence in Intuitive handling and unstructured information processing aspect. It is adaptive, self-organized and learning timely, and widely used in schematic identification signal processing.。

神经网络的应用论文

神经网络的应用论文

神经网络的原理及应用之勘阻及广创作摘要:通过阅读相关文献,总结了神经网络方面的基来源根基理和应用。

首先介绍了Hopfield神经网络中的离散型网络,并介绍其实现交通标记的步调。

随着神经网络的发展,其局限性日益凸显。

为此,科学家们提出了与其它方法结合的神经网络。

本文介绍了遗传算法优化BP神经网络的原理及在在坝基岩体渗透系数识别中的应用,还介绍了模糊神经网络的原理及在预测地基沉降量中的应用,最后介绍了小波神经网络的原理及在电力负荷预测中的应用。

关键字:神经网络、Hopfield、遗传算法、模糊神经网络、小波神经网络绪论Hopfield网络及学习算法最初是由美国物理学家J.J Hopfield于1982年首先提出的,曾为人工神经网络的发展进程开辟了新的研究途径。

它利用与阶层型神经网络分歧的结构特征和学习方法,模拟生物神经网络的记忆机理,获得了令人满意的结果。

Hopfield 最早提出的网络是二值神经网络,神经元的输出只取1和0,所以,也称离散Hopfield神经网络(Discrete Hopfield Neural Network,DHNN)。

在离散Hopfield网络中,所采取的神经元是二值神经元,因此,所输出的离散值1和0分别暗示神经元处于激活和抑制状态。

Hopfield神经网络是递归神经网络的一种,在函数优化和联想记忆等方面有大量的应用。

其运行机理与反馈神经网络有实质的区别,运行规律更加复杂。

神经网络由于高度复杂的非线性结构导致其内部存在大量的局部极值点,而传统的梯度下降法训练神经网络有可能收敛于局部极值点,造成神经网络性能变差,甚至无法使用。

随着现代非线性优化方法异军突起,特别是赫赫有名的遗传算法,具有极强的全局搜索能力,其收敛的有效性得到了理论和实践的充分检验。

因此,遗传神经网络是解决高复杂性情况下全局收敛问题的有效途径。

系统的复杂性与所要求的精确性之间存在着尖锐矛盾,模糊逻辑、神经网络和专家控制等智能系统为缓解这种矛盾提供了有效途径,但是这些系统单个运用时经常存在多种问题,因此人们便根据它们的优缺点提出了融合使用的新思路,如本文的模糊神经网络。

神经网络参考文献列表

神经网络参考文献列表

神经网络参考文献[1] B. Widrow and M. A. Lehr, “30 years of adaptive neural networks: Perceptron,madaline, and backpropagation,” Proc. IEEE, vol. 78, pp. 1415-1442, Sept.1990.[2] Jian Hua Li, Anthony N. Michel, and Wolfgang Porod. “Analysis and synthesisof a class neural networks: Linear systems operating on closed hypercube,”IEEE Trans. Circuits Syst., 36(11):1405-1422, November 1989.[3] R. P. Lippmann, “An introduction to computing with neural nets,” IEEEAcoustics, Speech and Signal Processing Magazine, 2(4):4-22, April 1987.[4] S. Grossberg, E. Mingolla, and D. Todovoric,“A neural network architecture forpreattentive vision,” IEEE Trans. Biomed. Eng., 36:65-83, 1989.[5] Wang, D.and Arbib, M. A., “Complex temporal sequence learning based onshort-term memory,” Proc. IEEE, vol. 78, pp. 1536-1543, Sept. 1990.[6] Amari, S.-i., “Mathematical foundations of neurocomputing,” Proc. IEEE, vol.78, pp. 1443-1463, Sept. 1990.[7] Poggio, T. and Girosi, F., “Networks for approximation and learning,” Proc.IEEE, vol. 78, pp. 1481-1497, Sept. 1990.[8] Barnard, E., “Optimization for training neural nets,” IEEE Trans. NeuralNetwork, vol. 3, pp. 232-240, Mar. 1992.[9] Kohonen, T., “The self-organizing map,” Proc. IEEE, vol. 78, pp. 1464-1480,Sept. 1990.[10] Hagan, M.T. and Menhaj, M.B., “Training feedforward networks with theMarquardt algorithm,” IEEE Trans. Neural Network, vol. 5, pp. 989-993, Nov.1994.[11] Pei-Yih Ting and Iltis, R.A., “Diffusion network architectures forimplementation of Gibbs samplers with applications to assignment problems,”IEEE Trans. Neural Network, vol. 5, pp. 622-638, July 1994.[12] Iltis, R. A. and Ting, P.-Y., “Computing association probabilities using parallelBoltzmann machines,” IEEE Trans. Neural Network, vol. 4, pp. 221-233, Mar.1993.[13] R. Battiti, “First and second order methods for learning: Between steepestdescent and Newton's method,” Neural Computation, vol. 4, no. 2, pp. 141-166, 1992.[14] G. A. Carpenter and S. Grossberg, “A massively parallel architecture for aself-organizing neural pattern recognition machine,” Computer Vision, Graphics, and Image Processing, vol. 37, pp. 54-115, 1987.[15] C. Charalambous, “Conjugate gradient algorithm for efficient training ofartificial neural networks,” IEEE Proceeding, vol. 139, no. 3, pp. 301-310, 1992.[16] M. A. Cohen and S. Grossberg, “Absolute stability of global pattern formationand parallel memory storage by competitive neural networks,” IEEE Trans. on Systems, Man, and Cybernetics, vol. 13, no. 5, pp. 815-826, 1983.[17] J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, pp. 179-211,1990.[18] K. Fukushima, S. Miyake and T. Ito, “Neocognitron: A neural network modelfor a mechanism of visual pattern recognition,” IEEE Trans. on Systems, Man, and Cybernetics, vol. 13, no. 5, pp. 826-834, 1983.[19] K. Fukushima, “Neocognitron: A hierarchical neural network capable of visualpattern recognition,” Neural Networks, vol. 1, pp. 119-130, 1988.[20] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and theBayesian restoration of images,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 6, pp. 721-741, 1984.[21] S. Grossberg, “How does the brain build a cognitive code?,” PsychologicalReview, vol. 87, pp. 1-51, 1980.[22] M. Heywood and P. Noakes, “A framework for improved training of sigma-pinetworks,” IEEE Transactions of Neural Networks, vol. 6, no. 4, pp. 893-903, 1995.[23] J. J. Hopfield, “Neural networks and physical systems with emergent collectivecomputational properties,” Proceedings of the National Academy of Sciences, vol. 79, pp. 2554-2558, 1982.[24] J. J. Hopfield, “Neurons with graded response have collective computationalproperties like those of two-state neurons,” Proceedings of the National Academy of Sciences, vol. 81, pp. 3088-3092, 1984.[25] J. J. Hopfield and D. W. Tank, “'Neural computation of decisions inoptimization problems,” Biological Cybernetics, vol. 52, pp. 141-152, 1985.[26] K. M. Hornik, M. Stinchcombe and H. White, “Multilayer feedforward networksare universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359-366, 1989.[27] R. A. Jacobs, “Increased rates of convergence through learning rate adaptation,”Neural Networks, vol. 1, no. 4, pp. 295-308, 1988.[28] R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, “Adaptive mixturesof local experts,” Neural Computation, vol. 3, pp. 79-87, 1991.[29] T. Kohonen, “Correlation matrix memories,” IEEE Transactions on Computers,vol. 21, pp. 353-359, 1972.[30] B. Kosko, “Bidirectional associative memories,” IEEE Transactions on Systems,Man, and Cybernetics, vol. 18, no. 1, pp. 49-60, 1988.[31] D. J. C. MacKay, “A practical bayesian framework for backproagationnetworks,” Neural Computation, vol. 4, pp. 448-472, 1992.[32] A. N. Michel and J. A. Farrell, “Associative memories via artificial neuralnetworks,” IEEE Control Systems Magazine, April, pp. 6-17, 1990.[33] A. K. Rigler, J. M. Irvine and T. P. Vogl, “Rescaling of variables inbackpropagation learning,” Neural Networks, vol. 4, no. 2, pp. 225-229, 1991. [34] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations byback-propagating errors,” Nature, vol. 323, pp. 533-536, 1986.[35] D. F. Specht, “Probabilistic neural networks,” Neural Networks, vol. 3, no. 1, pp.109-118, 1990.[36] D. F. Specht, “A General regression neural network,” IEEE Transactions onNeural Networks, vol. 2, no. 6, pp. 568-576, 1991.[37] D. W. Tank and J. J. Hopfield, “Simple 'neural' optimization networks: An A/Dconverter, signal decision circuit and a linear programming circuit,” IEEE Transactions on Circuits and Systems, vol. 33, no. 5, pp. 533-541, 1986.[38] T. P. Vogl, J. K. Mangis, A. K. Zigler, W. T. Zink, and D. L. Alkon,“Accelerating the convergence of the backpropagation method,” Biological Cybernetics, vol. 59, pp. 256-264, Sept. 1988.[39] P. J. Werbos, “Backpropagation through time: What it is and how to do it,”Proceedings of the IEEE, vol. 78, pp. 1550-1560, Oct. 1990.[40] B. Widrow and R. Winter, “Neural nets for adaptive filtering and adaptivepattern recognition,” IEEE Computer Magazine, pp. 25-39, March 1988.[41] R. J. Williams and D. Zipser, “A learning algorithm for continually running fullyrecurrent neural networks,” Neural Computation, vol. 1, pp. 270-280, 1989. [42] A. Waibel, Tl Hanazawa, G. Hinton, K. Shikano and K. J. Lang, “Phonemerecognition using time-delay neural networks,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, pp. 328-339, 1989.[43] Linske, R., “Self-organization in a perceptual network,” IEEE ComputerMagazine, vol. 21, pp. 105-117, March 1988.[44] Carpenter, G.A. and Grossberg, S., “The ART of adaptive pattern recognition bya self-organizing neural network,” IEEE Computer Magazine, vol. 21, pp. 77-88,March 1988.[45] Fukushima, K., “A neural network for visual pattern recognition,” IEEEComputer Magazine, vol. 21, pp. 65-75, March 1988.[46] Kohonen, T., “The 'neural' phonetic typewriter,” IEEE Computer Magazine, vol.21, pp. 11-22, March 1988.。

卷积神经网络机器学习外文文献翻译中英文2020

卷积神经网络机器学习外文文献翻译中英文2020

卷积神经网络机器学习相关外文翻译中英文2020英文Prediction of composite microstructure stress-strain curves usingconvolutional neural networksCharles Yang,Youngsoo Kim,Seunghwa Ryu,Grace GuAbstractStress-strain curves are an important representation of a material's mechanical properties, from which important properties such as elastic modulus, strength, and toughness, are defined. However, generating stress-strain curves from numerical methods such as finite element method (FEM) is computationally intensive, especially when considering the entire failure path for a material. As a result, it is difficult to perform high throughput computational design of materials with large design spaces, especially when considering mechanical responses beyond the elastic limit. In this work, a combination of principal component analysis (PCA) and convolutional neural networks (CNN) are used to predict the entire stress-strain behavior of binary composites evaluated over the entire failure path, motivated by the significantly faster inference speed of empirical models. We show that PCA transforms the stress-strain curves into an effective latent space by visualizing the eigenbasis of PCA. Despite having a dataset of only 10-27% of possible microstructure configurations, the mean absolute error of the prediction is <10% of therange of values in the dataset, when measuring model performance based on derived material descriptors, such as modulus, strength, and toughness. Our study demonstrates the potential to use machine learning to accelerate material design, characterization, and optimization.Keywords:Machine learning,Convolutional neural networks,Mechanical properties,Microstructure,Computational mechanics IntroductionUnderstanding the relationship between structure and property for materials is a seminal problem in material science, with significant applications for designing next-generation materials. A primary motivating example is designing composite microstructures for load-bearing applications, as composites offer advantageously high specific strength and specific toughness. Recent advancements in additive manufacturing have facilitated the fabrication of complex composite structures, and as a result, a variety of complex designs have been fabricated and tested via 3D-printing methods. While more advanced manufacturing techniques are opening up unprecedented opportunities for advanced materials and novel functionalities, identifying microstructures with desirable properties is a difficult optimization problem.One method of identifying optimal composite designs is by constructing analytical theories. For conventional particulate/fiber-reinforced composites, a variety of homogenizationtheories have been developed to predict the mechanical properties of composites as a function of volume fraction, aspect ratio, and orientation distribution of reinforcements. Because many natural composites, synthesized via self-assembly processes, have relatively periodic and regular structures, their mechanical properties can be predicted if the load transfer mechanism of a representative unit cell and the role of the self-similar hierarchical structure are understood. However, the applicability of analytical theories is limited in quantitatively predicting composite properties beyond the elastic limit in the presence of defects, because such theories rely on the concept of representative volume element (RVE), a statistical representation of material properties, whereas the strength and failure is determined by the weakest defect in the entire sample domain. Numerical modeling based on finite element methods (FEM) can complement analytical methods for predicting inelastic properties such as strength and toughness modulus (referred to as toughness, hereafter) which can only be obtained from full stress-strain curves.However, numerical schemes capable of modeling the initiation and propagation of the curvilinear cracks, such as the crack phase field model, are computationally expensive and time-consuming because a very fine mesh is required to accommodate highly concentrated stress field near crack tip and the rapid variation of damage parameter near diffusive cracksurface. Meanwhile, analytical models require significant human effort and domain expertise and fail to generalize to similar domain problems.In order to identify high-performing composites in the midst of large design spaces within realistic time-frames, we need models that can rapidly describe the mechanical properties of complex systems and be generalized easily to analogous systems. Machine learning offers the benefit of extremely fast inference times and requires only training data to learn relationships between inputs and outputs e.g., composite microstructures and their mechanical properties. Machine learning has already been applied to speed up the optimization of several different physical systems, including graphene kirigami cuts, fine-tuning spin qubit parameters, and probe microscopy tuning. Such models do not require significant human intervention or knowledge, learn relationships efficiently relative to the input design space, and can be generalized to different systems.In this paper, we utilize a combination of principal component analysis (PCA) and convolutional neural networks (CNN) to predict the entire stress-strain c urve of composite failures beyond the elastic limit. Stress-strain curves are chosen as the model's target because t hey are difficult to predict given their high dimensionality. In addition, stress-strain curves are used to derive important material descriptors such as modulus, strength, and toughness. In this sense, predicting stress-straincurves is a more general description of composites properties than any combination of scaler material descriptors. A dataset of 100,000 different composite microstructures and their corresponding stress-strain curves are used to train and evaluate model performance. Due to the high dimensionality of the stress-strain dataset, several dimensionality reduction methods are used, including PCA, featuring a blend of domain understanding and traditional machine learning, to simplify the problem without loss of generality for the model.We will first describe our modeling methodology and the parameters of our finite-element method (FEM) used to generate data. Visualizations of the learned PCA latent space are then presented, a long with model performance results.CNN implementation and trainingA convolutional neural network was trained to predict this lower dimensional representation of the stress vector. The input to the CNN was a binary matrix representing the composite design, with 0's corresponding to soft blocks and 1's corresponding to stiff blocks. PCA was implemented with the open-source Python package scikit-learn, using the default hyperparameters. CNN was implemented using Keras with a TensorFlow backend. The batch size for all experiments was set to 16 and the number of epochs to 30; the Adam optimizer was used to update the CNN weights during backpropagation.A train/test split ratio of 95:5 is used –we justify using a smaller ratio than the standard 80:20 because of a relatively large dataset. With a ratio of 95:5 and a dataset with 100,000 instances, the test set size still has enough data points, roughly several thousands, for its results to generalize. Each column of the target PCA-representation was normalized to have a mean of 0 and a standard deviation of 1 to prevent instable training.Finite element method data generationFEM was used to generate training data for the CNN model. Although initially obtained training data is compute-intensive, it takes much less time to train the CNN model and even less time to make high-throughput inferences over thousands of new, randomly generated composites. The crack phase field solver was based on the hybrid formulation for the quasi-static fracture of elastic solids and implementedin the commercial FEM software ABAQUS with a user-element subroutine (UEL).Visualizing PCAIn order to better understand the role PCA plays in effectively capturing the information contained in stress-strain curves, the principal component representation of stress-strain curves is plotted in 3 dimensions. Specifically, we take the first three principal components, which have a cumulative explained variance ~85%, and plot stress-strain curves in that basis and provide several different angles from which toview the 3D plot. Each point represents a stress-strain curve in the PCA latent space and is colored based on the associated modulus value. it seems that the PCA is able to spread out the curves in the latent space based on modulus values, which suggests that this is a useful latent space for CNN to make predictions in.CNN model design and performanceOur CNN was a fully convolutional neural network i.e. the only dense layer was the output layer. All convolution layers used 16 filters with a stride of 1, with a LeakyReLU activation followed by BatchNormalization. The first 3 Conv blocks did not have 2D MaxPooling, followed by 9 conv blocks which did have a 2D MaxPooling layer, placed after the BatchNormalization layer. A GlobalAveragePooling was used to reduce the dimensionality of the output tensor from the sequential convolution blocks and the final output layer was a Dense layer with 15 nodes, where each node corresponded to a principal component. In total, our model had 26,319 trainable weights.Our architecture was motivated by the recent development and convergence onto fully-convolutional architectures for traditional computer vision applications, where convolutions are empirically observed to be more efficient and stable for learning as opposed to dense layers. In addition, in our previous work, we had shown that CNN's werea capable architecture for learning to predict mechanical properties of 2Dcomposites [30]. The convolution operation is an intuitively good fit forpredicting crack propagation because it is a local operation, allowing it toimplicitly featurize and learn the local spatial effects of crack propagation.After applying PCA transformation to reduce the dimensionality ofthe target variable, CNN is used to predict the PCA representation of thestress-strain curve of a given binary composite design. After training theCNN on a training set, its ability to generalize to composite designs it hasnot seen is evaluated by comparing its predictions on an unseen test set.However, a natural question that emerges i s how to evaluate a model's performance at predicting stress-strain curves in a real-world engineeringcontext. While simple scaler metrics such as mean squared error (MSE)and mean absolute error (MAE) generalize easily to vector targets, it isnot clear how to interpret these aggregate summaries of performance. It isdifficult to use such metrics to ask questions such as “Is this modeand “On average, how poorly will aenough to use in the real world” given prediction be incorrect relative to some given specification”. Although being able to predict stress-strain curves is an importantapplication of FEM and a highly desirable property for any machinelearning model to learn, it does not easily lend itself to interpretation. Specifically, there is no simple quantitative way to define whether two-world units.stress-s train curves are “close” or “similar” with real Given that stress-strain curves are oftentimes intermediary representations of a composite property that are used to derive more meaningful descriptors such as modulus, strength, and toughness, we decided to evaluate the model in an analogous fashion. The CNN prediction in the PCA latent space representation is transformed back to a stress-strain curve using PCA, and used to derive the predicted modulus, strength, and toughness of the composite. The predicted material descriptors are then compared with the actual material descriptors. In this way, MSE and MAE now have clearly interpretable units and meanings. The average performance of the model with respect to the error between the actual and predicted material descriptor values derived from stress-strain curves are presented in Table. The MAE for material descriptors provides an easily interpretable metric of model performance and can easily be used in any design specification to provide confidence estimates of a model prediction. When comparing the mean absolute error (MAE) to the range of values taken on by the distribution of material descriptors, we can see that the MAE is relatively small compared to the range. The MAE compared to the range is <10% for all material descriptors. Relatively tight confidence intervals on the error indicate that this model architecture is stable, the model performance is not heavily dependent on initialization, and that our results are robust to differenttrain-test splits of the data.Future workFuture work includes combining empirical models with optimization algorithms, such as gradient-based methods, to identify composite designs that yield complementary mechanical properties. The ability of a trained empirical model to make high-throughput predictions over designs it has never seen before allows for large parameter space optimization that would be computationally infeasible for FEM. In addition, we plan to explore different visualizations of empirical models-box” of such models. Applying machine in an effort to “open up the blacklearning to finite-element methods is a rapidly growing field with the potential to discover novel next-generation materials tailored for a variety of applications. We also note that the proposed method can be readily applied to predict other physical properties represented in a similar vectorized format, such as electron/phonon density of states, and sound/light absorption spectrum.ConclusionIn conclusion, we applied PCA and CNN to rapidly and accurately predict the stress-strain curves of composites beyond the elastic limit. In doing so, several novel methodological approaches were developed, including using the derived material descriptors from the stress-strain curves as interpretable metrics for model performance and dimensionalityreduction techniques to stress-strain curves. This method has the potential to enable composite design with respect to mechanical response beyond the elastic limit, which was previously computationally infeasible, and can generalize easily to related problems outside of microstructural design for enhancing mechanical properties.中文基于卷积神经网络的复合材料微结构应力-应变曲线预测查尔斯,吉姆,瑞恩,格瑞斯摘要应力-应变曲线是材料机械性能的重要代表,从中可以定义重要的性能,例如弹性模量,强度和韧性。

(完整版)人工神经网络在认知科学的研究中的应用状况毕业设计开题报告外文翻译

(完整版)人工神经网络在认知科学的研究中的应用状况毕业设计开题报告外文翻译

本科毕业设计(论文) 外文翻译(附外文原文)学院:机械与控制工程学院课题名称:人工神经网络在认知科学研究中的应用状况的报告专业(方向):自动化(控制)班级:学生:指导教师:日期:水下运载工具模糊逻辑控制器的简单设计方法K. Ishaque n, S.S.Abdullah,S.M.Ayob,Z.Salam(Faculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM 81310, Skudai, Johor Bahru, Malaysia )摘要:模糊逻辑控制器(FLC)的性能是由其推理规则决定的。

在大多数情况下,FLC 会使用很多算法,以使其控制功能的精确性得到增强。

不过运行大型的算法需要很多的计算时间,这使得安装使用的FLC必须有快速和高效的性能。

本文描述一种水下运载工具模糊逻辑控制器的简单设计方法(FLC),水下运载工具也被称为深度下潜救援运载工具(DSRV)。

这一方法使控制器成为单输入模糊逻辑控制器(SIFLC),其省略了普通模糊逻辑控制器中将双输入FLC(CFLC)转变成单输入FLC的步骤。

SIFLC使推理法则得到简化,主要是简化了控制参数的转化过程。

控制器是在MATLAB/SIMULINK程序平台上使用航海系统模拟器(MSS)来模拟状况的,其以此达到简化的目的。

在仿真中,波动的干扰提交到DSRV中。

在SIFLC上显示出相同输入系统的Mamdani和Sugeno类型的相同反应,而且SIFLC只需要非常小的转换。

在两个量级间,他的执行时间是少于CIFLC的。

关键词:模糊逻辑控制器;距离符号法;单输入模糊逻辑控制;水下运载工具电子工程系,teknologi malaysia大学,UTM81310,Skudai,johor bahru,malaysia 1引言无人水下运载工具是一个自动的,像水下机器人设备一样能完成水下任务(例如搜索和营救操作,考察,监视,检查,维修和保养)的设备。

神经元网络的研究与应用

神经元网络的研究与应用

神经元网络的研究与应用神经元网络(neural network)也被称为神经网络,是指一种模拟人脑神经元和神经元之间相互作用的计算模型,同时也是一种机器学习技术,被广泛应用于人工智能领域中。

神经元网络能够通过大量的数据训练,对复杂问题进行分类、预测和优化等任务。

一、神经元网络的基本结构神经元网络的基本结构由许多神经元构成,这些神经元之间通过连接形成了神经元网络。

而这些神经元网络的连接方式可以是前向连接、反馈连接、自适应连接等多种形式。

每个神经元接收多个输入信号并将它们求和,然后通过激活函数(如sigmoid函数、ReLU函数等)来计算自己的输出信号。

这个输出信号又连接到其他的神经元中,形成了一个多层的神经元网络。

二、神经元网络在机器学习中的应用神经元网络在机器学习中的应用非常广泛,常见的应用包括但不限于以下领域:1. 图像分类神经元网络可以进行图像分类和目标识别等任务。

比如,卷积神经网络(Convolutional Neural Network,CNN)可以提取图像中的特征,将图像分为不同的类别。

2. 语音识别神经元网络可以对人类说话的语音进行识别,将语音转换成文字(Speech-to-Text)或语音需要表达的意思(Natural Language Processing,NLP)。

3. 自动驾驶神经元网络可以对路况进行分析,捕捉场景中的交通标志、车辆、行人等,并做出自动驾驶的决策。

4. 金融预测神经元网络可以预测股票的价格走势、货币的涨跌等,帮助投资人进行投资决策。

三、神经元网络的发展历程神经元网络的研究自20世纪50年代开始。

早期的神经元网络主要包括感知器(Perceptron)和自适应线性元件(Adaline)。

这些神经元网络被用来解决二分类问题和线性回归问题。

到了20世纪80年代,随着BP算法(Back Propagation Algorithm)的引入,多层神经元网络的训练变得可行。

BP算法基于误差反向传递的原理,通过反向传播误差来调整每个神经元之间的权重,从而使神经元网络产生更好的分类效果。

神经网络 论文

神经网络 论文

神经网络论文以下是一些关于神经网络的重要论文:1. "A Computational Approach to Edge Detection",作者:John Canny,论文发表于1986年,提出了一种基于神经网络的边缘检测算法,被广泛应用于计算机视觉领域。

2. "Backpropagation Applied to Handwritten Zip Code Recognition",作者:Yann LeCun et al.,论文发表于1990年,引入了反向传播算法在手写数字识别中的应用,为图像识别领域开创了先河。

3. "Gradient-Based Learning Applied to Document Recognition",作者:Yann LeCun et al.,论文发表于1998年,介绍了LeNet-5,一个用于手写数字和字符识别的深度卷积神经网络。

4. "ImageNet Classification with Deep Convolutional Neural Networks",作者:Alex Krizhevsky et al.,论文发表于2012年,提出了深度卷积神经网络模型(AlexNet),在ImageNet图像识别竞赛中取得了重大突破。

5. "Deep Residual Learning for Image Recognition",作者:Kaiming He et al.,论文发表于2015年,提出了深度残差网络(ResNet),通过引入残差连接解决了深度神经网络训练中的梯度消失和梯度爆炸问题。

6. "Generative Adversarial Networks",作者:Ian Goodfellow etal.,论文发表于2014年,引入了生成对抗网络(GAN),这是一种通过博弈论思想训练生成模型和判别模型的框架,广泛应用于图像生成和增强现实等领域。

纺织专业 人工神经网络 中英文 外文 资料 文献 原文和翻译

纺织专业 人工神经网络 中英文 外文 资料 文献 原文和翻译

纺织专业人工神经网络中英文外文资料文献原文和翻译Textile Research Journal ArticleUse of Artificial Neural Networks for Determining the LevelingAction Point at the Auto-leveling Draw FrameAssad Farooq1and Chokri CherifInstitute of Textile and Clothing Technology, TechnischeUniversität Dresden. Dresden, GermanyAbstractArtificial neural networks with their ability of learning from data have been successfully applied in the textile industry. The leveling action point is one of the important auto-leveling parameters of the drawing frame and strongly influences the quality of the manufactured yarn. This paper reports a method of predicting the leveling actionpoint using artificial neural networks. Various leveling action point affecting variables were selected as inputs for training the artificial neural networks with the aim to optimize the auto-leveling by limiting the leveling action point search range. The Levenberg Marquardtalgorithm is incorporated into the back-propagation to accelerate the training and Bayesian regularization is applied to improve the generalization of the networks. The results obtained are quite promising. Key words: artificial neural networks, auto-lev-eling, draw frame, leveling action point。

Neural-Network-Introduction神经网络介绍大学毕业论文外文文献翻译及原文

Neural-Network-Introduction神经网络介绍大学毕业论文外文文献翻译及原文

毕业设计(论文)外文文献翻译文献、资料中文题目:神经网络介绍文献、资料英文题目:Neural Network Introduction文献、资料来源:文献、资料发表(出版)日期:院(部):专业:班级:姓名:学号:指导教师:翻译日期:2017.02.14外文文献翻译注:节选自Neural Network Introduction神经网络介绍,绪论。

HistoryThe history of artificial neural networks is filled with colorful, creative individuals from many different fields, many of whom struggled for decades to develop concepts that we now take for granted. This history has been documented by various authors. One particularly interesting book is Neurocomputing: Foundations of Research by John Anderson and Edward Rosenfeld. They have collected and edited a set of some 43 papers of special historical interest. Each paper is preceded by an introduction that puts the paper in historical perspective.Histories of some of the main neural network contributors are included at the beginning of various chapters throughout this text and will not be repeated here. However, it seems appropriate to give a brief overview, a sample of the major developments.At least two ingredients are necessary for the advancement of a technology: concept and implementation. First, one must have a concept, a way of thinking about a topic, some view of it that gives clarity not there before. This may involve a simple idea, or it may be more specific and include a mathematical description. To illustrate this point, consider the history of the heart. It was thought to be, at various times, the center of the soul or a source of heat. In the 17th century medical practitioners finally began to view the heart as a pump, and they designed experiments to study its pumping action. These experiments revolutionized our view of the circulatory system. Without the pump concept, an understanding of the heart was out of grasp.Concepts and their accompanying mathematics are not sufficient for a technology to mature unless there is some way to implement the system. For instance, the mathematics necessary for the reconstruction of images from computer-aided topography (CAT) scans was known many years before the availability of high-speed computers and efficient algorithms finally made it practical to implement a useful CAT system.The history of neural networks has progressed through both conceptual innovations and implementation developments. These advancements, however, seem to have occurred in fits and starts rather than by steady evolution.Some of the background work for the field of neural networks occurred in the late 19th and early 20th centuries. This consisted primarily of interdisciplinary work in physics, psychology and neurophysiology by such scientists as Hermann von Helmholtz, Ernst Much and Ivan Pavlov. This early work emphasized general theories of learning, vision, conditioning, etc.,and did not include specific mathematical models of neuron operation.The modern view of neural networks began in the 1940s with the work of Warren McCulloch and Walter Pitts [McPi43], who showed that networks of artificial neurons could, in principle, compute any arithmetic or logical function. Their work is often acknowledged as the origin of theneural network field.McCulloch and Pitts were followed by Donald Hebb [Hebb49], who proposed that classical conditioning (as discovered by Pavlov) is present because of the properties of individual neurons. He proposed a mechanism for learning in biological neurons.The first practical application of artificial neural networks came in the late 1950s, with the invention of the perception network and associated learning rule by Frank Rosenblatt [Rose58]. Rosenblatt and his colleagues built a perception network and demonstrated its ability to perform pattern recognition. This early success generated a great deal of interest in neural network research. Unfortunately, it was later shown that the basic perception network could solve only a limited class of problems. (See Chapter 4 for more on Rosenblatt and the perception learning rule.)At about the same time, Bernard Widrow and Ted Hoff [WiHo60] introduced a new learning algorithm and used it to train adaptive linear neural networks, which were similar in structure and capability to Rosenblatt’s perception. The Widrow Hoff learning rule is still in use today. (See Chapter 10 for more on Widrow-Hoff learning.) Unfortunately, both Rosenblatt's and Widrow's networks suffered from the same inherent limitations, which were widely publicized in a book by Marvin Minsky and Seymour Papert [MiPa69]. Rosenblatt and Widrow wereaware of these limitations and proposed new networks that would overcome them. However, they were not able to successfully modify their learning algorithms to train the more complex networks.Many people, influenced by Minsky and Papert, believed that further research on neural networks was a dead end. This, combined with the fact that there were no powerful digital computers on which to experiment,caused many researchers to leave the field. For a decade neural network research was largely suspended. Some important work, however, did continue during the 1970s. In 1972 Teuvo Kohonen [Koho72] and James Anderson [Ande72] independently and separately developed new neural networks that could act as memories. Stephen Grossberg [Gros76] was also very active during this period in the investigation of self-organizing networks.Interest in neural networks had faltered during the late 1960s because of the lack of new ideas and powerful computers with which to experiment. During the 1980s both of these impediments were overcome, and researchin neural networks increased dramatically. New personal computers and workstations, which rapidly grew in capability, became widely available. In addition, important new concepts were introduced.Two new concepts were most responsible for the rebirth of neural net works. The first was the use of statistical mechanics to explain the operation of a certain class of recurrent network, which could be used as an associative memory. This was described in a seminal paper by physicist John Hopfield [Hopf82].The second key development of the 1980s was the backpropagation algo rithm for training multilayer perceptron networks, which was discovered independently by several different researchers. The most influential publication of the backpropagation algorithm was by David Rumelhart and James McClelland [RuMc86]. This algorithm was the answer to the criticisms Minsky and Papert had made in the 1960s. (See Chapters 11 and 12 for a development of the backpropagation algorithm.)These new developments reinvigorated the field of neural networks. In the last ten years, thousands of papers have been written, and neural networks have found many applications. The field is buzzing with new theoretical and practical work. As noted below, it is not clear where all of this will lead US.The brief historical account given above is not intended to identify all of the major contributors, but is simply to give the reader some feel for how knowledge in the neuralnetwork field has progressed. As one might note, the progress has not always been "slow but sure." There have been periods of dramatic progress and periods when relatively little has been accomplished.Many of the advances in neural networks have had to do with new concepts, such as innovative architectures and training. Just as important has been the availability of powerful new computers on which to test these new concepts.Well, so much for the history of neural networks to this date. The real question is, "What will happen in the next ten to twenty years?" Will neural networks take a permanent place as a mathematical/engineering tool, or will they fade away as have so many promising technologies? At present, the answer seems to be that neural networks will not only have their day but will have a permanent place, not as a solution to every problem, but as a tool to be used in appropriate situations. In addition, remember that we still know very little about how the brain works. The most important advances in neural networks almost certainly lie in the future.Although it is difficult to predict the future success of neural networks, the large number and wide variety of applications of this new technology are very encouraging. The next section describes some of these applications.ApplicationsA recent newspaper article described the use of neural networks in literature research by Aston University. It stated that "the network can be taught to recognize individual writing styles, and the researchers used it to compare works attributed to Shakespeare and his contemporaries." A popular science television program recently documented the use of neural networks by an Italian research institute to test the purity of olive oil. These examples are indicative of the broad range of applications that can be found for neural networks. The applications are expanding because neural networks are good at solving problems, not just in engineering, science and mathematics, but m medicine, business, finance and literature as well. Their application to a wide variety of problems in many fields makes them very attractive. Also, faster computers and faster algorithms have made it possible to use neural networks to solve complex industrial problems that formerly required too much computation.The following note and Table of Neural Network Applications are reproduced here from the Neural Network Toolbox for MATLAB with the permission of the Math Works, Inc.The 1988 DARPA Neural Network Study [DARP88] lists various neural network applications, beginning with the adaptive channel equalizer in about 1984. This device, which is an outstanding commercial success, is a single-neuron network used in long distance telephone systems to stabilize voice signals. The DARPA report goes on to list other commercial applications, including a small word recognizer, a process monitor, a sonar classifier and a risk analysis system.Neural networks have been applied in many fields since the DARPA report was written. A list of some applications mentioned in the literature follows.AerospaceHigh performance aircraft autopilots, flight path simulations, aircraft control systems, autopilot enhancements, aircraft component simulations, aircraft component fault detectorsAutomotiveAutomobile automatic guidance systems, warranty activity analyzersBankingCheck and other document readers, credit application evaluatorsDefenseWeapon steering, target tracking, object discrimination, facial recognition, new kinds of sensors, sonar, radar and image signal processing including data compression, feature extraction and noise suppression, signal/image identificationElectronicsCode sequence prediction, integrated circuit chip layout, process control, chip failure analysis, machine vision, voice synthesis, nonlinear modelingEntertainmentAnimation, special effects, market forecasting。

神经网络概论外文文献翻译中英文

神经网络概论外文文献翻译中英文

外文文献翻译(含:英文原文及中文译文)英文原文Neural Network Introduction1 ObjectivesAs you read these words you are using a complex biological neural network. Y ou have a highly interconnected set of some 1011neurons to facilitate your reading, breathing, motion and thinking. Each of your biological neurons, a rich assembly of tissue and chemistry, has the complexity, if not the speed, of a microprocessor. Some of your neural structure was with you at birth. Other parts have been established by experience.Scientists have only just begun to understand how biological neural networks operate. It is generally understood that all biological neural functions, including memory, are stored in the neurons and in the connections between them. Learning is viewed as the establishment of new connections between neurons or the modification of existing connections.This leads to the following question: Although we have only a rudimentary understanding of biological neural networks, is it possible to construct a small set of simple artifi cial “neurons” and perhaps train them to serve a useful function? The answer is “yes.”This book, then, is aboutartificial neural networks.The neurons that we consider here are not biological. They are extremely simple abstractions of biological neurons, realized as elements in a program or perhaps as circuits made of silicon. Networks of these artificial neurons do not have a fraction of the power of the human brain, but they can be trained to perform useful functions. This book is about such neurons, the networks that contain them and their training.2 HistoryThe history of artificial neural networks is filled with colorful, creative individuals from many different fields, many of whom struggled for decades to develop concepts that we now take for granted. This history has been documented by various authors. One particularly interesting book is Neurocomputing: Foundations of Research by John Anderson and Edward Rosenfeld. They have collected and edited a set of some 43 papers of special historical interest. Each paper is preceded by an introduction that puts the paper in historical perspective.Histories of some of the main neural network contributors are included at the beginning of various chapters throughout this text and will not be repeated here. However, it seems appropriate to give a brief overview, a sample of the major developments.At least two ingredients are necessary for the advancement of a technology: concept and implementation. First, one must have a concept,a way of thinking about a topic, some view of it that gives clarity not there before. This may involve a simple idea, or it may be more specific and include a mathematical description. To illustrate this point, consider the history of the heart. It was thought to be, at various times, the center of the soul or a source of heat. In the 17th century medical practitioners finally began to view the heart as a pump, and they designed experiments to study its pumping action. These experiments revolutionized our view of the circulatory system. Without the pump concept, an understanding of the heart was out of grasp.Concepts and their accompanying mathematics are not sufficient for a technology to mature unless there is some way to implement the system. For instance, the mathematics necessary for the reconstruction of images from computer-aided topography (CA T) scans was known many years before the availability of high-speed computers and efficient algorithms finally made it practical to implement a useful CA T system.The history of neural networks has progressed through both conceptual innovations and implementation developments. These advancements, however, seem to have occurred in fits and starts rather than by steady evolution.Some of the background work for the field of neural networks occurred in the late 19th and early 20th centuries. This consisted primarily of interdisciplinary work in physics, psychology andneurophysiology by such scientists as Hermann von Helmholtz, Ernst Much and Ivan Pavlov. This early work emphasized general theories of learning, vision, conditioning, etc.,and did not include specific mathematical models of neuron operation.The modern view of neural networks began in the 1940s with the work of Warren McCulloch and Walter Pitts [McPi43], who showed that networks of artificial neurons could, in principle, compute any arithmetic or logical function. Their work is often acknowledged as the origin of the neural network field.McCulloch and Pitts were followed by Donald Hebb [Hebb49], who proposed that classical conditioning (as discovered by Pavlov) is present because of the properties of individual neurons. He proposed a mechanism for learning in biological neurons.The first practical application of artificial neural networks came in the late 1950s, with the invention of the perception network and associated learning rule by Frank Rosenblatt [Rose58]. Rosenblatt and his colleagues built a perception network and demonstrated its ability to perform pattern recognition. This early success generated a great deal of interest in neural network research. Unfortunately, it was later shown that the basic perception network could solve only a limited class of problems. (See Chapter 4 for more on Rosenblatt and the perception learning rule.) At about the same time, Bernard Widrow and Ted Hoff [WiHo60]introduced a new learning algorithm and used it to train adaptive linear neural networks, which were similar in structure and capability to Rosenblatt’s perception. The Widrow Hoff learning rule is still in use today. (See Chapter 10 for more on Widrow-Hoff learning.) Unfortunately, both Rosenblatt's and Widrow's networks suffered from the same inherent limitations, which were widely publicized in a book by Marvin Minsky and Seymour Papert [MiPa69]. Rosenblatt and Widrow wereaware of these limitations and proposed new networks that would overcome them. However, they were not able to successfully modify their learning algorithms to train the more complex networks.Many people, influenced by Minsky and Papert, believed that further research on neural networks was a dead end. This, combined with the fact that there were no powerful digital computers on which to experiment, caused many researchers to leave the field. For a decade neural network research was largely suspended. Some important work, however, did continue during the 1970s. In 1972 Teuvo Kohonen [Koho72] and James Anderson [Ande72] independently and separately developed new neural networks that could act as memories. Stephen Grossberg [Gros76] was also very active during this period in the investigation of self-organizing networks.Interest in neural networks had faltered during the late 1960s because of the lack of new ideas and powerful computers with which toexperiment. During the 1980s both of these impediments were overcome, and researchin neural networks increased dramatically. New personal computers and workstations, which rapidly grew in capability, became widely available. In addition, important new concepts were introduced.Two new concepts were most responsible for the rebirth of neural net works. The first was the use of statistical mechanics to explain the operation of a certain class of recurrent network, which could be used as an associative memory. This was described in a seminal paper by physicist John Hopfield [Hopf82].The second key development of the 1980s was the backpropagation algo rithm for training multilayer perceptron networks, which was discovered independently by several different researchers. The most influential publication of the backpropagation algorithm was by David Rumelhart and James McClelland [RuMc86]. This algorithm was the answer to the criticisms Minsky and Papert had made in the 1960s. (See Chapters 11 and 12 for a development of the backpropagation algorithm.) These new developments reinvigorated the field of neural networks. In the last ten years, thousands of papers have been written, and neural networks have found many applications. The field is buzzing with new theoretical and practical work. As noted below, it is not clear where all of this will lead US.The brief historical account given above is not intended to identify all of the major contributors, but is simply to give the reader some feel for how knowledge in the neural network field has progressed. As one might note, the progress has not always been "slow but sure." There have been periods of dramatic progress and periods when relatively little has been accomplished.Many of the advances in neural networks have had to do with new concepts, such as innovative architectures and training. Just as important has been the availability of powerful new computers on which to test these new concepts.Well, so much for the history of neural networks to this date. The real question is, "What will happen in the next ten to twenty years?" Will neural networks take a permanent place as a mathematical/engineering tool, or will they fade away as have so many promising technologies? At present, the answer seems to be that neural networks will not only have their day but will have a permanent place, not as a solution to every problem, but as a tool to be used in appropriate situations. In addition, remember that we still know very little about how the brain works. The most important advances in neural networks almost certainly lie in the future.Although it is difficult to predict the future success of neural networks, the large number and wide variety of applications of this newtechnology are very encouraging. The next section describes some of these applications.3 ApplicationsA recent newspaper article described the use of neural networks in literature research by Aston University. It stated that "the network can be taught to recognize individual writing styles, and the researchers used it to compare works attributed to Shakespeare and his contemporaries." A popular science television program recently documented the use of neural networks by an Italian research institute to test the purity of olive oil. These examples are indicative of the broad range of applications that can be found for neural networks. The applications are expanding because neural networks are good at solving problems, not just in engineering, science and mathematics, but m medicine, business, finance and literature as well. Their application to a wide variety of problems in many fields makes them very attractive. Also, faster computers and faster algorithms have made it possible to use neural networks to solve complex industrial problems that formerly required too much computation.The following note and Table of Neural Network Applications are reproduced here from the Neural Network Toolbox for MA TLAB with the permission of the Math Works, Inc.The 1988 DARPA Neural Network Study [DARP88] lists various neural network applications, beginning with the adaptive channelequalizer in about 1984. This device, which is an outstanding commercial success, is a single-neuron network used in long distance telephone systems to stabilize voice signals. The DARPA report goes on to list other commercial applications, including a small word recognizer, a process monitor, a sonar classifier and a risk analysis system.Neural networks have been applied in many fields since the DARPA report was written. A list of some applications mentioned in the literature follows.AerospaceHigh performance aircraft autopilots, flight path simulations, aircraft control systems, autopilot enhancements, aircraft component simulations, aircraft component fault detectorsAutomotiveAutomobile automatic guidance systems, warranty activity analyzers BankingCheck and other document readers, credit application evaluatorsDefenseWeapon steering, target tracking, object discrimination, facial recognition, new kinds of sensors, sonar, radar and image signal processing including data compression, feature extraction and noise suppression, signal/image identification ElectronicsCode sequence prediction, integrated circuit chip layout, processcontrol, chip failure analysis, machine vision, voice synthesis, nonlinear modelingEntertainmentAnimation, special effects, market forecastingFinancialReal estate appraisal, loan advisor, mortgage screening, corporate bond rating, credit line use analysis, portfolio trading program, corporate financial analysis, currency price predictionInsurancePolicy application evaluation, product optimizationManufacturingManufacturing process control, product design and analysis, process and machine diagnosis, real-time particle identification, visual quality inspection systems, beer testing, welding quality analysis, paper quality prediction, computer chip quality analysis, analysis of grinding operations, chemical product design analysis, machine maintenance analysis, project bidding, planning and management, dynamic modeling of chemical process systemsMedicalBreast cancer cell analysis, EEG and ECG analysis, prosthesis design, optimization of transplant times, hospital expense reduction, hospital quality improvement, emergency room test advisement0il and GasExplorationRoboticsTrajectory control, forklift robot, manipulator controllers, vision systems SpeechSpeech recognition, speech compression, vowel classification, text to speech synthesisSecuritiesMarket analysis, automatic bond rating, stock trading advisory systems TelecommunicationsImage and data compression, automated information services,real-time translation of spoken language, customer payment processing systemsTransportationTruck brake diagnosis systems, vehicle scheduling, routing systems ConclusionThe number of neural network applications, the money that has been invested in neural network software and hardware, and the depth and breadth of interest in these devices have been growing rapidly.4 Biological InspirationThe artificial neural networks discussed in this text are only remotely related to their biological counterparts. In this section we willbriefly describe those characteristics of brain function that have inspired the development of artificial neural networks.The brain consists of a large number (approximately 1011) of highly connected elements (approximately 104 connections per element) called neurons. For our purposes these neurons have three principal components: the dendrites, the cell body and the axon. The dendrites are tree-like receptive networks of nerve fibers that carry electrical signals into the cell body. The cell body effectively sums and thresholds these incoming signals. The axon is a single long fiber that carries the signal from the cell body out to other neurons. The point of contact between an axon of one cell and a dendrite of another cell is called a synapse. It is the arrangement of neurons and the strengths of the individual synapses, determined by a complex chemical process, that establishes the function of the neural network. Some of the neural structure is defined at birth. Other parts are developed through learning, as new connections are made and others waste away. This development is most noticeable in the early stages of life. For example, it has been shown that if a young cat is denied use of one eye during a critical window of time, it will never develop normal vision in that eye.Neural structures continue to change throughout life. These later changes tend to consist mainly of strengthening or weakening of synaptic junctions. For instance, it is believed that new memories are formed bymodification of these synaptic strengths. Thus, the process of learning a new friend's face consists of altering various synapses.Artificial neural networks do not approach the complexity of the brain. There are, however, two key similarities between bio logical and artificial neural networks. First, the building blocks of both networks are simple computational devices (although artificial neurons are much simpler than biological neurons) that are highly interconnected. Second, the connections between neurons determine the function of the network. The primary objective of this book will be to determine the appropriate connections to solve particular problems.It is worth noting that even though biological neurons are very slow when compared to electrical circuits, the brain is able to perform many tasks much faster than any conventional computer. This is in part because of the massively parallel structure of biological neural networks; all of the neurons are operating at the same time. Artificial neural networks share this parallel structure. Even though most artificial neural networks are currently implemented on conventional digital computers, their parallel structure makes them ideally suited to implementation using VLSI, optical devices and parallel processors.In the following chapter we will introduce our basic artificial neuron and will explain how we can combine such neurons to form networks. This will provide a background for Chapter 3, where we take our firstlook at neural networks in action.中文译文神经网络概述1.目的当你现在看这本书的时候, 就正在使用一个复杂的生物神经网络。

神经网络模型的应用与研究

神经网络模型的应用与研究

神经网络模型的应用与研究近年来,随着人工智能技术的迅速发展,神经网络模型作为一种重要的人工智能手段,在不同领域得到了广泛应用。

本文将从神经网络模型的基本原理、应用及相关研究方向等方面介绍神经网络模型的应用与研究。

一、神经网络模型的基本原理神经网络模型(Neural Network Model)又称人工神经网络,是一种直接模拟人类大脑神经系统的计算模型,它由多个神经元(Neuron)组成,每个神经元与其它神经元连接,通过学习自适应函数来实现对数据的学习和分类。

简单来说,神经网络模型就是一种基于各种数据的学习算法,可以从中发现并应用数据中的潜在规律和模式,并具有较强的自适应性和泛化能力。

有关神经网络模型的具体实现方法,可以通过流程图表现,其基本步骤包括输入层、隐藏层和输出层,其中输入层用来接收多维度输入数据,隐藏层用于过滤和处理隐藏层和输出层之间的数据,输出层则能够得出最后的预测结果。

在训练模型时,需要对神经网络初始权重进行随机量化,通过反向传播算法来调整权重,使得预测误差逐渐降低,得到更加精确的预测结果。

二、神经网络模型的应用在实际应用中,神经网络模型已经开始被广泛应用于各个领域。

以下是神经网络模型的一些常见应用:1、图像识别:神经网络模型能够对图像进行识别和分类,可用于人脸识别、车牌号识别、语音识别等领域。

2、自然语言处理:神经网络模型能够实现自动翻译、文本分类、情感分析等功能。

3、金融领域:神经网络模型可用于股票预测、欺诈检测、信用风险评估等方面。

4、医疗领域:神经网络模型能够对疾病信息进行处理和分析,来进行疾病诊断、疾病预测和治疗方案制定等。

三、神经网络模型的研究方向随着神经网络模型在各个领域的不断应用,相关的研究方向也在不断扩展。

以下是一些目前热门的神经网络研究方向:1、深度学习:深度学习是一种复杂神经网络的变体,通过层叠多个简单的神经元来获取更复杂的信息抽象和学习,可以被广泛应用于图像和语音处理等方面。

卷积神经网络机器学习外文文献翻译中英文2020

卷积神经网络机器学习外文文献翻译中英文2020

卷积神经网络机器学习相关外文翻译中英文2020英文Prediction of composite microstructure stress-strain curves usingconvolutional neural networksCharles Yang,Youngsoo Kim,Seunghwa Ryu,Grace GuAbstractStress-strain curves are an important representation of a material's mechanical properties, from which important properties such as elastic modulus, strength, and toughness, are defined. However, generating stress-strain curves from numerical methods such as finite element method (FEM) is computationally intensive, especially when considering the entire failure path for a material. As a result, it is difficult to perform high throughput computational design of materials with large design spaces, especially when considering mechanical responses beyond the elastic limit. In this work, a combination of principal component analysis (PCA) and convolutional neural networks (CNN) are used to predict the entire stress-strain behavior of binary composites evaluated over the entire failure path, motivated by the significantly faster inference speed of empirical models. We show that PCA transforms the stress-strain curves into an effective latent space by visualizing the eigenbasis of PCA. Despite having a dataset of only 10-27% of possible microstructure configurations, the mean absolute error of the prediction is <10% of therange of values in the dataset, when measuring model performance based on derived material descriptors, such as modulus, strength, and toughness. Our study demonstrates the potential to use machine learning to accelerate material design, characterization, and optimization.Keywords:Machine learning,Convolutional neural networks,Mechanical properties,Microstructure,Computational mechanics IntroductionUnderstanding the relationship between structure and property for materials is a seminal problem in material science, with significant applications for designing next-generation materials. A primary motivating example is designing composite microstructures for load-bearing applications, as composites offer advantageously high specific strength and specific toughness. Recent advancements in additive manufacturing have facilitated the fabrication of complex composite structures, and as a result, a variety of complex designs have been fabricated and tested via 3D-printing methods. While more advanced manufacturing techniques are opening up unprecedented opportunities for advanced materials and novel functionalities, identifying microstructures with desirable properties is a difficult optimization problem.One method of identifying optimal composite designs is by constructing analytical theories. For conventional particulate/fiber-reinforced composites, a variety of homogenizationtheories have been developed to predict the mechanical properties of composites as a function of volume fraction, aspect ratio, and orientation distribution of reinforcements. Because many natural composites, synthesized via self-assembly processes, have relatively periodic and regular structures, their mechanical properties can be predicted if the load transfer mechanism of a representative unit cell and the role of the self-similar hierarchical structure are understood. However, the applicability of analytical theories is limited in quantitatively predicting composite properties beyond the elastic limit in the presence of defects, because such theories rely on the concept of representative volume element (RVE), a statistical representation of material properties, whereas the strength and failure is determined by the weakest defect in the entire sample domain. Numerical modeling based on finite element methods (FEM) can complement analytical methods for predicting inelastic properties such as strength and toughness modulus (referred to as toughness, hereafter) which can only be obtained from full stress-strain curves.However, numerical schemes capable of modeling the initiation and propagation of the curvilinear cracks, such as the crack phase field model, are computationally expensive and time-consuming because a very fine mesh is required to accommodate highly concentrated stress field near crack tip and the rapid variation of damage parameter near diffusive cracksurface. Meanwhile, analytical models require significant human effort and domain expertise and fail to generalize to similar domain problems. In order to identify high-performing composites in the midst of large design spaces within realistic time-frames, we need models that can rapidly describe the mechanical properties of complex systems and be generalized easily to analogous systems. Machine learning offers the benefit of extremely fast inference times and requires only training data to learn relationships between inputs and outputs e.g., composite microstructures and their mechanical properties. Machine learning has already been applied to speed up the optimization of several different physical systems, including graphene kirigami cuts, fine-tuning spin qubit parameters, and probe microscopy tuning. Such models do not require significant human intervention or knowledge, learn relationships efficiently relative to the input design space, and can be generalized to different systems.In this paper, we utilize a combination of principal component analysis (PCA) and convolutional neural networks (CNN) to predict the entire stress-strain curve of composite failures beyond the elastic limit. Stress-strain curves are chosen as the model's target because they are difficult to predict given their high dimensionality. In addition, stress-strain curves are used to derive important material descriptors such as modulus, strength, and toughness. In this sense, predicting stress-straincurves is a more general description of composites properties than any combination of scaler material descriptors. A dataset of 100,000 different composite microstructures and their corresponding stress-strain curves are used to train and evaluate model performance. Due to the high dimensionality of the stress-strain dataset, several dimensionality reduction methods are used, including PCA, featuring a blend of domain understanding and traditional machine learning, to simplify the problem without loss of generality for the model.We will first describe our modeling methodology and the parameters of our finite-element method (FEM) used to generate data. Visualizations of the learned PCA latent space are then presented, along with model performance results.CNN implementation and trainingA convolutional neural network was trained to predict this lower dimensional representation of the stress vector. The input to the CNN was a binary matrix representing the composite design, with 0's corresponding to soft blocks and 1's corresponding to stiff blocks. PCA was implemented with the open-source Python package scikit-learn, using the default hyperparameters. CNN was implemented using Keras with a TensorFlow backend. The batch size for all experiments was set to 16 and the number of epochs to 30; the Adam optimizer was used to update the CNN weights during backpropagation.A train/test split ratio of 95:5 is used –we justify using a smaller ratio than the standard 80:20 because of a relatively large dataset. With a ratio of 95:5 and a dataset with 100,000 instances, the test set size still has enough data points, roughly several thousands, for its results to generalize. Each column of the target PCA-representation was normalized to have a mean of 0 and a standard deviation of 1 to prevent instable training.Finite element method data generationFEM was used to generate training data for the CNN model. Although initially obtained training data is compute-intensive, it takes much less time to train the CNN model and even less time to make high-throughput inferences over thousands of new, randomly generated composites. The crack phase field solver was based on the hybrid formulation for the quasi-static fracture of elastic solids and implemented in the commercial FEM software ABAQUS with a user-element subroutine (UEL).Visualizing PCAIn order to better understand the role PCA plays in effectively capturing the information contained in stress-strain curves, the principal component representation of stress-strain curves is plotted in 3 dimensions. Specifically, we take the first three principal components, which have a cumulative explained variance ~85%, and plot stress-strain curves in that basis and provide several different angles from which toview the 3D plot. Each point represents a stress-strain curve in the PCA latent space and is colored based on the associated modulus value. it seems that the PCA is able to spread out the curves in the latent space based on modulus values, which suggests that this is a useful latent space for CNN to make predictions in.CNN model design and performanceOur CNN was a fully convolutional neural network i.e. the only dense layer was the output layer. All convolution layers used 16 filters with a stride of 1, with a LeakyReLU activation followed by BatchNormalization. The first 3 Conv blocks did not have 2D MaxPooling, followed by 9 conv blocks which did have a 2D MaxPooling layer, placed after the BatchNormalization layer. A GlobalAveragePooling was used to reduce the dimensionality of the output tensor from the sequential convolution blocks and the final output layer was a Dense layer with 15 nodes, where each node corresponded to a principal component. In total, our model had 26,319 trainable weights.Our architecture was motivated by the recent development and convergence onto fully-convolutional architectures for traditional computer vision applications, where convolutions are empirically observed to be more efficient and stable for learning as opposed to dense layers. In addition, in our previous work, we had shown that CNN's werea capable architecture for learning to predict mechanical properties of 2D composites [30]. The convolution operation is an intuitively good fit for predicting crack propagation because it is a local operation, allowing it to implicitly featurize and learn the local spatial effects of crack propagation.After applying PCA transformation to reduce the dimensionality of the target variable, CNN is used to predict the PCA representation of the stress-strain curve of a given binary composite design. After training the CNN on a training set, its ability to generalize to composite designs it has not seen is evaluated by comparing its predictions on an unseen test set. However, a natural question that emerges is how to evaluate a model's performance at predicting stress-strain curves in a real-world engineering context. While simple scaler metrics such as mean squared error (MSE) and mean absolute error (MAE) generalize easily to vector targets, it is not clear how to interpret these aggregate summaries of performance. It is difficult to use such metrics to ask questions such as “Is this model good enough to use in the real world” and “On average, how poorly will a given prediction be incorrect relative to so me given specification”. Although being able to predict stress-strain curves is an important application of FEM and a highly desirable property for any machine learning model to learn, it does not easily lend itself to interpretation. Specifically, there is no simple quantitative way to define whether twostress-strain curves are “close” or “similar” with real-world units.Given that stress-strain curves are oftentimes intermediary representations of a composite property that are used to derive more meaningful descriptors such as modulus, strength, and toughness, we decided to evaluate the model in an analogous fashion. The CNN prediction in the PCA latent space representation is transformed back to a stress-strain curve using PCA, and used to derive the predicted modulus, strength, and toughness of the composite. The predicted material descriptors are then compared with the actual material descriptors. In this way, MSE and MAE now have clearly interpretable units and meanings. The average performance of the model with respect to the error between the actual and predicted material descriptor values derived from stress-strain curves are presented in Table. The MAE for material descriptors provides an easily interpretable metric of model performance and can easily be used in any design specification to provide confidence estimates of a model prediction. When comparing the mean absolute error (MAE) to the range of values taken on by the distribution of material descriptors, we can see that the MAE is relatively small compared to the range. The MAE compared to the range is <10% for all material descriptors. Relatively tight confidence intervals on the error indicate that this model architecture is stable, the model performance is not heavily dependent on initialization, and that our results are robust to differenttrain-test splits of the data.Future workFuture work includes combining empirical models with optimization algorithms, such as gradient-based methods, to identify composite designs that yield complementary mechanical properties. The ability of a trained empirical model to make high-throughput predictions over designs it has never seen before allows for large parameter space optimization that would be computationally infeasible for FEM. In addition, we plan to explore different visualizations of empirical models in an effort to “open up the black-box” of such models. Applying machine learning to finite-element methods is a rapidly growing field with the potential to discover novel next-generation materials tailored for a variety of applications. We also note that the proposed method can be readily applied to predict other physical properties represented in a similar vectorized format, such as electron/phonon density of states, and sound/light absorption spectrum.ConclusionIn conclusion, we applied PCA and CNN to rapidly and accurately predict the stress-strain curves of composites beyond the elastic limit. In doing so, several novel methodological approaches were developed, including using the derived material descriptors from the stress-strain curves as interpretable metrics for model performance and dimensionalityreduction techniques to stress-strain curves. This method has the potential to enable composite design with respect to mechanical response beyond the elastic limit, which was previously computationally infeasible, and can generalize easily to related problems outside of microstructural design for enhancing mechanical properties.中文基于卷积神经网络的复合材料微结构应力-应变曲线预测查尔斯,吉姆,瑞恩,格瑞斯摘要应力-应变曲线是材料机械性能的重要代表,从中可以定义重要的性能,例如弹性模量,强度和韧性。

BP神经网络的研究与应用

BP神经网络的研究与应用

本栏目责任编辑:唐一东人工智能及识别技术Computer Knowledge and Technology 电脑知识与技术第5卷第15期(2009年5月)BP 神经网络的研究与应用侯智斌,温必腾,彭华,李纯厚(炮兵学院研究生系,安徽合肥230031)摘要:人工神经网络作为一门高度综合的交叉学科,在实际应用中绝大部分的神经网络模型是采用BP 网络和它的变化形式,它也是前向网络的核心部分,体现了人工神经网络最精华的部分。

该文介绍了BP 网络的学习过程以及从模式识别角度应用BP 神经网络作为分类器进行机械故障诊断。

关键词:BP 神经网络;学习过程;模式识别;旋转机械;故障诊断中图分类号:TP311文献标识码:A 文章编号:1009-3044(2009)15-3982-02The Study and Application of the BP Neural NetworkHOU Zhi-bin,WEN Bi-teng,PENG-Hua,LI Chun-hou(Department for Graduate Students of Artillery Academy,Hefei 230031,China)Abstract:The manual NN as a highly integrated chiasma subject.Most of models about NN are adopting the BP network and the changed form at the practical appliance,which is also the hard core of forward network,incarnating the essential part of NN.The paper introduces the studying process of the BP network and uses the BP network for the mechanical failure diagnoses as assorted organ in the mode identification.Key words:BP neural network;studying process;mode identification;revolving machine;failure diagnoses1引言人工神经网络是一门高度综合的交叉学科,它的研究和发展涉及神经生理学、数理科学、信息科学和计算机科学等众多学科领域。

模糊神经网络外文翻译文献

模糊神经网络外文翻译文献

模糊神经网络外文翻译文献(文档含中英文对照即英文原文和中文翻译)原文:Neuro-fuzzy generalized predictive control ofboiler steam temperatureXiangjie LIU, Jizhen LIU, Ping GUANABSTRACTPower plants are nonlinear and uncertain complex systems. Reliablecontrol of superheated steam temperature is necessary to ensure high efficiency and high load-following capability in the operation of modern power plant. A nonlinear generalized predictive controller based on neuro-fuzzy network (NFGPC) is proposed in this paper. The proposed nonlinear controller is applied to control the superheated steam temperature of a 200MW power plant. From the experiments on the plant and the simulation of the plant, much better performance than the traditional controller is obtained.Keywords:Neuro-fuzzy networks; Generalized predictive control; Superheated steam temperature1. IntroductionContinuous process in power plant and power station are complex systems characterized by nonlinearity, uncertainty and load disturbance. The superheater is an important part of the steam generation process in the boiler-turbine system, where steam is superheated before entering the turbine that drives the generator. Controlling superheated steam temperature is not only technically challenging, but also economically important.From Fig.1,the steam generated from the boiler drum passes through the low-temperature superheater before it enters the radiant-type platen superheater. Water is sprayed onto the steam to control the superheated steam temperature in both the low and high temperature superheaters. Proper control of the superheated steam temperature is extremely important to ensure theoverall efficiency and safety of the power plant. It is undesirable that the steam temperature is too high, as it can damage the superheater and the high pressure turbine, or too low, as it will lower the efficiency of the power plant. It is also important to reduce the temperature fluctuations inside the superheater, as it helps to minimize mechanical stress that causes micro-cracks in the unit, in order to prolong the life of the unit and to reduce maintenance costs. As the GPC is derived by minimizing these fluctuations, it is amongst the controllers that are most suitable for achieving this goal.The multivariable multi-step adaptive regulator has been applied to control the superheated steam temperature in a 150 t/h boiler, and generalized predictive control was proposed to control the steam temperature. A nonlinear long-range predictive controller based on neural networks is developed into control the main steam temperature and pressure, and the reheated steam temperature at several operating levels. The control of the main steam pressure and temperature based on a nonlinear model that consists of nonlinear static constants and linear dynamics is presented in that.Fig.1 The boiler and superheater steam generation processFuzzy logic is capable of incorporating human experiences via the fuzzy rules. Nevertheless, the design of fuzzy logic controllers is somehow time consuming, as the fuzzy rules are often obtained by trials and errors. In contrast, neural networks not only have the ability to approximate non-linear functions with arbitrary accuracy, they can also be trained from experimental data. The neuro-fuzzy networks developed recently have the advantages of model transparency of fuzzy logic and learning capability of neural networks. The NFN is have been used to develop self-tuning control, and is therefore a useful tool for developing nonlinear predictive control. Since NFN is can be considered as a network that consists of several local re-gions, each of which contains a local linear model, nonlinear predictive control based on NFN can be devised with the network incorporating all the local generalized predictive controllers (GPC) designed using the respective local linear models. Following this approach, the nonlinear generalized predictive controllers based on the NFN, or simply, the neuro-fuzzy generalized predictive controllers (NFG-PCs)are derived here. The proposed controller is then applied to control the superheated steam temperature of the 200MW power unit. Experimental data obtained from the plant are used to train the NFN model, and from which local GPC that form part of the NFGPC is then designed. The proposed controller is tested first on the simulation of the process, before applying it to control the power plant.2. Neuro-fuzzy network modellingConsider the following general single-input single-output nonlinear dynamic system:),1(),...,(),(),...,1([)(''+-----=u y n d t u d t u n t y t y f t y∆+--/)()](),...,1('t e n t e t e e (1)where f[.]is a smooth nonlinear function such that a Taylor series expansion exists, e(t)is a zero mean white noise and Δis the differencing operator,''',,e u y n n n and d are respectively the known orders and time delay ofthe system. Let the local linear model of the nonlinear system (1) at the operating point )(t o be given by the following Controlled Auto-Regressive Integrated Moving Average (CARIMA) model:)()()()()()(111t e z C t u z B z t y z A d ----+∆= (2) Where)()(),()(1111----∆=z andC z B z A z A are polynomials in 1-z , the backward shift operator. Note that the coefficients of these polynomials are a function of the operating point )(t o .The nonlinear system (1) is partitioned into several operating regions, such that each region can be approximated by a local linear model. Since NFN is a class of associative memory networks with knowledge stored locally, they can be applied to model this class of nonlinear systems. A schematic diagram of the NFN is shown in Fig.2.B-spline functions are used as the membership functions in the NFN for the following reasons. First,B-spline functions can be readily specified by the order of the basis function and the number of inner knots. Second, they are defined on a bounded support, andthe output of the basis function is always positive, i.e.,],[,0)(j k j j k x x λλμ-∉=and ],[,0)(j k j j k x x λλμ-∈>.Third, the basis functions form a partition of unity, i.e.,.][,1)(min,∑∈≡jmam j k x x x x μ (3)And fourth, the output of the basis functions can be obtained by a recurrence equation.Fig. 2 neuro-fuzzy network The membership functions of the fuzzy variables derived from the fuzzy rules can be obtained by the tensor product of the univariate basis functions. As an example, consider the NFN shown in Fig.2, which consists of the following fuzzy rules:IF operating condition i (1x is positive small, ... , and n x is negative large),THEN the output is given by the local CARIMA model i:...)()(ˆ...)1(ˆ)(ˆ01+-∆+-++-=d t u b n t y a t y a t y i i a i in i i i a )(...)()(c i in i b i in n t e c t e n d t u b cb -+++--∆+ (4)or)()()()()(ˆ)(111t e z C t u z B z t yz A i i i i d i i ----+∆= (5) Where )()(),(111---z andC z B z A i i i are polynomials in the backward shift operator 1-z , and d is the dead time of the plant,)(t u i is the control, and )(t e i isa zero mean independent random variable with a variance of 2δ. Themultivariate basis function )(k i x a is obtained by the tensor products of the univariate basis functions,p i x A a nk k i k i ,...,2,1,)(1==∏=μ(6)where n is the dimension of the input vector x, and p, the total number of weights in the NFN, is given by,∏=+=nk i i k R p 1)((7)Where i k and i R are the order of the basis function and the number of inner knots respectively. The properties of the univariate B-spline basis functions described previously also apply to the multivariate basis function, which is defined on the hyper-rectangles. The output of the NFN is,∑∑∑=====p i i i p i ip i i i a y aa y y 111ˆˆˆ (8)译文:锅炉蒸汽温度模糊神经网络的广义预测控制Xiangjie LIU, Jizhen LIU, Ping GUAN摘要发电厂是非线性和不确定性的复杂系统。

神经网络英文文献

神经网络英文文献

ARTIFICIAL NEURAL NETWORK FOR LOAD FORECASTINGIN SMART GRIDHAO-TIAN ZHANG,FANG-YUAN XU,LONG ZHOUEnergy System Group,City University London,Northampton Square,London,UK E-MAIL:abhb@,abcx172@,long.zhou.1@ Abstract:It is an irresistible trend of the electric power improvement for developing the smart grid,which applies a large amount of new technologies in power generation,transmission,distribution and utilization to achieve optimization of the power configuration and energy saving.As one of the key links to make a grid smarter,load forecast plays a significant role in planning and operation in power system.Many ways such as Expert Systems,Grey System Theory,and Artificial Neural Network(ANN)and so on are employed into load forecast to do the simulation.This paper intends to illustrate the representation of the ANN applied in load forecast based on practical situation in Ontario Province,Canada.Keywords:Load forecast;Artificial Neuron Network;back propagation training;Matlab1.IntroductionLoad forecasting is vitally beneficial to the power system industries in many aspects.As an essential part in the smart grid,high accuracy of the load forecasting is required to give the exact information about the power purchasing and generation in electricity market,prevent more energy from wasting and abusing and making the electricity price in a reasonable range and so on.Factors such as season differences,climate changes,weekends and holidays,disasters and political reasons, operation scenarios of the power plants and faults occurring on the network lead to changes of the load demand and generations.Since1990,the artificial neural network(ANN)has been researched to apply into forecasting the load.“ANNs are massively parallel networks of simple processing elements designed to emulate the functions and structure of the brain to solve very complex problems”.Owing to the transcendent characteristics,ANNs is one of the most competent methods to do the practical works like load forecasting.This paper concerns about the behaviors of artificial neural network in load forecasting.Analysis of the factors affectingthe load demand in Ontario,Canada is made to give aneffective way for load forecast in Ontario.2.Back Propagation Network2.1.BackgroundBecause the outstanding characteristic of the statistical and modeling capabilities,ANN could deal with non-linear and complex problems in terms of classification or forecasting.As the problem defined,the relationship between the input and target is non-linear and very complicated.ANN is an appropriate method to apply into the problem to forecast the load situation.For applying into the load forecast,an ANN needs to select a network type such as Feed-forward Back Propagation, Layer Recurrent and Feed-forward time-delay and so on.To date,Back propagation is widely used in neural networks,which is a feed-forward network with continuously valued functions and supervised learning.It can match the input data and corresponding output in an appropriate way to approach a certain function which is used for achieving an expected goal with some previous data in the same manner of the input.2.2.Architecture of back propagation algorithmFigure1shows a single Neuron model of back propagation algorithm. Generally,the output is a function of the sum of bias and weight multiplied by the input.The activationfunction could be any kinds of functions.However,the generated output is different.Owing to the feed-forward network,in general,at least one hidden layer before the output layer is needed.Three-layer network is selected as the architecture,because this kind of architecture can approximate any function with a few discontinuities.The architecture with three layers is shown in Figure2below:Figure1.Neuron model of back propagation algorithmFigure2.Architecture of three-layer feed-forward networkBasically,there are three activation functions applied into back propagation algorithm,namely,Log-Sigmoid,Tan-Sigmoid,and Linear Transfer Function.The output range in each function is illustrated in Figure3below.Figure.3.Activation functions applied in back propagation(a)Log-sigmoid(b)Tan-sigmoid(c)linear function2.3.Training function selectionAlgorithms of training function employed based on back propagation approach are used and the function was integrated in the Matlab Neuron network toolbox.TABLE.I.TRAINING FUNCTIONS IN MATLAB’S NN TOOLBOX3.Training Procedures3.1.Background analysisThe neural network training is based on the load demand and weather conditions in Ontario Province,Canada which is located in the south of Canada.The region in Ontario can be divided into three parts which are southwest,central and east,and north,according to the weather conditions.The population is gathered around southeastern part of the entire province,which includes two of the largest cities of Canada, Toronto and Ottawa.3.2.Data AcquisitionThe required training data can be divided into two parts:input vectors and output targets.For load forecasting,input vectors for training include all the information of factorsaffecting the load demand change,such as weather information,holidays or working days,fault occurring in the network and so on.Output targets are the real time loadscenarios,which mean the demand presented at the same time as input vectors changing.Owing to the conditional restriction,this study only considers the weather information and logical adjustment of weekdays and weekends as the factors affecting the loadstatus.In this paper,factors affecting the load changing are listed below:(1).Temperature(℃)(2).Dew Point Temperature(℃)(3).Relative Humidity(%)(4).Wind speed(km/h)(5).Wind Direction(10)(6).Visibility(km)(7).Atmospheric pressure(kPa)(8).Logical adjustment of weekday or weekendAccording to the information gathered above,the weather information in Toronto taken place of the whole Ontario province is chosen to provide data acquisition.The data was gathered hourly according to the historical weather conditions remained in the weather stations.Load demand data also needs to be gathered hourly and correspondingly.In this paper,2years weather data and load data is collected to train and test the created network.3.3.Data NormalizationOwing to prevent the simulated neurons from being driven too far into saturation,all of the gathered data needs to be normalized after acquisition.Like per unit system,each input and target data are required to be divided by the maximum absolute value in corresponding factor.Each value of the normalized data is within the range between-1and+1so that the ANN could recognize the data easily.Besides,weekdays are represented as1,and weekend are represented as0.3.4.Neural network creatingToolbox in Matlab is used for training and simulating the neuron network.The layout of the neural network consists of number of neurons and layers,connectivity of layers,activation functions,and error goal and so on.It depends on the practical situation to set the framework and parameters of the network.The architecture of the ANN could be selected to achieve the optimized result.Matlab is one of the best simulation tools to provide visible windows.Three-layer architecture has been chosen to give the simulation as shown in Figure2above.It is adequate to approximate arbitrary function,if the nodes of the hidden layer are sufficient.Due to the practical input value is from-1to+1,the transfer function of the first layer is set to be tan sigmiod,which is a hyperbolic tangent sigmoid transfer function.The transfer function of the output layer is set to be linear function,which is a linear function to calculate a layer’s output from its net input.There is one advantage for the linear output transfer function:because the linear output neurons lead to the output take on any value,there is no difficulty to find out the differences between output and target.The next step is the neurons and training functions selection. Generally,Trainbr and Trainlm are the best choices around all of the training functions in Matlab toolboxTrainlm(Levenberg-Marquardt algorithm)is the fastest training algorithm for networks with moderate size.However,the big problemappears that it needs the storage of some matrices which is sometimes large for the problems.When the training set is large,trainlm algorithm will reduce the memory and always compute the approximate Hessian matrix with n×n dimensions.Another drawback of the trainlm is that the over-fitting will occur when the number of the neurons is too large.Basically,the number of neurons is not too large when the trainlm algorithm is employed into the network.Trainbr(Bayesian regularization)is a modified algorithm of the Levenberg-Marquardt training method to create networks which generalize well so that the optimal network architecture can be easily determined.Impacts from effectively used weights and biases of the network can be seen clearly by this algorithm.And the number of the effective weights and biases will not change too much when the dimension of the network is getting large.The trianbr algorithm has the best performance after the network input and output normalized into the range from-1to+1.An important thing when using trainbr should be mentioned is that the algorithm should not stop until the effective number of parameters has converged.More details are available in Matlab neural network toolbox.Number of neurons in the first layer also can be selected to optimize the network so that an expected result can be made.Generally speaking, the more complicated architecture of the network is,the more accurate the output result will be,however,the higher chances will the algorithm such as trainlm with over-fitting.In this paper,the number of neurons is8in trainlm algorithm,and30in trainbr algorithm.3.5.Neural network trainingBefore training,the network needs to be initialized first.The network initialization is not only influencing the final local minimum,but also affecting the speed of convergence,the convergence probability and generalization.The information on weather conditions in2007hourly and weekday and weekend logic in Ontario are defined as training input;the load demand changes in2007hourly in Ontario are defined as training target.The training performances of trainlm algorithm and trainbr algorithm are shown in Figure4and5,respectively.As can be seen in these plots,the mean squared error is decreasing from a large value to a smaller value.Figure4trainlm algorithm performance plotFigure5.trainbr algorithm performance plotFor both training algorithms,namely,trainbr and trainlm,the procedure will stop when any of the conditions occurs:(1).Epochs reached the maximum value(2).Time approaches to the preinstalled value(3).Goal error is minimized(4).Gradient is decreased to min_grad(5).Mu exceeds mu_max(6).Validation performance has increased more than max_fail times since the last time it decreased(when usingvalidation).There is no difficult to find out that the trainlm performance plot stopped because of meeting the error goal which is set as0.001;the trainbr performance stopped owingto the validation check times is more than the max_fail times.Figure6.Training result and training target by trainbr algorithm with 8neuronsFrom Figure6,comparison of training result and training target are made to check out the performance of the algorithm applied on load forecasting.It is obvious that the training result meet the target in general.The network test simulation should be made in order to find out the performance in a real problem.3.6.Neural network simulationThe network is required to check whether it can achieve the expectation after training.Another set of input vectors and demand scenarios are needed to test the parison needs to be made to check out the difference between the test output and real demand.In this project, the information on weather conditions in2008hourly and weekday and weekend logic in Ontario are used as simulation input,and the load demand scenarios in2008hourly in Ontario are used as the simulation target. After the simulation,a set of output could be obtained through the trained neural network.The simulation output and the simulation target are used to check the mean squared error to analyze the extent of succeed with neural network application.Mean squared error could be calculated as: MSE=Mean(se)/max(test target)Where Mean(se)is the mean value of the difference between the simulation output and the test target;Max(test target)is the maximum value of the test target.Figure7.The simulation result of the trainbr algorithm with8neurons Figure7shows a sample of the simulation results which is applying the same network as the training simulation in Figure6.The green track is the test simulation result and the blue track is the real load demand which is provided by electric industry in Ontario.The horizontal is presenting time,and ordinate is presenting the load which has been normalized.The less the mean squared error is,the better the created neuron network can perform.4.result comparisonTable2illustrates the MSE of the trainoss algorithm,trainbr algorithm and trainlm algorithm when the number of the neurons in the hidden layer is increasing.Each network has been trained10times to achieve the global minimum.It is obvious that trainlm and trainbr have better performance than trainoss.Figure8-9demonstrate two of the best results with different training algorithms and different neuron number.As can be seen in both figures, most of the test simulation results could meet the target very well. However,there are still some simulation part didn’t follow the real target.It may be because the factors which haven’t been taken into account, such as disasters,failure of the electric network or some national holidays which is not mentioned on input vectors.Figure8.the simulation result of the trainlm algorithm with8neuronsFigure9.The simulation result of the trainbr algorithm with10neurons Figure10aims to compare the two algorithms with the same number of neurons applied into the networks.The blue track is the test simulation target,the red one is the result simulated by trainlm,and the green one is the result simulated by trainbr.The simulation result of trainbr is much closer to the test target than that of trainlm,although the error goal of trainbr is much larger.The trend of the trainbr followed the general track of the test target,whereas the trainlm can only simulate the maximum and minimum valueper cycle.Figure10The simulation results of both trainbr and trainlm with8neurons The comparison of trainlm with8neurons and30neurons is presented in Figure11.Theoretically,more neuron applied into the neural network could make the performance better.However,over-fittings could occur much more obviously.In Figure11,red track which is representing the trainlm with30neurons is much closer to the test target which is represented in blue.Over-fitting is the main problem which cannot be avoided.Figure11The simulation results of trainlm with8neurons and30neurons From the performance,trainbr could be the optimized algorithm that can be employed in load forcast by back propagation.There is an argument that the MSE of the trainlm algorithm could be less than that of trianbr when the neurons are increasing.However,there is a big problem could not be neglected,namely,over-fitting,which could decrease the quality of the simulation.5.ConclusionThis paper focused on the behaviors of different training algorithms for load forecasting by back propagation algorithm in Neural Network.Due to the characteristic of imitating the mode of human beings’thinking, ANN can learn the relationship between input and output.Thus,same function of the relationship can be applied into practical situation tofind out the output according to the information known already as input. After research,trainbr algorithm which is integrated in Neural Network Toolbox in Matlab is regarded as one of the best choice to do load forecast. If the accurate results are required to forecast the load,more neurons are needed to apply into the network architecture.On the other hand, over-fitting must be considered about to ensure the network simulate the load situation well.Owing to the condition limitation,the input vectors did not take all of the information into account.A few of the simulation part didn’t meet the real demand very well,even large squared error occurred.If the information was gathered enough and the networks were trained more meticulous,better result could be obtained to apply into load forecasting for smart grid.。

纺织专业 人工神经网络 中英文 外文 资料 文献 原文和翻译

纺织专业 人工神经网络 中英文 外文 资料 文献 原文和翻译

Textile Research Journal Article Use of Artificial Neural Networks for Determining the LevelingAction Point at the Auto-leveling Draw FrameAssad Farooq1and Chokri CherifInstitute of Textile and Clothing Technology, TechnischeUniversität Dresden. Dresden, GermanyAbstractArtificial neural networks with their ability of learning from data have been successfully applied in the textile industry. The leveling action point is one of the important auto-leveling parameters of the drawing frame and strongly influences the quality of the manufactured yarn. This paper reports a method of predicting the leveling action point using artificial neural networks. Various leveling action point affecting variables were selected as inputs for training the artificial neural networks with the aim to optimize the auto-leveling by limiting the leveling action point search range. The Levenberg Marquardt algorithm is incorporated into the back-propagation to accelerate the training and Bayesian regularization is applied to improve the generalization of the networks. The results obtained are quite promising.Key words:artificial neural networks, auto-lev-eling, draw frame, leveling action point。

深度神经网络的研究与应用

深度神经网络的研究与应用

深度神经网络的研究与应用深度神经网络(Deep Neural Network,DNN)是当前最流行的人工智能技术之一,也是目前许多应用领域中最具潜力的技术之一。

这种技术背后的数学理论和算法复杂度极高,但它让计算机具备了很强的智能,能够高度自动化、高效率地进行各种数据处理任务。

深度神经网络是一种多层次神经元构成的神经网络模型,它能够通过学习具有多层结构的大量数据,并逐渐对这些数据进行抽象、归纳和分类,从而实现诸如图像识别、语音识别、自然语言处理等高级功能。

随着深度学习技术的不断发展和应用,深度神经网络的研究和应用范围也越来越广泛。

一、深度神经网络的工作原理深度神经网络的工作过程可以分为两个阶段,即训练和测试。

在训练阶段,深度神经网络通过大量输入数据样本进行训练。

网络将输入数据样本从第一层开始,依次经过多个隐藏层,最终输出结果。

在每个隐藏层中,都会有很多个神经元,每个神经元接收上一层的输出,并对其进行加权求和和激活函数处理。

这些操作可以将输入数据从低级别的特征抽象为高级别的特征。

最终输出结果是网络对输入数据样本进行分类或回归的结果。

在测试阶段,网络接收新的测试数据,将其输入到网络中,并经过训练好的模型进行推断。

测试阶段的输出结果可以用于各种应用领域。

二、深度神经网络的应用深度神经网络在图像识别、语音识别和自然语言处理等领域有广泛的应用。

图像识别是深度学习技术中最常见的应用之一。

通过深度神经网络,可以对图像进行分类、目标检测、人脸识别等操作。

在医学图像分类、无人机目标识别和自动驾驶汽车等领域,深度神经网络也尤其重要。

语音识别是另一个深度学习中非常重要的应用。

语音识别技术可以将人类语言转化为文本形式。

在智能音箱、手机助手、汽车语音管控、语音翻译等领域,深度神经网络的语音识别技术也被广泛应用。

自然语言处理是深度学习技术中的另一个应用领域。

深度神经网络可以被用于内容分类,情感分析,机器翻译,语音生成等多项任务。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

network is shown as follows:
x1
w11 w12
y1
v11 v12
o1
x2
y2
o2
# xn1n1
y3
m# w2m m
wn1m
ym
#n v1n2 2
v3n2
on2
vmn2
Figure 1 The standard structure of a typical three-layer feed-forward network
1462
III. IMPROVEMENT OF THE STANDARD BP NEURAL NETWORK ALGORITHM
The convergence rate of the standard BP algorithm is slow, and the iterations of the standard BP algorithm are much, they all have negative influences on the rapidity of the control system. In this paper, improvement has been made to the learning rate of the standard BP algorithm to accelerate the training speed of the neural network.
From formula (1), the learning rate η influences the
weight adjustment value ǻW(n), and then influences the convergence rate of the network. If the learning rate η is too
small, the convergence rate will become very slow; If the learning rate η is too big, the excessive weight adjustment
will cause the convergence process oscillates around the minimum point. In order to solve the problem, the momentum term is added behind the formula (1):
Yan Li
School of Automation Northwestern Polytechnical
University Xi’an,China liyan@
Kairui Zhao
School of Automation Northwestern Polytechnical
University Xi’an,China zhaokairui@
II. STUCTURE AND ALGORITHM OF THE STANDARD BP NEURAL NETWORK
A. Structure of the BP neural network
The standard structure of a typical three-layer feed-forward
For the standard BP algorithm, the formula to calculate the weight adjustment is as follows:
ΔW ( n)
=
−η
∂E
∂W (n)
(1)
In formula (1), η represents the learning rate; ǻW(n)
Research and Application on Improved BP Neural Network Algorithm
Rong Xie
School of Automation Northwestern Polytechnical
University Xi’an,China xierong2005@
In order to accelerate the convergence speed of the neural networks, the weight adjustment formula needs to be further improved.
Although the BP neural network has mature theory and wide application, it still has many problems, such as the convergence rate is slow, the iterations are much, and the realtime performance is not so good. It is necessary to improve the standard BP neural network algorithm to solve there problems and achieve optimal performance.
Xinmin Wang
School of Automation Northwestern Polytechnical
University Xi’an,China wxmin@
Abstract—As the iterations are much, and the adjustment speed is slow, the improvements are made to the standard BP neural network algorithm. The momentum term of the weight adjustment rule is improved, make the weight adjustment speed more quicker and the weight adjustment process more smoother. The simulation of a concrete example shows that the iterations of the improved BP neural network algorithm can be calculated and compared. Finally, choosing a certain type of airplane as the controlled object, the improved BP neural network algorithm is used to design the control law for control command tracking, the simulation results show that the improved BP neural network algorithm can realize quicker convergence rate and better tracking accuracy.
E
=
¦ Ek
=
¦ ¦ 1
2
ei2k
ε
(r) p,k
(r
=
0,1,
2)
E < Emax
ωk ← ωk−1 +ηΔωk v j ← v j−1 +ηΔv j
Figure 2 Flow chart of the standard BP neural network algorithm
978-1-4244-5046-6/10/$26.00 c 2010 IEEE
According to the different types of the neuron connections, the neural networks can be divided into several types. This paper studies feed-forward neural network, as the feed-forward neural network using the error back propagation function in the weight training process, it is also known as back propagation neural network, or BP network for short [2,3]. BP neural network is a core part of the feed-forward neural network, which can realize a special non-linear transformation, transform the input space to the output space.
represents the weight adjustment value of the nth iterations; E(n) represents the error of the nth iterations; W(n) represents the connection weight of the nth iterations.
Keywords— improved BP neural networ˗ weight adjustment˗ learning rate˗ convergence rate˗ momentum term
I. INTRODUCTION
Artificial neural network (ANN) is developed under the basis of researching on complex biological neural networks. The human brain is constituted by about 1011 highly interconnected units, these units called neurons, and each neuron has about 104 connections[1]. Imitating the biological neurons, neurons can be expressed mathematically, the concept of artificial neural network is introduced, and the types can be defined by the different interconnection of neurons. It is an important area of the intelligent control by using the artificial neural network.
相关文档
最新文档