最优化-最小二乘法拟合
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Least Squares Fit
Abstract:
The techniques of least squares optimization have their origins in problems of curve fitting, and of finding the best possible solution for a system of linear equations with infinitely many solutions. Curve fitting problems begin with data points (t 1, S 1), . . . , (tn' sn) and a given class of functions (for example, linear functions, polynomial functions, exponential functions), and seek to identify the function S = f(t) that "best fits" the data points. On the other hand, such problems as finding the minimum distance in geometric contexts or minimum variance in statistical contexts can often be solved by finding the solution of minimum norm for an underdetermined linear system of equations.
Keyword:Least Squares、Fit、Equations
Text:Suppose that in a certain experiment or study, we record a series of observed values (t 1 , Sl), (t 2 , S2), ..., (tn, Sn) of two variables s, t that we have reason to believe are related by a function s = f(t) of a certain type. For example, we might know that sand t are related by a polynomial function
of degree < k, where k is prescribed in advance, but we do not know the specific values of the coefficients xo, Xl' ..., X k of p(t). We are interested in choosing the values of these coefficients so that the deviations
between the observed value Si at t i and the value p(tJ of p(t) at t i , are all as small as possible.
One reasonable approach to this problem is to minimize the function
over all (X o , Xl' . . . , x k ) in R k + 1. Al tho ugh the use of the "square deviation" (Si - p(tJ)2 in place of the raw deviation ISi - p(tJI can be justified purely in terms of the convenience afforded by the resulting differentiability of qJ, this choice has some theoretical advantages that will soon become evident. Recall that the customary approach to the minimization of the function qJ(x o , Xl' . .., x k ) is to set the gradient of qJ equal to zero and solve the resulting system for xo, Xl' ..., X k (cf. Exercise 18 of Chapter 1). This produces the minimizers of qJ because qJ is a convex function of xo, Xl' ..., X k (Why?) and so any critical point of qJ is a global minimizer. Our approach to this mini- mization problem is similar but somewhat more refined. We first observe that the function qJ(x o , Xl' .. . , X k ) can be expressed con- veniently in terms of the norm on Rk+l. Specifically, if we set
Then
Therefore, the gradient and Hessian of qJ are given by
Now here is the pertinent observation: Since the numbers t l' t 2' . . ., t n are distinct values of the independent variable t, the columns of the matrix A are linearly independent. This means that if Ax = 0 then x = 0 since Ax is simply a linear combination of the column vectors of A. But then, because
we see that H qJ(x) is positive definite. It follows from (2.3.7) that qJ(x) is strictly convex on Rk+l and so qJ(x) has unique global minimizer at the point x* for which VqJ(x*) = o. Since VqJ(x) = - 2A T b + 2A T Ax, we see that the minimize x* of qJ is characterized by the so-called normal equation
The matrix AT A is invertible because it is positive definite, so x* is also given by
If x* = (X6, xi, ..., xt), then the polynomial
is called the best least squares (kth degree polynomial) fit for the given