MULTI-VARIABLE OPTIMIZATION NOTES. 1. Identifying Critical Points

MULTI-VARIABLE OPTIMIZATION NOTES HARRIS MATH CAMP 2018 1. Identifying Critical Points Definition. Let f : R 2! R. Then f has a local maximum at (x 0,y 0 ) if there exists some disc D around (x 0,y 0 ) such that f(x, y) apple f(x 0,y 0 ) for every (x, y) 2 D. Similarly, f has a local minimum at (x 0,y 0 )if there exists some disc D around (x 0,y 0 ) such that f(x, y) f(x 0,y 0 ) for every (x, y) 2 D. Local maxima and local minima are called local extrema.. Suppose that f : R 2! R and (x 0,y 0 ) is a local maximum. Then the graph of f intersected with the plane y = y 0 is a curve with a local maximum at x = x 0.So d dx f(x, y 0)= @f @x (x 0,y 0 ) = 0. Similarly, @f @y (x 0,y 0 ) = 0. In other words, at all local minima and maxima, rf(x 0,y 0 ) = 0. So once again, to identify local extrema we need only identify points at which rf(x 0,y 0 ) = 0 and then classify them as local minimia, local maxima, or non-extrema. Definition. If rf(x 0,y 0 ) = 0, then (x 0,y 0 )isacritical point. Classifying critical points is more di cult in two (or more) dimensions than in a single dimension, because behavior in di erent directions can be di erent. For example, if f(x, y) =x 2 y 2,thenrf(0, 0) = 0. But f(x, 0) = x 2 has a minimum at (0, 0), and f(0,y)= y 2 has a local maximum at (0, 0). Therefore (0, 0) is not a local extrema. Definition. A saddle point of a function f : R 2! R is a point (x 0,y 0 ) so that f(x 0,y 0 ) is a local minimum in one direction and a local maximum in another. 1

2 HARRIS MATH CAMP 2018 In this definition, (0, 0) is a saddle point of the function f(x, y) =x 2 y 2,sinceinthex-direction (0, 0) is a local maximum, and it is a local minimum in the y-direction. Notice that a function f : R 2! R could have a local minimum (or maximum) at (x 0,y 0 ) in both the x-direction and the y-direction, and still have a saddle point at (x 0,y 0 ). In particular, imagine that f(x, y) is a saddle with four ru es (i.e. it is a saddle for a person with four legs) in such a way that the legs are placed one per quadrant. Then f has a local minimum at (0, 0) in both the x- and y- directions, but a local maximum in the direction of the line y = x. Hence (0, 0) is a saddle point. " ' fi. 2. Second Derivative Test In one dimension, we can classify local minima and maxima by the First Derivative Test, in which we identify the roots of f 0 (x) and then test the sign of f 0 on each complementary interval. This is more di in higher dimensions, because @f @f @x and @y on the same intervals. cult need not have zeros at the same points, or behave in the same way Instead, we will use an analogue of the Second Derivative Test. Recall that in one dimension, we found critical points and then used the behavior of the second derivative to classify them. We will do the same in higher dimensions. The only snag is that we have multiple second derivatives, and we need to look at all of their behaviors simultaneously.

MULTI-VARIABLE OPTIMIZATION NOTES 3 To do this, we create a new function. For f : R 2! R, defined f : R 2! R as D f (x, y) =f xx (x, y) f yy (x, y) [f xy (x, y)] 2. As a hint of where this comes from, the Hessian matrix is defined as: H f (x, y) = 2 6 4 f xx(x, y) f xy (x, y) f yx (x, y) f yy (x, y) 3 7 5, which gives a mathematical way of looking at all the partial derivatives simultaneously. The discriminant of this matrix is precisely D f (x, y), as long as the mixed partials are equivalent. Theorem 2.1. (Second Derivative Test - Multivariable) Suppose (a, b) is a critical point of f : R 2! R. (1) If D f (a, b) < 0, then (a, b) is a saddle point. (2) If D f (a, b) > 0 and f xx (a, b) < 0 then (a, b) is a local maximum. (3) If D f (a, b) > 0 and f xx (a, b) then (a, b) is a local minimum. If D f (a, b) =0the test is inconclusive. 3. Lagrange Multipliers In the real world, we may need to optimize a function subject to some constraints. For example, suppose that a company makes two items. The total output of the factory is constrained (by factory space, time, materials, etc). To optimize the profit of the company, we need to find the maximum of the profit function subject to the constraints of production. Geometrically, this corresponds to finding an extremum over a subspace of R 2. In the case that our subspace is a line, we could potentially solve the optimization problem as a single variable function. In particular, suppose that our constraint is y = Ax+B. Thenf(x, y) under this constraint is equal to f(x, Ax + B), a function in a single variable. We can maximize this according to rules of single variable functions. On the other hand, suppose that our constraints are a region rather than a single line. Then we could find our extrema by finding all the extrema of the function, restricting our view to those that lie in our region, and then compare these extrema to all the border values. However this can be di cult (or even impossible) depending on the complexity of the constraint. But notice that for any constraint g(x, y) =k, f(x, y) has the relative extrema as the function F (x, y, )=f(x, y) [g(x, y) k],

4 HARRIS MATH CAMP 2018 where is a new variable, called the Lagrange Multiplier. To find the extrema of F, compute its partial derivatives F x = f x g x, F y = f y g y, F = (g k). The solve the equations F x = F y = F = 0. In otherwords, solve f x = g x, f y = g y, g = k. These are the potential extrema of F, and hence also potential extrema of f. Finally, we need only evaluate f at all of these potential extrema. It should be noted that the method of Lagrange multipliers does not tell you whether any constrained extrema exist - it says only where a constrained extrema would exist. Example. Suppose that we are going to fence in a rectangular field of size 5000 sq. yd. along a river, so we need fencing on only three sides. What is the minimum length of fencing needed? If x is the length of the field and y the width, then A(x, y)xy and L(x, y) =x +2y. We want to minimize L subject to the constraint A(x, y) = 5000. Then f x =1, f y =1, g x = y, g y = x. So our three Lagrange equations are f x =1= g x, f y =1= g y, g = xy = 5000. Hence and x =2y. Soxy =2y 2 = 5000, and y = 50, x = 100. = 1 y = 2 x, Example. Find the extrema of f(x, y) =xy subject to x 2 + y 2 = 8. We have a constraint g(x, y) =x 2 + y 2, so our partial derivatives are f x = y, f y = x, g x =2x, g y =2y. The three Lagrange equations give us y =2 x, x =2 y, x 2 + y 2 =8. Solving these, we get 2 = y x = x y, so x2 = y 2. In the final equation, we get 2x 2 = 8, or x = ±2, and y = ±2.

MULTI-VARIABLE OPTIMIZATION NOTES 5 Then f(2, 2) = f( 2, 2) = 4, and f(2, 2) = f( 2, 2) = 4. So that maximal value of f(x, y) subject to x 2 + y 2 = 8 is 4, and the minimal value is 4. If we want to find extrema of a function over a closed region R of R 2, then we need to (1) find any local extrema inside that region, then (2) compare boundary values. To do this, we use the Second Derivative Test to find and classify any critical points in R, then use Lagrange multipliers to test the boundaries for extrema, and compare these to the extrema inside R. Example. Suppose f(x, y) =xy. We want to optimize this over the region D given by x 2 + y 2 apple 8 and y 0. First, we need to identify extrema within D. To do this we use the Second Derivative Test. So: f x = y, f y = x, f xx = f yy =0,f xy = f yx =1. Then rf =(y, x) =(0, 0) only at (0, 0). But D f (x, y) = 1 < 0 for all (x, y), so this is a saddle point. Along the constraint y = 0, then f(x, 0) = x 2, which has a minimum at x = 0, where f(0, 0) = 0. We have the constraint g(x, y) =x 2 + y 2 = 8. Then g x =2x and g y =2y, so the Lagrange equations give y = 2x, x = 2y, x 2 + y 2 =8. Solving these for x and y, x = ±2 and y = 2. Checking these values, we have f(2, 2) = 4, and f( 2, 2) = 4. Thus there is a global maximum at (2, 2) and a global minimum at ( 2, 2).