12.2 — Root Finding, Sorting algorithms

Roots of a function

  • A root or zero of a function f(x)f(x) is a solution of the equation f(x)=0f(x)=0

  • Some functions can be solved in closed form i.e., their solutions can be represented in terms of functions that we know.

    • f(x)=x21f(x) = x^2 - 1 can be factored to (x1)(x+1)(x-1)(x+1), with the solution of f(x)=0f(x) = 0 being x=1,1x = -1, 1
  • For the functions that cannot be factored or represented in closed form, we must use numerical methods to find their roots.

    • See this post for a discussion on “closed form”.

  • In general, finding roots numerically involves a procedure which generates a sequence of approximations x1,x2,...,xnx_1, x_2, ..., x_n until we obtain a value close to the actual root.

  • Note that we will focus on finding the real roots to functions, not the complex roots

Single and double roots

  • The equation f(x)=0f(x)=0 has a single root for each x-intercept i.e. each time the function f(x)f(x) crosses X-axis
  • Double roots (two equal roots) occur when f(x)f(x) touches but does not cross the X-axis

For example, f(x)=x2f(x) = x^2 has a double root at x=0x=0

Methods of finding roots

  • All the methods we will see require an initial estimate of the root — either a single number as an estimate or an estimated interval containing the root.

  • Few different ways of estimating an initial root value or interval:

    • If the equation is associated with a physical problem its physical context may suggest the estimate.
    • A search can be carried out for estimated root values.
    • The function could be plotted (a visual procedure).

We will see the following methods in this lecture:

  • Bisection method
  • Secant method
  • Newton-Raphson method

Intermediate value theorem

(We will assume that all functions are continuous for root finding algorithms.)

  • Let f(x)f(x) be a continuous function on the interval [a,b][a, b] and let ss be a number where f(a)<s<f(b)f(a) < s < f(b), then there exists at least one xx with f(x)=sf(x) = s.

We will use Intermediate value theorem in some of the root-finding methods.

  • The main idea is that
    • If f(a)f(a) and f(b)f(b) have opposite signs, e.g. f(a)<0<f(b)f(a) < 0 < f(b), then there is at least one xx in the interval [a,b][a, b] such that f(x)=0f(x) = 0 i.e. there exists a root in [a,b][a, b]
  • Note that since a double root does not lead to a change in sign, it won’t be detectable using this method.

Checking if an interval contains a root

Given a function f(x)f(x) which is continuous in the interval [a,b][a, b], if f(a)f(a) and f(b)f(b) have opposite signs, we have a root in [a,b][a, b]

1def f(x):
2 return x ** 2 - 25
3
4a1, b1 = [-4.0, 3.0]
5print('Is root in [-4.0, 3.0]?', f(a1) * f(b1) < 0)
6# False
7
8a2, b2 = [3.0, 10.0]
9print('Is root in [3.0, 10.0]?', f(a2) * f(b2) < 0)
10# True

Example

In the previous graph,

  • Since we have tantan function, which is not continuous, there are sign changes at discontinuities, e.g. in interval [1,2][1, 2].
  • So the theorem doesn’t work here and we can’t use the root finding methods in the interval [1,2][1, 2].
  • But we can find root in the interval [6,7][6, 7].

Bisection method

  • It is the simplest root finding algorithm which can be applied to any continuous function f(x)f(x) on an interval [a,b][a, b] where a root exists i.e. f(a)f(a) and f(b)f(b) have different signs.

  • Therefore, the method relies on Intermediate value theorem.

  • The general idea is to repeat the following

    • Divide the interval in two subintervals, one of which must contain the root.
    • Select the subinterval [an,bn][a_n, b_n] where f(x)f(x) changes sign i.e. f(an)f(bn)<0f(a_n)·f(b_n) < 0

Algorithm

  1. Choose an interval [a0,b0][a_0, b_0] where sign of f(x)f(x) changes i.e. f(a0)f(b0)<0f(a_0) \cdot f(b_0) < 0

  2. Compute f(m0)f(m_0) for the midpoint m0=a0+b02m_0 = \frac{a_0 + b_0}{2}

  3. Determine the next subinterval [a1,b1][a_1, b_1] as follows:

    • If f(a0)f(m0)<0f(a_0) \cdot f(m_0) < 0, then [a1,b1]=[a0,m0][a_1, b_1] = [a_0, m_0]
    • If f(m0)f(b0)<0f(m_0) \cdot f(b_0) < 0, then [a1,b1]=[m0,b0][a_1, b_1] = [m_0, b_0]
  4. Repeat steps (2) and (3) until the interval [an,bn][a_n, b_n] is small enough, i.e. its size is less than some value ϵ\epsilon

  5. Return the midpoint value mn=an+bn2m_n = \frac{a_n + b_n}{2}

To visualize this algorithm download visualize_bisection.py from Ed and run it.

Python implementation

Check bisection function in rootfinding.py file.

Example

f(x)=x46.4x3+6.45x2+20.538x31.752f(x) = x^4 - 6.4x^3 + 6.45x^2 + 20.538x - 31.752

There is a double root at x=2x=2 and a single root at x=4x=4.

Check Bisection method example in rootfinding_example.py file.

Secant Method

  • A secant line is a straight line joining at least two points on a function.

  • The secant method is very similar to the bisection method.

  • Instead of choosing the midpoint, the secant method divides each interval [a,b][a, b] by the x-intercept of the secant line connecting the endpoints (a,f(a))(a, f(a)) and (b,f(b))(b, f(b)).

  • However, unlike Bisection method, Secant method does not require that a root be present in the interval [a,b][a, b]. So the Secant method may not converge to any number.

Formula for Secant line

  • Let f(x)f(x) be a continuous function on a closed interval [a,b][a, b] such that f(a)f(b)<0f(a) \cdot f(b) < 0

  • The secant line connecting the endpoints (a,f(a))(a, f(a)) and (b,f(b))(b, f(b)) is given by

    y=f(b)f(a)ba(xa)+f(a) y = \frac{f(b) - f(a)}{b - a}(x - a) + f(a)
  • By setting y=0y=0 in above equation, we get the x-intercept of secant line as follows:

    x=af(a)baf(b)f(a) x = a - f(a) \frac{b - a}{f(b) - f(a)}

Algorithm for Secant method

  1. Choose an interval [a0,b0][a_0, b_0] where sign of f(x)f(x) changes i.e. f(a0)f(b0)<0f(a_0) \cdot f(b_0) < 0.
  2. Compute f(x0)f(x_0) where x0x_0 is the x-intercept of secant line between (a0,f(a0))(a_0, f(a_0)) and (b0,f(b0))(b_0, f(b_0)) given by
    x0=a0f(a0)b0a0f(b0)f(a0) x_0 = a_0 - f(a_0) \frac{b_0 - a_0}{f(b_0) - f(a_0)}
  3. Determine the next subinterval [a1,b1][a_1, b_1] as follows:
    • If f(a0)f(x0)<0f(a_0) \cdot f(x_0) < 0, then [a1,b1]=[a0,x0][a_1, b_1] = [a_0, x_0]
    • If f(x0)f(b0)<0f(x_0) \cdot f(b_0) < 0, then [a1,b1]=[x0,b0][a_1, b_1] = [x_0, b_0]
  4. Repeat steps (2) and (3) until the interval [an,bn][a_n, b_n] is small enough or some fixed number of steps have been taken.
  5. Return the value xnx_n, the x-intercept of nthn^{th} subinterval.

To visualize this algorithm download visualize_secant.py from Ed and run it.

Python implementation

Check secant function in rootfinding.py file.

Check Secant method example in rootfinding_example.py file.

Newton-Raphson method

  • Simple, fast, and best-known method of finding roots.

  • It uses derivative of the function f(x)f(x) i.e. tangent line.

    • So, it can only be used if the function is differentiable in the given interval.

    • If it is costly to compute the derivative, Newton’s method will be slow.

  • It doesn’t need an interval as an initial estimate, but only one estimate value x0x_0.

  • We start with an approximation of the root, x0x_0 and compute successive approximations as follows:

    xn=xn1f(xn1)f(xn1) x_n = x_{n-1} - \frac{f(x_{n-1})}{f'(x_{n-1})}
  • The above formula comes from Taylor series approximation of f(xn)f(x_n) given by:

    f(xn)f(xn1)+f(xn1)(xnxn1) f(x_n) \approx f(x_{n-1}) + f'(x_{n-1})(x_n - x_{n-1})
    • Setting f(xn)f(x_n) to 00 and solving for xnx_n would give the above formula for xnx_n.
  • This method may not converge. To prevent an infinite loop, we use a fixed number of iterations, and stop if we do not find a root.

To visualize this algorithm download visualize_newton.py from Ed and run it.

Python implementation

Check newton_raphson function in rootfinding.py file.

For Newton-Raphson we also need the first derivative of the function. For the exampl we have,

f(x)=x46.4x3+6.45x2+20.538x31.752f(x) = x^4 - 6.4x^3 + 6.45x^2 + 20.538x - 31.752
f(x)=4x319.2x2+12.9x+20.538f'(x) = 4 x^3 - 19.2 x^2 + 12.9 x + 20.538

Check the example in rootfinding_example.py file.

Newton-Raphson divergence examples

Although the Newton–Raphson method converges fast near the root, its global convergence characteristics are poor.

MethodAssumptionsConvergenceCan find Double roots?
Bisectionff is continuous, root exists in the initial intervalYesNo
Secantff is continuous, root need not exist in the initial intervalNot guaranteed but will converge if root is close to initial interval.Not guaranteed.
Newton-Raphsonff is continuous and differentiableNot guaranteed, but will converge if initial estimate is close to root.Yes but may converge slowly.

Speed of convergence (single root): Newton-Raphson > Secant > Bisection

Root-finding using Scipy

Check the file scipy_rootfinding.py.