How to Graph Functions with Bézier Curves

Oct 13, 2024 Wyatt Sheffield Oct 13, 2024

Since Bézier curves are so ubiquitous in computer graphics, they are a natural choice for plotting the graph of a function. At their core, Bézier curves are polynomial in nature, so unless the graph is for a polynomial function with the same degree as our Bézier curve, we are forced to approximate. In this post, I will detail the thought process behind my implementation of a Bézier function plotter.

Notation and Scope

This post assumes basic familiarity with Bézier curves, a solid foundation in single variable calculus and linear algebra, and too much free time.

Code snippets are written in Go, since I was already developing an application in Go that needed to plot arbitrary functions when I began work on this problem.

Unless stated otherwise,

A real interval on which a function is defined is denoted $I$
A subinterval of $I=[a,b]:a,b\in\R$ is indexed with a subscript and has the property $I_k\subseteq I$
If unspecified, an interval is to be considered closed
A sequence of intervals, $\{I_1, I_2, \dots, I_n\}$ , has the properties
- The union of the sequence is an interval such that
  $I_1=[a,b] \,\mathrm{and}\, I_n=[c,d] \implies \bigcup_{j=1}^{n} I_j = [a,d]$
- Subsequent intervals share exactly one point:
  $I_k \cap I_{k+1}=\{q\} \text{ where}$ $q = \max{\{I_k\}} = \min{\{I_{k+1}\}}$
A more compact notation for the maximum element of a countable set, $\max{\{x_1, x_2, \dots, x_n\}}$ , may be written as $\max_{i=1}^{n}{\{x_i\}}$ or $\max\limits_{i=1}^{n}{\{x_i\}}$ .
A control point is a vector in cartesian coordinates, $\P\in\R^2$ .
- For a Bézier curve $\B$ of degree $n$ , there are $n+1$ control points that define $\B$ .
- The control points of $\B$ are indexed beginning at zero such that $\{\P_0, \P_1,\dots,\P_n\}$ uniquely determines $\B$ .
- The $x$ and $y$ components of a control point $\P_k$ are denoted $x_{k}$ and $y_{k}$ respectively.
A Bézier curve is a function $\B:[0,1]\to\R^2$ defined by a set of control points $\{\P_0,\P_1,\dots,\P_n\}$ in the usual way. We may think of $\B$ as the parametric vector equation $\B(t) =\begin{bmatrix}x(t) \\ y(t)\end{bmatrix}$ . Since $t$ always ranges from 0 to 1, and we typically care about the entire curve as $t$ increases from 0 to 1, we will omit the $t$ and simply write $\B$ unless a specific value of $t$ is needed.

A Baseline: Piecewise Linear Interpolation

The easiest way to plot a function $f$ is to use small linear segments. This is achieved by subdividing the interval $I$ into $n$ subintervals. If subinterval $I_k = [a,b]$ then the linear approximation of $f$ on $I_k$ is given by the line segment $\overline{\P_0 \P_1}$ where $\P_0=(a, f(a))$ and $\P_1=(b, f(b))$ . Depending on what machinery is being used to do the actual drawing, it may be useful to have the equation for the entire line passing through these points:

\begin{align} y - y_1 &= m(x - x_1)\\ y - f(a) &= \frac{f(b)-f(a)}{b-a}(x - a) \\ y &= \frac{f(b)-f(a)}{b-a}(x-a)+f(a) \end{align}

The hard truth is that piecewise linear interpolation is often the only sane way to plot an arbitrary curve. Indeed, Bézier curves are most frequently rendered with De Casteljau’s algorithm, using many small timesteps to trace out sufficiently small linear approximations of the Bézier curve such that no sharp angles are visible.

If it’s linear approximations all the way down, then why are we bothering with Béziers? First, Bézier curves are well-studied and well-behaved. Second, we are trying to plot an arbitrary function $f$ . That means that we have no knowledge about $f$ ’s properties. It could be incredibly expensive to compute values of $f$ , and when using the thousands of sample points necessary for a clear picture, the computer may take an inexcusably long time to plot the curve.

Quadratic and cubic Bézier curves are simple enough to plot efficiently, and flexible enough to approximate complex trajectories. If we can get away with fewer sample points than the linear approximation, but still match the detail of the mystery function $f$ , we will have achieved our goal. The approximation will with minimal samples give a reasonable simulacrum of the infinitely many values $f$ may attain.

Initial Failures with Cubics

I began my first attempt with cubic Bézier curves, as I was most familliar with them due to their ubiquity in vector graphics. Following from piecewise linear interpolation, we certainly want the endpoints of our Bézier curve to be exactly coincident with the endpoints of our function on the interval. Just as the accuracy of the linear approximation improves with a higher number of smaller segments, so too will the accuracy of a piecewise Bézier approximation.

In this section, any Bézier curve is considered to be cubic unless stated otherwise, and will be of the form

\B(t) = (1-t)^3\P_0 + 3(1-t)^2t\P_1 + 3(1-t)t^2\P_2 + t^3\P_3

The Goal

With linear approximation, we are limited to matching the function values at the endpoints of an interval. Cubic Bézier curves, being curves, may have a non-constant first derivative. By carefully choosing our control points we may match not only the function value at the endpoints, but also the derivative of the function at the endpoints. Furthermore, since cubic Béziers are polynomials of degree three, they also have a well-defined and easy to compute second derivative!

Being able to match the zeroth (more familliarly just the function value), first, and second derivative gives a highly accurate local approximation. See the graphs below in which the linear, quadratic, and cubic approximations are superimposed over the actual functions to get an idea of how powerful this technique can be.

The Problem

The zeroth derivative is trivial. On an interval $I=[a,b]$ , the endpoints of the bezier approximation are $\P_0 = (a, f(a))$ and $\P_3 = (b, f(b))$ .

The first derivative is more involved but still doable. Recall that $\B(t) = (1-t)^3 \P_0 + 3 t (1-t)^2 \P_1 + 3 t^2 (1-t) \P_2 + t^3 \P_3$ . Then the first derivative is given by

\B'(t) = 3(1-t)^2(\P_1 - \P_0) +6t(1-t)(\P_2-\P_1) + 3t^2(\P_3 - \P_2)

Notice that $\B'(t)$ at the endpoints $t=0$ and $t=1$ are particularly easy to calculate! The middle term is zero in both cases, as are the last and first when $t=0$ and $t=1$ respectively.

\begin{array} ~\B'(0) = 3(\P_1 - \P_0) & & \B'(1) = 3(\P_3 - \P_2) \end{array}

Here we find the first hint of an issue. The derivative we just found, $\B'$ is incompatible with $f'$ for two reasons. First, the codomain of $\B$ , $\R^2$ does not match the codomain of $f$ , $\R$ . Second, $f' = \dv{y}{x}$ while $\B'=\dv{\B}{t}$ . Thankfully, this can be resolved by a straightforward result from multivariable calculus via the chain rule:

\dv{y}{x} = \frac{\dv*{y}{t}}{\dv*{x}{t}}

Hence, in order to match the derivatives of $f$ at the endpoints of our interval $[a,b]$ we wish to solve

f'(a)=\left.\frac{\dv{t}\B_y(t)}{\dv{t}\B_x(t)}\right|_{t=0} =\frac{3(y_{1} - y_{0})}{3(x_{1} - x_{0})} =\frac{y_{1} - y_{0}}{x_{1} - x_{0}}

Since $\P_0$ is known and $f'(a)$ is calculated numerically, we need only solve for $\P_1$ . Examining the above result, we can see that $\P_1$ must lie along the line tangent to $f$ at $x=a$ . Similarly, given $\P_3$ , we can determine that $\P_2$ must lie along the line tangent to $f$ at $x=b$ .

There are two degrees of freedom left: the exact position of $\P_1$ on its constraining tangent line, and the same for $\P_2$ . Matching the second derivative will force these positions and fully constrain the solution. We can find $\B''(t)$ relatively easily,

\B''(t) = 6(1-t)(\P_2 - 2 \P_1 + \P_0) + 6t(\P_3 - 2 \P_2 + \P_1)

Where again the derivatives at zero and one are nice

\begin{array} ~\B''(0) = 6(\P_2 - 2 \P_1 + \P_0) && \B''(1)=6(\P_3 - 2 \P_2 + \P_1) \end{array}

Once again, we have a $\dv[2]{\B}{t}$ , but we want $\dv[2]{y}{x}$ . Unfortunately for us, it is not quite so simple a transformation for the second derivative as it is for the first. Using the formula

\dv[2]{y}{x}=\frac{\dv{t}\left[\dv{y}{x}\right]}{\dv{x}{t}}

we can substitute our previous result for $\dv{y}{x}$ and solve:

\begin{align} \dv[2]{y}{x}&=\frac{\dv{t}\left[\dv{y}{x}\right]}{\dv{x}{t}} \\ &=\frac{\dv{t}\left[\frac{\dv*{y}{t}}{\dv*{x}{t}}\right]}{\dv{x}{t}} \\ &=\frac{\left(\frac{\dv{x}{t}\dv[2]{y}{t} - \dv{y}{t}\dv[2]{x}{t}}{\left(\dv{x}{t}\right)^2}\right)}{\dv*{x}{t}} \\ &=\frac{\dv{x}{t}\dv[2]{y}{t} - \dv{y}{t}\dv[2]{x}{t}}{\left(\dv{x}{t}\right)^3} \end{align}

Rather than work out the entire expression symbolically, we can leverage the fact that we still only really care about the endpoints at $t=0$ and $t=1$ . Using the previously found expressions for the first and second derivatives at these points, we replace the $\P$ terms with either $x$ or $y$ as called for by the above formula.

\begin{align} \left.\dv[2]{y}{x}\B(t)\right|_{t=0} &= \frac{[3(x_1-x_0)][6(y_2 - 2y_1 + y_0)] - [3(y_1-y_0)[6(x_3-2x_1+x_0)]}{[3(x_1-x_0)]^3}\\ \left.\dv[2]{y}{x}\B(t)\right|_{t=1} &= \frac{[3(x_3-x_2)][6(y_3 - 2y_2 + y_1)] - [3(y_3-y_2)[6(x_3-2x_2+x_1)]}{[3(x_3-x_2)]^3}\\ \end{align}

Now, on our interval $I=[a,b]$ , we want the following to be true:

\begin{align} \B(0) &= \P_0 = \colvec{x_0 \\ y_0} = \colvec{a\\f(a)}\\ \B(1) &= \P_3 = \colvec{x_3 \\ y_3} = \colvec{b\\f(b)}\\ f'(a) &= \frac{y_1 - y_0}{x_1 - x_0}\\ f'(b) &= \frac{y_3 - y_2}{x_3 - x_2}\\ f''(a) &= 2\frac{(x_1-x_0)(y_2 - 2y_1 + y_0) - (y_1-y_0)(x_3-2x_1+x_0)}{9(x_1-x_0)^3}\\ f''(b) &= 2\frac{(x_3-x_2)(y_3 - 2y_2 + y_1) - (y_3-y_2)(x_3-2x_2+x_1)}{9(x_3-x_2)^3} \end{align}

The first two equations directly give us the values of $\P_0$ and $\P_3$ . The remaining four equations give us the relationships between the remaining four unknowns: $\P_1 = \colvec{x_1\\y_1}$ and $\P_2 = \colvec{x_2\\y_2}$ . Unfortunately, this is a nonlinear system that I do not know how to solve. I tried automating the process with SymPy’s nonlinsolve, but even after 18 hours no progress was made. I managed to reduce the system to only two unknowns — the lengths of the handles $l_1^2 = \lVert \P_1 - P_0 \rVert^2$ and $l_2^2 = \lVert \P_3 - P_2 \rVert^2$ , but it was still a quadratic nonlinear system that neither SymPy nor I could solve.

 1from sympy import symbols, solve
 2
 3################################################################################
 4#                          First form, four unknowns
 5################################################################################
 6# variables to solve for
 7x1, x2, y1, y2 = symbols('x1, x2, y1, y2', real=True)
 8# x and y components of point a and b, derivatives at those points, and second
 9# derivatives.
10xa, xb, ya, yb, da, db, dda, ddb = symbols('xa, xb, ya, yb, da, db, dda, ddb',
11                                           real=True)
12eq1 = da*x1 - y1 - xa*da + da
13eq2 = y2 - yb*x2 - db + xb*db
14eq3 = (3*dda*(x1-xa)**3)/2 -(x1 - xa)*(y2 - 2*y1 + ya) +(y1 - ya)*(x2 - 2*x1 + xa)
15eq4 = (3*ddb*(xb-x2)**3)/2 -(xb - x2)*(yb - 2*y2 + y1) +(yb - y2)*(xb - 2*x2 + x1)
16system = [eq1, eq2, eq3, eq4]
17print(nonlinsolve(system, [x1, x2, y1, y2]))
18
19################################################################################
20#                          Second form, two unknowns
21################################################################################
22
23
24l1, l2 = symbols('l1, l2', real=True)
25a, b, c, d, k0, k1 = symbols('a, b, c, d, k0, k1', real=True)
26x, y = symbols('x, y', real=True)
27eq1 = (2/(3*(a**3)*(l1**1)))*(a*(y)+b*(x)+l2*(c*b - a*d)) - k0
28eq2 = (2/(3*(c**3)*(l2**1)))*(c*(y)+d*(x)+l1*(c*b - a*d)) - k1
29system = [eq1, eq2]
30print(solve(system, [l1, l2], dict=True))

Naïve Approach with Quadratics

Illegal Control Points

An Intelligent Heuristic-Based Approach

Say we wish to approximate a function $f$ with a quadratic Bézier curve $\B\left(\P_0, \P_1,\P_2\right)$ on a closed interval $I_0=[a, b]$ . As we have previously seen, $\B$ is fully determined by the endpoints of $f$ on the interval $I$ and the intersection of the lines tangent to $\left(a, f(a)\right)$ and $\left(b, f(b)\right)$ . For functions that are approximately linear or quadratic on the interval, $\B$ is a good approximation of $f$ . On the other hand, functions with asymptotes, undefined values, high frequency oscillations, and in general large changes in curvature are poorly approximated.

We can see that naïvely distributing endpoints uniformly along the x-axis is inefficient: large flat areas of functions are appointed the same resolution as highly detailed portions. With the same number of points, we could achieve a more accurate approximation by moving unnecessary points from “smoother” parts of the function to the “noisier” parts where they can do more legwork.

Badness

The goal of my approach is to minimize badness. On the left is an example of an approximation with a high amount of badness; on the right is an approximation with acceptably low badness. We can immediately notice that the distribution of points along the x-axis is uneven for the better approximation: as desired, more points are given to noisier portions of the function.

Though we do not yet know how to quantify badness, we can imagine how it should behave. From there, we can give a proper definition. To that end, take some closed interval $I$ on which $f$ is defined. Then badness is given by some function, beta, $\beta_f:[a, b]\to\R$ such that $\beta_f(I)$ is the badness value of $I$ for the function $f$ . Note that I have made no mention of the curve $\B$ in constructing the badness function. This is because we assume that $\B$ is implicitly defined by the function and interval in question. The two endpoints of $\B$ are $(a, f(a))$ and $(b, f(b))$ , and the control point can be found by intersecting the tangent lines at these endpoints as we have seen.

As we saw from the piecewise linear interpolation approach, more segments tends to give higher accuracy. The same obviously¹ holds for Bézier curves of higher degree. It stands to reason, then, that to achieve a better approximation, we can split a single high-badness Bézier curve into two less-bad Béziers on the same interval.

As we can see from the diagram above, we are now guaranteed to match the function exactly in at least one point on the open interval $(a, b)$ . Specifically, if $I=[a,b]=[a,q]\cup[q,b]$ with $[a,q]\cap[q,b]=\{q\}$ then there exists $t\in[0,1]$ such that $f(x)= \B(t)$ whenever $x=q$ .

Since “badness” is very much open to interpretation, we can select a heuristic or combination of heuristics to approximate the platonic ideal of true, perfect, beautiful badness². For the uninitiated, a heuristic technique relies on making a simple model of a complex problem that is just “good enough”. Heuristics are everywhere in applied mathematics, computer science, and artificial intelligence. Most real-world processes are too complex for a human to grapple with, so heuristics are used to get an approximate solution which can then be refined.

An Algorithm for Choosing Points

Say we wish to better approximate a function with $n$ intervals and that this function has already been approximated across $k \lt n$ intervals. Of these $k$ intervals, choose $I_{\Omega}$ such that $\beta(I_{\Omega}) = \mathop{\max}_{j=1}^{k}{\,\beta(I_j)}$ . That is, we select the interval with the greatest badness. We then subdivide $I_{\Omega}$ into two (or perhaps more) sequential intervals $\{I_{\Omega 1}, I_{\Omega 2}\}$ and define new Bézier curves $\B_{\Omega 1}$ and $\B_{\Omega 2}$ .

The above process is iterated, each time adding an interval by way of subdivision, until the number of points/intervals reaches a sufficient value $n$ . The exact value of $n$ is not known, but we would like it to satisfy $\hat{\beta}(\{I_0, I_1, \dots, I_n\})<\varepsilon$ where $\hat\beta$ is some function that combines the badness of all the intervals and $\varepsilon$ is a predetermined maximum allowable badness.

Implementation

A linked list is a natural choice to represent a sequence of intervals, since it is easy to insert and remove nodes at arbitrary positions - an operation we will perform for each subdivision. I chose to use a doubly-linked list, though a singly linked list would work fine so long as the first node can be found and modified.

Each node of the list represents a closed interval has the following members:

 1type Interval struct {
 2    F       func(float32) float32   // the function of interest
 3    Beg     Point                   // the three control
 4    End     Point                   // points for a quadratic
 5    Ctrl    Point                   // Bézier curve
 6    BegDx   float32                 // Derivative of F at Beg.X
 7    EndDx   float32                 // Derivative of F at End.X
 8    Badness float32                 // computed badness value
 9    Level   int                     // number of times a split has occured
10    Prev    *Interval               // pointer to previous interval
11    Next    *Interval               // pointer to next interval
12    Box     ViewBox                 // dimensions of the window in which we are plotting
13}

There are obvius improvements that can be made for better memory usage (note that Box and F are identical for all intervals in the list), but I prefer to be explicit for clarity.

The primary operation we will perform on an interval is a Split, which will divide it into two or more subintervals. I have written Split such that it can make arbitrarily many cuts, with the default being to split the interval exactly in half.

 1func (iv *Interval) Split(cuts ...float32) ([]*Interval, error) {
 2    if cuts == nil {
 3        cuts = make([]float32, 1)
 4        cuts[0] = (iv.Beg.X + iv.End.X) / 2
 5    } else {
 6        if cuts[0] <= iv.Beg.X {
 7            return nil, fmt.Errorf("%f is outside the range (%f, %f)", cuts[0], iv.Beg, iv.End)
 8        }
 9        if cuts[len(cuts)-1] >= iv.End.X {
10            return nil, fmt.Errorf("%f is outside the range (%f, %f)", cuts[len(cuts)-1], iv.Beg, iv.End)
11        }
12        slices.Sort(cuts)
13    }
14    newIntervals := make([]*Interval, len(cuts)+1)
15    iv.Level += 1
16    for i, x := range cuts {
17        // dc is the mean of dl and dr, derivatives calculated as
18        // an approach from the left and right respectively
19        dc, dl, dr := Deriv(iv.F, x, iv.End.X-iv.Beg.X)
20        a := &Interval{
21            F:     iv.F,
22            Beg:   iv.Beg,
23            End:   f32.Pt(x, iv.F(x)),
24            BegDx: iv.BegDx,
25            EndDx: dc,
26            Prev:  iv.Prev,
27            Next:  iv,
28            Box:   iv.Box,
29            Level: iv.Level,
30        }
31        iv.BegDx = dc
32        iv.Prev = a
33        iv.Beg = a.End
34        if a.Prev != nil {
35            a.Prev.Next = a
36        }
37        if math.IsNaN(float64(dc)) {
38            a.EndDx = dl
39            iv.BegDx = dr
40        }
41        a.CalcCtrlPt()
42        newIntervals[i] = a
43    }
44    iv.CalcCtrlPt()
45    newIntervals[len(cuts)] = iv
46    return newIntervals, nil
47}

I admit to having committed a grave sin: the CalcCtrlPt function not only calculates the control point, but also calculates the badness for the interval. I should pick a better name, but the reasoning will become clear shortly.

Now that we are able to split intervals, we want to be able to find $I_\Omega$ , the interval with greatest badness. A max heap offers an efficient and elegant solution that prevents us from having to traverse the list or perform a sorting operation each time we modify its contents. Since Split returns a slice of freshly chopped intervals, we can push those onto our heap and then pop off the next worst interval. The heap is implemented with Go’s builtin container/heap module:

 1type IntervalHeap []*Interval
 2
 3func (h IntervalHeap) Len() int           { return len(h) }
 4func (h IntervalHeap) Less(i, j int) bool { return h[i].Badness > h[j].Badness } // make it max!
 5func (h IntervalHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
 6
 7func (h *IntervalHeap) Push(x any) {
 8    iv := x.(*Interval)
 9    *h = append(*h, iv)
10}
11
12func (h *IntervalHeap) Pop() any {
13    old := *h
14    n := len(old)
15    x := old[n-1]
16    old[n-1] = nil
17    *h = old[0 : n-1]
18    return x
19}

and the intelligent interval subdivision is accomplished easily:

 1func (v *VectorPlotLine) PlotSmoothSmart(f func(float32) float32, box ViewBox, n int) float32 {
 2    intervals := NewInterval(box, f)
 3    queue := &IntervalHeap{intervals}
 4    heap.Init(queue)
 5    points := 2
 6    for points < n {
 7        iv := heap.Pop(queue).(*Interval)
 8        twins, _ := iv.Split()
 9        for _, s := range twins {
10            heap.Push(queue, s)
11        }
12        points++
13    }
14    ivp := intervals
15    for ivp.Prev != nil {
16        ivp = ivp.Prev
17    }
18    // other stuff...
19}

Heuristics

Length of Handles

Distance From Control Point to Curve

Width of Interval

Difference of Derivatives

Finishing touches

Out of bounds
NaN values

I am a mathematician. I am required by law to use obviously for statements that I do not wish to prove. ↩︎
Beutiful badness is going to be the name of my new experimental bubblegum noise band. I still need a synth player and a sledgehammer swinger. Shoot me an email with your demo tape. ↩︎