An Extension of a Probability Puzzle

Consider a unit square and uniformly pick two points on its perimeter. What is the probability that the line that connects these two points has length less than 1?

I first came across this problem in the accompanying youtube video, and decided to investigate it with a bit more depth. If you ever wondered about the sum of squares of uniform distributions, or if you are an avid shape-rotator, this one’s for you.

We shall call these points \(a\) and \(b\), the line connecting them \(L\), and the length of \(L\) shall be \(l\). We are interested in the probability that \(l\) is less than 1, written \(P(l<1)\).

The interesting part of the original problem arises when out points are chosen on adjacent sides; when points are chosen on the same side of the square we find that \(P(l<1)=1\), and when points are chosen on opposite sides of the square \(P(l<1)=0\).

A geometric representation of the problem. A quarter circle centered at the lower left corner of a square is shown
with three different radii. To find \(l\) when \(a\) and \(b\) are on adjacent sides of the square, it is convenient to allow these line segments be \([0,1]\) on the x-axis and similarly on the y-axis. We may choose \(a, b \in [0,1]\) uniformly. WLOG let \(a\) be on the x-axis and \(b\) be on the y-axis. Then by the Pythagorean theorem, \(l = \sqrt{a^2 + b^2}\). We may notice that this is very closely related to the Cartesian equation of a circle, \(x^2 + y^2 = r^2\), where the radius is 1 in this case. Thus, we may realize that \(P(l<1)\) is equivalent to finding the probability that a randomly selected point \((a,b)\) in the unit square is less than one unit distance from the origin, as shown in the figure by the three white points. That is, the ratio of the area of a quarter unit circle inscribed in the unit square, namely \(P(l<1)=\frac{\pi}{4}\).

Extending the Problem

What if we wish to know \(P(l<\frac{1}{2})\), or perhaps \(P(\frac{1}{e} < l < 1.2)\)? What if we want to find the probability for any arbitrary value of \(l\)? This requires us to find the probability distribution function (or pdf) \(f(x)\), and the related cumulative distribution function \(F(x)=\int_{-\infty}^{x}f(t) \,dt\).

Computer simulation

The first step was to run a simulation in order to get a feel of the shape of the problem. Rather than looking at the distribution of the length of the connecting line, it seemed easier to look at the square of its length, since this quantity is simply the sum \(a^2 + b^2\) without that pesky square root. The simulation was accomplished in python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from random import random

def get_len_adjacent():
    """return the square of the length."""
    x = random()
    y = random()
    return x**2 + y**2

if __name__ == '__main__':
    trials = 1000000
    for _ in range(trials):
        print(get_len_adjacent())

After dumping the output to a file and analyzing the histogram we find a…

A histogram of the simulation results. It appears to be piecewise, with the first part being a uniform distribution,
and the second looking like an exponential decay. Now what the heck is that distribution? It appears uniform on [0, 1] but then falls off. Clearly I forgot about some quirk of python’s random() implementation. Let’s try again by first loading up 2 lists with random numbers and then reading them for the data.

1
2
3
4
5
6
7
8
from random import random

if __name__ == '__main__':
    trials = 1000000
    x = [random() for _ in range(trials)]
    y = [random() for _ in range(trials)]
    for a, b in zip(x, y):
        print(a**2 + b**2)

There. Now let’s get a look at the real histogram of proper data

Exactly the same distribution as before. Ok. Clearly this distribution is a strange one. Let’s see if statistical analysis provides us with the same result.

A Pure Statistical Approach

First we define the random variables \(X, Y \sim U(0,1)\) where \(X\) represents a point chosen uniformly on the x-axis interval \([0,1]\), similarly for \(Y\). We wish to find the distribution of the sum of the squares of these random variables \(Z = X^2 + Y^2\), namely \(P(Z < z)\) where \(z \in [0,2]\).

Next, we must find the distribution of \(X^2\) and \(Y^2\). The probability density function \(f_X\) and the cumulative distribution function \(F_X\) of the uniformly distributed \(X\) are given by

\[ f_X(x) = \begin{cases} 1 & \quad x \in [0,1] \\ 0 & \quad \text{otherwise} \end{cases} \] \[ F_X(x) = \int_{-\infty}^{\infty} f_X(x) \, dx = \begin{cases} 0 & \quad x < 0 \\ x & \quad x \in [0,1] \\ 1 & \quad x > 1 \end{cases} \]

Now we may make the substitution \(U = X^2\) and note \(\sqrt{u} = x^{\star}\) to find \[ \begin{align} F_U(u) &= F_X(x^{\star}) = P(U \le u) \\ = P(X \le \sqrt{u} = x^{\star}) &= \begin{cases} 0 & \quad u < 0 \\ \sqrt{u} & \quad u \in [0,1] \\ 1 & \quad u > 1 \end{cases} \end{align} \] Furthermore, since \(Y\) follows an identical distribution we may make the substitution \(V = Y^2\) and define \(F_V(v)\) similarly. The above can be differentiated to find \(f_U(u) = \frac{1}{2\sqrt{u}}\) on the interval [0,1].

Finally to find \(P(Z < z) = P(U+V < z) = P(X^2 + Y^2 < z)\), we must find the joint probability of \(Z=U+V\), defined by \[ \begin{align} F_Z(z)&=\int_{-\infty}^{\infty}\int_{-\infty}^{z-u}f_{UV}(u,v)\, dv\, du \\ &= \int_0^1 F_V(z-u) f_U(u) \, du \quad ({\small \text{N.B. }} z-u=v)\\ &= \int_0^1 \left(\sqrt{z-u}\right)\left(\frac{1}{2\sqrt{u}}\right) \, du \end{align} \]

This integral is a tricky one, and it is not immediately clear that a sharp cusp will be present as we would expect from the histogram. However, working through it (or more realistically, asking Wolfram Alpha) we find

\[ \begin{align} F_Z(z) &= \int_0^1 \frac{\sqrt{z-u}}{2\sqrt{u}} \, du \\ &= \frac{\pi}{2}z + \sqrt{z-1} - z\arctan{\left(\sqrt{z-1}\right)} \end{align} \] Ignoring for a moment1 that we should have an initial factor of \(\frac{\pi}{4}\) rather than \(\frac{\pi}{2}\) this cdf matches the cumulative sum of simulation data absolutely perfectly (the \(\chi^2\) statistic was approximately \(7.2 \times 10^{-8}\) with df=39 from 10 million samples). As for the sharp cutoff, we note that the terms \(\sqrt{z-1} - z\arctan{\left(\sqrt{z-1}\right)}\) are strictly imaginary on \(z\in [0,1)\) and only contribute to the real value of \(F_Z\) for \(z\in [1,2]\).

A Geometric Approach

A geometric representation of the problem. A quarter circle centered at the lower left corner of a square is shown
with three different radii. Referring back to the diagram, the origin of the distribution becomes more intuitive. As \(z\) increases (that is, the square of the radius of a circle increases), the area of the square that is covered by the circle grows linearly with \(z\). Since the desired probability is equivalent to the ratio of the area of the square that is covered to the area of the entire square, this directly gives us the cumulative distribution function. The area of a quarter circle is \(\frac{\pi r^2}{4} = \frac{\pi z}{4}\) and the area of the square is 1, giving us the cumulative distribution function \(F_z = \frac{\pi}{4}z\) for \(z \in [0,1)\), as expected.

Once we choose a point in the square such that the radius is larger than one, we have a more complicated shape to deal with. We wish to find the covered area of the square, excluding any parts of the quarter circle that extend beyond the edges of the square.

A diagram showing a quarter circle centered at the corner of a square. The circle's radius is greater than the side length of the square, but less than the diagonal of the square. There are two areas of the quarter circle that do not intersect with the square.image/svg+xml

Again, The area of a large quarter circle is easy, so all that remains is finding the area of the two pieces outside the square. By a symmetry argument we can see that these two pieces are not only of equal area, but they must also be congruent up to reflection. Translating one of these pieces, we can see that the two together comprise a circular segment for which we may calculate the area.

A diagram showing that when we move one of these "leftover pieces" we can join them by their shared perpendicular (or "saggita") to form a circular segment.image/svg+xml

Now we can work on deriving the area of the red circular segment.

A focused view of the circular segment we generated, with radius r, apothem d, sagitta s, and chord 2c image/svg+xml

The figure to the left is labelled with radius \(r\), chord length \(2c\), apothem \(d\), sagitta \(s\), and angle \(\theta\). To find the area of the circular segment, we will first find the area of the entire sector, and then subtract twice the magenta right triangle formed by \(d, r, \text{ and } c\).

Recall that we created the apothem with the original blue square of side length 1, so \(d = 1.\) The radius \(r\) is arbitrary and due to the initial choice of \(x\) and \(y\), so we wish to solve for \(c\) and \(\theta\). By the Pythagorean Theorem and some basic trigonometry we find that \(c = \sqrt{r^2 - d^2} = \sqrt{r^2 - 1}\) and \(\theta = \arctan{\left(\frac{c}{d}\right)} = \arctan{(\sqrt{r^2 - 1})}\)2.

The area of a circular sector is given by the ratio of the central angle \(\theta\) to a full rotation multiplied by the area of a circle: \(A_{\text{sector}} = \frac{\theta}{2\pi}\pi r^2 = \frac{1}{2}\theta r^2\). In our case, the central angle is \(2\theta\) and the radius is r, giving \(A_{\text{sector}} = \theta r^2\).

From \(A_{\text{sector}}\) we wish to subtract two of \(\triangle cdr\). Thus we have \( A_{\text{sector}} - 2A_{\triangle} = A_{\text{sector}} - 2\left(\frac{1}{2}cd\right) =A_{\text{sector}} - c \)

Putting everything together, we arrive at the area of the circular segment \[ A = \theta r^2 - c = r^2 \arctan{(\sqrt{r^2 - 1})} - \sqrt{r^2 - 1} \]

Finally, we subtract this segment from a quarter circle leaving us with the intersection of the square with the quarter circle, which is equivalent to finding the cumulative distribution function we desire. \[ \begin{align} A &= F_{r^2}(r) \\ &= \frac{\pi}{4}r^2 +\sqrt{r^2 - 1} - r^2 \arctan{(\sqrt{r^2 - 1})} \end{align} \] Recall that the random variable \(Z\) is the sum of the squares of \(X\) and \(Y\), thus for any specific instance \(z\) of \(Z\), we may consider \(z\) to be the square of the radius \(r\) we have been using in the geometric approach. Making the substitution \(z = r^2\) we may breathe a sigh of relief as we see that our cumulative distribution function is what we hoped to find \[ F_{Z}(z) = \frac{\pi}{4}z +\sqrt{z - 1} - z\arctan{(\sqrt{z - 1})} \]

Conclusion

This is one instance of uncountably many problems in mathematics that benefit from a change of perspective. The analytic solution found using only statistics ended with a particularly nasty integral (that I still need to fix, see footnote 1), and did not offer much insight into the bizarre nature of the distribution we found. The geometric approach on the other hand only required us to draw a circle, move some shapes, and use high school trigonometry. Furthermore, we can easily see the boundary condition where the distribution transitions from a uniform to a nonuniform regime is caused by a circle no longer being fully contained within a square.

I was not able to find very much information about the sum of squared uniform distributions anywhere in the literature, certainly not available for free on the clearnet. A natural extension of this problem is to examine the behavior of the sum of squares of three or more random variables. I.e. given that \(X_i \sim U[0,1]\;\; \forall i \in \{1..n\}\) what distribution does \(Y\) follow if \[ Y = \sum_{i=1}^n X_i^2 \quad ?\] The analysis is trivial and left as an exercise to the reader.


  1. I really need to investigate this further. The form of the integral is an unusual one, and I expect that wolfram alpha made an error. I need to sit down and look into more integration techniques. There is also a very real chance that I made an error in setting up the integral in the first place. 

  2. Note that we could have just as easily chosen \(\theta = \arccos{r} \) but let’s try to match this up as closely as possible to the previous result.