Roots of Polynomials

The Factor Theorem

The factor theorem states that if a non-zero polynomial $f (X) \in F [X]$ has a root at $x \in F$ , i.e. $f (x) = 0$ , then we can write $f (X) = (X - x) \cdot g (X)$ for some polynomial $g (X)$ .

The Factor Theorem

Let $f (X) \in F [X]$ , with $f \neq = 0$ and let $x \in F$ be such that $f (x) = 0$ .

Then there exists $g (X) \in F [X]$ such that $f (X) = (X - x) \cdot g (X)$ .

We first consider a special case, then use this to prove the general case.

Proof.

Consider the two cases:

Case x = 0. First we show the theorem when $f (0) = 0$ , then the claim is that $f (X) = (X - x)$ $\cdot g (X)$ $=$ $X \cdot g (X)$ . To see this observe that if $f (X) = \sum_{i = 0}^{d} c_{i} \cdot X^{i}$ then $f (0) = c_{0}$ , and since $f (0) = 0$ we have $c_{0} = 0$ . Therefore $f (X)$ is of the form: $f (X) = c_{1} \cdot X + \dots + c_{d} \cdot X^{d} = X \cdot (c_{1} + \dots + c_{d} \cdot X^{d - 1})$ If we define $g (X) = c_{1} + \dots + c_{d} \cdot X^{d - 1}$ , we see that: $f (X) = X \cdot g (X) = (X - 0) \cdot g (X)$ As desired.
Case x $\neq =$ 0. Suppose $f (x) = 0$ . Define the polynomial $f^{*} (X) = f (X + x)$ and observe that $f^{*} (0) = 0$ . Therefore, by the “x = 0” case, we conclude that $f^{*} (X) = X \cdot g^{*} (X)$ for some $g^{*} (X) \in F [X]$ . Next write: $f (X) = f ((X - x) + x) = f^{*} (X - x) = (X - x) \cdot g^{*} (X - x)$

If we define $g (X) = g^{*} (X - x)$ , then we see that:

$f (X) = (X - x) \cdot g^{*} (X - x) = (X - x) \cdot g (X)$

Actually, the factor theorem can be broadened a bit: no part of the proof requires $F$ to be a integral domain, at no point did we use the fact that $a \cdot b = 0$ implies $a = 0$ or $b = 0$ . Therefore, the factor theorem applies even to polynomials over any commutative ring, for instance, polynomials over $Z_{2^{k}}$ .

The Fundamental Theorem

The Fundamental Theorem

Let $F$ be an integral domain and let $f (X) \in F [X]$ be a non-zero polynomial of degree $d$ , then $f (X)$ has at most $d$ roots in $F$ .

Proof.

We prove this using induction over the degree $d$ of $f (X)$ .

Base d = 0. If $d = 0$ , then $f (X) = c$ for some non-zero $c \in F$ . Clearly, $f (X)$ has $0$ roots.
Step d > 0. Let $f (X)$ be a polynomial of degree $d$ . If $f (X)$ has $0$ roots in $F$ , then we are done. Otherwise, let $x_{0} \in F$ be a root of $f (X)$ . Then, using the factor theorem, we can write $f (X) = (X - x_{0}) \cdot g (X)$ for some polynomial $g (X)$ of degree $d - 1$ . By the inductive hypothesis, $g (X)$ has at most $d - 1$ roots and $X - x_{0}$ has at most $1$ root. Finally $f (X)$ has at most $d$ roots because $F$ is an integral domain: since if $x \in F$ is such that $f (x) = (x - x_{0}) \cdot g (x) = 0$ , then either $(x - x_{0}) = 0$ or $g (x) = 0$ , in other words, $x$ must be a root of either $X - x_{0}$ or $g (X)$ of which there are at most $1 + (d - 1) = d$ .

The theorem also allows us to conclude that if two polynomials $f_{0} (X)$ and $f_{1} (X)$ of degree $d$ share more than $d$ evaluations in $F$ , then the polynomials must be equal.

Corollary. Equality of Polynomials

Let $f_{0} (X)$ and $f_{1} (X)$ be polynomials of degree $d$ over $F$ . Let $x_{0}, \dots, x_{d} \in F$ be distinct elements. Then: $\forall i \in {0, \dots, d} . f_{0} (x_{i}) = f_{1} (x_{i})$ $⟺$ $f_{0} (X) = f_{1} (X)$

Proof.

Define $f (X) = f_{0} (X) - f_{1} (X)$ . Note that $f (X)$ is a polynomial in $F [X]$ of degree at most $d$ and observe that $f (x_{i}) = 0$ for all $i \in {0, \dots, d}$ : $f (X)$ has at least $d + 1$ roots in $F$ . Therefore, $f (X)$ must be the zero polynomial, i.e. $f_{0} (X) = f_{1} (X)$ .

Schwartz-Zippel Lemma

We can turn the corollary above into a probabilistic check of equality between polynomials, this technique is called the Schwartz-Zippel lemma and is widely used; we will use it a lot.

Corollary. Schwartz-Zippel

Let $f_{0} (X) \neq = f_{1} (X)$ be distinct polynomials of degree $d$ and let $C \subseteq F$ be an arbitrary subset of the integral domain $F$ , then: $P_{x \leftarrow $ C} [f_{0} (x) = f_{1} (x)] \leq \frac{d}{∣ C ∣}$

Proof.

Note that the statement is vacuous if $∣ C ∣ \leq d$ . When $∣ C ∣ > d$ then the statement implies that there must exist a subset $S \subseteq C$ with $∣ S ∣ > d$ such that: $\forall x \in S, f_{0} (x) = f_{1} (x)$ In which case the previous corollary shows that $f_{0} (X) = f_{1} (X)$ .

Multivariate Polynomial Roots

An important way for us to view multivariate polynomials will be as univariate polynomials over polynomial rings. To this end, it is useful for us to verify that multivariate polynomials form integral domains allowing us to apply the fundamental theorem:

Polynomial Ring is Integral Domain

Let $F$ be an integral domain, then $F [X]$ is an integral domain.

Proof.

We can multiply and add polynomials, it is also clear that polynomial multiplication is commutative (since $F$ is), i.e. for $f (X) \in F [X]$ and $g (X) \in F [X]$ , we have: $f (X) \cdot g (X) = g (X) \cdot f (X)$ Finally, let us verify that there are no zero divisors in $F [X]$ , i.e. show: $f (X) \cdot g (X) = 0 ⟹ f (X) = 0 or g (X) = 0$ To see this let $j$ and $i$ be the degrees of $f (X)$ and $g (X)$ respectively, denote by $f_{j} \in F$ and $g_{i} \in F$ the leading coefficients of $f (X)$ and $g (X)$ . Note that $f_{j} \neq = 0$ and $g_{i} \neq = 0$ otherwise $f (X) = 0$ or $g (X) = 0$ respectively. Then the leading coefficient of $f (X) \cdot g (X)$ is $f_{j} \cdot g_{i}$ which is non-zero since $F$ is an integral domain. Therefore, $f (X) \cdot g (X) = 0$ if and only if $f (X) = 0$ or $g (X) = 0$ .

Corollary. Multivariate Polynomial Ring is Integral Domain

By applying the theorem above $n$ times, we can conclude that $F [X_{1}, \dots, X_{n}]$ is an integral domain: let $R_{0} = F$ and $R_{i} = R_{i - 1} [X_{i}]$ for $i = 1, \dots, n$ . Observe that: $R_{i} = (((F [X_{1}]) [X_{2}]) \dots) [X_{i}] = F [X_{1}, \dots, X_{i}]$

This "iterative" construction of $F [X_{1}, \dots, X_{n}]$ allows us to view $k$ -variate polynomials over $F$ as univariate polynomials over $F$ :

$f (X_{1}, \dots, X_{k}) \in F [X_{1}, \dots, X_{k}]$ $Equivalently$ $f (X_{k}) \in F [X_{k}] where F = F [X_{1}, \dots, X_{k - 1}]$

With this interpretation $f (X_{k})$ is a polynomial over $F = F [X_{1}, \dots, X_{k - 1}]$ and therefore $X_{k}$ can take any value in $F$ (not just $F$ ), i.e. we can evaluate $f (X_{k})$ for every $(k - 1)$ -variate polynomial $X_{k} \in F$ , which includes the constant polynomials, i.e. $F \subseteq F$ .

If we apply the fundamental theorem to this particular setting we get:

Corollary. Multivariate Polynomial Root Bound

Let $f (X_{1}, \dots, X_{k}) = f (X_{k}) \in F [X_{k}]$ with $F = F [X_{1}, \dots, X_{k - 1}]$ be a $k$ -variate polynomial. Let $d$ be the degree in the $X_{k}$ variable, then there exist at most $d$ distinct $(k - 1)$ -variate polynomials $x_{k} \in F$ such that $f (x_{k}) = 0 \in F$ . In particular, there exist at most $d$ field elements (constant polynomials) $x_{k} \in F$ such that $f (x_{k}) = 0 \in F [X_{1}, \dots, X_{k - 1}]$ .

If we apply this observation recursively, we can conclude that for sufficiently large tensor products, a polynomial vanishes over the tensor product if and only if the polynomial is the zero polynomial:

Polynomial Vanishing on Tensor Products

Let $f (X_{1}, \dots, X_{k}) \in F [X_{1}, \dots, X_{k}]$ be a non-zero $k$ -variate polynomial with degree $d_{i}$ in each variable $X_{i}$ . Let $H_{k}$ be the tensor product of $S_{1}, \dots, S_{k}$ where $\forall i .∣ S_{i} ∣ > d_{i}$ : $H_{k} = S_{1} \otimes S_{2} \otimes \dots \otimes S_{k} \subseteq F^{k}$ Then: $\forall (x_{1}, \dots, x_{k}) \in H_{k} . f (x_{1}, \dots, x_{k}) = 0$ $⟺$ $f (X_{1}, \dots, X_{k}) = 0 \in F [X_{1}, \dots, X_{k}]$

Proof.

We prove this by induction:

Base k = 1. When $X = (X_{1})$ the “multivariate” polynomial is a univariate polynomial $f (X_{1}) \in F [X_{1}]$ , by applying the fundamental theorem with $F = F$ , we observe that the number of roots of $f (X_{1})$ is at most $d_{1} = de g (f)$ , however since $∣ S_{1} ∣ > d_{1}$ the polynomial cannot evaluate to zero on all of $S_{1}$ . So the claim holds.
Step k > 1. Define $F$ $=$ $F [X_{1}, \dots, X_{k - 1}]$ and now rewrite $f (X)$ as a polynomial with coefficients in $F$ : $f (X_{1}, \dots, X_{k}) = i \sum X_{k}^{i} \cdot f_{i} (X_{1}, \dots, X_{k - 1}) \in F [X_{k}]$ Since $F$ is an integral domain, we can apply the fundamental theorem, this time to $F$ $=$ $F [X_{1}, \dots, X_{k - 1}]$ , rather than $F = F$ . We conclude that at most $d_{k}$ values $x_{k} \in F$ satisfy: $f (x_{k}) = i \sum x_{k}^{i} \cdot f_{i} (X_{1}, \dots, X_{k - 1}) = 0$ And, in particular, there exist at most $d_{k}$ elements $x_{k} \in S_{k} \subseteq F \subseteq F$ (constant polynomials) satisfying this. On the other hand, since $∣ S_{k} ∣ > d_{k}$ there must exist at least one $x_{k} \in S_{k}$ which is not a root, in other words: $f (x_{k}) = g (X_{1}, \dots, X_{k - 1}) \neq = 0 \in F$ We then apply the induction hypothesis on $g (X_{1}, \dots, X_{k - 1})$ to conclude that it does not vanish over $H_{k - 1}$ . In other words, we conclude that there is at least one $(x_{1},$ $\dots,$ $x_{k - 1})$ $\in$ $H_{k - 1}$ such that $g (x_{1}, \dots, x_{k - 1}) \neq = 0$ , which also allows us to conclude: $f (x_{1}, \dots, x_{k}) = g (x_{1}, \dots, x_{k - 1}) \neq = 0 \in F$ So $f (X_{1}, \dots, X_{k})$ also cannot vanish over $H_{k}$ and the claim holds for $k$ as well.

Corollary. Multilinear Polynomial Non-Vanishing on Hypercube

Setting $d_{1} = d_{2} = \dots = d_{k} = 1$ and $H_{k} = {0, 1} \otimes \dots \otimes {0, 1}$ as the $k$ -dimensional hypercube, we conclude that a non-zero multilinear polynomial cannot vanish on the hypercube, i.e. if $f (X_{1}, \dots, X_{k}) \in F [X_{1}, \dots, X_{k}]$ then there exists at least one $(x_{1}, \dots, x_{k}) \in {0, 1}^{k}$ such that $f (x_{1}, \dots, x_{k}) \neq = 0$ .

An easy, but very important, observation is that two multilinear polynomials can agree on the hypercube if and only if they actually are equal as polynomials.

Corollary. Multilinear Polynomial Equality on Hypercube

Let $f, g \in F [X_{1}, \dots, X_{k}]$ be two multilinear polynomials such that: $\forall x \in H_{k} . f (x) = g (x)$ Then $f (X_{1}, \dots, X_{k}) = g (X_{1}, \dots, X_{k})$ .

Proof.

We can form: $h (X_{1}, \dots, X_{k}) = f (X_{1}, \dots, X_{k}) - g (X_{1}, \dots, X_{k})$ By assumption $\forall x \in H_{k} . h (x) = f (x) - g (x) = 0$ , therefore we conclude that $h (X_{1}, X_{2}, \dots, X_{k}) = 0$ by the theorem. Hence $f (X_{1}, \dots, X_{k}) = g (X_{1}, \dots, X_{k})$ .

Multivariate Schwartz-Zippel

We can extend the techniques above to reason about the probability that a multivariate polynomial $f (X_{1}, \dots, X_{k})$ vanishes at a random point $x \leftarrow $ H_{k}$ .

Multivariate Schwartz-Zippel

Let $H_{k} = S_{1} \otimes \dots \otimes S_{k}$ and let $f (X_{1}, \dots, X_{k})$ be a multivariate polynomial of individual degrees $d_{i}$ in $X_{i}$ . Then the probability that the polynomial vanishes at uniformly random $x \leftarrow $ H_{k}$ can be bounded as follows: $P [f (x) = 0] \leq i = 1 \sum k \frac{d _{i}}{∣ S _{i} ∣}$ where $x \leftarrow $ H_{k}$ .

Proof.

We show this by induction:

Base k = 1. When $k = 1$ the “multivariate” polynomial is a univariate polynomial $f (X_{1}) \in F [X_{1}]$ , by applying the fundamental theorem with $F = F$ we observe that the number of roots of $f (X_{1})$ is at most $d_{1}$ . Hence for uniform $x_{1} \leftarrow $ S_{1}$ , the probability that $f (x_{1}) = 0$ , i.e. that $x_{1}$ is one of the at most $d_{1}$ roots, is at most $d_{1} /∣ S_{1} ∣$ .
Step k > 1. Basically, there are two ways that $f (x_{1}, \dots, x_{k})$ could be zero:
- When we partially evaluate we get the zero polynomial $f (X_{1}, \dots, X_{k - 1}, x_{k}) = 0$
- Or, $g (X_{1}, \dots, X_{k - 1}) = f (X_{1}, \dots, X_{k - 1}, x_{k}) \neq = 0$ , but $g (x_{1}, \dots, x_{k - 1}) = 0$ .
Define $F$ $=$ $F [X_{1}, \dots, X_{k - 1}]$ and view $f (X)$ as a polynomial in $F$ : $f (X_{1}, \dots, X_{k}) = i \sum X_{k}^{i} \cdot f_{i} (X_{1}, \dots, X_{k - 1}) \in F [X_{k}]$ Since $F$ is an integral domain, we conclude that at most $d_{k}$ values $x_{k} \in F$ satisfy: $f (x_{k}) = i \sum x_{k}^{i} \cdot f_{i} (X_{1}, \dots, X_{k - 1}) = 0$ And, in particular, there exist at most $d_{k}$ elements $x_{k} \in S_{k} \subseteq F \subseteq F$ which make $f (x_{k}) = 0$ , hence the probability that $x_{k} \leftarrow $ S_{k}$ makes $f (x_{k}) = 0$ is at most $d_{k} /∣ S_{k} ∣$ .

On the other hand, if $x_{k} \leftarrow $ S_{k}$ is not a root: $f (x_{k}) = g (X_{1}, \dots, X_{k - 1}) \neq = 0 \in F$ We can apply the induction hypothesis on $g (X_{1}, \dots, X_{k - 1})$ to conclude that: $P_{x \leftarrow $ H_{k - 1}} [g (x) = 0] \leq i = 1 \sum k - 1 \frac{d _{i}}{∣ S _{i} ∣}$ By applying a union bound on both these events we conclude that: $P_{x \leftarrow $ H_{k}} [f (x) = 0] \leq \frac{d _{k}}{∣ S _{k} ∣} + (i = 1 \sum k - 1 \frac{d _{i}}{∣ S _{i} ∣}) = i = 1 \sum k \frac{d _{i}}{∣ S _{i} ∣}$

Keyboard shortcuts

Roots of Polynomials

The Factor Theorem

The Fundamental Theorem

Schwartz-Zippel Lemma

Multivariate Polynomial Roots

Multivariate Schwartz-Zippel