Playing with the Ax–Grothendieck Theorem
Published:
We discuss a construction question that naturally arises from the Ax–Grothendieck theorem. In particular, this comes from a field automorphism between the complex numbers and the ultraproduct of prime algebraically closed fields of positive characteristics.
Introduction
Recall the celebrated theorem of Ax–Grothendieck:
Theorem. (Ax–Grothendieck) Suppose $F\colon\mathbb{C}^n\to\mathbb{C}^n$ is a polynomial map which is injective. Then it is also surjective.
The often spelled proof of this theorem uses some model theory, especially the models of the theory of algebraically closed fields. A whole vast theory of mathematical logic, including first-order language, models on them, categoricity, completeness of the theory of algebraically closed fields, and topological arguments supporting the above theorem then awaits. I do not intend to get through all of these, however; see Marker 2006, Theorem 2.2.1.
However, after we get through all these theories and proofs, get happy with it, one can raise the following question.
So if we know F is thus surjective, how do we find $\bar{x}\in\mathbb{C}^n$ with $F(\bar{x})=\bar{0}$?
This post is aimed to justify the following claim.
The (well-known) “logical” proof tells you how to get that.
Ultraproducts
Ultrafilters
Suppose X is a (discrete) set, whose power set is denoted $2^X$. We say a set $\mathscr{F}\subset 2^X$ of subsets of X a filter on X if it satisfies
- (Nontrivial, Proper) $X\in\mathscr{F}$, yet $\varnothing\notin\mathscr{F}$,
- (Upwards closed) if $A\in\mathscr{F}$ and $A\subset B(\subset X)$, then $B\in\mathscr{F}$,
- (has Meet) if $A,B\in\mathscr{F}$, then $A\cap B\in\mathscr{F}$.
Equivalently, the “set of complements” \(\{X\setminus A : A\in\mathscr{F}\}\) forms a proper ideal in the Boolean ring $2^X$.
If we furthermore add a condition
- (Ultra) for each $A\in 2^X$, either $A\in\mathscr{F}$ or $X\setminus A\in\mathscr{F}$,
then we say a filter $\mathscr{F}$ is an ultrafilter. (The “set of complements” forms a prime ideal in the ring $2^X$, which is also a maximal ideal.) A cheapest way to make an ultrafilter is to let \(\mathscr{F}_x=\{A\in 2^X:x\in A\}\) for any element x in X; such ultrafilters are called principal ultrafilters. Nonprincipal ultrafilters are gaining attention in most cases.
Ultrafilter-almost-every
Given an ultrafilter $\mathscr{U}$ on a set X, we say a property $\phi(x)$ on X holds $\mathscr{U}$-almost everytime (a.e.) if the set of x in X in which $\phi(x)$ becomes the case, is in $\mathscr{U}$ (i.e., \(\{x\in X : \phi(x)\}\in\mathscr{U}\)). The axioms of ultrafilters then states as follows.
- (Nontrivial) A property which is always true holds $\mathscr{U}$-a.e..
- (Proper) A property which is always false never holds $\mathscr{U}$-a.e..
- (Upwards closed) Any property which is implied by a $\mathscr{U}$-a.e. property is also holding $\mathscr{U}$-a.e..
- (has Meet) For two properties holding $\mathscr{U}$-a.e., their conjunction (AND) holds $\mathscr{U}$-a.e..
- (Ultra) Any property on X either holds $\mathscr{U}$-a.e. or fails $\mathscr{U}$-a.e..
One might recall a similar term in measure theory, in which we can state the connection as follows. Define \(\mu_\mathscr{U}\colon 2^X\to\{0,1\}\) by $\mu(I)=1$ if $I\in\mathscr{U}$, and 0 otherwise. It is not hard to see that $\mu_\mathscr{U}$ forms a finite additive measure (it is seldom countably additive, esp. when X is countable), and that $\mu_\mathscr{U}$-a.e. coincides with $\mathscr{U}$-a.e. above.
Constructing Ultrafilters
To get some sense on how this is constructed, consider a family $\mathscr{S}\subset 2^X$ of subsets with the following property.
- (Finite Intersection Property: FIP) For any $A_1,\ldots,A_n\in\mathscr{S}$, we have $A_1\cap\cdots\cap A_n$ nonempty.
For instance, a singleton of nonempty subset of X has FIP. Then the following set
\[\langle\mathscr{S}\rangle=\left\{B\in 2^X: (\exists A_1,\ldots,A_n\in\mathscr{S})(A_1\cap\cdots\cap A_n\subset B)\right\}\]is a filter on X, called the filter generated by $\mathscr{S}$. An ultrafilter is then constructed from any family $\mathscr{S}$ with FIP as follows.
- Build $\mathscr{F}=\langle\mathscr{S}\rangle$.
- Choose whichever $A\subset X$ with $A\notin\mathscr{F}$ and $X\setminus A\notin\mathscr{F}$. (If it fails, $\mathscr{F}$ is an ultrafilter.)
- At least one of \(\mathscr{F}\cup\{A\}\) or \(\mathscr{F}\cup\{X\setminus A\}\) has FIP; choose any one with FIP and renew $\mathscr{S}$.
- Go to 1.
If X is finite, this will be a finite procedure (although we always end up with a principal ultrafilter). Otherwise, we need to understand this as a transfinite recursive process, in which we will not go into details but leave a summary “keep try until it is done.”
Apparently, the step 3 of the above often results in “both have FIP.” So for instance, there may be an (a plenty of!) ultrafilter on natural numbers that contains the set of all even numbers, as well as those containing the set of all odd numbers. Likewise, there may be a plenty of ultrafilters on natural numbers that contains the set of all composite numbers (and 1), or those containing the set of all prime numbers.
Exercise. If an ultrafilter on natural numbers contains (a) the set of even numbers and (b) the set of all prime numbers, show that it must be principal.
The number of such instances is quantitatively understood (see Engelking and Karłowicz, 1965) and even counts the number of ultrafilters as $2^{2^{\vert X\vert}}$ when X is infinite.
Ultraproducts
Ultraproduct is a construction that mingles various structures. For instance, one can ultraproduct graphs, groups, rings, fields, …, to get an enormously sized entity. One instance that it appears outside of set theory is when one constructs infinitesimals via an ultraproduct of the field of reals, in which we describe as follows.
Example: Nonstandard Reals
Recall the ordered field $(\mathbb{R},0,1,+,\cdot,<)$ of real numbers. The product $\mathbb{R}^\mathbb{N}$ will inherit some structures like constants 0 or 1, operations + and ${}\cdot{}$, but the overall structure $(\mathbb{R}^{\mathbb{N}},0,1,+,\cdot)$ is no longer a field. Moreover, the order < may also extend to the product but the order is no longer total (i.e., there may be two sequences that cannot be compared termwise).
However, an appropriate quotienting of $\mathbb{R}^\mathbb{N}$ can recover the ordered field structure $(\mathbb{R}^{\mathbb{N}}/\sim,0,1,+,\cdot,<)$.
Perhaps a cheap way to define such quotienting is to “let a coordinate survive and all perish”: for instance, we can define $(a_n)\sim(b_n)$ by $a_6=b_6$, so that we can recover the ordered field structure of the real numbers.
By the terms to be introduced below, this quotienting can be viewed as a “principal ultraproduct.”
One way to describe such equivalence is to use ultrafilters on N. Fix an ultrafilter $\mathscr{U}$ on N. Define an equivalence
\[(a_n)\sim_\mathscr{U}(b_n)\quad\overset{\text{def}}{\equiv}\quad \{n\in\mathbb{N} : a_n=b_n\}\in\mathscr{U}.\label{eqn:ultrafilter-equivalence}\]That is, two sequences are equivalent if the set of indices that they agree is an element of $\mathscr{U}$ (Phew!). This defines an equivalence relation, thanks to properties of filters. The relation $\sim_\mathscr{U}$ is
- reflexive because \(\{n\in\mathbb{N} : a_n=a_n\}=\mathbb{N}\in\mathscr{U}\),
- symmetric thanks to the definition itself, and
- transitive because the sets \(I=\{n\in\mathbb{N} : a_n=b_n\}\), \(J=\{n\in\mathbb{N}:b_n=c_n\}\), and \(K=\{n\in\mathbb{N} : c_n=a_n\}\) satisfy $I\cap J\subset K$; thus if $I,J\in\mathscr{U}$ then $K\in\mathscr{U}$.
The arguments for the transitivity applies to show that the operations and relations
\[[(a_n)]_\mathscr{U} + [(b_n)]_\mathscr{U} := [(a_n+b_n)]_\mathscr{U}, \\ [(a_n)]_\mathscr{U} \cdot [(b_n)]_\mathscr{U} := [(a_nb_n)]_\mathscr{U}, \\ [(a_n)]_\mathscr{U} < [(b_n)]_\mathscr{U} \overset{\text{def}}{\equiv} \{n\in\mathbb{N} : a_n<b_n\}\in\mathscr{U}\]are well-defined (independent to the representative). Elements 0 and 1 in $\mathbb{R}^\mathbb{N}/\sim_\mathscr{U}$ are defined to be $[(0)],[(1)]$ respectively. So the definition above is well-defining the structure $(\mathbb{R}^{\mathbb{N}}/\sim_\mathscr{U},0,1,+,\cdot,<)$.
Proposition. The structure $(\mathbb{R}^{\mathbb{N}}/\sim_\mathscr{U},0,1,+,\cdot,<)$ satisfies axioms of ordered fields.
(Proof) It is not hard to see that the structure forms a commutative ring with unity. It remains to see that (a) one can take reciprocals of nonzero elements and (b) the order < is total. (Below we are using the language $\mathscr{U}$-a.e. introduced previously.)
To see (a), we first claim that if \([(a_n)]_\mathscr{U}\neq 0\), then \(a_n\neq 0\) for $\mathscr{U}$-a.e. n, since \(a_n=0\) does not hold $\mathscr{U}$-a.e.. Thus \(1/a_n\) is well-defined for $\mathscr{U}$-a.e. n.
Define \(b_n:=1/a_n\) if \(a_n\neq 0\), and \(b_n=0\) otherwise. Then \(a_n\cdot b_n=1\) for $\mathscr{U}$-a.e. n. This verifies that \([(a_n)]_\mathscr{U}\cdot[(b_n)]_\mathscr{U}=1\), hence proving that \(1/[(a_n)]_\mathscr{U}=[(b_n)]_\mathscr{U}\).
To see (b), suppose \([(a_n)]_\mathscr{U}<[(b_n)]_\mathscr{U}\) and \([(a_n)]_{\mathscr{U}}=[(b_n)]_\mathscr{U}\) are not the case. Then both \(a_n<b_n\) and \(a_n=b_n\) does not hold $\mathscr{U}$-a.e. n. This means \(a_n\geq b_n\) and \(a_n\neq b_n\) holds for $\mathscr{U}$-a.e. n, thus we have \(a_n>b_n\) for $\mathscr{U}$-a.e. n. That \([(a_n)]_\mathscr{U}>[(b_n)]_\mathscr{U}\) follows. $\square$
Summarizing, we have
- defined the structure \((\mathbb{R}^\mathbb{N}/\sim_\mathscr{U},0,1,+,\cdot,<)\) thanks to the filter structure that $\mathscr{U}$ has, and
- proved that the structure descends properties of real numbers by that $\mathscr{U}$ is an ultrafilter on natural numbers.
The structure constructed above has the name nonstandard reals.
Ultraproduct of Algebraic Closures of Finite Fields
In the construction of nonstandard reals above, we note that elements with different indices does not interact in the above definitions. Hence there is no reason to let $a_n\in\mathbb{R}$ for all n, but instead let $a_n\in k_n$ where $k_n$ is a field of our choice.
Denote $\mathcal{P}\subset\mathbb{N}$ for the set of all prime numbers. For each prime p, denote \(\overline{\mathbb{F}}_p\) for the algebraic closure of the order p field \(\mathbb{F}_p=\mathbb{Z}/p\mathbb{Z}\). Fix a nonprincipal ultrafilter $\mathscr{U}$ on $\mathcal{P}$. Denote
\[\overline{\mathbb{F}}_\mathscr{U} := \left.\prod_{p\in\mathcal{P}}\overline{\mathbb{F}}_p\right/\sim_\mathscr{U},\]where $\sim_\mathscr{U}$ is defined analogously to \eqref{eqn:ultrafilter-equivalence}. From operations of the fields \(\overline{\mathbb{F}}_p\), it is not hard to define 0, 1, addition, and multiplication of the field \(\overline{\mathbb{F}}_\mathscr{U}\). These operations satisfy the following
Lemma. The structure $(\overline{\mathbb{F}}_\mathscr{U},0,1,+,\cdot)$ forms an algebraically closed field, of characteristic zero.
(Proof) That the structure forms a field can be shown similar to the arguments for nonstandard reals.
To see algebraic closedness, let \(f(x)=\sum_{i=0}^d[(a_{i,p})_{p\in\mathcal{P}}]_\mathscr{U}x^i\) be any nonconstant polynomial defined over \(\overline{\mathbb{F}}_\mathscr{U}\). Its components, \(f_p(x):=\sum_{i=0}^da_{i,p}x^i\), is a nonconstant polynomial for $\mathscr{U}$-a.e. p. Let \(x_p\) be a solution to \(f_p(x)=0\), which is defined for $\mathscr{U}$-a.e. p. Then \(x=[(x_p)_{p\in\mathscr{P}}]_\mathscr{U}\) solves $f(x)=0$, since \(f_p(x_p)=0\) for $\mathscr{U}$-a.e. p..
To see the characteristic, fix a positive prime q, and denote \(q\cdot 1:=\underbrace{1+\cdots+1}_q\in\overline{\mathbb F}_\mathscr{U}\). But $q\cdot 1\neq 0$ on \(\overline{\mathbb{F}}_p\) for all $p\neq q$, thus $\mathscr{U}$-a.e. p as well (this is where we use that $\mathscr{U}$ is nonprincipal). Hence $q\cdot 1\neq 0$ on \(\overline{\mathbb{F}}_\mathscr{U}\) too. $\square$
The field \(\overline{\mathbb{F}}_\mathscr{U}\) has large cardinality, in the following sense. We omit the proof yet leave a MathStackExchange post that sketches a relevant fact.
Proposition. The field \(\overline{\mathbb{F}}_\mathscr{U}\) has cardinality of the continuum, i.e., \(\vert\overline{\mathbb{F}}_\mathscr{U}\vert=\vert\mathbb{C}\vert=2^{\aleph_0}\).
The above cardinality data tells that the field \(\overline{\mathbb{F}}_\mathscr{U}\) is an algebraically closed field that has transcendence degree $2^{\aleph_0}$ over its prime field, Q. In particular, it is field isomorphic to the algebraic closure of \(\mathbb{Q}(x_t)_{t\in[0,1]}\), the field of rational functions in variables \(x_t\)’s, $t\in[0,1]$. Same argument applies for the field C of complex numbers, thus yielding the following
Corollary. There is a field isomorphism $\eta\colon\mathbb{C}\to\overline{\mathbb{F}}_\mathscr{U}$.
Now we are ready to prove the Ax–Grothendieck theorem.
Proof of Ax–Grothendieck Theorem
Theorem. (Ax–Grothendieck) Suppose $F\colon\mathbb{C}^n\to\mathbb{C}^n$ is a polynomial map which is injective. Then it is also surjective.
(Proof) Since the map F is algebraically defined, via the field isomorphism \(\eta\colon\mathbb{C}\to\overline{\mathbb{F}}_\mathscr{U}\) we may view F as a polynomial map \(F^\eta\colon\overline{\mathbb{F}}_\mathscr{U}^n\to\overline{\mathbb{F}}_\mathscr{U}^n\). Denote the components \(F^\eta=(F^\eta_1,\ldots,F^\eta_n)\), with each \(F^\eta_i(X_1,\ldots,X_n)=\sum_{\alpha}[(c_{i,\alpha,p})_{p\in\mathcal{P}}]_\mathscr{U}X^\alpha\) denoted with multiindex notations.
Let \(F^\eta_{i,p}=\sum_\alpha c_{i,\alpha,p}X^\alpha\) and $F^\eta\vert_p=(F^\eta_{1,p},\ldots,F^\eta_{n,p})$ (we use a special notation to avoid confusion with \(F^\eta_p\), the p-th component of $F^\eta$). We show the following
Claim A. The map $F^\eta\vert_p\colon\overline{\mathbb{F}}_p^n\to\overline{\mathbb{F}}_p^n$ is injective for $\mathscr{U}$-a.e. p.
Suppose otherwise. Then for $\mathscr{U}$-a.e. p, we can find distinct \(\bar{x}_p,\bar{y}_p\in\overline{\mathbb{F}}_p^n\) with \(F^\eta\vert_p(\bar{x}_p)=F^\eta\vert_p(\bar{y}_p)\). The vectors \(\bar{x}=[(\bar{x}_p)]_\mathscr{U}\) and \(\bar{y}=[(\bar{y}_p)]_\mathscr{U}\) thus has \(F^\eta(\bar{x})=F^\eta(\bar{y})\). Since $F^\eta$ is injective (as F was so), we have $\bar{x}=\bar{y}$, thus \(\bar{x}_p=\bar{y}_p\) for $\mathscr{U}$-a.e. p. Contradiction.
Claim B. The map $F^\eta\vert_p\colon\overline{\mathbb{F}}_p^n\to\overline{\mathbb{F}}_p^n$ is surjective for $\mathscr{U}$-a.e. p.
Fix p where \(F^\eta\vert_p\colon\overline{\mathbb{F}}_p^n\to\overline{\mathbb{F}}_p^n\) is injective. Set $k$ large enough so that \(F^\eta\vert_p\) has coefficients in \(\mathbb{F}_{p^k}\) (the finite field of order $p^k$). For any \(\mathbb{F}_{p^\ell}\supset\mathbb{F}_{p^k}\), the map \(F^\eta\vert_p\) sends \(\mathbb{F}_{p^\ell}^n\) into itself, in an injective manner. Thus the image of $F^\eta\vert_p$ contains all of \(\mathbb{F}_{p^\ell}^n\). Since \(\overline{\mathbb{F}}_p^n\) is the union of all such sets, we have our claim.
*
To see the surjectivity of F, it suffices to show that there is \(\bar{x}\in\overline{\mathbb{F}}_\mathscr{U}^n\) with $F^\eta(\bar{x})=\bar{0}$. [The map \(\bar{x}\mapsto F(\bar{x})-\bar{y}\) is injective if F is.] For $\mathscr{U}$-a.e. p, we have \(\bar{x}_p\in\overline{\mathbb{F}}^n_p\) with \(F^\eta\vert_p(\bar{x}_p)=\bar{0}\). So setting \(\bar{x}=[(\bar{x}_p)]_\mathscr{U}\), we have the desired property. $\square$
We finally remark that the above theorem is just an algebraic embedding of the classical logical proof. The punchline, that the theory of algebraically closed field of characteristic p “approximates” that of characteristic 0, is governed by the Corollary above, leaving all the details as some algebraic game.
Update Log
- 230831: Created