Hensel's lemma explained

In mathematics, Hensel's lemma, also known as Hensel's lifting lemma, named after Kurt Hensel, is a result in modular arithmetic, stating that if a univariate polynomial has a simple root modulo a prime number, then this root can be lifted to a unique root modulo any higher power of . More generally, if a polynomial factors modulo into two coprime polynomials, this factorization can be lifted to a factorization modulo any higher power of (the case of roots corresponds to the case of degree for one of the factors).

By passing to the "limit" (in fact this is an inverse limit) when the power of tends to infinity, it follows that a root or a factorization modulo can be lifted to a root or a factorization over the -adic integers.

These results have been widely generalized, under the same name, to the case of polynomials over an arbitrary commutative ring, where is replaced by an ideal, and "coprime polynomials" means "polynomials that generate an ideal containing ".

Hensel's lemma is fundamental in -adic analysis, a branch of analytic number theory.

The proof of Hensel's lemma is constructive, and leads to an efficient algorithm for Hensel lifting, which is fundamental for factoring polynomials, and gives the most efficient known algorithm for exact linear algebra over the rational numbers.

Modular reduction and lifting

Hensel's original lemma concerns the relation between polynomial factorization over the integers and over the integers modulo a prime number and its powers. It can be straightforwardly extended to the case where the integers are replaced by any commutative ring, and is replaced by any maximal ideal (indeed, the maximal ideals of

have the form

p\Z,

where is a prime number).

Making this precise requires a generalization of the usual modular arithmetic, and so it is useful to define accurately the terminology that is commonly used in this context.

R\toR/I.

For example, if

f\inR[X]

is a polynomial with coefficients in, its reduction modulo, denoted

f\bmodI,

is the polynomial in

(R/I)[X]=R[X]/IR[X]

obtained by replacing the coefficients of by their image in

R/I.

Two polynomials and in

R[X]

are congruent modulo, denoted

f\equiv g \pmod I

if they have the same coefficients modulo, that is if

f-g\inIR[X].

h\inR[X],

a factorization of modulo consists in two (or more) polynomials in

R[X]

such that

h\equiv fg \pmod I.

The lifting process is the inverse of reduction. That is, given objects depending on elements of

R/I,

the lifting process replaces these elements by elements of

(or of

R/I^k

for some) that maps to them in a way that keeps the properties of the objects.

For example, given a polynomial

h\inR[X]

and a factorization modulo expressed as

h\equiv fg \pmod I,

lifting this factorization modulo

I^k

consists of finding polynomials

f',g'\inR[X]

such that

f'\equiv f \pmod I,

g'\equiv g \pmod I,

and

h\equiv f'g' \pmod .

Hensel's lemma asserts that such a lifting is always possible under mild conditions; see next section.

Statement

Originally, Hensel's lemma was stated (and proved) for lifting a factorization modulo a prime number of a polynomial over the integers to a factorization modulo any power of and to a factorization over the -adic integers. This can be generalized easily, with the same proof to the case where the integers are replaced by any commutative ring, the prime number is replaced by a maximal ideal, and the -adic integers are replaced by the completion with respect to the maximal ideal. It is this generalization, which is also widely used, that is presented here.

Let

akm

be a maximal ideal of a commutative ring, and

	n+ …
h=\alpha
	0X

+\alpha_n-1X+\alpha_n

be a polynomial in

R[X]

with a leading coefficient

\alpha₀

not in

akm.

Since

akm

is a maximal ideal, the quotient ring

R/akm

is a field, and

(R/akm)[X]

is a principal ideal domain, and, in particular, a unique factorization domain, which means that every nonzero polynomial in

(R/akm)[X]

can be factorized in a unique way as the product of a nonzero element of

(R/akm)

and irreducible polynomials that are monic (that is, their leading coefficients are 1).

Hensel's lemma asserts that every factorization of modulo

akm

into coprime polynomials can be lifted in a unique way into a factorization modulo

akm^k

for every .

More precisely, with the above hypotheses, if $h\equiv \alpha_0 fg\pmod \mathfrak m,$ where and are monic and coprime modulo

akm,

then, for every positive integer there are monic polynomials

f_k

and

g_k

such that

\begin{align} h&\equiv\alpha₀f_kg_k

	k},\\ f
\pmod{akm
	k&\equiv

f\pmod{akm},\\ g_k&\equivg\pmod{akm}, \end{align}

and

f_k

and

g_k

are unique (with these properties) modulo

akm^k.

Lifting simple roots

An important special case is when

f=X-r.

In this case the coprimality hypothesis means that is a simple root of

h\bmodakm.

This gives the following special case of Hensel's lemma, which is often called also Hensel's lemma.

With above hypotheses and notations, if is a simple root of

h\bmodakm,

then can be lifted in a unique way to a simple root of

h\bmod{akm^n}

for every positive integer . Explicitly, for every positive integer, there is a unique

r_n\inR/{akm}ⁿ

such that

r_n\equiv r \pmod \mathfrak m

and

r_n

is a simple root of

h\bmodakm^n.

Lifting to adic completion

The fact that one can lift to

R/akmⁿ

for every positive integer suggests to "pass to the limit" when tends to the infinity. This was one of the main motivations for introducing -adic integers.

Given a maximal ideal

akm

of a commutative ring, the powers of

akm

form a basis of open neighborhoods for a topology on, which is called the

akm

-adic topology. The completion of this topology can be identified with the completion of the local ring

R_akm,

and with the inverse limit

\lim_\leftarrowR/akm^n.

This completion is a complete local ring, generally denoted

\widehatR_akm.

When is the ring of the integers, and

akm=p\Z,

where is a prime number, this completion is the ring of -adic integers

\Z_p.

The definition of the completion as an inverse limit, and the above statement of Hensel's lemma imply that every factorization into pairwise coprime polynomials modulo

akm

of a polynomial

h\inR[X]

can be uniquely lifted to a factorization of the image of in

\widehatR_akm[X].

Similarly, every simple root of modulo

akm

can be lifted to a simple root of the image of in

\widehatR_akm[X].

Proof

Hensel's lemma is generally proved incrementally by lifting a factorization over

R/akmⁿ

to either a factorization over

R/akmⁿ⁺¹

(Linear lifting), or a factorization over

R/akm²ⁿ

(Quadratic lifting).

The main ingredient of the proof is that coprime polynomials over a field satisfy Bézout's identity. That is, if and are coprime univariate polynomials over a field (here

R/akm

), there are polynomials and such that

\dega<\degg,

\degb<\degf,

and

af+bg=1.

Bézout's identity allows defining coprime polynomials and proving Hensel's lemma, even if the ideal

akm

is not maximal. Therefore, in the following proofs, one starts from a commutative ring, an ideal, a polynomial

h\inR[X]

that has a leading coefficient that is invertible modulo (that is its image in

R/I

is a unit in

R/I

), and factorization of modulo or modulo a power of, such that the factors satisfy a Bézout's identity modulo . In these proofs,

A\equiv B \pmod I

means

A-B\inIR[X].

Linear lifting

Let be an ideal of a commutative ring, and

h\inR[X]

be a univariate polynomial with coefficients in that has a leading coefficient

\alpha

that is invertible modulo (that is, the image of

\alpha

R/I

is a unit in

R/I

Suppose that for some positive integer there is a factorization

h\equiv\alphafg\pmod{I^k},

such that and are monic polynomials that are coprime modulo, in the sense that there exist

a,b\inR[X],

such that

af+bg\equiv 1\pmod I.

Then, there are polynomials

\delta_f,\delta_g\inI^kR[X],

such that

\deg\delta_f<\degf,

\deg\delta_g<\degg,

and

h\equiv\alpha(f+\delta_f)(g+\delta_g)\pmod{I^k+1

Under these conditions,

\delta_f

and

\delta_g

are unique modulo

I^k+1R[X].

Moreover,

f+\delta_f

and

g+\delta_g

satisfy the same Bézout's identity as and, that is,

a(f+\delta_f)+b(g+\delta_g)\equiv 1\pmod I.

This follows immediately from the preceding assertions, but is needed to apply iteratively the result with increasing values of .

The proof that follows is written for computing

\delta_f

and

\delta_g

by using only polynomials with coefficients in

R/I

I^k/I^k+1.

When

R=\Z

and

I=p\Z,

this allows manipulating only integers modulo .

Proof: By hypothesis,

\alpha

is invertible modulo . This means that there exists

\beta\inR

and

\gamma\inIR[X]

such that

\alpha\beta=1-\gamma.

Let

\delta_h\inI^kR[X],

of degree less than

\degh,

such that

\delta_h\equivh-\alphafg\pmod{I^k+1

}. (One may choose

\delta_h=h-\alphafg,

but other choices may lead to simpler computations. For example, if

R=\Z

and

I=p\Z,

it is possible and better to choose

	k\delta'
\delta
	h

where the coefficients of

\delta'_h

are integers in the interval

As is monic, the Euclidean division of

a\delta_h

by is defined, and provides and such that

a\delta_h=qg+c,

and

\degc<\degg.

Moreover, both and are in

I^kR[X].

Similarly, let

b\delta_h=q'f+d,

with

\degd<\degf,

and

q',d\inI^kR[X].

One has

q+q'\inI^k+1R[X].

Indeed, one has

fc+gd=af\delta_h+bg\delta_h-fg(q+q')\equiv\delta_h-fg(q+q')\pmod{I^k+1

}.As

is monic, the degree modulo

I^k+1

fg(q+q')

can be less than

\degfg

only if

q+q'\inI^k+1R[X].

Thus, considering congruences modulo

I^k+1,

one has

\begin{align} \alpha(f+\betad)&(g+\betac)-h\\ &\equiv\alphafg-h+\alpha\beta(f(a\delta_{h-qg)+g(b\delta}_{h-q'f))\\
&\equiv}\delta_h(-1+\alpha\beta(af+bg))-\alpha\betafg(q+q')\\ &\equiv0\pmod{I^k+1

}.\end

So, the existence assertion is verified with

\delta_f=\betad, \delta_g=\betac.

Uniqueness

Let,, and

\alpha

as a in the preceding section. Let

h\equiv\alphafg{\pmodI}

be a factorization into coprime polynomials (in the above sense), such

\degf_0+\degg_0=\degh.

The application of linear lifting for

k=1,2,\ldots,n-1\ldots,

shows the existence of

\delta_f

and

\delta_g

such that

\deg\delta_f<\degf,

\deg\delta_g<\degg,

and

h\equiv\alpha(f+\delta_f)(g+\delta_g)\pmod{I^n}.

The polynomials

\delta_f

and

\delta_g

are uniquely defined modulo

I^n.

This means that, if another pair

(\delta'_f,\delta'_g)

satisfies the same conditions, then one has

\delta'_f\equiv\delta_f\pmod{I^{n} and}\delta'_g\equiv\delta_g\pmod{I^n}.

Proof: Since a congruence modulo

Iⁿ

implies the same concruence modulo

I^n-1,

one can proceed by induction and suppose that the uniqueness has been proved for, the case being trivial. That is, one can suppose that

\delta_f-\delta'_f\inI^n-1R[X] and \delta_g-\delta'_g\inI^n-1R[X].

By hypothesis, has

h\equiv\alpha(f+\delta_f)(g+\delta_g)\equiv\alpha(f+\delta'_f)(g+\delta'_g)\pmod{I^n},

and thus

\begin{align} \alpha(f+\delta_f)(g+\delta_g)&-\alpha(f+\delta'_f)(g+\delta'_g)\\
&=\alpha(f(\delta_g-\delta'_g)+g(\delta_f-\delta'_f))+\alpha(\delta_f(\delta_g-\delta'_g)-\delta_g(\delta_f-\delta'_f))\inIⁿR[X]. \end{align}

By induction hypothesis, the second term of the latter sum belongs to

I^n,

and the same is thus true for the first term. As

\alpha

is invertible modulo, there exist

\beta\inR

and

\gamma\inI

such that

\alpha\beta=1+\gamma.

Thus

\begin{align} f(\delta_g-\delta'_g)&+g(\delta_f-\delta'_f)\\
&=\alpha\beta(f(\delta_g-\delta'_g)+g(\delta_f-\delta'_{f))-\gamma(f(\delta}_g-\delta'_g)+g(\delta_f-\delta'_f))\inIⁿR[X], \end{align}

using the induction hypothesis again.

The coprimality modulo implies the existence of

a,b\inR[X]

such that

1\equiv af+bg\pmod I.

Using the induction hypothesis once more, one gets

\begin{align} \delta_g-\delta'_g&\equiv(af+bg)(\delta_g-\delta'_g)\\
&\equivg(b(\delta_g-\delta'_g)-a(\delta_f-\delta'_f))\pmod{I^{n}.
\end{align}}

Thus one has a polynomial of degree less than

\degg

that is congruent modulo

Iⁿ

to the product of the monic polynomial and another polynomial . This is possible only if

w\inIⁿR[X],

and implies

\delta_g-\delta'_g\inIⁿR[X].

Similarly,

\delta_f-\delta'_f

is also in

IⁿR[X],

and this proves the uniqueness.

Quadratic lifting

Linear lifting allows lifting a factorization modulo

Iⁿ

to a factorization modulo

Iⁿ⁺¹.

Quadratic lifting allows lifting directly to a factorization modulo

I²ⁿ,

at the cost of lifting also the Bézout's identity and of computing modulo

Iⁿ

instead of modulo (if one uses the above description of linear lifting).

For lifting up to modulo

I^N

for large one can use either method. If, say,

N=2^k,

a factorization modulo

I^N

requires steps of linear lifting or only steps of quadratic lifting. However, in the latter case the size of the coefficients that have to be manipulated increase during the computation. This implies that the best lifting method depends on the context (value of, nature of, multiplication algorithm that is used, hardware specificities, etc.).

Quadratic lifting is based on the following property.

Suppose that for some positive integer there is a factorization

h\equiv\alphafg\pmod{I^k},

such that and are monic polynomials that are coprime modulo, in the sense that there exist

a,b\inR[X],

such that

af+bg\equiv 1\pmod .

Then, there are polynomials

\delta_f,\delta_g\inI^kR[X],

such that

\deg\delta_f<\degf,

\deg\delta_g<\degg,

and

h\equiv\alpha(f+\delta_f)(g+\delta_g)\pmod{I^2k

Moreover,

f+\delta_f

and

g+\delta_g

satisfy a Bézout's identity of the form

(a+\delta_a)(f+\delta_f)+(b+\delta_b)(g+\delta_g)\equiv1\pmod{I^2k

}. (This is required for allowing iterations of quadratic lifting.)

Proof: The first assertion is exactly that of linear lifting applied with to the ideal

I^k

instead of

Let

\alpha=af+bg-1\inI^kR[X].

One has

a(f+\delta_{f)+b(g+\delta}_g)=1+\Delta,

where

\Delta=\alpha+a\delta_f+b\delta_g\inI^kR[X].

Setting

\delta_a=-a\Delta

and

\delta_b=-b\Delta,

one gets

(a+\delta_a)(f+\delta_f)+(b+\delta_b)(g+\delta

	2\in

	g)=1-\Delta

I^2kR[X],

which proves the second assertion.

Explicit example

Let

f(X)=X⁶-2\inQ[X].

Modulo 2, Hensel's lemma cannot be applied since the reduction of

f(X)

modulo 2 is simply^[1] ^{pg 15-16}

\bar{f}(X)=X⁶-\overline{2}=X⁶

with 6 factors

not being relatively prime to each other. By Eisenstein's criterion, however, one can conclude that the polynomial

f(X)

is irreducible in

\Q_2[X].

Over

k=F₇

, on the other hand, one has

\bar{f}(X)=X⁶-\overline{2}=X⁶-\overline{16}=(X³-\overline{4}) (X³+\overline{4})

where

is the square root of 2 in

F₇

. As 4 is not a cube in

F_7,

these two factors are irreducible over

F₇

. Hence the complete factorization of

X^6-2

\Z_7[X]

and

\Q_7[X]

f(X)=X⁶-2=(X^{3-\alpha) (X}³+\alpha),

where

\alpha=\ldots450454₇

is a square root of 2 in

\Z₇

that can be obtained by lifting the above factorization.
Finally, in

F₇₂₇[X]

the polynomial splits into

\bar{f}(X)=X⁶-\overline{2}=(X-\overline{3}) (X-\overline{116}) (X-\overline{119}) (X-\overline{608}) (X-\overline{611}) (X-\overline{724})

with all factors relatively prime to each other, so that in

\Z₇₂₇[X]

and

\Q₇₂₇[X]

there are 6 factors

X-\beta

with the (non-rational) 727-adic integers

\beta=\left\{\begin{array}{rrr}3 +&545 ⋅ 727 +&537 ⋅ 727²+&161 ⋅ 727³+\ldots\\116 +&48 ⋅ 727 +&130 ⋅ 727²+&498 ⋅ 727³+\ldots\\119 +&593 ⋅ 727 +&667 ⋅ 727²+&659 ⋅ 727³+\ldots\\608 +&133 ⋅ 727 +&59 ⋅ 727²+&67 ⋅ 727³+\ldots\\611 +&678 ⋅ 727 +&596 ⋅ 727²+&228 ⋅ 727³+\ldots\\724 +&181 ⋅ 727 +&189 ⋅ 727²+&565 ⋅ 727³+\ldots\end{array}\right.

Using derivatives for lifting roots

Let

f(x)

be a polynomial with integer (or -adic integer) coefficients, and let m, k be positive integers such that m ≤ k. If r is an integer such that

f(r)\equiv0\bmodp^k and f'(r)\not\equiv0\bmodp

then, for every

m>0

there exists an integer s such that

f(s)\equiv0\bmodp^k+m and r\equivs\bmodp^k.

Furthermore, this s is unique modulo p^k+m, and can be computed explicitly as the integer such that

s=r-f(r) ⋅ a,

where

is an integer satisfying

a\equiv[f'(r)]^-1\bmodp^m.

Note that

f(r)\equiv0\bmodp^k

so that the condition

s\equivr\bmodp^k

is met. As an aside, if

f'(r)\equiv0\bmodp

, then 0, 1, or several s may exist (see Hensel Lifting below).

Derivation

We use the Taylor expansion of f around r to write:

f(s)=

	N
\sum
	n=0

c_n(s-r)^n, c_n=f⁽ⁿ⁾(r)/n!.

From

r\equivs\bmodp^k,

we see that s − r = tp^k for some integer t. Let

\begin{align} f(s)&=

	N
\sum
	n=0

c_n\left(tp^k\right)ⁿ\\ &=f(r)+tp^kf'(r)+

	N
\sum
	n=2

c_ntⁿp^kn\\ &=f(r)+tp^kf'(r)+p^2kt^2g(t)&&g(t)\in\Z[t]\\ &=zp^k+tp^kf'(r)+p^2kt^2g(t)&&f(r)\equiv0\bmodp^k\\ &=(z+tf'(r))p^k+p^2kt^{2g(t)
\end{align}}

For

m\leqslantk,

we have:

\begin{align} f(s)\equiv0\bmodp^k+m&\Longleftrightarrow(z+tf'(r))p^k\equiv0\bmodp^k+m\\ &\Longleftrightarrowz+tf'(r)\equiv0\bmodp^m\\ &\Longleftrightarrowtf'(r)\equiv-z\bmodp^m\\ &\Longleftrightarrowt\equiv-z[f'(r)]^-1\bmodp^m&&p\nmidf'(r) \end{align}

The assumption that

f'(r)

is not divisible by p ensures that

f'(r)

has an inverse mod

p^m

which is necessarily unique. Hence a solution for t exists uniquely modulo

p^m,

and s exists uniquely modulo

p^k+m.

Observations

Criterion for irreducible polynomials

Using the above hypotheses, if we consider an irreducible polynomial

f(x)=a₀+a₁x+ … +a_nxⁿ\inK[X]

such that

a_0,a_n ≠ 0

, then

|f|=max\{|a_0|,|a_n|\}

In particular, for

f(X)=X⁶+10X-1

, we find in

Q_2[X]

\begin{align} |f(X)|&=max\{|a_0|,\ldots,|a_n|\}\\ &=max\{0,1,0\}=1 \end{align}

but

max\{|a_0|,|a_n|\}=0

, hence the polynomial cannot be irreducible. Whereas in

Q_7[X]

we have both values agreeing, meaning the polynomial could be irreducible. In order to determine irreducibility, the Newton polygon must be employed.^[2]

Frobenius

Note that given an

a\inF_p

the Frobenius endomorphism

y\mapstoy^p

gives a nonzero polynomial

x^p-a

that has zero derivative

\begin{align}	d
	dx

(x^p-a)&=p ⋅ x^p-1\\ &\equiv0 ⋅ x^p-1\bmodp\\ &\equiv0\bmodp \end{align}

hence the pth roots of

do not exist in

Z_p

. For

a=1

, this implies that

Z_p

cannot contain the root of unity

\mu_p

Roots of unity

Although the pth roots of unity are not contained in

F_p

, there are solutions of

x^p-x=x(x^p-1-1)

. Note that

\begin{align}	d
	dx

(x^p-x)&=px^p-1-1\\ &\equiv-1\bmodp \end{align}

is never zero, so if there exists a solution, it necessarily lifts to

Z_p

. Because the Frobenius gives

a^p=a,

all of the non-zero elements

	x
F
	p

are solutions. In fact, these are the only roots of unity contained in

Hensel lifting

Using the lemma, one can "lift" a root r of the polynomial f modulo p^k to a new root s modulo p^k+1 such that (by taking ; taking larger m follows by induction). In fact, a root modulo p^k+1 is also a root modulo p^k, so the roots modulo p^k+1 are precisely the liftings of roots modulo p^k. The new root s is congruent to r modulo p, so the new root also satisfies

f'(s)\equivf'(r)\not\equiv0\bmodp.

So the lifting can be repeated, and starting from a solution r_k of

f(x)\equiv0\bmodp^k

we can derive a sequence of solutions r_k+1, r_k+2, ... of the same congruence for successively higher powers of p, provided that

f'(r_k)\not\equiv0\bmodp

for the initial root r_k. This also shows that f has the same number of roots mod p^k as mod p^k+1, mod p^k+2, or any other higher power of p, provided that the roots of f mod p^k are all simple.

What happens to this process if r is not a simple root mod p? Suppose that

f(r)\equiv0\bmodp^k and f'(r)\equiv0\bmodp.

Then

s\equivr\bmodp^k

implies

f(s)\equivf(r)\bmodp^k+1.

That is,

f(r+tp^k)\equivf(r)\bmodp^k+1

for all integers t. Therefore, we have two cases:

f(r)\not\equiv0\bmodp^k+1

then there is no lifting of r to a root of f(x) modulo p^k+1.

f(r)\equiv0\bmodp^k+1

then every lifting of r to modulus p^k+1 is a root of f(x) modulo p^k+1.

Example. To see both cases we examine two different polynomials with :

f(x)=x²+1

and . Then

f(1)\equiv0\bmod2

and

f'(1)\equiv0\bmod2.

We have

f(1)\not\equiv0\bmod4

which means that no lifting of 1 to modulus 4 is a root of f(x) modulo 4.

g(x)=x²-17

and . Then

g(1)\equiv0\bmod2

and

g'(1)\equiv0\bmod2.

However, since

g(1)\equiv0\bmod4,

we can lift our solution to modulus 4 and both lifts (i.e. 1, 3) are solutions. The derivative is still 0 modulo 2, so a priori we don't know whether we can lift them to modulo 8, but in fact we can, since g(1) is 0 mod 8 and g(3) is 0 mod 8, giving solutions at 1, 3, 5, and 7 mod 8. Since of these only g(1) and g(7) are 0 mod 16 we can lift only 1 and 7 to modulo 16, giving 1, 7, 9, and 15 mod 16. Of these, only 7 and 9 give, so these can be raised giving 7, 9, 23, and 25 mod 32. It turns out that for every integer, there are four liftings of 1 mod 2 to a root of .

Hensel's lemma for p-adic numbers

In the -adic numbers, where we can make sense of rational numbers modulo powers of p as long as the denominator is not a multiple of p, the recursion from r_k (roots mod p^k) to r_k+1 (roots mod p^k+1) can be expressed in a much more intuitive way. Instead of choosing t to be an(y) integer which solves the congruence

tf'(r_k)\equiv

	k
-(f(r
	k)/p

)\bmodp^m,

let t be the rational number (the p^k here is not really a denominator since f(r_k) is divisible by p^k):

	k
-(f(r
	k)/p

)/f'(r_k).

Then set

r_k+1=r_k+tp^k=r_k-

	f(r_k)
	f'(r_k)

This fraction may not be an integer, but it is a -adic integer, and the sequence of numbers r_k converges in the -adic integers to a root of f(x) = 0. Moreover, the displayed recursive formula for the (new) number r_k+1 in terms of r_k is precisely Newton's method for finding roots to equations in the real numbers.

By working directly in the -adics and using the -adic absolute value, there is a version of Hensel's lemma which can be applied even if we start with a solution of f(a) ≡ 0 mod p such that

f'(a)\equiv0\bmodp.

We just need to make sure the number

f'(a)

is not exactly 0. This more general version is as follows: if there is an integer a which satisfies:

|f(a)|_p<

	2,
\|f'(a)\|
	p

then there is a unique -adic integer b such f(b) = 0 and

|b-a|_p<|f'(a)|_p.

The construction of b amounts to showing that the recursion from Newton's method with initial value a converges in the -adics and we let b be the limit. The uniqueness of b as a root fitting the condition

|b-a|_p<|f'(a)|_p

needs additional work.

The statement of Hensel's lemma given above (taking

m=1

) is a special case of this more general version, since the conditions that f(a) ≡ 0 mod p and

f'(a)\not\equiv0\bmodp

say that

|f(a)|_p<1

and

|f'(a)|_p=1.

Examples

Suppose that p is an odd prime and a is a non-zero quadratic residue modulo p. Then Hensel's lemma implies that a has a square root in the ring of -adic integers

\Z_p.

Indeed, let

f(x)=x^2-a.

If r is a square root of a modulo p then:

f(r)=r²-a\equiv0\bmodp and f'(r)=2r\not\equiv0\bmodp,

where the second condition is dependent on the fact that p is odd. The basic version of Hensel's lemma tells us that starting from r₁ = r we can recursively construct a sequence of integers

\{r_k\}

such that:

r_k+1\equivr_k\bmodp^k,

	2
r
	k

\equiva\bmodp^k.

This sequence converges to some -adic integer b which satisfies b² = a. In fact, b is the unique square root of a in

\Z_p

congruent to r₁ modulo p. Conversely, if a is a perfect square in

\Z_p

and it is not divisible by p then it is a nonzero quadratic residue mod p. Note that the quadratic reciprocity law allows one to easily test whether a is a nonzero quadratic residue mod p, thus we get a practical way to determine which -adic numbers (for p odd) have a -adic square root, and it can be extended to cover the case p = 2 using the more general version of Hensel's lemma (an example with 2-adic square roots of 17 is given later).

To make the discussion above more explicit, let us find a "square root of 2" (the solution to

x^2-2=0

) in the 7-adic integers. Modulo 7 one solution is 3 (we could also take 4), so we set

r₁=3

. Hensel's lemma then allows us to find

r₂

as follows:

\begin{align} f(r₁₎&=3^2-2=7

	1
\\ f(r
	1)/p

&=7/7=1\\ f'(r₁₎&=2r₁₌₆\end{align}

Based on which the expression

tf'(r₁₎\equiv

	k)\bmod
-(f(r
	1)/p

turns into:

t ⋅ 6\equiv-1\bmod7

which implies

t=1.

Now:

r₂=r₁+tp¹=3+1 ⋅ 7=10=13_7.

And sure enough,

10^2\equiv2\bmod7^2.

(If we had used the Newton method recursion directly in the 7-adics, then

r₂=r₁-f(r_1)/f'(r₁₎=3-7/6=11/6,

and

11/6\equiv10\bmod7^2.

)

We can continue and find

r₃=108=3+7+2 ⋅ 7²=213₇

. Each time we carry out the calculation (that is, for each successive value of k), one more base 7 digit is added for the next higher power of 7. In the 7-adic integers this sequence converges, and the limit is a square root of 2 in

\Z₇

which has initial 7-adic expansion

3+7+2 ⋅ 7²+6 ⋅ 7³+7⁴+2 ⋅ 7⁵+7⁶+2 ⋅ 7⁷+4 ⋅ 7⁸+ … .

If we started with the initial choice

r₁=4

then Hensel's lemma would produce a square root of 2 in

\Z₇

which is congruent to 4 (mod 7) instead of 3 (mod 7) and in fact this second square root would be the negative of the first square root (which is consistent with 4 = −3 mod 7).

As an example where the original version of Hensel's lemma is not valid but the more general one is, let

f(x)=x^2-17

and

a=1.

Then

f(a)=-16

and

f'(a)=2,

|f(a)|₂<

	2,
\|f'(a)\|
	2

which implies there is a unique 2-adic integer b satisfying

b²=17 and |b-a|₂<|f'(a)|₂=

	1
	2

i.e., b ≡ 1 mod 4. There are two square roots of 17 in the 2-adic integers, differing by a sign, and although they are congruent mod 2 they are not congruent mod 4. This is consistent with the general version of Hensel's lemma only giving us a unique 2-adic square root of 17 that is congruent to 1 mod 4 rather than mod 2. If we had started with the initial approximate root a = 3 then we could apply the more general Hensel's lemma again to find a unique 2-adic square root of 17 which is congruent to 3 mod 4. This is the other 2-adic square root of 17.

In terms of lifting the roots of

x^2-17

from modulus 2^k to 2^k+1, the lifts starting with the root 1 mod 2 are as follows:

1 mod 2 → 1, 3 mod 4

1 mod 4 → 1, 5 mod 8 and 3 mod 4 → 3, 7 mod 8

1 mod 8 → 1, 9 mod 16 and 7 mod 8 → 7, 15 mod 16, while 3 mod 8 and 5 mod 8 don't lift to roots mod 16

9 mod 16 → 9, 25 mod 32 and 7 mod 16 → 7, 23 mod 16, while 1 mod 16 and 15 mod 16 don't lift to roots mod 32.

For every k at least 3, there are four roots of x² − 17 mod 2^k, but if we look at their 2-adic expansions we can see that in pairs they are converging to just two 2-adic limits. For instance, the four roots mod 32 break up into two pairs of roots which each look the same mod 16:

9 = 1 + 2³ and 25 = 1 + 2³ + 2⁴.

7 = 1 + 2 + 2² and 23 = 1 + 2 + 2² + 2⁴.

The 2-adic square roots of 17 have expansions

1+2³+2⁵+2⁶+2⁷+2⁹+2¹⁰+ …

1+2+2²+2⁴+2⁸+2¹¹+ …

Another example where we can use the more general version of Hensel's lemma but not the basic version is a proof that any 3-adic integer c ≡ 1 mod 9 is a cube in

\Z_3.

Let

f(x)=x^3-c

and take initial approximation a = 1. The basic Hensel's lemma cannot be used to find roots of f(x) since

f'(r)\equiv0\bmod3

for every r. To apply the general version of Hensel's lemma we want

|f(1)|₃

	2,
<\|f'(1)\|
	3

which means

c\equiv1\bmod27.

That is, if c ≡ 1 mod 27 then the general Hensel's lemma tells us f(x) has a 3-adic root, so c is a 3-adic cube. However, we wanted to have this result under the weaker condition that c ≡ 1 mod 9. If c ≡ 1 mod 9 then c ≡ 1, 10, or 19 mod 27. We can apply the general Hensel's lemma three times depending on the value of c mod 27: if c ≡ 1 mod 27 then use a = 1, if c ≡ 10 mod 27 then use a = 4 (since 4 is a root of f(x) mod 27), and if c ≡ 19 mod 27 then use a = 7. (It is not true that every c ≡ 1 mod 3 is a 3-adic cube, e.g., 4 is not a 3-adic cube since it is not a cube mod 9.)

In a similar way, after some preliminary work, Hensel's lemma can be used to show that for any odd prime number p, any -adic integer c congruent to 1 modulo p² is a p-th power in

\Z_p.

(This is false for p = 2.)

Generalizations

ak{m},

and let

f(x)\inA[x].

a ∈ A is called an "approximate root" of f, if

f(a)\equiv0\bmodf'(a)²ak{m}.

If f has an approximate root then it has an exact root b ∈ A "close to" a; that is,

f(b)=0 and b\equiva\bmod{akm}.

Furthermore, if

f'(a)

is not a zero-divisor then b is unique.

This result can be generalized to several variables as follows:

Theorem. Let A be a commutative ring that is complete with respect to ideal

ak{m}\subsetA.

Let

f_1,\ldots,f_n\inA[x_1,\ldots,x_n]

be a system of n polynomials in n variables over A. View

f=(f_1,\ldots,f_n),

as a mapping from Aⁿ to itself, and let

J_f(x)

denote its Jacobian matrix. Suppose a = (a₁, ..., a_n) ∈ Aⁿ is an approximate solution to f = 0 in the sense that

f_i(a)\equiv0\bmod(\detJ_f(a))²ak{m}, 1\leqslanti\leqslantn.

Then there is some b = (b₁, ..., b_n) ∈ Aⁿ satisfying f(b) = 0, i.e.,

f_i(b)=0, 1\leqslanti\leqslantn.

Furthermore this solution is "close" to a in the sense that

b_i\equiva_i\bmod\detJ_f(a)ak{m}, 1\leqslanti\leqslantn.

As a special case, if

f_i(a)\equiv0\bmodak{m}

for all i and

\detJ_f(a)

is a unit in A then there is a solution to f(b) = 0 with

b_i\equiva_i\bmodak{m}

for all i.

When n = 1, a = a is an element of A and

J_f(a)=J_f(a)=f'(a).

The hypotheses of this multivariable Hensel's lemma reduce to the ones which were stated in the one-variable Hensel's lemma.

Related concepts

Completeness of a ring is not a necessary condition for the ring to have the Henselian property: Goro Azumaya in 1950 defined a commutative local ring satisfying the Henselian property for the maximal ideal m to be a Henselian ring.

Masayoshi Nagata proved in the 1950s that for any commutative local ring A with maximal ideal m there always exists a smallest ring A^h containing A such that A^h is Henselian with respect to mA^h. This A^h is called the Henselization of A. If A is noetherian, A^h will also be noetherian, and A^h is manifestly algebraic as it is constructed as a limit of étale neighbourhoods. This means that A^h is usually much smaller than the completion Â while still retaining the Henselian property and remaining in the same category.

Notes and References

Book: Gras, Georges. Class field theory : from theory to practice. 2003. 978-3-662-11323-3. Berlin. 883382066.
Book: Neukirch, Jürgen . Algebraic Number Theory . 1999 . Springer Berlin Heidelberg . 978-3-662-03983-0 . Berlin, Heidelberg . 851391469.

Hensel's lemma explained

Modular reduction and lifting

Statement

Lifting simple roots

Lifting to adic completion

Proof

Linear lifting

Uniqueness

Quadratic lifting

Explicit example

Using derivatives for lifting roots

Derivation

Observations

Criterion for irreducible polynomials

Frobenius

Roots of unity

Hensel lifting

Hensel's lemma for p-adic numbers

Examples

Generalizations

Related concepts

See also

Notes and References