*Historical Intermezzo: from Bézout to Cayley

In connection to Sylvester's resultant, both Sylvester and Cayley quote <#2341#>Bézout's abridged method to obtain the resultant<#2341#> or <#2344#>Bézout's abbreviated method of elimination<#2344#>. Without pretending to give a survey on Bézout's result, I think helpful to give some pointers to it for the interested reader.

His method for computing resultants is preliminarly described by Bézout in the case of three homogeneous linear equations#math500##tex2html_wrap_inline12130#a_ijX_j, 1≤i≤3 : he considers the product #math501##tex2html_wrap_inline12132#X_j and successively, for i = 1..3, substutes each X_j with a_ij <#2351#>observing the signe rule<#2351#>.We thus obtain <#4788#>

#math502#

	a₁₁X₂X₃X₄ - a₁₂X₁X₃X₄ + a₁₃X₁X₂X₄ - a₁₄X₁X₂X₃	i = 1

	(a₁₁a₂₂ - a₂₁a₁₂)X₃X₄ - (a₁₁a₂₃ - a₂₁a₁₃)X₂X₄
+	(a₁₁a₂₄ - a₂₁a₁₄)X₂X₃ + (a₁₂a₂₃ - a₂₂a₁₃)X₁X₄	i = 2
-	(a₁₂a₂₄ - a₂₂a₁₄)X₁X₃ + (a₁₃a₂₄ - a₂₃a₁₄)X₁X₂

	#tex2html_wrap_indisplay12158#(a₁₁a₂₂ - a₂₁a₁₂)a₃₃ - (a₁₁a₂₃ - a₂₁a₁₃)a₃₂ + (a₁₂a₂₃ - a₂₂a₁₃)a₃₁#tex2html_wrap_indisplay12159#X₄
-	#tex2html_wrap_indisplay12163#(a₁₁a₂₂ - a₂₁a₁₂)a₃₄ - (a₁₁a₂₄ - a₂₁a₁₄)a₃₂ + (a₁₂a₂₄ - a₂₂a₁₄)a₃₁#tex2html_wrap_indisplay12164#X₃
+	#tex2html_wrap_indisplay12168#(a₁₁a₂₃ - a₂₁a₁₃)a₃₄ - (a₁₁a₂₄ - a₂₁a₁₄)a₃₃ + (a₁₃a₂₄ - a₂₃a₁₄)a₃₁#tex2html_wrap_indisplay12169#X₂	i = 3
-	#tex2html_wrap_indisplay12173#(a₁₂a₂₃ - a₂₂a₁₃)a₃₄ - (a₁₂a₂₄ - a₂₂a₁₄)a₃₃ + (a₁₃a₂₄ - a₂₃a₁₄)a₃₂#tex2html_wrap_indisplay12174#X₁

<#4788#> whence he deduces

#math503#

#tex2html_wrap_indisplay12177##tex2html_wrap_indisplay12178#

<#2537#>id est<#2537#> Cramer's formula.

As Muir put it

the unreal product #math504##tex2html_wrap_inline12181#X_j at the very outset must have been a sore puzzle to students. [...]

To throw light upon the process, let us compare the above solution of a set of three linear equations with the following solution, which from one point of view may be looked upon as an improvement on the ordinary determinantal modes of solution as presented to modern readers.

[...] The numerators of the values of #math505#X₁, X₂, X₃ and the common denominator are [...] the coefficients of #math506#X₁, X₂, X₃, X₄ in the determinant

#math507#

#tex2html_wrap_indisplay12185##tex2html_wrap_indisplay12186##tex2html_wrap_indisplay12187# : = #tex2html_wrap_indisplay12188#X₁X₂X₃X₄#tex2html_wrap_indisplay12189#

More precisely, Muir explains, if we denote

#math508#

#tex2html_wrap_indisplay12191#X_iX_j#tex2html_wrap_indisplay12192# : = #tex2html_wrap_indisplay12193##tex2html_wrap_indisplay12194##tex2html_wrap_indisplay12195##tex2html_wrap_indisplay12196##tex2html_wrap_indisplay12197#X_iX_jX_h#tex2html_wrap_indisplay12198# : = #tex2html_wrap_indisplay12199##tex2html_wrap_indisplay12200##tex2html_wrap_indisplay12201#

we have (by developing along the first line)

#math509#

a₁₁#tex2html_wrap_indisplay12203#X₂X₃X₄#tex2html_wrap_indisplay12204# - a₁₂#tex2html_wrap_indisplay12205#X₁X₃X₄#tex2html_wrap_indisplay12206# + a₁₃#tex2html_wrap_indisplay12207#X₁X₂X₄#tex2html_wrap_indisplay12208# - a₁₄#tex2html_wrap_indisplay12209#X₁X₂X₃#tex2html_wrap_indisplay12210#

and, developing, again along the first line, the four determinants #math510##tex2html_wrap_inline12212#X_iX_jX_h#tex2html_wrap_inline12213#

#math511#

		(a₁₁a₂₂ - a₂₁a₁₂)#tex2html_wrap_indisplay12216#X₃X₄#tex2html_wrap_indisplay12217# - (a₁₁a₂₃ - a₂₁a₁₃))#tex2html_wrap_indisplay12218#X₂X₄#tex2html_wrap_indisplay12219#
	+	(a₁₁a₂₄ - a₂₁a₁₄))#tex2html_wrap_indisplay12222#X₂X₃#tex2html_wrap_indisplay12223# + (a₁₂a₂₃ - a₂₂a₁₃))#tex2html_wrap_indisplay12224#X₁X₄#tex2html_wrap_indisplay12225#
	-	(a₁₂a₂₄ - a₂₂a₁₄)#tex2html_wrap_indisplay12228#X₁X₃#tex2html_wrap_indisplay12229# + (a₁₃a₂₄ - a₂₃a₁₄)#tex2html_wrap_indisplay12230#X₁X₃#tex2html_wrap_indisplay12231#

and, finally, expanding the six determainants and recollecting the result

#math512#

		#tex2html_wrap_indisplay12234##tex2html_wrap_indisplay12235##tex2html_wrap_indisplay12236#X₄ - #tex2html_wrap_indisplay12237##tex2html_wrap_indisplay12238##tex2html_wrap_indisplay12239#X₃
	+	#tex2html_wrap_indisplay12242##tex2html_wrap_indisplay12243##tex2html_wrap_indisplay12244#X₂ - #tex2html_wrap_indisplay12245##tex2html_wrap_indisplay12246##tex2html_wrap_indisplay12247#X₁.

The same method, <#2651#>id est<#2651#> an expansion of proper determinants expressed with a similar notation and process, is then applied by Bézout to <#2652#>resolve<#2652#> different systems of polynomial equations, includingcomputing the resultant in k[X₁] of two polynomials in #math516#k[X₁, X₂] and then is specialized to the computation of the resultant in k of two polynomials

#math517#

φ : = #tex2html_wrap_indisplay12272#a_i+1X^n-i, φ' : = #tex2html_wrap_indisplay12273#a'_i+1X^m-i∈k[X].

The specialized method, which is the one the English School called the <#2663#>Bézout's abridged/abbreviated method<#2663#>, consists in

multiplying φ and φ' respectively by the polynomials
#math518#
Φ : = #tex2html_wrap_indisplay12277#A_i+1X^ν-i#tex2html_wrap_indisplay12278#Φ' : = #tex2html_wrap_indisplay12279#A'_i+1X^μ-i
whose degree (actually we have #math519#μ : = n - 1 and #math520#ν : = m - 1) is deduced by means of results similar to Theorem~#bezot1#2673>;
summing such two product, considering the linear system whose equations are the coefficients of the resulting polynomial and whose unknowns are the #math523#A_i, A'_is and
solving it by the method discussed above, <#2676#>id est<#2676#> via determinant expansion.

#Example2678#

#Example2702#

#History2716#

#Example2731#

The Sylvester resultant was, at least implicitly, introduced by Euler. His approach is essentially a variation of the one illustrated in Example~#41Ex1#2753>: given

#math525#

φ : = #tex2html_wrap_indisplay12311#a_i+1(X)Y^m-i, φ' : = #tex2html_wrap_indisplay12312#a'_i+1(X)Y^n-i∈k(X)[Y]

he multiplies them, respectively, by

#math526#

Φ : = a'₁Y^n-1 + #tex2html_wrap_indisplay12314#A_i+1Y^n-2-i#tex2html_wrap_indisplay12315#Φ' : = a₁Y^m-1 + #tex2html_wrap_indisplay12316#A'_i+1Y^m-2-i

and subtracts the result, obtaining a polynomial of degree m + n - 2 --- the coefficient of Y^m+n-1 being 0; thus equating the coefficients of #math527#Y, Y²,…, Y^m+n-2 we obtain m + n - 2 linear equations into #math528#(m - 1) + (n - 1) variables; the solution is then substituted in the constant coefficient #math529#A_n-1a_m+1 - A'_m-1a'_n+1 giving #math530#E(X)∈k(X).

The equation system can be expressed as

#math531#

M⋅(a'₁, A₁,…, A_n-1, a₁, A'₁,…, A'_m-1)^T = (0,…, 0, E(X))^T

where M is the Sylvester matrix.

It is clear that the procedure proposed by Euler is equivalent to the computation of the Sylvester resultant, so that #math532#E(X) = #tex2html_wrap_inline12327#(φ, φ') is the required resultant.

In order to present the English School view of Bézout's abridged method, I refer to Salmon's <#2780#>Higher Algebra<#2780#>.

Given two polynomials of the same degree

#math533#

U : = #tex2html_wrap_indisplay12330#a_i+1X^m-i,∈k[X], V : = #tex2html_wrap_indisplay12331#a'_i+1X^n-i, m = n

and denoting

#math534#

V_ρ : = #tex2html_wrap_indisplay12333#a'_i+1X^ρ-i, U_ρ : = #tex2html_wrap_indisplay12334#a_i+1X^ρ-i,#tex2html_wrap_indisplay12335#ρ, 0≤ρ ;SPMlt; n,

we compute the n polynomials of degree bounded by m - 1

#math535#

F_ρ : = V_ρ-1U - U_ρ-1V : = #tex2html_wrap_indisplay12339#α_ρσX^m-σ, 1≤ρ≤n;

we thus obtain the square matrix #math536##tex2html_wrap_inline12341#α_ρσ#tex2html_wrap_inline12342# whose determinant is the required resultant; in order to extend this construction of a square matrix when m ;SPMgt; n, we need to have e : = m - n more polynomials of degree bounded by m - 1; the choice is to take #math537#F_ρ : = X^n-ρV, n ;SPMlt; ρ≤m so that the resultant

is, therefore, as it ought to be, of the n^th degree in the coefficients of [U], and of the m^th in those of [V].

#Example2814#

The first which studied the <#2822#>bezoutic matrix<#2822#> #math538##tex2html_wrap_inline12353# : = #tex2html_wrap_inline12354#α_ρσ#tex2html_wrap_inline12355# was Jacobiwhich, among other comments, remarked that

#math539##tex2html_wrap_inline12358#X^ρα_ρσX^σ = U(X)#tex2html_wrap_inline12359# - V(X)#tex2html_wrap_inline12360# (pg.103);
#tex2html_wrap_inline12392# is symmetric (pg.102) and
has no <#2868#>factore superfluo<#2868#> (pg.104);
if #math541#det(#tex2html_wrap_inline12394#)≠ 0 the inverse of #tex2html_wrap_inline12396# is a Hankel matrix (pg. 104);
if #math546#det(#tex2html_wrap_inline12409#) = 0 the common roots α of U and V are in relations with the linear solutions #math547#(1, α,…, α^n-1) of #tex2html_wrap_inline12415# (pg. 104).

Sylvestergave in 1842 a formula to express the coefficients of the F_ρ (in case n = m) in terms of the determinants

#math548#

(i, j) : = (a_ia_j') : = #tex2html_wrap_indisplay12422##tex2html_wrap_indisplay12423##tex2html_wrap_indisplay12424#, 1≤i ;SPMlt; j≤n + 1,

as follows:

[Conceive] a number of cubic blocks each of which has two numbers, termed its <#2895#>characteristics<#2895#>, inscribed upon one of its faces, upon which the values of such a block (itself called an <#2896#>element<#2896#>) depends.

For instance, the value of the <#2897#>element<#2897#>, whose <#2898#>characteristics<#2898#> are r, s, is the difference between two products: the one of the coefficient rth in order occurring in the polynomial U, by that which comes sth in order of the polynomial V; the other product is that of the coefficient sth in order of the polynomials U, by that rth in order of V; so that if the degree of each equation be n, there will be altogether #math549##tex2html_wrap_inline12436#n(n + 1) such elements.

The blocks are formed into squares or flats (<#2900#>plafonds<#2900#>) of which the number is #math550##tex2html_wrap_inline12438# or #math551##tex2html_wrap_inline12440#, according as n is even or odd. The first of these contains n blanks 103 in a side, the next (n - 2), the next (n - 4), till finally we reach a square of four blocks or of one, according as n is even or odd. These flats are laid upon one another so as to form a regularly ascending pyramid, of which the two diagonal planes are termed the planes of separation and symmetry respectively. The former divides the pyramid into two halves, such that no element on the one side of it is the same as that of any block in the other. The plan of symmetry, as the name denotes, divides the pyramid into two exactly <#2903#>similar<#2903#> parts; it being a rule, that <#2904#>all elements lying in any given line of a square (platfond) parallel to the plane of separation are identical<#2904#>; moreover the sum of the characteristics is the same for <#2905#>all<#2905#> elements lying <#2906#>anywhere<#2906#> in a <#2907#>plane<#2907#> parallel to that of separation.

The formula behind this rule is

#math552#

α_ρσ = α_σρ = #tex2html_wrap_indisplay12447#(a_ρ+σ+1-j, a'_j).

(4)

#Example2917#

If m ;SPMgt; n the same formula is applied simply by expressing V as

#math553#

V : = #tex2html_wrap_indisplay12455#a'_i+1X^n-i = #tex2html_wrap_indisplay12456#b_i+1X^m-i

with #math554#b_i+1 = #tex2html_wrap_inline12458#

In 1857 Cayley gave in <#2987#>Crelle<#2987#>

<#4882#>la forme la plus simple sous laquelle on peut présenter cette méthode.

Pour éliminer [x]entre deux équations du #math555#n^{<#1#>ième<#1#>} degré

#math556#

#tex2html_wrap_indisplay12464#a_i+1x^n-i = 0,#tex2html_wrap_indisplay12465#a'_i+1x^n-i = 0

on n'a qu'a former l'équation identique <#4881#>

#math557#

		#tex2html_wrap_indisplay12468#
	=	(y^n-1, y^n-2,…, 1)#tex2html_wrap_indisplay12471##tex2html_wrap_indisplay12472##tex2html_wrap_indisplay12473##tex2html_wrap_indisplay12474##tex2html_wrap_indisplay12475##tex2html_wrap_indisplay12476#

<#4881#> oú l'expression qui forme le second membre représente la fonction suivant

#math558#

		#tex2html_wrap_indisplay12479#a_{0, 0}x^n-1 + a_{1, 0}x^n-2 + ^... + a_{n-1, 0}#tex2html_wrap_indisplay12480#y^n-1
	+	#tex2html_wrap_indisplay12483#a_{0, 1}x^n-1 + a_{1, 1}x^n-2 + ^... + a_{n-1, 1}#tex2html_wrap_indisplay12484#y^n-2
		^...
	+	#tex2html_wrap_indisplay12488#a_{0, n-1}x^n-1 + a_{1, n-1}x^n-2 + ^... + a_{n-1, n-1}#tex2html_wrap_indisplay12489#;

le résultat de l'élimination sera

#math559#

#tex2html_wrap_indisplay12491##tex2html_wrap_indisplay12492##tex2html_wrap_indisplay12493#.

<#4882#>

#Example3068#

Sylvester, in the case n = m, denotes the polynomials #math560#F_ρ, 1≤ρ≤n the <#3083#>bezoutians<#3083#> of U and V and remarks that

The determinant formed by arranging in a square the n sets of coefficients of the n Bezoutians, and which I shall term the Bezoutian matrix , gives, as is well known, the Resultant (meaning thereby the Result in its simplest form of eliminating the variables out) of U and V.

Eliminating dialytically, first X^n-1 between the first and the second, then X^n-1 and X^n-2 between the first, second and the third, and so on, and finally, all the powers of X between the first, second, third,...,nth of these Bezoutians, and repeating the first of them, we obtain a derived set of n equations, the right-hand members of which I shall term the secondary Bezoutains to U and V.

The 'dialytical elimination' performed by Sylvester on the expressions

#math562#

V_ρ-1U - U_ρ-1V = F_ρ, 1≤ρ≤m

returns

#math563#

V₀U - U₀V	=	F₁	= :	B₁
(α₂₁V₀ - α₁₁V₁)U - (α₂₁U₀ - α₁₁U₁)V	=	α₂₁F₁ - α₁₁F₂	= :	B₂
…
S_ρU - T_ρV			= :	B_ρ
…
S_m-1U - T_m-1V			= :	B_m-1.

where we have #math564#deg(S_ρ) = deg(T_ρ) = ρ -1, deg(B_ρ) = m - ρ; thus, assuming U, V to be

perfectly unrelated, and each the most general function that can be formed of the same degree

and in case m = n then if we repeatedly perfom the Division Algorithm and <#3145#>change the sign<#3145#> of each remainder, as in Section~13.3, we obtain the polynomial sequence#math572#U, V, R₂,…, R_m where each R_i necessarily satisfies #math573#deg(R_i) = m - i + 1. Thus, the

n successive <#3152#>Secondary Bezoutians<#3152#> to the system U, V [...] will (saving at least a numerical factor of a magnitude and algebraic sign to be determined, but which, when proper conventions are made, will be subsequently proved to be +1) represent the simplified [...] residue[s] to #math574##tex2html_wrap_inline12587#.

<#3156#>id est<#3156#> #math575#B_ρ = R_ρ+1 for each ρ.

Once obtained the Bezoutic square of two polynomials f, φ of the same degree m, Sylvester remarks that

this square [...] is symmetrical about one of its diagonals, and corresponds therefore (as every symmetrical matrix must do) to a homogeneous quadratic function of m variables of which it expresses the determinant. This quadratic function, which plays a great part in [...] the theory of real roots, I term the Bezoutiant.

[...]

In Section V. Arts. 56.57, I show that the <#3161#>total<#3161#> number of effective intercalations between the roots of two functions of the same degree is given by the <#3162#>inertia<#3162#> of that quadratic form which we agreed to term the Bezoutiant to f and φ; and in the following article (58) the result is extended to embrace the case contemplated in M.Sturm's theorem; that is to say, I show, that on replacing the function of x by a homogeneous function of x and y, the Bezoutiant of the two functions, which are respectively the differential derivates of f with respect to x and with respect to y, will serve to determine by its form or <#3177#>inertia<#3177#> the total number of real roots and of <#3178#>equal<#3178#> roots in f (x). The subject is pursued in the following Arts. 59,60. 107 [...] In Arts. 61, 62, 63, it is proved that the Bezoutiant is an invariative function of the functions from which it is derived; and in Art. 64 the important remark is added, that it is an invariant of that particular class to which I have given the name of Combinants, which have the property of remaining unaltereted, not only for linear transformations of the variables, but also for linear combinations of the functions containing the variables, possessing thus a character of double invariability. In Arts. 65, 66 I consider the relation of the Bezoutiant to the differential determinant, so called by Jacoby, but which for greater brevity I call the Jacobian. On proper substitutions being made in the Bezoutiant for the m variables which it contains [...], the Bezoutiant becomes identical with the Jacobian of f and Φ.

To illustrate the `proper substitution to be done' I give again the word to Sylvester

[The Bezoutiant] #math594#B(u₁,…, u_m) being a covariant of the system f and φ [...] on making #math595#u₁,…, u_m equal to [#math596#x^m-1, x^m-2y,…, y^m-1], B will become [...] what I am in the habit of calling the Jacobian (after the name of the late but ever-illustrious Jacobi), a term capable of application to any number of homogeneous functions of as many variables. In the case before us, where we have two functions of two variables, the Jacobian 109

#math597#

J(f, φ) = #tex2html_wrap_indisplay12667##tex2html_wrap_indisplay12668##tex2html_wrap_indisplay12669# = #tex2html_wrap_indisplay12670##tex2html_wrap_indisplay12671# - #tex2html_wrap_indisplay12672##tex2html_wrap_indisplay12673#.

[...] So in the case of a single function F of the degree m, the Bezoutoid, that is the Bezoutiant to #math598##tex2html_wrap_inline12677#,#tex2html_wrap_inline12678#, on making the (m - 1) variables which it contains identical with #math599#x^m-2, x^m-3y,…, y^m-2 respectively, becomes identical with the Jacobian to #math600##tex2html_wrap_inline12682#,#tex2html_wrap_inline12683#, that is the Hessian of F, namely

#math601#

#tex2html_wrap_indisplay12686##tex2html_wrap_indisplay12687##tex2html_wrap_indisplay12688#.

As an example of this property of the Bezoutiant, suppose

#math602#

f	=	ax³ + bx²y + cxy² + dy³,
φ	=	αx³ + βx²y + γxy² + δy³.

The Bezoutiant matrix becomes

#math603#

aβ - bα,	aγ - cα,	aδ - dα,
aγ - cα,	#tex2html_wrap_indisplay12702##tex2html_wrap_indisplay12703##tex2html_wrap_indisplay12704#,	bγ - cβ,
aδ - dα,	bγ - cβ,	cδ - dγ.

The Bezoutiant accordingly will be the quadratic function

#math604#

		(aβ - bα)u₁² + #tex2html_wrap_indisplay12711#(aδ - dα) + (bγ - cβ)#tex2html_wrap_indisplay12712#u₂² + (cδ - dγ)u₃²
	+	2(aγ - cα)u₁u₂ +2(aδ - dα)u₃u₁ +2(bγ - cβ)u₂u₃,

which on making

#math605#

u₁ = x², u₂ = xy, u₃ = y²

becomes

#math606#

Lx⁴ + Mx³y + Nx²y² + Pxy³ + Qy⁴,

where L, M, N, P, Q respectively will be the sum of the terms lying in the successive bands drawn parallel to the sinister diagonal of the Bezoutiant matrix, that is

#math607#

L	=	(aβ - bα),
M	=	2(aγ - cα),
N	=	3(aδ - dα) + (bγ - cβ),
P	=	2(bγ - cβ),
Q	=	(cδ - dγ).

The biquadratic function in x and y[...] will be found on computation to be identical in point of form with the Jacobian to f,φ, namely <#3241#>

#math608#

(3ax² +2bxy + cy²)(βx² +2γxy + 3δy²) - (3αx² +2βxy + γy²)(bx² +2cxy + 3dy²)

<#3241#> this latter being in fact

#math609#

3Lx⁴ +3Mx³y + 3Nx²y² +3Pxy³ +3Qy⁴.

and concludes commenting:

The remark is not without some interest, that in fact the Bezoutiant, which is capable (as has been shown already) of being mechanically constructed, gives the best and readiest means of calculating the Jacobian; for in summing the sinister bands transverse to the axis of symmmetry the only numerical operation to be performed is that of addition of positive integers, whereas the direct method involves the necessity of numerical subtractions as well as additions, inasmuch as the same terms will be repeated with different signs.

and remarks, in a different example, that, unlike the computation via Bezoutiant, the direct evaluation requires to effectively employ also <#3245#>division<#3245#> in order to reduce the Jacobian to its simplest form, being divisible by #math610#deg(f )= deg(φ).