A search for a mathematical expression

for mass ratios using a large database


Simon Plouffe

April 19, 2004





Using a database of 610 millions mathematical constants and expressions a search was conducted in order to find a reasonable expression based on simplicity and length for the mass ratios of fundamental particles. The mass ratios are the dimensionless values of the NIST-CODATA 2002 values.  The search was restricted to 8 of the fundamental particles: the electron, proton, neutron, helion, tau, deuteron, alpha and the muon since their mass is known to enough precision.


The search uncovered a weakness in a well-known algorithm and produced a series of candidate expressions that I propose as being the most appropriate mathematical answer for that question. The paper explains the different models used.





The mass ratio of the proton and the electron is a dimensionless number, which is equal to 1836.15267261 with an uncertainty (85) on the last 2 digits. In other words, 9 digits are valid with a confidence of 99%.


In the 60’s, Richard P. Feynman proposed the value of Mp/Me ≈  = 1836.118107. At the time it was considered like an inspired guess up to the known precision. Compared to the known value today with the  error it is not very good in fact since the relative error 4.6 x 10-6 with (Mp/Me -) is 40663 times the error. A search using the author’s database and programs with the other ratios failed to produce similar formulas in simple terms of πk, exp(πk) with k being an integer or even rational.


That idea of finding the best mathematical expression for dimensionless numbers of physics goes back to Sir Arthur Eddington in the 30’s with Einstein at the beginning, later Feynman, Gell-man and Dirac addressed or thought of that problem. There is an extensive literature on the subject.


The project was to use a large database of mathematical constants and specialized programs to find an expression for many if not all mass ratios. The previous results like the one of Feynman were more or less guessed and I believe that this method was inspired but not appropriate.


The Inverter and methodology


The Inverter is the private version of Plouffe’s Inverter found on the internet with over 610 million of mathematical constants including the 99000 sequences of the On-line Encyclopaedia of Integer sequences of [Sloane] and a subset of 797 mathematical constants from Steven Finch’s Mathematical Constants [Finch]. The constants are values of known functions or series at rational points, known constants like gamma, π, √2 or the golden ratio. The construction of the tables were inspired from known references like [AS][Boll][Le Lionnais][Potter-Robinson], [Sloane, Plouffe] and [Finch].


Text Box: Entries from the On-line Encyclopedia of Integer Sequences
x = 1,-1, exp(-2p/k),
+ concatenated sequences, continued fractions, … 13 million entries at 16 digits.


Each sequence of the OEIS is transformed into a real number entry using various power series representations, continued fractions and concatenations.


The existence of the database is based on the assumption that a number like 0.42278433 can only be recognized or known if it is known by advance that it is 1-gamma. Gamma (0.5772156649…) is a known constant to mathematics but not that well known otherwise and there is no simple way to go from 0.422… to 1-gamma.  As of today there are algorithms that can find that value but they are not simple. Such algorithms like the Integer Relations algorithm, the LLL algorithm (Lenstra, Lenstra, Lovasz) or the PSLQ algorithm of Bailey and Ferguson. Such examples like 1-gamma can easily be constructed . To recognize a number takes tables and algorithms. Recognizing means to propose a mathematical expression from a series of known digits, usually in mathematics we go from left to right, a mathematical expression is equal to a certain number to a given precision. Here we go in the opposite direction.


Once the database is set with all known sources of mathematical constants a global lookup was conducted with the CODATA table at precisions varying from 5 to 11 digits. A simple criteria based on the length in characters of the expression associated with each number is used to classify candidates.


To catch more possible candidates the Farey fractions (Farey(n) is a set of rationals in ]0,1] with denominators ≤ n), of order 12, 24, 60, 120, 240, 256 and elementary transforms and functions on each entry is used for lookups as well. For each of the CODATA entry a table was produced and sorted in increasing length of expressions, a tree shaped table having on top the simplest expression associated with a value. These calculations produced a set of 23 million candidates that were compared and analyzed to find similarities among tables. No definite pattern emerged from that analysis and on top of all trees the simplest expression found was Mp/Mn ≈ cos(p/60), valid to 5 digits. That value is elegant and simple but unique among candidates and not precise enough, the difference with the real value is more than 10,000 times the standard error.


Integer Relations algorithm


Many names are known for the next algorithms. Usually they refer to Integer Relation algorithms and can be stated in the following way. Let x = (x1, x2, x3, …, xn) be a vector of real numbers. x is said to possess an integer relation if there exist integers not all zero such that a1x1 + a2x2 + + anxn = 0. By an integer relation algorithm, I mean an algorithm that is guaranteed (provided the computer implementation has sufficient numerical precision) to recover the vector of integers ai, if it exists, or to produce bounds within which no integer relation can exist.


The 2 dimension version of the algorithm is the continued fraction algorithm and is equivalent to the Euclid algorithm but it had to wait until 1977 when Ferguson and Forcade could state and explain properly the problem in many dimensions and mid 80’s until an efficient computer program could be made. As of today, most of the computer algebra programs like Maple, Mathematica and Pari-GP have a built-in program to perform such operation and can provide an answer in almost all conditions up to a precision of 1000 digits. Some versions of the program [PSLQ] by David H. Bailey can even go further to many thousands of digits attacking harder mathematical problems.


These programs can easily attack problems dealing with hundreds of decimal digits but are almost useless for small problems dealing with only 5 because of the error control. Usually the safe bounds for the error control is handled with the double of the precision but with 5 digits the answer provided is hardly valid. In the best cases, 11 digits are just enough but with the known error bound for the constants the results are poor. Nevertheless an extensive search was conducted using models like a combination of π, exp(π), exp(2 π) and such and some individual results were found. But when an individual result is found that cannot be reproduced for the other ratios then it must be rejected because it has no value apart from being a numerical curiosity. For powers of π alone, 1 million expressions were found similar to Feynman expression but no pattern emerged as well.




























alpha particle-proton     




















alpha particle-electron   



Masses in KG and ratios from the NIST-CODATA 2002 table

























Notes : Only the values > 1 are taken into account and only if they appear separately as a ratio in the CODATA 2002 table. The ratios are the results of many experiments and averages and not the result of the arithmetical operation of taking ratios of entries in the first table

Model #1, spheres or archimedean solid of n dimensions


N-dimensional spheres of uniform matter is the first model considered. The volume of a n-dimensional sphere V(n) is , consequently the mass ratio should be rational in 3 dimensions and with powers of π if the dimension is higher. I could not find any evidence of such hypothesis. The next step is to consider semi-regular polytopes such as the archimedean solids. The volume of such polytopes is expressed in radicals so I expected the mass ratios to be as well.  




(Courtesy of Eric Weisstein from Math World at Wolfram Research:



If I go back to the number cos(π/60), that number is algebraic and the value is

cos(π/60) =

This is a fairly simple expression and could represent a ratio of such polytopes but a problem arises when I consider the polynomial which has this number as a root. The degree is 24, cos(π/60) is one of the roots of



I can’t use the known 11 digits of Mp/Mn to find such a polynomial using PSLQ or LLL algorithms. The only way to detect simple expressions with radicals with as little as 11 digits is to construct tables of values like cos(π/60), values of algebraic numbers with embedded radicals, roots of simple polynomials and combination of roots of unity. In all, there are 245 million algebraic entries of various degrees in the main table. I applied the same method and found nothing simpler (but more precise) than the cos(π/60) expression.



Model #2 expressions with p, exp(p) and various bases.


I go back to Feynman and also that z(n) and/or powers of p, consider these products.


n is integer, p is a prime ≥ 2 and c is a composite number.

The product wth the inverses of all primes is expressed with p4 and the product of inverses of all integers is related to p/exp(p), therefore the product of inverses of all composite numbers is simply the ratio. In other words when I have an expression with p2 and 1/p2 then it means something if I have Mp/Me ≈ 6p5+ 328/p8 then it hardly can be explained in terms of primes and composites, the exponents have to be related in some way. Other bases like Fibonacci and φ were tried,  p bases  and exp(p) bases as well. In all cases it had to correspond to a pattern, a similarity in either the exponent or the coefficient. Unfortunately no patterns were found despite the numerous candidates.


Here is a summary of the models with the number of entries and method used.


Type of model

Expected ratio type

Method of detection

Number of entries in main table

Best possible match with any entry

Any from shortest to longest expression

Lookups + variations

610 million entries

Sphere in n dimension

Powers of p with simple algebraic numbers.

Integer Relations or LLL

Dynamic lookup

Archimedean solid of n dimension

Algebraic numbers with embedded radicals.

Lookups with variations

145 million entries

Additive with single base like exp(Pi*k), Xk or roots of unity.

Linear combination of base.

Integer Relations or LLL and GFUN.

Dynamic lookup

Related to a specific integer sequence eveluated at ±1,

exp(-p) or exp(-2p).

Any from shortest to longest expression *

GFUN, LLL, lookups + variations.

13 million entries

Single constant generation X

Multiplicative with powers of X :

A/B * Xk

Construction of specialized table + lookups with variations.

145 million entries


Families of expressions found for many ratios

These are the near identities involved with F(n), L(n) and φ. Fn are the Fibonacci numbers : 1, 2, 3, 5, 8, 13, 21, 34, 55, … =  the nearest integer to φn/√5. Ln  are the Lucas numbers : 2, 3, 4, 7, 11, 18, 29, 47, 76, 123, … =  the nearest integer  to φn . By using the definition of those numbers, I can deduce 3 basic transformations that will lead to a nearby value.


And from there another series of transforms when r divides p, a and b being integers.


This last set of identities that are almost equal to 1 forced me to reconsider many expressions encountered especially one found about the alpha particle and electron ratio, that is

 valid up to 7 digits


Since 11 ≈ φ5 , 11 is the 5’th Lucas number and 29 is ≈ φ7 this identity is in fact a power of φ in disguise. This is a very simple expression. But since I have 3 different expressions near 1 it means that in fact there is a set of values near that point. This is no surprising that I could not find it with those Integer Relations algorithms since it is linear with [log(φ), log(Malpha/Melectron), log(11), log(29)] but since the numeric precision is only 11 digits at the most then the algorithm fails to find it. Ihad to construct a specialized table of 145 million entries to be sure to detect any of these relations and this is what I have found.


But since each expression can be either improved in precision or simplified then it means that for each approximation there is a family of expressions near the value, an infinite family of expressions all similar. This is by far the simplest model I have found. Each expression is generated by 1 single number at a given power, that is the golden ratio.


Those ripples appear in group near the ratio values of the CODATA 2002, in some cases I could find families of families of values like




These patterns can be explained easily since for each n the basic transformation will lead to a series of near values once transformed. That phenomena explains why I have those series.


, ,, ≈ 1.00137841870


Summary of results for φ = ½+√5/2 = 1.6180339887… 



Particle Ratio




Other expressions

Other expressions









































Alpha particle-Electron





Error analysis 



Particle Ratio





Relative error



0.1375 x 10-8




0.48467 x10-4




0.34040 x 10-5








0.9035 x 10-10




0.2782 x 10-7




0.15001 x 10-6




0.3754 x 10-7




0.149 x 10-4




0.22071 x 10-5




0.17397 x 10-6




0.1192 x 10-5




0.2844 x 10-2


Alpha particle-Electron


0.38535 x 10-3



3 expressions are out of the range of the normal error (in gray)  and have to be rejected. All the others are within the normal error, that is +/- 3e.



Now the questions is are these patterns occuring with any real number or is it occuring especially for those ratios? By using an algorithm to systematically replace an expression by a nearby expression with the  basic transforms then I get roughly 1 digit of precision by iteration (or term). In other words a 10 digits approximation of an arbitrary real number will lead to an expression with 10 terms and by looking at the size of the expressions obtained then I conclude that they are remarkable and that these ripples of values appear near the values of the CODATA 2002 and they fit within the error bounds. The expressions are in my opinion the simplest mathematical expression that can exist for those numbers.

As I mentioned the problem is not to find an answer but to find an answer for the 16 ratios that makes sense and above all a comprehensive or simple answer if there is any. After all, those ratios could vary with time and be not constant at all as suggested by recent findings. Even Paul A.M. Dirac doubted that any mathematical expression could even exist.


A weakness discovered in Integer Relations algorithms when using a small precision.


In the course of experiments I dealt with one simple case that is an integer relation with 1, √5 and φ48 . As you may know the golden ratio has many facets and one of them is the relation φn =F(n)φ+F(n-1), with F(n) being the n’th Fibonacci number. But also that φn is very near integers when n >> 1. We expect the program to at least detect that but it is not the case since when asked to solve in integers [1, √5, φ48] it answers [-1791659574, 1, -4006272456] when the actual answer is [5374978561, 2403763488, -1] that is, φ48 is a linear combination of F(48)√5 and F(47). It does it well if Iincrease the number of working digits to 100 but that mises the point when the digits are set to 24. This is exactly why Icould not rely on that algorithm to find valid integer relations with a working precision that goes from 5 to 11 digits.



[Aspden, Eagles] -H. Aspden and D. M. Eagles,  Physics Letters A, v. 41, p.423 (1972).

[AS] Abramowitz, M. and Stegun, I. A. (Eds.). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing. New York: Dover, 1972.

[Bailey, Ferguson], Numerical Results on Relations Between Numerical Constants Using a New Algorithm," Mathematics of Computation, vol. 53 (October 1989), p. 649 - 656.

[Bergeron, Plouffe] Computing the generating function of a series given its first few terms, Experimental Mathematics, Vol. 1 #4, 1992.

[Boll] Marcel Boll, Tables Numériques Universelles des laboratoires et bureaux d'étude, 881 pages. Dunod 1947 Paris.

[CSD] Jean-Paul Delahaye, Certitudes sans démonstrations, in Pour La Science, #249 , juillet 1998. See article :


[Finch] S. Finch, Mathematical Constants, Encyclopedia of Mathematics and its Applications, Cambridge University Press, (2003).

[GFUN] GFUN a maple package for manipulating power series, Bruno Salvy and Paul Zimmermann, Paris 1992, INRIA internal report.

[Hardy, Wright] Hardy, G. H. and Wright, W. M. An Introduction to the Theory of Numbers, 5th ed. Oxford, England: Oxford University Press, 1979.

[Hastad et al] J. Hastad, B. Just, J. C. Lagarias and C. P. Schnorr,Polynomial Time Algorithms for Finding Integer Relations Among Real Numbers," SIAM Journal on Computing, vol. 18 (1988), p. 859 - 881.

[Le Lionnais] Les nombres remarquables, Paris, Hermann, 1983.

[LLL] A. K. Lenstra, H. W. Lenstra and L. Lovasz, Factoring Polynomials with Rational Coeffcients, Math. Annalen, vol. 261 (1982), p. 515 - 534.

[NIST- CODATA 2002] http://physics.nist.gov/cuu/Constants/Table/allascii.txt

[Plouffe’s Inverter] http://pi.lacim.uqam.ca/eng/.

[Inverseur de Plouffe] http://pi.lacim.uqam.ca/fra/.

{Ferguson-Forcade], Generalization of the Euclidean Algorithm for Real Numbers to All Dimensions Higher Than Two," Bulletin of the American Mathematical Society, 1 (1979), p. 912 - 914.

[Potter-Robinson] copy of a manuscript document from Jeff O. Shallit 1995.

[PSLQ] Ferguson, H. R. P. and Bailey, D. H. A Polynomial Time, Numerically Stable Integer Relation Algorithm. RNR Techn. Rept. RNR-91-032, Jul. 14, 1992.

 [Sloane, Plouffe] The encyclopedia of Integer Sequences, Academic Press, San Diego 600 pp. 1995.

[Sloane N.J.A.] The On-Line Encyclopedia of Integer Sequences.


[Mohr-Taylor] -Peter J. Mohr and Barry N. Taylor, CODATA Recommended Values of the Fundamental Physical Constants: 2002 (to appear). Latest version is CODATA of Dec. 2003
See also : http://physics.nist.gov/constants