This article presents and explains several methods which can be used to calculate square roots.
Many of the methods for calculating square roots of a positive real number S require an initial seed value. If the initial value is too far from the actual square root, the calculation will be slowed down. It is therefore useful to have a rough estimate, which may be very inaccurate but easy to calculate. If S ≥ 1, let D be the number of digits to the left of the decimal point. If S < 1, let D be the negative of the number of zeros to the immediate right of the decimal point. Then the rough estimation is this:
Two and six are used because and
When working in the binary numeral system (as computers do internally), an alternative method is to use (here D is the number of binary digits).
Perhaps the first algorithm used for approximating is known as the "Babylonian method", named after the Babylonians,^{[1]} or "Hero's method", named after the firstcentury Greek mathematician Hero of Alexandria who gave the first explicit description of the method.^{[2]} It can be derived from (but predates) Newton's method. This is a quadratically convergent algorithm, which means that the number of correct digits of the approximation roughly doubles with each iteration. It proceeds as follows:
It can also be represented as:
This algorithm works equally well in the padic numbers, but cannot be used to identify real square roots with padic square roots; it is easy, for example, to construct a sequence of rational numbers by this method which converges to + 3 in the reals, but to − 3 in the 2adics.
To calculate , where S = 125348, to 6 significant figures, we will use the rough estimation method above to get x_{0}. The number of digits in S is D=6=2·2+2. So, n=2 and the rough estimate is
Therefore,
We let the relative error in x_{n} be defined by
and thus
Then one can show that
and thus that
and consequently that convergence is assured provided that x_{0} and S are both positive.
If one is using the rough estimate above with the Babylonian method, then the worst cases are:
Thus in any case,
Remember that rounding errors will slow the convergence. One should keep at least one extra digit beyond the desired accuracy of the x_{n} which you are calculating to minimize round off error.
Pocket calculators typically implement good routines to compute the exponential function and the natural logarithm, and then compute the square root of S using the identity
The same identity is used when computing square roots with logarithm tables or slide rules.
Another simple way to find a square root is the high/low method, similar to the bisection method. This method involves guessing a number based on known squares, then checking if its square is too high or too low and adjusting accordingly.
Suppose you want to find the square root of 20. You know that the square of 5 is 25, and that the square of 4 is 16, so it must be between 4 and 5. Now you guess 4.5. The square of 4.5 equals 20.25 and is too high, so you guess 4.4. This equals 19.36 and is too low. So the root is between 4.4 and 4.5. Continue this pattern until you get as many decimal places as needed. We are going to stop at three.
Now that we know that it is between 4.472 and 4.473, we now know that the square root of 20 to the first three decimal places is 4.472.
This is a method for finding an approximation to a square root which was described in an ancient manuscript known as the Bakhshali manuscript. It is equivalent to two iterations of the Babylonian method beginning with N. The original presentation goes as follows: To calculate , let N^{2} be the nearest perfect square to S. Then, calculate:
This can be also written as:
We'll find
This is a method to find each digit of the square root in a sequence. It is slower than the Babylonian method (if you have a calculator which can divide in one operation), but it has several advantages:
Napier's bones include an aid for the execution of this algorithm. The shifting nthroot algorithm is a generalization of this method.
The algorithm works for any base, and naturally, the way it proceeds depends on the base chosen.
Write the original number in decimal form. The numbers are written similar to the long division algorithm, and, as in long division, the root will be written on the line above. Now separate the digits into pairs, starting from the decimal point and going both left and right. The decimal point of the root will be above the decimal point of the square. One digit of the root will appear above each pair of digits of the square.
Beginning with the leftmost pair of digits, do the following procedure for each pair:
Find the square root of 152.2756.
1 2. 3 4 / \/ 01 52.27 56 01 1*1 <= 1 < 2*2 x = 1 01 y = x*x = 1*1 = 1 00 52 22*2 <= 52 < 23*3 x = 2 00 44 y = (20+x)*x = 22*2 = 44 08 27 243*3 <= 827 < 244*4 x = 3 07 29 y = (240+x)*x = 243*3 = 729 98 56 2464*4 <= 9856 < 2465*5 x = 4 98 56 y = (2460+x)*x = 2464*4 = 9856 00 00 Algorithm terminates: Answer is 12.34
Find the square root of 2.
1. 4 1 4 2 / \/ 02.00 00 00 00 02 1*1 <= 2 < 2*2 x = 1 01 y = x*x = 1*1 = 1 01 00 24*4 <= 100 < 25*5 x = 4 00 96 y = (20+x)*x = 24*4 = 96 04 00 281*1 <= 400 < 282*2 x = 1 02 81 y = (280+x)*x = 281*1 = 281 01 19 00 2824*4 <= 11900 < 2825*5 x = 4 01 12 96 y = (2820+x)*x = 2824*4 = 11296 06 04 00 28282*2 <= 60400 < 28283*3 x = 2 The desired precision is achieved: The square root of 2 is about 1.4142
Inherent to digitbydigit algorithms is a search and test step: find a digit, , when added to the right of a current solution ', such that , where is the value for which a root is desired. Expanding, we obtain . The current value of —or, usually, the remainder—can be incrementally updated efficiently when working in binary, as the value of will be a single bit, and the operations needed to compute and can be replaced with faster bit shift operations. This gives rise to simple computer implementations:^{[3]}
short sqrt(short num) { short op = num; short res = 0; short one = 1 << 14; // The secondtotop bit is set: 1L<<30 for long // "one" starts at the highest power of four <= the argument. while (one > op) one >>= 2; while (one != 0) { if (op >= res + one) { op = res + one; res = (res >> 1) + one; } else res >>= 1; one >>= 2; } return res; }
Faster algorithms, in binary and decimal or any other base, can be realized by using lookup tables—in effect trading more storage space for reduced run time.^{[4]}
The duplex method is a variant of the digit by digit method for calculating the square root of a whole or decimal number one digit at a time.^{[5]} The duplex is the square of the central digit plus double the crossproduct of digits equidistant from the center. The duplex is computed from the quotient digits (square root digits) computed thus far, but after the initial digits. The duplex is subtracted from the dividend digit prior to the second subtraction for the product of the quotient digit times the divisor digit. For perfect squares the duplex and the dividend will get smaller and reach zero after a few steps. For nonperfect squares the decimal value of the square root can be calculated to any precision desired. However, as the decimal places proliferate, the duplex adjustment gets larger and longer to calculate. The duplex method follows the Vedic ideal for an algorithm, oneline, mental calculation. It is flexible in choosing the first digit group and the divisor. Small divisors are to be avoided by starting with a larger initial group.
In short, to calculate the duplex of a number, double the product of each pair of equidistant digits plus the square of the center digit (of the digits to the right of the colon).
Number => Calculation = Duplex 574 ==> 2(5·4) + 7^{2} = 89 406,739 ==> 2(4·9)+ 2(0·3)+ 2(6·7) = 72+0+84 = 156 123,456 ==> 2(1·6)+ 2(2·5)+ 2(3·4) = 12 +20 +24 = 56 88,900,777 ==> 2(8·7)+2(8·7)+2(9·7)+2(0·0)+2(0·0) = 112+112+126+0+0 = 350 48329,03711 ==> 2(4·1)+2(8·1)+2(3·7)+2(2·3)+2(9·0)= 8+16+42+12+0 = 78
In a square root calculation the quotient digit set increases incrementally for each step.
Number => Calculation = Duplex: 1 ==> 1^{2} = 1 14 ==>2(1·4) = 8 142 ==> 2(1·2) + 4^{2} = 4 + 16 = 20 14,21 ==> 2(1·1) + 2(4·2) = 2 + 16 = 18 14213 ==> 6+8+4 = 18 142,135 ==> 10+24+4 = 38 1421356 ==> 12+40+12+1 = 65 1421,3562 ==> 4+48+20+6 = 78 142,135,623 ==> 6+16+24+10+9 = 65 142,1356,237 ==> 14+24+8+12+30 = 88 142,13562,373 ==> 6+56+12+4+36+25 = 139
Consider the perfect square 2809. 53^{2} = 2809. Use the duplex method to find the square root of 2,809.
Find the square root of 2809. Set down the number in groups of two digits. The number of groups gives the number of whole digits in the root. Put a colon after the first group, 28, to separate it. From the first group, 28, we obtain the divisor, 10, since 28>25=5^{2} and by doubling this first root, 2x5=10. Gross dividend: 28: 0 9. Using mental math: Divisor: 10) 3 0 Square: 10) 28: _{3}0 9 Duplex, Deduction: 25: xx 09 Square root: 5: 3. 0 Dividend: 30 00 Remainder: 3: 00 00 Square Root, Quotient: 5: 3. 0
Find the square root of 2,080,180,881. Solution by the duplex method: this tendigit square has five digitpairs, so it will have a fivedigit square root. The first digitpair is 20. Put the colon to the right. The nearest square below 20 is 16, whose root is 4, the first root digit. So, we use 2·4=8 for the divisor. Now we proceed with the duplex division, one digit column at a time. Prefix the remainder to the next dividend digit.
divisor; gross dividend: 8) 20: 8 0 1 8 0 8 8 1 read the dividend diagonally up: 4 8 7 11 10 10 0 8 minus the duplex: 16: xx 25 60 36 90 108 00 81 actual dividend: : 48 55 11 82 10 00 08 00 minus the product: : 40 48 00 72 00 00 0 00 remainder: 4: 8 7 11 10 10 0 8 00 quotient: 4: 5, 6 0 9. 0 0 0 0
Duplex calculations: Quotientdigits ==> Duplex deduction. 5 ==> 5^{2}= 25 5 and 6 ==> 2(5·6) = 60 5,6,0 ==> 2(5·0)+6^{2} = 36 5,6,0,9 ==> 2(5·9)+2(6·0) = 90 5,6,0,9,0 ==> 2(5·0)+2(6·9)+ 0 = 108 5,6,0,9,0,0 ==> 2(5·0)+2(6·0)+2(0·9) = 0 5,6,0,9,0,0,0 ==> 2(5·0)+2(6·0)+2(0·0)+9^{2} = 81
Hence the square root of 2,080,180,881 is exactly 45,609.
Find the square root of two to ten places. Let us take 20,000 as the beginning group, using three digitpairs at the start. The perfect square just below 20,000 is 141, since 141^{2} = 19881 < 20,000. So, the first root digits are 141 and the divisor doubled, 2 x 141 = 282. With a larger divisor the duplex will be relatively small. Hence, we can pick the multiple of the divisor without confusion.
Dividend: 2.0000 : 0 0 0 0 0 0 0 0 Diagonal;Divisor: (282) : 1190 620 400 1020 1620 1820 750 1120 Minus duplex: : xxxx 16 16 12 28 53 74 59 Actual dividend: 20000 : 1190 604 384 1008 1592 1767 676 1061 Minus product: 19881 : 1128 564 282 846 1410 1692 564 846 Remainder: 119 : 62 40 102 162 182 75 112 215 Root quotient: 1.41 : 4 2 1 3 5 6 2 3
Ten multiples of 282: 282; 564; 846; 1128; 1410; 1692; 1974; 2256; 2538; 2820.
This method is applicable for finding the square root of and converges best for . This, however, is no real limitation for a computer based calculation, as in base 2 floating point and fixed point representations, it is trivial to multiply by an integer power of 4, and therefore by the corresponding power of 2, by changing the exponent or by shifting, respectively. Therefore, one can move to the range . Moreover, the following method does not employ general divisions, but only additions, subtractions, multiplications, and divisions by powers of two, which are again trivial to implement. A disadvantage of the method is that numerical errors accumulate, in contrast to single variable iterative methods such as the Babylonian one.
The initialization step of this method is
while the iterative steps read
Then, (while ).
Note that the convergence of , and therefore also of , is quadratic.
The proof of the method is rather easy. First, we rewrite the iterative definition of as
Then it is straightforward to prove by induction that
and therefore the convergence of to the desired result is ensured by the convergence of to 0, which in turn follows from .
This method was developed around 1950 by M. V. Wilkes, D. J. Wheeler and S. Gill^{[6]} for use on EDSAC, one of the first electronic computers^{[7]}. The method was later generalized, allowing the computation of nonsquare roots^{[8]}.
The following are iterative methods for finding the reciprocal square root of S which is . Once it has been found, we can find by simple multiplication: . These iterations involve only multiplication, and not division. They are therefore faster than the Babylonian method. However, they are not stable. If the initial value is not close to the reciprocal square root, the iterations will diverge away from it rather than converge to it. It can therefore be advantageous to perform an iteration of the Babylonian method on a rough estimate before starting to apply these methods.
If N is an approximation to , a better approximation can be found by using the Taylor series of the square root function:
As an iterative method, the order of convergence is equal to the number of terms used. With 2 terms, it is identical to the Babylonian method; With 3 terms, each iteration takes almost as many operations as the Bakhshali approximation, but converges more slowly. Therefore, this is not a particularly efficient way of calculation.
Finding is the same as solving the equation . Therefore, any general numerical rootfinding algorithm can be used. Newton's method, for example, reduces in this case to the Babylonian method. Other methods are less efficient than the ones presented above.
A completely different method for computing the square root is based on the CORDIC algorithm, which uses only very simple operations (addition, subtraction, bitshift and table lookup, but no multiplication).
Quadratic irrationals (numbers of the form , where a, b and c are integers), and in particular, square roots of integers, have periodic continued fractions. Sometimes we may be interested not in finding the numerical value of a square root, but rather in its continued fraction expansion. The following iterative algorithm can be used for this purpose (S is any natural number which is not a perfect square):
Notice that m_{n}, d_{n}, and a_{n} are always integers. The algorithm terminates when this triplet is the same as one encountered before. The expansion will repeat from then on. The sequence [a_{0}; a_{1}, a_{2}, a_{3}, …] is the continued fraction expansion:
We begin with m_{0} = 0; d_{0} = 1; and a_{0} = 10 (10^{2} = 100 and 11^{2} = 121 > 114 so 10 chosen).
So, m_{1} = 10; d_{1} = 14; and a_{1} = 1.
Next, m_{2} = 4; d_{2} = 7; and a_{2} = 2.
Now, loop back to the second equation above.
Consequently, the continued fraction for the square root of 114 is
Pell's equation and its variants yield a method for efficiently finding continued fraction convergents of square roots of integers. However, it can be complicated to execute, and usually not every convergent is generated. The ideas behind the method are as follows:
The method is as follows:
On computers, a very rapid Newton'smethodbased approximation to the square root can be obtained for floating point numbers when computers use an IEEE (or sufficiently similar) representation.
This technique is based on the fact that the IEEE floating point format approximates base2 logarithm. For example, you can get the approximate logarithm of 32bit single precision floating point number by translating its binary representation as an integer, scaling it by 2^{23}, and removing a bias of 127.
For example, 1.0 is represented by a hexadecimal number 0x3F800000, which would represent 1065353216 = if taken as an integer. Using the formula above you get 1065353216 / 2^{23} − 127 = 0, as expected from log_{2}(1.0). In a similar fashion you get 0.5 from 1.5(0x3FC00000).
In order to get the square root, we can divide the logarithm by 2 and convert the value back. The following program demonstrates the idea. Note that we intentionally allow the exponent's lowest bit to propagate into the mantissa.
float fastsqrt(float val) { union { int tmp; float val; } u; u.val = val; u.tmp = 1<<23; /* Remove last bit so 1.0 gives 1.0 */ /* tmp is now an approximation to logbase2(val) */ u.tmp >>= 1; /* divide by 2 */ u.tmp += 1<<29; /* add 64 to exponent: (e+127)/2 =(e/2)+63, */ /* that represents (e/2)64 but we want e/2 */ return u.val; }
In the above, the operations to remove last exponent bit and add the IEEE bias can be combined into a single operation. An additional adjustment can be made in the same operation to reduce the maximum relative error. So, the three operations, not including the cast, can be rewritten as:
tmp = (1<<29) + (tmp >> 1)  (1<<22) + m;
Where m is a bias for adjusting the approximation errors. For example, with m = 0 you get accurate results for even powers of 2 (e.g. 1.0), but for other numbers the results will be slightly too big (e.g. you get 1.5 for 2.0 instead of 1.414... with 6% error). With m = 0x4C000 you get errors between about 3.5% and 3.5%.
If the approximation is to be used for an initial guess for Newton's method to the equation (1 / x^{2}) − S = 0, you need to use a reciprocal form shown in the following section.
A variant of the above routine is included below, which can be used to compute the reciprocal of the square root, i.e. instead, was written by Greg Walsh, and implemented into SGI Indigo by Gary Tarolli.^{[9]}^{[10]} The integershift approximation produced a relative error of less than 4%, and the error dropped further to 0.15% with one iteration of Newton's method on the following line.^{[11]} In computer graphics it is a very efficient way to normalize a vector.
float invSqrt(float x) { float xhalf = 0.5f*x; union { float x; int i; } u; u.x = x; u.i = 0x5f3759df  (u.i >> 1); x = u.x * (1.5f  xhalf * u.x * u.x); return x; }
Some VLSI hardware implements inverse square root using a second degree polynomial estimation followed by a Goldschmidt iteration.^{[12]}
