Adds a test that triggers the RISC-V fused multiply-add code
generation bug fixed by CL 506575.
Change-Id: Ia3a55a68b48c5cc6beac4e5235975dea31f3faf2
Reviewed-on: https://go-review.googlesource.com/c/go/+/507035
Auto-Submit: M Zhuo <mzh@golangcn.org>
Reviewed-by: M Zhuo <mzh@golangcn.org>
Run-TryBot: Michael Munday <mike.munday@lowrisc.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
When x*y == -z the portable implementation of FMA copied the sign
bit from x*y into the result. This meant that when x*y == -z and
x*y < 0 the result was -0 which is incorrect.
Fixes#61130.
Change-Id: Ib93a568b7bdb9031e2aedfa1bdfa9bddde90851d
Reviewed-on: https://go-review.googlesource.com/c/go/+/507376
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Michael Munday <mike.munday@lowrisc.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joedian Reid <joedian@golang.org>
This reverts CL 467515. Now that we have cmp.Compare,
we don't need math.Compare or math.Compare32 after all.
For #56491Fixes#60519
Change-Id: I8ed33464adfc6d69bd6b328edb26aa2ee3d234d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/499416
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Eli Bendersky <eliben@google.com>
This change introduces the Compare and Compare32 functions
based on the total-ordering predicate in IEEE-754, section 5.10.
In particular,
* -NaN is ordered before any other value
* +NaN is ordered after any other value
* -0 is ordered before +0
* All other values are ordered the usual way
Compare-8 0.4537n ± 1%
Compare32-8 0.3752n ± 1%
geomean 0.4126n
Fixes#56491.
Change-Id: I5c9c77430a2872f380688c1b0a66f2105b77d5ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/467515
Reviewed-by: WANG Xuerui <git@xen0n.name>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This change introduces the Compare and Compare32 functions
based on the total-ordering predicate in IEEE-754, section 5.10.
In particular,
* -NaN is ordered before any other value
* +NaN is ordered after any other value
* -0 is ordered before +0
* All other values are ordered the usual way
name time/op
Compare-8 0.24ns ± 1%
Compare32-8 0.24ns ± 0%
Fixes#56491.
Change-Id: I9444fbfefe26741794c4436a26d403b8da97bdaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/459435
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
The existing implementation does a float64 to int64 conversion in order to check whether the number is odd, however it does not check for overflows. If an overflow occurs, the result is implementation-defined and while it happens to work on amd64 and i386, it produces an incorrect result on arm64 and possibly other architectures.
This change fixes that and also avoids calling isOddInt altogether if the base is +0, because it's unnecessary.
(I was considering avoiding the extra check if runtime.GOARCH is "amd64" or "i386", but I can't see this pattern being used anywhere outside the tests. And having separate files with build tags just for isOddInt() seems like an overkill)
Fixes#57465
Change-Id: Ieb243796194412aa6b98fac05fd19766ca2413ef
GitHub-Last-Rev: 3bfbd85c4cd6c5dc3d15239e180c99764a19ca88
GitHub-Pull-Request: golang/go#57494
Reviewed-on: https://go-review.googlesource.com/c/go/+/459815
Auto-Submit: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Bypass: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
And then revert the bootstrap cmd directories and certain testdata.
And adjust tests as needed.
Not reverting the changes in std that are bootstrapped,
because some of those changes would appear in API docs,
and we want to use any consistently.
Instead, rewrite 'any' to 'interface{}' in cmd/dist for those directories
when preparing the bootstrap copy.
A few files changed as a result of running gofmt -w
not because of interface{} -> any but because they
hadn't been updated for the new //go:build lines.
Fixes#49884.
Change-Id: Ie8045cba995f65bd79c694ec77a1b3d1fe01bb09
Reviewed-on: https://go-review.googlesource.com/c/go/+/368254
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
The original value was rounded too early, which lead to the
surprising behavior that float64(math.SmallestNonzeroFloat64 / 2)
wasn't 0. That is, the exact compile-time computation of
math.SmallestNonzeroFloat64 / 2 resulted in a value that was
rounded up when converting to float64. To address this, added 3
more digits to the mantissa, ending in a 0.
While at it, also slightly increased the precision of MaxFloat64
to end in a 0.
Computed exact values via https://play.golang.org/p/yt4KTpIx_wP.
Added a test to verify expected behavior.
In contrast to the other (irrational) constants, expanding these
extreme values to more digits is unlikely to be important as they
are not going to appear in numeric computations except for tests
verifying their correctness (as is the case here).
Re-enabled a disabled test in go/types and types2.
Updates #44057.
Fixes#44058.
Change-Id: I8f363155e02331354e929beabe993c8d8de75646
Reviewed-on: https://go-review.googlesource.com/c/go/+/315170
Trust: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.
On arm64 flatform.
previous:
FCVTSD F0, F0
FSQRTD F0, F0
FCVTDS F0, F0
optimized:
FSQRTS S0, S0
And this patch adds the test case to check the correctness.
This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)
Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The s390x assembly implementation was previously only handling this
case correctly for x = -Pi. Update the special case handling for
any y.
Fixes#35446
Change-Id: I355575e9ec8c7ce8bd9db10d74f42a22f39a2f38
Reviewed-on: https://go-review.googlesource.com/c/go/+/223420
Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
This makes it a little less likely the portable FMA will be
broken without realizing it.
Change-Id: I7f7f4509b35160a9709f8b8a0e494c09ea6e410a
Reviewed-on: https://go-review.googlesource.com/c/go/+/205337
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This API was added for #25819, where it was discussed as math.FMA.
The commit adding it used math.Fma, presumably for consistency
with the rest of the unusual names in package math
(Sincos, Acosh, Erfcinv, Float32bits, etc).
I believe that using an idiomatic Go name is more important here
than consistency with these other names, most of which are historical
baggage from C's standard library.
Early additions like Float32frombits happened before "uppercase for export"
(so they were originally like "float32frombits") and they were not properly
reconsidered when we uppercased the symbols to export them.
That's a mistake we live with.
The names of functions we have added since then, and even a few
that were legacy, are more properly Go-cased, such as IsNaN, IsInf,
and RoundToEven, rather than Isnan, Isinf, and Roundtoeven.
And also constants like MaxFloat32.
For new API, we should keep using proper Go-cased symbols
instead of minimally-upper-cased-C symbols.
So math.FMA, not math.Fma.
This API has not yet been released, so this change does not break
the compatibility promise.
This CL also modifies cmd/compile, since the compiler knows
the name of the function. I could have stopped at changing the
string constants, but it seemed to make more sense to use a
consistent casing everywhere.
Change-Id: I0f6f3407f41e99bfa8239467345c33945088896e
Reviewed-on: https://go-review.googlesource.com/c/go/+/205317
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently, the precision of the float64 multiply-add operation
(x * y) + z varies across architectures. While generated code for
ppc64, s390x, and arm64 can guarantee that there is no intermediate
rounding on those platforms, other architectures like x86, mips, and
arm will exhibit different behavior depending on available instruction
set. Consequently, applications cannot rely on results being identical
across GOARCH-dependent codepaths.
This CL introduces a software implementation that performs an IEEE 754
double-precision fused-multiply-add operation. The only supported
rounding mode is round-to-nearest ties-to-even. Separate CLs include
hardware implementations when available. Otherwise, this software
fallback is given as the default implementation.
Specifically,
- arm64, ppc64, s390x: Uses the FMA instruction provided by all
of these ISAs.
- mips[64][le]: Falls back to this software implementation. Only
release 6 of the ISA includes a strict FMA instruction with
MADDF.D (not implementation defined). Because the number of R6
processors in the wild is scarce, the assembly implementation
is left as a future optimization.
- x86: Guards the use of VFMADD213SD by checking cpu.X86.HasFMA.
- arm: Guards the use of VFMA by checking cpu.ARM.HasVFPv4.
- software fallback: Uses mostly integer arithmetic except
for input that involves Inf, NaN, or zero.
Updates #25819.
Change-Id: Iadadff2219638bacc9fec78d3ab885393fea4a08
Reviewed-on: https://go-review.googlesource.com/c/go/+/127458
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Modify the |x| == |y| case to return -0 when x < 0.
Fixes#30814.
Change-Id: Ic4cd48001e0e894a12b5b813c6a1ddc3a055610b
Reviewed-on: https://go-review.googlesource.com/c/go/+/167479
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The original port of Log1p incorrectly translated a ternary statement
so that a correction was only applied to one of the branches.
Fixes#29488
Change-Id: I035b2fc741f76fe7c0154c63da6e298b575e08a4
Reviewed-on: https://go-review.googlesource.com/c/156120
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This change implements Payne-Hanek range reduction by Pi/4
to properly calculate trigonometric functions of huge arguments.
The implementation is based on:
"ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit"
K. C. Ng et al, March 24, 1992
The major difference with the reference is that the simulated
multi-precision calculation of x*B is implemented using 64-bit
integer arithmetic rather than floating point to ease extraction
of the relevant bits of 4/Pi.
The assembly implementations for 386 were removed since the trigonometric
instructions only use a 66-bit representation of Pi internally for
reduction. It is not possible to use these instructions and maintain
accuracy without a prior accurate reduction in software as recommended
by Intel.
Fixes#6794
Change-Id: I31bf1369e0578891d738c5473447fe9b10560196
Reviewed-on: https://go-review.googlesource.com/c/153059
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Ceil and Trunc of -0.2 return -0, not +0, but we didn't test that.
Updates #23647
Change-Id: Idbd4699376abfb4ca93f16c73c114d610d86a9f2
Reviewed-on: https://go-review.googlesource.com/91335
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Test large but not infinite arguments.
This CL adds a test which breaks s390x. Don't submit until
a fix for that is figured out.
Update #26477
Change-Id: Ic86739fe3554e87d7f8e15482875c198fcf1d59c
Reviewed-on: https://go-review.googlesource.com/125641
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Each URL was manually verified to ensure it did not serve up incorrect
content.
Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
One might try to implement the Mod or Remainder function with the expression
x - TRUNC(x/y + 0.5)*y, but in fact this method is wrong, because the rounding
of (x/y + 0.5) to initialize the argument of TRUNC may lose too much precision.
However, the current test cases can not detect this error. This CL adds two
test cases to prevent people from continuing to do such attempts.
Change-Id: I6690f5cffb21bf8ae06a314b7a45cafff8bcee13
Reviewed-on: https://go-review.googlesource.com/84275
Reviewed-by: Robert Griesemer <gri@golang.org>
Before this change, the smallest result Ldexp can handle was
ldexp(2, -1075), which is SmallestNonzeroFloat64.
There are some numbers below it should also be rounded to
SmallestNonzeroFloat64. The change fixes this.
Fixes#23407
Change-Id: I76f4cb005a6e9ccdd95b5e5c734079fd5d29e4aa
Reviewed-on: https://go-review.googlesource.com/87338
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Fixes#23224
The previous Pow code had an optimization for
powers equal to ±0.5 that used Sqrt for
increased accuracy/speed. This caused special
cases involving powers of ±0.5 to disagree with
the Pow spec. This change places the Sqrt optimization
after all of the special case handling.
Change-Id: I6bf757f6248256b29cc21725a84e27705d855369
Reviewed-on: https://go-review.googlesource.com/85660
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Dim performance has regressed by 14% vs 1.9 on amd64.
Current pure go version of Dim is faster and,
what is even more important for performance, is inlinable, so
instead of tweaking asm implementation, just remove it.
I had to update BenchmarkDim, because it was simply reloading
constant(answer) in a loop.
Perf data below:
name old time/op new time/op delta
Dim-6 6.79ns ± 0% 1.60ns ± 1% -76.39% (p=0.000 n=7+10)
If I modify benchmark to be the same as in this CL results are even better:
name old time/op new time/op delta
Dim-6 10.2ns ± 0% 1.6ns ± 1% -84.27% (p=0.000 n=8+10)
Updates #21913
Change-Id: I00e23c8affc293531e1d9f0e0e49f3a525634f53
Reviewed-on: https://go-review.googlesource.com/80695
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Rounding ties to even is statistically useful for some applications.
This implementation completes IEEE float64 rounding mode support (in
addition to Round, Ceil, Floor, Trunc).
This function avoids subtle faults found in ad-hoc implementations, and
is simple enough to be inlined by the compiler.
Fixes#21748
Change-Id: I09415df2e42435f9e7dabe3bdc0148e9b9ebd609
Reviewed-on: https://go-review.googlesource.com/61211
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
CL 62250 makes constant folding a bit more aggressive and these
benchmarks were optimized away. This CL adds some indirection to
the function arguments to stop them being folded.
The Copysign benchmark is a bit faster because I've left one
argument as a constant and it can be partially folded.
old CL 62250 this CL
Copysign 1.24ns ± 0% 0.34ns ± 2% 1.02ns ± 2%
Abs 0.67ns ± 0% 0.35ns ± 3% 0.67ns ± 0%
Signbit 0.87ns ± 0% 0.35ns ± 2% 0.87ns ± 1%
Change-Id: I9604465a87d7aa29f4bd6009839c8ee354be3cd7
Reviewed-on: https://go-review.googlesource.com/62450
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This function avoids subtle faults found in many ad-hoc implementations,
and is simple enough to be inlined by the compiler.
Fixes#20100
Change-Id: Ib320254e9b1f1f798c6ef906b116f63bc29e8d08
Reviewed-on: https://go-review.googlesource.com/43652
Reviewed-by: Robert Griesemer <gri@golang.org>
This commit defines the inverse of error function (erfinv) in the
math package. The function is based on the rational approximation
of percentage points of normal distribution available at
https://www.jstor.org/stable/pdf/2347330.pdf.
Fixes#6359
Change-Id: Icfe4508f623e0574c7fffdbf7aa929540fd4c944
Reviewed-on: https://go-review.googlesource.com/46990
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The existing implementation is translated from C, which uses a
polynomial coefficient very close to 1/6. If the function uses
1/6 as this coeffient, the result of Exp(1) will be more accurate.
And this change doesn't introduce more error to Exp function.
Fixes#20319
Change-Id: I94c236a18cf95570ebb69f7fb99884b0d7cf5f6e
Reviewed-on: https://go-review.googlesource.com/49294
Reviewed-by: Robert Griesemer <gri@golang.org>
The current implementation uses a shift and add
loop to compute the product of x's exponent xe and
the integer part of y (yi) for yi up to 1<<63.
Since xe is an 11-bit exponent, this product can be
up to 74-bits and overflow both 32 and 64-bit int.
This change checks whether the accumulated exponent
will fit in the 11-bit float exponent of the output
and breaks out of the loop early if overflow is detected.
The current handling of yi >= 1<<63 uses Exp(y * Log(x))
which incorrectly returns Nan for x<0. In addition,
for y this large, Exp(y * Log(x)) can be enumerated
to only overflow except when x == -1 since the
boundary cases computed exactly:
Pow(NextAfter(1.0, Inf(1)), 1<<63) == 2.72332... * 10^889
Pow(NextAfter(1.0, Inf(-1)), 1<<63) == 1.91624... * 10^-445
exceed the range of float64. So, the call can be
replaced with a simple case statement analgous to
y == Inf that correctly handles x < 0 as well.
Fixes#7394
Change-Id: I6f50dc951f3693697f9669697599860604323102
Reviewed-on: https://go-review.googlesource.com/48290
Reviewed-by: Robert Griesemer <gri@golang.org>
Add test cases to verify behavior for Ldexp with exponents outside the
range of Minint32/Maxint32, for a gccgo bug.
Test for issue #21323.
Change-Id: Iea67bc6fcfafdfddf515cf7075bdac59360c277a
Reviewed-on: https://go-review.googlesource.com/54230
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Removes init function from the math package.
Allows stripping of arrays with pre-computed values
used for Pow10 from binaries if Pow10 is not used.
cmd/go shrinks by 128 bytes.
Fixed small values like 10**-323 being 0 instead of 1e-323.
Overall precision is increased but still not as good as
predefined constants for some inputs.
Samples:
Pow10(208)
before: 1.0000000000000006662e+208
after: 1.0000000000000000959e+208
Pow10(202)
before 1.0000000000000009895e+202
after 1.0000000000000001193e+202
Pow10(60)
before 1.0000000000000001278e+60
after 0.9999999999999999494e+60
Pow10(-100)
before 0.99999999999999938551e-100
after 0.99999999999999989309e-100
Pow10(-200)
before 0.9999999999999988218e-200
after 1.0000000000000001271e-200
name old time/op new time/op delta
Pow10Pos-4 44.6ns ± 2% 1.2ns ± 1% -97.39% (p=0.000 n=19+17)
Pow10Neg-4 50.8ns ± 1% 4.1ns ± 2% -92.02% (p=0.000 n=17+19)
Change-Id: If094034286b8ac64be3a95fd9e8ffa3d4ad39b31
Reviewed-on: https://go-review.googlesource.com/36331
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Test finite negative x with Y0(-1), Y1(-1), Yn(2,-1), Yn(-3,-1).
Also test the special case Yn(0,0).
Fixes#19130.
Change-Id: I95f05a72e1c455ed8ddf202c56f4266f03f370fd
Reviewed-on: https://go-review.googlesource.com/37310
Reviewed-by: Robert Griesemer <gri@golang.org>
Add exported global variables and store the results of benchmarked
functions in them. This prevents the current compiler optimizations
from removing the instructions that are needed to compute the return
values of the benchmarked functions.
Change-Id: If8b08424e85f3796bb6dd73e761c653abbabcc5e
Reviewed-on: https://go-review.googlesource.com/37195
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Unlike the pure go implementation used by every other architecture,
the amd64 asm implementation of Exp does not fail early if the
argument is known to overflow. Make it fail early.
Cost of the check is < 1ns (on an old Sandy Bridge machine):
name old time/op new time/op delta
Exp-4 18.3ns ± 1% 18.7ns ± 1% +2.08% (p=0.000 n=18+20)
Fixes#14932Fixes#18912
Change-Id: I04b3f9b4ee853822cbdc97feade726fbe2907289
Reviewed-on: https://go-review.googlesource.com/36271
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Using 387 mode was computing it without underflow to zero,
apparently due to an 80-bit intermediate. Avoid underflow even
with 64-bit floats.
This eliminates the TODOs in the test suite.
Fixes linux-386-387 build and fixes#11441.
Change-Id: I8abaa63bfdf040438a95625d1cb61042f0302473
Reviewed-on: https://go-review.googlesource.com/30540
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The 387 implementation is less accurate and slower.
name old time/op new time/op delta
Exp-8 29.7ns ± 2% 24.0ns ± 2% -19.08% (p=0.000 n=10+10)
This makes Gamma more accurate too.
Change-Id: Iad33b9cce0b087ccbce3e08ba7a6d285c4999d02
Reviewed-on: https://go-review.googlesource.com/30230
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
1.7 introduced a significant regression compared to 1.6:
SqrtIndirect-4 2.32ns ± 0% 7.86ns ± 0% +238.79% (p=0.000 n=20+18)
This is caused by sqrtsd preserving upper part of destination register.
Which introduces dependency on previous value of X0.
In 1.6 benchmark loop didn't use X0 immediately after call:
callq *%rbx
movsd 0x8(%rsp),%xmm2
movsd 0x20(%rsp),%xmm1
addsd %xmm2,%xmm1
mov 0x18(%rsp),%rax
inc %rax
jmp loop
In 1.7 however xmm0 is used just after call:
callq *%rbx
mov 0x10(%rsp),%rcx
lea 0x1(%rcx),%rax
movsd 0x8(%rsp),%xmm0
movsd 0x18(%rsp),%xmm1
I've verified that this is caused by dependency, by inserting
XORPS X0,X0 in the beginning of math.Sqrt, which puts performance back on 1.6 level.
Splitting SQRTSD mem,reg into:
MOVSD mem,reg
SQRTSD reg,reg
Removes dependency, because MOVSD (load version)
doesn't need to preserve upper part of a register.
And reg,reg operation is solved by renamer in CPU.
As a result of this change regression is gone:
SqrtIndirect-4 7.86ns ± 0% 2.33ns ± 0% -70.36% (p=0.000 n=18+17)
This also removes old Sqrt benchmarks, in favor of benchmarks measuring latency.
Only SqrtIndirect is kept, to show impact of this patch.
Change-Id: Ic7eebe8866445adff5bc38192fa8d64c9a6b8872
Reviewed-on: https://go-review.googlesource.com/28392
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
Noticed by cmd/vet.
Expected values array produced by Python instead of Keisan because:
1) Keisan's website calculator is painfully difficult to copy/paste
values into and out of, and
2) after tediously computing e^(vf[i] * 10) - 1 via Keisan I
discovered that Keisan computing vf[i]*10 in a higher precision was
giving substantially different output values.
Also, testing uses "close" instead of "veryclose" because 386's
assembly implementation produces values for some of the test cases
that fail "veryclose". Curiously, Expm1(vf[i]*10) is identical to
Exp(vf[i]*10)-1 on 386, whereas with the portable implementation
they're only "veryclose".
Investigating these questions is left to someone else. I just wanted
to fix the cmd/vet warning.
Fixes#13101.
Change-Id: Ica8f6c267d01aa4cc31f53593e95812746942fbc
Reviewed-on: https://go-review.googlesource.com/16505
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Klaus Post <klauspost@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
- Make Log2 exact for powers of two.
- Fix error tolerance function to make tolerance
a function of the correct (expected) value.
Fixes#9066.
Change-Id: I0320a93ce4130deed1c7b7685627d51acb7bc56d
Reviewed-on: https://go-review.googlesource.com/12230
Reviewed-by: Ian Lance Taylor <iant@golang.org>