fanzha02
d5377c2026
test: fix the wrong test of math.Copysign(c, -1) for arm64
...
The CL 132915 added the wrong codegen test for math.Copysign(c, -1),
it should test that AND is not emitted. This CL fixes this error.
Change-Id: Ida1d3d54ebfc7f238abccbc1f70f914e1b5bfd91
Reviewed-on: https://go-review.googlesource.com/134815
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-12 15:34:20 +00:00
fanzha02
2e5c32518c
cmd/compile: optimize math.Copysign on arm64
...
Add rewrite rules to optimize math.Copysign() when the second
argument is negative floating point constant.
For example, math.Copysign(c, -2): The previous compile output is
"AND $9223372036854775807, R0, R0; ORR $-9223372036854775808, R0, R0".
The optimized compile output is "ORR $-9223372036854775808, R0, R0"
Math package benchmark results.
name old time/op new time/op delta
Copysign-8 2.61ns ± 2% 2.49ns ± 0% -4.55% (p=0.000 n=10+10)
Cos-8 43.0ns ± 0% 41.5ns ± 0% -3.49% (p=0.000 n=10+10)
Cosh-8 98.6ns ± 0% 98.1ns ± 0% -0.51% (p=0.000 n=10+10)
ExpGo-8 107ns ± 0% 105ns ± 0% -1.87% (p=0.000 n=10+10)
Exp2Go-8 100ns ± 0% 100ns ± 0% +0.39% (p=0.000 n=10+8)
Max-8 6.56ns ± 2% 6.45ns ± 1% -1.63% (p=0.002 n=10+10)
Min-8 6.66ns ± 3% 6.47ns ± 2% -2.82% (p=0.006 n=10+10)
Mod-8 107ns ± 1% 104ns ± 1% -2.72% (p=0.000 n=10+10)
Frexp-8 11.5ns ± 1% 11.0ns ± 0% -4.56% (p=0.000 n=8+10)
HypotGo-8 19.4ns ± 0% 19.4ns ± 0% +0.36% (p=0.019 n=10+10)
Ilogb-8 8.63ns ± 0% 8.51ns ± 0% -1.36% (p=0.000 n=10+10)
Jn-8 584ns ± 0% 585ns ± 0% +0.17% (p=0.000 n=7+8)
Ldexp-8 13.8ns ± 0% 13.5ns ± 0% -2.17% (p=0.002 n=8+10)
Logb-8 10.2ns ± 0% 9.9ns ± 0% -2.65% (p=0.000 n=10+7)
Nextafter64-8 7.54ns ± 0% 7.51ns ± 0% -0.37% (p=0.000 n=10+10)
Remainder-8 73.5ns ± 1% 70.4ns ± 1% -4.27% (p=0.000 n=10+10)
SqrtGoLatency-8 79.6ns ± 0% 76.2ns ± 0% -4.30% (p=0.000 n=9+10)
Yn-8 582ns ± 0% 579ns ± 0% -0.52% (p=0.000 n=10+10)
Change-Id: I0c9cd1ea87435e7b8bab94b4e79e6e29785f25b1
Reviewed-on: https://go-review.googlesource.com/132915
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-06 19:57:25 +00:00
Milan Knezevic
2959128dc5
cmd/compile: add softfloat support to mips64{,le}
...
mips64 softfloat support is based on mips implementation and introduces
new enviroment variable GOMIPS64.
GOMIPS64 is a GOARCH=mips64{,le} specific option, for a choice between
hard-float and soft-float. Valid values are 'hardfloat' (default) and
'softfloat'. It is passed to the assembler as
'GOMIPS64_{hardfloat,softfloat}'.
Change-Id: I7f73078627f7cb37c588a38fb5c997fe09c56134
Reviewed-on: https://go-review.googlesource.com/108475
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-27 14:50:17 +00:00
Carlos Eduardo Seo
ebb67d993a
cmd/compile, cmd/internal/obj/ppc64: make math.Round an intrinsic on ppc64x
...
This change implements math.Round as an intrinsic on ppc64x so it can be
done using a single instruction.
benchmark old ns/op new ns/op delta
BenchmarkRound-16 2.60 0.69 -73.46%
Change-Id: I9408363e96201abdfc73ced7bcd5f0c29db006a8
Reviewed-on: https://go-review.googlesource.com/109395
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-04-26 14:12:09 +00:00
Giovanni Bajo
284ba47b49
test: run codegen tests on all supported architecture variants
...
This CL makes the codegen testsuite automatically test all
architecture variants for architecture specified in tests. For
instance, if a test file specifies a "arm" test, it will be
automatically run on all GOARM variants (5,6,7), to increase
the coverage.
The CL also introduces a syntax to specify only a specific
variant (eg: "arm/7") in case the test makes sense only there.
The same syntax also allows to specify the operating system
in case it matters (eg: "plan9/386/sse2").
Fixes #24658
Change-Id: I2eba8b918f51bb6a77a8431a309f8b71af07ea22
Reviewed-on: https://go-review.googlesource.com/107315
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-15 20:02:43 +00:00
Giovanni Bajo
79112707bb
cmd/compile: add patterns for bit set/clear/complement on amd64
...
This patch completes implementation of BT(Q|L), and adds support
for BT(S|R|C)(Q|L).
Example of code changes from time.(*Time).addSec:
if t.wall&hasMonotonic != 0 {
0x1073465 488b08 MOVQ 0(AX), CX
0x1073468 4889ca MOVQ CX, DX
0x107346b 48c1e93f SHRQ $0x3f, CX
0x107346f 48c1e13f SHLQ $0x3f, CX
0x1073473 48f7c1ffffffff TESTQ $-0x1, CX
0x107347a 746b JE 0x10734e7
if t.wall&hasMonotonic != 0 {
0x1073435 488b08 MOVQ 0(AX), CX
0x1073438 480fbae13f BTQ $0x3f, CX
0x107343d 7363 JAE 0x10734a2
Another example:
t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
0x10734c8 4881e1ffffff3f ANDQ $0x3fffffff, CX
0x10734cf 48c1e61e SHLQ $0x1e, SI
0x10734d3 4809ce ORQ CX, SI
0x10734d6 48b90000000000000080 MOVQ $0x8000000000000000, CX
0x10734e0 4809f1 ORQ SI, CX
0x10734e3 488908 MOVQ CX, 0(AX)
t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
0x107348b 4881e2ffffff3f ANDQ $0x3fffffff, DX
0x1073492 48c1e61e SHLQ $0x1e, SI
0x1073496 4809f2 ORQ SI, DX
0x1073499 480fbaea3f BTSQ $0x3f, DX
0x107349e 488910 MOVQ DX, 0(AX)
Go1 benchmarks seem unaffected, and I would be surprised
otherwise:
name old time/op new time/op delta
BinaryTree17-4 2.64s ± 4% 2.56s ± 9% -2.92% (p=0.008 n=9+9)
Fannkuch11-4 2.90s ± 1% 2.95s ± 3% +1.76% (p=0.010 n=10+9)
FmtFprintfEmpty-4 35.3ns ± 1% 34.5ns ± 2% -2.34% (p=0.004 n=9+8)
FmtFprintfString-4 57.0ns ± 1% 58.4ns ± 5% +2.52% (p=0.029 n=9+10)
FmtFprintfInt-4 59.8ns ± 3% 59.8ns ± 6% ~ (p=0.565 n=10+10)
FmtFprintfIntInt-4 93.9ns ± 3% 91.2ns ± 5% -2.94% (p=0.014 n=10+9)
FmtFprintfPrefixedInt-4 107ns ± 6% 104ns ± 6% ~ (p=0.099 n=10+10)
FmtFprintfFloat-4 187ns ± 3% 188ns ± 3% ~ (p=0.505 n=10+9)
FmtManyArgs-4 410ns ± 1% 415ns ± 6% ~ (p=0.649 n=8+10)
GobDecode-4 5.30ms ± 3% 5.27ms ± 3% ~ (p=0.436 n=10+10)
GobEncode-4 4.62ms ± 5% 4.47ms ± 2% -3.24% (p=0.001 n=9+10)
Gzip-4 197ms ± 4% 193ms ± 3% ~ (p=0.123 n=10+10)
Gunzip-4 30.4ms ± 3% 30.1ms ± 3% ~ (p=0.481 n=10+10)
HTTPClientServer-4 76.3µs ± 1% 76.0µs ± 1% ~ (p=0.236 n=8+9)
JSONEncode-4 10.5ms ± 9% 10.3ms ± 3% ~ (p=0.280 n=10+10)
JSONDecode-4 42.3ms ±10% 41.3ms ± 2% ~ (p=0.053 n=9+10)
Mandelbrot200-4 3.80ms ± 2% 3.72ms ± 2% -2.15% (p=0.001 n=9+10)
GoParse-4 2.88ms ±10% 2.81ms ± 2% ~ (p=0.247 n=10+10)
RegexpMatchEasy0_32-4 69.5ns ± 4% 68.6ns ± 2% ~ (p=0.171 n=10+10)
RegexpMatchEasy0_1K-4 165ns ± 3% 162ns ± 3% ~ (p=0.137 n=10+10)
RegexpMatchEasy1_32-4 65.7ns ± 6% 64.4ns ± 2% -2.02% (p=0.037 n=10+10)
RegexpMatchEasy1_1K-4 278ns ± 2% 279ns ± 3% ~ (p=0.991 n=8+9)
RegexpMatchMedium_32-4 99.3ns ± 3% 98.5ns ± 4% ~ (p=0.457 n=10+9)
RegexpMatchMedium_1K-4 30.1µs ± 1% 30.4µs ± 2% ~ (p=0.173 n=8+10)
RegexpMatchHard_32-4 1.40µs ± 2% 1.41µs ± 4% ~ (p=0.565 n=10+10)
RegexpMatchHard_1K-4 42.5µs ± 1% 41.5µs ± 3% -2.13% (p=0.002 n=8+9)
Revcomp-4 332ms ± 4% 328ms ± 5% ~ (p=0.720 n=9+10)
Template-4 48.3ms ± 2% 49.6ms ± 3% +2.56% (p=0.002 n=8+10)
TimeParse-4 252ns ± 2% 249ns ± 3% ~ (p=0.116 n=9+10)
TimeFormat-4 262ns ± 4% 252ns ± 3% -4.01% (p=0.000 n=9+10)
name old speed new speed delta
GobDecode-4 145MB/s ± 3% 146MB/s ± 3% ~ (p=0.436 n=10+10)
GobEncode-4 166MB/s ± 5% 172MB/s ± 2% +3.28% (p=0.001 n=9+10)
Gzip-4 98.6MB/s ± 4% 100.4MB/s ± 3% ~ (p=0.123 n=10+10)
Gunzip-4 639MB/s ± 3% 645MB/s ± 3% ~ (p=0.481 n=10+10)
JSONEncode-4 185MB/s ± 8% 189MB/s ± 3% ~ (p=0.280 n=10+10)
JSONDecode-4 46.0MB/s ± 9% 47.0MB/s ± 2% +2.21% (p=0.046 n=9+10)
GoParse-4 20.1MB/s ± 9% 20.6MB/s ± 2% ~ (p=0.239 n=10+10)
RegexpMatchEasy0_32-4 460MB/s ± 4% 467MB/s ± 2% ~ (p=0.165 n=10+10)
RegexpMatchEasy0_1K-4 6.19GB/s ± 3% 6.28GB/s ± 3% ~ (p=0.165 n=10+10)
RegexpMatchEasy1_32-4 487MB/s ± 5% 497MB/s ± 2% +2.00% (p=0.043 n=10+10)
RegexpMatchEasy1_1K-4 3.67GB/s ± 2% 3.67GB/s ± 3% ~ (p=0.963 n=8+9)
RegexpMatchMedium_32-4 10.1MB/s ± 3% 10.1MB/s ± 4% ~ (p=0.435 n=10+9)
RegexpMatchMedium_1K-4 34.0MB/s ± 1% 33.7MB/s ± 2% ~ (p=0.173 n=8+10)
RegexpMatchHard_32-4 22.9MB/s ± 2% 22.7MB/s ± 4% ~ (p=0.565 n=10+10)
RegexpMatchHard_1K-4 24.0MB/s ± 3% 24.7MB/s ± 3% +2.64% (p=0.001 n=9+9)
Revcomp-4 766MB/s ± 4% 775MB/s ± 5% ~ (p=0.720 n=9+10)
Template-4 40.2MB/s ± 2% 39.2MB/s ± 3% -2.47% (p=0.002 n=8+10)
The rules match ~1800 times during all.bash.
Fixes #18943
Change-Id: I64be1ada34e89c486dfd935bf429b35652117ed4
Reviewed-on: https://go-review.googlesource.com/94766
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-24 02:38:50 +00:00
Giovanni Bajo
89ae7045f3
test: convert all math-related tests from asm_test
...
Change-Id: If542f0b5c5754e6eb2f9b302fe5a148ba9a57338
Reviewed-on: https://go-review.googlesource.com/98443
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-04 16:52:33 +00:00