We compute a lot of stuff based off the CFG: postorder traversal,
dominators, dominator tree, loop nest. Multiple phases use this
information and we end up recomputing some of it. Add a cache
for this information so if the CFG hasn't changed, we can reuse
the previous computation.
Change-Id: I9b5b58af06830bd120afbee9cfab395a0a2f74b2
Reviewed-on: https://go-review.googlesource.com/29356
Reviewed-by: David Chase <drchase@google.com>
Teach SSA about the cmd/internal/obj/$ARCH register numbering.
It can then return that numbering when requested. Each architecture
now does not need to know anything about the internal SSA numbering
of registers.
Change-Id: I34472a2736227c15482e60994eebcdd2723fa52d
Reviewed-on: https://go-review.googlesource.com/29249
Reviewed-by: David Chase <drchase@google.com>
A tentative fix of #16380. It adds "line" everywhere...
This also reduces binary size slightly (cmd/go on ARM as an example):
before after
total binary size 8068097 8018945 (-0.6%)
.gopclntab 1195341 1179929 (-1.3%)
.debug_line 689692 652017 (-5.5%)
Change-Id: Ibda657c6999783c5bac180cbbba487006dbf0ed7
Reviewed-on: https://go-review.googlesource.com/25082
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Rip out the code that allows SSA to be used conditionally.
No longer exists:
ssa=0 flag
GOSSAHASH
GOSSAPKG
SSATEST
GOSSAFUNC now only controls the printing of the IR/html.
Still need to rip out all of the old backend. It should no longer be
callable after this CL.
Update #16357
Change-Id: Ib30cc18fba6ca52232c41689ba610b0a94aa74f5
Reviewed-on: https://go-review.googlesource.com/29155
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The new SSA backend modifies the ABI slightly: R0 is now a usable
general purpose register.
Fixes#16677.
Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf
Reviewed-on: https://go-review.googlesource.com/28978
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
No need for it, we can treat calls as (mostly) normal values
that take a memory and return a memory.
Lowers the number of basic blocks needed to represent a function.
"go test -c net/http" uses 27% fewer basic blocks.
Probably doesn't affect generated code much, but should help
various passes whose running time and/or space depends on
the number of basic blocks.
Fixes#15631
Change-Id: I0bf21e123f835e2cfa382753955a4f8bce03dfa6
Reviewed-on: https://go-review.googlesource.com/28950
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Inline atomic reads and writes on amd64. There's no reason
to pay the overhead of a call for these.
To keep atomic loads from being reordered, we make them
return a <value,memory> tuple.
Change the meaning of resultInArg0 for tuple-generating ops
to mean the first part of the result tuple, not the second.
This means we can always put the store part of the tuple last,
matching how arguments are laid out. This requires reordering
the outputs of add32carry and sub32carry and their descendents
in various architectures.
benchmark old ns/op new ns/op delta
BenchmarkAtomicLoad64-8 2.09 0.26 -87.56%
BenchmarkAtomicStore64-8 7.54 5.72 -24.14%
TBD (in a different CL): Cas, Or8, ...
Change-Id: I713ea88e7da3026c44ea5bdb56ed094b20bc5207
Reviewed-on: https://go-review.googlesource.com/27641
Reviewed-by: Cherry Zhang <cherryyz@google.com>
- implement *, /, %, shifts, Zero, Move.
- fix mistakes in comparison.
- fix floating point rounding.
- handle RetJmp in assembler (which was not handled, as a consequence
Duff's device was disabled in the old backend.)
all.bash now passes with SSA on.
Updates #16359.
Change-Id: Ia14eed0ed1176b5d800592080c8f53dded7fe73f
Reviewed-on: https://go-review.googlesource.com/27592
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This time with the cherry-pick from the proper patch of
the old CL.
Stack size increased.
Corrected NaN-comparison glitches.
Marked g register as clobbered by calls.
Fixed shared libraries.
live_ssa.go still disabled because of differences.
Presumably turning on more optimization will fix
both the stack size and the live_ssa.go glitches.
Enhanced debugging output for shared libs test.
Rebased onto master.
Updates #16010.
Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4
Reviewed-on: https://go-review.googlesource.com/27159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Last part of the 386 SSA port.
Modify the x86 backend to simulate SSE registers and
instructions with 387 registers and instructions.
The simulation isn't terribly performant, but it works,
and the old implementation wasn't very performant either.
Leaving to people who care about 387 to optimize if they want.
Turn on SSA backend for 386 by default.
Fixes#16358
Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0
Reviewed-on: https://go-review.googlesource.com/25271
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Access to globals requires a 2-instruction sequence on PIC 386.
MOVL foo(SB), AX
is translated by the obj package into:
CALL getPCofNextInstructionInTempRegister(SB)
MOVL (&foo-&thisInstruction)(tmpReg), AX
The call returns the PC of the next instruction in a register.
The next instruction then offsets from that register to get the
address required. The tricky part is the allocation of the
temp register. The legacy compiler always used CX, and forbid
the register allocator from allocating CX when in PIC mode.
We can't easily do that in SSA because CX is actually a required
register for shift instructions. (I think the old backend got away
with this because the register allocator never uses CX, only
codegen knows that shifts must use CX.)
Instead, we allow the temp register to be anything. When the
destination of the MOV (or LEA) is an integer register, we can
use that register. Otherwise, we make sure to compile the
operation using an LEA to reference the global. So
MOVL AX, foo(SB)
is never generated directly. Instead, SSA generates:
LEAL foo(SB), DX
MOVL AX, (DX)
which is then rewritten by the obj package to:
CALL getPcInDX(SB)
LEAL (&foo-&thisInstruction)(DX), AX
MOVL AX, (DX)
So this CL modifies the obj package to use different thunks
to materialize the pc into different registers. We use the
registers that regalloc chose so that SSA can still allocate
the full set of registers.
Change-Id: Ie095644f7164a026c62e95baf9d18a8bcaed0bba
Reviewed-on: https://go-review.googlesource.com/25442
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
It's not a new backend, just a PtrSize==4 modification
of the existing AMD64 backend.
Change-Id: Icc63521a5cf4ebb379f7430ef3f070894c09afda
Reviewed-on: https://go-review.googlesource.com/25586
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reg allocator skips flag-typed values. Flag allocator uses the type
and whether the op has "clobberFlags" set.
Tested on AMD64, ARM, ARM64, 386. Passed 'toolstash -cmp' on AMD64.
PPC64 is coded blindly.
Change-Id: Ib1cc27efecef6a1bb27f7d7ed035a582660d244f
Reviewed-on: https://go-review.googlesource.com/25480
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Includes hmul (all widths)
compare for boolean result and simplifications
shift operations plus changes/additions for implementation
(ORN, ADDME, ADDC)
Also fixed a backwards-operand CMP.
Change-Id: Id723c4e25125c38e0d9ab9ec9448176b75f4cdb4
Reviewed-on: https://go-review.googlesource.com/25410
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Support the following:
- Shifts. ARM64 machine instructions only use lowest 6 bits of the
shift (i.e. mod 64). Use conditional selection instruction to
ensure Go semantics.
- Zero/Move. Alignment is ensured.
- Hmul, Avg64u, Sqrt.
- reserve R18 (platform register in ARM64 ABI) and R29 (frame pointer
in ARM64 ABI).
Everything compiles, all.bash passed (with non-SSA test disabled).
Change-Id: Ia8ed58dae5cbc001946f0b889357b258655078b1
Reviewed-on: https://go-review.googlesource.com/25290
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Make tuple types and their SelectX ops fully generic.
These ops no longer need to be lowered.
Regalloc understands them and their tuple-generating arguments.
We can now have opcodes returning arbitrary pairs of results.
(And it would be easy to move to >2 results if needed.)
Update arm implementation to the new standard.
Implement just enough in 386 port to do 64-bit add.
Change-Id: I370ed5aacce219c82e1954c61d1f63af76c16f79
Reviewed-on: https://go-review.googlesource.com/24976
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
NaCl code runs in sandbox and there are restrictions for its
instruction uses
(https://developer.chrome.com/native-client/reference/sandbox_internals/arm-32-bit-sandbox).
Like the legacy backend, on NaCl,
- don't use R9, which is used as NaCl's "thread pointer".
- don't use Duff's device.
- don't use indexed load/stores.
- the assembler rewrites DIV/MOD to runtime calls, which on NaCl
clobbers R12, so R12 is marked as clobbered for DIV/MOD.
- other restrictions are satisfied by the assembler.
Enable SSA specific tests on nacl/arm, and disable non-SSA ones.
Updates #15365.
Change-Id: I9262693ec6756b89ca29d3ae4e52a96fe5403b02
Reviewed-on: https://go-review.googlesource.com/24859
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
If a spill is used to satisfy a merge edge (in shuffle), don't sink
it out of loop.
This is found in the following code (on ARM) where there is a stack
Phi (v268) inside a loop (b36 -> ... -> b47 -> b38 -> b36).
(before shuffle)
b36: <- b34 b38
...
v268 = Phi <int> v410 v360 : autotmp_198[int]
...
... -> b47
b47: <- b44
...
v360 = ... : R6
v230 = StoreReg <int> v360 : autotmp_198[int]
v261 = CMPconst <flags> [0] v360
EQ v261 -> b49 b38 (unlikely)
b38: <- b47
...
Plain -> b36
During shuffle, v230 (as spill of v360) is found to satisfy v268, but
it didn't record its use in shuffle, and v230 is sunk out of the loop
(to b49), which leads to bad value in v268.
This seems never happened on AMD64 (in make.bash), until 4 registers
are removed.
Change-Id: I01dfc28ae461e853b36977c58bcfc0669e556660
Reviewed-on: https://go-review.googlesource.com/24858
Reviewed-by: David Chase <drchase@google.com>
Provide better diagnostic messages.
Use an int for numRegs comparisons,
to avoid asking whether a uint8 is > 255.
Change-Id: I33ae193ce292b24b369865abda3902c3207d7d3f
Reviewed-on: https://go-review.googlesource.com/24135
Reviewed-by: Keith Randall <khr@golang.org>
Use hardware g register (R10) for GetG, allow g to appear at LHS of
some ops.
Progress on SSA backend for ARM. Now everything compiles and runs.
Updates #15365.
Change-Id: Icdf93585579faa86cc29b1e17ab7c90f0119fc4e
Reviewed-on: https://go-review.googlesource.com/23952
Reviewed-by: David Chase <drchase@google.com>
- 64x signed right shift was wrong for shift larger than 0x80000000.
- for Lsh-followed-by-Rsh, the intermediate value should be full int
width, so when it is spilled MOVW should be used.
- use RET for RetJmp, so the assembler can take case of restoring LR
for non-leaf case.
- reserve R9 in dynlink mode. R9 is used for GOT by the assembler.
Progress on SSA backend for ARM. Still not complete.
Updates #15365.
Change-Id: I3caca256b92ff7cf96469da2feaf4868a592efc5
Reviewed-on: https://go-review.googlesource.com/23793
Reviewed-by: David Chase <drchase@google.com>
Auto-generate register masks and load them through Config.
Passed toolstash -cmp on AMD64.
Tests phi_ssa.go and regalloc_ssa.go in cmd/compile/internal/gc/testdata
passed on ARM.
Updates #15365.
Change-Id: I393924d68067f2dbb13dab82e569fb452c986593
Reviewed-on: https://go-review.googlesource.com/23292
Reviewed-by: David Chase <drchase@google.com>
This has a minor performance cost, but far less than is being gained by SSA.
As an experiment, enable it during the Go 1.7 beta.
Having frame pointers on by default makes Linux's perf, Intel VTune,
and other profilers much more useful, because it lets them gather a
stack trace efficiently on profiling events.
(It doesn't help us that much, since when we walk the stack we usually
need to look up PC-specific information as well.)
Fixes#15840.
Change-Id: I4efd38412a0de4a9c87b1b6e5d11c301e63f1a2a
Reviewed-on: https://go-review.googlesource.com/23451
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fix hardcoded flag register mask in ssa/flagalloc.go by auto-generating
the mask.
Also fix a mistake (in previous CL) about conditional branches.
Progress on SSA backend for ARM. Still not complete. Now "container/ring"
package compiles and tests passed.
Updates #15365.
Change-Id: Id7c8805c30dbb8107baedb485ed0f71f59ed6ea8
Reviewed-on: https://go-review.googlesource.com/23093
Reviewed-by: Keith Randall <khr@golang.org>
Run live vars test only on ssa builds.
We can't just drop KeepAlive ops during regalloc. We need
to replace them with copies.
Change-Id: Ib4b3b1381415db88fdc2165fc0a9541b73ad9759
Reviewed-on: https://go-review.googlesource.com/23225
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Introduce a KeepAlive op which makes sure that its argument is kept
live until the KeepAlive. Use KeepAlive to mark pointer input
arguments as live after each function call and at each return.
We do this change only for pointer arguments. Those are the
critical ones to handle because they might have finalizers.
Doing compound arguments (slices, structs, ...) is more complicated
because we would need to track field liveness individually (we do
that for auto variables now, but inputs requires extra trickery).
Turn off the automatic marking of args as live. That way, when args
are explicitly nulled, plive will know that the original argument is
dead.
The KeepAlive op will be the eventual implementation of
runtime.KeepAlive.
Fixes#15277
Change-Id: I5f223e65d99c9f8342c03fbb1512c4d363e903e5
Reviewed-on: https://go-review.googlesource.com/22365
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
This adds a sparse method for locating nearest ancestors
in a dominator tree, and checks blocks with more than one
predecessor for differences and inserts phi functions where
there are.
Uses reversed post order to cut number of passes, running
it from first def to last use ("last use" for paramout and
mem is end-of-program; last use for a phi input from a
backedge is the source of the back edge)
Includes a cutover from old algorithm to new to avoid paying
large constant factor for small programs. This keeps normal
builds running at about the same time, while not running
over-long on large machine-generated inputs.
Add "phase" flags for ssa/build -- ssa/build/stats prints
number of blocks, values (before and after linking references
and inserting phis, so expansion can be measured), and their
product; the product governs the cutover, where a good value
seems to be somewhere between 1 and 5 million.
Among the files compiled by make.bash, this is the shape of
the tail of the distribution for #blocks, #vars, and their
product:
#blocks #vars product
max 6171 28180 173,898,780
99.9% 1641 6548 10,401,878
99% 463 1909 873,721
95% 152 639 95,235
90% 84 359 30,021
The old algorithm is indeed usually fastest, for 99%ile
values of usually.
The fix to LookupVarOutgoing
( https://go-review.googlesource.com/#/c/22790/ )
deals with some of the same problems addressed by this CL,
but on at least one bug ( #15537 ) this change is still
a significant help.
With this CL:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
github.com/gogo/protobuf/test/theproto3/combos/...
...
real 4m35.200s
user 13m16.644s
sys 0m36.712s
and pprof reports 3.4GB allocated in one of the larger profiles
With tip:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
github.com/gogo/protobuf/test/theproto3/combos/...
...
real 10m36.569s
user 25m52.286s
sys 4m3.696s
and pprof reports 8.3GB allocated in the same larger profile
With this CL, most of the compilation time on the benchmarked
input is spent in register/stack allocation (cumulative 53%)
and in the sparse lookup algorithm itself (cumulative 20%).
Fixes#15537.
Change-Id: Ia0299dda6a291534d8b08e5f9883216ded677a00
Reviewed-on: https://go-review.googlesource.com/22342
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In regalloc, a sparse map is preallocated for later use by
spill-in-loop sinking. However, variables (spills) are added
during register allocation before spill sinking, and a map
query involving any of these new variables will index out of
bounds in the map.
To fix:
1) fix the queries to use s.orig[v.ID].ID instead, to ensure
proper indexing. Note that s.orig will be nil for values
that are not eligible for spilling (like memory and flags).
2) add a test.
Fixes#15585.
Change-Id: I8f2caa93b132a0f2a9161d2178320d5550583075
Reviewed-on: https://go-review.googlesource.com/22911
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
:= is the wrong thing here. The new variable masks the old
variable so we allocate the slice afresh each time around the loop.
Change-Id: I759c30e1bfa88f40decca6dd7d1e051e14ca0844
Reviewed-on: https://go-review.googlesource.com/22679
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
The cached copy's ID is sometimes outside the bounds of the orig array.
There's no reason to start at the cached copy and work backwards
to the original value. We already have the original value ID at
all the callsites.
Fixes noopt build
Change-Id: I313508a1917e838a87e8cc83b2ef3c2e4a8db304
Reviewed-on: https://go-review.googlesource.com/22355
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Improve forward-looking desired register calculations.
It is now inter-block and handles a bunch more cases.
Fixes#14504Fixes#14828Fixes#15254
Change-Id: Ic240fa0ec6a779d80f577f55c8a6c4ac8c1a940a
Reviewed-on: https://go-review.googlesource.com/22160
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This is a fix for the ssacheck builder
http://build.golang.org/log/baa00f70c34e41186051cfe90568de3d91f115d7
after CL 21307 for sinking spills down loop exits
https://go-review.googlesource.com/#/c/21037/
The fix is to reuse (move) the original spill, thus preserving
the definition of the variable and its use count. Original and
copy both use the same stack slot, but ssacheck needs to see
a definition for the variable itself.
Fixes#15279.
Change-Id: I286285490193dc211b312d64dbc5a54867730bd6
Reviewed-on: https://go-review.googlesource.com/21995
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For call-free inner loops.
Revised statistics:
85 inner loop spills sunk
341 inner loop spills remaining
1162 inner loop spills that were candidates for sinking
ended up completely register allocated
119 inner loop spills could have been sunk were used in
"shuffling" at the bottom of the loop.
1 inner loop spill not sunk because the register assigned
changed between def and exit,
Understanding how to make an inner loop definition not be
a candidate for from-memory shuffling (to force the shuffle
code to choose some other value) should pick up some of the
119 other spills disqualified for this reason.
Modified the stats printing based on feedback from Austin.
Change-Id: If3fb9b5d5a028f42ccc36c4e3d9e0da39db5ca60
Reviewed-on: https://go-review.googlesource.com/21037
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Instead of being a hint, resultInArg0 is now enforced by regalloc.
This allows us to delete all the code from amd64/ssa.go which
deals with converting from a semantically three-address instruction
into some copies plus a two-address instruction.
Change-Id: Id4f39a80be4b678718bfd42a229f9094ab6ecd7c
Reviewed-on: https://go-review.googlesource.com/21816
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Be more careful about inserting instrumentation in racewalk.
If the node being instrumented is an OAS, and it has a non-
empty Ninit, then append instrumentation to the Ninit list
rather than letting it be inserted before the OAS (and the
compilation of its init list). This deals with the case that
the Ninit list defines a variable used in the RHS of the OAS.
Fixes#15091.
Change-Id: Iac91696d9104d07f0bf1bd3499bbf56b2e1ef073
Reviewed-on: https://go-review.googlesource.com/21771
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: David Chase <drchase@google.com>
Some minor scoping cleanups found by a very old version of grind.
Change-Id: I1d373817586445fc87e38305929097b652696fdd
Reviewed-on: https://go-review.googlesource.com/21064
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Start working on arm port. Gets close to correct
code for fibonacci:
func fib(n int) int {
if n < 2 {
return n
}
return fib(n-1) + fib(n-2)
}
Still a lot to do, but this is a good starting point.
Cleaned up some arch-specific dependencies in regalloc.
Change-Id: I4301c6c31a8402168e50dcfee8bcf7aee73ea9d5
Reviewed-on: https://go-review.googlesource.com/21000
Reviewed-by: David Chase <drchase@google.com>
This seems to help the problem reported in #14606; this
change seems to produce about a 4% improvement (mostly
for the 128-8192 shards).
Fixes#14789.
Change-Id: I1bd52c82d4ca81d9d5e9ab371fdfc860d7e8af50
Reviewed-on: https://go-review.googlesource.com/20660
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Keep track of how many uses each Value has. Each appearance in
Value.Args and in Block.Control counts once.
The number of uses of a value is generically useful to
constrain rewrite rules. For instance, we might want to
prevent merging index operations into loads if the same
index expression is used lots of times.
But I have one use in particular for which the use count is required.
We must make sure we don't combine ops with loads if the load has
more than one use. Otherwise, we may split a single load
into multiple loads and that breaks perceived behavior in
the presence of races. In particular, the load of m.state
in sync/mutex.go:Lock can't be done twice. (I have a separate
CL which triggers the mutex failure. This CL has a test which
demonstrates a similar failure.)
Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e
Reviewed-on: https://go-review.googlesource.com/20790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
x86 has a lot of instructions that require the output to be in the same
register as one of the inputs. When allocating the output register,
allocate the same register as the input if it is available.
Improves the performance of golang.org/x/crypto/sha3 by
10% (from 6% slower than 1.6 to 4% faster).
Fixes#14745
Change-Id: I4d81785240c9368e4dc75107b45c959d200df8e6
Reviewed-on: https://go-review.googlesource.com/20488
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Change the existing flags from compile time consts to be configurable
from the command line.
Change-Id: I4aab4bf3dfcbdd8e2b5a2ff51af95c2543967769
Reviewed-on: https://go-review.googlesource.com/20560
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Make sure we do any just-before-return cleanup on all paths out of a
function, including when recovering. Each exit path should include
deferreturn (if there are any defers) and then the exit
code (e.g. copying heap-escaping return values back to the stack).
Introduce a Defer SSA block type which has two outgoing edges - one the
fallthrough edge (the defer was queued successfully) and one which
immediately returns (the defer had a successful recover() call and
normal execution should resume at the return point).
Fixes#14725
Change-Id: Iad035c9fd25ef8b7a74dafbd7461cf04833d981f
Reviewed-on: https://go-review.googlesource.com/20486
Reviewed-by: David Chase <drchase@google.com>
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In regalloc, make LoadReg instructions use the line number
of their *use*, not their *source*. This reduces the
tendency of debugger stepping to "jump around" the program.
Change-Id: I59e2eeac4dca9168d8af3a93effbc5bdacac2881
Reviewed-on: https://go-review.googlesource.com/19836
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>