This abstracts creation of ACALL Progs into package gc. The main
benefit of this today is we can refactor away a lot of common
boilerplate code.
Later, once liveness analysis happens on the SSA graph, this will also
provide an easy insertion point for emitting the PCDATA Progs
immediately before call instructions.
Passes toolstash-check -all.
Change-Id: Ia15108ace97201cd84314f1ca916dfeb4f09d61c
Reviewed-on: https://go-review.googlesource.com/38081
Reviewed-by: Keith Randall <khr@golang.org>
Several SSA ops will always behave identically regardless of target
architecture, so handle those within gc/ssa.go instead.
Passes toolstash-check -all.
Change-Id: I54d514e80ab86723e44332a5a38e3054cbca8c5d
Reviewed-on: https://go-review.googlesource.com/37931
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
On AMD64 Most operation can have one operand in memory.
Combine load and dependand operation into one new operation,
where possible. I've seen no significant performance changes on go1,
but this allows to remove ~1.8kb code from go tool. And in math package
I see e. g.:
Remainder-6 70.0ns ± 0% 64.6ns ± 0% -7.76% (p=0.000 n=9+1
Change-Id: I88b8602b1d55da8ba548a34eb7da4b25d59a297e
Reviewed-on: https://go-review.googlesource.com/36793
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Removes all external uses of Linksym and Pkglookup, which are the only
two exported functions that return Syms.
Also add Duffcopy and Duffzero since they're used often enough across
SSA backends.
Passes toolstash -cmp.
Change-Id: I8d3fd048ad5cd676fc46378f09a917569ffc9b2c
Reviewed-on: https://go-review.googlesource.com/36418
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Gc's Sym type represents a package-qualified identifier, which is a
frontend concept and doesn't belong in SSA. Bonus: we can replace some
interface{} types with *obj.LSym.
Passes toolstash -cmp.
Change-Id: I456eb9957207d80f99f6eb9b8eab4a1f3263e9ed
Reviewed-on: https://go-review.googlesource.com/36415
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This is a mostly mechanical rename followed by manual fixes where necessary.
Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842
Reviewed-on: https://go-review.googlesource.com/34137
Reviewed-by: David Lazar <lazard@golang.org>
Various minor adjustments.
Change-Id: Iedfb97989f7bedaa3e9e8993b167e05f162434a7
Reviewed-on: https://go-review.googlesource.com/34136
Reviewed-by: David Lazar <lazard@golang.org>
Adjust cmd/compile accordingly.
This will make it easier to replace the underlying implementation.
Change-Id: I33645850bb18c839b24785b6222a9e028617addb
Reviewed-on: https://go-review.googlesource.com/34133
Reviewed-by: David Lazar <lazard@golang.org>
Identify live stack variables during SSA and compute the stack frame
layout earlier so that we can emit instructions with the correct
offsets upfront.
Passes toolstash/buildall.
Change-Id: I191100dba274f1e364a15bdcfdc1d1466cdd1db5
Reviewed-on: https://go-review.googlesource.com/30216
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Teach SSA about the cmd/internal/obj/$ARCH register numbering.
It can then return that numbering when requested. Each architecture
now does not need to know anything about the internal SSA numbering
of registers.
Change-Id: I34472a2736227c15482e60994eebcdd2723fa52d
Reviewed-on: https://go-review.googlesource.com/29249
Reviewed-by: David Chase <drchase@google.com>
Get rid of BlockCheck. Josh goaded me into it, and I went
down a rabbithole making it happen.
NilCheck now panics if the pointer is nil and returns void, as before.
BlockCheck is gone, and NilCheck is no longer a Control value for
any block. It just exists (and deadcode knows not to throw it away).
I rewrote the nilcheckelim pass to handle this case. In particular,
there can now be multiple NilCheck ops per block.
I moved all of the arch-dependent nil check elimination done as
part of ssaGenValue into its own proper pass, so we don't have to
duplicate that code for every architecture.
Making the arch-dependent nil check its own pass means I needed
to add a bunch of flags to the opcode table so I could write
the code without arch-dependent ops everywhere.
Change-Id: I419f891ac9b0de313033ff09115c374163416a9f
Reviewed-on: https://go-review.googlesource.com/29120
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Rip out the code that allows SSA to be used conditionally.
No longer exists:
ssa=0 flag
GOSSAHASH
GOSSAPKG
SSATEST
GOSSAFUNC now only controls the printing of the IR/html.
Still need to rip out all of the old backend. It should no longer be
callable after this CL.
Update #16357
Change-Id: Ib30cc18fba6ca52232c41689ba610b0a94aa74f5
Reviewed-on: https://go-review.googlesource.com/29155
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
No need for it, we can treat calls as (mostly) normal values
that take a memory and return a memory.
Lowers the number of basic blocks needed to represent a function.
"go test -c net/http" uses 27% fewer basic blocks.
Probably doesn't affect generated code much, but should help
various passes whose running time and/or space depends on
the number of basic blocks.
Fixes#15631
Change-Id: I0bf21e123f835e2cfa382753955a4f8bce03dfa6
Reviewed-on: https://go-review.googlesource.com/28950
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reduce the duplication in every arch by moving the code into package gc.
Change-Id: Ia111add8316492571825431ecd4f0154c8792ae1
Reviewed-on: https://go-review.googlesource.com/28481
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Atomic add rules were depending on CSE to combine duplicate atomic ops.
With -N, CSE doesn't run.
Redo the rules for atomic add so there's only one atomic op.
Introduce an add-to-first-part-of-tuple pseudo-ops to make the atomic add result correct.
Change-Id: Ib132247051abe5f80fefad6c197db8df8ee06427
Reviewed-on: https://go-review.googlesource.com/27991
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Atomic swap, add/and/or, compare and swap.
Also works on amd64p32.
Change-Id: Idf2d8f3e1255f71deba759e6e75e293afe4ab2ba
Reviewed-on: https://go-review.googlesource.com/27813
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Inline atomic reads and writes on amd64. There's no reason
to pay the overhead of a call for these.
To keep atomic loads from being reordered, we make them
return a <value,memory> tuple.
Change the meaning of resultInArg0 for tuple-generating ops
to mean the first part of the result tuple, not the second.
This means we can always put the store part of the tuple last,
matching how arguments are laid out. This requires reordering
the outputs of add32carry and sub32carry and their descendents
in various architectures.
benchmark old ns/op new ns/op delta
BenchmarkAtomicLoad64-8 2.09 0.26 -87.56%
BenchmarkAtomicStore64-8 7.54 5.72 -24.14%
TBD (in a different CL): Cas, Or8, ...
Change-Id: I713ea88e7da3026c44ea5bdb56ed094b20bc5207
Reviewed-on: https://go-review.googlesource.com/27641
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Now that we have ops that can return 2 results, have BSF return a result
and flags. We can then get rid of the redundant comparison and use CMOV
instead of CMOVconst ops.
Get rid of a bunch of the ops we don't use. Ctz{8,16}, plus all the Clzs,
and CMOVNEs. I don't think we'll ever use them, and they would be easy
to add back if needed.
Change-Id: I8858a1d017903474ea7e4002fc76a6a86e7bd487
Reviewed-on: https://go-review.googlesource.com/27630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
It's not a new backend, just a PtrSize==4 modification
of the existing AMD64 backend.
Change-Id: Icc63521a5cf4ebb379f7430ef3f070894c09afda
Reviewed-on: https://go-review.googlesource.com/25586
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
SSA compiler on AMD64 may spill Duff-adjusted address as scalar. If
the object is on stack and the stack moves, the spilled address become
invalid.
Making the spill pointer-typed does not work. The Duff-adjusted address
points to the memory before the area to be zeroed and may be invalid.
This may cause stack scanning code panic.
Fix it by doing Duff-adjustment in genValue, so the intermediate value
is not seen by the reg allocator, and will not be spilled.
Add a test to cover both cases. As it depends on allocation, it may
be not always triggered.
Fixes#16515.
Change-Id: Ia81d60204782de7405b7046165ad063384ede0db
Reviewed-on: https://go-review.googlesource.com/25309
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
We now allow Values to have 2 outputs. Use that ability for amd64.
This allows x,y := a/b,a%b to use just a single divide instruction.
Update #6815
Change-Id: Id70bcd20188a2dd8445e631a11d11f60991921e4
Reviewed-on: https://go-review.googlesource.com/25004
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
The comments were mostly duplicated; unify them.
Add a check that the required invariant holds.
Change-Id: I42fe09dcd1fac76d3c4e191f7a58c591c5ce429b
Reviewed-on: https://go-review.googlesource.com/24719
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
This will be used verbatim in other architectures.
Change-Id: I307891ae597d797fd45f296b6a38ffe9fac6b975
Reviewed-on: https://go-review.googlesource.com/24155
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Introduce a KeepAlive op which makes sure that its argument is kept
live until the KeepAlive. Use KeepAlive to mark pointer input
arguments as live after each function call and at each return.
We do this change only for pointer arguments. Those are the
critical ones to handle because they might have finalizers.
Doing compound arguments (slices, structs, ...) is more complicated
because we would need to track field liveness individually (we do
that for auto variables now, but inputs requires extra trickery).
Turn off the automatic marking of args as live. That way, when args
are explicitly nulled, plive will know that the original argument is
dead.
The KeepAlive op will be the eventual implementation of
runtime.KeepAlive.
Fixes#15277
Change-Id: I5f223e65d99c9f8342c03fbb1512c4d363e903e5
Reviewed-on: https://go-review.googlesource.com/22365
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Now that we're using 32-bit ops for 8/16-bit logical operations
(to avoid partial register stalls), there's really no need to
keep track of the 8/16-bit ops at all. Convert everything we
can to 32-bit ops.
This CL is the obvious stuff. I might think a bit more about
whether we can get rid of weirder stuff like HMULWU.
The only downside to this CL is that we lose some information
about constants. If we had source like:
var a byte = ...
a += 128
a += 128
We will convert that to a += 256, when we could get rid of the
add altogether. This seems like a fairly unusual scenario and
I'm happy with forgoing that optimization.
Change-Id: Ia7c1e5203d0d110807da69ed646535194a3efba1
Reviewed-on: https://go-review.googlesource.com/22382
Reviewed-by: Todd Neal <todd@tneal.org>
Instead of being a hint, resultInArg0 is now enforced by regalloc.
This allows us to delete all the code from amd64/ssa.go which
deals with converting from a semantically three-address instruction
into some copies plus a two-address instruction.
Change-Id: Id4f39a80be4b678718bfd42a229f9094ab6ecd7c
Reviewed-on: https://go-review.googlesource.com/21816
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
We already generate ADDL for byte operations, reflect this in code.
This also allows inc/dec for +-1 operation, which are 1-byte shorter,
and enables lea for 3-operand addition/subtraction.
Change-Id: Ibfdfee50667ca4cd3c28f72e3dece0c6d114d3ae
Reviewed-on: https://go-review.googlesource.com/21251
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For idx1 ops, SP can appear in the index slot.
Swap SP into the base register slot so we can encode
the instruction.
Fixes#15053
Change-Id: I19000cc9d6c86c7611743481e6e2cb78b1ef04eb
Reviewed-on: https://go-review.googlesource.com/21384
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Helpful for indexed loads and stores when the stride is not equal to
the size being loaded/stored.
Update #7927
Change-Id: I8714dd4c7b18a96a611bf5647ee21f753d723945
Reviewed-on: https://go-review.googlesource.com/21346
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Previously if we were only using the low bits of AuxInt,
the high bits were ignored and could be junk. This CL
changes that behavior to define the high bits to be the
sign-extended version of the low bits for all cases.
There are 2 main benefits:
- Deterministic representation. This helps with CSE.
(Const8 [0x1]) and (Const8 [0x101]) used to be the same "value"
but CSE couldn't see them as such.
- Testability. We can check that all ops leave AuxInt in a state
consistent with the new rule. In the old scheme, it was hard
to check whether a rule correctly used only the low-order bits.
Side benefits:
- ==0 and !=0 tests are easier.
Drawbacks:
- This differs from the runtime representation in registers,
where it is important that we allow upper bits to be undefined
(so we're not sign/zero-extending all the time).
- Ops that treat AuxInt as unsigned (shifts, mostly) need to be
a bit more careful.
Change-Id: I9a685ff27e36dc03287c9ab1cecd6c0b4045c819
Reviewed-on: https://go-review.googlesource.com/21256
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The previous rules to combine indexed loads produced addresses like:
From: obj.Addr{
Type: TYPE_MEM,
Reg: REG_CX,
Name: NAME_AUTO,
Offset: 121,
...
}
which are erroneous because NAME_AUTO implies a base register of
REG_SP, and cmd/internal/obj/x86 makes many assumptions to this
effect. Note that previously we were also producing an extra "ADDQ
SP, CX" instruction, so indexing off of SP was already handled.
The approach taken by this CL to address the problem is to instead
produce addresses like:
From: obj.Addr{
Type: TYPE_MEM,
Reg: REG_SP,
Name: NAME_AUTO,
Offset: 121,
Index: REG_CX,
Scale: 1,
}
and to omit the "ADDQ SP, CX" instruction.
Downside to this approach is it requires adding a lot of new
MOV[WLQ]loadidx1 instructions that nearly duplicate functionality of
the existing MOV[WLQ]loadidx[248] instructions, but with a different
Scale.
Fixes#15001.
Change-Id: Iad9a1a41e5e2552f8d22e3ba975e4ea0862dffd2
Reviewed-on: https://go-review.googlesource.com/21245
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
One intrinsic was needed to help get the very best
performance out of a future GC; as long as that one was
being added, I also added Bswap since that is sometimes
a handy thing to have. I had intended to fill out the
bit-scan intrinsic family, but the mismatch between the
"scan forward" instruction and "count leading zeroes"
was large enough to cause me to leave it out -- it poses
a dilemma that I'd rather dodge right now.
These intrinsics are not exposed for general use.
That's a separate issue requiring an API proposal change
( https://github.com/golang/proposal )
All intrinsics are tested, both that they are substituted
on the appropriate architecture, and that they produce the
expected result.
Change-Id: I5848037cfd97de4f75bdc33bdd89bba00af4a8ee
Reviewed-on: https://go-review.googlesource.com/20564
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
No need to have both ops when they do the same thing.
Just declare MOVBload to zero extend and we can get rid
of MOVBQZXload. Same for W and L.
Kind of a followon cleanup for https://go-review.googlesource.com/c/19506/
Should enable an easier fix for #14920
Change-Id: I7cfac909a8ba387f433a6ae75c050740ebb34d42
Reviewed-on: https://go-review.googlesource.com/21004
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Parts of the SSA compiler in package gc contain amd64-specific code,
most notably Prog generation. Move this code into package amd64, so that
other architectures can be added more easily.
In package gc, this change is just moving code. There are no functional
changes or even any larger structural changes beyond changing function
names (mostly for export).
In the cmd/compile/internal/ssa/gen tool, more information is included
in arch to remove the AMD64-specific behavior in the main portion of the
tool. The generated opGen.go is identical.
Change-Id: I8eb37c6e6df6de1b65fa7dab6f3bc32c29daf643
Reviewed-on: https://go-review.googlesource.com/20609
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>