120 Commits

Author SHA1 Message Date
Keith Randall
d2286ea284 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Semi-regular merge from tip into dev.ssa.

Change-Id: Iadb60e594ef65a99c0e1404b14205fa67c32a9e9
2016-08-04 10:08:20 -07:00
Cherry Zhang
111d590f86 cmd/compile: fix possible spill of invalid pointer with DUFFZERO on AMD64
SSA compiler on AMD64 may spill Duff-adjusted address as scalar. If
the object is on stack and the stack moves, the spilled address become
invalid.

Making the spill pointer-typed does not work. The Duff-adjusted address
points to the memory before the area to be zeroed and may be invalid.
This may cause stack scanning code panic.

Fix it by doing Duff-adjustment in genValue, so the intermediate value
is not seen by the reg allocator, and will not be spilled.

Add a test to cover both cases. As it depends on allocation, it may
be not always triggered.

Fixes #16515.

Change-Id: Ia81d60204782de7405b7046165ad063384ede0db
Reviewed-on: https://go-review.googlesource.com/25309
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-07-29 01:09:55 +00:00
Keith Randall
cf92e3845f [dev.ssa] cmd/compile: use 2-result divide op
We now allow Values to have 2 outputs.  Use that ability for amd64.
This allows x,y := a/b,a%b to use just a single divide instruction.

Update #6815

Change-Id: Id70bcd20188a2dd8445e631a11d11f60991921e4
Reviewed-on: https://go-review.googlesource.com/25004
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2016-07-18 19:41:05 +00:00
Josh Bleecher Snyder
41a7dca272 [dev.ssa] cmd/compile: unify and check LoweredGetClosurePtr
The comments were mostly duplicated; unify them.
Add a check that the required invariant holds.

Change-Id: I42fe09dcd1fac76d3c4e191f7a58c591c5ce429b
Reviewed-on: https://go-review.googlesource.com/24719
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2016-07-04 01:29:28 +00:00
Josh Bleecher Snyder
22d1318e7b [dev.ssa] cmd/compile: refactor out CheckLoweredPhi
This will be used verbatim in other architectures.

Change-Id: I307891ae597d797fd45f296b6a38ffe9fac6b975
Reviewed-on: https://go-review.googlesource.com/24155
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-16 14:34:28 +00:00
Keith Randall
3572c6418b cmd/compile: keep pointer input arguments live throughout function
Introduce a KeepAlive op which makes sure that its argument is kept
live until the KeepAlive.  Use KeepAlive to mark pointer input
arguments as live after each function call and at each return.

We do this change only for pointer arguments.  Those are the
critical ones to handle because they might have finalizers.
Doing compound arguments (slices, structs, ...) is more complicated
because we would need to track field liveness individually (we do
that for auto variables now, but inputs requires extra trickery).

Turn off the automatic marking of args as live.  That way, when args
are explicitly nulled, plive will know that the original argument is
dead.

The KeepAlive op will be the eventual implementation of
runtime.KeepAlive.

Fixes #15277

Change-Id: I5f223e65d99c9f8342c03fbb1512c4d363e903e5
Reviewed-on: https://go-review.googlesource.com/22365
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-05-18 19:25:27 +00:00
Keith Randall
4fa050024f cmd/compile: enable constant-time CFG editing
Provide indexes along with block pointers for Preds
and Succs arrays.  This allows us to splice edges in
and out of those arrays in constant time.

Fixes worst-case O(n^2) behavior in deadcode and fuse.

benchmark                     old ns/op      new ns/op     delta
BenchmarkFuse1-8              2065           2057          -0.39%
BenchmarkFuse10-8             9408           9073          -3.56%
BenchmarkFuse100-8            105238         76277         -27.52%
BenchmarkFuse1000-8           3982562        1026750       -74.22%
BenchmarkFuse10000-8          301220329      12824005      -95.74%
BenchmarkDeadCode1-8          1588           1566          -1.39%
BenchmarkDeadCode10-8         4333           4250          -1.92%
BenchmarkDeadCode100-8        32031          32574         +1.70%
BenchmarkDeadCode1000-8       590407         468275        -20.69%
BenchmarkDeadCode10000-8      17822890       5000818       -71.94%
BenchmarkDeadCode100000-8     1388706640     78021127      -94.38%
BenchmarkDeadCode200000-8     5372518479     168598762     -96.86%

Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17
Reviewed-on: https://go-review.googlesource.com/22589
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 15:58:59 +00:00
Keith Randall
cd956576ae cmd/compile: make vet happy with ssa code
Fixes #15488

Change-Id: I054eb1e1c859de315e3cdbdef5428682bce693fd
Reviewed-on: https://go-review.googlesource.com/22609
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-04-29 18:49:23 +00:00
Keith Randall
9e3c68f1e0 cmd/compile: get rid of most byte and word insns for amd64
Now that we're using 32-bit ops for 8/16-bit logical operations
(to avoid partial register stalls), there's really no need to
keep track of the 8/16-bit ops at all.  Convert everything we
can to 32-bit ops.

This CL is the obvious stuff.  I might think a bit more about
whether we can get rid of weirder stuff like HMULWU.

The only downside to this CL is that we lose some information
about constants.  If we had source like:
  var a byte = ...
  a += 128
  a += 128
We will convert that to a += 256, when we could get rid of the
add altogether.  This seems like a fairly unusual scenario and
I'm happy with forgoing that optimization.

Change-Id: Ia7c1e5203d0d110807da69ed646535194a3efba1
Reviewed-on: https://go-review.googlesource.com/22382
Reviewed-by: Todd Neal <todd@tneal.org>
2016-04-23 16:30:27 +00:00
Keith Randall
f8fc3710fd cmd/compile: handle mem copies in amd64 backend
Fixes noopt builder.

Change-Id: If13373b2597f0fcc9b1b2f9c860f2bd043e43c6c
Reviewed-on: https://go-review.googlesource.com/22338
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-21 17:04:01 +00:00
Keith Randall
0004f34cef cmd/compile: regalloc enforces 2-address instructions
Instead of being a hint, resultInArg0 is now enforced by regalloc.
This allows us to delete all the code from amd64/ssa.go which
deals with converting from a semantically three-address instruction
into some copies plus a two-address instruction.

Change-Id: Id4f39a80be4b678718bfd42a229f9094ab6ecd7c
Reviewed-on: https://go-review.googlesource.com/21816
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-10 23:20:38 +00:00
Ilya Tocar
036d09d5bf cmd/compile/internal/amd64: Use 32-bit operands for byte operations
We already generate ADDL for byte operations, reflect this in code.
This also allows inc/dec for +-1 operation, which are 1-byte shorter,
and enables lea for 3-operand addition/subtraction.

Change-Id: Ibfdfee50667ca4cd3c28f72e3dece0c6d114d3ae
Reviewed-on: https://go-review.googlesource.com/21251
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-05 15:15:19 +00:00
Keith Randall
9f66636c93 cmd/compile: don't put SP in index slot
For idx1 ops, SP can appear in the index slot.
Swap SP into the base register slot so we can encode
the instruction.

Fixes #15053

Change-Id: I19000cc9d6c86c7611743481e6e2cb78b1ef04eb
Reviewed-on: https://go-review.googlesource.com/21384
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-31 22:10:47 +00:00
Keith Randall
af517da2f9 cmd/compile: Add more idx1 load/store instructions
Helpful for indexed loads and stores when the stride is not equal to
the size being loaded/stored.

Update #7927

Change-Id: I8714dd4c7b18a96a611bf5647ee21f753d723945
Reviewed-on: https://go-review.googlesource.com/21346
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
2016-03-31 17:30:40 +00:00
Alexandru Moșoi
dc5a7682f0 cmd/compile: use inc/dec for bytes, too
Change-Id: Ib2890ab1983cbef7c1c1ee5a10204ba3ace19b53
Reviewed-on: https://go-review.googlesource.com/21312
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-30 20:18:16 +00:00
Keith Randall
7fc5621991 cmd/compile: define high bits of AuxInt
Previously if we were only using the low bits of AuxInt,
the high bits were ignored and could be junk.  This CL
changes that behavior to define the high bits to be the
sign-extended version of the low bits for all cases.

There are 2 main benefits:
- Deterministic representation.  This helps with CSE.
  (Const8 [0x1]) and (Const8 [0x101]) used to be the same "value"
  but CSE couldn't see them as such.
- Testability.  We can check that all ops leave AuxInt in a state
  consistent with the new rule.  In the old scheme, it was hard
  to check whether a rule correctly used only the low-order bits.
Side benefits:
- ==0 and !=0 tests are easier.

Drawbacks:
- This differs from the runtime representation in registers,
  where it is important that we allow upper bits to be undefined
  (so we're not sign/zero-extending all the time).
- Ops that treat AuxInt as unsigned (shifts, mostly) need to be
  a bit more careful.

Change-Id: I9a685ff27e36dc03287c9ab1cecd6c0b4045c819
Reviewed-on: https://go-review.googlesource.com/21256
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-03-30 04:48:28 +00:00
Matthew Dempsky
da19a0cff4 cmd/compile: fix plan9-amd64 build
The previous rules to combine indexed loads produced addresses like:

    From: obj.Addr{
        Type:   TYPE_MEM,
        Reg:    REG_CX,
        Name:   NAME_AUTO,
        Offset: 121,
        ...
    }

which are erroneous because NAME_AUTO implies a base register of
REG_SP, and cmd/internal/obj/x86 makes many assumptions to this
effect.  Note that previously we were also producing an extra "ADDQ
SP, CX" instruction, so indexing off of SP was already handled.

The approach taken by this CL to address the problem is to instead
produce addresses like:

    From: obj.Addr{
        Type:   TYPE_MEM,
        Reg:    REG_SP,
        Name:   NAME_AUTO,
        Offset: 121,
        Index:  REG_CX,
        Scale:  1,
    }

and to omit the "ADDQ SP, CX" instruction.

Downside to this approach is it requires adding a lot of new
MOV[WLQ]loadidx1 instructions that nearly duplicate functionality of
the existing MOV[WLQ]loadidx[248] instructions, but with a different
Scale.

Fixes #15001.

Change-Id: Iad9a1a41e5e2552f8d22e3ba975e4ea0862dffd2
Reviewed-on: https://go-review.googlesource.com/21245
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-29 03:22:06 +00:00
David Chase
8eec2bbfbc cmd/compile: added some intrinsics to SSA back end
One intrinsic was needed to help get the very best
performance out of a future GC; as long as that one was
being added, I also added Bswap since that is sometimes
a handy thing to have.  I had intended to fill out the
bit-scan intrinsic family, but the mismatch between the
"scan forward" instruction and "count leading zeroes"
was large enough to cause me to leave it out -- it poses
a dilemma that I'd rather dodge right now.

These intrinsics are not exposed for general use.
That's a separate issue requiring an API proposal change
( https://github.com/golang/proposal )

All intrinsics are tested, both that they are substituted
on the appropriate architecture, and that they produce the
expected result.

Change-Id: I5848037cfd97de4f75bdc33bdd89bba00af4a8ee
Reviewed-on: https://go-review.googlesource.com/20564
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-28 16:29:59 +00:00
Keith Randall
68e86e6dfa cmd/compile: MOVBload and MOVBQZXload are the same op
No need to have both ops when they do the same thing.
Just declare MOVBload to zero extend and we can get rid
of MOVBQZXload.  Same for W and L.

Kind of a followon cleanup for https://go-review.googlesource.com/c/19506/
Should enable an easier fix for #14920

Change-Id: I7cfac909a8ba387f433a6ae75c050740ebb34d42
Reviewed-on: https://go-review.googlesource.com/21004
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-03-23 00:28:01 +00:00
Michael Pratt
a4e31d42ee cmd/compile: remove amd64 code from package gc and the core gen tool
Parts of the SSA compiler in package gc contain amd64-specific code,
most notably Prog generation. Move this code into package amd64, so that
other architectures can be added more easily.

In package gc, this change is just moving code. There are no functional
changes or even any larger structural changes beyond changing function
names (mostly for export).

In the cmd/compile/internal/ssa/gen tool, more information is included
in arch to remove the AMD64-specific behavior in the main portion of the
tool. The generated opGen.go is identical.

Change-Id: I8eb37c6e6df6de1b65fa7dab6f3bc32c29daf643
Reviewed-on: https://go-review.googlesource.com/20609
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-14 16:59:03 +00:00