553 Commits

Author SHA1 Message Date
Cherry Zhang
f55317828b [dev.ssa] cmd/compile: ensure alignment for Zero and Move in SSA for ARM
Encode the size and the alignment into AuxInt of Zero and Move ops.
On AMD64, we simply don't look at the alignment. On ARM and PPC64, we
only generate aligned stores.

Updates #15365.

Change-Id: Ifdcc205c364f67c4516b9adebfe7d50d223b6863
Reviewed-on: https://go-review.googlesource.com/24511
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-07-02 22:22:12 +00:00
Cherry Zhang
29f0984a35 cmd/compile: don't set line number to 0 when building SSA
The frontend may emit node with line number missing. In this case,
use the parent line number. Instead of changing every call site of
pushLine, do it in pushLine itself.

Fixes #16214.

Change-Id: I80390550b56e4d690fc770b01ff725b892ffd6dc
Reviewed-on: https://go-review.googlesource.com/24641
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-07-01 01:12:24 +00:00
Brad Fitzpatrick
7ea62121a7 all: be consistent about spelling of cancelation
We had ~30 one way, and these four new occurrences the other way.

Updates #11626

Change-Id: Ic6403dc4905874916ae292ff739d33482ed8e5bf
Reviewed-on: https://go-review.googlesource.com/24683
Reviewed-by: Rob Pike <r@golang.org>
2016-06-30 21:09:56 +00:00
Lynn Boger
03d152f36f [dev.ssa] cmd/compile: Add initial SSA configuration for PPC64
This adds the initial SSA implementation for PPC64.
Builds golang and all.bash runs correctly.  Simple hello.go
builds but does not run.

Change-Id: I7cec211b934cd7a2dd75a6cdfaf9f71867063466
Reviewed-on: https://go-review.googlesource.com/24453
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-28 15:41:20 +00:00
Josh Bleecher Snyder
22d1318e7b [dev.ssa] cmd/compile: refactor out CheckLoweredPhi
This will be used verbatim in other architectures.

Change-Id: I307891ae597d797fd45f296b6a38ffe9fac6b975
Reviewed-on: https://go-review.googlesource.com/24155
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-16 14:34:28 +00:00
Josh Bleecher Snyder
d0fa6c2f9e [dev.ssa] cmd/compile: add and use SSAReg
This will be needed by other architectures as well.
Put a cleaner encapsulation around it.

Change-Id: I0ac25d600378042b2233301678e9d037e20701d8
Reviewed-on: https://go-review.googlesource.com/24154
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-06-16 14:12:30 +00:00
Keith Randall
0393ed8201 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Change-Id: Idd150294aaeced0176b53d6b95852f5d21ff4fdc
2016-06-14 07:34:09 -07:00
Cherry Zhang
e3a6d00876 [dev.ssa] cmd/compile: ensure OffPtr has pointer type
SSA treats SP as constant throughout a function, so as OffPtr [off] SP.
When the stack moves, spilled OffPtr values become invalid, if they are
not pointer-typed.

(Currently it is fine because of the optimization rules that folds OffPtr
into Load/Store. But it'd better be "optimization", not requirement.)

Updates #15365.

Change-Id: I76cf4008dfdc169e1cb5a55a2605b6678efc915d
Reviewed-on: https://go-review.googlesource.com/23941
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-06-13 12:55:30 +00:00
Cherry Zhang
f3689d1382 cmd/compile: nilcheck interface value in go/defer interface call for SSA
This matches the behavior of the legacy backend.

Fixes #15975 (if this is the intended behavior)

Change-Id: Id277959069b8b8bf9958fa8f2cbc762c752a1a19
Reviewed-on: https://go-review.googlesource.com/23820
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-08 20:35:53 +00:00
Keith Randall
2f088884ae cmd/compile: use fake package for allocating autos
Make sure auto names don't conflict with function names. Before this CL,
we confused name a.len (the len field of the slice a) with a.len (the function
len declared on a).

Fixes #15961

Change-Id: I14913de697b521fb35db9a1b10ba201f25d552bb
Reviewed-on: https://go-review.googlesource.com/23789
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-07 06:04:23 +00:00
Cherry Zhang
59e11d7827 [dev.ssa] cmd/compile: handle floating point on ARM
Machine supports (or the runtime simulates in soft float mode)
(u)int32<->float conversions. The frontend rewrites int64<->float
conversions to call to runtime function.

For int64->float32 conversion, the frontend generates

.   .   AS u(100) l(10) tc(1)
.   .   .   NAME-main.~r1 u(1) a(true) g(1) l(9) x(8+0) class(PPARAMOUT) f(1) float32
.   .   .   CALLFUNC u(100) l(10) tc(1) float32
.   .   .   .   NAME-runtime.int64tofloat64 u(1) a(true) x(0+0) class(PFUNC) tc(1) used(true) FUNC-func(int64) float64

The CALLFUNC node has type float32, whereas runtime.int64tofloat64
returns float64. The legacy backend implicitly makes a float64->float32
conversion. The SSA backend does not do implicit conversion, so we
insert an explicit CONV here.

All cmd/compile/internal/gc/testdata/*_ssa.go tests passed.

Progress on SSA for ARM. Still not complete.

Update #15365.

Change-Id: I30937c8ff977271246b068f48224693776804339
Reviewed-on: https://go-review.googlesource.com/23652
Reviewed-by: Keith Randall <khr@golang.org>
2016-06-06 14:06:38 +00:00
Cherry Zhang
e78d90beeb [dev.ssa] cmd/compile: handle Div, Convert, GetClosurePtr etc. on ARM
This CL adds support of Div, Mod, Convert, GetClosurePtr and 64-bit indexing
support to SSA backend for ARM.

Add tests for 64-bit indexing to cmd/compile/internal/gc/testdata/string_ssa.go.

Tests cmd/compile/internal/gc/testdata/*_ssa.go passed, except compound_ssa.go
and fp_ssa.go.

Progress on SSA for ARM. Still not complete. Essentially the only unsupported
part is floating point.

Updates #15365.

Change-Id: I269e88b67f641c25e7a813d910c96d356d236bff
Reviewed-on: https://go-review.googlesource.com/23542
Reviewed-by: David Chase <drchase@google.com>
2016-06-05 03:56:42 +00:00
Cherry Zhang
8756d9253f [dev.ssa] cmd/compile: decompose 64-bit integer on ARM
Introduce dec64 rules to (generically) decompose 64-bit integer on
32-bit architectures. 64-bit integer is composed/decomposed with
Int64Make/Hi/Lo ops, as for complex types.

The idea of dealing with Add64 is the following:

(Add64 (Int64Make xh xl) (Int64Make yh yl))
->
(Int64Make
	(Add32withcarry xh yh (Select0 (Add32carry xl yl)))
	(Select1 (Add32carry xl yl)))

where Add32carry returns a tuple (flags,uint32). Select0 and Select1
read the first and the second component of the tuple, respectively.
The two Add32carry will be CSE'd.

Similarly for multiplication, Mul32uhilo returns a tuple (hi, lo).

Also add support of KeepAlive, to fix build after merge.

Tests addressed_ssa.go, array_ssa.go, break_ssa.go, chan_ssa.go,
cmp_ssa.go, ctl_ssa.go, map_ssa.go, and string_ssa.go in
cmd/compile/internal/gc/testdata passed.

Progress on SSA for ARM. Still not complete.

Updates #15365.

Change-Id: I7867c76785a456312de5d8398a6b3f7ca5a4f7ec
Reviewed-on: https://go-review.googlesource.com/23213
Reviewed-by: Keith Randall <khr@golang.org>
2016-06-02 13:01:09 +00:00
Keith Randall
42da35c699 cmd/compile: SSA, don't let write barrier clobber return values
When we do *p = f(), we might need to copy the return value from
f to p with a write barrier.  The write barrier itself is a call,
so we need to copy the return value of f to a temporary location
before we call the write barrier function.  Otherwise, the call
itself (specifically, marshalling the args to typedmemmove) will
clobber the value we're trying to write.

Fixes #15854

Change-Id: I5703da87634d91a9884e3ec098d7b3af713462e7
Reviewed-on: https://go-review.googlesource.com/23522
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-27 22:11:45 +00:00
David Chase
31e13c83c2 [dev.ssa] Merge branch 'master' into dev.ssa
Change-Id: Iabc80b6e0734efbd234d998271e110d2eaad41dd
2016-05-27 15:19:33 -04:00
Cherry Zhang
d108bc0e73 [dev.ssa] cmd/compile: implement Defer, RetJmp on SSA for ARM
Also fix argument offset for runtime calls.

Also fix LoadReg/StoreReg by generating instructions by type.

Progress on SSA backend for ARM. Still not complete.
Tests append_ssa.go, assert_ssa.go, loadstore_ssa.go, short_ssa.go, and
deferNoReturn.go in cmd/compile/internal/gc/testdata passed.

Updates #15365.

Change-Id: I0f0a2398cab8bbb461772a55241a16a7da2ecedf
Reviewed-on: https://go-review.googlesource.com/23212
Reviewed-by: David Chase <drchase@google.com>
2016-05-27 12:53:22 +00:00
Russ Cox
20803b845f cmd/compile: eliminate PPARAMREF
As in the elimination of PHEAP|PPARAM in CL 23393,
this is something the front end can trivially take care of
and then not bother the back ends with.
It also eliminates some suspect (and only lightly exercised)
code paths in the back ends.

I don't have a smoking gun for this one but it seems
more clearly correct.

Change-Id: I3b3f5e669b3b81d091ff1e2fb13226a6f14c69d5
Reviewed-on: https://go-review.googlesource.com/23431
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
2016-05-27 05:16:16 +00:00
Russ Cox
b6dc3e6f66 cmd/compile: fix liveness computation for heap-escaped parameters
The liveness computation of parameters generally was never
correct, but forcing all parameters to be live throughout the
function covered up that problem. The new SSA back end is
too clever: even though it currently keeps the parameter values live
throughout the function, it may find optimizations that mean
the current values are not written back to the original parameter
stack slots immediately or ever (for example if a parameter is set
to nil, SSA constant propagation may replace all later uses of the
parameter with a constant nil, eliminating the need to write the nil
value back to the stack slot), so the liveness code must now
track the actual operations on the stack slots, exposing these
problems.

One small problem in the handling of arguments is that nodarg
can return ONAME PPARAM nodes with adjusted offsets, so that
there are actually multiple *Node pointers for the same parameter
in the instruction stream. This might be possible to correct, but
not in this CL. For now, we fix this by using n.Orig instead of n
when considering PPARAM and PPARAMOUT nodes.

The major problem in the handling of arguments is general
confusion in the liveness code about the meaning of PPARAM|PHEAP
and PPARAMOUT|PHEAP nodes, especially as contrasted with PAUTO|PHEAP.
The difference between these two is that when a local variable "moves"
to the heap, it's really just allocated there to start with; in contrast,
when an argument moves to the heap, the actual data has to be copied
there from the stack at the beginning of the function, and when a
result "moves" to the heap the value in the heap has to be copied
back to the stack when the function returns
This general confusion is also present in the SSA back end.

The PHEAP bit worked decently when I first introduced it 7 years ago (!)
in 391425ae. The back end did nothing sophisticated, and in particular
there was no analysis at all: no escape analysis, no liveness analysis,
and certainly no SSA back end. But the complications caused in the
various downstream consumers suggest that this should be a detail
kept mainly in the front end.

This CL therefore eliminates both the PHEAP bit and even the idea of
"heap variables" from the back ends.

First, it replaces the PPARAM|PHEAP, PPARAMOUT|PHEAP, and PAUTO|PHEAP
variable classes with the single PAUTOHEAP, a pseudo-class indicating
a variable maintained on the heap and available by indirecting a
local variable kept on the stack (a plain PAUTO).

Second, walkexpr replaces all references to PAUTOHEAP variables
with indirections of the corresponding PAUTO variable.
The back ends and the liveness code now just see plain indirected
variables. This may actually produce better code, but the real goal
here is to eliminate these little-used and somewhat suspect code
paths in the back end analyses.

The OPARAM node type goes away too.

A followup CL will do the same to PPARAMREF. I'm not sure that
the back ends (SSA in particular) are handling those right either,
and with the framework established in this CL that change is trivial
and the result clearly more correct.

Fixes #15747.

Change-Id: I2770b1ce3cbc93981bfc7166be66a9da12013d74
Reviewed-on: https://go-review.googlesource.com/23393
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-27 03:19:52 +00:00
Cherry Zhang
ccaed50c7b [dev.ssa] cmd/compile: handle boolean values for SSA on ARM
Fix hardcoded flag register mask in ssa/flagalloc.go by auto-generating
the mask.

Also fix a mistake (in previous CL) about conditional branches.

Progress on SSA backend for ARM. Still not complete. Now "container/ring"
package compiles and tests passed.

Updates #15365.

Change-Id: Id7c8805c30dbb8107baedb485ed0f71f59ed6ea8
Reviewed-on: https://go-review.googlesource.com/23093
Reviewed-by: Keith Randall <khr@golang.org>
2016-05-19 02:48:36 +00:00
Keith Randall
3572c6418b cmd/compile: keep pointer input arguments live throughout function
Introduce a KeepAlive op which makes sure that its argument is kept
live until the KeepAlive.  Use KeepAlive to mark pointer input
arguments as live after each function call and at each return.

We do this change only for pointer arguments.  Those are the
critical ones to handle because they might have finalizers.
Doing compound arguments (slices, structs, ...) is more complicated
because we would need to track field liveness individually (we do
that for auto variables now, but inputs requires extra trickery).

Turn off the automatic marking of args as live.  That way, when args
are explicitly nulled, plive will know that the original argument is
dead.

The KeepAlive op will be the eventual implementation of
runtime.KeepAlive.

Fixes #15277

Change-Id: I5f223e65d99c9f8342c03fbb1512c4d363e903e5
Reviewed-on: https://go-review.googlesource.com/22365
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-05-18 19:25:27 +00:00
David Chase
6b99fb5bea cmd/compile: use sparse algorithm for phis in large program
This adds a sparse method for locating nearest ancestors
in a dominator tree, and checks blocks with more than one
predecessor for differences and inserts phi functions where
there are.

Uses reversed post order to cut number of passes, running
it from first def to last use ("last use" for paramout and
mem is end-of-program; last use for a phi input from a
backedge is the source of the back edge)

Includes a cutover from old algorithm to new to avoid paying
large constant factor for small programs.  This keeps normal
builds running at about the same time, while not running
over-long on large machine-generated inputs.

Add "phase" flags for ssa/build -- ssa/build/stats prints
number of blocks, values (before and after linking references
and inserting phis, so expansion can be measured), and their
product; the product governs the cutover, where a good value
seems to be somewhere between 1 and 5 million.

Among the files compiled by make.bash, this is the shape of
the tail of the distribution for #blocks, #vars, and their
product:

	 #blocks	#vars	    product
 max	6171	28180	173,898,780
99.9%	1641	 6548	 10,401,878
  99%	 463	 1909	    873,721
  95%	 152	  639	     95,235
  90%	  84	  359	     30,021

The old algorithm is indeed usually fastest, for 99%ile
values of usually.

The fix to LookupVarOutgoing
( https://go-review.googlesource.com/#/c/22790/ )
deals with some of the same problems addressed by this CL,
but on at least one bug ( #15537 ) this change is still
a significant help.

With this CL:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
   github.com/gogo/protobuf/test/theproto3/combos/...
...
real	4m35.200s
user	13m16.644s
sys	0m36.712s
and pprof reports 3.4GB allocated in one of the larger profiles

With tip:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
   github.com/gogo/protobuf/test/theproto3/combos/...
...
real	10m36.569s
user	25m52.286s
sys	4m3.696s
and pprof reports 8.3GB allocated in the same larger profile

With this CL, most of the compilation time on the benchmarked
input is spent in register/stack allocation (cumulative 53%)
and in the sparse lookup algorithm itself (cumulative 20%).

Fixes #15537.

Change-Id: Ia0299dda6a291534d8b08e5f9883216ded677a00
Reviewed-on: https://go-review.googlesource.com/22342
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-16 21:08:05 +00:00
Cherry Zhang
fdc4a964d2 [dev.ssa] cmd/compile/internal/gc, runtime: use 32-bit load for writeBarrier check
Use 32-bit load for writeBarrier check on all architectures.
Padding added to runtime structure.

Updates #15365, #15492.

Change-Id: I5d3dadf8609923fe0fe4fcb384a418b7b9624998
Reviewed-on: https://go-review.googlesource.com/22855
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-10 17:34:30 +00:00
Cherry Zhang
e1a2ea88d0 [dev.ssa] cmd/compile: handle symbolic constant for SSA on ARM
Progress on SSA backend for ARM. Still not complete. Now "helloworld"
function compiles and runs.

Updates #15365.

Change-Id: I02f66983cefdf07a6aed262fb4af8add464d8e9a
Reviewed-on: https://go-review.googlesource.com/22854
Reviewed-by: Keith Randall <khr@golang.org>
2016-05-10 13:30:51 +00:00
Josh Bleecher Snyder
bc1989f115 cmd/compile: optimize lookupVarOutgoing
If b has exactly one predecessor, as happens
frequently with static calls, we can make
lookupVarOutgoing generate less garbage.

Instead of generating a value that is just
going to be an OpCopy and then get eliminated,
loop. This can lead to lots of looping.
However, this loop is way cheaper than generating
lots of ssa.Values and then eliminating them.

For a subset of the code in #15537:

Before:

       28.31 real        36.17 user         1.68 sys
2282450944  maximum resident set size

After:

        9.63 real        11.66 user         0.51 sys
 638144512  maximum resident set size

Updates #15537.

Excitingly, it appears that this also helps
regular code:

name       old time/op      new time/op      delta
Template        288ms ± 6%       276ms ± 7%   -4.13%        (p=0.000 n=21+24)
Unicode         143ms ± 8%       141ms ±10%     ~           (p=0.287 n=24+25)
GoTypes         932ms ± 4%       874ms ± 4%   -6.20%        (p=0.000 n=23+22)
Compiler        4.89s ± 4%       4.58s ± 4%   -6.46%        (p=0.000 n=22+23)
MakeBash        40.2s ±13%       39.8s ± 9%     ~           (p=0.648 n=23+23)

name       old user-ns/op   new user-ns/op   delta
Template   388user-ms ±10%  373user-ms ± 5%   -3.80%        (p=0.000 n=24+25)
Unicode    203user-ms ± 6%  202user-ms ± 7%     ~           (p=0.492 n=22+24)
GoTypes    1.29user-s ± 4%  1.17user-s ± 4%   -9.67%        (p=0.000 n=25+23)
Compiler   6.86user-s ± 5%  6.28user-s ± 4%   -8.49%        (p=0.000 n=25+25)

name       old alloc/op     new alloc/op     delta
Template       51.5MB ± 0%      47.6MB ± 0%   -7.47%        (p=0.000 n=22+25)
Unicode        37.2MB ± 0%      37.1MB ± 0%   -0.21%        (p=0.000 n=25+25)
GoTypes         166MB ± 0%       138MB ± 0%  -16.83%        (p=0.000 n=25+25)
Compiler        756MB ± 0%       628MB ± 0%  -16.96%        (p=0.000 n=25+23)

name       old allocs/op    new allocs/op    delta
Template         450k ± 0%        445k ± 0%   -1.02%        (p=0.000 n=25+25)
Unicode          356k ± 0%        356k ± 0%     ~           (p=0.374 n=24+25)
GoTypes         1.31M ± 0%       1.25M ± 0%   -4.18%        (p=0.000 n=25+25)
Compiler        5.29M ± 0%       5.02M ± 0%   -5.15%        (p=0.000 n=25+23)

It also seems to help in other cases in which
phi insertion is a pain point (#14774, #14934).

Change-Id: Ibd05ed7b99d262117ece7bb250dfa8c3d1cc5dd2
Reviewed-on: https://go-review.googlesource.com/22790
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
2016-05-05 17:45:21 +00:00
Keith Randall
4fa050024f cmd/compile: enable constant-time CFG editing
Provide indexes along with block pointers for Preds
and Succs arrays.  This allows us to splice edges in
and out of those arrays in constant time.

Fixes worst-case O(n^2) behavior in deadcode and fuse.

benchmark                     old ns/op      new ns/op     delta
BenchmarkFuse1-8              2065           2057          -0.39%
BenchmarkFuse10-8             9408           9073          -3.56%
BenchmarkFuse100-8            105238         76277         -27.52%
BenchmarkFuse1000-8           3982562        1026750       -74.22%
BenchmarkFuse10000-8          301220329      12824005      -95.74%
BenchmarkDeadCode1-8          1588           1566          -1.39%
BenchmarkDeadCode10-8         4333           4250          -1.92%
BenchmarkDeadCode100-8        32031          32574         +1.70%
BenchmarkDeadCode1000-8       590407         468275        -20.69%
BenchmarkDeadCode10000-8      17822890       5000818       -71.94%
BenchmarkDeadCode100000-8     1388706640     78021127      -94.38%
BenchmarkDeadCode200000-8     5372518479     168598762     -96.86%

Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17
Reviewed-on: https://go-review.googlesource.com/22589
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 15:58:59 +00:00
Keith Randall
5325fbc7db cmd/compile: don't SSA any variables when -N
Turn SSAing of variables off when compiling with optimizations off.
This helps keep variable names around that would otherwise be
optimized away.

Fixes #14744

Change-Id: I31db8cf269c068c7c5851808f13e5955a09810ca
Reviewed-on: https://go-review.googlesource.com/22681
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-05-02 04:16:45 +00:00
Keith Randall
cd956576ae cmd/compile: make vet happy with ssa code
Fixes #15488

Change-Id: I054eb1e1c859de315e3cdbdef5428682bce693fd
Reviewed-on: https://go-review.googlesource.com/22609
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-04-29 18:49:23 +00:00
Michael Hudson-Doyle
f4d38a8792 cmd/compile: de-dup the gclocals symbols in compiler too
These symbols are de-duplicated in the linker but the compiler generates quite
many duplicates too: 2425 of 13769 total symbols for runtime.a for example.
De-duplicating them in the compiler saves the linker a bit of work.

Fixes #14983

Change-Id: I5f18e5f9743563c795aad8f0a22d17a7ed147711
Reviewed-on: https://go-review.googlesource.com/22293
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-04-27 22:07:17 +00:00
Dave Cheney
d3c79d324a cmd/compile/internal/gc: remove oconv(op, 0) calls
Updates #15462

Automatic refactor with sed -e.

Replace all oconv(op, 0) to string conversion with the raw op value
which fmt's %v verb can print directly.

The remaining oconv(op, FmtSharp) will be replaced with op.GoString and
%#v in the next CL.

Change-Id: I5e2f7ee0bd35caa65c6dd6cb1a866b5e4519e641
Reviewed-on: https://go-review.googlesource.com/22499
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-27 21:39:39 +00:00
Keith Randall
a19e60b2c3 cmd/compile: don't use line numbers from ONAME and named OLITERALs
The line numbers of ONAMEs are the location of their
declaration, not their use.

The line numbers of named OLITERALs are also the location
of their declaration.

Ignore both of these.  Instead, we will inherit the line number from
the containing syntactic item.

Fixes #14742
Fixes #15430

Change-Id: Ie43b5b9f6321cbf8cead56e37ccc9364d0702f2f
Reviewed-on: https://go-review.googlesource.com/22479
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-04-27 15:03:38 +00:00
Dave Cheney
8f2e780e8a cmd/compile/internal: unexport gc.Oconv
Updates #15462

Semi automatic change with gofmt -r and hand fixups for callers outside
internal/gc.

All the uses of gc.Oconv outside cmd/compile/internal/gc were for the
Oconv(op, 0) form, which is already handled the Op.String method.

Replace the use of gc.Oconv(op, 0) with op itself, which will call
Op.String via the %v or %s verb. Unexport Oconv.

Change-Id: I84da2a2e4381b35f52efce427b2d6a3bccdf2526
Reviewed-on: https://go-review.googlesource.com/22496
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-27 06:18:46 +00:00
Alexandru Moșoi
8b92397bcd cmd/compile: introduce bool operations.
Introduce OrB, EqB, NeqB, AndB to handle bool operations.

Change-Id: I53e4d5125a8090d5eeb4576db619103f19fff58d
Reviewed-on: https://go-review.googlesource.com/22412
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-25 20:43:04 +00:00
Josh Bleecher Snyder
f12bd8a5a8 cmd/compile: encapsulate OSLICE* representation
As a nice side-effect, this allows us to
unify several code paths.

The terminology (low, high, max, simple slice expr,
full slice expr) is taken from the spec and
the examples in the spec.

This is a trial run. The plan, probably for Go 1.8,
is to change slice expressions to use Node.List
instead of OKEY, and to do some similar
tree structure changes for other ops.

Passes toolstash -cmp. No performance change.
all.bash passes with GO_GCFLAGS=-newexport.

Updates #15350

Change-Id: Ic1efdc36e79cdb95ae1636e9817a3ac8f83ab1ac
Reviewed-on: https://go-review.googlesource.com/22425
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-04-25 18:39:33 +00:00
Josh Bleecher Snyder
fca0f331c8 cmd/compile: use gc.Etype's String method
Passes toolstash -cmp.

Change-Id: I42c962cc5a3ffec2969f223cf238c2fdadbf5857
Reviewed-on: https://go-review.googlesource.com/22381
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-24 21:36:23 +00:00
Josh Bleecher Snyder
f027241445 cmd/compile: give gc.Op a String method, use it
Passes toolstash -cmp.

Change-Id: I915e76374fd64aa2597e6fa47e4fa95ca00ca643
Reviewed-on: https://go-review.googlesource.com/22380
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-04-24 21:36:13 +00:00
Matthew Dempsky
97360096e5 cmd/compile: replace Ctype switches with type switches
Instead of switching on Ctype (which internally uses a type switch)
and then scattering lots of type assertions throughout the CTFOO case
clauses, just use type switches directly on the underlying constant
value.

Passes toolstash/buildall.

Change-Id: I9bc172cc67e5f391cddc15539907883b4010689e
Reviewed-on: https://go-review.googlesource.com/22384
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-22 21:34:25 +00:00
Keith Randall
3c1a4c1902 cmd/compile: don't nilcheck newobject and return values from mapaccess{1,2}
They are guaranteed to be non-nil, no point in inserting
nil checks for them.

Fixes #15390

Change-Id: I3b9a0f2319affc2139dcc446d0a56c6785ae5a86
Reviewed-on: https://go-review.googlesource.com/22291
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-22 16:18:42 +00:00
Matthew Dempsky
40f1d0ca9f cmd/compile: split TSLICE into separate Type kind
Instead of using TARRAY for both arrays and slices, create a new
TSLICE kind to handle slices.

Also, get rid of the "DDDArray" distinction. While kinda ugly, it
seems likely we'll need to defer evaluating the constant bounds
expressions for golang.org/issue/13890.

Passes toolstash/buildall.

Change-Id: I8e45d4900e7df3a04cce59428ec8b38035d3cc3a
Reviewed-on: https://go-review.googlesource.com/22329
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-21 21:03:22 +00:00
Cherry Zhang
508a424eed cmd/compile/internal/gc: fix return value offset for SSA backend on ARM
Progress on SSA backend for ARM. Still not complete. It compiles a
Fibonacci function, but the caller picked the return value from an
incorrect offset. This CL adjusts it to match the stack frame layout
for architectures with link register.

Updates #15365.

Change-Id: I01e03c3e95f5503a185e8ac2b6d9caf4faf3d014
Reviewed-on: https://go-review.googlesource.com/22186
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-21 16:53:53 +00:00
Keith Randall
bfe0cbdc50 cmd/compile,runtime: pass elem type to {make,grow}slice
No point in passing the slice type to these functions.
All they need is the element type.  One less indirection,
maybe a few less []T type descriptors in the binary.

Change-Id: Ib0b83b5f14ca21d995ecc199ce8ac00c4eb375e6
Reviewed-on: https://go-review.googlesource.com/22275
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-20 00:31:16 +00:00
Josh Bleecher Snyder
03e216f30d cmd/compile: re-enable in-place append optimization
CL 21891 was too clever in its attempts to avoid spills.
Storing newlen too early caused uses of append in the runtime
itself to receive an inconsistent view of a slice,
leading to corruption.

This CL makes the generate code much more similar to
the old backend. It spills more than before,
but those spills have been contained to the grow path.
It recalculates newlen unnecessarily on the fast path,
but that's measurably cheaper than spilling it.

CL 21891 caused runtime failures in 6 of 2000 runs
of net/http and crypto/x509 in my test setup.
This CL has gone 6000 runs without a failure.


Benchmarks going from master to this CL:

name                         old time/op  new time/op  delta
AppendInPlace/NoGrow/Byte-8   439ns ± 2%   436ns ± 2%  -0.72%  (p=0.001 n=28+27)
AppendInPlace/NoGrow/1Ptr-8   901ns ± 0%   856ns ± 0%  -4.95%  (p=0.000 n=26+29)
AppendInPlace/NoGrow/2Ptr-8  2.15µs ± 1%  1.95µs ± 0%  -9.07%  (p=0.000 n=28+30)
AppendInPlace/NoGrow/3Ptr-8  2.66µs ± 0%  2.45µs ± 0%  -7.93%  (p=0.000 n=29+26)
AppendInPlace/NoGrow/4Ptr-8  3.24µs ± 1%  3.02µs ± 1%  -6.75%  (p=0.000 n=28+30)
AppendInPlace/Grow/Byte-8     269ns ± 1%   271ns ± 1%  +0.84%  (p=0.000 n=30+29)
AppendInPlace/Grow/1Ptr-8     275ns ± 1%   280ns ± 1%  +1.75%  (p=0.000 n=30+30)
AppendInPlace/Grow/2Ptr-8     384ns ± 0%   391ns ± 0%  +1.94%  (p=0.000 n=27+30)
AppendInPlace/Grow/3Ptr-8     455ns ± 0%   462ns ± 0%  +1.43%  (p=0.000 n=29+29)
AppendInPlace/Grow/4Ptr-8     478ns ± 0%   479ns ± 0%  +0.23%  (p=0.000 n=30+27)


However, for the large no-grow cases, there is still more work to be done.
Going from this CL to the non-SSA backend:

name                         old time/op  new time/op  delta
AppendInPlace/NoGrow/Byte-8   436ns ± 2%   436ns ± 2%     ~     (p=0.967 n=27+29)
AppendInPlace/NoGrow/1Ptr-8   856ns ± 0%   884ns ± 0%   +3.28%  (p=0.000 n=29+26)
AppendInPlace/NoGrow/2Ptr-8  1.95µs ± 0%  1.56µs ± 0%  -20.28%  (p=0.000 n=30+29)
AppendInPlace/NoGrow/3Ptr-8  2.45µs ± 0%  1.89µs ± 0%  -22.88%  (p=0.000 n=26+28)
AppendInPlace/NoGrow/4Ptr-8  3.02µs ± 1%  2.56µs ± 1%  -15.35%  (p=0.000 n=30+28)
AppendInPlace/Grow/Byte-8     271ns ± 1%   283ns ± 1%   +4.56%  (p=0.000 n=29+29)
AppendInPlace/Grow/1Ptr-8     280ns ± 1%   288ns ± 1%   +2.99%  (p=0.000 n=30+30)
AppendInPlace/Grow/2Ptr-8     391ns ± 0%   409ns ± 0%   +4.66%  (p=0.000 n=30+29)
AppendInPlace/Grow/3Ptr-8     462ns ± 0%   481ns ± 0%   +4.13%  (p=0.000 n=29+30)
AppendInPlace/Grow/4Ptr-8     479ns ± 0%   502ns ± 0%   +4.81%  (p=0.000 n=27+26)


New generated code:

var x []byte

func a() {
	x = append(x, 1)
}


"".a t=1 size=208 args=0x0 locals=0x48
	0x0000 00000 (a.go:5)	TEXT	"".a(SB), $72-0
	0x0000 00000 (a.go:5)	MOVQ	(TLS), CX
	0x0009 00009 (a.go:5)	CMPQ	SP, 16(CX)
	0x000d 00013 (a.go:5)	JLS	190
	0x0013 00019 (a.go:5)	SUBQ	$72, SP
	0x0017 00023 (a.go:5)	FUNCDATA	$0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (a.go:5)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (a.go:6)	MOVQ	"".x+16(SB), CX
	0x001e 00030 (a.go:6)	MOVQ	"".x+8(SB), DX
	0x0025 00037 (a.go:6)	MOVQ	"".x(SB), BX
	0x002c 00044 (a.go:6)	LEAQ	1(DX), BP
	0x0030 00048 (a.go:6)	CMPQ	BP, CX
	0x0033 00051 (a.go:6)	JGT	$0, 73
	0x0035 00053 (a.go:6)	LEAQ	1(DX), AX
	0x0039 00057 (a.go:6)	MOVQ	AX, "".x+8(SB)
	0x0040 00064 (a.go:6)	MOVB	$1, (BX)(DX*1)
	0x0044 00068 (a.go:7)	ADDQ	$72, SP
	0x0048 00072 (a.go:7)	RET
	0x0049 00073 (a.go:6)	LEAQ	type.[]uint8(SB), AX
	0x0050 00080 (a.go:6)	MOVQ	AX, (SP)
	0x0054 00084 (a.go:6)	MOVQ	BX, 8(SP)
	0x0059 00089 (a.go:6)	MOVQ	DX, 16(SP)
	0x005e 00094 (a.go:6)	MOVQ	CX, 24(SP)
	0x0063 00099 (a.go:6)	MOVQ	BP, 32(SP)
	0x0068 00104 (a.go:6)	PCDATA	$0, $0
	0x0068 00104 (a.go:6)	CALL	runtime.growslice(SB)
	0x006d 00109 (a.go:6)	MOVQ	40(SP), CX
	0x0072 00114 (a.go:6)	MOVQ	48(SP), DX
	0x0077 00119 (a.go:6)	MOVQ	DX, "".autotmp_0+64(SP)
	0x007c 00124 (a.go:6)	MOVQ	56(SP), BX
	0x0081 00129 (a.go:6)	MOVQ	BX, "".x+16(SB)
	0x0088 00136 (a.go:6)	MOVL	runtime.writeBarrier(SB), AX
	0x008e 00142 (a.go:6)	TESTB	AL, AL
	0x0090 00144 (a.go:6)	JNE	$0, 162
	0x0092 00146 (a.go:6)	MOVQ	CX, "".x(SB)
	0x0099 00153 (a.go:6)	MOVQ	"".x(SB), BX
	0x00a0 00160 (a.go:6)	JMP	53
	0x00a2 00162 (a.go:6)	LEAQ	"".x(SB), BX
	0x00a9 00169 (a.go:6)	MOVQ	BX, (SP)
	0x00ad 00173 (a.go:6)	MOVQ	CX, 8(SP)
	0x00b2 00178 (a.go:6)	PCDATA	$0, $0
	0x00b2 00178 (a.go:6)	CALL	runtime.writebarrierptr(SB)
	0x00b7 00183 (a.go:6)	MOVQ	"".autotmp_0+64(SP), DX
	0x00bc 00188 (a.go:6)	JMP	153
	0x00be 00190 (a.go:6)	NOP
	0x00be 00190 (a.go:5)	CALL	runtime.morestack_noctxt(SB)
	0x00c3 00195 (a.go:5)	JMP	0


Fixes #14969 again

Change-Id: Ia50463b1f506011aad0718a4fef1d4738e43c32d
Reviewed-on: https://go-review.googlesource.com/22197
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-19 18:48:05 +00:00
Matthew Dempsky
980ab12ade cmd/compile/internal/gc: change flags to bool where possible
Some of the Debug[x] flags are actually boolean too, but not all, so
they need to be handled separately.

While here, change some obj.Flagstr and obj.Flagint64 calls to
directly use flag.StringVar and flag.Int64Var instead.

Change-Id: Iccedf6fed4328240ee2257f57fe6d66688f237c4
Reviewed-on: https://go-review.googlesource.com/22052
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-04-14 02:10:35 +00:00
Josh Bleecher Snyder
811ebb6ac9 cmd/compile: temporarily disable inplace append special case
Fixes #15246
Re-opens #14969

Change-Id: Ic0b41c5aa42bbb229a0d62b7f3e5888c6b29293d
Reviewed-on: https://go-review.googlesource.com/21891
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-12 17:44:43 +00:00
Josh Bleecher Snyder
a4650a2111 cmd/compile: avoid write barrier in append fast path
When we are writing the result of an append back
to the same slice, we don’t need a write barrier
on the fast path.

This re-implements an optimization that was present
in the old backend.

Updates #14921
Fixes #14969

Sample code:

var x []byte

func p() {
	x = append(x, 1, 2, 3)
}

Before:

"".p t=1 size=224 args=0x0 locals=0x48
	0x0000 00000 (append.go:21)	TEXT	"".p(SB), $72-0
	0x0000 00000 (append.go:21)	MOVQ	(TLS), CX
	0x0009 00009 (append.go:21)	CMPQ	SP, 16(CX)
	0x000d 00013 (append.go:21)	JLS	199
	0x0013 00019 (append.go:21)	SUBQ	$72, SP
	0x0017 00023 (append.go:21)	FUNCDATA	$0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:21)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:19)	MOVQ	"".x+16(SB), CX
	0x001e 00030 (append.go:19)	MOVQ	"".x(SB), DX
	0x0025 00037 (append.go:19)	MOVQ	"".x+8(SB), BX
	0x002c 00044 (append.go:19)	MOVQ	BX, "".autotmp_0+64(SP)
	0x0031 00049 (append.go:22)	LEAQ	3(BX), BP
	0x0035 00053 (append.go:22)	CMPQ	BP, CX
	0x0038 00056 (append.go:22)	JGT	$0, 131
	0x003a 00058 (append.go:22)	MOVB	$1, (DX)(BX*1)
	0x003e 00062 (append.go:22)	MOVB	$2, 1(DX)(BX*1)
	0x0043 00067 (append.go:22)	MOVB	$3, 2(DX)(BX*1)
	0x0048 00072 (append.go:22)	MOVQ	BP, "".x+8(SB)
	0x004f 00079 (append.go:22)	MOVQ	CX, "".x+16(SB)
	0x0056 00086 (append.go:22)	MOVL	runtime.writeBarrier(SB), AX
	0x005c 00092 (append.go:22)	TESTB	AL, AL
	0x005e 00094 (append.go:22)	JNE	$0, 108
	0x0060 00096 (append.go:22)	MOVQ	DX, "".x(SB)
	0x0067 00103 (append.go:23)	ADDQ	$72, SP
	0x006b 00107 (append.go:23)	RET
	0x006c 00108 (append.go:22)	LEAQ	"".x(SB), CX
	0x0073 00115 (append.go:22)	MOVQ	CX, (SP)
	0x0077 00119 (append.go:22)	MOVQ	DX, 8(SP)
	0x007c 00124 (append.go:22)	PCDATA	$0, $0
	0x007c 00124 (append.go:22)	CALL	runtime.writebarrierptr(SB)
	0x0081 00129 (append.go:23)	JMP	103
	0x0083 00131 (append.go:22)	LEAQ	type.[]uint8(SB), AX
	0x008a 00138 (append.go:22)	MOVQ	AX, (SP)
	0x008e 00142 (append.go:22)	MOVQ	DX, 8(SP)
	0x0093 00147 (append.go:22)	MOVQ	BX, 16(SP)
	0x0098 00152 (append.go:22)	MOVQ	CX, 24(SP)
	0x009d 00157 (append.go:22)	MOVQ	BP, 32(SP)
	0x00a2 00162 (append.go:22)	PCDATA	$0, $0
	0x00a2 00162 (append.go:22)	CALL	runtime.growslice(SB)
	0x00a7 00167 (append.go:22)	MOVQ	40(SP), DX
	0x00ac 00172 (append.go:22)	MOVQ	48(SP), AX
	0x00b1 00177 (append.go:22)	MOVQ	56(SP), CX
	0x00b6 00182 (append.go:22)	ADDQ	$3, AX
	0x00ba 00186 (append.go:19)	MOVQ	"".autotmp_0+64(SP), BX
	0x00bf 00191 (append.go:22)	MOVQ	AX, BP
	0x00c2 00194 (append.go:22)	JMP	58
	0x00c7 00199 (append.go:22)	NOP
	0x00c7 00199 (append.go:21)	CALL	runtime.morestack_noctxt(SB)
	0x00cc 00204 (append.go:21)	JMP	0

After:

"".p t=1 size=208 args=0x0 locals=0x48
	0x0000 00000 (append.go:21)	TEXT	"".p(SB), $72-0
	0x0000 00000 (append.go:21)	MOVQ	(TLS), CX
	0x0009 00009 (append.go:21)	CMPQ	SP, 16(CX)
	0x000d 00013 (append.go:21)	JLS	191
	0x0013 00019 (append.go:21)	SUBQ	$72, SP
	0x0017 00023 (append.go:21)	FUNCDATA	$0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:21)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:19)	MOVQ	"".x+16(SB), CX
	0x001e 00030 (append.go:19)	MOVQ	"".x+8(SB), DX
	0x0025 00037 (append.go:19)	MOVQ	DX, "".autotmp_0+64(SP)
	0x002a 00042 (append.go:19)	MOVQ	"".x(SB), BX
	0x0031 00049 (append.go:22)	LEAQ	3(DX), BP
	0x0035 00053 (append.go:22)	MOVQ	BP, "".x+8(SB)
	0x003c 00060 (append.go:22)	CMPQ	BP, CX
	0x003f 00063 (append.go:22)	JGT	$0, 84
	0x0041 00065 (append.go:22)	MOVB	$1, (BX)(DX*1)
	0x0045 00069 (append.go:22)	MOVB	$2, 1(BX)(DX*1)
	0x004a 00074 (append.go:22)	MOVB	$3, 2(BX)(DX*1)
	0x004f 00079 (append.go:23)	ADDQ	$72, SP
	0x0053 00083 (append.go:23)	RET
	0x0054 00084 (append.go:22)	LEAQ	type.[]uint8(SB), AX
	0x005b 00091 (append.go:22)	MOVQ	AX, (SP)
	0x005f 00095 (append.go:22)	MOVQ	BX, 8(SP)
	0x0064 00100 (append.go:22)	MOVQ	DX, 16(SP)
	0x0069 00105 (append.go:22)	MOVQ	CX, 24(SP)
	0x006e 00110 (append.go:22)	MOVQ	BP, 32(SP)
	0x0073 00115 (append.go:22)	PCDATA	$0, $0
	0x0073 00115 (append.go:22)	CALL	runtime.growslice(SB)
	0x0078 00120 (append.go:22)	MOVQ	40(SP), CX
	0x007d 00125 (append.go:22)	MOVQ	56(SP), AX
	0x0082 00130 (append.go:22)	MOVQ	AX, "".x+16(SB)
	0x0089 00137 (append.go:22)	MOVL	runtime.writeBarrier(SB), AX
	0x008f 00143 (append.go:22)	TESTB	AL, AL
	0x0091 00145 (append.go:22)	JNE	$0, 168
	0x0093 00147 (append.go:22)	MOVQ	CX, "".x(SB)
	0x009a 00154 (append.go:22)	MOVQ	"".x(SB), BX
	0x00a1 00161 (append.go:19)	MOVQ	"".autotmp_0+64(SP), DX
	0x00a6 00166 (append.go:22)	JMP	65
	0x00a8 00168 (append.go:22)	LEAQ	"".x(SB), DX
	0x00af 00175 (append.go:22)	MOVQ	DX, (SP)
	0x00b3 00179 (append.go:22)	MOVQ	CX, 8(SP)
	0x00b8 00184 (append.go:22)	PCDATA	$0, $0
	0x00b8 00184 (append.go:22)	CALL	runtime.writebarrierptr(SB)
	0x00bd 00189 (append.go:22)	JMP	154
	0x00bf 00191 (append.go:22)	NOP
	0x00bf 00191 (append.go:21)	CALL	runtime.morestack_noctxt(SB)
	0x00c4 00196 (append.go:21)	JMP	0

Change-Id: I77a41ad3a22557a4bb4654de7d6d24a029efe34a
Reviewed-on: https://go-review.googlesource.com/21813
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-11 21:38:18 +00:00
Keith Randall
b04e145248 cmd/compile: fix naming of decomposed structs
When a struct is SSAable, we will name its component parts
by their field names.  For example,
type T struct {
     a, b, c int
}
If we ever need to spill a variable x of type T, we will
spill its individual components to variables named x.a, x.b,
and x.c.

Change-Id: I857286ff1f2597f2c4bbd7b4c0b936386fb37131
Reviewed-on: https://go-review.googlesource.com/21389
Reviewed-by: David Chase <drchase@google.com>
2016-04-11 17:11:23 +00:00
Josh Bleecher Snyder
6b33b0e98e cmd/compile: avoid a spill in append fast path
Instead of spilling newlen, recalculate it.
This removes a spill from the fast path,
at the cost of a cheap recalculation
on the (rare) growth path.
This uses 8 bytes less of stack space.
It generates two more bytes of code,
but that is due to suboptimal register allocation;
see far below.

Runtime append microbenchmarks are all over the map,
presumably due to incidental code movement.

Sample code:

func s(b []byte) []byte {
	b = append(b, 1, 2, 3)
	return b
}

Before:

"".s t=1 size=160 args=0x30 locals=0x48
	0x0000 00000 (append.go:8)	TEXT	"".s(SB), $72-48
	0x0000 00000 (append.go:8)	MOVQ	(TLS), CX
	0x0009 00009 (append.go:8)	CMPQ	SP, 16(CX)
	0x000d 00013 (append.go:8)	JLS	149
	0x0013 00019 (append.go:8)	SUBQ	$72, SP
	0x0017 00023 (append.go:8)	FUNCDATA	$0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB)
	0x0017 00023 (append.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:9)	MOVQ	"".b+88(FP), CX
	0x001c 00028 (append.go:9)	LEAQ	3(CX), DX
	0x0020 00032 (append.go:9)	MOVQ	DX, "".autotmp_0+64(SP)
	0x0025 00037 (append.go:9)	MOVQ	"".b+96(FP), BX
	0x002a 00042 (append.go:9)	CMPQ	DX, BX
	0x002d 00045 (append.go:9)	JGT	$0, 86
	0x002f 00047 (append.go:8)	MOVQ	"".b+80(FP), AX
	0x0034 00052 (append.go:9)	MOVB	$1, (AX)(CX*1)
	0x0038 00056 (append.go:9)	MOVB	$2, 1(AX)(CX*1)
	0x003d 00061 (append.go:9)	MOVB	$3, 2(AX)(CX*1)
	0x0042 00066 (append.go:10)	MOVQ	AX, "".~r1+104(FP)
	0x0047 00071 (append.go:10)	MOVQ	DX, "".~r1+112(FP)
	0x004c 00076 (append.go:10)	MOVQ	BX, "".~r1+120(FP)
	0x0051 00081 (append.go:10)	ADDQ	$72, SP
	0x0055 00085 (append.go:10)	RET
	0x0056 00086 (append.go:9)	LEAQ	type.[]uint8(SB), AX
	0x005d 00093 (append.go:9)	MOVQ	AX, (SP)
	0x0061 00097 (append.go:9)	MOVQ	"".b+80(FP), BP
	0x0066 00102 (append.go:9)	MOVQ	BP, 8(SP)
	0x006b 00107 (append.go:9)	MOVQ	CX, 16(SP)
	0x0070 00112 (append.go:9)	MOVQ	BX, 24(SP)
	0x0075 00117 (append.go:9)	MOVQ	DX, 32(SP)
	0x007a 00122 (append.go:9)	PCDATA	$0, $0
	0x007a 00122 (append.go:9)	CALL	runtime.growslice(SB)
	0x007f 00127 (append.go:9)	MOVQ	40(SP), AX
	0x0084 00132 (append.go:9)	MOVQ	56(SP), BX
	0x0089 00137 (append.go:8)	MOVQ	"".b+88(FP), CX
	0x008e 00142 (append.go:9)	MOVQ	"".autotmp_0+64(SP), DX
	0x0093 00147 (append.go:9)	JMP	52
	0x0095 00149 (append.go:9)	NOP
	0x0095 00149 (append.go:8)	CALL	runtime.morestack_noctxt(SB)
	0x009a 00154 (append.go:8)	JMP	0

After:

"".s t=1 size=176 args=0x30 locals=0x40
	0x0000 00000 (append.go:8)	TEXT	"".s(SB), $64-48
	0x0000 00000 (append.go:8)	MOVQ	(TLS), CX
	0x0009 00009 (append.go:8)	CMPQ	SP, 16(CX)
	0x000d 00013 (append.go:8)	JLS	151
	0x0013 00019 (append.go:8)	SUBQ	$64, SP
	0x0017 00023 (append.go:8)	FUNCDATA	$0, gclocals·6432f8c6a0d23fa7bee6c5d96f21a92a(SB)
	0x0017 00023 (append.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0017 00023 (append.go:9)	MOVQ	"".b+80(FP), CX
	0x001c 00028 (append.go:9)	LEAQ	3(CX), DX
	0x0020 00032 (append.go:9)	MOVQ	"".b+88(FP), BX
	0x0025 00037 (append.go:9)	CMPQ	DX, BX
	0x0028 00040 (append.go:9)	JGT	$0, 81
	0x002a 00042 (append.go:8)	MOVQ	"".b+72(FP), AX
	0x002f 00047 (append.go:9)	MOVB	$1, (AX)(CX*1)
	0x0033 00051 (append.go:9)	MOVB	$2, 1(AX)(CX*1)
	0x0038 00056 (append.go:9)	MOVB	$3, 2(AX)(CX*1)
	0x003d 00061 (append.go:10)	MOVQ	AX, "".~r1+96(FP)
	0x0042 00066 (append.go:10)	MOVQ	DX, "".~r1+104(FP)
	0x0047 00071 (append.go:10)	MOVQ	BX, "".~r1+112(FP)
	0x004c 00076 (append.go:10)	ADDQ	$64, SP
	0x0050 00080 (append.go:10)	RET
	0x0051 00081 (append.go:9)	LEAQ	type.[]uint8(SB), AX
	0x0058 00088 (append.go:9)	MOVQ	AX, (SP)
	0x005c 00092 (append.go:9)	MOVQ	"".b+72(FP), BP
	0x0061 00097 (append.go:9)	MOVQ	BP, 8(SP)
	0x0066 00102 (append.go:9)	MOVQ	CX, 16(SP)
	0x006b 00107 (append.go:9)	MOVQ	BX, 24(SP)
	0x0070 00112 (append.go:9)	MOVQ	DX, 32(SP)
	0x0075 00117 (append.go:9)	PCDATA	$0, $0
	0x0075 00117 (append.go:9)	CALL	runtime.growslice(SB)
	0x007a 00122 (append.go:9)	MOVQ	40(SP), AX
	0x007f 00127 (append.go:9)	MOVQ	48(SP), CX
	0x0084 00132 (append.go:9)	MOVQ	56(SP), BX
	0x0089 00137 (append.go:9)	ADDQ	$3, CX
	0x008d 00141 (append.go:9)	MOVQ	CX, DX
	0x0090 00144 (append.go:8)	MOVQ	"".b+80(FP), CX
	0x0095 00149 (append.go:9)	JMP	47
	0x0097 00151 (append.go:9)	NOP
	0x0097 00151 (append.go:8)	CALL	runtime.morestack_noctxt(SB)
	0x009c 00156 (append.go:8)	JMP	0

Observe that in the following sequence,
we should use DX directly instead of using
CX as a temporary register, which would make
the new code a strict improvement on the old:

	0x007f 00127 (append.go:9)	MOVQ	48(SP), CX
	0x0084 00132 (append.go:9)	MOVQ	56(SP), BX
	0x0089 00137 (append.go:9)	ADDQ	$3, CX
	0x008d 00141 (append.go:9)	MOVQ	CX, DX
	0x0090 00144 (append.go:8)	MOVQ	"".b+80(FP), CX

Change-Id: I4ee50b18fa53865901d2d7f86c2cbb54c6fa6924
Reviewed-on: https://go-review.googlesource.com/21812
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-04-10 20:02:04 +00:00
David Chase
c3b3e7b4ef cmd/compile: insert instrumentation more carefully in racewalk
Be more careful about inserting instrumentation in racewalk.
If the node being instrumented is an OAS, and it has a non-
empty Ninit, then append instrumentation to the Ninit list
rather than letting it be inserted before the OAS (and the
compilation of its init list).  This deals with the case that
the Ninit list defines a variable used in the RHS of the OAS.

Fixes #15091.

Change-Id: Iac91696d9104d07f0bf1bd3499bbf56b2e1ef073
Reviewed-on: https://go-review.googlesource.com/21771
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: David Chase <drchase@google.com>
2016-04-08 21:06:39 +00:00
Matthew Dempsky
c6e11fe037 cmd: add new common architecture representation
Information about CPU architectures (e.g., name, family, byte
ordering, pointer and register size) is currently redundantly
scattered around the source tree. Instead consolidate the basic
information into a single new package cmd/internal/sys.

Also, introduce new sys.I386, sys.AMD64, etc. names for the constants
'8', '6', etc. and replace most uses of the latter. The notable
exceptions are a couple of error messages that still refer to the old
char-based toolchain names and function reltype in cmd/link.

Passes toolstash/buildall.

Change-Id: I8a6f0cbd49577ec1672a98addebc45f767e36461
Reviewed-on: https://go-review.googlesource.com/21623
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-07 01:23:25 +00:00
Josh Bleecher Snyder
f38f43d029 cmd/compile: shrink gc.Type in half
Many of Type's fields are etype-specific.
This CL organizes them into their own auxiliary types,
duplicating a few fields as necessary,
and adds an Extra field to hold them.
It also sorts the remaining fields for better struct packing.
It also improves documentation for most fields.

This reduces the size of Type at the cost of some extra allocations.
There's no CPU impact; memory impact below.
It also makes the natural structure of Type clearer.

Passes toolstash -cmp on all architectures.

Ideas for future work in this vein:

(1) Width and Align probably only need to be
stored for Struct and Array types.
The refactoring to accomplish this would hopefully
also eliminate TFUNCARGS and TCHANARGS entirely.

(2) Maplineno is sparsely used and could probably better be
stored in a separate map[*Type]int32, with mapqueue updated
to store both a Node and a line number.

(3) The Printed field may be removable once the old (non-binary)
importer/exported has been removed.

(4) StructType's fields field could be changed from *[]*Field to []*Field,
which would remove a common allocation.

(5) I believe that Type.Nod can be moved to ForwardType. Separate CL.

name       old alloc/op     new alloc/op     delta
Template       57.9MB ± 0%      55.9MB ± 0%  -3.43%        (p=0.000 n=50+50)
Unicode        38.3MB ± 0%      37.8MB ± 0%  -1.39%        (p=0.000 n=50+50)
GoTypes         185MB ± 0%       180MB ± 0%  -2.56%        (p=0.000 n=50+50)
Compiler        824MB ± 0%       806MB ± 0%  -2.19%        (p=0.000 n=50+50)

name       old allocs/op    new allocs/op    delta
Template         486k ± 0%        497k ± 0%  +2.25%        (p=0.000 n=50+50)
Unicode          377k ± 0%        379k ± 0%  +0.55%        (p=0.000 n=50+50)
GoTypes         1.39M ± 0%       1.42M ± 0%  +1.63%        (p=0.000 n=50+50)
Compiler        5.52M ± 0%       5.57M ± 0%  +0.84%        (p=0.000 n=47+50)

Change-Id: I828488eeb74902b013d5ae4cf844de0b6c0dfc87
Reviewed-on: https://go-review.googlesource.com/21611
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-06 19:10:10 +00:00
Keith Randall
309144b7f1 cmd/compile: fix x=x assignments
No point in doing anything for x=x assignments.
In addition, skipping these assignments prevents generating:
    VARDEF x
    COPY x -> x
which is bad because x is incorrectly considered
dead before the vardef.

Fixes #14904

Change-Id: I6817055ec20bcc34a9648617e0439505ee355f82
Reviewed-on: https://go-review.googlesource.com/21470
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2016-04-06 15:04:32 +00:00