576 Commits

Author SHA1 Message Date
Cherry Zhang
6f3e5e637c cmd/compile: intrinsify runtime.getcallersp
Add a compiler intrinsic for getcallersp. So we are able to get
rid of the argument (not done in this CL).

Change-Id: Ic38fda1c694f918328659ab44654198fb116668d
Reviewed-on: https://go-review.googlesource.com/69350
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
2017-10-10 15:15:21 +00:00
David Chase
ca360c3992 cmd/compile: better XPos for rematerialized values and JMPs
This attempts to choose better values for values that are
rematerialized (uses the XPos of the consumer, not the
original) and for unconditional branches (uses the last
assigned XPos in the block).

The JMP branches seem to sometimes end up with a PC in the
destination block, I think because of register movement
or rematerialization that gets placed in predecessor blocks.
This may be acceptable because (eyeball-empirically) that is
often the line number of the target block, so the line number
flow is correct.

Added proper test, that checks both -N -l and regular compilation.
The test is also capable (for gdb, delve soon) of tracking
variable printing based on comments in the source code.

There's substantial room for improvement in debugger behavior.

Updates #21098.

Change-Id: I13abd48a39141583b85576a015f561065819afd0
Reviewed-on: https://go-review.googlesource.com/50610
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-07 22:12:36 +00:00
Matthew Dempsky
31a3b719a0 cmd/compile: cleanup genwrapper slightly
ORETJMP doesn't need an ONAME if we just set the target method on Sym
instead of Left. Conveniently, this is where fmt.go was looking for it
anyway.

Change the iface parameter and global compiling_wrappers to bool.

Passes toolstash-check.

Change-Id: I5333f8bcb4e06bf8161808041125eb95c439aafe
Reviewed-on: https://go-review.googlesource.com/68252
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-05 22:37:16 +00:00
Keith Randall
41eabc0fc7 cmd/compile: fix merge rules for panic calls
Use entire inlining call stack to decide whether two panic calls
can be merged. We used to merge panic calls when only the leaf
line numbers matched, but that leads to places higher up the call
stack being merged incorrectly.

Fixes #22083

Change-Id: Ia41400a80de4b6ecf3e5089abce0c42b65e9b38a
Reviewed-on: https://go-review.googlesource.com/67632
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-03 09:27:37 +00:00
Heschi Kreinick
6bbe1bc940 cmd/compile: cover control flow insns in location lists
The information that's used to generate DWARF location lists is very
ssa.Value centric; it uses Values as start and end coordinates to define
ranges. That mostly works fine, but control flow instructions don't come
from Values, so the ranges couldn't cover them.

Control flow instructions are generated when the SSA representation is
converted to assembly, so that's the best place to extend the ranges
to cover them. (Before that, there's nothing to refer to, and afterward
the boundaries between blocks have been lost.) That requires block
information in the debugInfo type, which then flows down to make
everything else awkward. On the plus side, there's a little less copying
slices around than there used to be, so it should be a little faster.

Previously, the ranges for empty blocks were not very meaningful. That
was fine, because they had no Values to cover, so no debug information
was generated for them. But they do have control flow instructions
(that's why they exist) and so now it's important that the information
be correct. Introduce two sentinel values, BlockStart and BlockEnd, that
denote the boundary of a block, even if the block is empty. BlockEnd
replaces the previous SurvivedBlock flag.

There's one more problem: the last instruction in the function will be a
control flow instruction, so any live ranges need to be extended past
it. But there's no instruction after it to use as the end of the range.
Instead, leave the EndProg field of those ranges as nil and fix it up to
point to past the end of the assembled text at the very last moment.

Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b
Reviewed-on: https://go-review.googlesource.com/63010
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-28 20:30:12 +00:00
David Chase
6cac100eef cmd/compile: add intrinsic for reading caller's pc
First step towards removing the mandatory argument for
getcallerpc, which solves certain problems for the runtime.
This might also slightly improve performance.

Intrinsic enabled on 386, amd64, amd64p32,
runtime asm implementation removed on those architectures.

Now-superfluous argument remains in getcallerpc signature
(for a future CL; non-386/amd64 asm funcs ignore it).

Added getcallerpc to the "not a real function" test
in dcl.go, that story is a little odd with respect to
unexported functions but that is not this CL.

Fixes #17327.

Change-Id: I5df1ad91f27ee9ac1f0dd88fa48f1329d6306c3e
Reviewed-on: https://go-review.googlesource.com/31851
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-09-22 18:37:03 +00:00
Michael Munday
7582494e06 cmd/compile: add s390x intrinsics for Ceil, Floor, Round and Trunc
Ceil, Floor and Trunc are pre-existing intrinsics. Round is a new
function and has been added as an intrinsic in this CL. All of the
functions can be implemented as a single 'LOAD FP INTEGER'
instruction, FIDBR, on s390x.

name   old time/op  new time/op  delta
Ceil   2.34ns ± 0%  0.85ns ± 0%  -63.74%  (p=0.000 n=5+4)
Floor  2.33ns ± 0%  0.85ns ± 1%  -63.35%  (p=0.008 n=5+5)
Round  4.23ns ± 0%  0.85ns ± 0%  -79.89%  (p=0.000 n=5+4)
Trunc  2.35ns ± 0%  0.85ns ± 0%  -63.83%  (p=0.029 n=4+4)

Change-Id: Idee7ba24a2899d12bf9afee4eedd6b4aaad3c510
Reviewed-on: https://go-review.googlesource.com/63890
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-20 10:01:35 +00:00
Keith Randall
1787ced894 cmd/compile: remove Symbol wrappers from Aux fields
We used to have {Arg,Auto,Extern}Symbol structs with which we wrapped
a *gc.Node or *obj.LSym before storing them in the Aux field
of an ssa.Value.  This let the SSA part of the compiler distinguish
between autos and args, for example.  We no longer need the wrappers
as we can query the underlying objects directly.

There was also some sloppy usage, where VarDef had a *gc.Node
directly in its Aux field, whereas the use of that variable had
that *gc.Node wrapped in an AutoSymbol. Thus the Aux fields didn't
match (using ==) when they probably should.
This sloppy usage cleanup is the only thing in the CL that changes the
generated code - we can get rid of some more unused auto variables if
the matching happens reliably.

Removing this wrapper also lets us get rid of the varsyms cache
(which was used to prevent wrapping the same *gc.Node twice).

Change-Id: I0dedf8f82f84bfee413d310342b777316bd1d478
Reviewed-on: https://go-review.googlesource.com/64452
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-19 22:03:10 +00:00
Daniel Martí
71c9454f99 cmd/compile: remove some redundant types in decls
As per golint's suggestions.

Change-Id: Ie0c6ad9aa5dc69966a279562a341c7b095c47ede
Reviewed-on: https://go-review.googlesource.com/64192
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-17 09:51:38 +00:00
Lynn Boger
fa3fe2e3c6 cmd/compile, math/bits: add rotate rules to PPC64.rules
This adds rules to match the code in math/bits RotateLeft,
RotateLeft32, and RotateLef64 to allow them to be inlined.

The rules are complicated because the code in these function
use different types, and the non-const version of these
shifts generate Mask and Carry instructions that become
subexpressions during the match process.

Also adds a testcase to asm_test.go.

Improvement in math/bits:

BenchmarkRotateLeft-16       1.57     1.32      -15.92%
BenchmarkRotateLeft32-16     1.60     1.37      -14.37%
BenchmarkRotateLeft64-16     1.57     1.32      -15.92%

Updates #21390

Change-Id: Ib6f17669ecc9cab54f18d690be27e2225ca654a4
Reviewed-on: https://go-review.googlesource.com/59932
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-09-11 20:44:22 +00:00
Daniel Martí
99da8730b0 all: remove some double spaces from comments
Went mainly for the ones that make no sense, such as the ones
mid-sentence or after commas.

Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0
Reviewed-on: https://go-review.googlesource.com/57293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-26 15:09:09 +00:00
Michael Munday
744ebfde04 cmd/compile: eliminate stores to unread auto variables
This is a crude compiler pass to eliminate stores to auto variables
that are only ever written to.

Eliminates an unnecessary store to x from the following code:

func f() int {
	var x := 1
	return *(&x)
}

Fixes #19765.

Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa
Reviewed-on: https://go-review.googlesource.com/38746
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-08-24 16:53:56 +00:00
Keith Randall
bf4d8d3d05 cmd/compile: rename SSA Register.Name to Register.String
Just to get rid of lots of .Name() stutter in printf calls.

Change-Id: I86cf00b3f7b2172387a1c6a7f189c1897fab6300
Reviewed-on: https://go-review.googlesource.com/56630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-08-17 21:53:08 +00:00
Daniel Martí
3366f51544 cmd/compile: tweaks to unindent some code
Prioritized the chunks of code with 8 or more levels of indentation.
Basically early breaks/returns and joining nested ifs.

Change-Id: I6817df1303226acf2eb904a29f2db720e4f7427a
Reviewed-on: https://go-review.googlesource.com/55630
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-08-17 07:57:19 +00:00
Keith Randall
04d6f982ae runtime: remove link field from itab
We don't use it any more, remove it.

Change-Id: I76ce1a4c2e7048fdd13a37d3718b5abf39ed9d26
Reviewed-on: https://go-review.googlesource.com/44474
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-08-15 01:52:35 +00:00
Cholerae Hu
57bf6aca71 runtime, cmd/compile: add intrinsic getclosureptr
Intrinsic enabled on all architectures,
runtime asm implementation removed on all architectures.

Fixes #21258

Change-Id: I2cb86d460b497c2f287a5b3df5c37fdb231c23a7
Reviewed-on: https://go-review.googlesource.com/53411
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2017-08-11 18:11:22 +00:00
Gerrit Code Review
385cd6681b Merge "Merge remote-tracking branch 'origin/dev.debug' into master" 2017-08-11 17:47:15 +00:00
Lynn Boger
0f19e24da7 cmd/compile: intrinsics for trunc, floor, ceil on ppc64x
This implements trunc, floor, and ceil in the math package
as intrinsics on ppc64x.  Significant improvement mainly due
to avoiding call overhead of args and return value.

BenchmarkCeil-16                    5.95          0.69          -88.40%
BenchmarkFloor-16                   5.95          0.69          -88.40%
BenchmarkTrunc-16                   5.82          0.69          -88.14%

Updates #21390

Change-Id: I951e182694f6e0c431da79c577272b81fb0ebad0
Reviewed-on: https://go-review.googlesource.com/54654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
2017-08-11 16:35:49 +00:00
Austin Clements
6f6a9398e2 Merge remote-tracking branch 'origin/dev.debug' into master
Change-Id: I85df2745af666b533f4f6f1d06f7c8e137590b5b
2017-08-11 12:17:43 -04:00
Josh Bleecher Snyder
3b87defe4e cmd/compile: unexport gc.Sysfunc
Updates #21352

Change-Id: If21342f30be32e25840b4072b932a6d4257b420d
Reviewed-on: https://go-review.googlesource.com/54091
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Avelino <t@avelino.xxx>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-08-11 00:27:35 +00:00
Josh Bleecher Snyder
2fe53d8d55 cmd/compile: remove gc.Sysfunc calls from 387 backend
gc.Sysfunc must not be called concurrently.
We set up runtime routines used by the backend
prior to doing any backend compilation.
I missed the 387 ones; fix that.

Sysfunc should have been unexported during 1.9.
I will rectify that in a subsequent CL.

Fixes #21352

Change-Id: I8386eaa1e05879c25c672b9c9fc693c938e9aeb6
Reviewed-on: https://go-review.googlesource.com/54090
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Avelino <t@avelino.xxx>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-09 00:19:58 +00:00
Heschi Kreinick
4c54a047c6 [dev.debug] cmd/compile: better DWARF with optimizations on
Debuggers use DWARF information to find local variables on the
stack and in registers. Prior to this CL, the DWARF information for
functions claimed that all variables were on the stack at all times.
That's incorrect when optimizations are enabled, and results in
debuggers showing data that is out of date or complete gibberish.

After this CL, the compiler is capable of representing variable
locations more accurately, and attempts to do so. Due to limitations of
the SSA backend, it's not possible to be completely correct.

There are a number of problems in the current design. One of the easier
to understand is that variable names currently must be attached to an
SSA value, but not all assignments in the source code actually result
in machine code. For example:

  type myint int
  var a int
  b := myint(int)
and
  b := (*uint64)(unsafe.Pointer(a))

don't generate machine code because the underlying representation is the
same, so the correct value of b will not be set when the user would
expect.

Generating the more precise debug information is behind a flag,
dwarflocationlists. Because of the issues described above, setting the
flag may not make the debugging experience much better, and may actually
make it worse in cases where the variable actually is on the stack and
the more complicated analysis doesn't realize it.

A number of changes are included:
- Add a new pseudo-instruction, RegKill, which indicates that the value
in the register has been clobbered.
- Adjust regalloc to emit RegKills in the right places. Significantly,
this means that phis are mixed with StoreReg and RegKills after
regalloc.
- Track variable decomposition in ssa.LocalSlots.
- After the SSA backend is done, analyze the result and build location
lists for each LocalSlot.
- After assembly is done, update the location lists with the assembled
PC offsets, recompose variables, and build DWARF location lists. Emit the
list as a new linker symbol, one per function.
- In the linker, aggregate the location lists into a .debug_loc section.

TODO:
- currently disabled for non-X86/AMD64 because there are no data tables.

go build -toolexec 'toolstash -cmp' -a std succeeds.

With -dwarflocationlists false:
before: f02812195637909ff675782c0b46836a8ff01976
after:  06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
benchstat -geomean  /tmp/220352263 /tmp/621364410
completed   15 of   15, estimated time remaining 0s (eta 3:52PM)
name        old time/op       new time/op       delta
Template          199ms ± 3%        198ms ± 2%     ~     (p=0.400 n=15+14)
Unicode          96.6ms ± 5%       96.4ms ± 5%     ~     (p=0.838 n=15+15)
GoTypes           653ms ± 2%        647ms ± 2%     ~     (p=0.102 n=15+14)
Flate             133ms ± 6%        129ms ± 3%   -2.62%  (p=0.041 n=15+15)
GoParser          164ms ± 5%        159ms ± 3%   -3.05%  (p=0.000 n=15+15)
Reflect           428ms ± 4%        422ms ± 3%     ~     (p=0.156 n=15+13)
Tar               123ms ±10%        124ms ± 8%     ~     (p=0.461 n=15+15)
XML               228ms ± 3%        224ms ± 3%   -1.57%  (p=0.045 n=15+15)
[Geo mean]        206ms             377ms       +82.86%

name        old user-time/op  new user-time/op  delta
Template          292ms ±10%        301ms ±12%     ~     (p=0.189 n=15+15)
Unicode           166ms ±37%        158ms ±14%     ~     (p=0.418 n=15+14)
GoTypes           962ms ± 6%        963ms ± 7%     ~     (p=0.976 n=15+15)
Flate             207ms ±19%        200ms ±14%     ~     (p=0.345 n=14+15)
GoParser          246ms ±22%        240ms ±15%     ~     (p=0.587 n=15+15)
Reflect           611ms ±13%        587ms ±14%     ~     (p=0.085 n=15+13)
Tar               211ms ±12%        217ms ±14%     ~     (p=0.355 n=14+15)
XML               335ms ±15%        320ms ±18%     ~     (p=0.169 n=15+15)
[Geo mean]        317ms             583ms       +83.72%

name        old alloc/op      new alloc/op      delta
Template         40.2MB ± 0%       40.2MB ± 0%   -0.15%  (p=0.000 n=14+15)
Unicode          29.2MB ± 0%       29.3MB ± 0%     ~     (p=0.624 n=15+15)
GoTypes           114MB ± 0%        114MB ± 0%   -0.15%  (p=0.000 n=15+14)
Flate            25.7MB ± 0%       25.6MB ± 0%   -0.18%  (p=0.000 n=13+15)
GoParser         32.2MB ± 0%       32.2MB ± 0%   -0.14%  (p=0.003 n=15+15)
Reflect          77.8MB ± 0%       77.9MB ± 0%     ~     (p=0.061 n=15+15)
Tar              27.1MB ± 0%       27.0MB ± 0%   -0.11%  (p=0.029 n=15+15)
XML              42.7MB ± 0%       42.5MB ± 0%   -0.29%  (p=0.000 n=15+15)
[Geo mean]       42.1MB            75.0MB       +78.05%

name        old allocs/op     new allocs/op     delta
Template           402k ± 1%         398k ± 0%   -0.91%  (p=0.000 n=15+15)
Unicode            344k ± 1%         344k ± 0%     ~     (p=0.715 n=15+14)
GoTypes           1.18M ± 0%        1.17M ± 0%   -0.91%  (p=0.000 n=15+14)
Flate              243k ± 0%         240k ± 1%   -1.05%  (p=0.000 n=13+15)
GoParser           327k ± 1%         324k ± 1%   -0.96%  (p=0.000 n=15+15)
Reflect            984k ± 1%         982k ± 0%     ~     (p=0.050 n=15+15)
Tar                261k ± 1%         259k ± 1%   -0.77%  (p=0.000 n=15+15)
XML                411k ± 0%         404k ± 1%   -1.55%  (p=0.000 n=15+15)
[Geo mean]         439k              755k       +72.01%

name        old text-bytes    new text-bytes    delta
HelloSize         694kB ± 0%        694kB ± 0%   -0.00%  (p=0.000 n=15+15)

name        old data-bytes    new data-bytes    delta
HelloSize        5.55kB ± 0%       5.55kB ± 0%     ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize         133kB ± 0%        133kB ± 0%     ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.04MB ± 0%       1.04MB ± 0%     ~     (all equal)

Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
Reviewed-on: https://go-review.googlesource.com/41770
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-27 20:19:44 +00:00
Heschi Kreinick
2d57d94ac3 [dev.debug] cmd/compile: track variable decomposition in LocalSlot
When the compiler decomposes a user variable, track its origin so that
it can be recomposed during DWARF generation.

Change-Id: Ia71c7f8e7f4d65f0652f1c97b0dda5d9cad41936
Reviewed-on: https://go-review.googlesource.com/50878
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-26 18:39:39 +00:00
Heschi Kreinick
c1c08a13e7 [dev.debug] cmd/compile: rename some locals in genssa
When we start tracking the mapping from Value to Prog, valueProgs will
be confusing. Disambiguate.

Change-Id: Ib3b302fedb7eb0ff1bde789d70a11656d82f0897
Reviewed-on: https://go-review.googlesource.com/50876
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-25 19:33:26 +00:00
Josh Bleecher Snyder
7cd6310014 cmd/compile: don't generate liveness maps when the stack is too large
Fixes #20529

Change-Id: I3cb0c037b1737fbc3fa3b1b61ed8a42cfaf8e10d
Reviewed-on: https://go-review.googlesource.com/44344
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-30 22:39:29 +00:00
David Chase
27da3ba5af cmd/compile: don't attach lines to SB, SP, similar constants
Attaching positions to SB, SP, initial mem can result in
less-good line-numbering when compiled for debugging.
This "fix" also removes source position from a zero-valued
struct (but not from its fields) and from a zero-length
array constant.

This may be a general problem for constants in entry blocks.

Fixes #20367.

Change-Id: I7e9df3341be2e2f60f127d35bb31e43cdcfce9a1
Reviewed-on: https://go-review.googlesource.com/43531
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-15 20:39:40 +00:00
Josh Bleecher Snyder
e5bb5e397d cmd/compile: restore panic deduplication
The switch to detailed position information broke
the removal of duplicate panics on the same line.
Restore it.

Neutral compiler performance impact:

name        old alloc/op      new alloc/op      delta
Template         38.8MB ± 0%       38.8MB ± 0%    ~     (p=0.690 n=5+5)
Unicode          28.7MB ± 0%       28.7MB ± 0%  +0.13%  (p=0.032 n=5+5)
GoTypes           109MB ± 0%        109MB ± 0%    ~     (p=1.000 n=5+5)
Compiler          457MB ± 0%        457MB ± 0%    ~     (p=0.151 n=5+5)
SSA              1.09GB ± 0%       1.10GB ± 0%  +0.17%  (p=0.008 n=5+5)
Flate            24.6MB ± 0%       24.5MB ± 0%  -0.35%  (p=0.008 n=5+5)
GoParser         30.9MB ± 0%       31.0MB ± 0%    ~     (p=0.421 n=5+5)
Reflect          73.4MB ± 0%       73.4MB ± 0%    ~     (p=0.056 n=5+5)
Tar              25.6MB ± 0%       25.5MB ± 0%  -0.61%  (p=0.008 n=5+5)
XML              40.9MB ± 0%       40.9MB ± 0%    ~     (p=0.841 n=5+5)
[Geo mean]       71.6MB            71.6MB       -0.07%

name        old allocs/op     new allocs/op     delta
Template           394k ± 0%         395k ± 1%    ~     (p=0.151 n=5+5)
Unicode            343k ± 0%         344k ± 0%  +0.38%  (p=0.032 n=5+5)
GoTypes           1.16M ± 0%        1.16M ± 0%    ~     (p=1.000 n=5+5)
Compiler          4.41M ± 0%        4.42M ± 0%    ~     (p=0.151 n=5+5)
SSA               9.79M ± 0%        9.79M ± 0%    ~     (p=0.690 n=5+5)
Flate              238k ± 1%         238k ± 0%    ~     (p=0.151 n=5+5)
GoParser           321k ± 0%         321k ± 1%    ~     (p=0.548 n=5+5)
Reflect            958k ± 0%         957k ± 0%    ~     (p=0.841 n=5+5)
Tar                252k ± 0%         252k ± 1%    ~     (p=0.151 n=5+5)
XML                401k ± 0%         400k ± 0%    ~     (p=1.000 n=5+5)
[Geo mean]         741k              742k       +0.08%


Reduces object files a little bit:

name        old object-bytes  new object-bytes  delta
Template           386k ± 0%         386k ± 0%  -0.04%  (p=0.008 n=5+5)
Unicode            202k ± 0%         202k ± 0%    ~     (all equal)
GoTypes           1.16M ± 0%        1.16M ± 0%  -0.04%  (p=0.008 n=5+5)
Compiler          3.91M ± 0%        3.91M ± 0%  -0.08%  (p=0.008 n=5+5)
SSA               7.91M ± 0%        7.91M ± 0%  -0.04%  (p=0.008 n=5+5)
Flate              228k ± 0%         227k ± 0%  -0.28%  (p=0.008 n=5+5)
GoParser           283k ± 0%         283k ± 0%  -0.01%  (p=0.008 n=5+5)
Reflect            952k ± 0%         951k ± 0%  -0.03%  (p=0.008 n=5+5)
Tar                188k ± 0%         187k ± 0%  -0.09%  (p=0.008 n=5+5)
XML                406k ± 0%         406k ± 0%  -0.04%  (p=0.008 n=5+5)
[Geo mean]         648k              648k       -0.06%


This was discovered in the context for the Fannkuch benchmark.
It shrinks the number of panicindex calls in that function
from 13 back to 9, their 1.8.1 level.

It shrinks the function text a bit, from 829 to 801 bytes.
It slows down execution a little, presumably due to alignment (?).

name          old time/op  new time/op  delta
Fannkuch11-8   2.68s ± 2%   2.74s ± 1%  +2.09%  (p=0.000 n=19+20)

After this CL, 1.8.1 and tip are identical:

name          old time/op  new time/op  delta
Fannkuch11-8   2.74s ± 2%   2.74s ± 1%   ~     (p=0.301 n=20+20)

Fixes #20332

Change-Id: I2aeacc3e8cf2ac1ff10f36c572a27856f4f8f7c9
Reviewed-on: https://go-review.googlesource.com/43291
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-11 19:34:11 +00:00
David Chase
00263a8968 cmd/compile: reduce debugger-worsening line number churn
Reuse block head or preceding instruction's line number for
register allocator's spill, fill, copy, rematerialization
instructionsl; and also for phi, and for no-src-pos
instructions.  Assembler creates same line number tables
for copy-predecessor-line and for no-src-pos,
but copy-predecessor produces better-looking assembly
language output with -S and with GOSSAFUNC, and does not
require changes to tests of existing assembly language.

Split "copyInto" into two cases, one for register allocation,
one for otherwise.  This caused the test score line change
count to increase by one, which may reflect legitimately
useful information preserved.  Without any special treatment
for copyInto, the change count increases by 21 more, from
51 to 72 (i.e., quite a lot).

There is a test; using two naive "scores" for line number
churn, the old numbering is 2x or 4x worse.

Fixes #18902.

Change-Id: I0a0a69659d30ee4e5d10116a0dd2b8c5df8457b1
Reviewed-on: https://go-review.googlesource.com/36207
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-10 17:16:44 +00:00
Lynn Boger
8304d10763 cmd/compile: ppc64x intrinsics for math/bits
This adds math/bits intrinsics for OnesCount, Len, TrailingZeros on
ppc64x.

benchmark                       old ns/op     new ns/op     delta
BenchmarkLeadingZeros-16        4.26          1.71          -59.86%
BenchmarkLeadingZeros16-16      3.04          1.83          -39.80%
BenchmarkLeadingZeros32-16      3.31          1.82          -45.02%
BenchmarkLeadingZeros64-16      3.69          1.71          -53.66%
BenchmarkTrailingZeros-16       2.55          1.62          -36.47%
BenchmarkTrailingZeros32-16     2.55          1.77          -30.59%
BenchmarkTrailingZeros64-16     2.78          1.62          -41.73%
BenchmarkOnesCount-16           3.19          0.93          -70.85%
BenchmarkOnesCount32-16         2.55          1.18          -53.73%
BenchmarkOnesCount64-16         3.22          0.93          -71.12%

Update #18616

I also made a change to bits_test.go because when debugging some failures
the output was not quite providing the right argument information.

Change-Id: Ia58d31d1777cf4582a4505f85b11a1202ca07d3e
Reviewed-on: https://go-review.googlesource.com/41630
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-10 12:10:56 +00:00
Josh Bleecher Snyder
46b88c9fbc cmd/compile: change ssa.Type into *types.Type
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.

In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.

Though this is a big CL, most of the changes are
mechanical and uninteresting.

Interesting bits:

* Add new singleton globals to package types for the special
  SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
  and TTUPLE, for SSA tuple types.
  ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
  to package types.
* We had picked the name "types" in our rules for the handy
  list of types provided by ssa.Config. That conflicted with
  the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
  types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
  and probably also some other duplicated Type methods
  designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
  and they were not particularly careful about types in general.
  Of necessity, this CL switches them to use *types.Type;
  it does not make them more type-accurate.
  Unfortunately, using types.Type means initializing a bit
  of the types universe.
  This is prime for refactoring and improvement.

This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.

name        old alloc/op      new alloc/op      delta
Template         37.9MB ± 0%       37.7MB ± 0%  -0.57%  (p=0.000 n=10+8)
Unicode          28.9MB ± 0%       28.7MB ± 0%  -0.52%  (p=0.000 n=10+10)
GoTypes           110MB ± 0%        109MB ± 0%  -0.88%  (p=0.000 n=10+10)
Flate            24.7MB ± 0%       24.6MB ± 0%  -0.66%  (p=0.000 n=10+10)
GoParser         31.1MB ± 0%       30.9MB ± 0%  -0.61%  (p=0.000 n=10+9)
Reflect          73.9MB ± 0%       73.4MB ± 0%  -0.62%  (p=0.000 n=10+8)
Tar              25.8MB ± 0%       25.6MB ± 0%  -0.77%  (p=0.000 n=9+10)
XML              41.2MB ± 0%       40.9MB ± 0%  -0.80%  (p=0.000 n=10+10)
[Geo mean]       40.5MB            40.3MB       -0.68%

name        old allocs/op     new allocs/op     delta
Template           385k ± 0%         386k ± 0%    ~     (p=0.356 n=10+9)
Unicode            343k ± 1%         344k ± 0%    ~     (p=0.481 n=10+10)
GoTypes           1.16M ± 0%        1.16M ± 0%  -0.16%  (p=0.004 n=10+10)
Flate              238k ± 1%         238k ± 1%    ~     (p=0.853 n=10+10)
GoParser           320k ± 0%         320k ± 0%    ~     (p=0.720 n=10+9)
Reflect            957k ± 0%         957k ± 0%    ~     (p=0.460 n=10+8)
Tar                252k ± 0%         252k ± 0%    ~     (p=0.133 n=9+10)
XML                400k ± 0%         400k ± 0%    ~     (p=0.796 n=10+10)
[Geo mean]         428k              428k       -0.01%


Removing all the interface calls helps non-trivially with CPU, though.

name        old time/op       new time/op       delta
Template          178ms ± 4%        173ms ± 3%  -2.90%  (p=0.000 n=94+96)
Unicode          85.0ms ± 4%       83.9ms ± 4%  -1.23%  (p=0.000 n=96+96)
GoTypes           543ms ± 3%        528ms ± 3%  -2.73%  (p=0.000 n=98+96)
Flate             116ms ± 3%        113ms ± 4%  -2.34%  (p=0.000 n=96+99)
GoParser          144ms ± 3%        140ms ± 4%  -2.80%  (p=0.000 n=99+97)
Reflect           344ms ± 3%        334ms ± 4%  -3.02%  (p=0.000 n=100+99)
Tar               106ms ± 5%        103ms ± 4%  -3.30%  (p=0.000 n=98+94)
XML               198ms ± 5%        192ms ± 4%  -2.88%  (p=0.000 n=92+95)
[Geo mean]        178ms             173ms       -2.65%

name        old user-time/op  new user-time/op  delta
Template          229ms ± 5%        224ms ± 5%  -2.36%  (p=0.000 n=95+99)
Unicode           107ms ± 6%        106ms ± 5%  -1.13%  (p=0.001 n=93+95)
GoTypes           696ms ± 4%        679ms ± 4%  -2.45%  (p=0.000 n=97+99)
Flate             137ms ± 4%        134ms ± 5%  -2.66%  (p=0.000 n=99+96)
GoParser          176ms ± 5%        172ms ± 8%  -2.27%  (p=0.000 n=98+100)
Reflect           430ms ± 6%        411ms ± 5%  -4.46%  (p=0.000 n=100+92)
Tar               128ms ±13%        123ms ±13%  -4.21%  (p=0.000 n=100+100)
XML               239ms ± 6%        233ms ± 6%  -2.50%  (p=0.000 n=95+97)
[Geo mean]        220ms             213ms       -2.76%


Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-09 23:01:51 +00:00
Josh Bleecher Snyder
53e62aba2f cmd/compile: add Func.SetNilCheckDisabled
Generated hash and eq routines don't need nil checks.
Prior to this CL, this was accomplished by
temporarily incrementing the global variable disable_checknil.
However, that increment lasted only the lifetime of the
call to funccompile. After CL 41503, funccompile may
do nothing but enqueue the function for compilation,
resulting in nil checks being generated.

Fix this by adding an explicit flag to a function
indicating whether nil checks should be disabled
for that function.

While we're here, allow concurrent compilation
with the -w and -W flags, since that was needed
to investigate this issue.

Fixes #20242

Change-Id: Ib9140c22c49e9a09e62fa3cf350f5d3eff18e2bd
Reviewed-on: https://go-review.googlesource.com/42591
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-05-05 19:34:09 +00:00
Josh Bleecher Snyder
c51559813f cmd/compile: add sizeCalculationDisabled flag
Use it to ensure that dowidth is not called
from the backend on a type whose size
has not yet been calculated.

This is an alternative to CL 42016.

Change-Id: I8c7b4410ee4c2a68573102f6b9b635f4fdcf392e
Reviewed-on: https://go-review.googlesource.com/42018
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-28 01:24:52 +00:00
Josh Bleecher Snyder
dae5389d3d Revert "cmd/compile: add Type.MustSize and Type.MustAlignment"
This reverts commit 94d540a4b6bf68ec472bf4469037955e3133fcf7.

Reason for revert: prefer something along the lines of CL 42018.

Change-Id: I876fe32e98f37d8d725fe55e0fd0ea429c0198e0
Reviewed-on: https://go-review.googlesource.com/42022
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-28 01:24:13 +00:00
Josh Bleecher Snyder
fc08a19cef cmd/compile: move Used from gc.Node to gc.Name
Node.Used was written to from the backend
concurrently with reads of Node.Class
for the same ONAME Nodes.
I do not know why it was not failing consistently
under the race detector, but it is a race.

This is likely also a problem with Node.HasVal and Node.HasOpt.
They will be handled in a separate CL.

Fix Used by moving it to gc.Name and making it a separate bool.
There was one non-Name use of Used, marking OLABELs as used.
That is no longer needed, now that goto and label checking
happens early in the front end.

Leave the getters and setters in place,
to ease changing the representation in the future
(or changing to an interface!).

Updates #20144

Change-Id: I9bec7c6d33dcb129a4cfa9d338462ea33087f9f7
Reviewed-on: https://go-review.googlesource.com/42015
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-27 22:58:13 +00:00
Josh Bleecher Snyder
94d540a4b6 cmd/compile: add Type.MustSize and Type.MustAlignment
Type.Size and Type.Alignment are for the front end:
They calculate size and alignment if needed.

Type.MustSize and Type.MustAlignment are for the back end:
They call Fatal if size and alignment are not already calculated.

Most uses are of MustSize and MustAlignment,
but that's because the back end is newer,
and this API was added to support it.

This CL was mostly generated with sed and selective reversion.
The only mildly interesting bit is the change of the ssa.Type interface
and the supporting ssa dummy types.

Follow-up to review feedback on CL 41970.

Passes toolstash-check.

Change-Id: I0d9b9505e57453dae8fb6a236a07a7a02abd459e
Reviewed-on: https://go-review.googlesource.com/42016
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-27 22:57:57 +00:00
Josh Bleecher Snyder
0b6a10ef24 cmd/compile: dowidth more in the front end
dowidth is fundamentally unsafe to call from the back end;
it will cause data races.

Replace all calls to dowidth in the backend with
assertions that the width has been calculated.

Then fix all the cases in which that was not so,
including the cases from #20145.

Fixes #20145.

Change-Id: Idba3d19d75638851a30ec2ebcdb703c19da3e92b
Reviewed-on: https://go-review.googlesource.com/41970
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-27 22:10:32 +00:00
Josh Bleecher Snyder
756b9ce3a5 cmd/compile: add initial backend concurrency support
This CL adds initial support for concurrent backend compilation.

BACKGROUND

The compiler currently consists (very roughly) of the following phases:

1. Initialization.
2. Lexing and parsing into the cmd/compile/internal/syntax AST.
3. Translation into the cmd/compile/internal/gc AST.
4. Some gc AST passes: typechecking, escape analysis, inlining,
   closure handling, expression evaluation ordering (order.go),
   and some lowering and optimization (walk.go).
5. Translation into the cmd/compile/internal/ssa SSA form.
6. Optimization and lowering of SSA form.
7. Translation from SSA form to assembler instructions.
8. Translation from assembler instructions to machine code.
9. Writing lots of output: machine code, DWARF symbols,
   type and reflection info, export data.

Phase 2 was already concurrent as of Go 1.8.

Phase 3 is planned for eventual removal;
we hope to go straight from syntax AST to SSA.

Phases 5–8 are per-function; this CL adds support for
processing multiple functions concurrently.
The slowest phases in the compiler are 5 and 6,
so this offers the opportunity for some good speed-ups.

Unfortunately, it's not quite that straightforward.
In the current compiler, the latter parts of phase 4
(order, walk) are done function-at-a-time as needed.
Making order and walk concurrency-safe proved hard,
and they're not particularly slow, so there wasn't much reward.
To enable phases 5–8 to be done concurrently,
when concurrent backend compilation is requested,
we complete phase 4 for all functions
before starting later phases for any functions.

Also, in reality, we automatically generate new
functions in phase 9, such as method wrappers
and equality and has routines.
Those new functions then go through phases 4–8.
This CL disables concurrent backend compilation
after the first, big, user-provided batch of
functions has been compiled.
This is done to keep things simple,
and because the autogenerated functions
tend to be small, few, simple, and fast to compile.

USAGE

Concurrent backend compilation still defaults to off.
To set the number of functions that may be backend-compiled
concurrently, use the compiler flag -c.
In future work, cmd/go will automatically set -c.

Furthermore, this CL has been intentionally written
so that the c=1 path has no backend concurrency whatsoever,
not even spawning any goroutines.
This helps ensure that, should problems arise
late in the development cycle,
we can simply have cmd/go set c=1 always,
and revert to the original compiler behavior.

MUTEXES

Most of the work required to make concurrent backend
compilation safe has occurred over the past month.
This CL adds a handful of mutexes to get the rest of the way there;
they are the mutexes that I didn't see a clean way to avoid.
Some of them may still be eliminable in future work.

In no particular order:

* gc.funcsymsmu. The global funcsyms slice is populated
  lazily when we need function symbols for closures.
  This occurs during gc AST to SSA translation.
  The function funcsym also does a package lookup,
  which is a source of races on types.Pkg.Syms;
  funcsymsmu also covers that package lookup.
  This mutex is low priority: it adds a single global,
  it is in an infrequently used code path, and it is low contention.
  Since funcsyms may now be added in any order,
  we must sort them to preserve reproducible builds.

* gc.largeStackFramesMu. We don't discover until after SSA compilation
  that a function's stack frame is gigantic.
  Recording that error happens basically never,
  but it does happen concurrently.
  Fix with a low priority mutex and sorting.

* obj.Link.hashmu. ctxt.hash stores the mapping from
  types.Syms (compiler symbols) to obj.LSyms (linker symbols).
  It is accessed fairly heavily through all the phases.
  This is the only heavily contended mutex.

* gc.signatlistmu. The global signatlist map is
  populated with types through several of the concurrent phases,
  including notably via ngotype during DWARF generation.
  It is low priority for removal.

* gc.typepkgmu. Looking up symbols in the types package
  happens a fair amount during backend compilation
  and DWARF generation, particularly via ngotype.
  This mutex helps us to avoid a broader mutex on types.Pkg.Syms.
  It has low-to-moderate contention.

* types.internedStringsmu. gc AST to SSA conversion and
  some SSA work introduce new autotmps.
  Those autotmps have their names interned to reduce allocations.
  That interning requires protecting types.internedStrings.
  The autotmp names are heavily re-used, and the mutex
  overhead and contention here are low, so it is probably
  a worthwhile performance optimization to keep this mutex.

TESTING

I have been testing this code locally by running
'go install -race cmd/compile'
and then doing
'go build -a -gcflags=-c=128 std cmd'
for all architectures and a variety of compiler flags.
This obviously needs to be made part of the builders,
but it is too expensive to make part of all.bash.
I have filed #19962 for this.

REPRODUCIBLE BUILDS

This version of the compiler generates reproducible builds.
Testing reproducible builds also needs automation, however,
and is also too expensive for all.bash.
This is #19961.

Also of note is that some of the compiler flags used by 'toolstash -cmp'
are currently incompatible with concurrent backend compilation.
They still work fine with c=1.
Time will tell whether this is a problem.

NEXT STEPS

* Continue to find and fix races and bugs,
  using a combination of code inspection, fuzzing,
  and hopefully some community experimentation.
  I do not know of any outstanding races,
  but there probably are some.
* Improve testing.
* Improve performance, for many values of c.
* Integrate with cmd/go and fine tune.
* Support concurrent compilation with the -race flag.
  It is a sad irony that it does not yet work.
* Minor code cleanup that has been deferred during
  the last month due to uncertainty about the
  ultimate shape of this CL.

PERFORMANCE

Here's the buried lede, at last. :)

All benchmarks are from my 8 core 2.9 GHz Intel Core i7 darwin/amd64 laptop.

First, going from tip to this CL with c=1 has almost no impact.

name        old time/op       new time/op       delta
Template          195ms ± 3%        194ms ± 5%    ~     (p=0.370 n=30+29)
Unicode          86.6ms ± 3%       87.0ms ± 7%    ~     (p=0.958 n=29+30)
GoTypes           548ms ± 3%        555ms ± 4%  +1.35%  (p=0.001 n=30+28)
Compiler          2.51s ± 2%        2.54s ± 2%  +1.17%  (p=0.000 n=28+30)
SSA               5.16s ± 3%        5.16s ± 2%    ~     (p=0.910 n=30+29)
Flate             124ms ± 5%        124ms ± 4%    ~     (p=0.947 n=30+30)
GoParser          146ms ± 3%        146ms ± 3%    ~     (p=0.150 n=29+28)
Reflect           354ms ± 3%        352ms ± 4%    ~     (p=0.096 n=29+29)
Tar               107ms ± 5%        106ms ± 3%    ~     (p=0.370 n=30+29)
XML               200ms ± 4%        201ms ± 4%    ~     (p=0.313 n=29+28)
[Geo mean]        332ms             333ms       +0.10%

name        old user-time/op  new user-time/op  delta
Template          227ms ± 5%        225ms ± 5%    ~     (p=0.457 n=28+27)
Unicode           109ms ± 4%        109ms ± 5%    ~     (p=0.758 n=29+29)
GoTypes           713ms ± 4%        721ms ± 5%    ~     (p=0.051 n=30+29)
Compiler          3.36s ± 2%        3.38s ± 3%    ~     (p=0.146 n=30+30)
SSA               7.46s ± 3%        7.47s ± 3%    ~     (p=0.804 n=30+29)
Flate             146ms ± 7%        147ms ± 3%    ~     (p=0.833 n=29+27)
GoParser          179ms ± 5%        179ms ± 5%    ~     (p=0.866 n=30+30)
Reflect           431ms ± 4%        429ms ± 4%    ~     (p=0.593 n=29+30)
Tar               124ms ± 5%        123ms ± 5%    ~     (p=0.140 n=29+29)
XML               243ms ± 4%        242ms ± 7%    ~     (p=0.404 n=29+29)
[Geo mean]        415ms             415ms       +0.02%

name        old obj-bytes     new obj-bytes     delta
Template           382k ± 0%         382k ± 0%    ~     (all equal)
Unicode            203k ± 0%         203k ± 0%    ~     (all equal)
GoTypes           1.18M ± 0%        1.18M ± 0%    ~     (all equal)
Compiler          3.98M ± 0%        3.98M ± 0%    ~     (all equal)
SSA               8.28M ± 0%        8.28M ± 0%    ~     (all equal)
Flate              230k ± 0%         230k ± 0%    ~     (all equal)
GoParser           287k ± 0%         287k ± 0%    ~     (all equal)
Reflect           1.00M ± 0%        1.00M ± 0%    ~     (all equal)
Tar                190k ± 0%         190k ± 0%    ~     (all equal)
XML                416k ± 0%         416k ± 0%    ~     (all equal)
[Geo mean]         660k              660k       +0.00%

Comparing this CL to itself, from c=1 to c=2
improves real times 20-30%, costs 5-10% more CPU time,
and adds about 2% alloc.
The allocation increase comes from allocating more ssa.Caches.

name       old time/op       new time/op       delta
Template         202ms ± 3%        149ms ± 3%  -26.15%  (p=0.000 n=49+49)
Unicode         87.4ms ± 4%       84.2ms ± 3%   -3.68%  (p=0.000 n=48+48)
GoTypes          560ms ± 2%        398ms ± 2%  -28.96%  (p=0.000 n=49+49)
Compiler         2.46s ± 3%        1.76s ± 2%  -28.61%  (p=0.000 n=48+46)
SSA              6.17s ± 2%        4.04s ± 1%  -34.52%  (p=0.000 n=49+49)
Flate            126ms ± 3%         92ms ± 2%  -26.81%  (p=0.000 n=49+48)
GoParser         148ms ± 4%        107ms ± 2%  -27.78%  (p=0.000 n=49+48)
Reflect          361ms ± 3%        281ms ± 3%  -22.10%  (p=0.000 n=49+49)
Tar              109ms ± 4%         86ms ± 3%  -20.81%  (p=0.000 n=49+47)
XML              204ms ± 3%        144ms ± 2%  -29.53%  (p=0.000 n=48+45)

name       old user-time/op  new user-time/op  delta
Template         246ms ± 9%        246ms ± 4%     ~     (p=0.401 n=50+48)
Unicode          109ms ± 4%        111ms ± 4%   +1.47%  (p=0.000 n=44+50)
GoTypes          728ms ± 3%        765ms ± 3%   +5.04%  (p=0.000 n=46+50)
Compiler         3.33s ± 3%        3.41s ± 2%   +2.31%  (p=0.000 n=49+48)
SSA              8.52s ± 2%        9.11s ± 2%   +6.93%  (p=0.000 n=49+47)
Flate            149ms ± 4%        161ms ± 3%   +8.13%  (p=0.000 n=50+47)
GoParser         181ms ± 5%        192ms ± 2%   +6.40%  (p=0.000 n=49+46)
Reflect          452ms ± 9%        474ms ± 2%   +4.99%  (p=0.000 n=50+48)
Tar              126ms ± 6%        136ms ± 4%   +7.95%  (p=0.000 n=50+49)
XML              247ms ± 5%        264ms ± 3%   +6.94%  (p=0.000 n=48+50)

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       39.3MB ± 0%   +1.48%  (p=0.008 n=5+5)
Unicode         29.8MB ± 0%       30.2MB ± 0%   +1.19%  (p=0.008 n=5+5)
GoTypes          113MB ± 0%        114MB ± 0%   +0.69%  (p=0.008 n=5+5)
Compiler         443MB ± 0%        447MB ± 0%   +0.95%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.26GB ± 0%   +0.89%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       25.9MB ± 1%   +2.35%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       32.2MB ± 0%   +1.59%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       78.9MB ± 0%   +0.91%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       27.0MB ± 0%   +1.80%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       43.4MB ± 0%   +2.35%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          379k ± 0%         378k ± 0%     ~     (p=0.421 n=5+5)
Unicode           322k ± 0%         321k ± 0%     ~     (p=0.222 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.548 n=5+5)
Compiler         4.12M ± 0%        4.11M ± 0%   -0.14%  (p=0.032 n=5+5)
SSA              9.72M ± 0%        9.72M ± 0%     ~     (p=0.421 n=5+5)
Flate             234k ± 1%         234k ± 0%     ~     (p=0.421 n=5+5)
GoParser          316k ± 1%         315k ± 0%     ~     (p=0.222 n=5+5)
Reflect           980k ± 0%         979k ± 0%     ~     (p=0.095 n=5+5)
Tar               249k ± 1%         249k ± 1%     ~     (p=0.841 n=5+5)
XML               392k ± 0%         391k ± 0%     ~     (p=0.095 n=5+5)

From c=1 to c=4, real time is down ~40%, CPU usage up 10-20%, alloc up ~5%:

name       old time/op       new time/op       delta
Template         203ms ± 3%        131ms ± 5%  -35.45%  (p=0.000 n=50+50)
Unicode         87.2ms ± 4%       84.1ms ± 2%   -3.61%  (p=0.000 n=48+47)
GoTypes          560ms ± 4%        310ms ± 2%  -44.65%  (p=0.000 n=50+49)
Compiler         2.47s ± 3%        1.41s ± 2%  -43.10%  (p=0.000 n=50+46)
SSA              6.17s ± 2%        3.20s ± 2%  -48.06%  (p=0.000 n=49+49)
Flate            126ms ± 4%         74ms ± 2%  -41.06%  (p=0.000 n=49+48)
GoParser         148ms ± 4%         89ms ± 3%  -39.97%  (p=0.000 n=49+50)
Reflect          360ms ± 3%        242ms ± 3%  -32.81%  (p=0.000 n=49+49)
Tar              108ms ± 4%         73ms ± 4%  -32.48%  (p=0.000 n=50+49)
XML              203ms ± 3%        119ms ± 3%  -41.56%  (p=0.000 n=49+48)

name       old user-time/op  new user-time/op  delta
Template         246ms ± 9%        287ms ± 9%  +16.98%  (p=0.000 n=50+50)
Unicode          109ms ± 4%        118ms ± 5%   +7.56%  (p=0.000 n=46+50)
GoTypes          735ms ± 4%        806ms ± 2%   +9.62%  (p=0.000 n=50+50)
Compiler         3.34s ± 4%        3.56s ± 2%   +6.78%  (p=0.000 n=49+49)
SSA              8.54s ± 3%       10.04s ± 3%  +17.55%  (p=0.000 n=50+50)
Flate            149ms ± 6%        176ms ± 3%  +17.82%  (p=0.000 n=50+48)
GoParser         181ms ± 5%        213ms ± 3%  +17.47%  (p=0.000 n=50+50)
Reflect          453ms ± 6%        499ms ± 2%  +10.11%  (p=0.000 n=50+48)
Tar              126ms ± 5%        149ms ±11%  +18.76%  (p=0.000 n=50+50)
XML              246ms ± 5%        287ms ± 4%  +16.53%  (p=0.000 n=49+50)

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       40.4MB ± 0%   +4.21%  (p=0.008 n=5+5)
Unicode         29.8MB ± 0%       30.9MB ± 0%   +3.68%  (p=0.008 n=5+5)
GoTypes          113MB ± 0%        116MB ± 0%   +2.71%  (p=0.008 n=5+5)
Compiler         443MB ± 0%        455MB ± 0%   +2.75%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.27GB ± 0%   +1.84%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       26.9MB ± 1%   +6.31%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       33.2MB ± 0%   +4.61%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       80.2MB ± 0%   +2.53%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       27.9MB ± 0%   +5.19%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       44.6MB ± 0%   +5.20%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          380k ± 0%         379k ± 0%   -0.39%  (p=0.032 n=5+5)
Unicode           321k ± 0%         321k ± 0%     ~     (p=0.841 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.421 n=5+5)
Compiler         4.12M ± 0%        4.14M ± 0%   +0.52%  (p=0.008 n=5+5)
SSA              9.72M ± 0%        9.76M ± 0%   +0.37%  (p=0.008 n=5+5)
Flate             234k ± 1%         234k ± 1%     ~     (p=0.690 n=5+5)
GoParser          316k ± 0%         317k ± 1%     ~     (p=0.841 n=5+5)
Reflect           981k ± 0%         981k ± 0%     ~     (p=1.000 n=5+5)
Tar               250k ± 0%         249k ± 1%     ~     (p=0.151 n=5+5)
XML               393k ± 0%         392k ± 0%     ~     (p=0.056 n=5+5)

Going beyond c=4 on my machine tends to increase CPU time and allocs
without impacting real time.

The CPU time numbers matter, because when there are many concurrent
compilation processes, that will impact the overall throughput.

The numbers above are in many ways the best case scenario;
we can take full advantage of all cores.
Fortunately, the most common compilation scenario is incremental
re-compilation of a single package during a build/test cycle.

Updates #15756

Change-Id: I6725558ca2069edec0ac5b0d1683105a9fff6bea
Reviewed-on: https://go-review.googlesource.com/40693
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-27 00:59:07 +00:00
Josh Bleecher Snyder
386765afdf cmd/compile: move Node.Class to flags
Put it at position zero, since it is fairly hot.

This shrinks gc.Node into a smaller size class on 64 bit systems.

name        old time/op       new time/op       delta
Template          193ms ± 5%        192ms ± 3%    ~     (p=0.353 n=94+93)
Unicode          86.1ms ± 5%       85.0ms ± 4%  -1.23%  (p=0.000 n=95+98)
GoTypes           546ms ± 3%        544ms ± 4%  -0.40%  (p=0.007 n=94+97)
Compiler          2.56s ± 3%        2.54s ± 3%  -0.67%  (p=0.000 n=99+97)
SSA               5.13s ± 2%        5.10s ± 3%  -0.55%  (p=0.000 n=94+98)
Flate             122ms ± 6%        121ms ± 4%  -0.75%  (p=0.002 n=97+95)
GoParser          144ms ± 5%        144ms ± 4%    ~     (p=0.298 n=98+97)
Reflect           348ms ± 4%        349ms ± 4%    ~     (p=0.350 n=98+97)
Tar               105ms ± 5%        104ms ± 5%    ~     (p=0.154 n=96+98)
XML               200ms ± 5%        198ms ± 4%  -0.71%  (p=0.015 n=97+98)
[Geo mean]        330ms             328ms       -0.52%

name        old user-time/op  new user-time/op  delta
Template          229ms ±11%        224ms ± 7%  -2.16%  (p=0.001 n=100+87)
Unicode           109ms ± 5%        109ms ± 6%    ~     (p=0.897 n=96+91)
GoTypes           712ms ± 4%        709ms ± 4%    ~     (p=0.085 n=96+98)
Compiler          3.41s ± 3%        3.36s ± 3%  -1.43%  (p=0.000 n=98+98)
SSA               7.46s ± 3%        7.31s ± 3%  -2.02%  (p=0.000 n=100+99)
Flate             145ms ± 6%        143ms ± 6%  -1.11%  (p=0.001 n=99+97)
GoParser          177ms ± 5%        176ms ± 5%  -0.78%  (p=0.018 n=95+95)
Reflect           432ms ± 7%        435ms ± 9%    ~     (p=0.296 n=100+100)
Tar               121ms ± 7%        121ms ± 5%    ~     (p=0.072 n=100+95)
XML               241ms ± 4%        239ms ± 5%    ~     (p=0.085 n=97+99)
[Geo mean]        413ms             410ms       -0.73%

name        old alloc/op      new alloc/op      delta
Template         38.4MB ± 0%       37.7MB ± 0%  -1.85%  (p=0.008 n=5+5)
Unicode          30.1MB ± 0%       28.8MB ± 0%  -4.09%  (p=0.008 n=5+5)
GoTypes           112MB ± 0%        110MB ± 0%  -1.69%  (p=0.008 n=5+5)
Compiler          470MB ± 0%        461MB ± 0%  -1.91%  (p=0.008 n=5+5)
SSA              1.13GB ± 0%       1.11GB ± 0%  -1.70%  (p=0.008 n=5+5)
Flate            25.0MB ± 0%       24.6MB ± 0%  -1.67%  (p=0.008 n=5+5)
GoParser         31.6MB ± 0%       31.1MB ± 0%  -1.66%  (p=0.008 n=5+5)
Reflect          77.1MB ± 0%       75.8MB ± 0%  -1.69%  (p=0.008 n=5+5)
Tar              26.3MB ± 0%       25.7MB ± 0%  -2.06%  (p=0.008 n=5+5)
XML              41.9MB ± 0%       41.1MB ± 0%  -1.93%  (p=0.008 n=5+5)
[Geo mean]       73.5MB            72.0MB       -2.03%

name        old allocs/op     new allocs/op     delta
Template           383k ± 0%         383k ± 0%    ~     (p=0.690 n=5+5)
Unicode            343k ± 0%         343k ± 0%    ~     (p=0.841 n=5+5)
GoTypes           1.16M ± 0%        1.16M ± 0%    ~     (p=0.310 n=5+5)
Compiler          4.43M ± 0%        4.42M ± 0%  -0.17%  (p=0.008 n=5+5)
SSA               9.85M ± 0%        9.85M ± 0%    ~     (p=0.310 n=5+5)
Flate              236k ± 0%         236k ± 1%    ~     (p=0.841 n=5+5)
GoParser           320k ± 0%         320k ± 0%    ~     (p=0.421 n=5+5)
Reflect            988k ± 0%         987k ± 0%    ~     (p=0.690 n=5+5)
Tar                252k ± 0%         251k ± 0%    ~     (p=0.095 n=5+5)
XML                399k ± 0%         399k ± 0%    ~     (p=1.000 n=5+5)
[Geo mean]         741k              740k       -0.07%

Change-Id: I9e952b58a98e30a12494304db9ce50d0a85e459c
Reviewed-on: https://go-review.googlesource.com/41797
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
2017-04-26 16:58:33 +00:00
Josh Bleecher Snyder
7f0757b394 cmd/compile: make node.Likely a flag
node.Likely may once have held -1/0/+1,
but it is now only 0/1.

With improved SSA heuristics,
it may someday go away entirely.

Change-Id: I6451d17fd7fb47e67fea4d39df302b6db00ea57b
Reviewed-on: https://go-review.googlesource.com/41760
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 00:02:24 +00:00
Josh Bleecher Snyder
eba396f596 cmd/compile: add and use gc.Node.funcname
Change-Id: If5631eae7e2ad2bef56e79b82f77105246e68773
Reviewed-on: https://go-review.googlesource.com/41494
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-23 19:00:28 +00:00
Matthew Dempsky
fe885fbdb0 cmd/compile: cleanup after IntSize->PtrSize conversion
Also, replace "PtrSize == 4 && Arch != amd64p32" with "RegSize == 4".

Passes toolstash-check -all.

Updates #19954.

Change-Id: I79b2ee9324f4fa53e34c9271d837ea288b5d7829
Reviewed-on: https://go-review.googlesource.com/41491
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-23 02:07:26 +00:00
Matthew Dempsky
c87520c598 cmd: remove IntSize and Widthint
Use PtrSize and Widthptr instead. CL prepared mostly with sed and
uniq.

Passes toolstash-check -all.

Fixes #19954.

Change-Id: I09371bd7128672885cb8bc4e7f534ad56a88d755
Reviewed-on: https://go-review.googlesource.com/40506
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-22 17:43:43 +00:00
Josh Bleecher Snyder
30940e2cc2 cmd/compile: move Linksym, linksymname, and isblanksym to types package
Response to code review feedback on CL 40693.

This CL was prepared by:

(1) manually adding new implementations and the Ctxt var to package types

(2) running eg with template:

func before(s *types.Sym) *obj.LSym { return gc.Linksym(s) }
func after(s *types.Sym) *obj.LSym  { return s.Linksym() }

(3) running gofmt -r:

gofmt -r 'isblanksym(a) -> a.IsBlank()'

(4) manually removing old implementations from package gc

Passes toolstash-check.

Change-Id: I39c35def7cae5bcbcc7c77253e5d2b066b981dea
Reviewed-on: https://go-review.googlesource.com/41302
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-04-21 16:10:29 +00:00
Keith Randall
38dee12dea cmd/compile: zero ambiguously live variables at VARKILLs
At VARKILLs, zero a variable if it is ambiguously live.
After the VARKILL anything this variable references
might be collected. If it were to become live again later,
the GC will see references to already-collected objects.

We don't know a variable is ambiguously live until very
late in compilation (after lowering, register allocation, ...),
so it is hard to generate the code in an arch-independent way.
We also have to be careful not to clobber any registers.
Fortunately, this almost never happens so performance is ~irrelevant.

There are only 2 instances where this triggers in the stdlib.

Fixes #20029

Change-Id: Ia9585a91d7b823fad4a9d141d954464cc7af31f4
Reviewed-on: https://go-review.googlesource.com/41076
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-04-20 23:47:43 +00:00
Matthew Dempsky
263ba3ac7b cmd/compile/internal/gc: make defframe arch-independent
The arch backends no longer depend on gc.Node.

Passes toolstash-check -all.

Change-Id: Ic7e49ae0a3ed155a2761c25e17cc341b46333fb4
Reviewed-on: https://go-review.googlesource.com/41196
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-20 18:34:14 +00:00
Josh Bleecher Snyder
01b1a34aac cmd/compile: rework handling of udiv on ARM
Instead of populating the aux symbol
of CALLudiv during rewrite rules,
populate it during genssa.

This simplifies the rewrite rules.
It also removes all remaining calls
to ctxt.Lookup from any rewrite rules.
This is a first step towards removing
ctxt from ssa.Cache entirely,
and also a first step towards converting
the obj.LSym.Version field into a boolean.
It should also speed up compilation.

Also, move func udiv into package runtime.
That's where it is anyway,
and it lets udiv look and act like the rest of
the runtime support functions.

Change-Id: I41462a632c14fdc41f61b08049ec13cd80a87bfe
Reviewed-on: https://go-review.googlesource.com/41191
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-20 16:27:38 +00:00
Matthew Dempsky
1e3570ac86 cmd/internal/objabi: extract shared functionality from obj
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.

There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.

objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.

Fixes #15165.
Fixes #20026.

Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-19 00:00:09 +00:00
Josh Bleecher Snyder
245ef3a157 cmd/compile: look up more runtime symbols before SSA begins
This avoids concurrent runtime package lookups.

Updates #15756

Change-Id: I9e2cbd042aba44923f0d03e6ca5b4eb60fa9e7ea
Reviewed-on: https://go-review.googlesource.com/40853
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-18 02:14:12 +00:00
Josh Bleecher Snyder
da15fe6870 cmd/internal/obj: rework gclocals handling
The compiler handled gcargs and gclocals LSyms unusually.
It generated placeholder symbols (makefuncdatasym),
filled them in, and then renamed them for content-addressability.
This is an important binary size optimization;
the same locals information occurs over and over.

This CL continues to treat these LSyms unusually,
but in a slightly more explicit way,
and importantly for concurrent compilation,
in a way that does not require concurrent
modification of Ctxt.Hash.

Instead of creating gcargs and gclocals in the usual way,
by creating a types.Sym and then an obj.LSym,
we add them directly to obj.FuncInfo,
initialize them in obj.InitTextSym,
and deduplicate and add them to ctxt.Data at the end.
Then the backend's job is simply to fill them in
and rename them appropriately.

Updates #15756

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       38.7MB ± 0%  -0.22%  (p=0.016 n=5+5)
Unicode         29.8MB ± 0%       29.8MB ± 0%    ~     (p=0.690 n=5+5)
GoTypes          113MB ± 0%        113MB ± 0%  -0.24%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.24GB ± 0%  -0.39%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       25.2MB ± 0%  -0.43%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       31.7MB ± 0%  -0.22%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       77.6MB ± 0%  -0.80%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       26.3MB ± 0%  -0.85%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       41.9MB ± 0%  -1.04%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          378k ± 0%         377k ± 1%    ~     (p=0.151 n=5+5)
Unicode           321k ± 1%         321k ± 0%    ~     (p=0.841 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%  -0.47%  (p=0.016 n=5+5)
SSA              9.71M ± 0%        9.67M ± 0%  -0.33%  (p=0.008 n=5+5)
Flate             233k ± 1%         232k ± 1%    ~     (p=0.151 n=5+5)
GoParser          316k ± 0%         315k ± 0%  -0.49%  (p=0.016 n=5+5)
Reflect           979k ± 0%         972k ± 0%  -0.75%  (p=0.008 n=5+5)
Tar               250k ± 0%         247k ± 1%  -0.92%  (p=0.008 n=5+5)
XML               392k ± 1%         389k ± 0%  -0.67%  (p=0.008 n=5+5)

Change-Id: Idc36186ca9d2f8214b5f7720bbc27b6bb22fdc48
Reviewed-on: https://go-review.googlesource.com/40697
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-18 02:13:32 +00:00
Matthew Dempsky
700574e791 cmd/compile/internal/ssa: ExternSymbol's Typ field is unused too
Change-Id: I5b692eb0586c40f3735a6b9c928e97ffa00a70e6
Reviewed-on: https://go-review.googlesource.com/40471
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-12 18:06:39 +00:00