107 Commits

Author SHA1 Message Date
Russ Cox
95ed5c3800 internal/buildcfg: move build configuration out of cmd/internal/objabi
The go/build package needs access to this configuration,
so move it into a new package available to the standard library.

Change-Id: I868a94148b52350c76116451f4ad9191246adcff
Reviewed-on: https://go-review.googlesource.com/c/go/+/310731
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
2021-04-16 19:20:53 +00:00
Austin Clements
0c93b16d01 cmd: move experiment flags into objabi.Experiment
This moves all remaining GOEXPERIMENT flags into the objabi.Experiment
struct, drops the "_enabled" from their name, and makes them all bool
typed.

We also drop DebugFlags.Fieldtrack because the previous CL shifted the
one test that used it to use GOEXPERIMENT instead.

Change-Id: I3406fe62b1c300bb4caeaffa6ca5ce56a70497fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/302389
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-03-18 21:27:23 +00:00
David Chase
f7dad5eae4 [dev.regabi] cmd/compile: remove leftover code form late call lowering work
It's no longer conditional.

Change-Id: I697bb0e9ffe9644ec4d2766f7e8be8b82d3b0638
Reviewed-on: https://go-review.googlesource.com/c/go/+/286013
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-01-26 18:35:19 +00:00
Michele Di Pede
94b3fd06cb cmd/compile: code cleanup
Change-Id: Ibf68e663f29a5cb3b64a7d923c005c16da647769
Reviewed-on: https://go-review.googlesource.com/c/go/+/266537
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-30 19:48:57 +00:00
David Chase
15f01d6ae9 cmd/compile: delay expansion of OpArg until expand_calls
As it says, delay expanpsion of OpArg to the expand_calls phase,
to enable (eventually) interprocedural SSA optimizations, and
(sooner) change to a register ABI.

Includes a round of cleanup to function names and comments,
largely to match the expanded scope of the functions.

This CL removes the per-function dependence on GOSSAHASH,
but the go116lateCallExpansion kill switch remains (and was
tested locally to ensure it worked).

Two functions in expand_calls.go that performed overlapping
things were combined into a single function that is called
twice.

Fixes #42236.
For #40724.

Change-Id: Icbb78947eaa39f17f2c1210d5c2caef20abd6571
Reviewed-on: https://go-review.googlesource.com/c/go/+/262117
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-29 03:23:51 +00:00
David Chase
c3fe874f25 cmd/compile: avoid generating CSEs; do all aggregates; maintain debug names
This adds a pass to detect common selection operations,
to avoid generating duplicates.  Duplicate offsets are
also detected.

All aggregate types are now handled; there is some freedom in where
expand_calls is run, though it must run before softfloat.

Debug-name-maintenance is now incremental both in decompose builtin
and in expand_calls; it might be good to push this into all the
decompose passes.

(this is a smash of 5 CLs that rewrote some of the same code several
times to deal with phase-ordering problems, and included an abandoned
attempt.)

For #40724.

Change-Id: I2a0c32f20660bf8b99e2bcecd33545d97d2bd3c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/249458
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-23 18:02:16 +00:00
David Chase
f4cbf3477f cmd/compile: allow directory specification for GOSSAFUNC output
This was useful for debugging failures occurring during make.bash.
The added flush also ensures that any hints in the GOSSAFUNC output
are flushed before fatal exit.

The environment variable GOSSADIR specifies where the SSA html debugging
files should be placed.  To avoid collisions, each one is written into
the [package].[functionOrMethod].html, where [package] is the filepath
separator separated package name, function is the function name, and method
is either (*Type).Method, or Type.Method, as appropriate.  Directories
are created as necessary to make this work.

Change-Id: I420927426b618b633bb1ffc51cf0f223b8f6d49c
Reviewed-on: https://go-review.googlesource.com/c/go/+/252338
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-01 20:52:34 +00:00
David Chase
ad8447bed9 cmd/compile: fix late call expansion for SSA-able aggregate results and arguments
This change incorporates the decision that it should be possible to
run call expansion relatively late in the optimization chain, so that
(1) calls themselves can be exposed to useful optimizations
(2) the effect of selectors on aggregates is seen at the rewrite,
    so that assignment of parts into registers is less complicated
    (at least I hope it works that way).

That means that selectors feeding into SelectN need to be processed,
and Make* feeding into call parameters need to be processed.

This does however require that call expansion run before decompose
builtins.

This doesn't yet handle rewrites of strings, slices, interfaces,
and complex numbers.

Passes run.bash and race.bash

Change-Id: I71ff23d3c491043beb30e926949970c4f63ef1a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/245133
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-01 16:43:20 +00:00
David Chase
44c0586931 cmd/compile: add code to expand calls just before late opt
Still needs to generate the calls that will need lowering.

Change-Id: Ifd4e510193441a5e27c462c1f1d704f07bf6dec3
Reviewed-on: https://go-review.googlesource.com/c/go/+/242359
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-18 16:14:40 +00:00
surechen
17553c6e71 cmd/compile: move dumpFileSeq
I noticed that there is a Todo comment here. This variable is only used for filename when dump a function's ssa passes result in details. It is no problem to print a function alone, but may be edited by not only one goroutine if dump multiple functions at the same time. Although it looks only dump one function's ssa passes now. As far as I am concerned this variable can be a member variable of the struct Func. I'm not sure if this change is necessary. Looking forward to your advices, thank you very much.

Change-Id: I35dd7247889e0cc7f19c0b400b597206592dee75
Reviewed-on: https://go-review.googlesource.com/c/go/+/244918
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2020-08-17 21:04:19 +00:00
Michael Munday
ac743dea8e cmd/compile: always tighten and de-duplicate tuple selectors
The scheduler assumes two special invariants that apply to tuple
selectors (Select0 and Select1 ops):

  1. There is only one tuple selector of each type per generator.
  2. Tuple selectors and generators reside in the same block.

Prior to this CL the assumption was that these invariants would
only be broken by the CSE pass. The CSE pass therefore contained
code to move and de-duplicate selectors to fix these invariants.

However it is also possible to write relatively basic optimization
rules that cause these invariants to be broken. For example:

  (A (Select0 (B))) -> (Select1 (B))

This rule could result in the newly added selector (Select1) being
in a different block to the tuple generator (see issue #38356). It
could also result in duplicate selectors if this rule matches
multiple times for the same tuple generator (see issue #39472).

The CSE pass will 'fix' these invariants. However it will only do
so when optimizations are enabled (since disabling optimizations
disables the CSE pass).

This CL moves the CSE tuple selector fixup code into its own pass
and makes it mandatory even when optimizations are disabled. This
allows tuple selectors to be treated like normal ops for most of
the compilation pipeline until after the new pass has run, at which
point we need to be careful to maintain the invariant again.

Fixes #39472.

Change-Id: Ia3f79e09d9c65ac95f897ce37e967ee1258a080b
Reviewed-on: https://go-review.googlesource.com/c/go/+/237118
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-06-10 14:55:29 +00:00
Bradford Lamson-Scribner
763bd58b19 cmd/compile: restore missing columns in ssa.html
If the final pass(es) are identical during ssa.html generation,
they are persisted in-memory as "pendingPhases" but never get
written as a column in the html. This change flushes those
in-memory phases.

Fixes #38242

Change-Id: Id13477dcbe7b419a818bb457861b2422ba5ef4bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/227182
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-05 23:23:03 +00:00
Bradford Lamson-Scribner
6736b2fdb2 cmd/compile: refactor around HTMLWriter removing logger in favor of Func
Replace HTMLWriter's Logger field with a *Func. Implement Fatalf method
for HTMLWriter which gets the Frontend() from the Func and calls down
into it's Fatalf method, passing the msg and args along. Replace
remaining calls to the old Logger with calls to logging methods on
the Func.

Change-Id: I966342ef9997396f3416fb152fa52d60080ebecb
Reviewed-on: https://go-review.googlesource.com/c/go/+/227277
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-05 20:54:32 +00:00
Keith Randall
98cb76799c cmd/compile: insert complicated x86 addressing modes as a separate pass
Use a separate compiler pass to introduce complicated x86 addressing
modes.  Loads in the normal architecture rules (for x86 and all other
platforms) can have constant offsets (AuxInt values) and symbols (Aux
values), but no more.

The complex addressing modes (x+y, x+2*y, etc.) are introduced in a
separate pass that combines loads with LEAQx ops.

Organizing rewrites this way simplifies the number of rewrites
required, as there are lots of different rule orderings that have to
be specified to ensure these complex addressing modes are always found
if they are possible.

Update #36468

Change-Id: I5b4bf7b03a1e731d6dfeb9ef19b376175f3b4b44
Reviewed-on: https://go-review.googlesource.com/c/go/+/217097
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-03-10 00:13:21 +00:00
Michael Munday
e37cc29863 cmd/compile: optimize integer-in-range checks
This CL incorporates code from CL 201206 by Josh Bleecher Snyder
(thanks Josh).

This CL restores the integer-in-range optimizations in the SSA
backend. The fuse pass is enhanced to detect inequalities that
could be merged and fuse their associated blocks while the generic
rules optimize them into a single unsigned comparison.

For example, the inequality `x >= 0 && x < 10` will now be optimized
to `unsigned(x) < 10`.

Overall has a fairly positive impact on binary sizes.

name                      old time/op       new time/op       delta
Template                        192ms ± 1%        192ms ± 1%    ~     (p=0.757 n=17+18)
Unicode                        76.6ms ± 2%       76.5ms ± 2%    ~     (p=0.603 n=19+19)
GoTypes                         694ms ± 1%        693ms ± 1%    ~     (p=0.569 n=19+20)
Compiler                        3.26s ± 0%        3.27s ± 0%  +0.25%  (p=0.000 n=20+20)
SSA                             7.41s ± 0%        7.49s ± 0%  +1.10%  (p=0.000 n=17+19)
Flate                           120ms ± 1%        120ms ± 1%  +0.38%  (p=0.003 n=19+19)
GoParser                        152ms ± 1%        152ms ± 1%    ~     (p=0.061 n=17+19)
Reflect                         422ms ± 1%        425ms ± 2%  +0.76%  (p=0.001 n=18+20)
Tar                             167ms ± 1%        167ms ± 0%    ~     (p=0.730 n=18+19)
XML                             233ms ± 4%        231ms ± 1%    ~     (p=0.752 n=20+17)
LinkCompiler                    927ms ± 8%        928ms ± 8%    ~     (p=0.857 n=19+20)
ExternalLinkCompiler            1.81s ± 2%        1.81s ± 2%    ~     (p=0.513 n=19+20)
LinkWithoutDebugCompiler        556ms ±10%        583ms ±13%  +4.95%  (p=0.007 n=20+20)
[Geo mean]                      478ms             481ms       +0.52%

name                      old user-time/op  new user-time/op  delta
Template                        270ms ± 5%        269ms ± 7%    ~     (p=0.925 n=20+20)
Unicode                         134ms ± 7%        131ms ±14%    ~     (p=0.593 n=18+20)
GoTypes                         981ms ± 3%        987ms ± 2%  +0.63%  (p=0.049 n=19+18)
Compiler                        4.50s ± 2%        4.50s ± 1%    ~     (p=0.588 n=19+20)
SSA                             10.6s ± 2%        10.6s ± 1%    ~     (p=0.141 n=20+19)
Flate                           164ms ± 8%        165ms ±10%    ~     (p=0.738 n=20+20)
GoParser                        202ms ± 5%        203ms ± 6%    ~     (p=0.820 n=20+20)
Reflect                         587ms ± 6%        597ms ± 3%    ~     (p=0.087 n=20+18)
Tar                             230ms ± 6%        228ms ± 8%    ~     (p=0.569 n=19+20)
XML                             311ms ± 6%        314ms ± 5%    ~     (p=0.369 n=20+20)
LinkCompiler                    878ms ± 8%        887ms ± 7%    ~     (p=0.289 n=20+20)
ExternalLinkCompiler            1.60s ± 7%        1.60s ± 7%    ~     (p=0.820 n=20+20)
LinkWithoutDebugCompiler        498ms ±12%        489ms ±11%    ~     (p=0.398 n=20+20)
[Geo mean]                      611ms             611ms       +0.05%

name                      old alloc/op      new alloc/op      delta
Template                       36.1MB ± 0%       36.0MB ± 0%  -0.32%  (p=0.000 n=20+20)
Unicode                        28.3MB ± 0%       28.3MB ± 0%  -0.03%  (p=0.000 n=19+20)
GoTypes                         121MB ± 0%        121MB ± 0%    ~     (p=0.226 n=16+20)
Compiler                        563MB ± 0%        563MB ± 0%    ~     (p=0.166 n=20+19)
SSA                            1.32GB ± 0%       1.33GB ± 0%  +0.88%  (p=0.000 n=20+19)
Flate                          22.7MB ± 0%       22.7MB ± 0%  -0.02%  (p=0.033 n=19+20)
GoParser                       27.9MB ± 0%       27.9MB ± 0%  -0.02%  (p=0.001 n=20+20)
Reflect                        78.3MB ± 0%       78.2MB ± 0%  -0.01%  (p=0.019 n=20+20)
Tar                            34.0MB ± 0%       34.0MB ± 0%  -0.04%  (p=0.000 n=20+20)
XML                            43.9MB ± 0%       43.9MB ± 0%  -0.07%  (p=0.000 n=20+19)
LinkCompiler                    205MB ± 0%        205MB ± 0%  +0.44%  (p=0.000 n=20+18)
ExternalLinkCompiler            223MB ± 0%        223MB ± 0%  +0.03%  (p=0.000 n=20+20)
LinkWithoutDebugCompiler        139MB ± 0%        142MB ± 0%  +1.75%  (p=0.000 n=20+20)
[Geo mean]                     93.7MB            93.9MB       +0.20%

name                      old allocs/op     new allocs/op     delta
Template                         363k ± 0%         361k ± 0%  -0.58%  (p=0.000 n=20+19)
Unicode                          329k ± 0%         329k ± 0%  -0.06%  (p=0.000 n=19+20)
GoTypes                         1.28M ± 0%        1.28M ± 0%  -0.01%  (p=0.000 n=20+20)
Compiler                        5.40M ± 0%        5.40M ± 0%  -0.01%  (p=0.000 n=20+20)
SSA                             12.7M ± 0%        12.8M ± 0%  +0.80%  (p=0.000 n=20+20)
Flate                            228k ± 0%         228k ± 0%    ~     (p=0.194 n=20+20)
GoParser                         295k ± 0%         295k ± 0%  -0.04%  (p=0.000 n=20+20)
Reflect                          949k ± 0%         949k ± 0%  -0.01%  (p=0.000 n=20+20)
Tar                              337k ± 0%         337k ± 0%  -0.06%  (p=0.000 n=20+20)
XML                              418k ± 0%         417k ± 0%  -0.17%  (p=0.000 n=20+20)
LinkCompiler                     553k ± 0%         554k ± 0%  +0.22%  (p=0.000 n=20+19)
ExternalLinkCompiler            1.52M ± 0%        1.52M ± 0%  +0.27%  (p=0.000 n=20+20)
LinkWithoutDebugCompiler         186k ± 0%         186k ± 0%  +0.06%  (p=0.000 n=20+20)
[Geo mean]                       723k              723k       +0.03%

name                      old text-bytes    new text-bytes    delta
HelloSize                       828kB ± 0%        828kB ± 0%  -0.01%  (p=0.000 n=20+20)

name                      old data-bytes    new data-bytes    delta
HelloSize                      13.4kB ± 0%       13.4kB ± 0%    ~     (all equal)

name                      old bss-bytes     new bss-bytes     delta
HelloSize                       180kB ± 0%        180kB ± 0%    ~     (all equal)

name                      old exe-bytes     new exe-bytes     delta
HelloSize                      1.23MB ± 0%       1.23MB ± 0%  -0.33%  (p=0.000 n=20+20)

file      before    after     Δ       %
addr2line 4320075   4311883   -8192   -0.190%
asm       5191932   5187836   -4096   -0.079%
buildid   2835338   2831242   -4096   -0.144%
compile   20531717  20569099  +37382  +0.182%
cover     5322511   5318415   -4096   -0.077%
dist      3723749   3719653   -4096   -0.110%
doc       4743515   4739419   -4096   -0.086%
fix       3413960   3409864   -4096   -0.120%
link      6690119   6686023   -4096   -0.061%
nm        4269616   4265520   -4096   -0.096%
pprof     14942189  14929901  -12288  -0.082%
trace     11807164  11790780  -16384  -0.139%
vet       8384104   8388200   +4096   +0.049%
go        15339076  15334980  -4096   -0.027%
total     132258257 132226007 -32250  -0.024%

Fixes #30645.

Change-Id: If551ac5996097f3685870d083151b5843170aab0
Reviewed-on: https://go-review.googlesource.com/c/go/+/165998
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-03 14:30:26 +00:00
Josh Bleecher Snyder
97a268624c cmd/compile: add -d=ssa/check/seed=SEED
This change adds the option to run the ssa checker with a random seed.
The current system uses a completely fixed seed,
which is good for reproducibility but bad for exploring the state space.

Preserve what we have, but also provide a way for the caller
to provide a seed. The caller can report the seed
alongside any failures.

Change-Id: I2676a8112d8260e6cac86d95d2e8db4d3221aeeb
Reviewed-on: https://go-review.googlesource.com/c/go/+/216418
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-02 22:11:29 +00:00
Giovanni Bajo
19532d04bf cmd/compile: add debugging mode for poset
Add an internal mode to simplify debugging of posets
by checking the integrity after every mutation. Turn
it on within SSA checked builds.

Change-Id: Idaa8277f58e5bce3753702e212cea4d698de30ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/196780
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-10-14 21:00:36 +00:00
David Chase
adc4d2cc2d cmd/compile: run deadcode before nilcheck for better statement relocation
Nilcheck would move statements from NilCheck values to others that
turned out were already dead, which leads to lost statements.  Better
to eliminate the dead code first.

One "error" is removed from test/prove.go because the code is
actually dead, and the additional deadcode pass removes it before
prove can run.

Change-Id: If75926ca1acbb59c7ab9c8ef14d60a02a0a94f8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/198479
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
2019-10-03 21:12:13 +00:00
Josh Bleecher Snyder
52ae04fdfc cmd/compile: improve shortcircuit pass
While working on #30645, I noticed that many instances
in which the walkinrange optimization could apply
were not even being considered.

This was because of extraneous blocks in the CFG,
of the type that shortcircuit normally removes.

The change improves the shortcircuit pass to handle
most of those cases. (There are a few that can only be
reasonably detected later in compilation, after other
optimizations have been run, but not enough to be worth chasing.)

Notable changes:

* Instead of calculating live-across-blocks values, use v.Uses == 1.
  This is cheaper and more straightforward.
  v.Uses did not exist when this pass was initially written.
* Incorporate a fusePlain and loop until stable.
  This is necessary to find many of the instances.
* Allow Copy and Not wrappers around Phi values.
  This significantly increases effectiveness.
* Allow removal of all preds, creating a dead block.
  The previous pass stopped unnecessarily at one pred.
* Use phielimValue during cleanup instead of manually
  setting the op to OpCopy.

The result is marginally faster compilation and smaller code.

name        old time/op       new time/op       delta
Template          213ms ± 2%        212ms ± 2%  -0.63%  (p=0.002 n=49+48)
Unicode          90.0ms ± 2%       89.8ms ± 2%    ~     (p=0.122 n=48+48)
GoTypes           710ms ± 3%        711ms ± 2%    ~     (p=0.433 n=45+49)
Compiler          3.23s ± 2%        3.22s ± 2%    ~     (p=0.124 n=47+49)
SSA               10.0s ± 1%        10.0s ± 1%  -0.43%  (p=0.000 n=48+50)
Flate             135ms ± 3%        135ms ± 2%    ~     (p=0.311 n=49+49)
GoParser          158ms ± 2%        158ms ± 2%    ~     (p=0.757 n=48+48)
Reflect           447ms ± 2%        447ms ± 2%    ~     (p=0.815 n=49+48)
Tar               189ms ± 2%        189ms ± 3%    ~     (p=0.530 n=47+49)
XML               251ms ± 3%        250ms ± 1%  -0.75%  (p=0.002 n=49+48)
[Geo mean]        427ms             426ms       -0.25%

name        old user-time/op  new user-time/op  delta
Template          265ms ± 2%        265ms ± 2%    ~     (p=0.969 n=48+50)
Unicode           119ms ± 6%        119ms ± 6%    ~     (p=0.738 n=50+50)
GoTypes           923ms ± 2%        925ms ± 2%    ~     (p=0.057 n=43+47)
Compiler          4.37s ± 2%        4.37s ± 2%    ~     (p=0.691 n=50+46)
SSA               13.4s ± 1%        13.4s ± 1%    ~     (p=0.282 n=42+49)
Flate             162ms ± 2%        162ms ± 2%    ~     (p=0.774 n=48+50)
GoParser          186ms ± 2%        186ms ± 3%    ~     (p=0.213 n=47+47)
Reflect           572ms ± 2%        573ms ± 3%    ~     (p=0.303 n=50+49)
Tar               240ms ± 3%        240ms ± 2%    ~     (p=0.939 n=46+44)
XML               302ms ± 2%        302ms ± 2%    ~     (p=0.399 n=47+47)
[Geo mean]        540ms             541ms       +0.07%

name        old alloc/op      new alloc/op      delta
Template         36.8MB ± 0%       36.7MB ± 0%  -0.42%  (p=0.008 n=5+5)
Unicode          28.1MB ± 0%       28.1MB ± 0%    ~     (p=0.151 n=5+5)
GoTypes           124MB ± 0%        124MB ± 0%  -0.26%  (p=0.008 n=5+5)
Compiler          571MB ± 0%        566MB ± 0%  -0.84%  (p=0.008 n=5+5)
SSA              1.86GB ± 0%       1.85GB ± 0%  -0.58%  (p=0.008 n=5+5)
Flate            22.8MB ± 0%       22.8MB ± 0%  -0.17%  (p=0.008 n=5+5)
GoParser         27.3MB ± 0%       27.3MB ± 0%  -0.20%  (p=0.008 n=5+5)
Reflect          79.5MB ± 0%       79.3MB ± 0%  -0.20%  (p=0.008 n=5+5)
Tar              34.7MB ± 0%       34.6MB ± 0%  -0.42%  (p=0.008 n=5+5)
XML              45.4MB ± 0%       45.3MB ± 0%  -0.29%  (p=0.008 n=5+5)
[Geo mean]       80.0MB            79.7MB       -0.34%

name        old allocs/op     new allocs/op     delta
Template           378k ± 0%         377k ± 0%  -0.22%  (p=0.008 n=5+5)
Unicode            339k ± 0%         339k ± 0%    ~     (p=0.643 n=5+5)
GoTypes           1.36M ± 0%        1.36M ± 0%  -0.10%  (p=0.008 n=5+5)
Compiler          5.51M ± 0%        5.50M ± 0%  -0.13%  (p=0.008 n=5+5)
SSA               17.5M ± 0%        17.5M ± 0%  -0.14%  (p=0.008 n=5+5)
Flate              234k ± 0%         234k ± 0%  -0.04%  (p=0.008 n=5+5)
GoParser           299k ± 0%         299k ± 0%  -0.05%  (p=0.008 n=5+5)
Reflect            978k ± 0%         979k ± 0%  +0.02%  (p=0.016 n=5+5)
Tar                351k ± 0%         351k ± 0%  -0.04%  (p=0.008 n=5+5)
XML                435k ± 0%         435k ± 0%  -0.11%  (p=0.008 n=5+5)
[Geo mean]         840k              840k       -0.08%

file      before    after     Δ       %
go        14794788  14770212  -24576  -0.166%
addr2line 4203688   4199592   -4096   -0.097%
api       5954056   5941768   -12288  -0.206%
asm       4862704   4846320   -16384  -0.337%
cgo       4778920   4770728   -8192   -0.171%
compile   24001568  23923792  -77776  -0.324%
cover     5198440   5190248   -8192   -0.158%
dist      3595248   3587056   -8192   -0.228%
doc       4618504   4610312   -8192   -0.177%
fix       3337416   3333320   -4096   -0.123%
link      6120408   6116312   -4096   -0.067%
nm        4149064   4140872   -8192   -0.197%
objdump   4555608   4547416   -8192   -0.180%
pprof     14616324  14595844  -20480  -0.140%
test2json 2766328   2762232   -4096   -0.148%
trace     11638844  11622460  -16384  -0.141%
vet       8274936   8258552   -16384  -0.198%
total     132520780 132270972 -249808 -0.189%

Change-Id: Ifcd235a2a6e5f13ed5c93e62523e2ef61321fccf
Reviewed-on: https://go-review.googlesource.com/c/go/+/178197
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-08-27 22:15:13 +00:00
Josh Bleecher Snyder
260e3d0818 cmd/compile: run deadcode before lowered CSE
CSE can make dead values live again.
Running deadcode first avoids that;
it also makes CSE more efficient.

file    before    after     Δ       %       
api     5970616   5966520   -4096   -0.069% 
asm     4867088   4846608   -20480  -0.421% 
compile 23988320  23935072  -53248  -0.222% 
link    6084376   6080280   -4096   -0.067% 
nm      4165736   4161640   -4096   -0.098% 
objdump 4572216   4568120   -4096   -0.090% 
pprof   14452996  14457092  +4096   +0.028% 
trace   11467292  11471388  +4096   +0.036% 
total   132181100 132099180 -81920  -0.062% 

Compiler performance impact is negligible:

name        old alloc/op      new alloc/op      delta
Template         38.8MB ± 0%       38.8MB ± 0%  -0.04%  (p=0.008 n=5+5)
Unicode          28.2MB ± 0%       28.2MB ± 0%    ~     (p=1.000 n=5+5)
GoTypes           131MB ± 0%        131MB ± 0%  -0.14%  (p=0.008 n=5+5)
Compiler          606MB ± 0%        606MB ± 0%  -0.05%  (p=0.008 n=5+5)
SSA              2.14GB ± 0%       2.13GB ± 0%  -0.26%  (p=0.008 n=5+5)
Flate            24.0MB ± 0%       24.0MB ± 0%  -0.18%  (p=0.008 n=5+5)
GoParser         28.8MB ± 0%       28.8MB ± 0%  -0.15%  (p=0.008 n=5+5)
Reflect          83.8MB ± 0%       83.7MB ± 0%  -0.11%  (p=0.008 n=5+5)
Tar              36.4MB ± 0%       36.4MB ± 0%  -0.09%  (p=0.008 n=5+5)
XML              47.9MB ± 0%       47.8MB ± 0%  -0.15%  (p=0.008 n=5+5)
[Geo mean]       84.6MB            84.5MB       -0.12%

name        old allocs/op     new allocs/op     delta
Template           379k ± 0%         380k ± 0%  +0.15%  (p=0.008 n=5+5)
Unicode            340k ± 0%         340k ± 0%    ~     (p=0.738 n=5+5)
GoTypes           1.36M ± 0%        1.36M ± 0%  +0.05%  (p=0.008 n=5+5)
Compiler          5.49M ± 0%        5.49M ± 0%  +0.12%  (p=0.008 n=5+5)
SSA               17.5M ± 0%        17.5M ± 0%  -0.18%  (p=0.008 n=5+5)
Flate              235k ± 0%         235k ± 0%    ~     (p=0.079 n=5+5)
GoParser           302k ± 0%         302k ± 0%    ~     (p=0.310 n=5+5)
Reflect            976k ± 0%         977k ± 0%  +0.08%  (p=0.008 n=5+5)
Tar                352k ± 0%         352k ± 0%  +0.12%  (p=0.008 n=5+5)
XML                436k ± 0%         436k ± 0%  -0.05%  (p=0.008 n=5+5)
[Geo mean]         842k              842k       +0.03%


Change-Id: I53e8faed1859885ca5c4a5d45067a50984f3eff1
Reviewed-on: https://go-review.googlesource.com/c/go/+/175879
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-08-27 21:04:53 +00:00
Josh Bleecher Snyder
0047353c53 cmd/compile: add countRule rewrite rule helper
noteRule is useful when you're trying to debug
a particular rule, or get a general sense for
how often a rule fires overall.

It is less useful if you're trying to figure
out which functions might be useful to benchmark
to ascertain the impact of a newly added rule.

Enter countRule. You use it like noteRule,
except that you get per-function summaries.

Sample output:

 # runtime
(*mspan).sweep: idx1=1
evacuate_faststr: idx1=1
evacuate_fast32: idx1=1
evacuate: idx1=2
evacuate_fast64: idx1=1
sweepone: idx1=1
purgecachedstats: idx1=1
mProf_Free: idx1=1

This suggests that the map benchmarks
might be good to run for this added rule.

Change-Id: Id471c3231f1736165f2020f6979ff01c29677808
Reviewed-on: https://go-review.googlesource.com/c/go/+/167088
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-05-08 22:58:10 +00:00
Josh Bleecher Snyder
49ad7bc67c cmd/compile: note that some rules know the name of the opt pass
Change-Id: I4a70f4a52f84cf50f99939351319504b1c5dff76
Reviewed-on: https://go-review.googlesource.com/c/go/+/175777
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-05-07 17:43:46 +00:00
Josh Bleecher Snyder
c1544ff906 cmd/compile: move phi tighten after critical
The phi tighten pass moves rematerializable phi args
to the immediate predecessor of the phis.
This reduces value lifetimes for regalloc.

However, the critical edge removal pass can introduce
new blocks, which can change what a block's
immediate precedessor is. This can result in tightened
phi args being spilled unnecessarily.

This change moves the phi tighten pass after the
critical edge pass, when the block structure is stable.

This improves the code generated for

func f(s string) bool { return s == "abcde" }

Before this change:

"".f STEXT nosplit size=44 args=0x18 locals=0x0
	0x0000 00000 (x.go:3)	MOVQ	"".s+16(SP), AX
	0x0005 00005 (x.go:3)	CMPQ	AX, $5
	0x0009 00009 (x.go:3)	JNE	40
	0x000b 00011 (x.go:3)	MOVQ	"".s+8(SP), AX
	0x0010 00016 (x.go:3)	CMPL	(AX), $1684234849
	0x0016 00022 (x.go:3)	JNE	36
	0x0018 00024 (x.go:3)	CMPB	4(AX), $101
	0x001c 00028 (x.go:3)	SETEQ	AL
	0x001f 00031 (x.go:3)	MOVB	AL, "".~r1+24(SP)
	0x0023 00035 (x.go:3)	RET
	0x0024 00036 (x.go:3)	XORL	AX, AX
	0x0026 00038 (x.go:3)	JMP	31
	0x0028 00040 (x.go:3)	XORL	AX, AX
	0x002a 00042 (x.go:3)	JMP	31

Observe the duplicated blocks at the end.
After this change:

"".f STEXT nosplit size=40 args=0x18 locals=0x0
	0x0000 00000 (x.go:3)	MOVQ	"".s+16(SP), AX
	0x0005 00005 (x.go:3)	CMPQ	AX, $5
	0x0009 00009 (x.go:3)	JNE	36
	0x000b 00011 (x.go:3)	MOVQ	"".s+8(SP), AX
	0x0010 00016 (x.go:3)	CMPL	(AX), $1684234849
	0x0016 00022 (x.go:3)	JNE	36
	0x0018 00024 (x.go:3)	CMPB	4(AX), $101
	0x001c 00028 (x.go:3)	SETEQ	AL
	0x001f 00031 (x.go:3)	MOVB	AL, "".~r1+24(SP)
	0x0023 00035 (x.go:3)	RET
	0x0024 00036 (x.go:3)	XORL	AX, AX
	0x0026 00038 (x.go:3)	JMP	31

Change-Id: I12c81aa53b89456cb5809aa5396378245f3beda9
Reviewed-on: https://go-review.googlesource.com/c/go/+/172597
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-04-19 15:39:53 +00:00
Daniel Martí
be9c534cde cmd/compile: don't crash on -d=ssa/
I forgot how to pull up the ssa debug options help, so instead of
writing -d=ssa/help, I just wrote -d=ssa/. Much to my amusement, the
compiler just crashed, as shown below. Fix that.

	panic: runtime error: index out of range

	goroutine 1 [running]:
	cmd/compile/internal/ssa.PhaseOption(0x7ffc375d2b70, 0x0, 0xdbff91, 0x5, 0x1, 0x0, 0x0, 0x1, 0x1)
	    /home/mvdan/tip/src/cmd/compile/internal/ssa/compile.go:327 +0x1876
	cmd/compile/internal/gc.Main(0xde7bd8)
	    /home/mvdan/tip/src/cmd/compile/internal/gc/main.go:411 +0x41d0
	main.main()
	    /home/mvdan/tip/src/cmd/compile/main.go:51 +0xab

Change-Id: Ia2ad394382ddf8f4498b16b5cfb49be0317fc1aa
Reviewed-on: https://go-review.googlesource.com/c/154421
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-02-26 17:50:01 +00:00
Keith Randall
2b4f24a2d2 cmd/compile: randomize value order in block for testing
A little bit of compiler stress testing. Randomize the order
of the values in a block before every phase. This randomization
makes sure that we're not implicitly depending on that order.

Currently the random seed is a hash of the function name.
It provides determinism, but sacrifices some coverage.
Other arrangements are possible (env var, ...) but require
more setup.

Fixes #20178

Change-Id: Idae792a23264bd9a3507db6ba49b6d591a608e83
Reviewed-on: https://go-review.googlesource.com/c/33909
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-28 17:13:46 +00:00
David Chase
04105ef1da cmd/compile: decompose composite OpArg before decomposeUser
This makes it easier to track names of function arguments
for debugging purposes.

Change-Id: Ic34856fe0b910005e1c7bc051d769d489a4b158e
Reviewed-on: https://go-review.googlesource.com/c/150098
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-23 22:59:32 +00:00
Josh Bleecher Snyder
a55f3ee46d cmd/compile: fuse before branchelim
The branchelim pass works better after fuse.
Running fuse before branchelim also increases
the stability of generated code amidst other compiler changes,
which was the original motivation behind this change.

The fuse pass is not cheap enough to run in its entirety
before branchelim, but the most important half of it is.
This change makes it possible to run "plain fuse" independently
and does so before branchelim.

During make.bash, elimIf occurrences increase from 4244 to 4288 (1%),
and elimIfElse occurrences increase from 989 to 1079 (9%).

Toolspeed impact is marginal; plain fuse pays for itself.

name        old time/op       new time/op       delta
Template          189ms ± 2%        189ms ± 2%    ~     (p=0.890 n=45+46)
Unicode          93.2ms ± 5%       93.4ms ± 7%    ~     (p=0.790 n=48+48)
GoTypes           662ms ± 4%        660ms ± 4%    ~     (p=0.186 n=48+49)
Compiler          2.89s ± 4%        2.91s ± 3%  +0.89%  (p=0.050 n=49+44)
SSA               8.23s ± 2%        8.21s ± 1%    ~     (p=0.165 n=46+44)
Flate             123ms ± 4%        123ms ± 3%  +0.58%  (p=0.031 n=47+49)
GoParser          154ms ± 4%        154ms ± 4%    ~     (p=0.492 n=49+48)
Reflect           430ms ± 4%        429ms ± 4%    ~     (p=1.000 n=48+48)
Tar               171ms ± 3%        170ms ± 4%    ~     (p=0.122 n=48+48)
XML               232ms ± 3%        232ms ± 2%    ~     (p=0.850 n=46+49)
[Geo mean]        394ms             394ms       +0.02%

name        old user-time/op  new user-time/op  delta
Template          236ms ± 5%        236ms ± 4%    ~     (p=0.934 n=50+50)
Unicode           132ms ± 7%        130ms ± 9%    ~     (p=0.087 n=50+50)
GoTypes           861ms ± 3%        867ms ± 4%    ~     (p=0.124 n=48+50)
Compiler          3.93s ± 4%        3.94s ± 3%    ~     (p=0.584 n=49+44)
SSA               12.2s ± 2%        12.3s ± 1%    ~     (p=0.610 n=46+45)
Flate             149ms ± 4%        150ms ± 4%    ~     (p=0.194 n=48+49)
GoParser          193ms ± 5%        191ms ± 6%    ~     (p=0.239 n=49+50)
Reflect           553ms ± 5%        556ms ± 5%    ~     (p=0.091 n=49+49)
Tar               218ms ± 5%        218ms ± 5%    ~     (p=0.359 n=49+50)
XML               299ms ± 5%        298ms ± 4%    ~     (p=0.482 n=50+49)
[Geo mean]        516ms             516ms       -0.01%

name        old alloc/op      new alloc/op      delta
Template         36.3MB ± 0%       36.3MB ± 0%  -0.02%  (p=0.000 n=49+49)
Unicode          29.7MB ± 0%       29.7MB ± 0%    ~     (p=0.270 n=50+50)
GoTypes           126MB ± 0%        126MB ± 0%  -0.34%  (p=0.000 n=50+49)
Compiler          534MB ± 0%        531MB ± 0%  -0.50%  (p=0.000 n=50+50)
SSA              1.98GB ± 0%       1.98GB ± 0%  -0.06%  (p=0.000 n=49+49)
Flate            24.6MB ± 0%       24.6MB ± 0%  -0.29%  (p=0.000 n=50+50)
GoParser         29.5MB ± 0%       29.4MB ± 0%  -0.15%  (p=0.000 n=49+50)
Reflect          87.3MB ± 0%       87.2MB ± 0%  -0.13%  (p=0.000 n=49+50)
Tar              35.6MB ± 0%       35.5MB ± 0%  -0.17%  (p=0.000 n=50+50)
XML              48.2MB ± 0%       48.0MB ± 0%  -0.30%  (p=0.000 n=48+50)
[Geo mean]       83.1MB            82.9MB       -0.20%

name        old allocs/op     new allocs/op     delta
Template           352k ± 0%         352k ± 0%  -0.01%  (p=0.004 n=49+49)
Unicode            341k ± 0%         341k ± 0%    ~     (p=0.341 n=48+50)
GoTypes           1.28M ± 0%        1.28M ± 0%  -0.03%  (p=0.000 n=50+49)
Compiler          4.96M ± 0%        4.96M ± 0%  -0.05%  (p=0.000 n=50+49)
SSA               15.5M ± 0%        15.5M ± 0%  -0.01%  (p=0.000 n=50+49)
Flate              233k ± 0%         233k ± 0%  +0.01%  (p=0.032 n=49+49)
GoParser           294k ± 0%         294k ± 0%    ~     (p=0.052 n=46+48)
Reflect           1.04M ± 0%        1.04M ± 0%    ~     (p=0.171 n=50+47)
Tar                343k ± 0%         343k ± 0%  -0.03%  (p=0.000 n=50+50)
XML                429k ± 0%         429k ± 0%  -0.04%  (p=0.000 n=50+50)
[Geo mean]         812k              812k       -0.02%

Object files grow slightly; branchelim often increases binary size, at least on amd64.

name        old object-bytes  new object-bytes  delta
Template          509kB ± 0%        509kB ± 0%  -0.01%  (p=0.008 n=5+5)
Unicode           224kB ± 0%        224kB ± 0%    ~     (all equal)
GoTypes          1.84MB ± 0%       1.84MB ± 0%  +0.00%  (p=0.008 n=5+5)
Compiler         6.71MB ± 0%       6.71MB ± 0%  +0.01%  (p=0.008 n=5+5)
SSA              21.2MB ± 0%       21.2MB ± 0%  +0.01%  (p=0.008 n=5+5)
Flate             324kB ± 0%        324kB ± 0%  -0.00%  (p=0.008 n=5+5)
GoParser          404kB ± 0%        404kB ± 0%  -0.02%  (p=0.008 n=5+5)
Reflect          1.40MB ± 0%       1.40MB ± 0%  +0.09%  (p=0.008 n=5+5)
Tar               452kB ± 0%        452kB ± 0%  +0.06%  (p=0.008 n=5+5)
XML               596kB ± 0%        596kB ± 0%  +0.00%  (p=0.008 n=5+5)
[Geo mean]       1.04MB            1.04MB       +0.01%

Change-Id: I535c711b85380ff657fc0f022bebd9cb14ddd07f
Reviewed-on: https://go-review.googlesource.com/c/129378
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-15 16:51:08 +00:00
Yury Smolsky
c35069d642 cmd/compile: clean the output of GOSSAFUNC
Since we print almost everything to ssa.html in the GOSSAFUNC mode,
there is a need to stop spamming stdout when user just wants to see
ssa.html.

This changes cleans output of the GOSSAFUNC debug mode.
To enable the dump of the debug data to stdout, one must
put suffix + after the function name like that:

GOSSAFUNC=Foo+

Otherwise gc will not print the IR and ASM to stdout after each phase.
AST IR is still sent to stdout because it is not included
into ssa.html. It will be fixed in a separate change.

The change adds printing out the full path to the ssa.html file.

Updates #25942

Change-Id: I711e145e05f0443c7df5459ca528dced273a62ee
Reviewed-on: https://go-review.googlesource.com/126603
Run-TryBot: Yury Smolsky <yury@smolsky.by>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-23 05:11:17 +00:00
Cherry Zhang
3f54e8537a cmd/compile: run generic deadcode in -N mode
Late opt pass may generate dead stores, which messes up store
chain calculation in later passes. Run generic deadcode even
in -N mode to remove them.

Fixes #26163.

Change-Id: I8276101717bb978d5980e6c7998f53fd8d0ae10f
Reviewed-on: https://go-review.googlesource.com/121856
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-07-02 20:53:23 +00:00
Yury Smolsky
5eb98b3c30 cmd/compile: use expandable columns in ssa.html
Display just a few columns in ssa.html, other
columns can be expanded by clicking on collapsed column.

Use sans serif font for the text, slightly smaller font size
for non program text.

Fixes #25286

Change-Id: I1094695135401602d90b97b69e42f6dda05871a2
Reviewed-on: https://go-review.googlesource.com/117275
Run-TryBot: Yury Smolsky <yury@smolsky.by>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-06-13 21:02:54 +00:00
David Chase
c2c1822b12 cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).

Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).

Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.

Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".

The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices.  This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.

Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go.  There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.

Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)

This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.

The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.

This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.

Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-14 14:09:49 +00:00
Michael Munday
f31a18ded4 cmd/compile: add some generic composite type optimizations
Propagate values through some wide Zero/Move operations. Among
other things this allows us to optimize some kinds of array
initialization. For example, the following code no longer
requires a temporary be allocated on the stack. Instead it
writes the values directly into the return value.

func f(i uint32) [4]uint32 {
    return [4]uint32{i, i+1, i+2, i+3}
}

The return value is unnecessarily cleared but removing that is
probably a task for dead store analysis (I think it needs to
be able to match multiple Store ops to wide Zero ops).

In order to reliably remove stack variables that are rendered
unnecessary by these new rules I've added a new generic version
of the unread autos elimination pass.

These rules are triggered more than 5000 times when building and
testing the standard library.

Updates #15925 (fixes for arrays of up to 4 elements).
Updates #24386 (fixes for up to 4 kept elements).
Updates #24416.

compilebench results:

name       old time/op       new time/op       delta
Template         353ms ± 5%        359ms ± 3%    ~     (p=0.143 n=10+10)
Unicode          219ms ± 1%        217ms ± 4%    ~     (p=0.740 n=7+10)
GoTypes          1.26s ± 1%        1.26s ± 2%    ~     (p=0.549 n=9+10)
Compiler         6.00s ± 1%        6.08s ± 1%  +1.42%  (p=0.000 n=9+8)
SSA              15.3s ± 2%        15.6s ± 1%  +2.43%  (p=0.000 n=10+10)
Flate            237ms ± 2%        240ms ± 2%  +1.31%  (p=0.015 n=10+10)
GoParser         285ms ± 1%        285ms ± 1%    ~     (p=0.878 n=8+8)
Reflect          797ms ± 3%        807ms ± 2%    ~     (p=0.065 n=9+10)
Tar              334ms ± 0%        335ms ± 4%    ~     (p=0.460 n=8+10)
XML              419ms ± 0%        423ms ± 1%  +0.91%  (p=0.001 n=7+9)
StdCmd           46.0s ± 0%        46.4s ± 0%  +0.85%  (p=0.000 n=9+9)

name       old user-time/op  new user-time/op  delta
Template         337ms ± 3%        346ms ± 5%    ~     (p=0.053 n=9+10)
Unicode          205ms ±10%        205ms ± 8%    ~     (p=1.000 n=10+10)
GoTypes          1.22s ± 2%        1.21s ± 3%    ~     (p=0.436 n=10+10)
Compiler         5.85s ± 1%        5.93s ± 0%  +1.46%  (p=0.000 n=10+8)
SSA              14.9s ± 1%        15.3s ± 1%  +2.62%  (p=0.000 n=10+10)
Flate            229ms ± 4%        228ms ± 6%    ~     (p=0.796 n=10+10)
GoParser         271ms ± 3%        275ms ± 4%    ~     (p=0.165 n=10+10)
Reflect          779ms ± 5%        775ms ± 2%    ~     (p=0.971 n=10+10)
Tar              317ms ± 4%        319ms ± 5%    ~     (p=0.853 n=10+10)
XML              404ms ± 4%        409ms ± 5%    ~     (p=0.436 n=10+10)

name       old alloc/op      new alloc/op      delta
Template        34.9MB ± 0%       35.0MB ± 0%  +0.26%  (p=0.000 n=10+10)
Unicode         29.3MB ± 0%       29.3MB ± 0%  +0.02%  (p=0.000 n=10+10)
GoTypes          115MB ± 0%        115MB ± 0%  +0.30%  (p=0.000 n=10+10)
Compiler         519MB ± 0%        521MB ± 0%  +0.30%  (p=0.000 n=10+10)
SSA             1.55GB ± 0%       1.57GB ± 0%  +1.34%  (p=0.000 n=10+9)
Flate           24.1MB ± 0%       24.2MB ± 0%  +0.10%  (p=0.000 n=10+10)
GoParser        28.1MB ± 0%       28.1MB ± 0%  +0.07%  (p=0.000 n=10+10)
Reflect         78.7MB ± 0%       78.7MB ± 0%  +0.03%  (p=0.000 n=8+10)
Tar             34.4MB ± 0%       34.5MB ± 0%  +0.12%  (p=0.000 n=10+10)
XML             43.2MB ± 0%       43.2MB ± 0%  +0.13%  (p=0.000 n=10+10)

name       old allocs/op     new allocs/op     delta
Template          330k ± 0%         330k ± 0%  -0.01%  (p=0.017 n=10+10)
Unicode           337k ± 0%         337k ± 0%  +0.01%  (p=0.000 n=9+10)
GoTypes          1.15M ± 0%        1.15M ± 0%  +0.03%  (p=0.000 n=10+10)
Compiler         4.77M ± 0%        4.77M ± 0%  +0.03%  (p=0.000 n=9+10)
SSA              12.5M ± 0%        12.6M ± 0%  +1.16%  (p=0.000 n=10+10)
Flate             221k ± 0%         221k ± 0%  +0.05%  (p=0.000 n=9+10)
GoParser          275k ± 0%         275k ± 0%  +0.01%  (p=0.014 n=10+9)
Reflect           944k ± 0%         944k ± 0%  -0.02%  (p=0.000 n=10+10)
Tar               324k ± 0%         323k ± 0%  -0.12%  (p=0.000 n=10+10)
XML               384k ± 0%         384k ± 0%  -0.01%  (p=0.001 n=10+10)

name       old object-bytes  new object-bytes  delta
Template         476kB ± 0%        476kB ± 0%  -0.04%  (p=0.000 n=10+10)
Unicode          218kB ± 0%        218kB ± 0%    ~     (all equal)
GoTypes         1.58MB ± 0%       1.58MB ± 0%  -0.04%  (p=0.000 n=10+10)
Compiler        6.25MB ± 0%       6.24MB ± 0%  -0.09%  (p=0.000 n=10+10)
SSA             15.9MB ± 0%       16.1MB ± 0%  +1.22%  (p=0.000 n=10+10)
Flate            304kB ± 0%        304kB ± 0%  -0.13%  (p=0.000 n=10+10)
GoParser         370kB ± 0%        370kB ± 0%  -0.00%  (p=0.000 n=10+10)
Reflect         1.27MB ± 0%       1.27MB ± 0%  -0.12%  (p=0.000 n=10+10)
Tar              421kB ± 0%        419kB ± 0%  -0.64%  (p=0.000 n=10+10)
XML              518kB ± 0%        517kB ± 0%  -0.12%  (p=0.000 n=10+10)

name       old export-bytes  new export-bytes  delta
Template        16.7kB ± 0%       16.7kB ± 0%    ~     (all equal)
Unicode         6.52kB ± 0%       6.52kB ± 0%    ~     (all equal)
GoTypes         29.2kB ± 0%       29.2kB ± 0%    ~     (all equal)
Compiler        88.0kB ± 0%       88.0kB ± 0%    ~     (all equal)
SSA              109kB ± 0%        109kB ± 0%    ~     (all equal)
Flate           4.49kB ± 0%       4.49kB ± 0%    ~     (all equal)
GoParser        8.10kB ± 0%       8.10kB ± 0%    ~     (all equal)
Reflect         7.71kB ± 0%       7.71kB ± 0%    ~     (all equal)
Tar             9.15kB ± 0%       9.15kB ± 0%    ~     (all equal)
XML             12.3kB ± 0%       12.3kB ± 0%    ~     (all equal)

name       old text-bytes    new text-bytes    delta
HelloSize        676kB ± 0%        672kB ± 0%  -0.59%  (p=0.000 n=10+10)
CmdGoSize       7.26MB ± 0%       7.24MB ± 0%  -0.18%  (p=0.000 n=10+10)

name       old data-bytes    new data-bytes    delta
HelloSize       10.2kB ± 0%       10.2kB ± 0%    ~     (all equal)
CmdGoSize        248kB ± 0%        248kB ± 0%    ~     (all equal)

name       old bss-bytes     new bss-bytes     delta
HelloSize        125kB ± 0%        125kB ± 0%    ~     (all equal)
CmdGoSize        145kB ± 0%        145kB ± 0%    ~     (all equal)

name       old exe-bytes     new exe-bytes     delta
HelloSize       1.46MB ± 0%       1.45MB ± 0%  -0.31%  (p=0.000 n=10+10)
CmdGoSize       14.7MB ± 0%       14.7MB ± 0%  -0.17%  (p=0.000 n=10+10)

Change-Id: Ic72b0c189dd542f391e1c9ab88a76e9148dc4285
Reviewed-on: https://go-review.googlesource.com/106495
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-08 10:31:21 +00:00
Alberto Donizetti
8a958bb8a6 cmd/compile: better formatting for ssa phases options doc
Change the help doc of

  go tool compile -d=ssa/help

from this:

  compile: GcFlag -d=ssa/<phase>/<flag>[=<value>|<function_name>]
  <phase> is one of:
  check, all, build, intrinsics, early_phielim, early_copyelim
  early_deadcode, short_circuit, decompose_user, opt, zero_arg_cse
  opt_deadcode, generic_cse, phiopt, nilcheckelim, prove, loopbce
  decompose_builtin, softfloat, late_opt, generic_deadcode, check_bce
  fuse, dse, writebarrier, insert_resched_checks, tighten, lower
  lowered_cse, elim_unread_autos, lowered_deadcode, checkLower
  late_phielim, late_copyelim, phi_tighten, late_deadcode, critical
  likelyadjust, layout, schedule, late_nilcheck, flagalloc, regalloc
  loop_rotate, stackframe, trim
  <flag> is one of on, off, debug, mem, time, test, stats, dump
  <value> defaults to 1
  <function_name> is required for "dump", specifies name of function to dump after <phase>
  Except for dump, output is directed to standard out; dump appears in a file.
  Phase "all" supports flags "time", "mem", and "dump".
  Phases "intrinsics" supports flags "on", "off", and "debug".
  Interpretation of the "debug" value depends on the phase.
  Dump files are named <phase>__<function_name>_<seq>.dump.

To this:

  compile: PhaseOptions usage:

      go tool compile -d=ssa/<phase>/<flag>[=<value>|<function_name>]

  where:

  - <phase> is one of:
      check, all, build, intrinsics, early_phielim, early_copyelim
      early_deadcode, short_circuit, decompose_user, opt, zero_arg_cse
      opt_deadcode, generic_cse, phiopt, nilcheckelim, prove
      decompose_builtin, softfloat, late_opt, generic_deadcode, check_bce
      branchelim, fuse, dse, writebarrier, insert_resched_checks, lower
      lowered_cse, elim_unread_autos, lowered_deadcode, checkLower
      late_phielim, late_copyelim, tighten, phi_tighten, late_deadcode
      critical, likelyadjust, layout, schedule, late_nilcheck, flagalloc
      regalloc, loop_rotate, stackframe, trim

  - <flag> is one of:
      on, off, debug, mem, time, test, stats, dump

  - <value> defaults to 1

  - <function_name> is required for the "dump" flag, and specifies the
    name of function to dump after <phase>

  Phase "all" supports flags "time", "mem", and "dump".
  Phase "intrinsics" supports flags "on", "off", and "debug".

  If the "dump" flag is specified, the output is written on a file named
  <phase>__<function_name>_<seq>.dump; otherwise it is directed to stdout.

Also add a few examples at the bottom.

Fixes #20349

Change-Id: I334799e951e7b27855b3ace5d2d966c4d6ec4cff
Reviewed-on: https://go-review.googlesource.com/110062
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-29 16:24:03 +00:00
Giovanni Bajo
f49369b67c cmd/compile: remove loopbce pass
prove now is able to do what loopbce used to do.

Passes toolstash -cmp.

Compilebench of the whole serie (master 9967582f770f6):

name       old time/op     new time/op     delta
Template       208ms ±18%      198ms ± 4%    ~     (p=0.690 n=5+5)
Unicode       99.1ms ±19%     96.5ms ± 4%    ~     (p=0.548 n=5+5)
GoTypes        623ms ± 1%      633ms ± 1%    ~     (p=0.056 n=5+5)
Compiler       2.94s ± 2%      3.02s ± 4%    ~     (p=0.095 n=5+5)
SSA            6.77s ± 1%      7.11s ± 2%  +4.94%  (p=0.008 n=5+5)
Flate          129ms ± 1%      136ms ± 0%  +4.87%  (p=0.016 n=5+4)
GoParser       152ms ± 3%      156ms ± 1%    ~     (p=0.095 n=5+5)
Reflect        380ms ± 2%      392ms ± 1%  +3.30%  (p=0.008 n=5+5)
Tar            185ms ± 6%      184ms ± 2%    ~     (p=0.690 n=5+5)
XML            223ms ± 2%      228ms ± 3%    ~     (p=0.095 n=5+5)
StdCmd         26.8s ± 2%      28.0s ± 5%  +4.46%  (p=0.032 n=5+5)

name       old user-ns/op  new user-ns/op  delta
Template        252M ± 5%       248M ± 3%    ~     (p=1.000 n=5+5)
Unicode         118M ± 7%       121M ± 4%    ~     (p=0.548 n=5+5)
GoTypes         790M ± 2%       793M ± 2%    ~     (p=0.690 n=5+5)
Compiler       3.78G ± 3%      3.91G ± 4%    ~     (p=0.056 n=5+5)
SSA            8.98G ± 2%      9.52G ± 3%  +6.08%  (p=0.008 n=5+5)
Flate           155M ± 1%       160M ± 0%  +3.47%  (p=0.016 n=5+4)
GoParser        185M ± 4%       187M ± 2%    ~     (p=0.310 n=5+5)
Reflect         469M ± 1%       481M ± 1%  +2.52%  (p=0.016 n=5+5)
Tar             222M ± 4%       222M ± 2%    ~     (p=0.841 n=5+5)
XML             269M ± 1%       274M ± 2%  +1.88%  (p=0.032 n=5+5)

name       old text-bytes  new text-bytes  delta
HelloSize       664k ± 0%       664k ± 0%    ~     (all equal)
CmdGoSize      7.23M ± 0%      7.22M ± 0%  -0.06%  (p=0.008 n=5+5)

name       old data-bytes  new data-bytes  delta
HelloSize       134k ± 0%       134k ± 0%    ~     (all equal)
CmdGoSize       390k ± 0%       390k ± 0%    ~     (all equal)

name       old exe-bytes   new exe-bytes   delta
HelloSize      1.39M ± 0%      1.39M ± 0%    ~     (all equal)
CmdGoSize      14.4M ± 0%      14.4M ± 0%  -0.06%  (p=0.008 n=5+5)

Go1 of the whole serie:

name                      old time/op    new time/op    delta
BinaryTree17-16              5.40s ± 6%     5.38s ± 4%     ~     (p=1.000 n=12+10)
Fannkuch11-16                4.04s ± 3%     3.81s ± 3%   -5.70%  (p=0.000 n=11+11)
FmtFprintfEmpty-16          60.7ns ± 2%    60.2ns ± 3%     ~     (p=0.136 n=11+10)
FmtFprintfString-16          115ns ± 2%     114ns ± 4%     ~     (p=0.175 n=11+10)
FmtFprintfInt-16             118ns ± 2%     125ns ± 2%   +5.76%  (p=0.000 n=11+10)
FmtFprintfIntInt-16          196ns ± 2%     204ns ± 3%   +4.42%  (p=0.000 n=10+11)
FmtFprintfPrefixedInt-16     207ns ± 2%     214ns ± 2%   +3.23%  (p=0.000 n=10+11)
FmtFprintfFloat-16           364ns ± 3%     357ns ± 2%   -1.88%  (p=0.002 n=11+11)
FmtManyArgs-16               773ns ± 2%     775ns ± 1%     ~     (p=0.457 n=11+10)
GobDecode-16                11.2ms ± 4%    11.0ms ± 3%   -1.51%  (p=0.022 n=10+9)
GobEncode-16                9.91ms ± 6%    9.81ms ± 5%     ~     (p=0.699 n=11+11)
Gzip-16                      339ms ± 1%     338ms ± 1%     ~     (p=0.438 n=11+11)
Gunzip-16                   64.4ms ± 1%    65.2ms ± 1%   +1.28%  (p=0.001 n=10+11)
HTTPClientServer-16          157µs ± 7%     160µs ± 5%     ~     (p=0.133 n=11+11)
JSONEncode-16               22.3ms ± 4%    23.2ms ± 4%   +3.79%  (p=0.000 n=11+11)
JSONDecode-16               96.7ms ± 3%    96.6ms ± 1%     ~     (p=0.562 n=11+11)
Mandelbrot200-16            6.42ms ± 1%    6.40ms ± 1%     ~     (p=0.365 n=11+11)
GoParse-16                  5.59ms ± 7%    5.42ms ± 5%   -3.07%  (p=0.020 n=11+10)
RegexpMatchEasy0_32-16       113ns ± 2%     113ns ± 3%     ~     (p=0.968 n=11+10)
RegexpMatchEasy0_1K-16       417ns ± 1%     416ns ± 3%     ~     (p=0.742 n=11+10)
RegexpMatchEasy1_32-16       106ns ± 1%     107ns ± 3%     ~     (p=0.223 n=11+11)
RegexpMatchEasy1_1K-16       654ns ± 2%     657ns ± 1%     ~     (p=0.672 n=11+8)
RegexpMatchMedium_32-16      176ns ± 3%     177ns ± 1%     ~     (p=0.664 n=11+9)
RegexpMatchMedium_1K-16     56.3µs ± 3%    56.7µs ± 3%     ~     (p=0.171 n=11+11)
RegexpMatchHard_32-16       2.83µs ± 5%    2.83µs ± 4%     ~     (p=0.735 n=11+11)
RegexpMatchHard_1K-16       82.7µs ± 2%    82.7µs ± 2%     ~     (p=0.853 n=10+10)
Revcomp-16                   679ms ± 9%     782ms ±29%  +15.16%  (p=0.031 n=9+11)
Template-16                  118ms ± 1%     109ms ± 2%   -7.49%  (p=0.000 n=11+11)
TimeParse-16                 474ns ± 1%     462ns ± 1%   -2.59%  (p=0.000 n=11+11)
TimeFormat-16                482ns ± 1%     494ns ± 1%   +2.49%  (p=0.000 n=10+11)

name                      old speed      new speed      delta
GobDecode-16              68.7MB/s ± 4%  69.8MB/s ± 3%   +1.52%  (p=0.022 n=10+9)
GobEncode-16              77.6MB/s ± 6%  78.3MB/s ± 5%     ~     (p=0.699 n=11+11)
Gzip-16                   57.2MB/s ± 1%  57.3MB/s ± 1%     ~     (p=0.428 n=11+11)
Gunzip-16                  301MB/s ± 2%   298MB/s ± 1%   -1.07%  (p=0.007 n=11+11)
JSONEncode-16             86.9MB/s ± 4%  83.7MB/s ± 4%   -3.63%  (p=0.000 n=11+11)
JSONDecode-16             20.1MB/s ± 3%  20.1MB/s ± 1%     ~     (p=0.529 n=11+11)
GoParse-16                10.4MB/s ± 6%  10.7MB/s ± 4%   +3.12%  (p=0.020 n=11+10)
RegexpMatchEasy0_32-16     282MB/s ± 2%   282MB/s ± 3%     ~     (p=0.756 n=11+10)
RegexpMatchEasy0_1K-16    2.45GB/s ± 1%  2.46GB/s ± 2%     ~     (p=0.705 n=11+10)
RegexpMatchEasy1_32-16     299MB/s ± 1%   297MB/s ± 2%     ~     (p=0.151 n=11+11)
RegexpMatchEasy1_1K-16    1.56GB/s ± 2%  1.56GB/s ± 1%     ~     (p=0.717 n=11+8)
RegexpMatchMedium_32-16   5.67MB/s ± 4%  5.63MB/s ± 1%     ~     (p=0.538 n=11+9)
RegexpMatchMedium_1K-16   18.2MB/s ± 3%  18.1MB/s ± 3%     ~     (p=0.156 n=11+11)
RegexpMatchHard_32-16     11.3MB/s ± 5%  11.3MB/s ± 4%     ~     (p=0.711 n=11+11)
RegexpMatchHard_1K-16     12.4MB/s ± 1%  12.4MB/s ± 2%     ~     (p=0.535 n=9+10)
Revcomp-16                 370MB/s ± 5%   332MB/s ±24%     ~     (p=0.062 n=8+11)
Template-16               16.5MB/s ± 1%  17.8MB/s ± 2%   +8.11%  (p=0.000 n=11+11)

Change-Id: I41e46f375ee127785c6491f7ef5bd35581261ae6
Reviewed-on: https://go-review.googlesource.com/104039
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-04-29 09:37:58 +00:00
Josh Bleecher Snyder
3a9e4440fd cmd/compile: tighten after lowering
Moving tighten after lowering benefits from the removal of values by
lowering and lowered CSE. It lets us make better decisions about
which values are rematerializable and which generate flags.
Empirically, it lowers stack usage (by avoiding spills)
and generates slightly smaller and faster binaries.


Fixes #19853
Fixes #21041

name        old time/op       new time/op       delta
Template          195ms ± 4%        193ms ± 4%  -1.33%  (p=0.000 n=92+97)
Unicode          94.1ms ± 9%       92.5ms ± 8%  -1.66%  (p=0.002 n=97+95)
GoTypes           572ms ± 5%        566ms ± 7%  -0.92%  (p=0.001 n=95+98)
Compiler          2.56s ± 4%        2.52s ± 3%  -1.41%  (p=0.000 n=94+97)
SSA               6.52s ± 2%        6.47s ± 3%  -0.82%  (p=0.000 n=96+94)
Flate             117ms ± 5%        116ms ± 7%  -0.72%  (p=0.018 n=97+97)
GoParser          148ms ± 6%        146ms ± 4%  -0.97%  (p=0.002 n=98+95)
Reflect           370ms ± 7%        363ms ± 6%  -1.79%  (p=0.000 n=99+98)
Tar               175ms ± 6%        173ms ± 6%  -1.11%  (p=0.001 n=94+95)
XML               204ms ± 6%        201ms ± 5%  -1.49%  (p=0.000 n=97+96)
[Geo mean]        363ms             359ms       -1.22%

name        old user-time/op  new user-time/op  delta
Template          251ms ± 5%        245ms ± 5%  -2.40%  (p=0.000 n=97+93)
Unicode           131ms ±10%        128ms ± 9%  -1.93%  (p=0.001 n=100+99)
GoTypes           760ms ± 4%        752ms ± 4%  -0.96%  (p=0.000 n=97+95)
Compiler          3.51s ± 3%        3.48s ± 2%  -1.04%  (p=0.000 n=96+95)
SSA               9.57s ± 4%        9.52s ± 2%  -0.50%  (p=0.004 n=97+96)
Flate             149ms ± 6%        147ms ± 6%  -1.46%  (p=0.000 n=98+96)
GoParser          184ms ± 5%        181ms ± 7%  -1.84%  (p=0.000 n=98+97)
Reflect           469ms ± 6%        461ms ± 6%  -1.69%  (p=0.000 n=100+98)
Tar               219ms ± 8%        217ms ± 7%  -0.90%  (p=0.035 n=96+96)
XML               255ms ± 5%        251ms ± 6%  -1.48%  (p=0.000 n=98+98)
[Geo mean]        476ms             469ms       -1.42%

name        old alloc/op      new alloc/op      delta
Template         37.8MB ± 0%       37.8MB ± 0%  -0.17%  (p=0.000 n=100+100)
Unicode          28.8MB ± 0%       28.8MB ± 0%  -0.02%  (p=0.000 n=100+95)
GoTypes           112MB ± 0%        112MB ± 0%  -0.20%  (p=0.000 n=100+97)
Compiler          466MB ± 0%        464MB ± 0%  -0.27%  (p=0.000 n=100+100)
SSA              1.49GB ± 0%       1.49GB ± 0%  -0.08%  (p=0.000 n=100+99)
Flate            24.4MB ± 0%       24.3MB ± 0%  -0.25%  (p=0.000 n=98+99)
GoParser         30.7MB ± 0%       30.6MB ± 0%  -0.26%  (p=0.000 n=99+100)
Reflect          76.4MB ± 0%       76.4MB ± 0%    ~     (p=0.253 n=100+100)
Tar              38.9MB ± 0%       38.8MB ± 0%  -0.20%  (p=0.000 n=100+97)
XML              41.5MB ± 0%       41.4MB ± 0%  -0.19%  (p=0.000 n=100+98)
[Geo mean]       77.5MB            77.4MB       -0.16%

name        old allocs/op     new allocs/op     delta
Template           381k ± 0%         381k ± 0%  -0.15%  (p=0.000 n=100+100)
Unicode            342k ± 0%         342k ± 0%  -0.01%  (p=0.000 n=100+98)
GoTypes           1.19M ± 0%        1.18M ± 0%  -0.24%  (p=0.000 n=100+100)
Compiler          4.52M ± 0%        4.50M ± 0%  -0.29%  (p=0.000 n=100+100)
SSA               12.3M ± 0%        12.3M ± 0%  -0.11%  (p=0.000 n=100+100)
Flate              234k ± 0%         234k ± 0%  -0.26%  (p=0.000 n=99+96)
GoParser           318k ± 0%         317k ± 0%  -0.21%  (p=0.000 n=99+100)
Reflect            974k ± 0%         974k ± 0%  -0.03%  (p=0.000 n=100+100)
Tar                392k ± 0%         391k ± 0%  -0.17%  (p=0.000 n=100+99)
XML                404k ± 0%         403k ± 0%  -0.24%  (p=0.000 n=99+99)
[Geo mean]         794k              792k       -0.17%

name        old object-bytes  new object-bytes  delta
Template          393kB ± 0%        392kB ± 0%  -0.19%  (p=0.008 n=5+5)
Unicode           207kB ± 0%        207kB ± 0%    ~     (all equal)
GoTypes          1.23MB ± 0%       1.22MB ± 0%  -0.11%  (p=0.008 n=5+5)
Compiler         4.34MB ± 0%       4.33MB ± 0%  -0.15%  (p=0.008 n=5+5)
SSA              9.85MB ± 0%       9.85MB ± 0%  -0.07%  (p=0.008 n=5+5)
Flate             235kB ± 0%        234kB ± 0%  -0.59%  (p=0.008 n=5+5)
GoParser          297kB ± 0%        296kB ± 0%  -0.22%  (p=0.008 n=5+5)
Reflect          1.03MB ± 0%       1.03MB ± 0%  -0.00%  (p=0.008 n=5+5)
Tar               332kB ± 0%        331kB ± 0%  -0.15%  (p=0.008 n=5+5)
XML               413kB ± 0%        412kB ± 0%  -0.19%  (p=0.008 n=5+5)
[Geo mean]        728kB             727kB       -0.17%

Change-Id: I9b5cdb668ed102a001897a05e833105acba220a2
Reviewed-on: https://go-review.googlesource.com/95995
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 00:03:24 +00:00
philhofer
2d0172c3a7 cmd/compile/internal/ssa: emit csel on arm64
Introduce a new SSA pass to generate CondSelect intstrutions,
and add CondSelect lowering rules for arm64.

In order to make the CSEL instruction easier to optimize,
and to simplify the introduction of CSNEG, CSINC, and CSINV
in the future, modify the CSEL instruction to accept a condition
code in the aux field.

Notably, this change makes the go1 Gzip benchmark
more than 10% faster.

Benchmarks on a Cavium ThunderX:

name                      old time/op    new time/op    delta
BinaryTree17-96              15.9s ± 6%     16.0s ± 4%     ~     (p=0.968 n=10+9)
Fannkuch11-96                7.17s ± 0%     7.00s ± 0%   -2.43%  (p=0.000 n=8+9)
FmtFprintfEmpty-96           208ns ± 1%     207ns ± 0%     ~     (p=0.152 n=10+8)
FmtFprintfString-96          379ns ± 0%     375ns ± 0%   -0.95%  (p=0.000 n=10+9)
FmtFprintfInt-96             385ns ± 0%     383ns ± 0%   -0.52%  (p=0.000 n=9+10)
FmtFprintfIntInt-96          591ns ± 0%     586ns ± 0%   -0.85%  (p=0.006 n=7+9)
FmtFprintfPrefixedInt-96     656ns ± 0%     667ns ± 0%   +1.71%  (p=0.000 n=10+10)
FmtFprintfFloat-96           967ns ± 0%     984ns ± 0%   +1.78%  (p=0.000 n=10+10)
FmtManyArgs-96              2.35µs ± 0%    2.25µs ± 0%   -4.63%  (p=0.000 n=9+8)
GobDecode-96                31.0ms ± 0%    30.8ms ± 0%   -0.36%  (p=0.006 n=9+9)
GobEncode-96                24.4ms ± 0%    24.5ms ± 0%   +0.30%  (p=0.000 n=9+9)
Gzip-96                      1.60s ± 0%     1.43s ± 0%  -10.58%  (p=0.000 n=9+10)
Gunzip-96                    167ms ± 0%     169ms ± 0%   +0.83%  (p=0.000 n=8+9)
HTTPClientServer-96          311µs ± 1%     308µs ± 0%   -0.75%  (p=0.000 n=10+10)
JSONEncode-96               65.0ms ± 0%    64.8ms ± 0%   -0.25%  (p=0.000 n=9+8)
JSONDecode-96                262ms ± 1%     261ms ± 1%     ~     (p=0.579 n=10+10)
Mandelbrot200-96            18.0ms ± 0%    18.1ms ± 0%   +0.17%  (p=0.000 n=8+10)
GoParse-96                  14.0ms ± 0%    14.1ms ± 1%   +0.42%  (p=0.003 n=9+10)
RegexpMatchEasy0_32-96       644ns ± 2%     645ns ± 2%     ~     (p=0.836 n=10+10)
RegexpMatchEasy0_1K-96      3.70µs ± 0%    3.49µs ± 0%   -5.58%  (p=0.000 n=10+10)
RegexpMatchEasy1_32-96       662ns ± 2%     657ns ± 2%     ~     (p=0.137 n=10+10)
RegexpMatchEasy1_1K-96      4.47µs ± 0%    4.31µs ± 0%   -3.48%  (p=0.000 n=10+10)
RegexpMatchMedium_32-96      844ns ± 2%     849ns ± 1%     ~     (p=0.208 n=10+10)
RegexpMatchMedium_1K-96      179µs ± 0%     182µs ± 0%   +1.20%  (p=0.000 n=10+10)
RegexpMatchHard_32-96       10.0µs ± 0%    10.1µs ± 0%   +0.48%  (p=0.000 n=10+9)
RegexpMatchHard_1K-96        297µs ± 0%     297µs ± 0%   -0.14%  (p=0.000 n=10+10)
Revcomp-96                   3.08s ± 0%     3.13s ± 0%   +1.56%  (p=0.000 n=9+9)
Template-96                  276ms ± 2%     275ms ± 1%     ~     (p=0.393 n=10+10)
TimeParse-96                1.37µs ± 0%    1.36µs ± 0%   -0.53%  (p=0.000 n=10+7)
TimeFormat-96               1.40µs ± 0%    1.42µs ± 0%   +0.97%  (p=0.000 n=10+10)
[Geo mean]                   264µs          262µs        -0.77%

Change-Id: Ie54eee4b3092af53e6da3baa6d1755098f57f3a2
Reviewed-on: https://go-review.googlesource.com/55670
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-20 06:00:54 +00:00
Geoff Berry
75f0ad705f cmd/compile/internal/ssa: group dump files alphabetically
Change dump file names to group them alphabetically in directory
listings, in pass run order.

Change-Id: I8070578a5b4a3a7983dcc527ea1cfdb10a6d7d24
Reviewed-on: https://go-review.googlesource.com/83958
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-12-14 21:22:04 +00:00
Vladimir Stefanovic
6be1c09e19 cmd/compile: use soft-float routines for soft-float targets
Updates #18162 (mostly fixes)

Change-Id: I35bcb8a688bdaa432adb0ddbb73a2f7adda47b9e
Reviewed-on: https://go-review.googlesource.com/37958
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-30 17:37:37 +00:00
David Chase
38c725b148 cmd/compile: repair name propagation into aggregate parts
For structs, slices, strings, interfaces, etc, propagation of
names to their components (e.g., complex.real, complex.imag)
is fragile (depends on phase ordering) and not done right
for the "dec" pass.

The dec pass is subsumed into decomposeBuiltin,
and then names are pushed into the args of all
OpFooMake opcodes.

compile/ssa/debug_test.go was fixed to pay attention to
variable values, and the reference files include checks
for the fixes in this CL (which make debugging better).

Change-Id: Ic2591ebb1698d78d07292b92c53667e6c37fa0cd
Reviewed-on: https://go-review.googlesource.com/73210
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2017-11-05 17:30:11 +00:00
Michael Munday
744ebfde04 cmd/compile: eliminate stores to unread auto variables
This is a crude compiler pass to eliminate stores to auto variables
that are only ever written to.

Eliminates an unnecessary store to x from the following code:

func f() int {
	var x := 1
	return *(&x)
}

Fixes #19765.

Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa
Reviewed-on: https://go-review.googlesource.com/38746
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-08-24 16:53:56 +00:00
Josh Bleecher Snyder
e5c9358fe2 cmd/compile: move writebarrier pass after dse
This avoids generating writeBarrier.enabled
blocks for dead stores.

Change-Id: Ib11d8e2ba952f3f1f01d16776e40a7200a7683cf
Reviewed-on: https://go-review.googlesource.com/42012
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-29 16:37:02 +00:00
Keith Randall
39ce5907ca cmd/compile: rotate loops so conditional branch is at the end
Old loops look like this:
   loop:
     CMPQ ...
     JGE exit
     ...
     JMP loop
   exit:

New loops look like this:
    JMP entry
  loop:
    ...
  entry:
    CMPQ ...
    JLT loop

This removes one instruction (the unconditional jump) from
the inner loop.
Kinda surprisingly, it matters.

This is a bit different than the peeling that the old obj
library did in that we don't duplicate the loop exit test.
We just jump to the test.  I'm not sure if it is better or
worse to do that (peeling gets rid of the JMP but means more
code duplication), but this CL is certainly a much simpler
compiler change, so I'll try this way first.

The obj library used to do peeling before
CL https://go-review.googlesource.com/c/36205 turned it off.

Fixes #15837 (remove obj instruction reordering)
The reordering is already removed, this CL implements the only
part of that reordering that we'd like to keep.

Fixes #14758 (append loop)
name    old time/op    new time/op    delta
Foo-12     817ns ± 4%     538ns ± 0%  -34.08%   (p=0.000 n=10+9)
Bar-12     850ns ±11%     570ns ±13%  -32.88%  (p=0.000 n=10+10)

Update #19595 (BLAS slowdown)
name                       old time/op  new time/op  delta
DgemvMedMedNoTransIncN-12  13.2µs ± 9%  10.2µs ± 1%  -22.26%  (p=0.000 n=9+9)

Fixes #19633 (append loop)
name    old time/op    new time/op    delta
Foo-12     810ns ± 1%     540ns ± 0%  -33.30%   (p=0.000 n=8+9)

Update #18977 (Fannkuch11 regression)
name         old time/op    new time/op    delta
Fannkuch11-8                2.80s ± 0%     3.01s ± 0%  +7.47%   (p=0.000 n=9+10)
This one makes no sense.  There's strictly 1 less instruction in the
inner loop (17 instead of 18).  They are exactly the same instructions
except for the JMP that has been elided.

go1 benchmarks generally don't look very impressive.  But the gains for the
specific issues above make this CL still probably worth it.
name                      old time/op    new time/op    delta
BinaryTree17-8              2.32s ± 0%     2.34s ± 0%  +1.14%    (p=0.000 n=9+7)
Fannkuch11-8                2.80s ± 0%     3.01s ± 0%  +7.47%   (p=0.000 n=9+10)
FmtFprintfEmpty-8          44.1ns ± 1%    46.1ns ± 1%  +4.53%  (p=0.000 n=10+10)
FmtFprintfString-8         67.8ns ± 0%    74.4ns ± 1%  +9.80%   (p=0.000 n=10+9)
FmtFprintfInt-8            74.9ns ± 0%    78.4ns ± 0%  +4.67%   (p=0.000 n=8+10)
FmtFprintfIntInt-8          117ns ± 1%     123ns ± 1%  +4.69%   (p=0.000 n=9+10)
FmtFprintfPrefixedInt-8     160ns ± 1%     146ns ± 0%  -8.22%   (p=0.000 n=8+10)
FmtFprintfFloat-8           214ns ± 0%     206ns ± 0%  -3.91%    (p=0.000 n=8+8)
FmtManyArgs-8               468ns ± 0%     497ns ± 1%  +6.09%   (p=0.000 n=8+10)
GobDecode-8                6.16ms ± 0%    6.21ms ± 1%  +0.76%   (p=0.000 n=9+10)
GobEncode-8                4.90ms ± 0%    4.92ms ± 1%  +0.37%   (p=0.028 n=9+10)
Gzip-8                      209ms ± 0%     212ms ± 0%  +1.33%  (p=0.000 n=10+10)
Gunzip-8                   36.6ms ± 0%    38.0ms ± 1%  +4.03%    (p=0.000 n=9+9)
HTTPClientServer-8         84.2µs ± 0%    86.0µs ± 1%  +2.14%    (p=0.000 n=9+9)
JSONEncode-8               13.6ms ± 3%    13.8ms ± 1%  +1.55%   (p=0.003 n=9+10)
JSONDecode-8               53.2ms ± 5%    52.9ms ± 0%    ~     (p=0.280 n=10+10)
Mandelbrot200-8            3.78ms ± 0%    3.78ms ± 1%    ~      (p=0.661 n=10+9)
GoParse-8                  2.89ms ± 0%    2.94ms ± 2%  +1.50%  (p=0.000 n=10+10)
RegexpMatchEasy0_32-8      68.5ns ± 2%    68.9ns ± 1%    ~     (p=0.136 n=10+10)
RegexpMatchEasy0_1K-8       220ns ± 1%     225ns ± 1%  +2.41%  (p=0.000 n=10+10)
RegexpMatchEasy1_32-8      64.7ns ± 0%    64.5ns ± 0%  -0.28%  (p=0.042 n=10+10)
RegexpMatchEasy1_1K-8       348ns ± 1%     355ns ± 0%  +1.90%  (p=0.000 n=10+10)
RegexpMatchMedium_32-8      102ns ± 1%     105ns ± 1%  +2.95%  (p=0.000 n=10+10)
RegexpMatchMedium_1K-8     33.1µs ± 3%    32.5µs ± 0%  -1.75%  (p=0.000 n=10+10)
RegexpMatchHard_32-8       1.71µs ± 1%    1.70µs ± 1%  -0.84%   (p=0.002 n=10+9)
RegexpMatchHard_1K-8       51.1µs ± 0%    50.8µs ± 1%  -0.48%  (p=0.004 n=10+10)
Revcomp-8                   411ms ± 1%     402ms ± 0%  -2.22%   (p=0.000 n=10+9)
Template-8                 61.8ms ± 1%    59.7ms ± 0%  -3.44%    (p=0.000 n=9+9)
TimeParse-8                 306ns ± 0%     318ns ± 0%  +3.83%  (p=0.000 n=10+10)
TimeFormat-8                320ns ± 0%     318ns ± 1%  -0.53%   (p=0.012 n=7+10)

Change-Id: Ifaf29abbe5874e437048e411ba8f7cfbc9e1c94b
Reviewed-on: https://go-review.googlesource.com/38431
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-04-24 22:13:36 +00:00
Matthew Dempsky
1e3570ac86 cmd/internal/objabi: extract shared functionality from obj
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.

There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.

objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.

Fixes #15165.
Fixes #20026.

Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-19 00:00:09 +00:00
Josh Bleecher Snyder
2cdb7f118a cmd/compile: move Frontend field from ssa.Config to ssa.Func
Suggested by mdempsky in CL 38232.
This allows us to use the Frontend field
to associate frontend state and information
with a function.
See the following CL in the series for examples.

This is a giant CL, but it is almost entirely routine refactoring.

The ssa test API is starting to feel a bit unwieldy.
I will clean it up separately, once the dust has settled.

Passes toolstash -cmp.

Updates #15756

Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf
Reviewed-on: https://go-review.googlesource.com/38327
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-17 23:18:57 +00:00
Josh Bleecher Snyder
a5e3cac895 cmd/compile: rearrange fields between ssa.Func, ssa.Cache, and ssa.Config
This makes ssa.Func, ssa.Cache, and ssa.Config fulfill
the roles laid out for them in CL 38160.

The only non-trivial change in this CL is how cached
values and blocks get IDs. Prior to this CL, their IDs were
assigned as part of resetting the cache, and only modified
IDs were reset. This required knowing how many values and
blocks were modified, which required a tight coupling between
ssa.Func and ssa.Config. To eliminate that coupling,
we now zero values and blocks during reset,
and assign their IDs when they are used.
Since unused values and blocks have ID == 0,
we can efficiently find the last used value/block,
to avoid zeroing everything.
Bulk zeroing is efficient, but not efficient enough
to obviate the need to avoid zeroing everything every time.
As a happy side-effect, ssa.Func.Free is no longer necessary.

DebugHashMatch and friends now belong in func.go.
They have been left in place for clarity and review.
I will move them in a subsequent CL.

Passes toolstash -cmp. No compiler performance impact.
No change in 'go test cmd/compile/internal/ssa' execution time.

Change-Id: I2eb7af58da067ef6a36e815a6f386cfe8634d098
Reviewed-on: https://go-review.googlesource.com/38167
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-17 05:21:42 +00:00
Austin Clements
8eb14e9de5 cmd/compile: accept string debug flags
The compiler's -d flag accepts string-valued flags, but currently only
for SSA debug flags. Extend it to support string values for other
flags. This also makes the syntax somewhat more sane so flag=value and
flag:value now both accept integers and strings.

Change-Id: Idd144d8479a430970cc1688f824bffe0a56ed2df
Reviewed-on: https://go-review.googlesource.com/37345
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-03-03 15:50:49 +00:00
Russ Cox
47ce87877b all: merge dev.inline into master
Change-Id: I7715581a04e513dcda9918e853fa6b1ddc703770
2017-02-01 09:47:23 -05:00
Robert Griesemer
472c792e0a [dev.inline] cmd/internal/src: introduce compact source position representation
XPos is a compact (8 instead of 16 bytes on a 64bit machine) source
position representation. There is a 1:1 correspondence between each
XPos and each regular Pos, translated via a global table.

In some sense this brings back the LineHist, though positions can
track line and column information; there is a O(1) translation
between the representations (no binary search), and the translation
is factored out.

The size increase with the prior change is brought down again and
the compiler speed is in line with the master repo (measured on
the same "quiet" machine as for prior change):

name       old time/op     new time/op     delta
Template       256ms ± 1%      262ms ± 2%    ~             (p=0.063 n=5+4)
Unicode        132ms ± 1%      135ms ± 2%    ~             (p=0.063 n=5+4)
GoTypes        891ms ± 1%      871ms ± 1%  -2.28%          (p=0.016 n=5+4)
Compiler       3.84s ± 2%      3.89s ± 2%    ~             (p=0.413 n=5+4)
MakeBash       47.1s ± 1%      46.2s ± 2%    ~             (p=0.095 n=5+5)

name       old user-ns/op  new user-ns/op  delta
Template        309M ± 1%       314M ± 2%    ~             (p=0.111 n=5+4)
Unicode         165M ± 1%       172M ± 9%    ~             (p=0.151 n=5+5)
GoTypes        1.14G ± 2%      1.12G ± 1%    ~             (p=0.063 n=5+4)
Compiler       5.00G ± 1%      4.96G ± 1%    ~             (p=0.286 n=5+4)

Change-Id: Icc570cc60ab014d8d9af6976f1f961ab8828cc47
Reviewed-on: https://go-review.googlesource.com/34506
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-09 22:43:22 +00:00
David Chase
7f1ff65c39 cmd/compile: insert scheduling checks on loop backedges
Loop breaking with a counter.  Benchmarked (see comments),
eyeball checked for sanity on popular loops.  This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.

Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test.  Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.

If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop.  Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.

This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.

Updates #17831.
Updates #10958.

Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09 21:01:29 +00:00
Robert Griesemer
eaca0e0529 [dev.inline] cmd/internal/src: introduce NoPos and use it instead Pos{}
Using a variable instead of a composite literal makes
the code independent of implementation changes of Pos.

Per David Lazar's suggestion.

Change-Id: I336967ac12a027c51a728a58ac6207cb5119af4a
Reviewed-on: https://go-review.googlesource.com/34148
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-12-09 00:35:07 +00:00