133 Commits

Author SHA1 Message Date
Cherry Zhang
a444458112 cmd/compile: make sure linkname'd symbol is non-package
When a variable symbol is both imported (possibly through
inlining) and linkname'd, make sure its LSym is marked as
non-package for symbol indexing in the object file, so it is
resolved by name and dedup'd with the original definition.

Fixes #42401.

Change-Id: I8e90c0418c6f46a048945c5fdc06c022b77ed68d
Reviewed-on: https://go-review.googlesource.com/c/go/+/268178
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
2020-11-09 16:09:16 +00:00
Cherry Zhang
84d7a85089 cmd/compile: delete register maps, completely
Remove go115ReduceLiveness feature gating flag, along with code
that only needed when go115ReduceLiveness is false.

Change-Id: I7571913cc74cbd17b330a0ee0160fefc9eeee66e
Reviewed-on: https://go-review.googlesource.com/c/go/+/264338
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2020-10-30 20:55:43 +00:00
Alberto Donizetti
3bac5faa4a cmd/compile: make gc debug flags collector a struct
gc debug flags are currently stored in a 256-long array, that is then
addressed using the ASCII numeric value of the flag itself (a quirk
inherited from the old C compiler). It is also a little wasteful,
since we only define 16 flags, and the other 240 array elements are
always empty.

This change makes Debug a struct, which also provides static checking
that we're not referencing flags that does not exist.

Change-Id: I2f0dfef2529325514b3398cf78635543cdf48fe0
Reviewed-on: https://go-review.googlesource.com/c/go/+/263539
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-10-22 09:33:46 +00:00
Russ Cox
912262b806 cmd/internal/obj: move LSym.Func into LSym.Extra
This creates space for a different kind of extension field
in LSym without making the struct any larger.
(There are many LSym, so we care about keeping the struct small.)

Change-Id: Ib16edb9e15f54c2a7351c8b875e19684058711e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/243943
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-16 03:02:36 +00:00
Keith Randall
9e70564f63 cmd/compile,cmd/asm: simplify recording of branch targets, take 2
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.

This is a cleanup CL in preparation for some stack frame CLs.

Change-Id: If8239177e4b1ea2bccd0608eb39553d23210d405
Reviewed-on: https://go-review.googlesource.com/c/go/+/251437
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-31 17:36:08 +00:00
Keith Randall
26ad27bb02 Revert "cmd/compile,cmd/asm: simplify recording of branch targets"
This reverts CL 243318.

Reason for revert: Seems to be crashing some builders.

Change-Id: I2ffc59bc5535be60b884b281c8d0eff4647dc756
Reviewed-on: https://go-review.googlesource.com/c/go/+/251169
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2020-08-28 02:10:13 +00:00
Keith Randall
8247da3662 cmd/compile,cmd/asm: simplify recording of branch targets
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.

This is a cleanup CL in preparation for some stack frame CLs.

Change-Id: I9055bf0a1d986aff421e47951a1dedc301c846f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/243318
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-27 22:35:45 +00:00
Keith Randall
623652e73f cmd/compile: make Haspointers a method instead of a function
More ergonomic that way. Also change Haspointers to HasPointers
while we are here.

Change-Id: I45bedc294c1a8c2bd01dc14bd04615ae77555375
Reviewed-on: https://go-review.googlesource.com/c/go/+/249959
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-08-23 18:33:55 +00:00
Tobias Klauser
bffb8818e7 all: fix dead links to inferno-os bitbucket repository
Generated using:

  perl -i -npe 's#inferno-os/src/default#inferno-os/src/master#' $(git grep -l "inferno-os/src/default" | grep -v vendor)

Change-Id: I4b6443bd09a8ea4c8aaeb40a1c73520d1f7ca648
Reviewed-on: https://go-review.googlesource.com/c/go/+/235821
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2020-06-04 07:25:06 +00:00
Austin Clements
9d812cfa5c cmd/compile,runtime: stack maps only at calls, remove register maps
Currently, we emit stack maps and register maps at almost every
instruction. This was originally intended to support non-cooperative
preemption, but was only ever used for debug call injection. Now debug
call injection also uses conservative frame scanning. As a result,
stack maps are only needed at call sites and register maps aren't
needed at all except that we happen to also encode unsafe-point
information in the register map PCDATA stream.

This CL reduces stack maps to only appear at calls, and replace full
register maps with just safe/unsafe-point information.

This is all protected by the go115ReduceLiveness feature flag, which
is defined in both runtime and cmd/compile.

This CL significantly reduces binary sizes and also speeds up compiles
and links:

name                      old exe-bytes     new exe-bytes     delta
BinGoSize                      15.0MB ± 0%       14.1MB ± 0%   -5.72%

name                      old pcln-bytes    new pcln-bytes    delta
BinGoSize                      3.14MB ± 0%       2.48MB ± 0%  -21.08%

name                      old time/op       new time/op       delta
Template                        178ms ± 7%        172ms ±14%  -3.59%  (p=0.005 n=19+19)
Unicode                        71.0ms ±12%       69.8ms ±10%    ~     (p=0.126 n=18+18)
GoTypes                         655ms ± 8%        615ms ± 8%  -6.11%  (p=0.000 n=19+19)
Compiler                        3.27s ± 6%        3.15s ± 7%  -3.69%  (p=0.001 n=20+20)
SSA                             7.10s ± 5%        6.85s ± 8%  -3.53%  (p=0.001 n=19+20)
Flate                           124ms ±15%        116ms ±22%  -6.57%  (p=0.024 n=18+19)
GoParser                        156ms ±26%        147ms ±34%    ~     (p=0.070 n=19+19)
Reflect                         406ms ± 9%        387ms ±21%  -4.69%  (p=0.028 n=19+20)
Tar                             163ms ±15%        162ms ±27%    ~     (p=0.370 n=19+19)
XML                             223ms ±13%        218ms ±14%    ~     (p=0.157 n=20+20)
LinkCompiler                    503ms ±21%        484ms ±23%    ~     (p=0.072 n=20+20)
ExternalLinkCompiler            1.27s ± 7%        1.22s ± 8%  -3.85%  (p=0.005 n=20+19)
LinkWithoutDebugCompiler        294ms ±17%        273ms ±11%  -7.16%  (p=0.001 n=19+18)

(https://perf.golang.org/search?q=upload:20200428.8)

The binary size improvement is even slightly better when you include
the CLs leading up to this. Relative to the parent of "cmd/compile:
mark PanicBounds/Extend as calls":

name                      old exe-bytes     new exe-bytes     delta
BinGoSize                      15.0MB ± 0%       14.1MB ± 0%   -6.18%

name                      old pcln-bytes    new pcln-bytes    delta
BinGoSize                      3.22MB ± 0%       2.48MB ± 0%  -22.92%

(https://perf.golang.org/search?q=upload:20200428.9)

For #36365.

Change-Id: I69448e714f2a44430067ca97f6b78e08c0abed27
Reviewed-on: https://go-review.googlesource.com/c/go/+/230544
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-29 21:29:21 +00:00
Austin Clements
faafdf5115 cmd/compile: fix unsafe-points with stack maps
The compiler currently conflates whether a Value has a stack map with
whether it's an unsafe point. For the most part, unsafe-points don't
have stack maps, so this is mostly fine, but call instructions can be
both an unsafe-point *and* have a stack map. For example, none of the
instructions in a nosplit function should be preemptible, but calls
must still have stack maps in case the called function grows the stack
or get preempted.

Currently, the compiler can't distinguish this case, so calls in
nosplit functions are marked as safe-points just because they have
stack maps. This is particularly problematic if a nosplit function
calls another nosplit function, since this can introduce a preemption
point where there should be none.

We realized this was a problem for split-stack prologues a while back,
and CL 207349 changed the encoding of unsafe-points to use the
register map index instead of the stack map index so we could record
both a stack map and an unsafe-point at the same instruction. But this
was never extended into the compiler.

This CL fixes this problem in the compiler. We make LivenessIndex
slightly more abstract by separating unsafe-point marks from stack and
register map indexes. We map this to the PCDATA encoding later when
producing Progs. This isn't enough to fix the whole problem for
nosplit functions, because obj still adds prologues and marks those as
preemptible, but it's a step in the right direction.

I checked this CL by comparing maps before and after this change in
the runtime and net/http. In net/http, unsafe-points match exactly; at
anything that isn't an unsafe-point, both the stack and register maps
are unchanged by this CL. In the runtime, at every point that was a
safe-point before this change, the stack maps agree (and mostly the
runtime doesn't have register maps at all now). In both, all CALLs
(except write barrier calls) have stack maps.

For #36365.

Change-Id: I066628938b02e78be5c81a6614295bcf7cc566c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/230541
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-29 21:29:17 +00:00
Matthew Dempsky
e341e93c51 cmd/compile, cmd/link: add coverage instrumentation for libfuzzer
This CL adds experimental coverage instrumentation similar to what
github.com/dvyukov/go-fuzz produces in its -libfuzzer mode. The
coverage can be enabled by compiling with -d=libfuzzer. It's intended
to be used in conjunction with -buildmode=c-archive to produce an ELF
archive (.a) file that can be linked with libFuzzer. See #14565 for
example usage.

The coverage generates a unique 8-bit counter for each basic block in
the original source code, and emits an increment operation. These
counters are then collected into the __libfuzzer_extra_counters ELF
section for use by libFuzzer.

Updates #14565.

Change-Id: I239758cc0ceb9ca1220f2d9d3d23b9e761db9bf1
Reviewed-on: https://go-review.googlesource.com/c/go/+/202117
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-11-05 00:00:36 +00:00
Austin Clements
61fa79885b cmd/compile: fix missing unsafe-points
Currently, the compiler fails to mark any unsafe-points in the initial
instructions of a function as unsafe points. This happens because
unsafe points are encoded as a stack map index of -2 and the compiler
emits PCDATA instructions when there's a change in the stack map
index, but I had set the initial stack map index to -2. The actual
initial PCDATA value assumed by the PCDATA encoder and the runtime is
-1. Hence, if the first instructions had a stack map index of -2, no
PCDATA was emitted, which cause the runtime to assume the index was -1
instead.

This was particularly problematic in the runtime, where the compiler
was supposed to mark only calls as safe-points and everything else as
unsafe-points. Runtime leaf functions, for example, should have been
marked as entirely unsafe-points, but were instead marked entirely as
safe-points.

Fix this by making the PCDATA instruction generator assume the initial
PCDATA value is -1 instead of -2, so it will emit a PCDATA instruction
right away if the first real instruction is an unsafe-point.

This increases the size of the cmd/go binary by 0.02% since we now
emit slightly more PCDATA than before.

For #10958, #24543.

Change-Id: I92222107f799130072b36d49098d2686f1543699
Reviewed-on: https://go-review.googlesource.com/c/go/+/202084
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-02 21:51:09 +00:00
LE Manh Cuong
515bb0129d cmd/compile: remove isfat from order expr
isfat was removed in walkexpr in CL 32313. For consistency,
remove it from order expr, too.

Change-Id: I0a47e0da13ba0168d6a055d990b8efad26ad790d
Reviewed-on: https://go-review.googlesource.com/c/go/+/179057
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-08-28 19:47:04 +00:00
Austin Clements
74d92db8d7 cmd/dist,cmd/compile: remove -allabis mode
dist passes the -allabis flag to the compiler to avoid having to
recreate the cross-package ABI logic from cmd/go. However, we removed
that logic from cmd/go in CL 179863 and replaced it with a different
mechanism that doesn't depend on the build system. Hence, passing
-allabis in dist is no longer necessary.

This CL removes -allabis from dist and, since that was the only use of
it, removes support for it from the compiler as well.

Updates #31230.

Change-Id: Ib005db95755a7028f49c885785e72c3970aea4f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/181079
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
2019-06-07 18:10:20 +00:00
Austin Clements
b402bd4499 cmd/compile: generate ABI wrappers for //go:linkname'd symbols
Calling a Go symbol from assembly in another package currently results
in a link failure because the Go symbol is defined as ABIInternal, but
the assembly call is from ABI0. In general this is okay because you
shouldn't do this anyway, but there are special cases where this is
necessary, especially between the runtime and packages closely tied to
the runtime in std.

Currently, we address this for runtime symbols with a hack in cmd/go
that knows to scan related packages when building the symabis file for
the runtime and runtime/internal/atomic. However, in addition to being
a messy solution in the first place, this hack causes races in cmd/go
that are difficult to work around.

We considered creating dummy references from assembly in the runtime
to these symbols, just to make sure they get ABI0 wrappers. However,
there are a fairly large number of these symbols on some platforms,
and it can vary significantly depending on build flags (e.g., race
mode), so even this solution is fairly unpalatable.

This CL addresses this by providing a way to mark symbols in Go code
that should be made available to assembly in other packages. Rather
than introduce a new pragma, we lightly expand the meaning of
"//go:linkname", since that pragma already generally indicates that
you're making the symbol available in a way it wasn't before. This
also dovetails nicely with the behavior of go:linkname in gccgo, which
makes unexported symbols available to other packages.

Follow-up CLs will make use of this and then remove the hack from
cmd/go.

Updates #31230.

Change-Id: I23060c97280626581f025c5c01fb8d24bb4c5159
Reviewed-on: https://go-review.googlesource.com/c/go/+/179860
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-06-06 19:44:08 +00:00
Keith Randall
f495f549ac cmd/compile: don't bother compiling functions named "_"
They can't be used, so we don't need code generated for them. We just
need to report errors in their bodies.

The compiler currently has a bunch of special cases sprinkled about
for "_" functions, because we never generate a linker symbol for them.
Instead, abort compilation earlier so we never reach any of that
special-case code.

Fixes #29870

Change-Id: I3530c9c353deabcf75ce9072c0b740e992349ee5
Reviewed-on: https://go-review.googlesource.com/c/158845
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-02-26 20:56:24 +00:00
Austin Clements
891f99eb43 cmd/compile: fix TestFormats
This fixes the linux-amd64-longtest builder, which was broken by CL
147160.

Updates #27539.

Change-Id: If6e69581ef503bba2449ec9bacaa31f34f59beb1
Reviewed-on: https://go-review.googlesource.com/c/149157
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-11-12 21:54:58 +00:00
Austin Clements
685aca45dc cmd/compile, cmd/link: separate stable and internal ABIs
This implements compiler and linker support for separating the
function calling ABI into two ABIs: a stable and an internal ABI. At
the moment, the two ABIs are identical, but we'll be able to evolve
the internal ABI without breaking existing assembly code that depends
on the stable ABI for calling to and from Go.

The Go compiler generates internal ABI symbols for all Go functions.
It uses the symabis information produced by the assembler to create
ABI wrappers whenever it encounters a body-less Go function that's
defined in assembly or a Go function that's referenced from assembly.

Since the two ABIs are currently identical, for the moment this is
implemented using "ABI alias" symbols, which are just forwarding
references to the native ABI symbol for a function. This way there's
no actual code involved in the ABI wrapper, which is good because
we're not deriving any benefit from it right now. Once the ABIs
diverge, we can eliminate ABI aliases.

The linker represents these different ABIs internally as different
versions of the same symbol. This way, the linker keeps us honest,
since every symbol definition and reference also specifies its
version. The linker is responsible for resolving ABI aliases.

Fixes #27539.

Change-Id: I197c52ec9f8fc435db8f7a4259029b20f6d65e95
Reviewed-on: https://go-review.googlesource.com/c/147160
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-11-12 20:46:55 +00:00
Austin Clements
15265ec421 cmd/compile: avoid duplicate GC bitmap symbols
Currently, liveness produces a distinct obj.LSym for each GC bitmap
for each function. These are then named by content hash and only
ultimately deduplicated by WriteObjFile.

For various reasons (see next commit), we want to remove this
deduplication behavior from WriteObjFile. Furthermore, it's
inefficient to produce these duplicate symbols in the first place.

GC bitmaps are the only source of duplicate symbols in the compiler.
This commit eliminates these duplicate symbols by declaring them in
the Ctxt symbol hash just like every other obj.LSym. As a result, all
GC bitmaps with the same content now refer to the same obj.LSym.

The next commit will remove deduplication from WriteObjFile.

For #27539.

Change-Id: I4f15e3d99530122cdf473b7a838c69ef5f79db59
Reviewed-on: https://go-review.googlesource.com/c/146557
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-03 15:12:34 +00:00
Austin Clements
9f95c9db23 cmd/compile, cmd/internal/obj: record register maps in binary
This adds FUNCDATA and PCDATA that records the register maps much like
the existing live arguments maps and live locals maps. The register
map is indexed independently from the argument and locals maps since
changes in register liveness tend not to correlate with changes to
argument and local liveness.

This is the final CL toward adding safe-points everywhere. The
following CLs will optimize liveness analysis to bring down the cost.
The effect of this CL is:

name        old time/op       new time/op       delta
Template          195ms ± 2%        197ms ± 1%    ~     (p=0.136 n=9+9)
Unicode          98.4ms ± 2%       99.7ms ± 1%  +1.39%  (p=0.004 n=10+10)
GoTypes           685ms ± 1%        700ms ± 1%  +2.06%  (p=0.000 n=9+9)
Compiler          3.28s ± 1%        3.34s ± 0%  +1.71%  (p=0.000 n=9+8)
SSA               7.79s ± 1%        7.91s ± 1%  +1.55%  (p=0.000 n=10+9)
Flate             133ms ± 2%        133ms ± 2%    ~     (p=0.190 n=10+10)
GoParser          161ms ± 2%        164ms ± 3%  +1.83%  (p=0.015 n=10+10)
Reflect           450ms ± 1%        457ms ± 1%  +1.62%  (p=0.000 n=10+10)
Tar               183ms ± 2%        185ms ± 1%  +0.91%  (p=0.008 n=9+10)
XML               234ms ± 1%        238ms ± 1%  +1.60%  (p=0.000 n=9+9)
[Geo mean]        411ms             417ms       +1.40%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.47M ± 0%        1.51M ± 0%  +2.79%  (p=0.000 n=10+10)

Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps is:

name        old time/op       new time/op       delta
Template          185ms ± 2%        197ms ± 1%   +6.42%  (p=0.000 n=10+9)
Unicode          96.3ms ± 3%       99.7ms ± 1%   +3.60%  (p=0.000 n=10+10)
GoTypes           658ms ± 0%        700ms ± 1%   +6.37%  (p=0.000 n=10+9)
Compiler          3.14s ± 1%        3.34s ± 0%   +6.53%  (p=0.000 n=9+8)
SSA               7.41s ± 2%        7.91s ± 1%   +6.71%  (p=0.000 n=9+9)
Flate             126ms ± 1%        133ms ± 2%   +6.15%  (p=0.000 n=10+10)
GoParser          153ms ± 1%        164ms ± 3%   +6.89%  (p=0.000 n=10+10)
Reflect           437ms ± 1%        457ms ± 1%   +4.59%  (p=0.000 n=10+10)
Tar               178ms ± 1%        185ms ± 1%   +4.18%  (p=0.000 n=10+10)
XML               223ms ± 1%        238ms ± 1%   +6.39%  (p=0.000 n=10+9)
[Geo mean]        394ms             417ms        +5.78%

name        old alloc/op      new alloc/op      delta
Template         34.5MB ± 0%       38.0MB ± 0%  +10.19%  (p=0.000 n=10+10)
Unicode          29.3MB ± 0%       30.3MB ± 0%   +3.56%  (p=0.000 n=8+9)
GoTypes           113MB ± 0%        125MB ± 0%  +10.89%  (p=0.000 n=10+10)
Compiler          510MB ± 0%        575MB ± 0%  +12.79%  (p=0.000 n=10+10)
SSA              1.46GB ± 0%       1.64GB ± 0%  +12.40%  (p=0.000 n=10+10)
Flate            23.9MB ± 0%       25.9MB ± 0%   +8.56%  (p=0.000 n=10+10)
GoParser         28.0MB ± 0%       30.8MB ± 0%  +10.08%  (p=0.000 n=10+10)
Reflect          77.6MB ± 0%       84.3MB ± 0%   +8.63%  (p=0.000 n=10+10)
Tar              34.1MB ± 0%       37.0MB ± 0%   +8.44%  (p=0.000 n=10+10)
XML              42.7MB ± 0%       47.2MB ± 0%  +10.75%  (p=0.000 n=10+10)
[Geo mean]       76.0MB            83.3MB        +9.60%

name        old allocs/op     new allocs/op     delta
Template           321k ± 0%         337k ± 0%   +4.98%  (p=0.000 n=10+10)
Unicode            337k ± 0%         340k ± 0%   +1.04%  (p=0.000 n=10+9)
GoTypes           1.13M ± 0%        1.18M ± 0%   +4.85%  (p=0.000 n=10+10)
Compiler          4.67M ± 0%        4.96M ± 0%   +6.25%  (p=0.000 n=10+10)
SSA               11.7M ± 0%        12.3M ± 0%   +5.69%  (p=0.000 n=10+10)
Flate              216k ± 0%         226k ± 0%   +4.52%  (p=0.000 n=10+9)
GoParser           271k ± 0%         283k ± 0%   +4.52%  (p=0.000 n=10+10)
Reflect            927k ± 0%         972k ± 0%   +4.78%  (p=0.000 n=10+10)
Tar                318k ± 0%         333k ± 0%   +4.56%  (p=0.000 n=10+10)
XML                376k ± 0%         395k ± 0%   +5.04%  (p=0.000 n=10+10)
[Geo mean]         730k              764k        +4.61%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.46M ± 0%        1.51M ± 0%   +3.66%  (p=0.000 n=10+10)

For #24543.

Change-Id: I91e003dc64151916b384274884bf02a2d6862547
Reviewed-on: https://go-review.googlesource.com/109353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:03 +00:00
Austin Clements
ed7a0682e2 cmd/compile: output stack map index everywhere it changes
Currently, the code generator only considers outputting stack map
indexes at CALL instructions. Raise this into the code generator loop
itself so that changes in the stack map index at any instruction emit
a PCDATA Prog before the actual instruction.

We'll optimize this in later CLs:

name        old time/op       new time/op       delta
Template          190ms ± 2%        191ms ± 2%    ~     (p=0.529 n=10+10)
Unicode          96.4ms ± 1%       98.5ms ± 3%  +2.18%  (p=0.001 n=9+10)
GoTypes           669ms ± 1%        673ms ± 1%  +0.62%  (p=0.004 n=9+9)
Compiler          3.18s ± 1%        3.22s ± 1%  +1.06%  (p=0.000 n=10+9)
SSA               7.59s ± 1%        7.64s ± 1%  +0.66%  (p=0.023 n=10+10)
Flate             128ms ± 1%        130ms ± 2%  +1.07%  (p=0.043 n=10+10)
GoParser          157ms ± 2%        158ms ± 3%    ~     (p=0.123 n=10+10)
Reflect           442ms ± 1%        445ms ± 1%  +0.73%  (p=0.017 n=10+9)
Tar               179ms ± 1%        180ms ± 1%  +0.58%  (p=0.019 n=9+9)
XML               229ms ± 1%        232ms ± 2%  +1.27%  (p=0.009 n=10+10)
[Geo mean]        401ms             405ms       +0.94%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.46M ± 0%        1.47M ± 0%  +0.84%  (p=0.000 n=10+10)
[Geo mean]        1.46M             1.47M       +0.84%

For #24543.

Change-Id: I4bfe45b767c9d9db47308a27763b303fa75bfa54
Reviewed-on: https://go-review.googlesource.com/109350
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 14:43:38 +00:00
David Chase
529362e1e2 cmd/compile: refactor and cleanup of common code/workaround
There's a glitch in how attributes from procs that do not
generate code are combined, and the workaround for this
glitch appeared in two places.

"One big pile is better than two little ones."

Updates #25426.

Change-Id: I252f9adc5b77591720a61fa22e6f9dda33d95350
Reviewed-on: https://go-review.googlesource.com/113717
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-05-18 21:00:56 +00:00
David Chase
c2c1822b12 cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).

Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).

Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.

Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".

The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices.  This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.

Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go.  There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.

Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)

This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.

The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.

This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.

Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-14 14:09:49 +00:00
Daniel Martí
2b2348ab14 cmd/compile/internal/gc: add some Node methods
Focus on "isfoo" funcs that take a *Node, and conver them to isFoo
methods instead. This makes for more idiomatic Go code, and also more
readable func names.

Found candidates with grep, and applied most changes with sed. The funcs
chosen were isgoconst, isnil, and isblank. All had the same signature,
func(*Node) bool.

While at it, camelCase the isliteral and iszero function names. Don't
move these to methods, as they are only used in the backend part of gc,
which might one day be split into a separate package.

Passes toolstash -cmp on std cmd.

Change-Id: I4df081b12d36c46c253167c8841c5a841f1c5a16
Reviewed-on: https://go-review.googlesource.com/105555
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-16 00:16:55 +00:00
Austin Clements
a7f73c436d cmd/compile: eliminate NoFramePointer
The NoFramePointer function flag is no longer used, so this CL
eliminates it. This cleans up some confusion between the compiler's
NoFramePointer flag and obj's NOFRAME flag. NoFramePointer was
intended to eliminate the saved base pointer on x86, but it was
translated into obj's NOFRAME flag. On x86, NOFRAME does mean to omit
the saved base pointer, but on ppc64 and s390x it has a more general
meaning of omitting *everything* from the frame, including the saved
LR and ppc64's "fixed frame". Hence, on ppc64 and s390x there are far
fewer situations where it is safe to set this flag.

Change-Id: If68991310b4d00638128c296bdd57f4ed731b46d
Reviewed-on: https://go-review.googlesource.com/92036
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-12 21:41:21 +00:00
Than McIntosh
4435fcfd6c compiler,linker: support for DWARF inlined instances
Compiler and linker changes to support DWARF inlined instances,
see https://go.googlesource.com/proposal/+/HEAD/design/22080-dwarf-inlining.md
for design details.

This functionality is gated via the cmd/compile option -gendwarfinl=N,
where N={0,1,2}, where a value of 0 disables dwarf inline generation,
a value of 1 turns on dwarf generation without tracking of formal/local
vars from inlined routines, and a value of 2 enables inlines with
variable tracking.

Updates #22080

Change-Id: I69309b3b815d9fed04aebddc0b8d33d0dbbfad6e
Reviewed-on: https://go-review.googlesource.com/75550
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-11-30 14:39:19 +00:00
Daniel Martí
006bc57095 cmd/compile: clean up various bits of code
* replace a copy of IsMethod with a call of it.
* a few more switches where they simplify the code.
* prefer composite literals over "n := new(...); n.x = y; ...".
* use defers to get rid of three goto labels.
* rewrite updateHasCall into two funcs to remove gotos.

Passes toolstash-check on std cmd.

Change-Id: Icb5442a89a87319ef4b640bbc5faebf41b193ef1
Reviewed-on: https://go-review.googlesource.com/72070
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-10-22 15:50:50 +00:00
Daniel Martí
71c9454f99 cmd/compile: remove some redundant types in decls
As per golint's suggestions.

Change-Id: Ie0c6ad9aa5dc69966a279562a341c7b095c47ede
Reviewed-on: https://go-review.googlesource.com/64192
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-17 09:51:38 +00:00
Josh Bleecher Snyder
26e126d6e6 cmd/compile: move nodarg to walk.go
Its sole use is in walk.go. 100% code movement.

gsubr.go increasingly contains backend-y things.
With a few more relocations, it could probably be
fruitfully renamed progs.go.

Change-Id: I61ec5c2bc1f8cfdda64c6d6f580952c154ff60e0
Reviewed-on: https://go-review.googlesource.com/41972
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-27 19:08:26 +00:00
Josh Bleecher Snyder
756b9ce3a5 cmd/compile: add initial backend concurrency support
This CL adds initial support for concurrent backend compilation.

BACKGROUND

The compiler currently consists (very roughly) of the following phases:

1. Initialization.
2. Lexing and parsing into the cmd/compile/internal/syntax AST.
3. Translation into the cmd/compile/internal/gc AST.
4. Some gc AST passes: typechecking, escape analysis, inlining,
   closure handling, expression evaluation ordering (order.go),
   and some lowering and optimization (walk.go).
5. Translation into the cmd/compile/internal/ssa SSA form.
6. Optimization and lowering of SSA form.
7. Translation from SSA form to assembler instructions.
8. Translation from assembler instructions to machine code.
9. Writing lots of output: machine code, DWARF symbols,
   type and reflection info, export data.

Phase 2 was already concurrent as of Go 1.8.

Phase 3 is planned for eventual removal;
we hope to go straight from syntax AST to SSA.

Phases 5–8 are per-function; this CL adds support for
processing multiple functions concurrently.
The slowest phases in the compiler are 5 and 6,
so this offers the opportunity for some good speed-ups.

Unfortunately, it's not quite that straightforward.
In the current compiler, the latter parts of phase 4
(order, walk) are done function-at-a-time as needed.
Making order and walk concurrency-safe proved hard,
and they're not particularly slow, so there wasn't much reward.
To enable phases 5–8 to be done concurrently,
when concurrent backend compilation is requested,
we complete phase 4 for all functions
before starting later phases for any functions.

Also, in reality, we automatically generate new
functions in phase 9, such as method wrappers
and equality and has routines.
Those new functions then go through phases 4–8.
This CL disables concurrent backend compilation
after the first, big, user-provided batch of
functions has been compiled.
This is done to keep things simple,
and because the autogenerated functions
tend to be small, few, simple, and fast to compile.

USAGE

Concurrent backend compilation still defaults to off.
To set the number of functions that may be backend-compiled
concurrently, use the compiler flag -c.
In future work, cmd/go will automatically set -c.

Furthermore, this CL has been intentionally written
so that the c=1 path has no backend concurrency whatsoever,
not even spawning any goroutines.
This helps ensure that, should problems arise
late in the development cycle,
we can simply have cmd/go set c=1 always,
and revert to the original compiler behavior.

MUTEXES

Most of the work required to make concurrent backend
compilation safe has occurred over the past month.
This CL adds a handful of mutexes to get the rest of the way there;
they are the mutexes that I didn't see a clean way to avoid.
Some of them may still be eliminable in future work.

In no particular order:

* gc.funcsymsmu. The global funcsyms slice is populated
  lazily when we need function symbols for closures.
  This occurs during gc AST to SSA translation.
  The function funcsym also does a package lookup,
  which is a source of races on types.Pkg.Syms;
  funcsymsmu also covers that package lookup.
  This mutex is low priority: it adds a single global,
  it is in an infrequently used code path, and it is low contention.
  Since funcsyms may now be added in any order,
  we must sort them to preserve reproducible builds.

* gc.largeStackFramesMu. We don't discover until after SSA compilation
  that a function's stack frame is gigantic.
  Recording that error happens basically never,
  but it does happen concurrently.
  Fix with a low priority mutex and sorting.

* obj.Link.hashmu. ctxt.hash stores the mapping from
  types.Syms (compiler symbols) to obj.LSyms (linker symbols).
  It is accessed fairly heavily through all the phases.
  This is the only heavily contended mutex.

* gc.signatlistmu. The global signatlist map is
  populated with types through several of the concurrent phases,
  including notably via ngotype during DWARF generation.
  It is low priority for removal.

* gc.typepkgmu. Looking up symbols in the types package
  happens a fair amount during backend compilation
  and DWARF generation, particularly via ngotype.
  This mutex helps us to avoid a broader mutex on types.Pkg.Syms.
  It has low-to-moderate contention.

* types.internedStringsmu. gc AST to SSA conversion and
  some SSA work introduce new autotmps.
  Those autotmps have their names interned to reduce allocations.
  That interning requires protecting types.internedStrings.
  The autotmp names are heavily re-used, and the mutex
  overhead and contention here are low, so it is probably
  a worthwhile performance optimization to keep this mutex.

TESTING

I have been testing this code locally by running
'go install -race cmd/compile'
and then doing
'go build -a -gcflags=-c=128 std cmd'
for all architectures and a variety of compiler flags.
This obviously needs to be made part of the builders,
but it is too expensive to make part of all.bash.
I have filed #19962 for this.

REPRODUCIBLE BUILDS

This version of the compiler generates reproducible builds.
Testing reproducible builds also needs automation, however,
and is also too expensive for all.bash.
This is #19961.

Also of note is that some of the compiler flags used by 'toolstash -cmp'
are currently incompatible with concurrent backend compilation.
They still work fine with c=1.
Time will tell whether this is a problem.

NEXT STEPS

* Continue to find and fix races and bugs,
  using a combination of code inspection, fuzzing,
  and hopefully some community experimentation.
  I do not know of any outstanding races,
  but there probably are some.
* Improve testing.
* Improve performance, for many values of c.
* Integrate with cmd/go and fine tune.
* Support concurrent compilation with the -race flag.
  It is a sad irony that it does not yet work.
* Minor code cleanup that has been deferred during
  the last month due to uncertainty about the
  ultimate shape of this CL.

PERFORMANCE

Here's the buried lede, at last. :)

All benchmarks are from my 8 core 2.9 GHz Intel Core i7 darwin/amd64 laptop.

First, going from tip to this CL with c=1 has almost no impact.

name        old time/op       new time/op       delta
Template          195ms ± 3%        194ms ± 5%    ~     (p=0.370 n=30+29)
Unicode          86.6ms ± 3%       87.0ms ± 7%    ~     (p=0.958 n=29+30)
GoTypes           548ms ± 3%        555ms ± 4%  +1.35%  (p=0.001 n=30+28)
Compiler          2.51s ± 2%        2.54s ± 2%  +1.17%  (p=0.000 n=28+30)
SSA               5.16s ± 3%        5.16s ± 2%    ~     (p=0.910 n=30+29)
Flate             124ms ± 5%        124ms ± 4%    ~     (p=0.947 n=30+30)
GoParser          146ms ± 3%        146ms ± 3%    ~     (p=0.150 n=29+28)
Reflect           354ms ± 3%        352ms ± 4%    ~     (p=0.096 n=29+29)
Tar               107ms ± 5%        106ms ± 3%    ~     (p=0.370 n=30+29)
XML               200ms ± 4%        201ms ± 4%    ~     (p=0.313 n=29+28)
[Geo mean]        332ms             333ms       +0.10%

name        old user-time/op  new user-time/op  delta
Template          227ms ± 5%        225ms ± 5%    ~     (p=0.457 n=28+27)
Unicode           109ms ± 4%        109ms ± 5%    ~     (p=0.758 n=29+29)
GoTypes           713ms ± 4%        721ms ± 5%    ~     (p=0.051 n=30+29)
Compiler          3.36s ± 2%        3.38s ± 3%    ~     (p=0.146 n=30+30)
SSA               7.46s ± 3%        7.47s ± 3%    ~     (p=0.804 n=30+29)
Flate             146ms ± 7%        147ms ± 3%    ~     (p=0.833 n=29+27)
GoParser          179ms ± 5%        179ms ± 5%    ~     (p=0.866 n=30+30)
Reflect           431ms ± 4%        429ms ± 4%    ~     (p=0.593 n=29+30)
Tar               124ms ± 5%        123ms ± 5%    ~     (p=0.140 n=29+29)
XML               243ms ± 4%        242ms ± 7%    ~     (p=0.404 n=29+29)
[Geo mean]        415ms             415ms       +0.02%

name        old obj-bytes     new obj-bytes     delta
Template           382k ± 0%         382k ± 0%    ~     (all equal)
Unicode            203k ± 0%         203k ± 0%    ~     (all equal)
GoTypes           1.18M ± 0%        1.18M ± 0%    ~     (all equal)
Compiler          3.98M ± 0%        3.98M ± 0%    ~     (all equal)
SSA               8.28M ± 0%        8.28M ± 0%    ~     (all equal)
Flate              230k ± 0%         230k ± 0%    ~     (all equal)
GoParser           287k ± 0%         287k ± 0%    ~     (all equal)
Reflect           1.00M ± 0%        1.00M ± 0%    ~     (all equal)
Tar                190k ± 0%         190k ± 0%    ~     (all equal)
XML                416k ± 0%         416k ± 0%    ~     (all equal)
[Geo mean]         660k              660k       +0.00%

Comparing this CL to itself, from c=1 to c=2
improves real times 20-30%, costs 5-10% more CPU time,
and adds about 2% alloc.
The allocation increase comes from allocating more ssa.Caches.

name       old time/op       new time/op       delta
Template         202ms ± 3%        149ms ± 3%  -26.15%  (p=0.000 n=49+49)
Unicode         87.4ms ± 4%       84.2ms ± 3%   -3.68%  (p=0.000 n=48+48)
GoTypes          560ms ± 2%        398ms ± 2%  -28.96%  (p=0.000 n=49+49)
Compiler         2.46s ± 3%        1.76s ± 2%  -28.61%  (p=0.000 n=48+46)
SSA              6.17s ± 2%        4.04s ± 1%  -34.52%  (p=0.000 n=49+49)
Flate            126ms ± 3%         92ms ± 2%  -26.81%  (p=0.000 n=49+48)
GoParser         148ms ± 4%        107ms ± 2%  -27.78%  (p=0.000 n=49+48)
Reflect          361ms ± 3%        281ms ± 3%  -22.10%  (p=0.000 n=49+49)
Tar              109ms ± 4%         86ms ± 3%  -20.81%  (p=0.000 n=49+47)
XML              204ms ± 3%        144ms ± 2%  -29.53%  (p=0.000 n=48+45)

name       old user-time/op  new user-time/op  delta
Template         246ms ± 9%        246ms ± 4%     ~     (p=0.401 n=50+48)
Unicode          109ms ± 4%        111ms ± 4%   +1.47%  (p=0.000 n=44+50)
GoTypes          728ms ± 3%        765ms ± 3%   +5.04%  (p=0.000 n=46+50)
Compiler         3.33s ± 3%        3.41s ± 2%   +2.31%  (p=0.000 n=49+48)
SSA              8.52s ± 2%        9.11s ± 2%   +6.93%  (p=0.000 n=49+47)
Flate            149ms ± 4%        161ms ± 3%   +8.13%  (p=0.000 n=50+47)
GoParser         181ms ± 5%        192ms ± 2%   +6.40%  (p=0.000 n=49+46)
Reflect          452ms ± 9%        474ms ± 2%   +4.99%  (p=0.000 n=50+48)
Tar              126ms ± 6%        136ms ± 4%   +7.95%  (p=0.000 n=50+49)
XML              247ms ± 5%        264ms ± 3%   +6.94%  (p=0.000 n=48+50)

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       39.3MB ± 0%   +1.48%  (p=0.008 n=5+5)
Unicode         29.8MB ± 0%       30.2MB ± 0%   +1.19%  (p=0.008 n=5+5)
GoTypes          113MB ± 0%        114MB ± 0%   +0.69%  (p=0.008 n=5+5)
Compiler         443MB ± 0%        447MB ± 0%   +0.95%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.26GB ± 0%   +0.89%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       25.9MB ± 1%   +2.35%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       32.2MB ± 0%   +1.59%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       78.9MB ± 0%   +0.91%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       27.0MB ± 0%   +1.80%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       43.4MB ± 0%   +2.35%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          379k ± 0%         378k ± 0%     ~     (p=0.421 n=5+5)
Unicode           322k ± 0%         321k ± 0%     ~     (p=0.222 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.548 n=5+5)
Compiler         4.12M ± 0%        4.11M ± 0%   -0.14%  (p=0.032 n=5+5)
SSA              9.72M ± 0%        9.72M ± 0%     ~     (p=0.421 n=5+5)
Flate             234k ± 1%         234k ± 0%     ~     (p=0.421 n=5+5)
GoParser          316k ± 1%         315k ± 0%     ~     (p=0.222 n=5+5)
Reflect           980k ± 0%         979k ± 0%     ~     (p=0.095 n=5+5)
Tar               249k ± 1%         249k ± 1%     ~     (p=0.841 n=5+5)
XML               392k ± 0%         391k ± 0%     ~     (p=0.095 n=5+5)

From c=1 to c=4, real time is down ~40%, CPU usage up 10-20%, alloc up ~5%:

name       old time/op       new time/op       delta
Template         203ms ± 3%        131ms ± 5%  -35.45%  (p=0.000 n=50+50)
Unicode         87.2ms ± 4%       84.1ms ± 2%   -3.61%  (p=0.000 n=48+47)
GoTypes          560ms ± 4%        310ms ± 2%  -44.65%  (p=0.000 n=50+49)
Compiler         2.47s ± 3%        1.41s ± 2%  -43.10%  (p=0.000 n=50+46)
SSA              6.17s ± 2%        3.20s ± 2%  -48.06%  (p=0.000 n=49+49)
Flate            126ms ± 4%         74ms ± 2%  -41.06%  (p=0.000 n=49+48)
GoParser         148ms ± 4%         89ms ± 3%  -39.97%  (p=0.000 n=49+50)
Reflect          360ms ± 3%        242ms ± 3%  -32.81%  (p=0.000 n=49+49)
Tar              108ms ± 4%         73ms ± 4%  -32.48%  (p=0.000 n=50+49)
XML              203ms ± 3%        119ms ± 3%  -41.56%  (p=0.000 n=49+48)

name       old user-time/op  new user-time/op  delta
Template         246ms ± 9%        287ms ± 9%  +16.98%  (p=0.000 n=50+50)
Unicode          109ms ± 4%        118ms ± 5%   +7.56%  (p=0.000 n=46+50)
GoTypes          735ms ± 4%        806ms ± 2%   +9.62%  (p=0.000 n=50+50)
Compiler         3.34s ± 4%        3.56s ± 2%   +6.78%  (p=0.000 n=49+49)
SSA              8.54s ± 3%       10.04s ± 3%  +17.55%  (p=0.000 n=50+50)
Flate            149ms ± 6%        176ms ± 3%  +17.82%  (p=0.000 n=50+48)
GoParser         181ms ± 5%        213ms ± 3%  +17.47%  (p=0.000 n=50+50)
Reflect          453ms ± 6%        499ms ± 2%  +10.11%  (p=0.000 n=50+48)
Tar              126ms ± 5%        149ms ±11%  +18.76%  (p=0.000 n=50+50)
XML              246ms ± 5%        287ms ± 4%  +16.53%  (p=0.000 n=49+50)

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       40.4MB ± 0%   +4.21%  (p=0.008 n=5+5)
Unicode         29.8MB ± 0%       30.9MB ± 0%   +3.68%  (p=0.008 n=5+5)
GoTypes          113MB ± 0%        116MB ± 0%   +2.71%  (p=0.008 n=5+5)
Compiler         443MB ± 0%        455MB ± 0%   +2.75%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.27GB ± 0%   +1.84%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       26.9MB ± 1%   +6.31%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       33.2MB ± 0%   +4.61%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       80.2MB ± 0%   +2.53%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       27.9MB ± 0%   +5.19%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       44.6MB ± 0%   +5.20%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          380k ± 0%         379k ± 0%   -0.39%  (p=0.032 n=5+5)
Unicode           321k ± 0%         321k ± 0%     ~     (p=0.841 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.421 n=5+5)
Compiler         4.12M ± 0%        4.14M ± 0%   +0.52%  (p=0.008 n=5+5)
SSA              9.72M ± 0%        9.76M ± 0%   +0.37%  (p=0.008 n=5+5)
Flate             234k ± 1%         234k ± 1%     ~     (p=0.690 n=5+5)
GoParser          316k ± 0%         317k ± 1%     ~     (p=0.841 n=5+5)
Reflect           981k ± 0%         981k ± 0%     ~     (p=1.000 n=5+5)
Tar               250k ± 0%         249k ± 1%     ~     (p=0.151 n=5+5)
XML               393k ± 0%         392k ± 0%     ~     (p=0.056 n=5+5)

Going beyond c=4 on my machine tends to increase CPU time and allocs
without impacting real time.

The CPU time numbers matter, because when there are many concurrent
compilation processes, that will impact the overall throughput.

The numbers above are in many ways the best case scenario;
we can take full advantage of all cores.
Fortunately, the most common compilation scenario is incremental
re-compilation of a single package during a build/test cycle.

Updates #15756

Change-Id: I6725558ca2069edec0ac5b0d1683105a9fff6bea
Reviewed-on: https://go-review.googlesource.com/40693
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-27 00:59:07 +00:00
Josh Bleecher Snyder
386765afdf cmd/compile: move Node.Class to flags
Put it at position zero, since it is fairly hot.

This shrinks gc.Node into a smaller size class on 64 bit systems.

name        old time/op       new time/op       delta
Template          193ms ± 5%        192ms ± 3%    ~     (p=0.353 n=94+93)
Unicode          86.1ms ± 5%       85.0ms ± 4%  -1.23%  (p=0.000 n=95+98)
GoTypes           546ms ± 3%        544ms ± 4%  -0.40%  (p=0.007 n=94+97)
Compiler          2.56s ± 3%        2.54s ± 3%  -0.67%  (p=0.000 n=99+97)
SSA               5.13s ± 2%        5.10s ± 3%  -0.55%  (p=0.000 n=94+98)
Flate             122ms ± 6%        121ms ± 4%  -0.75%  (p=0.002 n=97+95)
GoParser          144ms ± 5%        144ms ± 4%    ~     (p=0.298 n=98+97)
Reflect           348ms ± 4%        349ms ± 4%    ~     (p=0.350 n=98+97)
Tar               105ms ± 5%        104ms ± 5%    ~     (p=0.154 n=96+98)
XML               200ms ± 5%        198ms ± 4%  -0.71%  (p=0.015 n=97+98)
[Geo mean]        330ms             328ms       -0.52%

name        old user-time/op  new user-time/op  delta
Template          229ms ±11%        224ms ± 7%  -2.16%  (p=0.001 n=100+87)
Unicode           109ms ± 5%        109ms ± 6%    ~     (p=0.897 n=96+91)
GoTypes           712ms ± 4%        709ms ± 4%    ~     (p=0.085 n=96+98)
Compiler          3.41s ± 3%        3.36s ± 3%  -1.43%  (p=0.000 n=98+98)
SSA               7.46s ± 3%        7.31s ± 3%  -2.02%  (p=0.000 n=100+99)
Flate             145ms ± 6%        143ms ± 6%  -1.11%  (p=0.001 n=99+97)
GoParser          177ms ± 5%        176ms ± 5%  -0.78%  (p=0.018 n=95+95)
Reflect           432ms ± 7%        435ms ± 9%    ~     (p=0.296 n=100+100)
Tar               121ms ± 7%        121ms ± 5%    ~     (p=0.072 n=100+95)
XML               241ms ± 4%        239ms ± 5%    ~     (p=0.085 n=97+99)
[Geo mean]        413ms             410ms       -0.73%

name        old alloc/op      new alloc/op      delta
Template         38.4MB ± 0%       37.7MB ± 0%  -1.85%  (p=0.008 n=5+5)
Unicode          30.1MB ± 0%       28.8MB ± 0%  -4.09%  (p=0.008 n=5+5)
GoTypes           112MB ± 0%        110MB ± 0%  -1.69%  (p=0.008 n=5+5)
Compiler          470MB ± 0%        461MB ± 0%  -1.91%  (p=0.008 n=5+5)
SSA              1.13GB ± 0%       1.11GB ± 0%  -1.70%  (p=0.008 n=5+5)
Flate            25.0MB ± 0%       24.6MB ± 0%  -1.67%  (p=0.008 n=5+5)
GoParser         31.6MB ± 0%       31.1MB ± 0%  -1.66%  (p=0.008 n=5+5)
Reflect          77.1MB ± 0%       75.8MB ± 0%  -1.69%  (p=0.008 n=5+5)
Tar              26.3MB ± 0%       25.7MB ± 0%  -2.06%  (p=0.008 n=5+5)
XML              41.9MB ± 0%       41.1MB ± 0%  -1.93%  (p=0.008 n=5+5)
[Geo mean]       73.5MB            72.0MB       -2.03%

name        old allocs/op     new allocs/op     delta
Template           383k ± 0%         383k ± 0%    ~     (p=0.690 n=5+5)
Unicode            343k ± 0%         343k ± 0%    ~     (p=0.841 n=5+5)
GoTypes           1.16M ± 0%        1.16M ± 0%    ~     (p=0.310 n=5+5)
Compiler          4.43M ± 0%        4.42M ± 0%  -0.17%  (p=0.008 n=5+5)
SSA               9.85M ± 0%        9.85M ± 0%    ~     (p=0.310 n=5+5)
Flate              236k ± 0%         236k ± 1%    ~     (p=0.841 n=5+5)
GoParser           320k ± 0%         320k ± 0%    ~     (p=0.421 n=5+5)
Reflect            988k ± 0%         987k ± 0%    ~     (p=0.690 n=5+5)
Tar                252k ± 0%         251k ± 0%    ~     (p=0.095 n=5+5)
XML                399k ± 0%         399k ± 0%    ~     (p=1.000 n=5+5)
[Geo mean]         741k              740k       -0.07%

Change-Id: I9e952b58a98e30a12494304db9ce50d0a85e459c
Reviewed-on: https://go-review.googlesource.com/41797
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
2017-04-26 16:58:33 +00:00
Josh Bleecher Snyder
502a03ffcf cmd/compile: move Node.Typecheck to flags
Change-Id: Id5aa4a1499068bf2d3497b21d794f970b7e47fdf
Reviewed-on: https://go-review.googlesource.com/41795
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-26 01:27:28 +00:00
Josh Bleecher Snyder
a495fe2775 cmd/compile: make ggloblsym work with obj.LSyms
Automated refactoring using gorename, eg, and gofmt -r.

Passes toolstash-check.

Change-Id: Ib50f368bf62a07e5ced50b1b92a29c669ba9a158
Reviewed-on: https://go-review.googlesource.com/41401
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-21 23:15:25 +00:00
Josh Bleecher Snyder
30940e2cc2 cmd/compile: move Linksym, linksymname, and isblanksym to types package
Response to code review feedback on CL 40693.

This CL was prepared by:

(1) manually adding new implementations and the Ctxt var to package types

(2) running eg with template:

func before(s *types.Sym) *obj.LSym { return gc.Linksym(s) }
func after(s *types.Sym) *obj.LSym  { return s.Linksym() }

(3) running gofmt -r:

gofmt -r 'isblanksym(a) -> a.IsBlank()'

(4) manually removing old implementations from package gc

Passes toolstash-check.

Change-Id: I39c35def7cae5bcbcc7c77253e5d2b066b981dea
Reviewed-on: https://go-review.googlesource.com/41302
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-04-21 16:10:29 +00:00
Josh Bleecher Snyder
5aebeaaca2 cmd/compile: simplify sharedProgArray init
Per code review feedback on CL 40693.

Change-Id: I38c522022a3c2f3e61ea90181391edb5c178916e
Reviewed-on: https://go-review.googlesource.com/41300
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-04-21 16:10:13 +00:00
Matthew Dempsky
1e3570ac86 cmd/internal/objabi: extract shared functionality from obj
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.

There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.

objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.

Fixes #15165.
Fixes #20026.

Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-19 00:00:09 +00:00
Matthew Dempsky
1747078695 cmd/internal/obj: un-embed FuncInfo field in LSym
Automated refactoring using github.com/mdempsky/unbed (to rewrite
s.Foo to s.FuncInfo.Foo) and then gorename (to rename the FuncInfo
field to just Func).

Passes toolstash-check -all.

Change-Id: I802c07a1239e0efea058a91a87c5efe12170083a
Reviewed-on: https://go-review.googlesource.com/40670
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-18 17:29:50 +00:00
Josh Bleecher Snyder
da15fe6870 cmd/internal/obj: rework gclocals handling
The compiler handled gcargs and gclocals LSyms unusually.
It generated placeholder symbols (makefuncdatasym),
filled them in, and then renamed them for content-addressability.
This is an important binary size optimization;
the same locals information occurs over and over.

This CL continues to treat these LSyms unusually,
but in a slightly more explicit way,
and importantly for concurrent compilation,
in a way that does not require concurrent
modification of Ctxt.Hash.

Instead of creating gcargs and gclocals in the usual way,
by creating a types.Sym and then an obj.LSym,
we add them directly to obj.FuncInfo,
initialize them in obj.InitTextSym,
and deduplicate and add them to ctxt.Data at the end.
Then the backend's job is simply to fill them in
and rename them appropriately.

Updates #15756

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       38.7MB ± 0%  -0.22%  (p=0.016 n=5+5)
Unicode         29.8MB ± 0%       29.8MB ± 0%    ~     (p=0.690 n=5+5)
GoTypes          113MB ± 0%        113MB ± 0%  -0.24%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.24GB ± 0%  -0.39%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       25.2MB ± 0%  -0.43%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       31.7MB ± 0%  -0.22%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       77.6MB ± 0%  -0.80%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       26.3MB ± 0%  -0.85%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       41.9MB ± 0%  -1.04%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          378k ± 0%         377k ± 1%    ~     (p=0.151 n=5+5)
Unicode           321k ± 1%         321k ± 0%    ~     (p=0.841 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%  -0.47%  (p=0.016 n=5+5)
SSA              9.71M ± 0%        9.67M ± 0%  -0.33%  (p=0.008 n=5+5)
Flate             233k ± 1%         232k ± 1%    ~     (p=0.151 n=5+5)
GoParser          316k ± 0%         315k ± 0%  -0.49%  (p=0.016 n=5+5)
Reflect           979k ± 0%         972k ± 0%  -0.75%  (p=0.008 n=5+5)
Tar               250k ± 0%         247k ± 1%  -0.92%  (p=0.008 n=5+5)
XML               392k ± 1%         389k ± 0%  -0.67%  (p=0.008 n=5+5)

Change-Id: Idc36186ca9d2f8214b5f7720bbc27b6bb22fdc48
Reviewed-on: https://go-review.googlesource.com/40697
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-18 02:13:32 +00:00
Josh Bleecher Snyder
3692925c5e cmd/compile: move Text.From.Sym initialization earlier
The initialization of an ATEXT Prog's From.Sym
can race with the assemblers in a concurrent compiler.
CL 40254 contains an initial, failed attempt to
fix that race.

This CL takes a different approach: Rather than
expose an API to initialize the Prog,
expose an API to initialize the Sym.

The initialization of the Sym can then be
moved earlier in the compiler, avoiding the race.

The growth of gc.Func has negligible
performance impact; see below.

Passes toolstash -cmp.

Updates #15756

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       38.8MB ± 0%    ~     (p=0.968 n=9+10)
Unicode         29.8MB ± 0%       29.8MB ± 0%    ~     (p=0.684 n=10+10)
GoTypes          113MB ± 0%        113MB ± 0%    ~     (p=0.912 n=10+10)
SSA             1.25GB ± 0%       1.25GB ± 0%    ~     (p=0.481 n=10+10)
Flate           25.3MB ± 0%       25.3MB ± 0%    ~     (p=0.105 n=10+10)
GoParser        31.7MB ± 0%       31.8MB ± 0%  +0.09%  (p=0.016 n=8+10)
Reflect         78.3MB ± 0%       78.2MB ± 0%    ~     (p=0.190 n=10+10)
Tar             26.5MB ± 0%       26.6MB ± 0%  +0.13%  (p=0.011 n=10+10)
XML             42.4MB ± 0%       42.4MB ± 0%    ~     (p=0.971 n=10+10)

name       old allocs/op     new allocs/op     delta
Template          378k ± 1%         378k ± 0%    ~     (p=0.315 n=10+9)
Unicode           321k ± 1%         321k ± 0%    ~     (p=0.436 n=10+10)
GoTypes          1.14M ± 0%        1.14M ± 0%    ~     (p=0.079 n=10+9)
SSA              9.70M ± 0%        9.70M ± 0%  -0.04%  (p=0.035 n=10+10)
Flate             233k ± 1%         234k ± 1%    ~     (p=0.529 n=10+10)
GoParser          315k ± 0%         316k ± 0%    ~     (p=0.095 n=9+10)
Reflect           980k ± 0%         980k ± 0%    ~     (p=0.436 n=10+10)
Tar               249k ± 1%         250k ± 0%    ~     (p=0.280 n=10+10)
XML               391k ± 1%         391k ± 1%    ~     (p=0.481 n=10+10)

Change-Id: I3c93033dddd2e1df8cc54a106a6e615d27859e71
Reviewed-on: https://go-review.googlesource.com/40496
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-12 22:49:06 +00:00
Josh Bleecher Snyder
ce3ee7cdae cmd/internal/obj: stop storing Text flags in From3
Prior to this CL, flags such as NOSPLIT
on ATEXT Progs were stored in From3.Offset.
Some but not all of those flags were also
duplicated into From.Sym.Attribute.

This CL migrates all of those flags into
From.Sym.Attribute and stops creating a From3.

A side-effect of this is that printing an
ATEXT Prog can no longer simply dump From3.Offset.
That's kind of good, since the raw flag value
wasn't very informative anyway, but it did
necessitate a bunch of updates to the cmd/asm tests.

The reason I'm doing this work now is that
avoiding storing flags in both From.Sym and From3.Offset
simplifies some other changes to fix the data
race first described in CL 40254.

This CL almost passes toolstash-check -all.
The only changes are in cases where the assembler
has decided that a function's flags may be altered,
e.g. to make a function with no calls in it NOSPLIT.
Prior to this CL, that information was not printed.

Sample before:

"".Ctz64 t=1 size=63 args=0x10 locals=0x0
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	TEXT	"".Ctz64(SB), $0-16
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	FUNCDATA	$0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)

Sample after:

"".Ctz64 t=1 nosplit size=63 args=0x10 locals=0x0
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	TEXT	"".Ctz64(SB), NOSPLIT, $0-16
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	FUNCDATA	$0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)

Observe the additional "nosplit" in the first line
and the additional "NOSPLIT" in the second line.

Updates #15756

Change-Id: I5c59bd8f3bdc7c780361f801d94a261f0aef3d13
Reviewed-on: https://go-review.googlesource.com/40495
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-12 21:53:39 +00:00
Josh Bleecher Snyder
e367ba9eae cmd/internal/obj: refactor ATEXT symbol initialization
This makes the core Flushplist loop clearer.

We may also want to move the Sym initialization
much earlier in the compiler (see discussion on
CL 40254), for which this paves the way.

While we're here, eliminate package log in favor of ctxt.Diag.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ieaf848d196764a5aa82578b689af7bc6638c385a
Reviewed-on: https://go-review.googlesource.com/40313
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2017-04-11 02:56:56 +00:00
Robert Griesemer
f68f292820 cmd/compile: factor out Pkg, Sym, and Type into package types
- created new package cmd/compile/internal/types
- moved Pkg, Sym, Type to new package
- to break cycles, for now we need the (ugly) types/utils.go
  file which contains a handful of functions that must be installed
  early by the gc frontend
- to break cycles, for now we need two functions to convert between
  *gc.Node and *types.Node (the latter is a dummy type)
- adjusted the gc's code to use the new package and the conversion
  functions as needed
- made several Pkg, Sym, and Type methods functions as needed
- renamed constructors typ, typPtr, typArray, etc. to types.New,
  types.NewPtr, types.NewArray, etc.

Passes toolstash-check -all.

Change-Id: I8adfa5e85c731645d0a7fd2030375ed6ebf54b72
Reviewed-on: https://go-review.googlesource.com/39855
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-07 03:04:00 +00:00
Josh Bleecher Snyder
5c359d8083 cmd/compile: add Prog cache to Progs
The existing bulk/cached Prog allocator, Ctxt.NewProg, is not concurrency-safe.
This CL moves Prog allocation to its clients, the compiler and the assembler.

The assembler is so fast and generates so few Progs that it does not need
optimization of Prog allocation. I could not generate measureable changes.
And even if I could, the assembly is a miniscule portion of build times.

The compiler already has a natural place to manage Prog allocation;
this CL migrates the Prog cache there.
It will be made concurrency-safe in a later CL by
partitioning the Prog cache into chunks and assigning each chunk
to a different goroutine to manage.

This CL does cause a performance degradation when the compiler
is invoked with the -S flag (to dump assembly).
However, such usage is rare and almost always done manually.
The one instance I know of in a test is TestAssembly
in cmd/compile/internal/gc, and I did not detect
a measurable performance impact there.

Passes toolstash-check -all.
Minor compiler performance impact.

Updates #15756

Performance impact from just this CL:

name        old time/op     new time/op     delta
Template        213ms ± 4%      213ms ± 4%    ~     (p=0.571 n=49+49)
Unicode        89.1ms ± 3%     89.4ms ± 3%    ~     (p=0.388 n=47+48)
GoTypes         581ms ± 2%      584ms ± 3%  +0.56%  (p=0.019 n=47+48)
SSA             6.48s ± 2%      6.53s ± 2%  +0.84%  (p=0.000 n=47+49)
Flate           128ms ± 4%      128ms ± 4%    ~     (p=0.832 n=49+49)
GoParser        152ms ± 3%      152ms ± 3%    ~     (p=0.815 n=48+47)
Reflect         371ms ± 4%      371ms ± 3%    ~     (p=0.617 n=50+47)
Tar             112ms ± 4%      112ms ± 3%    ~     (p=0.724 n=49+49)
XML             208ms ± 3%      208ms ± 4%    ~     (p=0.678 n=49+50)
[Geo mean]      284ms           285ms       +0.18%

name        old user-ns/op  new user-ns/op  delta
Template         251M ± 7%       252M ±11%    ~     (p=0.704 n=49+50)
Unicode          107M ± 7%       108M ± 5%  +1.25%  (p=0.036 n=50+49)
GoTypes          738M ± 3%       740M ± 3%    ~     (p=0.305 n=49+48)
SSA             8.83G ± 2%      8.86G ± 4%    ~     (p=0.098 n=47+50)
Flate            146M ± 6%       147M ± 3%    ~     (p=0.584 n=48+41)
GoParser         178M ± 6%       179M ± 5%  +0.93%  (p=0.036 n=49+48)
Reflect          441M ± 4%       446M ± 7%    ~     (p=0.218 n=44+49)
Tar              126M ± 5%       126M ± 5%    ~     (p=0.766 n=48+49)
XML              245M ± 5%       244M ± 4%    ~     (p=0.359 n=50+50)
[Geo mean]       341M            342M       +0.51%

Performance impact from this CL combined with its parent:

name        old time/op     new time/op     delta
Template        213ms ± 3%      214ms ± 4%    ~     (p=0.685 n=47+50)
Unicode        89.8ms ± 6%     90.5ms ± 6%    ~     (p=0.055 n=50+50)
GoTypes         584ms ± 3%      585ms ± 2%    ~     (p=0.710 n=49+47)
SSA             6.50s ± 2%      6.53s ± 2%  +0.39%  (p=0.011 n=46+50)
Flate           128ms ± 3%      128ms ± 4%    ~     (p=0.855 n=47+49)
GoParser        152ms ± 3%      152ms ± 3%    ~     (p=0.666 n=49+49)
Reflect         371ms ± 3%      372ms ± 3%    ~     (p=0.298 n=48+48)
Tar             112ms ± 5%      113ms ± 3%    ~     (p=0.107 n=49+49)
XML             208ms ± 3%      208ms ± 2%    ~     (p=0.881 n=50+49)
[Geo mean]      285ms           285ms       +0.26%

name        old user-ns/op  new user-ns/op  delta
Template         254M ± 9%       252M ± 8%    ~     (p=0.290 n=49+50)
Unicode          106M ± 6%       108M ± 7%  +1.44%  (p=0.034 n=50+50)
GoTypes          741M ± 4%       743M ± 4%    ~     (p=0.992 n=50+49)
SSA             8.86G ± 2%      8.83G ± 3%    ~     (p=0.158 n=47+49)
Flate            147M ± 4%       148M ± 5%    ~     (p=0.832 n=50+49)
GoParser         179M ± 5%       178M ± 5%    ~     (p=0.370 n=48+50)
Reflect          441M ± 6%       445M ± 7%    ~     (p=0.246 n=45+47)
Tar              126M ± 6%       126M ± 6%    ~     (p=0.815 n=49+50)
XML              244M ± 3%       245M ± 4%    ~     (p=0.190 n=50+50)
[Geo mean]       342M            342M       +0.17%

Change-Id: I020f1c079d495fbe2e15ccb51e1ea2cc1b5a1855
Reviewed-on: https://go-review.googlesource.com/39634
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-06 04:53:50 +00:00
Matthew Dempsky
3a89065c6c cmd/compile: replace nod(ONAME) with newname
Passes toolstash-check -all.

Change-Id: Ib9f969e5ecc1537b7eab186dc4fd504a50f800f2
Reviewed-on: https://go-review.googlesource.com/38586
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-24 23:32:09 +00:00
Josh Bleecher Snyder
6652572b75 cmd/compile: thread Curfn through to debuginfo
Updates #15756

Change-Id: I860dd45cae9d851c7844654621bbc99efe7c7f03
Reviewed-on: https://go-review.googlesource.com/38591
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-24 16:22:58 +00:00
Josh Bleecher Snyder
c3a50ad3c7 cmd/compile: eliminate Prog-related globals
Introduce a new type, gc.Progs, to manage
generation of Progs for a function.
Use it to replace globals pc and pcloc.

Passes toolstash-check -all.

Updates #15756

Change-Id: I2206998d7c58fe2a76b620904909f2e1cec8a57d
Reviewed-on: https://go-review.googlesource.com/38418
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-23 16:52:28 +00:00
Josh Bleecher Snyder
352e19c92c cmd/compile: eliminate Gins and Naddr
Preparation for eliminating Prog-related globals.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ia199fcb282cc3a84903a6e92a3ce342c5faba79c
Reviewed-on: https://go-review.googlesource.com/38409
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-22 19:05:52 +00:00
Matthew Dempsky
7bb5b2d33a cmd/internal/obj: remove unneeded Addr.Node and Prog.Opt fields
Change-Id: I218b241c32a5948b66ad0d95ecc368648cf4ddf5
Reviewed-on: https://go-review.googlesource.com/38130
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-20 23:49:29 +00:00
Matthew Dempsky
325904fe6a cmd/compile: port liveness analysis to SSA
Passes toolstash-check -all.

Change-Id: I92c3c25d6c053f971f346f4fa3bbc76419b58183
Reviewed-on: https://go-review.googlesource.com/38087
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-20 22:58:50 +00:00