213 Commits

Author SHA1 Message Date
Brad Fitzpatrick
07b4abd62e all: remove the nacl port (part 2, amd64p32 + toolchain)
This is part two if the nacl removal. Part 1 was CL 199499.

This CL removes amd64p32 support, which might be useful in the future
if we implement the x32 ABI. It also removes the nacl bits in the
toolchain, and some remaining nacl bits.

Updates #30439

Change-Id: I2475d5bb066d1b474e00e40d95b520e7c2e286e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/200077
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-09 22:34:34 +00:00
Cuong Manh Le
77f5adba55 cmd/compile: don't use statictmps for small object in slice literal
Fixes #21561

Change-Id: I89c59752060dd9570d17d73acbbaceaefce5d8ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/197560
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-10-08 06:09:26 +00:00
Keith Randall
8f296f59de Revert "Revert "cmd/compile,runtime: allocate defer records on the stack""
This reverts CL 180761

Reason for revert: Reinstate the stack-allocated defer CL.

There was nothing wrong with the CL proper, but stack allocation of defers exposed two other issues.

Issue #32477: Fix has been submitted as CL 181258.
Issue #32498: Possible fix is CL 181377 (not submitted yet).

Change-Id: I32b3365d5026600069291b068bbba6cb15295eb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/181378
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-06-10 16:19:39 +00:00
David Chase
037ac2bd84 cmd/compile: add -smallframes gc flag for GC latency diagnosis
Shrinks the size of things that can be stack allocated from
10M to 128k for declared variables and from 64k to 16k for
implicit allocations (new(T), &T{}, etc).

Usage: "go build -gcflags -smallframes hello.go"

An earlier GOEXPERIMENT version of this caused only one
problem, when a gc-should-detect-oversize-stack test no
longer had an oversized stack to detect.  The change was
converted to a flag to make it easier to access (for
diagnosing "long" GC-related single-thread pauses) and to
remove interference with the test.

Includes test to verify behavior.

Updates #27732.

Change-Id: I1255d484331e77185e07c78389a8b594041204c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/180817
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-06-06 11:38:40 +00:00
Keith Randall
49200e3f3e Revert "cmd/compile,runtime: allocate defer records on the stack"
This reverts commit fff4f599fe1c21e411a99de5c9b3777d06ce0ce6.

Reason for revert: Seems to still have issues around GC.

Fixes #32452

Change-Id: Ibe7af629f9ad6a3d5312acd7b066123f484da7f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/180761
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-06-05 19:50:09 +00:00
Keith Randall
fff4f599fe cmd/compile,runtime: allocate defer records on the stack
When a defer is executed at most once in a function body,
we can allocate the defer record for it on the stack instead
of on the heap.

This should make defers like this (which are very common) faster.

This optimization applies to 363 out of the 370 static defer sites
in the cmd/go binary.

name     old time/op  new time/op  delta
Defer-4  52.2ns ± 5%  36.2ns ± 3%  -30.70%  (p=0.000 n=10+10)

Fixes #6980
Update #14939

Change-Id: I697109dd7aeef9e97a9eeba2ef65ff53d3ee1004
Reviewed-on: https://go-review.googlesource.com/c/go/+/171758
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-06-04 17:35:20 +00:00
Josh Bleecher Snyder
23b476a3c8 cmd/compile: port callnew to ssa conversion
This is part of a general effort to shrink walk.
In an ideal world, we'd have an SSA op for allocation,
but we don't yet have a good mechanism for introducing
function calling during SSA compilation.
In the meantime, SSA conversion is a better place for it.

This also makes it easier to introduce new optimizations;
instead of doing the typecheck walk dance,
we can simply write what we want the backend to do.

I introduced a new opcode in this change because:

(a) It avoids a class of bugs involving correctly detecting
    whether this ONEW is a "before walk" ONEW or an "after walk" ONEW.
    It also means that using ONEW or ONEWOBJ in the wrong context
    will generally result in a faster failure.
(b) Opcodes are cheap.
(c) It provides a better place to put documentation.

This change also is also marginally more performant:

name        old alloc/op      new alloc/op      delta
Template         39.1MB ± 0%       39.0MB ± 0%  -0.14%  (p=0.008 n=5+5)
Unicode          28.4MB ± 0%       28.4MB ± 0%    ~     (p=0.421 n=5+5)
GoTypes           132MB ± 0%        132MB ± 0%  -0.23%  (p=0.008 n=5+5)
Compiler          608MB ± 0%        607MB ± 0%  -0.25%  (p=0.008 n=5+5)
SSA              2.04GB ± 0%       2.04GB ± 0%  -0.01%  (p=0.008 n=5+5)
Flate            24.4MB ± 0%       24.3MB ± 0%  -0.13%  (p=0.008 n=5+5)
GoParser         29.3MB ± 0%       29.1MB ± 0%  -0.54%  (p=0.008 n=5+5)
Reflect          84.8MB ± 0%       84.7MB ± 0%  -0.21%  (p=0.008 n=5+5)
Tar              36.7MB ± 0%       36.6MB ± 0%  -0.10%  (p=0.008 n=5+5)
XML              48.7MB ± 0%       48.6MB ± 0%  -0.24%  (p=0.008 n=5+5)
[Geo mean]       85.0MB            84.8MB       -0.19%

name        old allocs/op     new allocs/op     delta
Template           383k ± 0%         382k ± 0%  -0.26%  (p=0.008 n=5+5)
Unicode            341k ± 0%         341k ± 0%    ~     (p=0.579 n=5+5)
GoTypes           1.37M ± 0%        1.36M ± 0%  -0.39%  (p=0.008 n=5+5)
Compiler          5.59M ± 0%        5.56M ± 0%  -0.49%  (p=0.008 n=5+5)
SSA               16.9M ± 0%        16.9M ± 0%  -0.03%  (p=0.008 n=5+5)
Flate              238k ± 0%         238k ± 0%  -0.23%  (p=0.008 n=5+5)
GoParser           306k ± 0%         303k ± 0%  -0.93%  (p=0.008 n=5+5)
Reflect            990k ± 0%         987k ± 0%  -0.33%  (p=0.008 n=5+5)
Tar                356k ± 0%         355k ± 0%  -0.20%  (p=0.008 n=5+5)
XML                444k ± 0%         442k ± 0%  -0.45%  (p=0.008 n=5+5)
[Geo mean]         848k              845k       -0.33%

Change-Id: I2c36003a7cbf71b53857b7de734852b698f49310
Reviewed-on: https://go-review.googlesource.com/c/go/+/167957
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-03-20 19:38:50 +00:00
Keith Randall
2c423f063b cmd/compile,runtime: provide index information on bounds check failure
A few examples (for accessing a slice of length 3):

   s[-1]    runtime error: index out of range [-1]
   s[3]     runtime error: index out of range [3] with length 3
   s[-1:0]  runtime error: slice bounds out of range [-1:]
   s[3:0]   runtime error: slice bounds out of range [3:0]
   s[3:-1]  runtime error: slice bounds out of range [:-1]
   s[3:4]   runtime error: slice bounds out of range [:4] with capacity 3
   s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3

Note that in cases where there are multiple things wrong with the
indexes (e.g. s[3:-1]), we report one of those errors kind of
arbitrarily, currently the rightmost one.

An exhaustive set of examples is in issue30116[u].out in the CL.

The message text has the same prefix as the old message text. That
leads to slightly awkward phrasing but hopefully minimizes the chance
that code depending on the error text will break.

Increases the size of the go binary by 0.5% (amd64). The panic functions
take arguments in registers in order to keep the size of the compiled code
as small as possible.

Fixes #30116

Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/161477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-18 17:33:38 +00:00
Lynn Boger
2d3474043c cmd/compile: call ginsnop, not ginsnop2 on ppc64le for mid-stack inlining tracebacks
A recent change to fix stacktraces for inlined functions
introduced a regression on ppc64le when compiling position
independent code. That happened because ginsnop2 was called for
the purpose of inserting a NOP to identify the location of
the inlined function, when ginsnop should have been used.
ginsnop2 is intended to be used before deferreturn to ensure
r2 is properly restored when compiling position independent code.
In some cases the location where r2 is loaded from might not be
initialized. If that happens and r2 is used to generate an address,
the result is likely a SEGV.

This fixes that problem.

Fixes #30283

Change-Id: If70ef27fc65ef31969712422306ac3a57adbd5b6
Reviewed-on: https://go-review.googlesource.com/c/163337
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Andrew Bonventre <andybons@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-25 18:08:46 +00:00
Keith Randall
585c9e8412 cmd/compile: implement shifts by signed amounts
Allow shifts by signed amounts. Panic if the shift amount is negative.

TODO: We end up doing two compares per shift, see Ian's comment
https://github.com/golang/go/issues/19113#issuecomment-443241799 that
we could do it with a single comparison in the normal case.

The prove pass mostly handles this code well. For instance, it removes the
<0 check for cases like this:
    if s >= 0 { _ = x << s }
    _ = x << len(a)

This case isn't handled well yet:
    _ = x << (y & 0xf)
I'll do followon CLs for unhandled cases as needed.

Update #19113

R=go1.13

Change-Id: I839a5933d94b54ab04deb9dd5149f32c51c90fa1
Reviewed-on: https://go-review.googlesource.com/c/158719
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-02-15 23:13:09 +00:00
Keith Randall
69c2c56453 cmd/compile,runtime: redo mid-stack inlining tracebacks
Work involved in getting a stack trace is divided between
runtime.Callers and runtime.CallersFrames.

Before this CL, runtime.Callers returns a pc per runtime frame.
runtime.CallersFrames is responsible for expanding a runtime frame
into potentially multiple user frames.

After this CL, runtime.Callers returns a pc per user frame.
runtime.CallersFrames just maps those to user frame info.

Entries in the result of runtime.Callers are now pcs
of the calls (or of the inline marks), not of the instruction
just after the call.

Fixes #29007
Fixes #28640
Update #26320

Change-Id: I1c9567596ff73dc73271311005097a9188c3406f
Reviewed-on: https://go-review.googlesource.com/c/152537
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-12-28 20:55:36 +00:00
Brian Kessler
319787a528 cmd/compile: intrinsify math/bits.Div on amd64
Note that the intrinsic implementation panics separately for overflow and
divide by zero, which matches the behavior of the pure go implementation.
There is a modest performance improvement after intrinsic implementation.

name     old time/op  new time/op  delta
Div-4    53.0ns ± 1%  47.0ns ± 0%  -11.28%  (p=0.008 n=5+5)
Div32-4  18.4ns ± 0%  18.5ns ± 1%     ~     (p=0.444 n=5+5)
Div64-4  53.3ns ± 0%  47.5ns ± 4%  -10.77%  (p=0.008 n=5+5)

Updates #28273

Change-Id: Ic1688ecc0964acace2e91bf44ef16f5fb6b6bc82
Reviewed-on: https://go-review.googlesource.com/c/144378
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-27 05:04:25 +00:00
Martin Möhrmann
75798e8ada runtime: make processor capability variable naming platform specific
The current support_XXX variables are specific for the
amd64 and 386 platforms.

Prefix processor capability variables by architecture to have a
consistent naming scheme and avoid reuse of the existing
variables for new platforms.

This also aligns naming of runtime variables closer with internal/cpu
processor capability variable names.

Change-Id: I3eabb29a03874678851376185d3a62e73c1aff1d
Reviewed-on: https://go-review.googlesource.com/c/91435
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-14 20:30:31 +00:00
Robert Griesemer
6fe8ee78e9 cmd/compile/internal/gc: remove isforw predicate table (cleanup)
Was only ever filled with one Etype (TFORW) and only used
in one place. Easier to just check for TFORW.

Change-Id: Icc96da3a22b0af1d7e60bc5841c744916c53341e
Reviewed-on: https://go-review.googlesource.com/c/147285
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2018-11-03 19:56:55 +00:00
Matthew Dempsky
41c0b9eb88 cmd/compile: remove compiling_wrappers
It's no longer needed after removing safemode.

Change-Id: I7581d77a86342e3b6d7c632839f5eb7a5c20902e
Reviewed-on: https://go-review.googlesource.com/c/143397
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-19 18:01:17 +00:00
Matthew Dempsky
5185744962 cmd/compile: remove obsolete "safe" mode
Nowadays there are better ways to safely run untrusted Go programs, like
NaCl and gVisor.

Change-Id: I20c45f13a50dbcf35c343438b720eb93e7b4e13a
Reviewed-on: https://go-review.googlesource.com/c/142717
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2018-10-17 19:00:37 +00:00
Matthew Dempsky
965fa3b191 cmd/compile: remove -dolinkobj flag
This used to be used by cmd/vet and some assembly generation tests, but
those were removed in CL 37691 and CL 107336. No point in keeping an
unneeded flag around.

Fixes #28220.

Change-Id: I59f8546954ab36ea61ceba81c10d6e16d74b966a
Reviewed-on: https://go-review.googlesource.com/c/142677
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-16 20:54:12 +00:00
Keith Randall
ceb0c371d9 cmd/compile: make []byte("...") more efficient
Do []byte(string) conversions more efficiently when the string
is a constant. Instead of calling stringtobyteslice, allocate
just the space we need and encode the initialization directly.

[]byte("foo") rewrites to the following pseudocode:

var s [3]byte // on heap or stack, depending on whether b escapes
s = *(*[3]byte)(&"foo"[0]) // initialize s from the string
b = s[:]

which generates this assembly:

	0x001d 00029 (tmp1.go:9)	LEAQ	type.[3]uint8(SB), AX
	0x0024 00036 (tmp1.go:9)	MOVQ	AX, (SP)
	0x0028 00040 (tmp1.go:9)	CALL	runtime.newobject(SB)
	0x002d 00045 (tmp1.go:9)	MOVQ	8(SP), AX
	0x0032 00050 (tmp1.go:9)	MOVBLZX	go.string."foo"+2(SB), CX
	0x0039 00057 (tmp1.go:9)	MOVWLZX	go.string."foo"(SB), DX
	0x0040 00064 (tmp1.go:9)	MOVW	DX, (AX)
	0x0043 00067 (tmp1.go:9)	MOVB	CL, 2(AX)
// Then the slice is b = {AX, 3, 3}

The generated code is still not optimal, as it still does load/store
from read-only memory instead of constant stores.  Next CL...

Update #26498
Fixes #10170

Change-Id: I4b990b19f9a308f60c8f4f148934acffefe0a5bd
Reviewed-on: https://go-review.googlesource.com/c/140698
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-10 16:10:40 +00:00
Tobias Klauser
e7f59f0284 cmd/compile/internal/gc: unexport Deferproc and Newproc
They are no longer used outside the package since CL 38080.

Passes toolstash-check -all

Change-Id: I30977ed2b233b7c8c53632cc420938bc3b0e37c6
Reviewed-on: https://go-review.googlesource.com/129781
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-21 08:12:06 +00:00
Wei Xiao
0a7ac93c27 cmd/compile: improve atomic add intrinsics with ARMv8.1 new instruction
ARMv8.1 has added new instruction (LDADDAL) for atomic memory operations. This
CL improves existing atomic add intrinsics with the new instruction. Since the
new instruction is only guaranteed to be present after ARMv8.1, we guard its
usage with a conditional on CPU feature.

Performance result on ARMv8.1 machine:
name        old time/op  new time/op  delta
Xadd-224    1.05µs ± 6%  0.02µs ± 4%  -98.06%  (p=0.000 n=10+8)
Xadd64-224  1.05µs ± 3%  0.02µs ±13%  -98.10%  (p=0.000 n=9+10)
[Geo mean]  1.05µs       0.02µs       -98.08%

Performance result on ARMv8.0 machine:
name        old time/op  new time/op  delta
Xadd-46      538ns ± 1%   541ns ± 1%  +0.62%  (p=0.000 n=9+9)
Xadd64-46    505ns ± 1%   508ns ± 0%  +0.48%  (p=0.003 n=9+8)
[Geo mean]   521ns        524ns       +0.55%

Change-Id: If4b5d8d0e2d6f84fe1492a4f5de0789910ad0ee9
Reviewed-on: https://go-review.googlesource.com/81877
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-06-21 14:52:43 +00:00
Richard Musiol
3b137dd2df cmd/compile: add wasm architecture
This commit adds the wasm architecture to the compile command.
A later commit will contain the corresponding linker changes.

Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4

The following files are generated:
- src/cmd/compile/internal/ssa/opGen.go
- src/cmd/compile/internal/ssa/rewriteWasm.go
- src/cmd/internal/obj/wasm/anames.go

Updates #18892

Change-Id: Ifb4a96a3e427aac2362a1c97967d5667450fba3b
Reviewed-on: https://go-review.googlesource.com/103295
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-04 17:56:12 +00:00
Josh Bleecher Snyder
24d5c871aa cmd/compile: alphabetize sysfunc lists
Change-Id: Ia95643752d743d208363e3434497ffcf0af7b2d7
Reviewed-on: https://go-review.googlesource.com/107816
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-18 17:15:42 +00:00
Josh Bleecher Snyder
581331e75b cmd/compile: fix race in SSA construction
syslook cannot be called safely during SSA construction.

Change-Id: Ief173babd2e964fd5016578073dd3ba12e5731c5
Reviewed-on: https://go-review.googlesource.com/107815
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-18 17:15:35 +00:00
Matthew Dempsky
8f6ae33796 cmd/compile, cmd/link: encode cgo directives using JSON
The standard library has plenty of polished encoder/decoder
implementations. No need for another ad-hoc one.

I considered using encoding/gob instead, but these strings go into the
package data part of the object file, so it's important they don't
contain "\n$$\n". Package json escapes newlines in strings, so it's
safe to use here.

Change-Id: I998655524ccee7365c2c8e9a843e6975e95a3e62
Reviewed-on: https://go-review.googlesource.com/106463
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-11 23:35:42 +00:00
Matthew Dempsky
17df5ed910 cmd/compile: insert instrumentation during SSA building
Insert appropriate race/msan calls before each memory operation during
SSA construction.

This is conceptually simple, but subtle because we need to be careful
that inserted instrumentation calls don't clobber arguments that are
currently being prepared for a user function call.

reorder1 already handles introducing temporary variables for arguments
in some cases. This CL changes it to use them for all arguments when
instrumenting.

Also, we can't SSA struct types with more than one field while
instrumenting. Otherwise, concurrent uses of disjoint fields within an
SSA-able struct can introduce false races.

This is both somewhat better and somewhat worse than the old racewalk
instrumentation pass. We're now able to easily recognize cases like
constructing non-escaping closures on the stack or accessing closure
variables don't need instrumentation calls. On the other hand,
spilling escaping parameters to the heap now results in an
instrumentation call.

Overall, this CL results in a small net reduction in the number of
instrumentation calls, but a small net increase in binary size for
instrumented executables. cmd/go ends up with 5.6% fewer calls, but a
2.4% larger binary.

Fixes #19054.

Change-Id: I70d1dd32ad6340e6fdb691e6d5a01452f58e97f3
Reviewed-on: https://go-review.googlesource.com/102817
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-09 18:40:55 +00:00
Matthew Dempsky
638f112d69 cmd/compile: cleanup method symbol creation
There were multiple ad hoc ways to create method symbols, with subtle
and confusing differences between them. This CL unifies them into a
single well-documented encoding and implementation.

This introduces some inconsequential changes to symbol format for the
sake of simplicity and consistency. Two notable changes:

1) Symbol construction is now insensitive to the package currently
being compiled. Previously, non-exported methods on anonymous types
received different method symbols depending on whether the method was
local or imported.

2) Symbols for method values parenthesized non-pointer receiver types
and non-exported method names, and also always package-qualified
non-exported method names. Now they use the same rules as normal
method symbols.

The methodSym function is also now stricter about rejecting
non-sensical method/receiver combinations. Notably, this means that
typecheckfunc needs to call addmethod to validate the method before
calling declare, which also means we no longer emit errors about
redeclaring bogus methods.

Change-Id: I9501c7a53dd70ef60e5c74603974e5ecc06e2003
Reviewed-on: https://go-review.googlesource.com/104876
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-05 22:01:17 +00:00
Matthew Dempsky
b55eedd173 Revert "cmd/compile: cleanup nodpc and nodfp"
This reverts commit dcac984b97470c4f047f0d3d87b0af40f5246ed2.

Reason for revert: broke LR architectures (arm64, ppc64, s390x)

Change-Id: I531d311c9053e81503c8c78d6cf044b318fc828b
Reviewed-on: https://go-review.googlesource.com/99695
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-08 21:23:01 +00:00
Matthew Dempsky
dcac984b97 cmd/compile: cleanup nodpc and nodfp
Instead of creating a new &nodfp expression for every recover() call,
or a new nodpc variable for every function instrumented by the race
detector, this CL introduces two new uintptr-typed pseudo-variables
callerSP and callerPC. These pseudo-variables act just like calls to
the runtime's getcallersp() and getcallerpc() functions.

For consistency, change runtime.gorecover's builtin stub's parameter
type from "*int32" to "uintptr".

Passes toolstash-check, but toolstash-check -race fails because of
register allocator changes.

Change-Id: I985d644653de2dac8b7b03a28829ad04dfd4f358
Reviewed-on: https://go-review.googlesource.com/99416
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-08 18:22:29 +00:00
Austin Clements
2010189407 runtime: remove legacy eager write barrier
Now that the buffered write barrier is implemented for all
architectures, we can remove the old eager write barrier
implementation. This CL removes the implementation from the runtime,
support in the compiler for calling it, and updates some compiler
tests that relied on the old eager barrier support. It also makes sure
that all of the useful comments from the old write barrier
implementation still have a place to live.

Fixes #22460.

Updates #21640 since this fixes the layering concerns of the write
barrier (but not the other things in that issue).

Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74
Reviewed-on: https://go-review.googlesource.com/92705
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-13 16:34:46 +00:00
Vladimir Stefanovic
6be1c09e19 cmd/compile: use soft-float routines for soft-float targets
Updates #18162 (mostly fixes)

Change-Id: I35bcb8a688bdaa432adb0ddbb73a2f7adda47b9e
Reviewed-on: https://go-review.googlesource.com/37958
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-30 17:37:37 +00:00
Than McIntosh
4435fcfd6c compiler,linker: support for DWARF inlined instances
Compiler and linker changes to support DWARF inlined instances,
see https://go.googlesource.com/proposal/+/HEAD/design/22080-dwarf-inlining.md
for design details.

This functionality is gated via the cmd/compile option -gendwarfinl=N,
where N={0,1,2}, where a value of 0 disables dwarf inline generation,
a value of 1 turns on dwarf generation without tracking of formal/local
vars from inlined routines, and a value of 2 enables inlines with
variable tracking.

Updates #22080

Change-Id: I69309b3b815d9fed04aebddc0b8d33d0dbbfad6e
Reviewed-on: https://go-review.googlesource.com/75550
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-11-30 14:39:19 +00:00
Austin Clements
7e343134d3 cmd/compile: compiler support for buffered write barrier
This CL implements the compiler support for calling the buffered write
barrier added by the previous CL.

Since the buffered write barrier is only implemented on amd64 right
now, this still supports the old, eager write barrier as well. There's
little overhead to supporting both and this way a few tests in
test/fixedbugs that expect to have liveness maps at write barrier
calls can easily opt-in to the old, eager barrier.

This significantly improves the performance of the write barrier:

name             old time/op  new time/op  delta
WriteBarrier-12  73.5ns ±20%  19.2ns ±27%  -73.90%  (p=0.000 n=19+18)

It also reduces the size of binaries because the write barrier call is
more compact:

name        old object-bytes  new object-bytes  delta
Template           398k ± 0%         393k ± 0%  -1.14%  (p=0.008 n=5+5)
Unicode            208k ± 0%         206k ± 0%  -1.00%  (p=0.008 n=5+5)
GoTypes           1.18M ± 0%        1.15M ± 0%  -2.00%  (p=0.008 n=5+5)
Compiler          4.05M ± 0%        3.88M ± 0%  -4.26%  (p=0.008 n=5+5)
SSA               8.25M ± 0%        8.11M ± 0%  -1.59%  (p=0.008 n=5+5)
Flate              228k ± 0%         224k ± 0%  -1.83%  (p=0.008 n=5+5)
GoParser           295k ± 0%         284k ± 0%  -3.62%  (p=0.008 n=5+5)
Reflect           1.00M ± 0%        0.99M ± 0%  -0.70%  (p=0.008 n=5+5)
Tar                339k ± 0%         333k ± 0%  -1.67%  (p=0.008 n=5+5)
XML                404k ± 0%         395k ± 0%  -2.10%  (p=0.008 n=5+5)
[Geo mean]         704k              690k       -2.00%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.05M ± 0%        1.04M ± 0%  -1.55%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20171027.1

(Amusingly, this also reduces compiler allocations by 0.75%, which,
combined with the better write barrier, speeds up the compiler overall
by 2.10%. See the perf link.)

It slightly improves the performance of most of the go1 benchmarks and
improves the performance of the x/benchmarks:

name                      old time/op    new time/op    delta
BinaryTree17-12              2.40s ± 1%     2.47s ± 1%  +2.69%  (p=0.000 n=19+19)
Fannkuch11-12                2.95s ± 0%     2.95s ± 0%  +0.21%  (p=0.000 n=20+19)
FmtFprintfEmpty-12          41.8ns ± 4%    41.4ns ± 2%  -1.03%  (p=0.014 n=20+20)
FmtFprintfString-12         68.7ns ± 2%    67.5ns ± 1%  -1.75%  (p=0.000 n=20+17)
FmtFprintfInt-12            79.0ns ± 3%    77.1ns ± 1%  -2.40%  (p=0.000 n=19+17)
FmtFprintfIntInt-12          127ns ± 1%     123ns ± 3%  -3.42%  (p=0.000 n=20+20)
FmtFprintfPrefixedInt-12     152ns ± 1%     150ns ± 1%  -1.02%  (p=0.000 n=18+17)
FmtFprintfFloat-12           211ns ± 1%     209ns ± 0%  -0.99%  (p=0.000 n=20+16)
FmtManyArgs-12               500ns ± 0%     496ns ± 0%  -0.73%  (p=0.000 n=17+20)
GobDecode-12                6.44ms ± 1%    6.53ms ± 0%  +1.28%  (p=0.000 n=20+19)
GobEncode-12                5.46ms ± 0%    5.46ms ± 1%    ~     (p=0.550 n=19+20)
Gzip-12                      220ms ± 1%     216ms ± 0%  -1.75%  (p=0.000 n=19+19)
Gunzip-12                   38.8ms ± 0%    38.6ms ± 0%  -0.30%  (p=0.000 n=18+19)
HTTPClientServer-12         79.0µs ± 1%    78.2µs ± 1%  -1.01%  (p=0.000 n=20+20)
JSONEncode-12               11.9ms ± 0%    11.9ms ± 0%  -0.29%  (p=0.000 n=20+19)
JSONDecode-12               52.6ms ± 0%    52.2ms ± 0%  -0.68%  (p=0.000 n=19+20)
Mandelbrot200-12            3.69ms ± 0%    3.68ms ± 0%  -0.36%  (p=0.000 n=20+20)
GoParse-12                  3.13ms ± 1%    3.18ms ± 1%  +1.67%  (p=0.000 n=19+20)
RegexpMatchEasy0_32-12      73.2ns ± 1%    72.3ns ± 1%  -1.19%  (p=0.000 n=19+18)
RegexpMatchEasy0_1K-12       241ns ± 0%     239ns ± 0%  -0.83%  (p=0.000 n=17+16)
RegexpMatchEasy1_32-12      68.6ns ± 1%    69.0ns ± 1%  +0.47%  (p=0.015 n=18+16)
RegexpMatchEasy1_1K-12       364ns ± 0%     361ns ± 0%  -0.67%  (p=0.000 n=16+17)
RegexpMatchMedium_32-12      104ns ± 1%     103ns ± 1%  -0.79%  (p=0.001 n=20+15)
RegexpMatchMedium_1K-12     33.8µs ± 3%    34.0µs ± 2%    ~     (p=0.267 n=20+19)
RegexpMatchHard_32-12       1.64µs ± 1%    1.62µs ± 2%  -1.25%  (p=0.000 n=19+18)
RegexpMatchHard_1K-12       49.2µs ± 0%    48.7µs ± 1%  -0.93%  (p=0.000 n=19+18)
Revcomp-12                   391ms ± 5%     396ms ± 7%    ~     (p=0.154 n=19+19)
Template-12                 63.1ms ± 0%    59.5ms ± 0%  -5.76%  (p=0.000 n=18+19)
TimeParse-12                 307ns ± 0%     306ns ± 0%  -0.39%  (p=0.000 n=19+17)
TimeFormat-12                325ns ± 0%     323ns ± 0%  -0.50%  (p=0.000 n=19+19)
[Geo mean]                  47.3µs         46.9µs       -0.67%

https://perf.golang.org/search?q=upload:20171026.1

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.25ms ± 1%  2.20ms ± 1%  -2.31%  (p=0.000 n=18+18)
HTTP-12                    12.6µs ± 0%  12.6µs ± 0%  -0.72%  (p=0.000 n=18+17)
JSON-12                    11.0ms ± 0%  11.0ms ± 1%  -0.68%  (p=0.000 n=17+19)

https://perf.golang.org/search?q=upload:20171026.2

Updates #14951.
Updates #22460.

Change-Id: Id4c0932890a1d41020071bec73b8522b1367d3e7
Reviewed-on: https://go-review.googlesource.com/73712
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-10-30 18:12:46 +00:00
Matthew Dempsky
31a3b719a0 cmd/compile: cleanup genwrapper slightly
ORETJMP doesn't need an ONAME if we just set the target method on Sym
instead of Left. Conveniently, this is where fmt.go was looking for it
anyway.

Change the iface parameter and global compiling_wrappers to bool.

Passes toolstash-check.

Change-Id: I5333f8bcb4e06bf8161808041125eb95c439aafe
Reviewed-on: https://go-review.googlesource.com/68252
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-05 22:37:16 +00:00
Hajime Hoshi
7537bb7b30 cmd/compile/internal/gc: unexport global constants
Change-Id: Ib292ef3b0a31b2c7bdd77519324362667f30389c
Reviewed-on: https://go-review.googlesource.com/44393
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-23 20:11:30 +00:00
Matthew Dempsky
0fb82fbcce cmd/compile: remove global bout variable
Change-Id: I7054bbec080708c3a11ed62d7f6594e82aa33747
Reviewed-on: https://go-review.googlesource.com/61691
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2017-09-05 19:59:21 +00:00
Josh Bleecher Snyder
2fe53d8d55 cmd/compile: remove gc.Sysfunc calls from 387 backend
gc.Sysfunc must not be called concurrently.
We set up runtime routines used by the backend
prior to doing any backend compilation.
I missed the 387 ones; fix that.

Sysfunc should have been unexported during 1.9.
I will rectify that in a subsequent CL.

Fixes #21352

Change-Id: I8386eaa1e05879c25c672b9c9fc693c938e9aeb6
Reviewed-on: https://go-review.googlesource.com/54090
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Avelino <t@avelino.xxx>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-09 00:19:58 +00:00
Keith Randall
a836254d08 cmd/compile: reject unknown //go: comments in std library
Fixes #18331

Change-Id: Ie5c6685be3002533b84604ff1f13f2f0850f29e2
Reviewed-on: https://go-review.googlesource.com/45010
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-06-06 22:28:17 +00:00
Alessandro Arzilli
2ad41a3090 cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in .debug_info
section when compiling with -N -l.

Version of debug_info is updated from DWARF v2 to DWARF v3 since
version 2 does not allow lexical blocks with discontinuous PC ranges.

Remaining open problems:
- scope information is removed from inlined functions
- variables records do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, even before its declaration.

Updates #6913.
Updates #12899.

Change-Id: Idc6808788512ea20e7e45bcf782453acb416fb49
Reviewed-on: https://go-review.googlesource.com/40095
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-05-18 23:10:50 +00:00
Josh Bleecher Snyder
5e157db37a cmd/compile: make Class a Stringer
Change-Id: I7fd137d01ab5a0690773ca6e47d2bd796f0e393e
Reviewed-on: https://go-review.googlesource.com/42177
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-30 16:26:09 +00:00
Josh Bleecher Snyder
756b9ce3a5 cmd/compile: add initial backend concurrency support
This CL adds initial support for concurrent backend compilation.

BACKGROUND

The compiler currently consists (very roughly) of the following phases:

1. Initialization.
2. Lexing and parsing into the cmd/compile/internal/syntax AST.
3. Translation into the cmd/compile/internal/gc AST.
4. Some gc AST passes: typechecking, escape analysis, inlining,
   closure handling, expression evaluation ordering (order.go),
   and some lowering and optimization (walk.go).
5. Translation into the cmd/compile/internal/ssa SSA form.
6. Optimization and lowering of SSA form.
7. Translation from SSA form to assembler instructions.
8. Translation from assembler instructions to machine code.
9. Writing lots of output: machine code, DWARF symbols,
   type and reflection info, export data.

Phase 2 was already concurrent as of Go 1.8.

Phase 3 is planned for eventual removal;
we hope to go straight from syntax AST to SSA.

Phases 5–8 are per-function; this CL adds support for
processing multiple functions concurrently.
The slowest phases in the compiler are 5 and 6,
so this offers the opportunity for some good speed-ups.

Unfortunately, it's not quite that straightforward.
In the current compiler, the latter parts of phase 4
(order, walk) are done function-at-a-time as needed.
Making order and walk concurrency-safe proved hard,
and they're not particularly slow, so there wasn't much reward.
To enable phases 5–8 to be done concurrently,
when concurrent backend compilation is requested,
we complete phase 4 for all functions
before starting later phases for any functions.

Also, in reality, we automatically generate new
functions in phase 9, such as method wrappers
and equality and has routines.
Those new functions then go through phases 4–8.
This CL disables concurrent backend compilation
after the first, big, user-provided batch of
functions has been compiled.
This is done to keep things simple,
and because the autogenerated functions
tend to be small, few, simple, and fast to compile.

USAGE

Concurrent backend compilation still defaults to off.
To set the number of functions that may be backend-compiled
concurrently, use the compiler flag -c.
In future work, cmd/go will automatically set -c.

Furthermore, this CL has been intentionally written
so that the c=1 path has no backend concurrency whatsoever,
not even spawning any goroutines.
This helps ensure that, should problems arise
late in the development cycle,
we can simply have cmd/go set c=1 always,
and revert to the original compiler behavior.

MUTEXES

Most of the work required to make concurrent backend
compilation safe has occurred over the past month.
This CL adds a handful of mutexes to get the rest of the way there;
they are the mutexes that I didn't see a clean way to avoid.
Some of them may still be eliminable in future work.

In no particular order:

* gc.funcsymsmu. The global funcsyms slice is populated
  lazily when we need function symbols for closures.
  This occurs during gc AST to SSA translation.
  The function funcsym also does a package lookup,
  which is a source of races on types.Pkg.Syms;
  funcsymsmu also covers that package lookup.
  This mutex is low priority: it adds a single global,
  it is in an infrequently used code path, and it is low contention.
  Since funcsyms may now be added in any order,
  we must sort them to preserve reproducible builds.

* gc.largeStackFramesMu. We don't discover until after SSA compilation
  that a function's stack frame is gigantic.
  Recording that error happens basically never,
  but it does happen concurrently.
  Fix with a low priority mutex and sorting.

* obj.Link.hashmu. ctxt.hash stores the mapping from
  types.Syms (compiler symbols) to obj.LSyms (linker symbols).
  It is accessed fairly heavily through all the phases.
  This is the only heavily contended mutex.

* gc.signatlistmu. The global signatlist map is
  populated with types through several of the concurrent phases,
  including notably via ngotype during DWARF generation.
  It is low priority for removal.

* gc.typepkgmu. Looking up symbols in the types package
  happens a fair amount during backend compilation
  and DWARF generation, particularly via ngotype.
  This mutex helps us to avoid a broader mutex on types.Pkg.Syms.
  It has low-to-moderate contention.

* types.internedStringsmu. gc AST to SSA conversion and
  some SSA work introduce new autotmps.
  Those autotmps have their names interned to reduce allocations.
  That interning requires protecting types.internedStrings.
  The autotmp names are heavily re-used, and the mutex
  overhead and contention here are low, so it is probably
  a worthwhile performance optimization to keep this mutex.

TESTING

I have been testing this code locally by running
'go install -race cmd/compile'
and then doing
'go build -a -gcflags=-c=128 std cmd'
for all architectures and a variety of compiler flags.
This obviously needs to be made part of the builders,
but it is too expensive to make part of all.bash.
I have filed #19962 for this.

REPRODUCIBLE BUILDS

This version of the compiler generates reproducible builds.
Testing reproducible builds also needs automation, however,
and is also too expensive for all.bash.
This is #19961.

Also of note is that some of the compiler flags used by 'toolstash -cmp'
are currently incompatible with concurrent backend compilation.
They still work fine with c=1.
Time will tell whether this is a problem.

NEXT STEPS

* Continue to find and fix races and bugs,
  using a combination of code inspection, fuzzing,
  and hopefully some community experimentation.
  I do not know of any outstanding races,
  but there probably are some.
* Improve testing.
* Improve performance, for many values of c.
* Integrate with cmd/go and fine tune.
* Support concurrent compilation with the -race flag.
  It is a sad irony that it does not yet work.
* Minor code cleanup that has been deferred during
  the last month due to uncertainty about the
  ultimate shape of this CL.

PERFORMANCE

Here's the buried lede, at last. :)

All benchmarks are from my 8 core 2.9 GHz Intel Core i7 darwin/amd64 laptop.

First, going from tip to this CL with c=1 has almost no impact.

name        old time/op       new time/op       delta
Template          195ms ± 3%        194ms ± 5%    ~     (p=0.370 n=30+29)
Unicode          86.6ms ± 3%       87.0ms ± 7%    ~     (p=0.958 n=29+30)
GoTypes           548ms ± 3%        555ms ± 4%  +1.35%  (p=0.001 n=30+28)
Compiler          2.51s ± 2%        2.54s ± 2%  +1.17%  (p=0.000 n=28+30)
SSA               5.16s ± 3%        5.16s ± 2%    ~     (p=0.910 n=30+29)
Flate             124ms ± 5%        124ms ± 4%    ~     (p=0.947 n=30+30)
GoParser          146ms ± 3%        146ms ± 3%    ~     (p=0.150 n=29+28)
Reflect           354ms ± 3%        352ms ± 4%    ~     (p=0.096 n=29+29)
Tar               107ms ± 5%        106ms ± 3%    ~     (p=0.370 n=30+29)
XML               200ms ± 4%        201ms ± 4%    ~     (p=0.313 n=29+28)
[Geo mean]        332ms             333ms       +0.10%

name        old user-time/op  new user-time/op  delta
Template          227ms ± 5%        225ms ± 5%    ~     (p=0.457 n=28+27)
Unicode           109ms ± 4%        109ms ± 5%    ~     (p=0.758 n=29+29)
GoTypes           713ms ± 4%        721ms ± 5%    ~     (p=0.051 n=30+29)
Compiler          3.36s ± 2%        3.38s ± 3%    ~     (p=0.146 n=30+30)
SSA               7.46s ± 3%        7.47s ± 3%    ~     (p=0.804 n=30+29)
Flate             146ms ± 7%        147ms ± 3%    ~     (p=0.833 n=29+27)
GoParser          179ms ± 5%        179ms ± 5%    ~     (p=0.866 n=30+30)
Reflect           431ms ± 4%        429ms ± 4%    ~     (p=0.593 n=29+30)
Tar               124ms ± 5%        123ms ± 5%    ~     (p=0.140 n=29+29)
XML               243ms ± 4%        242ms ± 7%    ~     (p=0.404 n=29+29)
[Geo mean]        415ms             415ms       +0.02%

name        old obj-bytes     new obj-bytes     delta
Template           382k ± 0%         382k ± 0%    ~     (all equal)
Unicode            203k ± 0%         203k ± 0%    ~     (all equal)
GoTypes           1.18M ± 0%        1.18M ± 0%    ~     (all equal)
Compiler          3.98M ± 0%        3.98M ± 0%    ~     (all equal)
SSA               8.28M ± 0%        8.28M ± 0%    ~     (all equal)
Flate              230k ± 0%         230k ± 0%    ~     (all equal)
GoParser           287k ± 0%         287k ± 0%    ~     (all equal)
Reflect           1.00M ± 0%        1.00M ± 0%    ~     (all equal)
Tar                190k ± 0%         190k ± 0%    ~     (all equal)
XML                416k ± 0%         416k ± 0%    ~     (all equal)
[Geo mean]         660k              660k       +0.00%

Comparing this CL to itself, from c=1 to c=2
improves real times 20-30%, costs 5-10% more CPU time,
and adds about 2% alloc.
The allocation increase comes from allocating more ssa.Caches.

name       old time/op       new time/op       delta
Template         202ms ± 3%        149ms ± 3%  -26.15%  (p=0.000 n=49+49)
Unicode         87.4ms ± 4%       84.2ms ± 3%   -3.68%  (p=0.000 n=48+48)
GoTypes          560ms ± 2%        398ms ± 2%  -28.96%  (p=0.000 n=49+49)
Compiler         2.46s ± 3%        1.76s ± 2%  -28.61%  (p=0.000 n=48+46)
SSA              6.17s ± 2%        4.04s ± 1%  -34.52%  (p=0.000 n=49+49)
Flate            126ms ± 3%         92ms ± 2%  -26.81%  (p=0.000 n=49+48)
GoParser         148ms ± 4%        107ms ± 2%  -27.78%  (p=0.000 n=49+48)
Reflect          361ms ± 3%        281ms ± 3%  -22.10%  (p=0.000 n=49+49)
Tar              109ms ± 4%         86ms ± 3%  -20.81%  (p=0.000 n=49+47)
XML              204ms ± 3%        144ms ± 2%  -29.53%  (p=0.000 n=48+45)

name       old user-time/op  new user-time/op  delta
Template         246ms ± 9%        246ms ± 4%     ~     (p=0.401 n=50+48)
Unicode          109ms ± 4%        111ms ± 4%   +1.47%  (p=0.000 n=44+50)
GoTypes          728ms ± 3%        765ms ± 3%   +5.04%  (p=0.000 n=46+50)
Compiler         3.33s ± 3%        3.41s ± 2%   +2.31%  (p=0.000 n=49+48)
SSA              8.52s ± 2%        9.11s ± 2%   +6.93%  (p=0.000 n=49+47)
Flate            149ms ± 4%        161ms ± 3%   +8.13%  (p=0.000 n=50+47)
GoParser         181ms ± 5%        192ms ± 2%   +6.40%  (p=0.000 n=49+46)
Reflect          452ms ± 9%        474ms ± 2%   +4.99%  (p=0.000 n=50+48)
Tar              126ms ± 6%        136ms ± 4%   +7.95%  (p=0.000 n=50+49)
XML              247ms ± 5%        264ms ± 3%   +6.94%  (p=0.000 n=48+50)

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       39.3MB ± 0%   +1.48%  (p=0.008 n=5+5)
Unicode         29.8MB ± 0%       30.2MB ± 0%   +1.19%  (p=0.008 n=5+5)
GoTypes          113MB ± 0%        114MB ± 0%   +0.69%  (p=0.008 n=5+5)
Compiler         443MB ± 0%        447MB ± 0%   +0.95%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.26GB ± 0%   +0.89%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       25.9MB ± 1%   +2.35%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       32.2MB ± 0%   +1.59%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       78.9MB ± 0%   +0.91%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       27.0MB ± 0%   +1.80%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       43.4MB ± 0%   +2.35%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          379k ± 0%         378k ± 0%     ~     (p=0.421 n=5+5)
Unicode           322k ± 0%         321k ± 0%     ~     (p=0.222 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.548 n=5+5)
Compiler         4.12M ± 0%        4.11M ± 0%   -0.14%  (p=0.032 n=5+5)
SSA              9.72M ± 0%        9.72M ± 0%     ~     (p=0.421 n=5+5)
Flate             234k ± 1%         234k ± 0%     ~     (p=0.421 n=5+5)
GoParser          316k ± 1%         315k ± 0%     ~     (p=0.222 n=5+5)
Reflect           980k ± 0%         979k ± 0%     ~     (p=0.095 n=5+5)
Tar               249k ± 1%         249k ± 1%     ~     (p=0.841 n=5+5)
XML               392k ± 0%         391k ± 0%     ~     (p=0.095 n=5+5)

From c=1 to c=4, real time is down ~40%, CPU usage up 10-20%, alloc up ~5%:

name       old time/op       new time/op       delta
Template         203ms ± 3%        131ms ± 5%  -35.45%  (p=0.000 n=50+50)
Unicode         87.2ms ± 4%       84.1ms ± 2%   -3.61%  (p=0.000 n=48+47)
GoTypes          560ms ± 4%        310ms ± 2%  -44.65%  (p=0.000 n=50+49)
Compiler         2.47s ± 3%        1.41s ± 2%  -43.10%  (p=0.000 n=50+46)
SSA              6.17s ± 2%        3.20s ± 2%  -48.06%  (p=0.000 n=49+49)
Flate            126ms ± 4%         74ms ± 2%  -41.06%  (p=0.000 n=49+48)
GoParser         148ms ± 4%         89ms ± 3%  -39.97%  (p=0.000 n=49+50)
Reflect          360ms ± 3%        242ms ± 3%  -32.81%  (p=0.000 n=49+49)
Tar              108ms ± 4%         73ms ± 4%  -32.48%  (p=0.000 n=50+49)
XML              203ms ± 3%        119ms ± 3%  -41.56%  (p=0.000 n=49+48)

name       old user-time/op  new user-time/op  delta
Template         246ms ± 9%        287ms ± 9%  +16.98%  (p=0.000 n=50+50)
Unicode          109ms ± 4%        118ms ± 5%   +7.56%  (p=0.000 n=46+50)
GoTypes          735ms ± 4%        806ms ± 2%   +9.62%  (p=0.000 n=50+50)
Compiler         3.34s ± 4%        3.56s ± 2%   +6.78%  (p=0.000 n=49+49)
SSA              8.54s ± 3%       10.04s ± 3%  +17.55%  (p=0.000 n=50+50)
Flate            149ms ± 6%        176ms ± 3%  +17.82%  (p=0.000 n=50+48)
GoParser         181ms ± 5%        213ms ± 3%  +17.47%  (p=0.000 n=50+50)
Reflect          453ms ± 6%        499ms ± 2%  +10.11%  (p=0.000 n=50+48)
Tar              126ms ± 5%        149ms ±11%  +18.76%  (p=0.000 n=50+50)
XML              246ms ± 5%        287ms ± 4%  +16.53%  (p=0.000 n=49+50)

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       40.4MB ± 0%   +4.21%  (p=0.008 n=5+5)
Unicode         29.8MB ± 0%       30.9MB ± 0%   +3.68%  (p=0.008 n=5+5)
GoTypes          113MB ± 0%        116MB ± 0%   +2.71%  (p=0.008 n=5+5)
Compiler         443MB ± 0%        455MB ± 0%   +2.75%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.27GB ± 0%   +1.84%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       26.9MB ± 1%   +6.31%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       33.2MB ± 0%   +4.61%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       80.2MB ± 0%   +2.53%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       27.9MB ± 0%   +5.19%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       44.6MB ± 0%   +5.20%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          380k ± 0%         379k ± 0%   -0.39%  (p=0.032 n=5+5)
Unicode           321k ± 0%         321k ± 0%     ~     (p=0.841 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.421 n=5+5)
Compiler         4.12M ± 0%        4.14M ± 0%   +0.52%  (p=0.008 n=5+5)
SSA              9.72M ± 0%        9.76M ± 0%   +0.37%  (p=0.008 n=5+5)
Flate             234k ± 1%         234k ± 1%     ~     (p=0.690 n=5+5)
GoParser          316k ± 0%         317k ± 1%     ~     (p=0.841 n=5+5)
Reflect           981k ± 0%         981k ± 0%     ~     (p=1.000 n=5+5)
Tar               250k ± 0%         249k ± 1%     ~     (p=0.151 n=5+5)
XML               393k ± 0%         392k ± 0%     ~     (p=0.056 n=5+5)

Going beyond c=4 on my machine tends to increase CPU time and allocs
without impacting real time.

The CPU time numbers matter, because when there are many concurrent
compilation processes, that will impact the overall throughput.

The numbers above are in many ways the best case scenario;
we can take full advantage of all cores.
Fortunately, the most common compilation scenario is incremental
re-compilation of a single package during a build/test cycle.

Updates #15756

Change-Id: I6725558ca2069edec0ac5b0d1683105a9fff6bea
Reviewed-on: https://go-review.googlesource.com/40693
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-27 00:59:07 +00:00
Josh Bleecher Snyder
e1a7db7f3b cmd/compile: minor cleanup
Follow-up to review comments on CL 41797.

Mask the input to set2 and set3, so that at the very least,
we won't corrupt the rest of the flags in case of a bad input.
It also seems more semantically appropriate.

Do minor cleanup in addrescapes. I started on larger cleanup,
but it wasn't clear that it was an improvement.

Add warning comments and sanity checks to Initorder and Class constants,
to attempt to prevent them from overflowing their allotted flag bits.

Passes toolstash-check.

Change-Id: I57b9661ba36f56406aa7a1d8da9b7c70338f9119
Reviewed-on: https://go-review.googlesource.com/41817
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 18:01:14 +00:00
Matthew Dempsky
c87520c598 cmd: remove IntSize and Widthint
Use PtrSize and Widthptr instead. CL prepared mostly with sed and
uniq.

Passes toolstash-check -all.

Fixes #19954.

Change-Id: I09371bd7128672885cb8bc4e7f534ad56a88d755
Reviewed-on: https://go-review.googlesource.com/40506
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-22 17:43:43 +00:00
Josh Bleecher Snyder
24c52ee570 cmd/compile: move typepkg to package types
Response to code review feedback on CL 40693.

It is now only accessible by types.TypePkgLookup.

Passes toolstash-check.

Change-Id: I0c422c1a271f97467ae38de53af9dc33f4b31bdb
Reviewed-on: https://go-review.googlesource.com/41304
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-04-21 16:11:15 +00:00
Keith Randall
38dee12dea cmd/compile: zero ambiguously live variables at VARKILLs
At VARKILLs, zero a variable if it is ambiguously live.
After the VARKILL anything this variable references
might be collected. If it were to become live again later,
the GC will see references to already-collected objects.

We don't know a variable is ambiguously live until very
late in compilation (after lowering, register allocation, ...),
so it is hard to generate the code in an arch-independent way.
We also have to be careful not to clobber any registers.
Fortunately, this almost never happens so performance is ~irrelevant.

There are only 2 instances where this triggers in the stdlib.

Fixes #20029

Change-Id: Ia9585a91d7b823fad4a9d141d954464cc7af31f4
Reviewed-on: https://go-review.googlesource.com/41076
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-04-20 23:47:43 +00:00
Matthew Dempsky
263ba3ac7b cmd/compile/internal/gc: make defframe arch-independent
The arch backends no longer depend on gc.Node.

Passes toolstash-check -all.

Change-Id: Ic7e49ae0a3ed155a2761c25e17cc341b46333fb4
Reviewed-on: https://go-review.googlesource.com/41196
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-20 18:34:14 +00:00
Josh Bleecher Snyder
01b1a34aac cmd/compile: rework handling of udiv on ARM
Instead of populating the aux symbol
of CALLudiv during rewrite rules,
populate it during genssa.

This simplifies the rewrite rules.
It also removes all remaining calls
to ctxt.Lookup from any rewrite rules.
This is a first step towards removing
ctxt from ssa.Cache entirely,
and also a first step towards converting
the obj.LSym.Version field into a boolean.
It should also speed up compilation.

Also, move func udiv into package runtime.
That's where it is anyway,
and it lets udiv look and act like the rest of
the runtime support functions.

Change-Id: I41462a632c14fdc41f61b08049ec13cd80a87bfe
Reviewed-on: https://go-review.googlesource.com/41191
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-20 16:27:38 +00:00
Josh Bleecher Snyder
245ef3a157 cmd/compile: look up more runtime symbols before SSA begins
This avoids concurrent runtime package lookups.

Updates #15756

Change-Id: I9e2cbd042aba44923f0d03e6ca5b4eb60fa9e7ea
Reviewed-on: https://go-review.googlesource.com/40853
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-18 02:14:12 +00:00
Josh Bleecher Snyder
da67c23fbb cmd/compile: add flag to disable DWARF generation
DWARF generation has non-trivial cost,
and in some cases is not necessary.
Provide an option to opt out.

Alloc impact of disabling DWARF generation:

name       old alloc/op      new alloc/op      delta
Template        38.7MB ± 0%       37.6MB ± 0%  -2.77%  (p=0.016 n=5+4)
Unicode         29.8MB ± 0%       29.8MB ± 0%  -0.16%  (p=0.032 n=5+5)
GoTypes          113MB ± 0%        110MB ± 0%  -2.38%  (p=0.008 n=5+5)
Compiler         463MB ± 0%        457MB ± 0%  -1.34%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.23GB ± 0%  -1.64%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       25.0MB ± 0%  -1.05%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       30.9MB ± 0%  -2.74%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       76.7MB ± 0%  -1.90%  (p=0.008 n=5+5)
Tar             26.5MB ± 0%       26.0MB ± 0%  -2.04%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       41.1MB ± 0%  -2.86%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          377k ± 0%         360k ± 1%  -4.46%  (p=0.008 n=5+5)
Unicode           321k ± 0%         320k ± 0%    ~     (p=0.151 n=5+5)
GoTypes          1.14M ± 0%        1.10M ± 0%  -4.13%  (p=0.008 n=5+5)
Compiler         4.26M ± 0%        4.13M ± 0%  -3.14%  (p=0.008 n=5+5)
SSA              9.70M ± 0%        9.33M ± 0%  -3.89%  (p=0.008 n=5+5)
Flate             233k ± 0%         228k ± 0%  -2.40%  (p=0.008 n=5+5)
GoParser          316k ± 0%         302k ± 0%  -4.48%  (p=0.008 n=5+5)
Reflect           980k ± 0%         945k ± 0%  -3.62%  (p=0.008 n=5+5)
Tar               249k ± 0%         241k ± 0%  -3.19%  (p=0.008 n=5+5)
XML               391k ± 0%         376k ± 0%  -3.95%  (p=0.008 n=5+5)

Change-Id: I97dbfb6b40195d1e0b91be097a4bf0e7f65b26af
Reviewed-on: https://go-review.googlesource.com/40857
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-17 03:16:28 +00:00
Josh Bleecher Snyder
0e4824aacc cmd/compile: remove flag_largemodel
It was added in 2013 in CL 7064048.
All uses of it in the compiler disappeared with
(or possibly before) the SSA backend.
Several releases have gone by without it,
from which I conclude that it is now not needed.

Change-Id: I2095f4ac05d4d7ab998168993a7fd5d954aeee88
Reviewed-on: https://go-review.googlesource.com/40856
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-16 14:53:20 +00:00
Robert Griesemer
f68f292820 cmd/compile: factor out Pkg, Sym, and Type into package types
- created new package cmd/compile/internal/types
- moved Pkg, Sym, Type to new package
- to break cycles, for now we need the (ugly) types/utils.go
  file which contains a handful of functions that must be installed
  early by the gc frontend
- to break cycles, for now we need two functions to convert between
  *gc.Node and *types.Node (the latter is a dummy type)
- adjusted the gc's code to use the new package and the conversion
  functions as needed
- made several Pkg, Sym, and Type methods functions as needed
- renamed constructors typ, typPtr, typArray, etc. to types.New,
  types.NewPtr, types.NewArray, etc.

Passes toolstash-check -all.

Change-Id: I8adfa5e85c731645d0a7fd2030375ed6ebf54b72
Reviewed-on: https://go-review.googlesource.com/39855
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-07 03:04:00 +00:00