Escape analysis has a hard time with tree-like
structures (see #13493 and #14858).
This is unlikely to change.
As a result, when invoking a function that accepts
a **Node parameter, we usually allocate a *Node
on the heap. This happens a whole lot.
This CL changes functions from taking a **Node
to acting more like append: It both modifies
the input and returns a replacement for it.
Because of the cascading nature of escape analysis,
in order to get the benefits, I had to modify
almost all such functions. The remaining functions
are in racewalk and the backend. I would be happy
to update them as well in a separate CL.
This CL was created by manually updating the
function signatures and the directly impacted
bits of code. The callsites were then automatically
updated using a bespoke script:
https://gist.github.com/josharian/046b1be7aceae244de39
For ease of reviewing and future understanding,
this CL is also broken down into four CLs,
mailed separately, which show the manual
and the automated changes separately.
They are CLs 20990, 20991, 20992, and 20993.
Passes toolstash -cmp.
name old time/op new time/op delta
Template 335ms ± 5% 324ms ± 5% -3.35% (p=0.000 n=23+24)
Unicode 176ms ± 9% 165ms ± 6% -6.12% (p=0.000 n=23+24)
GoTypes 1.10s ± 4% 1.07s ± 2% -2.77% (p=0.000 n=24+24)
Compiler 5.31s ± 3% 5.15s ± 3% -2.95% (p=0.000 n=24+24)
MakeBash 41.6s ± 1% 41.7s ± 2% ~ (p=0.586 n=23+23)
name old alloc/op new alloc/op delta
Template 63.3MB ± 0% 62.4MB ± 0% -1.36% (p=0.000 n=25+23)
Unicode 42.4MB ± 0% 41.6MB ± 0% -1.99% (p=0.000 n=24+25)
GoTypes 220MB ± 0% 217MB ± 0% -1.11% (p=0.000 n=25+25)
Compiler 994MB ± 0% 973MB ± 0% -2.08% (p=0.000 n=24+25)
name old allocs/op new allocs/op delta
Template 681k ± 0% 574k ± 0% -15.71% (p=0.000 n=24+25)
Unicode 518k ± 0% 413k ± 0% -20.34% (p=0.000 n=25+24)
GoTypes 2.08M ± 0% 1.78M ± 0% -14.62% (p=0.000 n=25+25)
Compiler 9.26M ± 0% 7.64M ± 0% -17.48% (p=0.000 n=25+25)
name old text-bytes new text-bytes delta
HelloSize 578k ± 0% 578k ± 0% ~ (all samples are equal)
CmdGoSize 6.46M ± 0% 6.46M ± 0% ~ (all samples are equal)
name old data-bytes new data-bytes delta
HelloSize 128k ± 0% 128k ± 0% ~ (all samples are equal)
CmdGoSize 281k ± 0% 281k ± 0% ~ (all samples are equal)
name old exe-bytes new exe-bytes delta
HelloSize 921k ± 0% 921k ± 0% ~ (all samples are equal)
CmdGoSize 9.86M ± 0% 9.86M ± 0% ~ (all samples are equal)
Change-Id: I277d95bd56d51c166ef7f560647aeaa092f3f475
Reviewed-on: https://go-review.googlesource.com/20959
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Boolean expressions involving t.Thistuple were converted to use
t.Recv(), because it's a bit clearer and will hopefully reveal cases
where we could remove redundant calls to t.Recv() (in followup CLs).
The other cases were all converted to use t.Recvs().NumFields(),
t.Params().NumFields(), or t.Results().NumFields().
Change-Id: I4df91762e7dc4b2ddae35995f8dd604a52c09b09
Reviewed-on: https://go-review.googlesource.com/20796
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
This is an automated rewrite of all the calls of the form:
for f, it := IterFields(t); f != nil; f = it.Next() { ... }
Followup CLs will work on cleaning up the remaining cases.
Change-Id: Ic1005ad45ae0b50c63e815e34e507e2d2644ba1a
Reviewed-on: https://go-review.googlesource.com/20794
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The obj.Fmt* values are only used by gc/fmt.go, so just move them
there. Also, add comments documenting the correspondance between
FmtFoo names and their flag characters to make understanding the
existing documentation slightly less confusing.
While here, add a new FmtFlag named type to represent these values.
Change-Id: I9631214b892557d094823f1ac575d0c43a84007b
Reviewed-on: https://go-review.googlesource.com/20717
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Parts of the SSA compiler in package gc contain amd64-specific code,
most notably Prog generation. Move this code into package amd64, so that
other architectures can be added more easily.
In package gc, this change is just moving code. There are no functional
changes or even any larger structural changes beyond changing function
names (mostly for export).
In the cmd/compile/internal/ssa/gen tool, more information is included
in arch to remove the AMD64-specific behavior in the main portion of the
tool. The generated opGen.go is identical.
Change-Id: I8eb37c6e6df6de1b65fa7dab6f3bc32c29daf643
Reviewed-on: https://go-review.googlesource.com/20609
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In addition to reflect.Value.Call, exported methods can be invoked
by the Func value in the reflect.Method struct. This CL has the
compiler track what functions get access to a legitimate reflect.Method
struct by looking for interface calls to either of:
Method(int) reflect.Method
MethodByName(string) (reflect.Method, bool)
This is a little overly conservative. If a user implements a type
with one of these methods without using the underlying calls on
reflect.Type, the linker will assume the worst and include all
exported methods. But it's cheap.
No change to any of the binary sizes reported in cl/20483.
For #14740
Change-Id: Ie17786395d0453ce0384d8b240ecb043b7726137
Reviewed-on: https://go-review.googlesource.com/20489
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Change-Id: Ice3aa807169f4fec85745a3991b1084a9f85c1b5
Reviewed-on: https://go-review.googlesource.com/20499
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
More idiomatic naming (in particular, matches the naming used for
go/types.Signature).
Also, convert more code to use these methods and/or IterFields.
(Still more to go; only made a quick pass for low hanging fruit.)
Passes toolstash -cmp.
Change-Id: I61831bfb1ec2cd50d4c7efc6062bca4e0dcf267b
Reviewed-on: https://go-review.googlesource.com/20451
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Make more idiomatic with a defer cleanup, which allows declaring
variables closer to their first use, rather than up front before the
first goto statement.
Also, split the legacy code generation code path into a separate
genlegacy function, analogous to the new genssa.
Passes toolstash -cmp.
Change-Id: I86c22838704f6861b75716ae64ba103b0e73b12f
Reviewed-on: https://go-review.googlesource.com/20353
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Without SSA:
$ go build -a -gcflags='-S -ssa=0' runtime 2>&1 | grep 'TEXT.*""\.init(SB)'
0x0000 00000 ($GOROOT/src/runtime/write_err.go:14) TEXT "".init(SB), $88-0
With SSA, before this CL:
$ go build -a -gcflags='-S -ssa=1' runtime 2>&1 | grep 'TEXT.*""\.init(SB)'
0x0000 00000 ($GOROOT/src/runtime/traceback.go:608) TEXT "".init(SB), $152-0
With SSA, after this CL:
$ go build -a -gcflags='-S -ssa=1' runtime 2>&1 | grep 'TEXT.*""\.init(SB)'
0x0000 00000 ($GOROOT/src/runtime/write_err.go:14) TEXT "".init(SB), $152-0
Change-Id: Ida3541e03a1af6ffc753ee5c3abeb653459edbf6
Reviewed-on: https://go-review.googlesource.com/20321
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Eliminates type conversions in a bunch of Oconv(int(n.Op), ...) calls.
Notably, this identified a misuse of Oconv in amd64/gsubr.go to try to
print an assembly instruction op instead of a compiler node op.
Change-Id: I93b5aa49fe14a5eaf868b05426d3b8cd8ab52bc5
Reviewed-on: https://go-review.googlesource.com/20298
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This CL addresses some issues noted during CL 20089.
Change-Id: I4e91a8077c07a571ccc9c004278672eb951c5104
Reviewed-on: https://go-review.googlesource.com/20181
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Automated CL prepared by github.com/mdempsky/unconvert, except for
reverting changes to ssa/rewritegeneric.go (generated file) and
package big (vendored copy of math/big).
Change-Id: I64dc4199f14077c7b6a2f334b12249d4a785eadd
Reviewed-on: https://go-review.googlesource.com/20089
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
Introduces a new types Nodes that can be used to replace NodeList.
Update #14473.
Change-Id: Id77c5dcae0cbeb898ba12dd46bd400aad408871c
Reviewed-on: https://go-review.googlesource.com/19969
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
By using a Pragma bit set (8 bits) rather than 8 booleans, also
reduce Func type size by 8 bytes (208B -> 200B on 64bit platforms,
116B -> 108B on 32bit platforms).
Change-Id: Ibb7e1f8c418a0b5bc6ff813cbdde7bc6f0013b5a
Reviewed-on: https://go-review.googlesource.com/19966
Reviewed-by: Dave Cheney <dave@cheney.net>
A slice uses less memory than a NodeList, and has better memory locality
when walking the list.
This uncovered a tricky case involving closures: the escape analysis
pass when run on a closure was appending to the Dcl list of the OCLOSURE
rather than the ODCLFUNC. This happened to work because they shared the
same NodeList. Fixed with a change to addrescapes, and a check to
Tempname to catch any recurrences.
This removes the last use of the listsort function outside of tests.
I'll send a separate CL to remove it.
Unfortunately, while this passes all tests, it does not pass toolstash
-cmp. The problem is that cmpstackvarlt does not fully determine the
sort order, and the change from listsort to sort.Sort, while generally
desirable, produces a different ordering. I could stage this by first
making cmpstackvarlt fully determined, but no matter what toolstash -cmp
is going to break at some point.
In my casual testing the compiler is 2.2% faster.
Update #14473.
Change-Id: I367d66daa4ec73ed95c14c66ccda3a2133ad95d5
Reviewed-on: https://go-review.googlesource.com/19919
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Previously, given two Nodes n1 and n2 of different non-PAUTO classes
(e.g., PPARAM and PPARAMOUT), cmpstackvarlt(n1, n2) and
cmpstackvarlt(n2, n1) both returned true, which is nonsense.
This doesn't seem to cause any visible miscompilation problems, but
notably fixing it does cause toolstash/buildall to fail.
Change-Id: I33b2c66e902c5eced875d8fbf18b7cfdc81e8aed
Reviewed-on: https://go-review.googlesource.com/19778
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The old write barriers used _nostore versions, which
don't work for Ian's cgo checker. Instead, we adopt the
same write barrier pattern as the default compiler.
It's a bit trickier to code up but should be more efficient.
Change-Id: I6696c3656cf179e28f800b0e096b7259bd5f3bb7
Reviewed-on: https://go-review.googlesource.com/18941
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Consider this code:
func f(*int)
func g() {
p := new(int)
f(p)
}
where f is an assembly function.
In general liveness analysis assumes that during the call to f, p is dead
in this frame. If f has retained p, p will be found alive in f's frame and keep
the new(int) from being garbage collected. This is all correct and works.
We use the Go func declaration for f to give the assembly function
liveness information (the arguments are assumed live for the entire call).
Now consider this code:
func h1() {
p := new(int)
syscall.Syscall(1, 2, 3, uintptr(unsafe.Pointer(p)))
}
Here syscall.Syscall is taking the place of f, but because its arguments
are uintptr, the liveness analysis and the garbage collector ignore them.
Since p is no longer live in h once the call starts, if the garbage collector
scans the stack while the system call is blocked, it will find no reference
to the new(int) and reclaim it. If the kernel is going to write to *p once
the call finishes, reclaiming the memory is a mistake.
We can't change the arguments or the liveness information for
syscall.Syscall itself, both for compatibility and because sometimes the
arguments really are integers, and the garbage collector will get quite upset
if it finds an integer where it expects a pointer. The problem is that
these arguments are fundamentally untyped.
The solution we have taken in the syscall package's wrappers in past
releases is to insert a call to a dummy function named "use", to make
it look like the argument is live during the call to syscall.Syscall:
func h2() {
p := new(int)
syscall.Syscall(1, 2, 3, uintptr(unsafe.Pointer(p)))
use(unsafe.Pointer(p))
}
Keeping p alive during the call means that if the garbage collector
scans the stack during the system call now, it will find the reference to p.
Unfortunately, this approach is not available to users outside syscall,
because 'use' is unexported, and people also have to realize they need
to use it and do so. There is much existing code using syscall.Syscall
without a 'use'-like function. That code will fail very occasionally in
mysterious ways (see #13372).
This CL fixes all that existing code by making the compiler do the right
thing automatically, without any code modifications. That is, it takes h1
above, which is incorrect code today, and makes it correct code.
Specifically, if the compiler sees a foreign func definition (one
without a body) that has uintptr arguments, it marks those arguments
as "unsafe uintptrs". If it later sees the function being called
with uintptr(unsafe.Pointer(x)) as an argument, it arranges to mark x
as having escaped, and it makes sure to hold x in a live temporary
variable until the call returns, so that the garbage collector cannot
reclaim whatever heap memory x points to.
For now I am leaving the explicit calls to use in package syscall,
but they can be removed early in a future cycle (likely Go 1.7).
The rule has no effect on escape analysis, only on liveness analysis.
Fixes#13372.
Change-Id: I2addb83f70d08db08c64d394f9d06ff0a063c500
Reviewed-on: https://go-review.googlesource.com/18584
Reviewed-by: Ian Lance Taylor <iant@golang.org>
We used to compile everything with SSA and then decide whether
to use the result or not. It was useful when we were working
on coverage without much regard for correctness, but not so much now.
Instead, let's decide what we're going to compile and go through
the SSA compiler for only those functions.
TODO: next CL: get rid of all the UnimplementedF stuff.
Change-Id: If629addd8b62cd38ef553fd5d835114137885ce0
Reviewed-on: https://go-review.googlesource.com/17763
Reviewed-by: David Chase <drchase@google.com>
It is based on ppc64 compiler.
Change-Id: I15a101df05f2919ba5292136957ba0009227d067
Reviewed-on: https://go-review.googlesource.com/14445
Reviewed-by: Minux Ma <minux@golang.org>
Declare a function's arguments as having already been
spilled so their use just requires a restore.
Allow spill locations to be portions of larger objects the stack.
Required to load portions of compound input arguments.
Rename the memory input to InputMem. Use Arg for the
pre-spilled argument values.
Change-Id: I8fe2a03ffbba1022d98bfae2052b376b96d32dda
Reviewed-on: https://go-review.googlesource.com/16536
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Added an explicit compare-zero and branch-to-panic for
integer division and mod so that other optimizations will
not be fooled by their implicit panics.
Change-Id: Ibf96f636b541c0088861907c537a6beb4b99fa4c
Reviewed-on: https://go-review.googlesource.com/16450
Reviewed-by: Keith Randall <khr@golang.org>
Modified GOSSA{HASH.PKG} environment variable filters to
make it easier to make/run with all SSA for testing.
Disable attempts at SSA for architectures that are not
amd64 (avoid spurious errors/unimplementeds.)
Removed easy out for unimplemented features.
Add convert op for proper liveness in presence of uintptr
to/from unsafe.Pointer conversions.
Tweaked stack sizes to get a pass on windows;
1024 instead 768, was observed to pass at least once.
Change-Id: Ida3800afcda67d529e3b1cf48ca4a3f0fa48b2c5
Reviewed-on: https://go-review.googlesource.com/16201
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
This is mechanical change that is a step toward reusing the racewalk
pass for a more general instrumentation pass. The first use will be to
add support for the memory sanitizer.
Change-Id: I75b93b814ac60c1db1660e0b9a9a7d7977d86939
Reviewed-on: https://go-review.googlesource.com/16105
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
A followup CL will rewrite listsort to use the new cmpstackvarlt and
change cmpstackvar to avoid stringsCompare.
Change-Id: Idf0857a3bd67f9e2243ba82aa0bff510612927c3
Reviewed-on: https://go-review.googlesource.com/14611
Reviewed-by: Dave Cheney <dave@cheney.net>
Convert some fields of struct Type in go.go from uint8 to bool.
This change passes go build -toolexec 'toolstash -cmp' -a std.
Change-Id: I0a6c53f8ee686839b5234010ee2de7ae3940d499
Reviewed-on: https://go-review.googlesource.com/14370
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This helps vet see a real issue:
cmd/internal/gc$ go vet
gen.go:1223: unreachable code
Fixes#12106.
Change-Id: I720868b07ae6b6d5a4dc6b238baa8c9c889da6d8
Reviewed-on: https://go-review.googlesource.com/14083
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The SSA implementation logs for three purposes:
* debug logging
* fatal errors
* unimplemented features
Separating these three uses lets us attempt an SSA
implementation for all functions, not just
_ssa functions. This turns the entire standard
library into a compilation test, and makes it
easy to figure out things like
"how much coverage does SSA have now" and
"what should we do next to get more coverage?".
Functions called _ssa are still special.
They log profusely by default and
the output of the SSA implementation
is used. For all other functions,
logging is off, and the implementation
is built and discarded, due to lack of
support for the runtime.
While we're here, fix a few minor bugs and
add some extra Unimplementeds to allow
all.bash to pass.
As of now, SSA handles 20.79% of the functions
in the standard library (689 of 3314).
The top missing features are:
10.03% 2597 SSA unimplemented: zero for type error not implemented
7.79% 2016 SSA unimplemented: addr: bad op DOTPTR
7.33% 1898 SSA unimplemented: unhandled expr EQ
6.10% 1579 SSA unimplemented: unhandled expr OROR
4.91% 1271 SSA unimplemented: unhandled expr NE
4.49% 1163 SSA unimplemented: unhandled expr LROT
4.00% 1036 SSA unimplemented: unhandled expr LEN
3.56% 923 SSA unimplemented: unhandled stmt CALLFUNC
2.37% 615 SSA unimplemented: zero for type []byte not implemented
1.90% 492 SSA unimplemented: unhandled stmt CALLMETH
1.74% 450 SSA unimplemented: unhandled expr CALLINTER
1.74% 450 SSA unimplemented: unhandled expr DOT
1.71% 444 SSA unimplemented: unhandled expr ANDAND
1.65% 426 SSA unimplemented: unhandled expr CLOSUREVAR
1.54% 400 SSA unimplemented: unhandled expr CALLMETH
1.51% 390 SSA unimplemented: unhandled stmt SWITCH
1.47% 380 SSA unimplemented: unhandled expr CONV
1.33% 345 SSA unimplemented: addr: bad op *
1.30% 336 SSA unimplemented: unhandled OLITERAL 6
Change-Id: I4ca07951e276714dc13c31de28640aead17a1be7
Reviewed-on: https://go-review.googlesource.com/11160
Reviewed-by: Keith Randall <khr@golang.org>
//go:systemstack means that the function must run on the system stack.
Add one use in runtime as a demonstration.
Fixes#9174.
Change-Id: I8d4a509cb313541426157da703f1c022e964ace4
Reviewed-on: https://go-review.googlesource.com/10840
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>