Now that the buffered write barrier is implemented for all
architectures, we can remove the old eager write barrier
implementation. This CL removes the implementation from the runtime,
support in the compiler for calling it, and updates some compiler
tests that relied on the old eager barrier support. It also makes sure
that all of the useful comments from the old write barrier
implementation still have a place to live.
Fixes#22460.
Updates #21640 since this fixes the layering concerns of the write
barrier (but not the other things in that issue).
Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74
Reviewed-on: https://go-review.googlesource.com/92705
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Turn off append-to-itself optimization if optimizations are turned off.
This optimization triggered a bug when doing
s = append(s, s)
where we write to the leftmost s before reading the rightmost s.
Update #17039
Change-Id: I21996532d20a75db6ec8d49db50cb157a1360b80
Reviewed-on: https://go-review.googlesource.com/81816
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The ssa backend is aggressive about placing constants and
certain other values in the Entry block. It's implausible
that the original line numbers for these constants makes
any sort of sense when it appears to a user stepping in a
debugger, and they're also not that useful in dumps since
entry-block instructions tend to be constants (i.e.,
unlikely to be the cause of a crash).
Therefore, use src.NoXPos for any values that are explicitly
inserted into a function's entry block.
Passes all tests, including ssa/debug_test.go with both
gdb and a fairly recent dlv. Hand-verified that it solves
the reported problem; constructed a test that reproduced
a problem, and fixed it.
Modified test harness to allow injection of slightly more
interesting inputs.
Fixes#22558.
Change-Id: I4476927067846bc4366da7793d2375c111694c55
Reviewed-on: https://go-review.googlesource.com/81215
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This changes the assembly language output to use the
innermost (instead of outermost) position for line
number and file.
The file is printed separately, only when it changes,
to remove redundant and space-consuming noise from the
output.
Unknown positions have line number "?"
The output format was changed slightly to make it
easier to read.
One source of gratuitous variation in debugging output was
removed.
Change-Id: I1fd9c8b0ddd82766488582fb684dce4b04f35723
Reviewed-on: https://go-review.googlesource.com/78895
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Fixes#22703
The fix was already done by Cherry for defer/go of an interface call (CL 23820).
We just need to do it everywhere.
Change-Id: I0115d22e443931fe1bcce44c93c4d0770b5fd268
Reviewed-on: https://go-review.googlesource.com/77450
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Since the slice of names is almost exactly the same as what stringer is
already generating for us.
Passes toolstash -cmp on std cmd.
Change-Id: I3f1e95efc690c0108236689e721627f00f79a461
Reviewed-on: https://go-review.googlesource.com/77190
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Now possible, since stringer just got the -trimprefix flag added.
While at it, simplify a few Op stringifications since we can now use %v,
and no longer have to worry about o<len(opnames).
Passes toolstash -cmp on std cmd.
Fixes#15462.
Change-Id: Icdcde0b0a5eb165d18488918175024da274f782b
Reviewed-on: https://go-review.googlesource.com/76790
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
The linker will refuse to work on objects larger than
2e9 bytes (see issue #9862 for why).
With this change, the compiler gives a useful error
message explaining this, instead of leaving it to the
linker to give a cryptic message later.
Fixes#1700.
Change-Id: I3933ce08ef846721ece7405bdba81dff644cb004
Reviewed-on: https://go-review.googlesource.com/74330
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Plain blocks that contain only uninteresting instructions
(that do not have reliable Pos information themselves)
need to have their Pos left unset so that they can
inherit it from their successors. The "uninteresting"
test was not properly applied and not properly defined.
OpFwdRef does not appear in the ssa.html debugging output,
but at the time of the test these instructions did appear,
and it needs to be part of the test.
Fixes#22365.
Change-Id: I99e5b271acd8f6bcfe0f72395f905c7744ea9a02
Reviewed-on: https://go-review.googlesource.com/74252
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Currently, assigning a []T where T is a go:notinheap type generates an
unnecessary write barrier for storing the slice pointer.
This fixes this by teaching HasHeapPointer that this type does not
have a heap pointer, and tweaking the lowering of slice assignments so
the pointer store retains the correct type rather than simply lowering
it to a *uint8 store.
Change-Id: I8bf7c66e64a7fefdd14f2bd0de8a5a3596340bab
Reviewed-on: https://go-review.googlesource.com/76027
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
A statement like
foo = bar + qux
might compile to
AX := AX + BX
resulting in a regkill for AX before this instruction.
The buggy behavior is to kill AX "at" this instruction,
before it has executed. (Code generation of no-instruction
values like RegKills applies their effects at the
next actual instruction emitted).
However, bar is still associated with AX until after the
instruction executes, so the effect of the regkill must
occur at the boundary between this instruction and the
next. Similarly, the new value bound to AX is not visible
until this instruction executes (and in the case of values
that require multiple instructions in code generation, until
all of them have executed).
The ranges are adjusted so that a value's start occurs
at the next following instruction after its evaluation,
and the end occurs after (execution of) the first
instruction following the end of the lifetime as a value.
(Notice the asymmetry; the entire value must be finished
before it is visible, but execution of a single instruction
invalidates. However, the value *is* visible before that
next instruction executes).
The test was adjusted to make it insensitive to the result
numbering for variables printed by gdb, since that is not
relevant to the test and makes the differences introduced
by small changes larger than necessary/useful.
The test was also improved to present variable probes
more intuitively, and also to allow explicit indication
of "this variable was optimized out"
Change-Id: I39453eead8399e6bb05ebd957289b112d1100c0e
Reviewed-on: https://go-review.googlesource.com/74090
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
We already do this for floor/ceil, but RoundToEven was added later.
Intrinsify it also.
name old time/op new time/op delta
RoundToEven-8 3.00ns ± 1% 0.68ns ± 2% -77.34% (p=0.000 n=10+10)
Change-Id: Ib158cbceb436c6725b2d9353a526c5c4be19bcad
Reviewed-on: https://go-review.googlesource.com/74852
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
Handle make(map[any]any) and make(map[any]any, hint) where
hint <= BUCKETSIZE special to allow for faster map initialization
and to improve binary size by using runtime calls with fewer arguments.
Given hint is smaller or equal to BUCKETSIZE in which case
overLoadFactor(hint, 0) is false and no buckets would be allocated by makemap:
* If hmap needs to be allocated on the stack then only hmap's hash0
field needs to be initialized and no call to makemap is needed.
* If hmap needs to be allocated on the heap then a new special
makehmap function will allocate hmap and intialize hmap's
hash0 field.
Reduces size of the godoc by ~36kb.
AMD64
name old time/op new time/op delta
NewEmptyMap 16.6ns ± 2% 5.5ns ± 2% -66.72% (p=0.000 n=10+10)
NewSmallMap 64.8ns ± 1% 56.5ns ± 1% -12.75% (p=0.000 n=9+10)
Updates #6853
Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b
Reviewed-on: https://go-review.googlesource.com/61190
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The new RoundToEven function can be implemented as a single FIDBR
instruction on s390x.
name old time/op new time/op delta
RoundToEven 5.32ns ± 1% 0.86ns ± 1% -83.86% (p=0.000 n=10+10)
Change-Id: Iaf597e57a0d1085961701e3c75ff4f6f6dcebb5f
Reviewed-on: https://go-review.googlesource.com/74350
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This adds support for math Abs, Copysign to be instrinsics on ppc64x.
New instruction FCPSGN is added to generate fcpsgn. Some new
rules are added to improve the int<->float conversions that are
generated mainly due to the Float64bits and Float64frombits in
the math package. PPC64.rules is also modified as suggested
in the review for CL 63290.
Improvements:
benchmark old ns/op new ns/op delta
BenchmarkAbs-16 1.12 0.69 -38.39%
BenchmarkCopysign-16 1.30 0.93 -28.46%
BenchmarkNextafter32-16 9.34 8.05 -13.81%
BenchmarkFrexp-16 8.81 7.60 -13.73%
Others that used Copysign also saw smaller improvements.
I attempted to make this work using rules since that
seems to be preferred, but due to the use of Float64bits and
Float64frombits in these functions, several rules had to be added and
even then not all cases were matched. Using rules became too
complicated and seemed too fragile for these.
Updates #21390
Change-Id: Ia265da9a18355e08000818a4fba1a40e9e031995
Reviewed-on: https://go-review.googlesource.com/67130
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Most write barrier calls are inserted by SSA, but copy and append are
lowered to runtime.typedslicecopy during walk. Fix these to set
Func.WBPos and emit the "write barrier" warning, as done for the write
barriers inserted by SSA. As part of this, we refactor setting WBPos
and emitting this warning into the frontend so it can be shared by
both walk and SSA.
Change-Id: I5fe9997d9bdb55e03e01dd58aee28908c35f606b
Reviewed-on: https://go-review.googlesource.com/73411
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The current go:nowritebarrierrec checker has two problems that limit
its coverage:
1. It doesn't understand that systemstack calls its argument, which
means there are several cases where we fail to detect prohibited write
barriers.
2. It only observes calls in the AST, so calls constructed during
lowering by SSA aren't followed.
This CL completely rewrites this checker to address these issues.
The current checker runs entirely after walk and uses visitBottomUp,
which introduces several problems for checking across systemstack.
First, visitBottomUp itself doesn't understand systemstack calls, so
the callee may be ordered after the caller, causing the checker to
fail to propagate constraints. Second, many systemstack calls are
passed a closure, which is quite difficult to resolve back to the
function definition after transformclosure and walk have run. Third,
visitBottomUp works exclusively on the AST, so it can't observe calls
created by SSA.
To address these problems, this commit splits the check into two
phases and rewrites it to use a call graph generated during SSA
lowering. The first phase runs before transformclosure/walk and simply
records systemstack arguments when they're easy to get. Then, it
modifies genssa to record static call edges at the point where we're
lowering to Progs (which is the latest point at which position
information is conveniently available). Finally, the second phase runs
after all functions have been lowered and uses a direct BFS walk of
the call graph (combining systemstack calls with static calls) to find
prohibited write barriers and construct nice error messages.
Fixes#22384.
For #22460.
Change-Id: I39668f7f2366ab3c1ab1a71eaf25484d25349540
Reviewed-on: https://go-review.googlesource.com/72773
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* replace a copy of IsMethod with a call of it.
* a few more switches where they simplify the code.
* prefer composite literals over "n := new(...); n.x = y; ...".
* use defers to get rid of three goto labels.
* rewrite updateHasCall into two funcs to remove gotos.
Passes toolstash-check on std cmd.
Change-Id: Icb5442a89a87319ef4b640bbc5faebf41b193ef1
Reviewed-on: https://go-review.googlesource.com/72070
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Use them to replace if/else chains with at least three comparisons,
where the code becomes clearly simpler.
Passes toolstash -cmp on std cmd.
Change-Id: Ic98aa3905944ddcab5aef5f9d9ba376853263d94
Reviewed-on: https://go-review.googlesource.com/70934
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
These functions are identical to math/bits.TrailingZeros{32,64}, which
are already intrinsified on ppc64.
Change-Id: If7ee57e7afe53154874f4b66bacdb6237806128a
Reviewed-on: https://go-review.googlesource.com/70350
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
In order to improve the line numbering for debuggers,
it's necessary to trace lines through compilation.
This makes it (much) easier to follow.
The format of the last column of the ssa.html output was
also changed to reduce the spamminess of the file name,
which is usually the same and makes it far harder to read
instructions and line numbers, and to make it wider and also
able to break words when wrapping (long path names still
can push off the end otherwise; side-to-side scrolling was
tried but was more annoying than the occasional wrapped
line).
Sample output now, where [...] is elision for sake of making
the CL character-counter happy -- and the (##) line numbers
are rendered in italics and a smaller font (11 point) under
control of a CSS class "line-number".
genssa
# /Users/drchase/[...]/ssa/testdata/hist.go
00000 (35) TEXT "".main(SB)
00001 (35) FUNCDATA $0, gclocals·7be4bb[...]1e8b(SB)
00002 (35) FUNCDATA $1, gclocals·9ab98a[...]4568(SB)
v920 00003 (36) LEAQ ""..autotmp_31-640(SP), DI
v858 00004 (36) XORPS X0, X0
v6 00005 (36) LEAQ -48(DI), DI
v6 00006 (36) DUFFZERO $277
v576 00007 (36) LEAQ ""..autotmp_31-640(SP), AX
v10 00008 (36) TESTB AX, (AX)
b1 00009 (36) JMP 10
and from an earlier phase:
b18: ← b17
v242 (47) = Copy <mem> v238
v243 (47) = VarKill <mem> {.autotmp_16} v242
v244 (48) = Addr <**bufio.Scanner> {scanner} v2
v245 (48) = Load <*bufio.Scanner> v244 v243
[...]
v279 (49) = Store <mem> {int64} v277 v276 v278
v280 (49) = Addr <*error> {.autotmp_18} v2
v281 (49) = Load <error> v280 v279
v282 (49) = Addr <*error> {err} v2
v283 (49) = VarDef <mem> {err} v279
v284 (49) = Store <mem> {error} v282 v281 v283
v285 (47) = VarKill <mem> {.autotmp_18} v284
v286 (47) = VarKill <mem> {.autotmp_17} v285
v287 (50) = Addr <*error> {err} v2
v288 (50) = Load <error> v287 v286
v289 (50) = NeqInter <bool> v288 v51
If v289 → b21 b22 (line 50)
Change-Id: I3f46310918f965761f59e6f03ea53067237c28a8
Reviewed-on: https://go-review.googlesource.com/69591
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
If the stack frame is too large, abort immediately.
We used to generate code first, then abort.
In issue 22200, generating code raised a panic
so we got an ICE instead of an error message.
Change the max frame size to 1GB (from 2GB).
Stack frames between 1.1GB and 2GB didn't used to work anyway,
the pcln table generation would have failed and generated an ICE.
Fixes#22200
Change-Id: I1d918ab27ba6ebf5c87ec65d1bccf973f8c8541e
Reviewed-on: https://go-review.googlesource.com/69810
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Add a compiler intrinsic for getcallersp. So we are able to get
rid of the argument (not done in this CL).
Change-Id: Ic38fda1c694f918328659ab44654198fb116668d
Reviewed-on: https://go-review.googlesource.com/69350
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
This attempts to choose better values for values that are
rematerialized (uses the XPos of the consumer, not the
original) and for unconditional branches (uses the last
assigned XPos in the block).
The JMP branches seem to sometimes end up with a PC in the
destination block, I think because of register movement
or rematerialization that gets placed in predecessor blocks.
This may be acceptable because (eyeball-empirically) that is
often the line number of the target block, so the line number
flow is correct.
Added proper test, that checks both -N -l and regular compilation.
The test is also capable (for gdb, delve soon) of tracking
variable printing based on comments in the source code.
There's substantial room for improvement in debugger behavior.
Updates #21098.
Change-Id: I13abd48a39141583b85576a015f561065819afd0
Reviewed-on: https://go-review.googlesource.com/50610
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
ORETJMP doesn't need an ONAME if we just set the target method on Sym
instead of Left. Conveniently, this is where fmt.go was looking for it
anyway.
Change the iface parameter and global compiling_wrappers to bool.
Passes toolstash-check.
Change-Id: I5333f8bcb4e06bf8161808041125eb95c439aafe
Reviewed-on: https://go-review.googlesource.com/68252
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@golang.org>
Use entire inlining call stack to decide whether two panic calls
can be merged. We used to merge panic calls when only the leaf
line numbers matched, but that leads to places higher up the call
stack being merged incorrectly.
Fixes#22083
Change-Id: Ia41400a80de4b6ecf3e5089abce0c42b65e9b38a
Reviewed-on: https://go-review.googlesource.com/67632
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The information that's used to generate DWARF location lists is very
ssa.Value centric; it uses Values as start and end coordinates to define
ranges. That mostly works fine, but control flow instructions don't come
from Values, so the ranges couldn't cover them.
Control flow instructions are generated when the SSA representation is
converted to assembly, so that's the best place to extend the ranges
to cover them. (Before that, there's nothing to refer to, and afterward
the boundaries between blocks have been lost.) That requires block
information in the debugInfo type, which then flows down to make
everything else awkward. On the plus side, there's a little less copying
slices around than there used to be, so it should be a little faster.
Previously, the ranges for empty blocks were not very meaningful. That
was fine, because they had no Values to cover, so no debug information
was generated for them. But they do have control flow instructions
(that's why they exist) and so now it's important that the information
be correct. Introduce two sentinel values, BlockStart and BlockEnd, that
denote the boundary of a block, even if the block is empty. BlockEnd
replaces the previous SurvivedBlock flag.
There's one more problem: the last instruction in the function will be a
control flow instruction, so any live ranges need to be extended past
it. But there's no instruction after it to use as the end of the range.
Instead, leave the EndProg field of those ranges as nil and fix it up to
point to past the end of the assembled text at the very last moment.
Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b
Reviewed-on: https://go-review.googlesource.com/63010
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
First step towards removing the mandatory argument for
getcallerpc, which solves certain problems for the runtime.
This might also slightly improve performance.
Intrinsic enabled on 386, amd64, amd64p32,
runtime asm implementation removed on those architectures.
Now-superfluous argument remains in getcallerpc signature
(for a future CL; non-386/amd64 asm funcs ignore it).
Added getcallerpc to the "not a real function" test
in dcl.go, that story is a little odd with respect to
unexported functions but that is not this CL.
Fixes#17327.
Change-Id: I5df1ad91f27ee9ac1f0dd88fa48f1329d6306c3e
Reviewed-on: https://go-review.googlesource.com/31851
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Ceil, Floor and Trunc are pre-existing intrinsics. Round is a new
function and has been added as an intrinsic in this CL. All of the
functions can be implemented as a single 'LOAD FP INTEGER'
instruction, FIDBR, on s390x.
name old time/op new time/op delta
Ceil 2.34ns ± 0% 0.85ns ± 0% -63.74% (p=0.000 n=5+4)
Floor 2.33ns ± 0% 0.85ns ± 1% -63.35% (p=0.008 n=5+5)
Round 4.23ns ± 0% 0.85ns ± 0% -79.89% (p=0.000 n=5+4)
Trunc 2.35ns ± 0% 0.85ns ± 0% -63.83% (p=0.029 n=4+4)
Change-Id: Idee7ba24a2899d12bf9afee4eedd6b4aaad3c510
Reviewed-on: https://go-review.googlesource.com/63890
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We used to have {Arg,Auto,Extern}Symbol structs with which we wrapped
a *gc.Node or *obj.LSym before storing them in the Aux field
of an ssa.Value. This let the SSA part of the compiler distinguish
between autos and args, for example. We no longer need the wrappers
as we can query the underlying objects directly.
There was also some sloppy usage, where VarDef had a *gc.Node
directly in its Aux field, whereas the use of that variable had
that *gc.Node wrapped in an AutoSymbol. Thus the Aux fields didn't
match (using ==) when they probably should.
This sloppy usage cleanup is the only thing in the CL that changes the
generated code - we can get rid of some more unused auto variables if
the matching happens reliably.
Removing this wrapper also lets us get rid of the varsyms cache
(which was used to prevent wrapping the same *gc.Node twice).
Change-Id: I0dedf8f82f84bfee413d310342b777316bd1d478
Reviewed-on: https://go-review.googlesource.com/64452
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
As per golint's suggestions.
Change-Id: Ie0c6ad9aa5dc69966a279562a341c7b095c47ede
Reviewed-on: https://go-review.googlesource.com/64192
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This adds rules to match the code in math/bits RotateLeft,
RotateLeft32, and RotateLef64 to allow them to be inlined.
The rules are complicated because the code in these function
use different types, and the non-const version of these
shifts generate Mask and Carry instructions that become
subexpressions during the match process.
Also adds a testcase to asm_test.go.
Improvement in math/bits:
BenchmarkRotateLeft-16 1.57 1.32 -15.92%
BenchmarkRotateLeft32-16 1.60 1.37 -14.37%
BenchmarkRotateLeft64-16 1.57 1.32 -15.92%
Updates #21390
Change-Id: Ib6f17669ecc9cab54f18d690be27e2225ca654a4
Reviewed-on: https://go-review.googlesource.com/59932
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Went mainly for the ones that make no sense, such as the ones
mid-sentence or after commas.
Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0
Reviewed-on: https://go-review.googlesource.com/57293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This is a crude compiler pass to eliminate stores to auto variables
that are only ever written to.
Eliminates an unnecessary store to x from the following code:
func f() int {
var x := 1
return *(&x)
}
Fixes#19765.
Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa
Reviewed-on: https://go-review.googlesource.com/38746
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Just to get rid of lots of .Name() stutter in printf calls.
Change-Id: I86cf00b3f7b2172387a1c6a7f189c1897fab6300
Reviewed-on: https://go-review.googlesource.com/56630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Prioritized the chunks of code with 8 or more levels of indentation.
Basically early breaks/returns and joining nested ifs.
Change-Id: I6817df1303226acf2eb904a29f2db720e4f7427a
Reviewed-on: https://go-review.googlesource.com/55630
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
We don't use it any more, remove it.
Change-Id: I76ce1a4c2e7048fdd13a37d3718b5abf39ed9d26
Reviewed-on: https://go-review.googlesource.com/44474
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This implements trunc, floor, and ceil in the math package
as intrinsics on ppc64x. Significant improvement mainly due
to avoiding call overhead of args and return value.
BenchmarkCeil-16 5.95 0.69 -88.40%
BenchmarkFloor-16 5.95 0.69 -88.40%
BenchmarkTrunc-16 5.82 0.69 -88.14%
Updates #21390
Change-Id: I951e182694f6e0c431da79c577272b81fb0ebad0
Reviewed-on: https://go-review.googlesource.com/54654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
gc.Sysfunc must not be called concurrently.
We set up runtime routines used by the backend
prior to doing any backend compilation.
I missed the 387 ones; fix that.
Sysfunc should have been unexported during 1.9.
I will rectify that in a subsequent CL.
Fixes#21352
Change-Id: I8386eaa1e05879c25c672b9c9fc693c938e9aeb6
Reviewed-on: https://go-review.googlesource.com/54090
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Avelino <t@avelino.xxx>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Debuggers use DWARF information to find local variables on the
stack and in registers. Prior to this CL, the DWARF information for
functions claimed that all variables were on the stack at all times.
That's incorrect when optimizations are enabled, and results in
debuggers showing data that is out of date or complete gibberish.
After this CL, the compiler is capable of representing variable
locations more accurately, and attempts to do so. Due to limitations of
the SSA backend, it's not possible to be completely correct.
There are a number of problems in the current design. One of the easier
to understand is that variable names currently must be attached to an
SSA value, but not all assignments in the source code actually result
in machine code. For example:
type myint int
var a int
b := myint(int)
and
b := (*uint64)(unsafe.Pointer(a))
don't generate machine code because the underlying representation is the
same, so the correct value of b will not be set when the user would
expect.
Generating the more precise debug information is behind a flag,
dwarflocationlists. Because of the issues described above, setting the
flag may not make the debugging experience much better, and may actually
make it worse in cases where the variable actually is on the stack and
the more complicated analysis doesn't realize it.
A number of changes are included:
- Add a new pseudo-instruction, RegKill, which indicates that the value
in the register has been clobbered.
- Adjust regalloc to emit RegKills in the right places. Significantly,
this means that phis are mixed with StoreReg and RegKills after
regalloc.
- Track variable decomposition in ssa.LocalSlots.
- After the SSA backend is done, analyze the result and build location
lists for each LocalSlot.
- After assembly is done, update the location lists with the assembled
PC offsets, recompose variables, and build DWARF location lists. Emit the
list as a new linker symbol, one per function.
- In the linker, aggregate the location lists into a .debug_loc section.
TODO:
- currently disabled for non-X86/AMD64 because there are no data tables.
go build -toolexec 'toolstash -cmp' -a std succeeds.
With -dwarflocationlists false:
before: f02812195637909ff675782c0b46836a8ff01976
after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
benchstat -geomean /tmp/220352263 /tmp/621364410
completed 15 of 15, estimated time remaining 0s (eta 3:52PM)
name old time/op new time/op delta
Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14)
Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15)
GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14)
Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15)
GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15)
Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13)
Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15)
XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15)
[Geo mean] 206ms 377ms +82.86%
name old user-time/op new user-time/op delta
Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15)
Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14)
GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15)
Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15)
GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15)
Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13)
Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15)
XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15)
[Geo mean] 317ms 583ms +83.72%
name old alloc/op new alloc/op delta
Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15)
Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15)
GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14)
Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15)
GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15)
Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15)
Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15)
XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15)
[Geo mean] 42.1MB 75.0MB +78.05%
name old allocs/op new allocs/op delta
Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15)
Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14)
GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14)
Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15)
GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15)
Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15)
Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15)
XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15)
[Geo mean] 439k 755k +72.01%
name old text-bytes new text-bytes delta
HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15)
name old data-bytes new data-bytes delta
HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal)
Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
Reviewed-on: https://go-review.googlesource.com/41770
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
When the compiler decomposes a user variable, track its origin so that
it can be recomposed during DWARF generation.
Change-Id: Ia71c7f8e7f4d65f0652f1c97b0dda5d9cad41936
Reviewed-on: https://go-review.googlesource.com/50878
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
When we start tracking the mapping from Value to Prog, valueProgs will
be confusing. Disambiguate.
Change-Id: Ib3b302fedb7eb0ff1bde789d70a11656d82f0897
Reviewed-on: https://go-review.googlesource.com/50876
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>