51 Commits

Author SHA1 Message Date
Alexander Yastrebov
8ffc931eae all: fix spelling errors
Fix spelling errors discovered using https://github.com/codespell-project/codespell. Errors in data files and vendored packages are ignored.

Change-Id: I83c7818222f2eea69afbd270c15b7897678131dc
GitHub-Last-Rev: 3491615b1b82832cc0064f535786546e89aa6184
GitHub-Pull-Request: golang/go#60758
Reviewed-on: https://go-review.googlesource.com/c/go/+/502576
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-06-14 00:03:57 +00:00
Nick Ripley
51225f6fc6 runtime: record parent goroutine ID, and print it in stack traces
Fixes #38651

Change-Id: Id46d684ee80e208c018791a06c26f304670ed159
Reviewed-on: https://go-review.googlesource.com/c/go/+/435337
Run-TryBot: Nick Ripley <nick.ripley@datadoghq.com>
Reviewed-by: Ethan Reesor <ethan.reesor@gmail.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-21 17:35:22 +00:00
Keith Randall
d8762b2f45 runtime: fix overflow in PingPongHog test
On 32-bit systems the result of hogCount*factor can overflow.
Use division instead to do comparison.

Update #52207

Change-Id: I429fb9dc009af645acb535cee5c70887527ba207
Reviewed-on: https://go-review.googlesource.com/c/go/+/407415
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-05-19 21:33:15 +00:00
Cherry Mui
c087121daf runtime: relax the threshold for TestPingPongHog
The test checks that the scheduling of the goroutines are within
a small factor, to ensure the scheduler handing off the P
correctly. There have been flaky failures on the builder (probably
due to OS scheduling delays). Increase the threshold to make it
less flaky. The gap would be much bigger if the scheduler doesn't
work correctly.

For the long term maybe it is better to test it more directly
with the scheduler, e.g. with scheduler instrumentation.

May fix #52207.

Change-Id: I50278b70ab21b7f04761fdc8b38dd13304c67879
Reviewed-on: https://go-review.googlesource.com/c/go/+/407134
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-05-18 19:52:53 +00:00
Russ Cox
9839668b56 all: separate doc comment from //go: directives
A future change to gofmt will rewrite

	// Doc comment.
	//go:foo

to

	// Doc comment.
	//
	//go:foo

Apply that change preemptively to all comments (not necessarily just doc comments).

For #51082.

Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/384260
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-05 17:54:15 +00:00
Austin Clements
ca33b34e17 runtime: deflake TestPreemptionAfterSyscall
This test occasionally takes very slightly longer than the 3 second
timeout on slow builders (especially windows-386-2008), so increase
the timeout to 5 seconds. It fails with much longer timeouts on Plan
9, so skip it as flaky there.

Updates #41015.

Change-Id: I426a7adfae92c18a0f8a223dd92762b0b91565e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/379214
Trust: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-01-19 15:34:05 +00:00
Michael Anthony Knyszek
4c943abb95 runtime: fix comments on the behavior of SetGCPercent
Fixes for #49680, #49695, #45867, and #49370 all assumed that
SetGCPercent(-1) doesn't block until the GC's mark phase is done, but
it actually does. The cause of 3 of those 4 failures comes from the fact
that at the beginning of the sweep phase, the GC does try to preempt
every P once, and this may run concurrently with test code. In the
fourth case, the issue was likely that only *one* of the debug_test.go
tests was missing a call to SetGCPercent(-1). Just to be safe, leave a
TODO there for now to remove the extraneous runtime.GC calls, but leave
the calls in.

Updates #49680, #49695, #45867, and #49370.

Change-Id: Ibf4e64addfba18312526968bcf40f1f5d54eb3f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/369815
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2021-12-07 17:46:04 +00:00
Michael Anthony Knyszek
871d63fb73 runtime: call runtime.GC in several tests that disable GC
These tests disable GC because of the potential for a deadlock, but
don't consider that a GC could be in progress due to other tests. The
likelihood of this case was increased when the minimum heap size was
lowered during the Go 1.18 cycle. The issue was then mitigated by
CL 368137 but in theory is always a problem.

This change is intended specifically for #45867, but I just walked over
a whole bunch of other tests that don't take this precaution where it
seems like it could be relevant (some tests it's not, like the
UserForcedGC test, or testprogs where no other code has run before it).

Fixes #45867.

Change-Id: I6a1b4ae73e05cab5a0b2d2cce14126bd13be0ba5
Reviewed-on: https://go-review.googlesource.com/c/go/+/369747
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: David Chase <drchase@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2021-12-06 23:02:28 +00:00
sivchari
ba0f8ce50f runtime: gofmt proc_test.go
Change-Id: I09a2be64e96fe85d84560728814af74b234d7210
GitHub-Last-Rev: bc881ea0022326fcc35e0356a79634fde00efd2a
GitHub-Pull-Request: golang/go#45929
Reviewed-on: https://go-review.googlesource.com/c/go/+/316409
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Pratt <mpratt@google.com>
2021-05-06 12:33:02 +00:00
Michael Pratt
ecfce58965 runtime: skip work recheck for non-spinning Ms
When an M transitions from spinning to non-spinning state, it must
recheck most sources of work to avoid missing work submitted between its
initial check and decrementing sched.nmspinning (see "delicate dance"
comment).

Ever since the scheduler rewrite in Go 1.1 (golang.org/cl/7314062), we
have performed this recheck on all Ms before stopping, regardless of
whether or not they were spinning.

Unfortunately, there is a problem with this approach: non-spinning Ms
are not eligible to steal work (note the skip over the stealWork block),
but can detect work during the recheck. If there is work available, this
non-spinning M will jump to top, skip stealing, land in recheck again,
and repeat. i.e., it will spin uselessly.

The spin is bounded. This can only occur if there is another spinning M,
which will either take the work, allowing this M to stop, or take some
other work, allowing this M to upgrade to spinning. But the spinning is
ultimately just a fancy spin-wait.

golang.org/issue/43997 discusses several ways to address this. This CL
takes the simplest approach: skipping the recheck on non-spinning Ms and
allowing them to go to stop.

Results for scheduler-relevant runtime and time benchmarks can be found
at https://perf.golang.org/search?q=upload:20210420.5.

The new BenchmarkCreateGoroutinesSingle is a characteristic example
workload that hits this issue hard. A single M readies lots of work
without itself parking. Other Ms must spin to steal work, which is very
short-lived, forcing those Ms to spin again. Some of the Ms will be
non-spinning and hit the above bug.

With this fixed, that benchmark drops in CPU usage by a massive 68%, and
wall time 24%. BenchmarkNetpollBreak shows similar drops because it is
unintentionally almost the same benchmark (create short-living Gs in a
loop). Typical well-behaved programs show little change.

We also measure scheduling latency (time from goready to execute). Note
that many of these benchmarks are very noisy because they don't involve
much scheduling. Those that do, like CreateGoroutinesSingle, are
expected to increase as we are replacing unintentional spin waiting with
a real park.

Fixes #43997

Change-Id: Ie1d1e1800f393cee1792455412caaa5865d13562
Reviewed-on: https://go-review.googlesource.com/c/go/+/310850
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-22 20:45:37 +00:00
Michael Anthony Knyszek
eb3c6a93c3 runtime: disable stack shrinking in activeStackChans race window
Currently activeStackChans is set before a goroutine blocks on a channel
operation in an unlockf passed to gopark. The trouble is that the
unlockf is called *after* the G's status is changed, and the G's status
is what is used by a concurrent mark worker (calling suspendG) to
determine that a G has successfully been suspended. In this window
between the status change and unlockf, the mark worker could try to
shrink the G's stack, and in particular observe that activeStackChans is
false. This observation will cause the mark worker to *not* synchronize
with concurrent channel operations when it should, and so updating
pointers in the sudog for the blocked goroutine (which may point to the
goroutine's stack) races with channel operations which may also
manipulate the pointer (read it, dereference it, update it, etc.).

Fix the problem by adding a new atomically-updated flag to the g struct
called parkingOnChan, which is non-zero in the race window above. Then,
in isShrinkStackSafe, check if parkingOnChan is zero. The race is
resolved like so:

* Blocking G sets parkingOnChan, then changes status in gopark.
* Mark worker successfully suspends blocking G.
* If the mark worker observes parkingOnChan is non-zero when checking
  isShrinkStackSafe, then it's not safe to shrink (we're in the race
  window).
* If the mark worker observes parkingOnChan as zero, then because
  the mark worker observed the G status change, it can be sure that
  gopark's unlockf completed, and gp.activeStackChans will be correct.

The risk of this change is low, since although it reduces the number of
places that stack shrinking is allowed, the window here is incredibly
small. Essentially, every place that it might crash now is replaced with
no shrink.

This change adds a test, but the race window is so small that it's hard
to trigger without a well-placed sleep in park_m. Also, this change
fixes stackGrowRecursive in proc_test.go to actually allocate a 128-byte
stack frame. It turns out the compiler was destructuring the "pad" field
and only allocating one uint64 on the stack.

Fixes #40641.

Change-Id: I7dfbe7d460f6972b8956116b137bc13bc24464e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/247050
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2020-09-21 14:34:33 +00:00
Michael Pratt
11b3730a02 runtime: disable preemption in startTemplateThread
When a locked M wants to start a new M, it hands off to the template
thread to actually call clone and start the thread. The template thread
is lazily created the first time a thread is locked (or if cgo is in
use).

stoplockedm will release the P (_Pidle), then call handoffp to give the
P to another M. In the case of a pending STW, one of two things can
happen:

1. handoffp starts an M, which does acquirep followed by schedule, which
will finally enter _Pgcstop.

2. handoffp immediately enters _Pgcstop. This only occurs if the P has
no local work, GC work, and no spinning M is required.

If handoffp starts an M, and must create a new M to do so, then newm
will simply queue the M on newmHandoff for the template thread to do the
clone.

When a stop-the-world is required, stopTheWorldWithSema will start the
stop and then wait for all Ps to enter _Pgcstop. If the template thread
is not fully created because startTemplateThread gets stopped, then
another stoplockedm may queue an M that will never get created, and the
handoff P will never leave _Pidle. Thus stopTheWorldWithSema will wait
forever.

A sequence to trigger this hang when STW occurs can be visualized with
two threads:

  T1                                 T2
-------------------------------   -----------------------------

LockOSThread                      LockOSThread
  haveTemplateThread == 0
  startTemplateThread
    haveTemplateThread = 1
    newm                            haveTemplateThread == 1
      preempt -> schedule           g.m.lockedExt++
        gcstopm -> _Pgcstop         g.m.lockedg = ...
        park                        g.lockedm = ...
                                    return

                                 ... (any code)
                                   preempt -> schedule
                                     stoplockedm
                                       releasep -> _Pidle
                                       handoffp
                                         startm (first 3 handoffp cases)
                                          newm
                                            g.m.lockedExt != 0
                                            Add to newmHandoff, return
                                       park

Note that the P in T2 is stuck sitting in _Pidle. Since the template
thread isn't running, the new M will not be started complete the
transition to _Pgcstop.

To resolve this, we disable preemption around the assignment of
haveTemplateThread and the creation of the template thread in order to
guarantee that if handTemplateThread is set then the template thread
will eventually exist, in the presence of stops.

Fixes #38931

Change-Id: I50535fbbe2f328f47b18e24d9030136719274191
Reviewed-on: https://go-review.googlesource.com/c/go/+/232978
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-05-21 21:01:39 +00:00
Ian Lance Taylor
2edd351b92 runtime: skip TestBigGOMAXPROCS if it runs out of memory
Fixes #38541

Change-Id: I0e9ea5865628d953c32f3a5d4b3ccf1c1d0b081e
Reviewed-on: https://go-review.googlesource.com/c/go/+/229077
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-20 22:42:49 +00:00
Ian Lance Taylor
7ea40f6594 runtime: use mcache0 if no P in profilealloc
A case that I missed in CL 205239: profilealloc can be called at
program startup if GOMAXPROCS is large enough.

Fixes #38474

Change-Id: I2f089fc6ec00c376680e1c0b8a2557b62789dd7f
Reviewed-on: https://go-review.googlesource.com/c/go/+/228420
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-04-17 00:45:52 +00:00
Josh Bleecher Snyder
97711bfd60 runtime: skip TestPingPongHog in race mode
TestPingPongHog tests properties of the scheduler.
But the race detector intentionally does randomized scheduling,
so the test is not applicable.

Fixes #38266

Change-Id: Ib06aa317b2776cb1faa641c4e038e2599cf70b2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/227344
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-04-08 16:40:46 +00:00
Michael Anthony Knyszek
8a5af7910a runtime: ready scavenger without next
This change makes it so that waking up the scavenger readies its
goroutine without "next" set, so that it doesn't interfere with the
application's use of the runnext feature in the scheduler which helps
fairness.

As of CL 201763 the scavenger began waking up much more often, and in
TestPingPongHog this meant that it would sometimes supercede either a
hog or light goroutine in runnext, leading to a skew in the results and
ultimately a test flake.

This change thus re-enables the TestPingPongHog test on the builders.

Fixes #35271.

Change-Id: Iace08576912e8940554dd7de6447e458ad0d201d
Reviewed-on: https://go-review.googlesource.com/c/go/+/208380
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-11-27 23:14:19 +00:00
Bryan C. Mills
52aebe8d21 runtime: skip TestPingPongHog on builders
This test is failing consistently in the longtest builders,
potentially masking regressions in other packages.

Updates #35271

Change-Id: Idc03171c0109b5c8d4913e0af2078c1115666897
Reviewed-on: https://go-review.googlesource.com/c/go/+/206098
Reviewed-by: Carlos Amedee <carlos@golang.org>
2019-11-08 15:10:39 +00:00
Austin Clements
7955ecebfc runtime: add a test for asynchronous safe points
This adds a test of preempting a loop containing no synchronous safe
points for STW and stack scanning.

We couldn't add this test earlier because it requires scheduler, STW,
and stack scanning preemption to all be working.

For #10958, #24543.

Change-Id: I73292db78ca3d14aab11bdafd26d03986920ef0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/201777
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-02 21:51:23 +00:00
Clément Chigot
6453337494 runtime: initialize netpoll in TestNetpollBreak
Netpoll must be always be initialized when TestNetpollBreak is launched.
However, when it is run in standalone, it won't be the case, so it must
be forced.

Updates: #27707

Change-Id: I28147f3834f3d6aca982c6a298feadc09b55f66e
Reviewed-on: https://go-review.googlesource.com/c/go/+/204058
Run-TryBot: Clément Chigot <clement.chigot%atos.net@gtempaccount.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-29 17:38:23 +00:00
Ian Lance Taylor
50f4896b72 runtime: add netpollBreak
The new netpollBreak function can be used to interrupt a blocking netpoll.
This function is not currently used; it will be used by later CLs.

Updates #27707

Change-Id: I5cb936609ba13c3c127ea1368a49194fc58c9f4d
Reviewed-on: https://go-review.googlesource.com/c/go/+/171824
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2019-10-21 16:37:45 +00:00
Cherry Zhang
2bcbe6a4b6 runtime: add a test for getg with thread switch
With gccgo, if we generate getg inlined, the backend may cache
the address of the TLS variable, which will become invalid after
a thread switch.

Currently there is no known bug for this. But if we didn't
implement this carefully, we may get subtle bugs. This CL adds a
test that will fail loudly if this is wrong. (See also
https://go.googlesource.com/gofrontend/+/refs/heads/master/libgo/runtime/proc.c#333
and an incorrect attempt CL 185337.)

Note: at least on Linux/AMD64, even with an incorrect
implementation, this only fails if the test is compiled with
-fPIC, which is not the default setting for gccgo test suite. So
some manual work is needed. Maybe we could extend the test suite
to run the runtime test with more settings (e.g. PIC and static).

Change-Id: I459a3b4c31f09b9785c0eca19b7756f80e8ef54c
Reviewed-on: https://go-review.googlesource.com/c/go/+/186357
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-07-16 20:53:01 +00:00
Andrei Vagin
4166ff42c0 runtime: preempt a goroutine which calls a lot of short system calls
A goroutine should be preempted if it runs for 10ms without blocking.
We found that this doesn't work for goroutines which call short system calls.

For example, the next program can stuck for seconds without this fix:

$ cat main.go
package main

import (
	"runtime"
	"syscall"
)

func main() {
	runtime.GOMAXPROCS(1)
	c := make(chan int)
	go func() {
		c <- 1
		for {
			t := syscall.Timespec{
				Nsec: 300,
			}
			if true {
				syscall.Nanosleep(&t, nil)
			}
		}
	}()
	<-c
}

$ time go run main.go

real	0m8.796s
user	0m0.367s
sys	0m0.893s

Updates #10958

Change-Id: Id3be54d3779cc28bfc8b33fe578f13778f1ae2a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/170138
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-04-09 07:45:26 +00:00
Michael Anthony Knyszek
429bae7158 runtime: skip TestLockOSThreadAvoidsStatePropagation if one can't unshare
This change splits a testprog out of TestLockOSThreadExit and makes it
its own test. Then, this change makes the testprog exit prematurely with
a special message if unshare fails with EPERM because not all of the
builders allow the user to call the unshare syscall.

Also, do some minor cleanup on the TestLockOSThread* tests.

Fixes #29366.

Change-Id: Id8a9f6c4b16e26af92ed2916b90b0249ba226dbe
Reviewed-on: https://go-review.googlesource.com/c/155437
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-21 18:42:22 +00:00
Michael Anthony Knyszek
d0f8a7517a runtime: don't clear lockedExt on locked M when G exits
When a locked M has its G exit without calling UnlockOSThread, then
lockedExt on it was getting cleared. Unfortunately, this meant that
during P handoff, if a new M was started, it might get forked (on
most OSs besides Windows) from the locked M, which could have kernel
state attached to it.

To solve this, just don't clear lockedExt. At the point where the
locked M has its G exit, it will also exit in accordance with the
LockOSThread API. So, we can safely assume that it's lockedExt state
will no longer be used. For the case of the main thread where it just
gets wedged instead of exiting, it's probably better for it to keep
the locked marker since it more accurately represents its state.

Fixed #28979.

Change-Id: I7d3d71dd65bcb873e9758086d2cbcb9a06429b0f
Reviewed-on: https://go-review.googlesource.com/c/153078
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2018-12-19 19:47:56 +00:00
Richard Musiol
e3c684777a all: skip unsupported tests for js/wasm
The general policy for the current state of js/wasm is that it only
has to support tests that are also supported by nacl.

The test nilptr3.go makes assumptions about which nil checks can be
removed. Since WebAssembly does not signal on reading a null pointer,
all nil checks have to be explicit.

Updates #18892

Change-Id: I06a687860b8d22ae26b1c391499c0f5183e4c485
Reviewed-on: https://go-review.googlesource.com/110096
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-30 19:39:18 +00:00
Brad Fitzpatrick
1e3f563b14 runtime: fix build on non-Linux platforms
CL 78538 was updated after running TryBots to depend on
syscall.NanoSleep which isn't available on all non-Linux platforms.

Change-Id: I1fa615232b3920453431861310c108b208628441
Reviewed-on: https://go-review.googlesource.com/79175
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-11-21 21:52:58 +00:00
Jamie Liu
868c8b374d runtime: only sleep before stealing work from a running P
The sleep in question does not make sense if the stolen-from P cannot
run the stolen G. The usleep(3) has been observed delaying execution of
woken G's by ~60us; skipping it reduces the wakeup-to-execution latency
to ~7us in these cases, improving CPU utilization.

Benchmarks added by this change:

name                             old time/op  new time/op  delta
WakeupParallelSpinning/0s-12     14.4µs ± 1%  14.3µs ± 1%     ~     (p=0.227 n=19+20)
WakeupParallelSpinning/1µs-12    18.3µs ± 0%  18.3µs ± 1%     ~     (p=0.950 n=20+19)
WakeupParallelSpinning/2µs-12    22.3µs ± 1%  22.3µs ± 1%     ~     (p=0.670 n=20+18)
WakeupParallelSpinning/5µs-12    31.7µs ± 0%  31.7µs ± 0%     ~     (p=0.460 n=20+17)
WakeupParallelSpinning/10µs-12   51.8µs ± 0%  51.8µs ± 0%     ~     (p=0.883 n=20+20)
WakeupParallelSpinning/20µs-12   91.9µs ± 0%  91.9µs ± 0%     ~     (p=0.245 n=20+20)
WakeupParallelSpinning/50µs-12    214µs ± 0%   214µs ± 0%     ~     (p=0.509 n=19+20)
WakeupParallelSpinning/100µs-12   335µs ± 0%   335µs ± 0%   -0.05%  (p=0.006 n=17+15)
WakeupParallelSyscall/0s-12       228µs ± 2%   129µs ± 1%  -43.32%  (p=0.000 n=20+19)
WakeupParallelSyscall/1µs-12      232µs ± 1%   131µs ± 1%  -43.60%  (p=0.000 n=19+20)
WakeupParallelSyscall/2µs-12      236µs ± 1%   133µs ± 1%  -43.44%  (p=0.000 n=18+19)
WakeupParallelSyscall/5µs-12      248µs ± 2%   139µs ± 1%  -43.68%  (p=0.000 n=18+19)
WakeupParallelSyscall/10µs-12     263µs ± 3%   150µs ± 2%  -42.97%  (p=0.000 n=18+20)
WakeupParallelSyscall/20µs-12     281µs ± 2%   170µs ± 1%  -39.43%  (p=0.000 n=19+19)
WakeupParallelSyscall/50µs-12     345µs ± 4%   246µs ± 7%  -28.85%  (p=0.000 n=20+20)
WakeupParallelSyscall/100µs-12    460µs ± 5%   350µs ± 4%  -23.85%  (p=0.000 n=20+20)

Benchmarks associated with the change that originally added this sleep
(see https://golang.org/s/go15gomaxprocs):

name        old time/op  new time/op  delta
Chain       19.4µs ± 2%  19.3µs ± 1%    ~     (p=0.101 n=19+20)
ChainBuf    19.5µs ± 2%  19.4µs ± 2%    ~     (p=0.840 n=19+19)
Chain-2     19.9µs ± 1%  19.9µs ± 2%    ~     (p=0.734 n=19+19)
ChainBuf-2  20.0µs ± 2%  20.0µs ± 2%    ~     (p=0.175 n=19+17)
Chain-4     20.3µs ± 1%  20.1µs ± 1%  -0.62%  (p=0.010 n=19+18)
ChainBuf-4  20.3µs ± 1%  20.2µs ± 1%  -0.52%  (p=0.023 n=19+19)
Powser       2.09s ± 1%   2.10s ± 3%    ~     (p=0.908 n=19+19)
Powser-2     2.21s ± 1%   2.20s ± 1%  -0.35%  (p=0.010 n=19+18)
Powser-4     2.31s ± 2%   2.31s ± 2%    ~     (p=0.578 n=18+19)
Sieve        13.6s ± 1%   13.6s ± 1%    ~     (p=0.909 n=17+18)
Sieve-2      8.02s ±52%   7.28s ±15%    ~     (p=0.336 n=20+16)
Sieve-4      4.00s ±35%   3.98s ±26%    ~     (p=0.654 n=20+18)

Change-Id: I58edd8ce01075859d871e2348fc0833e9c01f70f
Reviewed-on: https://go-review.googlesource.com/78538
Reviewed-by: Austin Clements <austin@google.com>
2017-11-21 19:31:06 +00:00
Austin Clements
4f34a52913 runtime: terminate locked OS thread if its goroutine exits
runtime.LockOSThread is sometimes used when the caller intends to put
the OS thread into an unusual state. In this case, we never want to
return this thread to the runtime thread pool. However, currently
exiting the goroutine implicitly unlocks its OS thread.

Fix this by terminating the locked OS thread when its goroutine exits,
rather than simply returning it to the pool.

Fixes #20395.

Change-Id: I3dcec63b200957709965f7240dc216fa84b62ad9
Reviewed-on: https://go-review.googlesource.com/46038
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-11 17:47:21 +00:00
Austin Clements
c85b12b579 runtime: make LockOSThread/UnlockOSThread nested
Currently, there is a single bit for LockOSThread, so two calls to
LockOSThread followed by one call to UnlockOSThread will unlock the
thread. There's evidence (#20458) that this is almost never what
people want or expect and it makes these APIs very hard to use
correctly or reliably.

Change this so LockOSThread/UnlockOSThread can be nested and the
calling goroutine will not be unwired until UnlockOSThread has been
called as many times as LockOSThread has. This should fix the vast
majority of incorrect uses while having no effect on the vast majority
of correct uses.

Fixes #20458.

Change-Id: I1464e5e9a0ea4208fbb83638ee9847f929a2bacb
Reviewed-on: https://go-review.googlesource.com/45752
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-05 19:50:23 +00:00
Austin Clements
d263e85597 runtime: expand acceptable PingPongHog factor from 2 to 5
Since TestPingPongHog tests the scheduler, it's ultimately
probabilistic. Currently, it requires the result be at most of factor
of 2 off of the ideal. It turns out this isn't quite enough in
practice, with factors on 1000 iterations on linux/amd64 ranging from
0.48 to 2.5. If the test were failing, we would expect a factor closer
to 1000X, so it's pretty safe to expand the accepted factor from 2 to
5.

Fixes #20494.

Change-Id: If8f2e96194fe66f1fb981a965d1167fe74ff38d7
Reviewed-on: https://go-review.googlesource.com/44859
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-06-05 15:51:49 +00:00
Daniel Martí
516e6f6d5d all: remove some unused parameters in test code
Mostly unnecessary *testing.T arguments.

Found with github.com/mvdan/unparam.

Change-Id: Ifb955cb88f2ce8784ee4172f4f94d860fa36ae9a
Reviewed-on: https://go-review.googlesource.com/41691
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-04-25 14:38:10 +00:00
Austin Clements
44497ebacb runtime: fix goroutine priority elevation
Currently it's possible for user code to exploit the high scheduler
priority of the GC worker in conjunction with the runnext optimization
to elevate a user goroutine to high priority so it will always run
even if there are other runnable goroutines.

For example, if a goroutine is in a tight allocation loop, the
following can happen:

1. Goroutine 1 allocates, triggering a GC.
2. G 1 attempts an assist, but fails and blocks.
3. The scheduler runs the GC worker, since it is high priority.
   Note that this also starts a new scheduler quantum.
4. The GC worker does enough work to satisfy the assist.
5. The GC worker readies G 1, putting it in runnext.
6. GC finishes and the scheduler runs G 1 from runnext, giving it
   the rest of the GC worker's quantum.
7. Go to 1.

Even if there are other goroutines on the run queue, they never get a
chance to run in the above sequence. This requires a confluence of
circumstances that make it unlikely, though not impossible, that it
would happen in "real" code. In the test added by this commit, we
force this confluence by setting GOMAXPROCS to 1 and GOGC to 1 so it's
easy for the test to repeated trigger GC and wake from a blocked
assist.

We fix this by making GC always put user goroutines at the end of the
run queue, instead of in runnext. This makes it so user code can't
piggy-back on the GC's high priority to make a user goroutine act like
it has high priority. The only other situation where GC wakes user
goroutines is waking all blocked assists at the end, but this uses the
global run queue and hence doesn't have this problem.

Fixes #15706.

Change-Id: I1589dee4b7b7d0c9c8575ed3472226084dfce8bc
Reviewed-on: https://go-review.googlesource.com/23172
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-05-19 18:18:13 +00:00
Dmitry Vyukov
fcd7c02c70 runtime: fix CPU underutilization
Runqempty is a critical predicate for scheduler. If runqempty spuriously
returns true, then scheduler can fail to schedule arbitrary number of
runnable goroutines on idle Ps for arbitrary long time. With the addition
of runnext runqempty predicate become broken (can spuriously return true).
Consider that runnext is not nil and the main array is empty. Runqempty
observes that the array is empty, then it is descheduled for some time.
Then queue owner pushes another element to the queue evicting runnext
into the array. Then queue owner pops runnext. Then runqempty resumes
and observes runnext is nil and returns true. But there were no point
in time when the queue was empty.

Fix runqempty predicate to not return true spuriously.

Change-Id: Ifb7d75a699101f3ff753c4ce7c983cf08befd31e
Reviewed-on: https://go-review.googlesource.com/20858
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-03 10:06:32 +00:00
Dmitry Vyukov
ea0386f85f runtime: improve randomized stealing logic
During random stealing we steal 4*GOMAXPROCS times from random procs.
One would expect that most of the time we check all procs this way,
but due to low quality PRNG we actually miss procs with frightening
probability. Below are modelling experiment results for 1e6 tries:

GOMAXPROCS = 2 : missed 1 procs 7944 times

GOMAXPROCS = 3 : missed 1 procs 101620 times
GOMAXPROCS = 3 : missed 2 procs 3571 times

GOMAXPROCS = 4 : missed 1 procs 63916 times
GOMAXPROCS = 4 : missed 2 procs 61 times
GOMAXPROCS = 4 : missed 3 procs 16 times

GOMAXPROCS = 5 : missed 1 procs 133136 times
GOMAXPROCS = 5 : missed 2 procs 1025 times
GOMAXPROCS = 5 : missed 3 procs 101 times
GOMAXPROCS = 5 : missed 4 procs 15 times

GOMAXPROCS = 8 : missed 1 procs 151765 times
GOMAXPROCS = 8 : missed 2 procs 5057 times
GOMAXPROCS = 8 : missed 3 procs 1726 times
GOMAXPROCS = 8 : missed 4 procs 68 times

GOMAXPROCS = 12 : missed 1 procs 199081 times
GOMAXPROCS = 12 : missed 2 procs 27489 times
GOMAXPROCS = 12 : missed 3 procs 3113 times
GOMAXPROCS = 12 : missed 4 procs 233 times
GOMAXPROCS = 12 : missed 5 procs 9 times

GOMAXPROCS = 16 : missed 1 procs 237477 times
GOMAXPROCS = 16 : missed 2 procs 30037 times
GOMAXPROCS = 16 : missed 3 procs 9466 times
GOMAXPROCS = 16 : missed 4 procs 1334 times
GOMAXPROCS = 16 : missed 5 procs 192 times
GOMAXPROCS = 16 : missed 6 procs 5 times
GOMAXPROCS = 16 : missed 7 procs 1 times
GOMAXPROCS = 16 : missed 8 procs 1 times

A missed proc won't lead to underutilization because we check all procs
again after dropping P. But it can lead to an unpleasant situation
when we miss a proc, drop P, check all procs, discover work, acquire P,
miss the proc again, repeat.

Improve stealing logic to cover all procs.
Also don't enter spinning mode and try to steal when there is nobody around.

Change-Id: Ibb6b122cc7fb836991bad7d0639b77c807aab4c2
Reviewed-on: https://go-review.googlesource.com/20836
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
2016-03-25 11:00:48 +00:00
Marcel van Lohuizen
872ca73cad runtime: don't assume b.N > 0
Change-Id: I2e26717f2563d7633ffd15f4adf63c3d0ee3403f
Reviewed-on: https://go-review.googlesource.com/20856
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-03-18 15:55:02 +00:00
Burcu Dogan
6df8038768 runtime: listen 127.0.0.1 instead of localhost on android
Fixes #14486.
Related to #14485.

Change-Id: I2dd77b0337aebfe885ae828483deeaacb500b12a
Reviewed-on: https://go-review.googlesource.com/20340
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-08 00:22:38 +00:00
Russ Cox
a3c1a3f401 runtime: deflake TestNumGoroutine
Fixes #14107.

Change-Id: Icd9463b1a77b139c7ebc2d8732482d704ea332d0
Reviewed-on: https://go-review.googlesource.com/19002
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-01-27 22:17:42 +00:00
Brad Fitzpatrick
9202e9e1b8 runtime: add more debug info to flaky TestNumGoroutine
This has been flaking on the new OpenBSD 5.8 builders lately:
https://storage.googleapis.com/go-build-log/808270e7/openbsd-amd64-gce58_61ce2663.log
(as one example)

Add more debug info when it fails.

Updates #14107

Change-Id: Ie30bc0c703d2e9ee993d1e232ffc5f2d17e65c97
Reviewed-on: https://go-review.googlesource.com/18938
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-01-27 02:51:01 +00:00
Russ Cox
fac8202c3f runtime: make NumGoroutine and Stack agree not to include system goroutines
[Repeat of CL 18343 with build fixes.]

Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them,
which was inconsistent and confusing.

To resolve which way they should be consistent, it seems like

	package main
	import "runtime"
	func main() { println(runtime.NumGoroutine()) }

should print 1 regardless of internal runtime details. Make it so.

Fixes #11706.

Change-Id: If26749fec06aa0ff84311f7941b88d140552e81d
Reviewed-on: https://go-review.googlesource.com/18432
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
2016-01-13 01:46:01 +00:00
Russ Cox
6da608206c Revert "runtime: make NumGoroutine and Stack agree not to include system goroutines"
This reverts commit c5bafc828126c8fa057e1accaa448583c7ec145f.

Change-Id: Ie7030c978c6263b9e996d5aa0e490086796df26d
Reviewed-on: https://go-review.googlesource.com/18431
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-08 15:31:15 +00:00
Russ Cox
c5bafc8281 runtime: make NumGoroutine and Stack agree not to include system goroutines
Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them,
which was inconsistent and confusing.

To resolve which way they should be consistent, it seems like

	package main
	import "runtime"
	func main() { println(runtime.NumGoroutine()) }

should print 1 regardless of internal runtime details. Make it so.

Fixes #11706.

Change-Id: I6bfe26a901de517728192cfb26a5568c4ef4fe47
Reviewed-on: https://go-review.googlesource.com/18343
Reviewed-by: Austin Clements <austin@google.com>
2016-01-08 15:25:00 +00:00
Russ Cox
8d5ff2e182 runtime: move test programs out of source code, coalesce
Now there are just three programs to compile instead of many,
and repeated tests can reuse the compilation result instead of
rebuilding it.

Combined, these changes reduce the time spent testing runtime
during all.bash on my laptop from about 60 to about 30 seconds.
(All.bash itself runs in 5½ minutes.)

For #10571.

Change-Id: Ie2c1798b847f1a635a860d11dcdab14375319ae9
Reviewed-on: https://go-review.googlesource.com/18085
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2015-12-29 21:16:59 +00:00
Dmitry Vyukov
fb6f8a96f2 runtime: remove unnecessary wakeups of worker threads
Currently we wake up new worker threads whenever we pass
through the scheduler with nmspinning==0. This leads to
lots of unnecessary thread wake ups.
Instead let only spinning threads wake up new spinning threads.

For the following program:

package main
import "runtime"
func main() {
	for i := 0; i < 1e7; i++ {
		runtime.Gosched()
	}
}

Before:
$ time ./test
real	0m4.278s
user	0m7.634s
sys	0m1.423s

$ strace -c ./test
% time     seconds  usecs/call     calls    errors syscall
 99.93    9.314936           3   2685009     17536 futex

After:
$ time ./test
real	0m1.200s
user	0m1.181s
sys	0m0.024s

$ strace -c ./test
% time     seconds  usecs/call     calls    errors syscall
  3.11    0.000049          25         2           futex

Fixes #13527

Change-Id: Ia1f5bf8a896dcc25d8b04beb1f4317aa9ff16f74
Reviewed-on: https://go-review.googlesource.com/17540
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-12-11 11:31:12 +00:00
Russ Cox
4a4eba9f37 runtime: disable TestGoroutineParallelism on uniprocessor
It's a bad test and it's worst on uniprocessors.

Fixes #11143.

Change-Id: I0164231ada294788d7eec251a2fc33e02a26c13b
Reviewed-on: https://go-review.googlesource.com/12522
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-07-22 18:53:12 +00:00
Austin Clements
f2c3957ed8 runtime: disable GC around TestGoroutineParallelism
TestGoroutineParallelism can deadlock if the GC runs during the
test. Currently it tries to prevent this by forcing a GC before the
test, but this is best effort and fails completely if GOGC is very low
for testing.

This change replaces this best-effort fix with simply setting GOGC to
off for the duration of the test.

Change-Id: I8229310833f241b149ebcd32845870c1cb14e9f8
Reviewed-on: https://go-review.googlesource.com/10454
Reviewed-by: Russ Cox <rsc@golang.org>
2015-05-28 17:40:19 +00:00
Russ Cox
42da270024 runtime: fix race in BenchmarkPingPongHog
The master goroutine was returning before
the child goroutine had done its final i < b.N
(the one that fails and causes it to exit the loop)
and then the benchmark harness was updating
b.N, causing a read+write race on b.N.

Change-Id: I2504270a0de30544736f6c32161337a25b505c3e
Reviewed-on: https://go-review.googlesource.com/9368
Reviewed-by: Austin Clements <austin@google.com>
2015-04-27 20:10:11 +00:00
Austin Clements
e870f06c3f runtime: yield time slice to most recently readied G
Currently, when the runtime ready()s a G, it adds it to the end of the
current P's run queue and continues running. If there are many other
things in the run queue, this can result in a significant delay before
the ready()d G actually runs and can hurt fairness when other Gs in
the run queue are CPU hogs. For example, if there are three Gs sharing
a P, one of which is a CPU hog that never voluntarily gives up the P
and the other two of which are doing small amounts of work and
communicating back and forth on an unbuffered channel, the two
communicating Gs will get very little CPU time.

Change this so that when G1 ready()s G2 and then blocks, the scheduler
immediately hands off the remainder of G1's time slice to G2. In the
above example, the two communicating Gs will now act as a unit and
together get half of the CPU time, while the CPU hog gets the other
half of the CPU time.

This fixes the problem demonstrated by the ping-pong benchmark added
in the previous commit:

benchmark                old ns/op     new ns/op     delta
BenchmarkPingPongHog     684287        825           -99.88%

On the x/benchmarks suite, this change improves the performance of
garbage by ~6% (for GOMAXPROCS=1 and 4), and json by 28% and 36% for
GOMAXPROCS=1 and 4. It has negligible effect on heap size.

This has no effect on the go1 benchmark suite since those benchmarks
are mostly single-threaded.

Change-Id: I858a08eaa78f702ea98a5fac99d28a4ac91d339f
Reviewed-on: https://go-review.googlesource.com/9289
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-24 15:12:52 +00:00
Austin Clements
da0e37fa8d runtime: benchmark for ping-pong in the presence of a CPU hog
This benchmark demonstrates a current problem with the scheduler where
a set of frequently communicating goroutines get very little CPU time
in the presence of another goroutine that hogs that CPU, even if one
of those communicating goroutines is always runnable.

Currently it takes about 0.5 milliseconds to switch between
ping-ponging goroutines in the presence of a CPU hog:

BenchmarkPingPongHog	    2000	    684287 ns/op

Change-Id: I278848c84f778de32344921ae8a4a8056e4898b0
Reviewed-on: https://go-review.googlesource.com/9288
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-24 15:12:47 +00:00
Dmitry Vyukov
c4ee44b7b9 cmd/gc: transform closure calls to function calls
Currently we always create context objects for closures that capture variables.
However, it is completely unnecessary for direct calls of closures
(whether it is func()(), defer func()() or go func()()).
This change transforms any OCALLFUNC(OCLOSURE) to normal function call.
Closed variables become function arguments.
This transformation is especially beneficial for go func(),
because we do not need to allocate context object on heap.
But it makes direct closure calls a bit faster as well (see BenchmarkClosureCall).

On implementation level it required to introduce yet another compiler pass.
However, the pass iterates only over xtop, so it should not be an issue.
Transformation consists of two parts: closure transformation and call site
transformation. We can't run these parts on different sides of escape analysis,
because tree state is inconsistent. We can do both parts during typecheck,
we don't know how to capture variables and don't have call site.
We can't do both parts during walk of OCALLFUNC, because we can walk
OCLOSURE body earlier.
So now capturevars pass only decides how to capture variables
(this info is required for escape analysis). New transformclosure
pass, that runs just before order/walk, does all transformations
of a closure. And later walk of OCALLFUNC(OCLOSURE) transforms call site.

benchmark                            old ns/op     new ns/op     delta
BenchmarkClosureCall                 4.89          3.09          -36.81%
BenchmarkCreateGoroutinesCapture     1634          1294          -20.81%

benchmark                            old allocs     new allocs     delta
BenchmarkCreateGoroutinesCapture     6              2              -66.67%

benchmark                            old bytes     new bytes     delta
BenchmarkCreateGoroutinesCapture     176           48            -72.73%

Change-Id: Ic85e1706e18c3235cc45b3c0c031a9c1cdb7a40e
Reviewed-on: https://go-review.googlesource.com/4050
Reviewed-by: Russ Cox <rsc@golang.org>
2015-02-13 12:12:18 +00:00
Dmitry Vyukov
0e80b2e082 cmd/gc: capture variables by value
Language specification says that variables are captured by reference.
And that is what gc compiler does. However, in lots of cases it is
possible to capture variables by value under the hood without
affecting visible behavior of programs. For example, consider
the following typical pattern:

	func (o *Obj) requestMany(urls []string) []Result {
		wg := new(sync.WaitGroup)
		wg.Add(len(urls))
		res := make([]Result, len(urls))
		for i := range urls {
			i := i
			go func() {
				res[i] = o.requestOne(urls[i])
				wg.Done()
			}()
		}
		wg.Wait()
		return res
	}

Currently o, wg, res, and i are captured by reference causing 3+len(urls)
allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap).
But all of them can be captured by value without changing behavior.

This change implements simple strategy for capturing by value:
if a captured variable is not addrtaken and never assigned to,
then it is captured by value (it is effectively const).
This simple strategy turned out to be very effective:
~80% of all captures in std lib are turned into value captures.
The remaining 20% are mostly in defers and non-escaping closures,
that is, they do not cause allocations anyway.

benchmark                                    old allocs     new allocs     delta
BenchmarkCompressedZipGarbage                153            126            -17.65%
BenchmarkEncodeDigitsSpeed1e4                91             69             -24.18%
BenchmarkEncodeDigitsSpeed1e5                178            129            -27.53%
BenchmarkEncodeDigitsSpeed1e6                1510           1051           -30.40%
BenchmarkEncodeDigitsDefault1e4              100            75             -25.00%
BenchmarkEncodeDigitsDefault1e5              193            139            -27.98%
BenchmarkEncodeDigitsDefault1e6              1420           985            -30.63%
BenchmarkEncodeDigitsCompress1e4             100            75             -25.00%
BenchmarkEncodeDigitsCompress1e5             193            139            -27.98%
BenchmarkEncodeDigitsCompress1e6             1420           985            -30.63%
BenchmarkEncodeTwainSpeed1e4                 109            81             -25.69%
BenchmarkEncodeTwainSpeed1e5                 211            151            -28.44%
BenchmarkEncodeTwainSpeed1e6                 1588           1097           -30.92%
BenchmarkEncodeTwainDefault1e4               103            77             -25.24%
BenchmarkEncodeTwainDefault1e5               199            143            -28.14%
BenchmarkEncodeTwainDefault1e6               1324           917            -30.74%
BenchmarkEncodeTwainCompress1e4              103            77             -25.24%
BenchmarkEncodeTwainCompress1e5              190            137            -27.89%
BenchmarkEncodeTwainCompress1e6              1327           919            -30.75%
BenchmarkConcurrentDBExec                    16223          16220          -0.02%
BenchmarkConcurrentStmtQuery                 17687          16182          -8.51%
BenchmarkConcurrentStmtExec                  5191           5186           -0.10%
BenchmarkConcurrentTxQuery                   17665          17661          -0.02%
BenchmarkConcurrentTxExec                    15154          15150          -0.03%
BenchmarkConcurrentTxStmtQuery               17661          16157          -8.52%
BenchmarkConcurrentTxStmtExec                3677           3673           -0.11%
BenchmarkConcurrentRandom                    14000          13614          -2.76%
BenchmarkManyConcurrentQueries               25             22             -12.00%
BenchmarkDecodeComplex128Slice               318            252            -20.75%
BenchmarkDecodeFloat64Slice                  318            252            -20.75%
BenchmarkDecodeInt32Slice                    318            252            -20.75%
BenchmarkDecodeStringSlice                   2318           2252           -2.85%
BenchmarkDecode                              11             8              -27.27%
BenchmarkEncodeGray                          64             56             -12.50%
BenchmarkEncodeNRGBOpaque                    64             56             -12.50%
BenchmarkEncodeNRGBA                         67             58             -13.43%
BenchmarkEncodePaletted                      68             60             -11.76%
BenchmarkEncodeRGBOpaque                     64             56             -12.50%
BenchmarkGoLookupIP                          153            139            -9.15%
BenchmarkGoLookupIPNoSuchHost                508            466            -8.27%
BenchmarkGoLookupIPWithBrokenNameServer      245            226            -7.76%
BenchmarkClientServer                        62             59             -4.84%
BenchmarkClientServerParallel4               62             59             -4.84%
BenchmarkClientServerParallel64              62             59             -4.84%
BenchmarkClientServerParallelTLS4            79             76             -3.80%
BenchmarkClientServerParallelTLS64           112            109            -2.68%
BenchmarkCreateGoroutinesCapture             10             6              -40.00%
BenchmarkAfterFunc                           1006           1005           -0.10%

Fixes #6632.

Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca
Reviewed-on: https://go-review.googlesource.com/3166
Reviewed-by: Russ Cox <rsc@golang.org>
2015-01-29 13:07:30 +00:00