264 Commits

Author SHA1 Message Date
Meng Zhuo
954f651ccc internal/cpu,runtime: call cpu.initialize before alginit
runtime.alginit needs runtime/support_{aes,ssse3,sse41} feature flag
to init aeshash function but internal/cpu.init not be called yet.
This CL will call internal/cpu.initialize before runtime.alginit, so
that we can move all cpu features related code to internal/cpu.

Change-Id: I00b8e403ace3553f8c707563d95f27dade0bc853
Reviewed-on: https://go-review.googlesource.com/104636
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-10 16:13:52 +00:00
Keith Randall
9d4215311b runtime: identify special functions by flag instead of address
When there are plugins, there may not be a unique copy of runtime
functions like goexit, mcall, etc.  So identifying them by entry
address is problematic.  Instead, keep track of each special function
using a field in the symbol table.  That way, multiple copies of
the same runtime function will be treated identically.

Fixes #24351
Fixes #23133

Change-Id: Iea3232df8a6af68509769d9ca618f530cc0f84fd
Reviewed-on: https://go-review.googlesource.com/100739
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-15 17:31:57 +00:00
Josh Bleecher Snyder
6cb064c9c4 Revert "runtime: convert g.waitreason from string to uint8"
This reverts commit 4eea887fd477368653f6fcf8ad766030167936e5.

Reason for revert: broke s390x build

Change-Id: Id6c2b6a7319273c4d21f613d4cdd38b00d49f847
Reviewed-on: https://go-review.googlesource.com/100375
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-13 15:21:21 +00:00
Josh Bleecher Snyder
4eea887fd4 runtime: convert g.waitreason from string to uint8
Every time I poke at #14921, the g.waitreason string
pointer writes show up.

They're not particularly important performance-wise,
but it'd be nice to clear the noise away.

And it does open up a few extra bytes in the g struct
for some future use.

Change-Id: I7ffbd52fbc2a286931a2218038fda52ed6473cc9
Reviewed-on: https://go-review.googlesource.com/99078
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-12 21:56:50 +00:00
Ian Lance Taylor
4902778607 runtime: set libcall values for Solaris system calls
This lets SIGPROF signals get a useful traceback.
Without it we just see sysvicallN calling asmcgocall.

Updates #24142

Change-Id: I5dfe3add51f0c3a4cb1c98acb7738be6396214bc
Reviewed-on: https://go-review.googlesource.com/99617
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-09 19:27:20 +00:00
Austin Clements
7f1b2738bb runtime: make throw safer to call
Currently, throw may grow the stack, which means whenever we call it
from a context where it's not safe to grow the stack, we first have to
switch to the system stack. This is pretty easy to get wrong.

Fix this by making throw switch to the system stack so it doesn't grow
the stack and is hence safe to call without a system stack switch at
the call site.

The only thing this complicates is badsystemstack itself, which would
now go into an infinite loop before printing anything (previously it
would also go into an infinite loop, but would at least print the
error first). Fix this by making badsystemstack do a direct write and
then crash hard.

Change-Id: Ic5b4a610df265e47962dcfa341cabac03c31c049
Reviewed-on: https://go-review.googlesource.com/93659
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:55:52 +00:00
Ian Lance Taylor
419c06455a runtime: get traceback from VDSO code
Currently if a profiling signal arrives while executing within a VDSO
the profiler will report _ExternalCode, which is needlessly confusing
for a pure Go program. Change the VDSO calling code to record the
caller's PC/SP, so that we can do a traceback from that point. If that
fails for some reason, report _VDSO rather than _ExternalCode, which
should at least point in the right direction.

This adds some instructions to the code that calls the VDSO, but the
slowdown is reasonably negligible:

name                                  old time/op  new time/op  delta
ClockVDSOAndFallbackPaths/vDSO-8      40.5ns ± 2%  41.3ns ± 1%  +1.85%  (p=0.002 n=10+10)
ClockVDSOAndFallbackPaths/Fallback-8  41.9ns ± 1%  43.5ns ± 1%  +3.84%  (p=0.000 n=9+9)
TimeNow-8                             41.5ns ± 3%  41.5ns ± 2%    ~     (p=0.723 n=10+10)

Fixes #24142

Change-Id: Iacd935db3c4c782150b3809aaa675a71799b1c9c
Reviewed-on: https://go-review.googlesource.com/97315
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-07 23:35:25 +00:00
David Crawshaw
b03f1d1a7e runtime: remove extraneous stackPreempt setting
The stackguard is set to stackPreempt earlier in reentersyscall, and
as it comes with throwsplit = true there's no way for the stackguard
to be set to anything else by the end of reentersyscall.

Change-Id: I4e942005b22ac784c52398c74093ac887fc8ec24
Reviewed-on: https://go-review.googlesource.com/65673
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-02-14 15:27:11 +00:00
Austin Clements
7c2cf4e779 runtime: avoid race on allp in findrunnable
findrunnable loops over allp to check run queues *after* it has
dropped its own P. This is unsafe because allp can change when nothing
is blocking safe-points. Hence, procresize could change allp
concurrently with findrunnable's loop. Beyond generally violating Go's
memory model, in the best case this could findrunnable to observe a
nil P pointer if allp has been grown but the new slots not yet
initialized. In the worst case, the reads of allp could tear, causing
findrunnable to read a word that isn't even a valid *P pointer.

Fix this by taking a snapshot of the allp slice header (but not the
backing store) before findrunnable drops its P and iterating over this
snapshot. The actual contents of allp are immutable up to len(allp),
so this fixes the race.

Updates #23098 (may fix).

Change-Id: I556ae2dbfffe9fe4a1bf43126e930b9e5c240ea8
Reviewed-on: https://go-review.googlesource.com/86215
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-01-04 18:01:55 +00:00
Austin Clements
292558be02 runtime: restore the Go-allocated signal stack in unminit
Currently, when we minit on a thread that already has an alternate
signal stack (e.g., because the M was an extram being used for a cgo
callback, or to handle a signal on a C thread, or because the
platform's libc always allocates a signal stack like on Android), we
simply drop the Go-allocated gsignal stack on the floor.

This is a problem for Ms on the extram list because those Ms may later
be reused for a different thread that may not have its own alternate
signal stack. On tip, this manifests as a crash in sigaltstack because
we clear the gsignal stack bounds in unminit and later try to use
those cleared bounds when we re-minit that M. On 1.9 and earlier, we
didn't clear the bounds, so this manifests as running more than one
signal handler on the same signal stack, which could lead to arbitrary
memory corruption.

This CL fixes this problem by saving the Go-allocated gsignal stack in
a new field in the m struct when overwriting it with a system-provided
signal stack, and then restoring the original gsignal stack in
unminit.

This CL is designed to be easy to back-port to 1.9. It won't quite
cherry-pick cleanly, but it should be sufficient to simply ignore the
change in mexit (which didn't exist in 1.9).

Now that we always have a place to stash the original signal stack in
the m struct, there are some simplifications we can make to the signal
stack handling. We'll do those in a later CL.

Fixes #22930.

Change-Id: I55c5a6dd9d97532f131146afdef0b216e1433054
Reviewed-on: https://go-review.googlesource.com/81476
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-12-01 20:20:45 +00:00
Austin Clements
be589f8d2b runtime: fix final stack split in exitsyscall
exitsyscall should be recursively nosplit, but we don't have a way to
annotate that right now (see #21314). There's exactly one remaining
place where this is violated right now: exitsyscall -> casgstatus ->
print. The other prints in casgstatus are wrapped in systemstack
calls. This fixes the remaining print.

Updates #21431 (in theory could fix it, but that would just indicate
that we have a different G status-related crash and we've *never* seen
that failure on the dashboard.)

Change-Id: I9a5e8d942adce4a5c78cfc6b306ea5bda90dbd33
Reviewed-on: https://go-review.googlesource.com/79815
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-24 15:48:04 +00:00
Austin Clements
09739d2850 runtime: call throw on systemstack in exitsyscall
If exitsyscall tries to grow the stack it will panic, but throw calls
print, which can grow the stack. Move the two bare throws in
exitsyscall to the system stack.

Updates #21431.

Change-Id: I5b29da5d34ade908af648a12075ed327a864476c
Reviewed-on: https://go-review.googlesource.com/79517
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-22 21:44:35 +00:00
Michael Pratt
b75b4d0ee6 runtime: skip netpoll check if there are no waiters
If there are no netpoll waiters then calling netpoll will never find any
goroutines. The later blocking netpoll in findrunnable already has this
optimization.

With golang.org/cl/78538 also applied, this change has a small impact on
latency:

name                             old time/op  new time/op  delta
WakeupParallelSpinning/0s-12     13.6µs ± 1%  13.7µs ± 1%    ~     (p=0.873 n=19+20)
WakeupParallelSpinning/1µs-12    17.7µs ± 0%  17.6µs ± 0%  -0.31%  (p=0.000 n=20+20)
WakeupParallelSpinning/2µs-12    20.2µs ± 2%  19.9µs ± 1%  -1.59%  (p=0.000 n=20+19)
WakeupParallelSpinning/5µs-12    32.0µs ± 1%  32.1µs ± 1%    ~     (p=0.201 n=20+19)
WakeupParallelSpinning/10µs-12   51.7µs ± 0%  51.4µs ± 1%  -0.60%  (p=0.000 n=20+18)
WakeupParallelSpinning/20µs-12   92.2µs ± 0%  92.2µs ± 0%    ~     (p=0.474 n=19+19)
WakeupParallelSpinning/50µs-12    215µs ± 0%   215µs ± 0%    ~     (p=0.319 n=20+19)
WakeupParallelSpinning/100µs-12   330µs ± 2%   331µs ± 2%    ~     (p=0.296 n=20+19)
WakeupParallelSyscall/0s-12       127µs ± 0%   126µs ± 0%  -0.57%  (p=0.000 n=18+18)
WakeupParallelSyscall/1µs-12      129µs ± 0%   128µs ± 1%  -0.43%  (p=0.000 n=18+19)
WakeupParallelSyscall/2µs-12      131µs ± 1%   130µs ± 1%  -0.78%  (p=0.000 n=20+19)
WakeupParallelSyscall/5µs-12      137µs ± 1%   136µs ± 0%  -0.54%  (p=0.000 n=18+19)
WakeupParallelSyscall/10µs-12     147µs ± 1%   146µs ± 0%  -0.58%  (p=0.000 n=18+19)
WakeupParallelSyscall/20µs-12     168µs ± 0%   167µs ± 0%  -0.52%  (p=0.000 n=19+19)
WakeupParallelSyscall/50µs-12     228µs ± 0%   227µs ± 0%  -0.37%  (p=0.000 n=19+18)
WakeupParallelSyscall/100µs-12    329µs ± 0%   328µs ± 0%  -0.28%  (p=0.000 n=20+18)

There is a bigger improvement in CPU utilization. Before this CL, these
benchmarks spent 12% of cycles in netpoll, which are gone after this CL.

This also fixes the sched.lastpoll load, which should be atomic.

Change-Id: I600961460608bd5ba3eeddc599493d2be62064c6
Reviewed-on: https://go-review.googlesource.com/78915
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2017-11-21 19:36:56 +00:00
Jamie Liu
868c8b374d runtime: only sleep before stealing work from a running P
The sleep in question does not make sense if the stolen-from P cannot
run the stolen G. The usleep(3) has been observed delaying execution of
woken G's by ~60us; skipping it reduces the wakeup-to-execution latency
to ~7us in these cases, improving CPU utilization.

Benchmarks added by this change:

name                             old time/op  new time/op  delta
WakeupParallelSpinning/0s-12     14.4µs ± 1%  14.3µs ± 1%     ~     (p=0.227 n=19+20)
WakeupParallelSpinning/1µs-12    18.3µs ± 0%  18.3µs ± 1%     ~     (p=0.950 n=20+19)
WakeupParallelSpinning/2µs-12    22.3µs ± 1%  22.3µs ± 1%     ~     (p=0.670 n=20+18)
WakeupParallelSpinning/5µs-12    31.7µs ± 0%  31.7µs ± 0%     ~     (p=0.460 n=20+17)
WakeupParallelSpinning/10µs-12   51.8µs ± 0%  51.8µs ± 0%     ~     (p=0.883 n=20+20)
WakeupParallelSpinning/20µs-12   91.9µs ± 0%  91.9µs ± 0%     ~     (p=0.245 n=20+20)
WakeupParallelSpinning/50µs-12    214µs ± 0%   214µs ± 0%     ~     (p=0.509 n=19+20)
WakeupParallelSpinning/100µs-12   335µs ± 0%   335µs ± 0%   -0.05%  (p=0.006 n=17+15)
WakeupParallelSyscall/0s-12       228µs ± 2%   129µs ± 1%  -43.32%  (p=0.000 n=20+19)
WakeupParallelSyscall/1µs-12      232µs ± 1%   131µs ± 1%  -43.60%  (p=0.000 n=19+20)
WakeupParallelSyscall/2µs-12      236µs ± 1%   133µs ± 1%  -43.44%  (p=0.000 n=18+19)
WakeupParallelSyscall/5µs-12      248µs ± 2%   139µs ± 1%  -43.68%  (p=0.000 n=18+19)
WakeupParallelSyscall/10µs-12     263µs ± 3%   150µs ± 2%  -42.97%  (p=0.000 n=18+20)
WakeupParallelSyscall/20µs-12     281µs ± 2%   170µs ± 1%  -39.43%  (p=0.000 n=19+19)
WakeupParallelSyscall/50µs-12     345µs ± 4%   246µs ± 7%  -28.85%  (p=0.000 n=20+20)
WakeupParallelSyscall/100µs-12    460µs ± 5%   350µs ± 4%  -23.85%  (p=0.000 n=20+20)

Benchmarks associated with the change that originally added this sleep
(see https://golang.org/s/go15gomaxprocs):

name        old time/op  new time/op  delta
Chain       19.4µs ± 2%  19.3µs ± 1%    ~     (p=0.101 n=19+20)
ChainBuf    19.5µs ± 2%  19.4µs ± 2%    ~     (p=0.840 n=19+19)
Chain-2     19.9µs ± 1%  19.9µs ± 2%    ~     (p=0.734 n=19+19)
ChainBuf-2  20.0µs ± 2%  20.0µs ± 2%    ~     (p=0.175 n=19+17)
Chain-4     20.3µs ± 1%  20.1µs ± 1%  -0.62%  (p=0.010 n=19+18)
ChainBuf-4  20.3µs ± 1%  20.2µs ± 1%  -0.52%  (p=0.023 n=19+19)
Powser       2.09s ± 1%   2.10s ± 3%    ~     (p=0.908 n=19+19)
Powser-2     2.21s ± 1%   2.20s ± 1%  -0.35%  (p=0.010 n=19+18)
Powser-4     2.31s ± 2%   2.31s ± 2%    ~     (p=0.578 n=18+19)
Sieve        13.6s ± 1%   13.6s ± 1%    ~     (p=0.909 n=17+18)
Sieve-2      8.02s ±52%   7.28s ±15%    ~     (p=0.336 n=20+16)
Sieve-4      4.00s ±35%   3.98s ±26%    ~     (p=0.654 n=20+18)

Change-Id: I58edd8ce01075859d871e2348fc0833e9c01f70f
Reviewed-on: https://go-review.googlesource.com/78538
Reviewed-by: Austin Clements <austin@google.com>
2017-11-21 19:31:06 +00:00
Austin Clements
f10d99f51d runtime: flush assist credit on goroutine exit
Currently dead goroutines retain their assist credit. This credit can
be used if the goroutine gets recycled, but in general this can make
assist pacing over-aggressive by hiding an amount of credit
proportional to the number of exited (and not reused) goroutines.

Fix this "hidden credit" by flushing assist credit to the global
credit pool when a goroutine exits.

Updates #14812.

Change-Id: I65f7f75907ab6395c04aacea2c97aea963b60344
Reviewed-on: https://go-review.googlesource.com/24703
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-11-07 18:41:14 +00:00
Ian Lance Taylor
86cd9c1176 runtime: only call netpoll if netpollinited returns true
This fixes a race on old Linux kernels, in which we might temporarily
set epfd to an invalid value other than -1. It's also the right thing
to do. No test because the problem only occurs on old kernels.

Fixes #22606

Change-Id: Id84bdd6ae6d7c5d47c39e97b74da27576cb51a54
Reviewed-on: https://go-review.googlesource.com/76319
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2017-11-07 16:18:12 +00:00
Austin Clements
e9079a69f3 runtime: buffered write barrier implementation
This implements runtime support for buffered write barriers on amd64.
The buffered write barrier has a fast path that simply enqueues
pointers in a per-P buffer. Unlike the current write barrier, this
fast path is *not* a normal Go call and does not require the compiler
to spill general-purpose registers or put arguments on the stack. When
the buffer fills up, the write barrier takes the slow path, which
spills all general purpose registers and flushes the buffer. We don't
allow safe-points or stack splits while this frame is active, so it
doesn't matter that we have no type information for the spilled
registers in this frame.

One minor complication is cgocheck=2 mode, which uses the write
barrier to detect Go pointers being written to non-Go memory. We
obviously can't buffer this, so instead we set the buffer to its
minimum size, forcing the write barrier into the slow path on every
call. For this specific case, we pass additional information as
arguments to the flush function. This also requires enabling the cgo
write barrier slightly later during runtime initialization, after Ps
(and the per-P write barrier buffers) have been initialized.

The code in this CL is not yet active. The next CL will modify the
compiler to generate calls to the new write barrier.

This reduces the average cost of the write barrier by roughly a factor
of 4, which will pay for the cost of having it enabled more of the
time after we make the GC pacer less aggressive. (Benchmarks will be
in the next CL.)

Updates #14951.
Updates #22460.

Change-Id: I396b5b0e2c5e5c4acfd761a3235fd15abadc6cb1
Reviewed-on: https://go-review.googlesource.com/73711
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-10-30 18:12:44 +00:00
Austin Clements
3526d8031a runtime: allow write barriers in gchelper
We're about to start tracking nowritebarrierrec through systemstack
calls, which detects that we're calling markroot (which has write
barriers) from gchelper, which is called from the scheduler during STW
apparently without a P.

But it turns out that func helpgc, which wakes up blocked Ms to run
gchelper, installs a P for gchelper to use. This means there *is* a P
when gchelper runs, so it is allowed to have write barriers. Tell the
compiler this by marking gchelper go:yeswritebarrierrec. Also,
document the call to gchelper so I don't have to spend another half a
day puzzling over how on earth this could possibly work before
discovering the spooky action-at-a-distance in helpgc.

Updates #22384.
For #22460.

Change-Id: I7394c9b4871745575f87a2d4fbbc5b8e54d669f7
Reviewed-on: https://go-review.googlesource.com/72772
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-10-29 17:56:21 +00:00
Russ Cox
986582126a runtime: avoid monotonic time zero on systems with low-res timers
Otherwise low-res timers cause problems at call sites that expect to
be able to use 0 as meaning "no time set" and therefore expect that
nanotime never returns 0 itself. For example, sched.lastpoll == 0
means no last poll.

Fixes #22394.

Change-Id: Iea28acfddfff6f46bc90f041ec173e0fea591285
Reviewed-on: https://go-review.googlesource.com/73410
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-10-25 17:10:20 +00:00
David du Colombier
f4faca6013 runtime: don't terminate locked OS threads on Plan 9
CL 46037 and CL 46038 implemented termination of
locked OS threads when the goroutine exits.

However, this behavior leads to crashes of Go programs
using runtime.LockOSThread on Plan 9. This is notably
the case of the os/exec and net packages.

This change disables termination of locked OS threads
on Plan 9.

Updates #22227.

Change-Id: If9fa241bff1c0b68e7e9e321e06e5203b3923212
Reviewed-on: https://go-review.googlesource.com/71230
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-17 15:15:12 +00:00
David du Colombier
d155b32f8d runtime: disable use of template thread on Plan 9
CL 46033 added a "template thread" mechanism to
allow creation of thread with a known-good state
from a thread of unknown state.

However, we are experiencing issues on Plan 9
with programs using the os/exec and net package.
These package are relying on runtime.LockOSThread.

Updates #22227.

Change-Id: I85b71580a41df9fe8b24bd8623c064b6773288b0
Reviewed-on: https://go-review.googlesource.com/70231
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-10-17 15:15:07 +00:00
David du Colombier
926373ea79 runtime: fix crash on Plan 9
Since CL 46037, the runtime is crashing after calling
exitThread on Plan 9.

The exitThread function shouldn't be called on
Plan 9, because the system manages thread stacks.

Fixes #22221.

Change-Id: I5d61c9660a87dc27e4cfcb3ca3ddcb4b752f2397
Reviewed-on: https://go-review.googlesource.com/70190
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-10-12 00:11:33 +00:00
Austin Clements
44d9e96da9 runtime: don't try to free OS-created signal stacks
Android's libc creates a signal stack for every thread it creates. In
Go, minitSignalStack picks up this existing signal stack and puts it
in m.gsignal.stack. However, if we later try to exit a thread (because
a locked goroutine is exiting), we'll attempt to stackfree this
libc-allocated signal stack and panic.

Fix this by clearing gsignal.stack when we unminitSignals in such a
situation.

This should fix the Android build, which is currently broken.

Change-Id: Ieea8d72ef063d22741c54c9daddd8bb84926a488
Reviewed-on: https://go-review.googlesource.com/70130
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-11 22:17:30 +00:00
Austin Clements
0aef82aa4a runtime: make (Un)LockOSThread doc more prescriptive
Right now users have to infer why they would want LockOSThread and
when it may or may not be appropriate to call UnlockOSThread. This
requires some understanding of Go's internal thread pool
implementation, which is unfortunate.

Improve the situation by making the documentation on these functions
more prescriptive so users can figure out when to use them even if
they don't know about the scheduler.

Change-Id: Ide221791e37cb5106dd8a172f89fbc5b3b98fe32
Reviewed-on: https://go-review.googlesource.com/52871
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-11 17:47:23 +00:00
Austin Clements
4f34a52913 runtime: terminate locked OS thread if its goroutine exits
runtime.LockOSThread is sometimes used when the caller intends to put
the OS thread into an unusual state. In this case, we never want to
return this thread to the runtime thread pool. However, currently
exiting the goroutine implicitly unlocks its OS thread.

Fix this by terminating the locked OS thread when its goroutine exits,
rather than simply returning it to the pool.

Fixes #20395.

Change-Id: I3dcec63b200957709965f7240dc216fa84b62ad9
Reviewed-on: https://go-review.googlesource.com/46038
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-11 17:47:21 +00:00
Austin Clements
eff2b2620d runtime: make it possible to exit Go-created threads
Currently, threads created by the runtime exist until the whole
program exits. For #14592 and #20395, we want to be able to exit and
clean up threads created by the runtime. This commit implements that
mechanism.

The main difficulty is how to clean up the g0 stack. In cgo mode and
on Solaris and Windows where the OS manages thread stacks, we simply
arrange to return from mstart and let the system clean up the thread.
If the runtime allocated the g0 stack, then we use a new exitThread
syscall wrapper that arranges to clear a flag in the M once the stack
can safely be reaped and call the thread termination syscall.

exitThread is based on the existing exit1 wrapper, which was always
meant to terminate the calling thread. However, exit1 has never been
used since it was introduced 9 years ago, so it was broken on several
platforms. exitThread also has the additional complication of having
to flag that the stack is unused, which requires some tricks on
platforms that use the stack for syscalls.

This still leaves the problem of how to reap the unused g0 stacks. For
this, we move the M from allm to a new freem list as part of the M
exiting. Later, allocm scans the freem list, finds Ms that are marked
as done with their stack, removes these from the list and frees their
g0 stacks. This also allows these Ms to be garbage collected.

This CL does not yet use any of this functionality. Follow-up CLs
will. Likewise, there are no new tests in this CL because we'll need
follow-up functionality to test it.

Change-Id: Ic851ee74227b6d39c6fc1219fc71b45d3004bc63
Reviewed-on: https://go-review.googlesource.com/46037
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-11 17:47:18 +00:00
Austin Clements
6c7bea6309 runtime: replace sched.mcount int32 with sched.mnext int64
Currently, since Ms never exit, the number of Ms, the number of Ms
ever created, and the ID of the next M are all the same and must be
small. That's about to change, so rename sched.mcount to sched.mnext
to make it clear it's the number of Ms ever created (and the ID of the
next M), change its type to int64, and use mcount() for the number of
Ms. In the next commit, mcount() will become slightly less trivial.

For #20395.

Change-Id: I9af34d36bd72416b5656555d16e8085076f1b196
Reviewed-on: https://go-review.googlesource.com/68750
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-11 17:47:16 +00:00
Austin Clements
a212083eea runtime: mark mstart as nowritebarrierrec
mstart is the entry point for new threads, so it certainly can't
interact with GC enough to have write barriers. We move the one small
piece that is allowed to have write barriers out into its own
function.

Change-Id: Id9c31d6ffac31d0051fab7db15eb428c11cadbad
Reviewed-on: https://go-review.googlesource.com/46035
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-10-11 17:47:13 +00:00
Austin Clements
2595fe7fb6 runtime: don't start new threads from locked threads
Applications that need to manipulate kernel thread state are currently
on thin ice in Go: they can use LockOSThread to prevent other
goroutines from running on the manipulated thread, but Go may clone
this manipulated state into a new thread that's put into the runtime's
thread pool along with other threads.

Fix this by never starting a new thread from a locked thread or a
thread that may have been started by C. Instead, the runtime starts a
"template thread" with a known-good state. If it then needs to start a
new thread but doesn't know that the current thread is in a good
state, it forwards the thread creation to the template thread.

Fixes #20676.

Change-Id: I798137a56e04b7723d55997e9c5c085d1d910643
Reviewed-on: https://go-review.googlesource.com/46033
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-11 17:47:08 +00:00
Daniel Martí
6f5ede8bd5 runtime: remove a few unused params and results
These have never had a use - not even going back to when they were added
in C.

Change-Id: I143b6902b3bacb1fa83c56c9070a8adb9f61a844
Reviewed-on: https://go-review.googlesource.com/69119
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-09 20:14:50 +00:00
Gabriel Aszalos
a04adcaf35 runtime: remove the 'go:nosplit' directive from documentation
The //go:nosplit directive was visible in GoDoc because the function
that it preceeded (Gosched) is exported. This change moves the directive
above the documentation, hiding it from the output.

Change-Id: I281fd7573f11d977487809f74c9cc16b2af0dc88
Reviewed-on: https://go-review.googlesource.com/69120
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-10-09 17:02:18 +00:00
Austin Clements
c85b12b579 runtime: make LockOSThread/UnlockOSThread nested
Currently, there is a single bit for LockOSThread, so two calls to
LockOSThread followed by one call to UnlockOSThread will unlock the
thread. There's evidence (#20458) that this is almost never what
people want or expect and it makes these APIs very hard to use
correctly or reliably.

Change this so LockOSThread/UnlockOSThread can be nested and the
calling goroutine will not be unwired until UnlockOSThread has been
called as many times as LockOSThread has. This should fix the vast
majority of incorrect uses while having no effect on the vast majority
of correct uses.

Fixes #20458.

Change-Id: I1464e5e9a0ea4208fbb83638ee9847f929a2bacb
Reviewed-on: https://go-review.googlesource.com/45752
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-05 19:50:23 +00:00
Austin Clements
e900e275e8 runtime: clean up loops over allp
allp now has length gomaxprocs, which means none of allp[i] are nil or
in state _Pdead. This lets replace several different styles of loops
over allp with normal range loops.

for i := 0; i < gomaxprocs; i++ { ... } loops can simply range over
allp. Likewise, range loops over allp[:gomaxprocs] can just range over
allp.

Loops that check for p == nil || p.state == _Pdead don't need to check
this any more.

Loops that check for p == nil don't have to check this *if* dead Ps
don't affect them. I checked that all such loops are, in fact,
unaffected by dead Ps. One loop was potentially affected, which this
fixes by zeroing p.gcAssistTime in procresize.

Updates #15131.

Change-Id: Ifa1c2a86ed59892eca0610360a75bb613bc6dcee
Reviewed-on: https://go-review.googlesource.com/45575
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-09-27 16:29:15 +00:00
Austin Clements
ee55000f6c runtime: eliminate GOMAXPROCS limit
Now that allp is dynamically allocated, there's no need for a hard cap
on GOMAXPROCS.

Fixes #15131.

Change-Id: I53eee8e228a711a818f7ebce8d9fd915b3865eed
Reviewed-on: https://go-review.googlesource.com/45574
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-09-27 16:29:12 +00:00
Austin Clements
84d2c7ea83 runtime: dynamically allocate allp
This makes it possible to eliminate the hard cap on GOMAXPROCS.

Updates #15131.

Change-Id: I4c422b340791621584c118a6be1b38e8a44f8b70
Reviewed-on: https://go-review.googlesource.com/45573
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-09-27 16:29:09 +00:00
Austin Clements
229aaac19e runtime: remove getcallerpc argument
Now that getcallerpc is a compiler intrinsic on x86 and non-x86
platforms don't need the argument, we can drop it.

Sadly, this doesn't let us remove any dummy arguments since all of
those cases also use getcallersp, which still takes the argument
pointer, but this is at least an improvement.

Change-Id: I9c34a41cf2c18cba57f59938390bf9491efb22d2
Reviewed-on: https://go-review.googlesource.com/65474
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-09-22 22:17:15 +00:00
Ian Lance Taylor
332719f7ce runtime: don't call lockOSThread for every cgo call
For a trivial benchmark with a do-nothing cgo call:

name    old time/op  new time/op  delta
Call-4  64.5ns ± 7%  63.0ns ± 6%  -2.25%  (p=0.027 n=20+16)

Because Windows uses the cgocall mechanism to make system calls,
and passes arguments in a struct held in the m,
we need to do the lockOSThread/unlockOSThread in that code.

Because deferreturn was getting a nosplit stack overflow error,
change it to avoid calling typedmemmove.

Updates #21827.

Change-Id: I9b1d61434c44faeb29805b46b409c812c9acadc2
Reviewed-on: https://go-review.googlesource.com/64070
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2017-09-22 18:17:13 +00:00
Giovanni Bajo
e7e4a4ffa3 runtime: improve fastrand with a better generator
The current generator is a simple LSFR, which showed strong
correlation in higher bits, as manifested by fastrandn().

Change it with xorshift64+, which is slightly more complex,
has a larger state, but has a period of 2^64-1 and is much better
at statistical tests. The version used here is capable of
passing Diehard and even SmallCrush.

Speed is slightly worse but is probably insignificant:

name                old time/op  new time/op  delta
Fastrand-4          0.77ns ±12%  0.91ns ±21%  +17.31%  (p=0.048 n=5+5)
FastrandHashiter-4  13.6ns ±21%  15.2ns ±17%     ~     (p=0.160 n=6+5)
Fastrandn/2-4       2.30ns ± 5%  2.45ns ±15%     ~     (p=0.222 n=5+5)
Fastrandn/3-4       2.36ns ± 7%  2.45ns ± 6%     ~     (p=0.222 n=5+5)
Fastrandn/4-4       2.33ns ± 8%  2.61ns ±30%     ~     (p=0.126 n=6+5)
Fastrandn/5-4       2.33ns ± 5%  2.48ns ± 9%     ~     (p=0.052 n=6+5)

Fixes #21806

Change-Id: I013bb37b463fdfc229a7f324df8fe2da8d286f33
Reviewed-on: https://go-review.googlesource.com/62530
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-09-16 10:17:26 +00:00
Ian Lance Taylor
165c15afa3 runtime: change lockedg/lockedm to guintptr/muintptr
This change has no real effect in itself. This is to prepare for a
followup change that will call lockOSThread during a cgo callback when
there is no p assigned, and therefore when lockOSThread can not use a
write barrier.

Change-Id: Ia122d41acf54191864bcb68f393f2ed3b2f87abc
Reviewed-on: https://go-review.googlesource.com/63630
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-09-15 17:29:51 +00:00
Aliaksandr Valialkin
76f4fd8a52 runtime: improve timers scalability on multi-CPU systems
Use per-P timers, so each P may work with its own timers.

This CL improves performance on multi-CPU systems
in the following cases:

- When serving high number of concurrent connections
  with read/write deadlines set (for instance, highly loaded
  net/http server).

- When using high number of concurrent timers. These timers
  may be implicitly created via context.WithDeadline
  or context.WithTimeout.

Production servers should usually set timeout on connections
and external requests in order to prevent from resource leakage.
See https://blog.cloudflare.com/the-complete-guide-to-golang-net-http-timeouts/

Below are relevant benchmark results for various GOMAXPROCS values
on linux/amd64:

context package:

name                                     old time/op  new time/op  delta
WithTimeout/concurrency=40      4.92µs ± 0%  5.17µs ± 1%  +5.07%  (p=0.000 n=9+9)
WithTimeout/concurrency=4000    6.03µs ± 1%  6.49µs ± 0%  +7.63%  (p=0.000 n=8+10)
WithTimeout/concurrency=400000  8.58µs ± 7%  9.02µs ± 4%  +5.02%  (p=0.019 n=10+10)

name                                     old time/op  new time/op  delta
WithTimeout/concurrency=40-2      3.70µs ± 1%  2.78µs ± 4%  -24.90%  (p=0.000 n=8+9)
WithTimeout/concurrency=4000-2    4.49µs ± 4%  3.67µs ± 5%  -18.26%  (p=0.000 n=10+10)
WithTimeout/concurrency=400000-2  6.16µs ±10%  5.15µs ±13%  -16.30%  (p=0.000 n=10+10)

name                                     old time/op  new time/op  delta
WithTimeout/concurrency=40-4      3.58µs ± 1%  2.64µs ± 2%  -26.13%  (p=0.000 n=9+10)
WithTimeout/concurrency=4000-4    4.17µs ± 0%  3.32µs ± 1%  -20.36%  (p=0.000 n=10+10)
WithTimeout/concurrency=400000-4  5.57µs ± 9%  4.83µs ±10%  -13.27%  (p=0.001 n=10+10)

time package:

name                     old time/op  new time/op  delta
AfterFunc                6.15ms ± 3%  6.07ms ± 2%     ~     (p=0.133 n=10+9)
AfterFunc-2              3.43ms ± 1%  3.56ms ± 1%   +3.91%  (p=0.000 n=10+9)
AfterFunc-4              5.04ms ± 2%  2.36ms ± 0%  -53.20%  (p=0.000 n=10+9)
After                    6.54ms ± 2%  6.49ms ± 3%     ~     (p=0.393 n=10+10)
After-2                  3.68ms ± 1%  3.87ms ± 0%   +5.14%  (p=0.000 n=9+9)
After-4                  6.66ms ± 1%  2.87ms ± 1%  -56.89%  (p=0.000 n=10+10)
Stop                      698µs ± 2%   689µs ± 1%   -1.26%  (p=0.011 n=10+10)
Stop-2                    729µs ± 2%   434µs ± 3%  -40.49%  (p=0.000 n=10+10)
Stop-4                    837µs ± 3%   333µs ± 2%  -60.20%  (p=0.000 n=10+10)
SimultaneousAfterFunc     694µs ± 1%   692µs ± 7%     ~     (p=0.481 n=10+10)
SimultaneousAfterFunc-2   714µs ± 3%   569µs ± 2%  -20.33%  (p=0.000 n=10+10)
SimultaneousAfterFunc-4   782µs ± 2%   386µs ± 2%  -50.67%  (p=0.000 n=10+10)
StartStop                 267µs ± 3%   274µs ± 0%   +2.64%  (p=0.000 n=8+9)
StartStop-2               238µs ± 2%   140µs ± 3%  -40.95%  (p=0.000 n=10+8)
StartStop-4               320µs ± 1%   125µs ± 1%  -61.02%  (p=0.000 n=9+9)
Reset                    75.0µs ± 1%  77.5µs ± 2%   +3.38%  (p=0.000 n=10+10)
Reset-2                   150µs ± 2%    40µs ± 5%  -73.09%  (p=0.000 n=10+9)
Reset-4                   226µs ± 1%    33µs ± 1%  -85.42%  (p=0.000 n=10+10)
Sleep                     857µs ± 6%   878µs ± 9%     ~     (p=0.079 n=10+9)
Sleep-2                   617µs ± 4%   585µs ± 2%   -5.21%  (p=0.000 n=10+10)
Sleep-4                   689µs ± 3%   465µs ± 4%  -32.53%  (p=0.000 n=10+10)
Ticker                   55.9ms ± 2%  55.9ms ± 2%     ~     (p=0.971 n=10+10)
Ticker-2                 28.7ms ± 2%  28.1ms ± 1%   -2.06%  (p=0.000 n=10+10)
Ticker-4                 14.6ms ± 0%  13.6ms ± 1%   -6.80%  (p=0.000 n=9+10)

Fixes #15133

Change-Id: I6f4b09d2db8c5bec93146db6501b44dbfe5c0ac4
Reviewed-on: https://go-review.googlesource.com/34784
Reviewed-by: Austin Clements <austin@google.com>
2017-09-12 16:52:23 +00:00
Austin Clements
b0392159f6 runtime,cmd/trace: trace GC STW events
Right now we only kind of sort of trace GC STW events. We emit events
around mark termination, but those start well after stopping the world
and end before starting it again, and we don't emit any events for
sweep termination.

Fix this by generalizing EvGCScanStart/EvGCScanDone. These were
already re-purposed to indicate mark termination (despite the names).
This commit renames them to EvGCSTWStart/EvGCSTWDone, adds an argument
to indicate the STW reason, and shuffles the runtime to generate them
right before stopping the world and right after starting the world,
respectively.

These events will make it possible to generate precise minimum mutator
utilization (MMU) graphs and could be useful in detecting
non-preemptible goroutines (e.g., #20792).

Change-Id: If95783f370781d8ef66addd94886028103a7c26f
Reviewed-on: https://go-review.googlesource.com/55411
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-08-29 21:54:55 +00:00
Daniel Martí
fbc8973a6b all: join some chained ifs to unindent code
Found with mvdan.cc/unindent. It skipped the cases where parentheses
would need to be added, where comments would have to be moved elsewhere,
or where actions and simple logic would mix.

One of them was of the form "err != nil && err == io.EOF", so the first
part was removed.

Change-Id: Ie504c2b03a2c87d10ecbca1b9270069be1171b91
Reviewed-on: https://go-review.googlesource.com/57690
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-29 20:57:41 +00:00
Austin Clements
9d17e175e0 runtime: capture runtimeInitTime after nanotime is initialized
CL 36428 changed the way nanotime works so on Darwin and Windows it
now depends on runtime.startNano, which is computed at runtime.init
time. Unfortunately, the `runtimeInitTime = nanotime()` initialization
happened *before* runtime.init, so on these platforms runtimeInitTime
is set incorrectly. The one (and only) consequence of this is that the
start time printed in gctrace lines is bogus:

gc 1 18446653480.186s 0%: 0.092+0.47+0.038 ms clock, 0.37+0.15/0.81/1.8+0.15 ms cpu, 4->4->1 MB, 5 MB goal, 8 P

To fix this, this commit moves the runtimeInitTime initialization to
shortly after runtime.init, at which point nanotime is safe to use.

This also requires changing the condition in newproc1 that currently
uses runtimeInitTime != 0 simply to detect whether or not the main M
has started. Since runtimeInitTime could genuinely be 0 now, this
introduces a separate flag to newproc1.

Fixes #21554.

Change-Id: Id874a4b912d3fa3d22f58d01b31ffb3548266d3b
Reviewed-on: https://go-review.googlesource.com/58690
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-25 16:02:17 +00:00
Daniel Morsing
32b94f13cf runtime: move selectdone into g
Writing to selectdone on the stack of another goroutine meant a
pretty subtle dance between the select code and the stack copying
code. Instead move the selectdone variable into the g struct.

Change-Id: Id246aaf18077c625adef7ca2d62794afef1bdd1b
Reviewed-on: https://go-review.googlesource.com/53390
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-15 19:18:00 +00:00
Austin Clements
250a9610a4 runtime: make STW duration more accurate
Currently, GC captures the start-the-world time stamp after
startTheWorldWithSema returns. This is problematic for two reasons:

1. It's possible to get preempted between startTheWorldWithSema
starting the world and calling nanotime.

2. startTheWorldWithSema does several clean-up tasks after the world
is up and running that on rare occasions can take upwards of 10ms.

Since the runtime uses the start-the-world time stamp to compute the
STW duration, both of these can significantly inflate the reported STW
duration.

Fix this by having startTheWorldWithSema itself call nanotime once the
world is started.

Change-Id: I114630234fb73c9dabae50a2ef1884661f2459db
Reviewed-on: https://go-review.googlesource.com/55410
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2017-08-15 18:47:08 +00:00
Vladimir Stefanovic
835dfef939 runtime/pprof: prevent a deadlock that SIGPROF might create on mips{,le}
64bit atomics on mips/mipsle are implemented using spinlocks. If SIGPROF
is received while the program is in the critical section, it will try to
write the sample using the same spinlock, creating a deadloop.
Prevent it by creating a counter of SIGPROFs during atomic64 and
postpone writing the sample(s) until called from elsewhere, with
pc set to _LostSIGPROFDuringAtomic64.

Added a test case, per Cherry's suggestion. Works around #20146.

Change-Id: Icff504180bae4ee83d78b19c0d9d6a80097087f9
Reviewed-on: https://go-review.googlesource.com/42652
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-07-26 13:29:59 +00:00
Ian Lance Taylor
28f650a2f7 runtime: don't call libc sigaction function in forked child
If we are using vfork, and if something (such as TSAN) is intercepting
the sigaction function, then we must call the system call, not the
libc function. Otherwise the intercepted sigaction call in the child
may trash the data structures in the parent.

Change-Id: Id9588bfeaa934f32c920bf829c5839be5cacf243
Reviewed-on: https://go-review.googlesource.com/50251
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
2017-07-20 18:02:47 +00:00
Austin Clements
093adeef40 runtime: use next timer to decide whether to relax
Currently, sysmon waits 60 ms during idle before relaxing. This is
primarily to avoid reducing the precision of short-duration timers. Of
course, if there are no short-duration timers, this wastes 60 ms
running the timer at high resolution.

Improve this by instead inspecting the time until the next timer fires
and relaxing the timer resolution immediately if the next timer won't
fire for a while.

Updates #20937.

Change-Id: If4ad0a565b65a9b3e8c4cdc2eff1486968c79f24
Reviewed-on: https://go-review.googlesource.com/47833
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-07-07 21:22:31 +00:00
Austin Clements
7a8f39fa14 runtime: delay before osRelaxing
Currently, sysmon relaxes the Windows timer resolution as soon as the
Go process becomes idle. However, if it's going idle because of a
short sleep (< 15.6 ms), this can turn that short sleep into a long
sleep (15.6 ms).

To address this, wait for 60 ms of idleness before relaxing the timer
resolution. It would be better to check the time until the next wakeup
and relax immediately if it makes sense, but there's currently no
interaction between sysmon and the timer subsystem, so adding this
simple delay is a much simpler and safer change for late in the
release cycle.

Fixes #20937.

Change-Id: I817db24c3bdfa06dba04b7bc197cfd554363c379
Reviewed-on: https://go-review.googlesource.com/47832
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-07-07 21:02:40 +00:00
Austin Clements
9745e88b22 runtime: use rwmutex for execLock
Currently the execLock is a mutex, which has the unfortunate
side-effect of serializing all thread creation. This replaces it with
an rwmutex so threads can be created in parallel, but exec still
blocks thread creation.

Fixes #20738.

Change-Id: Ia8f30a92053c3d28af460b0da71176abe5fd074b
Reviewed-on: https://go-review.googlesource.com/47072
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-06-28 22:08:59 +00:00