This commit moves the isSelect bool below the ticket uint32. The
boolean was consuming 8 bytes of the struct. The uint32 was also
consuming 8 bytes, so we can pack isSelect below the uint32 and save 8
bytes. This reduces the sudog struct from 96 bytes to 88 bytes.
Change-Id: If555cdaf2f5eaa125e2590fc4d113dbc99750738
GitHub-Last-Rev: d63b4e086b17da74e185046dfecb12d58e4f19ac
GitHub-Pull-Request: golang/go#36552
Reviewed-on: https://go-review.googlesource.com/c/go/+/214677
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
I took some of the infrastructure from Austin's lock logging CR
https://go-review.googlesource.com/c/go/+/192704 (with deadlock
detection from the logs), and developed a setup to give static lock
ranking for runtime locks.
Static lock ranking establishes a documented total ordering among locks,
and then reports an error if the total order is violated. This can
happen if a deadlock happens (by acquiring a sequence of locks in
different orders), or if just one side of a possible deadlock happens.
Lock ordering deadlocks cannot happen as long as the lock ordering is
followed.
Along the way, I found a deadlock involving the new timer code, which Ian fixed
via https://go-review.googlesource.com/c/go/+/207348, as well as two other
potential deadlocks.
See the constants at the top of runtime/lockrank.go to show the static
lock ranking that I ended up with, along with some comments. This is
great documentation of the current intended lock ordering when acquiring
multiple locks in the runtime.
I also added an array lockPartialOrder[] which shows and enforces the
current partial ordering among locks (which is embedded within the total
ordering). This is more specific about the dependencies among locks.
I don't try to check the ranking within a lock class with multiple locks
that can be acquired at the same time (i.e. check the ranking when
multiple hchan locks are acquired).
Currently, I am doing a lockInit() call to set the lock rank of most
locks. Any lock that is not otherwise initialized is assumed to be a
leaf lock (a very high rank lock), so that eliminates the need to do
anything for a bunch of locks (including all architecture-dependent
locks). For two locks, root.lock and notifyList.lock (only in the
runtime/sema.go file), it is not as easy to do lock initialization, so
instead, I am passing the lock rank with the lock calls.
For Windows compilation, I needed to increase the StackGuard size from
896 to 928 because of the new lock-rank checking functions.
Checking of the static lock ranking is enabled by setting
GOEXPERIMENT=staticlockranking before doing a run.
To make sure that the static lock ranking code has no overhead in memory
or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so
that it defines a build tag (with the same name) whenever any experiment
has been baked into the toolchain (by checking Expstring()). This allows
me to avoid increasing the size of the 'mutex' type when static lock
ranking is not enabled.
Fixes#38029
Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a
Reviewed-on: https://go-review.googlesource.com/c/go/+/207619
Reviewed-by: Dan Scales <danscales@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This change adds a small microbenchmark for (*mspan).countAlloc, which
we're about to replace. Admittedly this isn't a critical piece of code,
but the benchmark was useful in understanding the performance change.
Change-Id: Iea93c00f571ee95534a42f2ef2ab026b382242b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/224438
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If typehash (used by reflect) does not match the built-in map's hash,
then problems occur. If a map is built using reflect, and then
assigned to a variable of map type, the hash function can change. That
causes very bad things.
This issue is rare. MapOf consults a cache of all types that occur in
the binary before making a new one. To make a true new map type (with
a hash function derived from typehash) that map type must not occur in
the binary anywhere. But to cause the bug, we need a variable of that
type in order to assign to it. The only way to make that work is to
use a named map type for the variable, so it is distinct from the
unnamed version that MapOf looks for.
Fixes#37716
Change-Id: I3537bfceca8cbfa1af84202f432f3c06953fe0ed
Reviewed-on: https://go-review.googlesource.com/c/go/+/222357
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This change formalizes an assumption made by the page allocator, which
is that (*pageAlloc).searchAddr should never refer to memory that is not
represented by (*pageAlloc).inUse. The portion of address space covered
by (*pageAlloc).inUse reflects the parts of the summary arrays which are
guaranteed to mapped, and so looking at any summary which is not
reflected there may cause a segfault.
In fact, this can happen today. This change thus also removes a
micro-optimization which is the only case which may cause
(*pageAlloc).searchAddr to point outside of any region covered by
(*pageAlloc).inUse, and adds a test verifying that the current segfault
can no longer occur.
Change-Id: I98b534f0ffba8656d3bd6d782f6fc22549ddf1c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/216697
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
In the previous CL we ensures that memmove writes pointers
atomically, so the concurrent GC won't observe a partially
updated pointer. This CL adds a test.
Change-Id: Icd1124bf3a15ef25bac20c7fb8933f1a642d897c
Reviewed-on: https://go-review.googlesource.com/c/go/+/212627
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Prior to this change, if the heap was very discontiguous (such as in
TestArenaCollision) it's possible we could map a large amount of memory
as R/W and commit it. We would use only the start and end to track what
should be mapped, and we would extend that mapping as needed to
accomodate a potentially fragmented address space.
After this change, we only map exactly the part of the summary arrays
that we need by using the inUse ranges from the previous change. This
reduces the GCSys footprint of TestArenaCollision from 300 MiB to 18
MiB.
Because summaries are no longer mapped contiguously, this means the
scavenger can no longer iterate directly. This change also updates the
scavenger to borrow ranges out of inUse and iterate over only the
parts of the heap which are actually currently in use. This is both an
optimization and necessary for correctness.
Fixes#35514.
Change-Id: I96bf0c73ed0d2d89a00202ece7b9d089a53bac90
Reviewed-on: https://go-review.googlesource.com/c/go/+/207758
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change adds a new inUse field to the allocator which tracks ranges
of addresses that are owned by the heap. It is updated on each heap
growth.
These ranges are tracked in an array which is kept sorted. In practice
this array shouldn't exceed its initial allocation except in rare cases
and thus should be small (ideally exactly 1 element in size).
In a hypothetical worst-case scenario wherein we have a 1 TiB heap and 4
MiB arenas (note that the address ranges will never be at a smaller
granularity than an arena, since arenas are always allocated
contiguously), inUse would use at most 4 MiB of memory if the heap
mappings were completely discontiguous (highly unlikely) with an
additional 2 MiB leaked from previous allocations. Furthermore, the
copies that are done to keep the inUse array sorted will copy at most 4
MiB of memory in such a scenario, which, assuming a conservative copying
rate of 5 GiB/s, amounts to about 800µs.
However, note that in practice:
1) Most 64-bit platforms have 64 MiB arenas.
2) The copies should incur little-to-no page faults, meaning a copy rate
closer to 25-50 GiB/s is expected.
3) Go heaps are almost always mostly contiguous.
Updates #35514.
Change-Id: I3ad07f1c2b5b9340acf59ecc3b9ae09e884814fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/207757
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
This will be used to parse the Linux kernel versions, but this code is
generic and can be tested on its own.
For #35777.
Change-Id: If1df48d07250e5855dde45bc9d57c66f777b9fb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/209597
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently the page allocator bitmap is implemented as a single giant
memory mapping which is reserved at init time and committed as needed.
This causes problems on systems that don't handle large uncommitted
mappings well, or institute low virtual address space defaults as a
memory limiting mechanism.
This change modifies the implementation of the page allocator bitmap
away from a directly-mapped set of bytes to a sparse array in same vein
as mheap.arenas. This will hurt performance a little but the biggest
gains are from the lockless allocation possible with the page allocator,
so the impact of this extra layer of indirection should be minimal.
In fact, this is exactly what we see:
https://perf.golang.org/search?q=upload:20191125.5
This reduces the amount of mapped (PROT_NONE) memory needed on systems
with 48-bit address spaces to ~600 MiB down from almost 9 GiB. The bulk
of this remaining memory is used by the summaries.
Go processes with 32-bit address spaces now always commit to 128 KiB of
memory for the bitmap. Previously it would only commit the pages in the
bitmap which represented the range of addresses (lowest address to
highest address, even if there are unused regions in that range) used by
the heap.
Updates #35568.
Updates #35451.
Change-Id: I0ff10380156568642b80c366001eefd0a4e6c762
Reviewed-on: https://go-review.googlesource.com/c/go/+/207497
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
On AIX, addresses returned by mmap are between 0x0a00000000000000
and 0x0afffffffffffff. The previous solution to handle these large
addresses was to increase the arena size up to 60 bits addresses,
cf CL 138736.
However, with the new page allocator, the 60bit heap addresses are
causing huge memory allocations, especially by (s *pageAlloc).init. mmap
and munmap syscalls dealing with these allocations are reducing
performances of every Go programs.
In order to avoid these allocations, arenaBaseOffset is set to
0x0a00000000000000 and heap addresses are on 48bit, as others operating
systems.
Updates: #35451
Change-Id: Ice916b8578f76703428ec12a82024147a7592bc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/206841
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
This change makes the test addresses start at 1 GiB instead of 2 GiB to
support mips and mipsle, which only have 31-bit address spaces.
It also changes some tests to use smaller offsets for the chunk index to
avoid jumping too far ahead in the address space to support 31-bit
address spaces. The tests don't require such large jumps for what
they're testing anyway.
Updates #35112.
Fixes#35440.
Change-Id: Ic68ff2b0a1f10ef37ac00d4bb5b910ddcdc76f2e
Reviewed-on: https://go-review.googlesource.com/c/go/+/205938
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.
The net effect is that, when a sync.Mutex is contended, the code in
the critical section guarded by the Mutex gets a priority boost.
Fixes#33747
The original work was done in CL 200577 by Carlo Alberto Ferraris. The
change was reverted in CL 205817 because it broke the linux-arm64-packet
and solaris-amd64-oraclerel builders.
Change-Id: I76d79b1d63fd206ed1c57fe6900cb7ae9e4d46cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/206180
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
CL 201765 activated calls from the runtime to functions in math/bits.
When coverage and race detection were simultaneously enabled,
this caused a crash when the covered+race-checked code in
math/bits was called from the runtime before there was even a P.
PS Win for gdlv in helping sort this out.
TODO - next CL intrinsifies the new functions in
runtime/internal/sys
TODO/Would-be-nice - Ctz64 and TrailingZeros64 are the same
function; 386.s is intrinsified; clean all that up.
Fixes#35461.
Updates #35112.
Change-Id: I750a54dba493130ad3e68a06530ede7687d41e1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/206199
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This change adds a per-p free page cache which the page allocator may
allocate out of without a lock. The change also introduces a completely
lockless page allocator fast path.
Although the cache contains at most 64 pages (and usually less), the
vast majority (85%+) of page allocations are exactly 1 page in size.
Updates #35112.
Change-Id: I170bf0a9375873e7e3230845eb1df7e5cf741b78
Reviewed-on: https://go-review.googlesource.com/c/go/+/195701
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
This change adds a page cache structure which owns a chunk of free pages
at a given base address. It also adds code to allocate to this cache
from the page allocator. Finally, it adds tests for both.
Notably this change does not yet integrate the code into the runtime,
just into runtime tests.
Updates #35112.
Change-Id: Ibe121498d5c3be40390fab58a3816295601670df
Reviewed-on: https://go-review.googlesource.com/c/go/+/196643
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change removes the old page allocator from the runtime.
Updates #35112.
Change-Id: Ib20e1c030f869b6318cd6f4288a9befdbae1b771
Reviewed-on: https://go-review.googlesource.com/c/go/+/195700
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change integrates all the bits and pieces of the new page allocator
into the runtime, behind a global constant.
Updates #35112.
Change-Id: I6696bde7bab098a498ab37ed2a2caad2a05d30ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/201764
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change adds a "locked" parameter to scavenge() and scavengeone()
which allows these methods to be run with the heap lock acquired, and
synchronously with respect to others which acquire the heap lock.
This mode is necessary for both heap-growth scavenging (multiple
asynchronous scavengers here could be problematic) and
debug.FreeOSMemory.
Updates #35112.
Change-Id: I24eea8e40f971760999c980981893676b4c9b666
Reviewed-on: https://go-review.googlesource.com/c/go/+/195699
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
This change makes it so that the new page allocator returns the number
of pages that are scavenged in a new allocation so that mheap can update
memstats appropriately.
The accounting could be embedded into pageAlloc, but that would make
the new allocator more difficult to test.
Updates #35112.
Change-Id: I0f94f563d7af2458e6d534f589d2e7dd6af26d12
Reviewed-on: https://go-review.googlesource.com/c/go/+/195698
Reviewed-by: Austin Clements <austin@google.com>
This change adds a scavenger for the new page allocator along with
tests. The scavenger walks over the heap backwards once per GC, looking
for memory to scavenge. It walks across the heap without any lock held,
searching optimistically. If it finds what appears to be a scavenging
candidate it acquires the heap lock and attempts to verify it. Upon
verification it then scavenges.
Notably, unlike the old scavenger, it doesn't show any preference for
huge pages and instead follows a more strict last-page-first policy.
Updates #35112.
Change-Id: I0621ef73c999a471843eab2d1307ae5679dd18d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/195697
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change adds a new bitmap-based allocator to the runtime with tests.
It does not yet integrate the page allocator into the runtime and thus
this change is almost purely additive.
Updates #35112.
Change-Id: Ic3d024c28abee8be8797d3918116a80f901cc2bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/190622
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change adds the concept of summaries and of summarizing a set of
pallocBits, a core concept in the new page allocator. These summaries
are really just three integers packed into a uint64. This change also
adds tests and a benchmark for generating these summaries.
Updates #35112.
Change-Id: I69686316086c820c792b7a54235859c2105e5fee
Reviewed-on: https://go-review.googlesource.com/c/go/+/190621
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This change adds a per-chunk bitmap for page allocation called
pallocBits with algorithms for allocating and freeing pages out of the
bitmap. This change also adds tests for pallocBits, but does not yet
integrate it into the runtime.
Updates #35112.
Change-Id: I479006ed9f1609c80eedfff0580d5426b064b0ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/190620
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.
The net effect is that, when a sync.Mutex is contented, the code in
the critical section guarded by the Mutex gets a priority boost.
Fixes#33747
Change-Id: I9967f0f763c25504010651bdd7f944ee0189cd45
Reviewed-on: https://go-review.googlesource.com/c/go/+/200577
Reviewed-by: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This adds a test of preempting a loop containing no synchronous safe
points for STW and stack scanning.
We couldn't add this test earlier because it requires scheduler, STW,
and stack scanning preemption to all be working.
For #10958, #24543.
Change-Id: I73292db78ca3d14aab11bdafd26d03986920ef0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/201777
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
When everything is working correctly, any pointer the garbage
collector encounters can only point into a fully initialized heap
span, since the span must have been initialized before that pointer
could escape the heap allocator and become visible to the GC.
However, in various cases, we try to be defensive against bad
pointers. In findObject, this is just a sanity check: we never expect
to find a bad pointer, but programming errors can lead to them. In
spanOfHeap, we don't necessarily trust the pointer and we're trying to
check if it really does point to the heap, though it should always
point to something. Conservative scanning takes this to a new level,
since it can only guess that a word may be a pointer and verify this.
In all of these cases, we have a problem that the span lookup and
check can race with span initialization, since the span becomes
visible to lookups before it's fully initialized.
Furthermore, we're about to start initializing the span without the
heap lock held, which is going to introduce races where accesses were
previously protected by the heap lock.
To address this, this CL makes accesses to mspan.state atomic, and
ensures that the span is fully initialized before setting the state to
mSpanInUse. All loads are now atomic, and in any case where we don't
trust the pointer, it first atomically loads the span state and checks
that it's mSpanInUse, after which it will have synchronized with span
initialization and can safely check the other span fields.
For #10958, #24543, but a good fix in general.
Change-Id: I518b7c63555b02064b98aa5f802c92b758fef853
Reviewed-on: https://go-review.googlesource.com/c/go/+/203286
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Netpoll must be always be initialized when TestNetpollBreak is launched.
However, when it is run in standalone, it won't be the case, so it must
be forced.
Updates: #27707
Change-Id: I28147f3834f3d6aca982c6a298feadc09b55f66e
Reviewed-on: https://go-review.googlesource.com/c/go/+/204058
Run-TryBot: Clément Chigot <clement.chigot%atos.net@gtempaccount.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Fixes#35053
Change-Id: I31853d434610880044c169e0c1e9732f97ff1bdb
Reviewed-on: https://go-review.googlesource.com/c/go/+/202444
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David du Colombier <0intro@gmail.com>
The new netpollBreak function can be used to interrupt a blocking netpoll.
This function is not currently used; it will be used by later CLs.
Updates #27707
Change-Id: I5cb936609ba13c3c127ea1368a49194fc58c9f4d
Reviewed-on: https://go-review.googlesource.com/c/go/+/171824
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Currently, we map and grow the heap a whole arena (64MB) at a time.
Unfortunately, in order to fix#32828, we need to switch from
scavenging inline with allocation back to scavenging on heap growth,
but heap-growth scavenging happens in large jumps because we grow the
heap in large jumps.
In order to prepare for better heap-growth scavenging, this CL
separates mapping more space for the heap from actually "growing" it
(tracking the new space with spans). Instead, growing the heap keeps
track of the "current arena" it's growing into. It track that with new
spans as needed, and only maps more arena space when the current arena
is inadequate. The effect to the user is the same, but this will let
us scavenge on much smaller increments of heap growth.
There are two slightly subtleties to this change:
1. If an allocation requires mapping a new arena and that new arena
isn't contiguous with the current arena, we don't want to lose the
unused space in the current arena, so we have to immediately track
that with a span.
2. The mapped space must be accounted as released and idle, even
though it isn't actually tracked in a span.
For #32828, since this makes heap-growth scavenging far more
effective, especially at small heap sizes. For example, this change is
necessary for TestPhysicalMemoryUtilization to pass once we remove
inline scavenging.
Change-Id: I300e74a0534062467e4ce91cdc3508e5ef9aa73a
Reviewed-on: https://go-review.googlesource.com/c/go/+/189957
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
With gccgo, if we generate getg inlined, the backend may cache
the address of the TLS variable, which will become invalid after
a thread switch.
Currently there is no known bug for this. But if we didn't
implement this carefully, we may get subtle bugs. This CL adds a
test that will fail loudly if this is wrong. (See also
https://go.googlesource.com/gofrontend/+/refs/heads/master/libgo/runtime/proc.c#333
and an incorrect attempt CL 185337.)
Note: at least on Linux/AMD64, even with an incorrect
implementation, this only fails if the test is compiled with
-fPIC, which is not the default setting for gccgo test suite. So
some manual work is needed. Maybe we could extend the test suite
to run the runtime test with more settings (e.g. PIC and static).
Change-Id: I459a3b4c31f09b9785c0eca19b7756f80e8ef54c
Reviewed-on: https://go-review.googlesource.com/c/go/+/186357
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently there's an invariant in the runtime wherein the heap lock
can only be acquired on the system stack, otherwise a self-deadlock
could occur if the stack grows while the lock is held.
This invariant is upheld and documented in a number of situations (e.g.
allocManual, freeManual) but there are other places where the invariant
is either not maintained at all which risks self-deadlock (e.g.
setGCPercent, gcResetMarkState, allocmcache) or is maintained but
undocumented (e.g. gcSweep, readGCStats_m).
This change adds go:systemstack to any function that acquires the heap
lock or adds a systemstack(func() { ... }) around the critical section,
where appropriate. It also documents the invariant on (*mheap).lock
directly and updates repetitive documentation to refer to that comment.
Fixes#32105.
Change-Id: I702b1290709c118b837389c78efde25c51a2cafb
Reviewed-on: https://go-review.googlesource.com/c/go/+/177857
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
This change tracks the number of potential free and unscavenged huge
pages which will be used to inform the rate at which scavenging should
occur.
For #30333.
Change-Id: I47663e5ffb64cac44ffa10db158486783f707479
Reviewed-on: https://go-review.googlesource.com/c/go/+/170860
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change adds two new treap iteration types: one for large
unscavenged spans (contain at least one huge page) and one for small
unscavenged spans. This allows us to scavenge the huge spans first by
first iterating over the large ones, then the small ones.
Also, since we now depend on physHugePageSize being a power of two,
ensure that that's the case when it's retrieved from the OS.
For #30333.
Change-Id: I51662740205ad5e4905404a0856f5f2b2d2a5680
Reviewed-on: https://go-review.googlesource.com/c/go/+/174399
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change introduces a treapIterFilter type which represents the
power set of states described by a treapIterType.
This change then adds a treapIterFilter field to each treap node
indicating the types of spans that live in that subtree. The field is
maintained via the same mechanism used to maintain maxPages. This allows
pred, succ, start, and end to be judicious about which subtrees it will
visit, ensuring that iteration avoids traversing irrelevant territory.
Without this change, repeated scavenging attempts can end up being N^2
as the scavenger walks over what it already scavenged before finding new
spans available for scavenging.
Finally, this change also only scavenges a span once it is removed from
the treap. There was always an invariant that spans owned by the treap
may not be mutated in-place, but with this change violating that
invariant can cause issues with scavenging.
For #30333.
Change-Id: I8040b997e21c94a8d3d9c8c6accfe23cebe0c3d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/174878
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change modifies the treap implementation to support holding all
spans in a single treap, instead of keeping them all in separate treaps.
This improves ergonomics for nearly all treap-related callsites.
With that said, iteration is now more expensive, but it never occurs on
the fast path, only on scavenging-related paths.
This change opens up the opportunity for further optimizations, such as
splitting spans without treap removal (taking treap removal off the span
allocator's critical path) as well as improvements to treap iteration
(building linked lists for each iteration type and managing them on
insert/removal, since those operations should be less frequent).
For #30333.
Change-Id: I3dac97afd3682a37fda09ae8656a770e1369d0a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/174398
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change modifies the treap implementation to be address-ordered
instead of size-ordered, and further augments it so it may be used for
allocation. It then modifies the find method to implement a first-fit
allocation policy.
This change to the treap implementation consequently makes it so that
spans are scavenged in highest-address-first order without any
additional changes to the scavenging code. Because the treap itself is
now address ordered, and the scavenging code iterates over it in
reverse, the highest address is now chosen instead of the largest span.
This change also renames the now wrongly-named "scavengeLargest" method
on mheap to just "scavengeLocked" and also fixes up logic in that method
which made assumptions about size.
For #30333.
Change-Id: I94b6f3209211cc1bfdc8cdaea04152a232cfbbb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/164101
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This change exports the runtime mTreap in export_test.go and then adds a
series of tests which check that the invariants of the treap are
maintained under different operations. These tests also include tests
for the treap iterator type.
Also, we note that the find() operation on the treap never actually was
best-fit, so the tests just ensure that it returns an appropriately
sized span.
For #30333.
Change-Id: If81f7c746dda6677ebca925cb0a940134701b894
Reviewed-on: https://go-review.googlesource.com/c/go/+/164100
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
A goroutine should be preempted if it runs for 10ms without blocking.
We found that this doesn't work for goroutines which call short system calls.
For example, the next program can stuck for seconds without this fix:
$ cat main.go
package main
import (
"runtime"
"syscall"
)
func main() {
runtime.GOMAXPROCS(1)
c := make(chan int)
go func() {
c <- 1
for {
t := syscall.Timespec{
Nsec: 300,
}
if true {
syscall.Nanosleep(&t, nil)
}
}
}()
<-c
}
$ time go run main.go
real 0m8.796s
user 0m0.367s
sys 0m0.893s
Updates #10958
Change-Id: Id3be54d3779cc28bfc8b33fe578f13778f1ae2a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/170138
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This change makes mTreap's iterator type, treapIter, bidirectional
instead of unidirectional. This change helps support moving the find
operation on a treap to return an iterator instead of a treapNode, in
order to hide the details of the treap when accessing elements.
For #28479.
Change-Id: I5dbea4fd4fb9bede6e81bfd089f2368886f98943
Reviewed-on: https://go-review.googlesource.com/c/156918
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This change adds the treapIter type which provides an iterator
abstraction for walking over an mTreap. In particular, the mTreap type
now has iter() and rev() for iterating both forwards (smallest to
largest) and backwards (largest to smallest). It also has an erase()
method for erasing elements at the iterator's current position.
For #28479.
While the expectation is that this change will slow down Go programs,
the impact on Go1 and Garbage is negligible.
Go1: https://perf.golang.org/search?q=upload:20181214.6
Garbage: https://perf.golang.org/search?q=upload:20181214.11
Change-Id: I60dbebbbe73cbbe7b78d45d2093cec12cc0bc649
Reviewed-on: https://go-review.googlesource.com/c/151537
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
When we delete an element, and it was the last element in the bucket,
update the slots between the new last element and the old last element
with the marker that says "no more elements beyond here".
Change-Id: I8efeeddf4c9b9fc491c678f84220a5a5094c9c76
Reviewed-on: https://go-review.googlesource.com/c/142438
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This change extends the test function ReadMemStatsSlow to re-compute
the HeapReleased statistic such that it is checked in testing to be
consistent with the bookkeeping done in the runtime.
Change-Id: I49f5c2620f5731edea8e9f768744cf997dcd7c22
Reviewed-on: https://go-review.googlesource.com/c/142397
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Currently, if the runtime overflows the g0 stack on Windows, it leads
to an infinite recursion:
1. Something overflows the g0 stack bounds and calls morestack.
2. morestack determines it's on the g0 stack and hence cannot grow the
stack, so it calls badmorestackg0 (which prints "fatal: morestack on
g0") followed by abort.
3. abort performs an INT $3, which turns into a Windows
_EXCEPTION_BREAKPOINT exception.
4. This enters the Windows sigtramp, which ensures we're on the g0
stack and calls exceptionhandler.
5. exceptionhandler has a stack check prologue, so it determines that
it's out of stack and calls morestack.
6. goto 2
Fix this by making the exception handler avoid stack checks until it
has ruled out an abort and by blowing away the stack bounds in
lastcontinuehandler before we print the final fatal traceback (which
itself involves a lot of stack bounds checks).
Fixes#21382.
Change-Id: Ie66e91f708e18d131d97f22b43f9ac26f3aece5a
Reviewed-on: https://go-review.googlesource.com/120857
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
If the runtime code panics due to a bad index or slice expression,
then throw instead of panicing. This will skip calls to recover and dump
the entire runtime stack trace. The runtime should never panic due to
an out of bounds index, and this will help with debugging if it does.
For #24991
Updates #25201
Change-Id: I85a9feded8f0de914ee1558425931853223c0514
Reviewed-on: https://go-review.googlesource.com/121515
Reviewed-by: Austin Clements <austin@google.com>
This adds a mechanism for debuggers to safely inject calls to Go
functions on amd64. Debuggers must participate in a protocol with the
runtime, and need to know how to lay out a call frame, but the runtime
support takes care of the details of handling live pointers in
registers, stack growth, and detecting the trickier conditions when it
is unsafe to inject a user function call.
Fixes#21678.
Updates derekparker/delve#119.
Change-Id: I56d8ca67700f1f77e19d89e7fc92ab337b228834
Reviewed-on: https://go-review.googlesource.com/109699
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The hmap field in the maptype is only used by the runtime to check the sizes of
the hmap structure created by the compiler and runtime agree.
Comments are already present about the hmap structure definitions in the
compiler and runtime needing to be in sync.
Add a test that checks the runtimes hmap size is as expected to detect
when the compilers and runtimes hmap sizes diverge instead of checking
this at runtime when a map is created.
Change-Id: I974945ebfdb66883a896386a17bbcae62a18cf2a
Reviewed-on: https://go-review.googlesource.com/91796
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
getcallersp is intrinsified, and so the dummy arg is no longer
needed. Remove it, as well as a few dummy args that are solely
to feed getcallersp.
Change-Id: Ibb6c948ff9c56537042b380ac3be3a91b247aaa6
Reviewed-on: https://go-review.googlesource.com/109596
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>