cybercyst/go - go - Gitea: Git with a cup of tea

mirror of https://github.com/golang/go.git synced 2025-05-28 10:51:22 +00:00

Author	SHA1	Message	Date
Michael Anthony Knyszek	3eaf75c13a	runtime: move next_gc and last_next_gc into gcControllerState This change moves next_gc and last_next_gc into gcControllerState under the names heapGoal and lastHeapGoal respectively. These are fundamentally GC pacer related values, and so it makes sense for them to live here. Partially generated by rf ' ex . { memstats.next_gc -> gcController.heapGoal memstats.last_next_gc -> gcController.lastHeapGoal } ' except for updates to comments and gcControllerState methods, where they're accessed through the receiver, and trace-related renames of NextGC -> HeapGoal, while we're here. For #44167. Change-Id: I1e871ad78a57b01be8d9f71bd662530c84853bed Reviewed-on: https://go-review.googlesource.com/c/go/+/306603 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2021-04-14 14:03:30 +00:00
Michael Anthony Knyszek	f2d5bd1ad3	runtime: move internal GC statistics from memstats to gcController This change moves certain important but internal-only GC statistics from memstats into gcController. These statistics are mainly used in pacing the GC, so it makes sense to keep them in the pacer's state. This CL was mostly generated via rf ' ex . { memstats.gc_trigger -> gcController.trigger memstats.triggerRatio -> gcController.triggerRatio memstats.heap_marked -> gcController.heapMarked memstats.heap_live -> gcController.heapLive memstats.heap_scan -> gcController.heapScan } ' except for a few special cases, like updating names in comments and when these fields are used within gcControllerState methods (at which point they're accessed through the reciever). For #44167. Change-Id: I6bd1602585aeeb80818ded24c07d8e6fec992b93 Reviewed-on: https://go-review.googlesource.com/c/go/+/306598 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2021-04-13 23:42:29 +00:00
Michael Pratt	d85083911d	runtime: encapsulate access to allgs Correctly accessing allgs is a bit hairy. Some paths need to lock allglock, some don't. Those that don't are safest using atomicAllG, but usage is not consistent. Rather than doing this ad-hoc, move all access* through forEachG / forEachGRace, the locking and atomic versions, respectively. This will make it easier to ensure safe access. * markroot is the only exception, as it has a far-removed guarantee of safe access via an atomic load of allglen far before actual use. Change-Id: Ie1c7a8243e155ae2b4bc3143577380c695680e89 Reviewed-on: https://go-review.googlesource.com/c/go/+/279994 Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2021-03-05 22:09:52 +00:00
Michael Pratt	e1faebe7b4	runtime: manage gcBgMarkWorkers with a global pool Background mark workers perform per-P marking work. Currently each worker is assigned a P at creation time. The worker "attaches" to the P via p.gcBgMarkWorker, making itself (usually) available to findRunnableGCWorker for scheduling GC work. While running gcMarkDone, the worker "detaches" from the P (by clearing p.gcBgMarkWorker), since it may park for other reasons and should not be scheduled by findRunnableGCWorker. Unfortunately, this design is complex and difficult to reason about. We simplify things by changing the design to eliminate the hard P attachment. Rather than workers always performing work from the same P, workers perform work for whichever P they find themselves on. On park, the workers are placed in a pool of free workers, which each P's findRunnableGCWorker can use to run a worker for its P. Now if a worker parks in gcMarkDone, a P may simply use another worker from the pool to complete its own work. The P's GC worker mode is used to communicate the mode to run to the selected worker. It is also used to emit the appropriate worker EvGoStart tracepoint. This is a slight change, as this G may be preempted (e.g., in gcMarkDone). When it is rescheduled, the trace viewer will show it as a normal goroutine again. It is currently a bit difficult to connect to the original worker tracepoint, as the viewer does not display the goid for the original worker (though the data is in the trace file). Change-Id: Id7bd3a364dc18a4d2b1c99c4dc4810fae1293c1b Reviewed-on: https://go-review.googlesource.com/c/go/+/262348 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Trust: Michael Pratt <mpratt@google.com>	2020-10-30 15:25:49 +00:00
Michael Anthony Knyszek	f5c6875f32	runtime: make next_gc atomically accessed next_gc is mostly updated only during a STW, but may occasionally be updated by calls to e.g. debug.SetGCPercent. In this case the update is supposed to be protected by the heap lock, but in reality it's accessed by gcController.revise which may be called without the heap lock held (despite its documentation, which will be updated in a later change). Change the synchronization policy on next_gc so that it's atomically accessed when the world is not stopped to aid in making revise safe for concurrent use. Change-Id: I79657a72f91563f3241aaeda66e8a7757d399529 Reviewed-on: https://go-review.googlesource.com/c/go/+/246962 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:25:54 +00:00
Michael Anthony Knyszek	c847589ad0	runtime: synchronize StartTrace and StopTrace with sysmon Currently sysmon is not stopped when the world is stopped, which is in general a difficult thing to do. The result of this is that when tracing starts and the value of trace.enabled changes, it's possible for sysmon to fail to emit an event when it really should. This leads to traces which the execution trace parser deems inconsistent. Fix this by putting all of sysmon's work behind a new lock sysmonlock. StartTrace and StopTrace both acquire this lock after stopping the world but before performing any work in order to ensure sysmon sees the required state change in tracing. This change is expected to slow down StartTrace and StopTrace, but will help ensure consistent traces are generated. Updates #29707. Fixes #38794. Change-Id: I64c58e7c3fd173cd5281ffc208d6db24ff6c0284 Reviewed-on: https://go-review.googlesource.com/c/go/+/234617 Run-TryBot: Michael Knyszek <mknyszek@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-05-21 14:48:50 +00:00
Dan Scales	0a820007e7	runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT) I took some of the infrastructure from Austin's lock logging CR https://go-review.googlesource.com/c/go/+/192704 (with deadlock detection from the logs), and developed a setup to give static lock ranking for runtime locks. Static lock ranking establishes a documented total ordering among locks, and then reports an error if the total order is violated. This can happen if a deadlock happens (by acquiring a sequence of locks in different orders), or if just one side of a possible deadlock happens. Lock ordering deadlocks cannot happen as long as the lock ordering is followed. Along the way, I found a deadlock involving the new timer code, which Ian fixed via https://go-review.googlesource.com/c/go/+/207348, as well as two other potential deadlocks. See the constants at the top of runtime/lockrank.go to show the static lock ranking that I ended up with, along with some comments. This is great documentation of the current intended lock ordering when acquiring multiple locks in the runtime. I also added an array lockPartialOrder[] which shows and enforces the current partial ordering among locks (which is embedded within the total ordering). This is more specific about the dependencies among locks. I don't try to check the ranking within a lock class with multiple locks that can be acquired at the same time (i.e. check the ranking when multiple hchan locks are acquired). Currently, I am doing a lockInit() call to set the lock rank of most locks. Any lock that is not otherwise initialized is assumed to be a leaf lock (a very high rank lock), so that eliminates the need to do anything for a bunch of locks (including all architecture-dependent locks). For two locks, root.lock and notifyList.lock (only in the runtime/sema.go file), it is not as easy to do lock initialization, so instead, I am passing the lock rank with the lock calls. For Windows compilation, I needed to increase the StackGuard size from 896 to 928 because of the new lock-rank checking functions. Checking of the static lock ranking is enabled by setting GOEXPERIMENT=staticlockranking before doing a run. To make sure that the static lock ranking code has no overhead in memory or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so that it defines a build tag (with the same name) whenever any experiment has been baked into the toolchain (by checking Expstring()). This allows me to avoid increasing the size of the 'mutex' type when static lock ranking is not enabled. Fixes #38029 Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-07 21:51:03 +00:00
Michael Anthony Knyszek	f1f947af28	runtime: don't hold worldsema across mark phase This change makes it so that worldsema isn't held across the mark phase. This means that various operations like ReadMemStats may now stop the world during the mark phase, reducing latency on such operations. Only three such operations are still no longer allowed to occur during marking: GOMAXPROCS, StartTrace, and StopTrace. For the former it's because any change to GOMAXPROCS impacts GC mark background worker scheduling and the details there are tricky. For the latter two it's because tracing needs to observe consistent GC start and GC end events, and if StartTrace or StopTrace may stop the world during marking, then it's possible for it to see a GC end event without a start or GC start event without an end, respectively. To ensure that GOMAXPROCS and StartTrace/StopTrace cannot proceed until marking is complete, the runtime now holds a new semaphore, gcsema, across the mark phase just like it used to with worldsema. This change is being landed once more after being reverted in the Go 1.14 release cycle, since CL 215157 allows it to have a positive effect on system performance. For the benchmark BenchmarkReadMemStatsLatency in the runtime, which measures ReadMemStats latencies while the GC is exercised, the tail of these latencies reduced dramatically on an 8-core machine: name old 50%tile-ns new 50%tile-ns delta ReadMemStatsLatency-8 4.40M ±74% 0.12M ± 2% -97.35% (p=0.008 n=5+5) name old 90%tile-ns new 90%tile-ns delta ReadMemStatsLatency-8 102M ± 6% 0M ±14% -99.79% (p=0.008 n=5+5) name old 99%tile-ns new 99%tile-ns delta ReadMemStatsLatency-8 147M ±18% 4M ±57% -97.43% (p=0.008 n=5+5) Fixes #19812. Change-Id: If66c3c97d171524ae29f0e7af4bd33509d9fd0bb Reviewed-on: https://go-review.googlesource.com/c/go/+/216557 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-03-18 19:13:50 +00:00
Michael Knyszek	64c22b70bf	Revert "runtime: don't hold worldsema across mark phase" This reverts commit 7b294cdd8df0a9523010f6ffc80c59e64578f34b, CL 182657. Reason for revert: This change may be causing latency problems for applications which call ReadMemStats, because it may cause all goroutines to stop until the GC completes. https://golang.org/cl/215157 fixes this problem, but it's too late in the cycle to land that. Updates #19812. Change-Id: Iaa26f4dec9b06b9db2a771a44e45f58d0aa8f26d Reviewed-on: https://go-review.googlesource.com/c/go/+/216358 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-01-24 23:27:33 +00:00
Ian Lance Taylor	580337e268	runtime, time: remove old timer code Updates #6239 Updates #27707 Change-Id: I65e6471829c9de4677d3ac78ef6cd7aa0a1fc4cb Reviewed-on: https://go-review.googlesource.com/c/go/+/171884 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>	2019-11-19 15:30:58 +00:00
Michael Anthony Knyszek	383b447e0d	runtime: clean up power-of-two rounding code with align functions This change renames the "round" function to the more appropriately named "alignUp" which rounds an integer up to the next multiple of a power of two. This change also adds the alignDown function, which is almost like alignUp but rounds down to the previous multiple of a power of two. With these two functions, we also go and replace manual rounding code with it where we can. Change-Id: Ie1487366280484dcb2662972b01b4f7135f72fec Reviewed-on: https://go-review.googlesource.com/c/go/+/190618 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-04 23:41:34 +00:00
Brad Fitzpatrick	03ef105dae	all: remove nacl (part 3, more amd64p32) Part 1: CL 199499 (GOOS nacl) Part 2: CL 200077 (amd64p32 files, toolchain) Part 3: stuff that arguably should've been part of Part 2, but I forgot one of my grep patterns when splitting the original CL up into two parts. This one might also have interesting stuff to resurrect for any future x32 ABI support. Updates #30439 Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c Reviewed-on: https://go-review.googlesource.com/c/go/+/200318 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-10-10 22:38:38 +00:00
Michael Anthony Knyszek	7b294cdd8d	runtime: don't hold worldsema across mark phase This change makes it so that worldsema isn't held across the mark phase. This means that various operations like ReadMemStats may now stop the world during the mark phase, reducing latency on such operations. Only three such operations are still no longer allowed to occur during marking: GOMAXPROCS, StartTrace, and StopTrace. For the former it's because any change to GOMAXPROCS impacts GC mark background worker scheduling and the details there are tricky. For the latter two it's because tracing needs to observe consistent GC start and GC end events, and if StartTrace or StopTrace may stop the world during marking, then it's possible for it to see a GC end event without a start or GC start event without an end, respectively. To ensure that GOMAXPROCS and StartTrace/StopTrace cannot proceed until marking is complete, the runtime now holds a new semaphore, gcsema, across the mark phase just like it used to with worldsema. Fixes #19812. Change-Id: I15d43ed184f711b3d104e8f267fb86e335f86bf9 Reviewed-on: https://go-review.googlesource.com/c/go/+/182657 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-09-04 15:53:59 +00:00
Iskander Sharipov	fd7d3259c9	runtime: remove redundant explicit deref in trace.go Replaces legacy Go syntax for pointer struct member access with more modern auto-deref alternative. Found using https://go-critic.github.io/overview#underef-ref Change-Id: I71a3c424126c4ff5d89f9e4bacb6cc01c6fa2ddf Reviewed-on: https://go-review.googlesource.com/122895 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-22 19:48:50 +00:00
Iskander Sharipov	fa6639d626	runtime: simplify slice expression to sliced value itself Replace `x[:]` where x is a slice with just `x`. Found using https://go-critic.github.io/overview.html#unslice-ref Change-Id: Ib0ee16e1d49b2a875b6b92a770049acc33208362 Reviewed-on: https://go-review.googlesource.com/123375 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-22 19:48:18 +00:00
Josh Bleecher Snyder	4d7cf3fedb	runtime: convert g.waitreason from string to uint8 Every time I poke at #14921, the g.waitreason string pointer writes show up. They're not particularly important performance-wise, but it'd be nice to clear the noise away. And it does open up a few extra bytes in the g struct for some future use. This is a re-roll of CL 99078, which was rolled back because of failures on s390x. Those failures were apparently due to an old version of gdb. Change-Id: Icc2c12f449b2934063fd61e272e06237625ed589 Reviewed-on: https://go-review.googlesource.com/111256 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <mike.munday@ibm.com>	2018-05-03 17:04:22 +00:00
Hana Kim	c2d1024368	runtime/trace: rename "Span" with "Region" "Span" is a commonly used term in many distributed tracing systems (Dapper, OpenCensus, OpenTracing, ...). They use it to refer to a period of time, not necessarily tied into execution of underlying processor, thread, or goroutine, unlike the "Span" of runtime/trace package. Since distributed tracing and go runtime execution tracing are already similar enough to cause confusion, this CL attempts to avoid using the same word if possible. "Region" is being used in a certain tracing system to refer to a code region which is pretty close to what runtime/trace.Span currently refers to. So, replace that. https://software.intel.com/en-us/itc-user-and-reference-guide-defining-and-recording-functions-or-regions This CL also tweaks APIs a bit based on jbd and heschi's comments: NewContext -> NewTask and it now returns a Task object that exports End method. StartSpan -> StartRegion and it now returns a Region object that exports End method. Also, changed WithSpan to WithRegion and it now takes func() with no context. Another thought is to get rid of WithRegion. It is a nice concept but in practice, it seems problematic (a lot of code churn, and polluting stack trace). Already, the tracing concept is very low level, and we hope this API to be used with great care. Recommended usage will be defer trace.StartRegion(ctx, "someRegion").End() Left old APIs untouched in this CL. Once the usage of them are cleaned up, they will be removed in a separate CL. Change-Id: I73880635e437f3aad51314331a035dd1459b9f3a Reviewed-on: https://go-review.googlesource.com/108296 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: JBD <jbd@google.com>	2018-04-24 16:33:15 +00:00
Josh Bleecher Snyder	6cb064c9c4	Revert "runtime: convert g.waitreason from string to uint8" This reverts commit 4eea887fd477368653f6fcf8ad766030167936e5. Reason for revert: broke s390x build Change-Id: Id6c2b6a7319273c4d21f613d4cdd38b00d49f847 Reviewed-on: https://go-review.googlesource.com/100375 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-03-13 15:21:21 +00:00
Josh Bleecher Snyder	4eea887fd4	runtime: convert g.waitreason from string to uint8 Every time I poke at #14921, the g.waitreason string pointer writes show up. They're not particularly important performance-wise, but it'd be nice to clear the noise away. And it does open up a few extra bytes in the g struct for some future use. Change-Id: I7ffbd52fbc2a286931a2218038fda52ed6473cc9 Reviewed-on: https://go-review.googlesource.com/99078 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-12 21:56:50 +00:00
Hana Kim	6977a3b257	runtime/trace: implement annotation API This implements the annotation API proposed in golang.org/cl/63274. traceString is updated to protect the string map with trace.stringsLock because the assumption that traceString is called by a single goroutine (either at the beginning of tracing and at the end of tracing when dumping all the symbols and function names) is no longer true. traceString is used by the annotation apis (NewContext, StartSpan, Log) to register frequently appearing strings (task and span names, and log keys) after this change. NewContext -> one or two records (EvString, EvUserTaskCreate) end function -> one record (EvUserTaskEnd) StartSpan -> one or two records (EvString, EvUserSpan) span end function -> one or two records (EvString, EvUserSpan) Log -> one or two records (EvString, EvUserLog) EvUserLog record is of the typical record format written by traceEvent except that it is followed by bytes that represents the value string. In addition to runtime/trace change, this change includes corresponding changes in internal/trace to parse the new record types. Future work to improve efficiency: More efficient unique task id generation instead of atomic. (per-P counter). Instead of a centralized trace.stringsLock, consider using per-P string cache or something more efficient. R=go1.11 Change-Id: Iec9276c6c51e5be441ccd52dec270f1e3b153970 Reviewed-on: https://go-review.googlesource.com/71690 Reviewed-by: Austin Clements <austin@google.com>	2018-02-15 18:54:14 +00:00
Hana Kim	32d1cd33c7	runtime/trace: user annotation API This CL presents the proposed user annotation API skeleton. This CL bumps up the trace version to 1.11. Design doc https://goo.gl/iqJfJ3 Implementation CLs are followed. The API introduces three basic building blocks. Log, Span, and Task. Log is for basic logging. When called, the message will be recorded to the trace along with timestamp, goroutine id, and stack info. trace.Log(ctx, messageType message) Span can be thought as an extension of log to record interesting time interval during a goroutine's execution. A span is local to a goroutine by definition. trace.WithSpan(ctx, "doVeryExpensiveOp", func(ctx context) { /* do something very expensive */ }) Task is higher-level concept that aids tracing of complex operations that encompass multiple goroutines or are asynchronous. For example, an RPC request, a HTTP request, a file write, or a batch job can be traced with a Task. Note we chose to design the API around context.Context so it allows easier integration with other tracing tools, often designed around context.Context as well. Log and WithSpan APIs recognize the task information embedded in the context and record it in the trace as well. That allows the Go execution tracer to associate and group the spans and log messages based on the task information. In order to create a Task, ctx, end := trace.NewContext(ctx, "myTask") defer end() The Go execution tracer measures the time between the task created and the task ended for the task latency. More discussion history in golang.org/cl/59572. Update #16619 R=go1.11 Change-Id: I59a937048294dafd23a75cf1723c6db461b193cd Reviewed-on: https://go-review.googlesource.com/63274 Reviewed-by: Austin Clements <austin@google.com>	2018-02-15 18:52:43 +00:00
Hana (Hyang-Ah) Kim	d58f4e9b7b	runtime/trace: fix corrupted trace during StartTrace Since Go1.8, different types of GC mark workers were annotated and the annotation strings were recorded during StartTrace. This change fixes two issues around the use of traceString from StartTrace here. 1) "failed to parse trace: no consistent ordering of events possible" This issue is a result of a missing 'batch' event entry. For efficient tracing, tracer maintains system allocated buffers and once a buffer is full, it is Flushed out for writing. Moreover, tracing assumes all the records in the same buffer (batch) are already ordered and implements more optimization in encoding and defers the completing order reconstruction till the trace parsing time. Thus, when a Flush happens and a new buffer is used, the new buffer should contain an event to indicate the start of a new batch. Before this CL, the batch entry was written only by traceEvent only when the buffer position is 0 and wasn't written when flush occurs during traceString. This CL fixes it by moving the batch entry write to the traceFlush. 2) crash during tracing due to invalid memory access, or during parsing due to duplicate string entries This issue is a result of memory allocation during traceString calls. Execution tracer traces some memory allocation activities. Before this CL, traceString took the buffer address (traceBuf) and mutated the buffer. If memory tracing occurs in the meantime from the same P, the allocation tracing (traceEvent) will take the same buffer address through the pointer to the buffer address (traceBuf), and mutate the buffer. As a result, one of the followings can happen: - the allocation record is overwritten by the following trace string record (data loss) - if buffer flush occurs during the allocation tracing, traceString will attempt to write the string record to the old buffer and eventually causes invalid memory access crash. - or flush on the same buffer can occur twice (once from the memory allocation, and once from the string record write), and in this case the trace can contain the same data twice and the parse will complain about duplicate string record entries. This CL fixes the second issue by making the traceString take traceBuf (traceBufPtr). Change-Id: I24f629758625b38e1916fbfc7d7be6ea210586af Reviewed-on: https://go-review.googlesource.com/50873 Run-TryBot: Austin Clements <austin@google.com> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-10-31 22:03:30 +00:00
Austin Clements	e900e275e8	runtime: clean up loops over allp allp now has length gomaxprocs, which means none of allp[i] are nil or in state _Pdead. This lets replace several different styles of loops over allp with normal range loops. for i := 0; i < gomaxprocs; i++ { ... } loops can simply range over allp. Likewise, range loops over allp[:gomaxprocs] can just range over allp. Loops that check for p == nil \|\| p.state == _Pdead don't need to check this any more. Loops that check for p == nil don't have to check this if dead Ps don't affect them. I checked that all such loops are, in fact, unaffected by dead Ps. One loop was potentially affected, which this fixes by zeroing p.gcAssistTime in procresize. Updates #15131. Change-Id: Ifa1c2a86ed59892eca0610360a75bb613bc6dcee Reviewed-on: https://go-review.googlesource.com/45575 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-09-27 16:29:15 +00:00
Austin Clements	84d2c7ea83	runtime: dynamically allocate allp This makes it possible to eliminate the hard cap on GOMAXPROCS. Updates #15131. Change-Id: I4c422b340791621584c118a6be1b38e8a44f8b70 Reviewed-on: https://go-review.googlesource.com/45573 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-09-27 16:29:09 +00:00
Aliaksandr Valialkin	76f4fd8a52	runtime: improve timers scalability on multi-CPU systems Use per-P timers, so each P may work with its own timers. This CL improves performance on multi-CPU systems in the following cases: - When serving high number of concurrent connections with read/write deadlines set (for instance, highly loaded net/http server). - When using high number of concurrent timers. These timers may be implicitly created via context.WithDeadline or context.WithTimeout. Production servers should usually set timeout on connections and external requests in order to prevent from resource leakage. See https://blog.cloudflare.com/the-complete-guide-to-golang-net-http-timeouts/ Below are relevant benchmark results for various GOMAXPROCS values on linux/amd64: context package: name old time/op new time/op delta WithTimeout/concurrency=40 4.92µs ± 0% 5.17µs ± 1% +5.07% (p=0.000 n=9+9) WithTimeout/concurrency=4000 6.03µs ± 1% 6.49µs ± 0% +7.63% (p=0.000 n=8+10) WithTimeout/concurrency=400000 8.58µs ± 7% 9.02µs ± 4% +5.02% (p=0.019 n=10+10) name old time/op new time/op delta WithTimeout/concurrency=40-2 3.70µs ± 1% 2.78µs ± 4% -24.90% (p=0.000 n=8+9) WithTimeout/concurrency=4000-2 4.49µs ± 4% 3.67µs ± 5% -18.26% (p=0.000 n=10+10) WithTimeout/concurrency=400000-2 6.16µs ±10% 5.15µs ±13% -16.30% (p=0.000 n=10+10) name old time/op new time/op delta WithTimeout/concurrency=40-4 3.58µs ± 1% 2.64µs ± 2% -26.13% (p=0.000 n=9+10) WithTimeout/concurrency=4000-4 4.17µs ± 0% 3.32µs ± 1% -20.36% (p=0.000 n=10+10) WithTimeout/concurrency=400000-4 5.57µs ± 9% 4.83µs ±10% -13.27% (p=0.001 n=10+10) time package: name old time/op new time/op delta AfterFunc 6.15ms ± 3% 6.07ms ± 2% ~ (p=0.133 n=10+9) AfterFunc-2 3.43ms ± 1% 3.56ms ± 1% +3.91% (p=0.000 n=10+9) AfterFunc-4 5.04ms ± 2% 2.36ms ± 0% -53.20% (p=0.000 n=10+9) After 6.54ms ± 2% 6.49ms ± 3% ~ (p=0.393 n=10+10) After-2 3.68ms ± 1% 3.87ms ± 0% +5.14% (p=0.000 n=9+9) After-4 6.66ms ± 1% 2.87ms ± 1% -56.89% (p=0.000 n=10+10) Stop 698µs ± 2% 689µs ± 1% -1.26% (p=0.011 n=10+10) Stop-2 729µs ± 2% 434µs ± 3% -40.49% (p=0.000 n=10+10) Stop-4 837µs ± 3% 333µs ± 2% -60.20% (p=0.000 n=10+10) SimultaneousAfterFunc 694µs ± 1% 692µs ± 7% ~ (p=0.481 n=10+10) SimultaneousAfterFunc-2 714µs ± 3% 569µs ± 2% -20.33% (p=0.000 n=10+10) SimultaneousAfterFunc-4 782µs ± 2% 386µs ± 2% -50.67% (p=0.000 n=10+10) StartStop 267µs ± 3% 274µs ± 0% +2.64% (p=0.000 n=8+9) StartStop-2 238µs ± 2% 140µs ± 3% -40.95% (p=0.000 n=10+8) StartStop-4 320µs ± 1% 125µs ± 1% -61.02% (p=0.000 n=9+9) Reset 75.0µs ± 1% 77.5µs ± 2% +3.38% (p=0.000 n=10+10) Reset-2 150µs ± 2% 40µs ± 5% -73.09% (p=0.000 n=10+9) Reset-4 226µs ± 1% 33µs ± 1% -85.42% (p=0.000 n=10+10) Sleep 857µs ± 6% 878µs ± 9% ~ (p=0.079 n=10+9) Sleep-2 617µs ± 4% 585µs ± 2% -5.21% (p=0.000 n=10+10) Sleep-4 689µs ± 3% 465µs ± 4% -32.53% (p=0.000 n=10+10) Ticker 55.9ms ± 2% 55.9ms ± 2% ~ (p=0.971 n=10+10) Ticker-2 28.7ms ± 2% 28.1ms ± 1% -2.06% (p=0.000 n=10+10) Ticker-4 14.6ms ± 0% 13.6ms ± 1% -6.80% (p=0.000 n=9+10) Fixes #15133 Change-Id: I6f4b09d2db8c5bec93146db6501b44dbfe5c0ac4 Reviewed-on: https://go-review.googlesource.com/34784 Reviewed-by: Austin Clements <austin@google.com>	2017-09-12 16:52:23 +00:00
Austin Clements	b0392159f6	runtime,cmd/trace: trace GC STW events Right now we only kind of sort of trace GC STW events. We emit events around mark termination, but those start well after stopping the world and end before starting it again, and we don't emit any events for sweep termination. Fix this by generalizing EvGCScanStart/EvGCScanDone. These were already re-purposed to indicate mark termination (despite the names). This commit renames them to EvGCSTWStart/EvGCSTWDone, adds an argument to indicate the STW reason, and shuffles the runtime to generate them right before stopping the world and right after starting the world, respectively. These events will make it possible to generate precise minimum mutator utilization (MMU) graphs and could be useful in detecting non-preemptible goroutines (e.g., #20792). Change-Id: If95783f370781d8ef66addd94886028103a7c26f Reviewed-on: https://go-review.googlesource.com/55411 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-08-29 21:54:55 +00:00
Austin Clements	22000f5407	runtime: record swept and reclaimed bytes in sweep trace This extends the GCSweepDone event with counts of swept and reclaimed bytes. These are useful for understanding the duration and effectiveness of sweep events. Change-Id: I3c97a4f0f3aad3adbd188adb264859775f54e2df Reviewed-on: https://go-review.googlesource.com/40811 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2017-04-19 18:31:14 +00:00
Austin Clements	79c56addb6	runtime: make sweep trace events encompass entire sweep loop Currently, each individual span sweep emits a span to the trace. But sweeps are generally done in loops until some condition is satisfied, so this tracing is lower-level than anyone really wants any hides the fact that no other work is being accomplished between adjacent sweep events. This is also high overhead: enabling tracing significantly impacts sweep latency. Replace this with instead tracing around the sweep loops used for allocation. This is slightly tricky because sweep loops don't generally know if any sweeping will happen in them. Hence, we make the tracing lazy by recording in the P that we would like to start tracing the sweep if one happens, and then only closing the sweep event if we started it. This does mean we don't get tracing on every sweep path, which are legion. However, we get much more informative tracing on the paths that block allocation, which are the paths that matter. Change-Id: I73e14fbb250acb0c9d92e3648bddaa5e7d7e271c Reviewed-on: https://go-review.googlesource.com/40810 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-19 18:31:11 +00:00
David Lazar	3249cb0ab4	runtime/trace: iterate over frames instead of PCs Now the runtime/trace tests pass with -l=4. This also gets rid of the frames cache for multiple reasons: 1) The frames cache was used to avoid repeated calls to funcname and funcline. Now these calls happen inside the CallersFrames iterator. 2) Maintaining a frames cache is harder: map[uintptr]traceFrame doesn't work since each PC can map to multiple traceFrames. 3) It's not clear that the cache is important. Change-Id: I2914ac0b3ba08e39b60149d99a98f9f532b35bbb Reviewed-on: https://go-review.googlesource.com/40591 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-04-14 12:21:21 +00:00
Daniel Martí	77b09b8b8d	runtime: remove unused g parameter Found by github.com/mvdan/unparam. Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c Reviewed-on: https://go-review.googlesource.com/37894 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-16 14:03:45 +00:00
Austin Clements	0efc8b2188	runtime: avoid repeated findmoduledatap calls Currently almost every function that deals with a _func has to first look up the moduledata for the module containing the function's entry point. This means we almost always do at least two identical module lookups whenever we deal with a _func (one to get the _func and another to get something from its module data) and sometimes several more. Fix this by making findfunc return a new funcInfo type that embeds _func, but also includes the moduledata, and making all of the functions that currently take a _func instead take a funcInfo and use the already-found moduledata. This transformation is trivial for the most part, since the *_func type is usually inferred. The annoying part is that we can no longer use nil to indicate failure, so this introduces a funcInfo.valid() method and replaces nil checks with calls to valid. Change-Id: I9b8075ef1c31185c1943596d96dec45c7ab5100f Reviewed-on: https://go-review.googlesource.com/37331 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2017-03-06 19:17:24 +00:00
Dmitry Vyukov	0556e26273	sync: make Mutex more fair Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-17 17:24:59 +00:00
Austin Clements	d089a6c718	runtime: remove stack barriers Now that we don't rescan stacks, stack barriers are unnecessary. This removes all of the code and structures supporting them as well as tests that were specifically for stack barriers. Updates #17503. Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c Reviewed-on: https://go-review.googlesource.com/36620 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 15:52:54 +00:00
Heschi Kreinick	2a74b9e814	cmd/trace: Record mark assists in execution traces During the mark phase of garbage collection, goroutines that allocate may be recruited to assist. This change creates trace events for mark assists and displays them similarly to sweep assists in the trace viewer. Mark assists are different than sweeps in that they can be preempted, so displaying them in the trace viewer is a little tricky -- we may need to synthesize multiple slices for one mark assist. This could have been done in the parser instead, but I thought it might be preferable to keep the parser as true to the event stream as possible. Change-Id: I381dcb1027a187a354b1858537851fa68a620ea7 Reviewed-on: https://go-review.googlesource.com/36015 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2017-02-10 18:03:42 +00:00
Austin Clements	6da83c6fc0	runtime, cmd/trace: track goroutines blocked on GC assists Currently when a goroutine blocks on a GC assist, it emits a generic EvGoBlock event. Since assist blocking events and, in particular, the length of the blocked assist queue, are important for diagnosing GC behavior, this commit adds a new EvGoBlockGC event for blocking on a GC assist. The trace viewer uses this event to report a "waiting on GC" count in the "Goroutines" row. This makes sense because, unlike other blocked goroutines, these goroutines do have work to do, so being blocked on a GC assist is quite similar to being in the "runnable" state, which we also report in the trace viewer. Change-Id: Ic21a326992606b121ea3d3d00110d8d1fdc7a5ef Reviewed-on: https://go-review.googlesource.com/30704 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-10-28 14:29:47 +00:00
Austin Clements	6834839427	runtime, cmd/trace: annotate different mark worker types Currently mark workers are shown in the trace as regular goroutines labeled "runtime.gcBgMarkWorker". That's somewhat unhelpful to an end user because of the opaque label and particularly unhelpful to runtime developers because it doesn't distinguish the different types of mark workers. Fix this by introducing a variant of the GoStart event called GoStartLabel that lets the runtime indicate a label for a goroutine execution span and using this to label mark worker executions as "GC (<mode>)" in the trace viewer. Since this bumps the trace version to 1.8, we also add test data for 1.7 traces. Change-Id: Id7b9c0536508430c661ffb9e40e436f3901ca121 Reviewed-on: https://go-review.googlesource.com/30702 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-10-28 14:29:40 +00:00
Peter Weinberger	ca922b6d36	runtime: Profile goroutines holding contended mutexes. runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack traces of goroutines holding contended mutexes if n > 0. From runtime/pprof, pprot.Lookup("mutex").WriteTo writes the accumulated stack traces to w (in essentially the same format that blocking profiling uses). Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a Reviewed-on: https://go-review.googlesource.com/29650 Reviewed-by: Austin Clements <austin@google.com>	2016-10-28 11:47:16 +00:00
Austin Clements	c242517866	runtime: replace g with guintptr in trace trace's reader g is going to cause write barriers in unfortunate places, so replace it with a guintptr. Change-Id: Ie8fb13bb89a78238f9d2a77ec77da703e96df8af Reviewed-on: https://go-review.googlesource.com/31469 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-21 16:00:20 +00:00
Austin Clements	1bc6be6423	runtime: mark several types go:notinheap This covers basically all sysAlloc'd, persistentalloc'd, and fixalloc'd types. Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b Reviewed-on: https://go-review.googlesource.com/30941 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-15 17:58:20 +00:00
Austin Clements	94589054d3	cmd/trace: label mark termination spans as such Currently these are labeled "MARK", which was accurate in the STW collector, but these really indicate mark termination now, since marking happens for the full duration of the concurrent GC. Re-label them as "MARK TERMINATION" to clarify this. Change-Id: Ie98bd961195acde49598b4fa3f9e7d90d757c0a6 Reviewed-on: https://go-review.googlesource.com/30018 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-10-07 18:33:23 +00:00
Austin Clements	fa9b57bb1d	runtime: make next_gc ^0 when GC is disabled When GC is disabled, we set gcpercent to -1. However, we still use gcpercent to compute several values, such as next_gc and gc_trigger. These calculations are meaningless when gcpercent is -1 and result in meaningless values. This is okay in a sense because we also never use these values if gcpercent is -1, but they're confusing when exposed to the user, for example via MemStats or the execution trace. It's particularly unfortunate in the execution trace because it attempts to plot the underflowed value of next_gc, which scales all useful information in the heap row into oblivion. Fix this by making next_gc ^0 when gcpercent < 0. This has the advantage of being true in a way: next_gc is effectively infinite when gcpercent < 0. We can also detect this special value when updating the execution trace and report next_gc as 0 so it doesn't blow up the display of the heap line. Change-Id: I4f366e4451f8892a4908da7b2b6086bdc67ca9a9 Reviewed-on: https://go-review.googlesource.com/30016 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-07 18:32:51 +00:00
Dmitry Vyukov	cd285f1c6f	runtime: fix global buffer reset in StopTrace We reset global buffer only if its pos != 0. We ought to do it always, but queue it only if pos != 0. This is a latent bug. Currently it does not fire because whenever we create a global buffer, we increment pos. Change-Id: I01e28ae88ce9a5412497c524391b8b7cb443ffd9 Reviewed-on: https://go-review.googlesource.com/25574 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-09-02 19:14:11 +00:00
Dmitry Vyukov	747a158ef3	runtime: speed up StartTrace with lots of blocked goroutines In StartTrace we emit EvGoCreate for all existing goroutines. This includes stack unwind to obtain current stack. Real Go programs can contain hundreds of thousands of blocked goroutines. For such programs StartTrace can take up to a second (few ms per goroutine). Obtain current stack ID once and use it for all EvGoCreate events. This speeds up StartTrace with 10K blocked goroutines from 20ms to 4 ms (win for StartTrace called from net/http/pprof hander will be bigger as stack is deeper). Change-Id: I9e5ff9468331a840f8fdcdd56c5018c2cfde61fc Reviewed-on: https://go-review.googlesource.com/25573 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2016-08-22 17:40:10 +00:00
Dmitry Vyukov	a3703618ea	runtime: use per-goroutine sequence numbers in tracer Currently tracer uses global sequencer and it introduces significant slowdown on parallel machines (up to 10x). Replace the global sequencer with per-goroutine sequencer. If we assign per-goroutine sequence numbers to only 3 types of events (start, unblock and syscall exit), it is enough to restore consistent partial ordering of all events. Even these events don't need sequence numbers all the time (if goroutine starts on the same P where it was unblocked, then start does not need sequence number). The burden of restoring the order is put on trace parser. Details of the algorithm are described in the comments. On http benchmark with GOMAXPROCS=48: no tracing: 5026 ns/op tracing: 27803 ns/op (+453%) with this change: 6369 ns/op (+26%, mostly for traceback) Also trace size is reduced by ~22%. Average event size before: 4.63 bytes/event, after: 3.62 bytes/event. Besides running trace tests, I've also tested with manually broken cputicks (random skew for each event, per-P skew and episodic random skew). In all cases broken timestamps were detected and no test failures. Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa Reviewed-on: https://go-review.googlesource.com/21512 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-23 15:57:05 +00:00
Dmitry Vyukov	2d342fba78	runtime: fix description of trace events Change-Id: I037101b1921fe151695d32e9874b50dd64982298 Reviewed-on: https://go-review.googlesource.com/22314 Reviewed-by: Austin Clements <austin@google.com>	2016-04-22 21:32:37 +00:00
Dmitry Vyukov	3fafe2e888	internal/trace: support parsing of 1.5 traces 1. Parse out version from trace header. 2. Restore handling of 1.5 traces. 3. Restore optional symbolization of traces. 4. Add some canned 1.5 traces for regression testing (http benchmark trace, runtime/trace stress traces, plus one with broken timestamps). Change-Id: Idb18a001d03ded8e13c2730eeeb37c5836e31256 Reviewed-on: https://go-review.googlesource.com/21803 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-11 17:56:44 +00:00
Dmitry Vyukov	0fb7b4cccd	runtime: emit file:line info into traces This makes traces self-contained and simplifies trace workflow in modern cloud environments where it is simpler to reach a service via HTTP than to obtain the binary. Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d Reviewed-on: https://go-review.googlesource.com/21732 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2016-04-08 20:52:30 +00:00
Austin Clements	08594ac7c7	runtime: acquire stack lock in traceEvent traceEvent records system call events after a G has already entered _Gsyscall, which means the garbage collector could be installing stack barriers in the G's stack during the traceEvent. If traceEvent attempts to capture the user stack during this, it may observe a inconsistent stack barriers and panic. Fix this by acquiring the stack lock around the stack walk in traceEvent. Fixes #14101. Change-Id: I15f0ab0c70c04c6e182221f65a6f761c5a896459 Reviewed-on: https://go-review.googlesource.com/18973 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-27 02:22:09 +00:00
Austin Clements	e9aef43d87	runtime: eliminate traceAllocBlock write barriers This replaces *traceAllocBlock with traceAllocBlockPtr. Updates #10600. Change-Id: I94a20d90f04cca7c457b29062427748e315e4857 Reviewed-on: https://go-review.googlesource.com/17004 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-19 21:17:09 +00:00
Michael Matloob	432cb66f16	runtime: break out system-specific constants into package sys runtime/internal/sys will hold system-, architecture- and config- specific constants. Updates #11647 Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54 Reviewed-on: https://go-review.googlesource.com/16817 Reviewed-by: Russ Cox <rsc@golang.org>	2015-11-12 17:04:45 +00:00

1 2

77 Commits