62883 Commits

Author SHA1 Message Date
Dmitri Shuralyov
eb55b985a1 cmd/dist: add "devel" substring check to isRelease computation
Non-release versions that are built from source without a VERSION file
specifying any particular version end up with a development version like
"devel go1.25-67e0681aef Thu Apr 24 12:17:27 2025 -0700". Right now
those versions are correctly determined to be non-release because they
don't have a "go" prefix, instead they have a "devel " prefix.

In preparation of being able to move the "devel" substring, add a check
that said substring isn't present anywhere, since it is certain not to
be included in any released Go version we publish at https://go.dev/dl/.

For #73372.

Change-Id: Ia3e0d03b5723d4034d6270c3a2224f8dfae380e9
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/667955
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-25 14:30:13 -07:00
Keith Randall
3f3782feed cmd/compile: allow all of the preamble to be preemptible
We currently make some parts of the preamble unpreemptible because
it confuses morestack. See comments in the code.

Instead, have morestack handle those weird cases so we can
remove unpreemptible marks from most places.

This CL makes user functions preemptible everywhere if they have no
write barriers (at least, on x86). In cmd/go the fraction of functions
that need preemptible markings drops from 82% to 36%. Makes the cmd/go
binary 0.3% smaller.

Update #35470

Change-Id: Ic83d5eabfd0f6d239a92e65684bcce7e67ff30bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/648518
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-04-25 12:21:48 -07:00
limeidan
dc1e255104 runtime, internal/fuzz: add comparison tracing for libFuzzer on loong64
Change-Id: I212330962453139fa353db29928786b64c9ff063
Reviewed-on: https://go-review.googlesource.com/c/go/+/667455
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-04-24 23:16:24 -07:00
Guoqi Chen
fb2c88147d cmd/internal/obj: add new assembly format for BFPT and BFPF on loong64
On loong64, BFPT and BFPF are mapped to the platform assembly as follows:

   Go asm syntax:
        BFPT   FCCx, offs21
        BFPF   FCCx, offs21
   Equivalent platform assembler syntax:
        bcnez  cj, offs21
        bceqz  cj, offs21

If the condition register is not specified, it defaults to FCC0.

Change-Id: I2cc3df62a9c55d4b5eb124789358983c6737319c
Reviewed-on: https://go-review.googlesource.com/c/go/+/667456
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
2025-04-24 18:28:42 -07:00
Julian Zhu
06f96a598e crypto/sha256: improve performance of riscv64 assembly
Simplified the implementation of Ch and Maj by reducing instructions, based on CL 605495 which made the same change for SHA-512.

goos: linux
goarch: riscv64
pkg: crypto/sha256
cpu: Spacemit(R) X60
                    │  oldsha256  │              newsha256              │
                    │   sec/op    │   sec/op     vs base                │
Hash8Bytes/New-8      2.303µ ± 0%   2.098µ ± 0%   -8.90% (p=0.000 n=10)
Hash8Bytes/Sum224-8   2.535µ ± 0%   2.329µ ± 0%   -8.13% (p=0.000 n=10)
Hash8Bytes/Sum256-8   2.558µ ± 0%   2.352µ ± 0%   -8.04% (p=0.000 n=10)
Hash1K/New-8          28.67µ ± 0%   25.21µ ± 0%  -12.06% (p=0.000 n=10)
Hash1K/Sum224-8       28.89µ ± 0%   25.43µ ± 0%  -11.99% (p=0.000 n=10)
Hash1K/Sum256-8       28.91µ ± 0%   25.43µ ± 0%  -12.04% (p=0.000 n=10)
Hash8K/New-8          218.0µ ± 1%   192.7µ ± 2%  -11.58% (p=0.000 n=10)
Hash8K/Sum224-8       218.0µ ± 1%   193.6µ ± 1%  -11.20% (p=0.000 n=10)
Hash8K/Sum256-8       219.1µ ± 1%   193.4µ ± 1%  -11.74% (p=0.000 n=10)
geomean               24.93µ        22.28µ       -10.65%

                    │  oldsha256   │              newsha256               │
                    │     B/s      │     B/s       vs base                │
Hash8Bytes/New-8      3.309Mi ± 0%   3.633Mi ± 0%   +9.80% (p=0.000 n=10)
Hash8Bytes/Sum224-8   3.009Mi ± 0%   3.271Mi ± 0%   +8.72% (p=0.000 n=10)
Hash8Bytes/Sum256-8   2.985Mi ± 0%   3.242Mi ± 0%   +8.63% (p=0.000 n=10)
Hash1K/New-8          34.06Mi ± 0%   38.73Mi ± 0%  +13.72% (p=0.000 n=10)
Hash1K/Sum224-8       33.80Mi ± 0%   38.40Mi ± 0%  +13.63% (p=0.000 n=10)
Hash1K/Sum256-8       33.78Mi ± 0%   38.40Mi ± 0%  +13.69% (p=0.000 n=10)
Hash8K/New-8          35.84Mi ± 1%   40.54Mi ± 2%  +13.10% (p=0.000 n=10)
Hash8K/Sum224-8       35.83Mi ± 1%   40.35Mi ± 1%  +12.61% (p=0.000 n=10)
Hash8K/Sum256-8       35.66Mi ± 1%   40.40Mi ± 1%  +13.29% (p=0.000 n=10)
geomean               15.54Mi        17.39Mi       +11.89%

Change-Id: I9aa692fcfd70634dc6c308db9b5d06bd82ac2302
Reviewed-on: https://go-review.googlesource.com/c/go/+/639495
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
2025-04-24 17:55:31 -07:00
Mateusz Poliwczak
42d3cdc909 sync/atomic: document that atomic types should not be copied
Change-Id: I3c557d02cd676a389b5c5ea70ed92c8959041e3b
GitHub-Last-Rev: 8732da19a64853834ca155cafc1d7b2967290c31
GitHub-Pull-Request: golang/go#63256
Reviewed-on: https://go-review.googlesource.com/c/go/+/531375
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Sean Liao <sean@liao.dev>
2025-04-24 16:10:59 -07:00
changwang ma
da64b60c7e runtime: fix typo in comment
Change-Id: I85f518e36c18f4f0eda8b167750b43cd8c48ecff
Reviewed-on: https://go-review.googlesource.com/c/go/+/622675
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-04-24 15:19:23 -07:00
Keith Randall
67e0681aef cmd/compile: put constant value on node inside parentheses
That's where the unified IR writer expects it.

Fixes #73476

Change-Id: Ic22bd8dee5be5991e6d126ae3f6eccb2acdc0b19
Reviewed-on: https://go-review.googlesource.com/c/go/+/667415
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-04-24 12:17:27 -07:00
Felix Geisendörfer
3672a09a48 runtime/debug: update SetCrashOutput example to not pass parent env vars
Fixes #73490

Change-Id: I500fa73f4215c7f490779f53c1c2c0d775f51a95
Reviewed-on: https://go-review.googlesource.com/c/go/+/667775
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-04-24 12:01:27 -07:00
Keith Randall
3452d80da3 cmd/compile: add cast in range loop final value computation
When replacing a loop where the iteration variable has a named type,
we need to compute the last iteration value as i = T(len(a)-1), not
just i = len(a)-1.

Fixes #73491

Change-Id: Ic1cc3bdf8571a40c10060f929a9db8a888de2b70
Reviewed-on: https://go-review.googlesource.com/c/go/+/667815
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-04-24 11:02:26 -07:00
Keith Randall
3009566a46 runtime: fix tag pointers on aix
Clean up tagged pointers a bit. I got the shifts wrong
for the weird aix case.

Change-Id: I21449fd5973f4651fd1103d3b8be9c2b9b93a490
Reviewed-on: https://go-review.googlesource.com/c/go/+/667715
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-24 10:09:19 -07:00
qmuntal
8a8f506516 os,internal/poll: disassociate handle from IOCP in File.Fd
Go 1.25 will gain support for overlapped IO on handles passed to
os.NewFile thanks to CL 662236. It was previously not possible to add
an overlapped handle to the Go runtime's IO completion port (IOCP),
and now happens on the first call the an IO method.

This means that there is code that relies on the fact that File.Fd
returns a handle that can always be associated with a custom IOCP.
That wouldn't be the case anymore, as a handle can only be associated
with one IOCP at a time and it must be explicitly disassociated.

To fix this breaking change, File.Fd will disassociate the handle
from the Go runtime IOCP before returning it. It is then not necessary
to defer the association until the first IO method is called, which
was recently added in CL 661955 to support this same use case, but
in a more complex and unreliable way.

Updates #19098.

Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-race,gotip-windows-amd64-longtest,gotip-windows-arm64
Change-Id: Id8a7e04d35057047c61d1733bad5bf45494b2c28
Reviewed-on: https://go-review.googlesource.com/c/go/+/664455
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-24 06:51:09 -07:00
Keith Randall
c1fc209c41 runtime: use precise bounds of Go data/bss for race detector
We only want to call into the race detector for Go global variables.
By rounding up the region bounds, we can include some C globals.
Even worse, we can include only *part* of a C global, leading to
race{read,write}range calls which straddle the end of shadow memory.
That causes the race detector to barf.

Fix some off-by-one errors in the assembly comparisons. We want to
skip calling the race detector when addr == racedataend.

Fixes #73483

Change-Id: I436b0f588d6165b61f30cb7653016ba9b7cbf585
Reviewed-on: https://go-review.googlesource.com/c/go/+/667655
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2025-04-23 23:22:12 -07:00
Keith Randall
9d0320de25 runtime: align taggable pointers more so we can use low bits for tag
Currently we assume alignment to 8 bytes, so we can steal the low 3 bits.
This CL assumes alignment to 512 bytes, so we can steal the low 9 bits.

That's 6 extra bits!

Aligning to 512 bytes wastes a bit of space but it is not egregious.
Most of the objects that we make tagged pointers to are pretty big.

Update #49405

Change-Id: I66fc7784ac1be5f12f285de1d7851d5a6871fb75
Reviewed-on: https://go-review.googlesource.com/c/go/+/665815
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-23 21:44:50 -07:00
Alan Donovan
702f164ed1 cmd/vet: add hostport analyzer
+ test, release note

Fixes #28308

Change-Id: I190e2fe513eeb6b90b0398841f67bf52510b5f59
Reviewed-on: https://go-review.googlesource.com/c/go/+/667596
Auto-Submit: Alan Donovan <adonovan@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-23 19:09:44 -07:00
Alan Donovan
fca5832607 cmd/vendor: update x/tools and x/text
This CL updates x/tools to 68e94bd and x/text to v0.24.0,
updates the vendor tree, and re-runs the bundle step for net/http.

Updates golang/go#28308

Change-Id: I4184f77547f535270ddc8e2ce6542377e3046ffd
Reviewed-on: https://go-review.googlesource.com/c/go/+/667597
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
2025-04-23 15:34:39 -07:00
Nevkontakte
71d9505998 crypto/tls: skip part of the test based on GOOS instead of GOARCH
This allows to skip the last part of the test under GopherJS as well as
WebAssembly, since GopherJS shares GOOS=js with wasm.

Change-Id: I41adad788043c1863b23eb2a6da9bc9aa2833092
GitHub-Last-Rev: d8d42a3b7ccb2bee6479306b6ac1a319443702ec
GitHub-Pull-Request: golang/go#51827
Reviewed-on: https://go-review.googlesource.com/c/go/+/394114
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-23 10:22:11 -07:00
Michael Anthony Knyszek
f1ebad19bd internal/goexperiment: add Green Tea GC goexperiment
Change-Id: Ia3ea5290842d8eddfafad4882f5874a2aff03e94
Reviewed-on: https://go-review.googlesource.com/c/go/+/645935
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-04-23 08:07:17 -07:00
Michael Anthony Knyszek
e90ba1d208 runtime: move some malloc constants to internal/runtime/gc
These constants are needed by some future generator programs.

Change-Id: I5dccd009cbb3b2f321523bc0d8eaeb4c82e5df81
Reviewed-on: https://go-review.googlesource.com/c/go/+/655276
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-23 08:06:33 -07:00
Michael Anthony Knyszek
528bafa049 runtime: move sizeclass defs to new package internal/runtime/gc
We will want to reference these definitions from new generator programs,
and this is a good opportunity to cleanup all these old C-style names.

Change-Id: Ifb06f0afc381e2697e7877f038eca786610c96de
Reviewed-on: https://go-review.googlesource.com/c/go/+/655275
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-23 08:00:33 -07:00
limeidan
ecdd429a3b runtime: optimize the function memequal using SIMD on loong64
goos: linux
goarch: loong64
pkg: bytes
cpu: Loongson-3A6000-HV @ 2500.00MHz
                              │      old      │                 new                  │
                              │    sec/op     │    sec/op     vs base                │
Equal/0                          0.4012n ± 0%   0.4003n ± 0%   -0.21% (p=0.000 n=10)
Equal/same/1                      2.555n ± 1%    2.419n ± 0%   -5.32% (p=0.000 n=10)
Equal/same/6                      2.574n ± 1%    2.425n ± 1%   -5.79% (p=0.000 n=10)
Equal/same/9                      2.578n ± 0%    2.419n ± 1%   -6.19% (p=0.000 n=10)
Equal/same/15                     2.565n ± 1%    2.417n ± 0%   -5.73% (p=0.000 n=10)
Equal/same/16                     2.576n ± 1%    2.414n ± 0%   -6.31% (p=0.000 n=10)
Equal/same/20                     2.573n ± 1%    2.416n ± 0%   -6.10% (p=0.000 n=10)
Equal/same/32                     2.559n ± 0%    2.411n ± 0%   -5.80% (p=0.000 n=10)
Equal/same/4K                     2.579n ± 1%    2.410n ± 0%   -6.53% (p=0.000 n=10)
Equal/same/4M                     2.571n ± 0%    2.411n ± 0%   -6.22% (p=0.000 n=10)
Equal/same/64M                    2.568n ± 1%    2.413n ± 0%   -6.05% (p=0.000 n=10)
Equal/1                           5.215n ± 0%    6.404n ± 0%  +22.80% (p=0.000 n=10)
Equal/6                          11.630n ± 0%    6.404n ± 0%  -44.94% (p=0.000 n=10)
Equal/9                          15.240n ± 0%    6.404n ± 0%  -57.98% (p=0.000 n=10)
Equal/15                         22.925n ± 0%    6.404n ± 0%  -72.07% (p=0.000 n=10)
Equal/16                         24.070n ± 0%    5.203n ± 0%  -78.38% (p=0.000 n=10)
Equal/20                         28.880n ± 0%    6.404n ± 0%  -77.83% (p=0.000 n=10)
Equal/32                         43.320n ± 0%    6.404n ± 0%  -85.22% (p=0.000 n=10)
Equal/4K                        4938.50n ± 0%    55.43n ± 0%  -98.88% (p=0.000 n=10)
Equal/4M                         5048.8µ ± 0%    202.0µ ± 0%  -96.00% (p=0.000 n=10)
Equal/64M                        80.819m ± 0%    4.539m ± 0%  -94.38% (p=0.000 n=10)
EqualBothUnaligned/64_0          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/64_1          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/64_4          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/64_7          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/4096_0       4937.00n ± 0%    65.64n ± 0%  -98.67% (p=0.000 n=10)
EqualBothUnaligned/4096_1       4937.00n ± 0%    78.85n ± 0%  -98.40% (p=0.000 n=10)
EqualBothUnaligned/4096_4       4937.00n ± 0%    78.87n ± 0%  -98.40% (p=0.000 n=10)
EqualBothUnaligned/4096_7       4937.00n ± 0%    78.87n ± 0%  -98.40% (p=0.000 n=10)
EqualBothUnaligned/4194304_0     5049.2µ ± 0%    204.2µ ± 0%  -95.96% (p=0.000 n=10)
EqualBothUnaligned/4194304_1     5049.2µ ± 0%    205.1µ ± 0%  -95.94% (p=0.000 n=10)
EqualBothUnaligned/4194304_4     5049.4µ ± 0%    205.1µ ± 0%  -95.94% (p=0.000 n=10)
EqualBothUnaligned/4194304_7     5049.2µ ± 0%    205.1µ ± 0%  -95.94% (p=0.000 n=10)
EqualBothUnaligned/67108864_0    80.796m ± 0%    3.863m ± 0%  -95.22% (p=0.000 n=10)
EqualBothUnaligned/67108864_1    80.801m ± 0%    3.706m ± 0%  -95.41% (p=0.000 n=10)
EqualBothUnaligned/67108864_4    80.799m ± 0%    3.706m ± 0%  -95.41% (p=0.000 n=10)
EqualBothUnaligned/67108864_7    80.781m ± 0%    3.706m ± 0%  -95.41% (p=0.000 n=10)
geomean                           1.040µ         149.6n       -85.63%

Change-Id: Id4c2bc0ca758337dd9759df83750c761814be488
Reviewed-on: https://go-review.googlesource.com/c/go/+/667255
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-04-23 01:29:52 -07:00
Xiaolin Zhao
93e4e26d5b runtime: fix typos in comments
Change-Id: Id169b68cc93bb6eb4cdca384efaaf971fcfa32b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/666316
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-22 18:37:58 -07:00
Carlos Amedee
489917fc40 Revert "runtime: only poll network from one P at a time in findRunnable"
This reverts commit 352dd2d932c1c1c6dbc3e112fcdfface07d4fffb.

Reason for revert: cockroachdb benchmark failing. Likely due to CL 564197.

For #73474

Change-Id: Id5d83cd8bb8fe9ee7fddb8dc01f1a01f2d40154e
Reviewed-on: https://go-review.googlesource.com/c/go/+/667336
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Auto-Submit: Carlos Amedee <carlos@golang.org>
2025-04-22 15:49:52 -07:00
1911860538
83b53527fa net/http: replace map lookup with switch for scheme port
Improve scheme port lookup by replacing map with switch, reducing overhead and improving performance.

Change-Id: I45c790da15e237d5f32c50d342b3713b98fd2ffa
GitHub-Last-Rev: 4c02e4cabf181b365fbf2b722e3051625a289527
GitHub-Pull-Request: golang/go#73422
Reviewed-on: https://go-review.googlesource.com/c/go/+/666356
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2025-04-22 13:10:32 -07:00
Mateusz Poliwczak
8a85a2e70a runtime, internal/runtime/maps: speed-up empty/zero map lookups
This lets the inliner do a better job optimizing the mapKeyError call.

goos: linux
goarch: amd64
pkg: runtime
cpu: AMD Ryzen 5 4600G with Radeon Graphics
                                 │ /tmp/before2 │             /tmp/after3             │
                                 │    sec/op    │   sec/op     vs base                │
MapAccessZero/Key=int64-12          1.875n ± 0%   1.875n ± 0%        ~ (p=0.506 n=25)
MapAccessZero/Key=int32-12          1.875n ± 0%   1.875n ± 0%        ~ (p=0.082 n=25)
MapAccessZero/Key=string-12         1.902n ± 1%   1.902n ± 1%        ~ (p=0.256 n=25)
MapAccessZero/Key=mediumType-12     2.816n ± 0%   1.958n ± 0%  -30.47% (p=0.000 n=25)
MapAccessZero/Key=bigType-12        2.815n ± 0%   1.935n ± 0%  -31.26% (p=0.000 n=25)
MapAccessEmpty/Key=int64-12         1.942n ± 0%   2.109n ± 0%   +8.60% (p=0.000 n=25)
MapAccessEmpty/Key=int32-12         2.110n ± 0%   1.940n ± 0%   -8.06% (p=0.000 n=25)
MapAccessEmpty/Key=string-12        2.024n ± 0%   2.109n ± 0%   +4.20% (p=0.000 n=25)
MapAccessEmpty/Key=mediumType-12    3.157n ± 0%   2.344n ± 0%  -25.75% (p=0.000 n=25)
MapAccessEmpty/Key=bigType-12       3.054n ± 0%   2.115n ± 0%  -30.75% (p=0.000 n=25)
geomean                             2.305n        2.011n       -12.75%

Change-Id: Iee83930884dc4c8a791a711aa189a1c93b68d536
Reviewed-on: https://go-review.googlesource.com/c/go/+/663495
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-22 11:01:05 -07:00
Keith Randall
7d0cb2a2ad cmd/compile: constant fold 128-bit multiplies
The full 64x64->128 multiply comes up when using bits.Mul64.
The 64x64->64+overflow multiply comes up in unsafe.Slice when using
a constant length.

Change-Id: I298515162ca07d804b2d699d03bc957ca30a4ebc
Reviewed-on: https://go-review.googlesource.com/c/go/+/667175
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-22 10:24:18 -07:00
Rhys Hiltner
7a177114df runtime: commit to spinbitmutex GOEXPERIMENT
Use the "spinbit" mutex implementation always (including on platforms
that need to emulate atomic.Xchg8), and delete the prior "tristate"
implementations.

The exception is GOARCH=wasm, where the Go runtime does not use multiple
threads.

For #68578

Change-Id: Ifc29bbfa05071d776c23a19ae185891a03a82417
Reviewed-on: https://go-review.googlesource.com/c/go/+/658456
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-22 10:00:26 -07:00
Rhys Hiltner
7ce45a014c runtime: fix test of when a mutex is contended
This is used only in tests that verify reports of runtime-internal mutex
contention.

For #66999
For #70602

Change-Id: I72cb1302d8ea0524f1182ec892f5c9a1923cddba
Reviewed-on: https://go-review.googlesource.com/c/go/+/667095
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-04-22 09:58:32 -07:00
Prabhav Dogra
95611c0eb4 sync: use atomic.Bool for Once.done
Updated the use of atomic.Uint32 to atomic.Bool for sync package.

Change-Id: Ib8da66fea86ef06e1427ac5118016b96fbcda6b1
GitHub-Last-Rev: d36e0f431fcde988f90badf86bbf04a18a411947
GitHub-Pull-Request: golang/go#73447
Reviewed-on: https://go-review.googlesource.com/c/go/+/666895
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
2025-04-22 08:28:13 -07:00
Ian Lance Taylor
352dd2d932 runtime: only poll network from one P at a time in findRunnable
For #65064

Change-Id: Ifecd7e332d2cf251750752743befeda4ed396f33
Reviewed-on: https://go-review.googlesource.com/c/go/+/564197
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Artur M. Wolff <artur.m.wolff@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-22 02:49:42 -07:00
Keith Randall
336626bac4 cmd/compile: ensure we evaluate side effects of len() arg
For any len() which requires the evaluation of its arg (according to the spec).

Update #72844

Change-Id: Id2b0bcc78073a6d5051abd000131dafdf65e7f26
Reviewed-on: https://go-review.googlesource.com/c/go/+/658097
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-04-21 15:50:54 -07:00
Keith Randall
8af32240c6 cmd/compile: don't evaluate side effects of range over array
If the thing we're ranging over is an array or ptr to array, and
it doesn't have a function call or channel receive in it, then we
shouldn't evaluate it.

Typecheck the ranged-over value as a constant in that case.
That makes the unified exporter replace the range expression
with a constant int.

Change-Id: I0d4ea081de70d20cf6d1fa8d25ef6cb021975554
Reviewed-on: https://go-review.googlesource.com/c/go/+/659317
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-04-21 15:50:43 -07:00
thepudds
04a9b16f3d cmd/compile/internal/escape: avoid reading ir.Node during inner loop of walkOne
Broadly speaking, escape analysis has two main phases. First, it
traverses the AST while building a data-flow graph of locations and
edges. Second, during "solve", it repeatedly walks the data-flow graph
while carefully propagating information about each location, including
whether a location's address reaches the heap.

Once escape analysis is in the solve phase and repeatedly walking the
data-flow graph, almost all the information it needs is within the
location graph, with a notable exception being the ir.Class of an
ir.Name, which currently must be checked by following a pointer from
the location to its ir.Node.

For typical graphs, that does not matter much, but if the graph becomes
large enough, cache misses in the inner solve loop start to matter more,
and the class is checked many times in the inner loop.

We therefore store the class information on the location in the graph
to reduce how much memory we need to load in the inner loop.

The package github.com/microsoft/typescript-go/internal/checker
has many locations, and compilation currently spends most of its time
in escape analysis.

This CL gives roughly a 30% speedup for wall clock compilation time
for the checker package:

  go1.24.0:      91.79s
  this CL:       64.98s

Linux perf shows a healthy reduction for example in l2_request.miss and
dTLB-load-misses on an amd64 test VM.

We could tweak things a bit more, though initial review feedback
has suggested it would be good to get this in as it stands.

Subsequent CLs in this stack give larger improvements.

Updates #72815

Change-Id: I3117430dff684c99e6da1e0d7763869873379238
Reviewed-on: https://go-review.googlesource.com/c/go/+/657295
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Jake Bailey <jacob.b.bailey@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2025-04-21 15:44:05 -07:00
Julian Zhu
c0245b31fb crypto/sha512: remove unnecessary move op, replace with direct add
goos: linux
goarch: riscv64
pkg: crypto/sha512
                    │      o      │                 n                  │
                    │   sec/op    │   sec/op     vs base               │
Hash8Bytes/New-4      3.499µ ± 0%   3.444µ ± 0%  -1.56% (p=0.000 n=10)
Hash8Bytes/Sum384-4   4.012µ ± 0%   3.957µ ± 0%  -1.37% (p=0.000 n=10)
Hash8Bytes/Sum512-4   4.218µ ± 0%   4.162µ ± 0%  -1.32% (p=0.000 n=10)
Hash1K/New-4          17.07µ ± 0%   16.57µ ± 0%  -2.97% (p=0.000 n=10)
Hash1K/Sum384-4       17.59µ ± 0%   17.11µ ± 0%  -2.76% (p=0.000 n=10)
Hash1K/Sum512-4       17.78µ ± 0%   17.30µ ± 0%  -2.72% (p=0.000 n=10)
Hash8K/New-4          112.2µ ± 0%   108.7µ ± 0%  -3.08% (p=0.000 n=10)
Hash8K/Sum384-4       112.7µ ± 0%   109.2µ ± 0%  -3.09% (p=0.000 n=10)
Hash8K/Sum512-4       112.9µ ± 0%   109.4µ ± 0%  -3.07% (p=0.000 n=10)
geomean               19.72µ        19.24µ       -2.44%

                    │      o       │                  n                  │
                    │     B/s      │     B/s       vs base               │
Hash8Bytes/New-4      2.184Mi ± 0%   2.213Mi ± 0%  +1.31% (p=0.000 n=10)
Hash8Bytes/Sum384-4   1.898Mi ± 1%   1.926Mi ± 0%  +1.51% (p=0.000 n=10)
Hash8Bytes/Sum512-4   1.812Mi ± 1%   1.831Mi ± 0%  +1.05% (p=0.000 n=10)
Hash1K/New-4          57.20Mi ± 0%   58.95Mi ± 0%  +3.06% (p=0.000 n=10)
Hash1K/Sum384-4       55.51Mi ± 0%   57.09Mi ± 0%  +2.84% (p=0.000 n=10)
Hash1K/Sum512-4       54.91Mi ± 0%   56.44Mi ± 0%  +2.79% (p=0.000 n=10)
Hash8K/New-4          69.63Mi ± 0%   71.84Mi ± 0%  +3.17% (p=0.000 n=10)
Hash8K/Sum384-4       69.30Mi ± 0%   71.52Mi ± 0%  +3.20% (p=0.000 n=10)
Hash8K/Sum512-4       69.19Mi ± 0%   71.39Mi ± 0%  +3.18% (p=0.000 n=10)
geomean               19.65Mi        20.13Mi       +2.45%

Change-Id: Ib68b934276ec08246d4ae60ef9870c233f0eac69
Reviewed-on: https://go-review.googlesource.com/c/go/+/665595
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2025-04-20 21:03:00 -07:00
Joel Sing
c893e1cf82 crypto/internal/fips140/aes: actually use the VTBL instruction on arm64
Support for the VTBL instruction was added in CL 110015 - use it
directly, rather than using WORD encodings. Note that one of the
WORD encodings does not actually match the instruction in the
comment - use the instruction that matches the existing encoding
instead.

Change-Id: I1933162f8144a6b86b38e8b550d36907131b1dd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/666795
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-20 02:15:23 -07:00
1911860538
a204ed53d9 net: simplify readProtocols via sync.OnceFunc
In this case, using sync.OnceFunc is a better choice.

Change-Id: I52d27b9741265c90300a04a03537020e1aaaaaa7
GitHub-Last-Rev: a281daea255f1508a9042e8c8c7eb7ca1cef2430
GitHub-Pull-Request: golang/go#73434
Reviewed-on: https://go-review.googlesource.com/c/go/+/666635
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
2025-04-19 16:11:06 -07:00
Matthew Burton
ad0434200c fs: clarify documentation for ReadDir method
The fs.ReadDir method behaves the same way as
os.ReadDir, in that when n <= 0, ReadDir returns
all DirEntry values remaining in the dictionary.

Update the comment to reflect that only remaining
DirEntry values are returned (not all entries),
for subsequent calls.

Fixes #69301

Change-Id: I41ef7ef1c8e3fe7d64586f5297512697dc60dd40
Reviewed-on: https://go-review.googlesource.com/c/go/+/663215
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-19 16:07:50 -07:00
Russ Cox
a4d0269a4f math/big: use clearer loop bounds check elimination
Checking that the lengths are equal and panicking teaches the compiler
that it can assume “i in range for z” implies “i in range for x”, letting us
simplify the actual loops a bit.

It also turns up a few places in math/big that were playing maybe a little
too fast and loose with slice lengths. Update those to explicitly set all the
input slices to the same length.

These speedups are basically irrelevant, since they only happen
in real code if people are compiling with -tags math_big_pure_go.
But at least the code is clearer.

benchmark \ system                   c3h88    c2s16       s7      386   s7-386   c4as16      mac      arm  loong64  ppc64le  riscv64    s390x
AddVV/words=1/impl=go                    ~  +11.20%   +5.11%   -7.67%   -7.77%   +1.90%  +10.76%  -33.22%        ~  +10.98%        ~   +6.60%
AddVV/words=10/impl=go             -22.12%  -13.48%  -10.37%  -17.95%  -18.07%  -24.58%  -22.04%  -29.95%  -14.22%        ~   -6.33%   +3.66%
AddVV/words=16/impl=go              -9.75%  -13.73%        ~  -21.90%  -18.66%  -30.03%  -20.45%  -28.09%  -17.33%   -7.15%   -8.96%  +12.55%
AddVV/words=100/impl=go             -5.91%   -1.02%        ~  -29.23%  -22.18%  -25.62%   -6.49%  -23.59%  -22.31%   -1.88%  -14.13%   +9.23%
AddVV/words=1000/impl=go            -0.52%   -0.19%   -3.58%  -33.89%  -23.46%  -22.46%        ~  -24.00%  -24.73%   +0.93%  -15.79%  +12.32%
AddVV/words=10000/impl=go                ~        ~        ~  -33.79%  -23.72%  -23.79%   -5.98%  -23.92%        ~   +0.78%  -15.45%   +8.59%
AddVV/words=100000/impl=go               ~        ~        ~  -33.90%  -24.25%  -22.82%   -4.09%  -24.63%        ~   +1.00%  -13.56%        ~
SubVV/words=1/impl=go                    ~  +11.64%  +14.05%        ~   -4.07%        ~  +10.79%  -33.69%        ~        ~   +3.89%  +12.33%
SubVV/words=10/impl=go             -10.31%  -14.09%   -7.38%  +13.76%  -13.25%  -18.05%  -20.08%  -24.97%  -14.15%  +10.13%   -0.97%   -2.51%
SubVV/words=16/impl=go              -8.06%  -13.73%   -5.70%  +17.00%  -12.83%  -23.76%  -17.52%  -25.25%  -17.30%   -2.80%   -4.96%  -18.25%
SubVV/words=100/impl=go             -9.22%   -1.30%   -2.76%  +20.88%  -14.35%  -15.29%   -8.49%  -19.64%  -22.31%   -0.68%  -14.30%   -9.04%
SubVV/words=1000/impl=go            -0.60%        ~   -3.43%  +23.08%  -16.14%  -11.96%        ~  -28.52%  -24.73%        ~  -15.95%   -9.91%
SubVV/words=10000/impl=go                ~        ~        ~  +26.01%  -15.24%  -11.92%        ~  -28.26%   +4.25%        ~  -15.42%   -5.95%
SubVV/words=100000/impl=go               ~        ~        ~  +25.71%  -15.83%  -12.13%        ~  -27.88%   -1.27%        ~  -13.57%   -6.72%
LshVU/words=1/impl=go               +0.56%   +0.36%        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~
LshVU/words=10/impl=go             +13.37%   +4.63%        ~        ~        ~        ~        ~   -2.90%        ~        ~        ~        ~
LshVU/words=16/impl=go             +22.83%   +6.47%        ~        ~        ~        ~        ~        ~   +0.80%        ~        ~   +5.88%
LshVU/words=100/impl=go             +7.56%  +13.95%        ~        ~        ~        ~        ~        ~   +0.33%   -2.50%        ~        ~
LshVU/words=1000/impl=go            +0.64%  +17.92%        ~        ~        ~        ~        ~   -6.52%        ~   -2.58%        ~        ~
LshVU/words=10000/impl=go                ~  +17.60%        ~        ~        ~        ~        ~   -6.64%   -6.22%   -1.40%        ~        ~
LshVU/words=100000/impl=go               ~  +14.57%        ~        ~        ~        ~        ~        ~   -5.47%        ~        ~        ~
RshVU/words=1/impl=go                    ~        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~   +2.72%
RshVU/words=10/impl=go                   ~        ~        ~        ~        ~        ~        ~   +2.50%        ~        ~        ~        ~
RshVU/words=16/impl=go                   ~   +0.53%        ~        ~        ~        ~        ~   +3.82%        ~        ~        ~        ~
RshVU/words=100/impl=go                  ~        ~        ~        ~        ~        ~        ~   +6.18%        ~        ~        ~        ~
RshVU/words=1000/impl=go                 ~        ~        ~        ~        ~        ~        ~   +7.00%        ~        ~        ~        ~
RshVU/words=10000/impl=go                ~        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~
RshVU/words=100000/impl=go               ~        ~        ~        ~        ~        ~        ~   +7.05%        ~        ~        ~        ~
MulAddVWW/words=1/impl=go          -10.34%   +4.43%  +10.62%   -1.62%   -4.74%   -2.86%  +11.75%        ~   -8.00%   +8.89%   +3.87%        ~
MulAddVWW/words=10/impl=go          -1.61%   -5.87%        ~   -8.30%   -4.55%   +0.87%        ~   -5.28%  -20.82%        ~        ~   -2.32%
MulAddVWW/words=16/impl=go          -2.96%   -5.28%        ~   -9.22%   -5.28%        ~        ~   -3.74%  -19.52%   -1.48%   -2.53%   -9.52%
MulAddVWW/words=100/impl=go         -3.89%   -7.53%   +1.93%  -10.49%   -4.87%   -8.27%        ~        ~   -0.65%   -0.61%   -7.59%  -20.61%
MulAddVWW/words=1000/impl=go        -0.45%   -3.91%   +4.54%  -11.46%   -4.69%   -8.53%        ~        ~   -0.05%        ~   -8.88%  -19.77%
MulAddVWW/words=10000/impl=go            ~   -3.30%   +4.10%  -11.34%   -4.10%   -9.43%        ~   -0.61%        ~   -0.55%   -8.21%  -18.48%
MulAddVWW/words=100000/impl=go      -0.30%   -3.03%   +4.31%  -11.55%   -4.41%   -9.74%        ~   -0.75%   +0.63%        ~   -7.80%  -19.82%
AddMulVVWW/words=1/impl=go               ~  +13.09%  +12.50%   -7.05%  -10.41%   +2.53%  +13.32%   -3.49%        ~  +15.56%   +3.62%        ~
AddMulVVWW/words=10/impl=go        -15.96%   -9.06%   -5.06%  -14.56%  -11.83%   -5.44%  -26.30%  -14.23%  -11.44%   -1.79%   -5.93%   -6.60%
AddMulVVWW/words=16/impl=go        -19.05%  -12.43%   -6.19%  -14.24%  -12.67%   -8.65%  -18.64%  -16.56%  -10.64%   -3.00%   -7.61%  -12.80%
AddMulVVWW/words=100/impl=go       -22.13%  -16.59%  -13.04%  -13.79%  -11.46%  -12.01%   -6.46%  -21.80%   -5.08%   -3.13%  -13.60%  -22.53%
AddMulVVWW/words=1000/impl=go      -17.07%  -17.05%  -14.08%  -13.59%  -12.13%  -11.21%        ~  -22.81%   -4.27%   -1.27%  -16.35%  -23.47%
AddMulVVWW/words=10000/impl=go     -15.03%  -16.78%  -14.23%  -13.86%  -11.84%  -11.69%        ~  -22.75%  -13.39%   -1.10%  -14.37%  -22.01%
AddMulVVWW/words=100000/impl=go    -13.70%  -14.90%  -14.26%  -13.55%  -12.04%  -11.63%        ~  -22.61%        ~   -2.53%  -10.42%  -23.16%

Change-Id: Ic6f64344484a762b818c7090d1396afceb638607
Reviewed-on: https://go-review.googlesource.com/c/go/+/665155
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-04-19 08:19:27 -07:00
Russ Cox
7f516a31b0 math/big: replace assembly with mini-compiler output
Step 4 of the mini-compiler: switch to the new generated assembly.
No systematic performance regressions, and many many improvements.

In the benchmarks, the systems are:

	c3h88     GOARCH=amd64     c3h88 perf gomote (newer Intel, Google Cloud)
	c2s16     GOARCH=amd64     c2s16 perf gomote (Intel, Google Cloud)
	s7        GOARCH=amd64     rsc basement server (AMD Ryzen 9 7950X)
	386       GOARCH=386       gotip-linux-386 gomote (Intel, Google Cloud)
	s7-386    GOARCH=386       rsc basement server (AMD Ryzen 9 7950X)
	c4as16    GOARCH=arm64     c4as16 perf gomote (Google Cloud)
	mac       GOARCH=arm64     Apple M3 Pro in MacBook Pro
	arm       GOARCH=arm       gotip-linux-arm gomote
	loong64   GOARCH=loong64   gotip-linux-loong64 gomote
	ppc64le   GOARCH=ppc64le   gotip-linux-ppc64le gomote
	riscv64   GOARCH=riscv64   gotip-linux-riscv64 gomote
	s390x     GOARCH=s390x     linux-s390x-ibm old gomote

benchmark \ system           c3h88    c2s16       s7      386   s7-386   c4as16      mac      arm  loong64  ppc64le  riscv64    s390x
AddVV/words=1               -4.03%   +5.21%   -4.04%   +4.94%        ~        ~        ~        ~  -19.51%        ~        ~        ~
AddVV/words=10             -10.20%   +0.34%   -3.46%  -11.50%   -7.46%   +7.66%   +5.97%        ~  -17.90%        ~        ~        ~
AddVV/words=16             -10.91%   -6.45%   -8.45%  -21.86%  -17.90%   +2.73%   -1.61%        ~  -22.47%   -3.54%        ~        ~
AddVV/words=100             -3.77%   -4.30%   -3.17%  -47.27%  -45.34%   -0.78%        ~   -8.74%  -27.19%        ~        ~        ~
AddVV/words=1000            -0.08%   -0.71%        ~  -49.21%  -48.07%        ~        ~  -16.80%  -24.74%        ~        ~        ~
AddVV/words=10000                ~        ~        ~  -48.73%  -48.56%   -0.06%        ~  -17.08%        ~        ~   -4.81%        ~
AddVV/words=100000               ~        ~        ~  -47.80%  -48.38%        ~        ~  -15.10%  -25.06%        ~   -5.34%        ~
SubVV/words=1               -0.84%   +3.43%   -3.62%   +1.34%        ~   -0.76%        ~        ~  -18.18%   +5.58%        ~        ~
SubVV/words=10              -9.99%   +0.34%        ~  -11.23%   -8.24%   +7.53%   +6.15%        ~  -17.55%   +2.77%   -2.08%        ~
SubVV/words=16             -11.94%   -6.45%   -6.81%  -21.82%  -18.11%   +1.58%   -1.21%        ~  -20.36%        ~        ~        ~
SubVV/words=100             -3.38%   -4.32%   -1.80%  -46.14%  -46.43%   +0.41%        ~   -7.20%  -26.17%        ~   -0.42%        ~
SubVV/words=1000            -0.38%   -0.80%        ~  -49.22%  -48.90%        ~        ~  -15.86%  -24.73%        ~        ~        ~
SubVV/words=10000                ~        ~        ~  -49.57%  -49.64%   -0.03%        ~  -15.85%  -26.52%        ~   -5.05%        ~
SubVV/words=100000               ~        ~        ~  -46.88%  -49.66%        ~        ~  -15.45%  -16.11%        ~   -4.99%        ~
LshVU/words=1                    ~   +5.78%        ~        ~   -2.48%   +1.61%   +2.18%   +2.70%  -18.16%  -34.16%  -21.29%        ~
LshVU/words=10             -18.34%   -3.78%   +2.21%        ~        ~   -2.81%  -12.54%        ~  -25.02%  -24.78%  -38.11%  -66.98%
LshVU/words=16             -23.15%   +1.03%   +7.74%   +0.73%        ~   +8.88%   +1.56%        ~  -25.37%  -28.46%  -41.27%        ~
LshVU/words=100            -32.85%   -8.86%   -2.58%        ~   +2.69%   +1.24%        ~  -20.63%  -44.14%  -42.68%  -53.09%        ~
LshVU/words=1000           -37.30%   -0.20%   +5.67%        ~        ~   +1.44%        ~  -27.83%  -45.01%  -37.07%  -57.02%  -46.57%
LshVU/words=10000          -36.84%   -2.30%   +3.82%        ~   +1.86%   +1.57%  -66.81%  -28.00%  -13.15%  -35.40%  -41.97%        ~
LshVU/words=100000         -40.30%        ~   +3.96%        ~        ~        ~        ~  -24.91%  -19.06%  -36.14%  -40.99%  -66.03%
RshVU/words=1               -3.17%   +4.76%   -4.06%   +4.31%   +4.55%        ~        ~        ~  -20.61%        ~  -26.20%  -51.33%
RshVU/words=10             -22.08%   -4.41%  -17.99%   +3.64%  -11.87%        ~  -16.30%        ~  -30.01%        ~  -40.37%  -63.05%
RshVU/words=16             -26.03%   -8.50%  -18.09%        ~  -17.52%   +6.50%        ~   -2.85%  -30.24%        ~  -42.93%  -63.13%
RshVU/words=100            -20.87%  -28.83%  -29.45%        ~  -26.25%   +1.46%   -1.14%  -16.20%  -45.65%  -16.20%  -53.66%  -77.27%
RshVU/words=1000           -24.03%  -21.37%  -26.71%        ~  -28.95%   +0.98%        ~  -18.82%  -45.21%  -23.55%  -57.09%  -71.18%
RshVU/words=10000          -24.56%  -22.44%  -27.01%        ~  -28.88%   +0.78%   -5.35%  -17.47%  -16.87%  -20.67%  -41.97%        ~
RshVU/words=100000         -23.36%  -15.65%  -27.54%        ~  -29.26%   +1.73%   -6.67%  -13.68%  -21.40%  -23.02%  -40.37%  -66.31%
MulAddVWW/words=1           +2.37%   +8.14%        ~   +4.10%   +3.71%        ~        ~        ~  -21.62%        ~   +1.12%        ~
MulAddVWW/words=10               ~   -2.72%  -15.15%   +8.04%        ~        ~        ~   -2.52%  -19.48%        ~   -6.18%        ~
MulAddVWW/words=16               ~   +1.49%        ~   +4.49%   +6.58%   -8.70%   -7.16%  -12.08%  -21.43%   -6.59%   -9.05%        ~
MulAddVWW/words=100         +0.37%   +1.11%   -4.51%  -13.59%        ~  -11.10%   -3.63%  -21.40%  -22.27%   -2.92%  -14.41%        ~
MulAddVWW/words=1000             ~   +0.90%   -7.13%  -18.94%        ~  -14.02%   -9.97%  -28.31%  -18.72%   -2.32%  -15.80%        ~
MulAddVWW/words=10000            ~   +1.08%   -6.75%  -19.10%        ~  -14.61%   -9.04%  -28.48%  -14.29%   -2.25%   -9.40%        ~
MulAddVWW/words=100000           ~        ~   -6.93%  -18.09%        ~  -14.33%   -9.66%  -28.92%  -16.63%   -2.43%   -8.23%        ~
AddMulVVWW/words=1          +2.30%   +4.83%  -11.37%   +4.58%        ~   -3.14%        ~        ~  -10.58%  +30.35%        ~        ~
AddMulVVWW/words=10         -3.27%        ~   +8.96%   +5.74%        ~   +2.67%   -1.44%   -7.64%  -13.41%        ~        ~        ~
AddMulVVWW/words=16         -6.12%        ~        ~        ~   +1.91%   -7.90%  -16.22%  -14.07%  -14.26%   -4.15%   -7.30%        ~
AddMulVVWW/words=100        -5.48%   -2.14%        ~   -9.40%   +9.98%   -1.43%  -12.35%  -18.56%  -21.94%        ~   -9.84%        ~
AddMulVVWW/words=1000      -11.35%   -3.40%   -3.64%  -11.04%  +12.82%   -1.33%  -15.63%  -20.50%  -20.95%        ~  -11.06%  -51.97%
AddMulVVWW/words=10000     -10.31%   -1.61%   -8.41%  -12.15%  +13.10%   -1.03%  -16.34%  -22.46%   -1.00%        ~  -10.33%  -49.80%
AddMulVVWW/words=100000    -13.71%        ~   -8.31%  -12.18%  +12.98%   -1.35%  -15.20%  -21.89%        ~        ~   -9.38%  -48.30%

Change-Id: I0a33c33602c0d053c84d9946e662500cfa048e2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/664938
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-19 08:19:23 -07:00
Russ Cox
39070da4f8 math/big: add shift and mul to mini-compiler
Step 3 of the mini-compiler: add the generators for the shift and mul routines.

Change-Id: I981d5b7086262c740036f5db768d3e63083984e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/664937
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-19 08:19:20 -07:00
Russ Cox
2a88106617 math/big: add all architectures to mini-compiler
Step 2 of the mini-compiler: add all the remaining architectures.

Change-Id: I8c5283aa8baa497785a5c15f2248528fa9ae886e
Reviewed-on: https://go-review.googlesource.com/c/go/+/664936
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2025-04-19 08:19:17 -07:00
Russ Cox
8cc98a04ef math/big: new mini-compiler for arith assembly
The arith assembly is big enough, and the details that you have to keep
in mind are complex enough and varied enough, that it is worth using
a Go program to generate the assembly. That way, all the architectures
can use the same algorithms, and porting to new architectures will be
easier.

This is the first of a sequence of CLs to introduce a new mini-compiler
for generating the arith assembly, in math/big/internal/asmgen.
This CL has the basics of the compiler as well as a couple simple
architectures and the generator for addVV/subVV. It does not check
in the generated assembly yet. That will happen in a followup CL after
the other architectures and generators have been added.

Change-Id: Ib704c60fd972fc5690ac04d8fae3712ee2c1a80a
Reviewed-on: https://go-review.googlesource.com/c/go/+/664935
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2025-04-19 08:19:14 -07:00
Russ Cox
a11643df8f math/big: replace addVW/subVW assembly with fast pure Go
The vast majority of the time, carry propagation is limited and
addVW/subVW only need to consider a single word for carry propagation.
As Josh Bleecher-Snyder pointed out in 2019 (CL 164968), once carrying
is done, the remaining words can be handled faster with copy (memmove).
In the benchmarks below, this is the data=random case.

Even more important, if the source and destination are the same,
the copy can be optimized away entirely, making a small in-place
addition to a big.Int O(1) instead of O(N). To date, only a few
systems (amd64, arm64, and pure Go, meaning wasm) make use of this
asymptotic improvement. This is the data=shortcut case.

This CL deletes the addVW/subVW assembly and replaces it with
an optimized pure Go version. Using Go makes it easy to call
the real copy builtin, which will use optimized memmove code,
instead of recreating a worse memmove in assembly (as arm64 does)
or omitting the copy optimization entirely (as most others do).

The worst case for the Go version versus assembly is the case
of incrementing 2^N-1 by 1, which has to propagate a carry
the entire length of the array. This is the data=carry case.
On balance, we believe this case is rare enough to be worth
taking a hit in that case, in exchange for significant wins
in the other cases and the deletion of significant amounts of
assembly of varying quality. (Remember that half the assembly has
the copy optimization and shortcut, while half does not.)

In the benchmarks, the systems are:

	c2s16     GOARCH=amd64     c2s16 perf gomote (Intel, Google Cloud)
	c3h88     GOARCH=amd64     c3h88 perf gomote (newer Intel, Google Cloud)
	s7        GOARCH=amd64     rsc basement server (AMD Ryzen 9 7950X)
	c4as16    GOARCH=arm64     c4as16 perf gomote (Google Cloud)
	mac       GOARCH=arm64     Apple M3 Pro in MacBook Pro
	386       GOARCH=386       gotip-linux-386 gomote
	arm       GOARCH=arm       gotip-linux-arm gomote
	loong64   GOARCH=loong64   gotip-linux-loong64 gomote
	ppc64le   GOARCH=ppc64le   gotip-linux-ppc64le gomote
	riscv64   GOARCH=riscv64   gotip-linux-riscv64 gomote

benchmark \ system                    c2s16     c3h88       s7    c4as16       mac       386      arm  loong64   ppc64le  riscv64

AddVW/words=1/data=random            -1.15%    -1.74%   -5.89%    -9.80%   -11.54%   +23.71%  -12.74%  -14.25%   +14.67%  +10.27%
AddVW/words=2/data=random            -2.59%         ~   -4.38%   -19.31%   -15.41%   +24.80%        ~  -19.99%   +13.73%  +19.71%
AddVW/words=3/data=random            -3.75%   -19.10%   -3.79%   -23.15%   -17.04%   +20.04%  -10.07%  -23.20%         ~  +15.39%
AddVW/words=4/data=random            -2.84%    +7.05%   -8.77%   -22.64%   -15.77%   +16.01%   -7.36%  -28.22%         ~  +23.00%
AddVW/words=5/data=random           -10.97%    +2.16%  -12.09%   -20.89%   -17.14%    +9.42%   -4.69%  -32.60%         ~  +10.07%
AddVW/words=6/data=random            -9.87%         ~   -7.54%   -19.08%    -6.46%         ~   -3.44%  -34.61%         ~  +12.19%
AddVW/words=7/data=random           -14.36%         ~  -10.09%   -19.10%   -10.47%    -6.20%   -5.06%  -38.14%   -11.54%   +6.79%
AddVW/words=8/data=random           -17.50%         ~  -11.06%   -25.14%   -12.88%    -8.35%   -5.11%  -41.39%   -14.04%  +11.87%
AddVW/words=9/data=random           -19.76%    -4.05%  -15.47%   -24.08%   -16.50%   -12.34%  -21.56%  -44.25%   -14.82%        ~
AddVW/words=10/data=random          -13.89%         ~   -9.69%   -23.06%    -8.04%   -12.58%  -19.25%  -32.80%   -11.68%        ~
AddVW/words=16/data=random          -29.36%   -15.35%  -21.86%   -25.04%   -19.89%   -32.26%  -16.29%  -42.66%   -25.92%   -3.01%
AddVW/words=32/data=random          -39.02%   -28.76%  -39.87%   -11.22%    -2.85%   -55.40%  -31.17%  -55.37%   -37.92%  -16.28%
AddVW/words=64/data=random          -25.94%   -19.09%  -20.60%    -6.90%    +8.91%   -51.00%  -43.72%  -62.27%   -44.11%  -28.74%
AddVW/words=100/data=random         -22.79%   -18.13%  -18.25%         ~   +33.89%   -67.40%  -51.77%  -63.54%   -53.75%  -30.97%
AddVW/words=1000/data=random         -8.98%    -3.84%        ~    -3.15%         ~   -93.35%  -63.92%  -65.66%   -68.67%  -42.30%
AddVW/words=10000/data=random        -1.38%    -0.38%        ~         ~         ~   -89.16%  -65.18%  -44.65%   -70.35%  -20.08%
AddVW/words=100000/data=random            ~         ~        ~         ~         ~   -87.03%  -64.51%  -36.08%   -61.40%  -16.53%

SubVW/words=1/data=random            -3.67%         ~   -8.38%   -10.26%    -3.07%   +45.78%   -6.06%  -11.17%         ~        ~
SubVW/words=2/data=random            -3.48%   -10.07%   -5.76%   -20.14%    -8.45%   +44.28%        ~  -19.09%         ~  +16.98%
SubVW/words=3/data=random            -7.11%   -26.64%   -4.48%   -22.07%    -9.21%   +35.61%        ~  -23.93%   -18.20%        ~
SubVW/words=4/data=random            -4.23%    +7.19%   -8.95%   -22.62%   -13.89%   +33.20%   -8.96%  -29.96%         ~  +22.23%
SubVW/words=5/data=random           -11.49%    +1.92%  -10.86%   -22.27%   -17.53%   +24.48%   -2.88%  -35.19%   -19.55%        ~
SubVW/words=6/data=random            -7.67%         ~   -7.72%   -18.44%    -6.24%   +12.03%   -2.00%  -39.68%   -10.73%        ~
SubVW/words=7/data=random           -13.69%   -18.32%  -11.82%   -18.92%   -11.57%    +6.63%        ~  -43.54%   -30.81%        ~
SubVW/words=8/data=random           -16.02%         ~  -11.07%   -24.50%   -11.92%    +4.32%   -3.01%  -46.95%   -24.14%        ~
SubVW/words=9/data=random           -18.76%    -3.34%  -14.84%   -23.79%   -17.50%         ~  -21.80%  -49.98%   -29.62%        ~
SubVW/words=10/data=random          -13.23%         ~   -9.25%   -21.26%   -11.63%         ~  -18.58%  -39.19%   -20.09%        ~
SubVW/words=16/data=random          -28.25%   -13.24%  -22.66%   -27.18%   -19.13%   -23.38%  -20.24%  -51.01%   -28.06%   -3.05%
SubVW/words=32/data=random          -38.41%   -28.88%  -40.12%   -11.20%    -2.80%   -49.17%  -34.67%  -63.29%   -39.25%  -15.20%
SubVW/words=64/data=random          -25.51%   -19.24%  -22.20%    -6.57%    +9.98%   -48.52%  -48.14%  -69.50%   -49.44%  -27.92%
SubVW/words=100/data=random         -21.69%   -18.51%        ~    +1.92%   +34.42%   -65.88%  -54.67%  -71.24%   -58.88%  -30.71%
SubVW/words=1000/data=random         -9.81%    -4.05%   -2.14%    -3.06%         ~   -93.37%  -67.33%  -74.12%   -68.36%  -42.17%
SubVW/words=10000/data=random             ~    -0.52%        ~         ~         ~   -88.87%  -68.54%  -44.94%   -70.63%  -19.95%
SubVW/words=100000/data=random            ~         ~        ~         ~         ~   -86.69%  -68.09%  -48.36%   -62.42%  -19.32%

AddVW/words=1/data=shortcut         -29.38%   -25.38%  -27.37%   -23.15%   -25.41%    +3.01%  -33.60%  -36.12%   -15.76%        ~
AddVW/words=2/data=shortcut         -32.79%   -34.72%  -31.47%   -24.47%   -28.21%    -3.75%  -34.66%  -43.89%   -23.65%  -21.56%
AddVW/words=3/data=shortcut         -38.50%   -46.83%  -35.67%   -26.38%   -30.29%   -10.41%  -44.89%  -47.68%   -30.93%  -26.85%
AddVW/words=4/data=shortcut         -40.40%   -28.85%  -34.19%   -29.83%   -32.95%   -16.09%  -42.86%  -51.02%   -34.19%  -26.69%
AddVW/words=5/data=shortcut         -43.87%   -35.42%  -36.46%   -32.59%   -37.72%   -20.82%  -45.14%  -54.01%   -35.49%  -30.48%
AddVW/words=6/data=shortcut         -46.98%   -39.34%  -42.22%   -35.43%   -38.18%   -27.46%  -46.72%  -56.61%   -40.21%  -34.07%
AddVW/words=7/data=shortcut         -49.63%   -47.97%  -46.61%   -35.28%   -41.93%   -31.14%  -49.29%  -58.89%   -41.10%  -37.01%
AddVW/words=8/data=shortcut         -50.48%   -42.33%  -45.40%   -40.24%   -41.74%   -32.92%  -50.62%  -60.98%   -44.85%  -38.10%
AddVW/words=9/data=shortcut         -54.27%   -43.52%  -49.06%   -42.16%   -45.22%   -37.57%  -51.84%  -62.91%   -46.04%  -40.82%
AddVW/words=10/data=shortcut        -56.01%   -45.40%  -51.42%   -43.29%   -46.14%   -38.65%  -53.65%  -64.62%   -47.05%  -43.21%
AddVW/words=16/data=shortcut        -62.73%   -55.66%  -59.31%   -56.38%   -54.31%   -53.16%  -61.03%  -72.29%   -58.24%  -52.57%
AddVW/words=32/data=shortcut        -74.00%   -69.42%  -71.75%   -33.65%   -37.35%   -71.73%  -72.59%  -82.44%   -70.87%  -67.69%
AddVW/words=64/data=shortcut        -56.69%   -52.72%  -52.09%   -35.48%   -36.87%   -84.24%  -83.10%  -90.37%   -82.56%  -80.81%
AddVW/words=100/data=shortcut       -56.68%   -53.18%  -51.49%   -33.49%   -37.72%   -89.95%  -88.21%  -93.37%   -88.47%  -86.52%
AddVW/words=1000/data=shortcut      -56.68%   -52.45%  -51.66%   -35.31%   -36.65%   -98.88%  -98.62%  -99.24%   -98.78%  -98.41%
AddVW/words=10000/data=shortcut     -56.70%   -52.40%  -51.92%   -33.49%   -36.98%   -99.89%  -99.86%  -99.92%   -99.87%  -99.91%
AddVW/words=100000/data=shortcut    -56.67%   -52.46%  -52.38%   -35.31%   -37.20%   -99.99%  -99.99%  -99.99%   -99.99%  -99.99%

SubVW/words=1/data=shortcut         -29.80%   -20.71%  -26.94%   -23.24%   -25.33%   +26.97%  -32.02%  -37.85%   -40.20%  -12.67%
SubVW/words=2/data=shortcut         -35.47%   -36.38%  -31.93%   -25.43%   -30.18%   +18.96%  -33.48%  -46.48%   -39.38%  -18.65%
SubVW/words=3/data=shortcut         -39.22%   -49.96%  -36.90%   -25.82%   -30.96%   +12.53%  -40.67%  -51.07%   -43.71%  -23.78%
SubVW/words=4/data=shortcut         -40.46%   -24.90%  -34.66%   -29.87%   -33.97%    +4.60%  -42.32%  -54.92%   -42.83%  -22.45%
SubVW/words=5/data=shortcut         -43.84%   -34.17%  -38.00%   -32.55%   -37.27%    -2.46%  -43.09%  -58.18%   -45.70%  -26.45%
SubVW/words=6/data=shortcut         -47.69%   -37.49%  -42.73%   -35.90%   -37.73%    -8.52%  -46.55%  -61.01%   -44.00%  -30.14%
SubVW/words=7/data=shortcut         -49.45%   -50.66%  -46.88%   -34.77%   -41.64%   -14.46%  -48.92%  -63.46%   -50.47%  -33.39%
SubVW/words=8/data=shortcut         -50.45%   -39.31%  -47.14%   -40.47%   -41.70%   -15.77%  -50.21%  -65.64%   -47.71%  -34.01%
SubVW/words=9/data=shortcut         -54.28%   -43.07%  -49.42%   -41.34%   -44.99%   -19.39%  -51.55%  -67.61%   -56.92%  -36.82%
SubVW/words=10/data=shortcut        -56.85%   -47.88%  -50.92%   -42.76%   -45.67%   -23.60%  -53.04%  -69.34%   -60.18%  -39.43%
SubVW/words=16/data=shortcut        -62.36%   -54.83%  -58.80%   -55.83%   -53.74%   -41.04%  -60.16%  -76.75%   -60.56%  -48.63%
SubVW/words=32/data=shortcut        -73.68%   -68.64%  -71.57%   -33.52%   -37.34%   -64.73%  -72.67%  -85.89%   -71.87%  -64.56%
SubVW/words=64/data=shortcut        -56.68%   -51.66%  -52.56%   -34.75%   -37.54%   -80.30%  -83.58%  -92.39%   -83.41%  -78.70%
SubVW/words=100/data=shortcut       -56.68%   -50.97%  -51.57%   -33.68%   -36.78%   -87.42%  -88.53%  -94.84%   -88.87%  -84.96%
SubVW/words=1000/data=shortcut      -56.68%   -50.89%  -52.10%   -34.94%   -37.77%   -98.59%  -98.71%  -99.43%   -98.80%  -98.20%
SubVW/words=10000/data=shortcut     -56.68%   -51.00%  -52.44%   -33.65%   -37.27%   -99.86%  -99.87%  -99.94%   -99.88%  -99.90%
SubVW/words=100000/data=shortcut    -56.68%   -50.80%  -52.20%   -34.79%   -37.46%   -99.99%  -99.99%  -99.99%   -99.99%  -99.99%

AddVW/words=1/data=carry             -0.51%    -5.29%  -24.03%   -26.48%         ~         ~  -33.14%  -30.23%         ~  -20.74%
AddVW/words=2/data=carry             -6.36%         ~  -21.05%   -39.40%         ~   +10.72%  -29.12%  -31.34%         ~  -17.29%
AddVW/words=3/data=carry                  ~         ~  -17.46%   -19.53%   +17.58%         ~  -26.23%  -23.61%    +7.80%  -14.34%
AddVW/words=4/data=carry            +19.02%   +16.80%        ~         ~   +28.25%         ~  -27.90%  -20.31%   +19.16%        ~
AddVW/words=5/data=carry             +3.97%   +53.02%        ~         ~   +11.31%         ~  -19.05%  -17.47%   +16.81%        ~
AddVW/words=6/data=carry             +2.98%   +19.83%        ~         ~   +14.84%         ~  -18.48%  -14.92%   +18.25%        ~
AddVW/words=7/data=carry                  ~         ~        ~         ~   +27.17%         ~  -15.50%  -12.74%   +13.00%        ~
AddVW/words=8/data=carry             +0.58%   +22.32%        ~    +6.10%   +29.63%         ~  -13.04%        ~   +28.46%   +2.95%
AddVW/words=9/data=carry                  ~   +31.53%        ~         ~   +14.42%         ~  -11.32%        ~   +18.37%   +3.28%
AddVW/words=10/data=carry            +3.94%   +22.36%        ~    +6.29%   +19.22%         ~  -11.27%        ~   +20.10%   +3.91%
AddVW/words=16/data=carry            +2.82%   +14.23%        ~   +10.06%   +25.91%   -16.12%        ~        ~   +52.28%  +10.40%
AddVW/words=32/data=carry                 ~   +25.35%  +13.66%         ~   +34.89%   -34.39%   +6.51%  -18.71%   +41.06%  +19.42%
AddVW/words=64/data=carry           -42.03%         ~  -39.70%    +6.65%   +32.29%   -39.94%  +14.34%        ~   +19.68%  +20.86%
AddVW/words=100/data=carry          -33.95%   -34.28%  -39.65%         ~   +27.72%   -26.80%  +17.40%        ~   +26.39%  +23.32%
AddVW/words=1000/data=carry         -42.49%   -47.87%  -47.44%    +1.25%    +4.25%   -41.76%  +23.40%        ~   +25.48%  +27.99%
AddVW/words=10000/data=carry        -41.85%   -48.49%  -49.43%         ~         ~   -42.09%  +24.61%  -10.32%   +40.55%  +18.35%
AddVW/words=100000/data=carry       -28.18%   -48.13%  -48.24%    +1.35%         ~   -42.90%  +24.73%   -9.79%   +22.55%  +17.16%

SubVW/words=1/data=carry            -10.32%   -17.16%  -24.14%   -26.24%         ~   +18.43%  -34.10%  -29.54%    -9.57%        ~
SubVW/words=2/data=carry            -19.45%   -23.31%  -20.74%   -39.73%         ~   +15.74%  -28.13%  -30.21%         ~  -18.74%
SubVW/words=3/data=carry                  ~   -16.18%  -15.34%   -19.54%   +17.62%   +12.39%  -27.64%  -27.09%         ~  -14.97%
SubVW/words=4/data=carry            +11.67%   +24.42%        ~         ~   +25.11%   +14.07%  -28.08%  -26.18%         ~        ~
SubVW/words=5/data=carry             +8.08%   +25.64%        ~         ~   +10.35%    +8.12%  -21.75%  -25.50%         ~   -4.86%
SubVW/words=6/data=carry                  ~   +13.82%        ~         ~   +12.92%    +6.79%  -20.25%  -24.70%         ~   -2.74%
SubVW/words=7/data=carry                  ~         ~   +8.29%    +4.51%   +26.59%    +4.62%  -18.01%  -24.09%         ~   -1.26%
SubVW/words=8/data=carry                  ~   +23.16%  +16.19%    +6.16%   +25.46%    +6.74%  -15.57%  -22.74%         ~   +1.44%
SubVW/words=9/data=carry                  ~   +30.71%  +20.81%         ~   +12.36%         ~  -12.99%        ~         ~   +3.13%
SubVW/words=10/data=carry            +5.03%   +19.53%  +14.84%   +14.16%   +16.12%         ~  -11.64%  -16.00%   +15.45%   +3.29%
SubVW/words=16/data=carry           +14.42%   +15.58%  +33.07%   +11.43%   +24.65%         ~        ~  -21.90%   +25.59%   +9.40%
SubVW/words=32/data=carry                 ~   +27.57%  +46.58%         ~   +35.35%    -8.49%        ~  -24.04%   +11.86%  +18.40%
SubVW/words=64/data=carry           -24.34%   -27.83%  -20.90%   +13.34%   +37.17%   -14.90%        ~   -8.81%   +12.88%  +18.92%
SubVW/words=100/data=carry          -25.19%   -34.70%  -27.45%   +12.86%   +28.42%   -14.48%        ~        ~   +25.71%  +21.93%
SubVW/words=1000/data=carry         -24.93%   -47.86%  -47.26%    +2.66%         ~   -23.88%        ~        ~   +25.99%  +27.81%
SubVW/words=10000/data=carry        -24.17%   -36.48%  -49.41%    +1.06%         ~   -25.06%        ~  -26.50%   +27.94%  +18.36%
SubVW/words=100000/data=carry       -22.51%   -35.86%  -49.46%    +3.96%         ~   -25.18%        ~  -22.15%   +26.86%  +15.44%

Change-Id: I8f252073040e674780ac6ec9912082fb205329dd
Reviewed-on: https://go-review.googlesource.com/c/go/+/664898
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-18 15:07:59 -07:00
Russ Cox
b44b360dd4 math/big: add more complete tests and benchmarks of assembly
Also fix a few real but currently harmless bugs from CL 664895.
There were a few places that were still wrong if z != x or if a != 0.

Change-Id: Id8971e2505523bc4708780c82bf998a546f4f081
Reviewed-on: https://go-review.googlesource.com/c/go/+/664897
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-04-18 14:14:06 -07:00
Russ Cox
930cf59ba8 regexp/syntax: recognize category aliases like \p{Letter}
The Unicode specification defines aliases for some of the general
category names. For example the category "L" has alias "Letter".

The regexp package supports \p{L} but not \p{Letter}, because there
was nothing in the Unicode tables that lets regexp know about Letter.
Now that package unicode provides CategoryAliases (see #70780),
we can use it to provide \p{Letter} as well.

This is the only feature missing from making package regexp suitable
for use in a JSON-API Schema implementation. (The official test suite
includes usage of aliases like \p{Letter} instead of \p{L}.)

For better conformity with Unicode TR18, also accept case-insensitive
matches for names and ignore underscores, hyphens, and spaces;
and add Any, ASCII, and Assigned.

Fixes #70781.

Change-Id: I50ff024d99255338fa8d92663881acb47f1e92a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/641377
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-04-18 14:13:38 -07:00
Russ Cox
28fd9fa8a6 unicode: add CategoryAliases, Cn, LC
CategoryAliases is for regexp to use, for things like \p{Letter} as an alias for \p{L}.
Cn and LC are special-case categories that were never implemented
but should have been.

These changes were generated by the updated generator in CL 641395.

Fixes #70780.

Change-Id: Ibba20ff76191c8ae9631ac5ba19965790fe0cc81
Reviewed-on: https://go-review.googlesource.com/c/go/+/641376
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-04-18 14:13:31 -07:00
Michael Pratt
252c939445 internal/runtime/maps: move tombstone test to swiss file
This test fails on GOEXPERIMENT=noswissmap as it is testing behavior
specific to swissmaps. Move it to map_swiss_test.go to skip it on
noswissmap.

We could also switch the test to use NewTestMap, which provides a
swissmap even in GOEXPERIMENT=noswissmap, but that is tedious to use and
noswissmap is going away soon anyway.

For #70886.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-noswissmap
Change-Id: I6a6a636c5ec72217d936cd01e9da36ae127ea2c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/666437
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-18 08:30:30 -07:00
Damien Neil
0e17905793 encoding/json: add json/v2 with GOEXPERIMENT=jsonv2 guard
This imports the proposed new v2 JSON API implemented in
github.com/go-json-experiment/json as of commit
d3c622f1b874954c355e60c8e6b6baa5f60d2fed.

When GOEXPERIMENT=jsonv2 is set, the encoding/json/v2 and
encoding/jsontext packages are visible, the encoding/json
package is implemented in terms of encoding/json/v2, and
the encoding/json package include various additional APIs.
(See #71497 for details.)

When GOEXPERIMENT=jsonv2 is not set, the new API is not
present and the encoding/json package is unchanged.

The experimental API is not bound by the Go compatibility
promise and is expected to evolve as updates are made to
the json/v2 proposal.

The contents of encoding/json/internal/jsontest/testdata
are compressed with zstd v1.5.7 with the -19 option.

Fixes #71845
For #71497

Change-Id: Ib8c94e5f0586b6aaa22833190b41cf6ef59f4f01
Reviewed-on: https://go-review.googlesource.com/c/go/+/665796
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-04-18 08:24:07 -07:00
apocelipes
c889004615 internal,runtime: use the builtin clear
To simplify the code.

Change-Id: I023de705504c0b580718eec3c7c563b6cf2c8184
GitHub-Last-Rev: 026b32c799b13d0c7ded54f2e61429e6c5ed0aa8
GitHub-Pull-Request: golang/go#73412
Reviewed-on: https://go-review.googlesource.com/c/go/+/666118
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-18 04:21:17 -07:00
apocelipes
8a8efafa88 cmd/compile: use the builtin clear
To simplify the code a bit.

Change-Id: Ia72f576de59ff161ec389a4992bb635f89783540
GitHub-Last-Rev: eaec8216be964418a085649fcca53a042f28ce1a
GitHub-Pull-Request: golang/go#73411
Reviewed-on: https://go-review.googlesource.com/c/go/+/666117
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-04-18 04:21:12 -07:00