mirror of
https://github.com/golang/go.git
synced 2025-05-17 05:14:40 +00:00
47 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
44b54b99c9 |
strings,bytes: improve Repeat panic messages
The Repeat("-", maxInt) call should produce panic: runtime error: makeslice: len out of range instead of panic: strings: Repeat output length overflow This PR is only for theory perfection. Change-Id: If67d87b147d666fbbb7238656f2a0cb6cf1dbb5b GitHub-Last-Rev: 29dc0cb9c9c63d8a008960b4527d6aa6798c1c17 GitHub-Pull-Request: golang/go#67068 Reviewed-on: https://go-review.googlesource.com/c/go/+/581936 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> |
||
|
66b776b025 |
runtime: short path for equal pointers in arm64 memequal
If memequal is invoked with the same pointers as arguments it ends up comparing the whole memory contents, instead of just comparing the pointers. This effectively makes an operation that could be O(1) into O(n). All the other architectures already have this optimization in place. For instance, arm64 also have it, in memequal_varlen. Such optimization is very specific, one case that it will probably benefit is programs that rely heavily on interning of strings. goos: darwin goarch: arm64 pkg: bytes │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ Equal/same/1-8 2.678n ± ∞ ¹ 2.400n ± ∞ ¹ -10.38% (p=0.008 n=5) Equal/same/6-8 3.267n ± ∞ ¹ 2.431n ± ∞ ¹ -25.59% (p=0.008 n=5) Equal/same/9-8 2.981n ± ∞ ¹ 2.385n ± ∞ ¹ -19.99% (p=0.008 n=5) Equal/same/15-8 2.974n ± ∞ ¹ 2.390n ± ∞ ¹ -19.64% (p=0.008 n=5) Equal/same/16-8 2.983n ± ∞ ¹ 2.380n ± ∞ ¹ -20.21% (p=0.008 n=5) Equal/same/20-8 3.567n ± ∞ ¹ 2.384n ± ∞ ¹ -33.17% (p=0.008 n=5) Equal/same/32-8 3.568n ± ∞ ¹ 2.385n ± ∞ ¹ -33.16% (p=0.008 n=5) Equal/same/4K-8 78.040n ± ∞ ¹ 2.378n ± ∞ ¹ -96.95% (p=0.008 n=5) Equal/same/4M-8 78713.000n ± ∞ ¹ 2.385n ± ∞ ¹ -100.00% (p=0.008 n=5) Equal/same/64M-8 1348095.000n ± ∞ ¹ 2.381n ± ∞ ¹ -100.00% (p=0.008 n=5) geomean 43.52n 2.390n -94.51% ¹ need >= 6 samples for confidence interval at level 0.95 │ old.txt │ new.txt │ │ B/s │ B/s vs base │ Equal/same/1-8 356.1Mi ± ∞ ¹ 397.3Mi ± ∞ ¹ +11.57% (p=0.008 n=5) Equal/same/6-8 1.711Gi ± ∞ ¹ 2.298Gi ± ∞ ¹ +34.35% (p=0.008 n=5) Equal/same/9-8 2.812Gi ± ∞ ¹ 3.515Gi ± ∞ ¹ +24.99% (p=0.008 n=5) Equal/same/15-8 4.698Gi ± ∞ ¹ 5.844Gi ± ∞ ¹ +24.41% (p=0.008 n=5) Equal/same/16-8 4.995Gi ± ∞ ¹ 6.260Gi ± ∞ ¹ +25.34% (p=0.008 n=5) Equal/same/20-8 5.222Gi ± ∞ ¹ 7.814Gi ± ∞ ¹ +49.63% (p=0.008 n=5) Equal/same/32-8 8.353Gi ± ∞ ¹ 12.496Gi ± ∞ ¹ +49.59% (p=0.008 n=5) Equal/same/4K-8 48.88Gi ± ∞ ¹ 1603.96Gi ± ∞ ¹ +3181.17% (p=0.008 n=5) Equal/same/4M-8 49.63Gi ± ∞ ¹ 1637911.85Gi ± ∞ ¹ +3300381.91% (p=0.008 n=5) Equal/same/64M-8 46.36Gi ± ∞ ¹ 26253069.97Gi ± ∞ ¹ +56626517.99% (p=0.008 n=5) geomean 6.737Gi 122.7Gi +1721.01% ¹ need >= 6 samples for confidence interval at level 0.95 Fixes #64381 Change-Id: I7d423930a688edd88c4ba60d45e097296d9be852 GitHub-Last-Rev: ae8189fafb1cba87b5394f09f971746ae9299273 GitHub-Pull-Request: golang/go#64419 Reviewed-on: https://go-review.googlesource.com/c/go/+/545416 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> |
||
|
03cd8a7b0e |
internal/bytealg: fix alignment code in equal_riscv64.s
The riscv64 implementation of equal has an optimization that is applied when both pointers share the same alignment but that alignment is not 8 bytes. In this case it tries to align both pointers to an 8 byte boundaries, by individually comparing the first few bytes of each buffer. Unfortunately, the existing code is incorrect. It adjusts the pointers by the wrong number of bytes resulting, in most cases, in pointers that are not 8 byte aligned. This commit fixes the issue by individually comparing the first (8 - (pointer & 7)) bytes of each buffer rather than the first (pointer & 7) bytes. This particular optimization is not covered by any of the existing benchmarks so a new benchmark, BenchmarkEqualBothUnaligned, is provided. The benchmark tests the case where both pointers have the same alignment but may not be 8 byte aligned. Results of the new benchmark along with some of the existing benchmarks generated on a SiFive HiFive Unmatched A00 with 16GB of RAM running Ubuntu 23.04 are presented below. Equal/0-4 3.356n ± 0% 3.357n ± 0% ~ (p=0.840 n=10) Equal/1-4 63.91n ± 7% 65.97n ± 5% +3.22% (p=0.029 n=10) Equal/6-4 72.94n ± 5% 76.09n ± 4% ~ (p=0.075 n=10) Equal/9-4 84.61n ± 7% 85.83n ± 3% ~ (p=0.315 n=10) Equal/15-4 103.7n ± 2% 102.9n ± 4% ~ (p=0.739 n=10) Equal/16-4 89.14n ± 3% 100.40n ± 4% +12.64% (p=0.000 n=10) Equal/20-4 107.8n ± 3% 106.8n ± 3% ~ (p=0.725 n=10) Equal/32-4 63.95n ± 8% 67.79n ± 7% ~ (p=0.089 n=10) Equal/4K-4 1.256µ ± 1% 1.254µ ± 0% ~ (p=0.925 n=10) Equal/4M-4 1.231m ± 0% 1.230m ± 0% -0.04% (p=0.011 n=10) Equal/64M-4 19.77m ± 0% 19.78m ± 0% ~ (p=0.052 n=10) EqualBothUnaligned/64_0-4 43.70n ± 4% 44.40n ± 5% ~ (p=0.529 n=10) EqualBothUnaligned/64_1-4 6957.5n ± 0% 105.9n ± 1% -98.48% (p=0.000 n=10) EqualBothUnaligned/64_4-4 100.1n ± 2% 101.5n ± 4% ~ (p=0.149 n=10) EqualBothUnaligned/64_7-4 6965.00n ± 0% 95.60n ± 4% -98.63% (p=0.000 n=10) EqualBothUnaligned/4096_0-4 1.233µ ± 1% 1.225µ ± 0% -0.65% (p=0.015 n=10) EqualBothUnaligned/4096_1-4 584.226µ ± 0% 1.277µ ± 0% -99.78% (p=0.000 n=10) EqualBothUnaligned/4096_4-4 1.270µ ± 1% 1.268µ ± 0% ~ (p=0.105 n=10) EqualBothUnaligned/4096_7-4 584.944µ ± 0% 1.266µ ± 1% -99.78% (p=0.000 n=10) EqualBothUnaligned/4194304_0-4 1.241m ± 0% 1.236m ± 0% -0.38% (p=0.035 n=10) EqualBothUnaligned/4194304_1-4 600.956m ± 0% 1.238m ± 0% -99.79% (p=0.000 n=10) EqualBothUnaligned/4194304_4-4 1.239m ± 0% 1.241m ± 0% +0.22% (p=0.007 n=10) EqualBothUnaligned/4194304_7-4 601.036m ± 0% 1.239m ± 0% -99.79% (p=0.000 n=10) EqualBothUnaligned/67108864_0-4 19.79m ± 0% 19.78m ± 0% ~ (p=0.393 n=10) EqualBothUnaligned/67108864_1-4 9616.61m ± 0% 19.82m ± 0% -99.79% (p=0.000 n=10) EqualBothUnaligned/67108864_4-4 19.82m ± 0% 19.82m ± 0% ~ (p=0.971 n=10) EqualBothUnaligned/67108864_7-4 9616.34m ± 0% 19.86m ± 0% -99.79% (p=0.000 n=10) geomean 38.38µ 7.194µ -81.26% Change-Id: I4caab6c3450bd7e2773426b08b70bbc37fbe4e5f Reviewed-on: https://go-review.googlesource.com/c/go/+/500855 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> |
||
|
0b3f58c48e |
bytes, strings: add ContainsFunc
Fixes #54386. Change-Id: I78747da337ed6129e4f7426dd0483a644bed82e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/460216 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: hopehook <hopehook@golangcn.org> Auto-Submit: Ian Lance Taylor <iant@golang.org> |
||
|
e590afcf2c |
bytes, strings: rename field in CutSuffix tests
Change-Id: I63181f6540fc1bfcfc988a16bf9fafbd3575cfdf GitHub-Last-Rev: d90528730a92a087866c1bfc227a0a0bf1cdffbe GitHub-Pull-Request: golang/go#57909 Reviewed-on: https://go-review.googlesource.com/c/go/+/462284 Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> |
||
|
dcb90152a4 |
bytes,strings: optimize Repeat
When generating long strings or slices with Repeat we currently reuse intermediate states as a way to quickly build exponentially longer results. This works well as long as the intermediate states fit into the processor D-cache. If they don't we start thrashing the D-cache by reading in the whole intermediate state over and over on each iteration. Instead, once we reach a large enough intermediate state (that allows the memcpy operation to perform at peak) we cap the size of chunk of the state that is used as source for subsequent appends. This ensures that this smaller source chunk is always present in the D-cache, and the append operation does not need to read the state contents from memory. Currently the cap is set to 8KB, a number derived via experimentation to yield the highest performance across a a large range of result sizes. Slightly higher caps also produced similar results: 8KB was chosen as the smallest one in this performance plateau with the intention to minimize D-cache pollution. For result sizes larger than the fastest cache levels we get significantly higher performance compared to the current implementation: strings: name old speed new speed delta RepeatLarge/256/1-16 1.73GB/s ± 1% 1.73GB/s ± 0% ~ (p=0.556 n=5+4) RepeatLarge/256/16-16 2.02GB/s ± 0% 1.95GB/s ± 8% ~ (p=0.222 n=5+5) RepeatLarge/512/1-16 2.30GB/s ±13% 2.47GB/s ± 1% ~ (p=0.548 n=5+5) RepeatLarge/512/16-16 2.38GB/s ±16% 2.77GB/s ± 1% +16.27% (p=0.032 n=5+5) RepeatLarge/1024/1-16 3.17GB/s ± 1% 3.18GB/s ± 0% ~ (p=0.730 n=4+5) RepeatLarge/1024/16-16 3.39GB/s ± 2% 3.38GB/s ± 1% ~ (p=0.548 n=5+5) RepeatLarge/2048/1-16 3.32GB/s ± 2% 3.32GB/s ± 2% ~ (p=1.000 n=5+5) RepeatLarge/2048/16-16 3.41GB/s ± 4% 3.46GB/s ± 2% ~ (p=0.310 n=5+5) RepeatLarge/4096/1-16 3.60GB/s ± 4% 3.67GB/s ± 3% ~ (p=0.690 n=5+5) RepeatLarge/4096/16-16 3.74GB/s ± 3% 3.71GB/s ± 5% ~ (p=0.690 n=5+5) RepeatLarge/8192/1-16 3.94GB/s ± 4% 4.01GB/s ± 1% ~ (p=0.222 n=5+5) RepeatLarge/8192/16-16 3.94GB/s ± 6% 4.05GB/s ± 1% ~ (p=0.222 n=5+5) RepeatLarge/8192/4097-16 4.25GB/s ± 6% 4.32GB/s ± 3% ~ (p=0.690 n=5+5) RepeatLarge/16384/1-16 4.96GB/s ± 1% 5.02GB/s ± 2% ~ (p=0.421 n=5+5) RepeatLarge/16384/16-16 4.99GB/s ± 2% 5.07GB/s ± 1% ~ (p=0.421 n=5+5) RepeatLarge/16384/4097-16 5.15GB/s ± 3% 5.17GB/s ± 1% ~ (p=1.000 n=5+5) RepeatLarge/32768/1-16 5.44GB/s ± 2% 5.42GB/s ± 1% ~ (p=0.841 n=5+5) RepeatLarge/32768/16-16 5.46GB/s ± 4% 5.44GB/s ± 1% ~ (p=0.905 n=5+4) RepeatLarge/32768/4097-16 4.84GB/s ± 2% 4.59GB/s ±12% -5.05% (p=0.032 n=5+5) RepeatLarge/65536/1-16 5.85GB/s ± 0% 5.84GB/s ± 1% ~ (p=0.690 n=5+5) RepeatLarge/65536/16-16 5.81GB/s ± 2% 5.84GB/s ± 2% ~ (p=0.421 n=5+5) RepeatLarge/65536/4097-16 5.38GB/s ± 6% 5.45GB/s ± 1% ~ (p=1.000 n=5+5) RepeatLarge/131072/1-16 6.20GB/s ± 1% 6.31GB/s ± 1% +1.80% (p=0.008 n=5+5) RepeatLarge/131072/16-16 6.12GB/s ± 3% 6.25GB/s ± 3% ~ (p=0.095 n=5+5) RepeatLarge/131072/4097-16 5.95GB/s ± 1% 5.85GB/s ±10% ~ (p=1.000 n=5+5) RepeatLarge/262144/1-16 6.33GB/s ± 1% 6.56GB/s ± 0% +3.62% (p=0.016 n=5+4) RepeatLarge/262144/16-16 6.42GB/s ± 0% 6.65GB/s ± 1% +3.58% (p=0.016 n=4+5) RepeatLarge/262144/4097-16 6.31GB/s ± 1% 6.44GB/s ± 1% +1.94% (p=0.008 n=5+5) RepeatLarge/524288/1-16 6.23GB/s ± 1% 6.92GB/s ± 3% +11.02% (p=0.008 n=5+5) RepeatLarge/524288/16-16 6.24GB/s ± 1% 6.97GB/s ± 2% +11.77% (p=0.016 n=4+5) RepeatLarge/524288/4097-16 6.14GB/s ± 2% 6.73GB/s ± 3% +9.50% (p=0.008 n=5+5) RepeatLarge/1048576/1-16 5.23GB/s ± 1% 6.53GB/s ± 6% +24.85% (p=0.008 n=5+5) RepeatLarge/1048576/16-16 5.21GB/s ± 1% 6.56GB/s ± 4% +25.93% (p=0.008 n=5+5) RepeatLarge/1048576/4097-16 5.22GB/s ± 1% 6.26GB/s ± 2% +20.09% (p=0.008 n=5+5) RepeatLarge/2097152/1-16 3.95GB/s ± 1% 5.96GB/s ± 1% +51.01% (p=0.008 n=5+5) RepeatLarge/2097152/16-16 3.94GB/s ± 1% 5.98GB/s ± 2% +51.99% (p=0.008 n=5+5) RepeatLarge/2097152/4097-16 4.94GB/s ± 1% 5.71GB/s ± 2% +15.63% (p=0.008 n=5+5) RepeatLarge/4194304/1-16 3.10GB/s ± 1% 5.89GB/s ± 1% +89.90% (p=0.008 n=5+5) RepeatLarge/4194304/16-16 3.09GB/s ± 1% 5.86GB/s ± 1% +89.89% (p=0.008 n=5+5) RepeatLarge/4194304/4097-16 3.13GB/s ± 1% 5.89GB/s ± 1% +88.36% (p=0.008 n=5+5) RepeatLarge/8388608/1-16 3.06GB/s ± 1% 6.31GB/s ±16% +105.84% (p=0.008 n=5+5) RepeatLarge/8388608/16-16 3.08GB/s ± 1% 6.62GB/s ± 1% +114.66% (p=0.008 n=5+5) RepeatLarge/8388608/4097-16 3.13GB/s ± 2% 6.87GB/s ± 1% +119.62% (p=0.008 n=5+5) RepeatLarge/16777216/1-16 3.21GB/s ± 3% 5.88GB/s ± 1% +83.27% (p=0.008 n=5+5) RepeatLarge/16777216/16-16 3.23GB/s ± 2% 5.84GB/s ± 2% +80.49% (p=0.008 n=5+5) RepeatLarge/16777216/4097-16 3.30GB/s ± 6% 5.88GB/s ± 2% +78.18% (p=0.008 n=5+5) RepeatLarge/33554432/1-16 3.71GB/s ± 3% 5.91GB/s ± 2% +59.17% (p=0.008 n=5+5) RepeatLarge/33554432/16-16 3.67GB/s ± 3% 5.91GB/s ± 2% +61.13% (p=0.008 n=5+5) RepeatLarge/33554432/4097-16 3.71GB/s ± 1% 5.77GB/s ± 6% +55.51% (p=0.008 n=5+5) RepeatLarge/67108864/1-16 4.61GB/s ±11% 6.00GB/s ± 5% +30.15% (p=0.008 n=5+5) RepeatLarge/67108864/16-16 4.62GB/s ± 7% 6.11GB/s ± 2% +32.35% (p=0.008 n=5+5) RepeatLarge/67108864/4097-16 4.71GB/s ± 2% 6.24GB/s ± 2% +32.60% (p=0.008 n=5+5) RepeatLarge/134217728/1-16 4.53GB/s ± 8% 6.28GB/s ±11% +38.57% (p=0.008 n=5+5) RepeatLarge/134217728/16-16 4.78GB/s ± 3% 6.36GB/s ± 3% +33.16% (p=0.008 n=5+5) RepeatLarge/134217728/4097-16 4.73GB/s ± 6% 6.46GB/s ± 3% +36.63% (p=0.008 n=5+5) RepeatLarge/268435456/1-16 4.09GB/s ±25% 6.37GB/s ±19% +56.00% (p=0.008 n=5+5) RepeatLarge/268435456/16-16 4.50GB/s ± 4% 6.86GB/s ± 0% +52.49% (p=0.016 n=5+4) RepeatLarge/268435456/4097-16 4.73GB/s ± 5% 6.90GB/s ± 0% +45.94% (p=0.008 n=5+5) RepeatLarge/536870912/1-16 4.38GB/s ±36% 6.52GB/s ± 8% +48.68% (p=0.008 n=5+5) RepeatLarge/536870912/16-16 4.69GB/s ±12% 6.90GB/s ± 1% +46.97% (p=0.008 n=5+5) RepeatLarge/536870912/4097-16 4.87GB/s ± 8% 6.98GB/s ± 0% +43.36% (p=0.008 n=5+5) RepeatLarge/1073741824/1-16 3.87GB/s ±28% 6.96GB/s ± 1% +79.94% (p=0.016 n=5+4) RepeatLarge/1073741824/16-16 4.79GB/s ± 9% 6.93GB/s ± 0% +44.79% (p=0.008 n=5+5) RepeatLarge/1073741824/4097-16 4.65GB/s ± 8% 7.02GB/s ± 1% +51.02% (p=0.008 n=5+5) bytes: name old speed new speed delta RepeatLarge/256/1-16 1.93GB/s ± 1% 1.84GB/s ± 1% -4.81% (p=0.000 n=10+10) RepeatLarge/256/16-16 2.25GB/s ± 2% 2.15GB/s ± 1% -4.45% (p=0.000 n=9+8) RepeatLarge/512/1-16 2.71GB/s ± 1% 2.62GB/s ± 1% -3.27% (p=0.000 n=10+9) RepeatLarge/512/16-16 2.96GB/s ± 4% 2.91GB/s ± 1% ~ (p=0.243 n=9+10) RepeatLarge/1024/1-16 3.35GB/s ± 1% 3.27GB/s ± 1% -2.61% (p=0.000 n=9+10) RepeatLarge/1024/16-16 3.56GB/s ± 2% 3.52GB/s ± 1% -1.10% (p=0.010 n=10+9) RepeatLarge/2048/1-16 3.52GB/s ± 1% 3.45GB/s ± 1% -1.92% (p=0.000 n=10+10) RepeatLarge/2048/16-16 3.61GB/s ± 1% 3.58GB/s ± 0% -0.82% (p=0.008 n=9+8) RepeatLarge/4096/1-16 3.85GB/s ± 2% 3.80GB/s ± 2% ~ (p=0.165 n=10+10) RepeatLarge/4096/16-16 3.88GB/s ± 3% 3.84GB/s ± 4% ~ (p=0.393 n=10+10) RepeatLarge/8192/1-16 4.12GB/s ± 2% 4.04GB/s ± 1% -1.96% (p=0.000 n=10+10) RepeatLarge/8192/16-16 4.11GB/s ± 2% 4.09GB/s ± 1% ~ (p=0.278 n=9+10) RepeatLarge/8192/4097-16 4.38GB/s ± 1% 4.39GB/s ± 4% ~ (p=0.720 n=9+10) RepeatLarge/16384/1-16 5.06GB/s ± 2% 4.95GB/s ± 3% -2.29% (p=0.001 n=10+9) RepeatLarge/16384/16-16 5.11GB/s ± 3% 5.06GB/s ± 3% ~ (p=0.315 n=10+9) RepeatLarge/16384/4097-16 5.22GB/s ± 3% 5.26GB/s ± 3% ~ (p=0.211 n=9+10) RepeatLarge/32768/1-16 5.54GB/s ± 2% 5.50GB/s ± 3% ~ (p=0.353 n=10+10) RepeatLarge/32768/16-16 5.55GB/s ± 1% 5.60GB/s ± 1% +0.91% (p=0.035 n=10+9) RepeatLarge/32768/4097-16 4.88GB/s ± 2% 4.85GB/s ± 2% ~ (p=0.447 n=10+9) RepeatLarge/65536/1-16 5.86GB/s ± 1% 5.93GB/s ± 2% +1.18% (p=0.043 n=8+10) RepeatLarge/65536/16-16 5.83GB/s ± 2% 5.98GB/s ± 1% +2.67% (p=0.000 n=10+10) RepeatLarge/65536/4097-16 5.57GB/s ± 0% 5.56GB/s ± 3% ~ (p=0.696 n=8+10) RepeatLarge/131072/1-16 6.23GB/s ± 1% 6.38GB/s ± 2% +2.51% (p=0.000 n=9+10) RepeatLarge/131072/16-16 6.21GB/s ± 2% 6.37GB/s ± 1% +2.72% (p=0.000 n=9+10) RepeatLarge/131072/4097-16 6.04GB/s ± 1% 6.09GB/s ± 3% ~ (p=0.356 n=9+10) RepeatLarge/262144/1-16 6.47GB/s ± 1% 6.63GB/s ± 2% +2.57% (p=0.003 n=10+10) RepeatLarge/262144/16-16 6.45GB/s ± 2% 6.69GB/s ± 2% +3.65% (p=0.000 n=10+10) RepeatLarge/262144/4097-16 6.35GB/s ± 1% 6.51GB/s ± 2% +2.48% (p=0.000 n=9+10) RepeatLarge/524288/1-16 6.21GB/s ± 2% 6.95GB/s ± 1% +11.95% (p=0.000 n=10+10) RepeatLarge/524288/16-16 6.24GB/s ± 2% 6.93GB/s ± 2% +11.11% (p=0.000 n=10+10) RepeatLarge/524288/4097-16 6.18GB/s ± 2% 6.82GB/s ± 1% +10.39% (p=0.000 n=9+10) RepeatLarge/1048576/1-16 5.34GB/s ± 2% 6.41GB/s ± 2% +20.05% (p=0.000 n=10+10) RepeatLarge/1048576/16-16 5.33GB/s ± 1% 6.45GB/s ± 2% +20.84% (p=0.000 n=10+9) RepeatLarge/1048576/4097-16 5.28GB/s ± 1% 6.17GB/s ± 2% +16.75% (p=0.000 n=10+10) RepeatLarge/2097152/1-16 4.04GB/s ± 1% 6.21GB/s ± 1% +53.89% (p=0.000 n=9+8) RepeatLarge/2097152/16-16 4.02GB/s ± 1% 6.20GB/s ± 2% +54.37% (p=0.000 n=10+9) RepeatLarge/2097152/4097-16 4.94GB/s ± 1% 6.04GB/s ± 1% +22.36% (p=0.000 n=10+10) RepeatLarge/4194304/1-16 3.10GB/s ± 1% 5.74GB/s ± 0% +85.04% (p=0.000 n=10+9) RepeatLarge/4194304/16-16 3.10GB/s ± 2% 5.72GB/s ± 1% +84.26% (p=0.000 n=9+10) RepeatLarge/4194304/4097-16 3.03GB/s ± 4% 5.61GB/s ± 1% +85.06% (p=0.000 n=10+9) RepeatLarge/8388608/1-16 3.08GB/s ± 2% 6.25GB/s ± 1% +103.09% (p=0.000 n=9+9) RepeatLarge/8388608/16-16 3.07GB/s ± 2% 6.26GB/s ± 3% +104.07% (p=0.000 n=10+9) RepeatLarge/8388608/4097-16 3.08GB/s ± 2% 6.23GB/s ± 2% +102.09% (p=0.000 n=9+10) RepeatLarge/16777216/1-16 3.25GB/s ± 2% 5.78GB/s ± 3% +78.03% (p=0.000 n=9+9) RepeatLarge/16777216/16-16 3.25GB/s ± 1% 5.75GB/s ± 1% +77.21% (p=0.000 n=9+10) RepeatLarge/16777216/4097-16 3.29GB/s ± 3% 5.72GB/s ± 2% +73.74% (p=0.000 n=10+10) RepeatLarge/33554432/1-16 3.68GB/s ± 2% 5.90GB/s ± 1% +60.20% (p=0.000 n=10+10) RepeatLarge/33554432/16-16 3.69GB/s ± 3% 5.88GB/s ± 1% +59.54% (p=0.000 n=10+9) RepeatLarge/33554432/4097-16 3.74GB/s ± 1% 5.94GB/s ± 2% +58.68% (p=0.000 n=7+10) RepeatLarge/67108864/1-16 4.62GB/s ±12% 6.11GB/s ± 3% +32.23% (p=0.000 n=10+9) RepeatLarge/67108864/16-16 4.77GB/s ± 2% 6.09GB/s ± 2% +27.88% (p=0.000 n=9+9) RepeatLarge/67108864/4097-16 4.78GB/s ± 1% 6.19GB/s ± 1% +29.51% (p=0.000 n=9+10) RepeatLarge/134217728/1-16 4.60GB/s ±16% 6.52GB/s ± 9% +41.67% (p=0.000 n=10+10) RepeatLarge/134217728/16-16 4.80GB/s ± 4% 6.81GB/s ± 2% +41.82% (p=0.000 n=10+9) RepeatLarge/134217728/4097-16 4.79GB/s ± 4% 6.81GB/s ± 2% +42.31% (p=0.000 n=9+10) RepeatLarge/268435456/1-16 4.43GB/s ±25% 6.27GB/s ±14% +41.52% (p=0.000 n=10+10) RepeatLarge/268435456/16-16 4.75GB/s ± 4% 6.68GB/s ± 4% +40.50% (p=0.000 n=9+10) RepeatLarge/268435456/4097-16 4.75GB/s ± 3% 6.58GB/s ± 4% +38.68% (p=0.000 n=9+10) RepeatLarge/536870912/1-16 4.96GB/s ± 9% 6.39GB/s ±16% +28.90% (p=0.000 n=8+10) RepeatLarge/536870912/16-16 4.66GB/s ± 6% 6.57GB/s ± 7% +40.82% (p=0.000 n=10+9) RepeatLarge/536870912/4097-16 4.68GB/s ±11% 6.88GB/s ± 3% +47.01% (p=0.000 n=10+9) RepeatLarge/1073741824/1-16 4.39GB/s ±23% 6.57GB/s ± 5% +49.75% (p=0.000 n=10+8) RepeatLarge/1073741824/16-16 4.73GB/s ±13% 6.89GB/s ± 1% +45.68% (p=0.000 n=9+8) RepeatLarge/1073741824/4097-16 4.97GB/s ±15% 6.73GB/s ± 9% +35.45% (p=0.000 n=10+10) The results above come from a Intel i9-9980HK (256KB L2) with TurboBoost disabled. Change-Id: I79dd57da0429aee9020ffd7bc458a034b999b740 Reviewed-on: https://go-review.googlesource.com/c/go/+/419054 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
|
2a627afe13 |
bytes: simplify code using unsafe.SliceData
Updates #54854 Change-Id: I9c14f9fa595f73eae44eb714abc5d486915893c1 Reviewed-on: https://go-review.googlesource.com/c/go/+/428155 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> |
||
|
68005592b3 |
strings, bytes: add CutPrefix and CutSuffix
Fixes #42537 Change-Id: Ie03c2614ffee30ebe707acad6b9f6c28fb134a45 Reviewed-on: https://go-review.googlesource.com/c/go/+/407176 Reviewed-by: Benny Siegert <bsiegert@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Changkun Ou <mail@changkun.de> Reviewed-by: Ian Lance Taylor <iant@google.com> |
||
|
7b45edb450 |
bytes: add Clone function
The new Clone function returns a copy of b[:len(b)] for the input byte slice b. The result may have additional unused capacity. Clone(nil) returns nil. Fixes #45038 Change-Id: I0469a202d77a7b491f1341c08915d07ddd1f0300 Reviewed-on: https://go-review.googlesource.com/c/go/+/359675 Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joseph Tsai <joetsai@digital-static.net> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
||
|
9a4685f220 |
strings: avoid utf8.RuneError mangling in Split
Split should only split strings and not perform mangling of invalid UTF-8 into ut8.RuneError. The prior behavior is clearly a bug since mangling is not performed in all other situations (e.g., separator is non-empty). Fixes #53511 Change-Id: I112a2ef15ee46ddecda015ee14bca04cd76adfbf Reviewed-on: https://go-review.googlesource.com/c/go/+/413715 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
|
7ca1e2aa56 |
internal/bytealg: optimize index function for ppc64le/power9
Optimized index2to16 loop by unrolling the loop by 4. Multiple benchmark tests show performance improvement on POWER9. Similar improvements are seen on POWER10. Added tests to ensure changes work fine. name old time/op new time/op delta Index/10 18.3ns ± 0% 19.7ns ±25% ~ Index/32 75.3ns ± 0% 69.2ns ± 0% -8.22% Index/4K 5.53µs ± 0% 3.69µs ± 0% -33.20% Index/4M 5.64ms ± 0% 3.75ms ± 0% -33.55% Index/64M 92.9ms ± 0% 61.6ms ± 0% -33.69% IndexHard2 1.41ms ± 0% 0.93ms ± 0% -33.75% CountHard2 1.41ms ± 0% 0.93ms ± 0% -33.75% Change-Id: If9331df6a141a4716724b8cb648d2b91bdf17e5f Reviewed-on: https://go-review.googlesource.com/c/go/+/377016 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Archana Ravindar <aravind5@in.ibm.com> |
||
|
5bb2628c6f |
bytes: limit allocation in SplitN
So that bytes.SplitN("", "T", int(144115188075855872)) does not panic. Change-Id: I7c068852bd708416164fc2ed8b84cf6b2d593666 GitHub-Last-Rev: f8df09d65e2bc889fbd0c736bfb5e9a9078dfced GitHub-Pull-Request: golang/go#52147 Reviewed-on: https://go-review.googlesource.com/c/go/+/398076 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: mzh <mzh@golangcn.org> |
||
|
32fdad19a2 |
bytes: restore old Trim/TrimLeft behavior for nil
Keep returning nil for the cases where we historically returned nil, even though this is slightly different for TrimLeft and TrimRight. Fixes #51793 Change-Id: Ifbdfc6b09d52b8e063cfe6341019f9b2eb8b70e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/393876 Trust: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> |
||
|
8e36ab0551 |
bytes, strings: add Cut
Using Cut is a clearer way to write the vast majority (>70%) of existing code that calls Index, IndexByte, IndexRune, and SplitN. There is more discussion on https://golang.org/issue/46336. Fixes #46336. Change-Id: Ia418ed7c3706c65bf61e1b2c5baf534cb783e4d3 Reviewed-on: https://go-review.googlesource.com/c/go/+/351710 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
5baf60d472 |
bytes, strings: optimize Trim for single byte cutsets
Using the latest version of all modules known by the module proxy, we determine that for all Trim usages (and related functionality): * 76.6% have cutsets of len=1, and * 13.4% have cutsets of len=2. Given that a vast majority of usages only have a cutset of len=1, we should more heavily optimize for that situation. Previously, there was some optimization for cutsets of len=1, but it's within the internal makeCutsetFunc function. This is sub-optimal as it incurs an allocation in makeCutsetFunc for the closure over that single byte. This CL removes special-casing of one-byte cutsets from makeCutsetFunc and instead distributes it directly in Trim, TrimRight, and TrimLeft. Whether we should distribute the entire ASCII cutset logic into Trim is a future CL that should be discussed and handled separately. The evidence for multibyte cutsets is not as obviously compelling. name old time/op new time/op delta bytes/TrimByte-4 84.1ns ± 2% 7.5ns ± 1% -91.10% (p=0.000 n=9+7) strings/TrimByte-4 86.2ns ± 3% 8.3ns ± 1% -90.33% (p=0.000 n=9+10) Fixes #46446 Change-Id: Ia0e31a8384c3ce111ae35465605bcec45df2ebec Reviewed-on: https://go-review.googlesource.com/c/go/+/323318 Trust: Joe Tsai <joetsai@digital-static.net> Run-TryBot: Joe Tsai <joetsai@digital-static.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Go Bot <gobot@golang.org> |
||
|
3af92acb9f |
strings, bytes: improve IndexAny and LastIndexAny performance
For the case of a pattern containing multi-byte rune, the time complexity of the previous algorithm is O(nm), and if both input arguments are long, the search performance will be poor. This CL improves the searching performance for these cases by using IndexRune, which is mainly implemented with IndexByte and Index. As IndexByte and Index are specially optimized with some powerful instructions for short patterns (an UTF8 rune is 1 to 4 bytes), so they can help to reduce the runtime complexity of IndexAny and LastIndexAny. Another optimization method is using hash table, however, the actual test results show that using indexrune is better, and the space complexity is lower. There are two fast paths in IndexAny and LastIndexAny for cases where the length of the input arguements are 1, and their locations are not exactly the same, which is determined based on the actual test results. Benchmarks on arm64 and amd64: name old time/op new time/op delta pkg:strings goos:linux goarch:arm64 IndexAnyASCII/1:1-8 23.7ns ± 3% 28.5ns ± 0% +20.15% (p=0.008 n=5+5) IndexAnyASCII/1:2-8 18.0ns ± 0% 33.1ns ± 0% +83.67% (p=0.008 n=5+5) IndexAnyASCII/1:4-8 20.0ns ± 0% 36.0ns ± 0% +80.00% (p=0.029 n=4+4) IndexAnyASCII/1:8-8 36.1ns ± 0% 36.0ns ± 0% ~ (p=0.095 n=5+4) IndexAnyASCII/1:16-8 48.1ns ± 0% 36.0ns ± 0% -25.19% (p=0.029 n=4+4) IndexAnyASCII/1:32-8 72.1ns ± 0% 36.0ns ± 0% -50.01% (p=0.008 n=5+5) IndexAnyASCII/1:64-8 120ns ± 0% 39ns ± 0% -67.83% (p=0.008 n=5+5) IndexAnyASCII/16:1-8 73.0ns ± 0% 28.5ns ± 0% -60.95% (p=0.008 n=5+5) IndexAnyASCII/16:2-8 76.8ns ± 0% 77.0ns ± 0% ~ (p=1.000 n=5+5) IndexAnyASCII/16:4-8 83.2ns ± 1% 83.0ns ± 0% ~ (p=0.770 n=5+5) IndexAnyASCII/16:8-8 111ns ± 1% 107ns ± 0% -3.25% (p=0.008 n=5+5) IndexAnyASCII/16:16-8 139ns ± 1% 137ns ± 0% -1.58% (p=0.008 n=5+5) IndexAnyASCII/16:32-8 199ns ± 1% 197ns ± 0% -1.20% (p=0.008 n=5+5) IndexAnyASCII/16:64-8 307ns ± 0% 313ns ± 0% +1.82% (p=0.016 n=5+4) IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 0% -90.31% (p=0.008 n=5+5) IndexAnyASCII/256:2-8 678ns ± 0% 683ns ± 0% +0.68% (p=0.008 n=5+5) IndexAnyASCII/256:4-8 685ns ± 0% 683ns ± 0% -0.29% (p=0.000 n=5+4) IndexAnyASCII/256:8-8 711ns ± 0% 708ns ± 0% -0.48% (p=0.008 n=5+5) IndexAnyASCII/256:16-8 740ns ± 0% 740ns ± 0% ~ (p=0.444 n=5+5) IndexAnyASCII/256:32-8 799ns ± 0% 798ns ± 0% -0.18% (p=0.008 n=5+5) IndexAnyASCII/256:64-8 910ns ± 0% 914ns ± 0% +0.44% (p=0.016 n=4+5) IndexAnyUTF8/1:1-8 27.1ns ± 0% 19.0ns ± 0% -29.79% (p=0.008 n=5+5) IndexAnyUTF8/1:2-8 44.1ns ± 0% 33.0ns ± 0% -25.17% (p=0.008 n=5+5) IndexAnyUTF8/1:4-8 46.1ns ± 0% 33.1ns ± 0% -28.29% (p=0.016 n=4+5) IndexAnyUTF8/1:8-8 85.1ns ± 0% 33.0ns ± 0% -61.18% (p=0.008 n=5+5) IndexAnyUTF8/1:16-8 110ns ± 1% 36ns ± 0% -67.27% (p=0.008 n=5+5) IndexAnyUTF8/1:32-8 188ns ± 0% 36ns ± 0% -80.85% (p=0.008 n=5+5) IndexAnyUTF8/1:64-8 332ns ± 0% 39ns ± 0% ~ (p=0.079 n=4+5) IndexAnyUTF8/16:1-8 293ns ± 0% 54ns ± 0% -81.56% (p=0.008 n=5+5) IndexAnyUTF8/16:2-8 563ns ± 0% 349ns ± 0% -37.98% (p=0.008 n=5+5) IndexAnyUTF8/16:4-8 546ns ± 1% 349ns ± 0% -36.10% (p=0.000 n=5+4) IndexAnyUTF8/16:8-8 1.22µs ± 0% 0.35µs ± 0% -71.39% (p=0.008 n=5+5) IndexAnyUTF8/16:16-8 1.63µs ± 1% 0.42µs ± 0% -73.98% (p=0.008 n=5+5) IndexAnyUTF8/16:32-8 2.87µs ± 0% 0.42µs ± 0% -85.22% (p=0.008 n=5+5) IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.47µs ± 0% -90.98% (p=0.008 n=5+5) IndexAnyUTF8/256:1-8 4.26µs ± 0% 0.47µs ± 0% -88.85% (p=0.000 n=4+5) IndexAnyUTF8/256:2-8 8.62µs ± 0% 5.15µs ± 0% -40.21% (p=0.008 n=5+5) IndexAnyUTF8/256:4-8 8.25µs ± 0% 5.15µs ± 0% -37.50% (p=0.016 n=5+4) IndexAnyUTF8/256:8-8 19.2µs ± 1% 5.2µs ± 0% -73.08% (p=0.016 n=5+4) IndexAnyUTF8/256:16-8 25.6µs ± 1% 6.3µs ± 0% -75.32% (p=0.008 n=5+5) IndexAnyUTF8/256:32-8 45.6µs ± 0% 6.3µs ± 0% -86.15% (p=0.008 n=5+5) IndexAnyUTF8/256:64-8 82.4µs ± 0% 7.0µs ± 0% -91.53% (p=0.016 n=5+4) LastIndexAnyASCII/1:1-8 23.0ns ± 0% 33.5ns ± 0% +45.65% (p=0.008 n=5+5) LastIndexAnyASCII/1:2-8 24.5ns ± 0% 33.5ns ± 0% +36.73% (p=0.016 n=4+5) LastIndexAnyASCII/1:4-8 27.5ns ± 0% 35.5ns ± 0% +29.09% (p=0.008 n=5+5) LastIndexAnyASCII/1:8-8 44.5ns ± 0% 35.5ns ± 0% -20.13% (p=0.008 n=5+5) LastIndexAnyASCII/1:16-8 56.5ns ± 0% 35.5ns ± 0% -37.15% (p=0.008 n=5+5) LastIndexAnyASCII/1:32-8 80.3ns ± 0% 35.5ns ± 0% -55.79% (p=0.000 n=5+4) LastIndexAnyASCII/1:64-8 129ns ± 0% 40ns ± 0% -68.85% (p=0.008 n=5+5) LastIndexAnyASCII/16:1-8 72.8ns ± 0% 72.7ns ± 0% -0.19% (p=0.016 n=4+5) LastIndexAnyASCII/16:2-8 75.4ns ± 0% 75.1ns ± 0% ~ (p=0.127 n=5+5) LastIndexAnyASCII/16:4-8 81.9ns ± 1% 80.2ns ± 0% -2.00% (p=0.008 n=5+5) LastIndexAnyASCII/16:8-8 110ns ± 1% 108ns ± 0% -1.46% (p=0.008 n=5+5) LastIndexAnyASCII/16:16-8 138ns ± 1% 134ns ± 0% -3.18% (p=0.008 n=5+5) LastIndexAnyASCII/16:32-8 198ns ± 0% 197ns ± 0% -0.51% (p=0.008 n=5+5) LastIndexAnyASCII/16:64-8 309ns ± 0% 313ns ± 0% +1.30% (p=0.008 n=5+5) LastIndexAnyASCII/256:1-8 652ns ± 0% 653ns ± 0% +0.21% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-8 656ns ± 0% 656ns ± 0% ~ (all equal) LastIndexAnyASCII/256:4-8 663ns ± 0% 663ns ± 0% ~ (p=0.444 n=5+5) LastIndexAnyASCII/256:8-8 691ns ± 0% 690ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyASCII/256:16-8 719ns ± 0% 715ns ± 0% -0.53% (p=0.000 n=5+4) LastIndexAnyASCII/256:32-8 779ns ± 0% 780ns ± 0% +0.13% (p=0.029 n=4+4) LastIndexAnyASCII/256:64-8 890ns ± 0% 894ns ± 0% +0.45% (p=0.008 n=5+5) LastIndexAnyUTF8/1:1-8 31.6ns ± 0% 33.5ns ± 0% +6.01% (p=0.008 n=5+5) LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 33.5ns ± 0% -30.99% (p=0.008 n=5+5) LastIndexAnyUTF8/1:4-8 48.6ns ± 0% 33.5ns ± 0% -31.13% (p=0.000 n=5+4) LastIndexAnyUTF8/1:8-8 89.6ns ± 0% 33.5ns ± 0% -62.56% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-8 113ns ± 1% 36ns ± 0% -68.47% (p=0.000 n=5+4) LastIndexAnyUTF8/1:32-8 190ns ± 0% 36ns ± 0% -81.26% (p=0.029 n=4+4) LastIndexAnyUTF8/1:64-8 327ns ± 0% 40ns ± 0% -87.77% (p=0.008 n=5+5) LastIndexAnyUTF8/16:1-8 364ns ± 0% 158ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyUTF8/16:2-8 636ns ± 0% 472ns ± 0% -25.79% (p=0.000 n=5+4) LastIndexAnyUTF8/16:4-8 630ns ± 0% 472ns ± 0% -25.03% (p=0.008 n=5+5) LastIndexAnyUTF8/16:8-8 1.28µs ± 0% 0.47µs ± 0% -63.09% (p=0.016 n=5+4) LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.53µs ± 0% -68.39% (p=0.016 n=5+4) LastIndexAnyUTF8/16:32-8 2.88µs ± 0% 0.53µs ± 0% -81.72% (p=0.008 n=5+5) LastIndexAnyUTF8/16:64-8 5.08µs ± 0% 0.57µs ± 0% -88.79% (p=0.008 n=5+5) LastIndexAnyUTF8/256:1-8 5.41µs ± 0% 2.03µs ± 0% -62.46% (p=0.016 n=4+5) LastIndexAnyUTF8/256:2-8 9.77µs ± 0% 7.14µs ± 0% -26.97% (p=0.008 n=5+5) LastIndexAnyUTF8/256:4-8 9.63µs ± 0% 7.14µs ± 0% -25.86% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-8 20.0µs ± 0% 7.1µs ± 0% -64.30% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-8 26.1µs ± 1% 8.0µs ± 0% -69.40% (p=0.008 n=5+5) LastIndexAnyUTF8/256:32-8 45.6µs ± 1% 8.0µs ± 0% -82.51% (p=0.008 n=5+5) LastIndexAnyUTF8/256:64-8 80.8µs ± 0% 8.6µs ± 0% -89.33% (p=0.016 n=5+4) pkg:bytes goos:linux goarch:arm64 IndexAnyASCII/1:1-8 26.2ns ± 1% 26.5ns ± 0% +1.30% (p=0.016 n=5+4) IndexAnyASCII/1:2-8 18.5ns ± 0% 26.5ns ± 0% +43.24% (p=0.008 n=5+5) IndexAnyASCII/1:4-8 21.0ns ± 0% 26.5ns ± 0% +26.38% (p=0.008 n=5+5) IndexAnyASCII/1:8-8 37.5ns ± 0% 26.5ns ± 0% -29.33% (p=0.000 n=5+4) IndexAnyASCII/1:16-8 49.6ns ± 0% 26.5ns ± 0% -46.49% (p=0.008 n=5+5) IndexAnyASCII/1:32-8 73.6ns ± 0% 30.1ns ± 0% -59.16% (p=0.008 n=5+5) IndexAnyASCII/1:64-8 122ns ± 0% 33ns ± 0% -73.23% (p=0.008 n=5+5) IndexAnyASCII/16:1-8 73.7ns ± 0% 33.4ns ± 0% -54.71% (p=0.008 n=5+5) IndexAnyASCII/16:2-8 79.1ns ± 0% 78.9ns ± 0% -0.30% (p=0.016 n=4+5) IndexAnyASCII/16:4-8 84.8ns ± 0% 86.1ns ± 0% +1.58% (p=0.016 n=5+4) IndexAnyASCII/16:8-8 111ns ± 0% 111ns ± 0% ~ (all equal) IndexAnyASCII/16:16-8 139ns ± 0% 144ns ± 0% +3.60% (p=0.016 n=4+5) IndexAnyASCII/16:32-8 196ns ± 0% 207ns ± 0% +5.61% (p=0.016 n=5+4) IndexAnyASCII/16:64-8 311ns ± 0% 320ns ± 0% +2.89% (p=0.016 n=4+5) IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 1% -90.35% (p=0.008 n=5+5) IndexAnyASCII/256:2-8 680ns ± 0% 680ns ± 0% ~ (p=0.444 n=5+5) IndexAnyASCII/256:4-8 686ns ± 0% 687ns ± 0% ~ (p=0.167 n=5+5) IndexAnyASCII/256:8-8 713ns ± 0% 712ns ± 0% -0.14% (p=0.008 n=5+5) IndexAnyASCII/256:16-8 740ns ± 0% 744ns ± 0% +0.54% (p=0.016 n=5+4) IndexAnyASCII/256:32-8 797ns ± 0% 808ns ± 0% +1.43% (p=0.008 n=5+5) IndexAnyASCII/256:64-8 912ns ± 0% 921ns ± 0% +0.99% (p=0.016 n=4+5) IndexAnyUTF8/1:1-8 27.5ns ± 0% 26.5ns ± 0% -3.64% (p=0.008 n=5+5) IndexAnyUTF8/1:2-8 44.5ns ± 0% 26.5ns ± 0% -40.50% (p=0.008 n=5+5) IndexAnyUTF8/1:4-8 45.6ns ± 0% 26.5ns ± 0% -41.89% (p=0.000 n=5+4) IndexAnyUTF8/1:8-8 85.8ns ± 1% 26.5ns ± 0% -69.11% (p=0.008 n=5+5) IndexAnyUTF8/1:16-8 110ns ± 1% 26ns ± 0% -76.00% (p=0.016 n=5+4) IndexAnyUTF8/1:32-8 188ns ± 0% 30ns ± 0% -84.04% (p=0.008 n=5+5) IndexAnyUTF8/1:64-8 333ns ± 0% 33ns ± 0% -90.20% (p=0.008 n=5+5) IndexAnyUTF8/16:1-8 294ns ± 0% 235ns ± 0% -20.07% (p=0.008 n=5+5) IndexAnyUTF8/16:2-8 563ns ± 0% 309ns ± 0% -45.12% (p=0.008 n=5+5) IndexAnyUTF8/16:4-8 558ns ± 1% 309ns ± 0% -44.60% (p=0.000 n=5+4) IndexAnyUTF8/16:8-8 1.23µs ± 0% 0.31µs ± 0% -74.79% (p=0.008 n=5+5) IndexAnyUTF8/16:16-8 1.62µs ± 2% 0.31µs ± 0% -80.93% (p=0.008 n=5+5) IndexAnyUTF8/16:32-8 2.86µs ± 0% 0.38µs ± 0% -86.87% (p=0.008 n=5+5) IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.42µs ± 0% -91.86% (p=0.008 n=5+5) IndexAnyUTF8/256:1-8 4.27µs ± 1% 3.30µs ± 1% -22.75% (p=0.008 n=5+5) IndexAnyUTF8/256:2-8 8.61µs ± 0% 4.45µs ± 0% -48.31% (p=0.016 n=4+5) IndexAnyUTF8/256:4-8 8.44µs ± 0% 4.45µs ± 0% -47.23% (p=0.008 n=5+5) IndexAnyUTF8/256:8-8 19.2µs ± 0% 4.5µs ± 0% -76.78% (p=0.008 n=5+5) IndexAnyUTF8/256:16-8 25.6µs ± 0% 4.5µs ± 0% -82.63% (p=0.008 n=5+5) IndexAnyUTF8/256:32-8 45.4µs ± 0% 5.5µs ± 0% -87.85% (p=0.016 n=4+5) IndexAnyUTF8/256:64-8 82.5µs ± 0% 6.2µs ± 0% -92.49% (p=0.008 n=5+5) LastIndexAnyASCII/1:1-8 23.0ns ± 0% 26.5ns ± 0% +15.02% (p=0.008 n=5+5) LastIndexAnyASCII/1:2-8 24.5ns ± 0% 26.5ns ± 0% +8.16% (p=0.008 n=5+5) LastIndexAnyASCII/1:4-8 27.8ns ± 0% 26.5ns ± 0% -4.68% (p=0.029 n=4+4) LastIndexAnyASCII/1:8-8 45.1ns ± 1% 26.5ns ± 0% -41.29% (p=0.000 n=5+4) LastIndexAnyASCII/1:16-8 57.1ns ± 0% 26.5ns ± 0% -53.61% (p=0.008 n=5+5) LastIndexAnyASCII/1:32-8 81.5ns ± 0% 30.0ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyASCII/1:64-8 129ns ± 0% 32ns ± 0% -74.81% (p=0.008 n=5+5) LastIndexAnyASCII/16:1-8 72.6ns ± 0% 72.1ns ± 0% -0.63% (p=0.000 n=4+5) LastIndexAnyASCII/16:2-8 77.2ns ± 0% 77.2ns ± 0% ~ (p=0.167 n=5+5) LastIndexAnyASCII/16:4-8 83.1ns ± 0% 83.2ns ± 0% ~ (p=0.444 n=5+5) LastIndexAnyASCII/16:8-8 109ns ± 1% 108ns ± 0% ~ (p=0.167 n=5+5) LastIndexAnyASCII/16:16-8 136ns ± 0% 136ns ± 0% ~ (all equal) LastIndexAnyASCII/16:32-8 195ns ± 0% 197ns ± 0% +0.82% (p=0.008 n=5+5) LastIndexAnyASCII/16:64-8 309ns ± 0% 309ns ± 0% ~ (all equal) LastIndexAnyASCII/256:1-8 653ns ± 0% 657ns ± 0% +0.61% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-8 659ns ± 0% 658ns ± 0% ~ (p=0.167 n=5+5) LastIndexAnyASCII/256:4-8 664ns ± 0% 663ns ± 0% ~ (p=0.095 n=5+4) LastIndexAnyASCII/256:8-8 698ns ± 0% 689ns ± 0% -1.29% (p=0.008 n=5+5) LastIndexAnyASCII/256:16-8 726ns ± 0% 717ns ± 0% -1.24% (p=0.008 n=5+5) LastIndexAnyASCII/256:32-8 777ns ± 0% 779ns ± 0% ~ (p=0.079 n=5+4) LastIndexAnyASCII/256:64-8 889ns ± 0% 890ns ± 0% ~ (p=0.444 n=5+5) LastIndexAnyUTF8/1:1-8 32.1ns ± 0% 26.5ns ± 0% -17.45% (p=0.000 n=5+4) LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 26.5ns ± 0% -45.52% (p=0.000 n=5+4) LastIndexAnyUTF8/1:4-8 49.6ns ± 0% 26.5ns ± 0% -46.62% (p=0.008 n=5+5) LastIndexAnyUTF8/1:8-8 91.9ns ± 0% 26.5ns ± 0% -71.18% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-8 114ns ± 1% 26ns ± 0% -76.84% (p=0.000 n=5+4) LastIndexAnyUTF8/1:32-8 203ns ± 6% 30ns ± 0% -85.25% (p=0.008 n=5+5) LastIndexAnyUTF8/1:64-8 330ns ± 0% 33ns ± 0% -90.14% (p=0.000 n=4+5) LastIndexAnyUTF8/16:1-8 365ns ± 0% 164ns ± 0% -55.04% (p=0.008 n=5+5) LastIndexAnyUTF8/16:2-8 638ns ± 0% 296ns ± 0% -53.58% (p=0.008 n=5+5) LastIndexAnyUTF8/16:4-8 634ns ± 0% 296ns ± 0% -53.31% (p=0.008 n=5+5) LastIndexAnyUTF8/16:8-8 1.30µs ± 0% 0.30µs ± 0% -77.18% (p=0.000 n=4+5) LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.30µs ± 0% -82.19% (p=0.008 n=5+5) LastIndexAnyUTF8/16:32-8 2.90µs ± 0% 0.38µs ± 0% -87.00% (p=0.029 n=4+4) LastIndexAnyUTF8/16:64-8 5.10µs ± 0% 0.42µs ± 0% -91.78% (p=0.008 n=5+5) LastIndexAnyUTF8/256:1-8 5.42µs ± 0% 2.12µs ± 0% -60.92% (p=0.008 n=5+5) LastIndexAnyUTF8/256:2-8 9.79µs ± 0% 4.26µs ± 0% -56.47% (p=0.008 n=5+5) LastIndexAnyUTF8/256:4-8 9.66µs ± 0% 4.26µs ± 0% -55.87% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-8 20.4µs ± 0% 4.3µs ± 0% -79.10% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-8 26.0µs ± 1% 4.3µs ± 0% -83.62% (p=0.008 n=5+5) LastIndexAnyUTF8/256:32-8 46.0µs ± 0% 5.5µs ± 0% -88.09% (p=0.008 n=5+5) LastIndexAnyUTF8/256:64-8 81.1µs ± 0% 6.2µs ± 0% -92.38% (p=0.008 n=5+5) name old time/op new time/op delta pkg:strings goos:linux goarch:amd64 IndexAnyASCII/1:1-48 10.0ns ± 0% 13.3ns ± 0% +33.00% (p=0.008 n=5+5) IndexAnyASCII/1:2-48 11.0ns ± 0% 15.5ns ± 0% +40.55% (p=0.016 n=4+5) IndexAnyASCII/1:4-48 12.9ns ± 0% 15.4ns ± 0% +19.69% (p=0.008 n=5+5) IndexAnyASCII/1:8-48 18.6ns ± 0% 15.5ns ± 0% -16.45% (p=0.000 n=4+5) IndexAnyASCII/1:16-48 30.1ns ± 0% 16.9ns ± 0% ~ (p=0.079 n=4+5) IndexAnyASCII/1:32-48 53.1ns ± 0% 18.6ns ± 0% -64.95% (p=0.000 n=5+4) IndexAnyASCII/1:64-48 98.9ns ± 0% 17.4ns ± 0% -82.41% (p=0.000 n=5+4) IndexAnyASCII/16:1-48 35.0ns ± 0% 14.2ns ± 0% -59.47% (p=0.000 n=5+4) IndexAnyASCII/16:2-48 35.5ns ± 0% 35.6ns ± 0% ~ (p=0.238 n=5+4) IndexAnyASCII/16:4-48 40.8ns ± 0% 40.7ns ± 1% ~ (p=0.643 n=5+5) IndexAnyASCII/16:8-48 50.8ns ± 0% 50.9ns ± 1% ~ (p=1.000 n=4+5) IndexAnyASCII/16:16-48 64.0ns ± 1% 64.5ns ± 1% ~ (p=0.071 n=5+5) IndexAnyASCII/16:32-48 98.3ns ± 0% 100.8ns ± 1% +2.52% (p=0.008 n=5+5) IndexAnyASCII/16:64-48 156ns ± 0% 157ns ± 0% ~ (p=0.238 n=4+5) IndexAnyASCII/256:1-48 299ns ± 0% 24ns ± 3% -92.12% (p=0.008 n=5+5) IndexAnyASCII/256:2-48 303ns ± 0% 304ns ± 0% ~ (p=0.762 n=5+5) IndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.476 n=5+5) IndexAnyASCII/256:8-48 321ns ± 0% 321ns ± 0% ~ (p=0.429 n=4+5) IndexAnyASCII/256:16-48 334ns ± 0% 335ns ± 0% ~ (p=0.079 n=5+4) IndexAnyASCII/256:32-48 367ns ± 0% 365ns ± 0% ~ (p=0.079 n=4+5) IndexAnyASCII/256:64-48 431ns ± 1% 421ns ± 0% -2.27% (p=0.008 n=5+5) IndexAnyUTF8/1:1-48 17.2ns ± 0% 10.8ns ± 0% -37.21% (p=0.029 n=4+4) IndexAnyUTF8/1:2-48 26.7ns ± 0% 15.6ns ± 0% ~ (p=0.079 n=4+5) IndexAnyUTF8/1:4-48 28.2ns ± 0% 15.6ns ± 0% -44.68% (p=0.000 n=5+4) IndexAnyUTF8/1:8-48 48.8ns ± 0% 15.6ns ± 0% -68.03% (p=0.029 n=4+4) IndexAnyUTF8/1:16-48 58.3ns ± 0% 16.2ns ± 0% ~ (p=0.079 n=4+5) IndexAnyUTF8/1:32-48 103ns ± 0% 18ns ± 0% -82.27% (p=0.008 n=5+5) IndexAnyUTF8/1:64-48 182ns ± 0% 17ns ± 0% -90.53% (p=0.008 n=5+5) IndexAnyUTF8/16:1-48 197ns ± 0% 25ns ± 0% -87.34% (p=0.000 n=5+4) IndexAnyUTF8/16:2-48 348ns ± 0% 163ns ± 0% -53.11% (p=0.000 n=5+4) IndexAnyUTF8/16:4-48 374ns ± 0% 163ns ± 0% -56.37% (p=0.000 n=5+4) IndexAnyUTF8/16:8-48 716ns ± 0% 163ns ± 0% -77.22% (p=0.000 n=5+4) IndexAnyUTF8/16:16-48 859ns ± 0% 175ns ± 0% -79.63% (p=0.000 n=5+4) IndexAnyUTF8/16:32-48 1.58µs ± 0% 0.20µs ± 0% -87.01% (p=0.029 n=4+4) IndexAnyUTF8/16:64-48 2.84µs ± 0% 0.19µs ± 1% -93.34% (p=0.008 n=5+5) IndexAnyUTF8/256:1-48 2.61µs ± 0% 0.27µs ± 0% -89.81% (p=0.008 n=5+5) IndexAnyUTF8/256:2-48 4.95µs ± 0% 2.23µs ± 0% -54.91% (p=0.016 n=5+4) IndexAnyUTF8/256:4-48 5.55µs ± 0% 2.23µs ± 0% -59.72% (p=0.008 n=5+5) IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.2µs ± 0% -79.39% (p=0.008 n=5+5) IndexAnyUTF8/256:16-48 13.1µs ± 0% 2.5µs ± 0% -81.21% (p=0.016 n=4+5) IndexAnyUTF8/256:32-48 24.7µs ± 0% 2.8µs ± 0% -88.49% (p=0.008 n=5+5) IndexAnyUTF8/256:64-48 45.0µs ± 0% 2.6µs ± 1% -94.23% (p=0.008 n=5+5) LastIndexAnyASCII/1:1-48 13.9ns ± 0% 15.2ns ± 0% +9.35% (p=0.008 n=5+5) LastIndexAnyASCII/1:2-48 14.4ns ± 0% 15.2ns ± 0% +5.56% (p=0.008 n=5+5) LastIndexAnyASCII/1:4-48 16.7ns ± 0% 15.2ns ± 0% -8.98% (p=0.008 n=5+5) LastIndexAnyASCII/1:8-48 24.0ns ± 0% 15.2ns ± 0% -36.67% (p=0.008 n=5+5) LastIndexAnyASCII/1:16-48 35.6ns ± 0% 15.0ns ± 0% -57.82% (p=0.008 n=5+5) LastIndexAnyASCII/1:32-48 68.9ns ± 0% 16.7ns ± 0% -75.75% (p=0.008 n=5+5) LastIndexAnyASCII/1:64-48 104ns ± 0% 17ns ± 1% -83.81% (p=0.008 n=5+5) LastIndexAnyASCII/16:1-48 35.0ns ± 0% 35.0ns ± 0% ~ (all equal) LastIndexAnyASCII/16:2-48 35.6ns ± 0% 35.6ns ± 0% ~ (all equal) LastIndexAnyASCII/16:4-48 41.0ns ± 0% 40.8ns ± 0% -0.49% (p=0.032 n=5+5) LastIndexAnyASCII/16:8-48 50.9ns ± 0% 50.7ns ± 1% ~ (p=0.397 n=5+5) LastIndexAnyASCII/16:16-48 64.3ns ± 1% 64.4ns ± 1% ~ (p=1.000 n=4+5) LastIndexAnyASCII/16:32-48 100ns ± 0% 100ns ± 0% +0.38% (p=0.016 n=4+5) LastIndexAnyASCII/16:64-48 157ns ± 1% 163ns ± 0% +3.82% (p=0.008 n=5+5) LastIndexAnyASCII/256:1-48 302ns ± 0% 300ns ± 0% -0.53% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-48 305ns ± 0% 303ns ± 0% -0.66% (p=0.000 n=5+4) LastIndexAnyASCII/256:4-48 313ns ± 0% 307ns ± 0% -2.04% (p=0.000 n=4+5) LastIndexAnyASCII/256:8-48 323ns ± 0% 315ns ± 0% -2.48% (p=0.029 n=4+4) LastIndexAnyASCII/256:16-48 333ns ± 0% 332ns ± 0% -0.30% (p=0.048 n=5+5) LastIndexAnyASCII/256:32-48 366ns ± 0% 367ns ± 0% ~ (p=0.238 n=4+5) LastIndexAnyASCII/256:64-48 430ns ± 0% 430ns ± 0% ~ (p=1.000 n=5+5) LastIndexAnyUTF8/1:1-48 21.1ns ± 0% 13.9ns ± 0% -34.00% (p=0.008 n=5+5) LastIndexAnyUTF8/1:2-48 29.5ns ± 0% 13.9ns ± 0% -52.95% (p=0.008 n=5+5) LastIndexAnyUTF8/1:4-48 31.6ns ± 0% 13.9ns ± 0% -55.96% (p=0.008 n=5+5) LastIndexAnyUTF8/1:8-48 51.1ns ± 0% 13.9ns ± 0% -72.81% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-48 58.9ns ± 0% 14.6ns ± 0% -75.23% (p=0.016 n=5+4) LastIndexAnyUTF8/1:32-48 103ns ± 0% 16ns ± 1% -84.12% (p=0.008 n=5+5) LastIndexAnyUTF8/1:64-48 177ns ± 0% 17ns ± 1% -90.62% (p=0.008 n=5+5) LastIndexAnyUTF8/16:1-48 275ns ± 1% 105ns ± 0% -61.85% (p=0.000 n=5+4) LastIndexAnyUTF8/16:2-48 406ns ± 0% 216ns ± 0% -46.70% (p=0.008 n=5+5) LastIndexAnyUTF8/16:4-48 458ns ± 0% 216ns ± 0% -52.75% (p=0.000 n=4+5) LastIndexAnyUTF8/16:8-48 753ns ± 0% 216ns ± 0% -71.31% (p=0.029 n=4+4) LastIndexAnyUTF8/16:16-48 902ns ± 0% 221ns ± 0% -75.50% (p=0.016 n=5+4) LastIndexAnyUTF8/16:32-48 1.57µs ± 0% 0.24µs ± 0% -84.46% (p=0.008 n=5+5) LastIndexAnyUTF8/16:64-48 2.77µs ± 0% 0.24µs ± 0% -91.22% (p=0.000 n=5+4) LastIndexAnyUTF8/256:1-48 4.06µs ± 0% 1.53µs ± 0% -62.26% (p=0.008 n=5+5) LastIndexAnyUTF8/256:2-48 5.92µs ± 0% 3.04µs ± 0% -48.55% (p=0.016 n=4+5) LastIndexAnyUTF8/256:4-48 6.82µs ± 0% 3.04µs ± 0% -55.34% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-48 11.5µs ± 0% 3.0µs ± 0% -73.48% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-48 14.1µs ± 0% 3.1µs ± 0% -77.85% (p=0.008 n=5+5) LastIndexAnyUTF8/256:32-48 24.5µs ± 0% 3.5µs ± 0% -85.85% (p=0.016 n=5+4) LastIndexAnyUTF8/256:64-48 44.0µs ± 0% 3.5µs ± 0% -92.12% (p=0.008 n=5+5) pkg:bytes goos:linux goarch:amd64 IndexAnyASCII/1:1-48 9.56ns ± 0% 11.00ns ± 0% +15.06% (p=0.016 n=5+4) IndexAnyASCII/1:2-48 11.0ns ± 0% 10.8ns ± 2% -1.64% (p=0.048 n=5+5) IndexAnyASCII/1:4-48 13.9ns ± 0% 11.0ns ± 1% -21.15% (p=0.008 n=5+5) IndexAnyASCII/1:8-48 19.6ns ± 0% 10.8ns ± 3% -44.90% (p=0.008 n=5+5) IndexAnyASCII/1:16-48 31.1ns ± 0% 11.5ns ± 0% -63.02% (p=0.008 n=5+5) IndexAnyASCII/1:32-48 54.0ns ± 0% 11.8ns ± 0% -78.15% (p=0.000 n=5+4) IndexAnyASCII/1:64-48 100ns ± 0% 13ns ± 0% -86.89% (p=0.008 n=5+5) IndexAnyASCII/16:1-48 35.5ns ± 0% 14.8ns ± 0% -58.26% (p=0.008 n=5+5) IndexAnyASCII/16:2-48 36.2ns ± 1% 36.0ns ± 1% ~ (p=0.087 n=5+5) IndexAnyASCII/16:4-48 40.3ns ± 1% 39.7ns ± 4% ~ (p=0.175 n=4+5) IndexAnyASCII/16:8-48 48.7ns ± 5% 45.8ns ± 0% -6.02% (p=0.016 n=5+4) IndexAnyASCII/16:16-48 64.1ns ±11% 62.1ns ± 1% ~ (p=0.143 n=5+5) IndexAnyASCII/16:32-48 97.9ns ± 1% 98.3ns ± 1% ~ (p=0.294 n=5+5) IndexAnyASCII/16:64-48 163ns ± 0% 157ns ± 0% -3.68% (p=0.008 n=5+5) IndexAnyASCII/256:1-48 389ns ± 0% 25ns ± 0% -93.65% (p=0.000 n=5+4) IndexAnyASCII/256:2-48 391ns ± 0% 307ns ± 0% -21.48% (p=0.000 n=5+4) IndexAnyASCII/256:4-48 394ns ± 0% 323ns ± 0% -17.92% (p=0.008 n=5+5) IndexAnyASCII/256:8-48 402ns ± 0% 323ns ± 0% -19.51% (p=0.008 n=5+5) IndexAnyASCII/256:16-48 414ns ± 0% 334ns ± 0% -19.32% (p=0.016 n=4+5) IndexAnyASCII/256:32-48 446ns ± 0% 367ns ± 0% -17.75% (p=0.016 n=5+4) IndexAnyASCII/256:64-48 511ns ± 0% 424ns ± 0% -17.02% (p=0.008 n=5+5) IndexAnyUTF8/1:1-48 17.4ns ± 0% 11.0ns ± 0% -36.64% (p=0.008 n=5+5) IndexAnyUTF8/1:2-48 27.3ns ± 1% 11.0ns ± 0% -59.74% (p=0.008 n=5+5) IndexAnyUTF8/1:4-48 28.7ns ± 0% 11.0ns ± 0% -61.73% (p=0.008 n=5+5) IndexAnyUTF8/1:8-48 49.2ns ± 0% 11.0ns ± 0% -77.66% (p=0.008 n=5+5) IndexAnyUTF8/1:16-48 56.0ns ± 0% 11.5ns ± 0% -79.46% (p=0.000 n=5+4) IndexAnyUTF8/1:32-48 102ns ± 0% 12ns ± 0% -88.24% (p=0.008 n=5+5) IndexAnyUTF8/1:64-48 177ns ± 0% 13ns ± 0% -92.51% (p=0.008 n=5+5) IndexAnyUTF8/16:1-48 212ns ± 0% 112ns ± 0% -47.17% (p=0.008 n=5+5) IndexAnyUTF8/16:2-48 356ns ± 0% 159ns ± 1% -55.28% (p=0.000 n=4+5) IndexAnyUTF8/16:4-48 372ns ± 0% 158ns ± 0% -57.47% (p=0.008 n=5+5) IndexAnyUTF8/16:8-48 712ns ± 0% 159ns ± 1% -77.70% (p=0.008 n=5+5) IndexAnyUTF8/16:16-48 829ns ± 0% 129ns ± 0% -84.44% (p=0.008 n=5+5) IndexAnyUTF8/16:32-48 1.55µs ± 0% 0.16µs ± 0% -89.87% (p=0.008 n=5+5) IndexAnyUTF8/16:64-48 2.77µs ± 0% 0.14µs ± 0% -94.94% (p=0.008 n=5+5) IndexAnyUTF8/256:1-48 2.85µs ± 0% 1.63µs ± 1% -42.74% (p=0.008 n=5+5) IndexAnyUTF8/256:2-48 5.14µs ± 1% 2.03µs ± 0% -60.51% (p=0.008 n=5+5) IndexAnyUTF8/256:4-48 5.56µs ± 0% 2.03µs ± 0% -63.52% (p=0.008 n=5+5) IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.0µs ± 0% -81.22% (p=0.008 n=5+5) IndexAnyUTF8/256:16-48 12.9µs ± 0% 1.9µs ± 0% -85.55% (p=0.008 n=5+5) IndexAnyUTF8/256:32-48 24.2µs ± 0% 2.1µs ± 0% -91.29% (p=0.016 n=5+4) IndexAnyUTF8/256:64-48 43.7µs ± 0% 2.0µs ± 0% -95.32% (p=0.016 n=5+4) LastIndexAnyASCII/1:1-48 13.7ns ± 1% 12.8ns ± 0% -6.57% (p=0.016 n=5+4) LastIndexAnyASCII/1:2-48 14.7ns ± 0% 12.7ns ± 1% -13.33% (p=0.000 n=4+5) LastIndexAnyASCII/1:4-48 16.9ns ± 0% 12.7ns ± 1% -24.73% (p=0.000 n=4+5) LastIndexAnyASCII/1:8-48 20.5ns ± 0% 12.7ns ± 0% -37.85% (p=0.000 n=4+5) LastIndexAnyASCII/1:16-48 28.0ns ± 0% 11.7ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyASCII/1:32-48 69.8ns ± 0% 12.4ns ± 0% -82.19% (p=0.008 n=5+5) LastIndexAnyASCII/1:64-48 73.8ns ± 0% 13.3ns ± 0% -82.03% (p=0.000 n=4+5) LastIndexAnyASCII/16:1-48 35.5ns ± 0% 35.5ns ± 0% ~ (all equal) LastIndexAnyASCII/16:2-48 36.0ns ± 0% 36.1ns ± 0% +0.28% (p=0.016 n=4+5) LastIndexAnyASCII/16:4-48 40.3ns ± 2% 40.0ns ± 6% ~ (p=0.651 n=5+5) LastIndexAnyASCII/16:8-48 50.3ns ± 0% 50.2ns ± 9% ~ (p=0.175 n=4+5) LastIndexAnyASCII/16:16-48 62.4ns ± 4% 64.4ns ± 0% +3.28% (p=0.016 n=5+4) LastIndexAnyASCII/16:32-48 98.9ns ± 0% 98.4ns ± 0% -0.53% (p=0.016 n=5+4) LastIndexAnyASCII/16:64-48 160ns ± 1% 161ns ± 1% ~ (p=0.325 n=5+5) LastIndexAnyASCII/256:1-48 300ns ± 0% 301ns ± 0% +0.33% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-48 304ns ± 0% 304ns ± 0% ~ (p=1.000 n=5+5) LastIndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.556 n=4+5) LastIndexAnyASCII/256:8-48 320ns ± 0% 321ns ± 0% ~ (p=0.143 n=5+5) LastIndexAnyASCII/256:16-48 333ns ± 0% 335ns ± 0% +0.60% (p=0.029 n=4+4) LastIndexAnyASCII/256:32-48 367ns ± 0% 366ns ± 0% ~ (p=0.095 n=4+5) LastIndexAnyASCII/256:64-48 431ns ± 0% 424ns ± 0% -1.62% (p=0.008 n=5+5) LastIndexAnyUTF8/1:1-48 19.7ns ± 1% 11.9ns ± 0% -39.47% (p=0.008 n=5+5) LastIndexAnyUTF8/1:2-48 27.6ns ± 1% 11.9ns ± 0% -56.82% (p=0.008 n=5+5) LastIndexAnyUTF8/1:4-48 29.9ns ± 0% 11.9ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyUTF8/1:8-48 48.7ns ± 0% 11.9ns ± 0% -75.54% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-48 57.8ns ± 0% 11.4ns ± 0% -80.26% (p=0.008 n=5+5) LastIndexAnyUTF8/1:32-48 94.7ns ± 0% 12.2ns ± 0% -87.07% (p=0.008 n=5+5) LastIndexAnyUTF8/1:64-48 163ns ± 0% 13ns ± 1% -91.93% (p=0.008 n=5+5) LastIndexAnyUTF8/16:1-48 258ns ± 0% 88ns ± 0% -65.76% (p=0.008 n=5+5) LastIndexAnyUTF8/16:2-48 400ns ± 0% 162ns ± 0% -59.38% (p=0.008 n=5+5) LastIndexAnyUTF8/16:4-48 415ns ± 0% 162ns ± 0% -60.87% (p=0.008 n=5+5) LastIndexAnyUTF8/16:8-48 737ns ± 0% 162ns ± 0% -78.02% (p=0.000 n=5+4) LastIndexAnyUTF8/16:16-48 882ns ± 0% 128ns ± 0% -85.49% (p=0.008 n=5+5) LastIndexAnyUTF8/16:32-48 1.47µs ± 0% 0.16µs ± 0% -89.29% (p=0.000 n=4+5) LastIndexAnyUTF8/16:64-48 2.56µs ± 0% 0.14µs ± 0% -94.41% (p=0.016 n=5+4) LastIndexAnyUTF8/256:1-48 3.60µs ± 0% 1.23µs ± 0% -65.67% (p=0.008 n=5+5) LastIndexAnyUTF8/256:2-48 5.78µs ± 0% 2.18µs ± 0% -62.32% (p=0.008 n=5+5) LastIndexAnyUTF8/256:4-48 6.26µs ± 0% 2.18µs ± 0% -65.15% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-48 11.2µs ± 0% 2.2µs ± 0% -80.53% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-48 13.5µs ± 0% 1.9µs ± 0% -86.02% (p=0.016 n=4+5) LastIndexAnyUTF8/256:32-48 23.0µs ± 0% 2.1µs ± 0% -90.72% (p=0.008 n=5+5) LastIndexAnyUTF8/256:64-48 40.5µs ± 0% 2.1µs ± 0% -94.73% (p=0.008 n=5+5) Change-Id: Ie05e306f8b184b989701868cb161ce8b3f18203b Reviewed-on: https://go-review.googlesource.com/c/go/+/156998 Run-TryBot: eric fang <eric.fang@arm.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
18a6fd44bb |
bytes, strings: moves indexRabinKarp function to internal/bytealg
In order to facilitate optimization of IndexAny and LastIndexAny, this patch moves three Rabin-Karp related functions indexRabinKarp, hashStr and hashStrRev in strings package to initernal/bytealg. There are also three functions in the bytes package with the same names and functions but different parameter types. To highlight this, this patch also moves them to internal/bytealg and gives them slightly different names. Related benchmark changes on amd64 and arm64: name old time/op new time/op delta pkg:strings goos:linux goarch:amd64 Index-16 14.0ns ± 1% 14.1ns ± 2% ~ (p=0.738 n=5+5) LastIndex-16 15.5ns ± 1% 15.7ns ± 4% ~ (p=0.897 n=5+5) pkg:bytes goos:linux goarch:amd64 Index/10-16 26.5ns ± 1% 26.5ns ± 0% ~ (p=0.873 n=5+5) Index/32-16 26.2ns ± 0% 25.7ns ± 0% -1.68% (p=0.008 n=5+5) Index/4K-16 5.12µs ± 4% 5.14µs ± 2% ~ (p=0.841 n=5+5) Index/4M-16 5.44ms ± 3% 5.34ms ± 2% ~ (p=0.056 n=5+5) Index/64M-16 85.8ms ± 3% 84.6ms ± 0% -1.37% (p=0.016 n=5+5) name old speed new speed delta pkg:bytes goos:linux goarch:amd64 Index/10-16 377MB/s ± 1% 377MB/s ± 0% ~ (p=1.000 n=5+5) Index/32-16 1.22GB/s ± 1% 1.24GB/s ± 0% +1.66% (p=0.008 n=5+5) Index/4K-16 800MB/s ± 4% 797MB/s ± 2% ~ (p=0.841 n=5+5) Index/4M-16 771MB/s ± 3% 786MB/s ± 2% ~ (p=0.056 n=5+5) Index/64M-16 783MB/s ± 3% 793MB/s ± 0% +1.36% (p=0.016 n=5+5) name old time/op new time/op delta pkg:strings goos:linux goarch:arm64 Index-8 22.6ns ± 0% 22.5ns ± 0% ~ (p=0.167 n=5+5) LastIndex-8 17.5ns ± 0% 17.5ns ± 0% ~ (all equal) pkg:bytes goos:linux goarch:arm64 Index/10-8 25.0ns ± 0% 25.0ns ± 0% ~ (all equal) Index/32-8 160ns ± 0% 160ns ± 0% ~ (all equal) Index/4K-8 6.26µs ± 0% 6.26µs ± 0% ~ (p=0.167 n=5+5) Index/4M-8 6.30ms ± 0% 6.31ms ± 0% ~ (p=1.000 n=5+5) Index/64M-8 101ms ± 0% 101ms ± 0% ~ (p=0.690 n=5+5) name old speed new speed delta pkg:bytes goos:linux goarch:arm64 Index/10-8 399MB/s ± 0% 400MB/s ± 0% +0.08% (p=0.008 n=5+5) Index/32-8 200MB/s ± 0% 200MB/s ± 0% ~ (p=0.127 n=4+5) Index/4K-8 654MB/s ± 0% 654MB/s ± 0% +0.01% (p=0.016 n=5+5) Index/4M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.833 n=5+5) Index/64M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.913 n=5+5) Change-Id: Icce3bc162bb8613ac36dc963a46c51f8e82ab842 Reviewed-on: https://go-review.googlesource.com/c/go/+/208638 Run-TryBot: eric fang <eric.fang@arm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
3259bc4419 |
strings, bytes: add ToValidUTF8
The newly added functions create a copy of their input with all bytes in invalid UTF-8 byte sequences mapped to the UTF-8 byte sequence given as replacement parameter. Fixes #25805 Change-Id: Iaf65f65b40c0581c6bb000f1590408d6628321d0 Reviewed-on: https://go-review.googlesource.com/c/go/+/142003 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
ca0c449a6b |
bytes, internal/bytealg: simplify Equal
The compiler has advanced enough that it is cheaper to convert to strings than to go through the assembly trampolines to call runtime.memequal. Simplify Equal accordingly, and cull dead code from bytealg. While we're here, simplify Equal's documentation. Fixes #31587 Change-Id: Ie721d33f9a6cbd86b1d873398b20e7882c2c63e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/173323 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
08e1823a63 |
bytes: optimize ToLower and ToUpper for ASCII-only case
Follow what CL 68370 and CL 76470 did for the respective functions in package strings. Also adjust godoc strings to match the respective strings functions and mention the special case for ASCII-only byte slices which don't need conversion. name old time/op new time/op delta ToUpper/#00-8 9.35ns ± 3% 6.08ns ± 2% -35.04% (p=0.000 n=9+9) ToUpper/ONLYUPPER-8 77.7ns ± 1% 16.9ns ± 2% -78.22% (p=0.000 n=10+10) ToUpper/abc-8 36.5ns ± 1% 22.1ns ± 1% -39.43% (p=0.000 n=10+8) ToUpper/AbC123-8 56.9ns ± 2% 28.2ns ± 2% -50.54% (p=0.000 n=8+10) ToUpper/azAZ09_-8 62.3ns ± 1% 26.9ns ± 1% -56.82% (p=0.000 n=9+10) ToUpper/longStrinGwitHmixofsmaLLandcAps-8 219ns ± 2% 63ns ± 2% -71.17% (p=0.000 n=10+10) ToUpper/longɐstringɐwithɐnonasciiⱯchars-8 367ns ± 2% 374ns ± 3% +2.05% (p=0.000 n=9+10) ToUpper/ɐɐɐɐɐ-8 200ns ± 1% 206ns ± 1% +2.49% (p=0.000 n=10+10) ToUpper/a\u0080\U0010ffff-8 90.4ns ± 1% 93.8ns ± 0% +3.82% (p=0.000 n=10+7) ToLower/#00-8 9.59ns ± 1% 6.13ns ± 2% -36.08% (p=0.000 n=10+10) ToLower/abc-8 36.4ns ± 1% 10.4ns ± 1% -71.50% (p=0.000 n=10+10) ToLower/AbC123-8 55.8ns ± 1% 27.5ns ± 1% -50.61% (p=0.000 n=10+10) ToLower/azAZ09_-8 61.7ns ± 1% 30.2ns ± 1% -50.98% (p=0.000 n=8+10) ToLower/longStrinGwitHmixofsmaLLandcAps-8 226ns ± 1% 64ns ± 1% -71.53% (p=0.000 n=10+9) ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS-8 354ns ± 0% 361ns ± 0% +2.18% (p=0.000 n=10+10) ToLower/ⱭⱭⱭⱭⱭ-8 180ns ± 1% 186ns ± 0% +3.45% (p=0.000 n=10+9) ToLower/A\u0080\U0010ffff-8 91.7ns ± 0% 94.5ns ± 0% +2.99% (p=0.000 n=10+10) Change-Id: Ifdb8ae328ff9feacd1c170db8eebbf98c399e204 Reviewed-on: https://go-review.googlesource.com/c/go/+/170954 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> |
||
|
94507d2213 |
bytes: merge explodetests into splittests
splittests already contains most of the tests that cover explode. Add the missing ones and skip the append test for empty results which would otherwise lead to an "index out of range" panic. Change-Id: I2cb922282d2676be9ef85f186513075ae17c0243 Reviewed-on: https://go-review.googlesource.com/c/go/+/170126 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
154e5abfcd |
bytes, strings: add tests for TrimLeftFunc and TrimRightFunc
When I was working on the fix for #31038 (make TrimSpace return nil on all-space input) I noticed that there were no tests for TrimLeftFunc and TrimRightFunc, including the funky nil behavior. So add some! I've just reused the existing TrimFunc test cases for TrimLeftFunc and TrimRightFunc, as well as adding new tests for the empty string and all-trimmed cases (which test the nil-returning behavior of TrimFunc and TrimLeftFunc). Change-Id: Ib580d4364e9b3c91350305f9d9873080d7862904 Reviewed-on: https://go-review.googlesource.com/c/go/+/170061 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
f24e1099cb |
bytes: make TrimSpace return nil on all-space input
Issue #29122 introduced a subtle regression due to the way that TrimFuncLeft is written: previously TrimSpace returned nil when given an input of all whitespace, but with the #29122 changes it returned an empty slice on all-space input. This change adds a special case to the new, optimized TrimSpace to go back to that behavior. While it is odd behavior and people shouldn't be relying on these functions returning a nil slice in practice, it's not worth the breakage of code that does. This tweak doesn't change the TrimSpace benchmarks significantly. Fixes #31038 Change-Id: Idb495d02b474054d2b2f593c2e318a7a6625688a Reviewed-on: https://go-review.googlesource.com/c/go/+/169518 Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
9d9d4eeb87 |
bytes: add hard benchmarks for Index and Count
Add Benchmark(Index|Count)Hard[1-3] in preparation for implementing Index and Count in assembly on arm. Updates #29001 Change-Id: I2a9701892190e8d91de069c2f5a7f5bd3544c6c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/167798 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
4b4f222a0d |
bytes, strings: speed up TrimSpace 4-5x for common ASCII cases
This change adds a fast path for ASCII strings to both strings.TrimSpace and bytes.TrimSpace. It doesn't slow down the non-ASCII path much, if at all. I added benchmarks for strings.TrimSpace as it didn't have any, and I fleshed out the benchmarks for bytes.TrimSpace as it just had one case (for ASCII). The benchmarks (and the code!) are now the same between the two versions. Below are the benchmark results: strings.TrimSpace: name old time/op new time/op delta TrimSpace/NoTrim-8 18.6ns ± 0% 3.8ns ± 0% -79.53% (p=0.000 n=5+4) TrimSpace/ASCII-8 33.5ns ± 2% 6.0ns ± 3% -82.05% (p=0.008 n=5+5) TrimSpace/SomeNonASCII-8 97.1ns ± 1% 88.6ns ± 1% -8.68% (p=0.008 n=5+5) TrimSpace/JustNonASCII-8 144ns ± 0% 143ns ± 0% ~ (p=0.079 n=4+5) bytes.TrimSpace: name old time/op new time/op delta TrimSpace/NoTrim-8 18.9ns ± 1% 4.1ns ± 1% -78.34% (p=0.008 n=5+5) TrimSpace/ASCII-8 29.9ns ± 0% 6.3ns ± 1% -79.06% (p=0.008 n=5+5) TrimSpace/SomeNonASCII-8 91.5ns ± 0% 82.3ns ± 0% -10.03% (p=0.008 n=5+5) TrimSpace/JustNonASCII-8 150ns ± 0% 150ns ± 0% ~ (all equal) Fixes #29122 Change-Id: Ica45cd86a219cadf60173ec9db260133cd1d7951 Reviewed-on: https://go-review.googlesource.com/c/go/+/152917 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
|
b4baa8dd1d |
bytes: add benchmark for LastIndex
Add BenchmarkLastIndexHard[1-3] in preparation for implementing LastIndex using Rabin-Karp akin to strings.LastIndex BenchmarkLastIndexHard1-8 500 3162694 ns/op BenchmarkLastIndexHard2-8 500 3170475 ns/op BenchmarkLastIndexHard3-8 500 3051127 ns/op Change-Id: Id99f85f9640e248958f2b4be4dfd8c974e3b50e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/166257 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
ebdc0b8d68 |
bytes, strings: add ReplaceAll
Credit to Harald Nordgren for the proposal in https://golang.org/cl/137456 and #27864. Fixes #27864 Change-Id: I80546683b0623124fe4627a71af88add2f6c1c27 Reviewed-on: https://go-review.googlesource.com/137855 Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
45964e4f9c |
internal/bytealg: move Count to bytealg
Move bytes.Count and strings.Count to bytealg. Update #19792 Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb Reviewed-on: https://go-review.googlesource.com/98495 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
a025277505 |
bytes,strings: in generic Index, use mix of IndexByte and Rabin-Karp
Use IndexByte first, as it allows us to skip lots of bytes quickly. If IndexByte is generating a lot of false positives, switch over to Rabin-Karp. Experiments for ppc64le bytes: name old time/op new time/op delta IndexPeriodic/IndexPeriodic2-2 1.12ms ± 0% 0.18ms ± 0% -83.54% (p=0.000 n=10+9) IndexPeriodic/IndexPeriodic4-2 635µs ± 0% 184µs ± 0% -71.06% (p=0.000 n=9+9) IndexPeriodic/IndexPeriodic8-2 289µs ± 0% 184µs ± 0% -36.51% (p=0.000 n=10+9) IndexPeriodic/IndexPeriodic16-2 133µs ± 0% 183µs ± 0% +37.68% (p=0.000 n=10+9) IndexPeriodic/IndexPeriodic32-2 68.3µs ± 0% 70.2µs ± 0% +2.76% (p=0.000 n=10+10) IndexPeriodic/IndexPeriodic64-2 35.8µs ± 0% 36.6µs ± 0% +2.17% (p=0.000 n=8+10) strings: name old time/op new time/op delta IndexPeriodic/IndexPeriodic2-2 184µs ± 0% 184µs ± 0% +0.11% (p=0.029 n=4+4) IndexPeriodic/IndexPeriodic4-2 184µs ± 0% 184µs ± 0% ~ (p=0.886 n=4+4) IndexPeriodic/IndexPeriodic8-2 184µs ± 0% 184µs ± 0% ~ (p=0.486 n=4+4) IndexPeriodic/IndexPeriodic16-2 185µs ± 1% 184µs ± 0% ~ (p=0.343 n=4+4) IndexPeriodic/IndexPeriodic32-2 184µs ± 0% 69µs ± 0% -62.37% (p=0.029 n=4+4) IndexPeriodic/IndexPeriodic64-2 184µs ± 0% 37µs ± 0% -80.17% (p=0.029 n=4+4) Fixes #22578 Change-Id: If2a4d8554cb96bfd699b58149d13ac294615f8b8 Reviewed-on: https://go-review.googlesource.com/76070 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com> |
||
|
a9e2479a44 |
bytes: set cap of slices returned by Split and Fields and friends
This avoids the problem in which appending to a slice returned by Split can affect subsequent slices. Fixes #21149. Change-Id: Ie3df2b9ceeb9605d4625f47d49073c5f348cf0a1 Reviewed-on: https://go-review.googlesource.com/74510 Reviewed-by: Jelte Fennema <github-tech@jeltef.nl> Reviewed-by: Robert Griesemer <gri@golang.org> |
||
|
7df29b50b2 |
bytes: speed up Fields and FieldsFunc
Applies the optimizations from golang.org/cl/42810 and golang.org/cl/37959 done to the strings package to the bytes package. name old time/op new time/op delta Fields/ASCII/16 417ns ± 4% 118ns ± 3% -71.65% (p=0.000 n=10+10) Fields/ASCII/256 5.95µs ± 3% 0.88µs ± 0% -85.23% (p=0.000 n=10+7) Fields/ASCII/4096 92.3µs ± 1% 12.8µs ± 2% -86.13% (p=0.000 n=10+10) Fields/ASCII/65536 1.49ms ± 1% 0.25ms ± 1% -83.14% (p=0.000 n=10+10) Fields/ASCII/1048576 25.0ms ± 1% 6.5ms ± 2% -74.04% (p=0.000 n=10+10) Fields/Mixed/16 406ns ± 1% 222ns ± 1% -45.24% (p=0.000 n=10+9) Fields/Mixed/256 5.78µs ± 1% 2.27µs ± 1% -60.73% (p=0.000 n=9+10) Fields/Mixed/4096 97.9µs ± 1% 40.5µs ± 3% -58.66% (p=0.000 n=10+10) Fields/Mixed/65536 1.58ms ± 1% 0.69ms ± 1% -56.58% (p=0.000 n=10+10) Fields/Mixed/1048576 26.6ms ± 1% 12.6ms ± 2% -52.44% (p=0.000 n=9+10) FieldsFunc/ASCII/16 395ns ± 1% 188ns ± 1% -52.34% (p=0.000 n=10+10) FieldsFunc/ASCII/256 5.90µs ± 1% 2.00µs ± 1% -66.06% (p=0.000 n=10+10) FieldsFunc/ASCII/4096 92.5µs ± 1% 33.0µs ± 1% -64.34% (p=0.000 n=10+9) FieldsFunc/ASCII/65536 1.48ms ± 1% 0.54ms ± 1% -63.38% (p=0.000 n=10+9) FieldsFunc/ASCII/1048576 25.1ms ± 1% 10.5ms ± 3% -58.24% (p=0.000 n=10+10) FieldsFunc/Mixed/16 401ns ± 1% 205ns ± 2% -48.87% (p=0.000 n=10+10) FieldsFunc/Mixed/256 5.70µs ± 1% 1.98µs ± 1% -65.28% (p=0.000 n=10+10) FieldsFunc/Mixed/4096 97.5µs ± 1% 35.4µs ± 1% -63.65% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 1.57ms ± 1% 0.61ms ± 1% -61.20% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 26.5ms ± 1% 11.4ms ± 2% -56.84% (p=0.000 n=10+10) name old speed new speed delta Fields/ASCII/16 38.4MB/s ± 4% 134.9MB/s ± 3% +251.55% (p=0.000 n=10+10) Fields/ASCII/256 43.0MB/s ± 3% 290.6MB/s ± 1% +575.97% (p=0.000 n=10+8) Fields/ASCII/4096 44.4MB/s ± 1% 320.0MB/s ± 2% +620.90% (p=0.000 n=10+10) Fields/ASCII/65536 44.0MB/s ± 1% 260.7MB/s ± 1% +493.15% (p=0.000 n=10+10) Fields/ASCII/1048576 42.0MB/s ± 1% 161.6MB/s ± 2% +285.21% (p=0.000 n=10+10) Fields/Mixed/16 39.4MB/s ± 1% 71.7MB/s ± 1% +82.20% (p=0.000 n=10+10) Fields/Mixed/256 44.3MB/s ± 1% 112.8MB/s ± 1% +154.64% (p=0.000 n=9+10) Fields/Mixed/4096 41.9MB/s ± 1% 101.2MB/s ± 3% +141.92% (p=0.000 n=10+10) Fields/Mixed/65536 41.5MB/s ± 1% 95.5MB/s ± 1% +130.29% (p=0.000 n=10+10) Fields/Mixed/1048576 39.4MB/s ± 1% 82.9MB/s ± 2% +110.28% (p=0.000 n=9+10) FieldsFunc/ASCII/16 40.5MB/s ± 1% 84.9MB/s ± 2% +109.80% (p=0.000 n=10+10) FieldsFunc/ASCII/256 43.4MB/s ± 1% 127.9MB/s ± 1% +194.58% (p=0.000 n=10+10) FieldsFunc/ASCII/4096 44.3MB/s ± 1% 124.2MB/s ± 1% +180.44% (p=0.000 n=10+9) FieldsFunc/ASCII/65536 44.2MB/s ± 1% 120.6MB/s ± 1% +173.06% (p=0.000 n=10+9) FieldsFunc/ASCII/1048576 41.8MB/s ± 1% 100.2MB/s ± 3% +139.53% (p=0.000 n=10+10) FieldsFunc/Mixed/16 39.8MB/s ± 1% 77.8MB/s ± 2% +95.46% (p=0.000 n=10+10) FieldsFunc/Mixed/256 44.9MB/s ± 1% 129.4MB/s ± 1% +187.97% (p=0.000 n=10+10) FieldsFunc/Mixed/4096 42.0MB/s ± 1% 115.6MB/s ± 1% +175.08% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 41.6MB/s ± 1% 107.3MB/s ± 1% +157.75% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 39.6MB/s ± 1% 91.8MB/s ± 2% +131.72% (p=0.000 n=10+10) name old alloc/op new alloc/op delta Fields/ASCII/16 80.0B ± 0% 80.0B ± 0% ~ (all equal) Fields/ASCII/256 768B ± 0% 768B ± 0% ~ (all equal) Fields/ASCII/4096 9.47kB ± 0% 9.47kB ± 0% ~ (all equal) Fields/ASCII/65536 147kB ± 0% 147kB ± 0% ~ (all equal) Fields/ASCII/1048576 2.27MB ± 0% 2.27MB ± 0% ~ (all equal) Fields/Mixed/16 96.0B ± 0% 96.0B ± 0% ~ (all equal) Fields/Mixed/256 768B ± 0% 768B ± 0% ~ (all equal) Fields/Mixed/4096 9.47kB ± 0% 24.83kB ± 0% +162.16% (p=0.000 n=10+10) Fields/Mixed/65536 147kB ± 0% 497kB ± 0% +237.24% (p=0.000 n=10+10) Fields/Mixed/1048576 2.26MB ± 0% 9.61MB ± 0% +324.89% (p=0.000 n=10+10) FieldsFunc/ASCII/16 80.0B ± 0% 80.0B ± 0% ~ (all equal) FieldsFunc/ASCII/256 768B ± 0% 768B ± 0% ~ (all equal) FieldsFunc/ASCII/4096 9.47kB ± 0% 24.83kB ± 0% +162.16% (p=0.000 n=10+10) FieldsFunc/ASCII/65536 147kB ± 0% 497kB ± 0% +237.24% (p=0.000 n=10+10) FieldsFunc/ASCII/1048576 2.27MB ± 0% 9.61MB ± 0% +323.72% (p=0.000 n=10+10) FieldsFunc/Mixed/16 96.0B ± 0% 96.0B ± 0% ~ (all equal) FieldsFunc/Mixed/256 768B ± 0% 768B ± 0% ~ (all equal) FieldsFunc/Mixed/4096 9.47kB ± 0% 24.83kB ± 0% +162.16% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 147kB ± 0% 497kB ± 0% +237.24% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 2.26MB ± 0% 9.61MB ± 0% +324.89% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Fields/ASCII/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/4096 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/65536 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/1048576 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/Mixed/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/Mixed/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/Mixed/4096 1.00 ± 0% 5.00 ± 0% +400.00% (p=0.000 n=10+10) Fields/Mixed/65536 1.00 ± 0% 12.00 ± 0% +1100.00% (p=0.000 n=10+10) Fields/Mixed/1048576 1.00 ± 0% 24.00 ± 0% +2300.00% (p=0.000 n=10+10) FieldsFunc/ASCII/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/ASCII/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/ASCII/4096 1.00 ± 0% 5.00 ± 0% +400.00% (p=0.000 n=10+10) FieldsFunc/ASCII/65536 1.00 ± 0% 12.00 ± 0% +1100.00% (p=0.000 n=10+10) FieldsFunc/ASCII/1048576 1.00 ± 0% 24.00 ± 0% +2300.00% (p=0.000 n=10+10) FieldsFunc/Mixed/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/Mixed/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/Mixed/4096 1.00 ± 0% 5.00 ± 0% +400.00% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 1.00 ± 0% 12.00 ± 0% +1100.00% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 1.00 ± 0% 24.00 ± 0% +2300.00% (p=0.000 n=10+10) Change-Id: If1926782decc2f60d3b4b8c41c2ce7d8bdedfd8f Reviewed-on: https://go-review.googlesource.com/55131 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> |
||
|
01cd22c687 |
bytes: add optimized countByte for amd64
Use SSE/AVX2 when counting a single byte. Inspired from runtime indexbyte implementation. Benchmark against previous implementation, where 1 byte in every 8 is the one we are looking for: * On a machine without AVX2 name old time/op new time/op delta CountSingle/10-4 61.8ns ±10% 15.6ns ±11% -74.83% (p=0.000 n=10+10) CountSingle/32-4 100ns ± 4% 17ns ±10% -82.54% (p=0.000 n=10+9) CountSingle/4K-4 9.66µs ± 3% 0.37µs ± 6% -96.21% (p=0.000 n=10+10) CountSingle/4M-4 11.0ms ± 6% 0.4ms ± 4% -96.04% (p=0.000 n=10+10) CountSingle/64M-4 194ms ± 8% 8ms ± 2% -95.64% (p=0.000 n=10+10) name old speed new speed delta CountSingle/10-4 162MB/s ±10% 645MB/s ±10% +297.00% (p=0.000 n=10+10) CountSingle/32-4 321MB/s ± 5% 1844MB/s ± 9% +474.79% (p=0.000 n=10+9) CountSingle/4K-4 424MB/s ± 3% 11169MB/s ± 6% +2533.10% (p=0.000 n=10+10) CountSingle/4M-4 381MB/s ± 7% 9609MB/s ± 4% +2421.88% (p=0.000 n=10+10) CountSingle/64M-4 346MB/s ± 7% 7924MB/s ± 2% +2188.78% (p=0.000 n=10+10) * On a machine with AVX2 name old time/op new time/op delta CountSingle/10-8 37.1ns ± 3% 8.2ns ± 1% -77.80% (p=0.000 n=10+10) CountSingle/32-8 66.1ns ± 3% 9.8ns ± 2% -85.23% (p=0.000 n=10+10) CountSingle/4K-8 7.36µs ± 3% 0.11µs ± 1% -98.54% (p=0.000 n=10+10) CountSingle/4M-8 7.46ms ± 2% 0.15ms ± 2% -97.95% (p=0.000 n=10+9) CountSingle/64M-8 124ms ± 2% 6ms ± 4% -95.09% (p=0.000 n=10+10) name old speed new speed delta CountSingle/10-8 269MB/s ± 3% 1213MB/s ± 1% +350.32% (p=0.000 n=10+10) CountSingle/32-8 484MB/s ± 4% 3277MB/s ± 2% +576.66% (p=0.000 n=10+10) CountSingle/4K-8 556MB/s ± 3% 37933MB/s ± 1% +6718.36% (p=0.000 n=10+10) CountSingle/4M-8 562MB/s ± 2% 27444MB/s ± 3% +4783.43% (p=0.000 n=10+9) CountSingle/64M-8 543MB/s ± 2% 11054MB/s ± 3% +1935.81% (p=0.000 n=10+10) Fixes #19411 Change-Id: Ieaf20b1fabccabe767c55c66e242e86f3617f883 Reviewed-on: https://go-review.googlesource.com/38258 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> |
||
|
8946502776 |
bytes, strings: optimize Split*
The relevant benchmark results on linux/amd64: bytes: SplitSingleByteSeparator-4 25.7ms ± 5% 9.1ms ± 4% -64.40% (p=0.000 n=10+10) SplitMultiByteSeparator-4 13.8ms ±20% 4.3ms ± 8% -69.23% (p=0.000 n=10+10) SplitNSingleByteSeparator-4 1.88µs ± 9% 0.88µs ± 4% -53.25% (p=0.000 n=10+10) SplitNMultiByteSeparator-4 4.83µs ±10% 1.32µs ± 9% -72.65% (p=0.000 n=10+10) strings: name old time/op new time/op delta SplitSingleByteSeparator-4 21.4ms ± 8% 8.5ms ± 5% -60.19% (p=0.000 n=10+10) SplitMultiByteSeparator-4 13.2ms ± 9% 3.9ms ± 4% -70.29% (p=0.000 n=10+10) SplitNSingleByteSeparator-4 1.54µs ± 5% 0.75µs ± 7% -51.21% (p=0.000 n=10+10) SplitNMultiByteSeparator-4 3.57µs ± 8% 1.01µs ±11% -71.76% (p=0.000 n=10+10) Fixes #18973 Change-Id: Ie4bc010c6cc389001e72eab530497c81e5b26f34 Reviewed-on: https://go-review.googlesource.com/36510 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
|
9a8c69539c |
bytes, strings: optimize for ASCII sets
In a large codebase within Google, there are thousands of uses of: ContainsAny|IndexAny|LastIndexAny|Trim|TrimLeft|TrimRight An analysis of their usage shows that over 97% of them only use character sets consisting of only ASCII symbols. Uses of ContainsAny|IndexAny|LastIndexAny: 6% are 1 character (e.g., "\n" or " ") 58% are 2-4 characters (e.g., "<>" or "\r\n\t ") 24% are 5-9 characters (e.g., "()[]*^$") 10% are 10+ characters (e.g., "+-=&|><!(){}[]^\"~*?:\\/ ") We optimize for ASCII sets, which are commonly used to search for "control" characters in some string. We don't optimize for the single character scenario since IndexRune or IndexByte could be used. Uses of Trim|TrimLeft|TrimRight: 71% are 1 character (e.g., "\n" or " ") 14% are 2 characters (e.g., "\r\n") 10% are 3-4 characters (e.g., " \t\r\n") 5% are 10+ characters (e.g., "0123456789abcdefABCDEF") We optimize for the single character case with a simple closured function that only checks for that character's value. We optimize for the medium and larger sets using a 16-byte bit-map representing a set of ASCII characters. The benchmarks below have the following suffix name "%d:%d" where the first number is the length of the input and the second number is the length of the charset. == bytes package == benchmark old ns/op new ns/op delta BenchmarkIndexAnyASCII/1:1-4 5.09 5.23 +2.75% BenchmarkIndexAnyASCII/1:2-4 5.81 5.85 +0.69% BenchmarkIndexAnyASCII/1:4-4 7.22 7.50 +3.88% BenchmarkIndexAnyASCII/1:8-4 11.0 11.1 +0.91% BenchmarkIndexAnyASCII/1:16-4 17.5 17.8 +1.71% BenchmarkIndexAnyASCII/16:1-4 36.0 34.0 -5.56% BenchmarkIndexAnyASCII/16:2-4 46.6 36.5 -21.67% BenchmarkIndexAnyASCII/16:4-4 78.0 40.4 -48.21% BenchmarkIndexAnyASCII/16:8-4 136 47.4 -65.15% BenchmarkIndexAnyASCII/16:16-4 254 61.5 -75.79% BenchmarkIndexAnyASCII/256:1-4 542 388 -28.41% BenchmarkIndexAnyASCII/256:2-4 705 382 -45.82% BenchmarkIndexAnyASCII/256:4-4 1089 386 -64.55% BenchmarkIndexAnyASCII/256:8-4 1994 394 -80.24% BenchmarkIndexAnyASCII/256:16-4 3843 411 -89.31% BenchmarkIndexAnyASCII/4096:1-4 8522 5873 -31.08% BenchmarkIndexAnyASCII/4096:2-4 11253 5861 -47.92% BenchmarkIndexAnyASCII/4096:4-4 17824 5883 -66.99% BenchmarkIndexAnyASCII/4096:8-4 32053 5871 -81.68% BenchmarkIndexAnyASCII/4096:16-4 60512 5888 -90.27% BenchmarkTrimASCII/1:1-4 79.5 70.8 -10.94% BenchmarkTrimASCII/1:2-4 79.0 105 +32.91% BenchmarkTrimASCII/1:4-4 79.6 109 +36.93% BenchmarkTrimASCII/1:8-4 78.8 118 +49.75% BenchmarkTrimASCII/1:16-4 80.2 132 +64.59% BenchmarkTrimASCII/16:1-4 243 116 -52.26% BenchmarkTrimASCII/16:2-4 243 171 -29.63% BenchmarkTrimASCII/16:4-4 243 176 -27.57% BenchmarkTrimASCII/16:8-4 241 184 -23.65% BenchmarkTrimASCII/16:16-4 238 199 -16.39% BenchmarkTrimASCII/256:1-4 2580 840 -67.44% BenchmarkTrimASCII/256:2-4 2603 1175 -54.86% BenchmarkTrimASCII/256:4-4 2572 1188 -53.81% BenchmarkTrimASCII/256:8-4 2550 1191 -53.29% BenchmarkTrimASCII/256:16-4 2585 1208 -53.27% BenchmarkTrimASCII/4096:1-4 39773 12181 -69.37% BenchmarkTrimASCII/4096:2-4 39946 17231 -56.86% BenchmarkTrimASCII/4096:4-4 39641 17179 -56.66% BenchmarkTrimASCII/4096:8-4 39835 17175 -56.88% BenchmarkTrimASCII/4096:16-4 40229 17215 -57.21% == strings package == benchmark old ns/op new ns/op delta BenchmarkIndexAnyASCII/1:1-4 5.94 4.97 -16.33% BenchmarkIndexAnyASCII/1:2-4 5.94 5.55 -6.57% BenchmarkIndexAnyASCII/1:4-4 7.45 7.21 -3.22% BenchmarkIndexAnyASCII/1:8-4 10.8 10.6 -1.85% BenchmarkIndexAnyASCII/1:16-4 17.4 17.2 -1.15% BenchmarkIndexAnyASCII/16:1-4 36.4 32.2 -11.54% BenchmarkIndexAnyASCII/16:2-4 49.6 34.6 -30.24% BenchmarkIndexAnyASCII/16:4-4 77.5 37.9 -51.10% BenchmarkIndexAnyASCII/16:8-4 138 45.5 -67.03% BenchmarkIndexAnyASCII/16:16-4 241 59.1 -75.48% BenchmarkIndexAnyASCII/256:1-4 509 378 -25.74% BenchmarkIndexAnyASCII/256:2-4 720 381 -47.08% BenchmarkIndexAnyASCII/256:4-4 1142 384 -66.37% BenchmarkIndexAnyASCII/256:8-4 1999 391 -80.44% BenchmarkIndexAnyASCII/256:16-4 3735 403 -89.21% BenchmarkIndexAnyASCII/4096:1-4 7973 5824 -26.95% BenchmarkIndexAnyASCII/4096:2-4 11432 5809 -49.19% BenchmarkIndexAnyASCII/4096:4-4 18327 5819 -68.25% BenchmarkIndexAnyASCII/4096:8-4 33059 5828 -82.37% BenchmarkIndexAnyASCII/4096:16-4 59703 5817 -90.26% BenchmarkTrimASCII/1:1-4 71.9 71.8 -0.14% BenchmarkTrimASCII/1:2-4 73.3 103 +40.52% BenchmarkTrimASCII/1:4-4 71.8 106 +47.63% BenchmarkTrimASCII/1:8-4 71.2 113 +58.71% BenchmarkTrimASCII/1:16-4 71.6 128 +78.77% BenchmarkTrimASCII/16:1-4 152 116 -23.68% BenchmarkTrimASCII/16:2-4 160 168 +5.00% BenchmarkTrimASCII/16:4-4 172 170 -1.16% BenchmarkTrimASCII/16:8-4 200 177 -11.50% BenchmarkTrimASCII/16:16-4 254 193 -24.02% BenchmarkTrimASCII/256:1-4 1438 864 -39.92% BenchmarkTrimASCII/256:2-4 1551 1195 -22.95% BenchmarkTrimASCII/256:4-4 1770 1200 -32.20% BenchmarkTrimASCII/256:8-4 2195 1216 -44.60% BenchmarkTrimASCII/256:16-4 3054 1224 -59.92% BenchmarkTrimASCII/4096:1-4 21726 12557 -42.20% BenchmarkTrimASCII/4096:2-4 23586 17508 -25.77% BenchmarkTrimASCII/4096:4-4 26898 17510 -34.90% BenchmarkTrimASCII/4096:8-4 33714 17595 -47.81% BenchmarkTrimASCII/4096:16-4 47429 17700 -62.68% The benchmarks added test the worst case. For IndexAny, that is when the charset matches none of the input. For Trim, it is when the charset matches all of the input. Change-Id: I970874d101a96b33528fc99b165379abe58cf6ea Reviewed-on: https://go-review.googlesource.com/31593 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Martin Möhrmann <martisch@uos.de> |
||
|
4b2665786e |
bytes, strings: fix regression in IndexRune
In all previous versions of Go, the behavior of IndexRune(s, r) where r was utf.RuneError was that it would effectively return the index of any invalid UTF-8 byte sequence (include RuneError). Optimizations made in http://golang.org/cl/28537 and http://golang.org/cl/28546 altered this undocumented behavior such that RuneError would only match on the RuneError rune itself. Although, the new behavior is arguably reasonable, it did break code that depended on the previous behavior. Thus, we add special checks to ensure that we preserve the old behavior. There is a slight performance hit for correctness: benchmark old ns/op new ns/op delta BenchmarkIndexRune/10-4 19.3 21.6 +11.92% BenchmarkIndexRune/32-4 33.6 35.2 +4.76% This only occurs on small strings. The performance hit for larger strings is neglible and not shown. Fixes #17611 Change-Id: I1d863a741213d46c40b2e1724c41245df52502a5 Reviewed-on: https://go-review.googlesource.com/32123 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
7b40b0c3a3 |
strings, bytes: panic if Repeat overflows or if given a negative count
Panic if Repeat is given a negative count or if the value of (len(*) * count) is detected to overflow. We panic because we cannot change the signature of Repeat to return an error. Fixes #16237 Change-Id: I9f5ba031a5b8533db0582d7a672ffb715143f3fb Reviewed-on: https://go-review.googlesource.com/29954 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
9a7ce41d6c |
bytes: cut 10 seconds off the race builder's benchmark test
Don't benchmark so many sizes during the race builder's benchmark run. This package doesn't even use goroutines. Cuts off 10 seconds. Updates #17104 Change-Id: Ibb2c7272c18b9014a775949c656a5b930f197cd4 Reviewed-on: https://go-review.googlesource.com/29158 Reviewed-by: David Crawshaw <crawshaw@golang.org> |
||
|
e10286aeda |
bytes: make IndexRune faster
re-implement IndexRune by IndexByte and Index which are well optimized to get performance gain. name old time/op new time/op delta IndexRune/10-4 53.2ns ± 1% 29.1ns ± 1% -45.32% (p=0.008 n=5+5) IndexRune/32-4 191ns ± 1% 27ns ± 1% -85.75% (p=0.008 n=5+5) IndexRune/4K-4 23.5µs ± 1% 1.0µs ± 1% -95.77% (p=0.008 n=5+5) IndexRune/4M-4 23.8ms ± 0% 1.0ms ± 2% -95.90% (p=0.008 n=5+5) IndexRune/64M-4 384ms ± 1% 15ms ± 1% -95.98% (p=0.008 n=5+5) IndexRuneASCII/10-4 61.5ns ± 0% 10.3ns ± 4% -83.17% (p=0.008 n=5+5) IndexRuneASCII/32-4 203ns ± 0% 11ns ± 5% -94.68% (p=0.008 n=5+5) IndexRuneASCII/4K-4 23.4µs ± 0% 0.3µs ± 2% -98.60% (p=0.008 n=5+5) IndexRuneASCII/4M-4 24.0ms ± 1% 0.3ms ± 1% -98.60% (p=0.008 n=5+5) IndexRuneASCII/64M-4 386ms ± 2% 6ms ± 1% -98.57% (p=0.008 n=5+5) name old speed new speed delta IndexRune/10-4 188MB/s ± 1% 344MB/s ± 1% +82.91% (p=0.008 n=5+5) IndexRune/32-4 167MB/s ± 0% 1175MB/s ± 1% +603.52% (p=0.008 n=5+5) IndexRune/4K-4 174MB/s ± 1% 4117MB/s ± 1% +2262.71% (p=0.008 n=5+5) IndexRune/4M-4 176MB/s ± 0% 4299MB/s ± 2% +2340.46% (p=0.008 n=5+5) IndexRune/64M-4 175MB/s ± 1% 4354MB/s ± 1% +2388.57% (p=0.008 n=5+5) IndexRuneASCII/10-4 163MB/s ± 0% 968MB/s ± 4% +494.66% (p=0.008 n=5+5) IndexRuneASCII/32-4 157MB/s ± 0% 2974MB/s ± 4% +1788.59% (p=0.008 n=5+5) IndexRuneASCII/4K-4 175MB/s ± 0% 12481MB/s ± 2% +7027.71% (p=0.008 n=5+5) IndexRuneASCII/4M-4 175MB/s ± 1% 12510MB/s ± 1% +7061.15% (p=0.008 n=5+5) IndexRuneASCII/64M-4 174MB/s ± 2% 12143MB/s ± 1% +6881.70% (p=0.008 n=5+5) Change-Id: I0632eadb83937c2a9daa7f0ce79df1dee64f992e Reviewed-on: https://go-review.googlesource.com/28537 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
9e112a3fe4 |
bytes: use Run method for benchmarks
Change-Id: I34ab1003099570f0ba511340e697a648de31d08a Reviewed-on: https://go-review.googlesource.com/23427 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
||
|
59af53d681 |
bytes: add ContainsRune
Make package bytes consistent with strings by adding missing function ContainsRune. Fixes #15189 Change-Id: Ie09080b389e55bbe070c57aa3bd134053a805423 Reviewed-on: https://go-review.googlesource.com/21710 Run-TryBot: Rob Pike <r@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org> |
||
|
d636d7907c |
bytes: add ContainsAny
This function is present in the strings package but missing from bytes, and we would like to keep the two packages consistent. Add it to bytes, and copy the test over as well. Fixes #15140 Change-Id: I5dbd28da83a9fe741885794ed15f2af2f826cb3c Reviewed-on: https://go-review.googlesource.com/21562 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
||
|
b2cf571040 |
all: delete dead test code
This deletes unused code and helpers from tests. Change-Id: Ie31d46115f558ceb8da6efbf90c3c204e03b0d7e Reviewed-on: https://go-review.googlesource.com/20927 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
|
5fea2ccc77 |
all: single space after period.
The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
|
687abca1ea |
runtime: avoid using REP prefix for IndexByte
REP-prefixed instructions have a large startup cost. Avoid them like the plague. benchmark old ns/op new ns/op delta BenchmarkIndexByte10-8 22.4 5.34 -76.16% Fixes #13983 Change-Id: I857e956e240fc9681d053f2584ccf24c1b272bb3 Reviewed-on: https://go-review.googlesource.com/18703 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
|
32add8d7c8 |
bytes: improve Compare function on amd64 for large byte arrays
This patch contains only loop unrolling change for size > 63B Following are the performance numbers for various sizes on On Haswell based system: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz. benchcmp go.head.8.25.15.txt go.head.8.25.15.opt.txt benchmark old ns/op new ns/op delta BenchmarkBytesCompare1-4 5.37 5.37 +0.00% BenchmarkBytesCompare2-4 5.37 5.38 +0.19% BenchmarkBytesCompare4-4 5.37 5.37 +0.00% BenchmarkBytesCompare8-4 4.42 4.38 -0.90% BenchmarkBytesCompare16-4 4.27 4.45 +4.22% BenchmarkBytesCompare32-4 5.30 5.36 +1.13% BenchmarkBytesCompare64-4 6.93 6.78 -2.16% BenchmarkBytesCompare128-4 10.3 9.50 -7.77% BenchmarkBytesCompare256-4 17.1 13.8 -19.30% BenchmarkBytesCompare512-4 31.3 22.1 -29.39% BenchmarkBytesCompare1024-4 62.5 39.0 -37.60% BenchmarkBytesCompare2048-4 112 73.2 -34.64% Change-Id: I4eeb1c22732fd62cbac97ba757b0d29f648d4ef1 Reviewed-on: https://go-review.googlesource.com/11871 Reviewed-by: Keith Randall <khr@golang.org> |
||
|
0fb5475bdf |
bytes, strings: add LastIndexByte
Currently the packages have the following index functions: func Index(s, sep []byte) int func IndexAny(s []byte, chars string) int func IndexByte(s []byte, c byte) int func IndexFunc(s []byte, f func(r rune) bool) int func IndexRune(s []byte, r rune) int func LastIndex(s, sep []byte) int func LastIndexAny(s []byte, chars string) int func LastIndexFunc(s []byte, f func(r rune) bool) int Searching for the last occurrence of a byte is quite common for string parsing algorithms (e.g. find the last paren on a line). Also addition of LastIndexByte makes the set more orthogonal. Change-Id: Ida168849acacf8e78dd70c1354bef9eac5effafe Reviewed-on: https://go-review.googlesource.com/9500 Reviewed-by: Rob Pike <r@golang.org> |
||
|
c007ce824d |
build: move package sources from src/pkg to src
Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg. |