40 Commits

Author SHA1 Message Date
luochuanhang
e4ab8b0fe6 regexp: add the missing is
Change-Id: I23264972329aa3414067cd0e0986b69bb39bbeb5
GitHub-Last-Rev: d1d668a3cbe852d9a06f03369e7e635232d85139
GitHub-Pull-Request: golang/go#50650
Reviewed-on: https://go-review.googlesource.com/c/go/+/378935
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Daniel Martí <mvdan@mvdan.cc>
2022-01-19 22:56:09 +00:00
Russ Cox
702e337174 regexp: document and implement that invalid UTF-8 bytes are the same as U+FFFD
What should it mean to run a regexp match on invalid UTF-8 bytes?
The coherent behavior options are:

1. Invalid UTF-8 does not match any character classes,
   nor a U+FFFD literal (nor \x{fffd}).
2. Each byte of invalid UTF-8 is treated identically to a U+FFFD in the input,
   as a utf8.DecodeRune loop might.

RE2 uses Rule 1.
Because it works byte at a time, it can also provide \C to match any
single byte of input, which matches invalid UTF-8 as well.
This provides the nice property that a match for a regexp without \C
is guaranteed to be valid UTF-8.

Unfortunately, today Go has an incoherent mix of these two, although
mostly Rule 2. This is a deviation from RE2, and it gives up the nice
property, but we probably can't correct that at this point.
In particular .* already matches entire inputs today, valid UTF-8 or
not, and I doubt we can break that.

This CL adopts Rule 2 officially, fixing the few places that deviate from it.

Fixes #48749.

Change-Id: I96402527c5dfb1146212f568ffa09dde91d71244
Reviewed-on: https://go-review.googlesource.com/c/go/+/354569
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2021-10-11 15:28:50 +00:00
Russ Cox
4d8db00641 all: use bytes.Cut, strings.Cut
Many uses of Index/IndexByte/IndexRune/Split/SplitN
can be written more clearly using the new Cut functions.
Do that. Also rewrite to other functions if that's clearer.

For #46336.

Change-Id: I68d024716ace41a57a8bf74455c62279bde0f448
Reviewed-on: https://go-review.googlesource.com/c/go/+/351711
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-10-06 15:53:04 +00:00
Sylvain Zimmer
782fcb44b9 regexp: add (*Regexp).SubexpIndex
SubexpIndex returns the index of the first subexpression with the given name,
or -1 if there is no subexpression with that name.

Fixes #32420

Change-Id: Ie1f9d22d50fb84e18added80a9d9a9f6dca8ffc4
Reviewed-on: https://go-review.googlesource.com/c/go/+/187919
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2020-04-10 09:38:07 +00:00
Sylvain Zimmer
8116599f01 regexp: optimize for provably too short inputs
For many patterns we can compute the minimum input length at compile time.
If the input is shorter, we can return early and get a huge speedup.

As pointed out by Damian Gryski, Perl's regex engine contains a number of
these kinds of fail-fast optimizations:
https://perldoc.perl.org/perlreguts.html#Peep-hole-Optimisation-and-Analysis

Benchmarks: (including new ones for compile time)

name               old time/op    new time/op    delta
Compile/Onepass-8    4.39µs ± 1%    4.40µs ± 0%  +0.34%  (p=0.029 n=9+8)
Compile/Medium-8     9.80µs ± 0%    9.91µs ± 0%  +1.17%  (p=0.000 n=10+10)
Compile/Hard-8       72.7µs ± 0%    73.5µs ± 0%  +1.10%  (p=0.000 n=9+10)

name                       old time/op    new time/op      delta
Match/Easy0/16-8             52.6ns ± 5%       4.9ns ± 0%     -90.68%  (p=0.000 n=10+9)
Match/Easy0/32-8             64.1ns ±10%      61.4ns ± 1%        ~     (p=0.188 n=10+9)
Match/Easy0/1K-8              280ns ± 1%       277ns ± 2%      -0.97%  (p=0.004 n=10+10)
Match/Easy0/32K-8            4.61µs ± 1%      4.55µs ± 1%      -1.49%  (p=0.000 n=9+10)
Match/Easy0/1M-8              229µs ± 0%       226µs ± 1%      -1.29%  (p=0.000 n=8+10)
Match/Easy0/32M-8            7.50ms ± 1%      7.47ms ± 1%        ~     (p=0.165 n=10+10)
Match/Easy0i/16-8             533ns ± 1%         5ns ± 2%     -99.07%  (p=0.000 n=10+10)
Match/Easy0i/32-8             950ns ± 0%       950ns ± 1%        ~     (p=0.920 n=10+9)
Match/Easy0i/1K-8            27.5µs ± 1%      27.5µs ± 0%        ~     (p=0.739 n=10+10)
Match/Easy0i/32K-8           1.13ms ± 0%      1.13ms ± 1%        ~     (p=0.079 n=9+10)
Match/Easy0i/1M-8            36.7ms ± 2%      36.1ms ± 0%      -1.64%  (p=0.000 n=10+9)
Match/Easy0i/32M-8            1.17s ± 0%       1.16s ± 1%      -0.80%  (p=0.004 n=8+9)
Match/Easy1/16-8             55.5ns ± 6%       4.9ns ± 1%     -91.19%  (p=0.000 n=10+9)
Match/Easy1/32-8             58.3ns ± 8%      56.6ns ± 1%        ~     (p=0.449 n=10+8)
Match/Easy1/1K-8              750ns ± 0%       748ns ± 1%        ~     (p=0.072 n=8+10)
Match/Easy1/32K-8            31.8µs ± 0%      31.6µs ± 1%      -0.50%  (p=0.035 n=10+9)
Match/Easy1/1M-8             1.10ms ± 1%      1.09ms ± 0%      -0.95%  (p=0.000 n=10+9)
Match/Easy1/32M-8            35.5ms ± 0%      35.2ms ± 1%      -1.05%  (p=0.000 n=9+10)
Match/Medium/16-8             442ns ± 2%         5ns ± 1%     -98.89%  (p=0.000 n=10+10)
Match/Medium/32-8             875ns ± 0%       878ns ± 1%        ~     (p=0.071 n=9+10)
Match/Medium/1K-8            26.1µs ± 0%      25.9µs ± 0%      -0.64%  (p=0.000 n=10+10)
Match/Medium/32K-8           1.09ms ± 1%      1.08ms ± 0%      -0.84%  (p=0.000 n=10+9)
Match/Medium/1M-8            34.9ms ± 0%      34.6ms ± 1%      -0.98%  (p=0.000 n=9+10)
Match/Medium/32M-8            1.12s ± 1%       1.11s ± 1%      -0.98%  (p=0.000 n=10+9)
Match/Hard/16-8               721ns ± 1%         5ns ± 0%     -99.32%  (p=0.000 n=10+9)
Match/Hard/32-8              1.32µs ± 1%      1.31µs ± 0%      -0.71%  (p=0.000 n=9+9)
Match/Hard/1K-8              39.8µs ± 1%      39.7µs ± 1%        ~     (p=0.165 n=10+10)
Match/Hard/32K-8             1.57ms ± 0%      1.56ms ± 0%      -0.70%  (p=0.000 n=10+9)
Match/Hard/1M-8              50.4ms ± 1%      50.1ms ± 1%      -0.57%  (p=0.007 n=10+10)
Match/Hard/32M-8              1.62s ± 1%       1.60s ± 0%      -0.98%  (p=0.000 n=10+10)
Match/Hard1/16-8             3.88µs ± 1%      3.86µs ± 0%        ~     (p=0.118 n=10+10)
Match/Hard1/32-8             7.44µs ± 1%      7.46µs ± 1%        ~     (p=0.109 n=10+10)
Match/Hard1/1K-8              232µs ± 1%       229µs ± 1%      -1.31%  (p=0.000 n=10+9)
Match/Hard1/32K-8            7.41ms ± 2%      7.41ms ± 0%        ~     (p=0.237 n=10+8)
Match/Hard1/1M-8              238ms ± 1%       238ms ± 0%        ~     (p=0.481 n=10+10)
Match/Hard1/32M-8             7.69s ± 1%       7.61s ± 0%      -1.00%  (p=0.000 n=10+10)

Fixes #31329

Change-Id: I04640e8c59178ec8b3106e13ace9b109b6bdbc25
Reviewed-on: https://go-review.googlesource.com/c/go/+/171023
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-05-15 15:28:22 +00:00
Dmitry Vyukov
7f5434c44c regexp: clarify docs re Submatch result
Currently we say that a negative index means no match,
but we don't say how "no match" is expressed when 'Index'
is not present. Say how it is expressed.

Change-Id: I82b6c9038557ac49852ac03642afc0bc545bb4a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/175677
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-05-08 17:03:20 +00:00
Francesc Campoy
20930c7623 regexp: limit the capacity of slices of bytes returned by FindX
This change limits the capacity of the slices of bytes returned by:

- Find
- FindAll
- FindAllSubmatch

to be the same as their length.

Fixes #30169

Change-Id: I07b632757d2bfeab42fce0d42364e2a16c597360
Reviewed-on: https://go-review.googlesource.com/c/161877
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-26 23:51:17 +00:00
Russ Cox
bf68744a12 regexp: add partial Deprecation comment to Copy
Change-Id: I21b7817e604a48330f1ee250f7b1b2adc1f16067
Reviewed-on: https://go-review.googlesource.com/c/139784
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-12 17:48:44 +00:00
Russ Cox
3ca1f28e54 regexp: evaluate context flags lazily
There's no point in computing whether we're at the
beginning of the line if the NFA isn't going to ask.
Wait to compute that until asked.

Whatever minor slowdowns were introduced by
the conversion to pools that were not repaid by
other optimizations are taken care of by this one.

name                             old time/op    new time/op    delta
Find-12                             252ns ± 0%     260ns ± 0%   +3.34%  (p=0.000 n=10+8)
FindAllNoMatches-12                 136ns ± 4%     134ns ± 4%   -0.96%  (p=0.033 n=10+10)
FindString-12                       246ns ± 0%     250ns ± 0%   +1.46%  (p=0.000 n=8+10)
FindSubmatch-12                     332ns ± 1%     332ns ± 0%     ~     (p=0.101 n=9+10)
FindStringSubmatch-12               321ns ± 1%     322ns ± 1%     ~     (p=0.717 n=9+10)
Literal-12                         91.6ns ± 0%    92.3ns ± 0%   +0.74%  (p=0.000 n=9+9)
NotLiteral-12                      1.47µs ± 0%    1.47µs ± 0%   +0.38%  (p=0.000 n=9+8)
MatchClass-12                      2.15µs ± 0%    2.15µs ± 0%   +0.39%  (p=0.000 n=10+10)
MatchClass_InRange-12              2.09µs ± 0%    2.11µs ± 0%   +0.75%  (p=0.000 n=9+9)
ReplaceAll-12                      1.40µs ± 0%    1.40µs ± 0%     ~     (p=0.525 n=10+10)
AnchoredLiteralShortNonMatch-12    83.5ns ± 0%    81.6ns ± 0%   -2.28%  (p=0.000 n=9+10)
AnchoredLiteralLongNonMatch-12      101ns ± 0%      97ns ± 1%   -3.54%  (p=0.000 n=10+10)
AnchoredShortMatch-12               131ns ± 0%     128ns ± 0%   -2.29%  (p=0.000 n=10+9)
AnchoredLongMatch-12                268ns ± 1%     252ns ± 1%   -6.04%  (p=0.000 n=10+10)
OnePassShortA-12                    614ns ± 0%     587ns ± 1%   -4.33%  (p=0.000 n=6+10)
NotOnePassShortA-12                 552ns ± 0%     547ns ± 1%   -0.89%  (p=0.000 n=10+10)
OnePassShortB-12                    494ns ± 0%     455ns ± 0%   -7.96%  (p=0.000 n=9+9)
NotOnePassShortB-12                 411ns ± 0%     406ns ± 0%   -1.30%  (p=0.000 n=9+9)
OnePassLongPrefix-12                109ns ± 0%     108ns ± 1%     ~     (p=0.064 n=8+9)
OnePassLongNotPrefix-12             403ns ± 0%     349ns ± 0%  -13.30%  (p=0.000 n=9+8)
MatchParallelShared-12             38.9ns ± 1%    37.9ns ± 1%   -2.65%  (p=0.000 n=10+8)
MatchParallelCopied-12             39.2ns ± 1%    38.3ns ± 2%   -2.20%  (p=0.001 n=10+10)
QuoteMetaAll-12                    94.5ns ± 0%    94.7ns ± 0%   +0.18%  (p=0.043 n=10+9)
QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
Match/Easy0/32-12                  72.2ns ± 0%    71.9ns ± 0%   -0.38%  (p=0.009 n=8+10)
Match/Easy0/1K-12                   296ns ± 1%     297ns ± 0%   +0.51%  (p=0.001 n=10+9)
Match/Easy0/32K-12                 4.57µs ± 3%    4.61µs ± 2%     ~     (p=0.280 n=10+10)
Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.986 n=10+10)
Match/Easy0/32M-12                 7.96ms ± 0%    7.98ms ± 0%   +0.22%  (p=0.010 n=10+9)
Match/Easy0i/32-12                 1.09µs ± 0%    1.10µs ± 0%   +0.23%  (p=0.000 n=8+9)
Match/Easy0i/1K-12                 31.7µs ± 0%    31.7µs ± 0%   +0.09%  (p=0.003 n=9+8)
Match/Easy0i/32K-12                1.61ms ± 0%    1.27ms ± 1%  -21.03%  (p=0.000 n=8+10)
Match/Easy0i/1M-12                 51.4ms ± 0%    40.4ms ± 0%  -21.29%  (p=0.000 n=8+8)
Match/Easy0i/32M-12                 1.65s ± 0%     1.30s ± 1%  -21.22%  (p=0.000 n=9+9)
Match/Easy1/32-12                  67.6ns ± 1%    67.2ns ± 0%     ~     (p=0.085 n=10+9)
Match/Easy1/1K-12                   873ns ± 2%     880ns ± 0%   +0.78%  (p=0.006 n=9+7)
Match/Easy1/32K-12                 39.7µs ± 1%    34.3µs ± 3%  -13.53%  (p=0.000 n=10+10)
Match/Easy1/1M-12                  1.41ms ± 1%    1.19ms ± 3%  -15.48%  (p=0.000 n=10+10)
Match/Easy1/32M-12                 44.9ms ± 1%    38.0ms ± 2%  -15.21%  (p=0.000 n=10+10)
Match/Medium/32-12                 1.04µs ± 0%    1.03µs ± 0%   -0.57%  (p=0.000 n=9+9)
Match/Medium/1K-12                 31.2µs ± 0%    31.4µs ± 1%   +0.61%  (p=0.000 n=8+10)
Match/Medium/32K-12                1.45ms ± 1%    1.20ms ± 0%  -17.70%  (p=0.000 n=10+8)
Match/Medium/1M-12                 46.4ms ± 0%    38.4ms ± 2%  -17.32%  (p=0.000 n=6+9)
Match/Medium/32M-12                 1.49s ± 1%     1.24s ± 1%  -16.81%  (p=0.000 n=10+10)
Match/Hard/32-12                   1.47µs ± 0%    1.47µs ± 0%   -0.31%  (p=0.000 n=9+10)
Match/Hard/1K-12                   44.5µs ± 1%    44.4µs ± 0%     ~     (p=0.075 n=10+10)
Match/Hard/32K-12                  2.09ms ± 0%    1.78ms ± 7%  -14.88%  (p=0.000 n=8+10)
Match/Hard/1M-12                   67.8ms ± 5%    56.9ms ± 7%  -16.05%  (p=0.000 n=10+10)
Match/Hard/32M-12                   2.17s ± 5%     1.84s ± 6%  -15.21%  (p=0.000 n=10+10)
Match/Hard1/32-12                  7.89µs ± 0%    7.94µs ± 0%   +0.61%  (p=0.000 n=9+9)
Match/Hard1/1K-12                   246µs ± 0%     245µs ± 0%   -0.30%  (p=0.010 n=9+10)
Match/Hard1/32K-12                 8.93ms ± 0%    8.17ms ± 0%   -8.44%  (p=0.000 n=9+8)
Match/Hard1/1M-12                   286ms ± 0%     269ms ± 9%   -5.66%  (p=0.028 n=9+10)
Match/Hard1/32M-12                  9.16s ± 0%     8.61s ± 8%   -5.98%  (p=0.028 n=9+10)
Match_onepass_regex/32-12           825ns ± 0%     712ns ± 0%  -13.75%  (p=0.000 n=8+8)
Match_onepass_regex/1K-12          28.7µs ± 1%    19.8µs ± 0%  -30.99%  (p=0.000 n=9+8)
Match_onepass_regex/32K-12          950µs ± 1%     628µs ± 0%  -33.83%  (p=0.000 n=9+8)
Match_onepass_regex/1M-12          30.4ms ± 0%    20.1ms ± 0%  -33.74%  (p=0.000 n=9+8)
Match_onepass_regex/32M-12          974ms ± 1%     646ms ± 0%  -33.73%  (p=0.000 n=9+8)
CompileOnepass-12                  4.60µs ± 0%    4.59µs ± 0%     ~     (p=0.063 n=8+9)
[Geo mean]                         23.1µs         21.3µs        -7.44%

https://perf.golang.org/search?q=upload:20181004.4

Change-Id: I47cdd09f6dcde1d7c317080e9b4df42c7d0a8d24
Reviewed-on: https://go-review.googlesource.com/c/139782
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-12 17:48:41 +00:00
Russ Cox
a376435ae5 regexp: use pools for NFA machines
Now the machine struct is only used for NFA execution.
Use global pools to cache machines instead of per-Regexp lists.

Also eliminate some tail calls in NFA execution, to pay for
the added overhead of sync.Pool.

name                             old time/op    new time/op    delta
Find-12                             252ns ± 0%     252ns ± 0%     ~     (p=1.000 n=10+10)
FindAllNoMatches-12                 134ns ± 1%     136ns ± 4%     ~     (p=0.443 n=9+10)
FindString-12                       246ns ± 0%     246ns ± 0%   -0.16%  (p=0.046 n=10+8)
FindSubmatch-12                     333ns ± 2%     332ns ± 1%     ~     (p=0.489 n=10+9)
FindStringSubmatch-12               320ns ± 0%     321ns ± 1%   +0.55%  (p=0.005 n=10+9)
Literal-12                         91.1ns ± 0%    91.6ns ± 0%   +0.55%  (p=0.000 n=10+9)
NotLiteral-12                      1.45µs ± 0%    1.47µs ± 0%   +0.82%  (p=0.000 n=10+9)
MatchClass-12                      2.19µs ± 0%    2.15µs ± 0%   -2.01%  (p=0.000 n=9+10)
MatchClass_InRange-12              2.09µs ± 0%    2.09µs ± 0%     ~     (p=0.082 n=10+9)
ReplaceAll-12                      1.39µs ± 0%    1.40µs ± 0%   +0.50%  (p=0.000 n=10+10)
AnchoredLiteralShortNonMatch-12    82.4ns ± 0%    83.5ns ± 0%   +1.36%  (p=0.000 n=8+9)
AnchoredLiteralLongNonMatch-12      106ns ± 1%     101ns ± 0%   -4.36%  (p=0.000 n=10+10)
AnchoredShortMatch-12               130ns ± 0%     131ns ± 0%   +0.77%  (p=0.000 n=9+10)
AnchoredLongMatch-12                272ns ± 0%     268ns ± 1%   -1.46%  (p=0.000 n=8+10)
OnePassShortA-12                    615ns ± 0%     614ns ± 0%     ~     (p=0.094 n=10+6)
NotOnePassShortA-12                 549ns ± 0%     552ns ± 0%   +0.52%  (p=0.000 n=9+10)
OnePassShortB-12                    494ns ± 0%     494ns ± 0%     ~     (p=0.247 n=8+9)
NotOnePassShortB-12                 412ns ± 1%     411ns ± 0%     ~     (p=0.625 n=10+9)
OnePassLongPrefix-12                108ns ± 0%     109ns ± 0%   +0.93%  (p=0.000 n=10+8)
OnePassLongNotPrefix-12             402ns ± 0%     403ns ± 0%   +0.14%  (p=0.041 n=8+9)
MatchParallelShared-12             38.6ns ± 2%    38.9ns ± 1%     ~     (p=0.172 n=9+10)
MatchParallelCopied-12             39.4ns ± 7%    39.2ns ± 1%     ~     (p=0.423 n=10+10)
QuoteMetaAll-12                    94.9ns ± 0%    94.5ns ± 0%   -0.42%  (p=0.000 n=9+10)
QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
Match/Easy0/32-12                  72.1ns ± 0%    72.2ns ± 0%     ~     (p=0.435 n=9+8)
Match/Easy0/1K-12                   298ns ± 0%     296ns ± 1%   -1.01%  (p=0.000 n=8+10)
Match/Easy0/32K-12                 4.64µs ± 1%    4.57µs ± 3%   -1.39%  (p=0.030 n=10+10)
Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.971 n=10+10)
Match/Easy0/32M-12                 7.95ms ± 0%    7.96ms ± 0%     ~     (p=0.278 n=9+10)
Match/Easy0i/32-12                 1.10µs ± 0%    1.09µs ± 0%   -0.29%  (p=0.000 n=9+8)
Match/Easy0i/1K-12                 31.8µs ± 1%    31.7µs ± 0%     ~     (p=0.704 n=10+9)
Match/Easy0i/32K-12                1.62ms ± 1%    1.61ms ± 0%   -1.12%  (p=0.000 n=10+8)
Match/Easy0i/1M-12                 51.8ms ± 0%    51.4ms ± 0%   -0.84%  (p=0.000 n=8+8)
Match/Easy0i/32M-12                 1.65s ± 0%     1.65s ± 0%   -0.46%  (p=0.000 n=9+9)
Match/Easy1/32-12                  67.7ns ± 1%    67.6ns ± 1%     ~     (p=0.723 n=10+10)
Match/Easy1/1K-12                   873ns ± 0%     873ns ± 2%     ~     (p=0.345 n=10+9)
Match/Easy1/32K-12                 39.4µs ± 0%    39.7µs ± 1%   +0.66%  (p=0.000 n=10+10)
Match/Easy1/1M-12                  1.39ms ± 0%    1.41ms ± 1%   +1.10%  (p=0.000 n=10+10)
Match/Easy1/32M-12                 44.3ms ± 0%    44.9ms ± 1%   +1.18%  (p=0.000 n=10+10)
Match/Medium/32-12                 1.04µs ± 0%    1.04µs ± 0%   -0.58%  (p=0.000 n=9+9)
Match/Medium/1K-12                 31.4µs ± 0%    31.2µs ± 0%   -0.62%  (p=0.000 n=8+8)
Match/Medium/32K-12                1.45ms ± 0%    1.45ms ± 1%     ~     (p=0.356 n=9+10)
Match/Medium/1M-12                 46.4ms ± 0%    46.4ms ± 0%     ~     (p=0.142 n=8+6)
Match/Medium/32M-12                 1.49s ± 1%     1.49s ± 1%     ~     (p=0.739 n=10+10)
Match/Hard/32-12                   1.48µs ± 0%    1.47µs ± 0%   -0.53%  (p=0.000 n=9+9)
Match/Hard/1K-12                   45.0µs ± 1%    44.5µs ± 1%   -1.06%  (p=0.000 n=10+10)
Match/Hard/32K-12                  2.24ms ± 0%    2.09ms ± 0%   -6.56%  (p=0.000 n=8+8)
Match/Hard/1M-12                   71.6ms ± 0%    67.8ms ± 5%   -5.36%  (p=0.000 n=7+10)
Match/Hard/32M-12                   2.29s ± 0%     2.17s ± 5%   -5.40%  (p=0.000 n=9+10)
Match/Hard1/32-12                  7.89µs ± 0%    7.89µs ± 0%     ~     (p=0.053 n=9+9)
Match/Hard1/1K-12                   244µs ± 0%     246µs ± 0%   +0.71%  (p=0.000 n=10+9)
Match/Hard1/32K-12                 10.3ms ± 0%     8.9ms ± 0%  -13.76%  (p=0.000 n=10+9)
Match/Hard1/1M-12                   331ms ± 0%     286ms ± 0%  -13.72%  (p=0.000 n=9+9)
Match/Hard1/32M-12                  10.6s ± 0%      9.2s ± 0%  -13.72%  (p=0.000 n=10+9)
Match_onepass_regex/32-12           830ns ± 0%     825ns ± 0%   -0.57%  (p=0.000 n=9+8)
Match_onepass_regex/1K-12          28.7µs ± 1%    28.7µs ± 1%   -0.22%  (p=0.040 n=9+9)
Match_onepass_regex/32K-12          949µs ± 0%     950µs ± 1%     ~     (p=0.236 n=8+9)
Match_onepass_regex/1M-12          30.4ms ± 0%    30.4ms ± 0%     ~     (p=0.059 n=8+9)
Match_onepass_regex/32M-12          973ms ± 0%     974ms ± 1%     ~     (p=0.258 n=9+9)
CompileOnepass-12                  4.64µs ± 0%    4.60µs ± 0%   -0.90%  (p=0.000 n=10+8)
[Geo mean]                         23.3µs         23.1µs        -1.16%

https://perf.golang.org/search?q=upload:20181004.3

Change-Id: I46f3d52ce89c8cd992cf554473c27af81fd81bfd
Reviewed-on: https://go-review.googlesource.com/c/139781
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-12 17:48:36 +00:00
Russ Cox
60b2971180 regexp: split one-pass execution out of machine struct
This allows the one-pass executions to have their
own pool of (much smaller) allocated structures.
A step toward eliminating the per-Regexp machine cache.

Not much effect on benchmarks, since there are no
optimizations here, and pools are a tiny bit slower than a
locked data structure for single-threaded code.

name                             old time/op    new time/op    delta
Find-12                             254ns ± 0%     252ns ± 0%  -0.94%  (p=0.000 n=9+10)
FindAllNoMatches-12                 135ns ± 0%     134ns ± 1%  -0.49%  (p=0.002 n=9+9)
FindString-12                       247ns ± 0%     246ns ± 0%  -0.24%  (p=0.003 n=8+10)
FindSubmatch-12                     334ns ± 0%     333ns ± 2%    ~     (p=0.283 n=10+10)
FindStringSubmatch-12               321ns ± 0%     320ns ± 0%  -0.51%  (p=0.000 n=9+10)
Literal-12                         92.2ns ± 0%    91.1ns ± 0%  -1.25%  (p=0.000 n=9+10)
NotLiteral-12                      1.47µs ± 0%    1.45µs ± 0%  -0.99%  (p=0.000 n=9+10)
MatchClass-12                      2.17µs ± 0%    2.19µs ± 0%  +0.84%  (p=0.000 n=7+9)
MatchClass_InRange-12              2.13µs ± 0%    2.09µs ± 0%  -1.70%  (p=0.000 n=10+10)
ReplaceAll-12                      1.39µs ± 0%    1.39µs ± 0%  +0.51%  (p=0.000 n=10+10)
AnchoredLiteralShortNonMatch-12    83.2ns ± 0%    82.4ns ± 0%  -0.96%  (p=0.000 n=8+8)
AnchoredLiteralLongNonMatch-12      105ns ± 0%     106ns ± 1%    ~     (p=0.087 n=10+10)
AnchoredShortMatch-12               131ns ± 0%     130ns ± 0%  -0.76%  (p=0.000 n=10+9)
AnchoredLongMatch-12                267ns ± 0%     272ns ± 0%  +2.01%  (p=0.000 n=10+8)
OnePassShortA-12                    611ns ± 0%     615ns ± 0%  +0.61%  (p=0.000 n=9+10)
NotOnePassShortA-12                 552ns ± 0%     549ns ± 0%  -0.46%  (p=0.000 n=8+9)
OnePassShortB-12                    491ns ± 0%     494ns ± 0%  +0.61%  (p=0.000 n=8+8)
NotOnePassShortB-12                 412ns ± 0%     412ns ± 1%    ~     (p=0.151 n=9+10)
OnePassLongPrefix-12                112ns ± 0%     108ns ± 0%  -3.57%  (p=0.000 n=10+10)
OnePassLongNotPrefix-12             410ns ± 0%     402ns ± 0%  -1.95%  (p=0.000 n=9+8)
MatchParallelShared-12             38.8ns ± 1%    38.6ns ± 2%    ~     (p=0.536 n=10+9)
MatchParallelCopied-12             39.2ns ± 3%    39.4ns ± 7%    ~     (p=0.986 n=10+10)
QuoteMetaAll-12                    94.6ns ± 0%    94.9ns ± 0%  +0.29%  (p=0.001 n=8+9)
QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%    ~     (all equal)
Match/Easy0/32-12                  72.9ns ± 0%    72.1ns ± 0%  -1.07%  (p=0.000 n=9+9)
Match/Easy0/1K-12                   298ns ± 0%     298ns ± 0%    ~     (p=0.140 n=6+8)
Match/Easy0/32K-12                 4.60µs ± 2%    4.64µs ± 1%    ~     (p=0.171 n=10+10)
Match/Easy0/1M-12                   235µs ± 0%     234µs ± 0%  -0.14%  (p=0.004 n=10+10)
Match/Easy0/32M-12                 7.96ms ± 0%    7.95ms ± 0%  -0.12%  (p=0.043 n=10+9)
Match/Easy0i/32-12                 1.09µs ± 0%    1.10µs ± 0%  +0.15%  (p=0.000 n=8+9)
Match/Easy0i/1K-12                 31.7µs ± 0%    31.8µs ± 1%    ~     (p=0.905 n=9+10)
Match/Easy0i/32K-12                1.61ms ± 0%    1.62ms ± 1%  +1.12%  (p=0.000 n=9+10)
Match/Easy0i/1M-12                 51.4ms ± 0%    51.8ms ± 0%  +0.85%  (p=0.000 n=8+8)
Match/Easy0i/32M-12                 1.65s ± 1%     1.65s ± 0%    ~     (p=0.113 n=9+9)
Match/Easy1/32-12                  67.9ns ± 0%    67.7ns ± 1%    ~     (p=0.232 n=8+10)
Match/Easy1/1K-12                   884ns ± 0%     873ns ± 0%  -1.29%  (p=0.000 n=9+10)
Match/Easy1/32K-12                 39.2µs ± 0%    39.4µs ± 0%  +0.50%  (p=0.000 n=9+10)
Match/Easy1/1M-12                  1.39ms ± 0%    1.39ms ± 0%  +0.29%  (p=0.000 n=9+10)
Match/Easy1/32M-12                 44.2ms ± 1%    44.3ms ± 0%  +0.21%  (p=0.029 n=10+10)
Match/Medium/32-12                 1.05µs ± 0%    1.04µs ± 0%  -0.27%  (p=0.001 n=8+9)
Match/Medium/1K-12                 31.3µs ± 0%    31.4µs ± 0%  +0.39%  (p=0.000 n=9+8)
Match/Medium/32K-12                1.45ms ± 0%    1.45ms ± 0%  +0.33%  (p=0.000 n=8+9)
Match/Medium/1M-12                 46.2ms ± 0%    46.4ms ± 0%  +0.35%  (p=0.000 n=9+8)
Match/Medium/32M-12                 1.48s ± 0%     1.49s ± 1%  +0.70%  (p=0.000 n=8+10)
Match/Hard/32-12                   1.49µs ± 0%    1.48µs ± 0%  -0.43%  (p=0.000 n=10+9)
Match/Hard/1K-12                   45.1µs ± 1%    45.0µs ± 1%    ~     (p=0.393 n=10+10)
Match/Hard/32K-12                  2.18ms ± 1%    2.24ms ± 0%  +2.71%  (p=0.000 n=9+8)
Match/Hard/1M-12                   69.7ms ± 1%    71.6ms ± 0%  +2.76%  (p=0.000 n=9+7)
Match/Hard/32M-12                   2.23s ± 1%     2.29s ± 0%  +2.65%  (p=0.000 n=9+9)
Match/Hard1/32-12                  7.89µs ± 0%    7.89µs ± 0%    ~     (p=0.286 n=9+9)
Match/Hard1/1K-12                   244µs ± 0%     244µs ± 0%    ~     (p=0.905 n=9+10)
Match/Hard1/32K-12                 10.3ms ± 0%    10.3ms ± 0%    ~     (p=0.796 n=10+10)
Match/Hard1/1M-12                   331ms ± 0%     331ms ± 0%    ~     (p=0.167 n=8+9)
Match/Hard1/32M-12                  10.6s ± 0%     10.6s ± 0%    ~     (p=0.315 n=8+10)
Match_onepass_regex/32-12           812ns ± 0%     830ns ± 0%  +2.19%  (p=0.000 n=10+9)
Match_onepass_regex/1K-12          28.5µs ± 0%    28.7µs ± 1%  +0.97%  (p=0.000 n=10+9)
Match_onepass_regex/32K-12          936µs ± 0%     949µs ± 0%  +1.43%  (p=0.000 n=10+8)
Match_onepass_regex/1M-12          30.2ms ± 0%    30.4ms ± 0%  +0.62%  (p=0.000 n=10+8)
Match_onepass_regex/32M-12          970ms ± 0%     973ms ± 0%  +0.35%  (p=0.000 n=10+9)
CompileOnepass-12                  4.63µs ± 1%    4.64µs ± 0%    ~     (p=0.060 n=10+10)
[Geo mean]                         23.3µs         23.3µs       +0.12%

https://perf.golang.org/search?q=upload:20181004.2

Change-Id: Iff9e9f9d4a4698162126a2f300e8ed1b1a39361e
Reviewed-on: https://go-review.googlesource.com/c/139780
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-12 17:48:35 +00:00
Russ Cox
2d4346b319 regexp: split bit-state execution out of machine struct
This allows the bit-state executions to have their
own pool of allocated structures. A step toward
eliminating the per-Regexp machine cache.

Note especially the -92% on MatchParallelShared.
This is real but not a complete story: the other
execution engines still need to be de-shared,
but the benchmark was only using bit-state.

The tiny slowdowns in unrelated code are noise.

name                             old time/op    new time/op    delta
Find-12                             264ns ± 3%     254ns ± 0%   -3.86%  (p=0.000 n=10+9)
FindAllNoMatches-12                 140ns ± 2%     135ns ± 0%   -3.91%  (p=0.000 n=10+9)
FindString-12                       256ns ± 0%     247ns ± 0%   -3.52%  (p=0.000 n=8+8)
FindSubmatch-12                     339ns ± 1%     334ns ± 0%   -1.41%  (p=0.000 n=9+10)
FindStringSubmatch-12               322ns ± 0%     321ns ± 0%   -0.21%  (p=0.005 n=8+9)
Literal-12                          100ns ± 2%      92ns ± 0%   -8.10%  (p=0.000 n=10+9)
NotLiteral-12                      1.50µs ± 0%    1.47µs ± 0%   -1.91%  (p=0.000 n=8+9)
MatchClass-12                      2.18µs ± 0%    2.17µs ± 0%   -0.20%  (p=0.001 n=10+7)
MatchClass_InRange-12              2.12µs ± 0%    2.13µs ± 0%   +0.23%  (p=0.000 n=10+10)
ReplaceAll-12                      1.41µs ± 0%    1.39µs ± 0%   -1.30%  (p=0.000 n=7+10)
AnchoredLiteralShortNonMatch-12    89.8ns ± 0%    83.2ns ± 0%   -7.35%  (p=0.000 n=8+8)
AnchoredLiteralLongNonMatch-12      105ns ± 3%     105ns ± 0%     ~     (p=0.186 n=10+10)
AnchoredShortMatch-12               141ns ± 0%     131ns ± 0%   -7.09%  (p=0.000 n=9+10)
AnchoredLongMatch-12                276ns ± 4%     267ns ± 0%   -3.23%  (p=0.000 n=10+10)
OnePassShortA-12                    620ns ± 0%     611ns ± 0%   -1.39%  (p=0.000 n=10+9)
NotOnePassShortA-12                 575ns ± 3%     552ns ± 0%   -3.97%  (p=0.000 n=10+8)
OnePassShortB-12                    493ns ± 0%     491ns ± 0%   -0.33%  (p=0.000 n=8+8)
NotOnePassShortB-12                 423ns ± 0%     412ns ± 0%   -2.60%  (p=0.000 n=8+9)
OnePassLongPrefix-12                112ns ± 0%     112ns ± 0%     ~     (all equal)
OnePassLongNotPrefix-12             405ns ± 0%     410ns ± 0%   +1.23%  (p=0.000 n=8+9)
MatchParallelShared-12              501ns ± 1%      39ns ± 1%  -92.27%  (p=0.000 n=10+10)
MatchParallelCopied-12             39.1ns ± 0%    39.2ns ± 3%     ~     (p=0.785 n=6+10)
QuoteMetaAll-12                    94.6ns ± 0%    94.6ns ± 0%     ~     (p=0.439 n=10+8)
QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
Match/Easy0/32-12                  79.1ns ± 0%    72.9ns ± 0%   -7.85%  (p=0.000 n=9+9)
Match/Easy0/1K-12                   307ns ± 1%     298ns ± 0%   -2.99%  (p=0.000 n=10+6)
Match/Easy0/32K-12                 4.65µs ± 2%    4.60µs ± 2%     ~     (p=0.159 n=10+10)
Match/Easy0/1M-12                   234µs ± 0%     235µs ± 0%   +0.17%  (p=0.003 n=10+10)
Match/Easy0/32M-12                 7.98ms ± 1%    7.96ms ± 0%     ~     (p=0.278 n=9+10)
Match/Easy0i/32-12                 1.13µs ± 1%    1.09µs ± 0%   -3.24%  (p=0.000 n=9+8)
Match/Easy0i/1K-12                 32.5µs ± 0%    31.7µs ± 0%   -2.66%  (p=0.000 n=9+9)
Match/Easy0i/32K-12                1.59ms ± 0%    1.61ms ± 0%   +0.75%  (p=0.000 n=9+9)
Match/Easy0i/1M-12                 51.0ms ± 0%    51.4ms ± 0%   +0.77%  (p=0.000 n=10+8)
Match/Easy0i/32M-12                 1.63s ± 0%     1.65s ± 1%   +1.24%  (p=0.000 n=7+9)
Match/Easy1/32-12                  75.1ns ± 1%    67.9ns ± 0%   -9.54%  (p=0.000 n=8+8)
Match/Easy1/1K-12                   861ns ± 0%     884ns ± 0%   +2.71%  (p=0.000 n=8+9)
Match/Easy1/32K-12                 39.2µs ± 1%    39.2µs ± 0%     ~     (p=0.090 n=10+9)
Match/Easy1/1M-12                  1.38ms ± 0%    1.39ms ± 0%     ~     (p=0.095 n=10+9)
Match/Easy1/32M-12                 44.2ms ± 1%    44.2ms ± 1%     ~     (p=0.218 n=10+10)
Match/Medium/32-12                 1.04µs ± 1%    1.05µs ± 0%   +1.05%  (p=0.000 n=9+8)
Match/Medium/1K-12                 31.3µs ± 0%    31.3µs ± 0%   -0.14%  (p=0.004 n=9+9)
Match/Medium/32K-12                1.44ms ± 0%    1.45ms ± 0%   +0.18%  (p=0.001 n=8+8)
Match/Medium/1M-12                 46.1ms ± 0%    46.2ms ± 0%   +0.13%  (p=0.003 n=6+9)
Match/Medium/32M-12                 1.48s ± 0%     1.48s ± 0%   +0.20%  (p=0.002 n=9+8)
Match/Hard/32-12                   1.54µs ± 1%    1.49µs ± 0%   -3.60%  (p=0.000 n=9+10)
Match/Hard/1K-12                   46.4µs ± 1%    45.1µs ± 1%   -2.78%  (p=0.000 n=9+10)
Match/Hard/32K-12                  2.19ms ± 0%    2.18ms ± 1%   -0.51%  (p=0.006 n=8+9)
Match/Hard/1M-12                   70.1ms ± 0%    69.7ms ± 1%   -0.52%  (p=0.006 n=8+9)
Match/Hard/32M-12                   2.24s ± 0%     2.23s ± 1%   -0.42%  (p=0.046 n=8+9)
Match/Hard1/32-12                  8.17µs ± 1%    7.89µs ± 0%   -3.42%  (p=0.000 n=8+9)
Match/Hard1/1K-12                   254µs ± 2%     244µs ± 0%   -3.91%  (p=0.000 n=9+9)
Match/Hard1/32K-12                 9.58ms ± 1%   10.35ms ± 0%   +8.00%  (p=0.000 n=10+10)
Match/Hard1/1M-12                   306ms ± 1%     331ms ± 0%   +8.27%  (p=0.000 n=9+8)
Match/Hard1/32M-12                  9.79s ± 1%    10.60s ± 0%   +8.29%  (p=0.000 n=9+8)
Match_onepass_regex/32-12           808ns ± 0%     812ns ± 0%   +0.47%  (p=0.000 n=8+10)
Match_onepass_regex/1K-12          27.8µs ± 0%    28.5µs ± 0%   +2.32%  (p=0.000 n=8+10)
Match_onepass_regex/32K-12          925µs ± 0%     936µs ± 0%   +1.24%  (p=0.000 n=9+10)
Match_onepass_regex/1M-12          29.5ms ± 0%    30.2ms ± 0%   +2.38%  (p=0.000 n=10+10)
Match_onepass_regex/32M-12          945ms ± 0%     970ms ± 0%   +2.60%  (p=0.000 n=9+10)
CompileOnepass-12                  4.67µs ± 0%    4.63µs ± 1%   -0.84%  (p=0.000 n=10+10)
[Geo mean]                         24.5µs         23.3µs        -5.04%

https://perf.golang.org/search?q=upload:20181004.1

Change-Id: Idbc2b76223718265657819ff38be2d9aba1c54b4
Reviewed-on: https://go-review.googlesource.com/c/139779
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-12 17:48:34 +00:00
Alan Donovan
8c610aa633 regexp: fix incorrect name in Match doc comment
Change-Id: I628aad9a3abe9cc0c3233f476960e53bd291eca9
Reviewed-on: https://go-review.googlesource.com/135235
Reviewed-by: Ralph Corderoy <ralph@inputplus.co.uk>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-09-13 16:29:06 +00:00
Russ Cox
f50448f531 regexp: reword Match documentation to be more like Find
Before:

  // Find returns a slice holding the text of the leftmost match in b of the regular expression.

  // Match checks whether a textual regular expression matches a byte slice.

After:

  // Match reports whether the byte slice b contains any match of the regular expression re.

The use of different wording for Find and Match always makes me think
that Match required the entire string to match while Find clearly allows
a substring to match.

This CL makes the Match wording correspond more closely to Find,
to try to avoid that confusion.

Change-Id: I97fb82d5080d3246ee5cf52abf28d2a2296a5039
Reviewed-on: https://go-review.googlesource.com/123736
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-07-13 18:52:46 +00:00
Russ Cox
a41d21695c regexp: revert "use sync.Pool to cache regexp.machine objects"
Revert CL 101715.

The size of a sync.Pool scales linearly with GOMAXPROCS,
making it inappropriate to put a sync.Pool in any individually
allocated object, as the sync.Pool documentation explains.
The change also broke DeepEqual on regexps.

I have a cleaner way to do this with global sync.Pools but it's
too late in the cycle. Will revisit in Go 1.12. For now, revert.

Fixes #26219.

Change-Id: Ie632e709eb3caf489d85efceac0e4b130ec2019f
Reviewed-on: https://go-review.googlesource.com/122596
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-07-09 15:12:53 +00:00
Matthew Broberg
3885e86411 regexp: add QuoteMeta example
Change-Id: I0bbb53cad9a7c464ab1cfca381128f33496813ff
Reviewed-on: https://go-review.googlesource.com/49130
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-06-12 22:37:01 +00:00
Tim Cooper
161874da2a all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect
content.

Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2018-06-01 21:52:00 +00:00
Javier Kohen
7dbf9d43f5 regexp: use sync.Pool to cache regexp.machine objects
Performance optimization for the internals of the Regexp type. This adds
no features and has no user-visible impact beyond performance. Copy now
shares the cache, so memory usage for programs that use Copy a lot
should go down; Copy has effectively become a no-op.

The before v. after benchmark results show a lot of noise from run to
run, but there's a clear improvement to the Shared case and no detriment
to the Copied case.

BenchmarkMatchParallelShared-4                        361           77.9          -78.42%
BenchmarkMatchParallelCopied-4                        70.3          72.2          +2.70%

Macro benchmarks show that the lock contention in Regexp is gone, and my
server is now able to scale linearly 2.5x times more than before (and I
only stopped there because I ran out of CPU in my test machine).

Fixes #24411

Change-Id: Ib33abff2802f27599f5d09084775e95b54e3e1d7
Reviewed-on: https://go-review.googlesource.com/101715
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-03 16:03:19 +00:00
Diogo Pinela
9d84e0edd0 regexp: document behavior of FindAll* functions when n < 0
Fixes #24526

Change-Id: I0e38322fca12f9c88db836776920b9dfb66ff844
Reviewed-on: https://go-review.googlesource.com/102423
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Rob Pike <r@golang.org>
2018-03-28 00:31:10 +00:00
Oleg Bulatov
7263540146 regexp: Regexp shouldn't keep references to inputs
If you try to find something in a slice of bytes using a Regexp object,
the byte array will not be released by GC until you use the Regexp object
on another slice of bytes. It happens because the Regexp object keep
references to the input data in its cache.

Change-Id: I873107f15c1900aa53ccae5d29dbc885b9562808
Reviewed-on: https://go-review.googlesource.com/96715
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-23 16:58:48 +00:00
Ahmet Soormally
bb355ed5eb regexp: dont use builtin type as variable name
The existing implementation declares a variable error which collides
with builting type error.

This change simply renames error variable to err.

Change-Id: Ib56c2530f37f53ec70fdebb825a432d4c550cd04
Reviewed-on: https://go-review.googlesource.com/87775
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
2018-02-19 23:40:33 +00:00
Josh Bleecher Snyder
df5997b99b regexp: don't allocate when All methods find no matches
name                old time/op    new time/op    delta
FindAllNoMatches-8     216ns ± 3%     122ns ± 2%   -43.52%  (p=0.000 n=10+10)

name                old alloc/op   new alloc/op   delta
FindAllNoMatches-8      240B ± 0%        0B       -100.00%  (p=0.000 n=10+10)

name                old allocs/op  new allocs/op  delta
FindAllNoMatches-8      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)

This work was supported by Sourcegraph.

Change-Id: I30aac201370ccfb40a6ff637402020ac20f61f70
Reviewed-on: https://go-review.googlesource.com/87418
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-13 18:44:40 +00:00
Marvin Stenger
90d71fe99e all: revert "all: prefer strings.IndexByte over strings.Index"
This reverts https://golang.org/cl/65930.

Fixes #22148

Change-Id: Ie0712621ed89c43bef94417fc32de9af77607760
Reviewed-on: https://go-review.googlesource.com/68430
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-10-05 23:19:10 +00:00
Marvin Stenger
f22ba1f247 all: prefer strings.IndexByte over strings.Index
strings.IndexByte was introduced in go1.2 and it can be used
effectively wherever the second argument to strings.Index is
exactly one byte long.

This avoids generating unnecessary string symbols and saves
a few calls to strings.Index.

Change-Id: I1ab5edb7c4ee9058084cfa57cbcc267c2597e793
Reviewed-on: https://go-review.googlesource.com/65930
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-25 17:35:41 +00:00
Kevin Burke
89ebdbb5fd regexp: speed up QuoteMeta with a lookup table
This is the same technique used in CL 24466. By adding a little bit of
size to the binary, we can remove a function call and gain a lot of
performance.

A raw array ([128]bool) would be faster, but is also be 128 bytes
instead of 16.

Running tip on a Mac:

name             old time/op    new time/op     delta
QuoteMetaAll-4      192ns ±12%      120ns ±11%   -37.27%  (p=0.000 n=10+10)
QuoteMetaNone-4     186ns ± 6%       64ns ± 6%   -65.52%  (p=0.000 n=10+10)

name             old speed      new speed       delta
QuoteMetaAll-4   73.2MB/s ±11%  116.6MB/s ±10%   +59.21%  (p=0.000 n=10+10)
QuoteMetaNone-4   139MB/s ± 6%    405MB/s ± 6%  +190.74%  (p=0.000 n=10+10)

Change-Id: I68ce9fe2ef1c28e2274157789b35b0dd6ae3efb5
Reviewed-on: https://go-review.googlesource.com/41495
Run-TryBot: Kevin Burke <kev@inburke.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-28 06:43:14 +00:00
Ian Lance Taylor
94a9bc960c regexp: document that Longest method is not concurrent-safe
Change-Id: I9ec137502353e65325087dfb60ee9bd68ffd286d
Reviewed-on: https://go-review.googlesource.com/38447
Reviewed-by: Russ Cox <rsc@golang.org>
2017-04-07 21:12:11 +00:00
Martin Möhrmann
e74c6cd3c0 regexp: add ASCII fast path for context methods
The step method implementations check directly if the next rune
only needs one byte to be decoded and avoid calling utf8.DecodeRune
for such ASCII characters.

Introduce the same fast path optimization for rune decoding
for the context methods.

Results for regexp benchmarks that use the context methods:

name                            old time/op  new time/op  delta
AnchoredLiteralShortNonMatch-4  97.5ns ± 1%  94.8ns ± 2%  -2.80%  (p=0.000 n=45+43)
AnchoredShortMatch-4             163ns ± 1%   160ns ± 1%  -1.84%  (p=0.000 n=46+47)
NotOnePassShortA-4               742ns ± 2%   742ns ± 2%    ~     (p=0.440 n=49+50)
NotOnePassShortB-4               535ns ± 1%   533ns ± 2%  -0.37%  (p=0.005 n=46+48)
OnePassLongPrefix-4              169ns ± 2%   166ns ± 2%  -2.06%  (p=0.000 n=50+49)

Change-Id: Ib302d9e8c63333f02695369fcf9963974362e335
Reviewed-on: https://go-review.googlesource.com/38256
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-23 00:08:20 +00:00
Ingo Oeser
9aed16e96d regexp: avoid alloc in QuoteMeta when not quoting
Many users quote literals in regular expressions just in case.
No need to allocate then.

Note: Also added benchmarks for quoting and not quoting.

	name             old time/op    new time/op     delta
	QuoteMetaAll-4      629ns ± 6%      654ns ± 5%    +4.01%        (p=0.001 n=20+19)
	QuoteMetaNone-4    1.02µs ± 6%     0.20µs ± 0%   -80.73%        (p=0.000 n=18+20)

	name             old speed      new speed       delta
	QuoteMetaAll-4   22.3MB/s ± 6%   21.4MB/s ± 5%    -3.94%        (p=0.001 n=20+19)
	QuoteMetaNone-4  25.3MB/s ± 3%  131.5MB/s ± 0%  +419.28%        (p=0.000 n=17+19)

	name             old alloc/op   new alloc/op    delta
	QuoteMetaAll-4      64.0B ± 0%      64.0B ± 0%      ~     (all samples are equal)
	QuoteMetaNone-4     96.0B ± 0%      0.0B ±NaN%  -100.00%        (p=0.000 n=20+20)

	name             old allocs/op  new allocs/op   delta
	QuoteMetaAll-4       2.00 ± 0%       2.00 ± 0%      ~     (all samples are equal)
	QuoteMetaNone-4      2.00 ± 0%      0.00 ±NaN%  -100.00%        (p=0.000 n=20+20)

Change-Id: I38d50f463cde463115d22534f8eb849e54d899af
Reviewed-on: https://go-review.googlesource.com/31395
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-19 07:09:08 +00:00
Aliaksandr Valialkin
bea39e63ec regexp: reduce mallocs in Regexp.Find* and Regexp.ReplaceAll*.
This improves Regexp.Find* and Regexp.ReplaceAll* speed:

name                  old time/op    new time/op    delta
Find-4                   345ns ± 1%     314ns ± 1%    -8.94%    (p=0.000 n=9+8)
FindString-4             341ns ± 1%     308ns ± 0%    -9.85%   (p=0.000 n=10+9)
FindSubmatch-4           440ns ± 1%     404ns ± 0%    -8.27%   (p=0.000 n=10+8)
FindStringSubmatch-4     426ns ± 0%     387ns ± 0%    -9.07%   (p=0.000 n=10+9)
ReplaceAll-4            1.75µs ± 1%    1.67µs ± 0%    -4.45%   (p=0.000 n=9+10)

name                  old alloc/op   new alloc/op   delta
Find-4                   16.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=10+10)
FindString-4             16.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=10+10)
FindSubmatch-4           80.0B ± 0%     48.0B ± 0%   -40.00%  (p=0.000 n=10+10)
FindStringSubmatch-4     64.0B ± 0%     32.0B ± 0%   -50.00%  (p=0.000 n=10+10)
ReplaceAll-4              152B ± 0%      104B ± 0%   -31.58%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
Find-4                    1.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=10+10)
FindString-4              1.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=10+10)
FindSubmatch-4            2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
FindStringSubmatch-4      2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
ReplaceAll-4              8.00 ± 0%      5.00 ± 0%   -37.50%  (p=0.000 n=10+10)

Fixes #15643

Change-Id: I594fe51172373e2adb98d1d25c76ca2cde54ff48
Reviewed-on: https://go-review.googlesource.com/23030
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-06 17:04:31 +00:00
Dominik Honnef
fdba5a7544 all: delete dead non-test code
This change removes a lot of dead code. Some of the code has never been
used, not even when it was first commited. The rest shouldn't have
survived refactors.

This change doesn't remove unused routines helpful for debugging, nor
does it remove code that's used in commented out blocks of code that are
only unused temporarily. Furthermore, unused constants weren't removed
when they were part of a set of constants from specifications.

One noteworthy omission from this CL are about 1000 lines of unused code
in cmd/fix, 700 lines of which are the typechecker, which hasn't been
used ever since the pre-Go 1 fixes have been removed. I wasn't sure if
this code should stick around for future uses of cmd/fix or be culled as
well.

Change-Id: Ib714bc7e487edc11ad23ba1c3222d1fd02e4a549
Reviewed-on: https://go-review.googlesource.com/20926
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-25 06:28:13 +00:00
David Symonds
248c3a3c7b regexp: avoid copying mutex in (*Regexp).Copy.
There's nothing guaranteeing that the *Regexp isn't in active use,
and so copying the sync.Mutex value is invalid.

Updates #14839.

Change-Id: Iddf52bf69df1b563377922399f64a571f76b95dd
Reviewed-on: https://go-review.googlesource.com/20841
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-03-18 03:56:28 +00:00
Brad Fitzpatrick
5fea2ccc77 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02 00:13:47 +00:00
Nathan VanBenschoten
b04f3b06ec all: replace strings.Index with strings.Contains where possible
Change-Id: Ia613f1c37bfce800ece0533a5326fca91d99a66a
Reviewed-on: https://go-review.googlesource.com/18120
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
2016-02-19 01:06:05 +00:00
Caleb Spare
937172b247 regexp: add Copy method to Regexp
This helps users who wish to use separate Regexps in each goroutine to
avoid lock contention. Previously they had to parse the expression
multiple times to achieve this.

I used variants of the included benchmark to evaluate this change. I
used the arguments -benchtime 20s -cpu 1,2,4,8,16 on a machine with 16
hardware cores.

Comparing a single shared Regexp vs. copied Regexps, we can see that
lock contention causes huge slowdowns at higher levels of parallelism.
The copied version shows the expected linear speedup.

name              old time/op  new time/op  delta
MatchParallel      366ns ± 0%   370ns ± 0%   +1.09%   (p=0.000 n=10+8)
MatchParallel-2    324ns ±28%   184ns ± 1%  -43.37%  (p=0.000 n=10+10)
MatchParallel-4    352ns ± 5%    93ns ± 1%  -73.70%   (p=0.000 n=9+10)
MatchParallel-8    480ns ± 3%    46ns ± 0%  -90.33%    (p=0.000 n=9+8)
MatchParallel-16   510ns ± 8%    24ns ± 6%  -95.36%   (p=0.000 n=10+8)

I also compared a modified version of Regexp that has no mutex and a
single machine (the "RegexpForSingleGoroutine" rsc mentioned in
https://github.com/golang/go/issues/8232#issuecomment-66096128).

In this next test, I compared using N copied Regexps vs. N separate
RegexpForSingleGoroutines. This shows that, even for this relatively
simple regex, avoiding the lock entirely would only buy about 10-12%
further improvement.

name              old time/op  new time/op  delta
MatchParallel      370ns ± 0%   322ns ± 0%  -12.97%    (p=0.000 n=8+8)
MatchParallel-2    184ns ± 1%   162ns ± 1%  -11.60%  (p=0.000 n=10+10)
MatchParallel-4   92.7ns ± 1%  81.1ns ± 2%  -12.43%  (p=0.000 n=10+10)
MatchParallel-8   46.4ns ± 0%  41.8ns ±10%   -9.78%   (p=0.000 n=8+10)
MatchParallel-16  23.7ns ± 6%  20.6ns ± 1%  -13.14%    (p=0.000 n=8+8)

Updates #8232.

Change-Id: I15201a080c363d1b44104eafed46d8df5e311902
Reviewed-on: https://go-review.googlesource.com/16110
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-25 17:26:37 +00:00
Didier Spezia
f75f2f3fcc regexp: fix slice bounds out of range panics
Regular expressions involving a (x){0} term are
simplified by removing this term from the
expression, just before the expression is compiled.

The number of subexpressions is evaluated before
the simplification. The number of capture instructions
in the compiled expressions is not necessarily in line
with the number of subexpressions.

When the ReplaceAll(String) methods are used, a number
of capture slots (nmatch) is evaluated as 2*(s+1)
(s being the number of subexpressions).

In some case, it can be higher than the number of capture
instructions evaluated at compile time, resulting in a
panic when the internal slices of regexp.machine
are resized to this value.

Fixed by capping the number of capture slots to the number
of capture instructions.

I must say I do not really see the benefits of setting
nmatch lower than re.prog.NumCap using this 2*(s+1) formula,
so perhaps this can be further simplified.

Fixes #11178
Fixes #11176

Change-Id: I21415e8ef2dd5f2721218e9a679f7f6bfb76ae9b
Reviewed-on: https://go-review.googlesource.com/14013
Reviewed-by: Russ Cox <rsc@golang.org>
2015-10-23 03:30:25 +00:00
Rob Pike
ae38ef4cdf regexp: suggest go doc, not godoc
In 1.6, go doc is more likely to be available.

Change-Id: I970ad1d3317b35273f5c8d830f75713d3570c473
Reviewed-on: https://go-review.googlesource.com/10518
Reviewed-by: Andrew Gerrand <adg@golang.org>
2015-06-01 20:16:31 +00:00
Brad Fitzpatrick
fc9a234d9f regexp: fix link to RE2 syntax
Fixes #10224

Change-Id: I21037379b4667575e51ab0b6b683138c505c3f68
Reviewed-on: https://go-review.googlesource.com/7960
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-03-23 19:17:52 +00:00
Michael Matloob
c7eb9663aa regexp: fix typo in comment: s/onpass/onepass/
Change-Id: Idff57050a34d09e7fa9b77e9b53d61bb5ea2a71c
Reviewed-on: https://go-review.googlesource.com/2095
Reviewed-by: Minux Ma <minux@golang.org>
2014-12-24 07:30:28 +00:00
Ian Lance Taylor
3c5fd98918 regexp: correct doc comment for ReplaceAllLiteralString
Fixes #8959.

LGTM=adg
R=golang-codereviews, adg
CC=golang-codereviews
https://golang.org/cl/161790043
2014-10-19 10:28:27 -07:00
Russ Cox
c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00