15 Commits

Author SHA1 Message Date
Ian Lance Taylor
94a9bc960c regexp: document that Longest method is not concurrent-safe
Change-Id: I9ec137502353e65325087dfb60ee9bd68ffd286d
Reviewed-on: https://go-review.googlesource.com/38447
Reviewed-by: Russ Cox <rsc@golang.org>
2017-04-07 21:12:11 +00:00
Martin Möhrmann
e74c6cd3c0 regexp: add ASCII fast path for context methods
The step method implementations check directly if the next rune
only needs one byte to be decoded and avoid calling utf8.DecodeRune
for such ASCII characters.

Introduce the same fast path optimization for rune decoding
for the context methods.

Results for regexp benchmarks that use the context methods:

name                            old time/op  new time/op  delta
AnchoredLiteralShortNonMatch-4  97.5ns ± 1%  94.8ns ± 2%  -2.80%  (p=0.000 n=45+43)
AnchoredShortMatch-4             163ns ± 1%   160ns ± 1%  -1.84%  (p=0.000 n=46+47)
NotOnePassShortA-4               742ns ± 2%   742ns ± 2%    ~     (p=0.440 n=49+50)
NotOnePassShortB-4               535ns ± 1%   533ns ± 2%  -0.37%  (p=0.005 n=46+48)
OnePassLongPrefix-4              169ns ± 2%   166ns ± 2%  -2.06%  (p=0.000 n=50+49)

Change-Id: Ib302d9e8c63333f02695369fcf9963974362e335
Reviewed-on: https://go-review.googlesource.com/38256
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-23 00:08:20 +00:00
Ingo Oeser
9aed16e96d regexp: avoid alloc in QuoteMeta when not quoting
Many users quote literals in regular expressions just in case.
No need to allocate then.

Note: Also added benchmarks for quoting and not quoting.

	name             old time/op    new time/op     delta
	QuoteMetaAll-4      629ns ± 6%      654ns ± 5%    +4.01%        (p=0.001 n=20+19)
	QuoteMetaNone-4    1.02µs ± 6%     0.20µs ± 0%   -80.73%        (p=0.000 n=18+20)

	name             old speed      new speed       delta
	QuoteMetaAll-4   22.3MB/s ± 6%   21.4MB/s ± 5%    -3.94%        (p=0.001 n=20+19)
	QuoteMetaNone-4  25.3MB/s ± 3%  131.5MB/s ± 0%  +419.28%        (p=0.000 n=17+19)

	name             old alloc/op   new alloc/op    delta
	QuoteMetaAll-4      64.0B ± 0%      64.0B ± 0%      ~     (all samples are equal)
	QuoteMetaNone-4     96.0B ± 0%      0.0B ±NaN%  -100.00%        (p=0.000 n=20+20)

	name             old allocs/op  new allocs/op   delta
	QuoteMetaAll-4       2.00 ± 0%       2.00 ± 0%      ~     (all samples are equal)
	QuoteMetaNone-4      2.00 ± 0%      0.00 ±NaN%  -100.00%        (p=0.000 n=20+20)

Change-Id: I38d50f463cde463115d22534f8eb849e54d899af
Reviewed-on: https://go-review.googlesource.com/31395
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-19 07:09:08 +00:00
Aliaksandr Valialkin
bea39e63ec regexp: reduce mallocs in Regexp.Find* and Regexp.ReplaceAll*.
This improves Regexp.Find* and Regexp.ReplaceAll* speed:

name                  old time/op    new time/op    delta
Find-4                   345ns ± 1%     314ns ± 1%    -8.94%    (p=0.000 n=9+8)
FindString-4             341ns ± 1%     308ns ± 0%    -9.85%   (p=0.000 n=10+9)
FindSubmatch-4           440ns ± 1%     404ns ± 0%    -8.27%   (p=0.000 n=10+8)
FindStringSubmatch-4     426ns ± 0%     387ns ± 0%    -9.07%   (p=0.000 n=10+9)
ReplaceAll-4            1.75µs ± 1%    1.67µs ± 0%    -4.45%   (p=0.000 n=9+10)

name                  old alloc/op   new alloc/op   delta
Find-4                   16.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=10+10)
FindString-4             16.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=10+10)
FindSubmatch-4           80.0B ± 0%     48.0B ± 0%   -40.00%  (p=0.000 n=10+10)
FindStringSubmatch-4     64.0B ± 0%     32.0B ± 0%   -50.00%  (p=0.000 n=10+10)
ReplaceAll-4              152B ± 0%      104B ± 0%   -31.58%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
Find-4                    1.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=10+10)
FindString-4              1.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=10+10)
FindSubmatch-4            2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
FindStringSubmatch-4      2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
ReplaceAll-4              8.00 ± 0%      5.00 ± 0%   -37.50%  (p=0.000 n=10+10)

Fixes #15643

Change-Id: I594fe51172373e2adb98d1d25c76ca2cde54ff48
Reviewed-on: https://go-review.googlesource.com/23030
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-06 17:04:31 +00:00
Dominik Honnef
fdba5a7544 all: delete dead non-test code
This change removes a lot of dead code. Some of the code has never been
used, not even when it was first commited. The rest shouldn't have
survived refactors.

This change doesn't remove unused routines helpful for debugging, nor
does it remove code that's used in commented out blocks of code that are
only unused temporarily. Furthermore, unused constants weren't removed
when they were part of a set of constants from specifications.

One noteworthy omission from this CL are about 1000 lines of unused code
in cmd/fix, 700 lines of which are the typechecker, which hasn't been
used ever since the pre-Go 1 fixes have been removed. I wasn't sure if
this code should stick around for future uses of cmd/fix or be culled as
well.

Change-Id: Ib714bc7e487edc11ad23ba1c3222d1fd02e4a549
Reviewed-on: https://go-review.googlesource.com/20926
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-25 06:28:13 +00:00
David Symonds
248c3a3c7b regexp: avoid copying mutex in (*Regexp).Copy.
There's nothing guaranteeing that the *Regexp isn't in active use,
and so copying the sync.Mutex value is invalid.

Updates #14839.

Change-Id: Iddf52bf69df1b563377922399f64a571f76b95dd
Reviewed-on: https://go-review.googlesource.com/20841
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-03-18 03:56:28 +00:00
Brad Fitzpatrick
5fea2ccc77 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02 00:13:47 +00:00
Nathan VanBenschoten
b04f3b06ec all: replace strings.Index with strings.Contains where possible
Change-Id: Ia613f1c37bfce800ece0533a5326fca91d99a66a
Reviewed-on: https://go-review.googlesource.com/18120
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
2016-02-19 01:06:05 +00:00
Caleb Spare
937172b247 regexp: add Copy method to Regexp
This helps users who wish to use separate Regexps in each goroutine to
avoid lock contention. Previously they had to parse the expression
multiple times to achieve this.

I used variants of the included benchmark to evaluate this change. I
used the arguments -benchtime 20s -cpu 1,2,4,8,16 on a machine with 16
hardware cores.

Comparing a single shared Regexp vs. copied Regexps, we can see that
lock contention causes huge slowdowns at higher levels of parallelism.
The copied version shows the expected linear speedup.

name              old time/op  new time/op  delta
MatchParallel      366ns ± 0%   370ns ± 0%   +1.09%   (p=0.000 n=10+8)
MatchParallel-2    324ns ±28%   184ns ± 1%  -43.37%  (p=0.000 n=10+10)
MatchParallel-4    352ns ± 5%    93ns ± 1%  -73.70%   (p=0.000 n=9+10)
MatchParallel-8    480ns ± 3%    46ns ± 0%  -90.33%    (p=0.000 n=9+8)
MatchParallel-16   510ns ± 8%    24ns ± 6%  -95.36%   (p=0.000 n=10+8)

I also compared a modified version of Regexp that has no mutex and a
single machine (the "RegexpForSingleGoroutine" rsc mentioned in
https://github.com/golang/go/issues/8232#issuecomment-66096128).

In this next test, I compared using N copied Regexps vs. N separate
RegexpForSingleGoroutines. This shows that, even for this relatively
simple regex, avoiding the lock entirely would only buy about 10-12%
further improvement.

name              old time/op  new time/op  delta
MatchParallel      370ns ± 0%   322ns ± 0%  -12.97%    (p=0.000 n=8+8)
MatchParallel-2    184ns ± 1%   162ns ± 1%  -11.60%  (p=0.000 n=10+10)
MatchParallel-4   92.7ns ± 1%  81.1ns ± 2%  -12.43%  (p=0.000 n=10+10)
MatchParallel-8   46.4ns ± 0%  41.8ns ±10%   -9.78%   (p=0.000 n=8+10)
MatchParallel-16  23.7ns ± 6%  20.6ns ± 1%  -13.14%    (p=0.000 n=8+8)

Updates #8232.

Change-Id: I15201a080c363d1b44104eafed46d8df5e311902
Reviewed-on: https://go-review.googlesource.com/16110
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-25 17:26:37 +00:00
Didier Spezia
f75f2f3fcc regexp: fix slice bounds out of range panics
Regular expressions involving a (x){0} term are
simplified by removing this term from the
expression, just before the expression is compiled.

The number of subexpressions is evaluated before
the simplification. The number of capture instructions
in the compiled expressions is not necessarily in line
with the number of subexpressions.

When the ReplaceAll(String) methods are used, a number
of capture slots (nmatch) is evaluated as 2*(s+1)
(s being the number of subexpressions).

In some case, it can be higher than the number of capture
instructions evaluated at compile time, resulting in a
panic when the internal slices of regexp.machine
are resized to this value.

Fixed by capping the number of capture slots to the number
of capture instructions.

I must say I do not really see the benefits of setting
nmatch lower than re.prog.NumCap using this 2*(s+1) formula,
so perhaps this can be further simplified.

Fixes #11178
Fixes #11176

Change-Id: I21415e8ef2dd5f2721218e9a679f7f6bfb76ae9b
Reviewed-on: https://go-review.googlesource.com/14013
Reviewed-by: Russ Cox <rsc@golang.org>
2015-10-23 03:30:25 +00:00
Rob Pike
ae38ef4cdf regexp: suggest go doc, not godoc
In 1.6, go doc is more likely to be available.

Change-Id: I970ad1d3317b35273f5c8d830f75713d3570c473
Reviewed-on: https://go-review.googlesource.com/10518
Reviewed-by: Andrew Gerrand <adg@golang.org>
2015-06-01 20:16:31 +00:00
Brad Fitzpatrick
fc9a234d9f regexp: fix link to RE2 syntax
Fixes #10224

Change-Id: I21037379b4667575e51ab0b6b683138c505c3f68
Reviewed-on: https://go-review.googlesource.com/7960
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-03-23 19:17:52 +00:00
Michael Matloob
c7eb9663aa regexp: fix typo in comment: s/onpass/onepass/
Change-Id: Idff57050a34d09e7fa9b77e9b53d61bb5ea2a71c
Reviewed-on: https://go-review.googlesource.com/2095
Reviewed-by: Minux Ma <minux@golang.org>
2014-12-24 07:30:28 +00:00
Ian Lance Taylor
3c5fd98918 regexp: correct doc comment for ReplaceAllLiteralString
Fixes #8959.

LGTM=adg
R=golang-codereviews, adg
CC=golang-codereviews
https://golang.org/cl/161790043
2014-10-19 10:28:27 -07:00
Russ Cox
c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00