45 Commits

Author SHA1 Message Date
Ludi Rehak
82bf12902f regexp/syntax: test for lowercase letters first in IsWordChar
Lowercase letters occur more frequently than uppercase letters
in English text.  In IsWordChar, evaluate the most common case
(lowercase letters) first to minimize the expected value of its
execution time. Code clarity does not suffer by rearranging the
order of the checks.

Add a benchmark on a sentence demonstrating the performance
improvement.

name           old time/op  new time/op  delta
IsWordChar-10   122ns ± 0%   114ns ± 1%  -6.68%  (p=0.000 n=8+10)

Change-Id: Ieee8126a4bd8ee8703905b4f75724623029f6fa2
Reviewed-on: https://go-review.googlesource.com/c/go/+/404100
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
2023-03-14 04:39:42 +00:00
cuiweixie
581a822a9e regexp: add ErrLarge error
For #56041

Change-Id: I6c98458b5c0d3b3636a53ee04fc97221f3fd8bbc
Reviewed-on: https://go-review.googlesource.com/c/go/+/444817
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: xie cui <523516579@qq.com>
2022-11-02 18:15:21 +00:00
Russ Cox
c3c4aea55b regexp: limit size of parsed regexps
Set a 128 MB limit on the amount of space used by []syntax.Inst
in the compiled form corresponding to a given regexp.

Also set a 128 MB limit on the rune storage in the *syntax.Regexp
tree itself.

Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue.

Fixes CVE-2022-41715.
Fixes #55949.

Change-Id: Ia656baed81564436368cf950e1c5409752f28e1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/439356
Auto-Submit: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
2022-10-05 20:39:49 +00:00
cui fliter
1888875182 regexp: fix a few function names on comments
Change-Id: I192dd34c677e52e16f0ef78e1dae58a78f6d1aac
GitHub-Last-Rev: 1638a7468951df72f13fea34855b6a4fcbb08226
GitHub-Pull-Request: golang/go#55967
Reviewed-on: https://go-review.googlesource.com/c/go/+/436885
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2022-10-02 02:31:23 +00:00
Ludi Rehak
e7b0559448 regexp/syntax: fix typo in comment
Fix typo in comment describing IsWordChar.

Change-Id: Ia283813cf5662e218ee6d0411fb0c1b1ad1021f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/393435
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-29 01:00:55 +00:00
Ian Lance Taylor
0bf545e51f regexp/syntax: rename ErrInvalidDepth to ErrNestingDepth
The proposal accepted the name ErrNestingDepth.

For #51684

Change-Id: I702365f19e5e1889dbcc3c971eecff68e0b08727
Reviewed-on: https://go-review.googlesource.com/c/go/+/401854
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-04-22 22:35:03 +00:00
Ian Lance Taylor
575fd8817a regexp: change ErrInvalidDepth message to match proposal
Also update the file in $GOROOT/api/next to use proposal number.

For #51684

Change-Id: I28bfa6bc1cee98a17b13da196d41cda34d968bb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/401076
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-04-22 00:10:17 +00:00
Russ Cox
19309779ac all: gofmt main repo
[This CL is part of a sequence implementing the proposal #51082.
The design doc is at https://go.dev/s/godocfmt-design.]

Run the updated gofmt, which reformats doc comments,
on the main repository. Vendored files are excluded.

For #51082.

Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407
Reviewed-on: https://go-review.googlesource.com/c/go/+/384268
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-11 16:34:30 +00:00
Russ Cox
81431c7aa7 all: replace `` and '' with “ (U+201C) and ” (U+201D) in doc comments
go/doc in all its forms applies this replacement when rendering
the comments. We are considering formatting doc comments,
including doing this replacement as part of the formatting.
Apply it to our source files ahead of time.

For #51082.

Change-Id: Ifcc1f5861abb57c5d14e7d8c2102dfb31b7a3a19
Reviewed-on: https://go-review.googlesource.com/c/go/+/384262
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-05 17:52:29 +00:00
Russ Cox
1af60b2f49 regexp/syntax: add and use ErrInvalidDepth
The fix for #51112 introduced a depth check but used
ErrInternalError to avoid introduce new API in a CL that
would be backported to earlier releases.

New API accepted in proposal #51684.

This CL adds a distinct error for this case.

For #51112.
Fixes #51684.

Change-Id: I068fc70aafe4218386a06103d9b7c847fb7ffa65
Reviewed-on: https://go-review.googlesource.com/c/go/+/384617
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-04 10:59:27 +00:00
Russ Cox
690ac4071f all: remove trailing blank doc comment lines
A future change to gofmt will rewrite

	// Doc comment.
	//
	func f()

to

	// Doc comment.
	func f()

Apply that change preemptively to all doc comments.

For #51082.

Change-Id: I4023e16cfb0729b64a8590f071cd92f17343081d
Reviewed-on: https://go-review.googlesource.com/c/go/+/384259
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 18:18:07 +00:00
Russ Cox
452f24ae94 regexp/syntax: reject very deeply nested regexps in Parse
The regexp code assumes it can recurse over the structure of
a regexp safely. Go's growable stacks make that reasonable
for all plausible regexps, but implausible ones can reach the
“infinite recursion?” stack limit.

This CL limits the depth of any parsed regexp to 1000.
That is, the depth of the parse tree is required to be ≤ 1000.
Regexps that require deeper parse trees will return ErrInternalError.
A future CL will change the error to ErrInvalidDepth,
but using ErrInternalError for now avoids introducing new API
in point releases when this is backported.

Fixes #51112.

Change-Id: I97d2cd82195946eb43a4ea8561f5b95f91fb14c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/384616
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-02-10 15:23:05 +00:00
Russ Cox
702e337174 regexp: document and implement that invalid UTF-8 bytes are the same as U+FFFD
What should it mean to run a regexp match on invalid UTF-8 bytes?
The coherent behavior options are:

1. Invalid UTF-8 does not match any character classes,
   nor a U+FFFD literal (nor \x{fffd}).
2. Each byte of invalid UTF-8 is treated identically to a U+FFFD in the input,
   as a utf8.DecodeRune loop might.

RE2 uses Rule 1.
Because it works byte at a time, it can also provide \C to match any
single byte of input, which matches invalid UTF-8 as well.
This provides the nice property that a match for a regexp without \C
is guaranteed to be valid UTF-8.

Unfortunately, today Go has an incoherent mix of these two, although
mostly Rule 2. This is a deviation from RE2, and it gives up the nice
property, but we probably can't correct that at this point.
In particular .* already matches entire inputs today, valid UTF-8 or
not, and I doubt we can break that.

This CL adopts Rule 2 officially, fixing the few places that deviate from it.

Fixes #48749.

Change-Id: I96402527c5dfb1146212f568ffa09dde91d71244
Reviewed-on: https://go-review.googlesource.com/c/go/+/354569
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2021-10-11 15:28:50 +00:00
Russ Cox
4d8db00641 all: use bytes.Cut, strings.Cut
Many uses of Index/IndexByte/IndexRune/Split/SplitN
can be written more clearly using the new Cut functions.
Do that. Also rewrite to other functions if that's clearer.

For #46336.

Change-Id: I68d024716ace41a57a8bf74455c62279bde0f448
Reviewed-on: https://go-review.googlesource.com/c/go/+/351711
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-10-06 15:53:04 +00:00
Russ Cox
2a61b3c590 regexp: fix repeat of preferred empty match
In Perl mode, (|a)* should match an empty string at the start of the
input. Instead it matches as many a's as possible.
Because (|a)+ is handled correctly, matching only an empty string,
this leads to the paradox that e* can match more text than e+
(for e = (|a)) and that e+ is sometimes different from ee*.

This is a very old bug that ultimately derives from the picture I drew
for e* in https://swtch.com/~rsc/regexp/regexp1.html. The picture is
correct for longest-match (POSIX) regexps but subtly wrong for
preferred-match (Perl) regexps in the case where e has a preferred
empty match. Pointed out by Andrew Gallant in private mail.

The current code treats e* and e+ as the same structure, with
different entry points. In the case of e* the preference list ends up
not quite in the right order, in part because the “before e” and
“after e” states are the same state. Splitting them apart fixes the
preference list, and that can be done by compiling e* as if it were
(e+)?.

Like with any bug fix, there is a very low chance of breaking a
program that accidentally depends on the buggy behavior.

RE2, Go, and Rust all have this bug, and we've all agreed to fix it,
to keep the implementations in sync.

Fixes #46123.

Change-Id: I70e742e71e0a23b626593b16ddef3c1e73b413b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/318750
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
2021-05-13 14:52:20 +00:00
Tom Payne
1d3baf20dc regexp/syntax: add note about Unicode character classes
As proposed on golang-nuts:
https://groups.google.com/g/golang-nuts/c/M3lmSUptExQ/m/hRySV9GsCAAJ

Includes the latest updates from re2's mksyntaxgo:
https://code.googlesource.com/re2/+/refs/heads/master/doc/mksyntaxgo

Change-Id: Ib7b79aa6531f473feabd0a7f1d263cd65c4388e4
Reviewed-on: https://go-review.googlesource.com/c/go/+/264678
Reviewed-by: Russ Cox <rsc@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
2020-11-25 15:10:09 +00:00
Andy Balholm
0ce1ffc49d regexp/syntax: append patchLists in constant time
By keeping a tail pointer, we can append to a patchList in constant
time, rather than in time proportional to the length of the list. This
gets rid of the quadratic compile times we were seeing for long series
of alternations.

This is basically the same change as
e9d517989f.

Fixes #39542.

Change-Id: Ib4ca0ca9c55abd1594df1984653c7d311ccf7572
Reviewed-on: https://go-review.googlesource.com/c/go/+/238079
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-06-18 00:40:11 +00:00
Russ Cox
4d9ecde30a regexp/syntax: fix comment on p.literal and simplify
p.literal's doc comment said it returned a value but it doesn't.
While we're here, p.newLiteral is only called from p.literal,
so simplify the code by merging the two.

Change-Id: Ia357937a99f4e7473f0f1ec837113a39eaeb83d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/222659
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-04-17 22:12:02 +00:00
Keegan Carruthers-Smith
648c7b592a regexp/syntax: exclude full range from String negation case
If the char class is 0x0-0x10ffff we mistakenly would String that to `[^]`,
which is not a valid regex.

Fixes #31807

Change-Id: I9ceeaddc28b67b8e1de12b6703bcb124cc784556
Reviewed-on: https://go-review.googlesource.com/c/go/+/175679
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-05-22 04:43:25 +00:00
Brad Fitzpatrick
3813edf26e all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:

    // Foo reports whether ...
    func Foo() bool

(rather than "returns true if")

This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.

(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)

Created with:

$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)

Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-02 22:47:58 +00:00
Ian Lance Taylor
3396034155 regexp/syntax: don't do both linear and binary sesarch in MatchRunePos
MatchRunePos is a significant element of regexp performance, so some
attention to optimization is appropriate. Before this CL, a
non-matching rune would do both a linear search in the first four
entries, and a binary search over all the entries. Change the code to
optimize for the common case of two runes, to only do a linear search
when there are up to four entries, and to only do a binary search when
there are more than four entries.

Updates #26623

name                             old time/op    new time/op    delta
Find-12                             260ns ± 1%     275ns ± 7%   +5.84%  (p=0.000 n=8+10)
FindAllNoMatches-12                 144ns ± 9%     143ns ±12%     ~     (p=0.187 n=10+10)
FindString-12                       256ns ± 4%     254ns ± 1%     ~     (p=0.357 n=9+8)
FindSubmatch-12                     587ns ±12%     593ns ±11%     ~     (p=0.516 n=10+10)
FindStringSubmatch-12               534ns ±12%     525ns ±14%     ~     (p=0.565 n=10+10)
Literal-12                          104ns ±14%     106ns ±11%     ~     (p=0.145 n=10+10)
NotLiteral-12                      1.51µs ± 8%    1.47µs ± 2%     ~     (p=0.508 n=10+9)
MatchClass-12                      2.47µs ± 1%    2.26µs ± 6%   -8.55%  (p=0.000 n=8+10)
MatchClass_InRange-12              2.18µs ± 5%    2.25µs ±11%   +2.85%  (p=0.009 n=9+10)
ReplaceAll-12                      2.35µs ± 6%    2.08µs ±23%  -11.59%  (p=0.010 n=9+10)
AnchoredLiteralShortNonMatch-12    93.2ns ± 9%    93.2ns ±11%     ~     (p=0.716 n=10+10)
AnchoredLiteralLongNonMatch-12      118ns ±10%     117ns ± 9%     ~     (p=0.802 n=10+10)
AnchoredShortMatch-12               142ns ± 1%     141ns ± 1%   -0.53%  (p=0.007 n=8+8)
AnchoredLongMatch-12                303ns ± 9%     304ns ± 6%     ~     (p=0.724 n=10+10)
OnePassShortA-12                    620ns ± 1%     618ns ± 9%     ~     (p=0.162 n=8+10)
NotOnePassShortA-12                 599ns ± 8%     568ns ± 1%   -5.21%  (p=0.000 n=10+8)
OnePassShortB-12                    525ns ± 7%     489ns ± 1%   -6.93%  (p=0.000 n=10+8)
NotOnePassShortB-12                 449ns ± 9%     431ns ±11%   -4.05%  (p=0.033 n=10+10)
OnePassLongPrefix-12                119ns ± 6%     114ns ± 0%   -3.88%  (p=0.006 n=10+9)
OnePassLongNotPrefix-12             420ns ± 9%     410ns ± 7%     ~     (p=0.645 n=10+9)
MatchParallelShared-12              376ns ± 0%     375ns ± 0%   -0.45%  (p=0.003 n=8+10)
MatchParallelCopied-12             39.4ns ± 1%    39.1ns ± 0%   -0.55%  (p=0.004 n=10+9)
QuoteMetaAll-12                     139ns ± 7%     142ns ± 7%     ~     (p=0.445 n=10+10)
QuoteMetaNone-12                   56.7ns ± 0%    61.3ns ± 7%   +8.03%  (p=0.001 n=8+10)
Match/Easy0/32-12                  83.4ns ± 7%    83.1ns ± 8%     ~     (p=0.541 n=10+10)
Match/Easy0/1K-12                   417ns ± 8%     394ns ± 6%     ~     (p=0.059 n=10+9)
Match/Easy0/32K-12                 7.05µs ± 8%    7.30µs ± 9%     ~     (p=0.190 n=10+10)
Match/Easy0/1M-12                   291µs ±17%     284µs ±10%     ~     (p=0.481 n=10+10)
Match/Easy0/32M-12                 9.89ms ± 4%   10.27ms ± 8%     ~     (p=0.315 n=10+10)
Match/Easy0i/32-12                 1.13µs ± 1%    1.14µs ± 1%   +1.51%  (p=0.000 n=8+8)
Match/Easy0i/1K-12                 35.7µs ±11%    36.8µs ±10%     ~     (p=0.143 n=10+10)
Match/Easy0i/32K-12                1.70ms ± 7%    1.72ms ± 7%     ~     (p=0.776 n=9+6)

name                             old alloc/op   new alloc/op   delta
Find-12                             0.00B          0.00B          ~     (all equal)
FindAllNoMatches-12                 0.00B          0.00B          ~     (all equal)
FindString-12                       0.00B          0.00B          ~     (all equal)
FindSubmatch-12                     48.0B ± 0%     48.0B ± 0%     ~     (all equal)
FindStringSubmatch-12               32.0B ± 0%     32.0B ± 0%     ~     (all equal)

name                             old allocs/op  new allocs/op  delta
Find-12                              0.00           0.00          ~     (all equal)
FindAllNoMatches-12                  0.00           0.00          ~     (all equal)
FindString-12                        0.00           0.00          ~     (all equal)
FindSubmatch-12                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
FindStringSubmatch-12                1.00 ± 0%      1.00 ± 0%     ~     (all equal)

name                             old speed      new speed      delta
QuoteMetaAll-12                   101MB/s ± 8%    99MB/s ± 7%     ~     (p=0.529 n=10+10)
QuoteMetaNone-12                  458MB/s ± 0%   425MB/s ± 8%   -7.22%  (p=0.003 n=8+10)
Match/Easy0/32-12                 385MB/s ± 7%   386MB/s ± 7%     ~     (p=0.579 n=10+10)
Match/Easy0/1K-12                2.46GB/s ± 8%  2.60GB/s ± 6%     ~     (p=0.065 n=10+9)
Match/Easy0/32K-12               4.66GB/s ± 7%  4.50GB/s ±10%     ~     (p=0.190 n=10+10)
Match/Easy0/1M-12                3.63GB/s ±15%  3.70GB/s ± 9%     ~     (p=0.481 n=10+10)
Match/Easy0/32M-12               3.40GB/s ± 4%  3.28GB/s ± 8%     ~     (p=0.315 n=10+10)
Match/Easy0i/32-12               28.4MB/s ± 1%  28.0MB/s ± 1%   -1.50%  (p=0.000 n=8+8)
Match/Easy0i/1K-12               28.8MB/s ±10%  27.9MB/s ±11%     ~     (p=0.143 n=10+10)
Match/Easy0i/32K-12              19.0MB/s ±14%  19.1MB/s ± 8%     ~     (p=1.000 n=10+6)

Change-Id: I238a451b36ad84b0f5534ff0af5c077a0d52d73a
Reviewed-on: https://go-review.googlesource.com/130417
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-22 17:11:57 +00:00
Tim Cooper
161874da2a all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect
content.

Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2018-06-01 21:52:00 +00:00
Ian Lance Taylor
9967582f77 regexp/syntax: update perl script to preserve \s behavior
Incorporate https://code-review.googlesource.com/#/c/re2/+/3050/ from
the re2 repository. Description of that change:

    Preserve the original behaviour of \s.

    Prior to Perl 5.18, \s did not match vertical tab. Bake that into
    make_perl_groups.pl as an override so that perl_groups.cc retains
    its current definitions when rebuilt with newer versions of Perl.

This fixes make_perl_groups.pl to generate an unchanged perl_groups.go
with perl versions 5.18 and later.

Fixes #22057

Change-Id: I9a56e9660092ed6c1ff1045b4a3847de355441a7
Reviewed-on: https://go-review.googlesource.com/103517
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-29 22:31:57 +00:00
Brad Fitzpatrick
48db2c01b4 all: use strings.Builder instead of bytes.Buffer where appropriate
I grepped for "bytes.Buffer" and "buf.String" and mostly ignored test
files. I skipped a few on purpose and probably missed a few others,
but otherwise I think this should be most of them.

Updates #18990

Change-Id: I5a6ae4296b87b416d8da02d7bfaf981d8cc14774
Reviewed-on: https://go-review.googlesource.com/102479
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-26 23:05:53 +00:00
Daniel Martí
8da180f6ca all: remove some unused return parameters
As found by unparam. Picked the low-hanging fruit, consisting only of
errors that were always nil and results that were never used. Left out
those that were useful for consistency with other func signatures.

Change-Id: I06b52bbd3541f8a5d66659c909bd93cb3e172018
Reviewed-on: https://go-review.googlesource.com/102418
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-24 19:44:47 +00:00
Daniel Martí
d4b2168b23 regexp/syntax: make Op an fmt.Stringer
Using stringer.

Fixes #22684.

Change-Id: I62fbde5dcb337cf269686615616bd39a27491ac1
Reviewed-on: https://go-review.googlesource.com/95355
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-20 15:33:50 +00:00
Joe Tsai
b53088a634 Revert "go/printer: forbid empty line before first comment in block"
This reverts commit 08f19bbde1b01227fdc2fa2d326e4029bb74dd96.

Reason for revert:
The changed transformation takes effect on a larger set
of code snippets than expected.

For example, this:
    func foo() {

        // Comment
        bar()

    }
becomes:
    func foo() {
        // Comment
        bar()

    }

This is an unintended consequence.

Change-Id: Ifca88d6267dab8a8170791f7205124712bf8ace8
Reviewed-on: https://go-review.googlesource.com/81335
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-12-01 01:12:26 +00:00
Joe Tsai
08f19bbde1 go/printer: forbid empty line before first comment in block
To improve readability when exported fields are removed,
forbid the printer from emitting an empty line before the first comment
in a const, var, or type block.
Also, when printing the "Has filtered or unexported fields." message,
add an empty line before it to separate the message from the struct
or interfact contents.

Before the change:
<<<
type NamedArg struct {

        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}
        // contains filtered or unexported fields
}
>>>

After the change:
<<<
type NamedArg struct {
        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}

        // contains filtered or unexported fields
}
>>>

Fixes #18264

Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88
Reviewed-on: https://go-review.googlesource.com/71990
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-11-02 18:17:22 +00:00
Sylvain Zimmer
67da597312 regexp: Remove duplicated function wordRune()
Fixes #21742

Change-Id: Ib56b092c490c27a4ba7ebdb6391f1511794710b8
Reviewed-on: https://go-review.googlesource.com/61034
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2017-09-08 23:39:37 +00:00
Daniel Martí
ad74e450ca regexp/syntax: remove unused flags parameter
Found by github.com/mvdan/unparam.

Change-Id: I186d2afd067e97eb05d65c4599119b347f82867d
Reviewed-on: https://go-review.googlesource.com/37840
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-06 19:11:09 +00:00
Marcel van Lohuizen
a2a4db7375 unicode: upgrade to version 9.0.0
Changes beyond generated tables:
- Now supports aliases to handle deprecated
  property classes.
- Some Mongolian letters are now modifiers.

Other changes:
- strconv: newly generated table to be in sync
- regexp/syntax: updated maxFold

Fixes #16191

Change-Id: I56bdf21ee2f775f2a82d0465b3772faf5c24cb61
Reviewed-on: https://go-review.googlesource.com/24496
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-28 15:08:11 +00:00
Brad Fitzpatrick
cdcb8271a4 regexp/syntax: clarify that \Z means Perl's \Z
Fixes #14793

Change-Id: I408056d096cd6a999fa5e349704b5ea8e26d2e4e
Reviewed-on: https://go-review.googlesource.com/23201
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-18 04:43:32 +00:00
Emmanuel Odeke
53fd522c0d all: make copyright headers consistent with one space after period
Follows suit with https://go-review.googlesource.com/#/c/20111.

Generated by running
$ grep -R 'Go Authors.  All' * | cut -d":" -f1 | while read F;do perl -pi -e 's/Go
Authors.  All/Go Authors. All/g' $F;done

The code in cmd/internal/unvendor wasn't changed.

Fixes #15213

Change-Id: I4f235cee0a62ec435f9e8540a1ec08ae03b1a75f
Reviewed-on: https://go-review.googlesource.com/21819
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-02 13:43:18 +00:00
Brad Fitzpatrick
5fea2ccc77 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02 00:13:47 +00:00
Brad Fitzpatrick
519474451a all: make copyright headers consistent with one space after period
This is a subset of https://golang.org/cl/20022 with only the copyright
header lines, so the next CL will be smaller and more reviewable.

Go policy has been single space after periods in comments for some time.

The copyright header template at:

    https://golang.org/doc/contribute.html#copyright

also uses a single space.

Make them all consistent.

Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0
Reviewed-on: https://go-review.googlesource.com/20111
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-01 23:34:33 +00:00
Nathan VanBenschoten
b04f3b06ec all: replace strings.Index with strings.Contains where possible
Change-Id: Ia613f1c37bfce800ece0533a5326fca91d99a66a
Reviewed-on: https://go-review.googlesource.com/18120
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
2016-02-19 01:06:05 +00:00
Paul Wankadia
5ccaf0255b regexp/syntax: fix factoring of common prefixes in alternations
In the past, `a.*?c|a.*?b` was factored to `a.*?[bc]`. Thus, given
"abc" as its input string, the automaton would consume "ab" and
then stop (when unanchored) whereas it should consume all of "abc"
as per leftmost semantics.

Fixes #13812.

Change-Id: I67ac0a353d7793b3d0c9c4aaf22d157621dfe784
Reviewed-on: https://go-review.googlesource.com/18357
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-08 16:41:46 +00:00
Russ Cox
0680e9c0c1 regexp/syntax: fix handling of \Q...\E
It's not a group: must handle the inside as a sequence of literal chars,
not a single literal string.

That is, \Qab\E+ is the same as ab+, not (ab)+.

Fixes #11187.

Change-Id: I5406d05ccf7efff3a7f15395bdb0cfb2bd23a8ed
Reviewed-on: https://go-review.googlesource.com/17233
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-12-01 22:45:12 +00:00
Tamir Duberstein
3aa755b8fb regexp/syntax: correctly print ^ BOL and $ EOL
Fixes #12980.

Change-Id: I936db2f57f7c4dc80bb8ec32715c4c6b7bf0d708
Reviewed-on: https://go-review.googlesource.com/16112
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-25 17:25:22 +00:00
Josh Bleecher Snyder
2adc4e8927 all: use "reports whether" in place of "returns true if(f)"
Comment changes only.

Change-Id: I56848814564c4aa0988b451df18bebdfc88d6d94
Reviewed-on: https://go-review.googlesource.com/7721
Reviewed-by: Rob Pike <r@golang.org>
2015-03-18 15:14:06 +00:00
Robin Eklind
04c7b68b4a regexp/syntax: Clarify comment of OpAnyCharNotNL.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/171560043
2014-11-11 18:52:07 -08:00
Ian Lance Taylor
0f022fdd52 regexp/syntax: fix validity testing of zero repeats
This is already tested by TestRE2Exhaustive, but the build has
not broken because that test is not run when using -test.short.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/155580043
2014-10-20 08:12:45 -07:00
Russ Cox
85fd0fd7c4 regexp/syntax: regenerate doc.go from re2 syntax
Generated using re2/doc/mksyntaxgo.

Fixes #8505.

LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/155890043
2014-10-06 15:32:11 -04:00
Russ Cox
9b2b0c8c16 regexp/syntax: reject large repetitions created by nesting small ones
Fixes #7609.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/150270043
2014-09-30 12:08:09 -04:00
Russ Cox
c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00