39 Commits

Author SHA1 Message Date
apocelipes
bd6f911f85 archive: use slices and maps to clean up tests
Replace reflect.DeepEqual with slices.Equal/maps.Equal, which is
much faster.

Clean up some unnecessary helper functions.

Change-Id: I9b94bd43886302b9b327539ab065a435ce0d75d9
GitHub-Last-Rev: b9ca21f165bcc5e45733e6a511a2344b1aa4a281
GitHub-Pull-Request: golang/go#67607
Reviewed-on: https://go-review.googlesource.com/c/go/+/587936
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
2024-07-25 00:25:45 +00:00
qiulaidongfeng
d838e4dcdf archive/tar: add FileInfoNames interface
An optional interface FileInfoNames has been added.

If the parameter fi of FileInfoHeader implements the interface
the Gname/Uname of the return value Header
are provided by the method of the interface.

Also added testing.

Fixes #50102

Change-Id: I47976e238eb20ed43113b060e4f83a14ae49493e
GitHub-Last-Rev: a213613c79e150d52a2f5c84dca7a49fe123fa40
GitHub-Pull-Request: golang/go#65273
Reviewed-on: https://go-review.googlesource.com/c/go/+/558355
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-15 16:01:50 +00:00
Cherry Mui
5000b51680 Revert "archive/tar: add FileInfoNames interface"
This reverts CL 514235. Also reverts CL 518056 which is a followup
fix.

Reason for revert: Proposal #50102 defined an interface that is
too specific to UNIX-y systems and also didn't make much sense.
The proposal is un-accepted, and we'll revisit in Go 1.23.

Fixes (via backport) #65245.
Updates #50102.

Change-Id: I41ba0ee286c1d893e6564a337e5d76418d19435d
Reviewed-on: https://go-review.googlesource.com/c/go/+/558295
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-01-24 20:54:27 +00:00
Ian Lance Taylor
d13f7aa0ae archive/tar: correct value passed to Uname method
For #50102

Change-Id: I28b5579611b07952b6379bc4603daf29a86a3be0
Reviewed-on: https://go-review.googlesource.com/c/go/+/518056
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Tianon Gravi (Andrew) <admwiggin@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
2023-08-10 14:37:50 +00:00
qiulaidongfeng
834a3f844a archive/tar: add FileInfoNames interface
An optional interface FileInfoNames has been added.

If the parameter fi of FileInfoHeader implements the interface
the Gname and Uname of the return value Header are
provided by the method of the interface.

Also added testing.

Fixes #50102

Change-Id: I6fd06c7c9aaf29b22b7384542fe57affed33009a

Change-Id: I6fd06c7c9aaf29b22b7384542fe57affed33009a
GitHub-Last-Rev: 5e82257948759e13880d8af12743b9524ae3df5a
GitHub-Pull-Request: golang/go#61662
Reviewed-on: https://go-review.googlesource.com/c/go/+/514235
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2023-08-07 00:25:25 +00:00
Russ Cox
2580d0e08d all: gofmt -w -r 'interface{} -> any' src
And then revert the bootstrap cmd directories and certain testdata.
And adjust tests as needed.

Not reverting the changes in std that are bootstrapped,
because some of those changes would appear in API docs,
and we want to use any consistently.
Instead, rewrite 'any' to 'interface{}' in cmd/dist for those directories
when preparing the bootstrap copy.

A few files changed as a result of running gofmt -w
not because of interface{} -> any but because they
hadn't been updated for the new //go:build lines.

Fixes #49884.

Change-Id: Ie8045cba995f65bd79c694ec77a1b3d1fe01bb09
Reviewed-on: https://go-review.googlesource.com/c/go/+/368254
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2021-12-13 18:45:54 +00:00
Manlio Perillo
069983e5db archive/tar: replace os.MkdirTemp with T.TempDir
Updates #45402

Change-Id: I296f8c676c68ed1e10b6ad1a17b5b23d2c395252
Reviewed-on: https://go-review.googlesource.com/c/go/+/309355
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-04-13 21:06:12 +00:00
Russ Cox
4f1b0a44cb all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTemp
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)

Update the Go tree to use the preferred names.

As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.

ReadDir changes are in a separate CL, because they are not a
simple search and replace.

For #42026.

Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-12-09 19:12:23 +00:00
Russ Cox
1b09d43067 all: update references to symbols moved from io/ioutil to io
The old ioutil references are still valid, but update our code
to reflect best practices and get used to the new locations.

Code compiled with the bootstrap toolchain
(cmd/asm, cmd/dist, cmd/compile, debug/elf)
must remain Go 1.4-compatible and is excluded.
Also excluded vendored code.

For #41190.

Change-Id: I6d86f2bf7bc37a9d904b6cee3fe0c7af6d94d5b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/263142
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2020-10-20 18:41:18 +00:00
Russ Cox
7bb721b938 all: update references to symbols moved from os to io/fs
The old os references are still valid, but update our code
to reflect best practices and get used to the new locations.

Code compiled with the bootstrap toolchain
(cmd/asm, cmd/dist, cmd/compile, debug/elf)
must remain Go 1.4-compatible and is excluded.

For #41190.

Change-Id: I8f9526977867c10a221e2f392f78d7dec073f1bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/243907
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2020-10-20 02:32:42 +00:00
Caio Marcelo de Oliveira Filho
e4bde05104 archive/tar: automatically promote TypeRegA
Change Reader to promote TypeRegA to TypeReg in headers, unless their
name have a trailing slash which is already promoted to TypeDir. This
will allow client code to handle just TypeReg instead both TypeReg and
TypeRegA.

Change Writer to promote TypeRegA to TypeReg or TypeDir in the headers
depending on whether the name has a trailing slash. This normalization
is motivated by the specification (in pax(1)):

   0 represents a regular file. For backwards-compatibility, a
   typeflag value of binary zero ( '\0' ) should be recognized as
   meaning a regular file when extracting files from the
   archive. Archives written with this version of the archive file
   format create regular files with a typeflag value of the
   ISO/IEC 646:1991 standard IRV '0'.

Fixes #22768.

Change-Id: I149ec55824580d446cdde5a0d7a0457ad7b03466
Reviewed-on: https://go-review.googlesource.com/85656
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-13 18:36:49 +00:00
Joe Tsai
ba2835db6c archive/tar: partially revert sparse file support
This CL removes the following APIs:
	type SparseEntry struct{ ... }
	type Header struct{ SparseHoles []SparseEntry; ... }
	func (*Header) DetectSparseHoles(f *os.File) error
	func (*Header) PunchSparseHoles(f *os.File) error
	func (*Reader) WriteTo(io.Writer) (int, error)
	func (*Writer) ReadFrom(io.Reader) (int, error)

This API was added during the Go1.10 dev cycle, and are safe to remove.

The rationale for reverting is because Header.DetectSparseHoles and
Header.PunchSparseHoles are functionality that probably better belongs in
the os package itself.

The other API like Header.SparseHoles, Reader.WriteTo, and Writer.ReadFrom
perform no OS specific logic and only perform the actual business logic of
reading and writing sparse archives. Since we do know know what the API added to
package os may look like, we preemptively revert these non-OS specific changes
as well by simply commenting them out.

Updates #13548
Updates #22735

Change-Id: I77842acd39a43de63e5c754bfa1c26cc24687b70
Reviewed-on: https://go-review.googlesource.com/78030
Reviewed-by: Russ Cox <rsc@golang.org>
2017-11-16 16:54:08 +00:00
Joe Tsai
577aab0c59 archive/tar: ignore ChangeTime and AccessTime unless Format is specified
CL 59230 changed Writer.WriteHeader to ignore the ChangeTime and AccessTime
fields when considering using the USTAR format when the format is unspecified.
This policy is confusing and leads to unexpected behavior where some files
have ModTime only, while others have ModTime+AccessTime+ChangeTime if the
format became PAX for some unrelated reason (e.g., long pathname).

Change the policy to simply always ignore ChangeTime, AccessTime, and
sub-second time resolutions unless the user explicitly specifies a format.
This is a safe policy change since WriteHeader had no support for the
above features in any Go release.

Support for ChangeTime and AccessTime was added in CL 55570.
Support for sub-second times was added in CL 55552.
Both CLs landed after the latest Go release (i.e., Go1.9), which was
cut from the master branch around August 6th, 2017.

Change-Id: Ib82baa1bf9dd4573ed4f674b7d55d15f733a4843
Reviewed-on: https://go-review.googlesource.com/69296
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-10 20:13:27 +00:00
Joe Tsai
4cd58c2f26 archive/tar: improve handling of directory paths
The USTAR format says:
<<<
Implementors should be aware that the previous file format did not include
a mechanism to archive directory type files.
For this reason, the convention of using a filename ending with
<slash> was adopted to specify a directory on the archive.
>>>

In light of this suggestion, make the following changes:
* Writer.WriteHeader refuses to encode a header where a file that
is obviously a file-type has a trailing slash in the name.
* formatter.formatString avoids encoding a trailing slash in the event
that the string is truncated (the full string will be encoded elsewhere,
so stripping the slash is safe).
* Reader.Next treats a TypeRegA (which is the zero value of Typeflag)
as a TypeDir if the name has a trailing slash.

Change-Id: Ibf27aa8234cce2032d92e5e5b28546c2f2ae5ef6
Reviewed-on: https://go-review.googlesource.com/69293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-10 20:11:26 +00:00
David du Colombier
d83b23fd4f archive/tar: skip TestSparseFiles on Plan 9
CL 60871 added TestSparseFiles. This test is succeeding
on Plan 9 when executed on the ramfs file system, but
is failing when executed on the Fossil file system.

This may be due to an issue in the handling of sparse
files in the Fossil file system on Plan 9 that should
be investigated.

Updates #21977.

Change-Id: I177afff519b862a5c548e094203c219504852006
Reviewed-on: https://go-review.googlesource.com/65352
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-22 13:50:50 +00:00
Joe Tsai
718d9de60f archive/tar: perform test for hole-detection on specific builders
The test for hole-detection is heavily dependent on whether the
OS and underlying FS provides support for it.
Even on Linux, which has support for SEEK_HOLE and SEEK_DATA,
the underlying filesystem may not have support for it.
In order to avoid an ever-changing game of whack-a-mole,
we whitelist the specific builders that we expect the test to pass on.

Updates #21964

Change-Id: I7334e8532c96cc346ea83aabbb81b719685ad7e5
Reviewed-on: https://go-review.googlesource.com/65270
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-21 20:42:11 +00:00
Joe Tsai
1eacf78858 archive/tar: add Header.DetectSparseHoles and Header.PunchSparseHoles
To support the detection and creation of sparse files,
add two new methods:
	func Header.DetectSparseHoles(*os.File) error
	func Header.PunchSparseHoles(*os.File) error

DetectSparseHoles is intended to be used after FileInfoHeader
prior to serializing the Header with WriteHeader.
For each OS, it uses specialized logic to detect
the location of sparse holes. On most Unix systems, it uses
SEEK_HOLE and SEEK_DATA to query for the holes.
On Windows, it uses a specialized the FSCTL_QUERY_ALLOCATED_RANGES
syscall to query for all the holes.

PunchSparseHoles is intended to be used after Reader.Next
prior to populating the file with Reader.WriteTo.
On Windows, this uses the FSCTL_SET_ZERO_DATA syscall.
On other operating systems it simply truncates the file
to the end-offset of SparseHoles.

DetectSparseHoles and PunchSparseHoles are added as methods on
Header because they are heavily tied to the operating system,
for which there is already an existing precedence for
(since FileInfoHeader makes uses of OS-specific details).

Fixes #13548

Change-Id: I98a321dd1ce0165f3d143d4edadfda5e7db67746
Reviewed-on: https://go-review.googlesource.com/60871
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-20 22:12:38 +00:00
Joe Tsai
57c79febda archive/tar: add Reader.WriteTo and Writer.ReadFrom
To support the efficient packing and extracting of sparse files,
add two new methods:
	func Reader.WriteTo(io.Writer) (int64, error)
	func Writer.ReadFrom(io.Reader) (int64, error)

If the current archive entry is sparse and the provided io.{Reader,Writer}
is also an io.Seeker, then use Seek to skip past the holes.
If the last region in a file entry is a hole, then we seek to 1 byte
before the EOF:
	* for Reader.WriteTo to write a single byte
	to ensure that the resulting filesize is correct.
	* for Writer.ReadFrom to read a single byte
	to verify that the input filesize is correct.

The downside of this approach is when the last region in the sparse file
is a hole. In the case of Reader.WriteTo, the 1-byte write will cause
the last fragment to have a single chunk allocated.
However, the goal of ReadFrom/WriteTo is *not* the ability to
exactly reproduce sparse files (in terms of the location of sparse holes),
but rather to provide an efficient way to create them.

File systems already impose their own restrictions on how the sparse file
will be created. Some filesystems (e.g., HFS+) don't support sparseness and
seeking forward simply causes the FS to write zeros. Other filesystems
have different chunk sizes, which will cause chunk allocations at boundaries
different from what was in the original sparse file. In either case,
it should not be a normal expectation of users that the location of holes
in sparse files exactly matches the source.

For users that really desire to have exact reproduction of sparse holes,
they can wrap os.File with their own io.WriteSeeker that discards the
final 1-byte write and uses File.Truncate to resize the file to the
correct size.

Other reasons we choose this approach over special-casing *os.File because:
	* The Reader already has special-case logic for io.Seeker
	* As much as possible, we want to decouple OS-specific logic from
	Reader and Writer.
	* This allows other abstractions over *os.File to also benefit from
	the "skip past holes" logic.
	* It is easier to test, since it is harder to mock an *os.File.

Updates #13548

Change-Id: I0a4f293bd53d13d154a946bc4a2ade28a6646f6a
Reviewed-on: https://go-review.googlesource.com/60872
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-18 16:18:17 +00:00
Joe Tsai
f85dc050ba archive/tar: require opt-in to PAX or GNU format for time features
Nearly every Header obtained from FileInfoHeader via the FS has
timestamps with sub-second resolution and the AccessTime
and ChangeTime fields populated. This forces the PAX format
to almost always be used, which has the following problems:
* PAX is still not as widely supported compared to USTAR
* The PAX headers will occupy at minimum 1KiB for every entry

The old behavior of tar Writer had no support for sub-second resolution
nor any support for AccessTime or ChangeTime, so had neither problem.
Instead the Writer would just truncate sub-second information and
ignore the AccessTime and ChangeTime fields.

In this CL, we preserve the behavior such that the *default* behavior
would output a USTAR header for most cases by truncating sub-second
time measurements and ignoring AccessTime and ChangeTime.
To use either of the features, users will need to explicitly specify
that the format is PAX or GNU.

The exact policy chosen is this:
* USTAR and GNU may still be chosen even if sub-second measurements
are present; they simply truncate the timestamp to the nearest second.
As before, PAX uses sub-second resolutions.
* If the Format is unspecified, then WriteHeader ignores AccessTime
and ChangeTime when using the USTAR format.

This ensures that USTAR may still be chosen for a vast majority of
file entries obtained through FileInfoHeader.

Updates #11171
Updates #17876

Change-Id: Icc5274d4245922924498fd79b8d3ae94d5717271
Reviewed-on: https://go-review.googlesource.com/59230
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-30 18:00:59 +00:00
Joe Tsai
19a995945f archive/tar: add raw support for global PAX records
The PAX specification says the following:
<<<
'g' represents global extended header records for the following files in the archive.
The format of these extended header records shall be as described in pax Extended Header.
Each value shall affect all subsequent files that do not override that value
in their own extended header record and until another global extended header record
is reached that provides another value for the same field.
>>>

This CL adds support for parsing and composing global PAX records,
but intentionally does not provide support for automatically
persisting the global state across files.

Changes made:
* When Reader encounters a TypeXGlobalRecord header, it parses the
PAX records and returns them to the user ad-verbatim. Reader does not
store them in its state, ensuring it has no effect on future Next calls.
* When Writer receives a TypeXGlobalRecord header, it writes the
PAX records to the archive ad-verbatim. It does not store them in
its state, ensuring it has no effect on future WriteHeader calls.
* The restriction regarding empty record values is lifted since this
value is used to represent deletion in global headers.

Why provide raw support only:
* Some archives in the wild have a global header section (often empty)
and it is the user's responsibility to manually read and discard it's body.
The logic added here allows users to more easily skip over these sections.
* For users that do care about global headers, having access to the raw
records allows them to implement the functionality of global headers themselves
and manually persist the global state across files.
* We can still upgrade to a full implementation in the future.

Why we don't provide full support:
* Even though the PAX specification describes their operation in detail,
both the GNU and BSD tar tools (which are the most common implementations)
do not have a consistent interpretation of many details.
* Global headers were a controversial feature in PAX, by admission of the
specification itself:
  <<<
  The concept of a global extended header (typeflag g) was controversial.

  The typeflag g global headers should not be used with interchange media that
  could suffer partial data loss in transporting the archive.
  >>>
* Having state persist from entry-to-entry complicates the implementation
for a feature that is not widely used and not well supported.

Change-Id: I1d904cacc2623ddcaa91525a5470b7dbe226c7e8
Reviewed-on: https://go-review.googlesource.com/59190
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
2017-08-25 23:03:52 +00:00
Joe Tsai
a795ca51db archive/tar: support arbitrary PAX records
This CL adds the following new publicly visible API:
	type Header struct { ...; PAXRecords map[string]string }

The new Header.PAXRecords field is a map of all PAX extended header records.

We suggest (but do not enforce) that users use VENDOR-prefixed keys
according to the following in the PAX specification:
<<<
The standard developers have reserved keyword name space for vendor extensions.
It is suggested that the format to be used is:
	VENDOR.keyword
where VENDOR is the name of the vendor or organization in all uppercase letters.
>>>

When reading, the Header.PAXRecords is populated with all PAX records
encountered so far, including basic ones (e.g., "path", "mtime", etc).
When writing, the fields of Header will be merged into PAXRecords,
overwriting any records that may conflict.

Since PAXRecords is a more expressive feature than Xattrs and
is entirely a superset of Xattrs, we mark Xattrs as deprecated,
and steer users towards the new PAXRecords API.

The issue has a discussion about adding a Header.SetPAXRecord method
to help validate records and keep the Header fields in sync.
However, we do not include that in this CL since that helper
method can always be added in the future.

There is no support for global records.

Fixes #14472

Change-Id: If285a52749acc733476cf75a2c7ad15bc1542071
Reviewed-on: https://go-review.googlesource.com/58390
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-25 21:57:32 +00:00
Joe Tsai
3d62000adc archive/tar: return better WriteHeader errors
WriteHeader may fail to encode a header for any number of reasons,
which can be frustrating for the user when trying to create a tar archive.
As we validate the Header, we generate an informative error message
intended for human consumption and return that if and only if no
format can be selected.

This allows WriteHeader to return informative errors like:
    tar: cannot encode header: invalid PAX record: "linkpath = \x00hello"
    tar: cannot encode header: invalid PAX record: "SCHILY.xattr.foo=bar = baz"
    tar: cannot encode header: Format specifies GNU; and only PAX supports Xattrs
    tar: cannot encode header: Format specifies GNU; and GNU cannot encode ModTime=1969-12-31 15:59:59.0000005 -0800 PST
    tar: cannot encode header: Format specifies GNU; and GNU supports sparse files only with TypeGNUSparse
    tar: cannot encode header: Format specifies USTAR; and USTAR cannot encode ModTime=292277026596-12-04 07:30:07 -0800 PST
    tar: cannot encode header: Format specifies USTAR; and USTAR does not support sparse files
    tar: cannot encode header: Format specifies PAX; and only GNU supports TypeGNUSparse

Updates #18710

Change-Id: I82a498d6f29d02c4e73bce47b768eb578da8499c
Reviewed-on: https://go-review.googlesource.com/58310
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-25 05:21:00 +00:00
Joe Tsai
9d3d370632 archive/tar: support reporting and selecting the format
The Reader and Writer are now at feature parity,
meaning that everything that can be parsed by the Reader,
can also be composed by the Writer.

This position enables us to support selection of the format
in a backwards compatible way, since it ensures that everything
that can be read can also be round-trip written.

As such, we add the following new API:
    type Format int
            const FormatUnknown Format = 0 ...
    type Header struct { ...; Format Format }

The new Header.Format field is populated by the Reader on the
best guess on what the format is. Note that the Reader is very liberal
in what it permits, so a hybrid TAR file using aspects of multiple
formats can still be decoded, but will be reported as FormatUnknown.

Even though Reader has full support for V7 and basic support for STAR,
it will still report those formats as unknown (and the constants for
those formats are not even exported). The reasons for this is because
the Writer has no support for V7 or STAR. Leaving it as unknown allows
the Writer to choose a format usually USTAR or GNU that can encode
the equivalent Header.

When writing, the Header.allowedFormats will take the Format field
into consideration if it is a known format.

Fixes #18710

Change-Id: I00980c475d067c6969d3414e1ff0224fdd89cd49
Reviewed-on: https://go-review.googlesource.com/58230
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-24 01:35:39 +00:00
Joe Tsai
3bece2fa0e archive/tar: refactor Reader support for sparse files
This CL is the first step (of two) for adding sparse file support
to the Writer. This CL only refactors the logic of sparse-file handling
in the Reader so that common logic can be easily shared by the Writer.

As a result of this CL, there are some new publicly visible API changes:
	type SparseEntry struct { Offset, Length int64 }
	type Header struct { ...; SparseHoles []SparseEntry }

A new type is defined to represent a sparse fragment and a new field
Header.SparseHoles is added to represent the sparse holes in a file.
The API intentionally represent sparse files using hole fragments,
rather than data fragments so that the zero value of SparseHoles
naturally represents a normal file (i.e., a file without any holes).
The Reader now populates SparseHoles for sparse files.

It is necessary to export the sparse hole information, otherwise it would
be impossible for the Writer to specify that it is trying to encode
a sparse file, and what it looks like.

Some unexported helper functions were added to common.go:
	func validateSparseEntries(sp []SparseEntry, size int64) bool
	func alignSparseEntries(src []SparseEntry, size int64) []SparseEntry
	func invertSparseEntries(src []SparseEntry, size int64) []SparseEntry

The validation logic that used to be in newSparseFileReader is now moved
to validateSparseEntries so that the Writer can use it in the future.
alignSparseEntries is currently unused by the Reader, but will be used
by the Writer in the future. Since TAR represents sparse files by
only recording the data fragments, we add the invertSparseEntries
function to convert a list of data fragments to a normalized list
of hole fragments (and vice-versa).

Some other high-level changes:
* skipUnread is deleted, where most of it's logic is moved to the
Discard methods on regFileReader and sparseFileReader.
* readGNUSparsePAXHeaders was rewritten to be simpler.
* regFileReader and sparseFileReader were completely rewritten
in simpler and easier to understand logic.
* A bug was fixed in sparseFileReader.Read where it failed to
report an error if the logical size of the file ends before
consuming all of the underlying data.
* The tests for sparse-file support was completely rewritten.

Updates #13548

Change-Id: Ic1233ae5daf3b3f4278fe1115d34a90c4aeaf0c2
Reviewed-on: https://go-review.googlesource.com/56771
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-19 00:57:31 +00:00
Agniva De Sarker
d9606e5532 archive/tar: add reader/writer benchmarks
According to the discussion on golang.org/cl/55210,
adding benchmarks for reading from and writing to tar archives.

Splitting the benchmarks into 3 sections of USTAR, GNU, PAX each.

Results ran with -cpu=1 -count=10 on an amd64 machine (i5-5200U CPU @ 2.20GHz)
name           time/op
/Writer/USTAR  5.31µs ± 0%
/Writer/GNU    5.01µs ± 1%
/Writer/PAX    11.0µs ± 2%
/Reader/USTAR  3.22µs ± 1%
/Reader/GNU    3.04µs ± 1%
/Reader/PAX    7.48µs ± 1%

name           alloc/op
/Writer/USTAR  1.20kB ± 0%
/Writer/GNU    1.15kB ± 0%
/Writer/PAX    2.61kB ± 0%
/Reader/USTAR  1.38kB ± 0%
/Reader/GNU    1.35kB ± 0%
/Reader/PAX    4.91kB ± 0%

name           allocs/op
/Writer/USTAR    53.0 ± 0%
/Writer/GNU      47.0 ± 0%
/Writer/PAX       107 ± 0%
/Reader/USTAR    32.0 ± 0%
/Reader/GNU      30.0 ± 0%
/Reader/PAX      67.0 ± 0%

Change-Id: I58b1b85b52e58cbd566736aae4d722a3ddf2395b
Reviewed-on: https://go-review.googlesource.com/55254
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-16 20:51:52 +00:00
Joe Tsai
5c20ffbb2f archive/tar: add support for long binary strings in GNU format
The GNU tar format defines the following type flags:
	TypeGNULongName = 'L' // Next file has a long name
	TypeGNULongLink = 'K' // Next file symlinks to a file w/ a long name

Anytime a string exceeds the field dedicated to store it, the GNU format
permits a fake "file" to be prepended where that file entry has a Typeflag
of 'L' or 'K' and the contents of the file is a NUL-terminated string.

Contrary to previous TODO comments,
the GNU format supports arbitrary strings (without NUL) rather UTF-8 strings.
The manual says the following:
<<<
The name, linkname, magic, uname, and gname are
null-terminated character strings
>>>
<<<
All characters in header blocks are represented
by using 8-bit characters in the local variant of ASCII.
>>>

From this description, we gather the following:
* We must forbid NULs in any GNU strings
* Any 8-bit value (other than NUL) is permitted

Since the modern world has moved to UTF-8, it is really difficult to
determine what a "local variant of ASCII" means. For this reason,
we treat strings as just an arbitrary binary string (without NUL)
and leave it to the user to determine the encoding of this string.
(Practically, it seems that UTF-8 is the typical encoding used
in GNU archives seen in the wild).

The implementation of GNU tar seems to confirm this interpretation
of the manual where it permits any arbitrary binary string to exist
within these fields so long as they do not contain the NUL character.

 $ touch `echo -e "not\x80\x81\x82\x83utf8"`
 $ gnutar -H gnu --tar -cvf gnu-not-utf8.tar $(echo -e "not\x80\x81\x82\x83utf8")

The fact that we permit arbitrary binary in GNU strings goes
hand-in-hand with the fact that GNU also permits a "base-256" encoding
of numeric fields, which is effectively two-complement binary.

Change-Id: Ic037ec6bed306d07d1312f0058594bd9b64d9880
Reviewed-on: https://go-review.googlesource.com/55573
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-16 00:39:32 +00:00
Joe Tsai
9223adcc2c archive/tar: add support for atime and ctime to Writer
Both the GNU and PAX formats support atime and ctime fields.
The implementation is trivial now that we have:
* support for formatting PAX records for timestamps
* dedicated methods that only handle one format (e.g., GNU)

Fixes #17876

Change-Id: I0c604fce14a47d722098afc966399cca2037395d
Reviewed-on: https://go-review.googlesource.com/55570
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-15 03:07:59 +00:00
Joe Tsai
1da0e7e28e archive/tar: reject bad key-value pairs for PAX records
We forbid empty keys or keys with '=' because it leads to ambiguous parsing.
Relevent PAX specification:
<<<
A keyword shall not include an <equals-sign>.
>>>

Also, we forbid the writer from encoding records with an empty value.
While, this is a valid record syntactically, the semantics of an empty
value is that previous records with that key should be deleted.
Since we have no support (and probably never will) for global PAX records,
deletion is a non-sensible operation.
<<<
If the <value> field is zero length,
it shall delete any header block field,
previously entered extended header value,
or global extended header value of the same name.
>>>

Fixes #20698
Fixes #15567

Change-Id: Ia29c5c6ef2e36cd9e6d7f6cff10e92b96a62f0d1
Reviewed-on: https://go-review.googlesource.com/55571
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-15 02:29:29 +00:00
Joe Tsai
ead6255ce3 archive/tar: check for permissible output formats first
The current logic in writeHeader attempts to encode the Header in one
format and if it discovered that it could not it would attempt to
switch to a different format mid-way through. This makes it very
hard to reason about what format will be used in the end and whether
it will even be a valid format.

Instead, we should verify from the start what formats are allowed
to encode the given input Header. If no formats are possible,
then we can return immediately, rejecting the Header.

For now, we continue on to the hairy logic in writeHeader, but
a future CL can split that logic up and specialize them for each
format now that we know what is possible.

Update #9683
Update #12594

Change-Id: I8406ea855dfcb8b478a03a7058ddf8b2b09d46dc
Reviewed-on: https://go-review.googlesource.com/54433
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-11 04:39:39 +00:00
Lars Jeppesen
66b5a2f3f0 archive/tar: remove file type bits from mode field
When writing tar files by using the FileInfoHeader
the type bits was set in the mode field of the header
This is not correct according to the standard (GNU/Posix) and
other implementations.

Fixed #20150

Change-Id: I3be7d946a1923ad5827cf45c696546a5e287ebba
Reviewed-on: https://go-review.googlesource.com/42093
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-13 00:22:29 +00:00
Alex Brainman
1989921aef os: do not report ModeDir for symlinks on windows
When using Lstat against symlinks that point to a directory,
the function returns FileInfo with both ModeDir and ModeSymlink set.
Change that to never set ModeDir if ModeSymlink is set.

Fixes #10424
Fixes #17540
Fixes #17541

Change-Id: Iba280888aad108360b8c1f18180a24493fe7ad2b
Reviewed-on: https://go-review.googlesource.com/41830
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 23:17:23 +00:00
Alex Brainman
cf3a28124b archive/tar: extend TestFileInfoHeaderSymlink
For #17541.

Change-Id: I524ab194f32b8b061ce1c9c3e0cd34cc5539358e
Reviewed-on: https://go-review.googlesource.com/39410
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-25 03:04:46 +00:00
Russ Cox
0e3355903d time: record monotonic clock reading in time.Now, for more accurate comparisons
See https://golang.org/design/12914-monotonic for details.

Fixes #12914.

Change-Id: I80edc2e6c012b4ace7161c84cf067d444381a009
Reviewed-on: https://go-review.googlesource.com/36255
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Caleb Spare <cespare@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-03 19:04:52 +00:00
Joe Tsai
04262986a0 archive/tar: compact slices in tests
Took this opportunity to also embed tables in the functions
that they are actually used in and other stylistic cleanups.

There was no logical changes to the tests.

Change-Id: Ifa724060532175f6f4407d6cedc841891efd8f7b
Reviewed-on: https://go-review.googlesource.com/31436
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-19 18:38:34 +00:00
Alex Brainman
53eb4783c2 archive/tar: move round-trip reading into common os file
Fixes #11426

Change-Id: I77368b0e852149ed4533e139cc43887508ac7f78
Reviewed-on: https://go-review.googlesource.com/11662
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-06-30 02:20:20 +00:00
Brad Fitzpatrick
2b3b7dc1a9 archive/tar: also skip header roundtrip test on nacl
Update #11426

Change-Id: I7abc4ed2241a7a3af6d57c934786f36de4f97b77
Reviewed-on: https://go-review.googlesource.com/11592
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-06-26 22:36:38 +00:00
Brad Fitzpatrick
48025d2ce0 archive/tar: disable new failing test on windows and plan9
Update #11426

Change-Id: If406d2efcc81965825a63c76f5448d544ba2a740
Reviewed-on: https://go-review.googlesource.com/11590
Reviewed-by: Austin Clements <austin@google.com>
2015-06-26 21:43:51 +00:00
Vincent Batts
f271f928d9 archive/tar: fix round-trip attributes
The issue was identified while
working with round trip FileInfo of the headers of hardlinks. Also,
additional test cases for hard link handling.
(review carried over from http://golang.org/cl/165860043)

Fixes #9027

Change-Id: I9e3a724c8de72eb1b0fbe0751a7b488894911b76
Reviewed-on: https://go-review.googlesource.com/6790
Reviewed-by: Russ Cox <rsc@golang.org>
2015-06-26 15:51:06 +00:00
Russ Cox
c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00