node-tar
Fast and full-featured Tar for Node.js
The API is designed to mimic the behavior of tar(1)
on unix systems.
If you are familiar with how tar works, most of this will hopefully be
straightforward for you. If not, then hopefully this module can teach
you useful unix skills that may come in handy someday :)
Background
A "tar file" or "tarball" is an archive of file system entries
(directories, files, links, etc.) The name comes from "tape archive".
If you run man tar
on almost any Unix command line, you'll learn
quite a bit about what it can do, and its history.
Tar has 5 main top-level commands:
c
Create an archiver
Replace entries within an archiveu
Update entries within an archive (ie, replace if they're newer)t
List out the contents of an archivex
Extract an archive to disk
The other flags and options modify how this top level function works.
High-Level API
These 5 functions are the high-level API. All of them have a
single-character name (for unix nerds familiar with tar(1)
) as well
as a long name (for everyone else).
All the high-level functions take the following arguments, all three of which are optional and may be omitted.
options
- An optional object specifying various optionspaths
- An array of paths to add or extractcallback
- Called when the command is completed, if async. (If sync or no file specified, providing a callback throws aTypeError
.)
If the command is sync (ie, if options.sync=true
), then the
callback is not allowed, since the action will be completed immediately.
If a file
argument is specified, and the command is async, then a
Promise
is returned. In this case, if async, a callback may be
provided which is called when the command is completed.
If a file
option is not specified, then a stream is returned. For
create
, this is a readable stream of the generated archive. For
list
and extract
this is a writable stream that an archive should
be written into. If a file is not specified, then a callback is not
allowed, because you're already getting a stream to work with.
replace
and update
only work on existing archives, and so require
a file
argument.
Sync commands without a file argument return a stream that acts on its
input immediately in the same tick. For readable streams, this means
that all of the data is immediately available by calling
stream.read()
. For writable streams, it will be acted upon as soon
as it is provided, but this can be at any time.
Warnings and Errors
Tar emits warnings and errors for recoverable and unrecoverable situations, respectively. In many cases, a warning only affects a single entry in an archive, or is simply informing you that it's modifying an entry to comply with the settings provided.
Unrecoverable warnings will always raise an error (ie, emit 'error'
on
streaming actions, throw for non-streaming sync actions, reject the
returned Promise for non-streaming async operations, or call a provided
callback with an Error
as the first argument). Recoverable errors will
raise an error only if strict: true
is set in the options.
Respond to (recoverable) warnings by listening to the warn
event.
Handlers receive 3 arguments:
code
String. One of the error codes below. This may not matchdata.code
, which preserves the original error code from fs and zlib.message
String. More details about the error.data
Metadata about the error. AnError
object for errors raised by fs and zlib. All fields are attached to errors raisd by tar. Typically contains the following fields, as relevant:tarCode
The tar error code.code
Either the tar error code, or the error code set by the underlying system.file
The archive file being read or written.cwd
Working directory for creation and extraction operations.entry
The entry object (if it could be created) forTAR_ENTRY_INFO
,TAR_ENTRY_INVALID
, andTAR_ENTRY_ERROR
warnings.header
The header object (if it could be created, and the entry could not be created) forTAR_ENTRY_INFO
andTAR_ENTRY_INVALID
warnings.recoverable
Boolean. Iffalse
, then the warning will emit anerror
, even in non-strict mode.
Error Codes
TAR_ENTRY_INFO
An informative error indicating that an entry is being modified, but otherwise processed normally. For example, removing/
orC:\
from absolute paths ifpreservePaths
is not set.TAR_ENTRY_INVALID
An indication that a given entry is not a valid tar archive entry, and will be skipped. This occurs when:- a checksum fails,
- a
linkpath
is missing for a link type, or - a
linkpath
is provided for a non-link type.
If every entry in a parsed archive raises an
TAR_ENTRY_INVALID
error, then the archive is presumed to be unrecoverably broken, andTAR_BAD_ARCHIVE
will be raised.TAR_ENTRY_ERROR
The entry appears to be a valid tar archive entry, but encountered an error which prevented it from being unpacked. This occurs when:- an unrecoverable fs error happens during unpacking,
- an entry is trying to extract into an excessively deep location (by default, limited to 1024 subfolders),
- an entry has
..
in the path andpreservePaths
is not set, or - an entry is extracting through a symbolic link, when
preservePaths
is not set.
TAR_ENTRY_UNSUPPORTED
An indication that a given entry is a valid archive entry, but of a type that is unsupported, and so will be skipped in archive creation or extracting.TAR_ABORT
When parsing gzipped-encoded archives, the parser will abort the parse process raise a warning for any zlib errors encountered. Aborts are considered unrecoverable for both parsing and unpacking.TAR_BAD_ARCHIVE
The archive file is totally hosed. This can happen for a number of reasons, and always occurs at the end of a parse or extract:- An entry body was truncated before seeing the full number of bytes.
- The archive contained only invalid entries, indicating that it is likely not an archive, or at least, not an archive this library can parse.
TAR_BAD_ARCHIVE
is considered informative for parse operations, but unrecoverable for extraction. Note that, if encountered at the end of an extraction, tar WILL still have extracted as much it could from the archive, so there may be some garbage files to clean up.
Errors that occur deeper in the system (ie, either the filesystem or zlib)
will have their error codes left intact, and a tarCode
matching one of
the above will be added to the warning metadata or the raised error object.
Errors generated by tar will have one of the above codes set as the
error.code
field as well, but since errors originating in zlib or fs will
have their original codes, it's better to read error.tarCode
if you wish
to see how tar is handling the issue.
Examples
The API mimics the tar(1)
command line functionality, with aliases
for more human-readable option and function names. The goal is that
if you know how to use tar(1)
in Unix, then you know how to use
import('tar')
in JavaScript.
To replicate tar czf my-tarball.tgz files and folders
, you'd do:
import { create } from 'tar'
create(
{
gzip: <true|gzip options>,
file: 'my-tarball.tgz'
},
['some', 'files', 'and', 'folders']
).then(_ => { .. tarball has been created .. })
To replicate tar cz files and folders > my-tarball.tgz
, you'd do:
// if you're familiar with the tar(1) cli flags, this can be nice
import * as tar from 'tar'
tar.c(
{
// 'z' is alias for 'gzip' option
z: <true|gzip options>
},
['some', 'files', 'and', 'folders']
).pipe(fs.createWriteStream('my-tarball.tgz'))
To replicate tar xf my-tarball.tgz
you'd do:
tar.x( // or `tar.extract`
{
// or `file:`
f: 'my-tarball.tgz'
}
).then(_=> { .. tarball has been dumped in cwd .. })
To replicate cat my-tarball.tgz | tar x -C some-dir --strip=1
:
fs.createReadStream('my-tarball.tgz').pipe(
tar.x({
strip: 1,
C: 'some-dir', // alias for cwd:'some-dir', also ok
}),
)
To replicate tar tf my-tarball.tgz
, do this:
tar.t({
file: 'my-tarball.tgz',
onReadEntry: entry => { .. do whatever with it .. }
})
For example, to just get the list of filenames from an archive:
const getEntryFilenames = async tarballFilename => {
const filenames = []
await tar.t({
file: tarballFilename,
onReadEntry: entry => filenames.push(entry.path),
})
return filenames
}
To replicate cat my-tarball.tgz | tar t
do:
fs.createReadStream('my-tarball.tgz')
.pipe(tar.t())
.on('entry', entry => { .. do whatever with it .. })
To do anything synchronous, add sync: true
to the options. Note
that sync functions don't take a callback and don't return a promise.
When the function returns, it's already done. Sync methods without a
file argument return a sync stream, which flushes immediately. But,
of course, it still won't be done until you .end()
it.
const getEntryFilenamesSync = tarballFilename => {
const filenames = []
tar.t({
file: tarballFilename,
onReadEntry: entry => filenames.push(entry.path),
sync: true,
})
return filenames
}
To filter entries, add filter: <function>
to the options.
Tar-creating methods call the filter with filter(path, stat)
.
Tar-reading methods (including extraction) call the filter with
filter(path, entry)
. The filter is called in the this
-context of
the Pack
or Unpack
stream object.
The arguments list to tar t
and tar x
specify a list of filenames
to extract or list, so they're equivalent to a filter that tests if
the file is in the list.
For those who aren't fans of tar's single-character command names:
tar.c === tar.create
tar.r === tar.replace (appends to archive, file is required)
tar.u === tar.update (appends if newer, file is required)
tar.x === tar.extract
tar.t === tar.list
Keep reading for all the command descriptions and options, as well as the low-level API that they are built on.
tar.c(options, fileList, callback) [alias: tar.create]
Create a tarball archive.
The fileList
is an array of paths to add to the tarball. Adding a
directory also adds its children recursively.
An entry in fileList
that starts with an @
symbol is a tar archive
whose entries will be added. To add a file that starts with @
,
prepend it with ./
.
The following options are supported:
file
Write the tarball archive to the specified filename. If this is specified, then the callback will be fired when the file has been written, and a promise will be returned that resolves when the file is written. If a filename is not specified, then a Readable Stream will be returned which will emit the file data. [Alias:f
]sync
Act synchronously. If this is set, then any provided file will be fully written after the call totar.c
. If this is set, and a file is not provided, then the resulting stream will already have the data ready toread
oremit('data')
as soon as you request it.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")strict
Treat warnings as crash-worthy errors. Default false.cwd
The current working directory for creating the archive. Defaults toprocess.cwd()
. [Alias:C
]prefix
A path portion to prefix onto the entries in the archive.gzip
Set to any truthy value to create a gzipped archive, or an object with settings forzlib.Gzip()
[Alias:z
]filter
A function that gets called with(path, stat)
for each entry being added. Returntrue
to add the entry to the archive, orfalse
to omit it.portable
Omit metadata that is system-specific:ctime
,atime
,uid
,gid
,uname
,gname
,dev
,ino
, andnlink
. Note thatmtime
is still included, because this is necessary for other time-based operations. Additionally,mode
is set to a "reasonable default" for most unix systems, based on aumask
value of0o22
.preservePaths
Allow absolute paths. By default,/
is stripped from absolute paths. [Alias:P
]mode
The mode to set on the created file archivenoDirRecurse
Do not recursively archive the contents of directories. [Alias:n
]follow
Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such. [Alias:L
,h
]noPax
Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.noMtime
Set to true to omit writingmtime
values for entries. Note that this prevents using other mtime-based features liketar.update
or thekeepNewer
option with the resulting tar archive. [Alias:m
,no-mtime
]mtime
Set to aDate
object to force a specificmtime
for everything added to the archive. Overridden bynoMtime
.onWriteEntry
Called with eachWriteEntry
orWriteEntrySync
that is created in the course of writing the archive.
The following options are mostly internal, but can be modified in some advanced use cases, such as re-using caches between runs.
linkCache
A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.statCache
A Map object that caches callslstat
.readdirCache
A Map object that caches calls toreaddir
.jobs
A number specifying how many concurrent jobs to run. Defaults to 4.maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 16 MB.
tar.x(options, fileList, callback) [alias: tar.extract]
Extract a tarball archive.
The fileList
is an array of paths to extract from the tarball. If
no paths are provided, then all the entries are extracted.
If the archive is gzipped, then tar will detect this and unzip it.
Note that all directories that are created will be forced to be writable, readable, and listable by their owner, to avoid cases where a directory prevents extraction of child entries by virtue of its mode.
Most extraction errors will cause a warn
event to be emitted. If
the cwd
is missing, or not a directory, then the extraction will
fail completely.
The following options are supported:
cwd
Extract files relative to the specified directory. Defaults toprocess.cwd()
. If provided, this must exist and must be a directory. [Alias:C
]file
The archive file to extract. If not specified, then a Writable stream is returned where the archive data should be written. [Alias:f
]sync
Create files and directories synchronously.strict
Treat warnings as crash-worthy errors. Default false.filter
A function that gets called with(path, entry)
for each entry being unpacked. Returntrue
to unpack the entry from the archive, orfalse
to skip it.newer
Set to true to keep the existing file on disk if it's newer than the file in the archive. [Alias:keep-newer
,keep-newer-files
]keep
Do not overwrite existing files. In particular, if a file appears more than once in an archive, later copies will not overwrite earlier copies. [Alias:k
,keep-existing
]preservePaths
Allow absolute paths, paths containing..
, and extracting through symbolic links. By default,/
is stripped from absolute paths,..
paths are not extracted, and any file whose location would be modified by a symbolic link is not extracted. [Alias:P
]unlink
Unlink files before creating them. Without this option, tar overwrites existing files, which preserves existing hardlinks. With this option, existing hardlinks will be broken, as will any symlink that would affect the location of an extracted file. [Alias:U
]strip
Remove the specified number of leading path elements. Pathnames with fewer elements will be silently skipped. Note that the pathname is edited after applying the filter, but before security checks. [Alias:strip-components
,stripComponents
]onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")preserveOwner
If true, tar will set theuid
andgid
of extracted entries to theuid
andgid
fields in the archive. This defaults to true when run as root, and false otherwise. If false, then files and directories will be set with the owner and group of the user running the process. This is similar to-p
intar(1)
, but ACLs and other system-specific data is never unpacked in this implementation, and modes are set by default already. [Alias:p
]uid
Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified user id, regardless of theuid
field in the archive. Cannot be used along withpreserveOwner
. Requires also setting agid
option.gid
Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified group id, regardless of thegid
field in the archive. Cannot be used along withpreserveOwner
. Requires also setting auid
option.noMtime
Set to true to omit writingmtime
value for extracted entries. [Alias:m
,no-mtime
]transform
Provide a function that takes anentry
object, and returns a stream, or any falsey value. If a stream is provided, then that stream's data will be written instead of the contents of the archive entry. If a falsey value is provided, then the entry is written to disk as normal. (To exclude items from extraction, use thefilter
option described above.)onReadEntry
A function that gets called with(entry)
for each entry that passes the filter.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")chmod
Set to true to callfs.chmod()
to ensure that the extracted file matches the entry mode. This may necessitate a call to the deprecated and thread-unsafeprocess.umask()
method to determine the default umask value, unless aprocessUmask
options is also provided. Otherwise tar will extract with whatever mode is provided, and let the processumask
apply normally.processUmask
Set to an explicit numeric value to avoid callingprocess.umask()
whenchmod: true
is set.maxDepth
The maximum depth of subfolders to extract into. This defaults to 1024. Anything deeper than the limit will raise a warning and skip the entry. Set toInfinity
to remove the limitation.
The following options are mostly internal, but can be modified in some advanced use cases, such as re-using caches between runs.
maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 16 MB.umask
Filter the modes of entries likeprocess.umask()
.dmode
Default mode for directoriesfmode
Default mode for filesdirCache
A Map object of which directories exist.maxMetaEntrySize
The maximum size of meta entries that is supported. Defaults to 1 MB.
Note that using an asynchronous stream type with the transform
option will cause undefined behavior in sync extractions.
MiniPass-based streams are designed for this
use case.
tar.t(options, fileList, callback) [alias: tar.list]
List the contents of a tarball archive.
The fileList
is an array of paths to list from the tarball. If
no paths are provided, then all the entries are listed.
If the archive is gzipped, then tar will detect this and unzip it.
If the file
option is not provided, then returns an event emitter that
emits entry
events with tar.ReadEntry
objects. However, they don't
emit 'data'
or 'end'
events. (If you want to get actual readable
entries, use the tar.Parse
class instead.)
If a file
option is provided, then the return value will be a promise
that resolves when the file has been fully traversed in async mode, or
undefined
if sync: true
is set. Thus, you must specify an onReadEntry
method in order to do anything useful with the data it parses.
The following options are supported:
file
The archive file to list. If not specified, then a Writable stream is returned where the archive data should be written. [Alias:f
]sync
Read the specified file synchronously. (This has no effect when a file option isn't specified, because entries are emitted as fast as they are parsed from the stream anyway.)strict
Treat warnings as crash-worthy errors. Default false.filter
A function that gets called with(path, entry)
for each entry being listed. Returntrue
to emit the entry from the archive, orfalse
to skip it.onReadEntry
A function that gets called with(entry)
for each entry that passes the filter. This is important for whenfile
is set, because there is no other way to do anything useful with this method.maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 16 MB.noResume
By default,entry
streams are resumed immediately after the call toonReadEntry
. SetnoResume: true
to suppress this behavior. Note that by opting into this, the stream will never complete until the entry data is consumed.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")
tar.u(options, fileList, callback) [alias: tar.update]
Add files to an archive if they are newer than the entry already in the tarball archive.
The fileList
is an array of paths to add to the tarball. Adding a
directory also adds its children recursively.
An entry in fileList
that starts with an @
symbol is a tar archive
whose entries will be added. To add a file that starts with @
,
prepend it with ./
.
The following options are supported:
file
Required. Write the tarball archive to the specified filename. [Alias:f
]sync
Act synchronously. If this is set, then any provided file will be fully written after the call totar.c
.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")strict
Treat warnings as crash-worthy errors. Default false.cwd
The current working directory for adding entries to the archive. Defaults toprocess.cwd()
. [Alias:C
]prefix
A path portion to prefix onto the entries in the archive.gzip
Set to any truthy value to create a gzipped archive, or an object with settings forzlib.Gzip()
[Alias:z
]filter
A function that gets called with(path, stat)
for each entry being added. Returntrue
to add the entry to the archive, orfalse
to omit it.portable
Omit metadata that is system-specific:ctime
,atime
,uid
,gid
,uname
,gname
,dev
,ino
, andnlink
. Note thatmtime
is still included, because this is necessary for other time-based operations. Additionally,mode
is set to a "reasonable default" for most unix systems, based on aumask
value of0o22
.preservePaths
Allow absolute paths. By default,/
is stripped from absolute paths. [Alias:P
]maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 16 MB.noDirRecurse
Do not recursively archive the contents of directories. [Alias:n
]follow
Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such. [Alias:L
,h
]noPax
Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.noMtime
Set to true to omit writingmtime
values for entries. Note that this prevents using other mtime-based features liketar.update
or thekeepNewer
option with the resulting tar archive. [Alias:m
,no-mtime
]mtime
Set to aDate
object to force a specificmtime
for everything added to the archive. Overridden bynoMtime
.onWriteEntry
Called with eachWriteEntry
orWriteEntrySync
that is created in the course of writing the archive.
tar.r(options, fileList, callback) [alias: tar.replace]
Add files to an existing archive. Because later entries override earlier entries, this effectively replaces any existing entries.
The fileList
is an array of paths to add to the tarball. Adding a
directory also adds its children recursively.
An entry in fileList
that starts with an @
symbol is a tar archive
whose entries will be added. To add a file that starts with @
,
prepend it with ./
.
The following options are supported:
file
Required. Write the tarball archive to the specified filename. [Alias:f
]sync
Act synchronously. If this is set, then any provided file will be fully written after the call totar.c
.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")strict
Treat warnings as crash-worthy errors. Default false.cwd
The current working directory for adding entries to the archive. Defaults toprocess.cwd()
. [Alias:C
]prefix
A path portion to prefix onto the entries in the archive.gzip
Set to any truthy value to create a gzipped archive, or an object with settings forzlib.Gzip()
[Alias:z
]filter
A function that gets called with(path, stat)
for each entry being added. Returntrue
to add the entry to the archive, orfalse
to omit it.portable
Omit metadata that is system-specific:ctime
,atime
,uid
,gid
,uname
,gname
,dev
,ino
, andnlink
. Note thatmtime
is still included, because this is necessary for other time-based operations. Additionally,mode
is set to a "reasonable default" for most unix systems, based on aumask
value of0o22
.preservePaths
Allow absolute paths. By default,/
is stripped from absolute paths. [Alias:P
]maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 16 MB.noDirRecurse
Do not recursively archive the contents of directories. [Alias:n
]follow
Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such. [Alias:L
,h
]noPax
Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.noMtime
Set to true to omit writingmtime
values for entries. Note that this prevents using other mtime-based features liketar.update
or thekeepNewer
option with the resulting tar archive. [Alias:m
,no-mtime
]mtime
Set to aDate
object to force a specificmtime
for everything added to the archive. Overridden bynoMtime
.onWriteEntry
Called with eachWriteEntry
orWriteEntrySync
that is created in the course of writing the archive.
Low-Level API
class Pack
A readable tar stream.
Has all the standard readable stream interface stuff. 'data'
and
'end'
events, read()
method, pause()
and resume()
, etc.
constructor(options)
The following options are supported:
onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")strict
Treat warnings as crash-worthy errors. Default false.cwd
The current working directory for creating the archive. Defaults toprocess.cwd()
.prefix
A path portion to prefix onto the entries in the archive.gzip
Set to any truthy value to create a gzipped archive, or an object with settings forzlib.Gzip()
filter
A function that gets called with(path, stat)
for each entry being added. Returntrue
to add the entry to the archive, orfalse
to omit it.portable
Omit metadata that is system-specific:ctime
,atime
,uid
,gid
,uname
,gname
,dev
,ino
, andnlink
. Note thatmtime
is still included, because this is necessary for other time-based operations. Additionally,mode
is set to a "reasonable default" for most unix systems, based on aumask
value of0o22
.preservePaths
Allow absolute paths. By default,/
is stripped from absolute paths.linkCache
A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.statCache
A Map object that caches callslstat
.readdirCache
A Map object that caches calls toreaddir
.jobs
A number specifying how many concurrent jobs to run. Defaults to 4.maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 16 MB.noDirRecurse
Do not recursively archive the contents of directories.follow
Set to true to pack the targets of symbolic links. Without this option, symbolic links are archived as such.noPax
Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.noMtime
Set to true to omit writingmtime
values for entries. Note that this prevents using other mtime-based features liketar.update
or thekeepNewer
option with the resulting tar archive.mtime
Set to aDate
object to force a specificmtime
for everything added to the archive. Overridden bynoMtime
.onWriteEntry
Called with eachWriteEntry
orWriteEntrySync
that is created in the course of writing the archive.
add(path)
Adds an entry to the archive. Returns the Pack stream.
write(path)
Adds an entry to the archive. Returns true if flushed.
end()
Finishes the archive.
class PackSync
Synchronous version of Pack
.
class Unpack
A writable stream that unpacks a tar archive onto the file system.
All the normal writable stream stuff is supported. write()
and
end()
methods, 'drain'
events, etc.
Note that all directories that are created will be forced to be writable, readable, and listable by their owner, to avoid cases where a directory prevents extraction of child entries by virtue of its mode.
'close'
is emitted when it's done writing stuff to the file system.
Most unpack errors will cause a warn
event to be emitted. If the
cwd
is missing, or not a directory, then an error will be emitted.
constructor(options)
cwd
Extract files relative to the specified directory. Defaults toprocess.cwd()
. If provided, this must exist and must be a directory.filter
A function that gets called with(path, entry)
for each entry being unpacked. Returntrue
to unpack the entry from the archive, orfalse
to skip it.newer
Set to true to keep the existing file on disk if it's newer than the file in the archive.keep
Do not overwrite existing files. In particular, if a file appears more than once in an archive, later copies will not overwrite earlier copies.preservePaths
Allow absolute paths, paths containing..
, and extracting through symbolic links. By default,/
is stripped from absolute paths,..
paths are not extracted, and any file whose location would be modified by a symbolic link is not extracted.unlink
Unlink files before creating them. Without this option, tar overwrites existing files, which preserves existing hardlinks. With this option, existing hardlinks will be broken, as will any symlink that would affect the location of an extracted file.strip
Remove the specified number of leading path elements. Pathnames with fewer elements will be silently skipped. Note that the pathname is edited after applying the filter, but before security checks.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")umask
Filter the modes of entries likeprocess.umask()
.dmode
Default mode for directoriesfmode
Default mode for filesdirCache
A Map object of which directories exist.maxMetaEntrySize
The maximum size of meta entries that is supported. Defaults to 1 MB.preserveOwner
If true, tar will set theuid
andgid
of extracted entries to theuid
andgid
fields in the archive. This defaults to true when run as root, and false otherwise. If false, then files and directories will be set with the owner and group of the user running the process. This is similar to-p
intar(1)
, but ACLs and other system-specific data is never unpacked in this implementation, and modes are set by default already.win32
True if on a windows platform. Causes behavior where filenames containing<|>?
chars are converted to windows-compatible values while being unpacked.uid
Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified user id, regardless of theuid
field in the archive. Cannot be used along withpreserveOwner
. Requires also setting agid
option.gid
Set to a number to force ownership of all extracted files and folders, and all implicitly created directories, to be owned by the specified group id, regardless of thegid
field in the archive. Cannot be used along withpreserveOwner
. Requires also setting auid
option.noMtime
Set to true to omit writingmtime
value for extracted entries.transform
Provide a function that takes anentry
object, and returns a stream, or any falsey value. If a stream is provided, then that stream's data will be written instead of the contents of the archive entry. If a falsey value is provided, then the entry is written to disk as normal. (To exclude items from extraction, use thefilter
option described above.)strict
Treat warnings as crash-worthy errors. Default false.onReadEntry
A function that gets called with(entry)
for each entry that passes the filter.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")chmod
Set to true to callfs.chmod()
to ensure that the extracted file matches the entry mode. This may necessitate a call to the deprecated and thread-unsafeprocess.umask()
method to determine the default umask value, unless aprocessUmask
options is also provided. Otherwise tar will extract with whatever mode is provided, and let the processumask
apply normally.processUmask
Set to an explicit numeric value to avoid callingprocess.umask()
whenchmod: true
is set.maxDepth
The maximum depth of subfolders to extract into. This defaults to 1024. Anything deeper than the limit will raise a warning and skip the entry. Set toInfinity
to remove the limitation.
class UnpackSync
Synchronous version of Unpack
.
Note that using an asynchronous stream type with the transform
option will cause undefined behavior in sync unpack streams.
MiniPass-based streams are designed for this
use case.
class tar.Parse
A writable stream that parses a tar archive stream. All the standard writable stream stuff is supported.
If the archive is gzipped, then tar will detect this and unzip it.
Emits 'entry'
events with tar.ReadEntry
objects, which are
themselves readable streams that you can pipe wherever.
Each entry
will not emit until the one before it is flushed through,
so make sure to either consume the data (with on('data', ...)
or
.pipe(...)
) or throw it away with .resume()
to keep the stream
flowing.
constructor(options)
Returns an event emitter that emits entry
events with
tar.ReadEntry
objects.
The following options are supported:
strict
Treat warnings as crash-worthy errors. Default false.filter
A function that gets called with(path, entry)
for each entry being listed. Returntrue
to emit the entry from the archive, orfalse
to skip it.onReadEntry
A function that gets called with(entry)
for each entry that passes the filter.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")
abort(error)
Stop all parsing activities. This is called when there are zlib errors. It also emits an unrecoverable warning with the error provided.
class tar.ReadEntry extends MiniPass
A representation of an entry that is being read out of a tar archive.
It has the following fields:
extended
The extended metadata object provided to the constructor.globalExtended
The global extended metadata object provided to the constructor.remain
The number of bytes remaining to be written into the stream.blockRemain
The number of 512-byte blocks remaining to be written into the stream.ignore
Whether this entry should be ignored.meta
True if this represents metadata about the next entry, false if it represents a filesystem object.- All the fields from the header, extended header, and global extended
header are added to the ReadEntry object. So it has
path
,type
,size
,mode
, and so on.
constructor(header, extended, globalExtended)
Create a new ReadEntry object with the specified header, extended header, and global extended header values.
class tar.WriteEntry extends MiniPass
A representation of an entry that is being written from the file system into a tar archive.
Emits data for the Header, and for the Pax Extended Header if one is required, as well as any body data.
Creating a WriteEntry for a directory does not also create WriteEntry objects for all of the directory contents.
It has the following fields:
path
The path field that will be written to the archive. By default, this is also the path from the cwd to the file system object.portable
Omit metadata that is system-specific:ctime
,atime
,uid
,gid
,uname
,gname
,dev
,ino
, andnlink
. Note thatmtime
is still included, because this is necessary for other time-based operations. Additionally,mode
is set to a "reasonable default" for most unix systems, based on aumask
value of0o22
.myuid
If supported, the uid of the user running the current process.myuser
Theenv.USER
string if set, or''
. Set as the entryuname
field if the file'suid
matchesthis.myuid
.maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 1 MB.linkCache
A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.statCache
A Map object that caches callslstat
.preservePaths
Allow absolute paths. By default,/
is stripped from absolute paths.cwd
The current working directory for creating the archive. Defaults toprocess.cwd()
.absolute
The absolute path to the entry on the filesystem. By default, this ispath.resolve(this.cwd, this.path)
, but it can be overridden explicitly.strict
Treat warnings as crash-worthy errors. Default false.win32
True if on a windows platform. Causes behavior where paths replace\
with/
and filenames containing the windows-compatible forms of<|>?:
characters are converted to actual<|>?:
characters in the archive.noPax
Suppress pax extended headers. Note that this means that long paths and linkpaths will be truncated, and large or negative numeric values may be interpreted incorrectly.noMtime
Set to true to omit writingmtime
values for entries. Note that this prevents using other mtime-based features liketar.update
or thekeepNewer
option with the resulting tar archive.
constructor(path, options)
path
is the path of the entry as it is written in the archive.
The following options are supported:
portable
Omit metadata that is system-specific:ctime
,atime
,uid
,gid
,uname
,gname
,dev
,ino
, andnlink
. Note thatmtime
is still included, because this is necessary for other time-based operations. Additionally,mode
is set to a "reasonable default" for most unix systems, based on aumask
value of0o22
.maxReadSize
The maximum buffer size forfs.read()
operations. Defaults to 1 MB.linkCache
A Map object containing the device and inode value for any file whose nlink is > 1, to identify hard links.statCache
A Map object that caches callslstat
.preservePaths
Allow absolute paths. By default,/
is stripped from absolute paths.cwd
The current working directory for creating the archive. Defaults toprocess.cwd()
.absolute
The absolute path to the entry on the filesystem. By default, this ispath.resolve(this.cwd, this.path)
, but it can be overridden explicitly.strict
Treat warnings as crash-worthy errors. Default false.win32
True if on a windows platform. Causes behavior where paths replace\
with/
.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")noMtime
Set to true to omit writingmtime
values for entries. Note that this prevents using other mtime-based features liketar.update
or thekeepNewer
option with the resulting tar archive.umask
Set to restrict the modes on the entries in the archive, somewhat like how umask works on file creation. Defaults toprocess.umask()
on unix systems, or0o22
on Windows.
warn(message, data)
If strict, emit an error with the provided message.
Othewise, emit a 'warn'
event with the provided message and data.
class tar.WriteEntry.Sync
Synchronous version of tar.WriteEntry
class tar.WriteEntry.Tar
A version of tar.WriteEntry that gets its data from a tar.ReadEntry instead of from the filesystem.
constructor(readEntry, options)
readEntry
is the entry being read out of another archive.
The following options are supported:
portable
Omit metadata that is system-specific:ctime
,atime
,uid
,gid
,uname
,gname
,dev
,ino
, andnlink
. Note thatmtime
is still included, because this is necessary for other time-based operations. Additionally,mode
is set to a "reasonable default" for most unix systems, based on aumask
value of0o22
.preservePaths
Allow absolute paths. By default,/
is stripped from absolute paths.strict
Treat warnings as crash-worthy errors. Default false.onwarn
A function that will get called with(code, message, data)
for any warnings encountered. (See "Warnings and Errors")noMtime
Set to true to omit writingmtime
values for entries. Note that this prevents using other mtime-based features liketar.update
or thekeepNewer
option with the resulting tar archive.
class tar.Header
A class for reading and writing header blocks.
It has the following fields:
nullBlock
True if decoding a block which is entirely composed of0x00
null bytes. (Useful because tar files are terminated by at least 2 null blocks.)cksumValid
True if the checksum in the header is valid, false otherwise.needPax
True if the values, as encoded, will require a Pax extended header.path
The path of the entry.mode
The 4 lowest-order octal digits of the file mode. That is, read/write/execute permissions for world, group, and owner, and the setuid, setgid, and sticky bits.uid
Numeric user id of the file ownergid
Numeric group id of the file ownersize
Size of the file in bytesmtime
Modified time of the filecksum
The checksum of the header. This is generated by adding all the bytes of the header block, treating the checksum field itself as all ascii space characters (that is,0x20
).type
The human-readable name of the type of entry this represents, or the alphanumeric key if unknown.typeKey
The alphanumeric key for the type of entry this header represents.linkpath
The target of Link and SymbolicLink entries.uname
Human-readable user name of the file ownergname
Human-readable group name of the file ownerdevmaj
The major portion of the device number. Always0
for files, directories, and links.devmin
The minor portion of the device number. Always0
for files, directories, and links.atime
File access time.ctime
File change time.
constructor(data, [offset=0])
data
is optional. It is either a Buffer that should be interpreted
as a tar Header starting at the specified offset and continuing for
512 bytes, or a data object of keys and values to set on the header
object, and eventually encode as a tar Header.
decode(block, offset)
Decode the provided buffer starting at the specified offset.
Buffer length must be greater than 512 bytes.
set(data)
Set the fields in the data object.
encode(buffer, offset)
Encode the header fields into the buffer at the specified offset.
Returns this.needPax
to indicate whether a Pax Extended Header is
required to properly encode the specified data.
class tar.Pax
An object representing a set of key-value pairs in an Pax extended header entry.
It has the following fields. Where the same name is used, they have the same semantics as the tar.Header field of the same name.
global
True if this represents a global extended header, or false if it is for a single entry.atime
charset
comment
ctime
gid
gname
linkpath
mtime
path
size
uid
uname
dev
ino
nlink
constructor(object, global)
Set the fields set in the object. global
is a boolean that defaults
to false.
encode()
Return a Buffer containing the header and body for the Pax extended
header entry, or null
if there is nothing to encode.
encodeBody()
Return a string representing the body of the pax extended header entry.
encodeField(fieldName)
Return a string representing the key/value encoding for the specified
fieldName, or ''
if the field is unset.
tar.Pax.parse(string, extended, global)
Return a new Pax object created by parsing the contents of the string provided.
If the extended
object is set, then also add the fields from that
object. (This is necessary because multiple metadata entries can
occur in sequence.)
tar.types
A translation table for the type
field in tar headers.
tar.types.name.get(code)
Get the human-readable name for a given alphanumeric code.
tar.types.code.get(name)
Get the alphanumeric code for a given human-readable name.