|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185 |
- # seek-bzip
-
- [![Build Status][1]][2] [![dependency status][3]][4] [![dev dependency status][5]][6]
-
- `seek-bzip` is a pure-javascript Node.JS module adapted from [node-bzip](https://github.com/skeggse/node-bzip) and before that [antimatter15's pure-javascript bzip2 decoder](https://github.com/antimatter15/bzip2.js). Like these projects, `seek-bzip` only does decompression (see [compressjs](https://github.com/cscott/compressjs) if you need compression code). Unlike those other projects, `seek-bzip` can seek to and decode single blocks from the bzip2 file.
-
- `seek-bzip` primarily decodes buffers into other buffers, synchronously.
- With the help of the [fibers](https://github.com/laverdet/node-fibers)
- package, it can operate on node streams; see `test/stream.js` for an
- example.
-
- ## How to Install
-
- ```
- npm install seek-bzip
- ```
-
- This package uses
- [Typed Arrays](https://developer.mozilla.org/en-US/docs/JavaScript/Typed_arrays), which are present in node.js >= 0.5.5.
-
- ## Usage
-
- After compressing some example data into `example.bz2`, the following will recreate that original data and save it to `example`:
-
- ```
- var Bunzip = require('seek-bzip');
- var fs = require('fs');
-
- var compressedData = fs.readFileSync('example.bz2');
- var data = Bunzip.decode(compressedData);
-
- fs.writeFileSync('example', data);
- ```
-
- See the tests in the `tests/` directory for further usage examples.
-
- For uncompressing single blocks of bzip2-compressed data, you will need
- an out-of-band index listing the start of each bzip2 block. (Presumably
- you generate this at the same time as you index the start of the information
- you wish to seek to inside the compressed file.) The `seek-bzip` module
- has been designed to be compatible with the C implementation `seek-bzip2`
- available from https://bitbucket.org/james_taylor/seek-bzip2. That codebase
- contains a `bzip-table` tool which will generate bzip2 block start indices.
- There is also a pure-JavaScript `seek-bzip-table` tool in this package's
- `bin` directory.
-
- ## Documentation
-
- `require('seek-bzip')` returns a `Bunzip` object. It contains three static
- methods. The first is a function accepting one or two parameters:
-
- `Bunzip.decode = function(input, [Number expectedSize] or [output], [boolean multistream])`
-
- The `input` argument can be a "stream" object (which must implement the
- `readByte` method), or a `Buffer`.
-
- If `expectedSize` is not present, `decodeBzip` simply decodes `input` and
- returns the resulting `Buffer`.
-
- If `expectedSize` is present (and numeric), `decodeBzip` will store
- the results in a `Buffer` of length `expectedSize`, and throw an error
- in the case that the size of the decoded data does not match
- `expectedSize`.
-
- If you pass a non-numeric second parameter, it can either be a `Buffer`
- object (which must be of the correct length; an error will be thrown if
- the size of the decoded data does not match the buffer length) or
- a "stream" object (which must implement a `writeByte` method).
-
- The optional third `multistream` parameter, if true, attempts to continue
- reading past the end of the bzip2 file. This supports "multistream"
- bzip2 files, which are simply multiple bzip2 files concatenated together.
- If this argument is true, the input stream must have an `eof` method
- which returns true when the end of the input has been reached.
-
- The second exported method is a function accepting two or three parameters:
-
- `Bunzip.decodeBlock = function(input, Number blockStartBits, [Number expectedSize] or [output])`
-
- The `input` and `expectedSize`/`output` parameters are as above.
- The `blockStartBits` parameter gives the start of the desired block, in bits.
-
- If passing a stream as the `input` parameter, it must implement the
- `seek` method.
-
- The final exported method is a function accepting two or three parameters:
-
- `Bunzip.table = function(input, Function callback, [boolean multistream])`
-
- The `input` and `multistream` parameters are identical to those for the
- `decode` method.
-
- This function will invoke `callback(position, size)` once per bzip2 block,
- where `position` gives the starting position of the block (in *bits*), and
- `size` gives the uncompressed size of the block (in bytes).
-
- This can be used to construct an index allowing direct access to a particular
- block inside a bzip2 file, using the `decodeBlock` method.
-
- ## Command-line
- There are binaries available in bin. The first generates an index of all
- the blocks in a bzip2-compressed file:
- ```
- $ bin/seek-bzip-table test/sample4.bz2
- 32 99981
- 320555 99981
- 606348 99981
- 847568 99981
- 1089094 99981
- 1343625 99981
- 1596228 99981
- 1843336 99981
- 2090919 99981
- 2342106 39019
- $
- ```
- The first field is the starting position of the block, in bits, and the
- second field is the length of the block, in bytes.
-
- The second binary decodes an arbitrary block of a bzip2 file:
- ```
- $ bin/seek-bunzip -d -b 2342106 test/sample4.bz2 | tail
- élan's
- émigré
- émigré's
- émigrés
- épée
- épée's
- épées
- étude
- étude's
- études
- $
- ```
-
- Use `--help` to see other options.
-
- ## Help wanted
-
- Improvements to this module would be generally useful.
- Feel free to fork on github and submit pull requests!
-
- ## Related projects
-
- * https://github.com/skeggse/node-bzip node-bzip (original upstream source)
- * https://github.com/cscott/compressjs
- Lots of compression/decompression algorithms from the same author as this
- module, including bzip2 compression code.
- * https://github.com/cscott/lzjb fast LZJB compression/decompression
-
- ## License
-
- #### MIT License
-
- > Copyright © 2013-2015 C. Scott Ananian
- >
- > Copyright © 2012-2015 Eli Skeggs
- >
- > Copyright © 2011 Kevin Kwok
- >
- > Permission is hereby granted, free of charge, to any person obtaining
- > a copy of this software and associated documentation files (the
- > "Software"), to deal in the Software without restriction, including
- > without limitation the rights to use, copy, modify, merge, publish,
- > distribute, sublicense, and/or sell copies of the Software, and to
- > permit persons to whom the Software is furnished to do so, subject to
- > the following conditions:
- >
- > The above copyright notice and this permission notice shall be
- > included in all copies or substantial portions of the Software.
- >
- > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- > EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- > MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- > NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
- > LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- > OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- > WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- [1]: https://travis-ci.org/cscott/seek-bzip.png
- [2]: https://travis-ci.org/cscott/seek-bzip
- [3]: https://david-dm.org/cscott/seek-bzip.png
- [4]: https://david-dm.org/cscott/seek-bzip
- [5]: https://david-dm.org/cscott/seek-bzip/dev-status.png
- [6]: https://david-dm.org/cscott/seek-bzip#info=devDependencies
|