|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214 |
- # clarinet
-
- ![NPM Downloads](http://img.shields.io/npm/dm/clarinet.svg?style=flat) ![NPM Version](http://img.shields.io/npm/v/clarinet.svg?style=flat) [![CDNJS](https://img.shields.io/cdnjs/v/clarinet.svg)](https://cdnjs.com/libraries/clarinet)
-
- `clarinet` is a sax-like streaming parser for JSON. works in the browser and node.js. `clarinet` is inspired (and forked) from [sax-js][saxjs]. just like you shouldn't use `sax` when you need `dom` you shouldn't use `clarinet` when you need `JSON.parse`. for a more detailed introduction and a performance study please refer to this [article][blog].
-
- # design goals
-
- `clarinet` is very much like [yajl] but written in javascript:
-
- * written in javascript
- * portable
- * robust (~110 tests pass before even announcing the project)
- * data representation independent
- * fast
- * generates verbose, useful error messages including context of where
- the error occurs in the input text.
- * can parse json data off a stream, incrementally
- * simple to use
- * tiny
-
- # motivation
-
- the reason behind this work was to create better full text support in node. creating indexes out of large (or many) json files doesn't require a full understanding of the json file, but it does require something like `clarinet`.
-
- # installation
-
- ## node.js
-
- 1. install [npm]
- 2. `npm install clarinet`
- 3. `var clarinet = require('clarinet');`
-
- ## browser
-
- 1. minimize clarinet.js
- 2. load it into your webpage
-
- # usage
-
- ## basics
-
- ``` js
- var clarinet = require("clarinet")
- , parser = clarinet.parser()
- ;
-
- parser.onerror = function (e) {
- // an error happened. e is the error.
- };
- parser.onvalue = function (v) {
- // got some value. v is the value. can be string, double, bool, or null.
- };
- parser.onopenobject = function (key) {
- // opened an object. key is the first key.
- };
- parser.onkey = function (key) {
- // got a subsequent key in an object.
- };
- parser.oncloseobject = function () {
- // closed an object.
- };
- parser.onopenarray = function () {
- // opened an array.
- };
- parser.onclosearray = function () {
- // closed an array.
- };
- parser.onend = function () {
- // parser stream is done, and ready to have more stuff written to it.
- };
-
- parser.write('{"foo": "bar"}').close();
- ```
-
- ``` js
- // stream usage
- // takes the same options as the parser
- var stream = require("clarinet").createStream(options);
- stream.on("error", function (e) {
- // unhandled errors will throw, since this is a proper node
- // event emitter.
- console.error("error!", e)
- // clear the error
- this._parser.error = null
- this._parser.resume()
- })
- stream.on("openobject", function (node) {
- // same object as above
- })
- // pipe is supported, and it's readable/writable
- // same chunks coming in also go out.
- fs.createReadStream("file.json")
- .pipe(stream)
- .pipe(fs.createReadStream("file-altered.json"))
- ```
-
- ## arguments
-
- pass the following arguments to the parser function. all are optional.
-
- `opt` - object bag of settings regarding string formatting. all default to `false`.
-
- settings supported:
-
- * `trim` - boolean. whether or not to trim text and comment nodes.
- * `normalize` - boolean. if true, then turn any whitespace into a single
- space.
-
- ## methods
-
- `write` - write bytes onto the stream. you don't have to do this all at
- once. you can keep writing as much as you want.
-
- `close` - close the stream. once closed, no more data may be written until
- it is done processing the buffer, which is signaled by the `end` event.
-
- `resume` - to gracefully handle errors, assign a listener to the `error`
- event. then, when the error is taken care of, you can call `resume` to
- continue parsing. otherwise, the parser will not continue while in an error
- state.
-
- ## members
-
- at all times, the parser object will have the following members:
-
- `line`, `column`, `position` - indications of the position in the json
- document where the parser currently is looking.
-
- `closed` - boolean indicating whether or not the parser can be written to.
- if it's `true`, then wait for the `ready` event to write again.
-
- `opt` - any options passed into the constructor.
-
- and a bunch of other stuff that you probably shouldn't touch.
-
- ## events
-
- all events emit with a single argument. to listen to an event, assign a
- function to `on<eventname>`. functions get executed in the this-context of
- the parser object. the list of supported events are also in the exported
- `EVENTS` array.
-
- when using the stream interface, assign handlers using the `EventEmitter`
- `on` function in the normal fashion.
-
- `error` - indication that something bad happened. the error will be hanging
- out on `parser.error`, and must be deleted before parsing can continue. by
- listening to this event, you can keep an eye on that kind of stuff. note:
- this happens *much* more in strict mode. argument: instance of `Error`.
-
- `value` - a json value. argument: value, can be a bool, null, string on number
-
- `openobject` - object was opened. argument: key, a string with the first key of the object (if any)
-
- `key` - an object key: argument: key, a string with the current key. Not called for first key (use `openobject` for that).
-
- `closeobject` - indication that an object was closed
-
- `openarray` - indication that an array was opened
-
- `closearray` - indication that an array was closed
-
- `end` - indication that the closed stream has ended.
-
- `ready` - indication that the stream has reset, and is ready to be written
- to.
-
- ## samples
-
- some [samples] are available to help you get started. one that creates a list of top npm contributors, and another that gets a bunch of data from twitter and generates valid json.
-
- # roadmap
-
- check [issues]
-
- # contribute
-
- everyone is welcome to contribute. patches, bug-fixes, new features
-
- 1. create an [issue][issues] so the community can comment on your idea
- 2. fork `clarinet`
- 3. create a new branch `git checkout -b my_branch`
- 4. create tests for the changes you made
- 5. make sure you pass both existing and newly inserted tests
- 6. commit your changes
- 7. push to your branch `git push origin my_branch`
- 8. create an pull request
-
- helpful tips:
-
- check `index.html`. there's two env vars you can set, `CRECORD` and `CDEBUG`.
-
- * `CRECORD` allows you to `record` the event sequence from a new json test so you don't have to write everything.
- * `CDEBUG` can be set to `info` or `debug`. `info` will `console.log` all emits, `debug` will `console.log` what happens to each char.
-
- in `test/clarinet.js` there's two lines you might want to change. `#8` where you define `seps`, if you are isolating a test you probably just want to run one sep, so change this array to `[undefined]`. `#718` which says `for (var key in docs) {` is where you can change the docs you want to run. e.g. to run `foobar` i would do something like `for (var key in {foobar:''}) {`.
-
- # meta
-
- * code: `git clone git://github.com/dscape/clarinet.git`
- * home: <http://github.com/dscape/clarinet>
- * bugs: <http://github.com/dscape/clarinet/issues>
- * build: [![build status](https://secure.travis-ci.org/dscape/clarinet.png)](http://travis-ci.org/dscape/clarinet)
-
- `(oO)--',-` in [caos]
-
- [npm]: http://npmjs.org
- [issues]: http://github.com/dscape/clarinet/issues
- [caos]: http://caos.di.uminho.pt/
- [saxjs]: http://github.com/isaacs/sax-js
- [yajl]: https://github.com/lloyd/yajl
- [samples]: https://github.com/dscape/clarinet/tree/master/samples
- [blog]: http://writings.nunojob.com/2011/12/clarinet-sax-based-evented-streaming-json-parser-in-javascript-for-the-browser-and-nodejs.html
|