|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111 |
- text-encoding
- ==============
-
- This is a polyfill for the [Encoding Living
- Standard](https://encoding.spec.whatwg.org/) API for the Web, allowing
- encoding and decoding of textual data to and from Typed Array buffers
- for binary data in JavaScript.
-
- By default it adheres to the spec and does not support *encoding* to
- legacy encodings, only *decoding*. It is also implemented to match the
- specification's algorithms, rather than for performance. The intended
- use is within Web pages, so it has no dependency on server frameworks
- or particular module schemes.
-
- Basic examples and tests are included.
-
- ### Install ###
-
- There are a few ways you can get and use the `text-encoding` library.
-
- ### HTML Page Usage ###
-
- Clone the repo and include the files directly:
-
- ```html
- <!-- Required for non-UTF encodings -->
- <script src="encoding-indexes.js"></script>
- <script src="encoding.js"></script>
- ```
-
- This is the only use case the developer cares about. If you want those
- fancy module and/or package manager things that are popular these days
- you should probably use a different library.
-
- #### Package Managers ####
-
- The package is published to **npm** and **bower** as `@sinonjs/text-encoding`.
- Use through these is not really supported, since they aren't used by
- the developer of the library. Using `require()` in interesting ways
- probably breaks. Patches welcome, as long as they don't break the
- basic use of the files via `<script>`.
-
- ### API Overview ###
-
- Basic Usage
-
- ```js
- var uint8array = new TextEncoder().encode(string);
- var string = new TextDecoder(encoding).decode(uint8array);
- ```
-
- Streaming Decode
-
- ```js
- var string = "", decoder = new TextDecoder(encoding), buffer;
- while (buffer = next_chunk()) {
- string += decoder.decode(buffer, {stream:true});
- }
- string += decoder.decode(); // finish the stream
- ```
-
- ### Encodings ###
-
- All encodings from the Encoding specification are supported:
-
- utf-8 ibm866 iso-8859-2 iso-8859-3 iso-8859-4 iso-8859-5 iso-8859-6
- iso-8859-7 iso-8859-8 iso-8859-8-i iso-8859-10 iso-8859-13 iso-8859-14
- iso-8859-15 iso-8859-16 koi8-r koi8-u macintosh windows-874
- windows-1250 windows-1251 windows-1252 windows-1253 windows-1254
- windows-1255 windows-1256 windows-1257 windows-1258 x-mac-cyrillic
- gb18030 hz-gb-2312 big5 euc-jp iso-2022-jp shift_jis euc-kr
- replacement utf-16be utf-16le x-user-defined
-
- (Some encodings may be supported under other names, e.g. ascii,
- iso-8859-1, etc. See [Encoding](https://encoding.spec.whatwg.org/) for
- additional labels for each encoding.)
-
- Encodings other than **utf-8**, **utf-16le** and **utf-16be** require
- an additional `encoding-indexes.js` file to be included. It is rather
- large (596kB uncompressed, 188kB gzipped); portions may be deleted if
- support for some encodings is not required.
-
- ### Non-Standard Behavior ###
-
- As required by the specification, only encoding to **utf-8** is
- supported. If you want to try it out, you can force a non-standard
- behavior by passing the `NONSTANDARD_allowLegacyEncoding` option to
- TextEncoder and a label. For example:
-
- ```js
- var uint8array = new TextEncoder(
- 'windows-1252', { NONSTANDARD_allowLegacyEncoding: true }).encode(text);
- ```
-
- But note that the above won't work if you're using the polyfill in a
- browser that natively supports the TextEncoder API natively, since the
- polyfill won't be used!
-
- You can force the polyfill to be used by using this before the polyfill:
-
- ```html
- <script>
- window.TextEncoder = window.TextDecoder = null;
- </script>
- ```
-
- To support the legacy encodings (which may be stateful), the
- TextEncoder `encode()` method accepts an optional dictionary and
- `stream` option, e.g. `encoder.encode(string, {stream: true});` This
- is not needed for standard encoding since the input is always in
- complete code points.
|