Package com.ibm.icu.text
Class UnicodeDecompressor
- java.lang.Object
-
- com.ibm.icu.text.UnicodeDecompressor
-
public final class UnicodeDecompressor extends Object
A decompression engine implementing the Standard Compression Scheme for Unicode (SCSU) as outlined in Unicode Technical Report #6.USAGE
The static methods on UnicodeDecompressor may be used in a straightforward manner to decompress simple strings:
byte [] compressed = ... ; // get compressed bytes from somewhere String result = UnicodeDecompressor.decompress(compressed);
The static methods have a fairly large memory footprint. For finer-grained control over memory usage, UnicodeDecompressor offers more powerful APIs allowing iterative decompression:
// Decompress an array "bytes" of length "len" using a buffer of 512 chars // to the Writer "out" UnicodeDecompressor myDecompressor = new UnicodeDecompressor(); final static int BUFSIZE = 512; char [] charBuffer = new char [ BUFSIZE ]; int charsWritten = 0; int [] bytesRead = new int [1]; int totalBytesDecompressed = 0; int totalCharsWritten = 0; do { // do the decompression charsWritten = myDecompressor.decompress(bytes, totalBytesDecompressed, len, bytesRead, charBuffer, 0, BUFSIZE); // do something with the current set of chars out.write(charBuffer, 0, charsWritten); // update the no. of bytes decompressed totalBytesDecompressed += bytesRead[0]; // update the no. of chars written totalCharsWritten += charsWritten; } while(totalBytesDecompressed < len); myDecompressor.reset(); // reuse decompressor
Decompression is performed according to the standard set forth in Unicode Technical Report #6
- Author:
- Stephen F. Booth
- See Also:
UnicodeCompressor
-
-
Field Summary
Fields Modifier and Type Field Description static int
ARMENIANINDEX
static int
COMPRESSIONOFFSET
static int
GREEKINDEX
static int
HALFWIDTHKATAKANAINDEX
static int
HIRAGANAINDEX
static int
INVALIDCHAR
static int
INVALIDWINDOW
static int
IPAEXTENSIONINDEX
static int
KATAKANAINDEX
static int
LATININDEX
static int
MAXINDEX
static int
NUMSTATICWINDOWS
static int
NUMWINDOWS
static int
RESERVEDINDEX
static int
SCHANGE0
static int
SCHANGE1
static int
SCHANGE2
static int
SCHANGE3
static int
SCHANGE4
static int
SCHANGE5
static int
SCHANGE6
static int
SCHANGE7
static int
SCHANGEU
static int
SDEFINE0
static int
SDEFINE1
static int
SDEFINE2
static int
SDEFINE3
static int
SDEFINE4
static int
SDEFINE5
static int
SDEFINE6
static int
SDEFINE7
static int
SDEFINEX
static int
SINGLEBYTEMODE
static int[]
sOffsets
Static compression window offsetsstatic int[]
sOffsetTable
For window offset mappingstatic int
SQUOTE0
static int
SQUOTE1
static int
SQUOTE2
static int
SQUOTE3
static int
SQUOTE4
static int
SQUOTE5
static int
SQUOTE6
static int
SQUOTE7
static int
SQUOTEU
static int
SRESERVED
static int
UCHANGE0
static int
UCHANGE1
static int
UCHANGE2
static int
UCHANGE3
static int
UCHANGE4
static int
UCHANGE5
static int
UCHANGE6
static int
UCHANGE7
static int
UDEFINE0
static int
UDEFINE1
static int
UDEFINE2
static int
UDEFINE3
static int
UDEFINE4
static int
UDEFINE5
static int
UDEFINE6
static int
UDEFINE7
static int
UDEFINEX
static int
UNICODEMODE
static int
UQUOTEU
static int
URESERVED
-
Constructor Summary
Constructors Constructor Description UnicodeDecompressor()
Create a UnicodeDecompressor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static String
decompress(byte[] buffer)
Decompress a byte array into a String.static char[]
decompress(byte[] buffer, int start, int limit)
Decompress a byte array into a Unicode character array.int
decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit)
Decompress a byte array into a Unicode character array.void
reset()
Reset the decompressor to its initial state.
-
-
-
Field Detail
-
COMPRESSIONOFFSET
public static final int COMPRESSIONOFFSET
- See Also:
- Constant Field Values
-
NUMWINDOWS
public static final int NUMWINDOWS
- See Also:
- Constant Field Values
-
NUMSTATICWINDOWS
public static final int NUMSTATICWINDOWS
- See Also:
- Constant Field Values
-
INVALIDWINDOW
public static final int INVALIDWINDOW
- See Also:
- Constant Field Values
-
INVALIDCHAR
public static final int INVALIDCHAR
- See Also:
- Constant Field Values
-
SINGLEBYTEMODE
public static final int SINGLEBYTEMODE
- See Also:
- Constant Field Values
-
UNICODEMODE
public static final int UNICODEMODE
- See Also:
- Constant Field Values
-
MAXINDEX
public static final int MAXINDEX
- See Also:
- Constant Field Values
-
RESERVEDINDEX
public static final int RESERVEDINDEX
- See Also:
- Constant Field Values
-
LATININDEX
public static final int LATININDEX
- See Also:
- Constant Field Values
-
IPAEXTENSIONINDEX
public static final int IPAEXTENSIONINDEX
- See Also:
- Constant Field Values
-
GREEKINDEX
public static final int GREEKINDEX
- See Also:
- Constant Field Values
-
ARMENIANINDEX
public static final int ARMENIANINDEX
- See Also:
- Constant Field Values
-
HIRAGANAINDEX
public static final int HIRAGANAINDEX
- See Also:
- Constant Field Values
-
KATAKANAINDEX
public static final int KATAKANAINDEX
- See Also:
- Constant Field Values
-
HALFWIDTHKATAKANAINDEX
public static final int HALFWIDTHKATAKANAINDEX
- See Also:
- Constant Field Values
-
SDEFINEX
public static final int SDEFINEX
- See Also:
- Constant Field Values
-
SRESERVED
public static final int SRESERVED
- See Also:
- Constant Field Values
-
SQUOTEU
public static final int SQUOTEU
- See Also:
- Constant Field Values
-
SCHANGEU
public static final int SCHANGEU
- See Also:
- Constant Field Values
-
SQUOTE0
public static final int SQUOTE0
- See Also:
- Constant Field Values
-
SQUOTE1
public static final int SQUOTE1
- See Also:
- Constant Field Values
-
SQUOTE2
public static final int SQUOTE2
- See Also:
- Constant Field Values
-
SQUOTE3
public static final int SQUOTE3
- See Also:
- Constant Field Values
-
SQUOTE4
public static final int SQUOTE4
- See Also:
- Constant Field Values
-
SQUOTE5
public static final int SQUOTE5
- See Also:
- Constant Field Values
-
SQUOTE6
public static final int SQUOTE6
- See Also:
- Constant Field Values
-
SQUOTE7
public static final int SQUOTE7
- See Also:
- Constant Field Values
-
SCHANGE0
public static final int SCHANGE0
- See Also:
- Constant Field Values
-
SCHANGE1
public static final int SCHANGE1
- See Also:
- Constant Field Values
-
SCHANGE2
public static final int SCHANGE2
- See Also:
- Constant Field Values
-
SCHANGE3
public static final int SCHANGE3
- See Also:
- Constant Field Values
-
SCHANGE4
public static final int SCHANGE4
- See Also:
- Constant Field Values
-
SCHANGE5
public static final int SCHANGE5
- See Also:
- Constant Field Values
-
SCHANGE6
public static final int SCHANGE6
- See Also:
- Constant Field Values
-
SCHANGE7
public static final int SCHANGE7
- See Also:
- Constant Field Values
-
SDEFINE0
public static final int SDEFINE0
- See Also:
- Constant Field Values
-
SDEFINE1
public static final int SDEFINE1
- See Also:
- Constant Field Values
-
SDEFINE2
public static final int SDEFINE2
- See Also:
- Constant Field Values
-
SDEFINE3
public static final int SDEFINE3
- See Also:
- Constant Field Values
-
SDEFINE4
public static final int SDEFINE4
- See Also:
- Constant Field Values
-
SDEFINE5
public static final int SDEFINE5
- See Also:
- Constant Field Values
-
SDEFINE6
public static final int SDEFINE6
- See Also:
- Constant Field Values
-
SDEFINE7
public static final int SDEFINE7
- See Also:
- Constant Field Values
-
UCHANGE0
public static final int UCHANGE0
- See Also:
- Constant Field Values
-
UCHANGE1
public static final int UCHANGE1
- See Also:
- Constant Field Values
-
UCHANGE2
public static final int UCHANGE2
- See Also:
- Constant Field Values
-
UCHANGE3
public static final int UCHANGE3
- See Also:
- Constant Field Values
-
UCHANGE4
public static final int UCHANGE4
- See Also:
- Constant Field Values
-
UCHANGE5
public static final int UCHANGE5
- See Also:
- Constant Field Values
-
UCHANGE6
public static final int UCHANGE6
- See Also:
- Constant Field Values
-
UCHANGE7
public static final int UCHANGE7
- See Also:
- Constant Field Values
-
UDEFINE0
public static final int UDEFINE0
- See Also:
- Constant Field Values
-
UDEFINE1
public static final int UDEFINE1
- See Also:
- Constant Field Values
-
UDEFINE2
public static final int UDEFINE2
- See Also:
- Constant Field Values
-
UDEFINE3
public static final int UDEFINE3
- See Also:
- Constant Field Values
-
UDEFINE4
public static final int UDEFINE4
- See Also:
- Constant Field Values
-
UDEFINE5
public static final int UDEFINE5
- See Also:
- Constant Field Values
-
UDEFINE6
public static final int UDEFINE6
- See Also:
- Constant Field Values
-
UDEFINE7
public static final int UDEFINE7
- See Also:
- Constant Field Values
-
UQUOTEU
public static final int UQUOTEU
- See Also:
- Constant Field Values
-
UDEFINEX
public static final int UDEFINEX
- See Also:
- Constant Field Values
-
URESERVED
public static final int URESERVED
- See Also:
- Constant Field Values
-
sOffsetTable
public static final int[] sOffsetTable
For window offset mapping
-
sOffsets
public static final int[] sOffsets
Static compression window offsets
-
-
Constructor Detail
-
UnicodeDecompressor
public UnicodeDecompressor()
Create a UnicodeDecompressor. Sets all windows to their default values.- See Also:
reset()
-
-
Method Detail
-
decompress
public static String decompress(byte[] buffer)
Decompress a byte array into a String.- Parameters:
buffer
- The byte array to decompress.- Returns:
- A String containing the decompressed characters.
- See Also:
decompress(byte [], int, int)
-
decompress
public static char[] decompress(byte[] buffer, int start, int limit)
Decompress a byte array into a Unicode character array.- Parameters:
buffer
- The byte array to decompress.start
- The start of the byte run to decompress.limit
- The limit of the byte run to decompress.- Returns:
- A character array containing the decompressed bytes.
- See Also:
decompress(byte [])
-
decompress
public int decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit)
Decompress a byte array into a Unicode character array. This function will either completely fill the output buffer, or consume the entire input.- Parameters:
byteBuffer
- The byte buffer to decompress.byteBufferStart
- The start of the byte run to decompress.byteBufferLimit
- The limit of the byte run to decompress.bytesRead
- A one-element array. If not null, on return the number of bytes read from byteBuffer.charBuffer
- A buffer to receive the decompressed data. This buffer must be at minimum two characters in size.charBufferStart
- The starting offset to which to write decompressed data.charBufferLimit
- The limiting offset for writing decompressed data.- Returns:
- The number of Unicode characters written to charBuffer.
-
reset
public void reset()
Reset the decompressor to its initial state.
-
-