Sunday, October 16, 2011

Java ZipInputStream with IBM PC or MS-DOS code page 437 (CP437) support

Preamble

This articles will show how to leverage the existing java ZipInputStream to also support IBM PC or MS-DOS code page 437

JDK and eclipse compatibility

JDK version 1.4 with eclipse 3.3 is the minimum requirement.

Project overview

The downloadable zip file contains an eclipse project which has a MainClass for displaying the sample usage.

The magic lies in ExtendedZipInputStream class which supports all the features provided by java ZipInputStream in addition to Extended ASCII file names support.

Unzip the downloadable and load in the eclipse project. Make a run configuration for MainClass and you are all set. The magic resides in the "getString" method.

/**
     * Read string with UTF-8 if fails use CP437 for the specified byte array
     * 
     * @param b The byte array to read the string from
     * @param off The offset to start reading from
     * @param len The total length to read
     * @return Either UTF-8 or CP437 string
     */
    private String getString(byte[] b, int off, int len) {
        String name;

        try {
            if (this.charset == null) name = (String) Reflect.invoke(this, DEF_READ_UTF8_METHOD, b,
                off, len);
            else name = new String(b, off, len, this.charset);
        }
        catch (Exception e) {
            //Unable to determine UTF-8 file name so use CP437
            name = this.getCP437String(b, off, len);
        }
        return name;
    }
Enjoy, if you like it please appreciate!

No comments:

Post a Comment