I'm trying to figure out a clean way to parse streaming JSON with Jackson. "Streaming" as in TCP, off-the-wire, in a piecemeal fashion without any guarantee of receiving complete JSON data in a single read (no message framing either). Also, the goal is to do this asynchronously, which rules out relying on Jackson's handling of java.io.InputStreams. I came up with a functioning solution (see demonstration below), but I'm not particularly happy with it. Imperative style aside, I don't like the ungraceful handling of incomplete JSON by JsonParser#readValueAsTree. When processing a stream of bytes, incomplete data is absolutely normal and is not an exceptional scenario, so it's strange (and unacceptable) to see java.io.IOExceptions in Jackson's APIs. I also looked into using Jackson's TokenBuffer, but ran into similar issues. Is Jackson not really meant for processing true streaming JSON?
package com.example.jackson;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.LinkedList;
import java.util.List;
import static java.nio.charset.StandardCharsets.UTF_8;
import static java.util.Collections.emptyList;
public class AsyncJsonParsing {
public static void main(String[] args) {
final AsyncJsonParsing parsing = new AsyncJsonParsing();
parsing.runFirstScenario();
parsing.runSecondScenario();
parsing.runThirdScenario();
parsing.runFourthScenario();
}
static final class ParsingOutcome {
final List<JsonNode> roots;//list of parsed JSON objects and JSON arrays
final byte[] remainder;
ParsingOutcome(final List<JsonNode> roots, final byte[] remainder) {
this.roots = roots;
this.remainder = remainder;
}
}
final byte[] firstMessage = "{\"message\":\"first\"}".getBytes(UTF_8);
final byte[] secondMessage = "{\"message\":\"second\"}".getBytes(UTF_8);
final byte[] leadingHalfOfFirstMessage = Arrays.copyOfRange(firstMessage, 0, firstMessage.length / 2);
final byte[] trailingHalfOfFirstMessage = Arrays.copyOfRange(firstMessage, firstMessage.length / 2, firstMessage.length);
final byte[] leadingHalfOfSecondMessage = Arrays.copyOfRange(secondMessage, 0, secondMessage.length / 2);
final byte[] trailingHalfOfSecondMessage = Arrays.copyOfRange(secondMessage, secondMessage.length / 2, secondMessage.length);
final ObjectMapper mapper = new ObjectMapper();
void runFirstScenario() {
//expectation: remainder = empty array and roots has a single element - parsed firstMessage
final ParsingOutcome result = parse(firstMessage, mapper);
report(result);
}
void runSecondScenario() {
//expectation: remainder = leadingHalfOfFirstMessage and roots is empty
final ParsingOutcome firstResult = parse(leadingHalfOfFirstMessage, mapper);
report(firstResult);
//expectation: remainder = empty array and roots has a single element - parsed firstMessage
final ParsingOutcome secondResult = parse(concat(firstResult.remainder, trailingHalfOfFirstMessage), mapper);
report(secondResult);
}
void runThirdScenario() {
//expectation: remainder = leadingHalfOfSecondMessage and roots has a single element - parsed firstMessage
final ParsingOutcome firstResult = parse(concat(firstMessage, leadingHalfOfSecondMessage), mapper);
report(firstResult);
//expectation: remainder = empty array and roots has a single element - parsed secondMessage
final ParsingOutcome secondResult = parse(concat(firstResult.remainder, trailingHalfOfSecondMessage), mapper);
report(secondResult);
}
void runFourthScenario() {
//expectation: remainder = empty array and roots has two elements - parsed firstMessage, followed by parsed secondMessage
final ParsingOutcome result = parse(concat(firstMessage, secondMessage), mapper);
report(result);
}
static void report(final ParsingOutcome result) {
System.out.printf("Remainder of length %d: %s%n", result.remainder.length, Arrays.toString(result.remainder));
System.out.printf("Total of %d parsed JSON roots: %s%n", result.roots.size(), result.roots);
}
static byte[] concat(final byte[] left, final byte[] right) {
final byte[] union = Arrays.copyOf(left, left.length + right.length);
System.arraycopy(right, 0, union, left.length, right.length);
return union;
}
static ParsingOutcome parse(final byte[] chunk, final ObjectMapper mapper) {
final List<JsonNode> roots = new LinkedList<>();
JsonParser parser;
JsonNode root;
try {
parser = mapper.getFactory().createParser(chunk);
root = parser.readValueAsTree();
} catch (IOException e) {
return new ParsingOutcome(emptyList(), chunk);
}
byte[] remainder = new byte[0];
try {
while(root != null) {
roots.add(root);
remainder = extractRemainder(parser);
root = parser.readValueAsTree();
}
} catch (IOException e) {
//fallthrough
}
return new ParsingOutcome(roots, remainder);
}
static byte[] extractRemainder(final JsonParser parser) {
try {
final ByteArrayOutputStream baos = new ByteArrayOutputStream();
parser.releaseBuffered(baos);
return baos.toByteArray();
} catch (IOException e) {
return new byte[0];
}
}
}
To elaborate a bit further, conceptually (at least in my mind), parsing of any streaming data boils down to a simple function which accepts an array of bytes and returns a tuple of (1) a possibly empty list of parsed results and (2) an array of remaining, currently-unparsable bytes. In the snippet above, this tuple is represented by an instance of ParsingOutcome.
Related
I'm reading and writing to a ByteBuffer
import org.assertj.core.api.Assertions;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
public class Solution{
public static void main(String[] args) throws Exception{
final CharsetEncoder messageEncoder = Charset.forName("ISO-8859-1").newEncoder();
String message = "TRANSACTION IGNORED";
String carrierName= "CARR00AB";
int messageLength = message.length()+carrierName.length()+8;
System.out.println(" --------Fill data---------");
ByteBuffer messageBuffer = ByteBuffer.allocate(4096);
messageBuffer.order(ByteOrder.BIG_ENDIAN);
messageBuffer.putInt(messageLength);
messageBuffer.put(messageEncoder.encode(CharBuffer.wrap(carrierName)));
messageBuffer.put(messageEncoder.encode(CharBuffer.wrap(message)));
messageBuffer.put((byte) 0x2b);
messageBuffer.flip();
System.out.println("------------Extract Data Approach 1--------");
CharsetDecoder messageDecoder = Charset.forName("ISO-8859-1").newDecoder();
int lengthField = messageBuffer.getInt();
System.out.println("lengthField="+lengthField);
int responseLength = lengthField - 12;
System.out.println("responseLength="+responseLength);
String messageDecoded= messageDecoder.decode(messageBuffer).toString();
System.out.println("messageDecoded="+messageDecoded);
String decodedCarrier = messageDecoded.substring(0, carrierName.length());
System.out.println("decodedCarrier="+ decodedCarrier);
String decodedBody = messageDecoded.substring(carrierName.length(), messageDecoded.length() - 1);
System.out.println("decodedBody="+decodedBody);
Assertions.assertThat(messageLength).isEqualTo(lengthField);
Assertions.assertThat(decodedBody).isEqualTo(message);
Assertions.assertThat(decodedBody).isEqualTo(message);
ByteBuffer messageBuffer2 = ByteBuffer.allocate(4096);
messageBuffer2.order(ByteOrder.BIG_ENDIAN);
messageBuffer2.putInt(messageLength);
messageBuffer2.put(messageEncoder.encode(CharBuffer.wrap(carrierName)));
messageBuffer2.put(messageEncoder.encode(CharBuffer.wrap(message)));
messageBuffer2.put((byte) 0x2b);
messageBuffer2.flip();
System.out.println("---------Extract Data Approach 2--------");
byte [] data = new byte[messageBuffer2.limit()];
messageBuffer2.get(data);
String dataString =new String(data, "ISO-8859-1");
System.out.println(dataString);
}
}
It works fine but then I thought to refactor it, Please see approach 2 in above code
byte [] data = new byte[messageBuffer.limit()];
messageBuffer.get(data);
String dataString =new String(data, "ISO-8859-1");
System.out.println(dataString);
Output= #CARR00ABTRANSACTION IGNORED+
Could you guys help me with explanation
why the integer is got missing in second approach while decoding ???
Is there any way to extract the integer in second approach??
Okay so you are trying to read an int from the Buffer which takes up 4 bits and then trying to get the whole data after reading 4 bits
What I have done is call messageBuffer2.clear(); after reading the int to resolve this issue. here is the full code
System.out.println(messageBuffer2.getInt());
byte[] data = new byte[messageBuffer2.limit()];
messageBuffer2.clear();
messageBuffer2.get(data);
String dataString = new String(data, StandardCharsets.ISO_8859_1);
System.out.println(dataString);
Output is:
35
#CARR0033TRANSACTION IGNORED+
Edit: So basically when you are calling clear it resets various variables and it also resets the position it's getting from and thats how it fixes it.
I am using msgpack to serialize data. I have some code works fine with serializing data.
public void testJackson() throws Exception {
ByteArrayOutputStream out = new ByteArrayOutputStream();
String data1 = "test data";
int data2 = 10;
List<String> data3 = new ArrayList<String>();
data3.add("list data1");
data3.add("list data1");
ObjectMapper mapper = new ObjectMapper();
mapper.writeValue(out, data1);
mapper.writeValue(out, data2);
mapper.writeValue(out, data3);
// TODO: How to deserialize?
}
But now I don't know how to deserialize data.
I am not finding any solution anywhere. It will be good if anyone can help how to proceed.
The problem
I have tried many of the readValue methods, but I only can get the first String, about the second and third value I have no idea
The thing is, Jackson always reads the first data, since the data is neither deleted from the nor did you explicitly tell Jackson that the next data is from position A to position B
Solutions
this example works and is similar to your code, but is not very elegant. Here I explicitly tell Jackson where my data is, but I have to know how it got written, which is a way too specific solution
File dataFile = new File("jackson.txt");
if(!dataFile.exists())
dataFile.createNewFile();
FileOutputStream fileOut = new FileOutputStream(dataFile);
ByteArrayOutputStream out = new ByteArrayOutputStream();
FileInputStream fileIn = new FileInputStream(dataFile);
String writeData1 = "test data";
int writeData2 = 10;
List<String> writeData3 = new ArrayList<String>();
writeData3.add("list data1");
writeData3.add("list data1");
ObjectMapper mapper = new ObjectMapper();
byte[] writeData1Bytes = mapper.writeValueAsBytes(writeData1);
out.write(writeData1Bytes);
byte[] writeData2Bytes = mapper.writeValueAsBytes(writeData2);
out.write(writeData2Bytes);
byte[] writeData3Bytes = mapper.writeValueAsBytes(writeData3);
out.write(writeData3Bytes);
out.writeTo(fileOut);
// TODO: How to deserialize?
int pos = 0;
byte[] readData = new byte[1000];
fileIn.read(readData);
String readData1 = mapper.readValue(readData, pos, writeData1Bytes.length, String.class);
pos += writeData1Bytes.length;
Integer readData2 = mapper.readValue(readData, pos, writeData2Bytes.length, Integer.class);
pos += writeData2Bytes.length;
ArrayList readData3 = mapper.readValue(readData, pos, writeData3Bytes.length, ArrayList.class);
pos += writeData3Bytes.length;
System.out.printf("readData1 = %s%n", readData1);
System.out.printf("readData2 = %s%n", readData2);
System.out.printf("readData3 = %s%n", readData3);
the file looks then like this
"test data"10["list data1","list data1"]
How to do it correctly
a way more elegant way is to encapsulate your data in an object which can be turned into a valid JSON string and from that Jackson won't need any more information
public class JacksonTest {
public static class DataNode {
#JsonProperty("data1")
private String data1;
#JsonProperty("data2")
private int data2;
#JsonProperty("data3")
private List<String> data3;
//needed for Jackson
public DataNode() {
}
public DataNode(String data1, int data2, List<String> data3) {
this.data1 = data1;
this.data2 = data2;
this.data3 = data3;
}
}
public static void main(String[] args) throws Exception {
File dataFile = new File("jackson.txt");
if(!dataFile.exists())
dataFile.createNewFile();
FileOutputStream fileOut = new FileOutputStream(dataFile);
ByteArrayOutputStream out = new ByteArrayOutputStream();
FileInputStream fileIn = new FileInputStream(dataFile);
String writeData1 = "test data";
int writeData2 = 10;
List<String> writeData3 = new ArrayList<String>();
writeData3.add("list data1");
writeData3.add("list data1");
DataNode writeData = new DataNode(writeData1, writeData2, writeData3);
ObjectMapper mapper = new ObjectMapper();
mapper.writeValue(out, writeData);
out.writeTo(fileOut);
// TODO: How to deserialize?
DataNode readData = mapper.readValue(fileIn, DataNode.class);
System.out.printf("readData1 = %s%n", readData.data1);
System.out.printf("readData2 = %s%n", readData.data2);
System.out.printf("readData3 = %s%n", readData.data3);
}
}
the content of the file looks like this
{"data1":"test data","data2":10,"data3":["list data1","list data1"]}
You'll want to use one of the readValue methods from ObjectMapper - probably one that has a Reader or InputStream as the first parameter.
#Japu_D_Cret Thank you for such a detailed answer!
Actually I want to use msgpack to transfer data, and I made it work by using msgpack, here is my code
ByteArrayOutputStream out = new ByteArrayOutputStream();
String data1 = "test data";
int data2 = 10;
List<String> data3 = new ArrayList<String>();
data3.add("list data1");
data3.add("list data1");
MessagePack packer = new MessagePack();
packer.write(out, data1);
packer.write(out, data2);
packer.write(out, data3);
// TODO: How to deserialize?
BufferUnpacker unpacker = packer.createBufferUnpacker(out.toByteArray());
System.out.println(unpacker.readString());
System.out.println(unpacker.readInt());
System.out.println(unpacker.read(Templates.tList(Templates.TString)));
Then I found jackson-databind on msgpack website and it supports msgpack format also.
I do some tests on these two and found that jackson's serialize performance is better than msgpack, so I want to use jackson instead of msgpack.
I have a XmlObject which has the correct value what i needed.
Ex : 1½-2Y
But when i tried to convert it into byte of stream, the result i am seeing as 1½-2Y.
Sample code :
import org.apache.xmlbeans.XmlObject;
Class MyClass implements XmlBuilder<T> {
protected final String serializeToXml(XmlObject xmlObject) {
ByteArrayOutputStream os = null;
try {
os = new ByteArrayOutputStream();
xmlObject.save(os,createXmlOptions()); /Its adding a special char here
return os.toString();
}
}
protected final XmlOptions createXmlOptions() {
final XmlOptions xmlOptions = new XmlOptions();
xmlOptions.setValidateOnSet();
xmlOptions.setCharacterEncoding(UTF_8_ENCODING);
return xmlOptions;
}
}
os.toString() will internally call new String(buffer) and thus will use the system encoding which I assume is not UTF-8.
In general you should explicitly provide the encoding, e.g. new String( os.toByteArray(), "UTF-8").
I have the following code which merges two audio files into one:
import java.io.File;
import java.io.IOException;
import java.io.SequenceInputStream;
import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
public class WavAppender {
public static void main(String[] args) {
String wavFile1 = "D:\\wav1.wav";
String wavFile2 = "D:\\wav2.wav";
try {
AudioInputStream clip1 = AudioSystem.getAudioInputStream(new File(wavFile1));
AudioInputStream clip2 = AudioSystem.getAudioInputStream(new File(wavFile2));
AudioInputStream appendedFiles =
new AudioInputStream(
new SequenceInputStream(clip1, clip2),
clip1.getFormat(),
clip1.getFrameLength() + clip2.getFrameLength());
AudioSystem.write(appendedFiles,
AudioFileFormat.Type.WAVE,
new File("D:\\wavAppended.wav"));
} catch (Exception e) {
e.printStackTrace();
}
}
}
I will have a string in the following format [1,2,3,4,5]. Based on the string I will need to select the appropriate wav file. For example if the string is in the format [3,4,5,6,7], I will need to send wavfile 3, wavfile4, wavfile5, wavfile6 and wavfile 7. What is the best way to achieve this?
Create an array or List of items, so that wavfile1 is at index 0, wavfile2 is at index 1 and so on and so forth.
Take each element from the String array and convert to to int, subtract one from it (as arrays and lists are zero indexed) and that becomes your index for the "wave file array"...
String waveFile = waveFiles[Integer.parseInt(indicies[0]) - 1];
...Now this prone to some issues, particularly the conversion of the String to int...
Instead, you could use a Map instead, where each wave file is mapped to the corresponding String id
Map<String, String> waveFiles = new ...;
waveFiles.put("1", "WaveFile1");
waveFiles.put("2", "WaveFile2");
//...
Then you would simply use the value from the String array to look it up...
String waveFile = waveFiles.get(indicies[0]);
As some ideas...
Take a look at the Collections Trail for more details and ideas...
I am trying to produce images without gamma information so that IE8 can display them correctly. Used the following code but the result is a distorted image that looks nothing like the original image.
///PNG
PNGEncodeParam params= PNGEncodeParam.getDefaultEncodeParam(outImage);
params.unsetGamma();
params.setChromaticity(DEFAULT_CHROMA);
params.setSRGBIntent(PNGEncodeParam.INTENT_ABSOLUTE);
ImageEncoder encoder= ImageCodec.createImageEncoder("PNG", response.getOutputStream(), params);
encoder.encode(outImage);
response.getOutputStream().close();
Here is the original image and the distorted one resulting from the code above.
Thanks!
I saw the same question asked several places but there seems to be no answer, so I am offering mine here. I have no idea whether Java imageio saves gamma or not. Given the fact gamma is system dependent, it is unlikely imageio could handle it. One thing is for sure: imageio ignores gamma when reading pngs.
PNG is a chunk based image format. Gamma is one of the 14 Ancillary chunks which takes care of the differences of the computer systems that create the image to make them looks more or less equally "bright" on different systems. Each trunk starts with a data length and a trunk identifier followed by a 4 bytes CRC checksum. The data length doesn't include the data length property itself and the trunk identifier. The gAMA chunk is identified by hex 0x67414D41.
Here is the raw way to remove the gAMA from png image: we assume the input stream is in valid PNG format. First read 8 bytes which is the png identifier 0x89504e470d0a1a0aL. Then read another 25 bytes which comprise of the image header. Altogether we have read 33 bytes from the top of the file. Now save them to another temp file with png extension. Now it comes to a while loop. We read chunks one by one: if it's not IEND and it's not a gAMA chunk, we copy it to the output tempfile. If it's a gAMA trunk, we skip it, until we reach IEND which should be the last chunk and we copy it to the tempfile. Done. Here is the whole test code to show how things are done (it is just for demo purpose, not optimized):
import java.io.*;
public class RemoveGamma
{
/** PNG signature constant */
public static final long SIGNATURE = 0x89504E470D0A1A0AL;
/** PNG Chunk type constants, 4 Critical chunks */
/** Image header */
private static final int IHDR = 0x49484452; // "IHDR"
/** Image data */
private static final int IDAT = 0x49444154; // "IDAT"
/** Image trailer */
private static final int IEND = 0x49454E44; // "IEND"
/** Palette */
private static final int PLTE = 0x504C5445; // "PLTE"
/** 14 Ancillary chunks */
/** Transparency */
private static final int tRNS = 0x74524E53; // "tRNs"
/** Image gamma */
private static final int gAMA = 0x67414D41; // "gAMA"
/** Primary chromaticities */
private static final int cHRM = 0x6348524D; // "cHRM"
/** Standard RGB color space */
private static final int sRGB = 0x73524742; // "sRGB"
/** Embedded ICC profile */
private static final int iCCP = 0x69434350; // "iCCP"
/** Textual data */
private static final int tEXt = 0x74455874; // "tEXt"
/** Compressed textual data */
private static final int zTXt = 0x7A545874; // "zTXt"
/** International textual data */
private static final int iTXt = 0x69545874; // "iTXt"
/** Background color */
private static final int bKGD = 0x624B4744; // "bKGD"
/** Physical pixel dimensions */
private static final int pHYs = 0x70485973; // "pHYs"
/** Significant bits */
private static final int sBIT = 0x73424954; // "sBIT"
/** Suggested palette */
private static final int sPLT = 0x73504C54; // "sPLT"
/** Palette histogram */
private static final int hIST = 0x68495354; // "hIST"
/** Image last-modification time */
private static final int tIME = 0x74494D45; // "tIME"
public void remove(InputStream is) throws Exception
{
//Local variables for reading chunks
int data_len = 0;
int chunk_type = 0;
long CRC = 0;
byte[] buf=null;
DataOutputStream ds = new DataOutputStream(new FileOutputStream("temp.png"));
long signature = readLong(is);
if (signature != SIGNATURE)
{
System.out.println("--- NOT A PNG IMAGE ---");
return;
}
ds.writeLong(SIGNATURE);
//*******************************
//Chuncks follow, start with IHDR
//*******************************
/** Chunk layout
Each chunk consists of four parts:
Length
A 4-byte unsigned integer giving the number of bytes in the chunk's data field.
The length counts only the data field, not itself, the chunk type code, or the CRC.
Zero is a valid length. Although encoders and decoders should treat the length as unsigned,
its value must not exceed 2^31-1 bytes.
Chunk Type
A 4-byte chunk type code. For convenience in description and in examining PNG files,
type codes are restricted to consist of uppercase and lowercase ASCII letters
(A-Z and a-z, or 65-90 and 97-122 decimal). However, encoders and decoders must treat
the codes as fixed binary values, not character strings. For example, it would not be
correct to represent the type code IDAT by the EBCDIC equivalents of those letters.
Additional naming conventions for chunk types are discussed in the next section.
Chunk Data
The data bytes appropriate to the chunk type, if any. This field can be of zero length.
CRC
A 4-byte CRC (Cyclic Redundancy Check) calculated on the preceding bytes in the chunk,
including the chunk type code and chunk data fields, but not including the length field.
The CRC is always present, even for chunks containing no data. See CRC algorithm.
*/
/** Read header */
/** We are expecting IHDR */
if ((readInt(is)!=13)||(readInt(is) != IHDR))
{
System.out.println("--- NOT A PNG IMAGE ---");
return;
}
ds.writeInt(13);//We expect length to be 13 bytes
ds.writeInt(IHDR);
buf = new byte[13+4];//13 plus 4 bytes CRC
is.read(buf,0,17);
ds.write(buf);
while (true)
{
data_len = readInt(is);
chunk_type = readInt(is);
//System.out.println("chunk type: 0x"+Integer.toHexString(chunk_type));
if (chunk_type == IEND)
{
System.out.println("IEND found");
ds.writeInt(data_len);
ds.writeInt(IEND);
int crc = readInt(is);
ds.writeInt(crc);
break;
}
switch (chunk_type)
{
case gAMA://or any non-significant chunk you want to remove
{
System.out.println("gamma found");
is.skip(data_len+4);
break;
}
default:
{
buf = new byte[data_len+4];
is.read(buf,0, data_len+4);
ds.writeInt(data_len);
ds.writeInt(chunk_type);
ds.write(buf);
break;
}
}
}
is.close();
ds.close();
}
private int readInt(InputStream is) throws Exception
{
byte[] buf = new byte[4];
is.read(buf,0,4);
return (((buf[0]&0xff)<<24)|((buf[1]&0xff)<<16)|
((buf[2]&0xff)<<8)|(buf[3]&0xff));
}
private long readLong(InputStream is) throws Exception
{
byte[] buf = new byte[8];
is.read(buf,0,8);
return (((buf[0]&0xffL)<<56)|((buf[1]&0xffL)<<48)|
((buf[2]&0xffL)<<40)|((buf[3]&0xffL)<<32)|((buf[4]&0xffL)<<24)|
((buf[5]&0xffL)<<16)|((buf[6]&0xffL)<<8)|(buf[7]&0xffL));
}
public static void main(String args[]) throws Exception
{
FileInputStream fs = new FileInputStream(args[0]);
RemoveGamma rg = new RemoveGamma();
rg.remove(fs);
}
}
Since the input is a Java InputStream, we could use some kind of encoder to encode image as a PNG and write it to a ByteArrayOutputStream which later will be fed to the above test class as a ByteArrayInputSteam and the the gamma information (if any) will be removed. Here is the result:
The left side is the original image with gAMA, the right side is the same image with gAMA removed.
Image source: http://r6.ca/cs488/kosh.png
Edit: here is a revised version of the code to remove any ancillary chunks.
import java.io.*;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Set;
public class PNGChunkRemover
{
/** PNG signature constant */
private static final long SIGNATURE = 0x89504E470D0A1A0AL;
/** PNG Chunk type constants, 4 Critical chunks */
/** Image header */
private static final int IHDR = 0x49484452; // "IHDR"
/** Image data */
private static final int IDAT = 0x49444154; // "IDAT"
/** Image trailer */
private static final int IEND = 0x49454E44; // "IEND"
/** Palette */
private static final int PLTE = 0x504C5445; // "PLTE"
//Ancillary chunks keys
private static String[] KEYS = { "TRNS", "GAMA","CHRM","SRGB","ICCP","TEXT","ZTXT",
"ITXT","BKGD","PHYS","SBIT","SPLT","HIST","TIME"};
private static int[] VALUES = {0x74524E53,0x67414D41,0x6348524D,0x73524742,0x69434350,0x74455874,0x7A545874,
0x69545874,0x624B4744,0x70485973,0x73424954,0x73504C54,0x68495354,0x74494D45};
private static HashMap<String, Integer> TRUNK_TYPES = new HashMap<String, Integer>()
{{
for(int i=0;i<KEYS.length;i++)
put(KEYS[i],VALUES[i]);
}};
private static HashMap<Integer, String> REVERSE_TRUNK_TYPES = new HashMap<Integer,String>()
{{
for(int i=0;i<KEYS.length;i++)
put(VALUES[i],KEYS[i]);
}};
private static Set<Integer> REMOVABLE = new HashSet<Integer>();
private static void remove(InputStream is, File dir, String fileName) throws Exception
{
//Local variables for reading chunks
int data_len = 0;
int chunk_type = 0;
byte[] buf=null;
DataOutputStream ds = new DataOutputStream(new FileOutputStream(new File(dir,fileName)));
long signature = readLong(is);
if (signature != SIGNATURE)
{
System.out.println("--- NOT A PNG IMAGE ---");
return;
}
ds.writeLong(SIGNATURE);
/** Read header */
/** We are expecting IHDR */
if ((readInt(is)!=13)||(readInt(is) != IHDR))
{
System.out.println("--- NOT A PNG IMAGE ---");
return;
}
ds.writeInt(13);//We expect length to be 13 bytes
ds.writeInt(IHDR);
buf = new byte[13+4];//13 plus 4 bytes CRC
is.read(buf,0,17);
ds.write(buf);
while (true)
{
data_len = readInt(is);
chunk_type = readInt(is);
//System.out.println("chunk type: 0x"+Integer.toHexString(chunk_type));
if (chunk_type == IEND)
{
System.out.println("IEND found");
ds.writeInt(data_len);
ds.writeInt(IEND);
int crc = readInt(is);
ds.writeInt(crc);
break;
}
if(REMOVABLE.contains(chunk_type))
{
System.out.println(REVERSE_TRUNK_TYPES.get(chunk_type)+"Chunk removed!");
is.skip(data_len+4);
}
else
{
buf = new byte[data_len+4];
is.read(buf,0, data_len+4);
ds.writeInt(data_len);
ds.writeInt(chunk_type);
ds.write(buf);
}
}
is.close();
ds.close();
}
private static int readInt(InputStream is) throws Exception
{
byte[] buf = new byte[4];
int bytes_read = is.read(buf,0,4);
if(bytes_read<0) return IEND;
return (((buf[0]&0xff)<<24)|((buf[1]&0xff)<<16)|
((buf[2]&0xff)<<8)|(buf[3]&0xff));
}
private static long readLong(InputStream is) throws Exception
{
byte[] buf = new byte[8];
int bytes_read = is.read(buf,0,8);
if(bytes_read<0) return IEND;
return (((buf[0]&0xffL)<<56)|((buf[1]&0xffL)<<48)|
((buf[2]&0xffL)<<40)|((buf[3]&0xffL)<<32)|((buf[4]&0xffL)<<24)|
((buf[5]&0xffL)<<16)|((buf[6]&0xffL)<<8)|(buf[7]&0xffL));
}
public static void main(String args[]) throws Exception
{
if(args.length>0)
{
File[] files = {new File(args[0])};
File dir = new File(".");
if(files[0].isDirectory())
{
dir = files[0];
files = files[0].listFiles(new FileFilter(){
public boolean accept(File file)
{
if(file.getName().toLowerCase().endsWith("png")){
return true;
}
return false;
}
}
);
}
if(args.length>1)
{
FileInputStream fs = null;
if(args[1].equalsIgnoreCase("all")){
REMOVABLE = REVERSE_TRUNK_TYPES.keySet();
}
else
{
String key = "";
for (int i=1;i<args.length;i++)
{
key = args[i].toUpperCase();
if(TRUNK_TYPES.containsKey(key))
REMOVABLE.add(TRUNK_TYPES.get(key));
}
}
for(int i= files.length-1;i>=0;i--)
{
String outFileName = files[i].getName();
outFileName = outFileName.substring(0,outFileName.lastIndexOf('.'))
+"_slim.png";
System.out.println("<<"+files[i].getName());
fs = new FileInputStream(files[i]);
remove(fs, dir, outFileName);
System.out.println(">>"+outFileName);
System.out.println("************************");
}
}
}
}
}
Usage: java PNGChunkRemover filename.png all will remove any of the predefined 14 ancillary chunks.
java PNGChunkRemover filename.png gama time ... will only remove the chunks specified after the png file.
Note: If a folder name is specified as the first argument to the PNGChunkRemover, all png file in the folder will be processed.
The above example has become part of a Java image library which can be found at https://github.com/dragon66/icafe
You can also do it with the (my) PNGJ library
http://code.google.com/p/pngj/
Eg
PngReader pngr = FileHelper.createPngReader(new File(origFilename));
PngWriter pngw = FileHelper.createPngWriter(new File(destFilename), pngr.imgInfo, false);
pngw.copyChunksFirst(pngr, ChunkCopyBehaviour.COPY_ALL); // all chunks are queued
PngChunkGAMA gama = (PngChunkGAMA) pngw.getChunkList().getQueuedById1(ChunkHelper.gAMA);
if (gama != null) {
System.out.println("removing gama chunk gamma=" + gama.getGamma());
pngw.getChunkList().removeChunk(gama);
}
for (int row = 0; row < pngr.imgInfo.rows; row++) {
ImageLine l1 = pngr.readRow(row);
pngw.writeRow(l1, row);
}
pngw.copyChunksLast(pngr, ChunkCopyBehaviour.COPY_ALL); // in case some new metadata has been read
pngw.end();
Included in the library samples.
The tool pngcrush can remove the gamma information and other unwanted chunks:
pngcrush -m 3 -rem gAMA -rem cHRM -rem iCCP -rem sRGB in.png out.png
It recompresses the PNG at the same time, trying different methods. The -m 3 option tries only method number 3, which seems to be quick and reasonably effective. Omit that if you want the smallest png.