How to parse binary data from xml? - java

Trying to parse xml with STAX for a school project. If i have an element:
<binary data></binary data>
if this data takes up to 250 mb, how to deal with it?
XMLStreamReader has byte[] getElementAsBinary() but i can't affrod to hold this amount in memory. If anyone can help with this I would really appreciate it.
EDIT
Is its possible somehow to read data to stream? Currently i have:
private byte[] readBinary(XMLStreamReader2 sr) throws XMLStreamException {
Stax2Util.ByteAggregator aggr = new Stax2Util.ByteAggregator();
byte[] buffer = aggr.startAggregation();
while (true) {
int offset = 0;
int len = buffer.length;
do {
int readCount = sr.readElementAsBinary(buffer, offset, len);
if (readCount < 1) { // all done!
return aggr.aggregateAll(buffer, offset);
}
offset += readCount;
len -= readCount;
} while (len > 0);
buffer = aggr.addFullBlock(buffer);
}
}

Related

read a File in hex byteArray and write part of that array to another File - java android

I have a 200kb file and I have to read it in bytes, and then write part of this byteArray (from index 90000 to 165000) to another file. How can I accomplish this.
File file1 = new File(getExternalStorageDirectory(), "file1.raw"); //the 200kb file
try {
File.createTempFile("file2", "mp3", getCacheDir()); //empty file that needs to get filled with part of the byte array from file1
} catch (IOException ignored) {}
Use a RandomAccessFile in order to seek to the offset to start copying from.
For example:
final int bufSize = 1024; // the buffer size for copying
byte[] buf = new byte[bufSize];
final long start = 10000L; // start offset at source
final long end = 15000L; // end (non-inclusive) offset at source
try (
RandomAccessFile in = new RandomAccessFile("input.bin", "r");
OutputStream out = new FileOutputStream("output.bin"))
{
in.seek(start);
long remaining = end - start;
do {
int n = (remaining < bufSize)? ((int) remaining) : bufSize;
int nread = in.read(buf, 0, n);
if (nread < 0)
break; // EOF
remaining -= nread;
out.write(buf, 0, nread);
} while (remaining > 0);
}

Java 8: How to chunk multipart file for POST request

I have a multipart file, it will be an image or video, which needs to be chunked for POST request. How can I chunk the file into byte array segments?
edit: I'm using Twitter API to upload image, according to their docs, media must be chunked
I've found a solution thanks to https://www.baeldung.com/2013/04/04/multipart-upload-on-s3-with-jclouds/
public final class MediaUtil {
public static int getMaximumNumberOfParts(byte[] byteArray) {
int numberOfParts = byteArray.length / (1024 * 1024); // 1MB
if (numberOfParts == 0) {
return 1;
}
return numberOfParts;
}
public static List<byte[]> breakByteArrayIntoParts(byte[] byteArray, int maxNumberOfParts) {
List<byte[]> parts = new ArrayList<>();
int fullSize = byteArray.length;
long dimensionOfPart = fullSize / maxNumberOfParts;
for (int i = 0; i < maxNumberOfParts; i++) {
int previousSplitPoint = (int) (dimensionOfPart * i);
int splitPoint = (int) (dimensionOfPart * (i + 1));
if (i == (maxNumberOfParts - 1)) {
splitPoint = fullSize;
}
byte[] partBytes = Arrays.copyOfRange(byteArray, previousSplitPoint, splitPoint);
parts.add(partBytes);
}
return parts;
}
}
// Post the request
int maxParts = MediaUtil.getMaximumNumberOfParts(multipartFile.getBytes());
List<byte[]> bytes = MediaUtil.breakByteArrayIntoParts(multipartFile.getBytes(), maxParts);
int segment = 0;
for (byte[] b : bytes) {
// POST request here
segment++;
}
Well, you may need this:
File resource = ResourceUtils.getFile(path);
if (resource.isFile()) {
byte[] bytes = readFile2Bytes(new FileInputStream(resource));
}
private byte[] readFile2Bytes(FileInputStream fis) throws IOException {
int length = 0;
byte[] buffer = new byte[size];
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while ((length = fis.read(buffer)) != -1) {
baos.write(buffer, 0, length);
}
return baos.toByteArray();
}

Incomplete file returned by GridFS

I'm working on a Java project to store and retrieve files from MongoDB using GridFS specification. I'm using the code snippets provided in MongoDB Java driver documentation from https://mongodb.github.io/mongo-java-driver/4.1/driver/tutorials/gridfs/.
While using OpenDownloadStream to retrieve the file, I noticed that if the file is divided into more than one chunks, it returns only the first chunk, and not the full file.
ObjectId fileId;
GridFSDownloadStream downloadStream = gridFSBucket.openDownloadStream(fileId);
int fileLength = (int) downloadStream.getGridFSFile().getLength();
byte[] bytesToWriteTo = new byte[fileLength];
downloadStream.read(bytesToWriteTo); /*read file contents */
downloadStream.close();
System.out.println(new String(bytesToWriteTo, StandardCharsets.UTF_8));
Any solutions to this?
Looking at the class GridFSDownloadStreamImpl which implements GridFSDownloadStream, it looks like the method read(byte[]) reads chunk by chunk:
#Override
public int read(final byte[] b) {
return read(b, 0, b.length);
}
#Override
public int read(final byte[] b, final int off, final int len) {
checkClosed();
if (currentPosition == length) {
return -1;
} else if (buffer == null) {
buffer = getBuffer(chunkIndex);
} else if (bufferOffset == buffer.length) {
chunkIndex += 1;
buffer = getBuffer(chunkIndex);
bufferOffset = 0;
}
int r = Math.min(len, buffer.length - bufferOffset);
System.arraycopy(buffer, bufferOffset, b, off, r);
bufferOffset += r;
currentPosition += r;
return r;
}
Therefore, you have to loop until all expected bytes are actually read:
byte[] bytesToWriteTo = new byte[fileLength];
int bytesRead = 0;
while(bytesRead < fileLength) {
int newBytesRead = downloadStream.read(bytesToWriteTo);
if(newBytesRead == -1) {
throw new Exception();
}
bytesRead += newBytesRead;
}
downloadStream.close();
Note that I was not able to test above code so please use with caution.
I ended up using readAllBytes() method and it returns the whole file.
GridFSDownloadStream downloadStream = gridFSBucket.openDownloadStream(fileId);
int fileLength = (int) downloadStream.getGridFSFile().getLength();
byte[] bytesToWriteTo = new byte[fileLength];
bytesToWriteTo = downloadStream.readAllBytes();
downloadStream.close();

How to get BufferedInputStream from Multipart?

I'm trying to get a BufferedInputStream from an uploaded cvs file.
I'm working with a Multipart derived from the cvs file.
When I first get the Multipart, it's a BufferedInputStream, but the buffer is all null.
But if I look deeper down, there's another buffer in the CoyoteInputStream and that has data.
How can I get at this second buffer? My code is below.
And of course it's throwing a null exception when it gets to
while ((multiPartDataPos = stream.read(buffer)) >= 0)
What am I doing wrong? Am I mistaken that the CoyoteInputStream is the data I want?
public byte[] handleUploadedFile(Multipart multiPart) throws EOFException {
Multipart multiPartData = null;
BufferedInputStream stream = null;
int basicBufferSize = 0x2000;
byte[] buffer = new byte[basicBufferSize];
int bufferPos = 0;
try {
while (multiPart.hasNext()) {
int multiPartDataPos = bufferPos;
multiPartData = (Multipart) multiPart.next();
stream = new BufferedInputStream(multiPartData.getInputStream());
while ((multiPartDataPos = stream.read(buffer)) >= 0) {
int len = stream.read(buffer, multiPartDataPos, buffer.length - multiPartDataPos);
multiPartDataPos += len;
}
bufferPos = bufferPos + multiPartDataPos;
}
} ...
Your code doesn't make any sense.
while ((multiPartDataPos = stream.read(buffer)) >= 0) {
At this point you have read multiPartDataPos bytes into buffer, so that buffer[0..multiPartDataPos-1] contains the data just read.
int len = stream.read(buffer, multiPartDataPos, buffer.length - multiPartDataPos);
At this point you are doing another read, which could return -1, which will otherwise add some data from multiPartPos to multiPartDataPos+len-.
multiPartDataPos += len;
This step is only valid if len > 0.
And you are doing nothing with the buffer; and next time around the loop you will clobber whatever you just read.
The correct way to read any stream in Java is as follows:
while ((count = in.read(buffer)) > 0)
{
// use buffer[9..count-1], for example out.write(buffer, 0, count);
}
I don't understand why you think access to an underlying stream is required or what it's going to give you that you don't already have.
Turns out the better solution was to use move the data from an InputStream to a ByteArrayOutputStream and then return ByteArrayOutputStream.toByteArray()
Multipart multiPartData = null;
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int read;
byte[] input = new byte[4096];
InputStream is;
try {
multiPartData = (Multipart)multipart.next();
is = multiPartData.getInputStream();
while ((read = is.read(input, 0, input.length)) != -1) {
buffer.write(input, 0, read);
}
buffer.flush();
return buffer.toByteArray(); // just a test right now
}

Inserting an image to a particular position in a word document using docx4j

I want to add an image to particular position in my word document using docx4j. I don't want inline insertion. The code below performs adding the image inline with text. But I want floating insertion where I can explicitly give the location of where the image should be placed in the page. Please help me.
public R addUserPic(P parag, WordprocessingMLPackage wordMLPackage)
throws Exception {
File file = new File("src/main/resources/PictureNew.png");
byte[] bytes = convertImageToByteArray(file);
BinaryPartAbstractImage imagePart = BinaryPartAbstractImage
.createImagePart(wordMLPackage, bytes);
int docPrId = 1;
int cNvPrId = 2;
Inline inline = imagePart.createImageInline("Filename hint",
"Alternative text", docPrId, cNvPrId, false);
ObjectFactory factory = new ObjectFactory();
R run = factory.createR();
org.docx4j.wml.Drawing drawing = factory.createDrawing();
run.getContent().add(drawing);
drawing.getAnchorOrInline().add(inline);
return run;
}
private static byte[] convertImageToByteArray(File file)
throws FileNotFoundException, IOException {
InputStream is = new FileInputStream(file);
long length = file.length();
if (length > Integer.MAX_VALUE) {
System.out.println("File too large!!");
}
byte[] bytes = new byte[(int) length];
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}
if (offset < bytes.length) {
System.out.println("Could not completely read file "
+ file.getName());
}
is.close();
return bytes;
}
The thread you have cross posted in, at http://www.docx4java.org/forums/docx-java-f6/how-to-create-a-floating-image-t1224.html answers your question.

Categories

Resources