Incomplete file returned by GridFS - java

I'm working on a Java project to store and retrieve files from MongoDB using GridFS specification. I'm using the code snippets provided in MongoDB Java driver documentation from https://mongodb.github.io/mongo-java-driver/4.1/driver/tutorials/gridfs/.
While using OpenDownloadStream to retrieve the file, I noticed that if the file is divided into more than one chunks, it returns only the first chunk, and not the full file.
ObjectId fileId;
GridFSDownloadStream downloadStream = gridFSBucket.openDownloadStream(fileId);
int fileLength = (int) downloadStream.getGridFSFile().getLength();
byte[] bytesToWriteTo = new byte[fileLength];
downloadStream.read(bytesToWriteTo); /*read file contents */
downloadStream.close();
System.out.println(new String(bytesToWriteTo, StandardCharsets.UTF_8));
Any solutions to this?

Looking at the class GridFSDownloadStreamImpl which implements GridFSDownloadStream, it looks like the method read(byte[]) reads chunk by chunk:
#Override
public int read(final byte[] b) {
return read(b, 0, b.length);
}
#Override
public int read(final byte[] b, final int off, final int len) {
checkClosed();
if (currentPosition == length) {
return -1;
} else if (buffer == null) {
buffer = getBuffer(chunkIndex);
} else if (bufferOffset == buffer.length) {
chunkIndex += 1;
buffer = getBuffer(chunkIndex);
bufferOffset = 0;
}
int r = Math.min(len, buffer.length - bufferOffset);
System.arraycopy(buffer, bufferOffset, b, off, r);
bufferOffset += r;
currentPosition += r;
return r;
}
Therefore, you have to loop until all expected bytes are actually read:
byte[] bytesToWriteTo = new byte[fileLength];
int bytesRead = 0;
while(bytesRead < fileLength) {
int newBytesRead = downloadStream.read(bytesToWriteTo);
if(newBytesRead == -1) {
throw new Exception();
}
bytesRead += newBytesRead;
}
downloadStream.close();
Note that I was not able to test above code so please use with caution.

I ended up using readAllBytes() method and it returns the whole file.
GridFSDownloadStream downloadStream = gridFSBucket.openDownloadStream(fileId);
int fileLength = (int) downloadStream.getGridFSFile().getLength();
byte[] bytesToWriteTo = new byte[fileLength];
bytesToWriteTo = downloadStream.readAllBytes();
downloadStream.close();

Related

Java 8: How to chunk multipart file for POST request

I have a multipart file, it will be an image or video, which needs to be chunked for POST request. How can I chunk the file into byte array segments?
edit: I'm using Twitter API to upload image, according to their docs, media must be chunked
I've found a solution thanks to https://www.baeldung.com/2013/04/04/multipart-upload-on-s3-with-jclouds/
public final class MediaUtil {
public static int getMaximumNumberOfParts(byte[] byteArray) {
int numberOfParts = byteArray.length / (1024 * 1024); // 1MB
if (numberOfParts == 0) {
return 1;
}
return numberOfParts;
}
public static List<byte[]> breakByteArrayIntoParts(byte[] byteArray, int maxNumberOfParts) {
List<byte[]> parts = new ArrayList<>();
int fullSize = byteArray.length;
long dimensionOfPart = fullSize / maxNumberOfParts;
for (int i = 0; i < maxNumberOfParts; i++) {
int previousSplitPoint = (int) (dimensionOfPart * i);
int splitPoint = (int) (dimensionOfPart * (i + 1));
if (i == (maxNumberOfParts - 1)) {
splitPoint = fullSize;
}
byte[] partBytes = Arrays.copyOfRange(byteArray, previousSplitPoint, splitPoint);
parts.add(partBytes);
}
return parts;
}
}
// Post the request
int maxParts = MediaUtil.getMaximumNumberOfParts(multipartFile.getBytes());
List<byte[]> bytes = MediaUtil.breakByteArrayIntoParts(multipartFile.getBytes(), maxParts);
int segment = 0;
for (byte[] b : bytes) {
// POST request here
segment++;
}
Well, you may need this:
File resource = ResourceUtils.getFile(path);
if (resource.isFile()) {
byte[] bytes = readFile2Bytes(new FileInputStream(resource));
}
private byte[] readFile2Bytes(FileInputStream fis) throws IOException {
int length = 0;
byte[] buffer = new byte[size];
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while ((length = fis.read(buffer)) != -1) {
baos.write(buffer, 0, length);
}
return baos.toByteArray();
}

How to use XOR to develop a ​OTPInputStream​ in Java

I want to develop a ​OTPInputStream ​in Java that extends the ​InputStream ​and takes another input stream of key data and provides a stream encrypting / decrypting input stream.I need to develop a test program to show the use of ​OTPInputStream​ that uses XOR and arbitrary data.
I tried with this code but I have problem that is
java.io.FileInputStream cannot be cast to java.lang.CharSequence
What should I do here?
public class Bitwise_Encryption {
static String file = "" ;
static String key = "VFGHTrbg";
private static int[] encrypt(FileInputStream file, String key) {
int[] output = new int[((CharSequence) file).length()];
for(int i = 0; i < ((CharSequence) file).length(); i++) {
int o = (Integer.valueOf(((CharSequence) file).charAt(i)) ^ Integer.valueOf(key.charAt(i % (key.length() - 1)))) + '0';
output[i] = o;
}
return output;
}
private static String decrypt(int[] input, String key) {
String output = "";
for(int i = 0; i < input.length; i++) {
output += (char) ((input[i] - 48) ^ (int) key.charAt(i % (key.length() - 1)));
}
return output;
}
public static void main(String args[]) throws FileNotFoundException {
FileInputStream file = new FileInputStream("directory");
encrypt(file,key);
//decrypt();
int[] encrypted = encrypt(file,key);
System.out.println("Encrypted Data is :");
for(int i = 0; i < encrypted.length; i++)
System.out.printf("%d,", encrypted[i]);
System.out.println("");
System.out.println("---------------------------------------------------");
System.out.println("Decrypted Data is :");
System.out.println(decrypt(encrypted,key));
}
}
Think what you want is just file.read() and file.getChannel().size() to read one character at a time and get the size of the file
Try something like this:
private static int[] encrypt(FileInputStream file, String key) {
int fileSize = file.getChannel().size();
int[] output = new int[fileSize];
for(int i = 0; i < output.length; i++) {
char char1 = (char) file.read();
int o = (char1 ^ Integer.valueOf(key.charAt(i % (key.length() - 1)))) + '0';
output[i] = o;
}
return output;
}
Will have to do some error handling because file.read() will return -1 if the end of the file has been reached and as pointed out reading one byte at a time is lot of IO operations and can slow down performance. You can keep the data in a buffer and read it another way like this:
private static int[] encrypt(FileInputStream file, String key) {
int fileSize = file.getChannel().size();
int[] output = new int[fileSize];
int read = 0;
int offset = 0;
byte[] buffer = new byte[1024];
while((read = file.read(buffer)) > 0) {
for(int i = 0; i < read; i++) {
char char1 = (char) buffer[i];
int o = (char1 ^ Integer.valueOf(key.charAt(i % (key.length() - 1)))) + '0';
output[i + offset] = o;
}
offset += read;
}
return output;
}
This will read in 1024 bytes at a time from the file and store it in your buffer, then you can loop through the buffer to do your logic. The offset value is to store where in our output the current spot is. Also you will have to make sure that i + offset doesn't exceed your array size.
UPDATE
After working with it; i decided to switch to Base64 Encoding/Decoding to remove non-printable characters:
private static String encrypt(InputStream file, String key) throws Exception {
int read = 0;
byte[] buffer = new byte[1024];
try(ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
while((read = file.read(buffer)) > 0) {
baos.write(buffer, 0, read);
}
return base64Encode(xorWithKey(baos.toByteArray(), key.getBytes()));
}
}
private static String decrypt(String input, String key) {
byte[] decoded = base64Decode(input);
return new String(xorWithKey(decoded, key.getBytes()));
}
private static byte[] xorWithKey(byte[] a, byte[] key) {
byte[] out = new byte[a.length];
for (int i = 0; i < a.length; i++) {
out[i] = (byte) (a[i] ^ key[i%key.length]);
}
return out;
}
private static byte[] base64Decode(String s) {
return Base64.getDecoder().decode(s.trim());
}
private static String base64Encode(byte[] bytes) {
return Base64.getEncoder().encodeToString(bytes);
}
This method is cleaner and doesn't require knowing the size of your InputStream or do any character conversions. It reads your InputStream into an OutputStream to do the Base64 Encoding as well to remove non printable characters.
I have tested this and it works both for encrypting and decrypting.
I got the idea from this answer:
XOR operation with two strings in java

How to get BufferedInputStream from Multipart?

I'm trying to get a BufferedInputStream from an uploaded cvs file.
I'm working with a Multipart derived from the cvs file.
When I first get the Multipart, it's a BufferedInputStream, but the buffer is all null.
But if I look deeper down, there's another buffer in the CoyoteInputStream and that has data.
How can I get at this second buffer? My code is below.
And of course it's throwing a null exception when it gets to
while ((multiPartDataPos = stream.read(buffer)) >= 0)
What am I doing wrong? Am I mistaken that the CoyoteInputStream is the data I want?
public byte[] handleUploadedFile(Multipart multiPart) throws EOFException {
Multipart multiPartData = null;
BufferedInputStream stream = null;
int basicBufferSize = 0x2000;
byte[] buffer = new byte[basicBufferSize];
int bufferPos = 0;
try {
while (multiPart.hasNext()) {
int multiPartDataPos = bufferPos;
multiPartData = (Multipart) multiPart.next();
stream = new BufferedInputStream(multiPartData.getInputStream());
while ((multiPartDataPos = stream.read(buffer)) >= 0) {
int len = stream.read(buffer, multiPartDataPos, buffer.length - multiPartDataPos);
multiPartDataPos += len;
}
bufferPos = bufferPos + multiPartDataPos;
}
} ...
Your code doesn't make any sense.
while ((multiPartDataPos = stream.read(buffer)) >= 0) {
At this point you have read multiPartDataPos bytes into buffer, so that buffer[0..multiPartDataPos-1] contains the data just read.
int len = stream.read(buffer, multiPartDataPos, buffer.length - multiPartDataPos);
At this point you are doing another read, which could return -1, which will otherwise add some data from multiPartPos to multiPartDataPos+len-.
multiPartDataPos += len;
This step is only valid if len > 0.
And you are doing nothing with the buffer; and next time around the loop you will clobber whatever you just read.
The correct way to read any stream in Java is as follows:
while ((count = in.read(buffer)) > 0)
{
// use buffer[9..count-1], for example out.write(buffer, 0, count);
}
I don't understand why you think access to an underlying stream is required or what it's going to give you that you don't already have.
Turns out the better solution was to use move the data from an InputStream to a ByteArrayOutputStream and then return ByteArrayOutputStream.toByteArray()
Multipart multiPartData = null;
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int read;
byte[] input = new byte[4096];
InputStream is;
try {
multiPartData = (Multipart)multipart.next();
is = multiPartData.getInputStream();
while ((read = is.read(input, 0, input.length)) != -1) {
buffer.write(input, 0, read);
}
buffer.flush();
return buffer.toByteArray(); // just a test right now
}

Inserting an image to a particular position in a word document using docx4j

I want to add an image to particular position in my word document using docx4j. I don't want inline insertion. The code below performs adding the image inline with text. But I want floating insertion where I can explicitly give the location of where the image should be placed in the page. Please help me.
public R addUserPic(P parag, WordprocessingMLPackage wordMLPackage)
throws Exception {
File file = new File("src/main/resources/PictureNew.png");
byte[] bytes = convertImageToByteArray(file);
BinaryPartAbstractImage imagePart = BinaryPartAbstractImage
.createImagePart(wordMLPackage, bytes);
int docPrId = 1;
int cNvPrId = 2;
Inline inline = imagePart.createImageInline("Filename hint",
"Alternative text", docPrId, cNvPrId, false);
ObjectFactory factory = new ObjectFactory();
R run = factory.createR();
org.docx4j.wml.Drawing drawing = factory.createDrawing();
run.getContent().add(drawing);
drawing.getAnchorOrInline().add(inline);
return run;
}
private static byte[] convertImageToByteArray(File file)
throws FileNotFoundException, IOException {
InputStream is = new FileInputStream(file);
long length = file.length();
if (length > Integer.MAX_VALUE) {
System.out.println("File too large!!");
}
byte[] bytes = new byte[(int) length];
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead = is.read(bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}
if (offset < bytes.length) {
System.out.println("Could not completely read file "
+ file.getName());
}
is.close();
return bytes;
}
The thread you have cross posted in, at http://www.docx4java.org/forums/docx-java-f6/how-to-create-a-floating-image-t1224.html answers your question.

How to parse binary data from xml?

Trying to parse xml with STAX for a school project. If i have an element:
<binary data></binary data>
if this data takes up to 250 mb, how to deal with it?
XMLStreamReader has byte[] getElementAsBinary() but i can't affrod to hold this amount in memory. If anyone can help with this I would really appreciate it.
EDIT
Is its possible somehow to read data to stream? Currently i have:
private byte[] readBinary(XMLStreamReader2 sr) throws XMLStreamException {
Stax2Util.ByteAggregator aggr = new Stax2Util.ByteAggregator();
byte[] buffer = aggr.startAggregation();
while (true) {
int offset = 0;
int len = buffer.length;
do {
int readCount = sr.readElementAsBinary(buffer, offset, len);
if (readCount < 1) { // all done!
return aggr.aggregateAll(buffer, offset);
}
offset += readCount;
len -= readCount;
} while (len > 0);
buffer = aggr.addFullBlock(buffer);
}
}

Categories

Resources