I have connected to an ftp location using;
URL url = new URL("ftp://user:password#mydomain.com/" + file_name +";type=i");
I read the content into a byte array as shown below;
byte[] buffer = new byte[1024];
int count = 0;
while((count = fis.read(buffer)) > 0)
{
//check if bytes in buffer is a file
}
I want to be able to check if the bytes in buffer is a file without explicitly passing a specific file to write to it like;
File xfile= new File("dir1/");
FileOutputStream fos = new FileOutputStream(xfile);
fos.write(bytes);
if(xfile.isFile())
{
}
In an Ideal world something like this;
File xfile = new File(buffer);//Note: you cannot do this in java
if(xfile.isFile())
{
}
isFile() is to check if the bytes read from the ftp is file. I don't want to pass an explicit file name as I do not know the name of the file on the ftp location.
Any solutions available?
What is a file?
A computer file is a block of arbitrary information [...] which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished.
Your bytes that are stored in the byte array will be a part of a file if you write them on some kind of durable storage.
Sure, we often say that we read a file or write a file, but basically we read bytes from a file and write bytes to a file.
So we can't test a byte array whether it's content is a file or not. Simply because every byte array can be used to create a file (even an empty array).
BTW - the ftp server does not send a file, it (1) reads bytes and (2) a filename and (3) sends the bytes and (4) the filename so that a client can (5) read the bytes and (6) the filename and use both datasets to (7) create a file. The ftp server doesn't have to access a file, it can take bytes and names from a database or create both in memory...
I guess you cannot check if the byte[] array is a file or not. Why dont' you just use already written and tested library like maybe for example: http://commons.apache.org/net/
There is no way to do that easily.
A file is a byte array on a disk and a byte array will be a file if you write it to disk. There is no reliable way of telling what is in the data you just received, without parsing the data and checking if you can find a valid file header in it.
Where is isFile() file means the content fetched from from the ftp stream is a file.
The answer to that is simple. You can't do it because it doesn't make any sense.
What you have read from the stream IS a sequence of bytes stored in memory.
A file is a sequence of bytes stored on a disk (typically).
These are not the same thing. (Or if you want to get all theoretical / philosophical you have to answer the question "when is a sequence of bytes a file, and when is it not a file".
Now a more sensible question to ask might be:
How do I know if the stuff I fetched by FTP is the contents of a file on the FTP server.
(... as distinct from a rendering of a directory or something).
The answer is that you can't be sure if you fetched the file by opening an URLConnection to the FTP server ... like you have done. It is like asking "is '(123) 555-5555' a phone number?". It could be a phone number, or it could just be a sequence of characters that look like a phone number.
Related
I am trying to read a big AWS S3 Compressed Object(gz).I don't want to read the whole object, want to read it in parts,so that i can process the uncompressed data in parallel
I am reading it with GetObjectRequest with "Range" Header, where i am setting byte range.
However, when i give a byte range in between (100,200), it fails with "Not in GZIP format"
The reason for failure is , AWS request return a stream,however when i parse it to GZIPInputStream it fails as "GZIPInputStream" expects the first byte (GZIP_MAGIC = 0x8b1f) to confirm is it gzip , which is not present in the stream.
GetObjectRequest rangeObjectRequest = new GetObjectRequest(<<Bucket>>, <<Key>>).withRange(100, 200);
S3Object object = s3Client.getObject(rangeObjectRequest);
S3ObjectInputStream rawData = object.getObjectContent();
InputStream data = new GZIPInputStream(rawData);
Can anyone guide the right approach?
GZIP is a compression format in which each byte in the file depends on all of the bytes that precede it. Which means that you can't pick an arbitrary byte range out of the file and make sense of it.
If you need to read byte ranges, you'll need to store it uncompressed.
You could also create your own file storage format that stores chunks of the file as separately-compressed blocks. You could do this using the ZIP format, where each file in the archive represents a specific block size. But you'd need to implement your own ZIP directory reader to make that work.
I am attempting to transfer a gzipped file using IOUtils.copyLarge. When I transfer from a GZIPInputStream to a non-compressed output, it works fine, but when I transfer the original InputStream (attempting to leave it compressed) the end result is 0 bytes.
I have verified the input file is correct. Here is an example of what works
IOUtils.copyLarge(new GZIPInputStream(inputStream), out)
This of course results in an uncompressed file being written out. I would like to keep the file compressed as it is in the original input.
When I try val countBytes = IOUtils.copyLarge(inputStream, out) the result is 0, and the resulting file is empty. The desired result is simply copying the already compressed gzip file to a new destination maintaining compression.
Reading the documentation for the API, I should be using this correctly. Any ideas on what is preventing it from working?
I have asked this question https://stackoverflow.com/questions/32735189/sending-files-from-java-server-to-unity3d-c-sharp-client but I saw that it isn't an optimal solution to send files between Java and C# via built-in operations, because I also need also other messages, not only the file content.
Therefore, I tried using Protobuf, because it is fast and can serialize/deserialize objects platform independent. My .proto file is the following:
message File{
optional int32 fileSize = 1;
optional string fileName = 2;
optional bytes fileContent = 3;
}
So, I set the values for each variable in the generated .java file:
file.setFileSize(fileSize);
file.setFileName(fileName);
file.setFileContent(ByteString.copyFrom(fileContent, 0, fileContent.length);
I saw many tutorials about how to write the objects to a file and read from it. However, I can't find any example about how to send a file from server socket to client socket.
My intention is to serialize the object (file size, file name and file content) on the java server and to send these information to the C# client. So, the file can be deserialized and stored on the client side.
In my example code above, the server read the bytes of the file (image file) and write it to the output stream, so that the client can read and write the bytes to disk through input stream. I want to achieve the same thing with serialization of my generated .proto file.
Can anyone provide me an example or give me a hint how to do that?
As described in the documentation, protobuf does not take care of where a message start and stops, so when using a stream socket like TCP you'll have to do that yourself.
From the doc:
[...] If you want to write multiple messages to a single file or stream, it is up to you to keep track of where one message ends and the next begins. The Protocol Buffer wire format is not self-delimiting, so protocol buffer parsers cannot determine where a message ends on their own. The easiest way to solve this problem is to write the size of each message before you write the message itself. When you read the messages back in, you read the size, then read the bytes into a separate buffer, then parse from that buffer. [...]
Length-prefixing is a good candidate. Depending on what language you're writing, there are libraries that does length-prefixing for e.g. TCP that you can use, or you can define it yourself.
An example representation of the buffer on the wire might beof the format might be (beginning of buffer to the left):
[buf_length|serialized_buffer2]
So you code to pack the the buffer before sending might look something like (this is in javascript with node.js):
function pack(message) {
var packet = new Buffer(message.length + 2);
packet.writeIntBE(message.length, 0, 2);
message.copy(packet, 2);
return packet;
}
To read you would have to do the opposite:
client.on('data', function (data) {
dataBuffer = Buffer.concat([dataBuffer, data]);
var dataLen = dataBuffer.readIntBE(0, 2);
while(dataBuffer.length >= dataLen) {
// Message length excluding length prefix of 2 bytes
var msgLen = dataBuffer.readIntBE(0, 2);
var thisMsg = new Buffer(dataBuffer.slice(2, msgLen + 2));
//do something with the msg here
// Remove processed message from buffer
dataBuffer = dataBuffer.slice(msgLen + 2);
}
});
You should also be aware of that when sending multiple protobufs on a TCP socket, they are likely to be buffered for network optimizations (concatenated) and sent together. Meaning some sort of delimiter is needed anyway.
I'm trying to send an image upload in a Qt server trough the socket and visualize it in a client created using Java. Until now I have only transferred strings to communicate on both sides, and tried different examples for sending images but with no results.
The code I used to transfer the image in qt is:
QImage image;
image.load("../punton.png");
qDebug()<<"Image loaded";
QByteArray ban; // Construct a QByteArray object
QBuffer buffer(&ban); // Construct a QBuffer object using the QbyteArray
image.save(&buffer, "PNG"); // Save the QImage data into the QBuffer
socket->write(ban);
In the other end the code to read in Java is:
BufferedInputStream in = new BufferedInputStream(socket.getInputStream(),1);
File f = new File("C:\\Users\\CLOUDMOTO\\Desktop\\JAVA\\image.png");
System.out.println("Receiving...");
FileOutputStream fout = new FileOutputStream(f);
byte[] by = new byte[1];
for(int len; (len = in.read(by)) > 0;){
fout.write(by, 0, len);
System.out.println("Done!");
}
The process in Java gets stuck until I close the Qt server and after that the file generated is corrupt.
I'll appreciate any help because it's neccessary for me to do this and I'm new to programming with both languages.
Also I've used the following commands that and the receiving process now ends and show a message, but the file is corrupt.
socket->write(ban+"-1");
socket->close(); in qt.
And in java:
System.out.println(by);
String received = new String(by, 0, by.length, "ISO8859_1");
System.out.println(received);
System.out.println("Done!");
You cannot transport file over socket in such simple way. You are not giving the receiver any clue, what number of bytes is coming. Read javadoc for InputStream.read() carefully. Your receiver is in endless loop because it is waiting for next byte until the stream is closed. So you have partially fixed that by calling socket->close() at the sender side. Ideally, you need to write the length of ban into the socket before the buffer, read that length at receiver side and then receive only that amount of bytes. Also flush and close the receiver stream before trying to read the received file.
I have absolutely no idea what you wanted to achieve with socket->write(ban+"-1"). Your logged output starts with %PNG which is correct. I can see there "-1" at the end, which means that you added characters to the image binary file, hence you corrupted it. Why so?
And no, 1x1 PNG does not have size of 1 byte. It does not have even 4 bytes (red,green,blue,alpha). PNG needs some things like header and control checksum. Have a look at the size of the file on filesystem. This is your required by size.
I'm reading a file line by line, like this:
FileReader myFile = new FileReader(File file);
BufferedReader InputFile = new BufferedReader(myFile);
// Read the first line
String currentRecord = InputFile.readLine();
while(currentRecord != null) {
currentRecord = InputFile.readLine();
}
But if other types of files are uploaded, it will still read their contents. For instance, if the uploaded file is an image, it will output junk characters when reading the file. So my question is: how can I check the file is CSV for sure before reading it?
Checking extension of the file is kind of lame since someone can upload a file that is not CSV but has a .csv extension. Thanks in advance.
Determining the MIME type of a file is not something easy to do, especially if ASCII sections can be mixed with binary ones.
Actually, when you look at how a java mail system does determine the MIME type of an email, it does involve reading all bytes in it, and applying some "rules".
Check out MimeUtility.java
If the primary type of this datasource is "text" and if all the bytes in its input stream are US-ASCII, then the encoding is "7bit".
If more than half of the bytes are non-US-ASCII, then the encoding is "base64".
If less than half of the bytes are non-US-ASCII, then the encoding is "quoted-printable".
If the primary type of this datasource is not "text", then if all the bytes of its input stream are US-ASCII, the encoding is "7bit".
If there is even one non-US-ASCII character, the encoding is "base64".
#return "7bit", "quoted-printable" or "base64"
As mentioned by mmyers in a deleted comment, JavaMimeType is supposed to do the same thing, but:
it is dead since 2006
it does involve reading the all content!
:
File file = new File("/home/bibi/monfichieratester");
InputStream inputStream = new FileInputStream(file);
ByteArrayOutputStream byteArrayStream = new ByteArrayOutputStream();
int readByte;
while ((readByte = inputStream.read()) != -1) {
byteArrayStream.write(readByte);
}
String mimetype = "";
byte[] bytes = byteArrayStream.toByteArray();
MagicMatch m = Magic.getMagicMatch(bytes);
mimetype = m.getMimeType();
So... since you are reading the all content of the file anyway, you could take advantage of that to determine the type based on that content and your own rules.
Java Mime Magic may be of use. It'll analyse mime-types from files and inputstreams. I can't vouch for it's functionality, however.
This link may provide further info. It provides several different means of determining how to do what you want (or at least something similar).
I would perhaps be tempted to write something specific to your problem domain. e.g. determining the number of comma-separated values per line and rejecting if it's not within certain limits. Then split on the commas and parse each entry according to requirements (e.g. are they doubles/floats/valid Strings - and if strings, what encoding). I think you may have to do this anyway, given that someone may upload a file that starts like a CSV but is corrupted half-way through.