Checksum verification for a file on two different hadoop file system - java

I am very new to Hadoop file system.
I have two different hadoop file systems(one is client and another is server), both of them are in different domain and not having direct access to one another.
I want to copy files(in GB) from server to client.
Since I don't have direct access to server(from client), I followed below method to copy the files.
I wrote server java program which reads the file with server configuration and writing as bytes to stdout.
System.out.write(buf.array(), 0, length);
System.out.flush();
then, I wrote cgi script which call this server jar.
then, I wrote a client java program which calls above cgi script to read the data
FSDataOutputStream dataOut = fs.create(client_file, true, bufSize, replica, blockSize);
URL url = new URL("http://xxx.company.com/cgi/my_cgi_script?" + "file=" + server_file);
InputStream is = url.openStream();
byte[] byteChunk = new byte[1024 * 1024];
int n = 0;
while ( (n = is.read(byteChunk)) > 0 ) {
dataOut.write(byteChunk, 0, n);
received += n;
}
dataOut.close();
now, copying file done without any issue and I see the same file size on server and client.
When I do FileChecksum for same file on client and server file systems, I am getting different values.
MD5-of-262144MD5-of-512CRC32C:86094f4043b9592a49ec7f6ef157e0fe
MD5-of-262144MD5-of-512CRC32C:a83a0b3f182db066da7520b36c79e696
Can you please help me to fix this issue?
Note: I am using the same blockSize on client and server file systems

Related

Understanding ICAP Server File Transfer In Java Problem

I am trying to get a Java ICAP server to interface with a blue coat device which is acting as the ICAP client. The ICAP server I am working with is here: icap. Basically I have been getting things working and now I am stuck on why on the server side I am not receiving the file. Below is a few lines of code that shows kind of where I am at most recently. Obviously most of the code has been omitted.
IcapRequest request = (IcapRequest)e.getMessage();
ChannelBuffer buffer = null;
buffer = request.getHttpRequest().getContent();
if(buffer != null) {
System.out.println("Buffer = " + buffer.toString(Charset.defaultCharset()));
}
try {
FileOutputStream fout= new FileOutputStream(testfile);
while (request.getHttpResponse().getContent().readable()) {
byte[] bb = new byte[request.getHttpResponse().getContent().readableBytes()];
request.getHttpResponse().getContent().readBytes(bb);
fout.write(bb);
}
Basically, I see using wireshark and on my server print statements I am getting the file name, the html request, etc. But I am not getting all the content when it is a large file. If it is a small .txt file I can get the content and save the txt file and all content to my server side disk. If it is any kind of .docx file that is maybe about 10K in size or larger there appears to be only one ICAP client packet with content using a PSH method but no other content so if I try to save the file to my server on disk I am not getting all the content so the file is basically corrupt. So at this point I am not sure why I cannot get my ICAP server to save the .docx file sent from the blue coat device as I am more leaning toward the problem is on the server side. Any advice would be greatly appreciated.

How to continue downloading file after disconnection?

I have simple java-server via sockets.
Server is read from client url of file which need to download.
FileOutputStream outStream= new FileOutputStream(SERVER_PATH + file.getName());
BufferedOutputStream out = new BufferedOutputStream(outStream);
byte buf[] = new byte[BATCH];
int read = 0;
while ((read = in.read(buf,0,BATCH))>=0){
out.write(buf,0,read);
}
how to continue to download file?
Your Question is a little ambiguous .!
After looking at the code, it looks like you are reading from a File in Client machine and Writing the same to the Server URL.
Assuming this situation,
The points that can help you resolve this are,
1. There will an IOException if the connection is lost. That means you have to handle the exception and reconnect to the Socket. May be after waiting for some time (!!)
2. Then you need to open the server File in Append mode and continue with out.write. As the out is not reset or lost with the Disconnection.
Thanks, Sunil

Apache Commons Net Slow FTP Upload

I'm using Apache Commons Net 3.3 to handle FTP transfers in a Java application.
Downloads seem to work fine, but I'm getting speeds a lot slower than the local internet connection capabilities for uploads.
The code that writes the file data to the stream looks like this:
BufferedOutputStream out = new BufferedOutputStream(ftp.getOutputStream(prt));
BufferedInputStream in = new BufferedInputStream(prov.getInputStream(s));
byte[] buff = new byte[BUFF_SIZE];
int len;
while ((len = in.read(buff)) >= 0 && !prog.isCanceled()) {
out.write(buff, 0, len);
total += len;
prog.setProgress((int) (Math.round((total / combo) * 100)));
}
in.close();
out.close();
BUFF_SIZE = 16kB
I have the FTPClient buffer size also set to 16kB via setBufferSize
The issue isn't with the server or my internet connection because the upload proceeds at a much more reasonable speed using Filezilla as a FTP client.
The issue also seems to occur with Java 6 and 7 JVMs.
Does anyone have any ideas as to why this is happening? Is there a problem with Commons Net or Java? Or is there something I haven't configured correctly?
Same problem - using SDK 1.6 resolve problem, but also try to find better way
UPD: Solved (see comments)

try sending file and string through same sockt (in java)

i need to send file from server to client through some sockt (lets say port 8478) and also massage(in middel of file transfer ) (somthing like "hi", or "you reach to your limt"or "you reach to your 50% limt").
now to send only file it's easy im using
BufferedInputStrear and BufferedOutputStream in the client and server side.
now how can i send also massage in middel of file transfer in same port (8478).
thank you all..
this how i transfer the file
server side:
BufferedInputStream d=new BufferedInputStream(new FileInputStream(s));
BufferedOutputStream outStream = new BufferedOutputStream(cs.getOutputStream());
ObjectOutputStream msgoutStream = new ObjectOutputStream(cs.getOutputStream());
byte buffer[] = new byte[1024];
int read;
while((read = d.read(buffer))!=-1)
{
//msgoutStream.writeUTF("hjlhkhjk");
outStream.write(buffer, 0, read);
outStream.flush();
}
client side:
byte buffer[] = new byte[1024];
int read;
int f=0;
while((read = d.read(buffer))!=-1)
{
if(ifContinun)
{
System.out.println("strat write to file...");
}
//String s1=msgInPutStream.readLine();
//String s2=msgInPutStream.readUTF();
outStream.write(buffer, 0, read);
outStream.flush();
if(ifContinun)
{
System.out.println("after write to file...");
ifContinun=false;
}
}
You need to send the file in parts. You can invent a protocol like
short stream-id
short length of message
bytes of the message
This will allow you to interleave multiple streams of data in the same socket and have the other end break up the different streams.
However, its likely to be much simpler to open two connections which avoids the need for a protocol like this. e.g. FTP does this. ;)
In order to do this, you need to define a protocol on top of TCP. For example, the protocol can be:
There are a series of messages
Each message has a type
Each message is preceded by 4 bytes that carry the size of the next message
Each message starts with a type byte
The types are: 1 -- StartFile, 2 -- NextFileChunk, 3 -- TextMessage
The the second byte onwards contains the body of the message
For StartFile, the rest of the bytes constitute the filename and whatever other properties you want to send. (You can choose to use regular Java serialization.)
For NextFileChunk, you just have the next n bytes of the file being transferred
For TestMessage, the rest of the bytes would carry the text message
One way is to use some escape code to indicate when changing from file transfer to text transefer, and vice versa. Because a binary file may contain your escape codes, you must hand those some how.
But message protocol is preferable.

Sending file with custom attributes over a network

I want to create a client-server program that allows the client to send a file to the server along with some information about the file (sender name, description, etc.).
The file could potentially be quite large as it could be either a text, picture, audio or video file, and because of that I do not want to have to read the whole file into a byte array before sending, I would rather read the file in blocks, sending them over the network and then allowing the server to append the blocks to the file at it's end.
However I am faced with the problem of how to best send the file along with a few bits of information about the file itself. I would like at a minimum to send the sender's name and a description both of which will be input to the client program by the user, but this may change in the future so should be flexible.
What is a good way of doing this that would also allow me to "stream" the file being sent rather than reading it in as a whole and then sending?
Sockets are natively streams of bytes so you shouldn't have a problem there. I suggest you have a protocol which looks like this.
This will allow you to send arbitrary properties as long as the total length is less than 64 KB. Followed by the file which can be any 63-bit length and is sent a block at a time. (with a buffer of 8 KB)
The Socket can be used to send more files if you wish.
DataOutputStream dos = new DataOutputStream(socket.getOutputStream());
Properties fileProperties = new Properties();
File file = new File(filename);
// send the properties
StringWriter writer = new StringWriter();
fileProperties.store(writer, "");
writer.close();
dos.writeUTF(writer.toString());
// send the length of the file
dos.writeLong(file.length());
// send the file.
byte[] bytes = new byte[8*1024];
FileInputStream fis = new FileInputStream(file);
int len;
while((len = fis.read(bytes))>0) {
dos.write(bytes, 0, len);
}
fis.close();
dos.flush();
to read
DataInputStream dis = new DataInputStream(socket.getInputStream());
String propertiesText = dis.readUTF();
Properties properties = new Properties();
properties.load(new StringReader(propertiesText));
long lengthRemaining = dis.readLong();
FileOutputStream fos = new FileOutputStream(outFilename);
int len;
while(lengthRemaining > 0
&& (len = dis.read(bytes,0, (int) Math.min(bytes.length, lengthRemaining))) > 0) {
fos.write(bytes, 0, len);
lengthRemaining -= len;
}
fos.close();
You could build up program around a well known protocol as FTP.
And to send the meta information you could just create a special file with a unique name that contains the info. Afterwards transfer both the user file and the meta file with FTP.
Otherwise, again using FTP for the file you could transfer the meta data in the client-server stream of your hand-written program.
I recommend using the http protocol for this. The server can be implemented using a servlet and Apache HttpClient can be used for the client. This article has some good examples. You can send both the file and the parameters in the same request. And that too with very little code!

Categories

Resources