GZipping from standard input to standard output in Java - java

Preface: I'm a total Java noob...I just wrote Hello World yesterday. Have mercy on my noob self.
I'm not sure how to read from standard input or output to standard output in Java. I know there are things like Scanners and System.out.println, but this doesn't seem to apply directly to what I'm trying to do.
In particular, I'm trying to use GZip on standard input and output the compressed result to standard output. I see that there is a GZipOutputStream class that I'll certainly want to use. However, how can I initialize the output stream to direct to std output? Further, how can I just read from standard input?
How can I accomplish this? How do I compress std input and output the result to std output?
(Here's a diagram of what I'm trying to accomplish: Std input -> GZIP (via my Java program) -> std output (the compressed version of the std input)

Take a look at the following constructor : GZIPInputStream(InputStream in). To get stdin as an InputStream, use System.in. Reading from the stream is done with the read(byte[] buf, int off, int len) method- take a look at the documentation for a detailed description.
The whole thing would be something like
GZIPInputStream i = new GZIPInputStream(System.in);
byte[] buffer = new byte[1024];
int n = i.read(buffer, 0,buffer.length)
System.out.println("Bytes read: " + n);
Personally, I found streams in Java to have a steep learning curve, so I do recommend reading any tutorial online.
I'll leave it as an exercise to figure out the output.
--
Disclaimer: haven't actually tried the code

import java.io.IOException;
import java.util.zip.GZIPOutputStream;
public class InToGzipOut {
private static final int BUFFER_SIZE = 512;
public static void main(String[] args) throws IOException {
byte[] buf = new byte[BUFFER_SIZE];
GZIPOutputStream out = new GZIPOutputStream(System.out);
int len;
while ((len = System.in.read(buf)) > 0) {
out.write(buf, 0, len);
}
out.finish();
}
}

Related

Detect Encoding with Java

I have an example which is workingfine. With this example (provided below), I can detect the encoding of file using the universaldetector framework from mozilla.
But I want that this example to detect the encoding of input and not of the file for Example using class Scanner? How can I modify the code below to detect the encoding of input instead of file?
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import org.mozilla.universalchardet.UniversalDetector;
public class TestDetector {
public static void main(String[] args) throws java.io.IOException {
byte[] buf = new byte[4096];
java.io.FileInputStream fis = new java.io.FileInputStream("C:\\Users\\khalat\\Desktop\\Java\\toti.txt");
// (1)
UniversalDetector detector = new UniversalDetector(null);
// (2)
int nread;
while ((nread = fis.read(buf)) > 0 && !detector.isDone()) {
detector.handleData(buf, 0, nread);
}
// (3)
detector.dataEnd();
// (4)
String encoding = detector.getDetectedCharset();
if (encoding != null) {
System.out.println("Detected encoding = " + encoding);
} else {
System.out.println("No encoding detected.");
}
// (5)
detector.reset();
}
}
i found a elegant example wich can test at least, wether the charatcht is ISO-8859-1 see code below.
public class TestIso88591 {
public static void main(String[] args){
if(TestIso88591.testISO("ü")){
System.out.println("True");
}
else{
System.out.println("False");
}
}
public static boolean testISO(String text){
return Charset.forName(CharEncoding.ISO_8859_1).newEncoder().canEncode(text);
}
}
now i hav question to expert Java .there is a posibillity to test charachter wether it is ISO-8859-5 or ISO-8859-7? yes yes I know there is utf-8 but my exact question its how can i test the iso-8859-5 charachter. because the input data should be stored in SAP and SAP can handel only with ISO-8859-1 CHarachter. I need that as soon as.
OK I researched a bit more. And the result is. It is useless to read bytes from stdin to guess the encoding, because the java API let you directly read the input as a string which is already encoded ;) The only usecase for this dector is when you get a stream of unknown bytes from a file or socket etc. to guess how to decode it in a java string.
Next pseudo code, it's only theoretical approach to it. But as we figured out it makes no sense ;)
Its very simple.
byte[] buf = new byte[4096];
java.io.FileInputStream fis = new java.io.FileInputStream("C:\\Users\\khalat\\Desktop\\Java\\toti.txt");
UniversalDetector detector = new UniversalDetector(null);
int nread;
while ((nread = fis.read(buf)) > 0 && !detector.isDone()) {
detector.handleData(buf, 0, nread);
}
What you are doing here is reading from the file into an byte array, which is then passed to the detector.
Replace your FileInputStream with an other reader.
For example to read everything from Standard In:
byte[] buf = new byte[4096];
InputStreamReader isr = new InputStreamReader(System.in);
UniversalDetector detector = new UniversalDetector(null);
int nread = 0;
while ((nread = isr.read(buf, nread, buf.length)) > 0 && !detector.isDone()) {
detector.handleData(buf, 0, nread);
}
ATTENTION!!
This code is not tested by me. Its only based in Java API Docs.
I also would place a BufferedReader between the input stream and the read, to puffer. Also it can't work because of the size of the buffer with 4096 bytes. As I see my Example it would work, when you directly enter minimum 4096 bytes in Stdandard IN in one chunk, otherwise the while loop will never start.
About Reader API, The Base class java.io.Reader (http://docs.oracle.com/javase/7/docs/api/java/io/Reader.html#read(char[],%20int,%20int)) Defines the method read as abstract, and any Reader based impl. has to impl this method. SO IT IS THERE!!!
About you can't figure out the encoding of a chunk of unknown bytes. Yes thats right. But you can make a guess, like the detector from mozilla tries. Because you have some clues: 1. We expect that the bytes are a text 2. we know any byte in any specified encoding 3. we can trie to decode several bytes in a guessed encoding and compare the resulting string
About we are experts:
Yes most of use are ones ;) But we don't like to make the homework for someone else. We like to fix bugs or give advices. So provide a full example which provides an error we can fix. Or as it happend here: we give you an advice with some pseudo code. (I don't have the time to setup a project and write you an working example)
Nice comment thread ;)

Scan Char from file in Java

The java project i am working on requires me to write the java equivalent of this C code:
void read_hex_char(char *filename, unsigned char *image)
{
int i;
FILE *ff;
short tmp_short;
ff = fopen(filename, "r");
for (i = 0; i < 100; i++)
{
fscanf(ff, "%hx", &tmp_short);
image[i] = tmp_short;
}
fclose(ff);
}
I have written this Java code.
void read_hex_char(String filename, char[] image) throws IOException
{
Scanner s=new Scanner(new BufferedReader(new FileReader(filename)));
for(int i=0;i<100;i++)
{
image[i]=s.nextShort();
}
s.close();
}
Is this code correct? If its not, what corrections should be done?
I would go with a FileInputStream and read byte to byte (a short is just two bytes, char is more "complex" than just a short http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html). Simple byte extraction code from my project :
public static byte[] readFile(File file) throws IOException {
FileInputStream in = new FileInputStream(file);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int ch = -1;
while ((ch = in.read()) != -1)
bos.write(ch);
return bos.toByteArray();
}
For your example, the simplest is to find a few samples : run the C function on it then the java one and compare the results. It should give you informations.
Keep in mind, that java char type is pretty smart (smarter than just byte) and represents unicode character ( and reader classes perform processing of incoming byte stream into unicode characters possibly decoding or modifyiung bytes according to actual locale and charset settings ). From your source I guess that it is just actual bytes you want in
memory buffer. Here is really good explanation how to do this:
Convert InputStream to byte array in Java
For parsing hex values, you can use Short.parseShort(String s,int radix) with radix 16. In Java, char is a bit different than short, so if you intend to perform bitmap operations, short is probably the better type to use. However, note that Java doesn't have unsigned types, which may make some of the operations likely to be used in image processing (like bitwise operations) tricky.

Java reading file into memory and how not to blow up memory

I'm a bit of a newbie in Java and I trying to perform a MAC calculation on a file.
Now since the size of the file is not known at runtime, I can't just load all of the file in to memory. So I wrote the code so it would read in bits (4k in this case).
The issue I'm having is I tried loading the entire file into memory to see if both methods produce the same hash. However they seem to be producing different hashes
Here's the bit by bit code:
FileInputStream fis = new FileInputStream("sbs.dat");
byte[] file = new byte[4096];
m = Mac.getInstance("HmacSHA1");
int i=fis.read(file);
m.init(key);
while (i != -1)
{
m.update(file);
i=fis.read(file);
}
mac = m.doFinal();
And here's the all at once approach:
File f = new File("sbs.dat");
long size = f.length();
byte[] file = new byte[(int) size];
fis.read(file);
m = Mac.getInstance("HmacSHA1");
m.init(key);
m.update(file);
mac = m.doFinal();
Shouldn't they both produce the same hash?
The question however is more generic. Is the 1st code the correct way of loading a file into memory into pieces and perform whatever we want to do inside the while cycle? (socket send, cipher a file, etc...).
This question is useful because every tutorial I've seen just loads everything at once...
Update: Working :-D. Will this approach work properly sending a file in pieces through a socket?
No. You have no guarantee that in fis.read(file) will read file.length bytes. This is why read() is returning an int to tell you how many bytes it has actually read.
You should instead do this:
m.init(key);
int i=fis.read(file);
while (i != -1)
{
m.update(file, 0, i);
i=fis.read(file);
}
taking advantage of Mac.update(byte[] data, int offset, int len) method that allows you to specify length of actual data in in byte[] array.
The read function will not necessarily fill up your entire array. So, you need to check how many bytes were returning from the read function, and only use that many bytes of your buffer.
Just like Jason LeBrun says - The read method will not always read the specified amount of bytes. For example: What do you think will happen if the file does not contain a multiple of 4096 bytes?
I would go for something like this:
FileInputStream fis = new FileInputStream(filename);
byte[] buffer = new byte[buffersize];
Mac m = Mac.getInstance("HmacSHA1");
m.init(key);
int n;
while ((n = fis.read(buffer)) != -1)
{
m.update(buffer, 0, n);
}
byte[] mac = m.doFinal();

how to send an array of bytes over a TCP connection (java programming)

Can somebody demonstrate how to send an array of bytes over a TCP connection from a sender program to a receiver program in Java.
byte[] myByteArray
(I'm new to Java programming, and can't seem to find an example of how to do this that shows both ends of the connection (sender and receiver.) If you know of an existing example, maybe you could post the link. (No need to reinvent the wheel.) P.S. This is NOT homework! :-)
The InputStream and OutputStream classes in Java natively handle byte arrays. The one thing you may want to add is the length at the beginning of the message so that the receiver knows how many bytes to expect. I typically like to offer a method that allows controlling which bytes in the byte array to send, much like the standard API.
Something like this:
private Socket socket;
public void sendBytes(byte[] myByteArray) throws IOException {
sendBytes(myByteArray, 0, myByteArray.length);
}
public void sendBytes(byte[] myByteArray, int start, int len) throws IOException {
if (len < 0)
throw new IllegalArgumentException("Negative length not allowed");
if (start < 0 || start >= myByteArray.length)
throw new IndexOutOfBoundsException("Out of bounds: " + start);
// Other checks if needed.
// May be better to save the streams in the support class;
// just like the socket variable.
OutputStream out = socket.getOutputStream();
DataOutputStream dos = new DataOutputStream(out);
dos.writeInt(len);
if (len > 0) {
dos.write(myByteArray, start, len);
}
}
EDIT: To add the receiving side:
public byte[] readBytes() throws IOException {
// Again, probably better to store these objects references in the support class
InputStream in = socket.getInputStream();
DataInputStream dis = new DataInputStream(in);
int len = dis.readInt();
byte[] data = new byte[len];
if (len > 0) {
dis.readFully(data);
}
return data;
}
Just start with this example from the Really Big Index. Notice though, that it's designed to transmit and receive characters, not bytes. This isn't a big deal, though - you can just deal with the raw InputStream and OutputStream objects that the Socket class provides. See the API for more info about the different types of readers, writers and streams. Methods you'll be interested in are OutputStream.write(byte[]) and InputStream.read(byte[]).
The Oracle Socket Communications Tutorial would seem to be the appropriate launch point.
Note that it's going to extra trouble to turn characters into bytes. If you want to work at the byte level, just peel that off.
This Sun Sockets tutorial should give you a good starting point
What you need to use is the write method of an java.io.OutputStream, and the read method of an java.io.InputStream, both of which you can retrieve from the Socket you open.
I would ask you to use ObjectOutputStream and ObjectInputStream. These send everything as an object and receive as the same.
ObjectOutputStream os = new ObjectOutputStream(socket.getOutputStream());
os.flush();
ObjectInputStream is = new ObjectInputStream(socket.getInputStream());
os.writeObject(byte_array_that_you_want_to_send);
byte[] temp = (byte[]) is.readObject();
Also remember first create the output stream, flush it and then go ahead with the input stream because if something left out in the stream the input stream wont be created.
I'm guessing that the question is worded incorrectly. I found this when searching for an answer to why my use of InputStream and OutputStream seemed to be setting the entire array to 0 upon encountering a byte of value 0. Do these assume that the bytes contain valid ASCII and not binary. Since the question doesn't come right out and ask this, and nobody else seems to have caught it as a possibility, I guess I'll have to satisfy my quest elsewhere.
What I was trying to do was write a TransparentSocket class that can instantiate either a TCP (Socket/ServerSocket) or a UDP (DatagramSocket) to use the DatagramPacket transparently. It works for UDP, but not (yet) for TCP.
Follow-up: I seem to have verified that these streams are themselves useless for binary transfers, but that they can be passed to a more programmer-friendly instantiation, e.g.,
new DataOutputStream(socket.getOutputStream()).writeInt(5);
^ So much for that idea. It writes data in a "portable" way, i.e., probably ASCII, which is no help at all, especially when emulating software over which I have no control!
import java.io.*;
import java.net.*;
public class ByteSocketClient
{
public static void main(String[] args) throws UnknownHostException, IOException
{
Socket s=new Socket("",6000);
DataOutputStream dout=new DataOutputStream(new BufferedOutputStream(s.getOutputStream()));
byte[] a = {(byte)0xC0,(byte)0xA8,(byte)0x01,(byte)0x02,(byte)0x53,(byte)0x4D,(byte)0x41,(byte)0x52,(byte)0x54};
dout.write(a);
dout.close();
s.close();
}
}
Here is an example that streams 100 byte wav file frames at a time.
private Socket socket;
public void streamWav(byte[] myByteArray, int start, int len) throws IOException {
Path path = Paths.get("path/to/file.wav");
byte[] data = Files.readAllBytes(path);
OutputStream out = socket.getOutputStream();
DataOutputStream os = new DataOutputStream(out);
os.writeInt(len);
if (len > 0) {
os.write(data, start, len);
}
}
public void readWav() throws IOException {
InputStream in = socket.getInputStream();
int frameLength = 100; // use useful number of bytes
int input;
boolean active = true;
while(active) {
byte[] frame = new byte[frameLength];
for(int i=0; i<frameLength; i++) {
input = in.read();
if(input < 0) {
active = false;
break;
} else frame[i] = (byte) input;
}
// playWavPiece(frame);
// streamed frame byte array is full
// use it here ;-)
}
}

What is InputStream & Output Stream? Why and when do we use them?

Someone explain to me what InputStream and OutputStream are?
I am confused about the use cases for both InputStream and OutputStream.
If you could also include a snippet of code to go along with your explanation, that would be great. Thanks!
The goal of InputStream and OutputStream is to abstract different ways to input and output: whether the stream is a file, a web page, or the screen shouldn't matter. All that matters is that you receive information from the stream (or send information into that stream.)
InputStream is used for many things that you read from.
OutputStream is used for many things that you write to.
Here's some sample code. It assumes the InputStream instr and OutputStream osstr have already been created:
int i;
while ((i = instr.read()) != -1) {
osstr.write(i);
}
instr.close();
osstr.close();
InputStream is used for reading, OutputStream for writing. They are connected as decorators to one another such that you can read/write all different types of data from all different types of sources.
For example, you can write primitive data to a file:
File file = new File("C:/text.bin");
file.createNewFile();
DataOutputStream stream = new DataOutputStream(new FileOutputStream(file));
stream.writeBoolean(true);
stream.writeInt(1234);
stream.close();
To read the written contents:
File file = new File("C:/text.bin");
DataInputStream stream = new DataInputStream(new FileInputStream(file));
boolean isTrue = stream.readBoolean();
int value = stream.readInt();
stream.close();
System.out.printlin(isTrue + " " + value);
You can use other types of streams to enhance the reading/writing. For example, you can introduce a buffer for efficiency:
DataInputStream stream = new DataInputStream(
new BufferedInputStream(new FileInputStream(file)));
You can write other data such as objects:
MyClass myObject = new MyClass(); // MyClass have to implement Serializable
ObjectOutputStream stream = new ObjectOutputStream(
new FileOutputStream("C:/text.obj"));
stream.writeObject(myObject);
stream.close();
You can read from other different input sources:
byte[] test = new byte[] {0, 0, 1, 0, 0, 0, 1, 1, 8, 9};
DataInputStream stream = new DataInputStream(new ByteArrayInputStream(test));
int value0 = stream.readInt();
int value1 = stream.readInt();
byte value2 = stream.readByte();
byte value3 = stream.readByte();
stream.close();
System.out.println(value0 + " " + value1 + " " + value2 + " " + value3);
For most input streams there is an output stream, also. You can define your own streams to reading/writing special things and there are complex streams for reading complex things (for example there are Streams for reading/writing ZIP format).
From the Java Tutorial:
A stream is a sequence of data.
A program uses an input stream to read data from a source, one item at a time:
A program uses an output stream to write data to a destination, one item at time:
The data source and data destination pictured above can be anything that holds, generates, or consumes data. Obviously this includes disk files, but a source or destination can also be another program, a peripheral device, a network socket, or an array.
Sample code from oracle tutorial:
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
This program uses byte streams to copy xanadu.txt file to outagain.txt , by writing one byte at a time
Have a look at this SE question to know more details about advanced Character streams, which are wrappers on top of Byte Streams :
byte stream and character stream
you read from an InputStream and write to an OutputStream.
for example, say you want to copy a file. You would create a FileInputStream to read from the source file and a FileOutputStream to write to the new file.
If your data is a character stream, you could use a FileReader instead of an InputStream and a FileWriter instead of an OutputStream if you prefer.
InputStream input = ... // many different types
OutputStream output = ... // many different types
byte[] buffer = new byte[1024];
int n = 0;
while ((n = input.read(buffer)) != -1)
output.write(buffer, 0, n);
input.close();
output.close();
OutputStream is an abstract class that represents writing output. There are many different OutputStream classes, and they write out to certain things (like the screen, or Files, or byte arrays, or network connections, or etc). InputStream classes access the same things, but they read data in from them.
Here is a good basic example of using FileOutputStream and FileInputStream to write data to a file, then read it back in.
A stream is a continuous flow of liquid, air, or gas.
Java stream is a flow of data from a source into a destination. The source or destination can be a disk, memory, socket, or other programs. The data can be bytes, characters, or objects. The same applies for C# or C++ streams. A good metaphor for Java streams is water flowing from a tap into a bathtub and later into a drainage.
The data represents the static part of the stream; the read and write methods the dynamic part of the stream.
InputStream represents a flow of data from the source, the OutputStream represents a flow of data into the destination.
Finally, InputStream and OutputStream are abstractions over low-level access to data, such as C file pointers.
Stream: In laymen terms stream is data , most generic stream is binary representation of data.
Input Stream : If you are reading data from a file or any other source , stream used is input stream. In a simpler terms input stream acts as a channel to read data.
Output Stream : If you want to read and process data from a source (file etc) you first need to save the data , the mean to store data is output stream .
An output stream is generally related to some data destination like a file or a network etc.In java output stream is a destination where data is eventually written and it ends
import java.io.printstream;
class PPrint {
static PPrintStream oout = new PPrintStream();
}
class PPrintStream {
void print(String str) {
System.out.println(str)
}
}
class outputstreamDemo {
public static void main(String args[]) {
System.out.println("hello world");
System.out.prinln("this is output stream demo");
}
}
For one kind of InputStream, you can think of it as a "representation" of a data source, like a file.
For example:
FileInputStream fileInputStream = new FileInputStream("/path/to/file/abc.txt");
fileInputStream represents the data in this path, which you can use read method to read bytes from the file.
For the other kind of InputStream, they take in another inputStream and do further processing, like decompression.
For example:
GZIPInputStream gzipInputStream = new GZIPInputStream(fileInputStream);
gzipInputStream will treat the fileInputStream as a compressed data source. When you use the read(buffer, 0, buffer.length) method, it will decompress part of the gzip file into the buffer you provide.
The reason why we use InputStream because as the data in the source becomes larger and larger, say we have 500GB data in the source file, we don't want to hold everything in the memory (expensive machine; not friendly for GC allocation), and we want to get some result faster (reading the whole file may take a long time).
The same thing for OutputStream. We can start moving some result to the destination without waiting for the whole thing to finish, plus less memory consumption.
If you want more explanations and examples, you have check these summaries: InputStream, OutputStream, How To Use InputStream, How To Use OutputStream
In continue to the great other answers, in my simple words:
Stream - like mentioned #Sher Mohammad is data.
Input stream - for example is to get input – data – from the file. The case is when I have a file (the user upload a file – input) – and I want to read what we have there.
Output Stream – is the vice versa. For example – you are generating an excel file, and output it to some place.
The “how to write” to the file, is defined at the sender (the excel workbook class) not at the file output stream.
See here example in this context.
try (OutputStream fileOut = new FileOutputStream("xssf-align.xlsx")) {
wb.write(fileOut);
}
wb.close();

Categories

Resources