I want to convert XML into binary data in Java? What is the fastest and easiest way to do this?
Is there any internal Java method that I can use?
If you want to just compress the xml then you can read it and use either GZIPOutputStream or ZipOutputstream as described here.
Related
I need to parse an EBCDIC input file format. Using Java, I am able to read it like below:
InputStreamReader rdr = new InputStreamReader(new FileInputStream("/Users/rr/Documents/workspace/EBCDIC_TO_ASCII/ebcdic.txt"), java.nio.charset.Charset.forName("ibm500"));
But in Hadoop Mapreduce, I need to parse via RecordReader which has not worked so far.
Can any one provide a solution to this problem?
You can try to parse it through Spark, maybe, by using Cobrix which is an open-source COBOL data source for Spark.
The best thing you can do is to convert data to ASCII first and then load to HDFS.
Why is the file in EBCDIC ???, does it need to be ???
If it is just Text data, why not convert it to ascii when you send / pull the file from the Mainframe / AS400 ???.
If the file contains binary or Cobol numeric fields then you have several options
Convert the file to normal Text on the mainframe (The Mainframe Sort utility is good at this), then send the file and convert it (to ascii) .
If it is a Cobol file, There are some open source projects you could look at https://github.com/tmalaska/CopybookInputFormat or https://github.com/ianbuss/CopybookHadoop
There are commercial packages for loading mainframe-Cobol data into hadoop.
I have a text file and i need to convert this text file all data in xml format to make more readable.
Text file
how can i convert it in xml format.
Any java library or any way that i can do it.
Your question is rather vague (and you could probably find the answer yourself with just a little research), but I'll give you a hint.
Your sample appears to be an INI file (as traditionally used for configuration files on Windows & DOS). So, look for an "INI file parser." If you can't find one, you should be able to write a simple parser yourself using regular expressions. It's a simple file format, consisting of section headings like [SectionTitle] and data fields like Key=Value. That's all.
As for generating XML ... it shouldn't be hard, but "xml format" is not a useful description. Can you be more specific? E.g., what will the XML be used for?
Try this: http://www.smooks.org/mediawiki/index.php?title=Main_Page. I've used it and it's great.
A more sophisticated solution would be to use Mule Data Mapper. On the server side, obviously.
I'm looking for a library/framework to generate/parse TXT files from/into Java objects.
I'm thinking in something like Castor or JAXB, where the mapping between the file and the objects can be defined programmatically or with XML/annotations. The TXT file is not homogeneous and has no separators (fixed positions). The size of the file is not big, therefore DOM-like handling is allowed, no streaming required.
For instance:
TextWriter.write(Collection objects) -> FileOutputStream
TextReader.read(FileInputStream fis) -> Collection
I suggest you use google's protocol buffers
Protocol buffers are a flexible, efficient, automated mechanism for
serializing structured data – think XML, but smaller, faster, and
simpler. You define how you want your data to be structured once, then
you can use special generated source code to easily write and read
your structured data to and from a variety of data streams and using a
variety of languages. You can even update your data structure without
breaking deployed programs that are compiled against the "old" format.
Protobuf messages can be exported/read in binary or text format.
Other solutions would depend on what you call text file : if base64 is texty enough for you, you could simply use java standard serialization with base64 encoding of the binary stream.
You can do this using Jackson serialize to JSON and back
http://jackson.codehaus.org/
Just generate and parse it with XML or JSON formats, there's a whole load of libraries out there that will do all the work for you.
Anyone know how can I read any file in binary using Java? I want to be able to read any image, document, pdf etc as a stream of binary digits. Thanks
Sounds like you want FileInputStream! You may find the Basic I/O tutorial useful too.
use FileInputStream
FileInputStream is meant for reading streams of raw bytes such as image data. For reading streams of characters, consider using FileReader.
I'm trying to read a file that was created in a Java-based game using ObjectOutputStream in PHP. The data is a serialized object written in a binary format.
I've been using fopen and fread to get the binary data, but I have absolutely no idea what to do with it.
PHP doesn't understand Java. Both do however understand a common format like JSON, XML, CSV, etc. I'd suggest to change the format to either of them and use that as data transfer format instead.
In case of JSON, you can in Java use Google Gson to convert (encode) fullworthy javabeans into JSON flavor and in PHP you can use json_decode() to convert (decode) it into an associative PHP array.
It doesn't seem easy to reimplement http://download.oracle.com/javase/6/docs/platform/serialization/spec/protocol.html
You can't do it so easily (unless an existing framework is available). This because the binary format used by Java serialization is highly specialized to the JVM, think that there's not guaranteed compatibility even between different JVM versions.
You should use a different approach, for example using XML, YAML or JSON..