In java there are streams for input/output.
I am confused that when i create a stream, is it the data that is in the stream or just the pipeline for the data ?
Actually i am trying to parse an xml response created from a rest request to a web service that returns an xml response.
//Parse Xml
ParseXml parser=new ParseXml();
parser.parseStream(connection.getInputStream());
where connection is an HttpURLConnection Object.
Following is the source for parseStream()
import java.io.IOException;
import java.io.InputStream;
import java.io.Reader;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
public class ParseXml
{
public void parseStream(InputStream input)
{
XMLReader xmlReader;
try
{
xmlReader = (XMLReader) XMLReaderFactory.createXMLReader();
xmlReader.setContentHandler(new XmlParser());
xmlReader.parse(new InputSource(input));
}
catch (SAXException e)
{
e.printStackTrace();
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
I'm getting an exception :
[Fatal Error] :1:1: Premature end of file.
org.xml.sax.SAXParseException: Premature end of file.
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at xmlparsing.ParseXml.parseStream(ParseXml.java:24)
at smssend.SmsSend.restHttpPost(SmsSend.java:129)
at main.SmsApiClass.main(SmsApiClass.java:28)
An InputStream is something from which you can read data. I could also call it a data source, but I wouldn't call it a pipeline. To me a pipeline involves multiple parts that are sticked together.
Regarding your parser error: Before feeding the data directly to the parser, you should write it to a file or System.out, just to make sure that some data actually arrived.
Then you should feed that data to the parser, to see what happens when you feed it known data.
And if these two cases work properly, you can feed the data directly.
[Update 2011-03-12]
Here is a complete example that works for me. Maybe you can spot the difference to your code (of which you only posted parts, especially not the part that creates the InputStream):
package so5281746;
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
public class ParseXml {
public static void parseStream(InputStream input) {
try {
XMLReader xmlReader = XMLReaderFactory.createXMLReader();
xmlReader.setContentHandler(new XmlParser());
xmlReader.parse(new InputSource(input));
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws IOException {
URLConnection conn = new URL("http://repo1.maven.org/maven2/org/apache/ant/ant/maven-metadata.xml").openConnection();
InputStream input = conn.getInputStream();
parseStream(input);
}
static class XmlParser extends DefaultHandler {
#Override
public void startDocument() throws SAXException {
System.out.println("startDocument");
}
#Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
System.out.println("startElement " + localName);
}
#Override
public void endDocument() throws SAXException {
System.out.println("endDocument");
}
}
}
In Java there's no such thing as "data", there are only "objects". Like everything else, an InputStream is an object. It has methods, such as read(), that give you access to data. Asking whether it "is" the data is a meaningless question - a principle of object-oriented languages is that data is always hidden behind interfaces, such as the read() interface.
Related
I have a XML file, I'm reading it's content using BufferedReader, i then store some pieces of information in String using substring. See following code:
Load file, basically I take whole xml file and store it in String called whole XML
try {
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream(inputFile), "UTF-8"));
while ((line2 = bufferedReader.readLine()) != null) {
wholeXML= line2;
} catch (IOException ex2) {
System.out.println("Exception xml");
}
after that i use substring to get data i need for example:
String senderID = wholeXML.substring(wholeXML.indexOf("<q1:SenderID>")+13,wholeXML.indexOf("</q1:SenderID>"));`
This serves my purpose and workes just fine, but i'm having problem because one part in xml file is not static it's dynamic, like this:
q1:Attachment>
<q1:AttachmentID>ba9727cc-a831-4ded-b88c-a00000041357</q1:AttachmentID>
</q1:Attachment>
-<q1:Attachment>
<q1:AttachmentID>c0773e77-e011-484e-a1e9-b00000131099</q1:AttachmentID>
</q1:Attachment>
-<q1:Attachment>
<q1:AttachmentID>08f57403-2feb-443c-8dd4-b00000131103</q1:AttachmentID>
</q1:Attachment>
-<q1:Attachment>
<q1:AttachmentID>53c47aba-bb64-4349-a0dc-b00000131105</q1:AttachmentID>
</q1:Attachment>
-<q1:Attachment>
<q1:AttachmentID>3ee501ed-5c5c-43ab-8bd0-b00000131108</q1:AttachmentID>
</q1:Attachment>
-<q1:Attachment>
<q1:AttachmentID>d4fe537a-a95a-4902-a583-b00000131112</q1:AttachmentID>
So as you can see there are multiple tags with the same name and I need to store data inside of them, but I don't know how many there will be, given it's different for each XML file. I'm a beginner so please go easy on me if there is an obvious solution, I'm just not seeing it.
Your approach (substring matching on the XML string) is not advisable, you should use one of the XML parsing methods available in Java (SAX, DOM, StAX, JAXB, see Which is the best library for XML parsing in java).
Example using SAX:
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.stream.XMLStreamException;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class StaxExample {
public static class CustomSAXHandler extends DefaultHandler {
private String senderId;
private final List<String> attachmentIds = new ArrayList<>();
private StringBuffer currentCharacters = new StringBuffer();
#Override
public void characters(char[] ch, int start, int length) throws SAXException {
if (currentCharacters != null) {
currentCharacters.append(String.valueOf(ch, start, length));
}
}
#Override
public void startElement(String uri, String localName, String qName, Attributes attributes)
throws SAXException {
currentCharacters = new StringBuffer();
}
#Override
public void endElement(String uri, String localName, String qName) throws SAXException {
switch (localName) {
case "AttachmentID":
getAttachmentIds().add(currentCharacters.toString());
break;
case "SenderID":
setSenderId(currentCharacters.toString());
break;
}
currentCharacters = null;
}
public String getSenderId() {
return senderId;
}
public void setSenderId(String senderId) {
this.senderId = senderId;
}
public List<String> getAttachmentIds() {
return attachmentIds;
}
}
public static void main(String[] args) throws XMLStreamException, SAXException, IOException, ParserConfigurationException {
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);
SAXParser saxParser = factory.newSAXParser();
CustomSAXHandler saxHandler = new CustomSAXHandler();
saxParser.parse(StaxExample.class.getResourceAsStream("test.xml"), saxHandler);
System.out.println("SenderID: " + saxHandler.getSenderId());
System.out.println("AttachmentIDs: " + saxHandler.getAttachmentIds());
}
}
Explanation:
Parsing a document with SAX requires you to provide a SAX handler, in which you can override certain methods to react on encounters of the different XML elements.
I created a fairly simple custom SAX handler which just records encountered text and stores it in instance variables (senderId, attachmentIds) for later retrieval.
As you see, the senderId is a single String (as it is expected to be encountered only once), and the attachmentIds is a List of Strings to be able to store multiple occurrences.
Am newbie to EDI. And i just converted the ORDERS edi file to XML using smooks api. Some of the ORDER example files are working fine in following example. But i got the following exception when i running the following edi file. Am stuck with this. Here is my example and EDI data
package example;
import org.json.JSONObject;
import org.json.XML;
import org.milyn.Smooks;
import org.milyn.SmooksException;
import org.milyn.io.StreamUtils;
import org.milyn.smooks.edi.unedifact.UNEdifactReaderConfigurator;
import org.xml.sax.SAXException;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.StringWriter;
public class Main {
public static int PRETTY_PRINT_INDENT_FACTOR = 4;
protected static String runSmooksTransform() throws IOException, SAXException, SmooksException {
Smooks smooks = new Smooks();
smooks.setReaderConfig(new UNEdifactReaderConfigurator("urn:org.milyn.edi.unedifact:d93a-mapping:*"));
try {
StringWriter writer = new StringWriter();
smooks.filterSource(new StreamSource(new FileInputStream("EDI.edi")), new StreamResult(writer));
return writer.toString();
} finally {
smooks.close();
}
}
public static void main(String[] args) throws IOException, SAXException, SmooksException {
System.out.println("\n\n==============Message In==============");
System.out.println(readInputMessage());
System.out.println("======================================\n");
String messageOut = Main.runSmooksTransform();
System.out.println("==============Message Out=============");
System.out.println(messageOut);
System.out.println("======================================\n\n");
JSONObject xmlJSONObj = XML.toJSONObject(messageOut);
String jsonPrettyPrintString = xmlJSONObj.toString(PRETTY_PRINT_INDENT_FACTOR);
System.out.println(jsonPrettyPrintString);
}
private static String readInputMessage() throws IOException {
return StreamUtils.readStreamAsString(new FileInputStream("EDI.edi"));
}
}
And the exception with Sample EDI Data
Exception in thread "main" org.milyn.SmooksException: Failed to filter source.
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:97)
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:64)
at org.milyn.Smooks._filter(Smooks.java:526)
at org.milyn.Smooks.filterSource(Smooks.java:482)
at org.milyn.Smooks.filterSource(Smooks.java:456)
at example.Main.runSmooksTransform(Main.java:49)
at example.Main.main(Main.java:63)
Caused by: org.milyn.edisax.EDIParseException: EDI message processing failed [ORDERS][D:93A:UN]. Must be a minimum of 1 instances of segment [UNS]. Currently at segment number 9.
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:499)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:450)
at org.milyn.edisax.EDIParser.parse(EDIParser.java:426)
at org.milyn.edisax.EDIParser.parse(EDIParser.java:410)
at org.milyn.edisax.unedifact.handlers.UNHHandler.process(UNHHandler.java:97)
at org.milyn.edisax.unedifact.handlers.UNBHandler.process(UNBHandler.java:75)
at org.milyn.edisax.unedifact.UNEdifactInterchangeParser.parse(UNEdifactInterchangeParser.java:113)
at org.milyn.smooks.edi.unedifact.UNEdifactReader.parse(UNEdifactReader.java:75)
at org.milyn.delivery.sax.SAXParser.parse(SAXParser.java:76)
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:86)
... 6 more
Bad source data will cause this.
It looks like smooks is looking for a UNS segment which isn't in your data. The section control is mandatory per the D.93A standard.
The below code is quoted from : http://examples.javacodegeeks.com/core-java/io/fileoutputstream/java-io-fileoutputstream-example/
Although the OutputStream is an abstract method, at the below code, OutputStream object is used for writing into the file.
Files.newOutputStream(filepath)) returns OutputStream. Then, the type of out is OutputStream, and out references OutputStream.
How can this be possible while OutputStream is an abstract class?
package com.javacodegeeks.core.io.outputstream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class FileOutputStreamExample {
private static final String OUTPUT_FILE = "C:\\Users\\nikos\\Desktop\\TestFiles\\testFile.txt";
public static void main(String[] args) {
String content = "Hello Java Code Geeks";
byte[] bytes = content.getBytes();
Path filepath = Paths.get(OUTPUT_FILE);
try ( OutputStream out = Files.newOutputStream(filepath)) {
out.write(bytes);
} catch (IOException e) {
e.printStackTrace();
}
}
}
Just because the declared type is OutputStream, that doesn't mean the implementation doesn't create an instance of a concrete subclass of OutputStream. You see this all the time with interfaces. For example:
public List<String> getList() {
return new ArrayList<String>();
}
Basically you need to distinguish between the API exposed (which uses the abstract class) and the implementation (which can choose to use any subclass it wants).
So Files.newOutputStream could be implemented as:
public static OutputStream newOutputStream(Path path)
throws IOException {
return new FileOutputStream(path.toFile());
}
I'm using the SAX parser that comes with JDK7. I'm trying to get hold of the DOCTYPE declaration, but none of the methods in DefaultHandler seem to be fired for it. What am I missing?
import java.io.StringReader;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class Problem {
public static void main(String[] args) throws Exception {
String xml = "<!DOCTYPE HTML><html><head></head><body></body></html>";
SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
InputSource in = new InputSource(new StringReader(xml));
saxParser.parse(in, new DefaultHandler() {
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
System.out.println("Element: " + qName);
}
});;
}
}
This produces:
Element: html
Element: head
Element: body
I want it to produce:
DocType: HTML
Element: html
Element: head
Element: body
How do I get the DocType?
Update: Looks like there's a DefaultHandler2 class to extend. Can I use that as a drop-in replacement?
Instead of a DefaultHander, use org.xml.sax.ext.DefaultHandler2 which has the startDTD() method.
Report the start of DTD declarations, if any. This method is intended
to report the beginning of the DOCTYPE declaration; if the document
has no DOCTYPE declaration, this method will not be invoked.
All declarations reported through DTDHandler or DeclHandler events
must appear between the startDTD and endDTD events. Declarations are
assumed to belong to the internal DTD subset unless they appear
between startEntity and endEntity events. Comments and processing
instructions from the DTD should also be reported between the startDTD
and endDTD events, in their original order of (logical) occurrence;
they are not required to appear in their correct locations relative to
DTDHandler or DeclHandler events, however.
Note that the start/endDTD events will appear within the
start/endDocument events from ContentHandler and before the first
startElement event.
However, you must also set the LexicalHandler for the XML Reader.
import java.io.StringReader;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.ext.DefaultHandler2;
public class Problem{
public static void main(String[] args) throws Exception {
String xml = "<!DOCTYPE html><hml><img/></hml>";
SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
InputSource in = new InputSource(new StringReader(xml));
DefaultHandler2 myHandler = new DefaultHandler2(){
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
System.out.println("Element: " + qName);
}
#Override
public void startDTD(String name, String publicId,
String systemId) throws SAXException {
System.out.println("DocType: " + name);
}
};
saxParser.setProperty("http://xml.org/sax/properties/lexical-handler",
myHandler);
saxParser.parse(in, myHandler);
}
}
I need to just read the value of a single attribute inside an XML file using java. The XML would look something like this:
<behavior name="Fred" version="2.0" ....>
and I just need to read out the version. Can someone point in the direction of a resource that would show me how to do this?
You don't need a fancy library -- plain old JAXP versions of DOM and XPath are pretty easy to read and write for this. Whatever you do, don't use a regular expression.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
public class GetVersion {
public static void main(String[] args) throws Exception {
XPath xpath = XPathFactory.newInstance().newXPath();
Document doc = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse("file:////tmp/whatever.xml");
String version = xpath.evaluate("//behavior/#version", doc);
System.out.println(version);
}
}
JAXB for brevity:
private static String readVersion(File file) {
#XmlRootElement class Behavior {
#XmlAttribute String version;
}
return JAXB.unmarshal(file, Behavior.class).version;
}
StAX for efficiency:
private static String readVersionEfficient(File file)
throws XMLStreamException, IOException {
XMLInputFactory inFactory = XMLInputFactory.newInstance();
XMLStreamReader xmlReader = inFactory
.createXMLStreamReader(new StreamSource(file));
try {
while (xmlReader.hasNext()) {
if (xmlReader.next() == XMLStreamConstants.START_ELEMENT) {
if (xmlReader.getLocalName().equals("behavior")) {
return xmlReader.getAttributeValue(null, "version");
} else {
throw new IOException("Invalid file");
}
}
}
throw new IOException("Invalid file");
} finally {
xmlReader.close();
}
}
Here's one.
import javax.xml.parsers.SAXParser;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.SAXException;
import org.xml.sax.Attributes;
import javax.xml.parsers.SAXParserFactory;
/**
* Here is sample of reading attributes of a given XML element.
*/
public class SampleOfReadingAttributes {
/**
* Application entry point
* #param args command-line arguments
*/
public static void main(String[] args) {
try {
// creates and returns new instance of SAX-implementation:
SAXParserFactory factory = SAXParserFactory.newInstance();
// create SAX-parser...
SAXParser parser = factory.newSAXParser();
// .. define our handler:
SaxHandler handler = new SaxHandler();
// and parse:
parser.parse("sample.xml", handler);
} catch (Exception ex) {
ex.printStackTrace(System.out);
}
}
/**
* Our own implementation of SAX handler reading
* a purchase-order data.
*/
private static final class SaxHandler extends DefaultHandler {
// we enter to element 'qName':
public void startElement(String uri, String localName,
String qName, Attributes attrs) throws SAXException {
if (qName.equals("behavior")) {
// get version
String version = attrs.getValue("version");
System.out.println("Version is " + version );
}
}
}
}
As mentioned you can use the SAXParser.
Digester mentioned using regular expressions, which I won't recommend as it would lead to code that is difficult to maintain: What if you add another version attribute in another tag, or another behaviour tag? You can handle it, but it won't be pretty.
You can also use XPath, which is a language for querying xml. That's what I would recommend.
If all you need is to read the version, then you can use regex. But really, I think you need apache digester
Apache Commons Configuration is nice, too. Commons Digester is based on it.