Why my DOM parser cant read UTF-8 - java

I have problem that my DOM parser can´t load file when there are UTF-8 characters in XML file
Now, i am aware that i have to give him instruction to read utf-8, but i don´t know how to put it in my code
here it is:
File xmlFile = new File(fileName);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
i am aware that there is method setencoding(), but i don´t know where to put it in my code...

Try this. Worked for me
InputStream inputStream= new FileInputStream(completeFileName);
Reader reader = new InputStreamReader(inputStream,"UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(is);

Try to use Reader and provide encoding as parameter:
InputStream inputStream = new FileInputStream(fileName);
documentBuilder.parse(new InputSource(new InputStreamReader(inputStream, "UTF-8")));

I used what Eugene did up there and changed it a little.
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
FileInputStream in = new FileInputStream(new File("XML.xml"));
Document doc = dBuilder.parse(in, "UTF-8");
though this will be read as UTF-8 if you are printing in eclipse console it won't show any 'UTF-8' characters unless the java file is saved as 'UTF-8', or at least that what happened with me

Related

creating space in end of xml tag in java

My xml tag is :
<Description/>
I want with space :
<Description />
How can I do this in Java?
I am signing xml document , in original file space has been used but when I used following code and print it, it printing without space.
String thisLine = "";
String xmlString = "";
BufferedReader br = new BufferedReader(new FileReader(originalXmlFilePath));
while ((thisLine = br.readLine()) != null) {
xmlString = xmlString + thisLine.trim();
}
br.close();
ByteArrayInputStream xmlStream = new ByteArrayInputStream(xmlString.getBytes());
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setValidating(false);
Document doc = dbf.newDocumentBuilder().parse
(xmlStream );
doc.setXmlStandalone(true);
DOMSignContext dsc = new DOMSignContext
(keyEntry.getPrivateKey(), doc.getDocumentElement());
javax.xml.crypto.dsig.XMLSignature signature = fac.newXMLSignature(si, ki);
signature.sign(dsc);
// Output the resulting document.
// OutputStream os = new FileOutputStream(new File(destnSignedXmlFilePath));
TransformerFactory tf = TransformerFactory.newInstance();
Transformer trans = tf.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
StringWriter writer = new StringWriter();
trans.transform(new DOMSource(doc), new StreamResult(writer));
String output = writer.getBuffer().toString();//.replaceAll("\n|\r", "");
System.out.println("output== "+output);
What you are doing wrong is signing an arbitrary unprocessed text instead of submitting a canonical version of your document (without spaces in tags, but also with sorted attributes, with quotes of the same type, etc.) to the digital signature computation.
The Canonical XML and Exclusive Canonical XML W3C recommendations specify a standard and comprehensive way to eliminate arbitrary differences.

How can I optimise this code for converting from JSON to XML?

I'm using this code to convert JSON to XML:
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
document = standardJsonToXML(hierarchyData, document, null);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(document);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
transformer.transform(source, result);
return writer.toString();
How can I increase its performance?

Transfer XML via socket between server and client in java

Hi I want to send a simple XML from server to client.
On the server side I use
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);OutputStream bos = userSocket.getOutputStream();
StreamResult result = new StreamResult(bos);
transformer.transform(source, result);
//here bos.close();
On the client side i use
InputStream is = socket.getInputStream();
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(is);
When I close the bos on the server side after the transformer, the XML is successfully transferred. But when I don't Document doc = dBuilder.parse(is); keeps waiting for input and my program stuck. So my question is how can I successfully transfer XML between my client and server without closing the socket. Thanks ;)
Change your bos.close() to bos.flush().

UTF-8 to UTF16 Parsing

I have an XML that is UTF-8 and have some special characters in Chinese, I need to parse this xml.
DocumentBuilderFactory factory = DocumentBuilderFactory
.newInstance();
factory.setIgnoringElementContentWhitespace(true);
factory.setNamespaceAware(true);
factory.setValidating(true);
//byte[] buffer = xmlMsg.getBytes("UTF-16");
logger.info("transformToUTP " + xmlMsg);
//byte[] buffer = soapMessage.getBytes();
//ByteArrayInputStream stream = new ByteArrayInputStream(buffer);
InputSource is = new InputSource(new ByteArrayInputStream(
xmlMsg.getBytes("UTF-16")));
Document doc = factory.newDocumentBuilder().parse(is);
//Document doc = factory.newDocumentBuilder().parse(
new InputSource(new StringReader(xmlMsg)));
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(getNameSpace());
XPathExpression soapBodyExpr = xpath.compile(BODY_XPATH_EXP);
Node soapBody = (Node) soapBodyExpr.evaluate(doc,
XPathConstants.NODE);
Node reqMsgNode = soapBody.getFirstChild();
I am getting a null pointer exception on reqMsgNode.
Do not convert xml into a string, parse it as is, use
DocummentBuilder.parse(File) or DocumentBuilder.parse(InputStream)
the parser will take encoding from xml declaration e.g. <?xml version="1.0" encoding="UTF-8"?>, and if it is missing then it will use UTF-8 by default

Parse XML string on BlackBerry

I am trying to parse XML with the following code, but StringReader is not available in the BlackBerry JDE. What is the right way to do this?
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xmlRecords));
Document doc = db.parse(is);
String xmlString = "<xml> </xml>" // your xml string
ByteArrayInputStream bis = new ByteArrayInputStream(xmlString.getBytes("UTF-8"));
Document doc = builder.parse(bis);
Try this out
If you want to build a DOM from data coming from a server, you're much better off parsing the InputStream directly with a DocumentBuilder rather than reading the data into a String and trying to work with that. One way is:
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(input);

Categories

Resources