How do I call DaisyDiff to compare two HTML files? - java

I need to create a diff between two HTML documents in my app. I found a library called DaisyDiff that can do it. It has an API that looks like this:
/**
* Diffs two html files, outputting the result to the specified consumer.
*/
public static void diffHTML(InputSource oldSource, InputSource newSource,
ContentHandler consumer, String prefix, Locale locale)
throws SAXException, IOException
I know absolutely nothing about SAX and I can't figure out what to pass as the third argument. After poking through https://code.google.com/p/daisydiff/source/browse/trunk/daisydiff/src/java/org/outerj/daisy/diff/Main.java I wrote this method:
#Override
protected String doInBackground(String... params)
{
try {
String oldFileName = params[0],
newFileName = params[1];
ByteArrayOutputStream os = new ByteArrayOutputStream();
FileInputStream oldis = null, newis = null;
oldis = openFileInput(oldFileName);
newis = openFileInput(newFileName);
SAXTransformerFactory tf = (SAXTransformerFactory) TransformerFactory
.newInstance();
TransformerHandler result = tf.newTransformerHandler();
result.setResult(new StreamResult(os));
DaisyDiff.diffHTML(new InputSource(oldis), new InputSource(newis), result, "", Locale.getDefault());
Log.d("diff", "output length = " + os.size());
return os.toString("Utf-8");
}catch (Exception e){
return e.toString();
}
}
I have no idea if that even makes sense. It doesn't work, nothing is written to the output. Please help me with this. Thanks in advance.

According to how HtmlTestFixture.diff is coded up (inside src/test/java of DaisyDiff, you need to give it instructions on how the result should be formatted. Have you tried adding the below setOutputProperty(...) calls?
#Test
//#Test comes from TestNG and is not related to DaisyDiff
public void daisyDiffTest() throws Exception {
String html1 = "<html><body>var v2</body></html>";
String html2 = "<html> \n <body> \n Hello world \n </body> \n </html>";
try {
StringWriter finalResult = new StringWriter();
SAXTransformerFactory tf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
TransformerHandler result = tf.newTransformerHandler();
result.getTransformer().setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
result.getTransformer().setOutputProperty(OutputKeys.INDENT, "yes");
result.getTransformer().setOutputProperty(OutputKeys.METHOD, "html");
result.getTransformer().setOutputProperty(OutputKeys.ENCODING, "UTF-8");
result.setResult(new StreamResult(finalResult));
ContentHandler postProcess = result;
DaisyDiff.diffHTML(new InputSource(new StringReader(html1)), new InputSource(new StringReader(html2)), postProcess, "test", Locale.ENGLISH);
System.out.println(finalResult.toString());
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Done this way, my output is as follows. Now I can stick this into an HTML file, include the right css and js files and have a pretty output.
<span class="diff-html-removed" id="removed-test-0" previous="first-test" changeId="removed-test-0" next="added-test-0">var v2</span><span class="diff-html-added" previous="removed-test-0" changeId="added-test-0" next="last-test"> </span><span class="diff-html-added" id="added-test-0" previous="removed-test-0" changeId="added-test-0" next="last-test">Hello world </span>

Related

analyzing inputstream xml format java

I have an InputStream containing xml format like the following :-
InputStream is = asStream("<TransactionList>\n" +
" <Transaction type=\"C\" amount=\"1000\"narration=\"salary\" />\n" +
" <Transaction type=\"X\" amount=\"400\" narration=\"rent\"/>\n" +
" <Transaction type=\"D\" amount=\"750\" narration=\"other\"/>\n" +
"</TransactionList>");
xmlTransactionProcessor.importTransactions(is);
I'm trying to analyze this and store the values into an array-list of Transaction object (user-defined), but I am still unable to do so.
I tried many solutions but I am still not getting any benefits.
I read about reading xml files but still am not able to deal with an InputStream like this.
Can anybody help ? This is my last try but it is still failing somewhere .
// TODO Auto-generated method stub
BufferedReader inputReader = new BufferedReader(new InputStreamReader(is));
StringBuilder sb = new StringBuilder();
String inline = "";
try {
while ((inline = inputReader.readLine()) != null) {
sb.append(inline);
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
SAXBuilder builder = new SAXBuilder();
try {
org.jdom2.Document document = (org.jdom2.Document) builder.build(new ByteArrayInputStream(sb.toString().getBytes()));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
You don't have to parse the XML yourself with SAX parser. There are several libraries that allow XML Binding: serialize and deserialize XML documents into custom POJO classes (or collection of these).
There is even a standard for XML binding in the JDK. It is called JAXB. You can use annotations to map the XML element names to the properties of your custom POJO.
Here's an example with my personal favorute library: Jackson. It is primarily desgined to process JSON formatted text, but has an extension to support XML (and JAXB).
import java.util.*;
import com.fasterxml.jackson.databind.*;
import com.fasterxml.jackson.dataformat.xml.*;
public class XMLTest
{
public static void main(String[] args)
{
String input =
"<TransactionList>\n" +
" <Transaction type=\"C\" amount=\"1000\" narration=\"salary\" />\n" +
" <Transaction type=\"X\" amount=\"400\" narration=\"rent\"/>\n" +
" <Transaction type=\"D\" amount=\"750\" narration=\"other\"/>\n" +
"</TransactionList>";
try {
XmlMapper xmlMapper = new XmlMapper();
xmlMapper.setDefaultUseWrapper(false);
// this is how we tell Jackson the target type of the deserialization
JavaType transactionListType = xmlMapper.getTypeFactory().constructCollectionType(List.class, Transaction.class);
List<Transaction> transactionList = xmlMapper.readValue(input, transactionListType );
System.out.println(transactionList);
} catch (Exception e) {
e.printStackTrace();
}
}
public static class Transaction
{
public String type;
public int amount;
public String narration;
#Override
public String toString() {
return String.format("{ type:%s, amount:%d, narration:%s }", type, amount, narration);
}
}
}
As explained by Sharon Ben Asher, you could use annotated data mapping using JAXB or Jackson with XML data formatter. This would be easier.
If you want to fix your existing code using the SAXParser here's how it is.
You have to iterate the document object as in the code below.
public static void main(String[] args) {
InputStream is = new ByteArrayInputStream(("<TransactionList>\n" +
" <Transaction type=\"C\" amount=\"1000\" narration=\"salary\" />\n" +
" <Transaction type=\"X\" amount=\"400\" narration=\"rent\"/>\n" +
" <Transaction type=\"D\" amount=\"750\" narration=\"other\"/>\n" +
"</TransactionList>").getBytes(StandardCharsets.UTF_8));
ArrayList transactions = importTransactions(is);
}
In the importTransaction method use getRootElement to get the root level Transactions element. Then iterate through each of the Transaction child elements using getChildren and a for-each loop.
public static ArrayList<Transaction> importTransactions(InputStream is){
ArrayList<Transaction> transactions = new ArrayList<>();
BufferedReader inputReader = new BufferedReader(new InputStreamReader(is));
StringBuilder sb = new StringBuilder();
String inline = "";
try {
while ((inline = inputReader.readLine()) != null) {
sb.append(inline);
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
SAXBuilder builder = new SAXBuilder();
try {
org.jdom2.Document document = builder.build(new ByteArrayInputStream(sb.toString().getBytes()));
Element transactionsElement = document.getRootElement();
List<Element> transactionList = transactionsElement.getChildren();
for (Element transaction:transactionList) {
Transaction t = new Transaction();
t.setType(transaction.getAttribute("type").getValue());
t.setAmount(transaction.getAttribute("amount").getValue());
transactions.add(t);
}
} catch (Exception e) {
// Log the error....
e.printStackTrace();
}
return transactions;
}

Can we create testng.xml file on a fly?

I am trying to create a UI which will show all the methods inside the project having #Test annotation. This will give the user an option to select the method which they want to execute during run time.
My intention is when user selects let say Method1 and Method3 from the UI, code should create a testng.xml file with Method1 and Method3 and pass that xml file for the execution.
Is there any way of doing this ? Any help is much appreciated. Thanks.
Yes, you can create TestNG at run time.
I have written utility to read all the test cases and and value designated by Y and N, depicting that if that test case has to be run or not. And the testng can be created accordingly. You can read values as selected from UI.
Below code can be of help to you:
public static void createTestNg() {
try {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder dbBuilder = dbFactory.newDocumentBuilder();
Document doc = dbBuilder.newDocument();
Element rootElement = doc.createElement("suite");
doc.appendChild(rootElement);
Attr rootNameAttribute = doc.createAttribute("name");
rootNameAttribute.setValue("Suite");
Attr rootParallelAttribute = doc.createAttribute("parallel");
rootParallelAttribute.setValue("none");
rootElement.setAttributeNode(rootNameAttribute);
rootElement.setAttributeNode(rootParallelAttribute);
Element testElement = doc.createElement("test");
rootElement.appendChild(testElement);
Attr testNameAttribute = doc.createAttribute("name");
testNameAttribute.setValue("Test1");
testElement.setAttributeNode(testNameAttribute);
Element classesElement = doc.createElement("classes");
testElement.appendChild(classesElement);
Fillo fillo = new Fillo();
Connection con = fillo.getConnection("./testCaseStatus.xls");
String query = "Select * from Sheet1";
Recordset recordSet = con.executeQuery(query);
while (recordSet.next()) {
if (recordSet.getField("Execute").equals("Y")) {
Element classElement = doc.createElement("class");
Attr classNameAttribute = doc.createAttribute("name");
classNameAttribute.setValue(recordSet.getField("TestCase"));
classElement.setAttributeNode(classNameAttribute);
classesElement.appendChild(classElement);
}
}
recordSet.close();
con.close();
TransformerFactory transformerFactory = TransformerFactory
.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("./testNg.xml"));
transformer.transform(source, result);
// Output to console for testing
StreamResult consoleResult = new StreamResult(System.out);
transformer.transform(source, consoleResult);
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (FilloException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
I have used fillo.jar to read from excel. You can use any other utility as required.
testng.xml is not mandatory.
You can have your own implementation of IMethodInterceptor where you will launch the GUI and then filter the methods you want.

Loss of special characters while using javax.xml.transform.Transformer

I have following problem - I lose some of special characters when using javax.xml.transform.Transformer. Both xml and xls files are UTF-8 formatted.
I seem to lose some of capital polish characters - Ą,Ł etc during transform and replaced by "�?" characters.
Here is my transforming method:
public static boolean transform(Logger logger, String inXML,String inXSL,String outTXT) throws Exception
{
try
{
TransformerFactory factory = TransformerFactory.newInstance();
ErrorListener listener = new ErrorListener()
{
#Override
public void warning(TransformerException exception)
throws TransformerException {}
#Override
public void fatalError(TransformerException exception)
throws TransformerException {}
#Override
public void error(TransformerException exception)
throws TransformerException {}
};
factory.setErrorListener(listener);
StreamSource xslStream = new StreamSource(inXSL);
Transformer transformer = factory.newTransformer(xslStream);
StreamSource in = new StreamSource(inXML);
StreamResult out = new StreamResult(outTXT);
transformer.transform(in,out);
return true;
}
catch(Exception e)
{
logger.log("ERROR DURING XSLT TRANSFORM (" + e.getMessage() + ")",2);
return false;
}
}
Any help will be appreciated!
=====
Using XSL file - Link
It seemed it was necessary to set output encoding.
After adding
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
engine seems to work fine in both environments.
I had similiar problem and after adding UTF-16 (not UTF-8) encoding
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-16");
special characters worked.

SAXException unable to get document encoding

I'm trying to make an application that displays news feed from a website so I get the input stream and parse it in document using SAX but it returns SAX exception that it is unable to determine type of coding of this Stream . I tried before that to put The website's stream manually in XML file and read the file and It worked but when streaming directly from Internet it throws that exception and this is my code :
public final class MyScreen extends MainScreen {
protected static RichTextField RTF = new RichTextField("Plz Wait . . . ",
Field.FIELD_BOTTOM);
public MyScreen() {
// Set the displayed title of the screen
super(Manager.NO_VERTICAL_SCROLL);
setTitle("Yalla Kora");
Runnable R = new Runnable();
R.start();
add(RTF);
}
private class Runnable extends Thread {
public Runnable() {
// TODO Auto-generated constructor stub
ConnectionFactory factory = new ConnectionFactory();
ConnectionDescriptor descriptor = factory
.getConnection("http://www.yallakora.com/arabic/rss.aspx?id=0");
HttpConnection httpConnection;
httpConnection = (HttpConnection) descriptor.getConnection();// Connector.open("http://www.yallakora.com/pictures/main//2011/11/El-Masry-807-11-2011-21-56-7.jpg");
Manager mainManager = getMainManager();
RichList RL = new RichList(mainManager, true, 2, 1);
InputStream input;
try {
input = httpConnection.openInputStream();
Document document;
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder;
try {
docBuilder = docBuilderFactory.newDocumentBuilder();
docBuilder.isValidating();
try {
document = docBuilder.parse(input);
document.getDocumentElement().normalize();
NodeList item = document.getElementsByTagName("item");
int k = item.getLength();
for (int i = 0; i < k; i++) {
Node value = item.item(i);
NodeList Data = value.getChildNodes();
Node title = Data.item(0);
Node link = Data.item(1);
Node date = Data.item(2);
Node discription = Data.item(5);
Node Discription = discription.getFirstChild();
String s = Discription.getNodeValue();
int mm = s.indexOf("'><BR>");
int max = s.length();
String imagelink = s.substring(0, mm);
String Khabar = s.substring(mm + 6, max);
String Date = date.getFirstChild().getNodeValue();
String Title = title.getFirstChild().getNodeValue();
String Link = link.getFirstChild().getNodeValue();
ConnectionFactory factory1 = new ConnectionFactory();
ConnectionDescriptor descriptor1 = factory1
.getConnection(imagelink);
HttpConnection httpConnection1;
httpConnection1 = (HttpConnection) descriptor1
.getConnection();
InputStream input1;
input1 = httpConnection1.openInputStream();
byte[] bytes = IOUtilities.streamToBytes(input1);
Bitmap bitmap = Bitmap.createBitmapFromBytes(bytes,
0, -1, 1);
;
RL.add(new Object[] { bitmap, Title, Khabar, Date });
add(new RichTextField(link.getNodeValue(),
Field.NON_FOCUSABLE));
}
RTF.setText("");
} catch (SAXException e) {
// TODO Auto-generated catch block
RTF.setText("SAXException " + e.toString());
e.printStackTrace();
}
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
RTF.setText("ParserConfigurationException " + e.toString());
e.printStackTrace();
}
} catch (IOException e) {
RTF.setText("IOException " + e.toString());
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}}
Any Ideas ??
I recommend restructuring this code into at least two parts.
I would create a download function that is given a URL and downloads the bytes associated with that URL. This should open and close the connection, and just return either the bytes downloaded or an error indication.
I would use this download processing as a 'function call' to download your XML bytes. Then parse the bytes that are obtained feeding these direct into your parser. If the data is properly constructed XML, it will have a header indicating the encoding used, so you do not need to worry about that, the parser will cope.
Once you have this parsed, then use the download function again to download the bytes associated with any images you want.
Regarding the SAX processing, have you reviewed this question:
parse-xml-inputstream-in-blackberry-java-application

Java(JAXP) and XSLT: Overwriting the XML file

I'm generating XML file by taking XML/HTML file (temp.xml) and XSLT(temp.xsl) as input and my output is generated as a separate file with the new name(temp_copy.xml) but I want to overwrite the input XML file instead of creating a new file. I tried by giving the same path as it was in the input file but that didn't work. So what can be the other way to achieve this?
Thanks in advance.
My Java code:
public class SimpleXSLT {
public static void main(String[] args) {
String inXML = "C:/tmp/temp.xml";
String inXSL = "C:/tmp/temp.xsl";
String outTXT = "C:/tmp/temp_copy.xml";
SimpleXSLT st = new SimpleXSLT();
try {
st.transform(inXML,inXSL,outTXT);
} catch(TransformerConfigurationException e) {
System.err.println("Invalid factory configuration");
System.err.println(e);
} catch(TransformerException e) {
System.err.println("Error during transformation");
System.err.println(e);
}
}
public void transform(String inXML,String inXSL,String outTXT)
throws TransformerConfigurationException,
TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
StreamSource xslStream = new StreamSource(inXSL);
Transformer transformer = factory.newTransformer(xslStream);
transformer.setErrorListener(new MyErrorListener());
StreamSource in = new StreamSource(inXML);
StreamResult out = new StreamResult(outTXT);
transformer.transform(in,out);
System.out.println("The generated XML file is:" + outTXT);
}
}
"But that didn't work" needs to be better defined. You got an error? If so, what did it say? If not, what happened that was contrary to your expectation?
Usually, a process that overwrites its input is in danger of clobbering the input before it finishes reading it, unless it's specifically designed to be able to handle that case.
The simplest solution is to write to a separate output file, then when the transformation is finished, delete or move/rename the input file, and move/rename the output file to be what the input file used to be.
If Anyone else is facing the same problem then have a look what I have done as per LarsH's suggestion and it works perfectly-
public static void main(String[] args) {
String inXML = "C:/tmp/text.xml";
String inXSL = "C:/tmp/text.xsl";
String outTXT = "C:/tmp/text_copy_copy.xml";
String renamedFile = "C:/tmp/text.xml";
File oldfile =new File(outTXT);
File newfile =new File(renamedFile);
SimpleXSLT st = new SimpleXSLT();
try {
//TRANSFORMATION CODE
}
try{
File file = new File(inXML);
if(file.delete()){
System.out.println("Deleted!");
}else{
System.out.println("Failed.");
}
}catch(Exception e){
e.printStackTrace();
}
if(oldfile.renameTo(newfile)){
System.out.println("Renamed");
}else{
System.out.println("Rename failed");
}
}

Categories

Resources