Cutting a XML with JAXB - java

I have the following xml:
<Package>
<PackageHeader>
<name>External Vendor File</name>
<description>External vendor file for some purpose</description>
<version>3.141694baR3</version>
</PackageHeader>
<PackageBody>
<Characteristic id="1">
<Size>
<value>1.68</value>
<scale>Meters</scale>
<comment>Size can vary, depending on temperature</comment>
</Size>
<Weight>
<value>9</value>
<scale>M*Tons</scale>
<comment>His mama is so fat, we had to use another scale</comment>
</Weight>
<rating>
<ratingCompany>ISO</ratingCompany>
<rating:details xmlns:rating="http://www.w3schools.com/ratingDetails">
<rating:value companyDepartment="Finance">A</rating:value>
<rating:expirationDate update="1/12/2010">1/1/2014</rating:expirationDate>
<rating:comment userID="z94234">You're not Silvia.</rating:comment>
<rating:comment userID="r24942">You're one of the Kung-Fu Creatures On The Rampage</rating:comment>
<rating:comment userID="i77880">TWO!</rating:comment>
<rating:priority>3</rating:priority>
</rating:details>
</rating>
</Characteristic>
<Characteristic id="2">
<Size/>
<Weight/>
<rating/>
</Characteristic>
...
<Characteristic id="n"/>
</PackageBody>
</Package>
And the following Java code:
public class XMLTest {
public static void main(String[] args) throws Exception {
Package currentPackage = new Package();
Package sourcePackage = new Package();
int totalCharacteristics;
PackageBody currentPackageBody = new PackageBody();
Characteristic currentCharacteristic = new Characteristic();
rating currentRating = new rating();
FileInputStream fis = new FileInputStream("sourceFile.xml");
JAXBContext myCurrentContext = JAXBContext.newInstance(Package.class);
Marshaller m = myCurrentContext.createMarshaller();
Unmarshaller um = myCurrentContext.createUnmarshaller();
sourcePackage = (Package)um.unmarshal(fis);
currentPackage.setPackageHeader(sourcePackage.getPackageHeader());
totalCharacteristics = sourcePackage.getPackageBody().getCharacteristics().size();
for (int i = 0; i < totalCharacteristics; i++)
{
currentRating = sourcePackage.getPackageBody().getCharacteristics().get(i).getrating();
}
currentCharacteristic.setrating(currentRating);
currentPackageBody.getCharacteristics().add(currentCharacteristic);
currentPackage.setPackageBody(currentPackageBody);
m.marshal(currentPackage, new File("targetFile.xml"));
fis.close();
}
}
Which gives me the next XML:
<Package>
<PackageHeader>
<name>External Vendor File</name>
<description>External vendor file for some purpose</description>
<version>3.141694baR3</version>
</PackageHeader>
<PackageBody>
<Characteristic id="1">
<rating>
<ratingCompany>ISO</ratingCompany>
<rating:details xmlns:rating="http://www.w3schools.com/ratingDetails">
<rating:value companyDepartment="Finance">A</rating:value>
<rating:expirationDate update="1/12/2010">1/1/2014</rating:expirationDate>
<rating:comment userID="z94234">You're not Silvia.</rating:comment>
<rating:comment userID="r24942">You're one of the Kung-Fu Creatures On The Rampage</rating:comment>
<rating:comment userID="i77880">TWO!</rating:comment>
<rating:priority>3</rating:priority>
</rating:details>
</rating>
</Characteristic>
<Characteristic id="2">
<rating/>
</Characteristic>
...
<Characteristic id="n"/>
</PackageBody>
</Package>
And this is what I need:
<Package>
<PackageHeader>
<name>External Vendor File</name>
<description>External vendor file for some purpose</description>
<version>3.141694baR3</version>
</PackageHeader>
<PackageBody>
<Characteristic>
<rating id="1">
<ratingCompany>ISO</ratingCompany>
<rating:details xmlns:rating="http://www.w3schools.com/ratingDetails">
<rating:comment userID="z94234">You're not Silvia.</rating:comment>
<rating:comment userID="r24942">You're one of the Kung-Fu Creatures On The Rampage</rating:comment>
<rating:comment userID="i77880">TWO!</rating:comment>
<rating:priority>3</rating:priority>
</rating:details>
</rating>
</Characteristic>
<Characteristic>
<rating id="2"/>
</Characteristic>
...
<Characteristic/>
</PackageBody>
</Package>
But I have a few questions:
How could I implement a way to read a 4GBs file? (for example, reading it with StAX).
If I want to filter some tags from source to target(as in the last xml), would I have to assign them one by one to the targetFile? Is there any iterator that might allow me to go through all subnodes and assign them?
If the sourceFile changes, would I need to rerun the xjc and recompile the whole project?
Thanks.

For reading huge XML files, you definitely need a streaming parser like StAX. In addition, you can use a combination of JAXB to selectively map a given piece of xml to java object if you wish work with it. You need to regenerate your JAXB classes only if your schema changes. No need to regenerate if you application code changes.

Related

How to convert xml file to HashMap using apache Tika

In my case i am able to read the xml file and parse it to get content as of meta data only provides the type of file which is "application/xml"
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.xml.XMLParser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.SAXException;
public class XmlParserExample {
public static void main(String[] args) throws IOException, SAXException, TikaException {
BodyContentHandler handler = new BodyContentHandler();
XMLParser parser = new XMLParser();
Metadata metadata = new Metadata();
ParseContext pcontext = new ParseContext();
FileInputStream inputstream = new FileInputStream(new File("example.xml"));
parser.parse(inputstream, handler, metadata, pcontext);
System.out.println("Contents of the document:" + handler.toString());
System.out.println("Metadata of the document:");
String[] metadataNames = metadata.names();
for(String name : metadataNames) {
System.out.println(name + ": " + metadata.get(name));
}
}
}
Above snippet of code prints the whole xml content and Content Type (as metadata).But i also want to fetch the xml tags as well so that i can create a HashMap which is requirement in my case.
Below is my Dummy example.xml:-
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE PubmedArticleSet SYSTEM "http://dtd.nlm.nih.gov/ncbi/pubmed/out/pubmed_190101.dtd">
<PubmedArticleSet>
<PubmedArticle>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">27483086</PMID>
<DateCompleted>
<Year>2018</Year>
<Month>05</Month>
<Day>02</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>05</Month>
<Day>02</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1532-849X</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>26</Volume>
<Issue>4</Issue>
<PubDate>
<Year>2017</Year>
<Month>Jun</Month>
</PubDate>
</JournalIssue>
<Title>Journal of prosthodontics : official journal of the American College of Prosthodontists</Title>
<ISOAbbreviation>J Prosthodont</ISOAbbreviation>
</Journal>
<ArticleTitle>The Use of CADCAM Technology for Fabricating Cast Gold Survey Crowns under Existing Partial Removable Dental Prosthesis. A Clinical Report.</ArticleTitle>
<Pagination>
<MedlinePgn>321-326</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1111jopr.12525</ELocationID>
<Abstract>
<AbstractText>The fabrication of a survey crown under an existing partial removable dental prosthesis (PRDP) has always been a challenge to many dental practitioners. This clinical report presents a technique for fabricating accurate cast gold survey crowns to fit existing PRDPs using CAD/CAM technology. The report describes a technique that would digitally scan the coronal anatomy of a cast gold survey crown and an abutment tooth under existing PRDPs planned for restoration, prior to any preparation. The information is stored in the digital software where all the coronal anatomical details are preserved without any modifications. The scanned designs are then applied to the scanned teeth preparations, sent to the milling machine and milled into full-contour clear acrylic resin burn-out patterns. The acrylic resin patterns are tried in the patient's mouth the same day to verify the full seating of the PRDP components. The patterns are then invested and cast into gold crowns and cemented in the conventional manner.</AbstractText>
<CopyrightInformation>© 2016 by the American College of Prosthodontists.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>El Kerdani</LastName>
<ForeName>Tarek</ForeName>
<Initials>T</Initials>
<AffiliationInfo>
<Affiliation>Department of Restorative Dental Sciences, Division of Prosthodontics, University of Florida College of Dentistry, Gainesville, FL.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Roushdy</LastName>
<ForeName>Sally</ForeName>
<Initials>S</Initials>
<AffiliationInfo>
<Affiliation>Department of Restorative Dental Sciences, Division of Prosthodontics, University of Florida College of Dentistry, Gainesville, FL.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D002363">Case Reports</PublicationType>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2016</Year>
<Month>08</Month>
<Day>02</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>J Prosthodont</MedlineTA>
<NlmUniqueID>9301275</NlmUniqueID>
<ISSNLinking>1059-941X</ISSNLinking>
</MedlineJournalInfo>
<ChemicalList>
<Chemical>
<RegistryNumber>7440-57-5</RegistryNumber>
<NameOfSubstance UI="D006046">Gold</NameOfSubstance>
</Chemical>
</ChemicalList>
<CitationSubset>D</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000368" MajorTopicYN="N">Aged</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017076" MajorTopicYN="Y">Computer-Aided Design</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D003442" MajorTopicYN="Y">Crowns</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D000044" MajorTopicYN="N">Dental Abutments</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017267" MajorTopicYN="Y">Dental Prosthesis Design</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D003832" MajorTopicYN="Y">Denture, Partial, Removable</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006046" MajorTopicYN="N">Gold</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D008297" MajorTopicYN="N">Male</DescriptorName>
</MeshHeading>
</MeshHeadingList>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">CADM</Keyword>
<Keyword MajorTopicYN="N">cast gold</Keyword>
<Keyword MajorTopicYN="N">milled acrylic resin patterns</Keyword>
</KeywordList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="accepted">
<Year>2016</Year>
<Month>06</Month>
<Day>13</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2016</Year>
<Month>8</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2018</Year>
<Month>5</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2016</Year>
<Month>8</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">27483086</ArticleId>
<ArticleId IdType="doi">10.111pr.12525</ArticleId>
</ArticleIdList>
</PubmedData>
</PubmedArticle>
<PubmedArticle>
<MedlineCitation Status="PubMed-not-MEDLINE" Owner="NLM">
<PMID Version="1">27483087</PMID>
<DateCompleted>
<Year>2018</Year>
<Month>08</Month>
<Day>07</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>08</Month>
<Day>07</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">2326-5205</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>68</Volume>
<Issue>11</Issue>
<PubDate>
<Year>2016</Year>
<Month>11</Month>
</PubDate>
</JournalIssue>
<Title>Arthritis & rheumatology (Hoboken, N.J.)</Title>
</Journal>
<ArticleTitle>Reply.</ArticleTitle>
<Pagination>
<MedlinePgn>2826-2827</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10t.39831</ELocationID>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Hitchon</LastName>
<ForeName>Carol Ann</ForeName>
<Initials>CA</Initials>
<AffiliationInfo>
<Affiliation>University of Manitoba, Winnipeg, Manitoba, Canada.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Koppejan</LastName>
<ForeName>Hester</ForeName>
<Initials>H</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Trouw</LastName>
<ForeName>Leendert A</ForeName>
<Initials>LA</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Huizinga</LastName>
<ForeName>Tom J W</ForeName>
<Initials>TJ</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Toes</LastName>
<ForeName>René E M</ForeName>
<Initials>RE</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>El-Gabalawy</LastName>
<ForeName>Hani S</ForeName>
<Initials>HS</Initials>
<AffiliationInfo>
<Affiliation>University of Manitoba, Winnipeg, Manitoba, Canada.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<GrantList CompleteYN="Y">
<Grant>
<GrantID>MOP‐77700</GrantID>
<Agency>CIHR</Agency>
<Country>Canada</Country>
</Grant>
</GrantList>
<PublicationTypeList>
<PublicationType UI="D016422">Letter</PublicationType>
<PublicationType UI="D013485">Research Sup</PublicationType>
<PublicationType UI="D016420">Comment</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2016</Year>
<Month>10</Month>
<Day>09</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>Arthritis Rheumatol</MedlineTA>
<NlmUniqueID>101623795</NlmUniqueID>
<ISSNLinking>2326-5191</ISSNLinking>
</MedlineJournalInfo>
<CommentsCorrectionsList>
<CommentsCorrections RefType="CommentOn">
<RefSource>dff</RefSource>
<PMID Version="1">27483211</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="CommentOn">
<RefSource>Arthritis Rheumato</RefSource>
<PMID Version="1">26946484</PMID>
</CommentsCorrections>
</CommentsCorrectionsList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2016</Year>
<Month>07</Month>
<Day>26</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2016</Year>
<Month>07</Month>
<Day>28</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2016</Year>
<Month>10</Month>
<Day>28</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2016</Year>
<Month>10</Month>
<Day>28</Day>
<Hour>6</Hour>
<Minute>1</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2016</Year>
<Month>8</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">27483087</ArticleId>
<ArticleId IdType="doi">efre</ArticleId>
</ArticleIdList>
</PubmedData>
</PubmedArticle>
</PubmedArticleSet>
Kindly help me out on this.
Thanks
My suggestion: If you want to read an XML file, and then parse its contents, you are probably better off using a purpose-built XML parser, rather than Tika.
There are various options - each with its own pros and cons (for example speed, memory consumption).
Here is one approach - it reads the entire file into memory, but you already do that with your Tika approach, so I assume file size is not a problem.
The code assumes there is a file called pubmed.xml which contains the XML presented in the question.
It reads the XML from file, and handles each element as a DOM node.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;
...
public void parseUsingDom() {
try {
File xmlFile = new File("C:/tmp/pubmed.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
NodeList articles = doc.getElementsByTagName("Article");
for (int i = 0; i < articles.getLength(); i++) {
Node article = articles.item(i);
if (article.getNodeType() == Node.ELEMENT_NODE) {
Element articleElement = (Element) article;
String title = articleElement
.getElementsByTagName("ArticleTitle")
.item(0).getTextContent();
System.out.println("");
System.out.println("Title : " + title);
NodeList authors = articleElement.getElementsByTagName("Author");
for (int j = 0; j < authors.getLength(); j++) {
Node author = authors.item(j);
if (author.getNodeType() == Node.ELEMENT_NODE) {
Element authorElement = (Element) author;
String foreName = authorElement
.getElementsByTagName("ForeName")
.item(0).getTextContent();
String lastName = authorElement
.getElementsByTagName("LastName")
.item(0).getTextContent();
System.out.println("Author : " + lastName + ", " + foreName);
}
}
}
}
} catch (Exception e) {
System.err.print(e);
}
}
The program prints the following output, just as a demo of what is possible:
Title : The Use of CADCAM Technology for Fabricating Cast Gold Survey Crowns under Existing Partial Removable Dental Prosthesis. A Clinical Report.
Author : El Kerdani, Tarek
Author : Roushdy, Sally
Title : Reply.
Author : Hitchon, Carol Ann
Author : Koppejan, Hester
Author : Trouw, Leendert A
Author : Huizinga, Tom J W
Author : Toes, René E M
Author : El-Gabalawy, Hani S
In your case, you would capture the relevant values in a hash map, of course.

How to export Weka model made in Java to PMML format

Is there a way i can export a LinearRegression model ( build on some dataset) into a PMML format in Java?
The code so far
DataSource source = new DataSource("house.arff");
Instances dataset = source.getDataSet();
Instances m_structure = new Instances(dataset, 0);
m_structure.setClassIndex(dataset.numAttributes()-1);
dataset.setClassIndex(dataset.numAttributes()-1);
LinearRegression lReg = new LinearRegression();
int m_NumClasses = dataset.numClasses();
int class_index= dataset.classIndex();
int nK = m_NumClasses - 1;
int nR = dataset.numAttributes() - 1;
double[][] m_Par = new double[nR + 1][nK];
String pmmlx= LogisticProducerHelper.toPMML(dataset,m_structure,m_Par,m_NumClasses);
System.out.println(pmmlx);
This produces the following PMML file
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
<Header copyright="WEKA">
<Application name="WEKA" version="3.8.0"/>
</Header>
<DataDictionary>
<DataField name="houseSize" optype="continuous"/>
<DataField name="lotSize" optype="continuous"/>
<DataField name="bedrooms" optype="continuous"/>
<DataField name="granite" optype="continuous"/>
<DataField name="bathroom" optype="continuous"/>
<DataField name="sellingPrice" optype="continuous"/>
</DataDictionary>
<RegressionModel algorithmName="logisticRegression" functionName="classification" modelType="logisticRegression" normalizationMethod="softmax">
<MiningSchema>
<MiningField missingValueReplacement="3132.0" missingValueTreatment="asMean" name="houseSize" usageType="active"/>
<MiningField missingValueReplacement="11788.142857142857" missingValueTreatment="asMean" name="lotSize" usageType="active"/>
<MiningField missingValueReplacement="5.0" missingValueTreatment="asMean" name="bedrooms" usageType="active"/>
<MiningField missingValueReplacement="0.42857142857142855" missingValueTreatment="asMean" name="granite" usageType="active"/>
<MiningField missingValueReplacement="0.7142857142857143" missingValueTreatment="asMean" name="bathroom" usageType="active"/>
<MiningField name="sellingPrice" usageType="predicted"/>
</MiningSchema>
<Output/>
</RegressionModel>
</PMML>
The PMML file above cannot be used to predict an Instance because the model is not yet built.
Using the following line builds the Classifier.
lReg.buildClassifier(dataset);
So I am wondering is there a way that I can add the parameters learned by this classifier into the PMML file so it can be exported/imported easily as a already trained classifier?
According to JavaDoc of LogisticProducerHelper:
Helper class for producing PMML for a Logistic classifier. Not designed to be used directly - you should call toPMML() on a trained Logistic classifier.
JavaDoc states that only Logistic classifier implements PMMLProducer.
If you use Logistic you can use logistic.toPMML(train) method.

Unable to run the XSLT through JAVA and Empty result observed in output

I just want to fetch a piece of tags from XML file and I'm using it XSLT.
XSLT:
<xsl:stylesheet version="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" version="1.0" encoding="utf-8" indent="yes"/><xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:strip-space elements="*"/>
<xsl:template match="/testng-results">
<xsl:copy-of select="class/test-method[#status='PASS']"/>
</xsl:template></xsl:stylesheet>
Input.XML
<?xml version="1.0" encoding="UTF-8"?>
<testng-results skipped="0" failed="0" total="10" passed="10">
<class name="com.transfermoney.Transfer">
<test-method status="PASS" name="setParameter" is-config="true" duration-ms="4"
started-at="2018-08-16T21:43:38Z" finished-at="2018-08-16T21:43:38Z">
<params>
<param index="0">
<value>
<![CDATA[org.testng.TestRunner#31c2affc]]>
</value>
</param>
</params>
<reporter-output>
</reporter-output>
</test-method> <!-- setParameter -->
</class>
<class name="com.transfermoney.Transfer">
<test-method status="FAIL" name="setSettlementFlag" is-config="true" duration-ms="5"
started-at="2018-08-16T21:44:55Z" finished-at="2018-08-16T21:44:55Z">
<reporter-output>
<line>
<![CDATA[runSettlement Value Set :false]]>
</line>
</reporter-output>
</test-method> setSettlementFlag
</class>
</testng-results>
JAVA Code:
public static void main(String[] args) throws Exception {
String XML = fetchDataFrmXML(".//Test//testng-results_2.xml");
Transformer t = TransformerFactory.newInstance().newTransformer(new StreamSource(new File(".//Test//Cut.xslt")));
t.transform(new StreamSource(new StringReader(XML)), new StreamResult(new File(".//Test//Sample1.xml")));
}
Expected Output:
<test-method status="PASS" name="setParameter" is-config="true" duration-ms="4" started-at="2018-08-16T21:43:38Z" finished-at="2018-08-16T21:43:38Z">
<params>
<param index="0">
<value>
<![CDATA[runSettlement Value Set :false]]>
</value>
</param>
</params>
<reporter-output/>
</test-method>
FetchXML:
public static String fetchDataFrmXML(String fileLocation) throws Exception
{
file = new File(fileLocation);
fr = new FileReader(file);
br = new BufferedReader(fr);
String temp;
String result = "";
while ((temp = br.readLine()) != null) {
result += temp;
}
br.close();
return result;
}
I'm getting the empty sample1.xml file after I ran the JAVA class file. But if the same XSLT script I just run it through online editor it's giving an expected result.
Is there any issue in my java file to execute the XSLT? please help me on this.
Your code works for me. The only things I changed were:
declaring the variables used in your fetchDataFromXML() method
Adding the missing space after version="1.0" in your stylesheet
Changing the file names.
I added the line
System.err.println(t.getClass().getName());
to identify the XSLT engine used; the output was
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl
You might like to do the same.
Looking more carefully at the output, it displays the record with status="PASS", which is what the code is selecting, though you said you wanted the one that has status="FAIL".
A note about your fetchDataFrmXML() method: it's incredibly inefficient to build up the content of a string by repeated string concatenation this way. Use a StringBuilder instead.
(I once earned myself $10K in consultancy fees by pointing this mistake out to a client, who probably saved themselves $1m in hardware costs as a result).

VTD-XML reading gives no results

I am trying to read a RSS content using VTD-XML. Below is a sample of RSS.
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<?xml-stylesheet type="text/xsl" href="rss.xsl"?>
<channel>
<title>MyRSS</title>
<atom:link href="http://www.example.com/rss.php" rel="self" type="application/rss+xml" />
<link>http://www.example.com/rss.php</link>
<description>MyRSS</description>
<language>en-us</language>
<pubDate>Tue, 22 May 2018 13:15:15 +0530</pubDate>
<item>
<title>Title 1</title>
<pubDate>Tue, 22 May 2018 13:14:40 +0530</pubDate>
<link>http://www.example.com/news.php?nid=47610</link>
<guid>http://www.example.com/news.php?nid=47610</guid>
<description>bla bla bla</description>
</item>
</channel>
</rss>
Anyway as you know, some RSS feeds can contain more styling info etc. However in every RSS, the <channel> and <item> will be common, at least for the ones I need to use.
I tried VTD XML to read this as quickly as possible. Below is the code.
VTDGen vg = new VTDGen();
if (vg.parseHttpUrl(appDataBean.getUrl(), true)) {
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/channel/item");
int result = -1;
while ((result = ap.evalXPath()) != -1) {
if (vn.matchElement("item")) {
do {
//do something with the contnets in the item node
Log.d("VTD", vn.toString(vn.getText()));
} while (vn.toElement(VTDNav.NEXT_SIBLING));
}
}
}
Unfortunately this did not print anything. What am I doing wrong here? Also non of the RSS feeds are very big, so I need to read them in couple of miliseconds. This code is on Android.

Xml validation with xsl

i have been trying to validate xml through xslt for couple of hours. i have following xsl form for xml validation. every time i try to validate xml, i get following warnings below and empty currencyid attributes in xml are ignored and xml validate although its not. Does anyone has an idea why its ignored and validated ?
<xsl:variable name="CurrencyCodeList"
select="',AED,AFN,ALL,AMD,ANG,AOA,ARS,AUD,AWG,AZN,BAM,BBD,BDT,BGN,BHD,BIF,BMD,BND,BOB,BOV,BRL,BSD,BTN,BWP,BYR,BZD,CAD,CDF,CHE,CHF,CHW,CLF,CLP,CNY,COP,COU,CRC,CUC,CUP,CVE,CZK,DJF,DKK,DOP,DZD,EEK,EGP,ERN,ETB,EUR,FJD,FKP,GBP,GEL,GHS,GIP,GMD,GNF,GTQ,GWP,GYD,HKD,HNL,HRK,HTG,HUF,IDR,ILS,INR,IQD,IRR,ISK,JMD,JOD,JPY,KES,KGS,KHR,KMF,KPW,KRW,KWD,KYD,KZT,LAK,LBP,LKR,LRD,LSL,LTL,LVL,LYD,MAD,MAD,MDL,MGA,MKD,MMK,MNT,MOP,MRO,MUR,MVR,MWK,MXN,MXV,MYR,MZN,NAD,NGN,NIO,NOK,NPR,NZD,OMR,PAB,PEN,PGK,PHP,PKR,PLN,PYG,QAR,RON,RSD,RUB,RWF,SAR,SBD,SCR,SDG,SEK,SGD,SHP,SLL,SOS,SRD,STD,SVC,SYP,SZL,THB,TJS,TMT,TND,TOP,TRY,TTD,TWD,TZS,UAH,UGX,USD,USN,USS,UYI,UYU,UZS,VEF,VND,VUV,WST,XAF,XAG,XAU,XBA,XBB,XBC,XBD,XCD,XDR,XFU,XOF,XPD,XPF,XPF,XPF,XPT,XTS,XXX,YER,ZAR,ZMK,ZWL,'"/>
<xsl:template match="//#currencyID" priority="1008" mode="M0">
<svrl:fired-rule xmlns:svrl="http://purl.oclc.org/dsdl/svrl" context="//#currencyID"/>
<!--ASSERT -->
<xsl:choose>
<xsl:when test="contains($CurrencyCodeList, concat(',',.,','))"/>
<xsl:otherwise>
<svrl:failed-assert xmlns:svrl="http://purl.oclc.org/dsdl/svrl"
test="contains($CurrencyCodeList, concat(',',.,','))">
<xsl:attribute name="location">
<xsl:apply-templates select="." mode="schematron-select-full-path"/>
</xsl:attribute>
<svrl:text>Geçersiz currencyID niteliği : '<xsl:text/>
<xsl:value-of select="."/>
<xsl:text/>'. Geçerli değerler için kod listesine bakınız.</svrl:text>
</svrl:failed-assert>
</xsl:otherwise>
</xsl:choose>
<xsl:apply-templates select="*|comment()|processing-instruction()" mode="M0"/>
Warning: on line 286
The preceding-sibling axis starting at a namespace node will never select anything
Warning: on line 311
The preceding-sibling axis starting at a namespace node will never select anything
Warning: on line 407
The child axis starting at an attribute node will never select anything
Warning: on line 407
The child axis starting at an attribute node will never select anything
Warning: on line 407
The child axis starting at an attribute node will never select anything
Warning: on line 436
The child axis starting at an attribute node will never select anything
Warning: on line 436
The child axis starting at an attribute node will never select anything
Warning: on line 436
EDIT:
actually i transformed schematron to xslt in order to test it also in xslt. given was schematron for validating. so actually i have to validate through given schematron files, sample xml and java code for validating. main schematron and other files have more rules and patterns. but i removed most of them for sample code and easily testing. everything is validated successfully except attributes in elements.( e.g currencyId attribute). im using UgliSch (Ugli Schematron Validator) for schematron validation.
MainSchematron.xml:
<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns="http://purl.oclc.org/dsdl/schematron"
xmlns:sch="http://purl.oclc.org/dsdl/schematron"
xmlns:sh="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader"
xmlns:ef="http://www.efatura.gov.tr/envelope-namespace">
<sch:include href="UBL-TR_Codelist.xml#codes"/>
<sch:include href="UBL-TR_Common_Schematron.xml#abstracts"/>
<sch:ns prefix="cbc" uri="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" />
<sch:rule context="//cbc:CurrencyCode">
<sch:extends rule="GeneralCurrencyCodeCheck"/>
</sch:rule>
<sch:rule context="//#currencyID">
<sch:extends rule="GeneralCurrencyIDCheck"/>
</sch:rule>
</sch:schema>
UBL-TR_Common_Schematron.xml:
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
xmlns="http://purl.oclc.org/dsdl/schematron">
<sch:pattern name="AbstractRules" id="abstracts">
<sch:p>Pattern for storing abstract rules</sch:p>
<!-- Rule to validate currencyID Genel -->
<sch:rule abstract="true" id="GeneralCurrencyIDCheck">
<sch:assert test="contains($CurrencyCodeList, concat(',',.,','))">Geçersiz currencyID niteliği : '<sch:value-of select="."/>'. Geçerli değerler için kod listesine bakınız.</sch:assert>
</sch:rule>
</sch:pattern>
</sch:schema>
UBL-TR_Codelist.xml:
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
xmlns="http://purl.oclc.org/dsdl/schematron">
<sch:pattern name="CodeList" id="codes">
<sch:let name="CurrencyCodeList" value="',AED,AFN,ALL,AMD,ANG,AOA,ARS,AUD,AWG,AZN,BAM,BBD,BDT,BGN,BHD,BIF,BMD,BND,BOB,BOV,BRL,BSD,BTN,BWP,BYR,BZD,CAD,CDF,CHE,CHF,CHW,CLF,CLP,CNY,COP,COU,CRC,CUC,CUP,CVE,CZK,DJF,DKK,DOP,DZD,EEK,EGP,ERN,ETB,EUR,FJD,FKP,GBP,GEL,GHS,GIP,GMD,GNF,GTQ,GWP,GYD,HKD,HNL,HRK,HTG,HUF,IDR,ILS,INR,IQD,IRR,ISK,JMD,JOD,JPY,KES,KGS,KHR,KMF,KPW,KRW,KWD,KYD,KZT,LAK,LBP,LKR,LRD,LSL,LTL,LVL,LYD,MAD,MAD,MDL,MGA,MKD,MMK,MNT,MOP,MRO,MUR,MVR,MWK,MXN,MXV,MYR,MZN,NAD,NGN,NIO,NOK,NPR,NZD,OMR,PAB,PEN,PGK,PHP,PKR,PLN,PYG,QAR,RON,RSD,RUB,RWF,SAR,SBD,SCR,SDG,SEK,SGD,SHP,SLL,SOS,SRD,STD,SVC,SYP,SZL,THB,TJS,TMT,TND,TOP,TRY,TTD,TWD,TZS,UAH,UGX,USD,USN,USS,UYI,UYU,UZS,VEF,VND,VUV,WST,XAF,XAG,XAU,XBA,XBB,XBC,XBD,XCD,XDR,XFU,XOF,XPD,XPF,XPF,XPF,XPT,XTS,XXX,YER,ZAR,ZMK,ZWL,'"/>
</sch:pattern>
</sch:schema>
sample.xml:
<sh:StandardBusinessDocument xsi:schemaLocation="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader PackageProxy_1_2.xsd" xmlns:ext="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2" xmlns:ef="http://www.efatura.gov.tr/package-namespace" xmlns:oa="http://www.openapplications.org/oagis/9" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:sh="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:ns9="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:ns11="urn:oasis:names:specification:ubl:schema:xsd:ApplicationResponse-2" xmlns:ns3="http://www.hr-xml.org/3" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<sh:StandardBusinessDocumentHeader>
<sh:HeaderVersion>1.0</sh:HeaderVersion>
<sh:Sender>
<sh:Identifier>urn:mail:defaultgb#xxx.com.tr</sh:Identifier>
<sh:ContactInformation>
<sh:Contact>xxx Kurumsal Bilgi Sistemleri A.Ş</sh:Contact>
<sh:ContactTypeIdentifier>UNVAN</sh:ContactTypeIdentifier>
</sh:ContactInformation>
<sh:ContactInformation>
<sh:Contact>8110120507</sh:Contact>
<sh:ContactTypeIdentifier>VKN_TCKN</sh:ContactTypeIdentifier>
</sh:ContactInformation>
</sh:Sender>
<sh:Receiver>
<sh:Identifier>urn:mail:defaultpk#xxx.com.tr</sh:Identifier>
<sh:ContactInformation>
<sh:Contact>KAKAR KURUMSAL BİLGİSİSTEMLERİ LTD.ŞTİ. Test Kullanıcısı</sh:Contact>
<sh:ContactTypeIdentifier>UNVAN</sh:ContactTypeIdentifier>
</sh:ContactInformation>
<sh:ContactInformation>
<sh:Contact>4545552073</sh:Contact>
<sh:ContactTypeIdentifier>VKN_TCKN</sh:ContactTypeIdentifier>
</sh:ContactInformation>
</sh:Receiver>
<sh:DocumentIdentification>
<sh:Standard>UBL-TR</sh:Standard>
<sh:TypeVersion>1.2</sh:TypeVersion>
<sh:InstanceIdentifier>bb583542-a81a-4b45-87d6-e90596101a41</sh:InstanceIdentifier>
<sh:Type>SENDERENVELOPE</sh:Type>
<sh:MultipleType>false</sh:MultipleType>
<sh:CreationDateAndTime>2016-01-06T16:27:25.759+02:00</sh:CreationDateAndTime>
</sh:DocumentIdentification>
</sh:StandardBusinessDocumentHeader>
<ef:Package>
<Elements>
<ElementType>INVOICE</ElementType>
<ElementCount>1</ElementCount>
<ElementList>
<ns9:Invoice>
<ext:UBLExtensions>
<ext:UBLExtension>
<ext:ExtensionContent>
...
</ext:ExtensionContent>
</ext:UBLExtension>
</ext:UBLExtensions>
<cbc:UBLVersionID>2.1</cbc:UBLVersionID>
<cbc:CustomizationID>TR1.2</cbc:CustomizationID>
<cbc:ProfileID>TICARIFATURA</cbc:ProfileID>
<cbc:ID>PAZ2015000000012</cbc:ID>
<cbc:CopyIndicator>false</cbc:CopyIndicator>
<cbc:UUID>54b0dad2-e3a7-44ee-848a-cf7977000020</cbc:UUID>
<cbc:IssueDate>2016-01-06</cbc:IssueDate>
<cbc:InvoiceTypeCode>SATIS</cbc:InvoiceTypeCode>
<cbc:DocumentCurrencyCode>TRY</cbc:DocumentCurrencyCode>
<cbc:LineCountNumeric>0</cbc:LineCountNumeric>
<cac:Signature>
<cbc:ID schemeID="VKN_TCKN">8110120507</cbc:ID>
<cac:SignatoryParty>
<cbc:WebsiteURI>http://www.xxx.com.tr/</cbc:WebsiteURI>
<cac:PartyIdentification>
<cbc:ID schemeID="VKN">8110120507</cbc:ID>
</cac:PartyIdentification>
<cac:PartyName>
<cbc:Name>xxx Kurumsal Bilgi Sistemleri A.Ş</cbc:Name>
</cac:PartyName>
<cac:PostalAddress>
<cbc:StreetName>Besiktas Teknik Universitesi</cbc:StreetName>
<cbc:BuildingNumber>150/1G</cbc:BuildingNumber>
<cbc:CitySubdivisionName>Besıktas</cbc:CitySubdivisionName>
<cbc:CityName>Istanbul</cbc:CityName>
<cbc:PostalZone>06100</cbc:PostalZone>
<cac:Country>
<cbc:Name>Turkiye</cbc:Name>
</cac:Country>
</cac:PostalAddress>
</cac:SignatoryParty>
<cac:DigitalSignatureAttachment>
<cac:ExternalReference>
<cbc:URI>#Signature</cbc:URI>
</cac:ExternalReference>
</cac:DigitalSignatureAttachment>
</cac:Signature>
<cac:AccountingSupplierParty>
<cac:Party>
<cac:PartyIdentification>
<cbc:ID schemeID="VKN">7221130507</cbc:ID>
</cac:PartyIdentification>
<cac:PartyName>
<cbc:Name>KAKAR KURUMSAL LTD.ŞTİ.</cbc:Name>
</cac:PartyName>
<cac:PostalAddress>
<cbc:Room/>
<cbc:BuildingName/>
<cbc:BuildingNumber/>
<cbc:CitySubdivisionName>besiktas</cbc:CitySubdivisionName>
<cbc:CityName>istanbul</cbc:CityName>
<cbc:PostalZone/>
<cac:Country>
<cbc:Name>ALMANYA</cbc:Name>
</cac:Country>
</cac:PostalAddress>
<cac:Contact>
<cbc:Telephone/>
<cbc:Telefax/>
<cbc:ElectronicMail/>
</cac:Contact>
</cac:Party>
</cac:AccountingSupplierParty>
<cac:AccountingCustomerParty>
<cac:Party>
<cbc:WebsiteURI/>
<cac:PartyIdentification>
<cbc:ID schemeID="VKN">2535552073</cbc:ID>
</cac:PartyIdentification>
<cac:PartyName>
<cbc:Name>KAKAR LTD.ŞTİ. Test Kullanıcısı</cbc:Name>
</cac:PartyName>
<cac:PostalAddress>
<cbc:ID/>
<cbc:Postbox/>
<cbc:Room/>
<cbc:StreetName/>
<cbc:BlockName/>
<cbc:BuildingName/>
<cbc:BuildingNumber/>
<cbc:CitySubdivisionName>besiktas</cbc:CitySubdivisionName>
<cbc:CityName>istanbul</cbc:CityName>
<cbc:PostalZone/>
<cbc:Region/>
<cbc:District/>
<cac:Country>
<cbc:Name>TÜRKİYE</cbc:Name>
</cac:Country>
</cac:PostalAddress>
<cac:Contact>
<cbc:Telephone/>
<cbc:Telefax/>
<cbc:ElectronicMail/>
</cac:Contact>
<cac:Person>
<cbc:FirstName/>
<cbc:FamilyName/>
</cac:Person>
</cac:Party>
</cac:AccountingCustomerParty>
<cac:TaxTotal>
<cbc:TaxAmount currencyID="TRY">2.16</cbc:TaxAmount>
<cac:TaxSubtotal>
<cbc:TaxableAmount currencyID="asdasdasdasd">0</cbc:TaxableAmount>
<cbc:TaxAmount currencyID="TRY">2.16</cbc:TaxAmount>
<cbc:CalculationSequenceNumeric>0</cbc:CalculationSequenceNumeric>
<cbc:Percent>18</cbc:Percent>
<cac:TaxCategory>
<cac:TaxScheme>
<cbc:Name>KDV</cbc:Name>
<cbc:TaxTypeCode>0015</cbc:TaxTypeCode>
</cac:TaxScheme>
</cac:TaxCategory>
</cac:TaxSubtotal>
</cac:TaxTotal>
<cac:LegalMonetaryTotal>
<cbc:LineExtensionAmount currencyID="TRY">12</cbc:LineExtensionAmount>
<cbc:TaxExclusiveAmount currencyID="TRY">12</cbc:TaxExclusiveAmount>
<cbc:TaxInclusiveAmount currencyID="TRY">14.16</cbc:TaxInclusiveAmount>
<cbc:AllowanceTotalAmount currencyID="TRY">0</cbc:AllowanceTotalAmount>
<cbc:PayableAmount currencyID="TRY">14.16</cbc:PayableAmount>
</cac:LegalMonetaryTotal>
<cac:InvoiceLine>
<cbc:ID>1</cbc:ID>
<cbc:InvoicedQuantity unitCode="NIU">1</cbc:InvoicedQuantity>
<cbc:LineExtensionAmount currencyID="">12</cbc:LineExtensionAmount>
<cac:AllowanceCharge>
<cbc:ChargeIndicator>false</cbc:ChargeIndicator>
<cbc:MultiplierFactorNumeric>0</cbc:MultiplierFactorNumeric>
<cbc:Amount currencyID="">0</cbc:Amount>
<cbc:BaseAmount currencyID="">0</cbc:BaseAmount>
</cac:AllowanceCharge>
<cac:TaxTotal>
<cbc:TaxAmount currencyID="">2.16</cbc:TaxAmount>
<cac:TaxSubtotal>
<cbc:TaxableAmount currencyID="">0</cbc:TaxableAmount>
<cbc:TaxAmount currencyID="">2.16</cbc:TaxAmount>
<cbc:Percent>18</cbc:Percent>
<cac:TaxCategory>
<cac:TaxScheme>
<cbc:Name>KDV</cbc:Name>
<cbc:TaxTypeCode>0015</cbc:TaxTypeCode>
</cac:TaxScheme>
</cac:TaxCategory>
</cac:TaxSubtotal>
</cac:TaxTotal>
<cac:Item>
<cbc:Name>asdasd</cbc:Name>
</cac:Item>
<cac:Price>
<cbc:PriceAmount currencyID="TRY">12</cbc:PriceAmount>
</cac:Price>
</cac:InvoiceLine>
</ns9:Invoice>
</ElementList>
</Elements>
</ef:Package>
</sh:StandardBusinessDocument>
java :
try (InputStream ubl = getClass().getResourceAsStream("/schematrons/UBL-TR_Main_Schematron.xml");) {
SchemaFactory schemaFactory = SchemaFactory.newInstance(XmlSchemaNsUris.SCHEMATRON_NS_URI);
Schema schema = schemaFactory.newSchema(new StreamSource(ubl));
Validator validator = schema.newValidator();
validator.setErrorHandler(validationErrorHandler);
validator.validate(new StringSource(new String(binary,"UTF-8")));
} catch (Exception e) {
e.printStackTrace();
}

Categories

Resources