I am trying to extract information from a XML file and able to extract values without its properties.
Code:
public class NRusEntity {
private String code;
private String name;
private String saltForm;
getters and setters
...
Parser Class:
...
String filePath = FileUtility.getOwlFilePath();
try {
Digester digester = new Digester();
digester.setValidating(false);
//digester.setNamespaceAware(true);
digester.addObjectCreate("rdf:RDF", NRus.class);
digester.addObjectCreate("rdf:RDF/owl:Class", NRusEntity.class);
digester.addCallMethod("rdf:RDF/owl:Class/Preferred_Name", "setName", 0);
digester.addCallMethod("rdf:RDF/owl:Class/code", "setCode", 0);
/**This commented part creates exception*/
//digester.addCallMethod("rdf:RDF/owl:Class/Has_Salt_Form", "setSaltForm", 2);
//digester.addCallParam("rdf:RDF/owl:Class/Has_Salt_Form", 0);
//digester.addCallParam("rdf:RDF/owl:Class/Has_Salt_Form", 1, "rdf:resource");
digester.addSetNext("rdf:RDF/owl:Class", "addEntry");
File input = new File(filePath);
digester.parse(input);
}
...
XML Looks like this:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#">
<owl:Class rdf:about="#z">
<Preferred_Name rdf:datatype="http://www.w3.org/2001/XMLSchema#string">von</Preferred_Name>
<code rdf:datatype="http://www.w3.org/2001/XMLSchema#string">XY221</code>
<Has_Format rdf:resource="http://zlib.com#Ni_Hydro"/>
</owl:Class>
...
</rdf:RDF>
How can I extract the URI value
"http://zlib.com#Ni_Hydro"
from that XML line
<Has_Format rdf:resource="http://zlib.com#Ni_Hydro"/>
I can't tell exactly as your XML does not appear to quite match your code: the commented out code refers to a Has_Salt_Form element but the rdf:resource element appears on a Has_Format element. However, I can see one potential problem which may help you progress:
I'm assuming your NRusEntity class setter is something like:
public void setSaltForm(String saltForm) {
// assign saltForm, or whatever...
}
However, the digester code you have is:
digester.addCallMethod("rdf:RDF/owl:Class/Has_Salt_Form", "setSaltForm", 2);
digester.addCallParam("rdf:RDF/owl:Class/Has_Salt_Form", 0);
digester.addCallParam("rdf:RDF/owl:Class/Has_Salt_Form", 1, "rdf:resource");
This is looking for a setSaltForm method with two parameters (the first is the element body, the second the rdf:resource attribute), so will not match the simple setter, and you'll get something like "no such method" in the exception message.
So if you need the body content then try adding another set method:
public void setSaltForm(String content, String attrib) {
// content will have the element content
// attrib will have "http://zlib.com#Ni_Hydro"
}
Or if you don't need the content then drop it from the digester rules:
digester.addCallMethod("rdf:RDF/owl:Class/Has_Salt_Form", "setSaltForm", 1);
digester.addCallParam("rdf:RDF/owl:Class/Has_Salt_Form", 0, "rdf:resource");
If neither of those work can you add details of the version of digester you are using, and the exception you get.
Related
I have to Update the Xml document object generated using Apache XMLbeans.There are two ways I am trying to update and save the document.
Step 1 : I am parsing the document and updating with the new values and saving with the parsed document itself.
private boolean updateContact(ContactType contacts, String contactFilePath, String name) throws Exception {
ContactsDocument contactDoc = ContactsDocument.Factory.parse(new File(contactFilePath));
ContactType contact = contactDoc.getContactType();
contact.setName(name);
contactDoc.save(new File(contactFilePath) , XmlUtils.getDefaultFileSavingOptions());
}
Step 2 : I am passing the updated document type and creating new instance of the xml document and saving with the updated type.
private boolean writeContact(ContactType contactType, String contactFilePath) throws Exception {
ContactsDocument contactsDoc = ContactsDocument.Factory.newInstance();
contactsDoc.setContactType(contactType);
contactsDoc.save(new File(contactFilePath), XmlUtils.getDefaultFileSavingOptions());
}
The step 2 is working but i want to know, will step 1 work ? and which is the efficient way of doing it for this scenario.
The Step 1 works perfect with the XML default file saving options and it does not modifies or removes any of the existing namespaces present in the file.
private boolean updateContact(ContactType contacts, String contactFilePath, String name) throws Exception {
ContactsDocument contactDoc = ContactsDocument.Factory.parse(new File(contactFilePath));
ContactType contact = contactDoc.getContactType();
contact.setName(name);
contactDoc.save(new File(contactFilePath) , XmlUtils.getDefaultFileSavingOptions());
}
It is also good approach to parse the file and save the changes on top of it, Instead of parse and instantiating the xml document for every updates.
I have the below method ...
public void sendmessage( final String messageText)
{
}
and in which the parameter messageText contains a an xml message now out of this xml message i need to extract the value of an xml tag and sent it it into an integer variable
that is in the above string parameter messageText which contains an xml message there is this tag as shown below
<transferGroupId>206320940</transferGroupId>
now i want to extract the e value of this tag and strored inside a variable please advise how to achieve this
below is the complete xml message shown below..
<?l version="1.0" encoding="UTF-8"?>
<emml message="emml-transfer-lifecycle">
<messageHeader>
<businessDate>2016-01-09</businessDate>
<eventDateTime timeContextReference="London">2016-01-09T16:55:00.485
</eventDateTime>
<system id="ACSDE">
<systemId>ADS ABLO</systemId>
<systemClass>ADS</systemClass>
<systemRole>Reference</systemRole>
</system>
<timeContext id="ndon">
<location>ABLO</location>
</timeContext>
</messageHeader>
<transferEventHeader>
<transferGroupStatus>Settled</transferGroupStatus>
<transferGroupIdentifier>
<transferGroupId>206320940</transferGroupId>
<systemReference>Ghtr</systemReference>
<transferGroupClassificationScheme>Primary Identifier
</transferGroupClassificationScheme>
</transferGroupIdentifier>
</transferEventHeader>
</emml>
I have tried this approach as shown below
String tagname = "transferGroupId";
String t = getTagValue( messageText, tagname);
and then further it is calling this method ..
public static String getTagValue(String messageText, String tagname){
return messageText.split("<"+tagname+">")[1].split("</"+tagname+">")[0];
but it this does not work in the end please advise how can i overcome from this
the other thing that was advise of jsoup also i have tried as shown below but it is throwing the exception that Parser class does not have any method named xmlParser in it ..
Document doc = Jsoup.parse(messageText, "", Parser.xmlParser());
for (Element e : doc.select("transferGroupId")) {
System.out.println(e.text());
}
JSoup sounds like what you need. (It has xml parsing support)
In JSoup:
Document doc = Jsoup.parse(messageText, "", Parser.xmlParser());
for (Element e : doc.select("transferGroupId")) {
System.out.println(e.text());
}
This will print out the text of the transferGroupId, which is 206320940 in this case. You can do other things with this such as sending a message using your own methods and resources.
Hope this helps!
In my current project, we are in the process of re-factoring a java class that constructs an XML document. In previous versions of the product delivered to the customer, the XML document is built with lower case elements and attributes:
<rootElement attr = "abc">
<childElement childAttr = "xyz"/>
</rootElement>
But now we have a requirement to build the XML document with TitleCase element and attributes. The user will set a flag in a properties file to indicate whether the document should be built in lower case or title case. If the flag is configured to build the document in TitleCase, the resultant document will look like:
<RootElement Attr = "abc">
<ChildElement ChildAttr = "xyz">
</RootElement>
Various approaches to solve the problem:
1. Plugging in a transformer to convert lowercase XML document to TitleCase XML document. But this will impact the overall performance, as we deal with huge XML files spanning more than 10,000 lines.
2. Create two separate maps with corr. XML elements and attributes.
For eg:
lowercase map: rootelement -> rootElement, attr -> attr ....
TitelCase map: rootlement -> RootElement, attr -> Attr ....
Based on the property set by the user, the corr. map will be chosen and XML element/attributes from this map will be used to build the XML document.
3. Using enum to define constants and its corr. values.
public enum XMLConstants {
ROOTELEMENT("rootElement", "RootElement"),
ATTRIBUTE("attr", "Attr");
private String lowerCase;
private String titleCase;
private XMLConstants(String aLowerCase, String aTitleCase){
titleCase = aTitleCase;
lowerCase = aLowerCase;
}
public String getValue(boolean isLowerCase){
if(isLowerCase){
return lowerCase;
} else {
return titleCase;
}
}
}
--------------------------------------------------------------
// XML document builder
if(propertyFlag){
isLowerCase = false;
} else {
isLowerCase = true;
}
....
....
createRootElement(ROOTELEMENT.getValue(isLowerCase));
createAttribute(ATTRIBUTE.getValue(isLowerCase));
Please help me choose the right option keeping in mind the performance aspect of the entire solution. If you have any other suggestions, please let me know.
// set before generate XML
boolean isUpperCase;
// use function for each tag/attribute name instead of string constant
// smth. like getInCase("rootElement")
String getInCase(String initialName) {
String intialFirstCharacter = initialName.substring(0, 1);
String actualFirstCharacter;
if (isUpperCase) {
actualFirstCharacter = intialFirstCharacter.toUpperCase();
} else {
actualFirstCharacter = intialFirstCharacter.toLowerCase();
}
return actualFirstCharacter + initialName.substring(1);
}
I did some research and it seems that is standard Jsoup make this change. I wonder if there is a way to configure this or is there some other Parser I can be converted to a document of Jsoup, or some way to fix this?
Unfortunately not, the constructor of Tag class changes the name to lower case:
private Tag(String tagName) {
this.tagName = tagName.toLowerCase();
}
But there are two ways to change this behavour:
If you want a clean solution, you can clone / download the JSoup Git and change this line.
If you want a dirty solution, you can use reflection.
Example for #2:
Field tagName = Tag.class.getDeclaredField("tagName"); // Get the field which contains the tagname
tagName.setAccessible(true); // Set accessible to allow changes
for( Element element : doc.select("*") ) // Iterate over all tags
{
Tag tag = element.tag(); // Get the tag of the element
String value = tagName.get(tag).toString(); // Get the value (= name) of the tag
if( !value.startsWith("#") ) // You can ignore all tags starting with a '#'
{
tagName.set(tag, value.toUpperCase()); // Set the tagname to the uppercase
}
}
tagName.setAccessible(false); // Revert to false
Here is a code sample (version >= 1.11.x):
Parser parser = Parser.htmlParser();
parser.settings(new ParseSettings(true, true));
Document doc = parser.parseInput(html, baseUrl);
There is ParseSettings class introduced in version 1.9.3.
It comes with options to preserve case for tags and attributes.
You must use xmlParser instead of htmlParser and the tags will remain unchanged. One line does the trick:
String html = "<camelCaseTag>some text</camelCaseTag>";
Document doc = Jsoup.parse(html, "", Parser.xmlParser());
I am using 1.11.1-SNAPSHOT version which does not have this piece of code.
private Tag(String tagName) {
this.tagName = tagName.toLowerCase();
}
So I checked ParseSettings as suggested above and changed this piece of code from:
static {
htmlDefault = new ParseSettings(false, false);
preserveCase = new ParseSettings(true, true);
}
to:
static {
htmlDefault = new ParseSettings(true, true);
preserveCase = new ParseSettings(true, true);
}
and skipped test cases while building JAR.
Need a quick help. I am a newbie in QuickFixJ. I have a FIX message in a txt file. I need to convert that into FIX50SP2 format. I am enclosing the code snippet.
String fixMsg = "1128=99=25535=X49=CME34=47134052=20100318-03:21:11.36475=20120904268=2279=122=848=336683=607400107=ESU2269=1270=140575271=152273=121014000336=2346=521023=1279=122=848=336683=607401107=ESU2269=1270=140600271=206273=121014000336=2346=681023=210=159";
System.out.println("FixMsg String:"+fixMsg);
Message FIXMessage = new Message();
DataDictionary dd = new DataDictionary("FIX50SP2.xml");
FIXMessage.fromString(fixMsg, dd, false);
System.out.println("FIXMessage Output:" + FIXMessage.toString()); // Print message after parsing
MsgType msgType = new MsgType();
System.out.println(FIXMessage.getField(msgType));
Here is the output:
FixMsg String:1128=99=15835=X49=CME34=47164052=2012090312102051175=20120904268=1279=122=848=336683=607745107=ESU2269=1270=140575271=123273=121020000336=2346=501023=110=205
FIXMessage Output:9=6135=X34=47164049=CME52=2012090312102051175=20120904268=110=117
quickfix.FieldNotFound: Field [35] was not found in message.
at quickfix.FieldMap.getField(FieldMap.java:216)
at quickfix.FieldMap.getFieldInternal(FieldMap.java:353)
at quickfix.FieldMap.getField(FieldMap.java:349)
at MainApp.main(MainApp.java:52)
I want to extract MsgType field (field 35). Could you please tell me where I am wrong? The thing I have observed is that after parsing to FIX50SP2 format, the convert FIX message is missing many data element (for details see the output)
Thanks
Like others mentioned the MsgType is an header field and you get it by using the following
String msgType = null;
if(FIXMessage.getHeader().isSetField(MsgType.FIELD)) {
msgType = FIXMessage.getHeader().getString(MsgType.FIELD);
}
System.out.println("MsgType is " + msgType);`
The reason you are missing many data element after parsing is, probably your message have some custom tags(like tag 2346), which is not defined in your data dictionary(FIXSP02.xml). hence the parsing of those tags failed and missing in the output.
To fix this, get the data dictionary from the party that is sending you the message and use it to parse the message
I'm not familiar with FIX messages and QuickFixJ, but glancing at the Javadoc, it seems like you should use the identifyType method :
String fixMsg = "1128=99=25535=X49=CME34=47134052=20100318-03:21:11.36475=20120904268=2279=122=848=336683=607400107=ESU2269=1270=140575271=152273=121014000336=2346=521023=1279=122=848=336683=607401107=ESU2269=1270=140600271=206273=121014000336=2346=681023=210=159";
MsgType msgType = Message.identifyType(fixMsg);
You may find FixB framework useful as it deals well with non-standard use cases of FIX.
As in your case, to extract only data you are interested in, you need to define a class that will represent this data and to bind it to FIX using annotations. E.g.:
#FixBlock
public class MDEntry {
#FixField(tag=269) public int entryType; // you could define an enum type for it as well
#FixField(tag=278) public String entryId;
#FixField(tag=55) public String symbol;
}
...
FixFieldExtractor fixExtractor = new NativeFixFieldExtractor();
List<MDEntry> mdEntries = fixExtractor.getGroups(fixMsg, List.class, 268, FixMetaScanner.scanClass(MDEntry.class))
In more common cases, FixSerializer interface should be used, but it requires a message with MsgType(35) tag and a class annotated with #FixMessage(type="...") accordingly. E.g.:
#FixMessage(type="X")
public class MarketData {
#FixGroup(tag=268) public List<MDEntry> entries;
}
...
FixMetaDictionary fixMetaDictionary = FixMetaScanner.scanClassesIn("my.fix.classes.package");
FixSerializer fixSerializer = new NativeFixSerializer("FIX.5.0.SP2", fixMetaDictionary);
MarketData marketData = fixSerializer.deserialize(fixMsg);
I hope you will find it useful.
If you need just a MsgTyp, you're sure the message is correct and you do not need any other field from the message, then I would recommend extracting MsgType from string using regexp.
e.g.: \u000135=(\w+)\u0001
It is MUCH FASTER than parsing (and validating) a string via QuickFix.