How to test for CDATA value of an element using Dom4j?

How to test for CDATA value of an element using Dom4j? - java

Does anybody have an idea how to find out whether an element contains <![CDATA[ text ]]> or not ? I search through the dom4j API and Jaxen and I can't find how to do that... If I retrieve the text, it the cdata wrapper is trimmed.

The method:
Node.asXML()
returns the entire element with its value unmodified by anything.
So if you have:
<nodes>
<node><![CDATA[value]]></node>
</nodes>
Calling the text methods will return "value", but calling "asXML()" will return:
<node><![CDATA[value]]></node>
From there, I guess you can do a String search for the CDATA tag.

Technically you can still do this.
public boolean isCDATA(org.dom4j.Node node) {
for (org.dom4j.Node n : node.content()) {
if (org.w3c.dom.Node.CDATA_SECTION_NODE == n.getNodeType()) {
return true;
}
}
return false;
}

http://dom4j.sourceforge.net/dom4j-1.6.1/apidocs/org/dom4j/Node.html#getNodeType%28%29
will this work?
public short getNodeType()
Returns the code according to the type of node. This makes processing nodes polymorphically much easier as the switch statement can be used instead of multiple if (instanceof) statements.
Returns: a W3C DOM complient code for the node type such as ELEMENT_NODE or ATTRIBUTE_NODE

Related

How to know the element contain namespace(xmlns) or not?

<xxx1 xmlns="hello">
<xxx2>
<xxx3>
<name>rule_1</name>
</xxx3>
</xxx2>
</xxx1>
I select node by "//*[namespace-uri()='hello']/*[local-name()='name']"
It should get //hello:xxx1/xxx2/xxx3/name , and it does.
Now I try to get element . In reality, I don't know how much parent for <name> will get <xxx1>;
I try this code
node.getParent().getNamespaceURI() = "Hello"
and increase getParent() amount to get <xxx1>
But the first time I call <xxx3>.getNamespaceURI() it returns true.
Is the namespace inherited?
How to get the element has or not has xmlns?
Sorry for my question was not clearly.
I'm trying to get the element which is the first declared namespace "hello".
<xxx1 xmlns="hello">
<xxx2>
<xxx3>
this three node which one is contained xmlns="hello", 'cause <xxx2> and <xxx3> was not declare xmlns in the label.

Hello and Welcome to Stack Overflow!
Yes, namespaces are sort of inherited, but the terminology normally used is that, in your example, the <name> element is in the scope of the namespace declaration xmlns="hello", so the <name>element will be in the hello namespace.
With DOM4J, you can test whether an element is in a namespace or not like this:
boolean hasNamespace(Element e) {
return e.getNamespaceURI().length() > 0;
}
If the element is not in any namespace, getNamespaceURI() returns an empty string.
I guess that you want to select the <name> element, but you don't know at which level it be, i.e. how many parents it will have. You can always use this XPath expression:
Node node = doc.selectSingleNode("//*[namespace-uri() = 'foo' and local-name() = 'name']");

Getting the Xpath of a span web element

I have the following HTML code:
I need to refer to the span element (last element in the tree) in order to check if it exists.
The problem is, I can't find the right XPath to it and was not able to find any question already concerning this specific issue.
I tried:
"//span[#data-highlighted='true']"
and also further successional XPaths referring to its previous nodes but was not able to actually get a working Xpath. The difficulty for me is that it has no id or title so I tried to get it through its "data-highlighted" but that does not seem to work.
Just for the sake of completeness:
I have written the following Java method which is meant to get an Xpath as its input:
public Boolean webelementIsPresent (String inputXpath) throws InterruptedException {
return driver.findElements(By.xpath(inputXpath)).size()>0;
}
Then in a test class I perform an assertTrue wether the webelement exists (the method returns a True) or wether it doesn't.
I'm open for any help, thanks in advance! :)

To identify the element "//span[#data-highlighted='true']" you can use the following xpath :
"//table[#class='GJBYOXIDAQ']/tbody//tr/td/div[#class='GJBYOXIDPL' and #id='descriptionZoom']/table/tbody/tr/td/div[#class='GJBYOXIDIN zoomable highlight' and #id='description']/div[#class='gwt-HTML' and #id='description']//span[#data-highlighted='true']"

You can get element by text
driver.findElement(By.xpath("//span[contains(text(), 'Willkommen')]"));
Or find div with id and based on that, find span element. There are 2 options to do that:
driver.findElement(By.xpath("//div[#id='description']//span"));
OR
WebElement descriptionDiv = driver.findElement(By.id("description"));
descriptionDiv.findElement(By.tagName("span"));
OR
driver.findElement(By.cssSelector("#description span"));

Your XPath looks fine, my guess is that it's a timing issue and you need a brief wait. It could also be that the page was in a certain state when you captured the HTML and it's not always in that state when you reach the page.
There are other locators that should work here.
XPath
//span[contains(., 'Willkommen')]
CSS selectors (These may or may not work based on your current XPath results)
span[data-highlighted='true']
#description span[data-highlighted='true']
For your function, I would suggest a change. Replace the String parameter with a By for more flexibility. You can then locate an element using any method and not be restricted to just XPath.
public Boolean webElementIsPresent(By locator)
{
return driver.findElements(locator).size() > 0;
}
or if you want to add a wait,
public Boolean webElementIsPresent(By locator)
{
try
{
new WebDriverWait(driver, 5).until(ExpectedConditions.presenceOfElementLocated(locator));
return true;
}
catch (TimeoutException e)
{
return false;
}
}

How to parse element with same name at different level in SAX parser java

I have an xml structure like below
<cr:TRFCoraxData instrumentId="8590925624" organizationId="4296241518">
<cr:Dividends>
<cr:ExDate>2017-02-27T00:00:00+00:00</cr:ExDate>
<cr:PeriodEndDate>2017-03-31T00:00:00+00:00</cr:PeriodEndDate>
<cr:PeriodDuration>P3M</cr:PeriodDuration>
</cr:Dividends>
<cr:AdjustmentFactors>
<cr:ExDate>2222-05-21T00:00:00+00:00</cr:ExDate>
<cr:AdjustmentFactor>0.50000</cr:AdjustmentFactor>
</cr:AdjustmentFactors>
</cr:TRFCoraxData>
So i have to element cr:ExDate with same name in Kand AdjustmentFactors tag.
Now i have pojo classes for both and then i have start and end element tag .
In my end element tag i have below condition like below
if (element.equals("cr:ExDate")) {
dividend.setExDate(tmpValue);
}else if (element.equals("cr:DividendEventId")) {
dividend.setDividendEventId(tmpValue);
}else if (element.equals("cr:AnnouncementDate")) {
dividend.setAnnouncementDate(tmpValue);
}
else if (element.equals("cr:ExDate")) {
adjustmentFactorObj.setExDate(tmpValue);
}else if (element.equals("cr:AdjustmentFactor")) {
adjustmentFactorObj.setAdjustmentFactor(tmpValue);
}
Clearly for "cr:ExDate" element if condition satisfies and i am not able to get and set in adjustmentFactorObj for "cr:ExDate" value.
Please suggest me how can i solve this problem

You can try something like this:
boolean isInDivident; // it is a field
...
if (element.equals("cr:Dividends")) {
isInDivident = true;
} else if ((element.equals("cr:AdjustmentFactors")) {
isDivident = false;
} else if (element.equals("cr:ExDate") && isInDivident) {
...

Writing SAX applications is not easy. Many people are lured into it by the promise of better performance, but you will only get better performance if you are highly skilled in the art. Most people who end up asking questions about SAX on SO probably should be using a different tool for the job. (But you haven't described the task, so we can't judge that.)
In any case, when you use SAX, it's your responsibility to keep track of context. For most situations, a stack holding element names is sufficient: push an element name to the stack in the startElement event, pop it in the endElement event. That's sufficient in this case to determine, when an ExDate arrives, what its parent element is.

Get next element of specific tag type in jsoup

I'm iterating through a list of elements using jsoup, but periodically need to find one element that doesn't occur directly after the current element.
For example, if I'm iterating through and come to an img tag, I want to find the very next a tag occurring after that img tag. But, there could be a few tags in between the two.
Here's some example code:
for (Element e : elements) {
if (e.tagName().equalsIgnoreCase("img")) {
// Do some stuff with "img" tag
// Now, find the next tag in the list with tag <a>
}
// Do some other things before the next loop iteration
}
I thought something like e.select("img ~ a") should work, but it returns no results.
What's a good way of doing this in jsoup?

This appears to be a way of accomplishing the stated goal. Not sure it's the most efficient, but it is the most straightforward.
Element node = e.nextElementSibling();
while (node != null && !node.tagName().equalsIgnoreCase("a")) {
node = node.nextElementSibling();
}
I was hoping for a way to run the equivalent of e.nextElementSibling("a"). Perhaps something for me to contribute back to jsoup ;-)

Use the nextElementSibling() method.
Inside your if statement add the following code:
Element imgNext = e.nextElementSibling();
do {
Element a = imgNext.select("a[href]").first();
} while (imgNext!=null && a==null);

How to check if an element exists in the XML using XPath?

Below is my element hierarchy. How to check (using XPath) that AttachedXml element is present under CreditReport of Primary Consumer
<Consumers xmlns="http://xml.mycompany.com/XMLSchema">
<Consumer subjectIdentifier="Primary">
<DataSources>
<Credit>
<CreditReport>
<AttachedXml><![CDATA[ blah blah]]>

Use the boolean() XPath function
The boolean function converts its
argument to a boolean as follows:
a number is true if and only if
it is neither positive or negative
zero nor NaN
a node-set is true if and only if
it is non-empty
a string is true if and only if
its length is non-zero
an object of a type other than
the four basic types is converted to a
boolean in a way that is dependent on
that type
If there is an AttachedXml in the CreditReport of primary Consumer, then it will return true().
boolean(/mc:Consumers
/mc:Consumer[#subjectIdentifier='Primary']
//mc:CreditReport/mc:AttachedXml)

The Saxon documentation, though a little unclear, seems to suggest that the JAXP XPath API will return false when evaluating an XPath expression if no matching nodes are found.
This IBM article mentions a return value of null when no nodes are matched.
You might need to play around with the return types a bit based on this API, but the basic idea is that you just run a normal XPath and check whether the result is a node / false / null / etc.
XPathFactory xpathFactory = XPathFactory.newInstance(NamespaceConstant.OBJECT_MODEL_SAXON);
XPath xpath = xpathFactory.newXPath();
XPathExpression expr = xpath.compile("/Consumers/Consumer/DataSources/Credit/CreditReport/AttachedXml");
Object result = expr.evaluate(doc, XPathConstants.NODE);
if ( result == null ) {
// do something
}

Use:
boolean(/*/*[#subjectIdentifier="Primary"]/*/*/*/*
[name()='AttachedXml'
and
namespace-uri()='http://xml.mycompany.com/XMLSchema'
]
)

Normally when you try to select a node using xpath your xpath-engine will return null or equivalent if the node doesn't exists.
xpath: "/Consumers/Consumer/DataSources/Credit/CreditReport/AttachedXml"
If your using xsl check out this question for an answer:
xpath find if node exists

take look at my example
<tocheading language="EN">
<subj-group>
<subject>Editors Choice</subject>
<subject>creative common</subject>
</subj-group>
</tocheading>
now how to check if creative common is exist
tocheading/subj-group/subject/text() = 'creative common'
hope this help you

If boolean() is not available (the tool I'm using does not) one way to achieve it is:
//SELECT[#id='xpto']/OPTION[not(not(#selected))]
In this case, within the /OPTION, one of the options is the selected one. The "selected" does not have a value... it just exists, while the other OPTION do not have "selected". This achieves the objective.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to test for CDATA value of an element using Dom4j? - java

Does anybody have an idea how to find out whether an element contains <![CDATA[ text ]]> or not ? I search through the dom4j API and Jaxen and I can't find how to do that... If I retrieve the text, it the cdata wrapper is trimmed.

Technically you can still do this. public boolean isCDATA(org.dom4j.Node node) { for (org.dom4j.Node n : node.content()) { if (org.w3c.dom.Node.CDATA_SECTION_NODE == n.getNodeType()) { return true; } } return false; }

Related

How to know the element contain namespace(xmlns) or not?

Getting the Xpath of a span web element

How to parse element with same name at different level in SAX parser java

Get next element of specific tag type in jsoup

How to check if an element exists in the XML using XPath?

Categories

Resources