How to remove special characters from a xpath using Selenium? - java

As you are able to see, I have used one dynamic xpath: //td[text()='Discharge Air']/following-sibling::td/span to go from zone1 until zone3, but when I am using gettext() to fetch only 100 but special character °F is also coming. Hence please suggest how to remove this special character °F, because I want only data 100 from this xpath? As you can see in the image, only 1 span is available, so I can't separate span also.
String s = driver.findElement(By.xpath("//td[text()='Discharge Air']/following-sibling::td/span")).getText();
s.replace("°F","");//replace the °F with empty string
Instead of String, can i use List because all these xpath are of same type,hence directly i can write and afterwards i can use for loop for getText().
List s=driver.findElements(By.xpath("//td[text()='Discharge Air']/following-sibling::td/span"));
s.replace("°F","");
Thanks in advance,

List disch_Air = driver.findElements(By.xpath("//td[text()='Discharge Air']/following-sibling::td/span"));
for(int i=0;i<disch_Air.size();i++) {
System.out.println(disch_Air.get(i).getText().replace("°F", ""));
}
}
This is what i want and its working fine thank you so much guys for ur help

Use this:
//first find the elements and save it as you did (with the xpath you posted)
String s = driver.findElement(By.xpath("//td[text()='Discharge Air']/following-sibling::td/span")).getText();
s.replace("°F","");//replace the °F with empty string
and if you see that there are still spaces on your string you can use this to remove them:
s.trim();

Related

Appium NOT locating element when java variable is used in xpath

I'm trying to locate elements dynamically usign the xpath. However, when I use variable in the xpath, elements are NOT located. However, if I use hardcoded value, elements are located properly.
What am I missing here?
Below xpath locates the elements perfectly:
driver.findElements(By.xpath("//XCUIElementTypeStaticText[contains(#value, 'hp')]"));
whereas, below xpath doesn't locate the elements:
driver.findElements(By.xpath("//XCUIElementTypeStaticText[contains(#value, '" + device + "')]"));
Please note that , there are multiple elements matching the above xpath.
I even tried below code but of no use:
driver.findElements(By.XPath(String.Format("//XCUIElementTypeStaticText[contains(#value, '{0}')]", device)));
Any help would be appreciated.
Try do debug this issue as following:
Define the XPath string before calling driver.findElements method, format the string to have the proper value and then pass it into Selenium method, as following:
String xpathLocator = "//XCUIElementTypeStaticText[contains(#value, '%s')]";
xpathLocator = String.format(xpathLocator, device);
driver.findElements(By.xpath(xpathLocator));
As about your existing code.
Here driver.findElements(By.xpath("//XCUIElementTypeStaticText[contains(#value, '" + device + "')]"));
I can't see the formatting action.
And here driver.FindElements(By.XPath(string.Format("//XCUIElementTypeStaticText[contains(#value, '{0}')]", device)));
it seems to be a wrong syntax.
It should be String.format while you wrote string.Format
Try trimming the spaces as:
driver.findElements(By.xpath("//XCUIElementTypeStaticText[contains(#value, '"+device+"')]"));
Or using String.format() as:
String device = "hp";
driver.findElements(By.xpath(String.format("//XCUIElementTypeStaticText[contains(#value, '%s')]", device)));
Note:
Instead of FindElements() it should be findElements()
Instead of String.Format() it should be String.format()
The issue was with the case mismatch in the value returned by variable. i.e; device variable was returning 'hP' instead of 'hp'.
Corrected the code and it works fine now.

Parsing xml with multi childs using jsoup

I have an xml file that looks as follows - link.
I would like to get the title from it.
In order to do so, I did the following:
Document bookDoc = Jsoup.connect( url ).parser( Parser.xmlParser() ).get();
Node node = bookDoc.childNode( 2 ).childNode( 3 ).childNode( 3 );
This returns me this:
Now I have 2 questions:
Isnt there any simpler way to get this title instead of using all of these childNodes? My worry is that in some result the title wont exactly be at childNode(3) and all my code wont work.
How do I eventually get this title? Im stuck at this point and cant get the string of the title.
Thank you
You can use selectors to access elements. Here you want to select by tag name. Two ways to get the element you want:
String title1 = bookDoc.select("record>display>title").text();
String title2 = bookDoc.selectFirst("record").selectFirst("display").selectFirst("title").text();
If you want to select more complicated things read:
https://jsoup.org/cookbook/extracting-data/dom-navigation
https://jsoup.org/cookbook/extracting-data/selector-syntax
But you probably won't need them for parsing this XML.

How to replace xml empty tags using regex

I have a lot of empty xml tags which needs to be removed from string.
String dealData = dealDataWriter.toString();
someData = someData.replaceAll("<somerandomField1/>", "");
someData = someData.replaceAll("<somerandomField2/>", "");
someData = someData.replaceAll("<somerandomField3/>", "");
someData = someData.replaceAll("<somerandomField4/>", "");
This uses a lot of string operations which is not efficient, what can be better ways to avoid these operations.
I would not suggest to use Regex when operating on HTML/XML... but for a simple case like yours maybe it is ok to use a rule like this one:
someData.replaceAll("<\\w+?\\/>", "");
Test: link
If you want to consider also the optional spaces before and after the tag names:
someData.replaceAll("<\\s*\\w+?\\s*\\/>", "");
Test: link
Try the following code, You can remove all the tag which does not have any space in it.
someData.replaceAll("<\w+/>","");
Alternatively to using regex or string matching, you can use an xml parser to find empty tags and remove them.
See the answers given over here: Java Remove empty XML tags
If you like to remove <tagA></tagA> and also <tagB/> you can use following regex. Please note that \1 is used to back reference matching group.
// identifies empty tag i.e <tag1></tag> or <tag/>
// it also supports the possibilities of white spaces around or within the tag. however tags with whitespace as value will not match.
private static final String EMPTY_VALUED_TAG_REGEX = "\\s*<\\s*(\\w+)\\s*></\\s*\\1\\s*>|\\s*<\\s*\\w+\\s*/\\s*>";
Run the code on ideone

Using multiple criteria to find a WebElement in Selenium

I am using Selenium to test a website, does this work if I find and element by more than one criteria? for example :
driverChrome.findElements(By.tagName("input").id("id_Start"));
or
driverChrome.findElements(By.tagName("input").id("id_Start").className("blabla"));
No it does not. You cannot concatenate/add selectors like that. This is not valid anyway. However, you can write the selectors such a way that will cover all the scenarios and use that with findElements()
By byXpath = By.xpath("//input[(#id='id_Start') and (#class = 'blabla')]")
List<WebElement> elements = driver.findElements(byXpath);
This should return you a list of elements with input tags having class name blabla and having id id_Start
To combine By statements, use ByChained:
driverChrome.findElements(
new ByChained(
By.tagName("input"),
By.id("id_Start"),
By.className("blabla")
)
)
However if the criteria refer to the same element, see #Saifur's answer.
CSS Selectors would be perfect in this scenario.
Your example would
By.css("input#id_start.blabla")
There are lots of information if you search for CSS selectors. Also, when dealing with classes, CSS is easier than XPath because Xpath treats class as a literal string, where as CSS treats it as a space delimited collection
Based #George's repply, the same code for C# :
//reference
using OpenQA.Selenium.Support.PageObjects;
...
int allElements = _driver.FindElements(new ByChained(
By.CssSelector(".sc-pAyMl.cnszJw"),
By.Id("base-field")
)).Count();

Need java Regex to remove/replace the XML elements from specific string

I have a problem in getting the correct Regular expression.I have below xml as string
<user_input>
<UserInput Question="test Q?" Answer=<value>0</value><sam#testmail.com>"
</user_input>
Now I need to remove the xml character from Answer attribute only.
So I need the below:-
<user_input>
<UserInput Question="test Q?" Answer=value0value sam#testmail.com"
</user_input>
I have tried the below regex but did not worked out:-
str1.replaceAll("Answer=.*?<([^<]*)>", "$1");
its removing all the text before..
Can anyone help please?
You need to put ? within the first group to make it none greedy, also you dont need Answer=.*?:
str1.replaceAll("<([^<]*?)>", "$1")
DEMO
httpRequest.send("msg="+data+"&TC="+TC); try like this
Although variable width look-behinds are not supported in Java, you can work around it with .{0,1000} that should suffice.
Please check out this approach using 2 regexes, or 1 regex and 1 replace. Choose the one that suits best (I removed the \n line break from the first input string to show the flaw with using simple replace):
String input = "<user_input><UserInput Question=\"test Q?\" Answer=<value>0</value><sam#testmail.com>\"\n</user_input>";
String st = input.replace("><", " ").replaceAll("(?<=Answer=.{0,1000})[<>/]+(?=[^\"]*\")", "");
String st1 = input.replaceAll("(?<=Answer=.{0,1000})><(?=[^\"]*\")", " ").replaceAll("(?<=Answer=.{0,1000})[<>/]+(?=[^\"]*\")", "");
System.out.println(st + "\n" + st1);
Output of a sample program:
<user_input UserInput Question="test Q?" Answer=value0value sam#testmail.com"
</user_input>
<user_input><UserInput Question="test Q?" Answer=value0value sam#testmail.com"
</user_input>
First off, in your sample above, there is a trailing " after the email and > which I do not know if it was placed by error.
However, I will keep it there as according to your expected result, you need it to still be present.
This is my hack.
(Answer=)(<)(value)(>)(.+?([^<]*))(</)(value)(><)(.+?([^>]*))(>) to replace it with
$1$3$5$8 $10
The explanation...
(Answer=)(<)(value)(>) matches from Answer to the start of the value 0
(.+?([^<]*) matches the result from 0 or more right to the beginning < which starts the closing value tag
(</) here, I still select this since it was dropped in the previous expression
(><) I will later replace this with a space
(.+?([^>]*) This matches from the start of the email and excludes the > after the .com
(>) this one selects the last > which I will later drop when replacing.
The trailing " is not selected as I will rather not touch it as requested.

Categories

Resources