Jsoup url, get url by link name - java

I wanna get url by the link name.
download
ad
so what i want is the first url as the link name is download.
My question is how to get url by link name.
I know a complete solution is to get all elements and use if(a.text().contains(download) ). But I guess there is a simple way.
Thanks

Well, the best way would be to get all the < a>s, which contain hrefs, and get the hrefs attributes. Just like this:
Document doc = Jsoup.connect("whatever url").get();
Elements a = doc.select("a[href]");
String href;
for (Element elem : a) {
href = a.attr("href");
}
Now.. Which hrefs you wanna get is enterely up to you. But I think you'd have to use the
.contains("");
.endsWith("");
.startsWith("");
Oh, and maybe you could try using the getters from the doc variable.
.getElementsByAttributeValue("a[href]", "download");

Use a pseudo-selector. For example,
Document doc = Jsoup.connect(url).get();
Elements a = doc.select("a[href]:contains(download)");
Depending on what exactly you are trying to accomplish, you might want to use containsOwn to avoid searching within child elements, or use matches/matchesOwn if you want to use a regex to get elements that contain ONLY the text "download". That regex would be
^download$
See the Selector documentation.

Related

Parsing xml with multi childs using jsoup

I have an xml file that looks as follows - link.
I would like to get the title from it.
In order to do so, I did the following:
Document bookDoc = Jsoup.connect( url ).parser( Parser.xmlParser() ).get();
Node node = bookDoc.childNode( 2 ).childNode( 3 ).childNode( 3 );
This returns me this:
Now I have 2 questions:
Isnt there any simpler way to get this title instead of using all of these childNodes? My worry is that in some result the title wont exactly be at childNode(3) and all my code wont work.
How do I eventually get this title? Im stuck at this point and cant get the string of the title.
Thank you
You can use selectors to access elements. Here you want to select by tag name. Two ways to get the element you want:
String title1 = bookDoc.select("record>display>title").text();
String title2 = bookDoc.selectFirst("record").selectFirst("display").selectFirst("title").text();
If you want to select more complicated things read:
https://jsoup.org/cookbook/extracting-data/dom-navigation
https://jsoup.org/cookbook/extracting-data/selector-syntax
But you probably won't need them for parsing this XML.

Find element by text inside another element using UISelector query

I have the following code snippet and the screenshot attached.
String query = "new UiScrollable(new UiSelector().className(\"androidx.recyclerview.widget.RecyclerView\"))" +
".scrollIntoView(new UiSelector().text(\"Test Group\"))";
driver.findElementByAndroidUIAutomator (query).click ();
What I want is to find an element with the text "Test Group" using UISelector, but inside the RecyclerView only (not searching the whole app source). What I get is the element inside search field instead (not in the RecyclerView).
Please advice. I know that I can get all searched elements using findElements(By.id("name")). But I want to use UI selector in this case.
With UiSelector you can use chaining:
String query = "new UiScrollable(resourseIdMatches(\".*recycler_view\")).scrollIntoView(resourseIdMatches(\".*recycler_view\")).childSelector(text(\"Text Group\")))";
In addition new UiSelector... part can be omitted. Appium does support this syntax.

How do i Jsoup query for the value of a html key/value pair

sorry if my terms are off, i havent done this before
Im using jsoup to scrape a single value off a website page,
I am trying to find the "serialno" which is stored within this function (java script?)
function set(obj, val)
{
document.getElementById(obj).innerHTML= val;
}
called by
{set("modelname", "NPort 5650-16");set("mac", "00:90:E8:22:76:F4");set("serialno", "2583");set("ver", "3.3 Build 08042219");setlabel("NPORT");uptime("264 days, 03h:31m:34s");}<
i am unsure how i can use jsoup to extract/print the serialno value, which in this case happens to be 2583. ive tried basic commands using getElementById but ive never used jsoup before. i am familiar with maps, but not sure how i can manipulate with jsoup, and most of the tutorials online need the actual 'path' to the exact cell within the table (where this info is displayed).
You can't use Jsoup to do this. Jsoup can parse HTML, but javascipt is out of its reach and is recognized as text. It can't be executed and selecting things from javascript is not possible.
But if you already have HTML parsed to Document and you're looking for an alternative solution you may try to use regular expressions to grab this value.
Document doc = Jsoup.parse...
String html = doc.toString();
Pattern p = Pattern.compile("set\\(\"serialno\", \"(\\d+)\"\\)");
Matcher m = p.matcher(html);
if (m.find()) {
String serialno = m.group(1);
System.out.println(serialno);
}

Using multiple criteria to find a WebElement in Selenium

I am using Selenium to test a website, does this work if I find and element by more than one criteria? for example :
driverChrome.findElements(By.tagName("input").id("id_Start"));
or
driverChrome.findElements(By.tagName("input").id("id_Start").className("blabla"));
No it does not. You cannot concatenate/add selectors like that. This is not valid anyway. However, you can write the selectors such a way that will cover all the scenarios and use that with findElements()
By byXpath = By.xpath("//input[(#id='id_Start') and (#class = 'blabla')]")
List<WebElement> elements = driver.findElements(byXpath);
This should return you a list of elements with input tags having class name blabla and having id id_Start
To combine By statements, use ByChained:
driverChrome.findElements(
new ByChained(
By.tagName("input"),
By.id("id_Start"),
By.className("blabla")
)
)
However if the criteria refer to the same element, see #Saifur's answer.
CSS Selectors would be perfect in this scenario.
Your example would
By.css("input#id_start.blabla")
There are lots of information if you search for CSS selectors. Also, when dealing with classes, CSS is easier than XPath because Xpath treats class as a literal string, where as CSS treats it as a space delimited collection
Based #George's repply, the same code for C# :
//reference
using OpenQA.Selenium.Support.PageObjects;
...
int allElements = _driver.FindElements(new ByChained(
By.CssSelector(".sc-pAyMl.cnszJw"),
By.Id("base-field")
)).Count();

How to extract xml tag value without using the tag name in java?

I am using java.I have an xml file which looks like this:
<?xml version="1.0"?>
<personaldetails>
<phno>1553294232</phno>
<email>
<official>xya#gmail.com</official>
<personal>bk#yahoo.com</personal>
</email>
</personaldetails>
Now,I need to check each of the tag values for its type using specific conditions,and put them in separate files.
For example,in the above file,i write conditions like 10 digits equals phone number,
something in the format of xxx#yy.com is an email..
So,what i need to do is i need to extract the tag values in each tag and if it matches a certain condition,it is put in the first text file,if not in the second text file.
in that case,the first text file will contain:
1553294232
xya#gmail.com
bk#yahoo.com
and the rest of the values in the second file.
i just don't know how to extract the tag values without using the tag name.(or without using GetElementsByTagName).
i mean this code should extract the email bk#yahoo.com even if i give <mailing> instead of <personal> tag.It should not depend on the tag name.
Hope i am not confusing.I am new to java using xml.So,pardon me if my question is silly.
Please Help.
Seems like a typical use case for XPath
XPath allows you to query XML in a very flexible way.
This tutorial could help:
http://www.javabeat.net/2009/03/how-to-query-xml-using-xpath/
If you're using Java script, which could to be the case, since you mention getElementsByTagName(), you could just use JQuery selectors, it will give you a consistent behavior across browsers, and JQuery library is useful for a lot of other things, if you are not using it already... http://api.jquery.com/category/selectors/
Here for example is information on this:
http://www.switchonthecode.com/tutorials/xml-parsing-with-jquery
Since you don't know your element name, I would suggest creating a DOM tree and iterating through it. As and when you get a element, you would try to match it against your ruleset (and I would suggest using regex for this purpose) and then write it to your a file.
This would be a sample structure to help you get started, but you would need to modify it based on your requirement:
public void parseXML(){
try{
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc;
doc = documentBuilder.parse(new File("test.xml"));
getData(null, doc.getDocumentElement());
}catch(Exception exe){
exe.printStackTrace();
}
}
private void getData(Node parentNode, Node node){
switch(node.getNodeType()){
case Node.ELEMENT_NODE:{
if(node.hasChildNodes()){
NodeList list = node.getChildNodes();
int size = list.getLength();
for(int index = 0; index < size; index++){
getData(node, list.item(index));
}
}
break;
}
case Node.TEXT_NODE:{
String data = node.getNodeValue();
if(data.trim().length() > 0){
/*
* Here you need to check the data against your ruleset and perform your operation
*/
System.out.println(parentNode.getNodeName()+" :: "+node.getNodeValue());
}
break;
}
}
}
You might want to look at the Chain of Responsibility design pattern to design your ruleset.

Categories

Resources