Get child pages in java component xwiki - java

I'm trying to get all the documents that list the current document as the perant using:
List<DocumentReference> childDocs = doc.getChildrenReferences(xcontext);
where doc is the parent XWikiDocument.
However the function only return an empty list, althoug there is documents in the space.
If I print doc.getFullName() its AAA.WebHome.. And all the chldren spaces is listed under AAA.. how should I reference doc to get say AAA.BBB.WebHome in the list? or where am I going wrong?
I'm trying ot write a recursive function that deletes all the child pages in the current space. But can't list all the child pages. Here is the recursive function:
public void RECdeleteSpace(XWikiDocument doc, XWikiContext xcontext,boolean toTrash){
XWiki xwiki = xcontext.getWiki();
List<DocumentReference> childDocs;
try {
childDocs = doc.getChildrenReferences(xcontext);
System.out.println("REC " + doc.getFullName());
System.out.println("CHLD " + childDocs.toString());
System.out.println("----- ");
Iterator<DocumentReference> docit = childDocs.iterator();
while (docit.hasNext()) {
DocumentReference chdocref = docit.next();
XWikiDocument chdoc = xwiki.getDocument(chdocref, xcontext);
System.out.println("DOC: "+chdoc.getFullName());
RECdeleteSpace(chdoc,xcontext,toTrash);
}
xwiki.deleteDocument(doc, toTrash, xcontext);
} catch (XWikiException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
The output is only:
INIT WebHome
REC AAA.WebHome
CHLD []
-----
And the itterator is nver going to AAA.BBB.Webhome or AAA.BBB.CCC.WebHome

Related

Price extraction in java

I am trying to create a discord bot that searches up an item inputted by user "!price item" and then gives me a price that I can work with later on in the code. I figured out how to get the html code into a string or a doc file, but I am struggling on finding a way to extract only prices.
Here is the code:
#Override
public void onMessageReceived(MessageReceivedEvent event) {
String html;
System.out.println("I received a message from " +
event.getAuthor().getName() + ": " +
event.getMessage().getContentDisplay());
if (event.getMessage().getContentRaw().contains("!price")) {
String input = event.getMessage().getContentDisplay();
String item = input.substring(9).replaceAll(" ", "%20");
String URL = "https://www.google.lt/search?q=" + item + "%20price";
try {
html = Jsoup.connect(URL).userAgent("Mozilla/49.0").get().html();
html = html.replaceAll("[^\\ ,.£€eur0123456789]"," ");
} catch (Exception e) {
return;
}
System.out.println(html);
}
}
The biggest problem is that I am using google search so the prices are not in the same place in the html code. Is there a way I can extract only (numbers + EUR) or (a euro sign + price) from the html code?.
you can easily do that scrapping the website. Here's a simple working example to do what you are looking for using JSOUP:
public class Main {
public static void main(String[] args) {
try {
String query = "oneplus";
String url = "https://www.google.com/search?q=" + query + "%20price&client=firefox-b&source=lnms&tbm=shop&sa=X";
int pricesToRetrieve = 3;
ArrayList<String> prices = new ArrayList<String>();
Document document = Jsoup.connect(url).userAgent("Mozilla/5.0").get();
Elements elements = document.select("div.pslires");
for (Element element : elements) {
String price = element.select("div > div > b").text();
String[] finalPrice = price.split(" ");
prices.add(finalPrice[0] + finalPrice[1]);
pricesToRetrieve -= 1;
if (pricesToRetrieve == 0) {
break;
}
}
System.out.println(prices);
} catch (IOException e) {
e.printStackTrace();
}
}
}
That piece of code will output:
[347,10€, 529,90€, 449,99€]
And if you want to retrieve more information just connect JSOUP to the Google Shop url adding your desired query, and scrapping it using JSOUP. In this case I scrapped Google Shop for OnePlus to check its prices, but you can also get the url to buy it, the full product name, etc. In this piece of code I want to retrieve the first 3 prices indexed in Google Shop and add them to an ArrayList of String. Then before adding it to the ArrayList I split the retrieved text by "space" so I just get the information I want, the price.
This is a simple scrapping example, if you need anything else feel free to ask! And if you want to learn more about scrapping using JSOUP check this link.
Hope this helped you!

How to loop inside containers to select value in selenium?

I have a product page which has the sizes inside containers, i tried to list elements and get size by text but the list always returns zero, i tried the xpath of the parent and child and i get the same error, How can i list the sizes and select specific size ?
public void chooseSize(String size) {
String selectedSize;
List<WebElement> sizesList = actions.driver.findElements(By.xpath("SelectSizeLoactor"));
try {
for (int i = 0; i <= sizesList.size(); i++) {
if (sizesList.get(i).getText().toLowerCase().contains(size.toLowerCase()));
{
selectedSize = sizesList.get(i).getText();
sizesList.get(i).click();
assertTrue(selectedSize.equals(size));
}
}
} catch (Exception e) {
Assert.fail("Couldn't select size cause of " + e.getMessage());
}
It looks to me like the proper selector would be:
actions.driver.findElements(By.cssSelector(".SizeSelection-option"))
Try below options
List<WebElement> sizesList = actions.driver.findElements(By.xpath("//[#class='SelectSizeLoactor']"));
List<WebElement> sizesList = actions.driver.findElements(By.cssSelector(".SelectSizeLoactor"));
I found a quick solution i used part of the xpath with text() and passed the value of that text later then added the last of the xpath and it worked!
String SelectSizeLoactor = "//button[text()='"
public void chooseSize(String size) {
String selectedSize;
WebElement sizeLocator = actions.driver.findElement(By.xpath(SelectSizeLoactor+size.toUpperCase()+"']"));
try {
if (sizeLocator.getText().toUpperCase().contains(size.toUpperCase()));
{
selectedSize = sizeLocator.getText();
sizeLocator.click();
assertTrue(selectedSize.equals(size));
}
} catch (Exception e) {
Assert.fail("Couldn't select size cause of " + e.getMessage());
}
}

Java - How do I extract Google News Titles and Links using Jsoup?

I am very new to using jsoup and html. I was wondering how to extract the titles and links (if possible) from the stories on the front page of google news. Here is my code:
org.jsoup.nodes.Document doc = null;
try {
doc = (org.jsoup.nodes.Document) Jsoup.connect("https://news.google.com/").get();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
Elements titles = doc.select("titletext");
System.out.println("Titles: " + titles.text());
//non existent
for (org.jsoup.nodes.Element e: titles) {
System.out.println("Title: " + e.text());
System.out.println("Link: " + e.attr("href"));
}
For some reason I think my program is unable to find titletext, since this is the output when the code runs: Titles:
I would really appreciate your help, thanks.
First get all nodes/elements which start with h2 html tag
Elements elem = html.select("h2");
Now you have element it has some child element(s) (id, href, originalhref and so on). Here you need retrieve these data which you need
for(Element e: elem){
System.out.println(e.select("[class=titletext]").text());
System.out.println(e.select("a").attr("href"));
}

Solr SearchComponent get all documents

I'm just writing a solr plugin (SearchComponent) and want to iterate over all documents that are found for the query. This is the part of my code in the process method:
// Searcher to search a document
SolrIndexSearcher searcher = rb.req.getSearcher();
// Getting the list of documents found for the query
DocList docs = rb.getResults().docList;
// Return if no results are found for the query
if (docs == null || docs.size() == 0) {
return;
}
// Get the iterator for the documents that will be returned
DocIterator iterator = docs.iterator();
// Iterate over all documents and count occurrences
for (int i = 0; i < docs.size(); i++) {
try {
// Getting the current document ID
int docid = iterator.nextDoc();
// Fetch the document from the searcher
Document doc = searcher.doc(docid);
// do stuff
} catch (Exception e) {
LOGGER.error(e.getMessage());
}
}
For now I found a method where I can iterate over all documents that will be returned by i.e. if 1300 documents are found for the query and I only return 20, I will only iterate over 20 with this method for now. I there a possibility to get the full set of documents (1300)?
There is a possibility to do that. You use DocList which contains only 'rows' docs starting from 'start'. If you want to iterate over all 'numFound' docs - use DocSet via
rb.getResults().docSet
For understanding of this mechanism - http://wiki.apache.org/solr/FAQ#How_can_I_get_ALL_the_matching_documents_back.3F_..._How_can_I_return_an_unlimited_number_of_rows.3F

Tips on storing Query values and passing them between activities?

I trying to store a list of songs that I query from Android Media Store but I am not sure how to save multiple columns (i.e. song name, track path, duration, etc..)
I currently use a HashMap and ArrayList to display the song Name and Duration in a list view but I'd like to store more information from my query. Any tips on how to get a multidimensional vector/container of some sort? I tried using JSON obj/arrays but I everytime I store values in them I can only get that last one out...
while (c.moveToNext()) {
HashMap<String, String> temp = new HashMap<String, String>();
temp.put("Title", c.getString(c
.getColumnIndex(MediaStore.Audio.Media.DISPLAY_NAME)));
temp.put("Duration", Tools.stringOfTime(Long.parseLong(c.getString(c
.getColumnIndex(MediaStore.Audio.Media.DURATION)))));
list.add(temp);
JSON attempt... basically I added each query row into 1 json object and kept putting those objects into a json array but for some reason I can only get the last value from my list where as the hashmap approach works fine but can only store 2 fields..
// object = new JSONObject();
// try {
// object.put(
// "Title",
// c.getString(c
// .getColumnIndex(MediaStore.Audio.Media.DISPLAY_NAME)));
// object.put("Data", c.getString(c
// .getColumnIndex(MediaStore.Audio.Media.DATA)));
// object.put("Artist", c.getString(c
// .getColumnIndex(MediaStore.Audio.Media.ARTIST)));
// object.put("Album", c.getString(c
// .getColumnIndex(MediaStore.Audio.Media.ALBUM)));
// object.put("Duration", c.getString(c
// .getColumnIndex(MediaStore.Audio.Media.DURATION)));
// jarray.put(object);
// } catch (JSONException e) {
// e.printStackTrace();
// }
//
// jlist.add(object);
// }
// try {
// tv.setText(object.getString("Title"));
// } catch (JSONException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// Log.d("SongsActivity", "Couldn't print json object");
// }
Store the cursor object in an Application class field variable.
Like: http://developer.android.com/reference/android/app/Application.html
Did some quick research, and almost every answer I see to questions like this returns to: run the query again in the next activity.
The other answer was to create an application class to act as a data hub.

Categories

Resources