Java Webcrawler to extract emails

Java Webcrawler to extract emails - java

I want to write a web crawler that starts at one page and goes to each link on that page looking for an email address. This is what I have so far, but it's not doing anything other than going from webpage to webpage.
`package com.netinstructions.crawler;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Set;
public class WebCrawler {
private static final int MAX_PAGES_TO_SEARCH = 26;
private Set<String> pagesVisited = new HashSet<String>();
private List<String> pagesToVisit = new LinkedList<String>();
private List<String> emails = new LinkedList<>();
private String nextUrl()
{
String nextUrl;
do
{
nextUrl = this.pagesToVisit.remove(0);
} while(this.pagesVisited.contains(nextUrl));
this.pagesVisited.add(nextUrl);
return nextUrl;
}
public void search(String url, String searchWord)
{
while(this.pagesVisited.size() < MAX_PAGES_TO_SEARCH)
{
String currentUrl;
SpiderLeg leg = new SpiderLeg();
if(this.pagesToVisit.isEmpty())
{
currentUrl = url;
this.pagesVisited.add(url);
}
else
{
currentUrl = this.nextUrl();
}
leg.crawl(currentUrl); // Lots of stuff happening here. Look at the crawl method in
// SpiderLeg
leg.searchForWord(currentUrl, emails);
this.pagesToVisit.addAll(leg.getLinks());
this.pagesToVisit.addAll(leg.getLinks());
}
System.out.println(emails.toString());
//System.out.println(String.format("**Done** Visited %s web page(s)", this.pagesVisited.size()));
}
}
And this is my Spider Leg Class
package com.netinstructions.crawler;
import java.io.IOException;
import java.util.LinkedList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class SpiderLeg
{
// We'll use a fake USER_AGENT so the web server thinks the robot is a normal web browser.
private static final String USER_AGENT =
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1";
private List<String> links = new LinkedList<String>();
private Document htmlDocument;
/**
* This performs all the work. It makes an HTTP request, checks the response, and then gathers
* up all the links on the page. Perform a searchForWord after the successful crawl
*
* #param url
* - The URL to visit
* #return whether or not the crawl was successful
*/
public boolean crawl(String url)
{
try
{
Connection connection = Jsoup.connect(url).userAgent(USER_AGENT);
Document htmlDocument = connection.get();
this.htmlDocument = htmlDocument;
if(connection.response().statusCode() == 200) // 200 is the HTTP OK status code
// indicating that everything is great.
{
System.out.println("\n**Visiting** Received web page at " + url);
}
if(!connection.response().contentType().contains("text/html"))
{
System.out.println("**Failure** Retrieved something other than HTML");
return false;
}
Elements linksOnPage = htmlDocument.select("a[href]");
//System.out.println("Found (" + linksOnPage.size() + ") links");
for(Element link : linksOnPage)
{
this.links.add(link.absUrl("href"));
}
return true;
}
catch(IOException ioe)
{
// We were not successful in our HTTP request
return false;
}
}
/**
* Performs a search on the body of on the HTML document that is retrieved. This method should
* only be called after a successful crawl.
*
* #param searchWord
* - The word or string to look for
* #return whether or not the word was found
*/
public void searchForWord(String searchWord, List<String> emails)
{
if(this.htmlDocument == null)
{
System.out.println("ERROR! Call crawl() before performing analysis on the document");
//return false;
}
Pattern pattern =
Pattern.compile("\"^[A-Z0-9._%+-]+#[A-Z0-9.-]+\\\\.[A-Z]{2,6}$\", Pattern.CASE_INSENSITIVE");
Matcher matchs = pattern.matcher(searchWord);
while (matchs.find()) {
System.out.println(matchs.group());
}
}
public List<String> getLinks()
{
return this.links;
}
}
My web crawler was taken from another source and I changed a few things. I added a List to hold the emails and return them all in a list to me. I think I am going wrong in my way that I take the email and put it in the list, but I am not sure how to fix it.

Spider Leg Class
Pattern.compile("\"^[A-Z0-9._%+-]+#[A-Z0-9.-]+\\\\.[A-Z]{2,6}$\", Pattern.CASE_INSENSITIVE");
Shouldn't this be...?
Pattern.compile("[A-Z0-9._%+-]+#[A-Z0-9.-]+\\.[A-Z]{2,6}", Pattern.CASE_INSENSITIVE);
Nothing gets added to the emails, so you need to emails.push() the emails you find to the list. Secondly, you probably want to be parsing the HTML document, not the URL of the page. Since the method now doesn't return anything, you need to expand the if statement to avoid the null pointer. The searchForWord method should be:
public void searchForWord(String searchWord, List<String> emails)
{
if(this.htmlDocument == null)
{
System.out.println("ERROR! Call crawl() before performing analysis on the document");
} else
{
String input = this.htmlDocument.toString();
Pattern pattern =
Pattern.compile("[A-Z0-9._%+-]+#[A-Z0-9.-]+\\.[A-Z]{2,6}", Pattern.CASE_INSENSITIVE);
Matcher matchs = pattern.matcher(input);
while (matchs.find()) {
emails.push(matchs.group());
}
}
}

Related

example OAuth 1.0 implementation in java

I need an example in java to be able to consume an API using OAuth 1.0 to generate the token :
NB: i have only those informations (the consumerKey, ConsumerSecret, AccesToken and the acessTokenSecret).
I have searched int the internet but I have not found a result that can help me.

I'm looking for the same but, actually, I have an example provided by the admins of the API that I'm trying to consume.
Since no one replied to this thread, I'll paste the mentioned example here, hoping it could help in some way.
Also, if someone can give me a hand, understand the code and explain how to implement it, it would be great.
package com.example.entities;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.MultivaluedMap;
import com.sun.jersey.api.client.Client;
import com.sun.jersey.api.client.WebResource;
import com.sun.jersey.api.client.filter.LoggingFilter;
import com.sun.jersey.oauth.client.OAuthClientFilter;
import com.sun.jersey.oauth.signature.HMAC_SHA1;
import com.sun.jersey.oauth.signature.OAuthParameters;
import com.sun.jersey.oauth.signature.OAuthSecrets;
public class SSRestServiceClient {
/*
* private constructor to avoid construction of object.
*/
private SSRestServiceClient(String url, String consumerKey, String consumerSecret, int timeoutSeconds) {
OAuthParameters params = new OAuthParameters().signatureMethod(HMAC_SHA1.NAME).consumerKey(consumerKey).version(OAUTH_VERSION);
OAuthSecrets secrets = new OAuthSecrets().consumerSecret(consumerSecret);
Client httpClient = Client.create();
httpClient.setConnectTimeout(timeoutSeconds * 1000);
httpClient.setReadTimeout(timeoutSeconds * 1000);
OAuthClientFilter filter = new OAuthClientFilter(httpClient.getProviders(), params, secrets);
httpClient.addFilter(filter);
httpClient.addFilter(new LoggingFilter());
this.consumerKey = consumerKey;
this.webResource = httpClient.resource(url);
}
/*
* Creates instance of the Client class, returns existing instance if it's already created.
*/
public static synchronized SSRestServiceClient getInstance(String url, String consumerKey, String consumerSecret) {
return getInstance(url, consumerKey, consumerSecret, 0);
}
/*
* Creates instance of the Client class, returns existing instance if it's already created.
*/
public static synchronized SSRestServiceClient getInstance(String url, String consumerKey,String consumerSecret, int timeoutSeconds) {
if (client == null || client.getConsumerKey() == null || !client.getConsumerKey().equals(consumerKey)) {
if (timeoutSeconds != 0 && timeoutSeconds < ONE_MINUTE) {
timeoutSeconds = ONE_MINUTE;
}
client = new SSRestServiceClient(url, consumerKey, consumerSecret, timeoutSeconds);
}
return client;
}
/*
* Return a list of all active source cities.
*/
#Override
public CityList getAllSources() {
return webResource.path(GET_ALL_SOURCES).get(CityList.class);
}
}

Exception in thread "main" java.lang.NoClassDefFoundError: org/jsoup/Jsoup

I copied a simple web crawler from the internet and then started to run the application in a test class. Every time i try to run the application I get "Exception in thread "main" java.lang.NoClassDefFoundError: org/jsoup/Jsoup" error. I first imported the jsoup jar as a externaljar in a Libary, because I needed it for the http stuff.
Error messages:
Exception in thread "main" java.lang.NoClassDefFoundError: org/jsoup/Jsoup
at com.copiedcrawler.SpiderLeg.crawl(SpiderLeg.java:35)
at com.copiedcrawler.Spider.search(Spider.java:40)
at com.copiedcrawler.SpiderTest.main(SpiderTest.java:9)
Caused by: java.lang.ClassNotFoundException: org.jsoup.Jsoup
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:602)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
... 3 more
Spider Class
package com.copiedcrawler;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Set;
public class Spider
{
private static final int MAX_PAGES_TO_SEARCH = 10;
private Set<String> pagesVisited = new HashSet<String>();
private List<String> pagesToVisit = new LinkedList<String>();
public void search(String url, String searchWord)
{
while(this.pagesVisited.size() < MAX_PAGES_TO_SEARCH)
{
String currentUrl;
SpiderLeg leg = new SpiderLeg();
if(this.pagesToVisit.isEmpty())
{
currentUrl = url;
this.pagesVisited.add(url);
}
else
{
currentUrl = this.nextUrl();
}
leg.crawl(currentUrl); // Lots of stuff happening here. Look at the crawl method in
// SpiderLeg
boolean success = leg.searchForWord(searchWord);
if(success)
{
System.out.println(String.format("**Success** Word %s found at %s", searchWord, currentUrl));
break;
}
this.pagesToVisit.addAll(leg.getLinks());
}
System.out.println("\n**Done** Visited " + this.pagesVisited.size() + " web page(s)");
}
/**
* Returns the next URL to visit (in the order that they were found). We also do a check to make
* sure this method doesn't return a URL that has already been visited.
*
* #return
*/
private String nextUrl()
{
String nextUrl;
do
{
nextUrl = this.pagesToVisit.remove(0);
} while(this.pagesVisited.contains(nextUrl));
this.pagesVisited.add(nextUrl);
return nextUrl;
}
}
SpiderLeg class
package com.copiedcrawler;
import java.io.IOException;
import java.util.LinkedList;
import java.util.List;
import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class SpiderLeg
{
// We'll use a fake USER_AGENT so the web server thinks the robot is a normal web browser.
private static final String USER_AGENT =
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1";
private List<String> links = new LinkedList<String>();
private Document htmlDocument;
/**
* This performs all the work. It makes an HTTP request, checks the response, and then gathers
* up all the links on the page. Perform a searchForWord after the successful crawl
*
* #param url
* - The URL to visit
* #return whether or not the crawl was successful
*/
public boolean crawl(String url)
{
try
{
Connection connection = Jsoup.connect(url).userAgent(USER_AGENT);
Document htmlDocument = connection.get();
this.htmlDocument = htmlDocument;
if(connection.response().statusCode() == 200) // 200 is the HTTP OK status code
// indicating that everything is great.
{
System.out.println("\n**Visiting** Received web page at " + url);
}
if(!connection.response().contentType().contains("text/html"))
{
System.out.println("**Failure** Retrieved something other than HTML");
return false;
}
Elements linksOnPage = htmlDocument.select("a[href]");
System.out.println("Found (" + linksOnPage.size() + ") links");
for(Element link : linksOnPage)
{
this.links.add(link.absUrl("href"));
}
return true;
}
catch(IOException ioe)
{
// We were not successful in our HTTP request
return false;
}
}
/**
* Performs a search on the body of on the HTML document that is retrieved. This method should
* only be called after a successful crawl.
*
* #param searchWord
* - The word or string to look for
* #return whether or not the word was found
*/
public boolean searchForWord(String searchWord)
{
// Defensive coding. This method should only be used after a successful crawl.
if(this.htmlDocument == null)
{
System.out.println("ERROR! Call crawl() before performing analysis on the document");
return false;
}
System.out.println("Searching for the word " + searchWord + "...");
String bodyText = this.htmlDocument.body().text();
return bodyText.toLowerCase().contains(searchWord.toLowerCase());
}
public List<String> getLinks()
{
return this.links;
}
}
SpiderTest class
package com.copiedcrawler;
public class SpiderTest {
public static void main(String[] args) {
// TODO Auto-generated method stub
Spider s1 = new Spider();
s1.search("https://www.w3schools.com/html/", "html");
}
}

Based on stacktrace you are running java program from command line and you forgot to add jsoup into class path. Try running
java -cp classes:libs/jsoup.jar com.copiedcrawler.SpiderTest
Where classes is your program compiled and libs is a folder with libraries.

You might have added the Jsoup Jar File into the Modulepath.
You need to add the JAR file to classpath.
Follow the below steps:
Remove the Jsoup JAR from the libraries.
Project->Build Path->Configure Build Path->Libraries->ClassPath->Add External JARs.
Apply and Close.
Re-run the project.
Now, It should work.

Iterate through all links of a website using Selenium

I'm new to Selenium and I would like to download all the pdf, ppt(x) and doc(x) files from a website. I have written the following code. But I'm confused how to get the inner links:
import java.io.*;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.io.FileUtils;
import org.openqa.selenium.By;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
public class WebScraper {
String loginPage = "https://blablah/login";
static String userName = "11";
static String password = "11";
static String mainPage = "https://blahblah";
public WebDriver driver = new FirefoxDriver();
ArrayList<String> visitedLinks = new ArrayList<>();
public static void main(String[] args) throws IOException {
System.setProperty("webdriver.gecko.driver", "E:\\geckodriver.exe");
WebScraper webSrcaper = new WebScraper();
webSrcaper.openTestSite();
webSrcaper.login(userName, password);
webSrcaper.getText(mainPage);
webSrcaper.saveScreenshot();
webSrcaper.closeBrowser();
}
/**
* Open the test website.
*/
public void openTestSite() {
driver.navigate().to(loginPage);
}
/**
* #param username
* #param Password Logins into the website, by entering provided username and password
*/
public void login(String username, String Password) {
WebElement userName_editbox = driver.findElement(By.id("IDToken1"));
WebElement password_editbox = driver.findElement(By.id("IDToken2"));
WebElement submit_button = driver.findElement(By.name("Login.Submit"));
userName_editbox.sendKeys(username);
password_editbox.sendKeys(Password);
submit_button.click();
}
/**
* grabs the status text and saves that into status.txt file
*
* #throws IOException
*/
public void getText(String website) throws IOException {
driver.navigate().to(website);
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
List<WebElement> allLinks = driver.findElements(By.tagName("a"));
System.out.println("Total no of links Available: " + allLinks.size());
for (int i = 0; i < allLinks.size(); i++) {
String fileAddress = allLinks.get(i).getAttribute("href");
System.out.println(allLinks.get(i).getAttribute("href"));
if (fileAddress.contains("download")) {
driver.get(fileAddress);
} else {
// getText(allLinks.get(i).getAttribute("href"));
}
}
}
/**
* Saves the screenshot
*
* #throws IOException
*/
public void saveScreenshot() throws IOException {
File scrFile = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
FileUtils.copyFile(scrFile, new File("screenshot.png"));
}
public void closeBrowser() {
driver.close();
}
}
I have an if clause which checks if the current link is a downloadable file (with an address including the word "download"). If it is, I will get it, if not, what to do? That part is my problem. I tried to implement a recursive function to retrieve the nested links and repeat the steps for the nested links, but no success.
In the meantime, the first link which is found when giving https://blahblah as the input, is https://blahblah/# which refers to the same page as https://blahblah. It can also cause a problem, but currently, I'm trapped in another problem, namely the implementation of the recursion function. Could you please help me?

You are not far off, but answering your question, grab all the link into a list of elements, iterate and click(and wait). Using C# something like this;
IList<IWebElement> listOfLinks = _driver.FindElements(By.XPath("//a"));
foreach (var link in listOfLinks)
{
if(link.GetAttribute("href").Contains("download"))
{
link.Click();
WaitForSecs(); //Thread.Sleep(1000)
}
}
JAVA
List<WebElement> listOfLinks = webDriver.findElements(By.xpath("//a"));
for (WebElement link :listOfLinks ) {
if(link.getAttribute("href").contains("download"))
{
link.click();
//WaitForSecs(); //Thread.Sleep(1000)
}
}

One option is to embed groovy in your java code if you want to search depth-first. When httpBuilder parses , it gives xml like documentation and then you can traverse as deep as you like using gpath in groovy. Your test.groovy is like below
#Grab(group='org.codehaus.groovy.modules.http-builder', module='http-builder', version='0.7' )
import groovyx.net.http.HTTPBuilder
import static groovyx.net.http.Method.GET
import static groovyx.net.http.ContentType.JSON
import groovy.json.*
import org.cyberneko.html.parsers.SAXParser
import groovy.util.XmlSlurper
import groovy.json.JsonSlurper
urlValue="http://yoururl.com"
def http = new HTTPBuilder(urlValue)
//parses page and provide xml tree , it even includes malformed html
def parsedText = http.get([:])
// number of a tags. "**" will parse depth-first
aCount= parsedText."**".findAll {it.name()=='a'}.size()
Then you just call test.groovy from java like this
static void runWithGroovyShell() throws Exception {
new GroovyShell().parse( new File( "test.groovy" ) ).invokeMethod( "hello_world", null ) ;
}
More info on parsing html with groovy
Addition:
When you evaluate groovy within Java, to access groovy variables in Java environment through groovy bindings, have a look here

How to call Fedora Commons findObjects method (web service)

I'm trying to make a search through the Fedora Commons web service. I'm interested in the findObjects method. How can I make a search in Java equal to the example described on the findObjects syntax documentation.
I'm particularly interested in this type of request:
http://localhost:8080/fedora/search?terms=fedora&pid=true&title=true
I'll attach some code, I have a class that can call my Fedora service already.
package test.fedora;
import info.fedora.definitions._1._0.types.DatastreamDef;
import info.fedora.definitions._1._0.types.MIMETypedStream;
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.util.List;
import javax.xml.ws.BindingProvider;
import org.w3c.dom.Document;
public class FedoraAccessor {
info.fedora.definitions._1._0.api.FedoraAPIAService service;
info.fedora.definitions._1._0.api.FedoraAPIA port;
final String username = "xxxx";
final String password = "yyyy";
public FedoraAClient() {
service = new info.fedora.definitions._1._0.api.FedoraAPIAService();
port = service.getFedoraAPIAServiceHTTPPort();
((BindingProvider) port.getRequestContext().put(BindingProvider.USERNAME_PROPERTY, username);
((BindingProvider) port).getRequestContext().put(BindingProvider.PASSWORD_PROPERTY, password);
}
public List findObjects() {
//how?
}
public List<DatastreamDef> listDatastreams(String pid, String asOfTime) {
List<DatastreamDef> result = null;
try {
result = port.listDatastreams(pid, asOfTime);
} catch (Exception ex) {
ex.printStackTrace();
}
return result;
}
}

It's easier using the client from mediashelf (http://mediashelf.github.com/fedora-client/). Here's an example searching for objects containing the string foobar in the title:
#Test
public void doTest() throws FedoraClientException {
connect();
FindObjectsResponse response = null;
response = findObjects().pid().title().query("title~foobar").execute(fedoraClient);
List<String> pids = response.getPids();
List<String> titles = new ArrayList<String>();
for (String pid : pids) {
titles.add(response.getObjectField(pid, "title").get(0));
}
assertEquals(7, titles.size());
}

Page Expiration in Java(Wicket)

I'm writing a wicket project for a social network.In my project i have authentication so when a user enter address of the home page he is redirected to login page in this way:
public class MyApplication extends WebApplication {
private Folder uploadFolder = null;
#Override
public Class getHomePage() {
return UserHome.class;
}
public Folder getUploadFolder()
{
return uploadFolder;
}
#Override
protected void init() {
super.init();
// Disable the Ajax debug label!
//getDebugSettings().setAjaxDebugModeEnabled(false);
this.getMarkupSettings().setDefaultMarkupEncoding("UTF-8");
this.getRequestCycleSettings().setResponseRequestEncoding("UTF-8");
mountBookmarkablePage("/BossPage", BossPage.class);
mountBookmarkablePage("/Branch", EditProfile.class);
mountBookmarkablePage("/SA", SuperAdmin.class);
mountBookmarkablePage("/Admin", ir.pnusn.branch.ui.pages.administratorPages.EditProfile.class);
mountBookmarkablePage("/Student", StudentSignUP.class);
mountBookmarkablePage("/Student/Test", StudentSignUpConfirm.class);
mountBookmarkablePage("/Branch/categories.xml", CategoriesXML.class);
get().getPageSettings().setAutomaticMultiWindowSupport(true);
getResourceSettings().setThrowExceptionOnMissingResource(false);
uploadFolder = new Folder("C:\\", "wicket-uploads");
uploadFolder.mkdirs();
this.getSecuritySettings().setAuthorizationStrategy(WiaAuthorizationStrategy.getInstance());
this.getSecuritySettings().setUnauthorizedComponentInstantiationListener(WiaAuthorizationStrategy.getInstance());
addComponentInstantiationListener(new IComponentInstantiationListener() {
public void onInstantiation(final Component component) {
if (!getSecuritySettings().getAuthorizationStrategy().isInstantiationAuthorized(component.getClass())) {
try {
getSecuritySettings().getUnauthorizedComponentInstantiationListener().onUnauthorizedInstantiation(component);
} catch (Exception e) {
System.out.println("ERRORRRRRRR:" + e.toString());
}
}
}
});
}
}
and my WiaAuthorizationStrategy class is like this which will get page names and user roles from a xml file by name Realm.xml :
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package ir.pnusn.ui.library;
import ir.pnusn.authentication.RealmPolicy;
import ir.pnusn.authentication.ui.pages.Login;
import org.apache.wicket.Component;
import org.apache.wicket.RestartResponseAtInterceptPageException;
import org.apache.wicket.authorization.Action;
import org.apache.wicket.authorization.IAuthorizationStrategy;
import org.apache.wicket.authorization.IUnauthorizedComponentInstantiationListener;
public final class WiaAuthorizationStrategy implements
IAuthorizationStrategy,
IUnauthorizedComponentInstantiationListener {
private RealmPolicy roleManager;
private static WiaAuthorizationStrategy instance;
private WiaAuthorizationStrategy() {
roleManager = RealmPolicy.getInstance();
}
public static WiaAuthorizationStrategy getInstance() {
if(instance == null)
instance = new WiaAuthorizationStrategy();
return instance;
}
public boolean isInstantiationAuthorized(Class componentClass) {
if (ProtectedPage.class.isAssignableFrom(componentClass)) {
if (WiaSession.get().getUser() == null) {
return false;
}
if(!roleManager.isAuthorized(WiaSession.get().getUser().getRole(), componentClass.getName()))//WiaSession.get().isAuthenticated();
{
WiaSession.get().setAccess(false);
return false;
}
else
return true;
}
return true;
}
public void onUnauthorizedInstantiation(Component component) {
throw new RestartResponseAtInterceptPageException(
Login.class);
}
public boolean isActionAuthorized(Component component, Action action) {
//System.out.println("Name:" + component.getClass().getName() + "\n Action:" + action.getName() + "\nUser:" + WiaSession.get().getUser());
if (action.equals(Component.RENDER)) {
if (roleManager.containClass(component.getClass().getName()))
{
if (WiaSession.get().getUser() != null) {
if(!roleManager.isAuthorized(WiaSession.get().getUser().getRole(), component.getClass().getName()))
{
WiaSession.get().setAccess(false);
return false;
}
return true;
}
return false;
}
}
return true;
}
}
in this situation i have a googlemap in one of my protectedpage and because googlemap needs to read data for loading builing from a xml so i create a servlet which will create it dynamicly depending on the Username. this servlet is below:
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package ir.pnusn.branch.ui.pages;
import ir.pnusn.branch.database.BranchNotFoundException;
import ir.pnusn.branch.database.DatabaseException;
import ir.pnusn.branch.facade.admin.branchDataEnter.BranchDataSubmitFacade;
import ir.pnusn.branch.facade.admin.branchDataEnter.BuildingBean;
import java.util.Iterator;
import java.util.List;
import org.apache.wicket.PageParameters;
import org.apache.wicket.RequestCycle;
import org.apache.wicket.markup.html.WebPage;
/**
*
* #author mohammad
*/
public class CategoriesXML extends WebPage
{
public CategoriesXML(PageParameters parameters)
{
System.out.println("user " + parameters.getString("user"));
StringBuilder builder = new StringBuilder("<markers>");
List<BuildingBean> buildings;
try
{
buildings = BranchDataSubmitFacade.createBranchDataSubmitFacade(parameters.getString("user")).getBranchSecondPageData();
for (Iterator<BuildingBean> it = buildings.iterator(); it.hasNext();)
{
BuildingBean buildingBean = it.next();
builder.append("<marker lat=\"");
builder.append(buildingBean.getLatit());
builder.append("\" lng=\"");
builder.append(buildingBean.getLongit());
builder.append("\"");
builder.append(" address=\"");
builder.append(buildingBean.getDesctiption());
builder.append("\" category=\"branch\" name=\"");
builder.append(buildingBean.getBuildingName());
builder.append("\"/>");
}
builder.append("</markers>");
}
catch (DatabaseException ex)
{
builder = new StringBuilder("<markers></markers>");
}
catch (BranchNotFoundException ex)
{
builder = new StringBuilder("<markers></markers>");
}
RequestCycle.get().getResponse().println(builder.toString());
/*"<markers>" +
"<marker lat=\"35.69187\" lng=\"51.413269\" address=\"Some stuff to display in the First Info Window\" category=\"branch\" name=\"gholi\"/>" +
"<marker lat=\"52.91892\" lng=\"78.89231\" address=\"Some stuff to display in the Second Info Window\" category=\"branch\" name=\"taghi\"/>" +
"<marker lat=\"40.82589\" lng=\"35.10040\" address=\"Some stuff to display in the Third Info Window\" category=\"branch\" name=\"naghi\"/>" +
"</markers> "**/
}
}
I have made this page at first protected and so the user had to loged in to have access this xml but after lot's of debuging i found that googlemap can't log in my system so instead of parsing the dataxml it pars login page html az input. so i changed the extention of the CategoriesXML to extend WebPage.
But now I have another problem:
When i go to the google map page in my Social site I can Not refresh the page because It expires and so I cannot add another building to my data xml.
what should I do?
tell me if you need more code or information

In trying to close this question, which the OP seems to have pretty much abandoned, I'll just post my old comment as an answer..:
In your CategoriesXML I would highly advise against building and adding your tags as Strings and adding them to pages just like that. See if you can work this into a in your .xml(?) file instead, as that's the Wicket way to do things (and as such just might solve the problems you're having)

Would making this WebPage extends org.apache.wicket.Page instead and associate the markup to return as xml instead of html? Basically, mimicking a WebPage, but implementing it for XML.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Webcrawler to extract emails - java

Related

example OAuth 1.0 implementation in java

Exception in thread "main" java.lang.NoClassDefFoundError: org/jsoup/Jsoup

Iterate through all links of a website using Selenium

How to call Fedora Commons findObjects method (web service)

Page Expiration in Java(Wicket)

Categories

Resources