Handling non-english URL with Spring and App Engine Task Queue - java

I have this problem where I need to queue a page link with TaskQueue:
Queue queue = QueueFactory.getDefaultQueue();
for (String href : hrefs){
href = baseUrl + href;
pageLinks = pageLinks + "\n" + href;
queue.add(TaskOptions.Builder
.withUrl("/crawler")
.param("url", href));
l("Added to queue url=["+href+"]");
}
The problem here is that, I think the URL that gets passed into the queue contains ?'s for Arabic characters. As it keeps on rescheduling.
The String pageLinks however is outputed in the browser through Spring MVC, and I can properly see the Arabic character being displayed. So I'm pretty the links are ok.
If I copy one of the links output on the browser, and paste it to the browser URL it works fine. So I'm pretty sure that the reason that the queue keeps on recheduling because it gets the wrong URL.
What could I be missing here? Do I need to convert the String href before passing it into the queue?
The crawl service looks like this:
#RequestMapping(method = RequestMethod.GET, value = "/crawl",
produces = "application/json; charset=iso-8859-6")
public #ResponseBody String crawl(HttpServletRequest req, HttpServletResponse res,
#RequestParam(value="url", required = false) String url) {
l("Processs url:" + url);
}
Also do I need to convert the #QueryParam String url here to Arabic or not?

You must Url-encode the parameters. See this question: Java URL encoding of query string parameters

Related

Java - Special character is being encoded twice

I am writing a test in Java that sends a request to an api. The url includes a # which should be getting encoded as %23.
However I can't get that %23 to show up accurately in the url.
public static void addRecords(JSONObject creds, String subsection, String jsonFile) {
String body = readFile("files/" + jsonFile);
given()
.header(CONTENT_TYPE, JSON)
.header(AUTHORIZATION, BEARER_TOKEN + creds.getString(ACCESS_TOKEN))
.body(body)
.post(BASE_API_URL_V1 + "/setup/" + subsection + "/records")
.then()
.statusCode(SC_ACCEPTED);
}
When running the below the url gets translated to /v1/setup/test so it removes the hashtag completely.
addRecords(creds, "test#v2'", VALID_DATA_JSON);
When running the below the url gets translated to /v1/setup/test%2523/records so it encodes everything.
addRecords(creds, "test%23v2", VALID_DATA_JSON);
What I need is /v1/setup/test%23v2/records
given()
.urlEncodingEnabled(false)
Adding this flag allowed me to hardcode the url without worrying about the library encoding the % when executing addRecords(creds, "test%23v2", VALID_DATA_JSON);

How to deal with dot in an url path in writing service

I am writing a "GET" endpoint looks like following:
#RequestMapping(value = "/{configSetId}/{version}", method = RequestMethod.GET, produces = { "application/json" })
public ResponseEntity<List<Metadata>> getMetadatasByConfigSetIdAndVersion(
#PathVariable("configSetId") final String configSetId,
#PathVariable("version") final String version) {
return ResponseEntity.ok(metadataService.getMetadatasByConfigSetIdAndVersion(configSetId, version));
}
So I can send a "GET" request to localhost:8080/{configSetId}/{version}, for example: localhost:8080/configSet1/v1
But the problem is if the version is "v1.02", then the ".02" will be ignored and the version I got is v1. How can I avoid this behaivor? Thank you!
Since "." is special character so don't use it directly on your request.
Instead of
v1.02
Just try
v1%2E02
Where %2E is URL encoding of ".".
For more information, please refer to this link HTML URL Encoding

How to get the browser name alone from client in java?

I tried using
String userAgent=req.getHeader("user-agent");
and also the following
#GET
#Path("/get")
public Response addUser(#HeaderParam("user-agent") String userAgent) {
return Response.status(200)
.entity("addUser is called, userAgent : " + userAgent)
.build();
}
But I need only, browser name as chrome,firefox,IE.Please help,if anyone know.
UPDATE : Got answer
public String browser(#HeaderParam("user-agent") String userAgent){
UserAgent browserName = UserAgent.parseUserAgentString(userAgent);
String browser=browserName.toString();
System.out.println(browser)
}
Getting information out of user agent strings is somewhat of a black art. Easiest is probably to use a library to parse the user agent string and extract the needed information.
I've used UADetector in the past with good results, but there are undoubtedly other libraries out there.
The following sample is from the UADetector documentation:
UserAgentStringParser parser = UADetectorServiceFactory.getResourceModuleParser();
ReadableUserAgent agent = parser.parse(request.getHeader("User-Agent"));
out.append("You're a <em>");
out.append(agent.getName());
out.append("</em> on <em>");
out.append(agent.getOperatingSystem().getName());
out.append("</em>!");

How to read the public URL in GWT?

I m new in GWT and I m generating a web application in which i have to create a public URL.
In this public URL i have to pass hashtag(#) and some parameters.
I am finding difficulty in achieving this task.
Extracting the hashtag from the URL.
Extracting the userid from the URL.
My public URL example is :: http://www.xyz.com/#profile?userid=10003
To access the URL in GWT you can use the History.getToken() method. It will give you the entire string that follows the hashtag ("#").
In your case (http://www.xyz.com/#profile?userid=10003) it will return a string "profile?userid=10003". After you have this you can parse it however you want. You can check if it contains("?") and u can split it by "?" or you can get a substring. How you get the information from that is really up to you.
I guess you already have the URL. I'm not that good at Regex, but this should work:
String yourURL = "http://www.xyz.com/#profile?userid=10003";
String[] array = yourURL.split("[\\p{Lower}\\p{Upper}\\p{Punct}}]");
int userID = 0;
for (String string : array) {
if (!string.isEmpty()) {
userID = Integer.valueOf(string);
}
}
System.out.println(userID);
To get the parameters:
String userId = Window.Location.getParameter("userid");
To get the anchor / hash tag:
I don't think there is something, you can parse the URL: look at the methods provided by Window.Location.

Java Webservice URL in JSF

I created a JSF application which also offers some Webservices. The webservices are created via annotations.
Now I want to create a webserviceInfo.xhtml Page , where I get all the needed webservice Information.
When I go to the address http://our.server.com/application/OurWebserviceName, I get all the information needed to access the webservice (this info page is generated automatically by Glassfish ).
To include this page, I did the following in the webserviceInfo.xhtml:
<iframe scrolling="automatic" width="971" height="1000" src="#{myBean.generateUrlToWebservice()}/OurWebserviceName"/>
Where:
public String generateUrlToWebservice(){
FacesContext fc = FacesContext.getCurrentInstance();
String servername = fc.getExternalContext().getRequestServerName();
String port = String.valueOf(fc.getExternalContext().getRequestServerPort());
String appname = fc.getExternalContext().getRequestContextPath();
return "http://"+servername+":"+port+appname;
}
Is there a more elegant solution to this?
BR, Rene
Use a page-relative URL.
<iframe src="OurWebserviceName"></iframe>
Or make use of <base> tag with little help of JSTL functions taglib.
<html xmlns:fn="http://java.sun.com/jsp/jstl/functions">
...
<base href="#{fn:replace(request.requestURL, request.requestURI, '')}#{request.contextPath}"></base>
This way any URL which doesn't start with scheme or / is always relative to this URL.
Or if you really need to do it in JSF the following gives less headache with scheme and port.
HttpServletRequest request = (HttpServletRequest) externalContext.getRequest();
return request.getRequestURL().toString().replace(request.getRequestURI, "") + request.getContextPath();
Better will be if you gets all parameters dynamically like protocol (http,https..) and pages (after app name)
public String generateUrlToWebservice(){
FacesContext fc = FacesContext.getCurrentInstance();
ExternalContext exContext = fc.getExternalContext();
String servername = exContext.getRequestServerName();
String port = String.valueOf(exContext.getRequestServerPort());
String appname = exContext.getRequestContextPath();
String protocol = exContext.getRequestScheme();
String pagePath = exContext.getInitParameter("pagePath"); //read it from web.xml
return protocol +"://"+servername+":"+port+appname+pagePath;
}

Categories

Resources