How do determine the final URL from a link in Java

How do determine the final URL from a link in Java - java

This is a link generated from Google Alerts, and I would like to get where you get redirected. So I need the URL and I would have to retrieve it with Java. I have checked for the response, but no location header redirect.
https://www.google.com/url?rct=j&sa=t&url=http://naija247news.com/2016/03/nigerian-bond-yields-rise-after-cbns-interest-rate-hike-aimed-at-luring-investors/&ct=ga&cd=CAIyGjA3ZmJiYzk0ZDM0N2U2MjU6Y29tOmVuOlVT&usg=AFQjCNGs7HsYSodEUnECfdAatG6KgY18DA

Maybe something like this:
String URL = "https://www.google.com/url?rct=j&sa=t&url=http://naija247news.com/2016/03/nigerian-bond-yields-rise-after-cbns-interest-rate-hike-aimed-at-luring-investors/&ct=ga&cd=CAIyGjA3ZmJiYzk0ZDM0N2U2MjU6Y29tOmVuOlVT&usg=AFQjCNGs7HsYSodEUnECfdAatG6KgY18DA";
String subStr = URL.substring(URL.indexOf("url=") + 1, URL.indexOf("&ct"));
I forgot what the starting and ending position has to be exactly, which indexes. So you would have to verify that and check it creates a substring at the right position. But the basic idea is to cut out the URL you need and nothing more. This is an example for what you forwarded. It could be that you would have to search for something else to know the end of the substring, when you have a different URL (in the provided example I look for &ct, which maybe be not be the case in another URL). You will have to look up several URLs you have to know how to cut out the URL.

Related

Redisearch query with "begin with" instead of "contains"

I am trying to understand on how to perform queries in Redisearch strictly with "begins with" and I keep getting "contains".
For example if I have fields with values like 'football', 'myfootball', 'greenfootball' and would provide a search term like this:
> FT.SEARCH myIdx #myfield:foot*
I want just to get 'football' but I keep getting other fields that contain the word instead of beginning with that word.
Is there a way to avoid this?
I was trying to use VERBATIM and things like #myfield:^foot* but nothing.
I am using JRedisearch as a client but eventually I had to enter the DB and perform these queries manually in order to figure out what's happening. That being said, is this possible to do with this client at the moment?
Thanks
EDIT
A sample of my index setup:
Client client = new Client(INDEX_NAME, url, PORT);
Schema sc = new Schema().addSortableTextField("url", 1.0); // using this field for query
client.dropIndex(true);
client.createIndex(sc, Client.IndexOptions.Default());
return client;
Sample document:
id: // random uuid
urlPath: myfootbal
application: web
market: Europe

After checking the RDB provided I see that when searching foot* you are not getting myfootbal. The replies look like this: /dot-com/plp/football/x/index.html. You are getting those replies because this url is tokenized, and '/' is one of the tokenize chars. If you do not want those urls to be tokenized you need to declare them as TAGS and not as TEXT. This way the entire url will be indexed as is and when search for foot* it will not appear in the results.
For more information about TAGS see the FT.CREATE documentation: https://oss.redislabs.com/redisearch/Commands.html

How can I get value after hashtag from URL in Java

I have a URL and I want to print in my graphical user interface the ID value after the hashtag.
For example, we have www.site.com/index.php#hello and I want to print hello value on a label in my GUI.
How can I do this using Java in Netbeans?

Simple solution is getRef() in URL class:
URL url = new URL("http://www.anyhost.com/index.php#hello");
jLabel.setText(url.getRef());
EDIT: According to #Henry comment:
I would recommend to use the java.net.URI as it also deals with encoding. The Javadocs say: "Note, the URI class does perform escaping of its component fields in certain circumstances. The recommended way to manage the encoding and decoding of URLs is to use URI, and to convert between these two classes using toURI() and URI.toURL()."
and this comment:
Why not just doing uri.getFragment()
URI uri = new URI("http://www.anyhost.com/index.php#hello");
jLabel.setText(uri.getFragment());

Use the String.split() Method.
public static String getId(string url) {
return url.split("#")[1];
}
String.split() returns an array of Strings that are delimited, or "Split," by the value you pass to it, or in this case #.
Because you want only the string after the #, you can just use the second item in the array that it returns by adding [1] to the end of it.
For more on String.split() go to Tutorials Point.
By the way, the part of the URL you are referencing is the Element ID. It is used to jump to an Element on a webpage.

Request parameter is modified in my servlet

I sent one request as URL with data to servlet, But by default servlet is modifying the data and sending as request. Can you please suggest how to maintain the request URL with data which i passed to servlet should remain same ?
Example:- when i am passing the data to servlet
http://localhost/helloservlet/servlet/ppd.abcd.build.coupons.CouponValueFormatterServlet?dsn=frd_abc_abcde&lang=ENG&val=PRCTXT|12345 &ABCDEFG
when it using the above url in servelt as request , like string abc = request.getParameter("val"), the val attribute is trimmed automatically and assigned as " val=PRCTXT|12345" but it supposed to be like " val = PRCTXT|12345 &ABCDEFG ". Please help me on this.

The servlet interprets each & in the URL as the start of a new parameter. So when it sees &ABCDEFG, it thinks you are sending a new parameter called ABCDEFG with no value (though this is technically a "keyless value" according to the specifications).
Two things to fix this, first is when you want to actually send an &, use %26 instead. This will be skipped by the code that divides up the parameters, but converted to a real & in the parameter's value.
Second is to replace spaces with +. Spaces in URLs work sometimes but can be problematic.
So your actual request URL should look like this:
http://localhost/helloservlet/servlet/ppd.abcd.build.coupons.CouponValueFormatterServlet?dsn=frd_abc_abcde&lang=ENG&val=PRCTXT|12345+%26ABCDEFG
If you're building these parameters in javascript, you can use encodeURIComponent() to fix all problem characters for you. So you could do something like this:
var userInput = *get some input here*
var addr = 'http://www.example.com?param1=' + encodeURIComponent(userInput);

Regex to Extract First Part of URL

I need a java regex to extract parts of a URL.
For example, take the following URLs:
http://localhost:81/example
https://test.com/test
http://test.com/
I would want my regex expression to return:
http://localhost:81
https://test.com
http://test.com
I will be using this in a Java patcher.
This is what I have so far, problem is it takes the whole URLs:
^https?:\/\/(?!.*:\/\/)\S+

import Java.net.URL
//snip
URL url = new URL(urlString);
return url.getProtocol() + "://" + url.getAuthority();
The right tool for the right job.

Building off your attempt, try this:
^https?://[^/]+
I'm assuming that you want to capture everything until the first / after http://? (That's what I was getting from your examples - if not, please post some more).
Are these URLs given as one input, or are each a different string?
Edit: It was pointed out that there were unnecessary escapes, so fixed to a more condensed version

Language independent answer:
For the whitespace: replace /^\s+/ with the empty string.
For removing the path information from the URL, if you can assume there aren't any slashes in the path (i.e. you're not dealing with http://localhost:81/foo/bar/baz), replace /\/[^\/]+$/ with the empty string. If there might be more slashes, you might try something like replacing /(^\s*.*:\/\/[^\/]+)\/.*/ with $1.

A simple one: ^(https?://[^/]+)

How to add a parameter to a URL in GWT?

I have a Java/GWT application. In that there is a list of items. If I click on any item title then that item is opened with full description.
I am using Anchor for the item title, so what I want is when user clicks on item title then in the URL the id of that item is appended to the current URL.
For example, this is my URL:
"http://127.0.0.1:8888/MyApp.html?gwt.codesvr=127.0.0.1:9997#listItem?list"
and I have to append id to the end of the URL like:
"http://127.0.0.1:8888/MyApp.html?gwt.codesvr=127.0.0.1:9997#listItem?list&itemId=55"

Using Window.Location should do your trick : see the doc here
Something like this :
String url = Window.Location.getHref();
url = url + "&itemId=" + itemId;
Window.Location.replace(url);
Although of course, as Crollster pointed out, you should insert your url parameter before the # sign. Give more details on what you're looking for exactly (why do you have to add the parameter manually, does the page have to reload ...)

you can use redirect command in order to add this parameter
response.sendRedirect(your url + itemId=55);
Then you can extract this variable.
I hope this will help.

You can try with javascript coding.When the user clicks on link, get this URL and appends your id to it and reconstruct the URL.

You see that # in the URL? Thats an anchor - you will need your parameter to be added before that, so it looks like this:
http://127.0.0.1:8888/MyApp.html?gwt.codesvr=127.0.0.1:9997&itemId=55#listItem?list
HTH

URIBuilder of Apache HttpComponents offers a convenient method to add parameters and will deal with existing query parameters and anchors.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How do determine the final URL from a link in Java - java

Related

Redisearch query with "begin with" instead of "contains"

How can I get value after hashtag from URL in Java

Request parameter is modified in my servlet

Regex to Extract First Part of URL

How to add a parameter to a URL in GWT?

Categories

Resources