java regular expressions matching specific urls

java regular expressions matching specific urls - java

I once built a program in php that used very specific regular expressions to match links, however that pattern doesnt seem to work in java, Im trying to find the java equivalent of
"~http://(bit.ly|t.co)~"
in php this would would match links such as http://t.co/UURRNlrK and http://bit.ly/AenG5W what would be a java equivalent of this?

I think you are looking for
String str = "http://t.co/UURRNlrK";
String p = "(http://(t\.co|bit\.ly).*)";
Pattern pattern = Pattern.compile(p);
Matcher matcher = pattern.matcher(str);
if(matcher.find())
System.out.println(matcher.group(0));
Output = http://t.co/UURRNlrK
if str = "http://bit.ly/AenG5W"
Output = http://bit.ly/AenG5W
Here is a nice Regex Tutorial for java.

http://(bit\.ly|t\.co)/\w*
I think this one would result same as the upper ones

I tried this:
String str = "http://bit.ly/asdfsd";
if(str.matches("http://(bit\.ly|t\.co).+")){
System.out.println("hurray");
}

Related

Pattern (string) allows characters only one time

I want to check if my string contains only allowed characters. Everything works properly for example 7B, 77B or 7BBBB, but when I input something like this 7B7 or 7BB2 it's not matching.
Everything work fine, but when integer is last character it's not working.
Could You tell me what is wrong with that code?
pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}

If you want to mix numbers and chars in a various order you need sth like:
Pattern pattern = Pattern.compile("[\\da-fA-F]*")

Why not try it this way?
// Compile this pattern.
Pattern pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*[0-9]*");
// See if this String matches.
Matcher m = pattern.matcher("num123");
if (m.matches()) {
System.out.println(true);
}
Source

Are you trying to verify that the string only has digits and letters and nothing else?
If so try using the following:
pattern = Pattern.compile("^[a-z-A-Z\\d]*$");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}

How can I use the $+ or $& regular expressions in Java?

I've been searching for a while trying to figure out why this wont work- I found the exact thing I want to accomplish Simple regex replace to keep original string but I can't seem to use the regular expression $+ or $& in Java
like so:
String S1 = "bob";
String S2 = "the builder";
Pattern p = Pattern.compile(S1, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(ST);
ST = m.replaceAll("$+/"+S2);

Use $0 to refer to the whole match in replacement pattern.

Java regex pattern for number

I know that this question can be stupid but I am trying to get some information from text and you are my last hope after last three hours of trying..
DIC: C/40764176 IC: 407641'6
Dekujerne a t8ime se na shledanou
I need to get for example this 40764176
I need to get string with 8-10 length, sometimes there can be some special chars like I,i,G,S,O,ó,l) but I have tried a lot of patterns for this and no one works...
I tried:
String generalDicFormatPattern = "([0-9IiGSOól]{8,10})";
String generalDicFormatPattern = ".*([0-9IiGSOól]{8,10}).*";
String generalDicFormatPattern = "\\b([0-9IiGSOól]{8,10})\\b";
nothing works... do you know where is the problem?
edit:
I use regex in this way:
private List<String> getGeneralDicFromLine(String concreteLine) {
List<String> allMatches = new ArrayList<String>();
Pattern pattern = Pattern.compile(generalDicFormatPattern);
Matcher matcher = pattern.matcher(concreteLine);
while (matcher.find()) {
allMatches.add(matcher.group(1));
}
return allMatches;
}

If your string's pattern is fixed you can use the regex
C/([^\s]{8,10})\sIC:
Sample code:
String s = "DIC: C/40764176 IC: 407641'6";
Pattern p = Pattern.compile("C/([^\\s]{8,10})\\sIC:");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println(m.group(1)); // 40764176
}
I'm expecting any character (includes the special ones you've shown in examples) but a white space.

May be you can split your string with spaces (string.split('\\s');), then you should have an array like this :
DIC:
C/40764176
IC: 407641'6
...
shledanou
Get the second string, split it using '/', and get the second element.
I hope it helped you.
Tip : you can check after the result using a regex (([0-9IiGSOól]{8,10})

regular expression - parse classpath location

$JAR_REPO/nlb/grbox/smnt.jar
I want to get the string between $ and first / and this will be replaced with some other string.
What is the regex to get JAR_REPO alone from above?
Can I use Regex to get the actual string like the pattern match (any method) will return the string JAR_REPO?
Please help.
Thanks.
Wells

\$([^/]+)/.*
or, as a Java String:
"\\$([^/]+)/.*"
The JAR_REPO String will be the group(1):
Pattern pattern = Pattern.compile("\\$([^/]+)/.*");
Matcher matcher = pattern.matcher(yourstring);
if (matcher.find()) {
String jarRepo = matcher.group(1);
}

Such type of recursive parse approach can be resolved using Interpreter Pattern logic along with parsing approach.

extract substring in java using regex

I need to extract "URPlus1_S2_3" from the string:
"Last one: http://abc.imp/Basic2#URPlus1_S2_3,"
using regular expression in Java language.
Can someone please help me? I am using regex for the first time.

Try
Pattern p = Pattern.compile("#([^,]*)");
Matcher m = p.matcher(myString);
if (m.find()) {
doSomethingWith(m.group(1)); // The matched substring
}

String s = "Last one: http://abc.imp/Basic2#URPlus1_S2_3,";
Matcher m = Pattern.compile("(URPlus1_S2_3)").matcher(s);
if (m.find()) System.out.println(m.group(1));
You gotta learn how to specify your requirements ;)

You haven't really defined what criteria you need to use to find that string, but here is one way to approach based on '#' separator. You can adjust the regex as necessary.
expr: .*#([^,]*)
extract: \1
Go here for syntax documentation:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html

String s = Last one: http://abc.imp/Basic2#URPlus1_S2_3,"
String result = s.replaceAll(".*#", "");
The above returns the full String in case there's no "#". There are better ways using regex, but the best solution here is using no regex. There are classes URL and URI doing the job.

Since it's the first time you use regular expressions I would suggest going another way, which is more understandable for now (until you master regular expressions ;) and it will be easily modified if you will ever need to:
String yourPart = new String().split("#")[1];

Here's a long version:
String url = "http://abc.imp/Basic2#URPlus1_S2_3,";
String anchor = null;
String ps = "#(.+),";
Pattern p = Pattern.compile(ps);
Matcher m = p.matcher(url);
if (m.matches()) {
anchor = m.group(1);
}
The main point to understand is the use of the parenthesis, they are used to create groups which can be extracted from a pattern. In the Matcher object, the group method will return them in order starting at index 1, while the full match is returned by the index 0.

If you just want everything after the #, use split:
String s = "Last one: http://abc.imp/Basic2#URPlus1_S2_3," ;
System.out.println(s.split("#")[1]);
Alternatively, if you want to parse the URI and get the fragment component you can do:
URI u = new URI("http://abc.imp/Basic2#URPlus1_S2_3,");
System.out.println(u.getFragment());

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

java regular expressions matching specific urls - java

http://(bit\.ly|t\.co)/\w* I think this one would result same as the upper ones

I tried this: String str = "http://bit.ly/asdfsd"; if(str.matches("http://(bit\.ly|t\.co).+")){ System.out.println("hurray"); }

Related

Pattern (string) allows characters only one time

How can I use the $+ or $& regular expressions in Java?

Java regex pattern for number

regular expression - parse classpath location

extract substring in java using regex

Categories

Resources