I have got example expression :
firstName =:'Mon';lastName =:'Arthur';:or{size >:'20';lastName ^:'H';:and{company |:'lon';:or{company |:'we'}}};lastName =:'aa';:and{length >:'33';:or{color =:'red'};width <:'2'};date <:'2012';:!{source =:'dictionary,locale'}
and regex must match:
:or{size >:'20';lastName ^:'H';:and{company |:'lon';:or{company |:'we'}}}
:and{length >:'33';:or{color =:'red'};width <:'2'}
:!{source =:'dictionary, locale'}
So that regex must match to expression that start with ':[anycharacters]{' and end with '}' and expression between that curly parentheses may also contains inner expression that can match.
I try to wrote something:
https://regex101.com/r/gM3dR9/13
and the return is:
:or{size >:'20';lastName ^:'H';:and{company |:'lon';:or{company |:'we'} - OK
:and{length >:'33';:or{color =:'red'} -MISSING ;width <:'2'}
:!{source =:'dictionary, locale'} -OK
I tried to work out a solution that fits your example and the requirements you wrote, but I'm not sure, if I got it entirely:
(?:;:)(\S+(?:{.*?}(?=[^}]*$|;[^}]*;:)))
This uses a positive lookahead to ensure that the last closing bracket is catched correctly (it has to be followed by the end of the string or another ;:)
If it is possible, that your match is the beginning of the string and therefor not proceeded by ;: you could change the part (?:;:) to (?:^|;:)
Here is the link for Regex101: https://regex101.com/r/dV8uI4/1
Try this regEx
(:or{.*?\};{1,})|(:and{.*\};)|(:!{.*?\};{0,})
I can't guarantee for any other complex case, but it is definitely what you have mentioned as output Except extra ';'
"firstName =:'Mon';lastName =:'Arthur';:or{size >:'20';lastName ^:'H';:and{company |:'lon';:or{company |:'we'}}};lastName =:'aa';:and{length >:'33';:or{color =:'red'};:width <:'2'};date <:'2012';:!{source =:'dictionary,locale'}".match(/(:or{.*?\};{1,})|(:and{.*\};)|(:!{.*?\};{0,})/g)
Output
[":or{size >:'20';lastName ^:'H';:and{company |:'lon';:or{company |:'we'}}};", ":and{length >:'33';:or{color =:'red'};:width <:'2'};", ":!{source =:'dictionary,locale'}"]
Formatted output
[
":or{size >:'20';lastName ^:'H';:and{company |:'lon';:or{company |:'we'}}};",
":and{length >:'33';:or{color =:'red'};:width <:'2'};",
":!{source =:'dictionary,locale'}"
]
tested Here - Java RegEx Tester
Related
I need to find regex for string represent date [07/Mar/2014:22:12:28 -0800] from mentioned line:
64.242.88.10 – – [07/Mar/2014:22:12:28 -0800] “GET /twiki/bin/attach/TWiki/WebSearch HTTP/1.1” 401 12846
If your string doesn't have any other content in square braces besides this, then:
\[.*?]
Regex101 Demo
Details
\[ - opening bracket (escaped because [ is a meta-character)
.*? - non-greedy match-all
] - closing bracket (doesn't need escaping)
When adapting for use in a Java program, you'll need to escape the backslash too.:
Pattern.compile("\\[.*?]");
Try this:
\[[0-9]{1,2}\/[a-zA-Z]+\/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2} -[0-9]{4}]
Short version(greedy) since it is enclosed by [ ]:
\[.*]
package com.j;
public class Program {
public static void main(String[] args) {
System.out.println(Puzzel.class.getName().replaceAll(".", "/")
+ ".class");
System.out.println(Program.class.getName());
}
}
in the above program i was expecting a output com/j/Program.class
But it is coming //////.class its y?
In the replacement, . is treated as a regular expression, where . means "any character" and here is replaced with / , so the output becomes
////////////.class
For the expected answer, change the expression to escape the .:
Name.class.getName().replaceAll("\\.", "/") + ".class");
Then the output will be what you expected:
com/j/Puzzel.class
Because . is a special char when it comes to regex. You should escape it with backslash.
replaceAll() takes a regular expression for the matcher. Your code says to replace every character (.) with a /. you need replaceAll("\\.") or maybe replaceAll("\\\\."). I can never remember how many escapes to use offhand.
I am trying to get a regular expression written that will capture what I'm trying to match in Java, but can't seem to get it.
This is my latest attempt:
Pattern.compile( "[A-Za-z0-9]+(/[A-Za-z0-9]+)*/?" );
This is what I want to match:
hello
hello/world
hello/big/world
hello/big/world/
This what I don't want matched:
/
/hello
hello//world
hello/big//world
I'd appreciate any insight into what I am doing wrong :)
Try this regex:
Pattern.compile( "^[A-Za-z0-9]+(/[A-Za-z0-9]+)*/?$" );
Doesn't your regex require question mark at the end?
I always write unit tests for my regexes so I can fiddle with them until they pass.
// your exact regex:
final Pattern regex = Pattern.compile( "[A-Za-z0-9]+(/[A-Za-z0-9]+)*/?" );
// your exact examples:
final String[]
good = { "hello", "hello/world", "hello/big/world", "hello/big/world/" },
bad = { "/", "/hello", "hello//world", "hello/big//world"};
for (String goodOne : good) System.out.println(regex.matcher(goodOne).matches());
for (String badOne : bad) System.out.println(!regex.matcher(badOne).matches());
prints a solid column of true values.
Put another way: your regex is perfectly fine just as it is.
It looks like what you're trying to 'Capture' is being overwritten each quantified itteration. Just change parenthesis arangement.
# "[A-Za-z0-9]+((?:/[A-Za-z0-9]+)*)/?"
[A-Za-z0-9]+
( # (1 start)
(?: / [A-Za-z0-9]+ )*
) # (1 end)
/?
Or, with no capture's at all -
# "[A-Za-z0-9]+(?:/[A-Za-z0-9]+)*/?"
[A-Za-z0-9]+
(?: / [A-Za-z0-9]+ )*
/?
If I have this:
thisisgibberish 1234 /hello/world/
more gibberish 43/7 /good/timing/
just onemore 8888 /thanks/mate
what would the regular expression inside the Java String.split() method be to obtain the paths per line?
ie.
[0]: /hello/world/
[1]: /good/timing/
[2]: /thanks/mate
Doing
myString.split("\/[a-zA-Z]")
causes the splits to occur to every /h, /w, /g, /t, and /m.
How would I go about writing a regular expression to split it only once per line while only capturing the paths?
Thanks in advance.
Why split ? I think running a match here is better, try the following expression:
(?<=\s)(/[a-zA-Z/])+
Regex101 Demo
This uses split() :
String[] split = myString.split(myString.substring(0, myString.lastIndexOf(" ")));
OR
myString.split(myString.substring(0, myString.lastIndexOf(" ")))[1]; //works for current inputs
You must first remove the leading junk, then split on the intervening junk:
String[] paths = str.replaceAll("^.*? (?=/[a-zA-Z])", "")
.split("(?m)((?<=[a-zA-Z]/|[a-zA-Z])\\s|^).*? (?=/[a-zA-Z])");
One important point here is the use of (?m), which is a switch that turns on "dot matches newline", which is required to split across the newlines.
Here's some test code:
String str = "thisisgibberish 1234 /hello/world/\nmore gibberish 43/7 /good/timing/\njust onemore 8888 /thanks/mate";
String[] paths = str.replaceAll("^.*? (?=/[a-zA-Z])", "")
.split("(?m)((?<=[a-zA-Z]/|[a-zA-Z])\\s|^).*? (?=/[a-zA-Z])");
System.out.println( Arrays.toString( paths));
Output (achieving requirements):
[/hello/world/, /good/timing/, /thanks/mate]
I need to get all strings(not empty) starts with # and end with ' '(space) in String below:
String s = "#test1 #test2 #test3 #test4 ## #test5";
I hope I can get all "test1", "test2", "test3", "test4", "test5" strings.
How to do it with java regx? thanks a lot!
You can use the following regex
#\w+
\w is similar to [a-zA-Z\d_]
\w+ matches 1 to many characters which are from [a-zA-Z\d_]
The Java regex (?<=#)[^# ]+(?= ) should do the trick. According to Regex Planet's Java regex page that regex matches test1, test2, test3 and test4. (#test5 does not end with a space, so test5 is not matched.)
If you're OK with matching the leading #s and trailing s as well, you can get away with the simpler Java regex #[^# ]+.
Finally I solved it with code below:
Pattern pattern = Pattern.compile("#\\p{L}+");