ReplaceAll regexp to match all patterns, without a specific String - java

I have a String, and I want to replace it:
src="test.jpg" -> src="file://test.jpg"
src="http://xxx...." -> untouched
In fact I replace src=" with src="file:// but I don't want to replace it if it starts with http, e.g. src="http.
So I wrote this regexp to replace src=" with src="file://:
html2.replaceAll("src=\"","src=\"file://");
But the problem is that this also matches src="http.
I didn't know how to build the regexp for this. I thought that I can make it like this, but it doesn't work:
html2.replaceAll("src=\"[^(http)]","src=\"file:///android_asset/verkehr/");

I think you want a zero width negative lookahead.
html2.replaceAll("(src=\"(?!http://))", "src=\"file:///");
But beware of other protocols such as https, ftp etc.

you want a negative look ahead.
html2.replaceAll("src=\"(?!http)",,....

Use regex with negative lookahead:
src=\"(?!http://)

Related

Replace string part with regex pattern

I would like to replace the following string.
img/s/430x250/
The problem is there are variations, like:
img/s/265x200/
or:
img/s/110x73/
So I would like to replace this part in whole, but the numbers are changeable, so how could I make a pattern that replaces it from a string?
Is your goal to match all three of those cases?
If so, this should work: img\/s\/\d+x\d+\/
It searches for img/s/[1 or more digits]x[1 or more digits]/
This regular expression will match your examples
img\/s\/\d+?x\d+?\/
the / matches /
the \d matches digits 0-9 and the + means 1 or more. The ? makes it lazy instead of greedy.
the img and s just match that literally
check out https://regex101.com/ to try out regular expressions. It's much easier than testing them by debugging code. Once you find an expression that works, you can move on to make sure your specific code will perform the same.

Regex for matching a string starting with a pattern and not ending with a pattern

Given a url, i have to match that the url starts with certain domain and not ends with certain pattern.
For eg,
Given a list of urls i want to match a url that starts with "http://www.google.com/" or "http://www.facebook.com/" and not ends with ".jpg" and ".bmp" and ".png"
I tried something like
^(http://www\.google\.com/|http://www\.facebook\.com/).*(\.(?!png)|(?!bmp)|(?!jpg))$
But it doesn't seem to work.. Any Mistakes in it? Or Any alternate way?
Something like (?!png)$ is, in general, pretty meaningless; it means "a position that is not followed by png, and that is at the end of the string", but of course the end of a string is never followed by png anyway, so (?!png)$ is equivalent to just $. (Do you see what I mean?)
Java regexes, fortunately, support zero-width lookbehind assertions, so you can write:
^http://www\.(google|facebook)\.com.*(?<!\.(png|bmp|jpg))$
where (?<!...) means "a position that is not preceded by ...". (See the Javadoc for java.util.regex.Pattern.)
This regex should work:
^https?:\/\/(?:www\.)?(?:google|facebook)\.com\/(?!.*?\.(?:jpe?g|png|bmp|gif)$).*$
Live Demo: http://www.rubular.com/r/V0X6ve1iUT
Java Demo: http://ideone.com/TlCyTG
try this is exact requirement you want
^(http://www\.google\.com/|http://www\.facebook\.com/)(?!.*?\.(?:jpe?g|png|bmp|gif)$).*$

Curious if this is possible with a Regex replacement

I'm trying to figure out a regex pattern to use with the Java String.replaceAll() function. I need to replace all the %26 but I don't want it to pick up any other numbers with a '26' prefix.
For example I want:
"abc%26def".replaceAll(regex, "&") to return "abc&def"
- and -
"abc%2623def".replaceAll(regex, "&") to return "abc%2623def" (no change)
I'm aware I can easily write a few more lines of code to accomplish this task but I was wondering if it's possible to do this with just a single replaceAll.
You can use a negative lookahead assertion that prevents matches where %26 is followed by another digit (you'll need to escape the \ in Java, so it would be \\d):
%26(?!\d)
Regex Demo: http://www.rubular.com/r/J07zojxabd
Java Demo: http://ideone.com/luCmFN
You could achieve this using negative lookahead. From the manual:
(?!X) X, via zero-width negative lookahead
You appear to be doing decoding of percent-encoded octets, in which case you might want to look at URLDecoder.decode:
System.out.println(URLDecoder.decode("abc%26def", "UTF-8")
This may or may not work for your purposes, as it also translates + to a space.

validating input string "RX-EZ12345678912345B" using regex

I need to validate input string which should be in the below format:
<2_upper_case_letters><"-"><2_upper_case_letters><14-digit number><1_uppercase_letter>
Ex: RX-EZ12345678912345B
I tried something like this ^[IN]-?[A-Z]{0,2}?\\d{0,14}[A-Z]{0,1} but its not giving the expected result.
Any help will be appreciated.
Thanks
Your biggest problem is the [IN] at the beginning, which matches only one letter, and only if it's I or N. If you want to match two of any letters, use [A-Z]{2}.
Once you fix that, your regex will still only match RX-E. That's because [A-Z]{0,2}? starts out trying to consume nothing, thanks to the reluctant quantifier, {0,2}?. Then \d{0,14} matches zero digits, and [A-Z]{0,1} greedily consumes the E.
If you want to match exactly 2 letters and 14 digits, use [A-Z]{2} and \d{14}. And since you're validating the string, you should end the regex with the end anchor, $. Result:
^[A-Z]{2}-[A-Z]{2}\d{14}[A-Z]$
...or, as a Java string literal:
"^[A-Z]{2}-[A-Z]{2}\\d{14}[A-Z]$"
As #nhahtdh observed, you don't really have to use the anchors if you're using Java's matches() method to apply the regex, but I recommend doing so anyway. It communicates your intent better, and it makes the regex portable, in case you have to use it in a different flavor/context.
EDIT: If the first two characters should be exactly IN, it would be
^IN-[A-Z]{2}\d{14}[A-Z]$
Simply translating your requirements into a java regex:
"^[A-Z]{2}-[A-Z]{2}\\d{14}[A-Z]$"
This will allow you to use:
if (!input.matches("^[A-Z]{2}-[A-Z]{2}\\d{14}[A-Z]$")) {
// do something because input is invalid
}
Not sure what you are trying to do at the beginning of your current regex.
"^[A-Z]{2}-[A-Z]{2}\\d{14}[A-Z]$"
The regex above will strictly match the input string as you specified. If you use matches function, ^ and $ may be omitted.
Since you want exact number of repetitions, you should specify it as {<number>} only. {<number>,<number>} is used for variable number of repetitions. And ? specify that the token before may or may not appear - if it must be there, then specifying ? is incorrect.
^[A-Z]{2}-[A-Z]{2}\\d{14}[A-Z]$
This should solve your purpose. You can confirm it from here
This should solve your problem. Check out the validity here
^[A-Z]{2}-[A-Z]{2}[0-9]{14}[A-Z]$
^([A-Z]{2,2}[-]{1,1}[A-Z]{2,2}[0-9]{14,14}[A-Z]{1,1}){1,1}$

Negating a set of words via java regex

I would like to negate a set of words using java regex.
Say, I want to negate cvs, svn, nvs, mvc. I wrote a regex which is ^[(svn|cvs|nvs|mvc)].
Some how that seems not to be working.
Try this:
^(?!.*(svn|cvs|nvs|mvc)).*$
this will match text if it doesn't contain one of svn, cvs, nvs or mvc.
This is a similar question: C# Regex to match a string that doesn't contain a certain string?
It's not that simple. If you want to negate a word you have to split it to letters and negate each letter.
so to negate
/svn/
you have to write
/[^s][^v][^n]/
So what you want to filter out will turn into really ugly regex and I think it's better idea to use this regex
/svn|cvs|nvs|mvc/
and when you test your string against it, just negate the result.
In JS this would look more less like that:
!/svn|cvs|nvs|mvc/.test("this is your test string");
Your regex is wrong. Between square brackets, you can put characters to require or to ignore. If you don't find ^(svn|cvs|nvs|mvc)$, you're fine.

Categories

Resources