Blank spaces in regular expression - java

I use this regular to validate many of the input fields of my java web app:
"^[a-zA-Z0-9]+$"
But i need to modify it, because i have a couple of fields that need to allow blank spaces(for example: Address).
How can i modify it to allow blank spaces(if possible not at the start).
I think i need to use some scape character like \
I tried a few different combinations but none of them worked. Can somebody help me with this regex?

I'd suggest using this:
^[a-zA-Z0-9][a-zA-Z0-9 ]+$
It adds two things: first, you're guaranteed not to have a space at the beginning, while allowing characters you need. Afterwards, letters a-z and A-Z are allowed, as well as all digits and spaces (there's a space at the end of my regex).

If you want to use only a whitespace, you can do:
^[a-zA-Z0-9 ]+$
If you want to include tabs \t, new-line \n \r\n characters, you can do:
^[a-zA-Z0-9\s]+$
Also, as you asked, if you don't want the whitespace to be at the begining:
^[a-zA-Z0-9][a-zA-Z0-9 ]+$

Use this: ^[a-zA-Z0-9]+[a-zA-Z0-9 ]+$. This should work. First atom ensures that there must be at least one character at beginning.

try like this ^[a-zA-Z0-9 ]+$ that is, add a space in it

This regex dont allow spaces at the end of string, one downside it accepts underscore character also.
^(\w+ )+\w+|\w+$

Try this one: I assume that any input with a length of at least one character is valid. The previously mentioned answers does not take that into account.
"^[a-zA-Z0-9][a-zA-Z0-9 ]*$"
If you want to allow all whitespace characters, replace the space by "\s"

Related

Regular Expression Match after double space, and before comma

I am trying to match the bolded portion of the below String, which would represent a city.
1795 New Test Dr Test TEst Wildwood, MI 48769-1100
There are two spaces between Dr and Test, the starting portion should happen after those double spaces, and end before the comma.
I feel like I am very close to having this correct but can't quite get it 100%, as it is including the white space characters before Test.
(?=\s{2})[\w+\s]*[^,]
The above is what I have so far, also the many other alternatives did not work either they still include the white space characters I do not want at the beginning.
I feel like I missing something simple, but even after looking many places I cannot seem to find the regex that would match this pattern.
Also I know this can be easily accomplished with split and substrings, but the requirement is a regex unfortunately, as this is for a database driven automation application and the format should be able to change on the fly without requiring a deploy due to code changes.
You need a look-behind for the spaces rather than a look-ahead, as you want the match to start immediately after them. From that point on, you can simply do a greedy match for anything that is not a comma:
(?<=\s{2})[^,]*
The * is greedy and will consume as many characters as it can, ending the match immediately before the comma.
\s actually also matches whitespace other than space, which may or may not be not be what you what.
How about ^.*? ([^,]*).*$. That's a non-greedy match at the beginning of the line ^.*?, followed by two literal spaces , then capturing everything that isn't a comma, then matching everything else to the end of the line.
Be aware, though, that when I copy and paste your example text, it does not contain two spaces. This might be causing you problems, or it's just a transcription issue and your original has the two spaces.

Need regular expression for pattern this

I need a regular expression for below pattern
It can start with / or number
It can only contain numbers, no text
Numbers can have space in between them.
It can contain /*, at least 1 number and space or numbers and /*
Valid Strings:
3232////33 43/323//
3232////3343/323//
/3232////343/323//
Invalid Strings:
/sas/3232/////dsds/
/ /34343///// /////
///////////
My Problem is, it can have space between numbers like /3232 323/ but not / /.
How to validate it ?
I have tried so far:
(\\d[\\d ]*/+) , (/*\\d[\\d ]*/+) , (/*)(\\d*)(/*)
This regex should work for you:
^/*(?:\\d(?: \\d)*/*)+$
Live Demo: http://www.rubular.com/r/pUOYFwV8SQ
My solution is not so simple but it works
^(((\d[\d ]*\d)|\d)|/)*((\d[\d ]*\d)|\d)(((\d[\d ]*\d)|\d)|/)*$
Just use lookarounds for the last criteria.
^(?=.*?\\d)([\\d/]*(?:/ ?(?!/)|\\d ?))+$
The best would have been to use conditional regex, but I think Java doesn't support them.
Explanation:
Basically, numbers or slashes, followed by one number and a space, or one slash and a space which is not followed by another slash. Repeat that. The space is made optional because I assume there's none at the end of your string.
Try this java regex
/*(\\d[\\d ]*(?<=\\d)/+)+
It meets all your criteria.
Although you didn't specifically state it, I have assumed that a space may not appear as the first or last character for a number (ie spaces must be between numbers)
"(?![A-z])(?=.*[0-9].*)(?!.*/ /.*)[0-9/ ]{2,}(?![A-z])"
this will match what you want but keep in mind it will also match this
/3232///// from /sas/3232/////dsds/
this is because part of the invalid string is correct
if you reading line by line then match the ^ $ and if you are reading an entire block of text then search for \r\n around the regex above to match each new line

How to use regex to remove punctuations in a sentence

I am trying to take from a file all the valid words. Valid words are defined as normal characters that can appear like so:
don't won't can't
and I have to ignore commas periods and exclamation points.
I have gotten the expression to just get characters but now it won't get words like don't and can't or won't.
This is the expression I am using "[^A-Za-z]+" and I have tried "\'[^A-Za-z]+" but this breaks and allows all characters. Does anyone have any idea what I can use to get normal words including don't and won't and can't and such words.
Thank you very much
[^A-Za-z] Would mean anything NOT matching those character ranges! Try this:
[A-Za-z']
You may need to escape the single quote, in which case you'll probably need to escape the slash that escapes it:
[A-Za-z\\']
Another way (using abbreviations) is: \b[\w']+
This will match letters from any language and exclude numbers.
\b[\p{L}\!\'\?]+
Here is a very good resource for regular expressions.
http://www.regular-expressions.info/

Removing all whitespace characters except for " "

I consider myself pretty good with Regular Expressions, but this one is appearing to be surprisingly tricky: I want to trim all whitespace, except the space character: ' '.
In Java, the RegEx I have tried is: [\s-[ ]], but this one also strips out ' '.
UPDATE:
Here is the particular string that I am attempting to strip spaces from:
project team manage key
Note: it would be the characters between "team" and "manage". They appear as a long space when editing this post but view as a single space in view mode.
Try using this regular expression:
[^\S ]+
It's a bit confusing to read because of the double negative. The regular expression [\S ] matches the characters you want to keep, i.e. either a space or anything that isn't a whitespace. The negated character class [^\S ] therefore must match all the characters you want to remove.
Using a Guava CharMatcher:
String text = ...
String stripped = CharMatcher.WHITESPACE.and(CharMatcher.isNot(' '))
.removeFrom(text);
If you actually just want that trimmed from the start and end of the string (like String.trim()) you'd use trimFrom rather than removeFrom.
There's no subtraction of character classes in Java, otherwise you could use [\s--[ ]], note the double dash. You can always simulate set subtraction using intersection with the complement, so
[\s&&[^ ]]
should work. It's no better than [^\S ]+ from the first answer, but the principle is different and it's good to know both.
I solved it with this:
anyString.replace(/[\f\t\n\v\r]*/g, '');
It is just a collection of all possible white space characters excluding blank (so actually
\s without blanks). It includes tab, carriage return, new line, vertical tab and form feed characters.

Java regex help

A string must not include spaces or special characters. Only a-z, A-Z, 0-9, the underscore, and the period characters are allowed.
How do I achieve this?
Update:
All the solutions posted worked for me.
Thanks everyone for helping out.
if (!myString.matches("^[a-zA-Z0-9._]*$")) {
// fail ...
}
or you can use the \w character class (shorthand for [a-zA-Z_0-9])
if (!myString.matches("^[\\w.]*$")) {
// fail ...
}
I am certain by the time I finish typing this, you will have received you answer. So here is some genuine advice to go with it - Take the time (hour or so) to learn the basics of regular expressions.
You will be surprised how often they show up in solutions to 'real world' problems.
Great testing resource -> http://gskinner.com/RegExr/
A different solution:
text = text.replaceAll("[\\w.]", "");
It removes the unwanted characters instead of just detecting them.
From Sun's website:
\w A word character: [a-zA-Z_0-9]
"[\\w,]+" should do the trick
You could simply delete all the characters that don't match the set [a-zA-Z0-9_.]. Alternatively you could replace characters not in the set with a valid character (e.g. the underscore). Finally you could altogether reject any string that does not consist solely of characters in the permitted set.
You can either make a "all characters must be one of these" regular expression or simply ask if any of the characters you dislike are present at all and if so reject the string. I believe the latter will be the easiest to write and understand later.

Categories

Resources