I need a regex that matches numbers depending on a variable - java

i'm having some problems when trying to find a regex for my code. Here it is:
Scanner key = new Scanner(System.in);
//this is the variable
int s = 4;
String input = "";
String bregex = "[1-9][0-9]{1," + (s*s) + "}";
boolean cfgmatch = false;
while(cfgmatch == false){
input = key.next();
Pattern cfgbp = Pattern.compile(bregex);
Matcher bm = cfgbp.matcher(input);
if(bm.matches()){
System.out.println("working");
}
else{
System.out.println("not working");
}
}
I'm trying to make a regex to resrict a number of cells in a board. cells number can't be higer than board's space, which is "s*s".
Example: If board's size is 4, the input can be from 1 to 16, if it's 5, from 1 to 25, etc...
Board size can only be from 1 to 9.
I've written that while to ask for another number in case of failing the input.

Be Careful with Regular Expressions
While a regular expression could potentially work for this, it's really better designed to handle pattern matching as opposed to arithmetic operations. Your current regular expression would generate s*s digits, which isn't going to be defining the range you are looking for :
// If s = 4, then this regular express will match any string that begins with a 1 and
// would allow any values from 1-99999999999999999 as opposed to the 1-16 you are expecting
String bregex = "[1-9][0-9]{1,16}";
Consider a Simpler Approach
You may be better off avoiding it if you are going to be comparing your input numerically to another value (i.e. is this number less than x) :
// Is your number less than the largest possible square value?
if(parseInt(input) <= s*s){
// Valid
}
else {
// Invalid
}

Related

Using scanner.next() to return the next n number of characters

I'm trying to use a scanner to parse out some text but i keep getting an InputMismatchException. I'm using the scanner.next(Pattern pattern) method and i want to return the next n amount of characters (including whitespace).
For example when trying to parse out
"21 SPAN 1101"
I want to store the first 4 characters ("21 ") in a variable, then the next 6 characters (" ") in another variable, then the next 5 ("SPAN "), and finally the last 4 ("1101")
What I have so far is:
String input = "21 SPAN 1101";
Scanner parser = new Scanner(input);
avl = parser.next(".{4}");
cnt = parser.next(".{6}");
abbr = parser.next(".{5}");
num = parser.next(".{4}");
But this keeps throwing an InputMismatchException even though according to the java 8 documentation for the scanner.next(Pattern pattern) it doesn't throw that type of exception. Even if I explicitly declare the pattern and then pass that pattern into the method i get the same exception being thrown.
Am I approaching this problem with the wrong class/method altogether? As far as i can tell my syntax is correct but i still cant figure out why im getting this exception.
At documentation of next(String pattern) we can find that it (emphasis mine)
Returns the next token if it matches the pattern constructed from the specified string.
But Scanner is using as default delimiter one or more whitespaces so it doesn't consider spaces as part of token. So first token it returns is "21", not "21 " so condition "...if it matches the pattern constructed from the specified string" is not fulfilled for .{4} because of its length.
Simplest solution would be reading entire line with nextLine() and splitting it into separate parts via regex like (.{4})(.{6})(.{5})(.{4}) or series of substring methods.
You might want to consider creating a convenience method to cut your input String into variable number of pieces of variable length, as approach with Scanner.next() seems to fail due to not considering spaces as part of tokens (spaces are used as delimiter by default). That way you can store result pieces of input String in an array and assign specific elements of an array to other variables (I made some additional explanations in comments to proper lines):
public static void main(String[] args) throws IOException {
String input = "21 SPAN 1101";
String[] result = cutIntoPieces(input, 4, 6, 5, 4);
// You can assign elements of result to variables the following way:
String avl = result[0]; // "21 "
String cnt = result[1]; // " "
String abbr = result[2]; // "SPAN "
String num = result[3]; // "1101"
// Here is an example how you can print whole array to console:
System.out.println(Arrays.toString(result));
}
public static String[] cutIntoPieces(String input, int... howLongPiece) {
String[] pieces = new String[howLongPiece.length]; // Here you store pieces of input String
int startingIndex = 0;
for (int i = 0; i < howLongPiece.length; i++) { // for each "length" passed as an argument...
pieces[i] = input.substring(startingIndex, startingIndex + howLongPiece[i]); // store at the i-th index of pieces array a substring starting at startingIndex and ending "howLongPiece indexes later"
startingIndex += howLongPiece[i]; // update value of startingIndex for next iterations
}
return pieces; // return array containing all pieces
}
Output that you get:
[21 , , SPAN , 1101]

java pattern regular expression matching

I'm really bad with pattern matching. I'm trying to take in a password and just check that it meets this criteria:
contains at least 1 lowercase letter
contains at least 1 uppercase letter
contains at least 1 number
contains at least one of these special chars: ##$%
has a minimum length of 8 characters
has a maximum length of 10 characters
This is what I have:
Pattern pattern = Pattern.compile("((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{8,10})");
Matcher matcher = pattern.matcher(in);
if(!matcher.find())
{
return false;
}
else
{
return true;
}
I would also like to do something like this:
int MIN = 8,
MAX = 10;
"((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{MIN,MAX})"
but I get some weird message about malformed expression.
Something isn't right. My program crashes with this. I don't know what's wrong. Any ideas?
private boolean isValidPassword(String in)
{
/* PASSWORD MUST:
* contains at least 1 lowercase letter
* contains at least 1 uppercase letter
* contains at least 1 number
* contains at least one of these special chars: ##$%
* has a minimum length of 8 characters
* has a maximum length of 10 characters
*/
Pattern hasLowercase = Pattern.compile(".*[a-z].*");
Pattern hasUppercase = Pattern.compile(".*[A-Z].*");
Pattern hasNumber = Pattern.compile(".*[0-9].*");
Pattern hasSpecial = Pattern.compile(".*(#|#|$|%).*");
Matcher matcher = hasLowercase.matcher(in);
if (!matcher.matches()) //a-z
{
return false;
}
matcher = hasUppercase.matcher(in);
if (!matcher.matches()) //A-Z
{
return false;
}
matcher = hasNumber.matcher(in);
if (!matcher.matches()) //0-9
{
return false;
}
matcher = hasSpecial.matcher(in);
if (!matcher.matches()) //##$%
{
return false;
}
if(in.length() < MIN_LENGTH || in.length() > MAX_LENGTH) //length must be min-to-max.
{
return false;
}
return true;
}
If you really want to do this with regular expressions, it would be much easier to test the input against multiple simple expressions rather than one single and excessively complex expression.
Test your input against the following regexes.
If one of them fails, then the input is invalid.
.*[a-z].*
.*[A-Z].*
.*[0-9].*
.*(#|#|$|%).*
Additionnally, check the length of the input, with basic string methods.
I am not sure how to help you with crashing without more information, but I do have a suggestion.
Instead of trying to create one giant regex expression, I would recommend making one expression for each rule, then test them all on the string individually. This allows you to easily edit the individual rules if you decide you want to change/add/remove rules. This also makes them easier to understand.
There is also the option of not using regex, which would make your rules pretty easy using the string contains method with these character classes
As for the malformed expression, you should concat the MIN and MAX like this:
"((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{" + MIN + "," + MAX + "})" which will insert the values of MAX and MIN into the string.
I think that your expression might be off, but I found one that meets what you are looking for.
"^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[!##\$%\^&\*])(?=.{8,10})"
You can modify the min and max length by using
"^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[!##\$%\^&\*])(?=.{" + MIN + "," + MAX + "})"
I have included this RegEx in Regexr so you can see how it works.
http://regexr.com/3gnbd
Also, for future reference when testing Regular Expressions, regexr.com is very helpful for seeing the different components.
You also should use if/then statements to return true or false, because you could just return the tested condition instead. return matcher.find() eliminates the need for an if statement.

Java: Stringtokenizer To Array

Given a polynomial, I'm attempting to write code to create a polynomial that goes by the degree's, and adds like terms together For instance... given
String term = "323x^3+2x+x-5x+5x^2" //Given
What I'd like = "323x^3+5x^2-2x" //result
So far I've tokenized the given polynomial by this...
term = term.replace("+" , "~+");
term = term.replace("-", "~-");
System.out.println(term);
StringTokenizer multiTokenizer = new StringTokenizer(term, "~");
int numberofTokens = multiTokenizer.countTokens();
String[] tokensArray = new String[numberofTokens];
int x=0;
while (multiTokenizer.hasMoreTokens())
{
System.out.println(multiTokenizer.nextToken());
}
Resulting in
323x^3~+2x~+x~-5x~+5x^2
323x^3
+2x
+x
-5x
+5x^2
How would I go about splitting the coefficient from the x value, saving each coefficient in an array, and then putting the degrees in a different array with the same index as it's coefficient? I will then use this algorithm to add like terms....
for (i=0;i<=biggest_Root; i++)
for(j=0; j<=items_in_list ; j++)
if (degree_array[j] = i)
total += b1[j];
array_of_totals[i] = total;
Any and all help is much appreciated!
You can also update the terms so they all have coefficients:
s/([+-])x/\11/g
So +x^2 becomes +1x^2.
Your individual coefficients can be pulled out by simple regex expressions.
Something like this should suffice:
/([+-]?\d+)x/ // match for x
/([+-]?\d+)x\^2/ // match for x^2
/([+-]?\d+)x\^3/ // match for x^3
/([+-]?\d+)x\^4/ // match for x^4
Then
sum_of_coefficient[degree] += match
where "match" is the parseInt of the the regex match (special case where coefficient is 1 and has no number eg. +x)
sum_of_coefficient[3] = 323
sum_of_coefficient[1] = +2+1-5 = -2
sum_of_coefficient[2] = 5
Using a "Regular Expression" Pattern to Simplify the Parsing
(and make the code cooler and more concise)
Here is a working example that parses coefficient, variable and degree for each term based on the terms you've parsed so far. It just inserted the terms shown into your example into a list of Strings and then processes each string the same way.
This program runs and produces output, and if you like it you can splice it into your program. To try it:
$ javac parse.java
$ java parse
Limitations and Potential Improvements:
Technically speaking the coefficient and degrees could be fractional, so the regular expression could easily be changed to handle those kinds of numbers. And then instead of Integer.parseInt() you could use Float.parseFloat() instead to convert the matched value to a variable you can use.
import java.util.*;
import java.util.regex.*;
public class parse {
public static void main(String args[]) {
/*
* Substitute this List with your own list or
* array from the code you've written already...
*
* vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv */
List<String>terms = new ArrayList<String>();
terms.add("323x^3");
terms.add("+2x");
terms.add("+x");
terms.add("-5x");
terms.add("+5x^2");
/* ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ */
for (String term : terms) {
System.out.print("Term: " + term + ": \n");
Pattern pattern = Pattern.compile("([+-]*\\d*)([A-Za-z]*)\\^*(\\d*)");
Matcher matcher = pattern.matcher(term);
if (matcher.find()) {
int coefficient = 1;
try {
coefficient = Integer.parseInt(matcher.group(1));
} catch (Exception e) {}
String variable = matcher.group(2);
int degree = 1;
try {
degree = Integer.parseInt(matcher.group(3));
} catch (Exception e) {}
System.out.println(" coefficient = " + coefficient);
System.out.println(" variable = " + variable);
System.out.println(" degree = " + degree);
/*
* Here, do what you need to do with
* variable, coefficient and degree
*/
}
}
}
}
Explanation of the Regular Expression in the Example Code:
This is the regular expression used:
([+-]*\\d*)([A-Za-z]*)\\^*(\\d*)
Each parenthesized section represents part of the term I want to match and extract into my result. It puts whatever is matched in a group corresponding to the set of parenthesis. First set of parenthesis goes into group 1, second into group 2, etc...
The first matcher (grouped by ( )), is ([+-]*\\d*)
That is designed match (e.g. extract) the coefficient (if any) and put it into group 1. It expects something that has zero or more occurances of '+' or '-' characters, followed by zero or more digits. I probably should have written in [+-]?\\d* which would match zero or one + or - characters.
The next grouped matcher is ([A-Za-z]*) That says match zero or more capital or lowercase letters.
That is trying to extract the variable name, if any and put it into group 2.
Following that, there is an ungrouped \\^*, which matches 0 or more ^ characters. It's not grouped in parenthesis, because we want to account for the ^ character in the text, but not stash it anywhere. We're really interested in the exponent number following it. Note: Two backslashes are how you make one backslash in a Java string. The real world regular expression we're trying to represent is \^*. The reason it's escaped here is because ^ unescaped has special meaning in regular expressions, but we just want to match/allow for the possibility of an actual caret ^ at that position in the algebraic term we're parsing.
The final pattern group is (\\d*). Outside of a string literal, as most regex's in the wild are, that would simply be \d*. It's escaped because, by default, in a regex, d, unescaped, means match a literal d at the current position in the text, but, escaped,\d is a special regex pattern that matches match any digit [0-9] (as the Pattern javadoc explains). * means expect (match) zero or more digits at that point. Alternatively, + would mean expect 1 or more digits in the text at the current position, and ? would mean 0 or 1 digits are expected in the text at the current position. So, essentially, the last group is designed to match and extract the exponent (if any) after the optional caret, putting that number into group 3.
Remember the ( ) (parenthesized) groupings are just so that we can extract those areas parsed into separate groups.
If this doesn't all make perfect sense, study regular expressions in general and read the Java Pattern class javadoc online. The are NOT as scary as they first look, and an extremely worthwhile study for any programmer ASAP, as it crosses most popular scripting languages and compilers, so learn it once and you have an extremely powerful tool for life.
This looks like a homework question, so I won't divulge the entire answer here but here's how I'd get started
public class Polynomial {
private String rawPolynomial;
private int lastTermIndex = 0;
private Map<Integer, Integer> terms = new HashMap<>();
public Polynomial(String poly) {
this.rawPolynomial = poly;
}
public void simplify() {
while(true){
String term = getNextTerm(rawPolynomial);
if ("".equalsIgnoreCase(term)) {
return;
}
Integer degree = getDegree(term);
Integer coeff = getCoefficient(term);
System.out.println(String.format("%dx^%d", coeff, degree));
terms.merge(degree, coeff, Integer::sum);
}
}
private String getNextTerm(String poly) {
...
}
private Integer getDegree(String poly) {
...
}
private Integer getCoefficient(String poly) {
...
}
#Override public String toString() {
return terms.toString();
}
}
and some tests to get you started -
public class PolynomialTest {
#Test public void oneTermPolynomialRemainsUnchanged() {
Polynomial poly = new Polynomial("3x^2");
poly.simplify();
assertTrue("3x^2".equalsIgnoreCase(poly.toString()));
}
}
You should be able to fill in the blanks, hope this helps. I'll be happy to help you further if you're stuck somewhere.

How do I go through the String to make sure it only contains numbers in certain places?

I am writing something that will take a user's input for a phone number and tell the user if their input was valid or not based on five parameters:
input is 13 characters in length
char at index 0 is '('
char at index 4 is ')'
char at index 8 is '-'
All other characters must be one of the digits: ’0’ through ’9’ inclusive.
So far I have everything down except the 5th parameter. My code for that goes as followed
if (number.contains("[0-9]+"))
{
ints = true;
if (number.contains("[a-zA-Z]*\\d+."))
{
ints = false;
}
}
else
{
ints = false;
}
(Side note: number is my string that is the user's input, and ints is a boolean declared earlier in the code).
Here is a regular expression to do it.
Pattern p = Pattern.compile("^\\(\\d{3}+\\)\\d{3}+-\\d{4}");
System.out.println(p.matcher("(916)628-4563").matches());
System.out.println(p.matcher("( 916 ) 628-4563").matches());
output:
true
false
It can be tough to enter data like this and when receiving user input you should try to limit their options. eg. ask for each part of the phone number, and omit (,) and -.
As the op has added new requirements. First check for the required (,), and -'s.
boolean goodNumber = number.find("(")==0&&number.find(")")==4
goodNumber = goodNumber&&number.find("-")==8
goodNumber = goodNumber&&number.length()==13&&
goodNumber = goodNumber&&number.replaceAll("\\d","").length()==3;
Find the brackets, the dash and then remove all of the numbers and see if you are only left with a bracket and dash.
You can use following. If the string is correct it will print valid, otherwise it will print invalid.
public void compare(){
String inputString="(123)848-3452";
if(inputString.matches("^\\([0-9]{3}\\)[0-9]{3}-[0-9]{4}")){
System.out.println("valid");
}else{
System.out.println("invalid");
}
}
You can have a simple regex like this
Pattern pattern = Pattern.compile("\\(\\d{3}\\)\\d{3}\\-\\d{4}");
Matcher matcher = pattern.matcher("(071)234-2434");
System.out.println(matcher.matches());

Checking if a character is an integer or letter

I am modifying a file using Java. Here's what I want to accomplish:
if an & symbol, along with an integer, is detected while being read, I want to drop the & symbol and translate the integer to binary.
if an & symbol, along with a (random) word, is detected while being read, I want to drop the & symbol and replace the word with the integer 16, and if a different string of characters is being used along with the & symbol, I want to set the number 1 higher than integer 16.
Here's an example of what I mean. If a file is inputted containing these strings:
&myword
&4
&anotherword
&9
&yetanotherword
&10
&myword
The output should be:
&0000000000010000 (which is 16 in decimal)
&0000000000000100 (or the number '4' in decimal)
&0000000000010001 (which is 17 in decimal, since 16 is already used, so 16+1=17)
&0000000000000101 (or the number '9' in decimal)
&0000000000010001 (which is 18 in decimal, or 17+1=18)
&0000000000000110 (or the number '10' in decimal)
&0000000000010000 (which is 16 because value of myword = 16)
Here's what I tried so far, but haven't succeeded yet:
for (i=0; i<anyLines.length; i++) {
char[] charray = anyLines[i].toCharArray();
for (int j=0; j<charray.length; j++)
if (Character.isDigit(charray[j])) {
anyLines[i] = anyLines[i].replace("&","");
anyLines[i] = Integer.toBinaryString(Integer.parseInt(anyLines[i]);
}
else {
continue;
}
if (Character.isLetter(charray[j])) {
anyLines[i] = anyLines[i].replace("&","");
for (int k=16; j<charray.length; k++) {
anyLines[i] = Integer.toBinaryString(Integer.parseInt(k);
}
}
}
}
I hope that I am articulate enough. Any suggestions on how to accomplish this task?
Character.isLetter() //tests to see if it is a letter
Character.isDigit() //tests the character to
It looks like something you could match against a regex. I don't know Java but you should have at least one regex engine at your disposal. Then the regex would be:
regex1: &(\d+)
and
regex2: &(\w+)
or
regex3: &(\d+|\w+)
in the first case, if regex1 matches, you know you ran into a number, and that number is into the first capturing group (eg: match.group(1)). If regex2 matches, you know you have a word. You can then lookup that word into a dictionary and see what its associated number is, or if not present, add it to the dictionary and associate it with the next free number (16 + dictionary size + 1).
regex3 on the other hand will match both numbers and words, so it's up to you to see what's in the capturing group (it's just a different approach).
If neither of the regex match, then you have an invalid sequence, or you need some other action. Note that \w in a regex only matches word characters (ie: letters, _ and possibly a few other characters), so &çSomeWord or &*SomeWord won't match at all, while the captured group in &Hello.World would be just "Hello".
Regex libs usually provide a length for the matched text, so you can move i forward by that much in order to skip already matched text.
You have to somehow tokenize your input. It seems you are splitting it in lines and then analyzing each line individually. If this is what you want, okay. If not, you could simply search for & (indexOf('%')) and then somehow determine what the next token is (either a number or a "word", however you want to define word).
What do you want to do with input which does not match your pattern? Neither the description of the task nor the example really covers this.
You need to have a dictionary of already read strings. Use a Map<String, Integer>.
I would post this as a comment, but don't have the ability yet. What is the issue you are running into? Error? Incorrect Results? 16's not being correctly incremented? Also, the examples use a '%' but in your description you say it should start with a '&'.
Edit2: Was thinking it was line by line, but re-reading indicates you could be trying to find say "I went to the &store" and want it to say "I went to the &000010000". So you would want to split by whitespace and then iterate through and pass the strings into your 'replace' method, which is similar to below.
Edit1: If I understand what you are trying to do, code like this should work.
Map<String, Integer> usedWords = new HashMap<String, Integer>();
List<String> output = new ArrayList<String>();
int wordIncrementer = 16;
String[] arr = test.split("\n");
for(String s : arr)
{
if(s.startsWith("&"))
{
String line = s.substring(1).trim(); //Removes &
try
{
Integer lineInt = Integer.parseInt(line);
output.add("&" + Integer.toBinaryString(lineInt));
}
catch(Exception e)
{
System.out.println("Line was not an integer. Parsing as a String.");
String outputString = "&";
if(usedWords.containsKey(line))
{
outputString += Integer.toBinaryString(usedWords.get(line));
}
else
{
outputString += Integer.toBinaryString(wordIncrementer);
usedWords.put(line, wordIncrementer++);
}
output.add(outputString);
}
}
else
{
continue; //Nothing indicating that we should parse the line.
}
}
How about this?
String input = "&myword\n&4\n&anotherword\n&9\n&yetanotherword\n&10\n&myword";
String[] lines = input.split("\n");
int wordValue = 16;
// to keep track words that are already used
Map<String, Integer> wordValueMap = new HashMap<String, Integer>();
for (String line : lines) {
// if line doesn't begin with &, then ignore it
if (!line.startsWith("&")) {
continue;
}
// remove &
line = line.substring(1);
Integer binaryValue = null;
if (line.matches("\\d+")) {
binaryValue = Integer.parseInt(line);
}
else if (line.matches("\\w+")) {
binaryValue = wordValueMap.get(line);
// if the map doesn't contain the word value, then assign and store it
if (binaryValue == null) {
binaryValue = wordValue;
wordValueMap.put(line, binaryValue);
wordValue++;
}
}
// I'm using Commons Lang's StringUtils.leftPad(..) to create the zero padded string
String out = "&" + StringUtils.leftPad(Integer.toBinaryString(binaryValue), 16, "0");
System.out.println(out);
Here's the printout:-
&0000000000010000
&0000000000000100
&0000000000010001
&0000000000001001
&0000000000010010
&0000000000001010
&0000000000010000
Just FYI, the binary value for 10 is "1010", not "110" as stated in your original post.

Categories

Resources