avoid code duplication - java

consider the following code:
if (matcher1.find()) {
String str = line.substring(matcher1.start()+7,matcher1.end()-1);
/*+7 and -1 indicate the prefix and suffix of the matcher... */
method1(str);
}
if (matcher2.find()) {
String str = line.substring(matcher2.start()+8,matcher2.end()-1);
method2(str);
}
...
I have n matchers, all matchers are independent (if one is true, it says nothing about the others...), for each matcher which is true - I am invoking a different method on the content it matched.
question: I do not like the code duplication nor the "magic numbers" in here, but I'm wondering if there is better way to do it...? (maybe Visitor Pattern?) any suggestions?

Create an abstract class, and add offset in subclass (with string processing too... depending of your requirement).
Then populate them in a list and process the list.
Here is a sample absract processor:
public abstract class AbsractProcessor {
public void find(Pattern pattern, String line) {
Matcher matcher = p.matcher(line);
if (matcher.find()) {
process(line.substring(matcher.start() + getStartOffset(), matcher.end() - getEndOffset()));
}
}
protected abstract int getStartOffset();
protected abstract int getEndOffset();
protected abstract void process(String str);
}

Simple mark the part of the regex that you want to pass to the method with a capturing group.
For example if your regex is foo.*bar and you are not interested in foo or bar, make the regex foo(.*)bar. Then always grab the group 1 from the Matcher.
Your code would then look like this:
method1(matcher1.group(1));
method2(matcher2.group(2));
...
One further step would be to replace your methods with classes implementing an like this:
public interface MatchingMethod {
String getRegex();
void apply(String result);
}
Then you can easily automate the task:
for (MatchingMethod mm : getAllMatchingMethods()) {
Pattern p = Pattern.compile(mm.getRegex());
Matcher m = p.matcher(input);
while (m.find()) {
mm.apply(m.group(1));
}
Note that if performance is important, then pre-compiling the Pattern can improve runtime if you apply this to many inputs.

You could make it a little bit shorter, but I the question is, is this really worth the effort:
private String getStringFromMatcher(Matcher matcher, int magicNumber) {
return line.subString(matcher.start() + magicNumber, matcher.end() - 1 )
}
if (matcher1.find()) {
method1(getStringFromMatcher(matcher1, 7);
}
if (matcher2.find()) {
method2.(getStringFromMatcher(mather2, 8);
}

use Cochard's solution combined with a factory (switch statement) with all the methodX methods. so you can call it like this:
Factory.CallMethodX(myEnum.MethodX, str)
you can assign the myEnum.MethodX in the population step of Cochard's solution

Related

Java console input handling

This is my first question here, I hope it's not too based on opinions. I've searched on the internet for quite a while now, but couldn't find a similar question.
I need to write a Java program that reads commands from the console, validates the input, gets the parameters and passes them on to a different class.
There are some restrictions on what I can do and use (university).
Only the packages java.util, java.lang and java.io are allowed
Each method can only be 80 lines long
Each line can only be 120 characters long
I am not allowed to use System.exit / Runtime.exit
The Terminal class is used to handle user input. Terminal.readLine() will read a line from the console, like Scanner.nextLine()
I have a fully working program - however my solution will not be accepted because of the way I handle console inputs (runInteractionLoop() method too long). I'm doing it like this:
The main class has the main method and an "interaction loop" where console inputs are handled. The main method calls the interaction loop in a while loop, with a boolean "quit" as a guardian.
private static boolean quit = false;
...
public static void main(String[] args) {
...
while (quit == false) {
runInteractionLoop();
}
}
The interaction loop handles console input. I need to check for 16 different commands - each with their own types of parameters. I chose to work with Patterns and Matchers, because I can use the groups for convenience. Now the problems start - I have never learned how to correctly handle user inputs. What I have done here is, for each possible command, create a new Matcher, see if the input matches, if it does then do whatever needs to be done for this input.
private static runInteractionLoop() {
Matcher m;
String query = Terminal.readLine;
m = Pattern.compile("sliding-window (\\d+) (-?\\d+(?:\\.\\d+)?;)*(-?\\d+(?:\\.\\d+)?)").matcher(query);
if (m.matches()) {
xyz.doSth(Integer.parseInt(m.group(1)), ......);
...
return;
}
m = Pattern.compile("record ([a-z]+) (-?\\d+(?:\\.\\d+)?)").matcher(query);
if (m.matches()) {
xyz.doSthElse(m.group(1), Double.parseDouble(m.group(2)));
return;
}
...
if (query.equals("quit")) {
quit = true;
return;
}
Terminal.printError("invalid input");
}
As you can see, doing this 16 times stretches out the method to more than 80 lines (5 lines per input max). It's also obviously very inefficient and to be honest, I'm quite ashamed to be posting this here (crap code). I just don't know how to do this correctly, using only java.util and having some way to quickly get the parameters (e.g. the Matcher groups here).
Any ideas? I would be very grateful for suggestions. Thanks.
EDIT/UPDATE:
I have made the decision to split the verification into two methods - one for each half of the commands. Looks ugly, but passes the Uni's checkstyle requirements. However, I'd still be more than happy if someone shows me a better solution to my problem - for the future (because I obviously have no idea how to make this prettier, shorter and/or more efficient).
I guess you could try something painful like this where you separate everything into a chain of method calls:
private static runInteractionLoop() {
Matcher m;
String query = Terminal.readLine;
m = Pattern.compile("sliding-window (\\d+) (-?\\d+(?:\\.\\d+)?;)*(-?\\d+(?:\\.\\d+)?)").matcher(query);
if (m.matches()) {
xyz.doSth(Integer.parseInt(m.group(1)), ......);
...
return;
} else {
tryDouble(query, m);
}
}
Private static tryDouble(String query, Matcher m) {
m = Pattern.compile("record ([a-z]+) (-?\\d+(?:\\.\\d+)?)").matcher(query);
if (m.matches()) {
xyz.doSthElse(m.group(1), Double.parseDouble(m.group(2)));
return;
} else {
trySomethingElse(query, m);
}
}
Private static trySomethingElse(String query, Matcher m) {
...
if (query.equals("quit")) {
quit = true;
return;
}
Terminal.printError("invalid input");
}
I would solve this with an abstract class CommandValidator:
public abstract class CommandValidator {
/* getter and setter */
public Matcher resolveMatcher(String query) {
return Pattern.compile(getCommand()).matcher(query);
}
public abstract String getCommand();
public abstract void doSth();
}
and would implement 16 different CommandValidators for each handler and implement the abstract methods differently:
public class IntegerCommandValidator extends CommandValidator {
#Override
public String getCommand() {
return "sliding-window (\\d+) (-?\\d+(?:\\.\\d+)?;)*(-?\\d+(?:\\.\\d+)?)";
}
#Override
public void doSth() {
/* magic here, parameter input the matcher and xyz, or have it defined as field at the class */
// xyz.doSth(Integer.parseInt(m.group(1)), ......);
}
}
Since you need the matcher in your CommandValidator you might set it as field of the class, or just give it into the doSth() method.
Then you can instantiate each concrete Validator in a list and iterate through every validator, resolve the matcher and look if it matches:
private static Set<CommandValidator> allConcreteValidators;
public static void main(String[] args) {
/* */
allConcreteValidators.add(new IntegerCommandValidator());
/* */
while (quit == false) {
runInteractionLoop();
}
}
private static runInteractionLoop() {
String query = Terminal.readLine;
for (CommandValidator validator : allConcreteValidators) {
if (validator.resolveMatcher(query).matches()) {
validator.doSth();
}
}
}
Of course you could build a lookup method before, if there even is a validator which fits and handle the case that you don't have any validator defined.
Might be a bit over engineered for your exercise. Maybe you can give the command into the constructor of your concrete validators, if they share the same doSth magic as well.
Ofc you should find better names for the classes, because it is not only a validator but something different.
You can boil down each possibility to two lines (or three if there must be a closing bracket on a separat line) by delegating the match work to a submethod:
if ( Matcher m = matches( query, "sliding-window (\\d+) (-?\\d+(?:\\.\\d+)?;)*(-?\\d+(?:\\.\\d+)?)") != null)
xyz.doSth(Integer.parseInt(m.group(1)), ......);
else if ( Matcher m = matches( query, "record ([a-z]+) (-?\\d+(?:\\.\\d+)?)") != null)
xyz.doSthElse(m.group(1), Double.parseDouble(m.group(2)));
...
else
private Matcher matches( String input, String regexp)
{
Matcher result = Pattern.compile(regexp).matcher(input);
if ( result.matches() )
return result;
else
return null;
}

Java check that string will only allow commas as special chacters

how can I check to make sure the only special character a string can have is a comma?
testString = "123,34565,222" //OK
testString = "123,123.123." //Fail
A full working example based on #Simeon's regex. This reuses a single Matcher object, which is recommended if the check will be done frequently.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class OnlyLettersDigitsCommas {
//"": Dummy search string, to reuse matcher
private static final Matcher lettersCommasMtchr = Pattern.
compile("^[a-zA-Z0-9,]+$").matcher("");
public static final boolean isOnlyLettersDigitsCommas(String to_test) {
return lettersCommasMtchr.reset(to_test).matches();
}
public static final void main(String[] ignored) {
System.out.println(isOnlyLettersDigitsCommas("123,34565,222"));
System.out.println(isOnlyLettersDigitsCommas("123,123.123."));
}
}
Output:
[C:\java_code\]java OnlyLettersDigitsCommas
true
false
You can use a quick String.contains method like this:
if ( testString.contains(".") {
// fails
}
But I would consider using Regex for this type of validation.
EDIT : As stated in the comments of the question : [a-zA-Z0-9,]
Maybe a
if (!testString.matches("^[a-zA-Z0-9,]+$")) {
// throw an exception
}
check ?

Match a string against multiple regex patterns

I have an input string.
I am thinking how to match this string against more than one regular expression effectively.
Example Input: ABCD
I'd like to match against these reg-ex patterns, and return true if at least one of them matches:
[a-zA-Z]{3}
^[^\\d].*
([\\w&&[^b]])*
I am not sure how to match against multiple patterns at once. Can some one tell me how do we do it effectively?
If you have just a few regexes, and they are all known at compile time, then this can be enough:
private static final Pattern
rx1 = Pattern.compile("..."),
rx2 = Pattern.compile("..."),
...;
return rx1.matcher(s).matches() || rx2.matcher(s).matches() || ...;
If there are more of them, or they are loaded at runtime, then use a list of patterns:
final List<Pattern> rxs = new ArrayList<>();
for (Pattern rx : rxs) if (rx.matcher(input).matches()) return true;
return false;
you can make one large regex out of the individual ones:
[a-zA-Z]{3}|^[^\\d].*|([\\w&&[^b]])*
To avoid recreating instances of Pattern and Matcher classes you can create one of each and reuse them. To reuse Matcher class you can use reset(newInput) method.
Warning: This approach is not thread safe. Use it only when you can guarantee that only one thread will be able to use this method, otherwise create separate instance of Matcher for each methods call.
This is one of possible code examples
private static Matcher m1 = Pattern.compile("regex1").matcher("");
private static Matcher m2 = Pattern.compile("regex2").matcher("");
private static Matcher m3 = Pattern.compile("regex3").matcher("");
public boolean matchesAtLeastOneRegex(String input) {
return m1.reset(input).matches()
|| m2.reset(input).matches()
|| m3.reset(input).matches();
}
like it was explained in (Running multiple regex patterns on String) it is better to concatenate each regex to one large regex and than run the matcher only one. This is an large improvement is you often reuse the regex.
I'm not sure what effectively means, but if it's about performance and you want to check a lot of strings, I'd go for this
...
static Pattern p1 = Pattern.compile("[a-zA-Z]{3}");
static Pattern p2 = Pattern.compile("^[^\\d].*");
static Pattern p3 = Pattern.compile("([\\w&&[^b]])*");
public static boolean test(String s){
return p1.matcher(s).matches ? true:
p2.matcher(s).matches ? true:
p3.matcher(s).matches;
}
I'm not sure how it will affect performance, but combining them all in one regexp with | could also help.
Here's an alternative.
Note that one thing this doesn't do is return them in a specific order. But one could do that by sorting by m.start() for example.
private static HashMap<String, String> regs = new HashMap<String, String>();
...
regs.put("COMMA", ",");
regs.put("ID", "[a-z][a-zA-Z0-9]*");
regs.put("SEMI", ";");
regs.put("GETS", ":=");
regs.put("DOT", "\\.");
for (HashMap.Entry<String, String> entry : regs.entrySet()) {
String key = entry.getKey();
String value = entry.getValue();
Matcher m = Pattern.compile(value).matcher("program var a, b, c; begin a := 0; end.");
boolean f = m.find();
while(f)
{
System.out.println(key);
System.out.print(m.group() + " ");
System.out.print(m.start() + " ");
System.out.println(m.end());
f = m.find();
}
}
}

How to iterate over regexp compliant strings

What is the easiest way to implement a class (in Java) that would serve as an iterator over the set of all values which conform to a given regexp?
Let's say I have a class like this:
public class RegexpIterator
{
private String regexp;
public RegexpIterator(String regexp) {
this.regexp = regexp;
}
public abstract boolean hasNext() {
...
}
public abstract String next() {
...
}
}
How do I implement it? The class assumes some linear ordering on the set of all conforming values and the next() method should return the i-th value when called for the i-th time.
Ideally the solution should support full regexp syntax (as supported by the Java SDK).
To avoid confusion, please note that the class is not supposed to iterate over matches of the given regexp over a given string. Rather it should (eventually) enumerate all string values that conform to the regexp (i.e. would be accepted by the matches() method of a matcher), without any other input string given as argument.
To further clarify the question, let's show a simple example.
RegexpIterator it = new RegexpIterator("ab?cd?e");
while (it.hasNext()) {
System.out.println(it.next());
}
This code snippet should have the following output (the order of lines is not relevant, even though a solution which would list shorter strings first would be preferred).
ace
abce
ecde
abcde
Note that with some regexps, such as ab[A-Z]*cd, the set of values over which the class is to iterate is ininite. The preceeding code snippet would run forever in these cases.
Do you need to implement a class? This pattern works well:
Pattern p = Pattern.compile("[0-9]+");
Matcher m = p.matcher("123, sdfr 123kjkh 543lkj ioj345ljoij123oij");
while (m.find()) {
System.out.println(m.group());
}
output:
123
123
543
345
123
for a more generalized solution:
public static List<String> getMatches(String input, String regex) {
List<String> retval = new ArrayList<String>();
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
while (m.find()) {
retval.add(m.group());
}
return retval;
}
which then can be used like this:
public static void main(String[] args) {
List<String> matches = getMatches("this matches _all words that _start _with an _underscore", "_[a-z]*");
for (String s : matches) { // List implements the 'iterable' interface
System.out.println(s);
}
}
which produces this:
_all
_start
_with
_underscore
more information about the Matcher class can be found here: http://docs.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html
Here is another working example. It might be helpful :
public class RegxIterator<E> implements RegexpIterator {
private Iterator<E> itr = null;
public RegxIterator(Iterator<E> itr, String regex) {
ArrayList<E> list = new ArrayList<E>();
while (itr.hasNext()) {
E e = itr.next();
if (Pattern.matches(regex, e.toString()))
list.add(e);
}
this.itr = list.iterator();
}
#Override
public boolean hasNext() {
return this.itr.hasNext();
}
#Override
public String next() {
return this.itr.next().toString();
}
}
If you want to use it for other dataTypes(Integer,Float etc. or other classes where toString() is meaningful), declare next() to return Object instead of String. Then you may able be to perform a typeCast on the return value to get back the actual type.

java regex multiple patterns sequential matching

I have a specific question, to which I couldn't find any answer online. Basically, I would like to run a pattern-matching operation on a text, with multiple patterns. However, I do not wish that the matcher gets me the result all at once, but instead that each pattern is called at different stages of the loop, at the same time that specific operations are performed on each of these stages. So for instance, imagining I have Pattern1, Pattern2, and Pattern3, I would like something like:
if (Pattern 1 = true) {
delete Pattern1;
} else if (Pattern 2 = true) {
delete Pattern2;
} else if (Pattern 3 = true) {
replace with 'something;
} .....and so on
(this is just an illustration of the loop, so probably the syntax is not correct, )
My question is then: how can I compile different patterns, while calling them separately?
(I've only seen multiple patterns compiled together and searched together with the help of AND/OR and so on..that's not what I'm looking for unfortunately) Could I save the patterns in an array and call each of them on my loop?
Prepare your Pattern objects pattern1, pattern2, pattern3 and store them at any container (array or list). Then loop over this container using usePattern(Pattern newPattern) method of Matcher object at each iteration.
You can make a common interface, and make anonymous implementations that use patterns or whatever else you may want to transform your strings:
interface StringProcessor {
String process(String source);
}
StringProcessor[] processors = new StringProcessor[] {
new StringProcessor() {
private final Pattern p = Pattern.compile("[0-9]+");
public String process(String source) {
String res = source;
if (p.matcher(source).find()) {
res = ... // delete
}
return res;
}
}
, new StringProcessor() {
private final Pattern p = Pattern.compile("[a-z]+");
public String process(String source) {
String res = source;
if (p.matcher(source).find()) {
res = ... // replace
}
return res;
}
}
, new StringProcessor() {
private final Pattern p = Pattern.compile("[%^##]{2,5}");
public String process(String source) {
String res = source;
if (p.matcher(source).find()) {
res = ... // do whatever else
}
return res;
}
}
};
String res = "My starting string 123 and more 456";
for (StringProcessor p : processors) {
res = p.process(res);
}
Note that implementations of StringProcessor.process do not need to use regular expressions at all. The loop at the bottom has no idea the regexp is involved in obtaining the results.

Categories

Resources