Given a string, how can I validate the contents are a valid PCRE within Java? I don't want to use the regex in any way within Java, just validate its contents.
java.util.regex.Pattern is almost good enough, but its javadoc points out how it differs from Perl.
In detail, there's a system with 3 relevant components:
Component A - Generates, among other things, Perl-compliant regular expressions (PCREs) to be evaluated at runtime by some other component capable of executing PCREs (component C). What's "generated" here may be coming from a human.
Component B - Validates that data generated by component A and, if valid, shuttles it over to the runtime (component C).
Component C - Some runtime that evaluates PCREs. This could be a Perl VM, a native process using the PCRE library, Boost.Regex, etc., or something else that can compile/execute a Perl-compliant regular expression.
Now, component B is implemented in Java. As mentioned above, it needs to validate a string potentially containing a PCRE, but does not need to execute it.
How could we do that?
One option would be something like:
public static boolean isValidPCRE(String str) {
try {
Pattern.compile(str);
} catch (PatternSyntaxException e) {
return false;
}
return true;
}
The problem is that java.util.regex.Pattern is designed to work with a regular expression syntax that is not exactly Perl-compliant. The javadoc makes that quite clear.
So, given a string, how can I validate the contents are a valid PCRE within Java?
Note: There are some differences between libPCRE and Perl, but they are pretty minor. To a certain degree, that is true of Java's syntax as well. However, the question still stands.
Related
I am working in Java 19, and using the pattern matching for instanceof that was released in JEP 394 (which released in Java 16). However, I am running into a warning that I am struggling to understand.
public class ExpressionTypeIsASubsetOfPatternType
{
public record Triple(int a, int b, int c) {}
public static void main(String[] args)
{
System.out.println("Java Version = " + System.getProperty("java.version"));
final Triple input = new Triple(1, 2, 3);
if (input instanceof Triple t)
{
System.out.println("Made it here");
}
}
}
And here is the warning that is returned.
$ javac -Xlint:preview --enable-preview --release 19 UnconditionalPatternsPreviewWarning.java
UnconditionalPatternsPreviewWarning.java:15: warning: [preview] unconditional patterns in instanceof are a preview feature and may be removed in a future release.
if (input instanceof Triple t)
^
1 warning
What does this warning message mean? More specifically, what does an unconditional pattern mean? I tried to search on StackOverflow, but found nothing helpful or useful on this.
I understand well enough that, whatever it is, is a preview feature. And thus, I am trying to do something that has not yet been released. But this looks and sounds like the most basic possible pattern match using the most basic form of pattern-matching --- instanceof. And the JEP that I linked above made it sound like this feature is released.
I guess whatever it is I am doing is an unconditional pattern. But what does that mean?
So, documentation is frustratingly scarce on this. The best thing I could find (which also happens to be an Oracle resource) was a blog written by Nicolai Parlog -- Pattern matching updates for Java 19’s JEP 427: when and null. Most of the article is unrelated to what I am talking about, but about halfway down, there is a reference to unconditional patterns.
...an unconditional pattern, that is, a pattern that matches all possible instances of the switched variable’s type.
Nicolai is talking about Pattern-matching for Switch, but I am willing to bet that the same rules apply to Pattern-matching for instanceof.
In short, the article is saying that, if the variable you are pattern matching is guaranteed to be fully matched against the pattern you are matching against, then that qualifies as an unconditional pattern.
More specifically, if I make a variable String s, then the compiler knows 100% for certain that s is a String. Therefore, if I try to do something like if (s instanceof String match) {}, then this will be considered to be an unconditional pattern. After all, a String is a String. So, asking if s is a String will always return true.
And it is this concept, unconditional patterns, that is in preview. Pattern-matching for instanceof has been fully released since Java 16. But this unconditional patterns concept is a new functionality that the Java team is considering to add onto pattern-matching for instanceof. Since they are still figuring it out, they are putting it in preview for the time being.
I'm new to Java and I couldn't find an answer to it anywhere because i don't even know how to search for it.
I want to define how 2 objects can be added together, so you get a new one like for example you can add String "a" and String "b" to get "ab".
I know this can be done in python by doing self.__add__(self, other).
How can you do this in Java?
The thing you are looking for is called operator overloading. It exists in some languages, however in Java it does not.
The best thing you can do is to define a method add() inside the class and then use it like this:
object1.add(object2);
I know it looks nicer with a + between them, but that would make compiling more complex.
With the exception of java.lang.String being treated as a special case1, Java does not allow you to define the behaviour of + for arbitrary types, or indeed any other operator, as you can in some languages such as C++ or Scala. In other words, Java does not support operator overloading.
Your best bet is to build functions like add &c. Appeal to precedent here: see how the Java guys have done it with BigInteger, for example. Sadly there is no way of defining the precedence of your functions, so you have to use very many parentheses to tell the compiler how you want an expression to be evaluated. It's for this reason that I don't use Java for any serious mathematical applications as the implementation of even a simple equation quickly becomes an unreadable mess2.
1 Which in some ways does more harm than good: e.g. consider 1 + 2 + "Hello" + 3 + 4. This compile time constant expression is a string type with the value 3Hello34.
2 Note that C++ was used to model the gravitational lensing effects of the wormhole in the movie "Interstellar". I challenge anyone to do that in a language that does not support operator overloading! See https://arxiv.org/pdf/1502.03808v1.pdf
Java does not allow you to override operators. String is a special case that does allow this functionality.
What you can do is add an add function like so:
public YourObject add(YourObject yourObject){
return new YourObject(this.propertyToAdd + yourObject.propertyToAdd);
}
All C++ functions are of the form
type name ( parameters ) { … }
To identify the regex, I'm using
regex = "...";
pattern = Pattern.compile(regex);
matcher = pattern.matcher(line);
if (matcher.matches())
{
...
}
I can only realistically search for the type name ( part since I am using a line reader and function definitions can be multi-line and I'm not sure of what the regex would be. .*\\b.*\\( was my latest guess, but it doesn't work. Any help would be greatly appreciated.
Unfortunately, there is no general regular expression that can match all function definitions.
The C++ grammar specification allows you to parenthesize the name of any variable as many times as you'd like. For example, you can write
int ((((((a))))));
to declare a variable named a. This means that you can define functions like this:
void whyWouldYouDoThis(int (((((becauseICan)))))) {
/* ... */
}
The problem with this is that it means that function declarations can have arbitrarily-complicated nesting of parentheses. You can prove that, in general, sets of strings that require keeping track of balanced parentheses cannot be matched by regular expressions (formally, that the language of those strings is not regular), and unfortunately this applies here.
This is definitely really contrived, but there are cases where you will see lots of nested parentheses. For example, consider this function:
void thisFunctionTakesACallback(void imACallbackFunction()) {
/* ... */
}
Here, there's an extra layer of parentheses induced by the fact that the function argument is itself of function type. If that function took a callback, you could see something like this:
void thisFunctionTakesACallback(void soDoesThisOne(void imACallbackInACallback())) {
/* ... */
}
If you're looking to find all function declarations, you might be better off using a parser and defining a grammar for what you're looking for, since these patterns are context-free. You could alternatively consider hooking into a compiler front-end (g++ can produce ASTs for you in the GIMPLE or GENERIC framework, for example) and using that to extract what you're looking for. That guarantees you won't miss anything.
In Java I have a method:
private boolean testFunction(int x){
// codes goes here..
}
Now I have a expression written in file something like:
if(testFunction(10)){ return "ok"; }else{ return null;}
I am storing this in a String variable inside java program and want to execute it like it should execute as Java code:
if(testFunction(10)){ return "ok"; }else{ return null;}`
Is it possible?
The thing is I have a web application where there are 10+ different kind of form having different kind of fields i.e in some form X,Y,Z is there and X,Y is required....in some form A,B,C is there and C is only required like this.
So instead of writing validation code for each form i wanted to write a expression in XML file and at the time evaluation these expression will execute by single java method and return some value. So in this way I will just have to write expression in XML file.
No. Java is a compiled language, it is not interpreted.
There are ways of generating bytecode dynamically in Java, but they are highly involved, and aren't anywhere close to the concept of eval(String code)
If you want dynamic validation for form entries, I'd suggest using RegExp expressions which can be evaluated and matched against form input at runtime.
Unfortunately your OP was a bit vague as to what you're actually trying to achieve.
To get you started: RegExp Pattern class
I often find myself searching for statements of a particular form in Java. Say I've written a simple function to express an idiom, such as "take this value, or a default value if it's null"
/** return a if not null, the default otherwise. */
public static <T> T notNull(T a, T def) {
if (a == null)
return def;
else
return a;
}
Now if I've written this, I want to look for cases in my code where it can be used to simplify, for instance
(some.longExpressionWhichMayBeNull() ? "default string" : some.longExpressionWhichMayBeNull())
The problem is that it's pretty tricky to write a regular expression that matches java syntax. It can be done, of course, but it's easy to get wrong. It's hard to get regular expressions to ignore whitespace in all the right locations always accurately figure out where strings start and stop, know the difference between a cast and a function call etc.
It also seems a bit wasteful, since we already have a java parser, which does that already.
So my question is: is there some Java syntax aware alternative to regular expressions for searching for particular (sub-)expressions?
You'd probably need to build an abstract syntax tree of the Java source file(s) and then analyse that. Might be possibly to leverage PMD (http://pmd.sourceforge.net/) and write a custom rule (http://pmd.sourceforge.net/pmd-5.0.5/howtowritearule.html) to detect and flag expressions that could be optimised as you describe.