Convert Java Regex into PHP regex - java

I got the following Java code from Apache commons to validate email addresses. I code in PHP so I'm trying to see if these regex can be used directly in PHP without any modification.
LEGAL_ASCII_REGEX = "^\\p{ASCII}+$";
EMAIL_REGEX = "^\\s*?(.+)#(.+?)\\s*$";
IP_DOMAIN_REGEX = "^\\[(.*)\\]$";
USER_REGEX = "^\\s*" + WORD + "(\\." + WORD + ")*$";
If an email address fails any of these 4 conditions above, then it would be considered invalid.
I don't have any experience with Java so any advice on modifications on these regex needed for PHP is hugely appreciated!
Best,
Update:
the code I'm using is:
$email_to_test='www.jinfu66#foxmail.com';
if(filter_var($email_to_test, FILTER_VALIDATE_EMAIL)&&preg_match('/^[[:ascii:]]+$/', $email_to_test)&&preg_match('/^\s*?(.+)#(.+?)\s*$/', $email_to_test))
{
echo 'It passed';
}
else
{
echo 'It did not t passs';
}
I'm not sure how to add the condition that $email_to_test must match the requirement from $USER_REGEX in order for it to echo 'It passed'. Thank you!
2nd update:
Here's what WORD stands for in the original JAVA regex:
private static final String SPECIAL_CHARS = "\\p{Cntrl}\\(\\)<>#,;:'\\\\\\\"\\.\\[\\]";
private static final String VALID_CHARS = "[^\\s" + SPECIAL_CHARS + "]";
private static final String QUOTED_USER = "(\"[^\"]*\")";
private static final String WORD = "((" + VALID_CHARS + "|')+|" + QUOTED_USER + ")";

PHP regex dont need double \\ like Java regex
PCRE regex have [[:ascii:]] instead of \\p{ASCII}
PCRE regex need delimiter unlike Java regex
Following PHP regex should work for you:
$LEGAL_ASCII_REGEX = '/^[[:ascii:]]+$/';
$EMAIL_REGEX = '/^\s*?(.+)#(.+?)\s*$/';
$IP_DOMAIN_REGEX = '/^\[(.*)\]$/';
$USER_REGEX = '/^\s*' + preg_quote(WORD, '/') + '(\.' + preg_quote(WORD, '/') + ')*$/';

Related

Python Regex to Java

I am trying to convert a python regex to java. It finds a match in python but fails on the same string in java.
Python regex : "(CommandLineEventConsumer)(\x00\x00)(.*?)(\x00)(.*?)({})(\x00\x00)?([^\x00]*)?".format(event_consumer_name)
Java regex : "(CommandLineEventConsumer)(\\u0000\\u0000)(.*?)(\\u0000)(.*?)(" + event_consumer_name + ")(\\u0000\\u0000)?([^\\u0000]*)?"
I also tried this : "(CommandLineEventConsumer)(\\x00\\x00)(.*?)(\\x00)(.*?)(" + event_consumer_name + ")(\\x00\\x00)?([^\\x00]*)?"
What I'm I missing please?
I have attached a piece of the code
String sampleStr = "\u0000\u0000�\u0003\b\u0000\u0000\u0000�\u0005\u0000\u0000\u0003\u0000\u0000�\u0000\u000B\u0000\u0000\u0000���\u0005\u0000\u0000\u0000\u0003\u0000\u0000\u0000 \u0000\u0000\u0000\u0000string\u0000\u0000WMIDataID\u0000\u0000SystemVersion\u0000\b\u0000\u0000\u0000\f\u0000.\u0000\u0000\u0000\u0000\u0000\u0000\u0000)\u0000\u0000\u0000 \u0000\u0000�\u0003\b\u0000\u0000\u0000'\u0006\u0000\u0000\u0003\u0000\u0000�\u0000\u000B\u0000\u0000\u0000��/\u0006\u0000\u0000\u0000\u0003\u0000\u0000\u0000\u000B\u0000\u0000\u0000\u0000string\u0000\u0000WMIDataID\u0000\f\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000�\u0016\u0000\u0000\u0000R\u0000O\u0000O\u0000T\u0000\\\u0000M\u0000i\u0000c\u0000r\u0000o\u0000s\u0000o\u0000f\u0000t\u0000\\\u0000H\u0000o\u0000m\u0000e\u0000N\u0000e\u0000t\u0000\u0019\u0000\u0000\u0000H\u0000N\u0000e\u0000t\u0000_\u0000C\u0000o\u0000n\u0000n\u0000e\u0000c\u0000t\u0000i\u0000o\u0000n\u0000P\u0000r\u0000o\u0000p\u0000e\u0000r\u0000t\u0000i\u0000e\u0000s\u0000 \u0000\u0000\u0000C\u0000o\u0000n\u0000n\u0000e\u0000c\u0000t\u0000i\u0000o\u0000n\u0000�\u0000\u0000\u0000N\u0000S\u0000_\u00005\u00001\u00001\u00006\u00002\u00006\u0000F\u0000A\u0000E\u00004\u0000F\u00005\u00007\u0000D\u0000B\u0000D\u00002\u00000\u0000D\u0000F\u00005\u0000C\u0000D\u00004\u00004\u0000A\u00004\u00001\u0000D\u0000A\u0000E\u0000C\u0000E\u0000D\u00002\u00008\u0000C\u0000F\u00007\u0000B\u00003\u0000F\u0000D\u00008\u0000B\u00001\u00002\u00000\u00001\u00002\u0000C\u00007\u0000F\u00004\u0000B\u00005\u00008\u0000F\u00004\u00004\u0000E\u00006\u00006\u00005\u0000\\\u0000K\u0000I\u0000_\u0000A\u00000\u00001\u00000\u00008\u0000C\u0000E\u00002\u00006\u00001\u0000D\u00006\u0000C\u0000D\u00007\u00000\u0000D\u00003\u00005\u00000\u0000F\u00005\u0000B\u00007\u00002\u0000F\u00002\u0000E\u00009\u00008\u00007\u00004\u0000A\u0000E\u00006\u0000E\u00000\u00000\u00004\u0000D\u00003\u00000\u00002\u00009\u00000\u00001\u00005\u0000B\u00000\u00009\u00001\u00009\u0000B\u00001\u0000B\u0000D\u00003\u00002\u00006\u0000B\u0000B\u00006\u00004\u00009\u0000\\\u0000I\u0000_\u0000E\u0000D\u0000C\u0000E\u0000A\u00001\u00004\u0000E\u0000C\u00006\u00003\u0000A\u00005\u00007\u00004\u00001\u0000F\u0000A\u0000A\u00006\u00003\u00000\u00001\u0000C\u00007\u00007\u0000C\u0000A\u00002\u00006\u00000\u0000A\u0000B\u0000E\u0000C\u00000\u0000E\u00007\u00007\u00000\u00009\u00005\u00001\u00004\u0000F\u00006\u0000A\u00003\u00002\u0000C\u00000\u00003\u00004\u00007\u0000E\u00000\u00002\u00006\u00008\u00001\u00007\u0000C\u00008\u00008\u0000\u0000\u0000WQL:Re4\u00007\u0000C\u00007\u00009\u0000E\u00006\u00002\u0000C\u00002\u00002\u00002\u00007\u0000E\u0000D\u0000D\u00000\u0000F\u0000F\u00002\u00009\u0000B\u0000F\u00004\u00004\u0000D\u00008\u00007\u0000F\u00002\u0000F\u0000A\u0000F\u00009\u0000F\u0000E\u0000D\u0000F\u00006\u00000\u0000A\u00001\u00008\u0000D\u00009\u0000F\u00008\u00002\u00005\u00009\u00007\u00006\u00000\u00002\u0000B\u0000D\u00009\u00005\u0000E\u00002\u00000\u0000B\u0000D\u00003\u0000�3u�&��\u0001����+\u0004�\u0001�\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\f;\u0000\u0000\u0000\u000F\u0000\u0000\u0000�\u0000\u0000\u0000F\u0000\u0000\u0000/\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0004\u0000\u0000\u0000\u0001�\u0000\u0000�\u0000__EventFilter\u0000\u001C\u0000\u0000\u0000\u0001\u0005\u0000\u0000\u0000\u0000\u0000\u0005\u0015\u0000\u0000\u0000�tw�}\n" +
"z�p�)��\u0001\u0000\u0000\u0000root\\cimv2\u0000\u0000BVTFilter\u0000\u0000SELECT * FROM __InstanceModificationEvent WITHIN 60 WHERE TargetInstance ISA \"Win32_Processor\" AND TargetInstance.LoadPercentage > 99\u0000\u0000WQL\u0000B\u0000B\u0000F\u0000C\u0000C\u0000B\u00004\u00004\u00004\u0000C\u0000F\u00006\u00006\u0000A\u0000A\u00000\u00009\u0000A\u0000E\u00006\u0000F\u00001\u00005\u00009\u00006\u00007\u0000A\u00006\u00008\u00006\u00005\u00001\u00007\u00005\u0000B\u0000B\u00000\u0000E\u0000D\u00002\u00001\u00006\u0000D\u00001\u00009\u00009\u00007\u00000\u0000A\u00007\u00009\u00008\u00008\u0000B\u00007\u00002\u0000C\u0000D\u0000F\u00000\u0000A\u00003\u0000A\u00004\u0000�3u�&��\u0001Ԏ��+\u0004�\u0001�\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u000F�����\"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000/\u0000\u0000\u0000O\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u001A\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\\\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0004\u0000\u0000\u0000\u0001q\u0000\u0000�\u0000CommandLineEventConsumer\u0000\u0000cscript KernCap.vbs\u0000\u001C\u0000\u0000\u0000\u0001\u0005\u0000\u0000\u0000\u0000\u0000\u0005\u0015\u0000\u0000\u0000�tw�}\n" +
"z�p�)��\u0001\u0000\u0000\u0000BVTConsumer\u0000\u0000C:\\\\tools\\\\kernrate\u00000\u0000A\u00007\u0000A\u0000B\u0000E\u00006\u00003\u0000F\u00003\u00006\u0000E\u00002\u0000B\u00002\u00009\u00002\u00000\u0000F\u0000E\u0000D\u0000A\u0000F\u0000A\u0000E\u00008\u00004\u00009\u00008\u00002\u00003\u0000A\u0000F\u00009\u00004\u00002\u00009\u0000C\u0000C\u00000\u0000E\u0000A\u00003\u00007\u00003\u0000F\u0000F\u0000E\u0000E\u00001\u00005\u00000\u00007\u0000E\u0000D\u0000B\u00002\u00001\u0000F\u0000D\u00009\u00001\u00007\u00000\u0000�3u�&��\u0001����+\u0004�\u0001�\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000�";
String event_consumer_name = "BVTConsumer";
String cPattern = "(CommandLineEventConsumer)(\\u0000\\u0000)(.*?)(\\u0000)(.*?)(" + event_consumer_name + ")(\\u0000\\u0000)?([^\\u0000]*)?";
Pattern consumer_mo = Pattern.compile(cPattern, Pattern.CASE_INSENSITIVE);
Matcher consumer_match = consumer_mo.matcher(sampleStr);
if(consumer_match.find()){
System.out.println(consumer_match.group(6));
}
UPDATE
In python the groups return
python result screenshot
From what I posted as comments:
The (CommandLineEventConsumer)(\u0000\u0000)(.*?)(\u0000)(.*?) part matches fine.
group(3) gets cscript KernCap.vbs
group(4) gets a null character
but group(5) gets nothing.
I did try in Python and I have the exact same lack of match when I include the (BVTConsumer). So you probably had a difference in the code doing the matching in Python, not the regex itself.
So the reason is that you have a \n in your string so the matching stops there. If you do
Pattern consumer_mo = Pattern.compile(cPattern, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
it does match in your example.

"missing ) after argument list" [duplicate]

I want to initialize a String in Java, but that string needs to include quotes; for example: "ROM". I tried doing:
String value = " "ROM" ";
but that doesn't work. How can I include "s within a string?
In Java, you can escape quotes with \:
String value = " \"ROM\" ";
In reference to your comment after Ian Henry's answer, I'm not quite 100% sure I understand what you are asking.
If it is about getting double quote marks added into a string, you can concatenate the double quotes into your string, for example:
String theFirst = "Java Programming";
String ROM = "\"" + theFirst + "\"";
Or, if you want to do it with one String variable, it would be:
String ROM = "Java Programming";
ROM = "\"" + ROM + "\"";
Of course, this actually replaces the original ROM, since Java Strings are immutable.
If you are wanting to do something like turn the variable name into a String, you can't do that in Java, AFAIK.
Not sure what language you're using (you didn't specify), but you should be able to "escape" the quotation mark character with a backslash: "\"ROM\""
\ = \\
" = \"
new line = \r\n OR \n\r OR \n (depends on OS) bun usualy \n enough.
taabulator = \t
Just escape the quotes:
String value = "\"ROM\"";
In Java, you can use char value with ":
char quotes ='"';
String strVar=quotes+"ROM"+quotes;
Here is full java example:-
public class QuoteInJava {
public static void main (String args[])
{
System.out.println ("If you need to 'quote' in Java");
System.out.println ("you can use single \' or double \" quote");
}
}
Here is Out PUT:-
If you need to 'quote' in Java
you can use single ' or double " quote
Look into this one ... call from anywhere you want.
public String setdoubleQuote(String myText) {
String quoteText = "";
if (!myText.isEmpty()) {
quoteText = "\"" + myText + "\"";
}
return quoteText;
}
apply double quotes to non empty dynamic string. Hope this is helpful.
This tiny java method will help you produce standard CSV text of a specific column.
public static String getStandardizedCsv(String columnText){
//contains line feed ?
boolean containsLineFeed = false;
if(columnText.contains("\n")){
containsLineFeed = true;
}
boolean containsCommas = false;
if(columnText.contains(",")){
containsCommas = true;
}
boolean containsDoubleQuotes = false;
if(columnText.contains("\"")){
containsDoubleQuotes = true;
}
columnText.replaceAll("\"", "\"\"");
if(containsLineFeed || containsCommas || containsDoubleQuotes){
columnText = "\"" + columnText + "\"";
}
return columnText;
}
suppose ROM is string variable which equals "strval"
you can simply do
String value= " \" "+ROM+" \" ";
it will be stored as
value= " "strval" ";

How to properly escape this regex patterns in java?

This is the input I want to process. I want to extract the value of the operation attribute:
<h:outputLink value="#" id="temp_solution">
<rich:componentContro
for="panel"
attachTo="temp_solution"
operation="show"
event="onclick"/>
</h:outputLink>
With the help of an online regex tester I came up with the following regular expression
(?<=operation=")(\w+)(?=")
To be a bit more dynamic, I replaced operation with %s so I can use this template for different situations. But I encountered a problem, while trying to test my "creation" with the help of a small test program:
public class Main {
private static final String INPUT = "<h:outputLink value=\"#\" id=\"temp_solution\">\n"
+ " <rich:componentControl \n"
+ " for=\"panel\" \n"
+ " attachTo=\"temp_solution\" \n"
+ " operation=\"show\""
+ " event=\"onclick\"/> \n"
+ "</h:outputLink>";
private static final String REGEX_TEMPLATE = "(?<=%s=\")(\\w+)(?=\")";
public static void main(String[] args) throws IOException {
final String actualRegex = String.format(REGEX_TEMPLATE, "operation");
final Pattern pattern = Pattern.compile(actualRegex);
final Matcher matcher = pattern.matcher(INPUT);
System.out.println("Regex: " + pattern);
System.out.println(matcher.matches() ? matcher.group(0) : "Nothing found");
}
}
Output:
Regex: (?<=operation=")(\w+)(?=")
Nothing found
Even double escaping the regex inside my code:
private static final String REGEX_TEMPLATE = "(?<=%s=\\\")(\\\\w+)(?=\\\")";
doesn't help:
Regex: (?<=operation=\")(\\w+)(?=\")
Nothing found
Please give me some advise on this.
There is nothing wrong with your regex. However, it doesn't match the entire input, so you can't use matches(). Change it to find(), which only tries to find a matching subsequence:
System.out.println(matcher.find() ? matcher.group(0) : "Nothing found");
Try regex like this :
(?<=operation=\")(\w+)
demo

Replace bibtex or latex {\dg} string command in java

I want to write a parser for a bibtex file. In my bibtex-file there is a string booktitle = {Wohnen - Pflege - Teilhabe {\dq}Besser leben durch Technik{\dq}} . Anyway, I am using JUnit-Tests and a method which should detect and replace {\dg} to ' or ". Unfortunately I am not able to write the corresponding java code, for example my following code did not detect the substring?
inproceedingsCitation1 += " booktitle = {Wohnen - Pflege - Teilhabe {\\dq}Besser leben durch Technik{\\dq}},\n";
Corresponding part of my replace-Method:
String afterDg = "";
CharSequence targetDg = "{\\dg}";
CharSequence replacementDg = "\"";
afterDg = afterAe.replace(targetDg, replacementDg);
This regular expression should solve your problem:
String afterDg = afterAe.replaceAll("\\{\\\\dq\\}", "\"");
For more details according to regular expressions, have a look at Vogellas Java Regex Tutorial.

Java RegEx replace all characters in string except for a word

I am using the code in Java:
String word = "hithere";
String str = "123hithere12345hi";
output(str.replaceAll("(?!"+word+")", "x"));
However, rather than outputting: xxxhitherexxxxxxx like I want it to, it outputs: x1x2x3hxixtxhxexrxex1x2x3x4x5xhxix x, I've tried a load of different regex patterns to try to do this, but I can't seem to figure out how to do this :(
Any help would be much appreciated.
Well this technically works. Using only replace all and only one line, and it's assuming you string does not contain a deprecated ASCII character (BEL)
String string = "hithere";
String string2 = "asdfasdfasdfasdfhithereasasdf";
System.out.println(string2.replaceAll(string,"" + (char)string.length()).replaceAll("[^" + (char)string.length() + "]", "x").replaceAll("" + (char)string.length(), string));
I think this is what you're looking for, if I'm not mistaken:
String pattern = "(\\d)|(hi$)";
System.out.println("123hithere12345hi".replaceAll(pattern, "X"));
The pattern replaces any numeric digits and the word "hi".
This lookaround based code will work for you:
String word = "hithere";
String string = "123hithere12345hi";
System.out.println(string.replaceAll(
".(?=.*?\\Q" + word + "\\E)|(?<=\\Q" + word + "\\E(.){0,99}).", "x"));
//=> xxxhitherexxxxxxx

Categories

Resources