How to determine wrong domain/IP - java

I have field on my web application that allow entering domain/IP addresses.
User can specified by misteken the following values:
10.10.10.
or
-domain.com
Both values are incorrect regarding RFCs about IP and Domain names.
I use the following validators from guava's library for validation :
InetAddresses
InternetDomainName
My goal is to determine wheter user specified wrong IP or Domain and show messages regarding to it.
Ex: if user specifes "-domain.com". Alert shoud be appear with message "Wrong domain name is specified".
I've already used ^[0-9\\.]*$ for determining wrong ip's, but I am not sure whether it was correct one.
Could you suggest some

Validate IP address by using below regular expression pattern:
^([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.
([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])$
Validate Domain names using below regex pattern:
/^[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9].[a-zA-Z]{2,}$/
But certain vaild domain names will fail the above test like http://google.com
Thanks

Related

Redisearch query with "begin with" instead of "contains"

I am trying to understand on how to perform queries in Redisearch strictly with "begins with" and I keep getting "contains".
For example if I have fields with values like 'football', 'myfootball', 'greenfootball' and would provide a search term like this:
> FT.SEARCH myIdx #myfield:foot*
I want just to get 'football' but I keep getting other fields that contain the word instead of beginning with that word.
Is there a way to avoid this?
I was trying to use VERBATIM and things like #myfield:^foot* but nothing.
I am using JRedisearch as a client but eventually I had to enter the DB and perform these queries manually in order to figure out what's happening. That being said, is this possible to do with this client at the moment?
Thanks
EDIT
A sample of my index setup:
Client client = new Client(INDEX_NAME, url, PORT);
Schema sc = new Schema().addSortableTextField("url", 1.0); // using this field for query
client.dropIndex(true);
client.createIndex(sc, Client.IndexOptions.Default());
return client;
Sample document:
id: // random uuid
urlPath: myfootbal
application: web
market: Europe
After checking the RDB provided I see that when searching foot* you are not getting myfootbal. The replies look like this: /dot-com/plp/football/x/index.html. You are getting those replies because this url is tokenized, and '/' is one of the tokenize chars. If you do not want those urls to be tokenized you need to declare them as TAGS and not as TEXT. This way the entire url will be indexed as is and when search for foot* it will not appear in the results.
For more information about TAGS see the FT.CREATE documentation: https://oss.redislabs.com/redisearch/Commands.html

Regex to validate wildcard domains with special conditions

I am looking to validate wildcards against Samsung Knox Firewall. Please see below the full criteria for all domains:
A list of URLs for specified domain names to block DNS resolution. The format of the URL must be compliant with RFC's standards and must also match one of the following rules:
Full URL: "www.google.com"
Partial URL: "android.com"; "www.samsung"; "google". The
character "*" (wildcard) must be at the beginning and/or at the end
of the URL otherwise the URL is invalid.
Special case, matches any URL : "*"
Valid domains
The following examples are considered valid by Knox.
*.test.com
*test.com
*test
*test*
test.*
test1.test.*
Invalid domains
The following examples are considered invalid by Knox.
*test-
*test.
*test.com-
*test-.com
Is anybody able to offer a hand? I am struggling to accommodate for all of the requirements with this one.
Current code:
(?=^\*|.*\*$)^(?:\*\.?)?(?:(?:[a-z0-9-]+(?(?=\.)(?<!-)\.(?!-)))+[a-z]+)(?:\.?\*)?$
Edit: Actually, it looks like conditional regex may not even be supported in Java.
BASED ON YOUR PROVIDED EXAMPLES
If you're trying to pre-filter the domains, then this one matches all of your "Valid" examples and rejects all of your "Invalid" examples
^[\w*]([\w*-]+[\w*])?(\.[\w*]([\w*-]+[\w*])?)*$
If there's a file or carriage return separated field with all of these in it that you're trying to test, you may want to use the "multiline" switch like so:
(?m)^[\w*]([\w*-]+[\w*])?(\.[\w*]([\w*-]+[\w*])?)*$
since you tagged java, that would be encoded into a java string as follows:
"(?m)^[\\w*]([\\w*-]+[\\w*])?(\\.[\\w*]([\\w*-]+[\\w*])?)*$"
EDIT - Matching all the rules, in addition to your provided examples
This expression seems to work:
^(\*|(\*|\*\.)?\w+(\.\w+)*(\.\*|\*)?)$
Matching/Non-matching examples:
MATCHING NON-MATCHING
------------ ------------
* *test-
*.test.com *test.
*test.com *test.com-
*test *test-.com
*test* test*.com
test.* test.*com
test1.test.* -test.com

apache commons-validator alternative for new gTLDS

I need to validate emails and domains. I just need a formal validation, no whois or other forms of domain lookup needed.
Currently I'm using apache's commons-validator v1.4.0
Unfortunately my customers use the new gTLDs, like .bike or .productions that are not yet supported by the DomainValidator class.
See Apache's Jira issue for more details.
Are there any sound alternatives that I may easily include in my Maven POM?
If you are not concerned about internationalized addresses, you could change last part of address, and continue to use Apache commons.
This approach is based on the fact that whatever the TLD is, the validity of the whole domain name is equivalent to the validity of the same domain name with the TLD replaced with com. For example:
abc.def.com is valid. Similarly abc.def.name, abc.def.xx--kput3i, abc.def.uk are valid.
ab,de.com is not valid. Similarly ab,de.name, ab,de.xx-kput3i, ab,de.uk are not valid.
So instead of calling
return EmailValidator.getInstance().isValid(userEmail);
You can call
if ( userEmail == null ) {
return false;
}
return EmailValidator.getInstance().isValid(userEmail.trim().replaceFirst("\\.\\p{Alpha}[\\p{Alnum}-]*\\p{Alnum}$", ".com"));
Explanation
The regular expression "\\.\\p{Alpha}[\\p{Alnum}-]*\\p{Alnum}$" checks for the TLD part: it's at the end of the string (because of the $), it starts with a dot and contains no other dot, and it conforms to the standards: begins with an ASCII Alpha character, followed by zero or more alphanumerics or dashes, and ends with an alphanumeric character.
I am using trim() because until now, if you used EmailValidator, it allows spaces before and after the address. Removing the spaces just makes it easier to replace the TLD, and it shouldn't matter as far as the validity of the address is concerned.
If the string doesn't have a valid TLD at the end, String.replaceFirst() will return it as is. It could still be valid, because email addresses of the format x#[n.n.n.n] where n.n.n.n. is a valid IP address are valid. So basically, if you didn't find a TLD, you let EmailValidator decide the validity issue itself.
Of course, if the TLD is not an IANA recognized TLD, this validation will not tell you that. An e-mail like david#galaxy.hoopie-frood will be accepted as legal,but IANA doesn't have that TLD as yet.
Checking a domain is similar, without the trim() part:
if (userDomain == null ) {
return false;
}
return DomainValidator.getInstance().isValid(userDomain.replaceFirst("\\.\\p{Alpha}[\\p{Alnum}-]*\\p{Alnum}$"));
I have also tried JavaMail's email address validation, but I don't really like it: it allows completely invalid domain names such as net-name.net- (ending with a dash) or IP addresses (which are not allowed for e-mail without square brackets around them), and it's only good for e-mail addresses, not for domains.
Internationalization
If you need to check for internationalized domains and e-mails, it's a bit different. It's easy to check for internationalized domains (for example 元気。テスト). All you need to do is convert them to ASCII with java.net.IDN.toASCII() (yielding xn--z4qx76d.xn--zckzah for my example domain - this is a valid TLD), and then do the same as I wrote above.
Internationalized e-mails are a different story. If the local part is ASCII, you can convert the domain part to ASCII. If you have to display the email address, you need to use the Unicode version, and if you have to send an email message, you use the ASCII version.
But recently a standard has been introduced for internationalized local parts as well, which also allows sending to the unicode version of the domain name without translating it to ASCII first. Whether you want to support that or not requires some thought, as not many mail servers and mail transfer agents support it at the moment.
Copied the implementation from DomainValidator and replaced the TOP_LABEL_REGEX expression with "\\p{Alpha}[\\p{Alnum}-]*\\p{Alpha}".
In addition, I removed validation against the hard coded list of approved gTLDs. This is, basically, quite weak in that it doesn't validate against the actual domains. But I think it's good enough (catches the gTLDs similar to XN--YGBI2AMMX).
See full list of approved gTLDs here.
// Copied from org.apache.commons.validator.routines.DomainValidator
private static final String DOMAIN_LABEL_REGEX = "\\p{Alnum}(?>[\\p{Alnum}-]*\\p{Alnum})*";
// Changed to include new gTLD - http://data.iana.org/TLD/tlds-alpha-by-domain.txt
private static final String TOP_LABEL_REGEX = "\\p{Alpha}[\\p{Alnum}-]*\\p{Alpha}";
// Copied from org.apache.commons.validator.routines.DomainValidator
private static final String DOMAIN_NAME_REGEX = "^(?:" + DOMAIN_LABEL_REGEX + "\\.)+" + "(" + TOP_LABEL_REGEX + ")$";
private static final RegexValidator domainRegex = new RegexValidator(DOMAIN_NAME_REGEX);
private static final EmailValidator EMAIL_VALIDATOR = new EmailValidator();
public static boolean isValidDomain(String domain) {
String[] groups = domainRegex.match(domain);
return groups != null && groups.length > 0;
}
What I often do in this situation is to checkout the source code for the library in question (it's open source remember?), modify it to suit my requirement, and then contribute the patch back to the project.
Your use case certainly sounds like it would be a useful contribution.
I made you a public suffix list Java API. The method PublicSuffixList.getRegistrableDomain() can be used for Domain validation:
PublicSuffixListFactory factory = new PublicSuffixListFactory();
PublicSuffixList suffixList = factory.build();
assertNull(suffixList.getRegistrableDomain("galaxy.hoopie-frood"));
assertNotNull(suffixList.getRegistrableDomain("example.bike"));
While DomainValidator is missing some of the new TLDs, for me the best solution was to update TLD.
DomainValidator.updateTLDOverride(ArrayType.COUNTRY_CODE_PLUS, new String[]{"someTLD"});
And then initiate EmailValidator Instance
EmailValidator.getInstance(false, true)

Substitute {0}, {1} .. {n} in a template with given varargs

Consider a string template of the following format:
String template = "The credentials you provided were username '{0}' with password '{1}'";
Substitution variable fields are of the form {n}, where n is a zero based index.
This is the template format used in Adobe Flex, see StringUtil.substitute(...). And also .NET, IIRC.
Since I want to re-use the templates used by the Flex code I'm looking for an Java equivalent. I'm aware of String.format(...) but the template structure is not identical.
What is the best way to get the same "Flex compatible" template functionality in Java?
Basically this is the desired end-result:
assert(StringUtil.substitute(template, "powerUser", "difficultPassword") == "The credentials you provided were username 'powerUser' with password 'difficultPassword'");
Use MessageFormat
You want java.text.MessageFormat http://download-llnw.oracle.com/javase/6/docs/api/java/text/MessageFormat.html

regular expression for email validation in Java

I am using the follwoing regular expression
(".+#.+\\.[a-z]+")
Bit it accepts ###.com as a valid email. What's the pattern I should use?
You should use apache-commons email validator. You can get the jar file from here.
Here is a simple example of how to use it:
import org.apache.commons.validator.routines.EmailValidator;
boolean isValidEmail = EmailValidator.getInstance().isValid(emailAddress);
Here's a web page that explains that better than I can: http://www.regular-expressions.info/email.html (EDIT: that appears to be a bit out of date since it refers to RFC 2822, which has been superseded by RFC 5322)
And another with an interesting take on the problem of validation: http://www.markussipila.info/pub/emailvalidator.php
Generally the best strategy for validating an email address is to just try sending mail to it.
If somebody wants to enter non-existent email address he'll do it whatever format validation you choose.
The only way to check that user owns email he entered is to send confirmation (or activation) link to that address and ask user to click it.
So don't try to make life of your users harder. Checking for presence of # is good enough.
[A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,4}
I usually use the following one:
([a-zA-Z0-9]+(?:[._+-][a-zA-Z0-9]+)*)#([a-zA-Z0-9]+(?:[.-][a-zA-Z0-9]+)*[.][a-zA-Z]{2,})
import java.util.regex.*;
class ValidateEmailPhone{
public static void main(String args[]){
//phone no validation starts with 9 and of 10 digit
System.out.println(Pattern.matches("[9]{1}[0-9]{9}", "9999999999"));
//email validation
System.out.println(Pattern.matches("[a-zA-Z0-9]{1,}[#]{1}[a-z]{5,}[.]{1}+[a-z]{3}", "abcd#gmail.com"));
}
}
This is my regex for email validation:
(([a-zA-Z0-9]+)([\.\-_]?)([a-zA-Z0-9]+)([\.\-_]?)([a-zA-Z0-9]+)?)(#)([a-zA-Z]+.[A-Za-z]+\.?([a-zA-Z0-9]+)\.?([a-zA-Z0-9]+))
For username it allows ".", "_", "-" for separators.
After "#" allows only "." and "-".
Can be easy modified for more words.

Categories

Resources