Java text filter

Java text filter - java

For a project I need to develop een Java application which checks a string on multiple parts:
First a text check if the word contains a word of a specified list.
Keep reminded that:
It is possible that the input can contain a word of the list but there can be spaces or special characters can be put in between characters to bypass the filter. In that case the filter needs to filter the word to.
It is possible that the word can be placed in another word. Then the word needs to be filtered if list the before and/or after filter is specified.
Second is filtering the text if it contains an ip address.
Keep reminded that:
It is possible that the input can contain an ip where speciale characters or spaces are used to bypass the filter. In that case the filter needs to filter the ip address to.
As third is filtering web addresses from the text.
Also here keep reminded that:
It is possible that the input can contain an web address where special characters or spaces are used to bypass the filter. In that case the filter needs to filter the web address to.
I tested some idea's with checking on spaces and speciale characters, but it cost a lot of work to proces the incoming text.
An example of what i tried:
public static boolean validateBericht(String msg) {
return validateTransformedBericht(msg);
}
private static boolean validateTransformedBericht(String bericht) {
if (bericht.length() != 0) {
for (String woord : ChatControlList.getChatControlList()
.getWoordenLijst()) {
for (int i = 0; i < (bericht.length() - (woord.length() - 1)); i++) {
if (i == 0 || inTekenLijst(bericht.charAt(i))) {
int index = 0;
for (int j = i; j < bericht.length(); j++) {
if (inTekenLijst(bericht.charAt(j))) {
} else if (bericht.charAt(j) == woord.charAt(index)) {
index++;
} else {
break;
}
if (index == woord.length()) {
if ((bericht.length() - 1) == j
|| inTekenLijst(bericht.charAt(index))) {
return true;
} else {
break;
}
}
}
}
}
}
}
return false;
}
private static boolean inTekenLijst(char teken) {
for (String tekenUitLijst : ChatControlList.getChatControlList()
.getSpecialeTekens()) {
if (tekenUitLijst.equalsIgnoreCase(String.valueOf(teken))
|| String.valueOf(teken).equalsIgnoreCase(" ")) {
return true;
}
}
return false;
}
Has someone any idea how to solve it on a good working solution?
Harm

In this case you should create two methods:
first to test if the String matches the searched word
and the second one to test the type of an address.
Then you can use them like you want in your code.
Code to check if your String matches the searched word:
String line = "the wor ld is wonderful";
String search = "wor ld";
String pattern = "(" + search + ")";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found value: " + m.group(0));
} else {
System.out.println("NO MATCH");
}
Method that tests a given address and tells if it's an IP address; a Web address or an invalid address:
public static String testAddress(String address) {
if (address.matches("^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$")) {
return "IP Address";
} else if (address.matches("^(http\\:\\/\\/|https\\:\\/\\/)?([a-z0-9][a-z0-9\\-]*\\.)+[a-z0-9][a-z0-9\\-]*$")) {
return "Web address";
} else {
return "invalid input";
}
}
And this is a Working Ideone Test.

For the first part, you could strip out all of the special characters and spaces, e.g.
testString = origString.replaceAll("[- #$%]", ""); //Extend the regex to add your own special characters
...and then search for the words...
containsWord = testString.toLowerCase().contains(badWord);

Related

Convert a String to customised version of Snake case and capitalise first character before every underscore in Java

Suppose I have string like FOO_BAR or foo_bar or Foo_Bar, I want to convert it to customised snake case like below
FOO_BAR -> Foo Bar
`foo_bar` -> `Foo Bar`
`Foo_Bar` -> `Foo Bar`
As you can notice all three inputs provide the same output, capitalise the first character before every underscore and first one, and remove the _ (underscore).
Looked into various libraries like guava and apache but as this is customized version couldn't find any Out of the box solution.
Tried below code but and make it work but its looking bit complex
str.replaceAll("([a-z])([A-Z])", "$1_$2").replaceAll("_", " ")
Output of above code is like FOO BAR basically all characters in uppercase, that i can fix in another iteration but looking for something more efficient and simple.

Just for a bit of fun, here is a stream-based answer:
var answer = Arrays.stream(s.split("_"))
.map(i -> i.substring(0, 1).toUpperCase() + i.substring(1).toLowerCase())
.collect(Collectors.joining(" "));

Here's a simple implementation. I would add a few more test cases before I trusted it.
It does not handle Unicode characters of more than two bytes.
public class Snakify {
public static String toSnake(String in) {
boolean first = true;
boolean afterUnderscore = false;
char[] chars = in.toCharArray();
for (int i = 0; i < chars.length; i++) {
if ((first || afterUnderscore) && Character.isAlphabetic(chars[i])) {
chars[i] = Character.toUpperCase(chars[i]);
first = false;
afterUnderscore = false;
} else if (chars[i] == '_') {
chars[i] = ' ';
afterUnderscore = true;
} else if (Character.isAlphabetic(chars[i])) {
chars[i] = Character.toLowerCase(chars[i]);
}
}
return new String(chars);
}
public static void main(String[] args) {
System.out.println(toSnake("FOO_BAR").equals("Foo Bar"));
System.out.println(toSnake("foo_bar").equals("Foo Bar"));
System.out.println(toSnake("Foo_Bar").equals("Foo Bar"));
System.out.println(toSnake("àèì_òù_ÀÈÌ").equals("Àèì Òù Àèì"));
}
}

How to program a context-free grammar?

I have two classes here.
The CFG class takes a string array in its constructor that defines the context-free grammar. The SampleTest class is being used to test the CFG class by inputting the grammar (C) into the class, then inputting a string by the user, and seeing if that string can be generated by the context-free grammar.
The problem I'm running into is a stack overflow (obviously). I'm assuming that I just created a never-ending recursive function.
Could someone take a look at the processData() function, and help me out figure out how to correctly configure it. I'm basically using recursion to take generate all possibilities for strings that the CFG can create, then returning true if one of those possibilities being generated matches the user's input (inString). Oh, and the wkString parameter is simply the string being generated by the grammar through each recursive iteration.
public class SampleTest {
public static void main(String[] args) {
// Language: strings that contain 0+ b's, followed by 2+ a's,
// followed by 1 b, and ending with 2+ a's.
String[] C = { "S=>bS", "S=>aaT", "T=>aT", "T=>bU", "U=>Ua", "U=>aa" };
String inString, startWkString;
boolean accept1;
CFG CFG1 = new CFG(C);
if (args.length >= 1) {
// Input string is command line parameter
inString = args[0];
char[] startNonTerm = new char[1];
startNonTerm[0] = CFG1.getStartNT();
startWkString = new String(startNonTerm);
accept1 = CFG1.processData(inString, startWkString);
System.out.println(" Accept String? " + accept1);
}
} // end main
} // end class
public class CFG {
private String[] code;
private char startNT;
CFG(String[] c) {
this.code = c;
setStartNT(c[0].charAt(0));
}
void setStartNT(char startNT) {
this.startNT = startNT;
}
char getStartNT() {
return this.startNT;
}
boolean processData(String inString, String wkString) {
if (inString.equals(wkString)) {
return true;
} else if (wkString.length() > inString.length()) {
return false;
}
// search for non-terminal in the working string
boolean containsNT = false;
for (int i = 0; i < wkString.length(); i++) {
// if one of the characters in the working string is a non-terminal
if (Character.isUpperCase(wkString.charAt(i))) {
// mark containsNT as true, and exit the for loop
containsNT = true;
break;
}
}
// if there isn't a non-terminal in the working string
if (containsNT == false) {
return false;
}
// for each production rule
for (int i = 0; i < this.code.length; i++) {
// for each character on the RHS of the production rule
for (int j = 0; j <= this.code[i].length() - 3; j++) {
if (Character.isUpperCase(this.code[i].charAt(j))) {
// make substitution for non-terminal, creating a new working string
String newWk = wkString.replaceFirst(Character.toString(this.code[i].charAt(0)), this.code[i].substring(3));
if (processData(inString, newWk) == true) {
return true;
}
}
}
} // end for loop
return false;
} // end processData
} // end class

Your grammar contains a left-recursive rule
U=>Ua
Recursive-descent parsers can't handle left-recursion, as you've just discovered.
You have two options: Alter your grammar to not be left-recursive anymore, or use a parsing algorithm that can handle it, such as LR1. In your case, U is matching "at least two a characters", so we can just move the recursion to the right.
U=>aU
and everything will be fine. This isn't always possible to do in such a nice way, but in your case, avoiding left-recursion is the easy solution.

You don't need this for loop: "for (int j = 0; j <= this.code[i].length() - 3; j++)". jus create a var to hold the Capital letter in the nonterminal search you did above. Then do your outer for loop followed by if there is a production rule in String[] that starts with that found Non-terminal, do your substitution and recursion.

Java Swing Register Form Password Weakness Conditions

I want to make a registration system in java. It's in a very good progress. I only have one specific problem. The password weakness. I made that, tha password must be longer than 8 character with a simple
if(password.getText().length() > 8) { error message }
I also put a condition just like that:
if(... < 8 || !(password.getText().contains("1"))) { error message }
But with this condition it only accept the password if your password for example: asdfghjk1
So I tried the condition with a lot of || condition like !....contains("2")..|| !..contains("9")
But with theese conditions it only works when the password is: 123456789
But what i really want to do is a password, that longer than 8 character, contains at least one capital and at least one number. Is that any way to do that?
By the way I use Java swing.

The best way to solve this problem is to use Regex. Here I am making an example for you of how to use regex to check password.
import java.util.regex.*;
class GFG {
// Function to validate the password.
public static boolean
isValidPassword(String password)
{
// Regex to check valid password.
String regex = "^(?=.*[0-9])"
+ "(?=.*[a-z])(?=.*[A-Z])"
+ "(?=.*[##$%^&+=])"
+ "(?=\\S+$).{8,20}$";
// Compile the ReGex
Pattern p = Pattern.compile(regex);
// If the password is empty
// return false
if (password == null) {
return false;
}
// Pattern class contains matcher() method
// to find matching between given password
// and regular expression.
Matcher m = p.matcher(password);
// Return if the password
// matched the ReGex
return m.matches();
}
// Driver Code.
public static void main(String args[])
{
// Test Case 1:
String str1 = "Thuans#portal20";
System.out.println(isValidPassword(str1));
// Test Case 2:
String str2 = "DaoMinhThuan";
System.out.println(isValidPassword(str2));
// Test Case 3:
String str3 = "Thuan# portal9";
System.out.println(isValidPassword(str3));
// Test Case 4:
String str4 = "1234";
System.out.println(isValidPassword(str4));
// Test Case 5:
String str5 = "Gfg#20";
System.out.println(isValidPassword(str5));
// Test Case 6:
String str6 = "thuan#portal20";
System.out.println(isValidPassword(str6));
}
}
Output:
true
false
false
false
false
false
Also you can refer to similar topics by following the link below:
Regex Java for password

You can do that by using regex, but I don't know how.
but this should work:
this is where you validate your password:
String passwordString = password.getText();
if (passwordString.Length() > 8 && checkCapital(passwordString) && checkDigit(passwordString)){ valid password }
else { error message }
this is checkCapital I used ascii codes:
private static boolean checkCapital(String string) {
for (char c : string.toCharArray()) {
int code = (int) c;
if (code >= 65 && code <= 90)
return true;
}
return false;
}
this is checkDigit:
private static boolean checkDigit(String string) {
for (int i = 0; i < 10; i++) {
if (string.contains("" + i))
return true;
}
return false;
}

Finding and printing a string that has white space using for loop in Java

I am trying to print any email addresses that are invalid. To be invalid, the email addresses need to not contain an # sign, a period, or a space. The code I have is returning email addresses that do not have an # sign or a period, but they are not returning email addresses that have a space in them.
public static void print_emails(){
for (int i = 0; i < student.size(); i++) {
if (student.get(i).getEmail().contains("#") && student.get(i).getEmail().contains(".") && student.get(i).getEmail().contains("")){
System.out.println("Scanning Roster");
}
else if (student.get(i).getEmail().contains("\\s")){
System.out.println("Invalid email address, " + student.get(i).getEmail());
}
else {
System.out.println("Invalid email address, " + student.get(i).getEmail());
}
}
}

A better way of doing it would be to match the email string against a regular expression. The following should work:
^[A-Za-z0-9+_.-]+#(.+)$
This is a very simple one, which you could make increasingly complex to suit your needs. Currently, this one ensures:
1) A-Z and a-z characters are allowed
2) 0-9 numbers are allowed
3) The email may contain only dot(.), dash(-) and underscore(_)
As well as allowing the # symbol in the correct place.

contains() is expecting a CharSequence. Use regex instead:
else if (studentList.get(i).getEmail().matches(".*\\s.*"))

You can modify your code to include this:
public static void print_invalid_emails()
{
for (int i = 0; i < studentList.size(); i++)
{
if (studentList.get(i).getEmail().contains("#") && studentList.get(i).getEmail().contains(".") && !studentList.get(i).getEmail().contains(" "))
{
System.out.println("Scanning Roster");
}
else
{
System.out.println("Invalid email address, " + studentList.get(i).getEmail());
}
}
}

Java Program - Email address verification using Strings and RegEx

The problem statement goes like this:
We need to create a String data type called emailId
The email ID has to be set using appropriate setter methods.
The validation rules for the email ID check have to be implemented in the main().
Conditions are:
Overall length of the email ID must be >3 and <20.
The emailId must include "#" followed by a minimum of 1 and maximum of 2 "." characters.
The substring before "#" must contain a combination of Upper Case, Lower Case and "_"(underscore) symbols.
The first letter of the email Id must be in Upper Case.
If all the above conditions are valid, it should display "Email ID is valid" or else, it should display an appropriate error message
This is my code:
public class EmailCheck {
String emailId;
public void setEmailId(String emailId){
this.emailId=emailId;
}
public String getEmailId(){
return emailId;
}
public static void main(String[] args) {
EmailCheck em = new EmailCheck();
em.setEmailId("CFDV#gm.a.il.com");
String email = em.getEmailId();
int length = email.length();
boolean flag1 = false;
boolean flag2 = false;
boolean flag3 = false;
boolean flag4 = false;
boolean flag5 = false;
boolean flag6 = false;
boolean flag7 = false;
int count = 0;
//Condition 1
if((length>3 && length<20)== true)
flag1 = true;
else
flag1 = false;
//Condition 2
int temp = email.length();
if(email.contains("#")){
flag2=true;
int a=email.indexOf("#");
for(int i=a;i<temp;i++){
if(email.charAt(i)=='.'){
flag3=true;
count=count+1;
}
}
if(count<1||count>2){
flag4=false;
}
else{
flag4=true;
}
}
else{
flag2 =false;
System.out.println("No # symbol present");
}
//Condition 3
if(email.matches("[A-Z a-z _]+#.*")) //Unable to get the right RegEx here!
flag5 = true;
else
flag5 = false;
//Condition4
if(Character.isUpperCase(email.charAt(0))==true)
flag6 = true;
else
flag6=false;
if(flag1==true && flag2==true && flag3==true && flag4==true && flag5==true &&flag6==true)
System.out.println("Email ID is valid");
else{
if(flag1==false)
System.out.println("Inavlid length of Email ID");
if(flag2==false||flag3==false||flag4==false)
System.out.println("Invalid Position of Special Characters");
if(flag5==false)
System.out.println("Invalid combination for username");
if(flag6==false)
System.out.println("Invalid case of first letter");
}
}
}
I'm not sure of the condition #2(the logic?) and condition #3(the RegExp part). A few of the test cases seem correct, the rest of them are wrong(owing to faulty logic in #2 and esp #3, I think.)
Please, tell me what changes have to be done here to get the right output.
Thanks!

If you insist on using regex you can use this but without validating properly you could be in for all kinds of trouble
static Pattern emailPattern = Pattern.compile("[a-zA-Z0-9[!#$%&'()*+,/\-_\.\"]]+#[a-zA-Z0-9[!#$%&'()*+,/\-_\"]]+\.[a-zA-Z0-9[!#$%&'()*+,/\-_\"\.]]+");
public static boolean isValidEmail(String email) {
Matcher m = emailPattern.matcher(email); return !m.matches();
}
Or alternatively you could use
public static boolean isValidEmailAddress(String email) {
boolean result = true;
try {
InternetAddress emailAddr = new InternetAddress(email);
emailAddr.validate();
} catch (AddressException ex) {
result = false;
}
return result;
}
Which is a core java utility which would be better...
Note that neither of these will guarentee an address is actually valid, just that it is correctly formed and so hypothetically could exist
String email_regex = "[A-Z]+[a-zA-Z_]+#\b([a-zA-Z]+.){2}\b?.[a-zA-Z]+";
String testString = "Im_an_email#email.co.uk";
Boolean b = testString.matches(email_regex);
System.out.println("String: " + testString + " :Valid = " + b);
will check for the last three constraints which you can then combine with
string.length()>3 && string.length()<20

Overall length of the email ID must be >3 and <20.
You have this part fine - a pair of length checks. Things to consider:
You don't need if (condition == true), just if (condition).
If this fails, you can stop processing the email and just display the error. The same applies for all your other error conditions.
The emailId must include "#" followed by a minimum of 1 and maximum of 2 "." characters.
Check for the # sign first and get the index. Once you have that, you can split the string. You can then split the substring on periods and count them - you want two or three.
There's probably a regex that would do this as well.
The substring before "#" must contain a combination of Upper Case, Lower Case and "_"(underscore) symbols.
You can use this approach to ensure your regex must match each part. As it stands, I believe your regex will work if there are caps, lowercase, or an underscore instead of checking for all three.
The first letter of the email Id must be in Upper Case.
Same comments as for the first part, drop the == true and short circuit the method if it fails.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java text filter - java

For the first part, you could strip out all of the special characters and spaces, e.g. testString = origString.replaceAll("[- #$%]", ""); //Extend the regex to add your own special characters ...and then search for the words... containsWord = testString.toLowerCase().contains(badWord);

Related

Convert a String to customised version of Snake case and capitalise first character before every underscore in Java

How to program a context-free grammar?

Java Swing Register Form Password Weakness Conditions

Finding and printing a string that has white space using for loop in Java

Java Program - Email address verification using Strings and RegEx

Categories

Resources