I have “-“ characters in my strings as below.
I am using if contains “-“ and splitting correctly. But some string values are also “-“ characters in different indexes.
I tried to use 2nd if contains “.-“ cannot solve the issue as well.
So have can I get correct outputs without “-“ characters perfectly?
13-adana-demirspor -> has 2 “-“ characters.
15-y.-malatyaspor -> has “-“ characters too.
1st and 2nd strings makes problem for splitting.
And others has only one “-“ character and no issue.
My Code is:
final String [] URL = {
"13-adana-demirspor",
"14-fenerbahce",
"15-y.-malatyaspor",
"16-trabzonspor",
"17-sivasspor",
"18-konyaspor",
"19-giresunspor",
"20-galatasaray"
};
for(int i=0; i<URL.length; i++)
String team;
if (URL[i].contains("-")) {
String[] divide = URL[i].split("-");
team = divide[1];
System.out.println(" " + team.toUpperCase());
} else if (URL[i].contains(".-")){
String[] divide = URL[i].split(".-");
team = divide[2];
System.out.println(" " + team.toUpperCase());
}else {
team = null;
}
My Output is:
ADANA ** missing second word
FENERBAHCE
Y. ** missing second word
TRABZONSPOR
SIVASSPOR
KONYASPOR
GIRESUNSPOR
GALATASARAY
Thanks for your help.
it looks like you just want to split on the first occurence. for this you can use the second parameter of split and set that to 2. So like
if (URL[i].contains("-")) {
String[] divide = URL[i].split("-", 2);
team = divide[1];
System.out.println(" " + team.toUpperCase());
} else {
team = null;
}
to get the last part instead you could do
if (URL[i].contains("-")) {
String[] divide = URL[i].split("-");
team = divide[divide.length - 1];
System.out.println(" " + team.toUpperCase());
} else {
team = null;
}
I divided my string in three part using newline ('\n'). The output that i want to achieve: count how many number of unique date are available in every part of string.
According to below code, first part contains two unique date, second part contains two and third part contains three unique date. So the output should be like this: 2,2,3,
But after run this below code i get this Output: 5,5,5,5,1,3,1,
How do i get Output: 2,2,3,
Thanks in advance.
String strH;
String strT = null;
StringBuilder sbE = new StringBuilder();
String strA = "2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-11,2021-03-11,2021-03-11,2021-03-11,2021-03-11," + '\n' +
"2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-15,2021-03-15,2021-03-15,2021-03-15,2021-03-15," + '\n' +
"2021-03-02,2021-03-09,2021-03-07,2021-03-09,2021-03-09,";
String[] strG = strA.split("\n");
for(int h=0; h<strG.length; h++){
strH = strG[h];
String[] words=strH.split(",");
int wrc=1;
for(int i=0;i<words.length;i++) {
for(int j=i+1;j<words.length;j++) {
if(words[i].equals(words[j])) {
wrc=wrc+1;
words[j]="0";
}
}
if(words[i]!="0"){
sbE.append(wrc).append(",");
strT = String.valueOf(sbE);
}
wrc=1;
}
}
Log.d("TAG", "Output: "+strT);
I would use a set here to count the duplicates:
String strA = "2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-11,2021-03-11,2021-03-11,2021-03-11,2021-03-11" + "\n" +
"2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-15,2021-03-15,2021-03-15,2021-03-15,2021-03-15" + "\n" +
"2021-03-02,2021-03-09,2021-03-07,2021-03-09,2021-03-09";
String[] lines = strA.split("\n");
List<Integer> counts = new ArrayList<>();
for (String line : lines) {
counts.add(new HashSet<String>(Arrays.asList(line.split(","))).size());
}
System.out.println(counts); // [2, 2, 3]
Note that I have done a minor cleanup of the strA input by removing the trailing comma from each line.
With Java 8 Streams, this can be done in a single statement:
String strA = "2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-11,2021-03-11,2021-03-11,2021-03-11,2021-03-11," + '\n' +
"2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-15,2021-03-15,2021-03-15,2021-03-15,2021-03-15," + '\n' +
"2021-03-02,2021-03-09,2021-03-07,2021-03-09,2021-03-09,";
String strT = Pattern.compile("\n").splitAsStream(strA)
.map(strG -> String.valueOf(Pattern.compile(",").splitAsStream(strG).distinct().count()))
.collect(Collectors.joining(","));
System.out.println(strT); // 2,2,3
Note that Pattern.compile("\n").splitAsStream(strA) can also be written as Arrays.stream(strA.split("\n")), which is shorter to write, but creates an unnecessary intermediate array. Matter of personal preference which is better.
String strT = Arrays.stream(strA.split("\n"))
.map(strG -> String.valueOf(Arrays.stream(strG.split(",")).distinct().count()))
.collect(Collectors.joining(","));
The first version can be further micro-optimized by only compiling the regex once:
Pattern patternComma = Pattern.compile(",");
String strT = Pattern.compile("\n").splitAsStream(strA)
.map(strG -> String.valueOf(patternComma.splitAsStream(strG).distinct().count()))
.collect(Collectors.joining(","));
I need to capture two groups from an input string. The values differ in structure as they come in.
The following are examples of the incoming strings:
Comment = "This is a comment";
NumericValue = 123456;
What I am trying to accomplish is to capture the string value from the left of the equals sign as one group and the value after the equals sign as a second group. The semicolon should never be included.
The caveat is that if the second group is a string, the quotes from each end must not be included in that capture group.
The expected results would be:
Comment = "This is a comment";
key group => Comment
value group => This is a comment
NumericValue = 123456;
key group => NumericValue
value group => 123456
The following is what I have so far. This works fine for capturing the numeric value, but leaves the end double quote when capturing the string value.
(?<key>\w+)\s*=\s*(?:[\"]?)(?<group>.+(?:(?=[\"]?;)))
EDIT
When applying the regex against a string value, it must allow capture of semicolons and double quotes within the string and ignore only the closing ones.
So, if we have an input of:
Comment = "This is a "comment"; This is still a comment";
The second capture group should be:
This is a "comment"; This is still a comment
An option is to use an alternation where you would have to check for group 2 or group 3:
(?<key>\w+)\h*=\h*(?:"(.*?)"|([^"\r\n]+));$
(?<key>\w+) Group key match 1+ word chars
\h*=\h* Match an = between optional horizontal whitespace chars
(?: Non capturing group
"(.+?)" Capture in group 2 1+ times any char between "
| Or
([^"\r\n]+) Capture group 3, match 1+ times any char except " or a newline
); Close non capturing group and match ;
$ End of string
Regex demo
In Java
String regex = "(?<key>\\w+)\\h*=\\h*(?:\"(.*?)\"|([^\"\\r\\n]+));$";
Edited based on comment to include ; and " in the comments as per the examples given:
(?<key>\w+)\s*=\s*(?:[\"]?)(?<value>((")(?!;?$)|;(?!$)|[^;"])+)"?;?$
The following one additionally doesn't allow ; or " to appear in the numeric text. However, to include this, I had to rename the capturing groups because the name cannot be used for more than one group.
(?<key>\w+)\s*=\s*((?:")(?<valueT>((")(?!;?$)|;(?!$)|[^;"])+)";?$|(?<valueN>[^;"]+);?$)
Here is a class that tests it.
For readability, I have separated the key and value regexes in the class. I have added the test cases in a method within the class. However, this still doesn't handle the case of a numeric text containing ; or ". Also, the line needs to be trimmed before being subjected to the pattern test (which I think is feasible).
public class NameValuePairRegex{
public static void main( String[] args ){
String SPACE = "\\s*";
String EQ = "=";
String OR = "|";
/* The original regex tried by you (for comparison). */
String orig = "(?<key>\\w+)\\s*=\\s*(?:[\\\"]?)(?<value>.+(?:(?=;)))";
String key = "(?<key>\\w+)";
String valuePatternForText = "(?:\")(?<valueT>((\")(?!;?$)|;(?!$)|[^;\"])+)\";?$";
String valuePatternForNumbers = "(?<valueN>[^;\"]+);?$";
String p = key + SPACE + EQ + SPACE + "(" + valuePatternForText + OR + valuePatternForNumbers + ")";
Pattern nvp = Pattern.compile( p );
System.out.println( nvp.pattern() );
print( input(), nvp );
}
private static void print( List<String> input, Pattern ep ) {
for( String e : input ) {
System.out.println( e );
Matcher m = ep.matcher( e );
boolean found = m.find();
if( !found ) {
System.out.println( "\t\tNo match" );
continue;
}
String valueT = m.group( "valueT" );
String valueN = m.group( "valueN" );
System.out.print( "\t\t" + m.group( "key" ) + " -> " + ( valueT == null ? "" : valueT ) + " " + ( valueN == null ? "" : valueN ) );
System.out.println( );
}
}
private static List<String> input(){
List<String> neg = new ArrayList<>();
Collections.addAll( neg,
"Comment = \"This is a comment\";",
"Comment = \"This is a comment with semicolon ;\";",
"Comment = \"This is a comment with semicolon ; and quote\"\";",
"Comment = \"This is a comment\"",
"Comment = \"This is a \"comment\"; This is still a comment\";",
"NumericValue = 123456;",
"NumericValue = 123;456;",
"NumericValue = 123\"456;",
"NumericValue = 123456" );
return neg;
}
}
Original answer:
The following changed regex is fulfilling the requirements you mentioned. I added the exclusion of ; and " from the value part.
Original that you tried:
(?<key>\w+)\s*=\s*(?:[\"]?)(?<group>.+(?:(?=[\"]?;)))
The changed one:
(?<key>\w+)\s*=\s*(?:[\"]?)(?<value>[^;"]+)
Regular expressions are fun, but look how clean and easy to read this would be without using a regular expression:
int equals = s.indexOf('=');
String key = s.substring(0, equals).trim();
String value = s.substring(equals + 1).trim();
if (value.endsWith(";")) {
value = value.substring(0, value.length() - 1).trim();
}
if (value.startsWith("\"") && value.endsWith("\"")) {
value = value.substring(1, value.length() - 1);
}
Don’t assume that because this uses more lines of code than a regular expression that it’s slower. The lines of code executed internally by a regex engine will far exceed the above code.
I need to insert a space after every given character in a string.
For example "abc.def..."
Needs to become "abc. def. . . "
So in this case the given character is the dot.
My search on google brought no answer to that question
I really should go and get some serious regex knowledge.
EDIT : ----------------------------------------------------------
String test = "0:;1:;";
test.replaceAll( "\\:", ": " );
System.out.println(test);
// output: 0:;1:;
// so didnt do anything
SOLUTION: -------------------------------------------------------
String test = "0:;1:;";
**test =** test.replaceAll( "\\:", ": " );
System.out.println(test);
You could use String.replaceAll():
String input = "abc.def...";
String result = input.replaceAll( "\\.", ". " );
// result will be "abc. def. . . "
Edit:
String test = "0:;1:;";
result = test.replaceAll( ":", ": " );
// result will be "0: ;1: ;" (test is still unmodified)
Edit:
As said in other answers, String.replace() is all you need for this simple substitution. Only if it's a regular expression (like you said in your question), you have to use String.replaceAll().
You can use replace.
text = text.replace(".", ". ");
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace%28java.lang.CharSequence,%20java.lang.CharSequence%29
If you want a simple brute force technique. The following code will do it.
String input = "abc.def...";
StringBuilder output = new StringBuilder();
for(int i = 0; i < input.length; i++){
char c = input.getCharAt(i);
output.append(c);
output.append(" ");
}
return output.toString();
Compare how you would accomplish the two tasks mentioned below with and without regular expressions. The problem:
The format for an SMS-based food delivery will be:
PABUSOG slash or comma repeated an infinite number of times #
// The quantity can only be numeric. For simplicity, assume that quantity is always an integer
e.g. PABUSOG STRFRY_SMAI/2 HSHBRWN_BRGR/1 COFEEFLT/1 #En311
it will capture the following:
STRFRY_SMAI - 2
HSHBRWN_BRGR - 1
COFEEFLT - 1
this is my sample code: // doing with regex
String message = "PABUSOG ASD_ASD/1 ASD_ASA/2";
Pattern pattern = Pattern.compile("PABUSOG(\\s+([A-Z]+_[A-Z]+)(/|,)([0-9]))+"
,Pattern.CASE_INSENSITIVE);
Matcher m = pattern.matcher(message);
try
{
if (m.matches())
{
String food = m.group(2);
String quantity = m.group(4);
System.out.println(food + " -- " + quantity + "\\n");
}
}
catch (NullPointerException e)
{
}
it displays the ASD_ASA -- 2, it overrides the 1st one which is ASD_ASD/1.
it must display
ASD_ASD -- 1
ASD_ASA -- 2
You cannot accomplish that with a single regex giving you all the data inside groups. And there's no great need for complex regex either. But still if you prefer regex try searching for pattern iteratively.
if (!message.startsWith("PABUSOG")) {
return;
}
Pattern pattern = Pattern.compile("([A-Z_]+)[/,]([0-9])+", Pattern.CASE_INSENSITIVE);
Matcher m = pattern.matcher(message);
while (m.find()) {
String food = m.group(1);
String quantity = m.group(2);
System.out.println(food + " -- " + quantity);
}
Without complex regex you can do the following by using String API:
// Check for correct header
if (!message.startsWith("PABUSOG")) {
return;
}
// split by whitespaces
String[] items = message.split("\\s+");
// skip header and iterate over remaining items
for (String item : Arrays.asList(items).subList(1, items.length)) {
// split each item by / or ,
String[] foodQuantity = item.split("[/,]");
assert foodQuantity.length == 2;
String food = foodQuantity[0];
String quantity = foodQuantity[1];
System.out.println(food + " -- " + quantity);
}
To skip items started with # you can either add
if (item.startsWith("#")) {
break; // or continue if it can be not the last
}
inside loop or limit subList in the following way if you sure that such item is always present and terminates the sequence: Arrays.asList(items).subList(1, items.length - 1).
By the way, your pattern [A-Z]+_[A-Z]+ won't match COFEEFLT from your example.