Regular expression to recognize string of '#' - java

I have a java question. I can't figure out how to write my regular expression to print something to a file when encountering one or more instances of '#'. It must not print when the string equals "", but it must print when the string equals "#". Here's my code:
int num = 1;
StringBuffer noletterbuf = new StringBuffer(nospaces);
noletterbuf.deleteCharAt(0);
String noletter = noletterbuf.toString();
//if(num == noletter.split("[^#]").length){//applies # to C# and C
if(num == noletter.split("[#*]").length){//applies # to C
double yacc = octave*-50;
p6.println("sb.append(\"/Times-Roman findfont 70 scalefont setfont 1 -1 scale newpath \"); sb.append(" + xaccplace + " + \" \" +" + yacc + " + \" moveto \"); sb.append(\"( # ) show 1 -1 scale \");");
}
Thanks in advance!
Bjorn

Why use regex and .split() at all, since you just discard the resulting array?
You can check if the string contains # using the following:
if (noletter.indexOf('#') >= 0) {
// ...
}

Your code:
noletter.split("[#*]")
This will split at each # and each *, since the asterisk is within brackets.

A simple way to check using regex is:
if (str.matches(".*#.*")) // true if there's a # in str
I don't understand the "no spaces" relevance, or why you used a StringBuffer, or why you deleted the first character.

Related

Find index of a non-digit using regex in Java

This is probably an easy question but I haven't been able to figure it out. I want to find the next letter (A to Z) in a string after a certain point in the string. The result I want from below is for the string money to be "$5. 00" but num2 always comes up as -1.
String text = "hello$5. 00Bla bla words that don't matter"
int num1 = text.indexOf('$');
int num2 = text.indexOf("[a-zA-Z]" , num1 + 1); // Always results in -1
String money = text.substring(num1, num2);
To find the first letter following a $ dollar sign, using regex, you can use the following regex:
\$\P{L}*\p{L}
Explanation:
\$ Match a $ dollar sign
\P{L}* Match 0 or more characters that are not Unicode letters
\p{L} Match a Unicode letter
The index of the letter is then the last character of the matched substring, i.e. one character before the end() of the match.
Example
String text = "hello$5. 00Bla bla words that don't matter";
Matcher m = Pattern.compile("\\$\\P{L}*\\p{L}").matcher(text);
if (m.find()) {
int idx = m.end() - 1;
System.out.println("Letter found at index " + idx + ": '" + text.substring(idx) + "'");
}
Output
Letter found at index 11: 'Bla bla words that don't matter'
UPDATE
It seems the actual question was slightly different than answered above, so to capture the text from $ dollar sign (inclusive) and all following characters up to first letter (exclusive) or end of string, use this regex:
\$\P{L}*
Example
String text = "hello$5. 00Bla bla words that don't matter";
Matcher m = Pattern.compile("\\$\\P{L}*").matcher(text);
if (m.find()) {
String money = m.group();
System.out.println("money = \"" + money + "\"");
}
Output
money = "$5. 00"
This is untested, as my workstation isn't set up for Java 9, but using that release, you should be able to do this:
String result = text.substring(text.indexOf('$'), text.length())
.takeWhile(ch -> !Character.isAlphabetic(ch))
.map(Object::toString).collect(Collectors.joining());
result will evaluate to $5. 00
Note: Stream<T>#takeWhile is a Java 9 feature
Thanks for the help everyone. I found a way to do this without using regex.
String money = "";
while (!Character.isLetter(text.charAt(num1))) {
money = money + text.charAt(num1);
num1++;
}
It might need some work later but it seems to work.

Splitting on comma outside quotes

My program reads a line from a file. This line contains comma-separated text like:
123,test,444,"don't split, this",more test,1
I would like the result of a split to be this:
123
test
444
"don't split, this"
more test
1
If I use the String.split(","), I would get this:
123
test
444
"don't split
this"
more test
1
In other words: The comma in the substring "don't split, this" is not a separator. How to deal with this?
You can try out this regex:
str.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
This splits the string on , that is followed by an even number of double quotes. In other words, it splits on comma outside the double quotes. This will work provided you have balanced quotes in your string.
Explanation:
, // Split on comma
(?= // Followed by
(?: // Start a non-capture group
[^"]* // 0 or more non-quote characters
" // 1 quote
[^"]* // 0 or more non-quote characters
" // 1 quote
)* // 0 or more repetition of non-capture group (multiple of 2 quotes will be even)
[^"]* // Finally 0 or more non-quotes
$ // Till the end (This is necessary, else every comma will satisfy the condition)
)
You can even type like this in your code, using (?x) modifier with your regex. The modifier ignores any whitespaces in your regex, so it's becomes more easy to read a regex broken into multiple lines like so:
String[] arr = str.split("(?x) " +
", " + // Split on comma
"(?= " + // Followed by
" (?: " + // Start a non-capture group
" [^\"]* " + // 0 or more non-quote characters
" \" " + // 1 quote
" [^\"]* " + // 0 or more non-quote characters
" \" " + // 1 quote
" )* " + // 0 or more repetition of non-capture group (multiple of 2 quotes will be even)
" [^\"]* " + // Finally 0 or more non-quotes
" $ " + // Till the end (This is necessary, else every comma will satisfy the condition)
") " // End look-ahead
);
Why Split when you can Match?
Resurrecting this question because for some reason, the easy solution wasn't mentioned. Here is our beautifully compact regex:
"[^"]*"|[^,]+
This will match all the desired fragments (see demo).
Explanation
With "[^"]*", we match complete "double-quoted strings"
or |
we match [^,]+ any characters that are not a comma.
A possible refinement is to improve the string side of the alternation to allow the quoted strings to include escaped quotes.
Building upon #zx81's answer, cause matching idea is really nice, I've added Java 9 results call, which returns a Stream. Since OP wanted to use split, I've collected to String[], as split does.
Caution if you have spaces after your comma-separators (a, b, "c,d"). Then you need to change the pattern.
Jshell demo
$ jshell
-> String so = "123,test,444,\"don't split, this\",more test,1";
| Added variable so of type String with initial value "123,test,444,"don't split, this",more test,1"
-> Pattern.compile("\"[^\"]*\"|[^,]+").matcher(so).results();
| Expression value is: java.util.stream.ReferencePipeline$Head#2038ae61
| assigned to temporary variable $68 of type java.util.stream.Stream<MatchResult>
-> $68.map(MatchResult::group).toArray(String[]::new);
| Expression value is: [Ljava.lang.String;#6b09bb57
| assigned to temporary variable $69 of type String[]
-> Arrays.stream($69).forEach(System.out::println);
123
test
444
"don't split, this"
more test
1
Code
String so = "123,test,444,\"don't split, this\",more test,1";
Pattern.compile("\"[^\"]*\"|[^,]+")
.matcher(so)
.results()
.map(MatchResult::group)
.toArray(String[]::new);
Explanation
Regex [^"] matches: a quote, anything but a quote, a quote.
Regex [^"]* matches: a quote, anything but a quote 0 (or more) times , a quote.
That regex needs to go first to "win", otherwise matching anything but a comma 1 or more times - that is: [^,]+ - would "win".
results() requires Java 9 or higher.
It returns Stream<MatchResult>, which I map using group() call and collect to array of Strings. Parameterless toArray() call would return Object[].
You can do this very easily without complex regular expression:
Split on the character ". You get a list of Strings
Process each string in the list: Split every string that is on an even position in the List (starting indexing with zero) on "," (you get a list inside a list), leave every odd positioned string alone (directly putting it in a list inside the list).
Join the list of lists, so you get only a list.
If you want to handle quoting of '"', you have to adapt the algorithm a little bit (joining some parts, you have incorrectly split of, or changing splitting to simple regexp), but the basic structure stays.
So basically it is something like this:
public class SplitTest {
public static void main(String[] args) {
final String splitMe="123,test,444,\"don't split, this\",more test,1";
final String[] splitByQuote=splitMe.split("\"");
final String[][] splitByComma=new String[splitByQuote.length][];
for(int i=0;i<splitByQuote.length;i++) {
String part=splitByQuote[i];
if (i % 2 == 0){
splitByComma[i]=part.split(",");
}else{
splitByComma[i]=new String[1];
splitByComma[i][0]=part;
}
}
for (String parts[] : splitByComma) {
for (String part : parts) {
System.out.println(part);
}
}
}
}
This will be much cleaner with lambdas, promised!
Please see the below code snippet. This code only considers happy flow. Change the according to your requirement
public static String[] splitWithEscape(final String str, char split,
char escapeCharacter) {
final List<String> list = new LinkedList<String>();
char[] cArr = str.toCharArray();
boolean isEscape = false;
StringBuilder sb = new StringBuilder();
for (char c : cArr) {
if (isEscape && c != escapeCharacter) {
sb.append(c);
} else if (c != split && c != escapeCharacter) {
sb.append(c);
} else if (c == escapeCharacter) {
if (!isEscape) {
isEscape = true;
if (sb.length() > 0) {
list.add(sb.toString());
sb = new StringBuilder();
}
} else {
isEscape = false;
}
} else if (c == split) {
list.add(sb.toString());
sb = new StringBuilder();
}
}
if (sb.length() > 0) {
list.add(sb.toString());
}
String[] strArr = new String[list.size()];
return list.toArray(strArr);
}

Check if string ends with certain pattern

If I have a string like:
This.is.a.great.place.too.work.
or:
This/is/a/great/place/too/work/
than my program should give me that the sentence is valid and it has "work".
If I Have :
This.is.a.great.place.too.work.hahahha
or:
This/is/a/great/place/too/work/hahahah
then my program should not give me that there is a "work" in the sentence.
So I am looking at java strings to find a word at the end of the sentence having . or , or / before it. How can I achieve this?
This is really simple, the String object has an endsWith method.
From your question it seems like you want either /, , or . as the delimiter set.
So:
String str = "This.is.a.great.place.to.work.";
if (str.endsWith(".work.") || str.endsWith("/work/") || str.endsWith(",work,"))
// ...
You can also do this with the matches method and a fairly simple regex:
if (str.matches(".*([.,/])work\\1$"))
Using the character class [.,/] specifying either a period, a slash, or a comma, and a backreference, \1 that matches whichever of the alternates were found, if any.
You can test if a string ends with work followed by one character like this:
theString.matches(".*work.$");
If the trailing character is optional you can use this:
theString.matches(".*work.?$");
To make sure the last character is a period . or a slash / you can use this:
theString.matches(".*work[./]$");
To test for work followed by an optional period or slash you can use this:
theString.matches(".*work[./]?$");
To test for work surrounded by periods or slashes, you could do this:
theString.matches(".*[./]work[./]$");
If the tokens before and after work must match each other, you could do this:
theString.matches(".*([./])work\\1$");
Your exact requirement isn't precisely defined, but I think it would be something like this:
theString.matches(".*work[,./]?$");
In other words:
zero or more characters
followed by work
followed by zero or one , . OR /
followed by the end of the input
Explanation of various regex items:
. -- any character
* -- zero or more of the preceeding expression
$ -- the end of the line/input
? -- zero or one of the preceeding expression
[./,] -- either a period or a slash or a comma
[abc] -- matches a, b, or c
[abc]* -- zero or more of (a, b, or c)
[abc]? -- zero or one of (a, b, or c)
enclosing a pattern in parentheses is called "grouping"
([abc])blah\\1 -- a, b, or c followed by blah followed by "the first group"
Here's a test harness to play with:
class TestStuff {
public static void main (String[] args) {
String[] testStrings = {
"work.",
"work-",
"workp",
"/foo/work.",
"/bar/work",
"baz/work.",
"baz.funk.work.",
"funk.work",
"jazz/junk/foo/work.",
"funk/punk/work/",
"/funk/foo/bar/work",
"/funk/foo/bar/work/",
".funk.foo.bar.work.",
".funk.foo.bar.work",
"goo/balls/work/",
"goo/balls/work/funk"
};
for (String t : testStrings) {
print("word: " + t + " ---> " + matchesIt(t));
}
}
public static boolean matchesIt(String s) {
return s.matches(".*([./,])work\\1?$");
}
public static void print(Object o) {
String s = (o == null) ? "null" : o.toString();
System.out.println(o);
}
}
Of course you can use the StringTokenizer class to split the String with '.' or '/', and check if the last word is "work".
You can use the substring method:
String aString = "This.is.a.great.place.too.work.";
String aSubstring = "work";
String endString = aString.substring(aString.length() -
(aSubstring.length() + 1),aString.length() - 1);
if ( endString.equals(aSubstring) )
System.out.println("Equal " + aString + " " + aSubstring);
else
System.out.println("NOT equal " + aString + " " + aSubstring);
I tried all the different things mentioned here to get the index of the . character in a filename that ends with .[0-9][0-9]*, e.g. srcfile.1, srcfile.12, etc. Nothing worked. Finally, the following worked:
int dotIndex = inputfilename.lastIndexOf(".");
Weird! This is with java -version:
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-0ubuntu1.16.10.2-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)
Also, the official Java doc page for regex (from which there is a quote in one of the answers above) does not seem to specify how to look for the . character. Because \., \\., and [.] did not work for me, and I don't see any other options specified apart from these.
String input1 = "This.is.a.great.place.too.work.";
String input2 = "This/is/a/great/place/too/work/";
String input3 = "This,is,a,great,place,too,work,";
String input4 = "This.is.a.great.place.too.work.hahahah";
String input5 = "This/is/a/great/place/too/work/hahaha";
String input6 = "This,is,a,great,place,too,work,hahahha";
String regEx = ".*work[.,/]";
System.out.println(input1.matches(regEx)); // true
System.out.println(input2.matches(regEx)); // true
System.out.println(input3.matches(regEx)); // true
System.out.println(input4.matches(regEx)); // false
System.out.println(input5.matches(regEx)); // false
System.out.println(input6.matches(regEx)); // false

Print multiple lines output in java without using a new line character

this is one of the interview question. I am supposed to print multiple lines of output on command line, without using the newline(\n) character in java. I tried googling for this, didn't find appropriate answers. If i am printing 5 numbers, then it should print in the following fashion. But I am not supposed to use the newline character nor loops either. I have to print this using a single println() statement. Can you give me some ideas ? Thanks !
1
2
3
4
5
You can do it recursively:
public void foo(int currNum) {
if (currNum > 5)
return;
println(currNum);
foo(currNum + 1);
}
Then you are only using a single println and you aren't using a for or while loop.
If you're just not allowed of using \n and println() then you can get the systems line.separator, e.g.
String h = "Hello" + System.getProperty("line.separator") + "World!"
Hope this helped, have Fun!
Ok, now I think I understand your question. What about this?
println(String.format("%d%n%d%n%d%n%d%n%d%n", 1, 2, 3, 4, 5));
One way is this: Platform Independent
final String EOL = System.getProperty("line.separator");
System.out.println('1' + EOL + '2' + EOL + '3' + EOL + '4' + EOL + '5');
This is Platform Dependent
char eol = (char) 13;
System.out.println("" + '1' + eol + '2' + eol + '3' + eol + '4');
There are many ways to achieve this...
One alternative to using '\n' is to output the byte value for the character. So, an example to print out your list of the numbers 1-5 in your example...
char line = (char)10;
System.out.println("1" + line+ "2" + line+ "3" + line + "4" + line+ "5");
You could also build a byte[] array or char[] array and output that...
char line = (char)10;
char[] output = new char[9]{'1',line,'2',line,'3',line,'4',line,'5'};
System.out.println(new String(output));
Probably cheating based on the requirements, but technically only 1 println statement and no loops.
public int recursivePrint(int number)
{
if (number >=5 )
return number;
else
System.out.println(recursivePrint(number++));
}
No loops, 1 println call, +flexibility:
public static void main (String[] args) {
print(5);
}
final String newLine = System.getProperty("line.separator");
public void print(int fin) {
System.out.println(printRec("",1,fin));
}
private String printRec(String s, int start, int fin) {
if(start > fin)
return s;
s += start + newLine;
return printRec(s, start+1, fin);
}
The ASCII value of new Line is 10.
So use this
char line = 10;
System.out.print("1" + line + "2" + line ......);
ANSI terminal escape codes can do the trick.
Aside: Since System.out is a PrintStream, it may not be able to support the escape codes.
However, you can define your own println(msg) function, and make one call to that. Might be cheating, but unless they explicitly say System.out.println, you're golden (hell, even if they do, you can define your own object named System in the local scope using a class defined outside your function, give it a field out with a function println(msg) and you're still scot-free).

How can I make Java print quotes, like "Hello"?

How can I make Java print "Hello"?
When I type System.out.print("Hello"); the output will be Hello. What I am looking for is "Hello" with the quotes("").
System.out.print("\"Hello\"");
The double quote character has to be escaped with a backslash in a Java string literal. Other characters that need special treatment include:
Carriage return and newline: "\r" and "\n"
Backslash: "\\"
Single quote: "\'"
Horizontal tab and form feed: "\t" and "\f"
The complete list of Java string and character literal escapes may be found in the section 3.10.6 of the JLS.
It is also worth noting that you can include arbitrary Unicode characters in your source code using Unicode escape sequences of the form \uxxxx where the xs are hexadecimal digits. However, these are different from ordinary string and character escapes in that you can use them anywhere in a Java program ... not just in string and character literals; see JLS sections 3.1, 3.2 and 3.3 for a details on the use of Unicode in Java source code.
See also:
The Oracle Java Tutorial: Numbers and Strings - Characters
In Java, is there a way to write a string literal without having to escape quotes? (Answer: No)
char ch='"';
System.out.println(ch + "String" + ch);
Or
System.out.println('"' + "ASHISH" + '"');
Escape double-quotes in your string: "\"Hello\""
More on the topic (check 'Escape Sequences' part)
You can do it using a unicode character also
System.out.print('\u0022' + "Hello" + '\u0022');
Adding the actual quote characters is only a tiny fraction of the problem; once you have done that, you are likely to face the real problem: what happens if the string already contains quotes, or line feeds, or other unprintable characters?
The following method will take care of everything:
public static String escapeForJava( String value, boolean quote )
{
StringBuilder builder = new StringBuilder();
if( quote )
builder.append( "\"" );
for( char c : value.toCharArray() )
{
if( c == '\'' )
builder.append( "\\'" );
else if ( c == '\"' )
builder.append( "\\\"" );
else if( c == '\r' )
builder.append( "\\r" );
else if( c == '\n' )
builder.append( "\\n" );
else if( c == '\t' )
builder.append( "\\t" );
else if( c < 32 || c >= 127 )
builder.append( String.format( "\\u%04x", (int)c ) );
else
builder.append( c );
}
if( quote )
builder.append( "\"" );
return builder.toString();
}
System.out.println("\"Hello\"");
System.out.println("\"Hello\"")
There are two easy methods:
Use backslash \ before double quotes.
Use two single quotes instead of double quotes like '' instead of "
For example:
System.out.println("\"Hello\"");
System.out.println("''Hello''");
Take note, there are a few certain things to take note when running backslashes with specific characters.
System.out.println("Hello\\\");
The output above will be:
Hello\
System.out.println(" Hello\" ");
The output above will be:
Hello"
Use Escape sequence.
\"Hello\"
This will print "Hello".
you can use json serialization utils to quote a java String.
like this:
public class Test{
public static String quote(String a){
return JSON.toJsonString(a)
}
}
if input is:hello output will be: "hello"
if you want to implement the function by self:
it maybe like this:
public static String quotes(String origin) {
// 所有的 \ -> \\ 用正则表达为: \\ => \\\\" 再用双引号quote起来: \\\\ ==> \\\\\\\\"
origin = origin.replaceAll("\\\\", "\\\\\\\\");
// " -> \" regExt: \" => \\\" quote to param: \\\" ==> \\\\\\\"
origin = origin.replaceAll("\"", "\\\\\\\"");
// carriage return: -> \n \\\n
origin = origin.replaceAll("\\n", "\\\\\\n");
// tab -> \t
origin = origin.replaceAll("\\t", "\\\\\\t");
return origin;
}
the above implementation will quote escape character in string but exclude
the " at the start and end.
the above implementation is incomplete. if other escape character you need , you can add to it.

Categories

Resources