I just wanted to ask if there is an easy way to do this, before I start building a fully fletched regex interpreter or at least a quite big state machine, just to figure out what degree the or operators have and where to split. To make things clearer let's put a random example here:
String regex = "test (1|2|3)|testing\\||tester\\nNextLine[ab|]|(test)";
The result I want is the following, spliting the regex by its main or operators:
String[] result = { "test (1|2|3)", "testing\\|", "tester\\nNextLine[ab|]", "(test)" };
As mentioned I already have some ideas on complex solutions that involve going through the string char by char, skipping escaped characters, figuring out where all the brackets open and close, on what bracket-level that character is, adding the indices of those level 0 '|' characters to a list and splitting the string by those indices, but I am searching for a simple one- or two-liner
aka a more beautiful solution. Is there one?
To clarify this even further - I want all alternatives like this in one string array
UPDATE: Not the most beautiful version, but I actually implemented something like a state machine for this now:
private ArrayList<String> parseFilters(String regex) {
ArrayList<Integer> indices = new ArrayList<>();
Stack<Integer> brackets = new Stack<>();
int level = 0;
int bracketType = -1;
char lastChar = ' ';
char currentChar = ' ';
for (int i = 0; i < regex.length(); i++) {
currentChar = regex.charAt(i);
if (lastChar == '\\' || "^$?*+".indexOf(currentChar) >= 0)
;
else if (level == 0 && "|".indexOf(currentChar) >= 0)
indices.add(i + 1);
else if ((bracketType = "([{".indexOf(currentChar)) >= 0) {
brackets.push(bracketType);
level++;
} else if ((bracketType = ")]}".indexOf(currentChar)) >= 0) {
if (bracketType == brackets.peek()) {
brackets.pop();
level--;
}
}
lastChar = currentChar;
}
ArrayList<String> results = new ArrayList<>();
int lastIndex = 0;
for (int i : indices)
results.add(regex.substring(lastIndex, (lastIndex = i) - 1));
results.add(regex.substring(lastIndex));
return results;
}
Here is a proof of concept Java program splitting on double || and leaving single | untouched.
This would be more complicated to achieve with regex.
We have to double escape each pipe because the pattern is parsed twice, once when it is loaded into the variable and again when it is used as a pattern. \\|\\| is thus reduced to ||.
class split{
public static void main(String[] args){
String lineTest = "test (1|2|3)|testing\\||tester\\nNextLine[ab||]|(test)";
String separated[] =
lineTest.split("\\|\\|");
for ( int i = 0; i < separated.length;i++){
System.out.println( separated[i]);
}
}
}
The output is:
test (1|2|3)|testing\
tester\nNextLine[ab
]|(test)
Related
I'm trying to practice for a techniqual test where I have to count the number of characters in a DNA sequence, but no matter what I do the counter won't update, this is really frustrating as I learnt code with ruby and it would update, but Java seems to have an issue. I know there's something wrong with my syntaxt but for the life of me I can't figure it out.
public class DNA {
public static void main(String[] args) {
String dna1 = "ATGCGATACGCTTGA";
String dna2 = "ATGCGATACGTGA";
String dna3 = "ATTAATATGTACTGA";
String dna = dna1;
int aCount = 0;
int cCount = 0;
int tCount = 0;
for (int i = 0; i <= dna.length(); i++) {
if (dna.substring(i) == "A") {
aCount+= 1;
}
else if (dna.substring(i) == "C") {
cCount++;
}
else if (dna.substring(i) == "T") {
tCount++;
}
System.out.println(aCount);
}
}
}
It just keeps returning zero instead of adding one to it if the conditions are meet and reassigning the value.
Good time to learn some basic debugging!
Let's look at what's actually in that substring you're looking at. Add
System.out.println(dna.substring(i));
to your loop. You'll see:
ATGCGATACGCTTGA
TGCGATACGCTTGA
GCGATACGCTTGA
CGATACGCTTGA
GATACGCTTGA
ATACGCTTGA
TACGCTTGA
ACGCTTGA
CGCTTGA
GCTTGA
CTTGA
TTGA
TGA
GA
A
So, substring doesn't mean what you thought it did - it's taking the substring starting at that index and going to the end of the string. Only the last character has a chance of matching your conditions.
Though, that last one still won't match your condition, which is understandably surprising if you're new to the language. In Java, == is "referential equality" - when applied to non-primitives, it's asserting the two things occupy the same location in memory. For strings in particular, this can give surprising and inconsistent results. Java keeps a special section of memory for strings, and tries to avoid duplicates (but doesn't try that hard.) The important takeaway is that string1.equals(string2) is the correct way to check.
It's a good idea to do some visibility and sanity checks like that, when your program isn't doing what you think it is. With a little practice you'll get a feel for what values to inspect.
Edward Peters is right about misuse of substring that returns a String.
In Java, string must be places between double quotes. A String is an object and you must use method equals to compare 2 objects:
String a = "first string";
String b = "second string";
boolean result = a.equals(b));
In your case, you should consider using charAt(int) instead. Chars must be places between simple quotes. A char is a primitive type (not an object) and you must use a double equals sign to compare two of them:
char a = '6';
char b = 't';
boolean result = (a==b);
So, your code should look like this:
public class DNA {
public static void main(String[] args) {
String dna1 = "ATGCGATACGCTTGA";
String dna2 = "ATGCGATACGTGA";
String dna3 = "ATTAATATGTACTGA";
String dna = dna1;
int aCount = 0;
int cCount = 0;
int tCount = 0;
for (int i = 0; i < dna.length(); i++) {
if (dna.charAt(i) == 'A') {
aCount += 1;
} else if (dna.charAt(i) == 'C') {
cCount++;
} else if (dna.charAt(i) == 'T') {
tCount++;
}
System.out.println(aCount);
}
}
}
substring(i) doesn't select one character but all the characters from i to the string length, then you also made a wrong comparison: == checks 'object identity', while you want to check that they are equals.
You could substitute
if (dna.substring(i) == "A")
with:
if (dna.charAt(i) == 'A')
this works because charAt(i) returns a primitive type, thus you can correctly compare it to 'A' using ==
One of the problems, as stated, was the way you are comparing Strings. Here is a way
that uses a switch statement and a iterated array of characters. I put all the strings in an array. If you only have one string, the outer loop can be eliminated.
public class DNA {
public static void main(String[] args) {
String dna1 = "ATGCGATACGCTTGA";
String dna2 = "ATGCGATACGTGA";
String dna3 = "ATTAATATGTACTGA";
String[] dnaStrings =
{dna1,dna2,dna3};
int aCount = 0;
int cCount = 0;
int tCount = 0;
int gCount = 0;
for (String dnaString : dnaStrings) {
for (char c : dnaString.toCharArray()) {
switch (c) {
case 'A' -> aCount++;
case 'T' -> tCount++;
case 'C' -> cCount++;
case 'G' -> gCount++;
}
}
}
System.out.println("A's = " + aCount);
System.out.println("T's = " + tCount);
System.out.println("C's = " + cCount);
System.out.println("G's = " + gCount);
}
prints
A's = 14
T's = 13
C's = 6
G's = 10
I am learning Java and wonder how I can get two numbers in same line.
Is this algorithm is okay, what can I do improve? What can you suggest me?
import java.util.Scanner;
public class Main{
public static int Separate(String Values, int Order){
String toReturn = "";
int Counter = 0;
for(int Iterator = 0; Iterator < Values.length(); Iterator = Iterator + 1){
if(Values.charAt(Iterator) == ' ') {
if(Order == Counter) break;
else{
toReturn = "";
Counter = Counter + 1;
}
}
else toReturn += Values.charAt(Iterator);
}
return Integer.parseInt(toReturn);
}
public static void main(String[] args){
Scanner Entry = new Scanner(System.in);
System.out.print("Enter two numbers separated by space: ");
String Number = Entry.nextLine();
int Frst = Separate(Number, 0);
int Scnd = Separate(Number, 1);
}
}
what can I do improve? What can you suggest me?
Adopt the Java Naming Conventions:
Method Names are camelCase, starting with a lower case letter
Field and Property Names and Method Argument Names are camelCase, too
Basically only Class and Interface Names start with an upper case letter in Java.
public static int separate(String values, int order){
String toReturn = "";
int counter = 0;
for(int iterator = 0; ...) { ...
Else I'd say: This algorithm is pretty solid for a beginner. It's easy to understand what's going on.
Of course Java provides much more sophisticated tools to solve this, using for example Regular Expressions with myString.split(...), or Streams with IntStream intStream = myString.chars().
Last but not least you could add Exception Handling: What happens if Integer.parseInt is given some non-number? It will crash.
try {
return Integer.parseInt(toReturn);
} catch (NumberFormatException e) {
// when "toReturn" cannot be parsed to an int, return a
// default value instead of crashing your application
return 0;
}
Or if crashing is the desired behavior, or you can ensure that this method is never called with an illegal String, leave it as it is (= don't add try catch)
I think what you've done is great for well-formatted input, where you have a single space character between the numbers. As others have pointer out, following Java naming conventions will greatly improve the readability of your code.
Handling sequences of space characters, possible before, between, and after your numbers is a little tricky. The general pattern would be to consume any sequences of spaces, remember the current position, consume the sequence of digits, then if we're at the correct position return the parsed number.
public static int separate(String str, int order)
{
for(int i = 0, pos = 0; ; pos++)
{
while(i < str.length() && str.charAt(i) == ' ') i += 1;
int j = i;
while(i < str.length() && str.charAt(i) != ' ') i += 1;
if(i == j) throw new IllegalStateException("Missing number!");
if(order == pos)
{
// handle NumberFormatException
return Integer.parseInt(str.substring(j, i));
}
}
}
Test:
String s = " 23432 798 44";
for(int i=0; i<3; i++)
System.out.print(separate(s, i) + " ");
Output:
23432 798 44
For example String grdwe,erwd becomes dwregrdwe
I have most of the code I just have trouble accessing all of ch1 and ch2 in my code after my for loop in my method I think I have to add all the elements to ch1 and ch2 into two separate arrays of characters but I wouldn't know what to initially initialize the array to it only reads 1 element I want to access all elements and then concat them. I'm stumped.
And I'd prefer to avoid Stringbuilder if possible
public class reverseStringAfterAComma{
public void reverseMethod(String word){
char ch1 = ' ';
char ch2 = ' ';
for(int a=0; a<word.length(); a++)
{
if(word.charAt(a)==',')
{
for(int i=word.length()-1; i>a; i--)
{
ch1 = word.charAt(i);
System.out.print(ch1);
}
for (int j=0; j<a; j++)
{
ch2 = word.charAt(j);
System.out.print(ch2);
}
}
}
//System.out.print("\n"+ch1);
//System.out.print("\n"+ch2);
}
public static void main(String []args){
reverseStringAfterAComma rsac = new reverseStringAfterAComma();
String str="grdwe,erwd";
rsac.reverseMethod(str);
}
}
You can use string builder as described here:
First split the string using:
String[] splitString = yourString.split(",");
Then reverse the second part of the string using this:
splitString[1] = new StringBuilder(splitString[1]).reverse().toString();
then append the two sections like so:
String final = splitString[1] + splitString[0];
And if you want to print it just do:
System.out.print(final);
The final code would be:
String[] splitString = yourString.split(",");
splitString[1] = new StringBuilder(splitString[1]).reverse().toString();
String final = splitString[1] + splitString[0];
System.out.print(final);
Then, since you are using stringbuilder all you need to do extra, is import it by putting this at the top of your code:
import java.lang.StringBuilder;
It appears you currently have working code, but are looking to print/save the value outside of the for loops. Just set a variable before you enter the loops, and concatenate the chars in each loop:
String result = "";
for (int a = 0; a < word.length(); a++) {
if (word.charAt(a) == ',') {
for (int i = word.length() - 1; i > a; i--) {
ch1 = word.charAt(i);
result += ch1;
}
for (int j = 0; j < a; j++) {
ch2 = word.charAt(j);
result += ch2;
}
}
}
System.out.println(result);
Demo
Let propose a solution that doesn't use a StringBuilder
You should knoz there is no correct reason not to use that class since this is well tested
The first step would be to split your String on the first comma found (I assumed, in case there is more than one, that the rest are part of the text to reverse). To do that, we can you String.split(String regex, int limit).
The limit is define like this
If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n and the array's last entry will contain all input beyond the last matched delimiter.
If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
Example :
"foobar".split(",", 2) // {"foobar"}
"foo,bar".split(",", 2) // {"foo", "bar"}
"foo,bar,far".split(",", 2) // {"foo", "bar,far"}
So this could be used at our advantage here :
String text = "Jake, ma I ,dlrow olleh";
String[] splittedText = text.split( ",", 2 ); //will give a maximum of a 2 length array
Know, we just need to reverse the second array if it exists, using the simplest algorithm.
String result;
if ( splittedText.length == 2 ) { //A comma was found
char[] toReverse = splittedText[1].toCharArray(); //get the char array to revese
int start = 0;
int end = toReverse.length - 1;
while ( start < end ) { //iterate until needed
char tmp = toReverse[start];
toReverse[start] = toReverse[end];
toReverse[end] = tmp;
start++; //step forward
end--; //step back
}
result = new String( toReverse ) + splittedText[0];
}
This was the part that should be done with a StringBuilder using
if ( splittedText.length == 2 ){
result = new StringBuilder(splittedText[1]).reverse().toString() + splittedText[0];
}
And if there is only one cell, the result is the same as the original text
else { //No comma found, just take the original text
result = text;
}
Then we just need to print the result
System.out.println( result );
hello world, I am Jake
So I'm creating a program that will output the first character of a string and then the first character of another string. Then the second character of the first string and the second character of the second string, and so on.
I created what is below, I was just wondering if there is an alternative to this using a loop or something rather than substring
public class Whatever
{
public static void main(String[] args)
{
System.out.println (interleave ("abcdefg", "1234"));
}
public static String interleave(String you, String me)
{
if (you.length() == 0) return me;
else if (me.length() == 0) return you;
return you.substring(0,1) + interleave(me, you.substring(1));
}
}
OUTPUT: a1b2c3d4efg
Well, if you really don't want to use substrings, you can use String's toCharArray() method, then you can use a StringBuilder to append the chars. With this you can loop through each of the array's indices.
Doing so, this would be the outcome:
public static String interleave(String you, String me) {
char[] a = you.toCharArray();
char[] b = me.toCharArray();
StringBuilder out = new StringBuilder();
int maxLength = Math.max(a.length, b.length);
for( int i = 0; i < maxLength; i++ ) {
if( i < a.length ) out.append(a[i]);
if( i < b.length ) out.append(b[i]);
}
return out.toString();
}
Your code is efficient enough as it is, though. This can be an alternative, if you really want to avoid substrings.
This is a loop implementation (not handling null value, just to show the logic):
public static String interleave(String you, String me) {
StringBuilder result = new StringBuilder();
for (int i = 0 ; i < Math.max(you.length(), me.length()) ; i++) {
if (i < you.length()) {
result.append(you.charAt(i)); }
if (i < me.length()) {
result.append(me.charAt(i));
}
}
return result.toString();
}
The solution I am proposing is based on the expected output - In your particular case consider using split method of String since you are interleaving by on character.
So do something like this,
String[] xs = "abcdefg".split("");
String[] ys = "1234".split("");
Now loop over the larger array and ensure interleave ensuring that you perform length checks on the smaller one before accessing.
To implement this as a loop you would have to maintain the position in and keep adding until one finishes then tack the rest on. Any larger sized strings should use a StringBuilder. Something like this (untested):
int i = 0;
String result = "";
while(i <= you.length() && i <= me.length())
{
result += you.charAt(i) + me.charAt(i);
i++;
}
if(i == you.length())
result += me.substring(i);
else
result += you.substring(i);
Improved (in some sense) #BenjaminBoutier answer.
StringBuilder is the most efficient way to concatenate Strings.
public static String interleave(String you, String me) {
StringBuilder result = new StringBuilder();
int min = Math.min(you.length(), me.length());
String longest = you.length() > me.length() ? you : me;
int i = 0;
while (i < min) { // mix characters
result.append(you.charAt(i));
result.append(me.charAt(i));
i++;
}
while (i < longest.length()) { // add the leading characters of longest
result.append(longest.charAt(i));
i++;
}
return result.toString();
}
I need to split Java Strings at any " character.
The main thing is, the previous character to that may not be a backslash ( \ ).
So these Strings would split like so:
asdnaoe"asduwd"adfdgb => asdnaoe, asduwd, adfgfb
addfgmmnp"fd asd\"das"fsfk => addfgmmnp, fd asd\"das, fsfk
Is there any easy way to achieve this using regular expressions?
(I use RegEx because it is easiest for me, the coder. Also performance is not an issue...)
Thank you in advance.
I solved it like this:
private static String[] split(String s) {
char[] cs = s.toCharArray();
int n = 1;
for (int i = 0; i < cs.length; i++) {
if (cs[i] == '"') {
int sn = 0;
for (int j = i - 1; j >= 0; j--) {
if (cs[j] == '\\')
sn += 1;
else
break;
}
if (sn % 2 == 0)
n += 1;
}
}
String[] result = new String[n];
int lastBreakPos = 0;
int index = 0;
for (int i = 0; i < cs.length; i++) {
if (cs[i] == '"') {
int sn = 0;
for (int j = i - 1; j >= 0; j--) {
if (cs[j] == '\\')
sn += 1;
else
break;
}
if (sn % 2 == 0) {
char[] splitcs = new char[i - lastBreakPos];
System.arraycopy(cs, lastBreakPos, splitcs, 0, i - lastBreakPos);
lastBreakPos = i + 1;
result[index] = new StringBuilder().append(splitcs).toString();
index += 1;
}
}
}
char[] splitcs = new char[cs.length - (lastBreakPos + 1)];
System.arraycopy(cs, lastBreakPos, splitcs, 0, cs.length - (lastBreakPos + 1));
result[index] = new StringBuilder().append(splitcs).toString();
return result;
}
Anyways, thanks for all your great responses!
(Oh, and despite this, I will be using either #biziclop's or #Alan Moore's version, as they
're shorter and probably more efficient! =)
Sure, just use
(?<!\\)"
Quick PowerShell test:
PS> 'addfgmmnp"fd asd\"das"fsfk' -split '(?<!\\)"'
addfgmmnp
fd asd\"das
fsfk
However, this won't split on \\" (an escaped backslash, followed by a normal quote [at least in most C-like languages' escaping rules]). You cannot really solve that in Java, though, as arbitrary-length lookbehind isn't supported:
PS> 'addfgmmnp"fd asd\\"das"fsfk' -split '(?<!\\)"'
addfgmmnp
fd asd\\"das
fsfk
Usually you would expect a proper solution to split on the remaining " because it isn't really escaped.
You can solve this problem with a Java regex; just don't use split().
public static void main(String[] args) throws Exception
{
String[] strs = {
"asdnaoe\"asduwd\"adfdgb",
"addfgmmnp\"fd asd\\\"das\"fsfk"
};
for (String str : strs)
{
System.out.printf("%n%-28s=> %s%n", str, splitIt(str));
}
}
public static List<String> splitIt(String s)
{
ArrayList<String> result = new ArrayList<String>();
Matcher m = Pattern.compile("([^\"\\\\]|\\\\.)+").matcher(s);
while (m.find())
{
result.add(m.group());
}
return result;
}
output:
asdnaoe"asduwd"adfdgb => [asdnaoe, asduwd, adfdgb]
addfgmmnp"fd asd\"das"fsfk => [addfgmmnp, fd asd\"das, fsfk]
The core regex, [^"\\]|\\., consumes anything that's not a backslash or a quotation mark, or a backslash followed by anything--so \\\" would be matched as an escaped backslash (\\) followed by an escaped quote (\").
Just for reference, here's a non-regexp solution that handles escaping of \ as well. (In real life, this could be simplified, there's no real need for the START_NEW state, but I tried to write it in a way that's easier to read.)
public class Splitter {
private enum State {
IN_TEXT, ESCAPING, START_NEW;
}
public static List<String> split( String source ) {
LinkedList<String> ret = new LinkedList<String>();
StringBuilder sb = new StringBuilder();
State state = State.START_NEW;
for( int i = 0; i < source.length(); i++ ) {
char next = source.charAt( i );
if( next == '\\' && state != State.ESCAPING ) {
state = State.ESCAPING;
} else if( next == '\\' && state == State.ESCAPING ) {
state = State.IN_TEXT;
} else if( next == '"' && state != State.ESCAPING ) {
ret.add( sb.toString() );
sb = new StringBuilder();
state = State.START_NEW;
} else {
state = State.IN_TEXT;
}
if( state != State.START_NEW ) {
sb.append( next );
}
}
ret.add( sb.toString() );
return ret;
}
}