CSV Data excluding commas between another character set

CSV Data excluding commas between another character set - java

for a class assignment, I'm using data from https://www.kaggle.com/shivamb/netflix-shows which has presented a small problem for me:
it is a CSV, however, the cast variable was also separated by commas affecting the .split function I was using. the data has a set of [value, value, value," value,value ", value, ...]. the goal is to exclude the values within the " ".
currently to run this function I have:
while ( inFile.hasNext() ){
String delims = "[,]"; //Delimiters for seperation
String[] tokens = inFile.nextLine().split(delims); // seperation operator put in to string array
for (String token : tokens) {
System.out.println(token);
}

Because it's a class assignment, I would simple just code the logic.
For each character decide if you want to add it to a current word or if a new word has to start. So its pretty easy to store if you are in the " " and react on this..
something like this
public List<String> split(String line)
{
List<String> result = new ArrayList<>();
String currentWord = "";
boolean inWord = false;
for (int i = 0; i < line.length(); i++)
{
char c = line.charAt(i);
if (c == ',' && !inWord)
{
result.add(currentWord.trim());
currentWord = "";
continue;
}
if (c == '"')
{
inWord = !inWord;
continue;
}
currentWord += c;
}
return result;
}
there are some hard core regular expressions like here: Splitting on comma outside quotes
but I would not use them in an assignment.

I'm sure there is a simpler way of doing this but this is one solution I came up with.
while ( inFile.hasNext() ) {
int quote = 0;
String delims = "[,]"; //Delimiters for seperation
String[] tokens = inFile.nextLine().split(delims);
for (String token : tokens) {
if(token.contains("\"")) { //If contains a quote
quote++; //Increment quote counter
}
if (quote != 1) //If not between quotes
{
if(token.indexOf(" ") == -1) //Print if no space at beginning
{
System.out.println(token);
}
else { //Print from first character
System.out.println(token.substring(token.indexOf(" ") + 1));
}
}
}
}
inFile.close();

Related

Format the results of JTextArea where it doesn't skip a line?

I wrote a program that given a list of anything adds single quotes around it, and an apstrophe at the end so like
"Dogs are cool" becomes 'Dogs', 'are', 'cool'
except the issue is the program gives one line to the single quote character
here are the results
'190619904419','
190619904469','
190619904569','
190619904669','
190619904759','
190619904859','
190619904869','
'
see how it appends the single quote to the end of the first line
when it should be the following
'190619904419',
'190619904469',
'190619904569',
'190619904669',
'190619904759',
'190619904859',
'190619904869',
The text is inputted in JTextArea, and I do the following
String line = JTextArea.getText().toString()
and I throw it in this method.
private static String SQLFormatter(String list, JFrame frame){
String ret = "";
String currentWord = "";
for(int i = 0; i < list.length(); i++){
char c = list.charAt(i);
if( i == list.length() - 1){
currentWord += c;
currentWord = '\'' + currentWord + '\'';
ret += currentWord;
currentWord = "";
}
else if(c != ' '){
currentWord += c;
}else if(c == ' '){
currentWord = '\'' + currentWord + '\'' + ',';
ret += currentWord;
currentWord = "";
}
}
return ret;
}
Any advice, the bug is in there somewhere but im not sure if its the method or some jtextarea feature I am missing.
[JTEXT AREA RESULTS][1]
[1]: https://i.stack.imgur.com/WXBKs.png

So it's a little hard to tell without the input, but there seem to be other white space, like carriage returns, in the input, which throws off your parsing. Also, if the thing has multiple white spaces or ends in white space, you might get more than you want (for example trailing comma, which I see you worked to avoid). Your original routine works on "Dogs are cool", but not as well on "Dogs \rare \rcool \r". Here is a slightly modified version that I think addresses the issues (I also pulled out the unused jframe parameter).
I also tried to think of it as the comma precedes any word but the first. I introduced a boolean for that, though it would have worked to check if ret was empty.
public static String SQLFormatter(String list) {
String ret = "";
String currentWord = "";
boolean firstWord = true;
for (int i = 0; i < list.length(); i++) {
// note modified to prepend comma to words beyond first and treat any white space as separator
// but multiple whitespace is treated as if just one space
char c = list.charAt(i);
if (!Character.isWhitespace(c)) {
currentWord += c;
} else if (!currentWord.equals("")) {
currentWord = '\'' + currentWord + '\'';
if (firstWord) {
ret += currentWord;
firstWord = false;
} else {
ret = ret + ',' + currentWord;
}
currentWord = "";
}
}
return ret;
}

How to write a replaceAll function java?

I'm trying to write a program that will allow a user to input a phrase (for example: "I like cats") and print each word on a separate line. I have already written the part to allow a new line at every space but I don't want to have blank lines between the words because of excess spaces. I can't use any regular expressions such as String.split(), replaceAll() or trim().
I tried using a few different methods but I don't know how to delete spaces if you don't know the exact number there could be. I tried a bunch of different methods but nothing seems to work.
Is there a way I could implement it into the code I've already written?
for (i=0; i<length-1;) {
j = text.indexOf(" ", i);
if (j==-1) {
j = text.length();
}
System.out.print("\n"+text.substring(i,j));
i = j+1;
}
Or how can I write a new expression for it? Any suggestions would really be appreciated.

I have already written the part to allow a new line at every space but
I don't want to have blank lines between the words because of excess
spaces.
If you can't use trim() or replaceAll(), you can use java.util.Scanner to read each word as a token. By default Scanner uses white space pattern as a delimiter for finding tokens. Similarly, you can also use StringTokenizer to print each word on new line.
String str = "I like cats";
Scanner scanner = new Scanner(str);
while (scanner.hasNext()) {
System.out.println(scanner.next());
}
OUTPUT
I
like
cats

Here is a simple solution using substring() and indexOf()
public static void main(String[] args) {
List<String> split = split("I like cats");
split.forEach(System.out::println);
}
public static List<String> split(String s){
List<String> list = new ArrayList<>();
while(s.contains(" ")){
int pos = s.indexOf(' ');
list.add(s.substring(0, pos));
s = s.substring(pos + 1);
}
list.add(s);
return list;
}
Edit:
If you only want to print the text without splitting or making lists, you can use this:
public static void main(String[] args) {
newLine("I like cats");
}
public static void newLine(String s){
while(s.contains(" ")){
int pos = s.indexOf(' ');
System.out.println(s.substring(0, pos));
s = s.substring(pos + 1);
}
System.out.println(s);
}

I think this will solve your problem.
public static List<String> getWords(String text) {
List<String> words = new ArrayList<>();
BreakIterator breakIterator = BreakIterator.getWordInstance();
breakIterator.setText(text);
int lastIndex = breakIterator.first();
while (BreakIterator.DONE != lastIndex) {
int firstIndex = lastIndex;
lastIndex = breakIterator.next();
if (lastIndex != BreakIterator.DONE && Character.isLetterOrDigit(text.charAt(firstIndex))) {
words.add(text.substring(firstIndex, lastIndex));
}
}
return words;
}
public static void main(String[] args) {
String text = "I like cats";
List<String> words = getWords(text);
for (String word : words) {
System.out.println(word);
}
}
Output :
I
like
cats

What about something like this, its O(N) time complexity:
Just use a string builder to create the string as you iterate through your string, add "\n" whenever you find a space
String word = "I like cats";
StringBuilder sb = new StringBuilder();
boolean newLine = true;
for(int i = 0; i < word.length(); i++) {
if (word.charAt(i) == ' ') {
if (newLine) {
sb.append("\n");
newLine = false;
}
} else {
newLine = true;
sb.append(word.charAt(i));
}
}
String result = sb.toString();
EDIT: Fixed the problem mentioned on comments (new line on multiple spaces)

Sorry, I didnot caution you cannot use replaceAll().
This is my other solution:
String s = "I like cats";
Pattern p = Pattern.compile("([\\S])+");
Matcher m = p.matcher(s);
while (m.find( )) {
System.out.println(m.group());
}
Old solution:
String s = "I like cats";
System.out.println(s.replaceAll("( )+","\n"));

You almost done all job. Just make small addition, and your code will work as you wish:
for (int i = 0; i < length - 1;) {
j = text.indexOf(" ", i);
if (i == j) { //if next space after space, skip it
i = j + 1;
continue;
}
if (j == -1) {
j = text.length();
}
System.out.print("\n" + text.substring(i, j));
i = j + 1;
}

splitting a comma delimited string with escaping quotes

I see that there are several similar questions, but I have not found any of the answers satisfactory. I have a comma delimited file where each line looks something like this:
4477,52544,,,P,S, ,,SUSAN JONES,9534 Black Bear Dr,,"CITY, NV 89506",9534 BLACK BEAR DR,,CITY,NV,89506,2008,,,, , , , ,,1
The problem that comes into play is when a token escapes a comma with quotes "CITY, NV 89506"
I need a result where the escaped tokens are handled and every token is included, even empty ones .

Consider a proper CSV parser such as opencsv. It will be highly tested (unlike a new, home-grown solution) and handle edge-conditions such as the one you describe (and lots you haven't thought about).
In the download, there is an examples folder which contains "addresses.csv" with this line:
Jim Sample,"3 Sample Street, Sampleville, Australia. 2615",jim#sample.com
In the same directory, the file AddressExample.java parses this file, and is highly relevant to your question.

Here is one way to answer your question using delivered java.lang.String methods. I believe it does what you need.
private final char QUOTE = '"';
private final char COMMA = ',';
private final char SUB = 0x001A; // or whatever character you know will NEVER
// appear in the input String
public void readLine(String line) {
System.out.println("original: " + line);
// Replace commas inside quoted text with substitute character
boolean quote = false;
for (int index = 0; index < line.length(); index++) {
char ch = line.charAt(index);
if (ch == QUOTE) {
quote = !quote;
} else if (ch == COMMA && quote) {
line = replaceChar(line, index, SUB);
System.out.println("replaced: " + line);
}
}
// Strip out all quotation marks
for (int index = 0; index < line.length(); index++) {
if (line.charAt(index) == QUOTE) {
line = removeChar(line, index);
System.out.println("stripped: " + line);
}
}
// Parse input into tokens
String[] tokens = line.split(",");
// restore commas in place of SUB characters
for (int i = 0; i < tokens.length; i++) {
tokens[i] = tokens[i].replace(SUB, COMMA);
}
// Display final results
System.out.println("Final Parsed Tokens: ");
for (String token : tokens) {
System.out.println("[" + token + "]");
}
}
private String replaceChar(String input, int position, char replacement) {
String begin = input.substring(0, position);
String end = input.substring(position + 1, input.length());
return begin + replacement + end;
}
private String removeChar(String input, int position) {
String begin = input.substring(0, position);
String end = input.substring(position + 1, input.length());
return begin + end;
}

error related to array index

Below code has variable "name". This may contain first and last name or only first name. This code checks if there is any white space in variable "name". If space exists, then it splits.
However, I am getting the "Error : Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at Space.main(Space.java:9)" during below cases
If there is a white space before "Richard"
If there is a white space after "Richard" without second word or second string.
If I have two spaces after "Richard" then it will not save the name in lname variable.
How to resolve this error.
public class Space {
public static void main(String[] args) {
String name = "Richard rinse ";
if(name.indexOf(' ') >= 0) {
String[] temp;
temp = name.split(" ");
String fname = temp[0];
String lname = temp[1];
System.out.println(fname);
System.out.println(lname);
} else {
System.out.println("Space does not exists");}
}
}

you have to split a string using "\s" like this
name.split("\\s+");

If there are two spaces temp[1] will be empty, given "Richard rinse" the array is split this way
1 Richard
2
3 rinse
You should trim() the string and do something like
while(name.contains(" "))
name=name.replace(" "," ");

String[] parts = name.trim().split("\\s+");
if (parts.length == 2) {
// print names out
} else {
// either less than 2 names or more than 2 names
}
trim removes leading and trailing whitespace as this lead to either leading or trailing empty strings in the array
the token to split on is a regular expression meaning any series of characters made up of one or more whitespace characters (space, tabs, etc...).

Maybe that way:
public class Space {
public static void main(String[] args) {
String name = "Richard rinse ";
String tname = name.trim().replace("/(\\s\\s+)/g", " ");
String[] temp;
temp = name.split(" ");
String fname = (temp.length > 0) ? temp[0] : null;
String lname = (temp.length > 1) ? temp[1] : null;
if (fname != null) System.out.println(fname);
if (lname != null) System.out.println(lname);
} else {
System.out.println("Space does not exists");
}
}

To trim the white spaces, use this.
public String trimSpaces(String s){
String str = "";
boolean spacesOmitted = false;
for (int i=0; i<s.length; i++){
char ch = s.chatAt(i);
if (ch!=' '){
spacesOmitted = true;
}
if (spacesOmitted){
str+=ch;
}
}
return str;
}
Then use the trimmed string in the place of name.

Reversing characters in each word in a sentence - Stack Implementation

This code is inside the main function:
Scanner input = new Scanner(System.in);
System.out.println("Type a sentence");
String sentence = input.next();
Stack<Character> stk = new Stack<Character>();
int i = 0;
while (i < sentence.length())
{
while (sentence.charAt(i) != ' ' && i < sentence.length() - 1)
{
stk.push(sentence.charAt(i));
i++;
}
stk.empty();
i++;
}
And this is the empty() function:
public void empty()
{
while (this.first != null)
System.out.print(this.pop());
}
It doesn't work properly, as by typing example sentence I am getting this output: lpmaxe. The first letter is missing and the loop stops instead of counting past the space to the next part of the sentence.
I am trying to achieve this:
This is a sentence ---> sihT si a ecnetnes

Per modifications to the original post, where the OP is now indicating that his goal is to reverse the letter order of the words within a sentence, but to leave the words in their initial positions.
The simplest way to do this, I think, is to make use of the String split function, iterate through the words, and reverse their orders.
String[] words = sentence.split(" "); // splits on the space between words
for (int i = 0; i < words.length; i++) {
String word = words[i];
System.out.print(reverseWord(word));
if (i < words.length-1) {
System.out.print(" "); // space after all words but the last
}
}
Where the method reverseWord is defined as:
public String reverseWord(String word) {
for( int i = 0; i < word.length(); i++) {
stk.push(word.charAt(i));
}
return stk.empty();
}
And where the empty method has been changed to:
public String empty() {
String stackWord = "";
while (this.first != null)
stackWord += this.pop();
return stackWord;
}
Original response
The original question indicated that the OP wanted to completely reverse the sentence.
You've got a double-looping construct where you don't really need it.
Consider this logic:
Read each character from the input string and push that character to the stack
When the input string is empty, pop each character from the stack and print it to screen.
So:
for( int i = 0; i < sentence.length(); i++) {
stk.push(sentence.charAt(i));
}
stk.empty();

I assume that what you want your code to do is to reverse each word in turn, not the entire string. So, given the input example sentence you want it to output elpmaxe ecnetnes not ecnetnes elpmaxe.
The reason that you see lpmaxe instead of elpmaxe is because your inner while-loop doesn't process the last character of the string since you have i < sentence.length() - 1 instead of i < sentence.length(). The reason that you only see a single word is because your sentence variable consists only of the first token of the input. This is what the method Scanner.next() does; it reads the next (by default) space-delimited token.
If you want to input a whole sentence, wrap up System.in as follows:
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
and call reader.readLine().
Hope this helps.

Assuming you've already got your input in sentence and the Stack object is called stk, here's an idea:
char[] tokens = sentence.toCharArray();
for (char c : tokens) {
if (c == ' ') {
stk.empty();
System.out.print(c);
} else {
stk.add(c);
}
}
Thus, it will scan through one character at a time. If we hit a space character, we'll assume we've hit the end of a word, spit out that word in reverse, print that space character, then continue. Otherwise, we'll add the character to the stack and continue building the current word. (If you want to also allow punctuation like periods, commas, and the like, change if (c == ' ') { to something like if (c == ' ' || c == '.' || c == ',') { and so on.)
As for why you're only getting one word, darrenp already pointed it out. (Personally, I'd use a Scanner instead of a BufferedReader unless speed is an issue, but that's just my opinion.)

import java.util.StringTokenizer;
public class stringWork {
public static void main(String[] args) {
String s1 = "Hello World";
s1 = reverseSentence(s1);
System.out.println(s1);
s1 = reverseWord(s1);
System.out.println(s1);
}
private static String reverseSentence(String s1){
String s2 = "";
for(int i=s1.length()-1;i>=0;i--){
s2 += s1.charAt(i);
}
return s2;
}
private static String reverseWord(String s1){
String s2 = "";
StringTokenizer st = new StringTokenizer(s1);
while (st.hasMoreTokens()) {
s2 += reverseSentence(st.nextToken());
s2 += " ";
}
return s2;
}
}

public class ReverseofeachWordinaSentance {
/**
* #param args
*/
public static void main(String[] args) {
String source = "Welcome to the word reversing program";
for (String str : source.split(" ")) {
System.out.print(new StringBuilder(str).reverse().toString());
System.out.print(" ");
}
System.out.println("");
System.out.println("------------------------------------ ");
String original = "Welcome to the word reversing program";
wordReverse(original);
System.out.println("Orginal Sentence :::: "+original);
System.out.println("Reverse Sentence :::: "+wordReverse(original));
}
public static String wordReverse(String original){
StringTokenizer string = new StringTokenizer(original);
Stack<Character> charStack = new Stack<Character>();
while (string.hasMoreTokens()){
String temp = string.nextToken();
for (int i = 0; i < temp.length(); i ++){
charStack.push(temp.charAt(i));
}
charStack.push(' ');
}
StringBuilder result = new StringBuilder();
while(!charStack.empty()){
result.append(charStack.pop());
}
return result.toString();
}
}

public class reverseStr {
public static void main(String[] args) {
String testsa[] = { "", " ", " ", "a ", " a", " aa bd cs " };
for (String tests : testsa) {
System.out.println(tests + "|" + reverseWords2(tests) + "|");
}
}
public static String reverseWords2(String s) {
String[] sa;
String out = "";
sa = s.split(" ");
for (int i = 0; i < sa.length; i++) {
String word = sa[sa.length - 1 - i];
// exclude "" in splited array
if (!word.equals("")) {
//add space between two words
out += word + " ";
}
}
//exclude the last space and return when string is void
int n = out.length();
if (n > 0) {
return out.substring(0, out.length() - 1);
} else {
return "";
}
}
}
This can pass in leetcode

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

CSV Data excluding commas between another character set - java

Related

Format the results of JTextArea where it doesn't skip a line?

How to write a replaceAll function java?

splitting a comma delimited string with escaping quotes

error related to array index

Reversing characters in each word in a sentence - Stack Implementation

Categories

Resources