Export a string in Java to CSV - java

How to export a string in Java to a csv file having this format using only one column.
This is what i am expecting:
Column 1
Row 1: string1,string2,string3
Row 2: string4, string5, string6
Thanks in advance

In the code below you provide a List of elements. Each element contains the info for one line of the csv file.
The StringBuilder is used to create the String for one line, which then is output at once to the file.
public void writeCsvFile(List elements, String fileName) throws IOException {
BufferedWriter csvFile = null;
String delim = ",";
try {
csvFile = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(fileName), StandardCharsets.UTF_8));
for (int i = 0; i < objects.size(); i++) {
StringBuilder buf = new StringBuilder();
Elements elem = elements.get(i);
buf.append(elem.info1).append(delim);
buf.append(elem.info2).append(delim);
buf.append(elem.info3);
csvFile.write(buf.toString());
csvFile.newLine();
}
} finally {
try {
if (csvFile != null) {
csvFile.close();
}
} catch (IOException e) {
// empty
}
}
}

You essentially have to "escape" the commas so the CSV reader won't interrpret them as columns delimiters.
If you wrap your row values in quotes then the commas should be ignored as delimeters
This will give you 3 columns
Value1,Value2,Value3
This should give you 1 column with the entire string as a single value
"Value1,Value2,Value3"

Related

Modify a Csv file with OpenCsv, SpringBoot, extra quote (") in the result

I want to modify a column of a csv file with OpenCsv.
In the result I have an extra quote ("). How can I get rid of it?
Ex.
Input file: "Order ID,Customer Name, 123, John"
Output file: ""Order ID,Customer Name, 123, Chris""
public static byte[] updateCSV(String fileToUpdate) throws IOException {
final byte[] fallback = {};
// Read existing file
CSVReader reader = new CSVReader(new StringReader(fileToUpdate));
List<String[]> csvBody = reader.readAll();
// get CSV row column and replace with by using row and column
for(int i = 1; i < csvBody.size(); i++){
csvBody.get(i)[1] = "-"+csvBody.get(i)[1];
}
reader.close();
//Write back in csv
try (StringWriter writer = new StringWriter();
CSVWriter csvWriter = new CSVWriter(writer)
) {
csvWriter.writeAll(csvBody);
csvWriter.flush();
return writer.toString().getBytes();
} catch (Exception e) {
LOGGER.error(LogUtil.systemLoggingContext(), "Cannot modify the current CSV file");
return fallback;
}
}

Delete multiple lines from text file

I am trying to create a method to delete some of the text from my txt file. I started by checking if a string that I have exist in the file:
public boolean ifConfigurationExists(String pathofFile, String configurationString)
{
Scanner scanner=new Scanner(pathofFile);
List<String> list=new ArrayList<>();
while(scanner.hasNextLine())
{
list.add(scanner.nextLine());
}
if(list.contains(configurationString))
{
return true;
}
else
{
return false;
}
}
Since the string I want to delete contains multiple lines (String configurationString = "This\nis a\n multiple lines\n string";) I started by creating a new array of strings and splitting the string into array members.
public boolean deleteCurrentConfiguration(String pathofFile, String configurationString)
{
String textStr[] = configurationString.split("\\r\\n|\\n|\\r");
File inputFile = new File(pathofFile);
File tempFile = new File("myTempFile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));
BufferedWriter writer = new BufferedWriter(new FileWriter(tempFile));
String currentLine;
while((currentLine = reader.readLine()) != null) {
// trim newline when comparing with lineToRemove
String trimmedLine = currentLine.trim();
if(trimmedLine.equals(textStr[0])) continue;
writer.write(currentLine + System.getProperty("line.separator"));
}
writer.close();
reader.close();
boolean successful = tempFile.renameTo(inputFile);
return true;
}
Can someone please help on how to delete the string from the txt file and also the line before and after the string?
There are many different ways to do this, though one way I do it is to firstly read the files content into an array of Strings by line (looks like you already did this), then to remove the data you don't want, and write to the file line-by-line the new information you do want.
To remove lines before the line you don't want, the line you don't want, and the line after you don't want, you could something like this:
List<String> newLines=new ArrayList<>();
boolean lineRemoved = false;
for (int i=0, i < lines.length; i++) {
if (i < lines.length-1 && lines.get(i+1).equals(lineToRemove)) {
// this is the line before it
} else if (lines.get(i).equals(lineToRemove)) {
// this is the line itself
lineRemoved = true;
} else if (lineRemoved == true) {
// this is the line after the line you want to remove
lineRemoved = false; // set back to false so you don't remove every line after the one you want
} else
newLines.add(lines.get(i));
}
// now write newLines to file
Note that this code is rough and untested, but should get you where you need to be.

Read the each string text from file in java

I am new in java. I just wants to read each string in java and print it on console.
Code:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = new String();
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
System.out.println(""+data);
}
} catch (IOException e) {
// Error
}
}
If file contains:
Add label abc to xyz
Add instance cdd to pqr
I want to read each word from file and print it to a new line, e.g.
Add
label
abc
...
And afterwards, I want to extract the index of a specific string, for instance get the index of abc.
Can anyone please help me?
It sounds like you want to be able to do two things:
Print all words inside the file
Search the index of a specific word
In that case, I would suggest scanning all lines, splitting by any whitespace character (space, tab, etc.) and storing in a collection so you can later on search for it. Not the question is - can you have repeats and in that case which index would you like to print? The first? The last? All of them?
Assuming words are unique, you can simply do:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
ArrayList<String> words = new ArrayList<String>();
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = null;
while ((data = infile.readLine()) != null) {
for (String word : data.split("\\s+") {
words.add(word);
System.out.println(word);
}
}
} catch (IOException e) {
// Error
}
// search for the index of abc:
for (int i = 0; i < words.size(); i++) {
if (words.get(i).equals("abc")) {
System.out.println("abc index is " + i);
break;
}
}
}
If you don't break, it'll print every index of abc (if words are not unique). You could of course optimize it more if the set of words is very large, but for a small amount of data, this should suffice.
Of course, if you know in advance which words' indices you'd like to print, you could forego the extra data structure (the ArrayList) and simply print that as you scan the file, unless you want the printings (of words and specific indices) to be separate in output.
Split the String received for any whitespace with the regex \\s+ and print out the resultant data with a for loop.
public static void main(String[] args) { // Don't make main throw an exception
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(fstream));
String data;
while ((data = infile.readLine()) != null) {
String[] words = data.split("\\s+"); // Split on whitespace
for (String word : words) { // Iterate through info
System.out.println(word); // Print it
}
}
} catch (IOException e) {
// Probably best to actually have this on there
System.err.println("Error found.");
e.printStackTrace();
}
}
Just add a for-each loop before printing the output :-
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
for(String temp : data.split(" "))
System.out.println(temp); // no need to concatenate the empty string.
}
This will automatically print the individual strings, obtained from each String line read from the file, in a new line.
And afterwards, I want to extract the index of a specific string, for
instance get the index of abc.
I don't know what index are you actually talking about. But, if you want to take the index from the individual lines being read, then add a temporary variable with count initialised to 0.
Increment it till d equals abc here. Like,
int count = 0;
for(String temp : data.split(" ")){
count++;
if("abc".equals(temp))
System.out.println("Index of abc is : "+count);
System.out.println(temp);
}
Use Split() Function available in Class String.. You may manipulate according to your need.
or
use length keyword to iterate throughout the complete line
and if any non- alphabet character get the substring()and write it to the new line.
List<String> words = new ArrayList<String>();
while ((data = infile.readLine()) != null) {
for(String d : data.split(" ")) {
System.out.println(""+d);
}
words.addAll(Arrays.asList(data));
}
//words List will hold all the words. Do words.indexOf("abc") to get index
if(words.indexOf("abc") < 0) {
System.out.println("word not present");
} else {
System.out.println("word present at index " + words.indexOf("abc"))
}

OpenNLP - Tokenize an Array of Strings

I am trying to tokenize a text file using the OpenNLP tokenizer.
What I do, I read in a .txt file and store it in a list, want to iterate over every line, tokenize the line and write the tokenized line to a new file.
In the line:
tokens[i] = tokenizer.tokenize(output[i]);
I get:
Type mismatch: cannot convert from String[] to String
This is my code:
public class Tokenizer {
public static void main(String[] args) throws Exception {
InputStream modelIn = new FileInputStream("en-token-max.bin");
try {
TokenizerModel model = new TokenizerModel(modelIn);
Tokenizer tokenizer = new TokenizerME(model);
CSVReader reader = new CSVReader(new FileReader("ParsedRawText1.txt"),',', '"', 1);
String csv = "ParsedRawText2.txt";
CSVWriter writer = new CSVWriter(new FileWriter(csv),CSVWriter.NO_ESCAPE_CHARACTER,CSVWriter.NO_QUOTE_CHARACTER);
//Read all rows at once
List<String[]> allRows = reader.readAll();
for(String[] output : allRows) {
//get current row
String[] tokens=new String[output.length];
for(int i=0;i<output.length;i++){
tokens[i] = tokenizer.tokenize(output[i]);
System.out.println(tokens[i]);
}
//write line
writer.writeNext(tokens);
}
writer.close();
}
catch (IOException e) {
e.printStackTrace();
}
finally {
if (modelIn != null) {
try {
modelIn.close();
}
catch (IOException e) {
}
}
}
}
}
Does anyone has any idea how to complete this task?
As compiler says, you try to assign array of Strings (result of tokenize()) to String (tokens[i] is a String). So you should declare and use tokens inside the inner loop and write tokens[] there, too:
for (String[] output : allRows) {
// get current row
for (int i = 0; i < output.length; i++) {
String[] tokens = tokenizer.tokenize(output[i]);
System.out.println(tokens);
// write line
writer.writeNext(tokens);
}
}
writer.close();
Btw, are you sure that your source file is a csv? If it is actually a plain text file, then you split text by commas and gives such chunks to Opennlp, and it can perform worse, because its model was trained over normal sentences, not split like yours.

How to trim the elements before assigning it into an array list?

I need to assign the elements present in a CSV file into an arraylist. CSV file contains filenames with extension .tar. I need to trim those elements before i read it into an array list or trim the whole arraylist. Please help me with it
try
{
String strFile1 = "D:\\Ramakanth\\PT2573\\target.csv"; //csv file containing data
BufferedReader br1 = new BufferedReader( new FileReader(strFile1)); //create BufferedReader
String strLine1 = "";
StringTokenizer st1 = null;
while( (strLine1 = br1.readLine()) != null) //read comma separated file line by line
{
st1 = new StringTokenizer(strLine1, ","); //break comma separated line using ","
while(st1.hasMoreTokens())
{
array1.add(st1.nextToken()); //store csv values in array
}
}
}
catch(Exception e)
{
System.out.println("Exception while reading csv file: " + e);
}
If you want to remove the ".tar" string from your tokens, you can use:
String nextToken = st1.nextToken();
if (nextToken.endsWith(".tar")) {
nextToken = nextToken.replace(".tar", "");
}
array1.add(nextToken);
You shouldn't be using StringTokenizer the JavaDoc says (in part) StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead. You should close your BufferedReader. You could use a try-with-resources statement to do that. And, you might use a for-each loop to iterate the array produced by String.split(String) the regular expression below optionally matches whitespace before or after your , and you might continue the loop if the token endsWith ".tar" like
String strFile1 = "D:\\Ramakanth\\PT2573\\target.csv";
try (BufferedReader br1 = new BufferedReader(new FileReader(strFile1)))
{
String strLine1 = "";
while( (strLine1 = br1.readLine()) != null) {
String[] parts = strLine1.split("\\s*,\\s*");
for (String token : parts) {
if (token.endsWith(".tar")) continue; // <-- don't add "tar" files.
array1.add(token);
}
}
}
catch(Exception e)
{
System.out.println("Exception while reading csv file: " + e);
}
if(str.indexOf(".tar") >0)
str = str.subString(0, str.indexOf(".tar")-1);
while(st1.hasMoreTokens())
{
String input = st1.nextToken();
int index = input.indexOf("."); // Get the position of '.'
if(index >= 0){ // To avoid StringIndexOutOfBoundsException, when there is no match with '.' then the index position set to -1.
array1.add(input.substring(0, index)); // Get the String before '.' position.
}
}

Categories

Resources