Csv: search for String and replace with another string

Csv: search for String and replace with another string - java

I have a .csv file that contains:
scenario, custom, master_data
1, ${CUSTOM}, A_1
I have a string:
a, b, c
and I want to replace 'custom' with 'a, b, c'. How can I do that and save to the existing .csv file?

Probably the easiest way is to read in one file and output to another file as you go, modifying it on a per-line basis
You could try something with tokenizers, this may not be completely correct for your output/input, but you can adapt it to your CSV file formatting
BufferedReader reader = new BufferedReader(new FileReader("input.csv"));
BufferedWriter writer = new BufferedWriter(new FileWriter("output.csv"));
String custom = "custom";
String replace = "a, b, c";
for(String line = reader.readLine(); line != null; line = reader.readLine())
{
String output = "";
StringTokenizer tokenizer = new StringTokenizer(line, ",");
for(String token = tokenizer.nextToken(); tokenizer.hasMoreTokens(); token = tokenizer.nextToken())
if(token.equals(custom)
output = "," + replace;
else
output = "," + token;
}
readInventory.close();
If this is for a one off thing, it also has the benefit of not having to research regular expressions (which are quite powerful and useful, good to know, but maybe for a later date?)

Have a look at Can you recommend a Java library for reading (and possibly writing) CSV files?
And once the values have been read, search for strings / value that start with ${ and end with }. Use Java Regular Expressions like \$\{(\w)\}. Then use some map for looking up the found key, and the related value. Java Properties would be a good candidate.
Then write a new csv file.

Since your replacement string is quite unique you can do it quickly without complicated parsing by just reading your file into a buffer, and then converting that buffer into a string. Replace all occurrences of the text you wish to replace with your target text. Then convert the string to a buffer and write that back to the file...
Pattern.quote is required because your string is a regular expression. If you don't quote it you may run into unexpected results.
Also it's generally not smart to overwrite your source file. Best is to create a new file then delete the old and rename the new to the old. Any error halfway will then not delete all your data.
final Path yourPath = Paths.get("Your path");
byte[] buff = Files.readAllBytes(yourPath);
String s = new String(buff, Charset.defaultCharset());
s = s.replaceAll(Pattern.quote("${CUSTOM}"), "a, b, c");
Files.write(yourPath, s.getBytes());

Related

Problem with input from user saved to file by RandomAccessFile methods

I've got a problem with input from user. I need to save input from user into binary file and when I read it and show it on the screen it isn't working properly. I dont want to put few hundreds of lines, so I will try to dexcribe it in more compact form. And encoding in NetBeans in properties of project is "UTF-8"
I got input from user, in NetBeans console or cmd console. Then I save it to object made up of strings, then add it to ArrayList<Ksiazka> where Ksiazka is my class (basically a book's properties). Then I save whole ArrayList object to file baza.bin. I do it by looping through whole list of objects of class Ksiazka, taking each String one by one and saving it into file baza.bin using method writeUTF(oneOfStrings). When I try to read file baza.bin I see question marks instead of special characters (ą, ć, ę, ł, ń, ó, ś, ź). I think there is a problem in difference in encoding of file and input data, but to be honest I don't have any idea ho to solve that.
Those are attributes of my class Ksiazka:
private String id;
private String tytul;
private String autor;
private String rok;
private String wydawnictwo;
private String gatunek;
private String opis;
private String ktoWypozyczyl;
private String kiedyWypozyczona;
private String kiedyDoOddania;
This is method for reading data from user:
static String podajDana(String[] tab, int coPokazac){
System.out.print(tab[coPokazac]);
boolean podawajDalej = true;
String linia = "";
Scanner klawiatura = new Scanner(System.in, "utf-8");
do{
try {
podawajDalej = false;
linia = klawiatura.nextLine();
}
catch(NoSuchElementException e){
System.err.println("Wystąpił błąd w czasie podawania wartości!"
+ " Spróbuj jeszcze raz!");
}
catch(IllegalStateException e){
System.err.println("Wewnętrzny błąd programu typu 2! Zgłoś to jak najszybciej"
+ " razem z tą wiadomością");
}
}while(podawajDalej);
return linia;
}
String[] tab is just array of strings I want to be able to show on the screen, each set (array) has its own function, int coPokazac is number of line from an array I want to show.
and this one saves all data from ArrayList<Ksiazka> to file baza.bin:
static void zapiszZmiany(ArrayList<Ksiazka> bazaKsiazek){
try{
RandomAccessFile plik = new RandomAccessFile("baza.bin","rw");
for(int i = 0; i < bazaKsiazek.size(); i++){
plik.writeUTF(bazaKsiazek.get(i).zwrocId());
plik.writeUTF(bazaKsiazek.get(i).zwrocTytul());
plik.writeUTF(bazaKsiazek.get(i).zwrocAutor());
plik.writeUTF(bazaKsiazek.get(i).zwrocRok());
plik.writeUTF(bazaKsiazek.get(i).zwrocWydawnictwo());
plik.writeUTF(bazaKsiazek.get(i).zwrocGatunek());
plik.writeUTF(bazaKsiazek.get(i).zwrocOpis());
plik.writeUTF(bazaKsiazek.get(i).zwrocKtoWypozyczyl());
plik.writeUTF(bazaKsiazek.get(i).zwrocKiedyWypozyczona());
plik.writeUTF(bazaKsiazek.get(i).zwrocKiedyDoOddania());
}
plik.close();
}
catch (FileNotFoundException ex){
System.err.println("Nie znaleziono pliku z bazą książek!");
}
catch (IOException ex){
System.err.println("Błąd zapisu bądź odczytu pliku!");
}
}
I think that there is a problem in one of those two methods (either I do something wrong while reading it or something wrong when it is saving data to file using writeUTF()) but even tho I tried few things to solve it, none of them worked.
After quick talk with lecturer I got information that I can use at most JDK 8.

You are using different techniques for reading and writing, and they are not compatible.
Despite the name, the writeUTF method of RandomAccessFile does not write a UTF-8 string. From the documentation:
Writes a string to the file using modified UTF-8 encoding in a machine-independent manner.
First, two bytes are written to the file, starting at the current file pointer, as if by the writeShort method giving the number of bytes to follow. This value is the number of bytes actually written out, not the length of the string. Following the length, each character of the string is output, in sequence, using the modified UTF-8 encoding for each character.
writeUTF will write a two-byte length, then write the string as UTF-8, except that '\u0000' characters are written as two UTF-8 bytes and supplementary characters are written as two UTF-8 encoded surrogates, rather than single UTF-8 codepoint sequences.
On the other hand, you are trying to read that data using new Scanner(System.in, "utf-8") and klawiatura.nextLine();. This approach is not compatible because:
The text was not written as a true UTF-8 sequence.
Before the text was written, two bytes indicating its numeric length were written. They are not readable text.
writeUTF does not write a newline. It does not write any terminating sequence at all, in fact.
The best solution is to remove all usage of RandomAccessFile and replace it with a Writer:
Writer plik = new FileWriter(new File("baza.bin"), StandardCharsets.UTF_8);
for (int i = 0; i < bazaKsiazek.size(); i++) {
plik.write(bazaKsiazek.get(i).zwrocId());
plik.write('\n');
plik.write(bazaKsiazek.get(i).zwrocTytul());
plik.write('\n');
// ...

How to modify a given String (from CSV)

I need to write a program for a project at university which should cut some specific parts out of a given CSV File. I've started already but I don't know how to keep only the content (sentence and vote values) or min. to remove the date part.
PARENT,"Lorem ipsum...","3","0","Town","09:17, 29/11/2016"
REPLY,"Loren ipsum...”,"2","0","Town","09:18, 29/11/2016"
After the program ran I want to have it like this:
Lorem ipsum... (String) 3 (int) 0 (int)
Loren ipsum... (String) 2 (int) 0 (int)
I have no problem with writing a parser (read in, remove separators) but I don't know how realize this thing.

You can create your own data structure that contains a string, and two integers and then do the following while reading from the csv file. Only include the stuff you want from the csv based on the column number which is the index of the String array returned by the split() method.
Scanner reader = new Scanner(new File("path to your CSV File"));
ArrayList<DataStructure> csvData = new ArrayList<>();
while(reader.hasNextLine())
{
String[] csvLine = reader.nextLine().split(",");
DataStructure data = new DataStructure(
csvLine[1],
Integer.parseInt(csvLine[2]),
Integer.parseInt(csvLine[3]));
csvData.add(data);
}

How to get rid of "Rogue Chars" in an .txt encoded under UTF-8

My program is reading from a .txt encoded with UTF-8. The reason why I'm using UTF-8 is to handle the characters åäö. The problem I come across is when the lines are read is that there seems to be some "rogue" characters sneaking in to the string which causes problems when I'm trying to store those lines into variables. Here's the code:
public void Läsochlista()
{
String Content = "";
String[] Argument = new String[50];
int index = 0;
Log.d("steg1", "steg1");
try{
InputStream inputstream = openFileInput("text.txt");
if(inputstream != null)
{
Log.d("steg2", "steg2");
//InputStreamReader inputstreamreader = new InputStreamReader(inputstream);
//BufferedReader bufferreader = new BufferedReader(inputstreamreader);
BufferedReader in = new BufferedReader(new InputStreamReader(inputstream, "UTF-8"));
String reciveString = "";
StringBuilder stringbuilder = new StringBuilder();
while ((reciveString = in.readLine()) != null)
{
Argument[index] = reciveString;
index++;
if(index == 6)
{
Log.d(Argument[0], String.valueOf((Argument[0].length())));
AllaPlatser.add(new Platser(Float.parseFloat(Argument[0]), Float.parseFloat(Argument[1]), Integer.parseInt(Argument[2]), Argument[3], Argument[4], Integer.parseInt(Argument[5])));
Log.d("En ny plats skapades", Argument[3]);
Arrays.fill(Argument, null);
index = 0;
}
}
inputstream.close();
Content = stringbuilder.toString();
}
}
catch (FileNotFoundException e){
Log.e("Filen", " Hittades inte");
} catch (IOException e){
Log.e("Filen", " Ej läsbar");
}
}
Now, I'm getting the error
Invalid float: "61.193521"
where the line only contains the chars "61.193521". When i print out the length of the string as read within the program, the output shows "10" which is one more character than the string is supposed to contain. The question; How do i get rid of those invisible "Rouge" chars? and why are they there in the first place?

When you save a file as "UTF-8", your editor may be writing a byte-order mark (BOM) at the beginning of the file.
See if there's an option in your editor to save UTF-8 without the BOM.
Apparently the BOM is just a pain in the butt: What's different between UTF-8 and UTF-8 without BOM?
I know you want to be able to have extended characters in your data; however, you may want to pick a different encoding like Latin-1 (ISO 8859-1).
Or you can just read & discard the first three bytes from the input stream before you wrap it with the reader.

Unfortunately you have not provided the sample text file so testing with your code exactly is not possible and here is the theoretical answer based on guess, what could have been the reasons:
Looks like it is BOM related issue and you may have to treat this. Some related detail is given here: http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html
And some information here: What is XML BOM and how do I detect it?
Basically there are various situation:
In one of the situation we face issues when we don't read and write using correct encoding.
In another situation we use an editor or reader which doesn't support UTF-8
Third is when we are using correct encoding for reading and writing, we are not facing issue in a text editor but facing issue in some other application or program. I think your issues is related to third case.
In third situation we may have to remove the BOM using a program or deal with it according to our context.
Here is some solution I guess you may find interesting:
UTF-8 file reading: the first character issue
You can use code given in this threads answer or use apache commons to deal with it:
Byte order mark screws up file reading in Java

Java Replacing Help Needed

Hey guy's so am trying to replace all characters and numbers to get the /hello/what/ only without the REMOVEThis4.PNG i don't want to use string.replace("REMOVEThis4.PNG", ""); cause i wanna use it on other strings not only that
Any help is great my code
String sFile = "/hello/what/REMOVEThis4.PNG";
if (sFile.contains("/")){
String Replaced = sFile.replaceAll("(?s)", "");
System.out.println(Replaced);
}
I want the the output to be
/hello/what/
Only thanks alot!

If you are trying to parse a path, I recommend to find the last index of /, and get the substring to this index plus one. So
string = string.substring(0, string.lastIndexOf("/") + 1);

No need to use regular expressions in your case:
String sFile = "/hello/what/REMOVEThis4.PNG";
// TODO check actual last index of "/" against -1
System.out.println(sFile.substring(0, sFile.lastIndexOf("/") + 1));
Output
/hello/what/
Note
In case you are dealing with actual files, you can probably spare yourself the String manipulation and use File.getParent() instead:
File file = new File("/hello/what/REMOVEThis4.PNG");
System.out.println(file.getParent());
Output (may change depending on your system)
\hello\what

Use Java's File API:
String example = "/hello/what/REMOVEThis4.PNG";
File file = new File(example);
System.out.println(example);
String absolutePath = file.getAbsolutePath();
String filePath = absolutePath.substring(0, absolutePath.lastIndexOf(File.separator));
System.out.println(filePath);

Collections.sort() isn't sorting in the right order

I have this code in Java:
List<String> unSorted = new ArrayList<String>();
List<String> beforeHash = new ArrayList<String>();
String[] unSortedAux, beforeHashAux;
String line = null;
BufferedReader reader = new BufferedReader(new FileReader("C:\\CPD\\temp0.txt"));
while ((line = reader.readLine()) != null){
unSorted.add(line);
beforeHash.add(line.split("#")[0]);
}
reader.close();
Collections.sort(beforeHash);
beforeHashAux = beforeHash.toArray(new String[beforeHash.size()]);
unSortedAux = unSorted.toArray(new String[unSorted.size()]);
System.out.println(Arrays.toString(beforeHashAux));
System.out.println(Arrays.toString(unSortedAux));
It reads a file named temp0.txt, which contains:
Carlos Magno#261
Mateus Carl#12
Analise Soares#151
Giancarlo Tobias#150
My goal is to sort the names in the string, without the string after "#". I am using beforeHash.add(line.split("#")[0]); to do this. The problem is that it reads correctly the file, but it sorts in the wrong order. The correspondent outputs are:
[Analise Soares, Giancarlo Tobias, Mateus Carl, Carlos Magno]
[Carlos Magno#261, Mateus Carl#12, Analise Soares#151, Giancarlo Tobias#150]
The first result is the "sorted" one, note that "Carlos Magno" comes after "Mateus Carl". I cannot find the problem in my code.

The problem is that "Carlos Magno" starts with a Unicode byte-order mark.
If you copy and paste your sample text ([Analise ... Carlos Magno]) into the Unicode Explorer you'll see that just before the "C" of Carlos Magno, you've got U+FEFF.
Basically, you'll need to strip that when reading the file. The easiest way to do this is just use:
line = line.replace("\ufeff", "");
... or check first:
if (line.startsWith("\ufeff")) {
line = line.substring(1);
}
Note that you should really specify the encoding you want to use when opening the file - use a FileInputStream wrapped in an InputStreamReader.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Csv: search for String and replace with another string - java

I have a .csv file that contains: scenario, custom, master_data 1, ${CUSTOM}, A_1 I have a string: a, b, c and I want to replace 'custom' with 'a, b, c'. How can I do that and save to the existing .csv file?

Related

Problem with input from user saved to file by RandomAccessFile methods

How to modify a given String (from CSV)

How to get rid of "Rogue Chars" in an .txt encoded under UTF-8

Java Replacing Help Needed

Collections.sort() isn't sorting in the right order

Categories

Resources