Parsing text string using delimeter

Parsing text string using delimeter - java

I have a text file:
Structure:
key1: xxx
key2:
key3:
key4:
key5:value5
key6: This is an example text file to add data
this is an example text file
this is an example text file
this is an example text file
key7:
I have tried to parse it, but finding it difficult to split using delimeter ':' and to add into maps so that i can access the values based on keys. I have tried below code.The problem is key6 where there is a paragraph and the code tries to split using delimeter after every new line. Any help to deal with this issue is much appreciated.
try{
Map<Object, Object> map = new Properties();
BufferedReader br = new BufferedReader
(new FileReader(textString));
String line = "";
while ((line = br.readLine()) != null) {
String fields[] = line.split(":");
map.put(fields[0], fields[1]);
}
br.close();
}catch(Exception e){
LOGGER.debug("Exception", e);
}

Because you are doing the split after reading each line. Why not parse the whole text, then perform the split. Like this
try{
Map<Object, Object> map = new Properties();
BufferedReader br = new BufferedReader
(new FileReader(textString));
StringBuffer text = new StringBuffer();
while (br.readLine() != null) {
text.append(br.readLine());
}
br.close();
String fields[] = text.toString().split(":");
for(int i=0; i < fields.length-1; i++){
map.put(fields[0], fields[1]);
}
}catch(Exception e){
LOGGER.debug("Exception", e);
}
NOTE
If any of the values contain a colon it will break your data. With this solution or what you are currently doing. Ideally, if you could, would be to use the Properties class part of the java.

If you want to use it as a key value and store it into map. Then do not use : use = . For next line value you can use \ symbol it will pick up the value from next line.
Properties File:
key1= xxx
key2=
key3=
key4=
key5=value5
key6= This is an example text file to add data \
this is an example text file \
this is an example text file \
this is an example text file
key7=
How to fetch value:
public static void main(String[] args) throws IOException
{
Map<String, String> map = new HashMap<String, String>();
Properties properties = new Properties();
properties.load(Main.class.getResourceAsStream("prop.properties"));
for (final Entry<Object, Object> entry : properties.entrySet()) {
map.put( (String)entry.getKey(), (String) entry.getValue());
}
for (Iterator iterator = map.entrySet().iterator(); iterator.hasNext();) {
Entry type = (Entry) iterator.next();
System.out.println(type.getKey());
System.out.println(type.getValue());
}
}
Output:
key1
xxx
key2
key5
value5
key6
This is an example text file to add data this is an example text file this is an example text file this is an example text file
key3
key4
key7

Related

Replacing values from HashMap in a file with Java

i'm stuck on this part. The aim is to take the values from an file.ini with this format
X = Y
X1 = Y1
X2 = Y2
take the Y values and replace them in a scxml file instead of the corresponding X keys, and save the new file.scxml
As you can see from my pasted code, i use the HashMap to take the key and values printed correctly, that although it seems right the code to replace the values works only for the first entry of the HashMap.
The code is currently as follows:
public String getPropValues() throws IOException {
try {
Properties prop = new Properties();
String pathconf = this.pathconf;
String pathxml = this.pathxml;
//Read file conf
File inputFile = new File(pathconf);
InputStream is = new FileInputStream(inputFile);
BufferedReader br = new BufferedReader(new InputStreamReader(is));
//load the buffered file
prop.load(br);
String name = prop.getProperty("name");
//Read xml file to get the format
FileReader reader = new FileReader(pathxml);
String newString;
StringBuffer str = new StringBuffer();
String lineSeparator = System.getProperty("line.separator");
BufferedReader rb = new BufferedReader(reader);
//read file.ini to HashMap
Map<String, String> mapFromFile = getHashMapFromFile();
//iterate over HashMap entries
for(Map.Entry<String, String> entry : mapFromFile.entrySet()){
System.out.println( entry.getKey() + " -> " + entry.getValue() );
//replace values
while ((newString = rb.readLine()) != null){
str.append(lineSeparator);
str.append(newString.replaceAll(entry.getKey(), entry.getValue()));
}
}
rb.close();
String pathwriter = pathxml + name + ".scxml";
BufferedWriter bw = new BufferedWriter(new FileWriter(new File(pathwriter)));
bw.write(str.toString());
//flush the stream
bw.flush();
//close the stream
bw.close();
} catch (Exception e) {
System.out.println("Exception: " + e);
}
return result;
}
so my .ini file is for example
Apple = red
Lemon = yellow
it print key and values correctly:
Apple -> red
Lemon -> yellow
but replace in the file only Apple with red and not the others key

The problem lays in your control flow order.
By the time the first iteration in your for loop, which corresponds to the first entry Apple -> red, runs it would caused the BufferedReader rb to reach the end of stream, hence doing nothing for subsequent iterations.
You have then either to reinitialize the BufferedReader for each iteration, or better, inverse the looping over your Map entries to be within the BufferedReader read loop:
EDIT (following #David hints)
You should can assign the resulting replaced value to the line replacement that will be appended to the result file at each line iteration:
public String getPropValues() throws IOException {
try {
// ...
BufferedReader rb = new BufferedReader(reader);
//read file.ini to HashMap
Map<String, String> mapFromFile = getHashMapFromFile();
//replace values
while ((newString = rb.readLine()) != null) {
// iterate over HashMap entries
for (Map.Entry<String, String> entry : mapFromFile.entrySet()) {
newString = newString.replace(entry.getKey(), entry.getValue());
}
str.append(lineSeparator)
.append(newString);
}
rb.close();
// ...
} catch (Exception e) {
System.out.println("Exception: " + e);
}
return result;
}

Java how to remove duplicates from ArrayList

I have a CSV file which contains rules and ruleversions. The CSV file looks like this:
CSV FILE:
#RULENAME, RULEVERSION
RULE,01-02-01
RULE,01-02-02
RULE,01-02-34
OTHER_RULE,01-02-04
THIRDRULE, 01-02-04
THIRDRULE, 01-02-04
As you can see, 1 rule can have 1 or more rule versions. What I need to do is read this CSV file and put them in an array. I am currently doing that with the following script:
private static List<String[]> getRulesFromFile() {
String csvFile = "rulesets.csv";
BufferedReader br = null;
String line = "";
String delimiter = ",";
List<String[]> input = new ArrayList<String[]>();
try {
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
if (!line.startsWith("#")) {
String[] rulesetEntry = line.split(delimiter);
input.add(rulesetEntry);
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return input;
}
But I need to adapt the script so that it saves the information in the following format:
ARRAY (
=> RULE => 01-02-01, 01-02-02, 01-02-04
=> OTHER_RULE => 01-02-34
=> THIRDRULE => 01-02-01, 01-02-02
)
What is the best way to do this? Multidimensional array? And how do I make sure it doesn't save the rulename more than once?

You should use a different data structure, for example an HashMap, like this.
HashMap<String, List<String>> myMap = new HashMap<>();
try {
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
if (!line.startsWith("#")) {
String[] parts = string.split(delimiter);
String key = parts[0];
String value = parts[1];
if (myMap.containsKey(key)) {
myMap.get(key).add(value);
} else {
List<String> values = new ArrayList<String>();
values.add(value);
myMap.put(key, values);
}
}
}
This should work!

See using an ArrayList is not a good data structure of choice here.
I would personally suggest you to use a HashMap> for this particular purpose.
The rules will be your keys and rule versions will be your values which will be a list of strings.
While traversing your original file, just check if the rule (key) is present, then add the value to the list of rule versions (values) already present, otherwise add a new key and add the value to it.

For instance like this:
public List<String> removeDuplicates(List<String> myList) {
Hashtable<String, String> hashtable=new Hashtable<String, String>();
for(String s:myList) {
hashtable.put(s, s);
}
return new ArrayList<String>(hashtable.values());
}

This is exactly what key - value pairs can be used for. Just take a look at the Map Interface. There you can define a unique key containing various elements as value, perfectly for your issue.

Code:
// This collection will take String type as a Key
// and Prevent duplicates in its associated values
Map<String, HashSet<String>> map = new HashMap<String,HashSet<String>>();
// Check if collection contains the Key you are about to enter
// !REPLACE! -> "rule" with the Key you want to enter into your collection
// !REPLACE! -> "whatever" with the Value you want to associate with the key
if(!map.containsKey("rule")){
map.put("rule", new HashSet<String>());
}
else{
map.get("rule").add("whatever");
}
Reference:
Set
Map

Java - Write hashmap to a csv file

I have a hashmap with a String key and String value. It contains a large number of keys and their respective values.
For example:
key | value
abc | aabbcc
def | ddeeff
I would like to write this hashmap to a csv file such that my csv file contains rows as below:
abc,aabbcc
def,ddeeff
I tried the following example here using the supercsv library: http://javafascination.blogspot.com/2009/07/csv-write-using-java.html. However, in this example, you have to create a hashmap for each row that you want to add to your csv file. I have a large number of key value pairs which means that several hashmaps, with each containing data for one row need to be created. I would like to know if there is a more optimized approach that can be used for this use case.

Using the Jackson API, Map or List of Map could be written in CSV file. See complete example here
/**
* #param listOfMap
* #param writer
* #throws IOException
*/
public static void csvWriter(List<HashMap<String, String>> listOfMap, Writer writer) throws IOException {
CsvSchema schema = null;
CsvSchema.Builder schemaBuilder = CsvSchema.builder();
if (listOfMap != null && !listOfMap.isEmpty()) {
for (String col : listOfMap.get(0).keySet()) {
schemaBuilder.addColumn(col);
}
schema = schemaBuilder.build().withLineSeparator(System.lineSeparator()).withHeader();
}
CsvMapper mapper = new CsvMapper();
mapper.writer(schema).writeValues(writer).writeAll(listOfMap);
writer.flush();
}

Something like this should do the trick:
String eol = System.getProperty("line.separator");
try (Writer writer = new FileWriter("somefile.csv")) {
for (Map.Entry<String, String> entry : myHashMap.entrySet()) {
writer.append(entry.getKey())
.append(',')
.append(entry.getValue())
.append(eol);
}
} catch (IOException ex) {
ex.printStackTrace(System.err);
}

As your question is asking how to do this using Super CSV, I thought I'd chime in (as a maintainer of the project).
I initially thought you could just iterate over the map's entry set using CsvBeanWriter and a name mapping array of "key", "value", but this doesn't work because HashMap's internal implementation doesn't allow reflection to get the key/value.
So your only option is to use CsvListWriter as follows. At least this way you don't have to worry about escaping CSV (every other example here just joins with commas...aaarrggh!):
#Test
public void writeHashMapToCsv() throws Exception {
Map<String, String> map = new HashMap<>();
map.put("abc", "aabbcc");
map.put("def", "ddeeff");
StringWriter output = new StringWriter();
try (ICsvListWriter listWriter = new CsvListWriter(output,
CsvPreference.STANDARD_PREFERENCE)){
for (Map.Entry<String, String> entry : map.entrySet()){
listWriter.write(entry.getKey(), entry.getValue());
}
}
System.out.println(output);
}
Output:
abc,aabbcc
def,ddeeff

Map<String, String> csvMap = new TreeMap<>();
csvMap.put("Hotel Name", hotelDetails.getHotelName());
csvMap.put("Hotel Classification", hotelDetails.getClassOfHotel());
csvMap.put("Number of Rooms", hotelDetails.getNumberOfRooms());
csvMap.put("Hotel Address", hotelDetails.getAddress());
// specified by filepath
File file = new File(fileLocation + hotelDetails.getHotelName() + ".csv");
// create FileWriter object with file as parameter
FileWriter outputfile = new FileWriter(file);
String[] header = csvMap.keySet().toArray(new String[csvMap.size()]);
String[] dataSet = csvMap.values().toArray(new String[csvMap.size()]);
// create CSVWriter object filewriter object as parameter
CSVWriter writer = new CSVWriter(outputfile);
// adding data to csv
writer.writeNext(header);
writer.writeNext(dataSet);
// closing writer connection
writer.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

If you have a single hashmap it is just a few lines of code. Something like this:
Map<String,String> myMap = new HashMap<>();
myMap.put("foo", "bar");
myMap.put("baz", "foobar");
StringBuilder builder = new StringBuilder();
for (Map.Entry<String, String> kvp : myMap.entrySet()) {
builder.append(kvp.getKey());
builder.append(",");
builder.append(kvp.getValue());
builder.append("\r\n");
}
String content = builder.toString().trim();
System.out.println(content);
//use your prefered method to write content to a file - for example Apache FileUtils.writeStringToFile(...) instead of syso.
result would be
foo,bar
baz,foobar

My Java is a little limited but couldn't you just loop over the HashMap and add each entry to a string?
// m = your HashMap
StringBuilder builder = new StringBuilder();
for(Entry<String, String> e : m.entrySet())
{
String key = e.getKey();
String value = e.getValue();
builder.append(key);
builder.append(',');
builder.append(value);
builder.append(System.getProperty("line.separator"));
}
string result = builder.toString();

Avoiding to loose the line breaks (\n) while reading and writing files

i read several property files to compare them against a template file for missing keys.
FileInputStream compareFis = new FileInputStream(compareFile);
Properties compareProperties = new Properties();
compareProperties.load(compareFis);
Note: I read the template file the same way.
After reading i compare them and write the missing keys with their values from the template file into a Set.
CompareResult result = new CompareResult(Main.resultDir);
[...]
if (!compareProperties.containsKey(key)) {
retVal = true;
result.add(compareFile.getName(), key + "=" + entry.getValue());
}
At last i write the missing keys and their values into a new file.
for (Entry<String, SortedSet<String>> entry : resultSet) {
PrintWriter out = null;
try {
out = new java.io.PrintWriter(resultFile);
SortedSet<String> values = entry.getValue();
for (String string : values) {
out.println(string);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} finally {
out.flush();
out.close();
}
}
If i open the result file i see that all line breaks "\n" from the values of the template file are replaced against a new line. Example:
test.key=Hello\nWorld!
becomes
test.key=Hello
World!
Although this is basically correct, but in my case I have to keep the "\n".
Does anyone know how can i avoid that?

Since it seems that your output is a properties file, you should use Properties.store() to generate the output file. This would not only take care of encoding the newline chars, but also the other special characters (non ISO8859-1 characters for example).

Using println will end each line with the platform-specific line terminator. You can instead write the line terminator that you want explicitly:
for (Entry<String, SortedSet<String>> entry : resultSet) {
PrintWriter out = null;
try {
out = new java.io.PrintWriter(resultFile);
SortedSet<String> values = entry.getValue();
for (String string : values) {
out.print(string); // NOT out.println(string)
out.print("\n");
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} finally {
out.flush();
out.close();
}
}

To add an example to JB Nizet answer (the best I think) using Properties.store()
FileInputStream compareFis = new FileInputStream(compareFile);
Properties compareProperties = new Properties();
compareProperties.load(compareFis);
....
StringBuilder value=new StringBuilder();
for (Entry<String, SortedSet<String>> entry : resultSet) {
SortedSet<String> values = entry.getValue();
for (String string : values) {
value.append(string).append("\n");
}
}
compareProperties.setProperty("test.key",value);
FileOutputStream fos = new FileOutputStream(compareFile);
compareProperties.store(fos,null);
fos.close();

You need something like this:
"test.key=Hello\\nWorld!"
where "\\n" is actually \n.

Escape the \n before serializing it. If you intend to read what you output file, your reading code will need to be aware of the escaping.

You might also look at the Apache Commons StringEscapeUtils.escapeJava( String ).

How to check number of instances of a domain in a text file

I have a text file containing domains like
ABC.COM
ABC.COM
DEF.COM
DEF.COM
XYZ.COM
i want to read the domains from the text file and check how many instances of domains are there.
Reading from a text file is easy but i am confused at how to check number of instances of domains.
Please help.

Split by space (String instances have method split), iterate through result array and use Map<String(domainName), Integer(count)> - when domain is in map, than increase count in map by 1, when not - put domain name in map and set 1 as a value.

Better solution is to use a Map to map the words Map with frequency.
Map<String,Integer> frequency = new LinkedHashMap<String,Integer>();
Read file
BufferedReader in = new BufferedReader(new FileReader("infilename"));
String str;
while ((str = in.readLine()) != null) {
buildMap(str);
}
in.close();
Build map method : You can split the urls in your file by reading them line by line and splitting with delimiter(in your case space).
String [] words = line.split(" ");
for (String word:words){
Integer f = frequency.get(word);
if(f==null) f=0;
frequency.put(word,f+1);
}
Find out for a particular domain with:
frequency.get(domainName)
Ref: Counting frequency of a string

List<String> domains=new ArrayList<String>(); // values from your file
domains.add("abc.com");
domains.add("abc.com");
domains.add("xyz.com");
//added for example
Map<String,Integer> domainCount=new HashMap<String, Integer>();
for(String domain:domains){
if(domainCount.containsKey(domain)){
domainCount.put(domain, domainCount.get(domain)+1);
}else
domainCount.put(domain, new Integer(1));
}
Set<Entry<String, Integer>> entrySet = domainCount.entrySet();
for (Entry<String, Integer> entry : entrySet) {
System.out.println(entry.getKey()+" : "+entry.getValue());
}

If the domains are unknown you can do something like:
// Field Declaration
private Map<String, Integer> mappedDomain = new LinkedHashMap<String, Integer>();
private static final List<String> domainList = new ArrayList<String>();
// Add all that you want to track
domainList.add("com");
domainList.add("net");
domainList.add("org");
...
// Inside the loop where you do a readLine
String[] words = line.split(" ");
for (String word : words) {
String[] wordSplit = word.split(".");
if (wordSplit.length == 2) {
for (String domainCheck : domainList) {
if (domainCheck.equals(wordSplit[1])) {
if (mappedDomain.containsKey(word)) {
mappedDomain.put(word, mappedDomain.get(word)+1);
} else {
mappedDomain.put(word, 1);
}
}
}
}
}
Note: This will work for something like xxx.xxx; if you need to take care of complex formats you need to modify the logic from wordSplit!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing text string using delimeter - java

Related

Replacing values from HashMap in a file with Java

Java how to remove duplicates from ArrayList

Java - Write hashmap to a csv file

Avoiding to loose the line breaks (\n) while reading and writing files

How to check number of instances of a domain in a text file

Categories

Resources