Spliting String into sections with keywords - java

I have a String i read from a .txt file with has values in sections seperated like
Text first
[section_name_1]
Text with values pattern1
...
[section_name_2]
Text with values pattern2
I need to split the sections at the section_name_# marks and add those to a String [] (Size of the array is fixed). My Code by now does not make some weird output:
//Code:
public static String[] parseFileToParams(File file)
{
String[] sections= {"[section_name_1]","[section_name_2]","[section_name_3]","[section_name_4]"};
String[] params = new String[sections.length+1];
StringBuilder sb = new StringBuilder();
String decoded = parseFile(file);// Returns the Text from the file
for(int i=0; i< sections.length;i++)
{
params[i]= decoded.split(sections[i])[1];
sb.append(params[i]);
}
return params;
}
//For Test of the output
String[] textArray = BasicOsuParser.parseFileToParams(parseFile);
for(int j = 0; j<textArray.length;j++)
{
sb.append(textArray[j]);
}
String text= sb.toString();
System.out.println (text); //Output: su f form formau fnull
// Obviously not how it should look like
Thanks for help!

Try this:
String[] sections= {"[section_name_1]","[section_name_2]","[section_name_3]","[section_name_4]"};
String textFromFile = "Text first [section_name_1] Text with values pattern1 [section_name_2] Text with values pattern2";
int count = 0;
for(int i = 0; i < sections.length; i++){
if(textFromFile.contains(sections[i])){//Use this to tell how big the parms array will be.
count++;
}
sections[i] = sections[i].replace("[", "\\[").replace("]", "\\]");//Removes the brackets from being delimiters.
}
String[] parms = new String[count+1];//Where the split items will go.
int next = 0;//The next index for the parms array.
for(String sec : sections){
String split[] = textFromFile.split(sec);//Split the file's text by the sec
if(split.length == 2){
parms[next] = split[0];//Adds split to the parms
next++;//Go to the next index for the parms.
textFromFile = split[1];//Remove text which has just been added to the parms.
}
}
parms[next] = textFromFile;//Add any text after the last split.
for(String out : parms){
System.out.println(out);//Output parms.
}
This will do what you have asked and it is commented so you can see how it works.

It's not a good idea use split() only for a one delimiter in text. This method tries to separate the text by given regexp pattern and usually used where there are more than one given delimiter in the text. Also you should screen special symbols in reqexp like '.','[' and so on. read about patterns in java. In your case better use substring() and indexOf():
public static String[] parseFileToParams(File file)
{
String[] sections= {"[section_name_1]","[section_name_2]","[section_name_3]","[section_name_4]"};
String[] params = new String[sections.length+1];
String decoded = parseFile(file);// Returns the Text from the file
int sectionStart = 0;
for (int i = 0; i < sections.length; i++) {
int sectionEnd = decoded.indexOf(sections[i], sectionStart);
params[i] = decoded.substring(sectionStart, sectionEnd);
sectionStart = sectionEnd + sections[i].length();
}
params[sections.length] = decoded.substring(sectionStart, decoded.length());
return params;
}

params[i]= decoded.split(sections[i])[1];
This returns the string after the first appearance of the sections[i] i.e. not just until the section[i+1] but till the end of file.

This loop,
for(int i=0; i< sections.length;i++)
{
params[i]= decoded.split(sections[i])[1];
sb.append(params[i]);
}
return params;
Repeatedly splits decoded into 2 halves, separated by the given section. You then append the entire 2nd half into params.
Example, pretend you wanted to split the string "abcdef" along "a", "b", etc.
You would split along a, and append "bcdef" to params, then split along b, and append "cdef" to params, etc., so you would get "bcdefcdef...f".
I think what you want to do is use real regex as the delimiter, something like params = decoded.split([section_name_.]). Look at http://www.tutorialspoint.com/java/java_string_split.htm and https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
and if you want t

Related

Trying to read the contents of a file into an array, but the contents are separated by a colon

Inside my file is the two-letter abbreviation of each state followed by the full name of the state, and each state abbreviation and name is separated by a colon.
Like this:
al:Alabama
ak:Alaska
I need to read this into an array of 52x2 and I am not sure how to do that. The code I have now just reads each line from the file into the array without separating the abbreviation and name.
String[][] states = new String[52][2];
while (input2.hasNext()) {
for (int row = 0; row < states.length; row++) {
for (int column = 0; column < states[row].length; column++) {
states[row][column] = input2.next();
System.out.printf("%s%n", states[row][column]);
}
}
}
You can try below code(Comments inline):
String[][] states = new String[52][2];
int row = 0;
while (input2.hasNext()) {
// Read whole line from the file
String line = input2.nextLine();
// Split string into tokens with : character.
// It means, Line: al:Alabama is converted to
// ["al", "Alabama"] in tokens
String tokens[] = line.split(":");
// Store first token in first column and similarly for second.
states[row][0] = tokens[0];
states[row][1] = tokens[1];
row++;
}
Use the predefined split() String function:
// A string variable.
String myString = "Hello:World";
// split the string by the colon.
String[] myStringArray = myString.split(":");
// print the first element of the array.
System.out.println(myStringArray[0]);
// print the second element of the array.
System.out.println(myStringArray[1]);
As your data also adheres to the .properties format, you can use the Properties class.
Path file = Paths.get("...");
Properties properties = new Properties(52); // Initial capacity.
properties.load(Files.newBufferedReader(path, StandardCharsets.ISO_8859_1));
properties.list(System.out);
String name = properties.get("AR", "Arabia?");
Here I used an overloaded get where one can provide a default ("Arabia?") in case of failure.
This one line version makes use of NIO Files to load from file and converts the contents in a stream:
String[][] states = Files.readAllLines(Path.of("states.cfg")).stream().map(s -> s.split(":", 2)).toArray(String[][]::new);

JSP JSTL funciton fn:split is not working properly

Today, I come across one issue and need your help to fix it.
I am trying to split the string using JSTL fn:split function that is likewise,
<c:set var="stringArrayName" value="${fn:split(element, '~$')}" />
Actual String :- "abc~$pqr$xyz"
Expected Result :-
abc
pqr$xyz
only 2-string part expecting, but it gives
abc
pqr
xyz
here, total 3-string parts returning, which is wrong.
NOTE :- I have added <%#taglib prefix="fn" uri="http://java.sun.com/jsp/jstl/functions"%> at the top of JSP.
any help really appreciates!!
JSTL split not work like the Java split you can check the difference from the code source :
org.apache.taglibs.standard.functions.Functions.split
public static String[] split(String input, String delimiters) {
String[] array;
if (input == null) {
input = "";
}
if (input.length() == 0) {
array = new String[1];
array[0] = "";
return array;
}
if (delimiters == null) {
delimiters = "";
}
StringTokenizer tok = new StringTokenizer(input, delimiters);
int count = tok.countTokens();
array = new String[count];
int i = 0;
while (tok.hasMoreTokens()) {
array[i++] = tok.nextToken();
}
return array;
}
java.lang.String.split
public String[] split(String regex, int limit) {
return Pattern.compile(regex).split(this, limit);
}
So it's clearly that fn:split use StringTokenizer
...
StringTokenizer tok = new StringTokenizer(input, delimiters);
int count = tok.countTokens();
array = new String[count];
int i = 0;
while (tok.hasMoreTokens()) {
array[i++] = tok.nextToken();
}
...
Not like java.lang.String.split which use regular expression
return Pattern.compile(regex).split(this, limit);
//-----------------------^
from the StringTokenizer documentation it says :
Constructs a string tokenizer for the specified string. The characters
in the delim argument are the delimiters for separating tokens.
Delimiter characters themselves will not be treated as tokens.
How `fn:split` exactly work?
It split on each character in the delimiter, in your case you have two characters ~ and $ so if your string is abc~$pqr$xyz it will split it like this :
abc~$pqr$xyz
^^ ^
1st split :
abc
$pqr$xyz
2nd split :
abc
pqr$xyz
3rd split :
abc
pqr
xyz
Solution
use split in your Servlet instead of JSTL
for example :
String[] array = "abc~$pqr$xyz".split("~\\$");

How to merge many List<String> elements in one based on double quote delimiter in java

I have a CSV file generated in other platform (Salesforce), by default it seems Salesforce is not handling break lines in the file generation in some large text fields, so in my CSV file I have some rows with break lines like this that I need to fix:
"column1","column2","my column with text
here the text continues
more text in the same field
here we finish this","column3","column4"
Same idea using this piece of code:
List<String> listWords = new ArrayList<String>();
listWords.add("\"Hi all");
listWords.add("This is a test");
listWords.add("of how to remove");
listWords.add("");
listWords.add("breaklines and merge all in one\"");
listWords.add("\"This is a new Line with the whole text in one row\"");
in this case I would like to merge the elements. My first approach was to check for the lines were the last char is not a ("), concatenates the next line and just like that until we see the las char contains another double quote.
this is a non working sample of what I was trying to achieve but I hope it gives you an idea
String[] csvLines = csvContent.split("\n");
Integer iterator = 0;
String mergedRows = "";
for(String row:csvLines){
newCsvfile.add(row);
if(row != null){
if(!row.isEmpty()){
String lastChar = String.valueOf(row.charAt(row.length()-1));
if(!lastChar.contains("\"")){
//row += row+" "+csvLines[iterator+1].replaceAll("\r", "").replaceAll("\n", "").replaceAll("","").replaceAll("\r\n?|\n", "");
mergedRows += row+" "+csvLines[iterator+1].replaceAll("\r", "").replaceAll("\n", "").replaceAll("","").replaceAll("\r\n?|\n", "");
row = mergedRows;
csvLines[iterator+1] = null;
}
}
newCsvfile.add(row);
}
iterator++;
}
My final result should look like (based on the list sample):
"Hi all This is a test of how to remove break lines and merge all in one"
"This is a new Line with the whole text in one row".
What is the best approach to achieve this?
In case you don't want to use a CSV reading library like #RealSkeptic suggested...
Going from your listWords to your expected solution is fairly simple:
List<String> listSentences = new ArrayList<>();
String tmp = "";
for (String s : listWords) {
tmp = tmp.concat(" " + s);
if (s.endsWith("\"")){
listSentences.add(tmp);
tmp = "";
}
}

Checking whether the String contains multiple words

I am getting the names as String. How can I display in the following format: If it's single word, I need to display the first character alone. If it's two words, I need to display the first two characters of the word.
John : J
Peter: P
Mathew Rails : MR
Sergy Bein : SB
I cannot use an enum as I am not sure that the list would return the same values all the time. Though they said, it's never going to change.
String name = myString.split('');
topTitle = name[0].subString(0,1);
subTitle = name[1].subString(0,1);
String finalName = topTitle + finalName;
The above code fine, but its not working. I am not getting any exception either.
There are few mistakes in your attempted code.
String#split takes a String as regex.
Return value of String#split is an array of String.
so it should be:
String[] name = myString.split(" ");
or
String[] name = myString.split("\\s+);
You also need to check for # of elements in array first like this to avoid exception:
String topTitle, subTitle;
if (name.length == 2) {
topTitle = name[0].subString(0,1);
subTitle = name[1].subString(0,1);
}
else
topTitle = name.subString(0,1);
The String.split method split a string into an array of strings, based on your regular expression.
This should work:
String[] names = myString.split("\\s+");
String topTitle = names[0].subString(0,1);
String subTitle = names[1].subString(0,1);
String finalName = topTitle + finalName;
First: "name" should be an array.
String[] names = myString.split(" ");
Second: You should use an if function and the length variable to determine the length of a variable.
String initial = "";
if(names.length > 1){
initial = names[0].subString(0,1) + names[1].subString(0,1);
}else{
initial = names[0].subString(0,1);
}
Alternatively you could use a for loop
String initial = "";
for(int i = 0; i < names.length; i++){
initial += names[i].subString(0,1);
}
You were close..
String[] name = myString.split(" ");
String finalName = name[0].charAt(0)+""+(name.length==1?"":name[1].charAt(0));
(name.length==1?"":name[1].charAt(0)) is a ternary operator which would return empty string if length of name array is 1 else it would return 1st character
This will work for you
public static void getString(String str) throws IOException {
String[] strr=str.split(" ");
StringBuilder sb=new StringBuilder();
for(int i=0;i<strr.length;i++){
sb.append(strr[i].charAt(0));
}
System.out.println(sb);
}

Extract string text into another strings

I got a string like this:
String text = number|name|url||number2|name2|url2
Now I have written a loop
int initialiaze = 0;
for(i = initialize; i > text.length(); i++) {
//do the work
}
In this loop I want to extract number to one string, name to one string, url to one string and if I reach || do a action (e.g insert this three string into db) if this action is done, start again an extract number2, name2 and url2 into string and do a action.
Is this possible? Can you tell me how? I dont get it.
you can use .split() method for strings.
String[] bigParts = myString.split("\\|\\|");
for(String part : bigParts)
{
String[] words = part.split("\\|");
//save to db or what you want
}
for your case
StringTokenizer stPipe = null;
StringTokenizer stDblPipe = null;
String firstPipeElement=null;
stPipe = new StringTokenizer(text, "|");
if (stPipe.hasMoreElements())
{
firstPipeElement= stPipe.nextElement().toString();
.......
if(firstPipeElement.equals("||"))
{
stDblPipe = new StringTokenizer(firstPipeElement , "||");
.....
}
}
hope this helps
Java is not my language, but worth try,
String text = number|name|url||number2|name2|url2
String[] temp;
String[] temp2;
int i ;
temp = text.split("\\|\\|")
for(i=0;i<temp.length();i++){
temp2 = temp[i].split("\\|");
String no = temp2[0];
String name = temp2[1];
String url = temp2[2];
// Do processing with no, name, url
}
I hope, this would help

Categories

Resources