Split string into key-value pairs - java

I have a string like this:
pet:cat::car:honda::location:Japan::food:sushi
Now : indicates key-value pairs while :: separates the pairs.
I want to add the key-value pairs to a map.
I can achieve this using:
Map<String, String> map = new HashMap<String, String>();
String test = "pet:cat::car:honda::location:Japan::food:sushi";
String[] test1 = test.split("::");
for (String s : test1) {
String[] t = s.split(":");
map.put(t[0], t[1]);
}
for (String s : map.keySet()) {
System.out.println(s + " is " + map.get(s));
}
But is there an efficient way of doing this?
I feel the code is inefficient because I have used 2 String[] objects and called the split function twice.
Also, I am using t[0] and t[1] which might throw an ArrayIndexOutOfBoundsException if there are no values.

You could do a single call to split() and a single pass on the String using the following code. But it of course assumes the String is valid in the first place:
Map<String, String> map = new HashMap<String, String>();
String test = "pet:cat::car:honda::location:Japan::food:sushi";
// split on ':' and on '::'
String[] parts = test.split("::?");
for (int i = 0; i < parts.length; i += 2) {
map.put(parts[i], parts[i + 1]);
}
for (String s : map.keySet()) {
System.out.println(s + " is " + map.get(s));
}
The above is probably a little bit more efficient than your solution, but if you find your code clearer, then keep it, because there is almost zero chance such an optimization has a significant impact on performance, unless you do that millions of times. Anyway, if it's so important, then you should measure and compare.
EDIT:
for those who wonder what ::? means in the above code: String.split() takes a regular expression as argument. A separator is a substring that matches the regular expression. ::? is a regular expression which means: 1 colon, followed by 0 or 1 colon. It thus allows considering :: and : as separators.

Using Guava library it's a one-liner:
String test = "pet:cat::car:honda::location:Japan::food:sushi";
Map<String, String> map = Splitter.on( "::" ).withKeyValueSeparator( ':' ).split( test );
System.out.println(map);
The output:
{pet=cat, car=honda, location=Japan, food=sushi}
This also might work faster than JDK String.split as it does not create a regexp for "::".
Update it even handles correctly the corner case from the comments:
String test = "pet:cat::car:honda::location:Japan::food:sushi:::cool";
Map<String, String> map = Splitter.on( "::" ).withKeyValueSeparator( ':' ).split( test );
System.out.println(map);
The output is:
{pet=cat, car=honda, location=Japan, food=sushi, =cool}

Your solution is indeed somewhat inefficient.
The person who gave you the string to parse is also somewhat of a clown. There are industry standard serialization formats, like JSON or XML, for which fast, efficient parses exist. Inventing the square wheel is never a good idea.
First question: Do you care? Is it slow enough that it hinders performance of your application? It's likely not to, but there is only one way to find out. Benchmark your code.
That said, more efficient solutions exist. Below is an example
public static void main (String[] args) throws java.lang.Exception
{
String test = "pet:cat::car:honda::location:Japan::food:sushi";
boolean stateiskey = true;
Map<String, String> map = new HashMap<>();
int keystart = 0;
int keyend = 0;
int valuestart = 0;
int valueend = 0;
for(int i = 0; i < test.length(); i++){
char nextchar = test.charAt(i);
if (stateiskey) {
if (nextchar == ':') {
keyend = i;
stateiskey = false;
valuestart = i + 1;
}
} else {
if (i == test.length() - 1 || (nextchar == ':' && test.charAt(i + 1) == ':')) {
valueend = i;
if (i + 1 == test.length()) valueend += 1; //compensate one for the end of the string
String key = test.substring(keystart, keyend);
String value = test.substring(valuestart, valueend);
keystart = i + 2;
map.put(key, value);
i++;
stateiskey = true;
}
}
}
System.out.println(map);
}
This solution is a finite state machine with only two states. It looks at every character only twice, once when it tests it for a boundary, and once when it copies it to the new string in your map. This is the minimum amount.
It doesn't create objects that are not needed, like stringbuilders, strings or arrays, this keeps collection pressure low.
It maintains good locality. The next character probably always is in cache, so the lookup is cheap.
It comes at a grave cost that is probably not worth it though:
It's far more complicated and less obvious
There are all sorts of moving parts
It's harder to debug when your string is in an unexpected format
Your coworkers will hate you
You will hate you when you have to debug something
Worth it? Maybe. How fast do you need that string parsed exactly?
A quick and dirty benchmark at https://ideone.com/8T7twy tells me that for this string, this method is approximately 4 times faster. For longer strings the difference is likely somewhat greater.
But your version is still only 415 milliseconds for 100.000 repetitions, where this one is 99 milliseconds.

Try this code - see the comments for an explanation:
HashMap<String,String> hmap = new HashMap<>();
String str="abc:1::xyz:2::jkl:3";
String straraay[]= str.split("::?");
for(int i=0;i<straraay.length;i+=2) {
hmap.put(straraay[i],straraay[i+1]);
}
for(String s:straraay){
System.out.println(hmap.values()); //for Values only
System.out.println(hmap.keySet()); //for keys only if you want to more clear
}

I don't know this is best approach or not but i think this is another way of doing same thing without using split method twice
Map<String, String> map = new HashMap<String, String>();
String test = "pet:cat::car:honda::location:Japan::food:sushi";
String[] test1 = test.replaceAll("::",":").split(":");
for(int i=0;i<test1.length;i=i+2)
{
map.put(test1[i], test1[i+1]);
}
for (String s : map.keySet()) {
System.out.println(s + " is " + map.get(s));
}
Hope it will help :)

This might be useful.
*utm_source=test_source&utm_medium=test_medium&utm_term=test_term&
utm_content=test_content&utm_campaign=test_name&referral_code=DASDASDAS
String str[] = referrerString.split("&");
HashMap<String,String> stringStringHashMap= new HashMap<>();
List<String> al;
al = Arrays.asList(str);
String[] strkey ;
for (String s : al) {
strkey= s.split("=");
stringStringHashMap.put(strkey[0],strkey[1]);
}
for (String s : stringStringHashMap.keySet()) {
System.out.println(s + " is " + stringStringHashMap.get(s));
}

Your program is absolutely fine.
Just because you asked for a more optimal code.
I reduced your memory by taking few variables instead of taking arrays and storing in them.
Look at your string it follows a patter.
key : value :: key : value ::....
What can we do from this?
get the key till it is : , once it reaches : get value until it reaches '::'.
package qwerty7;
import java.util.HashMap;
public class Demo {
public static void main(String ar[])
{
StringBuilder s = new StringBuilder("pet:cat::car:honda::location:Japan::food:sushi");
boolean isKey = true;
String key = "", value = "";
HashMap<String, String> hm = new HashMap();
for(int i = 0; i < s.length(); i++)
{
char ch = s.charAt(i);
char nextChar = s.charAt(i+1);
if(ch == ':' && nextChar != ':')
{
isKey = false;
continue;
}
else if(ch == ':' && nextChar == ':')
{
hm.put(key, value);
isKey = true;
key = "";
value = "";
i+=1;
continue;
}
if(isKey)
{
key += ch;
}
else
{
value += ch;
}
if(i == s.length() - 1)
{
hm.put(key, value);
}
}
for (String x : hm.keySet()) {
System.out.println(x + " is " + hm.get(x));
}
}
}
Doing so doesn't take up much iterations on splitting each time.
Doesn't take up much memory.
Time complexity O(n)
Output:
car is honda
location is Japan
pet is cat
food is sushi

Related

Reverse the words in a sentence but not punctuation using recursion

How to reverse the words in a sentence, but not punctuation using recursion. The sentence is said to use punctuation marks: ,.?!
Input: "Jack, come home!"
Output: "home, come Jack!"
Now I have somehow managed to complete the task correctly but without using recursion.
How should I convert this work to use recursion to solve the problem?
Here's the method:
public static StringBuilder reverseSentenceWithPunctuation(String sentence, int i) {
String[] parts = sentence.split(" ");
StringBuilder newSentence = new StringBuilder();
Map<Integer, Character> punctuationMap = new HashMap<>();
for (int j = 0; j < parts.length; j++) {
if (parts[j].endsWith(",") || parts[j].endsWith(".") || parts[j].endsWith("!") || parts[j].endsWith("?")) {
char lastSymbol = parts[j].charAt(parts[j].length()-1);
punctuationMap.put(j, lastSymbol);
String changedWord = parts[j].replace(String.valueOf(lastSymbol), "");
parts[j] = changedWord;
}
}
for (int j = parts.length-1; j >= 0; j--) {
newSentence.append(parts[j]);
if (punctuationMap.containsKey(i)) {
newSentence.append(punctuationMap.get(i));
newSentence.append(" ");
} else
newSentence.append(" ");
i++;
}
return newSentence;
}
Thanks in advance!
To implement this task using recursion, a pattern matching the first and the last words followed by some delimiters should be prepared:
word1 del1 word2 del2 .... wordLast delLast
In case of matching the input the result is calculated as:
wordLast del1 REVERT(middle_part) + word1 delLast
Example implementation may be as follows (the words are considered to contain English letters and apostrophe ' for contractions):
static Pattern SENTENCE = Pattern.compile("^([A-Za-z']+)([^A-Za-z]+)?(.*)([^'A-Za-z]+)([A-Za-z']+)([^'A-Za-z]+)?$");
public static String revertSentence(String sentence) {
Matcher m = SENTENCE.matcher(sentence);
if (m.matches()) {
return m.group(5) + (m.group(2) == null ? "" : m.group(2))
+ revertSentence(m.group(3) + m.group(4)) // middle part
+ m.group(1) + (m.group(6) == null ? "" : m.group(6));
}
return sentence;
}
Tests:
System.out.println(revertSentence("Jack, come home!"));
System.out.println(revertSentence("Jack, come home please!!"));
System.out.println(revertSentence("Jane cried: Will you come home Jack, please, don't go!"));
Output:
home, come Jack!
please, home come Jack!!
go don't: please Jack home come you, Will, cried Jane!
I don't think this is a good case for a recursive function, mainly because you need 2 loops. Also, in general, iterative algorithms are better performance-wise and won't throw a stackoverflow exception.
So I think the main reasons to work with recursive functions is readability and easiness, and honestly, in this case, I think it isn't worth it.
In any case, this is my attempt to convert your code to a recursive function. As stated before, I use 2 functions because of the 2 loops. I'm sure there is a way to achieve this with a single function that first loads the map of punctuations and then compose the final String, but to be honest that would be quite ugly.
import java.util.*;
import java.util.stream.*;
public class HelloWorld{
static Character[] punctuationCharacters = {',','.','!'};
public static void main(String []args){
System.out.println(reverseSentenceWithPunctuation("Jack, come home!"));
}
private static String reverseSentenceWithPunctuation(String sentence) {
String[] parts = sentence.split(" ");
return generate(0, parts, extractPunctuationMap(0, parts));
}
private static Map<Integer, Character> extractPunctuationMap(int index, String[] parts){
Map<Integer, Character> map = new HashMap<>();
if (index >= parts.length) {
return map;
}
char lastSymbol = parts[index].charAt(parts[index].length() - 1);
if (Arrays.stream(punctuationCharacters).anyMatch(character -> character == lastSymbol)) {
parts[index] = parts[index].substring(0, parts[index].length() - 1);
map = Stream.of(new Object[][] {
{ index, lastSymbol}
}).collect(Collectors.toMap(data -> (Integer) data[0], data -> (Character) data[1]));
}
map.putAll(extractPunctuationMap(index + 1, parts));
return map;
}
private static String generate(int index, String[] parts, Map<Integer, Character> punctuationMap) {
if (index >= parts.length) {
return "";
}
String part = index == 0? " " + parts[index] : parts[index];
if (punctuationMap.containsKey(parts.length -1 - index)) {
part += punctuationMap.get(parts.length -1 - index);
}
return generate(index + 1, parts, punctuationMap) + part;
}
}
In pseudocode maybe something like that:
take the whole sentence
(a). get the first word
(b). get the last word
(if there is a punctuation after the first or last word, leave it there)
swap(a, b) and return the remaining middle of the sentence
repeat (1) and (2) until there is only two words or one
return the last two (swapped) words left (if one word, just return that)

Efficient and non-interfering way of replacing multiple substrings in a String

I'm trying to apply the same replacement instructions several thousand times to different input strings with as little overhead as possible. I need to consider two things for this:
The search Strings aren't necessarily all the same length: one may be just "a", another might be "ch", yet another might be "sch"
What was already replaced shall not be replaced again: If the replacement patterns are [a->e; e->a], "beat" should become "baet", not "baat" or "beet".
With that in mind, this is the code I came up with:
public class Replacements {
private String[] search;
private String[] replace;
Replacements(String[] s, String[] r)
{
if (s.length!=r.length) throw new IllegalArgumentException();
Map<String,String> map = new HashMap<String,String>();
for (int i=0;i<s.length;i++)
{
map.put(s[i], r[i]);
}
List<String> sortedKeys = new ArrayList(map.keySet());
Collections.sort(sortedKeys, new StringLengthComparator());
this.search = sortedKeys.toArray(new String[0]);
Stack<String> r2 = new Stack<>();
sortedKeys.stream().forEach((i) -> {
r2.push(map.get(i));
});
this.replace = r2.toArray(new String[0]);
}
public String replace(String input)
{
return replace(input,0);
}
private String replace(String input,int i)
{
String out = "";
List<String> parts = Arrays.asList(input.split(this.search[i],-1));
for (Iterator it = parts.iterator(); it.hasNext();)
{
String part = it.next().toString();
if (part.length()>0 && i<this.search.length-1) out += replace(part,i+1);
if (it.hasNext()) out += this.replace[i];
}
return out;
}
}
And then
String[] words;
//fill variable words
String[] s_input = "ou|u|c|ch|ce|ci".split("\\|",-1);
String[] r_input = "u|a|k|c|se|si".split("\\|",-1);
Replacements reps = new Replacements(s_input,r_input);
for (String word : words) {
System.out.println(reps.replace(word));
}
(s_input and r_input would be up to the user, so they're just examples, just like the program wouldn't actually use println())
This code makes sure longer search strings get looked for first and also covers the second condition above.
It is, however, quite costly. What would be the most efficient way to accomplish what I'm doing here (especially if the number of Strings in words is significantly large)?
With my current code, "couch" should be converted into "kuc" (except it doesn't, apparently; it now does, thanks to the -1 in split(p,-1))
This is not a full solution but it shows how to scan the input and find all target substrings in one pass. You would use a StringBuilder to assemble the result, looking up the replacements in a Map as you are currently doing. Use the start and end indexes to handle copying of non-matching segments.
public static void main(String[] args) throws Exception
{
Pattern p = Pattern.compile("(ou|ch|ce|ci|u|c)");
Matcher m = p.matcher("auouuchcceaecxici");
while (m.find())
{
MatchResult r = m.toMatchResult();
System.out.printf("s=%d e=%d '%s'\n", r.start(), r.end(), r.group());
}
}
Output:
s=1 e=2 'u'
s=2 e=4 'ou'
s=4 e=5 'u'
s=5 e=7 'ch'
s=7 e=8 'c'
s=8 e=10 'ce'
s=12 e=13 'c'
s=15 e=17 'ci'
Note the strings in the regex have to be sorted in order of descending length to work correctly.
One could make a regex pattern from the keys and leave it to that module for optimization.
Obviously
"(ou|u|ch|ce|ci|c)"
needs to take care of ce/ci/c, either by reverse sorting or immediately as tree:
"(c(e|h|i)?|ou|u)"
Then
String soughtKeys = "ou|u|ch|ce|ci|c"; // c last
String replacements = "u|a|c|se|si|k";
Map<String, String> map = new HashMap<>();
... fill map
Pattern pattern = Pattern.compile("(" + soughtKeys + ")");
for (String word : words) {
StringBuffer sb = new StringBuffer();
Matcher m = pattern.matcher(word);
while (m.find()) {
m.appendReplacement(sb, map.get(m.group());
}
m.appendTail(sb);
System.out.printf("%s -> %s%n", word, sb.toString());
}
The advantage being that regex is quite smart (though slow), and replacements are not done over replaced text.
public class Replacements
{
private String[] search; // sorted in descending length and order, eg: sch, ch, c
private String[] replace; // corresponding replacement
Replacements(String[] s, String[] r)
{
if (s.length != r.length)
throw new IllegalArgumentException();
final TreeMap<String, String> map = new TreeMap<String, String>(Collections.reverseOrder());
for (int i = 0; i < s.length; i++)
map.put(s[i], r[i]);
this.search = map.keySet().toArray(new String[map.size()]);
this.replace = map.values().toArray(new String[map.size()]);
}
public String replace(String input)
{
final StringBuilder result = new StringBuilder();
// start of yet-to-be-copied substring
int s = 0;
SEARCH:
for (int i = s; i < input.length(); i++)
{
for (int p = 0; p < this.search.length; p++)
{
if (input.regionMatches(i, this.search[p], 0, this.search[p].length()))
{
// append buffer and replacement
result.append(input, s, i).append(this.replace[p]);
// skip beyond current match and reset buffer
i += this.search[p].length();
s = i--;
continue SEARCH;
}
}
}
if (s == 0) // no matches? no changes!
return input;
// append remaining buffer
return result.append(input, s, input.length()).toString();
}
}

Sort the words and letters in Java

The code below counts how many times the words and letters appeared in the string. How do I sort the output from highest to lowest? The output should be like:
the - 2
quick - 1
brown - 1
fox - 1
t - 2
h - 2
e - 2
b - 1
My code:
import java.util.HashMap;
import java.util.Map;
import java.util.StringTokenizer;
public class Tokenizer {
public static void main(String[] args) {
int index = 0;
int tokenCount;
int i = 0;
Map<String, Integer> wordCount = new HashMap<String, Integer>();
Map<Integer, Integer> letterCount = new HashMap<Integer, Integer>();
String message = "The Quick brown fox the";
StringTokenizer string = new StringTokenizer(message);
tokenCount = string.countTokens();
System.out.println("Number of tokens = " + tokenCount);
while (string.hasMoreTokens()) {
String word = string.nextToken().toLowerCase();
Integer count = wordCount.get(word);
Integer lettercount = letterCount.get(word);
if (count == null) {
// this means the word was encountered the first time
wordCount.put(word, 1);
} else {
// word was already encountered we need to increment the count
wordCount.put(word, count + 1);
}
}
for (String words : wordCount.keySet()) {
System.out.println("Word : " + words + " has count :" + wordCount.get(words));
}
for (i = 0; i < message.length(); i++) {
char c = message.charAt(i);
if (c != ' ') {
int value = letterCount.getOrDefault((int) c, 0);
letterCount.put((int) c, value + 1);
}
}
for (int key : letterCount.keySet()) {
System.out.println((char) key + ": " + letterCount.get(key));
}
}
}
You have a Map<String, Integer>; I'd suggest something along the lines of another LinkedHashMap<String, Integer> which is populated by inserting keys that are sorted by value.
It seems that you want to sort the Map by it's value (i.e., count). Here are some general solutions.
Specifically for your case, a simple solution might be:
Use a TreeSet<Integer> to save all possible values of counts in the HashMap.
Iterate the TreeSetfrom high to low.
Inside the iteration mentioned in 2., use a loop to output all word-count pairs with count equals to current iterated count.
Please see if this may help.
just use the concept of the list and add all your data into list and then use sort method for it

Find and replace String with a substring result

I was asked this recently, and I couldn't figure out the best way. We are trying to replicate Google's search results where the search terms are bolded (using a b tag) in the results.
Input Terms Output
The search is cool {sea} The <b>sea</b>rch is cool
Originally, I thought this was pretty easy:
String results(String input, String[] terms)
{
for(String term : terms)
{
input = input.replace(term, "<b>" + term + "</b>");
}
return input;
}
However, this isn't correct. For example:
Input Terms Output
The search is cool {sea, search} The <b>search</b> is cool
I struggled to figure out the best way to approach this. Obviously we can no longer find and replace immediately. I played around with using a Map<Integer,String> where the key is the term and the value is the index returned by input.indexOf(term), but this seemed potentially unnecessary. Any improvements?
public String results(String input, String[] terms)
{
Map<Integer, String> map = new HashMap<Integer,String>();
for(String term : terms)
{
int index = input.indexOf(term);
if(index >= 0)//if found
{
String value = map.get(index);
if(value == null || value.length() < term.length())//use the longer term
map.put(index, term);
}
}
for(String term: map.values())
{
input = input.replace(term, "<b>" + term + "</b>");
}
return input;
}
Try this
import java.net.*;
import java.util.HashMap;
import java.util.Map;
import java.io.*;
public class main {
public static String results(String input, String[] terms)
{
for(String t : terms)
{
input = input.replace(t, "<b>" + t + "</b>");
}
return input;
}
public static void main(String[] args) {
String [] terms={"sea", "search"};
String s = results("The search is cool ",terms);
System.out.println(s);
String [] terms2={"search", "sea"};
String s2 = results("The search is cool ",terms2);
System.out.println(s2);
}
}
Output
The <b>sea</b>rch is cool
The <b><b>sea</b>rch</b> is cool
In your code you were adding two times the string in the same index in the hash map so it was actually replacing "sea" to 'search" in the hash map itself.because the index is 4 in the both the cases.
Map<Integer, String> map = new HashMap<Integer,String>();
for(String term : terms)
{
int index = input.indexOf(term);
if(index >= 0)//if found
{
String value = map.get(index); //the index is 4 here both the times
if(value == null || value.length() < term.length())
map.put(index, term);//so first time putting string sea at index 4 and in second iteration replacing "sea" to "search" at the same index 4 in hashmap because you want a longer term
}
}
for(String term: map.values())//here getting only one string which is "search"
{
input = input.replace(term, "<b>" + term + "</b>");
}
But if you want a longer term than it is working fine in your code itself.
You might do it with regular expressions.
public static String results(String input, String[] terms) {
String output = input;
Arrays.sort(terms);
for (int i = terms.length - 1; i >= 0; --i) {
String term = terms[i];
output = output.replaceAll("(?<!>)\\b" + term, "<b>" + term + "</b>");
}
// With regular expressions.
// \\b = word boundary, starting at words
// (?<X) = without preceding X (negative look-behind)
// Converting " searching " to " <b>search</b>ing ",
// Not converting " research ".
return output;
}
The solution is a reverse sort, so that "search" precedes "sea", and checking that no ">" precedes the word (= already replaced; with a longer term).
I have added a word boundary check, that is, terms should be at the beginning of words. Not necessary.
Mind the array parameter terms gets sorted.

Best Loop Idiom for special casing the last element

I run into this case a lot of times when doing simple text processing and print statements where I am looping over a collection and I want to special case the last element (for example every normal element will be comma separated except for the last case).
Is there some best practice idiom or elegant form that doesn't require duplicating code or shoving in an if, else in the loop.
For example I have a list of strings that I want to print in a comma separated list. (the do while solution already assumes the list has 2 or more elements otherwise it'd be just as bad as the more correct for loop with conditional).
e.g. List = ("dog", "cat", "bat")
I want to print "[dog, cat, bat]"
I present 2 methods the
For loop with conditional
public static String forLoopConditional(String[] items) {
String itemOutput = "[";
for (int i = 0; i < items.length; i++) {
// Check if we're not at the last element
if (i < (items.length - 1)) {
itemOutput += items[i] + ", ";
} else {
// last element
itemOutput += items[i];
}
}
itemOutput += "]";
return itemOutput;
}
do while loop priming the loop
public static String doWhileLoopPrime(String[] items) {
String itemOutput = "[";
int i = 0;
itemOutput += items[i++];
if (i < (items.length)) {
do {
itemOutput += ", " + items[i++];
} while (i < items.length);
}
itemOutput += "]";
return itemOutput;
}
Tester class:
public static void main(String[] args) {
String[] items = { "dog", "cat", "bat" };
System.out.println(forLoopConditional(items));
System.out.println(doWhileLoopPrime(items));
}
In the Java AbstractCollection class it has the following implementation (a little verbose because it contains all edge case error checking, but not bad).
public String toString() {
Iterator<E> i = iterator();
if (! i.hasNext())
return "[]";
StringBuilder sb = new StringBuilder();
sb.append('[');
for (;;) {
E e = i.next();
sb.append(e == this ? "(this Collection)" : e);
if (! i.hasNext())
return sb.append(']').toString();
sb.append(", ");
}
}
I usually write it like this:
static String commaSeparated(String[] items) {
StringBuilder sb = new StringBuilder();
String sep = "";
for (String item: items) {
sb.append(sep);
sb.append(item);
sep = ",";
}
return sb.toString();
}
There are a lot of for loops in these answers, but I find that an Iterator and while loop reads much more easily. E.g.:
Iterator<String> itemIterator = Arrays.asList(items).iterator();
if (itemIterator.hasNext()) {
// special-case first item. in this case, no comma
while (itemIterator.hasNext()) {
// process the rest
}
}
This is the approach taken by Joiner in Google collections and I find it very readable.
string value = "[" + StringUtils.join( items, ',' ) + "]";
My usual take is to test if the index variable is zero, e.g.:
var result = "[ ";
for (var i = 0; i < list.length; ++i) {
if (i != 0) result += ", ";
result += list[i];
}
result += " ]";
But of course, that's only if we talk about languages that don't have some Array.join(", ") method. ;-)
I think it is easier to think of the first element as the special case because it is much easier to know if an iteration is the first rather than the last. It does not take any complex or expensive logic to know if something is being done for the first time.
public static String prettyPrint(String[] items) {
String itemOutput = "[";
boolean first = true;
for (int i = 0; i < items.length; i++) {
if (!first) {
itemOutput += ", ";
}
itemOutput += items[i];
first = false;
}
itemOutput += "]";
return itemOutput;
}
I'd go with your second example, ie. handle the special case outside of the loop, just write it a bit more straightforward:
String itemOutput = "[";
if (items.length > 0) {
itemOutput += items[0];
for (int i = 1; i < items.length; i++) {
itemOutput += ", " + items[i];
}
}
itemOutput += "]";
Java 8 solution, in case someone is looking for it:
String res = Arrays.stream(items).reduce((t, u) -> t + "," + u).get();
I like to use a flag for the first item.
ArrayList<String> list = new ArrayList()<String>{{
add("dog");
add("cat");
add("bat");
}};
String output = "[";
boolean first = true;
for(String word: list){
if(!first) output += ", ";
output+= word;
first = false;
}
output += "]";
Since your case is simply processing text, you don't need the conditional inside the loop. A C example:
char* items[] = {"dog", "cat", "bat"};
char* output[STRING_LENGTH] = {0};
char* pStr = &output[1];
int i;
output[0] = '[';
for (i=0; i < (sizeof(items) / sizeof(char*)); ++i) {
sprintf(pStr,"%s,",items[i]);
pStr = &output[0] + strlen(output);
}
output[strlen(output)-1] = ']';
Instead of adding a conditional to avoid generating the trailing comma, go ahead and generate it (to keep your loop simple and conditional-free) and simply overwrite it at the end. Many times, I find it clearer to generate the special case just like any other loop iteration and then manually replace it at the end (although if the "replace it" code is more than a couple of lines, this method can actually become harder to read).
...
String[] items = { "dog", "cat", "bat" };
String res = "[";
for (String s : items) {
res += (res.length == 1 ? "" : ", ") + s;
}
res += "]";
or so is quite readable. You can put the conditional in a separate if clause, of course. What it makes idiomatic (I think so, at least) is that it uses a foreach loop and does not use a complicated loop header.
Also, no logic is duplicated (i.e. there is only one place where an item from items is actually appended to the output string - in a real world application this might be a more complicated and lengthy formatting operation, so I wouldn't want to repeat the code).
In this case, you are essentially concatenating a list of strings using some separator string. You can maybe write something yourself which does this. Then you will get something like:
String[] items = { "dog", "cat", "bat" };
String result = "[" + joinListOfStrings(items, ", ") + "]"
with
public static String joinListOfStrings(String[] items, String sep) {
StringBuffer result;
for (int i=0; i<items.length; i++) {
result.append(items[i]);
if (i < items.length-1) buffer.append(sep);
}
return result.toString();
}
If you have a Collection instead of a String[] you can also use iterators and the hasNext() method to check if this is the last or not.
If you are building a string dynamically like that, you shouldn't be using the += operator.
The StringBuilder class works much better for repeated dynamic string concatenation.
public String commaSeparate(String[] items, String delim){
StringBuilder bob = new StringBuilder();
for(int i=0;i<items.length;i++){
bob.append(items[i]);
if(i+1<items.length){
bob.append(delim);
}
}
return bob.toString();
}
Then call is like this
String[] items = {"one","two","three"};
StringBuilder bob = new StringBuilder();
bob.append("[");
bob.append(commaSeperate(items,","));
bob.append("]");
System.out.print(bob.toString());
Generally, my favourite is the multi-level exit. Change
for ( s1; exit-condition; s2 ) {
doForAll();
if ( !modified-exit-condition )
doForAllButLast();
}
to
for ( s1;; s2 ) {
doForAll();
if ( modified-exit-condition ) break;
doForAllButLast();
}
It eliminates any duplicate code or redundant checks.
Your example:
for (int i = 0;; i++) {
itemOutput.append(items[i]);
if ( i == items.length - 1) break;
itemOutput.append(", ");
}
It works for some things better than others. I'm not a huge fan of this for this specific example.
Of course, it gets really tricky for scenarios where the exit condition depends on what happens in doForAll() and not just s2. Using an Iterator is such a case.
Here's a paper from the prof that shamelessly promoted it to his students :-). Read section 5 for exactly what you're talking about.
I think there are two answers to this question: the best idiom for this problem in any language, and the best idiom for this problem in java. I also think the intent of this problem wasn't the tasks of joining strings together, but the pattern in general, so it doesn't really help to show library functions that can do that.
Firstly though the actions of surrounding a string with [] and creating a string separated by commas are two separate actions, and ideally would be two separate functions.
For any language, I think the combination of recursion and pattern matching works best. For example, in haskell I would do this:
join [] = ""
join [x] = x
join (x:xs) = concat [x, ",", join xs]
surround before after str = concat [before, str, after]
yourFunc = surround "[" "]" . join
-- example usage: yourFunc ["dog", "cat"] will output "[dog,cat]"
The benefit of writing it like this is it clearly enumerates the different situations that the function will face, and how it will handle it.
Another very nice way to do this is with an accumulator type function. Eg:
join [] = ""
join strings = foldr1 (\a b -> concat [a, ",", b]) strings
This can be done in other languages as well, eg c#:
public static string Join(List<string> strings)
{
if (!strings.Any()) return string.Empty;
return strings.Aggregate((acc, val) => acc + "," + val);
}
Not very efficient in this situation, but can be useful in other cases (or efficiency may not matter).
Unfortunately, java can't use either of those methods. So in this case I think the best way is to have checks at the top of the function for the exception cases (0 or 1 elements), and then use a for loop to handle the case with more than 1 element:
public static String join(String[] items) {
if (items.length == 0) return "";
if (items.length == 1) return items[0];
StringBuilder result = new StringBuilder();
for(int i = 0; i < items.length - 1; i++) {
result.append(items[i]);
result.append(",");
}
result.append(items[items.length - 1]);
return result.toString();
}
This function clearly shows what happens in the two edge cases (0 or 1 elements). It then uses a loop for all but the last elements, and finally adds the last element on without a comma. The inverse way of handling the non-comma element at the start is also easy to do.
Note that the if (items.length == 1) return items[0]; line isn't actually necessary, however I think it makes what the function does more easier to determine at a glance.
(Note that if anyone wants more explanation on the haskell/c# functions ask and I'll add it in)
It can be achieved using Java 8 lambda and Collectors.joining() as -
List<String> items = Arrays.asList("dog", "cat", "bat");
String result = items.stream().collect(Collectors.joining(", ", "[", "]"));
System.out.println(result);
I usually write a for loop like this:
public static String forLoopConditional(String[] items) {
StringBuilder builder = new StringBuilder();
builder.append("[");
for (int i = 0; i < items.length - 1; i++) {
builder.append(items[i] + ", ");
}
if (items.length > 0) {
builder.append(items[items.length - 1]);
}
builder.append("]");
return builder.toString();
}
If you are just looking for a comma seperated list of like this: "[The, Cat, in, the, Hat]", don't even waste time writing your own method. Just use List.toString:
List<String> strings = Arrays.asList("The", "Cat", "in", "the", "Hat);
System.out.println(strings.toString());
Provided the generic type of the List has a toString with the value you want to display, just call List.toString:
public class Dog {
private String name;
public Dog(String name){
this.name = name;
}
public String toString(){
return name;
}
}
Then you can do:
List<Dog> dogs = Arrays.asList(new Dog("Frank"), new Dog("Hal"));
System.out.println(dogs);
And you'll get:
[Frank, Hal]
A third alternative is the following
StringBuilder output = new StringBuilder();
for (int i = 0; i < items.length - 1; i++) {
output.append(items[i]);
output.append(",");
}
if (items.length > 0) output.append(items[items.length - 1]);
But the best is to use a join()-like method. For Java there's a String.join in third party libraries, that way your code becomes:
StringUtils.join(items,',');
FWIW, the join() method (line 3232 onwards) in Apache Commons does use an if within a loop though:
public static String join(Object[] array, char separator, int startIndex, int endIndex) {
if (array == null) {
return null;
}
int bufSize = (endIndex - startIndex);
if (bufSize <= 0) {
return EMPTY;
}
bufSize *= ((array[startIndex] == null ? 16 : array[startIndex].toString().length()) + 1);
StringBuilder buf = new StringBuilder(bufSize);
for (int i = startIndex; i < endIndex; i++) {
if (i > startIndex) {
buf.append(separator);
}
if (array[i] != null) {
buf.append(array[i]);
}
}
return buf.toString();
}

Categories

Resources