I have a simple, general question regarding a real small issue that bothers me:
I'm printing a list of elements on the fly, so I don't have prior knowledge about the number of printed elements. I want a simple format where the elements are separated by a comma (elem1, elem2...) or something similar. Now, if I use a simple loop like:
while(elements to check) {
if (elem should be printed) {
print elem . ","
}
}
I get a comma after the last element...
I know this sounds quite stupid, but is there a way to handle this?
Let's assume that "should be printed" means "at least one non-whitespace character. In Perl, the idiomatic way to write this would be (you'll need to adjust the grep to taste):
print join "," => grep { /\S/ } #elements;
The "grep" is like "filter" in other languages and the /S/ is a regex matching one non-whitespace character. Only elements matching the expression are returned. The "join" should be self-explanatory.
perldoc -f join
perldoc -f grep
the way of having all your data in an array and then
print join(',', #yourarray)
is a good one.
You can also, after looping for your concatenation
declare eltToPrint
while (LOOP on elt) {
eltToPrint .= elt.','
}
remove the last comma with a regex :
eltToPrint =~s/,$//;
ps : works also if you put the comma at the beginning
eltToPrint =~s/^,//;
Java does not have a build-in join, but if you don't want to reinvent the wheel, you can use Guava's Joiner. It can skipNulls, or useForNull(something).
An object which joins pieces of text (specified as an array, Iterable, varargs or even a Map) with a separator. It either appends the results to an Appendable or returns them as a String. Example:
Joiner joiner = Joiner.on("; ").skipNulls();
return joiner.join("Harry", null, "Ron", "Hermione");
This returns the string "Harry; Ron; Hermione". Note that all input elements are converted to strings using Object.toString() before being appended.
Why not add a comma BEFORE each element (but the first one)? Pseudo-code:
is_first = true
loop element over element_array
BEGIN LOOP
if (! is_first)
print ","
end if
print element
is_first = false
END
print NEWLINE
I guess the simplest way is to create a new array containing only the elements from the original array that you need to print (i.e. a filter operation). Then print the newly created array, preferably using your language's built-in array/vector print/join function.
(In Perl)
#orig=("a","bc","d","ef","g");
#new_list=();
for $x(#orig){
push(#new_list,$x) if (length($x)==1);
}
print join(',',#new_list)."\n";
(In Java)
List<String> orig=Arrays.asList(new String[]{"a","bc","d","ef","g"});
List<String> new_list=new ArrayList<String>();
for(String x: orig){
if (x.length()==1)
new_list.add(x);
}
System.out.println(new_list);
You have several options depending on language.
e.g. in JavaScript just do:
var prettyString = someArray.join(', ');
in PHP you can implode()
$someArray = array('apple', 'orange', 'pear');
$prettyString = implode(",", $someArray);
if all else fails, you can either add the comma after every entry and trim the last one when done, or check in you while/foreach loop (bad for perf) if this is not the last item (if so, add a comma)
update: since you noted Java... you could create a method like this:
public static String join(String[] strings, String separator) {
StringBuffer sb = new StringBuffer();
for (int i=0; i < strings.length; i++) {
if (i != 0) sb.append(separator);
sb.append(strings[i]);
}
return sb.toString();
}
update 2: sounds like you really want this then if you are not outputting every element (pseudo-code):
first = true;
for(item in list){
if(item meets condition){
if(!first){
print ", ";
} else {
first = false;
}
print item;
}
}
Related
I'm trying to replace/remove certain characters within a String ArrayList. Here is the code that I've already tried:
for (int i = 0; i < newList.size(); i++) {
if (newList.get(i).contains(".")) {
newList.get(i).replace(".", "");
}
}
I tried iterating through the ArrayList and, if any of the elements contained a ".", then to replace that with emptiness. For example, if one of the words in my ArrayList was "Works.", then the resulted output would be "Works". Unfortunately, this code segment does not seem to be functioning. I did something similar to remove any elements which contains numbers and it worked fine, however I'm not sure if there's specific syntax or methods that needs to be used when replacing just a single character in a String while still keeping the ArrayList element.
newList.get(i).replace(".", "");
This doesn't update the list element - you construct the replaced string, but then discard that string.
You could use set:
newList.set(i, newList.get(i).replace(".", ""));
Or you could use a ListIterator:
ListIterator<String> it = newList.listIterator();
while (it.hasNext()) {
String s = it.next();
if (s.contains(".")) {
it.set(s.replace(".", ""));
}
}
but a better way would be to use replaceAll, with no need to loop explicitly:
newList.replaceAll(s -> s.replace(".", ""));
There's no need to check contains either: replace won't do anything if the search string is not present.
I need to be able to turn a string, for instance "This and <those> are.", into a string array of the form ["This and ", "<those>", " are."]. I have been trying to using the String.split() command, and I've gotten this regex:
"(?=[<>])"
However, this just gets me ["This and ", "<those", "> are."]. I can't figure out a good regex to get the brackets all on the same element, and I also can't have spaces between those brackets. So for instance, "This and <hey there> are." Should be simply split to ["This and <hey there> are."]. Ideally I'd like to just rely solely on the split command for this operation. Can anyone point me in the right direction?
Not actually possible; given that the 'separator' needs to match 0 characters it needs to be all lookahead/lookbehind, and those require fixed-size lookups; you need to look ahead arbitrarily far into the string to know if a space is going to occur or not, thus, what you want? Impossible.
Just write a regexp that FINDS the construct you want, that's a lot simpler. Simply Pattern.compile("<\\w+>") (taking a select few liberties on what you intend a thing-in-brackets to look like. If truly it can be ANYTHING except spaces and the closing brace, "<[^ >]+>" is what you want).
Then, just loop through, finding as you go:
private static final Pattern TOKEN_FINDER = Pattern.compile("<\\w+>");
List<String> parse(String in) {
Matcher m = TOKEN_FINDER.matcher(in);
if (!m.find()) return List.of(in);
var out = new ArrayList<String>();
int pos = 0;
do {
int s = m.start();
if (s > pos) out.add(in.substring(pos, s));
out.add(m.group());
pos = m.end();
} while (m.find());
if (pos < in.length()) out.add(in.substring(pos));
return out;
}
Let's try it:
System.out.println(parse("This and <those> are."));
System.out.println(parse("This and <hey there> are."));
System.out.println(parse("<edgecase>2"));
System.out.println(parse("3<edgecase>"));
prints:
[This and , <those>, are.]
[This and <hey there> are.]
[<edgecase>]
[<edgecase>, 2]
[3, <edgecase>]
seems like what you wanted.
I have an arraylist of type String with many words, and in some cases they are just single letters. Such as the letter "K".
I am essentially trying to remove all single instance characters, EXCEPT "A" and "I".
Here is the code/regex I was trying, to no avail:
//removing all single letters
ArrayList<String> newList2 = new ArrayList<String>();
for(String word : words) {
newList2.add(word.replace("[BCDEFGHJKLMOPQRSTUVWXYZ]", ""));
}
words = newList2;
Should I not use regex? Is there a better method, or is there a way I am not using regex correctly? From my understanding my implementation, if it even worked, would only replace it with an empty spot, not completely remove the element.. my goal is to remove the element entirely if it exists, perhaps by the .remove method... Not sure how to go about this. (JAVA)
(P.S, ideally I would also remove the "=" and other symbols if they are apparent, but characters is my gripe at the moment)
No need to use stream api for it. List#removeIf will suffice here:
list.removeIf(s -> s.length() == 1 && ! List.of("A", "I").contains(s))
Note: It is a mutative operation.
A solution with loop:
for(int i=0; i < newList2.size(); i++){
if(newList2.get(i).length() == 1){
if(!newList2.get(i).equals("A") || !newList2.get(i).equals("I")){
newList2.remove(i)
}
}
}
I'm trying to create a program that can abbreviate certain words in a string given by the user.
This is how I've laid it out so far:
Create a hashmap from a .txt file such as the following:
thanks,thx
your,yr
probably,prob
people,ppl
Take a string from the user
Split the string into words
Check the hashmap to see if that word exists as a key
Use hashmap.get() to return the key value
Replace the word with the key value returned
Return an updated string
It all works perfectly fine until I try to update the string:
public String shortenMessage( String inMessage ) {
String updatedstring = "";
String rawstring = inMessage;
String[] words = rawstring.replaceAll("[^a-zA-Z ]", "").toLowerCase().split("\\s+");
for (String word : words) {
System.out.println(word);
if (map.containsKey(word) == true) {
String x = map.get(word);
updatedstring = rawstring.replace(word, x);
}
}
System.out.println(updatedstring);
return updatedstring;
}
Input:
thanks, your, probably, people
Output:
thanks, your, probably, ppl
Does anyone know how I can update all the words in the string?
Thanks in advance
updatedstring = rawstring.replace(word, x);
This keeps replacing your updatedstring with the rawstring with a the single replacement.
You need to do something like
updatedstring = rawstring;
...
updatedString = updatedString.replace(word, x);
Edit:
That is the solution to the problem you are seeing but there are a few other problems with your code:
Your replacement won't work for things that you needed to lowercased or remove characters from. You create the words array that you iterate from altered version of your rawstring. Then you go back and try to replace the altered versions from your original rawstring where they don't exist. This will not find the words you think you are replacing.
If you are doing global replacements, you could just create a set of words instead of an array since once the word is replaced, it shouldn't come up again.
You might want to be replacing the words one at a time, because your global replacement could cause weird bugs where a word in the replacement map is a sub word of another replacement word. Instead of using String.replace, make an array/list of words, iterate the words and replace the element in the list if needed and join them. In java 8:
String.join(" ", elements);
I have a string in what is the best way to put the things in between $ inside a list in java?
String temp = $abc$and$xyz$;
how can i get all the variables within $ sign as a list in java
[abc, xyz]
i can do using stringtokenizer but want to avoid using it if possible.
thx
Maybe you could think about calling String.split(String regex) ...
The pattern is simple enough that String.split should work here, but in the more general case, one alternative for StringTokenizer is the much more powerful java.util.Scanner.
String text = "$abc$and$xyz$";
Scanner sc = new Scanner(text);
while (sc.findInLine("\\$([^$]*)\\$") != null) {
System.out.println(sc.match().group(1));
} // abc, xyz
The pattern to find is:
\$([^$]*)\$
\_____/ i.e. literal $, a sequence of anything but $ (captured in group 1)
1 and another literal $
The […] is a character class. Something like [aeiou] matches one of any of the lowercase vowels. [^…] is a negated character class. [^aeiou] matches one of anything but the lowercase vowels.
(…) is used for grouping. (pattern) is a capturing group and creates a backreference.
The backslash preceding the $ (outside of character class definition) is used to escape the $, which has a special meaning as the end of line anchor. That backslash is doubled in a String literal: "\\" is a String of length one containing a backslash).
This is not a typical usage of Scanner (usually the delimiter pattern is set, and tokens are extracted using next), but it does show how'd you use findInLine to find an arbitrary pattern (ignoring delimiters), and then using match() to access the MatchResult, from which you can get individual group captures.
You can also use this Pattern in a Matcher find() loop directly.
Matcher m = Pattern.compile("\\$([^$]*)\\$").matcher(text);
while (m.find()) {
System.out.println(m.group(1));
} // abc, xyz
Related questions
Validating input using java.util.Scanner
Scanner vs. StringTokenizer vs. String.Split
Just try this one:temp.split("\\$");
I would go for a regex myself, like Riduidel said.
This special case is, however, simple enough that you can just treat the String as a character sequence, and iterate over it char by char, and detect the $ sign. And so grab the strings yourself.
On a side node, I would try to go for different demarkation characters, to make it more readable to humans. Use $ as start-of-sequence and something else as end-of-sequence for instance. Or something like I think the Bash shell uses: ${some_value}. As said, the computer doesn't care but you debugging your string just might :)
As for an appropriate regex, something like (\\$.*\\$)* or so should do. Though I'm no expert on regexes (see http://www.regular-expressions.info for nice info on regexes).
Basically I'd ditto Khotyn as the easiest solution. I see you post on his answer that you don't want zero-length tokens at beginning and end.
That brings up the question: What happens if the string does not begin and end with $'s? Is that an error, or are they optional?
If it's an error, then just start with:
if (!text.startsWith("$") || !text.endsWith("$"))
return "Missing $'s"; // or whatever you do on error
If that passes, fall into the split.
If the $'s are optional, I'd just strip them out before splitting. i.e.:
if (text.startsWith("$"))
text=text.substring(1);
if (text.endsWith("$"))
text=text.substring(0,text.length()-1);
Then do the split.
Sure, you could make more sophisticated regex's or use StringTokenizer or no doubt come up with dozens of other complicated solutions. But why bother? When there's a simple solution, use it.
PS There's also the question of what result you want to see if there are two $'s in a row, e.g. "$foo$$bar$". Should that give ["foo","bar"], or ["foo","","bar"] ? Khotyn's split will give the second result, with zero-length strings. If you want the first result, you should split("\$+").
If you want a simple split function then use Apache Commons Lang which has StringUtils.split. The java one uses a regex which can be overkill/confusing.
You can do it in simple manner writing your own code.
Just use the following code and it will do the job for you
import java.util.ArrayList;
import java.util.List;
public class MyStringTokenizer {
/**
* #param args
*/
public static void main(String[] args) {
List <String> result = getTokenizedStringsList("$abc$efg$hij$");
for(String token : result)
{
System.out.println(token);
}
}
private static List<String> getTokenizedStringsList(String string) {
List <String> tokenList = new ArrayList <String> ();
char [] in = string.toCharArray();
StringBuilder myBuilder = null;
int stringLength = in.length;
int start = -1;
int end = -1;
{
for(int i=0; i<stringLength;)
{
myBuilder = new StringBuilder();
while(i<stringLength && in[i] != '$')
i++;
i++;
while((i)<stringLength && in[i] != '$')
{
myBuilder.append(in[i]);
i++;
}
tokenList.add(myBuilder.toString());
}
}
return tokenList;
}
}
You can use
String temp = $abc$and$xyz$;
String array[]=temp.split(Pattern.quote("$"));
List<String> list=new ArrayList<String>();
for(int i=0;i<array.length;i++){
list.add(array[i]);
}
Now the list has what you want.