Efficient and non-interfering way of replacing multiple substrings in a String - java

I'm trying to apply the same replacement instructions several thousand times to different input strings with as little overhead as possible. I need to consider two things for this:
The search Strings aren't necessarily all the same length: one may be just "a", another might be "ch", yet another might be "sch"
What was already replaced shall not be replaced again: If the replacement patterns are [a->e; e->a], "beat" should become "baet", not "baat" or "beet".
With that in mind, this is the code I came up with:
public class Replacements {
private String[] search;
private String[] replace;
Replacements(String[] s, String[] r)
{
if (s.length!=r.length) throw new IllegalArgumentException();
Map<String,String> map = new HashMap<String,String>();
for (int i=0;i<s.length;i++)
{
map.put(s[i], r[i]);
}
List<String> sortedKeys = new ArrayList(map.keySet());
Collections.sort(sortedKeys, new StringLengthComparator());
this.search = sortedKeys.toArray(new String[0]);
Stack<String> r2 = new Stack<>();
sortedKeys.stream().forEach((i) -> {
r2.push(map.get(i));
});
this.replace = r2.toArray(new String[0]);
}
public String replace(String input)
{
return replace(input,0);
}
private String replace(String input,int i)
{
String out = "";
List<String> parts = Arrays.asList(input.split(this.search[i],-1));
for (Iterator it = parts.iterator(); it.hasNext();)
{
String part = it.next().toString();
if (part.length()>0 && i<this.search.length-1) out += replace(part,i+1);
if (it.hasNext()) out += this.replace[i];
}
return out;
}
}
And then
String[] words;
//fill variable words
String[] s_input = "ou|u|c|ch|ce|ci".split("\\|",-1);
String[] r_input = "u|a|k|c|se|si".split("\\|",-1);
Replacements reps = new Replacements(s_input,r_input);
for (String word : words) {
System.out.println(reps.replace(word));
}
(s_input and r_input would be up to the user, so they're just examples, just like the program wouldn't actually use println())
This code makes sure longer search strings get looked for first and also covers the second condition above.
It is, however, quite costly. What would be the most efficient way to accomplish what I'm doing here (especially if the number of Strings in words is significantly large)?
With my current code, "couch" should be converted into "kuc" (except it doesn't, apparently; it now does, thanks to the -1 in split(p,-1))

This is not a full solution but it shows how to scan the input and find all target substrings in one pass. You would use a StringBuilder to assemble the result, looking up the replacements in a Map as you are currently doing. Use the start and end indexes to handle copying of non-matching segments.
public static void main(String[] args) throws Exception
{
Pattern p = Pattern.compile("(ou|ch|ce|ci|u|c)");
Matcher m = p.matcher("auouuchcceaecxici");
while (m.find())
{
MatchResult r = m.toMatchResult();
System.out.printf("s=%d e=%d '%s'\n", r.start(), r.end(), r.group());
}
}
Output:
s=1 e=2 'u'
s=2 e=4 'ou'
s=4 e=5 'u'
s=5 e=7 'ch'
s=7 e=8 'c'
s=8 e=10 'ce'
s=12 e=13 'c'
s=15 e=17 'ci'
Note the strings in the regex have to be sorted in order of descending length to work correctly.

One could make a regex pattern from the keys and leave it to that module for optimization.
Obviously
"(ou|u|ch|ce|ci|c)"
needs to take care of ce/ci/c, either by reverse sorting or immediately as tree:
"(c(e|h|i)?|ou|u)"
Then
String soughtKeys = "ou|u|ch|ce|ci|c"; // c last
String replacements = "u|a|c|se|si|k";
Map<String, String> map = new HashMap<>();
... fill map
Pattern pattern = Pattern.compile("(" + soughtKeys + ")");
for (String word : words) {
StringBuffer sb = new StringBuffer();
Matcher m = pattern.matcher(word);
while (m.find()) {
m.appendReplacement(sb, map.get(m.group());
}
m.appendTail(sb);
System.out.printf("%s -> %s%n", word, sb.toString());
}
The advantage being that regex is quite smart (though slow), and replacements are not done over replaced text.

public class Replacements
{
private String[] search; // sorted in descending length and order, eg: sch, ch, c
private String[] replace; // corresponding replacement
Replacements(String[] s, String[] r)
{
if (s.length != r.length)
throw new IllegalArgumentException();
final TreeMap<String, String> map = new TreeMap<String, String>(Collections.reverseOrder());
for (int i = 0; i < s.length; i++)
map.put(s[i], r[i]);
this.search = map.keySet().toArray(new String[map.size()]);
this.replace = map.values().toArray(new String[map.size()]);
}
public String replace(String input)
{
final StringBuilder result = new StringBuilder();
// start of yet-to-be-copied substring
int s = 0;
SEARCH:
for (int i = s; i < input.length(); i++)
{
for (int p = 0; p < this.search.length; p++)
{
if (input.regionMatches(i, this.search[p], 0, this.search[p].length()))
{
// append buffer and replacement
result.append(input, s, i).append(this.replace[p]);
// skip beyond current match and reset buffer
i += this.search[p].length();
s = i--;
continue SEARCH;
}
}
}
if (s == 0) // no matches? no changes!
return input;
// append remaining buffer
return result.append(input, s, input.length()).toString();
}
}

Related

How do i ignore same words in a string (JAVA)

I want to find how many words there are in a string but ignore the similar words in it.
For example the main method should return 8 insetad of 9.
I want it to be a method which takes one parameter s of type String and returns an int value. And im only allowed to use the bacics,so no HashMaps, ArrayLists, only charAt, length, or substring and using loops and if statemens are allowed.
public static void main(String[] args) {
countUniqueWords("A long long time ago, I can still remember");
public static int countUniqueWords(String str) {
char[] sentence = str.toCharArray();
boolean inWord = false;
int wordCt = 0;
for (char c : sentence) {
if (c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z') {
if (!inWord) {
wordCt++;
inWord = true;
}
} else {
inWord = false;
}
}
return wordCt;
}
```
Don't force yourself to limited options, and learn the Streaming API. Your question is as simple as:
public static long countUniqueWords(String str) {
var str2 = str.replaceAll("[^a-zA-Z0-9 ]", "").replaceAll(" +", " ");
return Arrays.stream(str2.split(" "))
.distinct()
.count();
}
[Optional step] Get get rid of all non alphanumeric chars
Split the string per empty slot
Remove duplicates
Add them together
To ignore same words in a string, you can use a combination of the split and distinct methods from the Java Stream API.
// Define the input string
String input = "This is a test string with some repeating words";
// Split the string into an array of words
String[] words = input.split("\\s+");
// Use the distinct method to remove duplicate words from the array
String[] distinctWords = Arrays.stream(words).distinct().toArray(String[]::new);
// Print the distinct words
System.out.println(Arrays.toString(distinctWords));
Try this:
public static int countUniqueWords(String words) {
// Add all the words to a list
List<String> array = new ArrayList<>();
Scanner in = new Scanner(words);
while (in.hasNext()) {
String s = in.next();
array.add(s);
}
// Save per word the amount of duplicates
HashMap<String, Integer> listOfWords = new HashMap<>();
Iterator<String> itr = array.iterator();
while (itr.hasNext()) {
String next = itr.next();
String prev = listOfWords.getOrDefault(next, 0);
listOfWords.put(next, prev + 1);
}
// Grab the size of all known words
return listOfWords.size();
}
public static void main(String args[]) {
int count = countUniqueWords("A long long time ago, I can still remember");
System.out.println("The number of unique words: " + count);
}

Fastest way to search several strings in a string

Below is my code to find the occurrences of all the substrings in a given single string
public static void main(String... args) {
String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
String[] severalStringArray = { "one", "two", "three", "four" };
Map<String, Integer> countMap = countWords(fullString, severalStringArray);
}
public static Map<String, Integer> countWords(String fullString, String[] severalStringArray) {
Map<String, Integer> countMap = new HashMap<>();
for (String searchString : severalStringArray) {
if (countMap.containsKey(searchString)) {
int searchCount = countMatchesInString(fullString, searchString);
countMap.put(searchString, countMap.get(searchString) + searchCount);
} else
countMap.put(searchString, countMatchesInString(fullString, searchString));
}
return countMap;
}
private static int countMatchesInString(String fullString, String subString) {
int count = 0;
int pos = fullString.indexOf(subString);
while (pos > -1) {
count++;
pos = fullString.indexOf(subString, pos + 1);
}
return count;
}
Assume the full string might be a full file read as a string. Is the above is the efficient way of search or any other better way or fastest way to do it?
Thanks
You could just form a regex alternation of words to search, and then do a single search against that regex:
public static int matchesInString(String fullString, String regex) {
int count = 0;
Pattern r = Pattern.compile(regex);
Matcher m = r.matcher(fullString);
while (m.find())
++count;
return count;
}
String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
String[] severalStringArray = { "one", "two", "three", "four" };
String regex = "\\b(?:" + String.join("|", severalStringArray) + ")\\b";
int count = matchesInString(fullString, regex);
System.out.println("There were " + count + " matches in the input");
This prints:
There were 8 matches in the input
Note that the regex pattern used in the above example was:
\b(?:one|two|three|four)\b
Regular expressions
Your problem can be solved using regex (regular expressions). Regular expressions are a tool that help you matching patterns in strings. This pattern can be a word or can be a set of chars.
Regular expressions in Java
In Java there are two Objects helping you with regular expressions: Pattern and Matcher.
Below you can see an example for searching if the word stackoverflow exists in the string stackoverflowXstackoverflowXXXstackoverflowXX in Java.
String pattern = "stackoverflow";
String stringToExamine = "stackoverflowXstackoverflowXXXstackoverflowXX";
Pattern patternObj = Pattern.compile(pattern);
Matcher matcherObj = patternObj.matcher(stringToExamine);
Counting how many occurrencies of a word in a given string
As written here you have different solution based on your Java version:
Java 9+
long matches = matcherObj.results().count();
Older Java versions
int count = 0;
while (matcherObj.find())
count++;
Regular expressions in your problem
You use a method for calculating how many times a word is occurring in a text (a string), and you can modify it like this:
Java 9+
public static int matchesInString(String fullString, String pattern)
{
Pattern patternObj = Pattern.compile(pattern);
Matcher matcherObj = patternObj.matcher(fullString);
return matcherObj.results().count();
}
Older Java versions
public static int matchesInString(String fullString, String pattern)
{
int count = 0;
Pattern patternObj = Pattern.compile(pattern);
Matcher matcherObj = patternObj.matcher(fullString);
while (matcherObj.find())
count++;
return count;
}
Actually, the fastest way is to scan the string first and count all existed words and save it into Map. Then select required words only.
Just be simple! The regular expression is too complicated and not efficient for this simple task. Let's solve it with a hummer!
public static void main(String... args) {
String str = "one is a good one. two is ok. three is three. four is four. five is not four";
Set<String> words = Set.of("one", "two", "three", "four");
Map<String, Integer> map = countWords(str, words);
}
public static Map<String, Integer> countWords(String str, Set<String> words) {
Map<String, Integer> map = new HashMap<>();
for (int i = 0, j = 0; j <= str.length(); j++) {
char ch = j == str.length() ? '\0' : str.charAt(j);
if (j == str.length() || !isWordSymbol(ch)) {
String word = str.substring(i, j);
if (!word.isEmpty() && words.contains(word))
map.put(word, map.getOrDefault(word, 0) + 1);
i = j + 1;
}
}
return map;
}
private static boolean isWordSymbol(char ch) {
return Character.isLetter(ch) || ch == '-' || ch == '_';
}
An implementation of the Trie tree that someone commented on. I don't know if it's fast or not.
static class Trie {
static final long INC_NODE_NO = 1L << Integer.SIZE;
private long nextNodeNo = 0;
private Node root = new Node();
private final Map<Long, Node> nodes = new HashMap<>();
public void put(String word) {
Node node = root;
for (int i = 0, len = word.length(); i < len; ++i)
node = node.put(word.charAt(i));
node.data = word;
}
public List<String> findPrefix(String text, int start) {
List<String> result = new ArrayList<>();
Node node = root;
for (int i = start, length = text.length(); i < length; ++i) {
if ((node = node.get(text.charAt(i))) == null)
break;
String v = node.data;
if (v != null)
result.add(v);
}
return result;
}
public Map<String, Integer> find(String text) {
Map<String, Integer> result = new HashMap<>();
for (int i = 0, length = text.length(); i < length; ++i)
for (String w : findPrefix(text, i))
result.compute(w, (k, v) -> v == null ? 1 : v + 1);
return result;
}
class Node {
final long no;
String data;
Node() {
this.no = nextNodeNo;
nextNodeNo += INC_NODE_NO;
}
Node get(int key) {
return nodes.get(no | key);
}
Node put(int key) {
return nodes.computeIfAbsent(no | key, k -> new Node());
}
}
}
public static void main(String args[]) throws IOException {
String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
String[] severalStringArray = { "one", "two", "three", "four" };
Trie trie = new Trie();
for (String word : severalStringArray)
trie.put(word);
Map<String, Integer> count = trie.find(fullString);
System.out.println(count);
}
output:
{four=3, one=2, three=2, two=1}

Split a string with multiple delimiters using only String methods

I want to split a string into tokens.
I ripped of another Stack Overflow question - Equivalent to StringTokenizer with multiple characters delimiters, but I want to know if this can be done with only string methods (.equals(), .startsWith(), etc.). I don't want to use RegEx's, the StringTokenizer class, Patterns, Matchers or anything other than String for that matter.
For example, this is how I want to call the method
String[] delimiters = {" ", "==", "=", "+", "+=", "++", "-", "-=", "--", "/", "/=", "*", "*=", "(", ")", ";", "/**", "*/", "\t", "\n"};
String splitString[] = tokenizer(contents, delimiters);
And this is the code I ripped of the other question (I don't want to do this).
private String[] tokenizer(String string, String[] delimiters) {
// First, create a regular expression that matches the union of the
// delimiters
// Be aware that, in case of delimiters containing others (example &&
// and &),
// the longer may be before the shorter (&& should be before &) or the
// regexpr
// parser will recognize && as two &.
Arrays.sort(delimiters, new Comparator<String>() {
#Override
public int compare(String o1, String o2) {
return -o1.compareTo(o2);
}
});
// Build a string that will contain the regular expression
StringBuilder regexpr = new StringBuilder();
regexpr.append('(');
for (String delim : delimiters) { // For each delimiter
if (regexpr.length() != 1)
regexpr.append('|'); // Add union separator if needed
for (int i = 0; i < delim.length(); i++) {
// Add an escape character if the character is a regexp reserved
// char
regexpr.append('\\');
regexpr.append(delim.charAt(i));
}
}
regexpr.append(')'); // Close the union
Pattern p = Pattern.compile(regexpr.toString());
// Now, search for the tokens
List<String> res = new ArrayList<String>();
Matcher m = p.matcher(string);
int pos = 0;
while (m.find()) { // While there's a delimiter in the string
if (pos != m.start()) {
// If there's something between the current and the previous
// delimiter
// Add it to the tokens list
res.add(string.substring(pos, m.start()));
}
res.add(m.group()); // add the delimiter
pos = m.end(); // Remember end of delimiter
}
if (pos != string.length()) {
// If it remains some characters in the string after last delimiter
// Add this to the token list
res.add(string.substring(pos));
}
// Return the result
return res.toArray(new String[res.size()]);
}
public static String[] clean(final String[] v) {
List<String> list = new ArrayList<String>(Arrays.asList(v));
list.removeAll(Collections.singleton(" "));
return list.toArray(new String[list.size()]);
}
Edit: I ONLY want to use string methods charAt, equals, equalsIgnoreCase, indexOf, length, and substring
EDIT:
My original answer did not quite do the trick, it did not include the delimiters in the resultant array, and used the String.split() method, which was not allowed.
Here's my new solution, which is split into 2 methods:
/**
* Splits the string at all specified literal delimiters, and includes the delimiters in the resulting array
*/
private static String[] tokenizer(String subject, String[] delimiters) {
//Sort delimiters into length order, starting with longest
Arrays.sort(delimiters, new Comparator<String>() {
#Override
public int compare(String s1, String s2) {
return s2.length()-s1.length();
}
});
//start with a list with only one string - the whole thing
List<String> tokens = new ArrayList<String>();
tokens.add(subject);
//loop through the delimiters, splitting on each one
for (int i=0; i<delimiters.length; i++) {
tokens = splitStrings(tokens, delimiters, i);
}
return tokens.toArray(new String[] {});
}
/**
* Splits each String in the subject at the delimiter
*/
private static List<String> splitStrings(List<String> subject, String[] delimiters, int delimiterIndex) {
List<String> result = new ArrayList<String>();
String delimiter = delimiters[delimiterIndex];
//for each input string
for (String part : subject) {
int start = 0;
//if this part equals one of the delimiters, don't split it up any more
boolean alreadySplit = false;
for (String testDelimiter : delimiters) {
if (testDelimiter.equals(part)) {
alreadySplit = true;
break;
}
}
if (!alreadySplit) {
for (int index=0; index<part.length(); index++) {
String subPart = part.substring(index);
if (subPart.indexOf(delimiter)==0) {
result.add(part.substring(start, index)); // part before delimiter
result.add(delimiter); // delimiter
start = index+delimiter.length(); // next parts starts after delimiter
}
}
}
result.add(part.substring(start)); // rest of string after last delimiter
}
return result;
}
Original Answer
I notice you are using Pattern when you said you only wanted to use String methods.
The approach I would take would be to think of the simplest way possible. I think that is to first replace all the possible delimiters with just one delimiter, and then do the split.
Here's the code:
private String[] tokenizer(String string, String[] delimiters) {
//replace all specified delimiters with one
for (String delimiter : delimiters) {
while (string.indexOf(delimiter)!=-1) {
string = string.replace(delimiter, "{split}");
}
}
//now split at the new delimiter
return string.split("\\{split\\}");
}
I need to use String.replace() and not String.replaceAll() because replace() takes literal text and replaceAll() takes a regex argument, and the delimiters supplied are of literal text.
That's why I also need a while loop to replace all instances of each delimiter.
Using only non-regex String methods...
I used the startsWith(...) method, which wasn't in the exclusive list of methods that you listed because it does simply string comparison rather than a regex comparison.
The following impl:
public static void main(String ... params) {
String haystack = "abcdefghijklmnopqrstuvwxyz";
String [] needles = new String [] { "def", "tuv" };
String [] tokens = splitIntoTokensUsingNeedlesFoundInHaystack(haystack, needles);
for (String string : tokens) {
System.out.println(string);
}
}
private static String[] splitIntoTokensUsingNeedlesFoundInHaystack(String haystack, String[] needles) {
List<String> list = new LinkedList<String>();
StringBuilder builder = new StringBuilder();
for(int haystackIndex = 0; haystackIndex < haystack.length(); haystackIndex++) {
boolean foundAnyNeedle = false;
String substring = haystack.substring(haystackIndex);
for(int needleIndex = 0; (!foundAnyNeedle) && needleIndex < needles.length; needleIndex ++) {
String needle = needles[needleIndex];
if(substring.startsWith(needle)) {
if(builder.length() > 0) {
list.add(builder.toString());
builder = new StringBuilder();
}
foundAnyNeedle = true;
list.add(needle);
haystackIndex += (needle.length() - 1);
}
}
if( ! foundAnyNeedle) {
builder.append(substring.charAt(0));
}
}
if(builder.length() > 0) {
list.add(builder.toString());
}
return list.toArray(new String[]{});
}
outputs
abc
def
ghijklmnopqrs
tuv
wxyz
Note...
This code is demo-only. In the event that one of the delimiters is any empty String, it will behave poorly and eventually crash with OutOfMemoryError: Java heap space after consuming a lot of CPU.
As far as i understood your problem you can do something like this -
public Object[] tokenizer(String value, String[] delimeters){
List<String> list= new ArrayList<String>();
for(String s:delimeters){
if(value.contains(s)){
String[] strArr=value.split("\\"+s);
for(String str:strArr){
list.add(str);
if(!list.contains(s)){
list.add(s);
}
}
}
}
Object[] newValues=list.toArray();
return newValues;
}
Now in the main method call this function -
String[] delimeters = {" ", "{", "==", "=", "+", "+=", "++", "-", "-=", "--", "/", "/=", "*", "*=", "(", ")", ";", "/**", "*/", "\t", "\n"};
Object[] obj=st.tokenizer("ge{ab", delimeters); //st is the reference of the other class. Edit this of your own.
for(Object o:obj){
System.out.println(o.toString());
}
Suggestion:
private static int INIT_INDEX_MAX_INT = Integer.MAX_VALUE;
private static String[] tokenizer(final String string, final String[] delimiters) {
final List<String> result = new ArrayList<>();
int currentPosition = 0;
while (currentPosition < string.length()) {
// plan: search for the nearest delimiter and its position
String nextDelimiter = "";
int positionIndex = INIT_INDEX_MAX_INT;
for (final String currentDelimiter : delimiters) {
final int currentPositionIndex = string.indexOf(currentDelimiter, currentPosition);
if (currentPositionIndex < 0) { // current delimiter not found, go to the next
continue;
}
if (currentPositionIndex < positionIndex) { // we found a better one, update
positionIndex = currentPositionIndex;
nextDelimiter = currentDelimiter;
}
}
if (positionIndex == INIT_INDEX_MAX_INT) { // we found nothing, finish up
final String finalPart = string.substring(currentPosition, string.length());
result.add(finalPart);
break;
}
// we have one, add substring + delimiter to result and update current position
// System.out.println(positionIndex + ":[" + nextDelimiter + "]"); // to follow the internals
final String stringBeforeNextDelimiter = string.substring(currentPosition, positionIndex);
result.add(stringBeforeNextDelimiter);
result.add(nextDelimiter);
currentPosition += stringBeforeNextDelimiter.length() + nextDelimiter.length();
}
return result.toArray(new String[] {});
}
Notes:
I have added more comments than necessary. I guess it would help in this case.
The perfomance of this is quite bad (could be improved with tree structures and hashes). It was no part of the specification.
Operator precedence is not specified (see my comment to the question). It was no part of the specification.
I ONLY want to use string methods charAt, equals, equalsIgnoreCase, indexOf, length, and substring
Check. The function uses only indexOf(), length() and substring()
No, I mean in the returned results. For example, If my delimiter was {, and a string was ge{ab, I would like an array with ge, { and ab
Check:
private static void test() {
final String[] delimiters = { "{" };
final String contents = "ge{ab";
final String splitString[] = tokenizer(contents, delimiters);
final String joined = String.join("", splitString);
System.out.println(Arrays.toString(splitString));
System.out.println(contents.equals(joined) ? "ok" : "wrong: [" + contents + "]#[" + joined + "]");
}
// [ge, {, ab]
// ok
One final remark: I should advice to read about compiler construction, in particular the compiler front end, if one wants to have best practices for this kind of question.
Maybe I haven't fully understood the question, but I have the impression that you want to rewrite the Java String method split(). I would advise you to have a look at this function, see how it's done and start from there.
Honestly, you could use Apache Commons Lang. If you check the source code of library you will notice that it doesn't uses Regex. Only String and a lot of flags is used in method [StringUtils.split](http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringUtils.html#split(java.lang.String, java.lang.String)).
Anyway, take a look in code using the Apache Commons Lang.
import org.apache.commons.lang.StringUtils;
import org.junit.Assert;
import org.junit.Test;
public class SimpleTest {
#Test
public void testSplitWithoutRegex() {
String[] delimiters = {"==", "+=", "++", "-=", "--", "/=", "*=", "/**", "*/",
" ", "=", "+", "-", "/", "*", "(", ")", ";", "\t", "\n"};
String finalDelimiter = "#";
//check if demiliter can be used
boolean canBeUsed = true;
for (String delimiter : delimiters) {
if (finalDelimiter.equals(delimiter)) {
canBeUsed = false;
break;
}
}
if (!canBeUsed) {
Assert.fail("The selected delimiter can't be used.");
}
String s = "Assuming that we have /** or /* all these signals like == and; / or * will be replaced.";
System.out.println(s);
for (String delimiter : delimiters) {
while (s.indexOf(delimiter) != -1) {
s = s.replace(delimiter, finalDelimiter);
}
}
String[] splitted = StringUtils.split(s, "#");
for (String s1 : splitted) {
System.out.println(s1);
}
}
}
I hope it helps.
As simple as I could get it...
public class StringTokenizer {
public static String[] split(String s, String[] tokens) {
Arrays.sort(tokens, new Comparator<String>() {
#Override
public int compare(String o1, String o2) {
return o2.length()-o1.length();
}
});
LinkedList<String> result = new LinkedList<>();
int j=0;
for (int i=0; i<s.length(); i++) {
String ss = s.substring(i);
for (String token : tokens) {
if (ss.startsWith(token)) {
if (i>j) {
result.add(s.substring(j, i));
}
result.add(token);
j = i+token.length();
i = j-1;
break;
}
}
}
result.add(s.substring(j));
return result.toArray(new String[result.size()]);
}
}
It does a lot of new objects creation - and could be optimized by writing custom startsWith() implementation that would compare char by char of the string.
#Test
public void test() {
String[] split = StringTokenizer.split("this==is the most>complext<=string<<ever", new String[] {"=", "<", ">", "==", ">=", "<="});
assertArrayEquals(new String[] {"this", "==", "is the most", ">", "complext", "<=", "string", "<", "<", "ever"}, split);
}
passes fine :)
You can use recursion (a hallmark of functional programming) to make it less verbose.
public static String[] tokenizer(String text, String[] delims) {
for(String delim : delims) {
int i = text.indexOf(delim);
if(i >= 0) {
// recursive call
String[] tail = tokenizer(text.substring(i + delim.length()), delims);
// return [ head, middle, tail.. ]
String[] list = new String[tail.length + 2];
list[0] = text.substring(0,i);
list[1] = delim;
System.arraycopy(tail, 0, list, 2, tail.length);
return list;
}
}
return new String[] { text };
}
Tested it using the same unit-test from the other answer
public static void main(String ... params) {
String haystack = "abcdefghijklmnopqrstuvwxyz";
String [] needles = new String [] { "def", "tuv" };
String [] tokens = tokenizer(haystack, needles);
for (String string : tokens) {
System.out.println(string);
}
}
Output
abc
def
ghijklmnopqrs
tuv
wxyz
It would be a little more elegant if Java had better native array support.

How to count how many Occurences of the word i have in an ArrayList?

How to count how many Occurences of the word i have in an ArrayList? Can anyone please help me?
If you are trying to only find how many matches and you don't need to know positions try using this
BufferedReader reader = new BufferedReader(new FileReader("hello.txt"));
StringBuilder builder = new StringBuilder();
String str = null;
while((str = reader.readLine()) != null){
builder.append(str);
}
String fileString = builder.toString();
String match = "wordToMatch";
String[] split = fileString.split(match);
System.out.println(split.length - 1);//finds amount matching exact sentance
Try this:
private static int findMatches(List<Character> text, List<Character> pattern) {
int n = text.size();
int m = pattern.size();
int count = 0; // tracks number of matches found
for (int i = 0; i <= n - m; i++) {
int k = 0;
while (k < m && text.get(i + k) == pattern.get(k))
k++;
if (k == m) { // if we reach the end of the pattern
k = 0;
count++;
}
}
return count;
}
Finding the number of times one string occurs in another is fairly trivial so it might be best to read your file into a string rather than a list of characters if all you are going to do is search it
int matches = 0;
for (int index = text.indexOf(target); index != -1; index = text.indexOf(target, index + 1))
matches++;
However most efficient search algorithms are going to require you to create an index as you read the input - something like a Map<String,List<Integer>> that can quickly find a list of word positions for a given word.
For example (using Java 8):
List<String> words; // ordered list of words read from the text file
Map<String, List<Integer>> index = IntStream.range(0, words.size())
.collect(Collectors.groupingBy(words::get));
Then searching for a word becomes trivial (e.g. index.get(word).size() is the number of occurences). Searching for a phrase is not much harder: you break it into words then filter the index values for consecutive word positions.
Use a List<String> instead of List<Character>. After that you can either brute force the occurrence or use Java 8's streams and filter.
public static void main(String[] args) throws Exception {
List<String> words = new ArrayList() {{
add("one");
add("two");
add("one");
add("one");
add("two");
add("one");
add("two");
add("three");
add("one");
add("one");
}};
String wordToSearch = "one";
int occurrence = 0;
for (String word : words) {
if (word.equals(wordToSearch)) {
occurrence++;
}
}
System.out.println("Brute force: " + occurrence);
System.out.println("Streams : " + words.stream()
.filter(word -> word.equalsIgnoreCase(wordToSearch)).count());
}
Results:
Brute force: 6
Streams : 6

How can I replace two strings in a way that one does not end up replacing the other?

Let's say that I have the following code:
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar."
story = story.replace("foo", word1);
story = story.replace("bar", word2);
After this code runs, the value of story will be "Once upon a time, there was a foo and a foo."
A similar issue occurs if I replaced them in the opposite order:
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar."
story = story.replace("bar", word2);
story = story.replace("foo", word1);
The value of story will be "Once upon a time, there was a bar and a bar."
My goal is to turn story into "Once upon a time, there was a bar and a foo." How could I accomplish that?
Use the replaceEach() method from Apache Commons StringUtils:
StringUtils.replaceEach(story, new String[]{"foo", "bar"}, new String[]{"bar", "foo"})
You use an intermediate value (which is not yet present in the sentence).
story = story.replace("foo", "lala");
story = story.replace("bar", "foo");
story = story.replace("lala", "bar");
As a response to criticism: if you use a large enough uncommon string like zq515sqdqs5d5sq1dqs4d1q5dqqé"&é5d4sqjshsjddjhodfqsqc, nvùq^µù;d&€sdq: d: ;)àçàçlala and use that, it is unlikely to the point where I won't even debate it that a user will ever enter this. The only way to know whether a user will is by knowing the source code and at that point you're with a whole other level of worries.
Yes, maybe there are fancy regex ways. I prefer something readable that I know will not break out on me either.
Also reiterating the excellent advise given by #David Conrad in the comments:
Don't use some string cleverly (stupidly) chosen to be unlikely. Use characters from the Unicode Private Use Area, U+E000..U+F8FF. Remove any such characters first, since they shouldn't legitimately be in the input (they only have application-specific meaning within some application), then use them as placeholders when replacing.
You can try something like this, using Matcher#appendReplacement and Matcher#appendTail:
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar.";
Pattern p = Pattern.compile("foo|bar");
Matcher m = p.matcher(story);
StringBuffer sb = new StringBuffer();
while (m.find()) {
/* do the swap... */
switch (m.group()) {
case "foo":
m.appendReplacement(sb, word1);
break;
case "bar":
m.appendReplacement(sb, word2);
break;
default:
/* error */
break;
}
}
m.appendTail(sb);
System.out.println(sb.toString());
Once upon a time, there was a bar and a foo.
This is not an easy problem. And the more search-replacement parameters you have, the trickier it gets. You have several options, scattered on the palette of ugly-elegant, efficient-wasteful:
Use StringUtils.replaceEach from Apache Commons as #AlanHay recommended. This is a good option if you're free to add new dependencies in your project. You might get lucky: the dependency might be included already in your project
Use a temporary placeholder as #Jeroen suggested, and perform the replacement in 2 steps:
Replace all search patterns with a unique tag that doesn't exist in the original text
Replace the placeholders with the real target replacement
This is not a great approach, for several reasons: it needs to ensure that the tags used in the first step are really unique; it performs more string replacement operations than really necessary
Build a regex from all the patterns and use the method with Matcher and StringBuffer as suggested by #arshajii. This is not terrible, but not that great either, as building the regex is kind of hackish, and it involves StringBuffer which went out of fashion a while ago in favor of StringBuilder.
Use a recursive solution proposed by #mjolka, by splitting the string at the matched patterns, and recursing on the remaining segments. This is a fine solution, compact and quite elegant. Its weakness is the potentially many substring and concatenation operations, and the stack size limits that apply to all recursive solutions
Split the text to words and use Java 8 streams to perform the replacements elegantly as #msandiford suggested, but of course that only works if you are ok with splitting at word boundaries, which makes it not suitable as a general solution
Here's my version, based on ideas borrowed from Apache's implementation. It's neither simple nor elegant, but it works, and should be relatively efficient, without unnecessary steps. In a nutshell, it works like this: repeatedly find the next matching search pattern in the text, and use a StringBuilder to accumulate the unmatched segments and the replacements.
public static String replaceEach(String text, String[] searchList, String[] replacementList) {
// TODO: throw new IllegalArgumentException() if any param doesn't make sense
//validateParams(text, searchList, replacementList);
SearchTracker tracker = new SearchTracker(text, searchList, replacementList);
if (!tracker.hasNextMatch(0)) {
return text;
}
StringBuilder buf = new StringBuilder(text.length() * 2);
int start = 0;
do {
SearchTracker.MatchInfo matchInfo = tracker.matchInfo;
int textIndex = matchInfo.textIndex;
String pattern = matchInfo.pattern;
String replacement = matchInfo.replacement;
buf.append(text.substring(start, textIndex));
buf.append(replacement);
start = textIndex + pattern.length();
} while (tracker.hasNextMatch(start));
return buf.append(text.substring(start)).toString();
}
private static class SearchTracker {
private final String text;
private final Map<String, String> patternToReplacement = new HashMap<>();
private final Set<String> pendingPatterns = new HashSet<>();
private MatchInfo matchInfo = null;
private static class MatchInfo {
private final String pattern;
private final String replacement;
private final int textIndex;
private MatchInfo(String pattern, String replacement, int textIndex) {
this.pattern = pattern;
this.replacement = replacement;
this.textIndex = textIndex;
}
}
private SearchTracker(String text, String[] searchList, String[] replacementList) {
this.text = text;
for (int i = 0; i < searchList.length; ++i) {
String pattern = searchList[i];
patternToReplacement.put(pattern, replacementList[i]);
pendingPatterns.add(pattern);
}
}
boolean hasNextMatch(int start) {
int textIndex = -1;
String nextPattern = null;
for (String pattern : new ArrayList<>(pendingPatterns)) {
int matchIndex = text.indexOf(pattern, start);
if (matchIndex == -1) {
pendingPatterns.remove(pattern);
} else {
if (textIndex == -1 || matchIndex < textIndex) {
textIndex = matchIndex;
nextPattern = pattern;
}
}
}
if (nextPattern != null) {
matchInfo = new MatchInfo(nextPattern, patternToReplacement.get(nextPattern), textIndex);
return true;
}
return false;
}
}
Unit tests:
#Test
public void testSingleExact() {
assertEquals("bar", StringUtils.replaceEach("foo", new String[]{"foo"}, new String[]{"bar"}));
}
#Test
public void testReplaceTwice() {
assertEquals("barbar", StringUtils.replaceEach("foofoo", new String[]{"foo"}, new String[]{"bar"}));
}
#Test
public void testReplaceTwoPatterns() {
assertEquals("barbaz", StringUtils.replaceEach("foobar",
new String[]{"foo", "bar"},
new String[]{"bar", "baz"}));
}
#Test
public void testReplaceNone() {
assertEquals("foofoo", StringUtils.replaceEach("foofoo", new String[]{"x"}, new String[]{"bar"}));
}
#Test
public void testStory() {
assertEquals("Once upon a foo, there was a bar and a baz, and another bar and a cat.",
StringUtils.replaceEach("Once upon a baz, there was a foo and a bar, and another foo and a cat.",
new String[]{"foo", "bar", "baz"},
new String[]{"bar", "baz", "foo"})
);
}
Search for the first word to be replaced. If it's in the string, recurse on the the part of the string before the occurrence, and on the part of the string after the occurrence.
Otherwise, continue with the next word to be replaced.
A naive implementation might look like this
public static String replaceAll(String input, String[] search, String[] replace) {
return replaceAll(input, search, replace, 0);
}
private static String replaceAll(String input, String[] search, String[] replace, int i) {
if (i == search.length) {
return input;
}
int j = input.indexOf(search[i]);
if (j == -1) {
return replaceAll(input, search, replace, i + 1);
}
return replaceAll(input.substring(0, j), search, replace, i + 1) +
replace[i] +
replaceAll(input.substring(j + search[i].length()), search, replace, i);
}
Sample usage:
String input = "Once upon a baz, there was a foo and a bar.";
String[] search = new String[] { "foo", "bar", "baz" };
String[] replace = new String[] { "bar", "baz", "foo" };
System.out.println(replaceAll(input, search, replace));
Output:
Once upon a foo, there was a bar and a baz.
A less-naive version:
public static String replaceAll(String input, String[] search, String[] replace) {
StringBuilder sb = new StringBuilder();
replaceAll(sb, input, 0, input.length(), search, replace, 0);
return sb.toString();
}
private static void replaceAll(StringBuilder sb, String input, int start, int end, String[] search, String[] replace, int i) {
while (i < search.length && start < end) {
int j = indexOf(input, search[i], start, end);
if (j == -1) {
i++;
} else {
replaceAll(sb, input, start, j, search, replace, i + 1);
sb.append(replace[i]);
start = j + search[i].length();
}
}
sb.append(input, start, end);
}
Unfortunately, Java's String has no indexOf(String str, int fromIndex, int toIndex) method. I've omitted the implementation of indexOf here as I'm not certain it's correct, but it can be found on ideone, along with some rough timings of various solutions posted here.
One-liner in Java 8:
story = Pattern
.compile(String.format("(?<=%1$s)|(?=%1$s)", "foo|bar"))
.splitAsStream(story)
.map(w -> ImmutableMap.of("bar", "foo", "foo", "bar").getOrDefault(w, w))
.collect(Collectors.joining());
Lookaround regular expressions (?<=, ?=): http://www.regular-expressions.info/lookaround.html
If the words can contain special regex characters, use Pattern.quote to
escape them.
I use guava ImmutableMap for conciseness, but obviously any other Map will do the job as well.
Here is a Java 8 streams possibility that might be interesting for some:
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar.";
// Map is from untranslated word to translated word
Map<String, String> wordMap = new HashMap<>();
wordMap.put(word1, word2);
wordMap.put(word2, word1);
// Split on word boundaries so we retain whitespace.
String translated = Arrays.stream(story.split("\\b"))
.map(w -> wordMap.getOrDefault(w, w))
.collect(Collectors.joining());
System.out.println(translated);
Here is an approximation of the same algorithm in Java 7:
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar.";
// Map is from untranslated word to translated word
Map<String, String> wordMap = new HashMap<>();
wordMap.put(word1, word2);
wordMap.put(word2, word1);
// Split on word boundaries so we retain whitespace.
StringBuilder translated = new StringBuilder();
for (String w : story.split("\\b"))
{
String tw = wordMap.get(w);
translated.append(tw != null ? tw : w);
}
System.out.println(translated);
​If you want to replace words in a sentence which are separated by white space as shown in your example you can use this simple algorithm.
Split story on white space
Replace each elements, if foo replace it to bar and vice varsa
Join the array back into one string
​If Splitting on space is not acceptable one can follow this alternate algorithm. ​You need to use the longer string first. If the stringes are foo and fool, you need to use fool first and then foo.
Split on the word foo
Replace bar with foo each element of the array
Join that array back adding bar after each element except the last
Here's a less complicated answer using Map.
private static String replaceEach(String str,Map<String, String> map) {
Object[] keys = map.keySet().toArray();
for(int x = 0 ; x < keys.length ; x ++ ) {
str = str.replace((String) keys[x],"%"+x);
}
for(int x = 0 ; x < keys.length ; x ++) {
str = str.replace("%"+x,map.get(keys[x]));
}
return str;
}
And method is called
Map<String, String> replaceStr = new HashMap<>();
replaceStr.put("Raffy","awesome");
replaceStr.put("awesome","Raffy");
String replaced = replaceEach("Raffy is awesome, awesome awesome is Raffy Raffy", replaceStr);
Output is:
awesome is Raffy, Raffy Raffy is awesome awesome
If you want to be able to handle multiple occurrences of the search strings to be replaced, you can do that easily by splitting the string on each search term, then replacing it.
Here is an example:
String regex = word1 + "|" + word2;
String[] values = Pattern.compile(regex).split(story);
String result;
foreach subStr in values
{
subStr = subStr.replace(word1, word2);
subStr = subStr.replace(word2, word1);
result += subStr;
}
You can accomplish your goal with the following code block:
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, in a foo, there was a foo and a bar.";
story = String.format(story.replace(word1, "%1$s").replace(word2, "%2$s"),
word2, word1);
It replaces the words regardless of the order. You can extend this principle into an utility method, like:
private static String replace(String source, String[] targets, String[] replacements) throws IllegalArgumentException {
if (source == null) {
throw new IllegalArgumentException("The parameter \"source\" cannot be null.");
}
if (targets == null || replacements == null) {
throw new IllegalArgumentException("Neither parameters \"targets\" or \"replacements\" can be null.");
}
if (targets.length == 0 || targets.length != replacements.length) {
throw new IllegalArgumentException("The parameters \"targets\" and \"replacements\" must have at least one item and have the same length.");
}
String outputMask = source;
for (int i = 0; i < targets.length; i++) {
outputMask = outputMask.replace(targets[i], "%" + (i + 1) + "$s");
}
return String.format(outputMask, (Object[])replacements);
}
Which would be consumed as:
String story = "Once upon a time, in a foo, there was a foo and a bar.";
story = replace(story, new String[] { "bar", "foo" },
new String[] { "foo", "bar" }));
This works and is simple:
public String replaceBoth(String text, String token1, String token2) {
return text.replace(token1, "\ufdd0").replace(token2, token1).replace("\ufdd0", token2);
}
You use it like this:
replaceBoth("Once upon a time, there was a foo and a bar.", "foo", "bar");
Note: this counts on Strings not containing character \ufdd0, which is a character permanently reserved for internal use by Unicode (See http://www.unicode.org/faq/private_use.html):
I don't think it's necessary, but If you want to be absolutely safe you can use:
public String replaceBoth(String text, String token1, String token2) {
if (text.contains("\ufdd0") || token1.contains("\ufdd0") || token2.contains("\ufdd0")) throw new IllegalArgumentException("Invalid character.");
return text.replace(token1, "\ufdd0").replace(token2, token1).replace("\ufdd0", token2);
}
Swapping Only One Occurrence
If there is only one occurrence of each of the swapable strings in the input, you can do the following:
Before proceeding to any replace, get the indices of the occurrences of the words. After that we only replace the word found at these indexes, and not all occurrences. This solution uses StringBuilder and does not produce intermediate Strings like String.replace().
One thing to note: if the swapable words have different lengths, after the first replace the second index might change (if the 1st word occurs before the 2nd) exactly with the difference of the 2 lengths. So aligning the second index will ensure this works even if we're swapping words with different lengths.
public static String swap(String src, String s1, String s2) {
StringBuilder sb = new StringBuilder(src);
int i1 = src.indexOf(s1);
int i2 = src.indexOf(s2);
sb.replace(i1, i1 + s1.length(), s2); // Replace s1 with s2
// If s1 was before s2, idx2 might have changed after the replace
if (i1 < i2)
i2 += s2.length() - s1.length();
sb.replace(i2, i2 + s2.length(), s1); // Replace s2 with s1
return sb.toString();
}
Swapping Arbitrary Number of Occurrences
Analogous to the previous case we will first collect the indexes (occurrences) of the words, but in this case it will a list of integers for each word, not just one int. For this we will use the following utility method:
public static List<Integer> occurrences(String src, String s) {
List<Integer> list = new ArrayList<>();
for (int idx = 0;;)
if ((idx = src.indexOf(s, idx)) >= 0) {
list.add(idx);
idx += s.length();
} else
return list;
}
And using this we will replace the words with the other one by decreasing index (which might require to alternate between the 2 swapable words) so that we won't even have to correct the indices after a replace:
public static String swapAll(String src, String s1, String s2) {
List<Integer> l1 = occurrences(src, s1), l2 = occurrences(src, s2);
StringBuilder sb = new StringBuilder(src);
// Replace occurrences by decreasing index, alternating between s1 and s2
for (int i1 = l1.size() - 1, i2 = l2.size() - 1; i1 >= 0 || i2 >= 0;) {
int idx1 = i1 < 0 ? -1 : l1.get(i1);
int idx2 = i2 < 0 ? -1 : l2.get(i2);
if (idx1 > idx2) { // Replace s1 with s2
sb.replace(idx1, idx1 + s1.length(), s2);
i1--;
} else { // Replace s2 with s1
sb.replace(idx2, idx2 + s2.length(), s1);
i2--;
}
}
return sb.toString();
}
It's easy to write a method to do this using String.regionMatches:
public static String simultaneousReplace(String subject, String... pairs) {
if (pairs.length % 2 != 0) throw new IllegalArgumentException(
"Strings to find and replace are not paired.");
StringBuilder sb = new StringBuilder();
outer:
for (int i = 0; i < subject.length(); i++) {
for (int j = 0; j < pairs.length; j += 2) {
String find = pairs[j];
if (subject.regionMatches(i, find, 0, find.length())) {
sb.append(pairs[j + 1]);
i += find.length() - 1;
continue outer;
}
}
sb.append(subject.charAt(i));
}
return sb.toString();
}
Testing:
String s = "There are three cats and two dogs.";
s = simultaneousReplace(s,
"cats", "dogs",
"dogs", "budgies");
System.out.println(s);
Output:
There are three dogs and two budgies.
It is not immediately obvious, but a function like this can still be dependent on the order in which the replacements are specified. Consider:
String truth = "Java is to JavaScript";
truth += " as " + simultaneousReplace(truth,
"JavaScript", "Hamster",
"Java", "Ham");
System.out.println(truth);
Output:
Java is to JavaScript as Ham is to Hamster
But reverse the replacements:
truth += " as " + simultaneousReplace(truth,
"Java", "Ham",
"JavaScript", "Hamster");
Output:
Java is to JavaScript as Ham is to HamScript
Oops! :)
Therefore it is sometimes useful to make sure to look for the longest match (as PHP's strtr function does, for example). This version of the method will do that:
public static String simultaneousReplace(String subject, String... pairs) {
if (pairs.length % 2 != 0) throw new IllegalArgumentException(
"Strings to find and replace are not paired.");
StringBuilder sb = new StringBuilder();
for (int i = 0; i < subject.length(); i++) {
int longestMatchIndex = -1;
int longestMatchLength = -1;
for (int j = 0; j < pairs.length; j += 2) {
String find = pairs[j];
if (subject.regionMatches(i, find, 0, find.length())) {
if (find.length() > longestMatchLength) {
longestMatchIndex = j;
longestMatchLength = find.length();
}
}
}
if (longestMatchIndex >= 0) {
sb.append(pairs[longestMatchIndex + 1]);
i += longestMatchLength - 1;
} else {
sb.append(subject.charAt(i));
}
}
return sb.toString();
}
Note that the above methods are case-sensitive. If you need a case-insensitive version it is easy to modify the above because String.regionMatches can take an ignoreCase parameter.
If you don't want any dependencies, you could simply use an array which allows a one-time change only. This is not the most efficient solution, but it should work.
public String replace(String sentence, String[]... replace){
String[] words = sentence.split("\\s+");
int[] lock = new int[words.length];
StringBuilder out = new StringBuilder();
for (int i = 0; i < words.length; i++) {
for(String[] r : replace){
if(words[i].contains(r[0]) && lock[i] == 0){
words[i] = words[i].replace(r[0], r[1]);
lock[i] = 1;
}
}
out.append((i < (words.length - 1) ? words[i] + " " : words[i]));
}
return out.toString();
}
Then, it whould work.
String story = "Once upon a time, there was a foo and a bar.";
String[] a = {"foo", "bar"};
String[] b = {"bar", "foo"};
String[] c = {"there", "Pocahontas"};
story = replace(story, a, b, c);
System.out.println(story); // Once upon a time, Pocahontas was a bar and a foo.
You are performing multiple search-replace operations on the input. This will produce undesired results when the replacement strings contain search strings. Consider the foo->bar, bar-foo example, here are the results for each iteration:
Once upon a time, there was a foo and a bar. (input)
Once upon a time, there was a bar and a bar. (foo->bar)
Once upon a time, there was a foo and a foo. (bar->foo, output)
You need to perform the replacement in one iteration without going back. A brute-force solution is as follows:
Search the input from current position to end for multiple search strings until a match is found
Replace the matched search string with corresponding replace string
Set current position to the next character after the replaced string
Repeat
A function such as String.indexOfAny(String[]) -> int[]{index, whichString} would be useful. Here is an example (not the most efficient one):
private static String replaceEach(String str, String[] searchWords, String[] replaceWords) {
String ret = "";
while (str.length() > 0) {
int i;
for (i = 0; i < searchWords.length; i++) {
String search = searchWords[i];
String replace = replaceWords[i];
if (str.startsWith(search)) {
ret += replace;
str = str.substring(search.length());
break;
}
}
if (i == searchWords.length) {
ret += str.substring(0, 1);
str = str.substring(1);
}
}
return ret;
}
Some tests:
System.out.println(replaceEach(
"Once upon a time, there was a foo and a bar.",
new String[]{"foo", "bar"},
new String[]{"bar", "foo"}
));
// Once upon a time, there was a bar and a foo.
System.out.println(replaceEach(
"a p",
new String[]{"a", "p"},
new String[]{"apple", "pear"}
));
// apple pear
System.out.println(replaceEach(
"ABCDE",
new String[]{"A", "B", "C", "D", "E"},
new String[]{"B", "C", "E", "E", "F"}
));
// BCEEF
System.out.println(replaceEach(
"ABCDEF",
new String[]{"ABCDEF", "ABC", "DEF"},
new String[]{"XXXXXX", "YYY", "ZZZ"}
));
// XXXXXX
// note the order of search strings, longer strings should be placed first
// in order to make the replacement greedy
Demo on IDEONE
Demo on IDEONE, alternate code
You can always replace it with a word you are sure will appear nowhere else in the string, and then do the second replace later:
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar."
story = story.replace("foo", "StringYouAreSureWillNeverOccur").replace("bar", "word2").replace("StringYouAreSureWillNeverOccur", "word1");
Note that this will not work right if "StringYouAreSureWillNeverOccur" does occur.
Consider using StringBuilder
Then store the index where each string should start. If you use a place holder character at each position, then remove it, and insert the users string. You can then map the end position by adding the string length to the start position.
String firstString = "???";
String secondString = "???"
StringBuilder story = new StringBuilder("One upon a time, there was a "
+ firstString
+ " and a "
+ secondString);
int firstWord = 30;
int secondWord = firstWord + firstString.length() + 7;
story.replace(firstWord, firstWord + firstString.length(), userStringOne);
story.replace(secondWord, secondWord + secondString.length(), userStringTwo);
firstString = userStringOne;
secondString = userStringTwo;
return story;
What I can only share is my own method.
You can use a temporary String temp = "<?>"; or String.Format();
This is my example code created in console application via c# -"Idea Only, Not Exact Answer".
static void Main(string[] args)
{
String[] word1 = {"foo", "Once"};
String[] word2 = {"bar", "time"};
String story = "Once upon a time, there was a foo and a bar.";
story = Switcher(story,word1,word2);
Console.WriteLine(story);
Console.Read();
}
// Using a temporary string.
static string Switcher(string text, string[] target, string[] value)
{
string temp = "<?>";
if (target.Length == value.Length)
{
for (int i = 0; i < target.Length; i++)
{
text = text.Replace(target[i], temp);
text = text.Replace(value[i], target[i]);
text = text.Replace(temp, value[i]);
}
}
return text;
}
Or you can also use the String.Format();
static string Switcher(string text, string[] target, string[] value)
{
if (target.Length == value.Length)
{
for (int i = 0; i < target.Length; i++)
{
text = text.Replace(target[i], "{0}").Replace(value[i], "{1}");
text = String.Format(text, value[i], target[i]);
}
}
return text;
}
Output: time upon a Once, there was a bar and a foo.
Here's my version, which is word-based:
class TextReplace
{
public static void replaceAll (String text, String [] lookup,
String [] replacement, String delimiter)
{
String [] words = text.split(delimiter);
for (int i = 0; i < words.length; i++)
{
int j = find(lookup, words[i]);
if (j >= 0) words[i] = replacement[j];
}
text = StringUtils.join(words, delimiter);
}
public static int find (String [] array, String key)
{
for (int i = 0; i < array.length; i++)
if (array[i].equals(key))
return i;
return (-1);
}
}
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar."
Little tricky way but you need to do some more checks.
1.convert string to character array
String temp[] = story.split(" ");//assume there is only spaces.
2.loop on temp and replace foo with bar and bar with foo as there are no chances of getting replaceable string again.
Well, the shorter answer is...
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar.";
story = story.replace("foo", "#"+ word1).replace("bar", word2).replace("#" + word2, word1);
System.out.println(story);
Using the answer found here you can find all occurrences of the strings you wish to replace with.
So for example you run the code in the above SO answer. Create two tables of indexes (let's say bar and foo do not appear only once in your string) and you can work with those tables on replacing them in your string.
Now for replacing on specific index locations you can use:
public static String replaceStringAt(String s, int pos, String c) {
return s.substring(0,pos) + c + s.substring(pos+1);
}
Whereas pos is the index where your strings start (from the index tables I quoted above).
So let's say you created two tables of indexes for each one.
Let's call them indexBar and indexFoo.
Now in replacing them you could simply run two loops, one for each replacements you wish to make.
for(int i=0;i<indexBar.Count();i++)
replaceStringAt(originalString,indexBar[i],newString);
Similarly another loop for indexFoo.
This may not be as efficient as other answers here but it's simpler to understand than Maps or other stuff.
This would always give you the result you wish and for multiple possible occurrences of each string. As long as you store the index of each occurrence.
Also this answer needs no recursion nor any external dependencies. As far as complexity goes it propably is O(n squared), whereas n is the sum of occurences of both words.
I developed this code will solve problem:
public static String change(String s,String s1, String s2) {
int length = s.length();
int x1 = s1.length();
int x2 = s2.length();
int x12 = s.indexOf(s1);
int x22 = s.indexOf(s2);
String s3=s.substring(0, x12);
String s4 =s.substring(x12+3, x22);
s=s3+s2+s4+s1;
return s;
}
In the main use change(story,word2,word1).
String word1 = "bar";
String word2 = "foo";
String story = "Once upon a time, there was a foo and a bar."
story = story.replace("foo", "<foo />");
story = story.replace("bar", "<bar />");
story = story.replace("<foo />", word1);
story = story.replace("<bar />", word2);

Categories

Resources