add space between list elements at odd position - java

I have a string:
str = "Hello there"
I am removing the whitespace:
String[] parts = str.split("\\s+");
Creating a List and populating it with the parts:
List<String> theParts = new ArrayList<String>();
for (int i = 0; i < parts.length; i++) {
theParts.add(parts[i]);
}
The size of the List is 2.Now, I want to increase it's size in order to be the same size as another list.
Let's say the other list has size 3.
So, I check:
if (otherList.size() > theParts.size()) {
and then, I want to change the theParts list in order to contain an empty space (the number which shows how much greater the otherList is) between it's parts.
So, I want theParts to be (add a space at every odd position):
theParts[0] = "Hello"
theParts[1] = " "
theParts[2] = "there"
I am not sure if this can be happen with Lists, but I can't think another solution.
Or use something like join (doesn't work, just an idea to use something like this):
if (otherList.size() > theParts.size()) {
for (int i = 0; i < otherList.size(); i++) {
if (i%2 !=0) {
String.join(" ", theParts);
}
}
}

Just insert the spaces as you're populating the list:
List<String> theParts = new ArrayList<>(2 * parts.length - 1);
for (int i = 0; i < parts.length; i++) {
if (i > 0) theParts.add(" ");
theParts.add(parts[i]);
}

You could use a word break regex:
public void test() throws Exception {
String str = "Hello there";
List<String> strings = Arrays.asList(str.split("\\b"));
for ( String s : strings ) {
System.out.println("'"+s+"'");
}
}
this will retain all of the spaces for you.
'Hello'
' '
'there'

for(String dis : theParts){
newParts.add(dis);//'newPart is another list '
String last = parts[parts.length -2]; // until new list read last element
if(!last.equals(dis)){
newParts.add(" ");
}if(last.equals(dis)){
newParts.add(" ");
}
}

Related

Creating longer array if elements contain whitespaces? (Java)

currently I'm trying to make a method that does the following:
Takes 3 String Arrays (words, beforeList, and afterList)
Looks for words that are in both words and in beforeList, and if found, replaces with word in afterList
Returns a new array that turns the elements with characters in afterList into new elements by themselves
For example, here is a test case, notice that "i'm" becomes split into two elements in the final array "i" and "am":
String [] someWords = {"i'm", "cant", "recollect"};
String [] beforeList = {"dont", "cant", "wont", "recollect", "i'm"};
String [] afterList = {"don't", "can't", "won't", "remember", "i am"};
String [] result = Eliza.replacePairs( someWords, beforeList, afterList);
if ( result != null && result[0].equals("i") && result[1].equals("am")
&& result[2].equals("can't") && result[3].equals("remember")) {
System.out.println("testReplacePairs 1 passed.");
} else {
System.out.println("testReplacePairs 1 failed.");
}
My biggest problem is in accounting for this case of whitespaces. I know the code I will post below is wrong, however I've been trying different methods. I think my code right now should return an empty array that is the length of the first but accounted for spaces. I realize it may require a whole different approach. Any advice though would be appreciated, I'm going to continue to try and figure it out but if there is a way to do this simply then I'd love to hear and learn from it! Thank you.
public static String[] replacePairs(String []words, String [] beforeList, String [] afterList) {
if(words == null || beforeList == null || afterList == null){
return null;
}
String[] returnArray;
int countofSpaces = 0;
/* Check if words in words array can be found in beforeList, here I use
a method I created "inList". If a word is found the index of it in
beforeList will be returned, if a word is not found, -1 is returned.
If a word is found, I set the word in words to the afterList value */
for(int i = 0; i < words.length; i++){
int listCheck = inList(words[i], beforeList);
if(listCheck != -1){
words[i] = afterList[listCheck];
}
}
// This is where I check for spaces (or attempt to)
for(int j = 0; j < words.length; j++){
if(words[j].contains(" ")){
countofSpaces++;
}
}
// Here I return an array that is the length of words + the space count)
returnArray = new String[words.length + countofSpaces];
return returnArray;
}
Here's one of the many ways of doing it, assuming you have to handle cases where words contain more than 1 consecutive spaces:
for(int i = 0; i < words.length; i++){
int listCheck = inList(words[i], beforeList);
if(listCheck != -1){
words[i] = afterList[listCheck];
}
}
ArrayList<String> newWords = new ArrayList<String>();
for(int i = 0 ; i < words.length ; i++) {
String str = words[i];
if(str.contains(' ')){
while(str.contains(" ")) {
str = str.replace(" ", " ");
}
String[] subWord = str.split(" ");
newWords.addAll(Arrays.asList(subWord));
} else {
newWords.add(str);
}
}
return (String[])newWords.toArray();

Sorting an array of strings [duplicate]

This question already has an answer here:
Remove a common word from each string value in an array
(1 answer)
Closed 6 years ago.
I have an array of Strings that contains: Extra Water, Juice, and Extra Milk, so I am wondering how would I get rid of the extras and use the only second word in the string so that the expected output is Water, Juice, and Milk.
If all you want to do is remove a specific substring then:
String[] array = {"Extra Water", "Juice", "Extra Milk"};
array = Arrays.stream(array).map(s-> s.replaceAll("Extra", "")).toArray();
This uses Java 8 streams but you could do it just as simply with iteration.
Use String.split(' ') to split the string by a space, then check the result to see if the string length == 2. If so, then take the second element of the array, otherwise the first.
for( int i = 0; i < array.length; i++ ) {
String[] parts = array[i].split(' ');
if( parts.length == 2 ) {
array[i] = parts[1];
}
}
EDIT: If you want to remove all duplicate words, you could do the following using two passes over the array:
// Pass 1 -- find all duplicate words
Set<String> wordSet = new HashSet<>();
Set<String> duplicateSet = new HashSet<>();
for (int i = 0; i < array.length; i++) {
String[] parts = array[i].split(" ");
for (String part : parts) {
if (!wordSet.contains(part)) {
// Haven't seen this word before
wordSet.add(part);
} else {
// This word is a duplicate word
if (!duplicateSet.contains(part)) {
duplicateSet.add(part);
}
}
}
}
// Pass 2 -- remove all words that are in the duplicate set
for (int i = 0; i < array.length; i++) {
String[] parts = array[i].split(" ");
String dedupedString = "";
for (String part : parts) {
if (!duplicateSet.contains(part)) {
dedupedString += part + " ";
}
}
array[i] = dedupedString;
}
Simply you need to iterate over each element of the array and replace the "Extra" in each element of the array and then trim the white spaces.
String[] array = {"Extra Water", "Juice", "Extra Milk"};
for (int i = 0; i < array.length; i++) {
array[i] = array[i].replace("Extra", "").trim();
}
for (String each : array) {
System.out.println(each);
}

Extract words from an array of Strings in java based on conditions

I am trying to do an assignment that works with Arrays and Strings. The code is almost complete, but I've run into a hitch. Every time the code runs, it replaces the value in the index of the output array instead of putting the new value in a different index. For example, if I was trying to search for the words containing a prefix "b" in the array of strings, the intended output is "bat" and "brewers" but instead, the output comes out as "brewers" and "brewers". Any suggestions? (ps. The static main method is there for testing purposes.)
--
public static void main(String[] args) {
String[] words = {"aardvark", "bat", "brewers", "cadmium", "wolf", "dastardly", "enigmatic", "frenetic",
"sycophant", "rattle", "zinc", "alloy", "tunnel", "nitrate", "sample", "yellow", "mauve", "abbey",
"thinker", "junk"};
String prefix = "b";
String[] output = new String[wordsStartingWith(words, prefix).length];
output = wordsStartingWith(words, prefix);
for (int i = 0; i < output.length; i++) {
System.out.println("Words: " + i + " " + output[i]);
}
}
public static String[] wordsStartingWith(String[] words, String prefix) {
// method that finds and returns all strings that start with the prefix
String[] returnWords;
int countWords = 0;
for (int i = 0; i < words.length; i++) {
// loop to count the number of words that actually have the prefix
if (words[i].substring(0, prefix.length()).equalsIgnoreCase(prefix)) {
countWords++;
}
}
// assign length of array based on number of words containing prefix
returnWords = new String[countWords];
for (int i = 0; i < words.length; i++) {
// loop to put strings containing prefix into new array
for (int j = 0; j < returnWords.length; j++) {
if (words[i].substring(0, prefix.length()).equalsIgnoreCase(prefix)) {
returnWords[j] = words[i];
}
}
}
return returnWords;
}
--
Thank You
Soul
Don't reinvent the wheel. Your code can be replaced by this single, easy to read, bug free, line:
String[] output = Arrays.stream(words)
.filter(w -> w.startsWith(prefix))
.toArray(String[]::new);
Or if you just want to print the matching words:
Arrays.stream(words)
.filter(w -> w.startsWith(prefix))
.forEach(System.out::println);
Its because of the code you have written. If you would have thought it properly you would have realized your mistake.
The culprit code
for (int j = 0; j < returnWords.length; j++) {
if (words[i].substring(0, prefix.length()).equalsIgnoreCase(prefix)) {
returnWords[j] = words[i];
}
}
When you get a matching word you set whole of your output array to that word. This would mean the last word found as satisfying the condition will replace all the previous words in the array.
All elements of array returnWords gets first initialized to "bat" and then each element gets replaced by "brewers"
corrected code will be like this
int j = 0;
for (int i = 0; i < words.length; i++) {
if (words[i].substring(0, prefix.length()).equalsIgnoreCase(prefix)) {
returnWords[j] = words[i];
j++;
}
}
Also you are doing multiple iterations which is not exactly needed.
For example this statement
String[] output = new String[wordsStartingWith(words, prefix).length];
output = wordsStartingWith(words, prefix);
can be rectified to a simpler statement
String[] output = wordsStartingWith(words, prefix);
The way you're doing this is looping through the same array multiple times.
You only need to check the values once:
public static void main(String[] args) {
String[] words = {"aardvark", "bat", "brewers", "cadmium", "wolf", "dastardly", "enigmatic", "frenetic",
"sycophant", "rattle", "zinc", "alloy", "tunnel", "nitrate", "sample", "yellow", "mauve", "abbey",
"thinker", "junk"};
String prefix = "b";
for (int i = 0; i < words.length; i++) {
if (words[i].toLowerCase().startsWith(prefix.toLowerCase())) {
System.out.println("Words: " + i + " " + words[i]);
}
}
}
Instead of doing two separate loops, try just having one:
String[] returnWords;
int[] foundWords = new int[words.length];
int countWords = 0;
for (int i = 0; i < words.length; i++) {
// loop to count the number of words that actually have the prefix
if (words[i].substring(0, prefix.length()).equalsIgnoreCase(prefix)) {
foundWords[index] = words[i];
countWords++;
}
}
// assign length of array based on number of words containing prefix
returnWords = new String[countWords];
for (int i = 0; i < countWords; i++) {
returnWords[i] = foundWords[i];
}
My method has another array (foundWords) for all the words that you found during the first loop which has the size of words in case every single word starts with the prefix. And index keeps track of where to place the found word in foundWords. And lastly, you just have to go through the countWords and assign each element to your returnWords.
Not only will this fix your code but it will optimize it so that it will run faster (very slightly; the bigger the word bank is, the greater fast it will search through).

Compare two arrayList and get longest matching String

So what I'm trying to do is get two text files and to return the longest matching string in both. I put both textfiles in arraylist and seperated them by everyword. This is my code so far, but I'm just wondering how I would return the longest String and not just the first one found.
for(int i = 0; i < file1Words.size(); i++)
{
for(int j = 0; j < file2Words.size(); j++)
{
if(file1Words.get(i).equals(file2Words.get(j)))
{
matchingString += file1Words.get(i) + " ";
}
}
}
String longest = "";
for (String s1: file1Words)
for (String s2: file2Words)
if (s1.length() > longest.length() && s1.equals(s2)) longest = s1;
if you are looking for performance in time and space,when compared to above replies, you can use below code.
System.out.println("Start time :"+System.currentTimeMillis());
String longestMatch="";
for(int i = 0; i < file1Words.size(); i++) {
if(file1Words.get(i).length()>longestMatch.length()){
for(int j = 0; j < file2Words.size(); j++) {
String w = file1Words.get(i);
if (w.length() > longestMatch.length() && w.equals(file2Words.get(j)))
longestMatch = w;
}
}
System.out.println("End time :"+System.currentTimeMillis());
I'm not going to give you the code but I'll help you with the main ides...
You will need a new string variable "curLargestString" to keep track of what is currently the largest string. Declare this outside of your for loops. Now, for every time you get two matching words, compare the size of the matching word to the size of the size of the word in "curLargestString". If the new matching word is larger, than set "curLargestString" to the new word. Then, after your for loop have run, return curLargestString.
One more note, be sure to initialize curLargestString with an empty string. This will prevent an error when you call the size function on it after you get your first matching word
Assuming, your files are small enough to fit in memory, sort them both with a custom comparator, that puts longer strings before shorter ones, and otherwise sorts lexicographically.
Then go through both files in order, advancing only one index at a time (teh one, pointing to the "smallest" entry of two), and return the first match.
You can use following code:
String matchingString = "";
Set intersection = new HashSet(file1Words);
intersection.retainAll(file2Words)
for(String word: intersection)
if(word.length() > matchingString.size())
matchingString = word;
private String getLongestString(List<String> list1, List<String> list2) {
String longestString = null;
for (String list1String : list1) {
if (list1String.size() > longestString.size()) {
for (String list2String : list2) {
if (list1String.equals(list2String)) {
longestString = list1String;
}
}
}
}
return longestString;
}

Tokenize method: Split string into array

I've been really struggling with a programming assignment. Basically, we have to write a program that translates a sentence in English into one in Pig Latin. The first method we need is one to tokenize the string, and we are not allowed to use the Split method usually used in Java. I've been trying to do this for the past 2 days with no luck, here is what I have so far:
public class PigLatin
{
public static void main(String[] args)
{
String s = "Hello there my name is John";
Tokenize(s);
}
public static String[] Tokenize(String english)
{
String[] tokenized = new String[english.length()];
for (int i = 0; i < english.length(); i++)
{
int j= 0;
while (english.charAt(i) != ' ')
{
String m = "";
m = m + english.charAt(i);
if (english.charAt(i) == ' ')
{
j++;
}
else
{
break;
}
}
for (int l = 0; l < tokenized.length; l++) {
System.out.print(tokenized[l] + ", ");
}
}
return tokenized;
}
}
All this does is print an enormously long array of "null"s. If anyone can offer any input at all, I would reallllyyyy appreciate it!
Thank you in advance
Update: We are supposed to assume that there will be no punctuation or extra spaces, so basically whenever there is a space, it's a new word
If I understand your question, and what your Tokenize was intended to do; then I would start by writing a function to split the String
static String[] splitOnWhiteSpace(String str) {
List<String> al = new ArrayList<>();
StringBuilder sb = new StringBuilder();
for (char ch : str.toCharArray()) {
if (Character.isWhitespace(ch)) {
if (sb.length() > 0) {
al.add(sb.toString());
sb.setLength(0);
}
} else {
sb.append(ch);
}
}
if (sb.length() > 0) {
al.add(sb.toString());
}
String[] ret = new String[al.size()];
return al.toArray(ret);
}
and then print using Arrays.toString(Object[]) like
public static void main(String[] args) {
String s = "Hello there my name is John";
String[] words = splitOnWhiteSpace(s);
System.out.println(Arrays.toString(words));
}
If you're allowed to use the StringTokenizer Object (which I think is what the assignment is asking, it would look something like this:
StringTokenizer st = new StringTokenizer("this is a test");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
which will produce the output:
this
is
a
test
Taken from here.
The string is split into tokens and stored in a stack. The while loop loops through the tokens, which is where you can apply the pig latin logic.
Some hints for you to do the "manual splitting" work.
There is a method String#indexOf(int ch, int fromIndex) to help you to find next occurrence of a character
There is a method String#substring(int beginIndex, int endIndex) to extract certain part of a string.
Here is some pseudo-code that show you how to split it (there are more safety handling that you need, I will leave that to you)
List<String> results = ...;
int startIndex = 0;
int endIndex = 0;
while (startIndex < inputString.length) {
endIndex = get next index of space after startIndex
if no space found {
endIndex = inputString.length
}
String result = get substring of inputString from startIndex to endIndex-1
results.add(result)
startIndex = endIndex + 1 // move startIndex to next position after space
}
// here, results contains all splitted words
String english = "hello my fellow friend"
ArrayList tokenized = new ArrayList<String>();
String m = "";
int j = 0; //index for tokenised array list.
for (int i = 0; i < english.length(); i++)
{
//the condition's position do matter here, if you
//change them, english.charAt(i) will give index
//out of bounds exception
while( i < english.length() && english.charAt(i) != ' ')
{
m = m + english.charAt(i);
i++;
}
//add to array list if there is some string
//if its only ' ', array will be empty so we are OK.
if(m.length() > 0 )
{
tokenized.add(m);
j++;
m = "";
}
}
//print the array list
for (int l = 0; l < tokenized.size(); l++) {
System.out.print(tokenized.get(l) + ", ");
}
This prints, "hello,my,fellow,friend,"
I used an array list since at the first sight the length of the array is not clear.

Categories

Resources