Find n:th word in a string

Find n:th word in a string - java

I'm trying to find nth word in a string. I am not allowed to use StringToknizer or split method from String class.
I now realize that I can use white space as a separator. The only problem is I don't know how to find the location of the first white space.
public static String pick(String message, int number){
String lastWord;
int word = 1;
String result = "haha";
for(int i=0; i<message.length();i++){
if(message.charAt(i)==' '){enter code here
word++;
}
}
if(number<=word && number > 0 && number != 1){//Confused..
int space = message.indexOf(" ");//improve
int nextSpace = message.indexOf(" ", space + 1);//also check dat
result = message.substring(space,message.indexOf(' ', space + 1));
}
if(number == 1){
result = message.substring(0,message.indexOf(" "));
}
if(number>word){
lastWord = message.substring(message.lastIndexOf(" ")+1);
return lastWord;
}
else return result;
}

The current implementation is overcomplicated, hard to understand.
Consider this alternative algorithm:
Initialize index = 0, to track your position in the input string
Repeat n - 1 of times:
Skip over non-space characters
Skip over space characters
At this point you are at the start of the n-th word, save this to start
Skip over non-space characters
At this point you are just after the end of the n-th word
Return the substring between start and end
Like this:
public static String pick(String message, int n) {
int index = 0;
for (int i = 1; i < n; i++) {
while (index < message.length() && message.charAt(index) != ' ') index++;
while (index < message.length() && message.charAt(index) == ' ') index++;
}
int start = index;
while (index < message.length() && message.charAt(index) != ' ') index++;
return message.substring(start, index);
}
Note that if n is higher than there are words in the input,
this will return empty string.
(If that's not what you want, it should be easy to tweak.)

CHEAT (using regex)1
public static String pick(String message, int number){
Matcher m = Pattern.compile("^\\W*" + (number > 1 ? "(?:\\w+\\W+){" + (number - 1) + "}" : "") + "(\\w+)").matcher(message);
return (m.find() ? m.group(1) : null);
}
Test
System.out.println(pick("This is a test", 1));
System.out.println(pick("! This # is # a $ test % ", 3));
System.out.println(pick("This is a test", 5));
Output
This
a
null
1) Only StringTokenizer and split are disallowed ;-)

This needs some edge case handling (e.g. there are fewer than n words), but here's the idea I was getting at. This is similar to your solution, but IMO less elegant than janos'.
public static String pick(String message, int n) {
int wordCount = 0;
String word = "";
int wordBegin = 0;
int wordEnd = message.indexOf(' ');
while (wordEnd >= 0 && wordCount < n) {
word = message.substring(wordBegin, wordEnd).trim();
message = message.substring(wordEnd).trim();
wordEnd = message.indexOf(' ');
wordCount++;
}
if (wordEnd == -1 && wordCount + 1 == n) {
return message;
}
if (wordCount + 1 < n) {
return "Not enough words to satisfy";
}
return word;
}

Most iteration in Java can now be replaced by streams. Whether this is an improvement is a matter of (strong) opinion.
int thirdWordIndex = IntStream.range(0, message.size() - 1)
.filter(i -> Character.isWhiteSpace(message.charAt(i)))
.filter(i -> Character.isLetter(message.charAt(i + 1)))
.skip(2).findFirst()
.orElseThrow(IllegalArgumentException::new) + 1;

Related

How to remove repeating code in this solution?

I have this code which compresses characters in the given string and replaces repeated adjacent characters with their count.
Consider the following example:
Input:
aaabbccdsa
Expecting output:
a3b2c2dsa
My code is working properly but I think repeating if condition can be removed.
public class Solution {
public static String getCompressedString(String str) {
String result = "";
char anch = str.charAt(0);
int count = 0;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (ch == anch) {
count++;
} else {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
anch = ch;
count = 1;
}
if (i == str.length() - 1) {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
}
}
return result;
}
}
In this solution code below is repeated two times
if (count == 1) {
result += anch;
} else {
result += anch + Integer.toString(count);
}
Please, note, I don't want to use a separate method for repeating logic.

You could do away with the if statements.
public static String getCompressedString(String str) {
char[] a = str.toCharArray();
StringBuilder sb = new StringBuilder();
for(int i=0,j=0; i<a.length; i=j){
for(j=i+1;j < a.length && a[i] == a[j]; j++);
sb.append(a[i]).append(j-i==1?"":j-i);
}
return sb.toString();
}
}

You can do something like this:
public static String getCompressedString(String str) {
String result = "";
int count = 1;
for (int i = 0; i < str.length(); i++) {
if (i + 1 < str.length() && str.charAt(i) == str.charAt(i + 1)) {
count++;
} else {
if (count == 1) {
result += str.charAt(i);
} else {
result += str.charAt(i) + "" + count;
count = 1;
}
}
}
return result;
}
I got rid of the repeated code, and it do as intended.

You can use this approach as explained below:
Code:
public class Test {
public static void main(String[] args) {
String s = "aaabbccdsaccbbaaadsa";
char[] strArray = s.toCharArray();
char ch0 = strArray[0];
int counter = 0;
StringBuilder sb = new StringBuilder();
for(int i=0;i<strArray.length;i++){
if(ch0 == strArray[i]){//check for consecutive characters and increment the counter
counter++;
} else { // when character changes while iterating
sb.append(ch0 + "" + (counter > 1 ? counter : ""));
counter = 1; // reset the counter to 1
ch0 = strArray[i]; // reset the ch0 with the current character
}
if(i == strArray.length-1){// case for last element of the string
sb.append(ch0 + "" + (counter > 1 ? counter : ""));
}
}
System.out.println(sb);
}
}
Sample Input/Output:
Input:: aaabbccdsaccbbaaadsa
Output:: a3b2c2dsac2b2a3dsa
Input:: abcdaaaaa
Output:: abcda5

Since, the body of the else and second if is the same, so we can merge them by updating the condition. The updated body of the function will be:
String result = "";
char anch = str.charAt(0);
int count = 0;
char ch = str.charAt(0); // declare ch outside the loop, and initialize to avoid error
for (int i = 0; i < str.length(); i++) {
ch = str.charAt(i);
if (ch == anch) {
count++;
}
// check if the second condition is false, or if we are at the end of the string
if (ch != anch || i == str.length() - 1) {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
anch = ch;
count = 1;
}
}
// add the condition
// if count is greater than or
// if the last character added already to the result
if (count > 1 || (len < 2 || result.charAt(len - 2) != ch)) {
result += ch;
}
return result;
Test Cases:
I have tested the solution on the following inputs:
aaabbccdsa -> a3b2c2dsa
aaab -> a3b
aaa -> a3
ab -> ab
aabbc -> a2b2c
Optional
If you want to make it shorter, you can update these 2 conditions.
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
as
result += anch;
if (count != 1) { // from here
result += count;// no need to convert (implicit conversion)
} // to here

Here's a single-statement solution using Stream API and regular expressions:
public static final Pattern GROUP_OF_ONE_OR_MORE = Pattern.compile("(.)\\1*");
public static String getCompressedString(String str) {
return GROUP_OF_ONE_OR_MORE.matcher(str).results()
.map(MatchResult::group)
.map(s -> s.charAt(0) + (s.length() == 1 ? "" : String.valueOf(s.length())))
.collect(Collectors.joining());
}
main()
public static void main(String[] args) {
System.out.println(getCompressedString("aaabbccdsa"));
System.out.println(getCompressedString("awswwwhhhp"));
}
Output:
a3b2c2dsa // "aaabbccdsa"
awsw3h3p // "awswwwhhhp"
How does it work
A regular expression "(.)\\1*" is capturing a group (.) of identical characters of length 1 or greater. Where . - denotes any symbol, and \\1 is a back reference to the group.
Method Matcher.results() "returns a stream of match results for each subsequence of the input sequence that matches the pattern".
The only thing left is to evaluate the length of each group and transform it accordingly before collecting into the resulting String.
Links:
A quick tutorial on Regular Expressions.
Official tutorials on lambda expressions and streams

You can use a function which has the following 3 parameters : result, anch, count .
something of this sort:
private static String extractedFunction(String result,int count, char anch) {
return count ==1 ? (result + anch) : (result +anch+Integer.toString(count) );
}
make a function call from those two points like this :
result = extractedFunction(result,count,anch);

Try this.
static final Pattern PAT = Pattern.compile("(.)\\1*");
static String getCompressedString(String str) {
return PAT.matcher(str)
.replaceAll(m -> m.group(1)
+ (m.group().length() == 1 ? "" : m.group().length()));
}
Test cases:
#Test
public void testGetCompressedString() {
assertEquals("", getCompressedString(""));
assertEquals("a", getCompressedString("a"));
assertEquals("abc", getCompressedString("abc"));
assertEquals("abc3", getCompressedString("abccc"));
assertEquals("a3b2c2dsa", getCompressedString("aaabbccdsa"));
}
The regular expression "(.)\\1*" used here matches any sequence of identical characters. .replaceAll() takes a lambda expression as an argument, evaluates the lambda expression each time the pattern matches, and replaces the original string with the result.
The lambda expression is passed a Matcher object containing the results of the match. Here we are receiving this object in the variable m. m.group() returns the entire matched substring, m.group(1) returns its first character.
If the input string is "aaabbccdsa", it will be processed as follows.
m.group(1) m.group() returned by lambda
a aaa a3
b bb b2
c cc c2
d d d
s s s
a a a

Splitting a string in an array of strings of limited size

I have a string of a random address like
String s = "H.N.-13/1443 laal street near bharath dental lab near thana qutubsher near modern bakery saharanpur uttar pradesh 247001";
I want to split it into array of string with two conditions:
each element of that array of string is of length less than or equal to 20
No awkward ending of an element of array of string
For example, splitting every 20 characters would produce:
"H.N.-13/1443 laal st"
"reet near bharath de"
"ntal lab near thana"
"qutubsher near moder"
"n bakery saharanpur"
but the correct output would be:
"H.N.-13/1443 laal"
"street near bharath"
"dental lab near"
"thana qutubsher near"
"modern bakery"
"saharanpur"
Notice how each element in string array is less than or equal to 20.
The above is my output for this code:
static String[] split(String s,int max){
int total_lines = s.length () / 24;
if (s.length () % 24 != 0) {
total_lines++;
}
String[] ans = new String[total_lines];
int count = 0;
int j = 0;
for (int i = 0; i < total_lines; i++) {
for (j = 0; j < 20; j++) {
if (ans[count] == null) {
ans[count] = "";
}
if (count > 0) {
if ((20 * count) + j < s.length()) {
ans[count] += s.charAt (20 * count + j);
} else {
break;
}
} else {
ans[count] += s.charAt (j);
}
}
String a = "";
a += ans[count].charAt (0);
if (a.equals (" ")) {
ans[i] = ans[i].substring (0, 0) + "" + ans[i].substring (1);
}
System.out.println (ans[i]);
count++;
}
return ans;
}
public static void main (String[]args) {
String add = "H.N.-13/1663 laal street near bharath dental lab near thana qutubsher near modern bakery";
String city = "saharanpur";
String state = "uttar pradesh";
String zip = "247001";
String s = add + " " + city + " " + state + " " + zip;
String[]ans = split (s);
}

Find all occurrences of up to 20 chars starting with a non-space and ending with a word boundary, and collect them to a List:
List<String> parts = Pattern.compile("\\S.{1,19}\\b").matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.toList());
See live demo.

The code is not very clear, but at first glance it seems you are building character by character that is why you are getting the output you see. Instead you go word by word if you want to retain a word and overflow it to next String if necessary. A more promising code would be:
static String[] splitString(String s, int max) {
String[] words = s.split("\s+");
List<String> out = new ArrayList<>();
int numWords = words.length;
int i = 0;
while (i <numWords) {
int len = 0;
StringBuilder sb = new StringBuilder();
while (i < numWords && len < max) {
int wordLength = words[i].length();
len += (i == numWords-1 ? wordLength : wordLength + 1);//1 for space
if (len <= max) {
sb.append(words[i]+ " ");
i++;
}
}
out.add(sb.toString().trim());
}
return out.toArray(new String[] {});
}
Note: It works on your example input, but you may need to tweak it so it works for cases like a long word containing more than 20 characters, etc.

Hello I am trying to write a method that duplicates all vowels but only if they are on their own. for example "beautiful" would return "beautiifuul"

Here's what I have if someone could give me some idea of what to do that would be great. I think taking the index and counting how many values are together would be helpful but im not sure how to implement that. isVowel is a helper method to determine if the char is a vowel.
public static String doubleVowelsMaybe(String s)
{
int run =0;
String n = "";
for(int i = 0; i< s.length(); ++i)
{
char k = s.charAt(i);
if(isVowel(k))
{
}
if(run == 1)
{
n = n + s.substring(i, i+1) + s.substring(i, i+1);
run=0;
}
else
{
n = n + s.substring(i, i+1);
run= 0;
}
}
return n;

Most simple string manipulation tasks like this can be fairly easily done with a regex. This one's a one-liner:
public static String doubleVowelsMaybe(String s) {
return s.replaceAll("(?<![aeiou])([aeiou])(?![aeiou])", "$1$1");
}
The regex works as follows:
(?<![aeiou]) is a negative lookbehind, so it matches only if the character is not preceded by a vowel.
([aeiou]) matches a single vowel, and captures it to group number 1.
(?![aeiou]) is a negative lookahead, so it matches only if the character is not followed by a vowel.
The replacement of $1$1 means two copies of whatever was matched by group number 1, which is the single vowel character.

import java.util.*;
class Hello {
public static void main(String[] args) {
String abc = "beautiful";
String n = "";
int i = 0;
char[] abcchar = abc.toCharArray();
HashSet<Character> hs = new HashSet<>();
hs.add('a');
hs.add('e');
hs.add('i');
hs.add('o');
hs.add('u');
while (i < abcchar.length) {
if (i + 1 < abcchar.length && hs.contains(abcchar[i]) && !hs.contains(abcchar[i + 1])) {
n = n + abc.substring(i, i + 1) + abc.substring(i, i + 1);
} else {
while (hs.contains(abcchar[i])) {
n = n + abc.substring(i, i + 1);
i++;
}
n = n + abc.substring(i, i + 1);
}
i++;
}
System.out.print(n);
}
}

Check endings of an String without built-in methods like endsWith() in Java

I want to check if every word in an string has specific endings with various length. I can't use arrays & methods for this like endsWith(). The only methods im allowed to use are charAt() and length().
public class TextAnalyse {
public static void main(String[] args) {
System.out.println(countEndings("This is a test", "t"));
System.out.println(countEndings("Waren sollen rollen", "en"));
System.out.println(countEndings("The ending is longer then every single word", "abcdefghijklmn"));
System.out.println(countEndings("Today is a good day", "xyz"));
System.out.println(countEndings("Thist is a test", "t"));
System.out.println(countEndings("This is a test!", "t"));
System.out.println(countEndings("Is this a test?", "t"));
}
public static int countEndings(String text, String ending) {
int counter = 0;
int counting;
int lastStringChar;
for (int i = 0; i < text.length(); i++) {
lastStringChar = 0;
if (!(text.charAt(i) >= 'A' && text.charAt(i) <= 'Z' || text.charAt(i) >= 'a' && text.charAt(i) <= 'z') || i == text.length() - 1) {
if( i == text.length() - 1 ){
lastStringChar = 1;
}
counting = 0;
for (int j = 0; j + lastStringChar < ending.length() && i > ending.length(); j++) {
if (text.charAt(i - ending.length() + j + lastStringChar) == ending.charAt(j)) {
counting = 1;
} else {
counting = 0;
}
}
counter += counting;
}
}
return counter;
}
}
The actual results are that I get one counting less, I guess its because it dont check the last chars properly.

The most simple solution I can come up with is the following:
Check, if a word ends with a given suffix:
public static boolean endsWith(String word, String suffix) {
if(suffix.length() > word.length()) {
return false;
}
int textIndex = (word.length() - 1);
int suffixIndex = (suffix.length() - 1);
while(suffixIndex >= 0) {
char textChar = word.charAt(textIndex);
char suffixChar = suffix.charAt(suffixIndex);
if(textChar != suffixChar) {
return false;
}
textIndex--;
suffixIndex--;
}
return true;
}
Split the given in its' words and use the above method to count every word ending with the given ending:
public static int countEndings(String text, String ending) {
{
//maybe remove punctuation here...
//(if yes, use String.replace for example)
}
String[] words = text.split(" ");
int counter = 0;
for(String word: words) {
if(endsWith(word, ending)) {
counter++;
}
}
return counter;
}
Also consider to remove unwanted punctuation, like '!' or '?' (...) - the above implementation will not recognize count any word ending with t in the String test!!

I guess you are able to use regular expression.
If you want count words with ending, you can use the following code:
public static int countEndings(String text, String ending) {
final Matcher wordWithEndMatches = Pattern.compile("\\b[A-Za-z]*" + ending + "\\b").matcher(text);
int count = 0;
while(wordWithEndMatches.find()) {
count++;
}
return count;
}

I want the string pattern aabbcc to be displayed as 2a2b2c

I have somehow got the output with the help of some browsing. But I couldn't understand the logic behind the code. Is there any simple way to achieve this?
public class LetterCount {
public static void main(String[] args)
{
String str = "aabbcccddd";
int[] counts = new int[(int) Character.MAX_VALUE];
// If you are certain you will only have ASCII characters, I would use `new int[256]` instead
for (int i = 0; i < str.length(); i++) {
char charAt = str.charAt(i);
counts[(int) charAt]++;
}
for (int i = 0; i < counts.length; i++) {
if (counts[i] > 0)
//System.out.println("Number of " + (char) i + ": " + counts[i]);
System.out.print(""+ counts[i] + (char) i + "");
}
}
}

There are 3 conditions which need to be taken care of:
if (s.charAt(x) != s.charAt(x + 1) && count == 1) ⇒ print the counter and character;
if (s.charAt(x) == s.charAt(x + 1)) ⇒ increase the counter;
if (s.charAt(x) != s.charAt(x + 1) && count >= 2) ⇒ reset to counter 1.
{
int count= 1;
int x;
for (x = 0; x < s.length() - 1; x++) {
if (s.charAt(x) != s.charAt(x + 1) && count == 1) {
System.out.print(s.charAt(x));
System.out.print(count);
}
else if (s.charAt(x)== s.charAt(x + 1)) {
count++;
}
else if (s.charAt(x) != s.charAt(x + 1) && count >= 2) {
System.out.print(s.charAt(x));
System.out.print(count);
count = 1;
}
}
System.out.print(s.charAt(x));
System.out.println(count);
}

The code is really simple.It uses the ASCII value of a character to index into the array that stores the frequency of each character.
The output is simply got by iterating over that array and which character has frequency greater than 1, print it accordingly as you want in the output that is frequency followed by character.
If the input string has same characters consecutive then the solution can be using space of O(1)
For example in your string aabbcc, the same characters are consecutive , so we can take advantage of this fact and count the character frequency and print it at the same time.
for (int i = 0; i < str.length(); i++)
{
int freq = 1;
while((i+1)<str.length()&&str.charAt(i) == str.charAt(i+1))
{++freq;++i}
System.out.print(freq+str.charAt(i));
}

You are trying to keep count of the number of times each character is found. An array is referenced by an index. For example, the ASCII code for the lowercase letter a is the integer 97. Thus the count of the number of times the letter a is seen is in counts[97]. After every element in the counts array has been set, you print out how many have been found.

This should help you understand the basic idea behind how to approach the string compression problem
import java.util.*;
public class LetterCount {
public static void main(String[] args) {
//your input string
String str = "aabbcccddd";
//split your input into characters
String chars[] = str.split("");
//maintain a map to store unique character and its frequency
Map<String, Integer> compressMap = new LinkedHashMap<String, Integer>();
//read every letter in input string
for(String s: chars) {
//java.lang.String.split(String) method includes empty string in your
//split array, so you need to ignore that
if("".equals(s))
continue;
//obtain the previous occurances of the character
Integer count = compressMap.get(s);
//if the character was previously encountered, increment its count
if(count != null)
compressMap.put(s, ++count);
else//otherwise store it as first occurance
compressMap.put(s, 1);
}
//Create a StringBuffer object, to append your input
//StringBuffer is thread safe, so I prefer using it
//you could use StringBuilder if you don't expect your code to run
//in a multithreaded environment
StringBuffer output = new StringBuffer("");
//iterate over every entry in map
for (Map.Entry<String, Integer> entry : compressMap.entrySet()) {
//append the results to output
output.append(entry.getValue()).append(entry.getKey());
}
//print the output on console
System.out.println(output);
}
}

class Solution {
public String toFormat(String input) {
char inChar[] = input.toCharArray();
String output = "";
int i;
for(i=0;i<input.length();i++) {
int count = 1;
while(i+1<input.length() && inChar[i] == inChar[i+1]) {
count+=1;
i+=1;
}
output+=inChar[i]+String.valueOf(count);
}
return output;
}
public static void main(String[] args) {
Solution sol = new Solution();
String input = "aaabbbbcc";
System.out.println("Formatted String is: " + sol.toFormat(input));
}
}

def encode(Test_string):
count = 0
Result = ""
for i in range(len(Test_string)):
if (i+1) < len(Test_string) and (Test_string[i] == Test_string[i+1]):
count += 1
else:
Result += str((count+1))+Test_string[i]
count = 0
return Result
print(encode("ABBBBCCCCCCCCAB"))

If you want to get the correct count considering the string is not in alphabetical order. Sort the string
public class SquareStrings {
public static void main(String[] args) {
SquareStrings squareStrings = new SquareStrings();
String str = "abbccddddbd";
System.out.println(squareStrings.manipulate(str));
}
private String manipulate(String str1) {
//convert to charArray
char[] charArray = str1.toCharArray();
Arrays.sort(charArray);
String str = new String(charArray);
StringBuilder stbuBuilder = new StringBuilder("");
int length = str.length();
String temp = "";
if (length > 1) {
for (int i = 0; i < length; i++) {
int freq = 1;
while (((i + 1) < length) && (str.charAt(i) == str.charAt(i + 1))) {
++freq;
temp = str.charAt(i) + "" + freq;
++i;
}
stbuBuilder.append(temp);
}
} else {
return str + "" + 1;
}
return stbuBuilder.toString();
}
}

Kotlin:
fun compressString(input: String): String {
if (input.isEmpty()){
return ""
}
var result = ""
var count = 1
var char1 = input[0]
for (i in 1 until input.length) {
val char2 = input[i]
if (char1 == char2) {
count++
} else {
if (count != 1) {
result += "$count$char1"
count = 1
} else {
result += "$char1"
}
char1 = char2
}
}
result += if (count != 1) {
"$count$char1"
} else {
"$char1"
}
return result
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Find n:th word in a string - java

Related

How to remove repeating code in this solution?

Splitting a string in an array of strings of limited size

Hello I am trying to write a method that duplicates all vowels but only if they are on their own. for example "beautiful" would return "beautiifuul"

Check endings of an String without built-in methods like endsWith() in Java

I want the string pattern aabbcc to be displayed as 2a2b2c

Categories

Resources