Tokenize method: Split string into array - java

I've been really struggling with a programming assignment. Basically, we have to write a program that translates a sentence in English into one in Pig Latin. The first method we need is one to tokenize the string, and we are not allowed to use the Split method usually used in Java. I've been trying to do this for the past 2 days with no luck, here is what I have so far:
public class PigLatin
{
public static void main(String[] args)
{
String s = "Hello there my name is John";
Tokenize(s);
}
public static String[] Tokenize(String english)
{
String[] tokenized = new String[english.length()];
for (int i = 0; i < english.length(); i++)
{
int j= 0;
while (english.charAt(i) != ' ')
{
String m = "";
m = m + english.charAt(i);
if (english.charAt(i) == ' ')
{
j++;
}
else
{
break;
}
}
for (int l = 0; l < tokenized.length; l++) {
System.out.print(tokenized[l] + ", ");
}
}
return tokenized;
}
}
All this does is print an enormously long array of "null"s. If anyone can offer any input at all, I would reallllyyyy appreciate it!
Thank you in advance
Update: We are supposed to assume that there will be no punctuation or extra spaces, so basically whenever there is a space, it's a new word

If I understand your question, and what your Tokenize was intended to do; then I would start by writing a function to split the String
static String[] splitOnWhiteSpace(String str) {
List<String> al = new ArrayList<>();
StringBuilder sb = new StringBuilder();
for (char ch : str.toCharArray()) {
if (Character.isWhitespace(ch)) {
if (sb.length() > 0) {
al.add(sb.toString());
sb.setLength(0);
}
} else {
sb.append(ch);
}
}
if (sb.length() > 0) {
al.add(sb.toString());
}
String[] ret = new String[al.size()];
return al.toArray(ret);
}
and then print using Arrays.toString(Object[]) like
public static void main(String[] args) {
String s = "Hello there my name is John";
String[] words = splitOnWhiteSpace(s);
System.out.println(Arrays.toString(words));
}

If you're allowed to use the StringTokenizer Object (which I think is what the assignment is asking, it would look something like this:
StringTokenizer st = new StringTokenizer("this is a test");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
which will produce the output:
this
is
a
test
Taken from here.
The string is split into tokens and stored in a stack. The while loop loops through the tokens, which is where you can apply the pig latin logic.

Some hints for you to do the "manual splitting" work.
There is a method String#indexOf(int ch, int fromIndex) to help you to find next occurrence of a character
There is a method String#substring(int beginIndex, int endIndex) to extract certain part of a string.
Here is some pseudo-code that show you how to split it (there are more safety handling that you need, I will leave that to you)
List<String> results = ...;
int startIndex = 0;
int endIndex = 0;
while (startIndex < inputString.length) {
endIndex = get next index of space after startIndex
if no space found {
endIndex = inputString.length
}
String result = get substring of inputString from startIndex to endIndex-1
results.add(result)
startIndex = endIndex + 1 // move startIndex to next position after space
}
// here, results contains all splitted words

String english = "hello my fellow friend"
ArrayList tokenized = new ArrayList<String>();
String m = "";
int j = 0; //index for tokenised array list.
for (int i = 0; i < english.length(); i++)
{
//the condition's position do matter here, if you
//change them, english.charAt(i) will give index
//out of bounds exception
while( i < english.length() && english.charAt(i) != ' ')
{
m = m + english.charAt(i);
i++;
}
//add to array list if there is some string
//if its only ' ', array will be empty so we are OK.
if(m.length() > 0 )
{
tokenized.add(m);
j++;
m = "";
}
}
//print the array list
for (int l = 0; l < tokenized.size(); l++) {
System.out.print(tokenized.get(l) + ", ");
}
This prints, "hello,my,fellow,friend,"
I used an array list since at the first sight the length of the array is not clear.

Related

why are some words not checked or included in string of reversed words?

everyone. I have a task- reverse every word in a sentence as long as the word is 5 or more letters long. The program has been working with most words, but after a couple, the words are not included. Does anyone know why this is happening? Here is the code:
public static int wordCount(String str) {
int count = 0;
for(int i = 0; i < str.length(); i++) if(str.charAt(i) == ' ') count++;
return count + 1;
}
This just gets the word count for me, which I use in a for loop later to loop through all the words.
public static String reverseString(String s) {
Stack<Character> stack = new Stack<>();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < s.length(); i++) {
stack.push(s.charAt(i));
}
while (!stack.empty()) {
sb.append(stack.pop());
}
return sb.toString();
}
This reverses a single string. This is not where I reverse certain words- this reverses a string. "Borrowed" from https://stackoverflow.com/a/33458528/16818831.
Lastly, the actual function:
public static String spinWords(String sentence) {
String ans = "";
for(int i = 0; i <= wordCount(sentence); i++) {
if(sentence.substring(0, sentence.indexOf(' ')).length() >= 5) {
ans += reverseString(sentence.substring(0, sentence.indexOf(' '))) + " ";
sentence = sentence.substring(sentence.indexOf(' ') + 1);
} else {
ans += sentence.substring(0, sentence.indexOf(' ')) + " ";
sentence = sentence.substring(sentence.indexOf(' ') + 1);
}
}
return ans;
}
This is where my mistake probably is. I'd like to know why some words are omitted. Just in case, here is my main method:
public static void main(String[] args) {
System.out.println(spinWords("Why, hello there!"));
System.out.println(spinWords("The weather is mighty fine today!"));
}
Let me know why this happens. Thank you!
The main issue would appear to be the for loop condition in spinWords()
The word count of your sentence keeps getting shorter while at the same time, i increases.
For example:
i is 0 when the word count is 5
i is 1 when the word count is 4
i is 2 when the word count is 3
i is 3 when the word count is 2 which
stops the loop.
It can't get through the whole sentence.
As many have mentioned, using the split method would help greatly, for example:
public static String spinWords(String sentence) {
return Arrays.asList(sentence.split(" ")).stream()
.map(word -> word.length() < 5 ? word : new StringBuilder(word).reverse().toString())
.collect(Collectors.joining(" "));
}
I think you should rewrite a lot of your code using String.split(). Instead of manually parsing every letter, you can get an array of every word just by writing String[] arr = sentence.split(" "). You can then use a for loop to go through and reverse each word something like this
for (int i=0; i<arr.length; i++) {
if (arr[i] >= 5) {
arr[i] = reverse(arr[i])
}
}
I know you just asked for a solution to your current code, but this would probably get you a better grade :)

Reverse words without changing capitals or punctuation

Create a program with the lowest amount of characters to reverse each word in a string while keeping the order of the words, as well as punctuation and capital letters, in their initial place.
By "Order of the words", I mean that each word is split by an empty space (" "), so contractions and such will be treated as one word. The apostrophe in contractions should stay in the same place. ("Don't" => "Tno'd").
(Punctuation means any characters that are not a-z, A-Z or whitespace*).
Numbers were removed from this list due to the fact that you cannot have capital numbers. Numbers are now treated as punctuation.
For example, for the input:
Hello, I am a fish.
it should output:
Olleh, I ma a hsif.
Notice that O, which is the first letter in the first word, is now capital, since H was capital before in the same location.
The comma and the period are also in the same place.
More examples:
This; Is Some Text!
would output
Siht; Si Emos Txet!
I've tried this:
public static String reverseWord(String input)
{
String words[]=input.split(" ");
StringBuilder result=new StringBuilder();
for (String string : words) {
String revStr = new StringBuilder(string).reverse().toString();
result.append(revStr).append(" ");
}
return result.toString().trim();
}
I have tried to solve your problem. It's working fine for the examples I have checked :) Please look and let me know :)
public static void main(String[] args) {
System.out.println(reverseWord("This; Is Some Text!"));
}
public static boolean isAlphaNumeric(String s) {
return s != null && s.matches("^[a-zA-Z0-9]*$");
}
public static String reverseWord(String input)
{
String words[]=input.split(" ");
StringBuilder result=new StringBuilder();
int startIndex = 0;
int endIndex = 0;
for(int i = 0 ; i < input.length(); i++) {
if (isAlphaNumeric(Character.toString(input.charAt(i)))) {
endIndex++;
} else {
String string = input.substring(startIndex, endIndex);
startIndex = ++endIndex;
StringBuilder revStr = new StringBuilder("");
for (int j = 0; j < string.length(); j++) {
char charToAdd = string.charAt(string.length() - j - 1);
if (Character.isUpperCase(string.charAt(j))) {
revStr.append(Character.toUpperCase(charToAdd));
} else {
revStr.append(Character.toLowerCase(charToAdd));
}
}
result.append(revStr);
result.append(input.charAt(i));
}
}
if(endIndex>startIndex) // endIndex != startIndex
{
String string = input.substring(startIndex, endIndex);
result.append(string);
}
return result.toString().trim();
}
Call the reverseWord with your test string.
Hope it helps. Don't forget to mark it as right answer, if it is :)
Here is a proposal that follows your requirements. It may seem very long but its just comments and aerated code; and everybody loves comments.
public static String smartReverseWords(String input) {
StringBuilder finalString = new StringBuilder();
// Word accumulator, resetted after each "punctuation" (or anything different than a letter)
StringBuilder wordAcc = new StringBuilder();
int processedChars = 0;
for(char c : input.toCharArray()) {
// If not a whitespace nor the last character
if(!Character.isWhitespace(c)) {
// Accumulate letters
wordAcc.append(c);
// Have I reached the last character? Then finalize now:
if(processedChars == input.length()-1) {
reverseWordAndAppend(wordAcc, finalString);
}
}
else {
// Was a word accumulated?
if(wordAcc.length() > 0) {
reverseWordAndAppend(wordAcc, finalString);
}
// Append non-letter char to final string:
finalString.append(c);
}
processedChars++;
}
return finalString.toString();
}
private static void reverseWordAndAppend(StringBuilder wordAcc, StringBuilder finalString) {
// Then reverse it:
smartReverse(wordAcc); // a simple wordAcc.reverse() is not possible
// Append word to final string:
finalString.append(wordAcc.toString());
// Reset accumulator
wordAcc.setLength(0);
}
private static class Marker {
Integer position;
String character;
}
private static void smartReverse(StringBuilder wordAcc) {
char[] arr = wordAcc.toString().toCharArray();
wordAcc.setLength(0); // clean it for now
// Memorize positions of 'punctuation' + build array free of 'punctuation' in the same time:
List<Marker> mappedPosOfNonLetters = new ArrayList<>(); // order matters
List<Integer> mappedPosOfCapitals = new ArrayList<>(); // order matters
for (int i = 0; i < arr.length; i++) {
char c = arr[i];
if(!Character.isLetter(c)) {
Marker mark = new Marker();
mark.position = i;
mark.character = c+"";
mappedPosOfNonLetters.add(mark);
}
else {
if(Character.isUpperCase(c)) {
mappedPosOfCapitals.add(i);
}
wordAcc.append(Character.toLowerCase(c));
}
}
// Reverse cleansed word:
wordAcc.reverse();
// Reintroduce 'punctuation' at right place(s)
for (Marker mark : mappedPosOfNonLetters) {
wordAcc.insert(mark.position, mark.character);
}
// Restore capitals at right place(s)
for (Integer idx : mappedPosOfCapitals) {
wordAcc.setCharAt(idx,Character.toUpperCase(wordAcc.charAt(idx)));
}
}
EDIT
I've updated the code to take all your requirements into account. Indeed we have to make sure that "punctuation' stay in place (and capitals also) but also within a word, like a contraction.
Therefore given the following input string:
"Hello, I am on StackOverflow. Don't tell anyone."
The code produces this output:
"Olleh, I ma no WolfrEvokcats. Tno'd llet enoyna."

How to write a replaceAll function java?

I'm trying to write a program that will allow a user to input a phrase (for example: "I like cats") and print each word on a separate line. I have already written the part to allow a new line at every space but I don't want to have blank lines between the words because of excess spaces. I can't use any regular expressions such as String.split(), replaceAll() or trim().
I tried using a few different methods but I don't know how to delete spaces if you don't know the exact number there could be. I tried a bunch of different methods but nothing seems to work.
Is there a way I could implement it into the code I've already written?
for (i=0; i<length-1;) {
j = text.indexOf(" ", i);
if (j==-1) {
j = text.length();
}
System.out.print("\n"+text.substring(i,j));
i = j+1;
}
Or how can I write a new expression for it? Any suggestions would really be appreciated.
I have already written the part to allow a new line at every space but
I don't want to have blank lines between the words because of excess
spaces.
If you can't use trim() or replaceAll(), you can use java.util.Scanner to read each word as a token. By default Scanner uses white space pattern as a delimiter for finding tokens. Similarly, you can also use StringTokenizer to print each word on new line.
String str = "I like cats";
Scanner scanner = new Scanner(str);
while (scanner.hasNext()) {
System.out.println(scanner.next());
}
OUTPUT
I
like
cats
Here is a simple solution using substring() and indexOf()
public static void main(String[] args) {
List<String> split = split("I like cats");
split.forEach(System.out::println);
}
public static List<String> split(String s){
List<String> list = new ArrayList<>();
while(s.contains(" ")){
int pos = s.indexOf(' ');
list.add(s.substring(0, pos));
s = s.substring(pos + 1);
}
list.add(s);
return list;
}
Edit:
If you only want to print the text without splitting or making lists, you can use this:
public static void main(String[] args) {
newLine("I like cats");
}
public static void newLine(String s){
while(s.contains(" ")){
int pos = s.indexOf(' ');
System.out.println(s.substring(0, pos));
s = s.substring(pos + 1);
}
System.out.println(s);
}
I think this will solve your problem.
public static List<String> getWords(String text) {
List<String> words = new ArrayList<>();
BreakIterator breakIterator = BreakIterator.getWordInstance();
breakIterator.setText(text);
int lastIndex = breakIterator.first();
while (BreakIterator.DONE != lastIndex) {
int firstIndex = lastIndex;
lastIndex = breakIterator.next();
if (lastIndex != BreakIterator.DONE && Character.isLetterOrDigit(text.charAt(firstIndex))) {
words.add(text.substring(firstIndex, lastIndex));
}
}
return words;
}
public static void main(String[] args) {
String text = "I like cats";
List<String> words = getWords(text);
for (String word : words) {
System.out.println(word);
}
}
Output :
I
like
cats
What about something like this, its O(N) time complexity:
Just use a string builder to create the string as you iterate through your string, add "\n" whenever you find a space
String word = "I like cats";
StringBuilder sb = new StringBuilder();
boolean newLine = true;
for(int i = 0; i < word.length(); i++) {
if (word.charAt(i) == ' ') {
if (newLine) {
sb.append("\n");
newLine = false;
}
} else {
newLine = true;
sb.append(word.charAt(i));
}
}
String result = sb.toString();
EDIT: Fixed the problem mentioned on comments (new line on multiple spaces)
Sorry, I didnot caution you cannot use replaceAll().
This is my other solution:
String s = "I like cats";
Pattern p = Pattern.compile("([\\S])+");
Matcher m = p.matcher(s);
while (m.find( )) {
System.out.println(m.group());
}
Old solution:
String s = "I like cats";
System.out.println(s.replaceAll("( )+","\n"));
You almost done all job. Just make small addition, and your code will work as you wish:
for (int i = 0; i < length - 1;) {
j = text.indexOf(" ", i);
if (i == j) { //if next space after space, skip it
i = j + 1;
continue;
}
if (j == -1) {
j = text.length();
}
System.out.print("\n" + text.substring(i, j));
i = j + 1;
}

Creating longer array if elements contain whitespaces? (Java)

currently I'm trying to make a method that does the following:
Takes 3 String Arrays (words, beforeList, and afterList)
Looks for words that are in both words and in beforeList, and if found, replaces with word in afterList
Returns a new array that turns the elements with characters in afterList into new elements by themselves
For example, here is a test case, notice that "i'm" becomes split into two elements in the final array "i" and "am":
String [] someWords = {"i'm", "cant", "recollect"};
String [] beforeList = {"dont", "cant", "wont", "recollect", "i'm"};
String [] afterList = {"don't", "can't", "won't", "remember", "i am"};
String [] result = Eliza.replacePairs( someWords, beforeList, afterList);
if ( result != null && result[0].equals("i") && result[1].equals("am")
&& result[2].equals("can't") && result[3].equals("remember")) {
System.out.println("testReplacePairs 1 passed.");
} else {
System.out.println("testReplacePairs 1 failed.");
}
My biggest problem is in accounting for this case of whitespaces. I know the code I will post below is wrong, however I've been trying different methods. I think my code right now should return an empty array that is the length of the first but accounted for spaces. I realize it may require a whole different approach. Any advice though would be appreciated, I'm going to continue to try and figure it out but if there is a way to do this simply then I'd love to hear and learn from it! Thank you.
public static String[] replacePairs(String []words, String [] beforeList, String [] afterList) {
if(words == null || beforeList == null || afterList == null){
return null;
}
String[] returnArray;
int countofSpaces = 0;
/* Check if words in words array can be found in beforeList, here I use
a method I created "inList". If a word is found the index of it in
beforeList will be returned, if a word is not found, -1 is returned.
If a word is found, I set the word in words to the afterList value */
for(int i = 0; i < words.length; i++){
int listCheck = inList(words[i], beforeList);
if(listCheck != -1){
words[i] = afterList[listCheck];
}
}
// This is where I check for spaces (or attempt to)
for(int j = 0; j < words.length; j++){
if(words[j].contains(" ")){
countofSpaces++;
}
}
// Here I return an array that is the length of words + the space count)
returnArray = new String[words.length + countofSpaces];
return returnArray;
}
Here's one of the many ways of doing it, assuming you have to handle cases where words contain more than 1 consecutive spaces:
for(int i = 0; i < words.length; i++){
int listCheck = inList(words[i], beforeList);
if(listCheck != -1){
words[i] = afterList[listCheck];
}
}
ArrayList<String> newWords = new ArrayList<String>();
for(int i = 0 ; i < words.length ; i++) {
String str = words[i];
if(str.contains(' ')){
while(str.contains(" ")) {
str = str.replace(" ", " ");
}
String[] subWord = str.split(" ");
newWords.addAll(Arrays.asList(subWord));
} else {
newWords.add(str);
}
}
return (String[])newWords.toArray();

Reverse characters in a sentence

Im trying to reverse characters in a sentence without using the split function. Im really close but I am missing the final letter. Can some one please point me in the right direction? Right now it prints "This is a new sentence" as "sihT si a wen cnetnes" Also I included if(start == 0) because the program would skip the initial space character, but I don't understand why?
static String reverseLetters(String sentence)
StringBuilder reversed = new StringBuilder("");
int counter = 0;
int start = 0;
String word;
for(int i = 0; i <= sentence.length()-1 ; i++ )
{
if(sentence.charAt(i)== ' '|| i == sentence.length()-1 )
{
StringBuilder sb = new StringBuilder("");
sb.append(sentence.substring(start,i));
if(start == 0)
{
start = i;
word = sb.toString();
reversed.append(reverseChar(word));
reversed.append(' ');
}
else
{
start = i;
word = sb.toString();
reversed.append(reverseChar(word));
}
}
return reversed.toString();
}
static String reverseChar (String word)
{
StringBuilder b = new StringBuilder("");
for(int idx = word.length()-1; idx >= 0; idx -- )
{
b.append(word.charAt(idx));
}
return b.toString();
}
start means wordStart. As i points to the space, the next wordStart should point after i.
Hence the last i should point after the last word char, should be length()
the if-then-else is too broad; a space has to be added in one case: i pointing at the space.
One could loop unconditionally, and on i == length() break in the middle of the loop code.
I think the error lies in the index, the for should be
for(int i = 0; i <= sentence.length() ; i++ )
Then if should be:
if (sentence.charAt(i==0?0:i-1)== ' '|| i == sentence.length() )
For me the error will be that the substring(start,i) for the last one i should be sentence.length instead of sentence.length-1, so this would solve it.
Substring is open in the last index, so if you put substring(1, 10) will be substring from 1 to 9. That might be the problem with last word.
The thing with the first space is also the problem with substring, let's say you're reading "this is..." the first time it will do a subtring with start=0 and i = 4 so you expect "this " but it really is "this". The next reading, with start=4 and i=7 will be " is".
So with the change of the index you should be able to remove the if/else with start==0 too.
Another option
private String reverse (String originalString) {
StringBuilder reverseString = new StringBuilder();
for (int i = originalString.length() - 1; i >= 0; i--) {
reverseString.append(originalString.charAt(i));
}
return reverseString.toString();
}
String reverseString = "This is a new sentence";
System.out.println(new StringBuffer(reverseString).reverse().toString());
Syso prints : ecnetnes wen a si sihT
Put
i <= sentence.length()
In your for loop and change the if to:
if(i == sentence.length() || sentence.charAt(i)== ' ')
as
substring(start,i)
Returns the string up to i, not included.
import java.util.Stack;
public class Class {
public static void main(String[] args) {
String input = "This is a sentence";
char[] charinput = input.toCharArray();
Stack<String> stack = new Stack<String>();
for (int i = input.length() - 1; i >= 0; i--) {
stack.push(String.valueOf(charinput[i]));
}
StringBuilder StackPush = new StringBuilder();
for (int i = 0; i < stack.size(); i++) {
StackPush.append(stack.get(i));
}
System.out.println(StackPush.toString());
}
}
Not a split to be seen.

Categories

Resources