Java - divide a text into n equal lines

Java - divide a text into n equal lines - java

Let's say I have this String
String myText="I think that stackoverflow is a very great website";
If i want to divide it in 2 lines i would have something like
I think that stackoverflow
is a very great website.
So the String will be now ("I think that stackoverflow\nis a very great website"
If I want it to divide in 3 lines it will be like
I think that
stackoverflow is a
very great website
What I've tried was just dividing the text, every line would have total number of words / n (n is the number of lines that i want to divide my text).
But this is a bad thing, i would have a result like
String myText="I me him is veryverylong wordvery longestwordever thisisevenlonger"
And the result would be (if i want to divide it in 2 lines) something like
"i you me is\nveryverylong wordvery longestwordever thisisevenlonger"
What do you guys suggest for me to try?
I've tried the common apache algorithm
http://pastebin.com/68zycavf
But my output text will be every word separated by \n ..if i use wrap(text,2)..

As Eran noted in his answer, you want to split at approximately the line length divided by the desired number of lines, but have to adjust for that being in the middle of a word.
I think his solution won't quite always give the best solution though, as it might sometimes be best to split before the word instead of after as he's doing.
A divide-and-conquer approach would be a recursive algorithm roughly as follows:
Let N be the desired number of lines and LENGTH be the number of characters in the input string (normalizing to single-spaces first).
If the character at LENGTH/N is a space, make the first cut there, and recursively call to split the remainder into N-1 lines, otherwise find the spaces at each end of the word containing this character and make trial cuts at both points with recursive calls again tom complete both cuts. Score the results somehow and choose the better.
I have implemented this as follows. For the scoring function, I chose to minimize the maximum length of lines in the split. A more complex scoring function might possibly improve the results, but this seems to work for all your cases.
public class WordWrapper {
public String wrapWords(String input, int lines) {
return splitWords(input.replaceAll("\\s+", " "), lines);
}
private String splitWords(String input, int lines) {
if (lines <= 1) {
return input;
}
int splitPointHigh = findSplit(input, lines, 1);
String splitHigh = input.substring(0, splitPointHigh).trim() + "\n" + splitWords(input.substring(splitPointHigh).trim(), lines - 1);
int splitPointLow = findSplit(input, lines, -1);
String splitLow = input.substring(0, splitPointLow).trim() + "\n" + splitWords(input.substring(splitPointLow).trim(), lines - 1);
if (maxLineLength(splitLow) < maxLineLength(splitHigh))
return splitLow;
else return splitHigh;
}
private int maxLineLength(String split) {
return maxLength(split.split("\n"));
}
private int maxLength(String[] lines) {
int maxLength = 0;
for (String line: lines) {
if (line.length() > maxLength)
maxLength = line.length();
}
return maxLength;
}
private int findSplit(String input, int lines, int dir) {
int result = input.length() / lines;
while (input.charAt(result) != ' ')
result+= dir;
return result;
}
}
I didn't actually bother with the special case of the lucky situation of the simple split landing on a space, and adding special handling for that might make it a little quicker. This code will in that case generate two identical "trial splits" and "choose one".
You might want to make all these methods static of course, and the recursion might give you a stack overflow for large inputs and large line counts.
I make no claim that this is the best algorithm, but it seems to work.

You can split based on the number of characters divided by n.
Then, for each line, you should add the end of the last word (which is the beginning of the next line, if the current line doesn't end with a space and the next line doesn't begin with a space), so that no words are split in the middle.
So if you have :
I me him is veryverylong wordvery longestwordever thisisevenlonger
And you wish to split it to two lines, you get :
I me him is veryverylong wordvery
longestwordever thisisevenlonger
In this case the second line already starts with a space, so we know that no word was split in the middle, and we are done.
If you split it to three lines, you first get :
I me him is veryverylo
ng wordvery longestwor
dever thisisevenlonger
Here some words were split, so you move "ng" to the first line, and then move "dever" to the second line.
I me him is veryverylong
wordvery longestwordever
thisisevenlonger

This is my solution using the split() function.
public class Textcut {
public static void main(String arg[]) {
String myText="I think that stackoverflow is a very great website";
int n = 2;
String[] textSplit = myText.split(" ");
int wordNumber = textSplit.length;
int cutIndex = wordNumber/n;
int i = cutIndex;
int j = 0;
while(i <= wordNumber) {
for(; j < i; j++) {
System.out.print(textSplit[j] + " ");
}
System.out.println("\n");
i = i+cutIndex;
}
}
}

Related

How to delete characters at x?

How to delete the characters at x and keep the rest? The output should be "12345678" Deleting every '9' in the position that x is on. X is i*(i+1)/2 so that the number is added to the next number. So every number at 0,1,3,6,10,15,21,28,etc.
public class removeMysteryI {
public static String removeMysteryI(String str) {
String newString = "";
int x=0;
for(int i=0;i<str.length();i++){
int y = (i*(i+1)/2)+1;
if(y<=str.length()){
x=i*(i+1)/2;
newString=str.substring(0, x) + str.substring(x + 1);
}
}
return newString;
}
public static void main(String[] args) {
String str = "9919239456978";
System.out.println(removeMysteryI(str));
}
}

OK, so there are a couple of mistakes in your code. One is easy to fix. The others not so easy.
The easy one first:
newString=str.substring(0, x) + str.substring(x + 1);
OK so that is creating a string with the character at position x removed. The problem is what it is operating on. The str variable is the input parameter. So at the end of the day newString will still only be str with one character removed.
The above actually needs to be operating on the string from the previous loop iterations ... if you are going to remove more than one character.
The next problem arises when you try to solve the first one. When you remove a character from a string, all characters after the removal point are renumbered; e.g. after removing the character at 5, the character at 6 becomes the character at 5, the character at 7 becomes the character at 6, and so on.
So if you are going to remove characters by "snipping" the string, you need to make sure that the indexes for the positions for the "snips" are adjusted for the number of characters you have already removed.
That can be done ... but you need to think about it.
The final problem is efficiency. Each time your current code removes a single character (as above), it is actually copying all remaining characters to a new string. For small strings, that's OK. For really large strings, the repeated copying could have a serious performance impact1.
The solution to this is to use a different approach to removing the characters. Instead of snipping out the characters you want to discard, copy the characters that you want to keep. The StringBuilder class is one way of doing this2. If you are not permitted to use that, then you could do it with an array of char, and an index variable to keep track of your "append" position in the array. Finally, there is a String constructor that can create a String from the relevant part of the char[].
I'll leave it to you to work out the details.
1 - Efficiency could be viewed as beyond the scope of this exercise.
2 - #Horse's answer uses a StringBuilder but in a different way to what I am suggesting. This will also suffer from the repeated copying problem because each deleteCharAt call will copy all characters after the deletion point.

Follow the steps below:
Initialize with builderIndexToDelete = 0
Initialize with counter = 1
Repeat the following till the index is valid:
delete character at builderIndexToDelete
update builderIndexToDelete to counter - 1 (-1 as a character is deleted in every iteration)
increment the counter
public static String deleteNaturalSumIndexes(String str) {
StringBuilder builder = new StringBuilder(str);
int counter = 1;
int builderIndexToDelete = 0;
while (builderIndexToDelete < builder.length()) {
builder.deleteCharAt(builderIndexToDelete);
builderIndexToDelete += (counter - 1);
counter++;
}
return builder.toString();
}
public static void main(String[] args) {
String str = "9919239456978";
System.out.println(deleteNaturalSumIndexes(str));
}
Thank you #dreamcrash and #StephenC
Using #StephenC suggestion to improve performance
public static String deleteNaturalSumIndexes(String str) {
StringBuilder builder = new StringBuilder();
int nextNum = 1;
int indexToDelete = 0;
while (indexToDelete < str.length()) {
// check whether this is a valid range to continue
// handles 0,1 specifically
if (indexToDelete + 1 < indexToDelete + nextNum) {
// min is used to limit the index of last iteration
builder.append(str, indexToDelete + 1, Math.min(indexToDelete + nextNum, str.length()));
}
indexToDelete += nextNum;
nextNum++;
}
return builder.toString();
}
public static void main(String[] args) {
System.out.println(deleteNaturalSumIndexes(""));
System.out.println(deleteNaturalSumIndexes("a"));
System.out.println(deleteNaturalSumIndexes("ab"));
System.out.println(deleteNaturalSumIndexes("abc"));
System.out.println(deleteNaturalSumIndexes("99192394569"));
System.out.println(deleteNaturalSumIndexes("9919239456978"));
}

Using bufferedreader then convert to a string

Hi im having this assignment that I don't really understand how to pull off.
Ive been programing java for 2.5 weeks so Im really new.
Im supposed to import a text document into my program and then do these operations, count letters, sentences and average length of words. I've to perform the counting task letter by letter, I'm not allowed to scan the entire document at the same time. Ive managed to import the text and also print it out, but my problem is I cant use my string "line" to do any of these operations. Ive tried converting it to arrays, strings and after a lot of failed attempts im giving up. So how do I convert my input to something I can use, because i always get the error message "line is not a variable" or smth like that.
Jesper
UPDATE WITH MY SOLUTION! also some of it is in Swedish, sorry for that.
Somehow the Format is wrong so I uploaded the code here instead, really don't feel to argue with this wright now!
http://txs.io/3eIb

To count letters, check each character. If it's a space or punctuation, ignore it. Otherwise, it's a letter and we should this increment.
Every word should have a space after it unless it is the last word of the sentence. To get the number of words, track the number of spaces + number of sentences. To get number of sentences, find the number of ! ? and .
I would do that by looking at the ascii value of each character.
int numSentences = 0;
int numWords = 0;
while (line = ...){
for(int i = 0; i <line.length(); i++){
int curCharAsc = (int)(line.at(i)) //get ascii value by casting char to int
if((curCharAsc >= 65 && curCharAsc <= 90) || (curCharAsc >= 97 && curCharAsc <= 122) //check if letter is uppercase or lowercase
numLetters++;
if(curCharAsc == 32){ //ascii for space
numWords++;
}
else if (curCharAsc == 33 || curCharAsc == 46 || curCharAsc == 63){
numWords++;
numSentences++;
}
}
}
double avgWordLength = ((double)(letters))/numWords; //cast to double before dividing to avoid round-off

Your code as presented works fine, it loads a file and prints out the contents line by line. What you probably need to do is capture each of those lines. Java has two useful classes for this StringBuilder or StringBuffer (pick one).
BufferedReader input = new BufferedReader(new FileReader(args[0]));
String line;
StringBuffer buffer = new StringBuffer();
while ((line = input.readLine()) != null) {
System.out.println(line);
buffer.append(line+" ");
}
input.close();
performOperations(buffer.toString());
The only other possibility is (if your own code is not running for you) - possibly you aren't passing the input file name as a parameter when you run this class?
UPDATE
NB - I've modified the line
buffer.append(line+"\n");
to add a space instead of a line break, so that it is compatible with algorithms in the #faraza answer
The method performOperations doesn't exist yet. So you should / could add something like this
public static void performOperations(String data){
}
You method could in turn make calls out to separate methods for each operation
public static void performOperations(String data){
countWords(data);
countLetters(data);
averageWordLength(data);
}
To take it to the next level, and introduce Object Orientation, you could create a class TextStatsCollector.
public class TextStatsCollector{
private final String data;
public TextStatsCollector(final String data) {
this.data = data;
}
public int countWords(){
//word count impl here
}
public int countLetters(){
//letter count impl here
}
public int averageWordLength(){
//average word length impl here
}
public void performOperations(){
System.out.println("Number of Words is " + countWords());
System.out.println("Number of Letters is " + countLetters());
System.out.println("Average word length is " + averageWordLength());
}
}
Then you could use TextStatsCollector like the following in your main method
new TextStatsCollector(buffer.toString()).performOperations();

Word in a java string

I am very new to Java and as a starter I have been offered to try this at home.
Write a program that will find out number of occurences of a smaller string in a bigger string as a part of it as well as an individual word.
For example,
Bigger string = "I AM IN AMSTERDAM", smaller string = "AM".
Output: As part of string: 3, as a part of word: 1.
While I did nail the second part (as a part of word), and even had my go at the first one (searching for the word as a part of the string), I just don't seem to figure out how to crack the first part. It keeps on displaying 1 for me with the example input, where it should be 3.
I have definitely made an error- I'll be really grateful if you could point out the error and rectify it. As a request, I am curious learner- so if possible (at your will)- please provide an explanation as to why so.
import java.util.Scanner;
public class Program {
static Scanner sc = new Scanner(System.in);
static String search,searchstring;
static int n;
void input(){
System.out.println("What do you want to do?"); System.out.println("1.
Search as part of string?");
System.out.println("2. Search as part of word?");
int n = sc.nextInt();
System.out.println("Enter the main string"); searchstring =
sc.nextLine();
sc.nextLine(); //Clear buffer
System.out.println("Enter the search string"); search = sc.nextLine();
}
static int asPartOfWord(String main,String search){
int count = 0;
char c; String w = "";
for (int i = 0; i<main.length();i++){
c = main.charAt(i);
if (!(c==' ')){
w += c;
}
else {
if (w.equals(search)){
count++;
}
w = ""; // Flush old value of w
}
}
return count;
}
static int asPartOfString(String main,String search){
int count = 0;
char c; String w = ""; //Stores the word
for (int i = 0; i<main.length();i++){
c = main.charAt(i);
if (!(c==' ')){
w += c;
}
else {
if (w.length()==search.length()){
if (w.equals(search)){
count++;
}
}
w = ""; // Replace with new value, no string
}
}
return count;
}
public static void main(String[] args){
Program a = new Program();
a.input();
switch(n){
case 1: System.out.println("Total occurences: " +
asPartOfString(searchstring,search));
case 2: System.out.println("Total occurences: " +
asPartOfWord(searchstring,search));
default: System.out.println("ERROR: No valid number entered");
}
}
}
EDIT: I will be using the loop structure.

A simpler way would be to use regular expressions (that probably defeats the idea of writing it yourself, although learning regexes is a good idea because they are very powerful: as you can see the core of my code is 4 lines long in the countMatches method).
public static void main(String... args) {
String bigger = "I AM IN AMSTERDAM";
String smaller = "AM";
System.out.println("Output: As part of string: " + countMatches(bigger, smaller) +
", as a part of word: " + countMatches(bigger, "\\b" + smaller + "\\b"));
}
private static int countMatches(String in, String regex) {
Matcher m = Pattern.compile(regex).matcher(in);
int count = 0;
while (m.find()) count++;
return count;
}
How does it work?
we create a Matcher that will find a specific pattern in your string, and then iterate to find the next match until there is none left and increment a counter
the patterns themselves: "AM" will find any occurrence of AM in the string, in any position. "\\bAM\\b" will only match whole words (\\b is a word delimiter).
That may not be what you were looking for but I thought it'd be interesting to see another approach. An technically, I am using a loop :-)

Although writing your own code with lots of loops to work things out may execute faster (debatable), it's better to use the JDK if you can, because there's less code to write, less debugging and you can focus on the high-level stuff instead of the low level implementation of character iteration and comparison.
It so happens, the tools you need to solve this already exist, and although using them requires knowledge you don't have, they are elegant to the point of being a single line of code for each method.
Here's how I would solve it:
static int asPartOfString(String main,String search){
return main.split(search, -1).length - 1;
}
static int asPartOfWord(String main,String search){
return main.split("\\b" + search + "\\b", -1).length - 1
}
See live demo of this code running with your sample input, which (probably deliberately) contains an edge case (see below).
Performance? Probably a few microseconds - fast enough. But the real benefit is there is so little code that it's completely clear what's going on, and almost nothing to get wrong or that needs debugging.
The stuff you need to know to use this solution:
regex term for "word boundary" is \b
split() takes a regex as its search term
the 2nd parameter of split() controls behaviour at the end of the string: a negative number means "retain blanks at end of split", which handle the edge case of the main string ending with the smaller string. Without the -1, a call to split would throw away the trailing blank in this edge case.

You could use Regular Expressions, try ".*<target string>.*" (Replace target string with what you are searching for.
Have a look at the Java Doc for "Patterns & Regular Expressions"
To search for the occurrences in a string this could be helpful.
Matcher matcher = Pattern.compile(".*AM.*").matcher("I AM IN AMSTERDAM")
int count = 0;
while (matcher.find()) {
count++;
}

Here's an alternative (and much shorter) way to get it to work using Pattern and Matcher,or more commonly known as regex.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CountOccurances {
public static void main(String[] args) {
String main = "I AM IN AMSTERDAM";
String search = "AM";
System.out.printf("As part of string: %d%n",
asPartOfString(main, search));
System.out.printf("As part of word: %d%n",
asPartOfWord(main, search));
}
private static int asPartOfString(String main, String search) {
Matcher m = Pattern.compile(search).matcher(main);
int count = 0;
while (m.find()) {
count++;
}
return count;
}
private static int asPartOfWord(String main, String search) {
// \b - A word boundary
return asPartOfString(main, "\\b" + search + "\\b");
}
}
Output:
As part of string: 3
As part of word: 1

For the first part of your Exercise this should work:
static int asPartOfWord(String main, String search) {
int count = 0;
while(main.length() >= search.length()) { // while String main is at least as long as String search
if (main.substring(0,search.length()).equals(search)) { // if String main from index 0 until exclusively search.length() equals the String search, count is incremented;
count++;
}
main = main.substring(1); // String main is shortened by cutting off the first character
}
return count;
You may think about the way you name variables:
static String search,searchstring;
static int n;
While search and searchstring will tell us what is meant, you should write the first word in lower case, every word that follows should be written with the first letter in upper case. This improves readability.
static int n won't give you much of a clue what it is used for if you read your code again after a few days, you might use something more meaningful here.
static String search, searchString;
static int command;

Java Count Words IndexOf

I'm trying to use simply a while loop and the String method indexOf() to count how many times a certain word appears in a String given by the user.
The method I created seems to count how many times a certain letter appears, but not how many times a word appears. I think this is because the indexOf can't differentiate between a set of letters and spaces. So would I have to create another loop for the computer to understand what I consider words?
This is what I have so far:
public static void countWord(String sentenceEntered, String badWord){
int number = sentenceEntered.toUpperCase().indexOf(badWord, 0);
while (number >= 0){
System.out.print(number);
number = sentenceEntered.indexOf(badWord, number + 1);
}
}//end of countWord
But when I run my program, nothing gets printed.

There are two main problems in your method:
The code converts the sentence to uppercase, but not the input word. Also in the while loop, the sentence is not converted to uppercase. So there is an inconsistency here.
The offset in the call to String#indexOf should be the previous index plus the length of the word, i.e. number + badWord.length() instead of number + 1.
number = sentenceEntered.indexOf(badWord, number + badWord.length());
So after the changes, and assuming you no conversion to uppercase is done, the method should be as follows:
public static void countWord(String sentenceEntered, String badWord) {
int number = sentenceEntered.indexOf(badWord, 0);
while (number >= 0) {
System.out.println(number);
number = sentenceEntered.indexOf(badWord, number + badWord.length());
}
}// end of countWord

You should uppercase your badword too.
int number = sentenceEntered.toUpperCase().indexOf(badWord.toUpperCase(), 0);

Your problem is that you need to set the new starting index to number + badWord.length() in the while loop so that you can start the next indexOf at the end of the last badWord.
number = sentenceEntered.indexOf(badWord, number + badWord.length());

Apart from what is causing your current issue, if you just want to count the number of word occurence in a sentence, there's a simple way without using the loops
int length = sentenceEntered.length();
int badwordLength = badword.length();
if (sentenceEntered.contains(badword)) {
System.out.println("badword count: " + (length - sentenceEntered.replace(badword,"").length() / badwordLength));
}

How to count the number of dashes (-) in a phone number String for input validation?

What I have is a program that reads in a user-entered phone number and returns the country code (if it exists), area code (if it exists), and local 7-digit phone number.
The number must be entered as countrycode-area-local. So, the maximum number of dashed you can have in the phone number is two.
For example:
1-800-5555678 has two dashes (it has a country code and area code)
800-5555678 has one dash (it only has an area code)
5555678 has no dashes (only local number)
So, it's OK to have zero dashes, one dash, or two dashes, but no more than two.
What I am trying to figure out is how you would count the number of dashes ("-") in the string to make sure there aren't more than two instances of them. If there are, it would print an error.
So far, I have:
if(phoneNumber ///contains more more than two dashed
{
System.out.println("Error, your input has more than two dashes. Please input using the specified format.");
{
else
{
//normal operations
}
Everything works so far except for this one part. I'm not sure what method to use to do this. I tried looking at indexOf, but I am stumped.

One way do it would be to parse the number into string and then split on the dashes. If the new array you get is of length greater than 3 then give an error.
String s = "1-800-5555678";
String parts[] = s.split("-");
if (parts.length > 3) {
System.out.println("error");
} else {
// do something
}
A more memory efficient solution as suggested by samrap is:
String s = "1-800-555-5678";
int dashes = s.split("-").length - 1;
if (dashes > 2) {
System.out.print("error");
} else {
// do something
}

String phoneNumber = "1-800-5555678";
int counter = 0;
for( int i=0; i< phoneNumber.length(); i++ ) {
if( phoneNumber.charAt(i) == '-' ) {
counter++;
}
}
if(counter > 2) ///contains more more than two dashed
{
System.out.println("Error, your input has more than two dashes. Please input using the specified format.");
{
else
{
//normal operations
}

Something like this. Loop through the chars in the string and see if they are -
int nDashes = 0;
for (int i=0; i<phoneNumber.length(); i++){
if (phoneNumber.charAt(i)=='-')
nDashes++;
}
if (nDashes>2){
//do something
}

Regex and replaceAll() are your friends. Use,
int count=string.replaceAll("\\d","").length();
This will replace all the numbers with empty strings and thus all you will be left with is -s

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.