I want to seperate all words in a text file. There can be lots of words in one line. I tried that code but didnt work.What can i do?
this is text file:
desem ki vakitlerden
bir nisan akşamıdır
rüzgarların en ferahlatıcısı senden esiyor
sen de açıyor çiçeklerin en solmazı
ormanların en kuytusunu sen de gezmekteyim
demişken sana ormanların
senden güzeli yok
vakitlerden geçmekteyim
çiçeklerin tadı yok çiçeklerin
neyi var çiçeklerin
and i want to write all the words one bye one.
String[] words = null;
String line = inputStream.nextLine();
while (inputStream.hasNextLine()) {
line = inputStream.nextLine();
words = line.split(" ");
}
for (int i = 0; i < words.length; i++) {
System.out.println(words[i]);
}
}
inputStream.close();
Use ArrayList:
ArrayList<String> words = new ArrayList<String>();
String line = inputStream.nextLine();
while (inputStream.hasNextLine()) {
line = inputStream.nextLine();
words.addAll(Arrays.asList(line.split(" ")));
}
for (int i = 0; i < words.size(); i++) {
System.out.println(words.get(i´));
}
}
inputStream.close();
Since you already seem to be using Scanner
instead of
while (inputStream.hasNextLine()) {
line = inputStream.nextLine();
words = line.split(" ");
}
for (int i = 0; i < words.length; i++) {
System.out.println(words[i]);
}
}
you can use its next() which will read words instead of lines. This way you will be able to print all words without having to
read entire line,
parse this line to split it into words
store all words
Your code can look like:
Scanner inputStream = new Scanner(new File("location/of/your/file.txt"));
List<String> words = new ArrayList<>();
while (inputStream.hasNext()){
words.add(inputStream.next());
}
inputStream.close();
for (String word : words){
//here you can do whatever you want with each word from list
System.out.println(word);
}
Related
I have the following code which counts and displays the number of times each word occurs in the whole text document.
try {
List<String> list = new ArrayList<String>();
int totalWords = 0;
int uniqueWords = 0;
File fr = new File("filename.txt");
Scanner sc = new Scanner(fr);
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
for (int i = 0; i < space.length; i++) {
list.add(space[i]);
}
totalWords++;
}
System.out.println("Words with their frequency..");
Set<String> uniqueSet = new HashSet<String>(list);
for (String word : uniqueSet) {
System.out.println(word + ": " + Collections.frequency(list,word));
}
} catch (Exception e) {
System.out.println("File not found");
}
Is it possible to modify this code to make it so it only counts each occurrence once per line rather than in the entire document?
One can read the contents per line and then apply logic per line to count the words:
File fr = new File("filename.txt");
FileReader fileReader = new FileReader(file);
BufferedReader br = new BufferedReader(fileReader);
// Read the line in the file
String line = null;
while ((line = br.readLine()) != null) {
//Code to count the occurrences of the words
}
Yes. The Set data structure is very similar to the ArrayList, but with the key difference of having no duplicates.
So, just use a set instead.
In your while loop:
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
//convert space arraylist -> set
Set<String> set = new HashSet<String>(Arrays.asList(space));
for (int i = 0; i < set.length; i++) {
list.add(set[i]);
}
totalWords++;
}
Rest of the code should remain the same.
I have a text file with multiple lines of numbers like this:
0.0336 0.0243 0.0261
0.0075 0.1788 0.0669
I need to make a Java program to reformat them to one number per line:
0.0336
0.0243
0.0261
0.0075
0.1788
0.0669
Here is my code and it does not work:
while (scanner.hasNext())
{
String[] arr = scanner.nextLine().split("\\s+");
for(int i =0; i< arr.length; i++){
System.out.println(arr[i]);
}
}
This code results in an extra line whenever there is a new line, for example:
0.0336
0.0243
0.0261
//extra line here, which should be ignored
0.0075
0.1788
0.0669
Is there a way to ignore the line?
I tried the same you did it's working fine on my text file. I did this
BufferedReader source = new BufferedReader(new FileReader("/path/of/file/stack.txt"));
scanner = new Scanner(source);
while (scanner.hasNext()) {
String[] arr = scanner.nextLine().split("\\s+");
for (int i = 0; i < arr.length; i++) {
System.out.println(arr[i]);
}
}
So I am doing this past sample final exam where the question asks to read input from a file and then process them into words. The end of a sentence is marked by any word that ends with one of the three characters . ? !
I was able to write a code for this however I can only split them into sentences using scanner class and using use.Delimiter. I want to process them into words and see if a word ends in the above sentence separator then I will just stop adding words into the sentence class.
Any help would be appreciated as I am learning this on my own and this is what I came up with. My code is here.
File file = new File("finalq4.txt");
Scanner scanner = new Scanner(file);
scanner.useDelimiter("[.?!]");
while(scanner.hasNext()){
sentCount++;
line = scanner.next();
line = line.replaceAll("\\r?\\n", " ");
line = line.trim();
StringTokenizer tokenizer = new StringTokenizer(line, " ");
wordsCount += tokenizer.countTokens();
sentences.add(new Sentence(line,wordsCount));
for(int i = 0; i < line.replaceAll(",|\\s+|'|-","").length(); i++){
currentChar = line.charAt(i);
if (Character.isDigit(currentChar)) {
}else{
lettersCount++;
}
}
}
What I am doing in this code is that I am splitting the input into sentences using the Delimiter method and then counting the words, letters of the entire file and storing the sentences in a sentence class.
If I want to split this into words, how can I do that without using the scanner class.
Some of the input from the file that I have to process is here:
Text that follows is based on the Wikipedia page on cryptography!
Cryptography is the practice and study of hiding information. In modern times,
cryptography is considered to be a branch of both mathematics and computer
science, and is affiliated closely with information theory, computer security, and
engineering. Cryptography is used in applications present in technologically
advanced societies; examples include the security of ATM cards, computer
passwords, and electronic commerce, which all depend on cryptography.....
I can further elaborate on this question if it needs explanation.
What I want to be able to do is to keep adding words to the sentence class and stop if the word ends in one of the above sentence separator. And then read another word and keep adding the words until I hit another separator.
The snippet below shall work
public static void main(String[] args) throws FileNotFoundException {
File file = new File("final.txt");
Scanner scanner = new Scanner(file);
scanner.useDelimiter("[.?!]");
int sentCount;
List<Sentence> sentences = new ArrayList<Sentence>();
while (scanner.hasNext()) {
String line = scanner.next();
if (!line.equals("")) { /// for the ... in the end
int wordsCount = 0;
String[] wordsOfLine = line.split(" ");
for (int i = 0; i < wordsOfLine.length; i++) {
wordsCount++;
}
Sentence sentence = new Sentence(line, wordsCount);
sentences.add(sentence);
}
}
}
public class Sentence {
String line = "";
int wordsCount = 0;
public Sentence(String line, int wordsCount) {
this.line = line;
this.wordsCount=wordsCount;
}
You can use a buffered reader to read every line of the file. Then split every line into a sentence with the split method and finally to get the words just split the sentence with the same method. In the end it would look something like this:
BufferedReader br;
try{
br = new BufferedReader(new File(fileName));
}catch (IOException e) {e.printStackTrace();}
StringBuilder sb = new StringBuilder();
String line;
while((line = br.readLine()) != null){
sb.append(line);
}
String[] sentences = sb.toString().split("\\.");
for(String sentence:sentences){
String word = sentence.split(" ");
//Add word to sentence...
}
try{
br.close();
}catch(IOException e){
e.printStackTrace();
}
Okay so i have been solving this question through several techniques and one of the approach was above. however i was able to solve this with another approach as well which does not involve using Scanner class. This one was much more accurate and it gave me the exact output whereas in the above i was off by a few words and letters.
try {
input = new BufferedReader(new FileReader("file.txt"));
strLine = input.readLine();
while(strLine!= null){
String[] tokens = strLine.split("\\s+");
for (int i = 0; i < tokens.length; i++) {
if(strLine.isEmpty()){
continue;
}
String s = tokens[i];
wordsJoin += tokens[i] + " ";
wordCount += i;
int len = s.length();
String charString = s.replaceAll("[^a-zA-Z ]", "");
for(int k =0; k<charString.length(); k++){
currentChar = charString.charAt(k);
if(Character.isLetter(currentChar)){
lettersCount++;
}
}
if (s.charAt(len - 1) == '.' || s.charAt(len - 1) == '?' || s.charAt(len - 1) == '!') {
sentences.add(new Sentence(wordsJoin, wordCount));
sentCount++;
numOfWords += countWords(wordsJoin);
wordsJoin = "";
wordCount = 0;
}
}
strLine = input.readLine();
}
This might be useful for anyone doing the same problem or just need an idea of how to count letters, words and sentences from a text file.
I'm having a small problem with my code and I'm not exactly sure how to fix it.. Basically I'm trying to separate the file into different lines (Frames) and then input those lines into the file, and proceed to print them. My first line of the file never prints.
public class Main {
public static void main(String[] args) throws IOException
{
/*Switch switcherino = new Switch();*/
Frame frame = new Frame();
Scanner input = new Scanner(System.in);
System.out.println("Enter the name of the file to process: ");
String fileName = input.nextLine();
FileInputStream inputStream =
new FileInputStream(fileName);
InputStreamReader inputStreamReader =
new InputStreamReader(inputStream,Charset.forName("UTF-8"));
BufferedReader bufferedReader =
new BufferedReader(inputStreamReader);
try{
String str = " ";
while((str = bufferedReader.readLine())!= null){
String words[] = str.split(" ");
for (int i = 0; i < words.length; i++){
words[i] = bufferedReader.readLine();
System.out.println(words[i]);
}
}
}
catch (IOException e){
e.printStackTrace();
} finally {
try {
if (inputStream != null)
inputStream.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
I don't want to use an ArrayList, as much as it would probably be easier.
Thanks in advance!
File: (switch.txt)
fa00 123123123abc 111111222222 data1
fa01 111111222222 123123123abc data2
fa03 444444444444 123123123abc data3
fa01 123123123abc 4353434234ab data4
fa99 a11b22c33d44 444444444444 data5
Output: (from System.println(words[i]);)
fa01 111111222222 123123123abc data2
fa03 444444444444 123123123abc data3
fa01 123123123abc 4353434234ab data4
fa99 a11b22c33d44 444444444444 data5
This is wrong logic: you read the line, you split it into words so then go ahead and print them - no need to try and read any more lines
while((str = bufferedReader.readLine())!= null){
String words[] = str.split(" ");
for (int i = 0; i < words.length; i++){
words[i] = bufferedReader.readLine();
System.out.println(words[i]);
}
}
use this instead
while((str = bufferedReader.readLine())!= null){
String words[] = str.split(" ");
for (int i = 0; i < words.length; i++){
System.out.println(words[i]);
}
}
// to count length
int length = 0;
BufferedReader br =
new BufferedReader(inputStreamReader);
while(true){
str = br.readLine();
if(str == null) break;
else length++;
} // this loop counts the length!!
final int clength = length;
//now this is what you want!
String words[] = new String[clength];
int j= 0;
while(true){
str = bufferedReader.readLine();
if(str == null) break;
words[j++] = str;
System.out.println(str); //FIXED
}
//Now the words[] have all the lines individually
Your code doesn't work because you called readLine() twice, which skipped the first line. Try this and let me know.
You don't need to use split() since you want the entire line :)
while((str = bufferedReader.readLine())!= null){
String words[] = str.split(" ");
for (int i = 0; i < words.length; i++){
words[i] = bufferedReader.readLine();
System.out.println(words[i]);
}
}
When iterate the file, you split your first line into a String array,
words[] contains the following elements : fa00, 123123123abc, 111111222222 and data1.
and then the inner for loop iterate your bufferReader and you assign the lines to a specific index of word and then you print out the word array elements
You are not supposed to invoke bufferedReader.readLine() in the inner for loop, it breaks your logic.
I need to read a text file, and break the text into blocks of 6 characters (including spaces), pad zeroes to the end of text to meet the requirement.
I tried doing it and here is what I have done.
File file = new File("Sample.txt");
String line;
try {
Scanner sc = new Scanner(file);
while(sc.hasNext()){
line = sc.next();
int chunk = line.length();
int block_size=6;
if((chunk%block_size) != 0)
{
StringBuilder sb = new StringBuilder(line);
int val = chunk%block_size;
for(int i=0; i<val; i++){
sb.append(" ");
}
line = new String(sb.toString());
}
int group = line.length() / block_size;
String[] b = new String[group];
System.out.println(line);
System.out.println(chunk);
int j =0;
for(int i=0; i<group;i++){
b[i] = line.substring(j,j+block_size);
j += block_size;
}
System.out.println("String after spliting is: ");
for(int i=0; i<group;i++){
System.out.println(b[i]);
}
}
}
Now this works fine when the text in the input file has no spaces between words. But when I add spaces gives me a different output. I am stuck up at this point. Any suggestions on the same ?
I don't want to write the solution for you, but I'd advise you that what you're trying to accomplish might be easier to do using a BufferedReader with a FileReader and by using Reader.read(buf) where buf is a char[6];