Scan for duplicate strings in a file - java

I have been having a problem with the next step in the logic of my code. Basically I'm supposed to examine each line of a file looking for consecutive tokens on the same line, and print the duplicate token along with the number of times it consecutively occurs. Non repeated tokens aren't printed.
here's a sample file
/*
* sometext.txt
* hello how how are you you you you
I I I am Jack's Jack's smirking smirking smirking smirking revenge
bow wow wow yippee yippee yo yippee yippee yay yay yay
one fish two fish red fish blue fish
It's the Muppet Show, wakka wakka wakka
*/
and here's some code that i've written.
package chapter6;
import java.util.*;
import java.io.*;
public class OutputDuplicates {
public static void main (String[] args) throws FileNotFoundException {
for (;;) {
Scanner scan = new Scanner(System.in);
prompt(scan);
}
}
public static void prompt(Scanner scan) throws FileNotFoundException {
System.out.println("What is the name of the file?");
String name = scan.next();
File inputFile = new File(name);
if (inputFile.exists()) {
Scanner read = new Scanner(inputFile);
while (read.hasNext()) {
String line = read.nextLine();
Scanner oneLine = new Scanner (line);
while (oneLine.hasNext()) {
String word = oneLine.next();
System.out.println(word);
}
}
} else if (!inputFile.exists()) {
prompt(scan);
}
}
}
Any insight to the logic from here out would be much appreciated.

Here you go buddy, it should work for you
public Map<String, Long> scan(File file) throws Exception {
Map<String, Long> map = new HashMap<>();
Scanner read = new Scanner(file);
while (read.hasNext()) {
String line = read.nextLine();
if(map.containsKey(line)) {
map.put(line, map.get(line).longValue() + 1);
} else {
map.put(line, 1L);
}
}
return map;
}

pseudocode:
for each line in the file
{
lastword = ""
numtimes = 1
for each word in the line
{
if word == lastword
{
numtimes++
}
else
{
if numtimes > 1
{
print (/*print lastword and numtimes here*/)
}
lastword = word
numtimes = 1
}
}
}

You want to make a symbol frequency table:
Map<String, Integer> symbolFrequencies = new HashMap<String, int>();
Then for each symbol, do this:
Integer countForSymbol = symbolFrequencies.get(symbol);
if (countForSymbol==null){
symbolFrequencies.put(symbol, 1);
} else {
countForSymbol = new Integer(countForSymbol.intValue + 1);
}
and that's it. You will now have counts for all the symbols you have parsed.

Related

Array/ArrayList to make order with scanner input

so my teacher ordered me to make a program that
ask the user for the size of the array with a scanner
the program is required to understand if there is a scanner with the word "add" it will add the word after it to the array
The commands that are required to exist are ADD, DELETE, VIEW for display index-n and DISPLAY for display all of them
this is an example I've made but it's still far from correct, please help me!!!
public static void main(String[] args) {
Scanner input = new Scanner(System.in);
int a = input.nextInt();
String arr[] = new String[a];
for (int i = 0; i < a; i++) {
for (int j=0;j<arr.length;j++){
arr[i] = input.nextLine();
}
}
for( String b : arr ){
System.out.println(b);
}
enter code here
an example of the scanner input is
7
ADD this
ADD IS
ADD not
ADD real
VIEW 2
DELETE not
DISPLAY
and the output will be
not
this is real
There's no reason to ask how long the array should be because we are using ArrayList which is a dynamic array. You can make this code easier to read but here's just an example of what you are looking for:
public final static void main(final String[] args)
{
final List<String> list = new ArrayList<String>();
final Scanner scan = new Scanner(System.in);
while (true)
{
final String command = scan.nextLine().toLowerCase();
if (command.contains("add "))
{
list.add(command.replace("add ", ""));
} else if (command.contains("delete "))
{
final String toDelete = command.replace("delete ", "");
if (!list.remove(toDelete))
System.out.format("\"%s\" didn't exist in the array!", toDelete);
} else if (command.contains("view "))
{
System.out.println(list.get(Integer.parseInt(command.replace("view ", ""))));
} else if (command.equals("display"))
{
for (final String str : list)
{
System.out.println(str);
}
break;
} else
{
System.out.println("Unknown command!");
}
}
scan.close();
}

what wrong with my code always print answer : null+word,what is null?

We have file with a few words, try safe word with word have 2,4,6 or 8 letters in array but then save in screen write null and null+good word.
What did I write wrong, and why does it show null?
public static void lyginis () throws IOException {
Path path = Paths.get("words.txt");
Scanner scanner = new Scanner(path);
int kiek = 0;
while (scanner.hasNext()) {
scanner.next();
kiek++;
}
Scanner scanner1 = new Scanner(path);
String[] atrinkti = new String[kiek];
String scan = "";
for (int i = 0; i < kiek; i++) {
scan = scanner1.next();
if (scan.length() % 2 == 0) {
atrinkti[i] += scan ;
}
System.out.println(atrinkti[i]);
}
}
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
public class Hello {
public static void main(String[] args) throws IOException {
File file = new File("words.txt");
Scanner scanner = new Scanner(file);
int kiek = 0;
while (scanner.hasNext()) {
scanner.next();
kiek++;
}
Scanner scanner2 = new Scanner(file);
String[] atrinkti = new String[kiek];
String word = "";
for (int i = 0; i < kiek; i++) {
word = scanner2.next();
if (word.length() % 2 == 0) {
atrinkti[i] = word;
System.out.println(atrinkti[i]);
}
}
}
}
Output
$ cat words.txt
hi
hello
whats up
chicken
duck
goose
$ javac Hello.java; java Hello
hi
up
duck
The issues were:
Path was used instead of File
The += was used within the if statement instead of just =
The System.out.println() function was called outside of the if statement so when the word's length was not divisible by 2, the current array element would print the default initialized value of the array of null

Checking for and counting punctuation in a file

I'm currently doing an assignment which requires the program to count words and punctuation from a text file. The word counting program is done and working but my professor provided an additional method to be combined with it to count punctuation that I cannot seem to get to work. Here is the working program:
import java.util.*;
import java.io.*;
public class SnippetWeek11 {
public static void main(String[] args) throws Exception {
Scanner input = new Scanner(System.in);
System.out.print("Enter a filename of a text file to process: ");
String filename = input.nextLine();
File file = new File(filename);
if (file.exists()) {
processFile(file);
}
else {
System.out.println("File " + filename + " does not exist");
}
}
private static void processFile(File theFile) throws Exception {
int wordIndex;
// Create a TreeMap to hold words as key and count as value
Map<String, Integer> map = new TreeMap<>();
Scanner input = new Scanner(theFile);
String line, keyText;
String[] words;
while (input.hasNextLine()) {
line = input.nextLine();
words = line.split("[\\s+\\p{P}]");
for (wordIndex = 0; wordIndex < words.length; wordIndex++) {
keyText = words[wordIndex].toLowerCase();
updateMap(map, keyText);
}
}
// Display key and value for each entry
map.forEach((key, value) -> System.out.println(key + "\t" + value));
}
private static void updateMap(Map<String, Integer> theMap,
String theText) {
int value;
String key = theText.toLowerCase();
if (key.length() > 0) {
if (!theMap.containsKey(key)) {
// The key does not exist in the Map object (theMap), so add key and
// the value (which is a count in this case) to a new theMap element.
theMap.put(key, 1);
}
else {
// The key already exists, so obtain the value (count in this case)
// from theMap element that contains the key and update the element
// with an increased count.
value = theMap.get(key);
value++;
theMap.put(key, value);
}
}
}
And here is the method that must be combined with the word count program. I would appreciate any help you could give. Thanks.
public static int countPunctuation(File theFile) throws Exception {
String[] punctuationString = {"[","]",".",";",",",":","!","?","(",")","{","}","'"};
Set<String> punctuationSet =
new HashSet<>(Arrays.asList(punctuationString));
int count = 0;
Scanner input = new Scanner(theFile);
while (input.hasNext()) {
String character = input.next();
if (punctuationSet.contains(character))
count++;
}
return count;
}
}
If you could use Pattern Class, you can do this.
import java.util.regex.*;
import java.util.*;
import java.util.stream.*;
class PunctuationMatch
{
public static void main(String[] args) {
final Pattern p = Pattern.compile("^[,|.|?|!|:|;]");
System.out.println(p.splitAsStream("Hello, World! How are you?").count());
}
}
While passing string in compile method pass all the puctuation you want to identify.
Passing into splitAsStream method your entire data string or a line by line of a file and add every thing up.
Here is the Java Docs Ref

print specific characters in an array of strings

I need to print specific indexes of strings in an array, for example
String[] words = {car, bike, truck};
print words[0][0] and the result would be c and print words[0][1] = a.
Also i have to read the array from a text file. What i have so far will print the first word of the array.
import java.util.Scanner;
import java.io.File;
import java.io.IOException;
import java.io.FileNotFoundException;
public class DemoReadingFiles
{
public static void main (String[] args)
{
String[] words = readArray("words.txt");
System.out.println(words[0]);//i can get it to print specific elements
}
public static String[] readArray(String file)
{
int ctr = 0;
try
{
Scanner s1 = new Scanner(new File(file));
while (s1.hasNextLine())
{
ctr = ctr + 1;
s1.next();
}
String[] words = new String[ctr];
Scanner s2 = new Scanner(new File(file));
for (int i = 0; i < ctr; i = i + 1)
{
words[i] = s2.next();
}
return words;
}
catch (FileNotFoundException e)
{
}
return null;
}
}
public static void main(String[] args) {
String[] words = {"cars", "bike", "truck"};
System.out.println("Specific character print:" + words[0].charAt(0));
System.out.println("Multi character selection printed as follows:" + words[0].substring(1, words[0].length() - 1));
}
Output:
Specific character print:c
Multi character selection printed as follows:ar

Java: Count duplicate tokens on line using Scanner object

Yes this is an exercise from "Building Java Programs", but its not an assigned problem.
I need to write a method that reads the following text as input:
hello how how are you you you you
I I I am Jack's Jack's smirking smirking smirking smirking smirking revenge
bow wow wow yippee yippee yo yippee yippee yay yay yay
one fish two fish red fish blue fish
It's the Muppet Show, wakka wakka wakka
And produces the following as output:
how*2 you*4
I*3 Jack's*2 smirking*4
wow*2 yippee*2 yippee*2 yay*3
wakka*3
Now I know I have to use Scanner objects to first read a line into a String, the to tokenize the string. What I don't get is how I read a token into a string, then immediately compare it to the next token.
CONSTRAINT -> This is from the chapter before arrays so I'd like to solve without using one.
Here is the code I have so far:
public class Exercises {
public static void main(String[] Args) throws FileNotFoundException {
Scanner inputFile = new Scanner(new File("misc/duplicateLines.txt"));
printDuplicates(inputFile);
}
public static void printDuplicates(Scanner input){
while(input.hasNextLine()){
//read each line of input into new String
String lineOfWords = input.nextLine();
//feed String into new scanner object to parse based on tokens
Scanner newInput = new Scanner(lineOfWords);
while(newInput.hasNext()){
//read next token into String
String firstWord = newInput.next();
//some code to compare one token to another
}
}
}
No need to use arrays...you just need a little bit of state in the while loop:
public class Exercises {
public static void main(String[] Args) throws FileNotFoundException {
// scanner splits on all whitespace characters by default, so it needs
// to be configured with a different regex in order to preserve newlines
Scanner inputFile = new Scanner(new File("misc/duplicateLines.txt"))
.useDelimiter("[ \\t]");
printDuplicates(inputFile);
}
public static void printDuplicates(Scanner input){
int lastWordCount = 0;
String lastWord = null;
while(newInput.hasNext()){
//read next token into String
String nextWord = newInput.next();
// reset counters on change and print out if count > 1
if(!nextWord.equals(lastWord)) {
if(lastWordCount > 1) {
System.out.println(lastWord + "*" + lastWordCount);
}
lastWordCount = 0;
}
lastWord = nextWord;
lastWordCount++;
}
// print out last word if it was repeated
if(lastWordCount > 1) {
System.out.println(lastWord + "*" + lastWordCount);
}
}
}
How about this? I'm allocating an extra string to keep track of the previous word.
while(input.hasNextLine()){
//read each line of input into new String
String lineOfWords = input.nextLine();
//feed String into new scanner object to parse based on tokens
Scanner newInput = new Scanner(lineOfWords);
String previousWord = "";
String currentWord = "";
while(newInput.hasNext()){
//read next token into String
previousWord = currentWord;
currentWord = newInput.next();
if (currentWord.equals(previousWord)) {
// duplicate detected!
}
}
}
public class test2 {
public static void main(String[] args) {
Scanner input = null;
try {
input = new Scanner(new File("chinese.txt"));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
String currentLine;
String lastWord="";
String currentWord="";
int count=1;
while (input.hasNextLine()){
currentLine=input.nextLine();
Scanner newInput = new Scanner (currentLine);
//System.out.println(currentLine);
while(newInput.hasNext()){
currentWord=newInput.next();
if (!currentWord.equals(lastWord)&& count>1){
System.out.print(lastWord+"*"+count+" ");
count=1;
}
else if (currentWord.equals(lastWord)){
count++;
}
lastWord=currentWord;
}
if (count>1){
System.out.print(lastWord+"*"+count+" ");
}
System.out.println();
count=1;
}
input.close();
}
}

Categories

Resources