I feel as if I am missing something really simple but I can't find it.
The goal of this code is to take a Shakespeare file and use a hash map to find the number of times a word is given by the text as well as words of "n" characters long. However I can't even get to the debugging portion because I get the error
Bard.java:13: error: cannot find symbol
Pattern getout = Pattern.compile("[\\w']+"); //this will take only the words
^ symbol: class Pattern location: class Bard
Bard.java:13: error: cannot find symbol
Pattern getout = Pattern.compile("[\\w']+"); //this will take only the words
plus a few more location. Help would be greatly appreciated.
import java.io.*;
import java.util.*;
public class Bard {
public static void main(String[] args) {
HashMap < String, Integer > m1 = new HashMap < String, Integer > (); // sets the hashmap
//create file reader for the shakespere text
try (BufferedReader br = new BufferedReader(new FileReader("shakespeare.txt"))) {
String line = br.readLine();
Pattern getout = Pattern.compile("[\\w']+"); //this will take only the words
//create the hashmap
while (line != null) {
Matcher m = getout.matcher(line); //find the relevent information
while (m.find()) {
if (m1.get(m.group()) == null && !m.group().toUpperCase().equals(m.group())) { //find new word that is not in all caps.
m1.put(m.gourp(), 1);
} else { //increments the onld word
int newValue = m1.get(m.group());
newValue++;
m1.put(m.group, newValue);
}
}
line = br.readLine();
}
} catch (Exception e) {
e.printStackTrace();
}
try (BufferedReader br2 = new BufferedReader(new FileReader("input.txt"))) {
String line2 = br2.readLine();
FileWriter output = new FileWriter("analysis.txt");
while (line2 != null) {
if (line2.matches("[\\d\\s]+")) { // if i am dealing with the two integers
String[] args = line.split(" "); // split them up
wordSize = Integer.parseInt(args[0]); // set the first on the the word size
numberOfWords = Integer.parseInt(args[1]); // set the other one to the number of words wanted
String[] wordsToReturn = new String[numberOfWords]; //create array to place the words
int i = 0;
int j;
for (String word: m1.keySet()) { //
if (word.length() == wordSize) {
wordToReturn[i] = word;
i++;
}
for (j = 0; numberOfWords > j; j++) {
output.write(wordToReturn[j]);
}
}
} else {
output.write(m1.get(line2));
}
}
line2 = br2.readLine();
} catch (Exception e) {
e.printStackTrace();
}
}
}
You have not imported the Pattern class. Import it with :-
import java.util.regex.*;
Related
I want to find the line number of a text file by each word, however, the method I wrote below only gives the first number while I need a list of line numbers.
For instance, if "a" occurs in lines: 1,3,5, it should have a list of [1,3,5]. This list result then will be passed into another method for further process. But, my result only shows [1] for "a".
Can someone help me fix this? Thank you!
public SomeObject<Word> buildIndex(String fileName, Comparator<Word> comparator) {
SomeObject<Word> someObject = new SomeObject<>(comparator);
Comparator<Word> comp = checkComparator(someObject.comparator());
int num = 0;
if (fileName != null) {
File file = new File(fileName);
try (Scanner scanner = new Scanner(file, "latin1")) {
while (scanner.hasNextLine()) {
String lines;
if (comparator instanceof IgnoreCase) {
lines = scanner.nextLine().toLowerCase();
} else {
lines = scanner.nextLine();
}
if (lines != null) {
String[] lineFromText = lines.split("\n");
List<Integer> list = new ArrayList<>();
for (int i = 0; i < lineFromText.length; i++) {
String[] wordsFromText = lineFromText[i].split("\\W");
num++;
for (String s : wordsFromText) {
if (s != null && lineFromText[i].contains(s)) {
list.add(num);
}
if (s != null && !s.trim().isEmpty() && s.matches("^[a-zA-Z]*$")) {
doInsert(s, comp, someObject, list);
}
}
}
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
return someObject;
}
Does something like this work for you?
It reads in the lines one at a time.
Finds the words by splitting on spaces.
Then puts the words and the line numbers in a map where the
key is the word an the value is a list of line numbers.
int lineCount = 1;
String fileName = "SomeFileName";
Map<String, List<Integer>> index = new HashMap<>();
Scanner scanner = new Scanner("fileName");
while (scanner.hasNextLine()) {
//get single line from file
String line = scanner.nextLine().toLowerCase();
//split into words
for (String word : line.split("\\s+")) {
// add to lineNumber to map if List already there.
// otherwise add new List and then add lineNumber
index.compute(word,
(wd, list) -> list == null ? new ArrayList<>()
: list).add(lineCount);
}
// bump lineCount for next line
lineCount++;
}
Print them out.
index.forEach((k, v) -> System.out.println(k + " --> " + v));
EDIT: editted for clarity as to what I'm having trouble with. I'm not getting the right responses as its counting dupes. I HAVE to use RegEx, can use tokenizer however but I did not.
What I am trying to do here is, there is 5 input files. I need to calculate how many "USER DEFINED VARIABLES" there are. Please ignore the messy code, I'm just learning Java.
I replaced: everything within ( and ), all non-word characters, any statements such as int, main etc, any digit with a space infront of it, and any blank space with a new line then trim it.
This leaves me with a list that has a variety of strings which I will match with my RegEx. However, at this point, how make my count only include unique identifiers?
EXAMPLE:
For example, in the input file I have attached beneath the code, I am receiving
"distinct/unique identifiers: 10" in my output file, when it should be "distinct/unique identifiers: 3"
And for example, in the 5th input file I have attached, I should have "distinct/unique identifiers: 3" instead I currently have "distinct/unique identifiers: 6"
I cannot use Set, Map etc.
Any help is great! Thanks.
import java.util.*
import java.util.regex.*;
import java.io.*;
public class A1_123456789 {
public static void main(String[] args) throws IOException {
if (args.length < 1) {
System.out.println("Wrong number of arguments");
System.exit(1);
}
for (int i = 0; i < args.length; i++) {
FileReader jk = new FileReader(args[i]);
BufferedReader ij = new BufferedReader(jk);
FileWriter fw = null;
BufferedWriter bw = null;
String regex = "\\b(\\w+)(\\s+\\1\\b)+";
Pattern p = Pattern.compile("[_a-zA-Z][_a-zA-Z0-9]{0,30}");
String line;
int count = 0;
while ((line = ij.readLine()) != null) {
line = line.replaceAll("\\(([^\\)]+)\\)", " " );
line = line.replaceAll("[^\\w]", " ");
line = line.replaceAll("\\bint\\b|\\breturn\\b|\\bmain\\b|\\bprintf\\b|\\bif\\b|\\belse\\b|\\bwhile\\b", " ");
line = line.replaceAll(" \\d", "");
line = line.replaceAll(" ", "\n");
line = line.trim();
Matcher m = p.matcher(line);
while (m.find()) {
count++;
}
}
try {
String s1 = args[i];
String s2 = s1.replaceAll("input","output");
fw = new FileWriter(s2);
bw = new BufferedWriter(fw);
bw.write("distinct/unique identifiers: " + count);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (bw != null) {
bw.close();
}
if (fw != null) {
bw.close();
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
//This is the 3rd input file below.
int celTofah(int cel)
{
int fah;
fah = 1.8*cel+32;
return fah;
}
int main()
{
int cel, fah;
cel = 25;
fah = celTofah(cel);
printf("Fah: %d", fah);
return 0;
}
//This is the 5th input file below.
int func2(int i)
{
while(i<10)
{
printf("%d\t%d\n", i, i*i);
i++;
}
}
int func1()
{
int i = 0;
func2(i);
}
int main()
{
func1();
return 0;
}
Try this
LinkedList dtaa = new LinkedList();
String[] parts =line.split(" ");
for(int ii =0;ii<parts.length;ii++){
if(ii == 0)
dtaa.add(parts[ii]);
else{
if(dtaa.contains(parts[ii]))
continue;
else
dtaa.add(parts[ii]);
}
}
count = dtaa.size();
instead of
Matcher m = p.matcher(line);
while (m.find()) {
count++;
}
Amal Dev has suggested a correct implementation, but given the OP wants to keep Matcher, we have:
// Previous code to here
// Linked list of unique entries
LinkedList uniqueMatches = new LinkedList();
// Existing code
while ((line = ij.readLine()) != null) {
line = line.replaceAll("\\(([^\\)]+)\\)", " " );
line = line.replaceAll("[^\\w]", " ");
line = line.replaceAll("\\bint\\b|\\breturn\\b|\\bmain\\b|\\bprintf\\b|\\bif\\b|\\belse\\b|\\bwhile\\b", " ");
line = line.replaceAll(" \\d", "");
line = line.replaceAll(" ", "\n");
line = line.trim();
Matcher m = p.matcher(line);
while (m.find()) {
// New code - get this match
String thisMatch = m.group();
// If we haven't seen this string before, add it to the list
if(!uniqueMatches.contains(thisMatch))
uniqueMatches.add(thisMatch);
}
}
// Now see how many unique strings we have collected
count = uniqueMatches.size();
Note I haven't compiled this, but hopefully it works as is...
public class ReadTemps {
public static void main(String[] args) throws IOException {
// TODO code application logic here
// // read KeyWestTemp.txt
// create token1
String token1 = "";
on hover over component 1 change the style
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Scanner;
public class ReadTemps{
public static void main(String[] args) throws IOException {
//taking the word to search from keyboard
Scanner keyboard = new Scanner(System.in);
System.out.print("Enter the word you want to search: ");
String input = keyboard.nextLine();
//counter for calculating how many times word wrote in line
int counter = 0;
//counter to find which line we are searching
int counterLine = 1;
// // read KeyWestTemp.txt
// create token1
String token1 = "";
// for-each loop for calculating heat index of May - October
// create Scanner inFile1
Scanner inFile1 = new Scanner(new File("C:\\KeyWestTemp.txt"));
// Original answer used LinkedList, but probably preferable to use
// ArrayList in most cases
// List<String> temps = new LinkedList<String>();
ArrayList<String> temps = new ArrayList<String>();
// while loop
while (inFile1.hasNext()) {
// find next line
token1 = inFile1.nextLine();
//removing whitespeaces
token1.replaceAll("\\s+","");
//taking all the letters as String
for(int i = 0; i < token1.length(); i++) {
char c = token1.charAt(i);
String s = "" + c;
temps.add(s);
}
//adding a point to find line' end
temps.add("line");
}
inFile1.close();
String[] tempsArray = temps.toArray(new String[0]);
//searching on array to find first letter of word
for (int i = 0; i < tempsArray.length; i++) {
String s = temps.get(i);
//if its the end of line time to print
if(s.equals("line")) {
System.out.println("Line" + counterLine + " : " + counter + " occurrence ");
counterLine++;
counter = 0;
}
//if the first letter found need to search rest of the letters
if(s.equalsIgnoreCase("" + input.charAt(0))) {
s = "";
try {
for(int j = i; j < i + input.length(); j++) {
String comp = temps.get(j);
if(comp.equalsIgnoreCase("" + input.charAt(j-i)))
s = s + comp;
}
} catch (IndexOutOfBoundsException e) {
}
//checks if found the word
if(s.equalsIgnoreCase(input))
counter++;
}
}
}
}
This is the code i got for searching char by char for wanted String.
Rather than using inFile1.next();, use inFile1.nextLine(), and don't bother wasting time using a token string.
while (inFile1.hasNext()) {
temps.add(inFile1.nextLine());
}
use BUFFERED READER , it read line by line
try (BufferedReader br = new BufferedReader(new FileReader(fileName))) {
String fullLine;
while ((line = br.readLine()) != null) {
}
}
I am trying to match words from array to create a Symbol Table for Lexical Analysis (compiler lab). I am reading a C code file from Java. I am able to find everything from the file except the first word. No matter what I try the first word does not match with anything although it is a valid word.
In my file, the first word is int (initialization of two variable) and second line is float (initialization). If I swap it, my code can match int but did not match float.
here is the file I am reading:
float d, e;
int a, b, c;
Here is the code to read from file:
public static void fileRead(String fileName)
{
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(fileName));
try {
String x;
while ( (x = br.readLine()) != null )
{
// printing out each line in the file
System.out.println(x);
parser(x);
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
} catch (FileNotFoundException e) {
System.out.println(e);
e.printStackTrace();
}
}
parser is another method and it is used to parse out different words:
public static void parser(String line)
{
String text = "";
for(int i = 0; i < line.length(); i++)
{
String temp = line.charAt(i) + "";
if(!(temp.equals(" ")
|| temp.equals(",")
|| temp.equals(";")
|| temp.equals(")")
|| temp.equals("}")
|| temp.equals("(")
|| temp.equals("{")
|| temp.equals("[")
|| temp.equals("]")
))
{
text = text + temp;
}
else
{
text = text.trim();
if(text.equals("int"))
{
System.out.println("Say cheese");
}
addToarray(text);
text = "";
}
}
I thought there might a space at the end so I trimmed it as well as backup.
and this is how I am adding to an array:
if(item.equals(text))
Here the "int" seemed to lose and never went inside the if block
public static void addToarray(String text)
{
boolean flag = false;
//look for keyWords first.
for (String item : keyWords)
{
if(item.equals(text))
{
if(resultKey.size() == 0)
{
System.out.println("Size zero> "+resultKey.size());
resultKey.add(text);
text = "";
flag = true;
break;
}
else
{
boolean checker = true;
for(String key : resultKey)
{
if(key.equals(text))
{
checker = false;
break;
}
}
if(checker)
{
resultKey.add(text);
flag = true;
text = "";
}
}
}
}
This is the array I am using to match:
final static String []keyWords = {"float", "if", "else",
"long", "double", "BigInteger","int"};
and these are the ArrayList to store variables.
static ArrayList <String> resultKey, resultIdent , resultMath,
resultLogic, resultNumeric, resultOthers;
Thanks for your help.
Lauching this simple app it works, don't know why you are unable to read the first word. EDIT: 100% it's the starting BOM in your file as #Fildor noticed.
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
public class Parser {
final static String[] keyWords = { "float", "if", "else", "long", "double", "BigInteger", "int" };
static ArrayList<String> resultKey = new ArrayList<>();
public static void main(String[] args) {
fileRead("src/test/resources/test.txt");
for (final String key : resultKey) {
System.out.println(key);
}
}
public static void fileRead(String fileName) {
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(fileName));
try {
String x;
while ((x = br.readLine()) != null) {
// printing out each line in the file
System.out.println(x);
parser(x);
}
br.close();
} catch (final IOException e) {
e.printStackTrace();
}
} catch (final FileNotFoundException e) {
System.out.println(e);
e.printStackTrace();
}
}
public static void parser(String line) {
String text = "";
for (int i = 0; i < line.length(); i++) {
final String temp = line.charAt(i) + "";
if (!(temp.equals(" ") || temp.equals(",") || temp.equals(";") || temp.equals(")") || temp.equals("}")
|| temp.equals("(") || temp.equals("{") || temp.equals("[") || temp.equals("]"))) {
text = text + temp;
} else {
text = text.trim();
if (text.equals("int")) {
System.out.println("Say cheese");
}
addToarray(text);
text = "";
}
}
}
public static void addToarray(String text) {
boolean flag = false;
// look for keyWords first.
for (final String item : keyWords) {
if (item.equals(text)) {
if (resultKey.size() == 0) {
System.out.println("Size zero> " + resultKey.size());
resultKey.add(text);
text = "";
flag = true;
break;
} else {
boolean checker = true;
for (final String key : resultKey) {
if (key.equals(text)) {
checker = false;
break;
}
}
if (checker) {
resultKey.add(text);
flag = true;
text = "";
}
}
}
}
}
}
And the file test.txt contains exactly
float d, e;
int a, b, c;
Launching it prints
float d, e;
Size zero> 0
int a, b, c;
Say cheese
float
int
"int" is not matched, because your input file probably contains a Byte-Order-Mark.
You can check for it in code or with a Hex-Editor. Most likely it will be one of 0xEFBBBF (UTF-8) , 0xFEFF (UTF-16 Big Endian) or 0xFFFE (UTF-16 Little Endian). But there are more. I already referenced a W3C-Document on the topic in the comments. Here is the Wikipedia-Article with even more BOMs.
Sidenote:
Which teacher hands out a "dirty" input file!? He must be some kind of sadist or (which would be even worse, imho) he did not do it on purpose. I would try to copy the (printable) content of the file to a new file and test this as input. So if the clean file works to your satisfaction, you can find some means of sanitizing input.
I have been given this question for practice and am kind of stuck on how to complete it. It basically asks us to create a program which uses a BufferedReader object to read values(55, 96, 88, 32) given in a txt file (say "s.txt") and then return the smallest value of the given values.
So far I have got two parts of the program but i'm not sure how to join them together.
import java.io.*;
class CalculateMin
{
public static void main(String[] args)
{
try {
BufferedReader br = new BufferedReader(new FileReader("grades.txt"));
int numberOfLines = 5;
String[] textInfo = new String[numberOfLines];
for (int i = 0; i < numberOfLines; i++) {
textInfo[i] = br.readLine();
}
br.close();
} catch (IOException ie) {
}
}
}
and then I have the loop which I made but i'm not sure how to implement it into the program above. Eugh I know i'm complicating things.
int[] numArray;
numArray = new int[Integer.parseInt(br.readLine())];
int smallestSoFar = numArray[0];
for (int i = 0; i < numArray.length; i++) {
if (numArray[i] < smallestSoFar) {
smallestSoFar = numArray[i];
}
}
Appreciate your help
Try this code, it iterates through the entire file comparing number from each line with the previously read lowest number-
public static void main(String[] args) {
try {
BufferedReader br = new BufferedReader(new FileReader("grades.txt"));
String line;
int lowestNumber = Integer.MAX_VALUE;
int number;
while ((line = br.readLine()) != null) {
try {
number = Integer.parseInt(line);
lowestNumber = number < lowestNumber ? number : lowestNumber;
} catch (NumberFormatException ex) {
// print the error saying that the line does not contain a number
}
}
br.close();
System.out.println("Lowest number is " + lowestNumber);
} catch (IOException ie) {
// print the exception
}
}