Unable to match the first word from a file using Java - java

I am trying to match words from array to create a Symbol Table for Lexical Analysis (compiler lab). I am reading a C code file from Java. I am able to find everything from the file except the first word. No matter what I try the first word does not match with anything although it is a valid word.
In my file, the first word is int (initialization of two variable) and second line is float (initialization). If I swap it, my code can match int but did not match float.
here is the file I am reading:
float d, e;
int a, b, c;
Here is the code to read from file:
public static void fileRead(String fileName)
{
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(fileName));
try {
String x;
while ( (x = br.readLine()) != null )
{
// printing out each line in the file
System.out.println(x);
parser(x);
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
} catch (FileNotFoundException e) {
System.out.println(e);
e.printStackTrace();
}
}
parser is another method and it is used to parse out different words:
public static void parser(String line)
{
String text = "";
for(int i = 0; i < line.length(); i++)
{
String temp = line.charAt(i) + "";
if(!(temp.equals(" ")
|| temp.equals(",")
|| temp.equals(";")
|| temp.equals(")")
|| temp.equals("}")
|| temp.equals("(")
|| temp.equals("{")
|| temp.equals("[")
|| temp.equals("]")
))
{
text = text + temp;
}
else
{
text = text.trim();
if(text.equals("int"))
{
System.out.println("Say cheese");
}
addToarray(text);
text = "";
}
}
I thought there might a space at the end so I trimmed it as well as backup.
and this is how I am adding to an array:
if(item.equals(text))
Here the "int" seemed to lose and never went inside the if block
public static void addToarray(String text)
{
boolean flag = false;
//look for keyWords first.
for (String item : keyWords)
{
if(item.equals(text))
{
if(resultKey.size() == 0)
{
System.out.println("Size zero> "+resultKey.size());
resultKey.add(text);
text = "";
flag = true;
break;
}
else
{
boolean checker = true;
for(String key : resultKey)
{
if(key.equals(text))
{
checker = false;
break;
}
}
if(checker)
{
resultKey.add(text);
flag = true;
text = "";
}
}
}
}
This is the array I am using to match:
final static String []keyWords = {"float", "if", "else",
"long", "double", "BigInteger","int"};
and these are the ArrayList to store variables.
static ArrayList <String> resultKey, resultIdent , resultMath,
resultLogic, resultNumeric, resultOthers;
Thanks for your help.

Lauching this simple app it works, don't know why you are unable to read the first word. EDIT: 100% it's the starting BOM in your file as #Fildor noticed.
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
public class Parser {
final static String[] keyWords = { "float", "if", "else", "long", "double", "BigInteger", "int" };
static ArrayList<String> resultKey = new ArrayList<>();
public static void main(String[] args) {
fileRead("src/test/resources/test.txt");
for (final String key : resultKey) {
System.out.println(key);
}
}
public static void fileRead(String fileName) {
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(fileName));
try {
String x;
while ((x = br.readLine()) != null) {
// printing out each line in the file
System.out.println(x);
parser(x);
}
br.close();
} catch (final IOException e) {
e.printStackTrace();
}
} catch (final FileNotFoundException e) {
System.out.println(e);
e.printStackTrace();
}
}
public static void parser(String line) {
String text = "";
for (int i = 0; i < line.length(); i++) {
final String temp = line.charAt(i) + "";
if (!(temp.equals(" ") || temp.equals(",") || temp.equals(";") || temp.equals(")") || temp.equals("}")
|| temp.equals("(") || temp.equals("{") || temp.equals("[") || temp.equals("]"))) {
text = text + temp;
} else {
text = text.trim();
if (text.equals("int")) {
System.out.println("Say cheese");
}
addToarray(text);
text = "";
}
}
}
public static void addToarray(String text) {
boolean flag = false;
// look for keyWords first.
for (final String item : keyWords) {
if (item.equals(text)) {
if (resultKey.size() == 0) {
System.out.println("Size zero> " + resultKey.size());
resultKey.add(text);
text = "";
flag = true;
break;
} else {
boolean checker = true;
for (final String key : resultKey) {
if (key.equals(text)) {
checker = false;
break;
}
}
if (checker) {
resultKey.add(text);
flag = true;
text = "";
}
}
}
}
}
}
And the file test.txt contains exactly
float d, e;
int a, b, c;
Launching it prints
float d, e;
Size zero> 0
int a, b, c;
Say cheese
float
int

"int" is not matched, because your input file probably contains a Byte-Order-Mark.
You can check for it in code or with a Hex-Editor. Most likely it will be one of 0xEFBBBF (UTF-8) , 0xFEFF (UTF-16 Big Endian) or 0xFFFE (UTF-16 Little Endian). But there are more. I already referenced a W3C-Document on the topic in the comments. Here is the Wikipedia-Article with even more BOMs.
Sidenote:
Which teacher hands out a "dirty" input file!? He must be some kind of sadist or (which would be even worse, imho) he did not do it on purpose. I would try to copy the (printable) content of the file to a new file and test this as input. So if the clean file works to your satisfaction, you can find some means of sanitizing input.

Related

Write a program NumberCount that counts the numbers (including integers and floating point values) in one or more text files. (Due today lol)

INSTRUCTIONS:
Write a program NumberCount that counts the numbers (including integers and floating point values) in
one or more text files. Note that only numbers separated by whitespace characters are counted, i.e., only
those numbers that can be read by either readInt() or readDouble() are considered.
So iv been trying to get this program to read text files and the title is pretty much the instructions but it does not want to read my textfiles that i have in the project folder (i tried moving it a bunch of times but anywhere i put it it didnt load up) This is my code
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
public class NumberCount implements Runnable {
private static int combinedCount = 0;
public static synchronized int getCombinedCount() {
return combinedCount;
}
public synchronized void setCombinedCount( int combinedCount) {
this.combinedCount = combinedCount;
}
String filename;
NumberCount(String filename) {
this.filename = filename;
}
NumberCount() {
}
#Override
public void run() {
String fileText = this.getTextFromFile(filename);
System.out.println(filename + ": " + countNumbers(fileText));
setCombinedCount(getCombinedCount() + countNumbers(fileText));
}
String getTextFromFile(String filename) {
try {
String data = "";
BufferedReader br = new BufferedReader(new FileReader(new File(filename)));
String st;
while ((st = br.readLine()) !=null) {
data += "\n" + st;
}
data = data.replaceAll("\n", " ");
return data;
} catch(Exception e) {
System.out.println("Unable to retrieve text from file: " + filename );
return "";
}
}
int countNumbers(String text) {
int count = 0;
String words[] = text.split(" ");
for (String word : words) {
try {
Integer.parseInt(word);
Float.parseFloat(word);
count++;
} catch (Exception e) {
}
}
return count;
}
String helpMessage() {
String data = "Please call as NumberCount <list of file names>\n";
data += "File names to be present in the same directory";
return data;
}
public static void main(String[] args) {
if (args.length == 0) {
System.out.println(new NumberCount().helpMessage());
} else {
Thread threads[] = new Thread[args.length];
int i = 0;
for (String filename : args) {
NumberCount nc = new NumberCount(filename);
threads[i] = new Thread(nc);
threads[i++].start();
}
try {
for(Thread t : threads) {
t.join();
}
}catch(Exception e) {
e.printStackTrace();
}
System.out.println("combined count: " + getCombinedCount());
}
}
}

How to ignore duplicate strings when using RegEx to match string?

EDIT: editted for clarity as to what I'm having trouble with. I'm not getting the right responses as its counting dupes. I HAVE to use RegEx, can use tokenizer however but I did not.
What I am trying to do here is, there is 5 input files. I need to calculate how many "USER DEFINED VARIABLES" there are. Please ignore the messy code, I'm just learning Java.
I replaced: everything within ( and ), all non-word characters, any statements such as int, main etc, any digit with a space infront of it, and any blank space with a new line then trim it.
This leaves me with a list that has a variety of strings which I will match with my RegEx. However, at this point, how make my count only include unique identifiers?
EXAMPLE:
For example, in the input file I have attached beneath the code, I am receiving
"distinct/unique identifiers: 10" in my output file, when it should be "distinct/unique identifiers: 3"
And for example, in the 5th input file I have attached, I should have "distinct/unique identifiers: 3" instead I currently have "distinct/unique identifiers: 6"
I cannot use Set, Map etc.
Any help is great! Thanks.
import java.util.*
import java.util.regex.*;
import java.io.*;
public class A1_123456789 {
public static void main(String[] args) throws IOException {
if (args.length < 1) {
System.out.println("Wrong number of arguments");
System.exit(1);
}
for (int i = 0; i < args.length; i++) {
FileReader jk = new FileReader(args[i]);
BufferedReader ij = new BufferedReader(jk);
FileWriter fw = null;
BufferedWriter bw = null;
String regex = "\\b(\\w+)(\\s+\\1\\b)+";
Pattern p = Pattern.compile("[_a-zA-Z][_a-zA-Z0-9]{0,30}");
String line;
int count = 0;
while ((line = ij.readLine()) != null) {
line = line.replaceAll("\\(([^\\)]+)\\)", " " );
line = line.replaceAll("[^\\w]", " ");
line = line.replaceAll("\\bint\\b|\\breturn\\b|\\bmain\\b|\\bprintf\\b|\\bif\\b|\\belse\\b|\\bwhile\\b", " ");
line = line.replaceAll(" \\d", "");
line = line.replaceAll(" ", "\n");
line = line.trim();
Matcher m = p.matcher(line);
while (m.find()) {
count++;
}
}
try {
String s1 = args[i];
String s2 = s1.replaceAll("input","output");
fw = new FileWriter(s2);
bw = new BufferedWriter(fw);
bw.write("distinct/unique identifiers: " + count);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (bw != null) {
bw.close();
}
if (fw != null) {
bw.close();
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
//This is the 3rd input file below.
int celTofah(int cel)
{
int fah;
fah = 1.8*cel+32;
return fah;
}
int main()
{
int cel, fah;
cel = 25;
fah = celTofah(cel);
printf("Fah: %d", fah);
return 0;
}
//This is the 5th input file below.
int func2(int i)
{
while(i<10)
{
printf("%d\t%d\n", i, i*i);
i++;
}
}
int func1()
{
int i = 0;
func2(i);
}
int main()
{
func1();
return 0;
}
Try this
LinkedList dtaa = new LinkedList();
String[] parts =line.split(" ");
for(int ii =0;ii<parts.length;ii++){
if(ii == 0)
dtaa.add(parts[ii]);
else{
if(dtaa.contains(parts[ii]))
continue;
else
dtaa.add(parts[ii]);
}
}
count = dtaa.size();
instead of
Matcher m = p.matcher(line);
while (m.find()) {
count++;
}
Amal Dev has suggested a correct implementation, but given the OP wants to keep Matcher, we have:
// Previous code to here
// Linked list of unique entries
LinkedList uniqueMatches = new LinkedList();
// Existing code
while ((line = ij.readLine()) != null) {
line = line.replaceAll("\\(([^\\)]+)\\)", " " );
line = line.replaceAll("[^\\w]", " ");
line = line.replaceAll("\\bint\\b|\\breturn\\b|\\bmain\\b|\\bprintf\\b|\\bif\\b|\\belse\\b|\\bwhile\\b", " ");
line = line.replaceAll(" \\d", "");
line = line.replaceAll(" ", "\n");
line = line.trim();
Matcher m = p.matcher(line);
while (m.find()) {
// New code - get this match
String thisMatch = m.group();
// If we haven't seen this string before, add it to the list
if(!uniqueMatches.contains(thisMatch))
uniqueMatches.add(thisMatch);
}
}
// Now see how many unique strings we have collected
count = uniqueMatches.size();
Note I haven't compiled this, but hopefully it works as is...

how to fix error: cannot find symbol

I feel as if I am missing something really simple but I can't find it.
The goal of this code is to take a Shakespeare file and use a hash map to find the number of times a word is given by the text as well as words of "n" characters long. However I can't even get to the debugging portion because I get the error
Bard.java:13: error: cannot find symbol
Pattern getout = Pattern.compile("[\\w']+"); //this will take only the words
^ symbol: class Pattern location: class Bard
Bard.java:13: error: cannot find symbol
Pattern getout = Pattern.compile("[\\w']+"); //this will take only the words
plus a few more location. Help would be greatly appreciated.
import java.io.*;
import java.util.*;
public class Bard {
public static void main(String[] args) {
HashMap < String, Integer > m1 = new HashMap < String, Integer > (); // sets the hashmap
//create file reader for the shakespere text
try (BufferedReader br = new BufferedReader(new FileReader("shakespeare.txt"))) {
String line = br.readLine();
Pattern getout = Pattern.compile("[\\w']+"); //this will take only the words
//create the hashmap
while (line != null) {
Matcher m = getout.matcher(line); //find the relevent information
while (m.find()) {
if (m1.get(m.group()) == null && !m.group().toUpperCase().equals(m.group())) { //find new word that is not in all caps.
m1.put(m.gourp(), 1);
} else { //increments the onld word
int newValue = m1.get(m.group());
newValue++;
m1.put(m.group, newValue);
}
}
line = br.readLine();
}
} catch (Exception e) {
e.printStackTrace();
}
try (BufferedReader br2 = new BufferedReader(new FileReader("input.txt"))) {
String line2 = br2.readLine();
FileWriter output = new FileWriter("analysis.txt");
while (line2 != null) {
if (line2.matches("[\\d\\s]+")) { // if i am dealing with the two integers
String[] args = line.split(" "); // split them up
wordSize = Integer.parseInt(args[0]); // set the first on the the word size
numberOfWords = Integer.parseInt(args[1]); // set the other one to the number of words wanted
String[] wordsToReturn = new String[numberOfWords]; //create array to place the words
int i = 0;
int j;
for (String word: m1.keySet()) { //
if (word.length() == wordSize) {
wordToReturn[i] = word;
i++;
}
for (j = 0; numberOfWords > j; j++) {
output.write(wordToReturn[j]);
}
}
} else {
output.write(m1.get(line2));
}
}
line2 = br2.readLine();
} catch (Exception e) {
e.printStackTrace();
}
}
}
You have not imported the Pattern class. Import it with :-
import java.util.regex.*;

Get some text from a file with java that we know the character and the line

Anyone have trick to get the pieces of text from a file ?
My goal is like this :
Assumpted, we need to read a java file. So,we will created a new java file that just need an input begin of line and character, untill end of line and character of Java file. For the Example, The java file like this :
package test;
public class FileDua {
public int method(int c, int d) {
while (d != 0) {
if (c > d) {
c = c - d;
} else {
d = d - c;
}
}
return c;
}
public int factorial(int n){
if(n == 0){
return 1;
}else{
return n * factorial(n-1);
}
}
}
With buffered or some like that, we know to input :
(4.3 - 10.3)
means line 4 character 3 to line 10 character 3 ?
So, you know, voila, we get this :
public int method(int c, int d) {
while (d != 0) {
if (c > d) {
c = c - d;
} else {
d = d - c;
}
}
return c;
}
or, another example 1.3 - 3.3 we get this :
package test;
public class FileDua {
Thanks for everyone that help. It is so appreciated.
update
I am newbie in java, so I can just something like this :
public class FinishingTouch {
public static void main(String[] args) {
FileReader fileA = null;
try {
fileA = new FileReader("src\\FIleDua.java");
try (BufferedReader br = new BufferedReader(fileA)) {
String sCurrentLine;
//print all of the contains
while ((sCurrentLine = br.readLine()) != null) {
System.out.println(sCurrentLine);
}
//now time to get just the line that I want
String input = "";
try {
input = new String(Files.readAllBytes(Paths.get(fileA.toString())));
} catch (IOException ex) {
System.out.println("File: " + fileA.toString() + " not found!");
}
String[] lines = input.split("\r?\n");
} catch (IOException e) {
e.printStackTrace();
}
} catch (FileNotFoundException ex) {
System.out.println("File Not Found");
} finally {
try {
fileA.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
What about something like this?
input = "";
try {
input = new String(Files.readAllBytes(Paths.get(INPUT_PATH)));
} catch (IOException ex) {
System.out.println("File: " + INPUT_PATH + " not found!");
}
This will make input be a string containing all the text in your file. Then just extract the components of it that you want. You can separate your string into an array of strings by using the new line character as a separator so that each string component will be a single line in your file. Something like
String[] lines = input.split("\r?\n");
Then if you want the fourth through the tenth lines, just return lines[3] through lines[9]. I'm not sure what you mean by referencing character 3 in your examples. From what I can tell, you want to get back the entire line. I don't see where, when saying line 4 character 3, the character 3 comes into play at all.

How can I fix my code so it can input letters [duplicate]

This question already has answers here:
How to insist that a users input is an int?
(7 answers)
Closed 9 years ago.
So i'm just learning java and I know this issue is very stupid, this is from the book Head Frist Java. When I try to put a letter instead of a number it crashes, how do I fix that? If I want it to say "pleasse try again with a number" when letter is entered.
public class Game {
public static void main(String[] args)
{
int numOfGuesses = 0;
GameHelper helper = new GameHelper();
SimpleDotCom theDotCom = new SimpleDotCom();
int randomNum = (int) (Math.random() * 5);
int[] locations = {randomNum, randomNum+1, randomNum+2};
theDotCom.setLocationCells(locations);
boolean isAlive = true;
while (isAlive == true)
{
String guess = helper.getUserInput("enter a number");
String result = theDotCom.checkYourself(guess);
numOfGuesses++;
if (result.equals("kill")) {
isAlive = false;
System.out.println("You took " + numOfGuesses + " guesses");
}
}
}
}
public class GameHelper {
private static final String alphabet = "abcdefg";
private int gridLength = 7;
private int gridSize = 49;
private int [] grid = new int[gridSize];
private int comCount = 0;
public String getUserInput(String prompt) {
String inputLine = null;
System.out.print(prompt + " ");
try {
BufferedReader is = new BufferedReader(
new InputStreamReader(System.in));
inputLine = is.readLine();
if (inputLine.length() == 0 ) return null;
} catch (IOException e) {
System.out.println("IOException: " + e);
}
return inputLine.toLowerCase();
}
public class SimpleDotCom {
int[] locationCells;
int numOfHits = 0;
public void setLocationCells(int[] locs)
{
locationCells = locs;
}
public String checkYourself(String stringGuess) {
int guess = Integer.parseInt(stringGuess);
String result = "miss";
for (int cell: locationCells)
{
if (guess == cell) {
result = "hit";
numOfHits++;
break;
}
}
if (numOfHits == locationCells.length)
{
result = "kill";
}
System.out.println(result);
return result;
}
In the following -
int guess = Integer.parseInt(stringGuess);
the parsing succeeds only if stringGuess contains some integer (within the range of [-2147483648 - 2147483647]. Otherwise, it fails with an exception.
To avoid that you have to make sure that stringGuess contains the right value.
Following is where the value comes from -
String guess = helper.getUserInput("enter a number");
String result = theDotCom.checkYourself(guess);
It's the getUserInput() method -
public String getUserInput(String prompt) {
String inputLine = null;
System.out.print(prompt + " ");
try {
BufferedReader is = new BufferedReader(new InputStreamReader(System.in));
inputLine = is.readLine();
if (inputLine.length() == 0)
return null; // this cannot be parsed
} catch (IOException e) {
System.out.println("IOException: " + e);
}
return inputLine.toLowerCase(); //this might not be an integer
}
And that's the part that you need to fix.
Following should do the job -
//...
BufferedReader is = new BufferedReader(new InputStreamReader(System.in));
while (true) { //keep reading
try {
inputLine = is.readLine();
int num = Integer.parseInt(inputLine); //make sure it's an integer
if(num > -1 && num < 10) { // if it is, and within [0-9]
break; // stop reading
}
} catch (Exception e) { // if not prompt again
System.out.println("pleasse try again with a number within [0-9]");
}
}
return inputLine; // no to lower case, it's a number
You can still better it up, by say just returning an int form this method, instead of String.
If you don't know if stringGuess is an integer or not, you can put Integer.parseInt(stringGuess) in a try { } catch construct. parseInt throws an exception if its input cannot be turned into an integer, so catch it. In the catch block, we know that it was not an integer. Otherwise it was an integer. Now do the logic you want to do (displaying a message, choosing to loop or not, etc)
(If you have not yet done exception handling, look up try and catch in Java)
as suggested by #patashu you can use try{ } catch() { }
as Integer.parseInt(argument) throws NumberFormatException if the argument is not a number(number in the form of string).
and about calling your input function again if user enters letter then you can simply do it by giving that particular input method a call inside catch block like:
try{
int guess = Integer.parseInt(stringGuess);
-----
-----
}
catch(NumberFormatException e){
System.out.println("Oooppps letter entered - try again with number ");
/**
now here make call to your method that takes input i.e getUserInput() in your case
**/
}

Categories

Resources