What I'm looking to do here is process a log file, in my case it's squid's access.log. I want to have my program take a look at the first 'word' in the file, which is the time in Unix format of when the URL was accessed. In other parts of the program, I designed a time class, which gets the time the program was last run in Unix time, and I want to compare this time to the first word in the file, which happens to be a Unix time.
My initial thinking on how to do this is that I process the file, store it in array, then based on the first word in the file, omit the lines by removing it from the array that the processed file is in, and put it in another array
Here's what I've got so far. I'm pretty sure that I'm close, but this is the first time that I've done file processing, so I don't exactly know what I'm doing here.
private void readFile(File file) throws FileNotFoundException, IOException{
String[] lines = new String[getLineCount(file)];
Long unixTime = time.getUnixLastRun();
String[] removedTime = new String[getLineCount(file)];
try(BufferedReader br = new BufferedReader(new FileReader(file))) {
int i = 0;
for(String line; (line = br.readLine()) != null; i++) {
lines[i] = line;
}
}
for(String arr: lines){
System.out.println(arr);
}
}
private void readFile(File file) {
List<String> lines = new ArrayList<String>();
List<String> firstWord = new ArrayList<String>();
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String sCurrentLine;
while ((sCurrentLine = br.readLine()) != null) {
// Adds the entire first line
lines.add(sCurrentLine);
// Adds the first word
firstWord.add(sCurrentLine.split(" ")[0]);
}
} catch (IOException e) {
e.printStackTrace();
}
}
If you want you can use your arrays.
private void readFile(File file) throws FileNotFoundException, IOException {
String[] lines = new String[getLineCount(file)];
Long unixTime = time.getUnixLastRun();
String[] removedTime = new String[getLineCount(file)];
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
int i = 0;
for (String line; (line = br.readLine()) != null; i++) {
lines[i] = line;
}
}
ArrayList<String> logsToBeUsed = new ArrayList<String>();
for (String arr : lines) {
//Gets the first word from the line and compares it with the current unix time, if it is >= unix time
//then we add it to the list of Strings to be used
try{
if(Long.parseLong(getFirstWord(arr)) >= unixTime){
logsToBeUsed.add(arr);
}
}catch(NumberFormatException nfe){
//Means the first word was not a float, do something here
}
}
}
private String getFirstWord(String text) {
if (text.indexOf(' ') > -1) {
return text.substring(0, text.indexOf(' '));
} else {
return text;
}
}
This is the answer according to the code you posted. This can be done more efficiently as you can use an ArrayList to store the lines from the file rather than first reading the line number getLineCount(file) as you open the file twice. And in the for loop you are declaring the String object again and again.
Related
I have a text file list of thousands of String (3272) and I want to put them each into a slot of an Array so that I can use them to be sorted out. I have the sorting part done I just need help putting each line of word into an array. This is what I have tried but it only prints the last item from the text file.
public static void main(String[] args) throws IOException
{
FileReader fileText = new FileReader("test.txt");
BufferedReader scan = new BufferedReader (fileText);
String line;
String[] word = new String[3272];
Comparator<String> com = new ComImpl();
while((line = scan.readLine()) != null)
{
for(int i = 0; i < word.length; i++)
{
word[i] = line;
}
}
Arrays.parallelSort(word, com);
for(String i: word)
{
System.out.println(i);
}
}
Each time you read a line, you assign it to all of the elements of word. This is why word only ends up with the last line of the file.
Replace the while loop with the following code.
int next = 0;
while ((line = scan.readLine()) != null) word[next++] = line;
Try this.
Files.readAllLines(Paths.get("test.txt"))
.parallelStream()
.sorted(new ComImpl())
.forEach(System.out::println);
I'm just trying to do an exercise where I have to read a particular file called test.txt in the following format:
Sampletest 4
What I want to do is that I want to store the text part in one variable and the number in another. I am still a beginner so I had to google quite a bit to find something that would at-least work, here what I got so far.
public static void main(String[] args) throws Exception{
try {
FileReader fr = new FileReader("test.txt");
BufferedReader br = new BufferedReader(fr);
String str;
while((str = br.readLine()) != null) {
System.out.println(str);
}
br.close();
} catch(IOException e) {
System.out.println("File not found");
}
Use a Scanner, which makes reading your file way easier than DIY code:
try (Scanner scanner = new Scanner(new FileInputStream("test.txt"));) {
while(scanner.hasNextLine()) {
String name = scanner.next();
int number = scanner.nextInt();
scanner.nextLine(); // clears newlines from the buffer
System.out.println(str + " and " + number);
}
} catch(IOException e) {
System.out.println("File not found");
}
Note the use of the try-with-resources syntax, which closes the scanner automatically when the try is exited, usable because Scanner implements Closeable.
You just need:
String[] parts = str.split(" ");
And parts[0] is the text (sampletest)
And parts[1] is the number 4
It seems like you are reading the whole file content (from test.txt file) line by line, so you need two separate List objects to store the numeric and non-numeric lines as shown below:
String str;
List<Integer> numericValues = new ArrayList<>();//stores numeric lines
List<String> nonNumericValues = new ArrayList<>();//stores non-numeric lines
while((str = br.readLine()) != null) {
if(str.matches("\\d+")) {//check line is numeric
numericValues.add(str);//store to numericList
} else {
nonNumericValues.add(str);//store to nonNumericValues List
}
}
If you are sure the format is always for each line in the file.
String str;
List<Integer> intvalues = new ArrayList<Integer>();
List<String> charvalues = new ArrayList<String>();
try{
BufferedReader br = new BufferedReader(new FileReader("test.txt"));
while((str = br.readLine()) != null) {
String[] parts = str.split(" ");
charvalues.add(parts[0]);
intvalues.add(new Integer(parts[0]));
}
}catch(IOException ioer) {
ioer.printStackTrace();
}
You can use java utilities Files#lines()
Then you can do something like this. Use String#split() to parse each line with a regular expression, in this example i use a comma.
public static void main(String[] args) throws IOException {
try (Stream<String> lines = Files.lines(Paths.get("yourPath"))) {
lines.map(Representation::new).forEach(System.out::println);
}
}
static class Representation{
final String stringPart;
final Integer intPart;
Representation(String line){
String[] splitted = line.split(",");
this.stringPart = splitted[0];
this.intPart = Integer.parseInt(splitted[1]);
}
}
I am working in creating inverted index for list of words in java. Basically it creates a list for each word contains the document index that word appear on associated with frequency of word in that document, the desired output should be like this:
[word1:[FileNo:frequency],[FileNo:frequency],[FileNo:frequency],word2:[FileNo:frequency],[FileNo:frequency]...etc]
Here is the code:
package assigenment2;
import java.io.*;
import java.util.*;
public class invertedIndex {
public static Map<String, Map<Integer,Integer>> wordTodocumentMap;
public static BufferedReader buffer;
public static BufferedReader br;
public static BufferedReader reader;
public static List<String> files = new ArrayList<String>();
public static List<String>[] tokens;
public static void main(String[] args) throws IOException {
//read the token file and store the token in list
String tokensPath="/Users/Manal/Documents/workspace/Information Retrieval/tokens.txt";
int k=0;
String[] tokens = new String[8500];
String sCurrentLine;
try
{
FileReader fr=new FileReader(tokensPath);
BufferedReader br= new BufferedReader(fr);
while ((sCurrentLine = br.readLine()) != null)
{
tokens[k]=sCurrentLine;
k++;
}
System.out.println("the number of token are:"+k+" words");
br.close();
}
catch(Exception ex)
{System.out.println(ex);}
Until there it works correctly, I believe that the problem is in the manipulating the nested map in the following part:
TreeMap<Integer,Integer> documentToCount = new TreeMap<Integer,Integer>();
//read files
System.out.print("Enter the path of files you want to process:\n");
Scanner InputPath = new Scanner(System.in);
String cranfield = InputPath.nextLine();
File cranfieldFiles = new File(cranfield);
for (File file: cranfieldFiles.listFiles())
{
int fileno = files.indexOf(file.getPath());
if (fileno == -1) //the current file isn't in the files list \
{
files.add(file.getPath());// add file to the files list
fileno = files.size() - 1;//the index of file will start from 0 to size-1
}
int frequency = 0;
BufferedReader reader = new BufferedReader(new FileReader(file));
for (String line = reader.readLine(); line != null; line = reader.readLine())
{
for (String _word : line.split(" "))
{
String word = _word.toLowerCase();
if (Arrays.asList(tokens).contains(word))
if (wordTodocumentMap.get(word) == null)//check whether word is new word
{
documentToCount = new TreeMap<Integer,Integer>();
wordTodocumentMap.put(word, documentToCount);
}
documentToCount.put(fileno, frequency+1);//add the location and frequency
}
}
}
reader.close();
}
}
The error I get is:
Exception in thread "main" java.lang.NullPointerException
at assigenment2.invertedIndex.main(invertedIndex.java:65)
You’re never instantiating wordTodocumentMap, so it remains null throughout. Therefore the line if (wordTodocumentMap.get(word) == null)//check whether word is new word throws a NullPointerException when you do .get(), that is, before you have anything to compare to null. One possible solution is to instantiate the map in the declaration:
public static Map<String, Map<Integer,Integer>> wordTodocumentMap = new HashMap<>();
There may be other problems in your code, but this should get you a step further.
I am new to Java and it has all been self-taught. I enjoy working with the code and it is just a hobby, so, I don't have any formal education on the topic.
I am at the point now where I am learning to read from a text file. The code that I have been given isn't correct. It works when I hardcode the exact number of lines but if I use a "for" loop to sense how many lines, it doesn't work.
I have altered it a bit from what I was given. Here is where I am now:
This is my main class
package textfiles;
import java.io.IOException;
public class FileData {
public static void main(String[] args) throws IOException {
String file_name = "C:/Users/Desktop/test.txt";
ReadFile file = new ReadFile(file_name);
String[] aryLines = file.OpenFile();
int nLines = file.readLines();
int i = 0;
for (i = 0; i < nLines; i++) {
System.out.println(aryLines[i]);
}
}
}
This is my class that will read the text file and sense the number of lines
package textfiles;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class ReadFile {
private String path;
public ReadFile(String file_path) {
path = file_path;
}
int readLines() throws IOException {
FileReader file_to_read = new FileReader(path);
BufferedReader bf = new BufferedReader(file_to_read);
int numberOfLines = 0;
String aLine;
while ((aLine = bf.readLine()) != null) {
numberOfLines++;
}
bf.close();
return numberOfLines;
}
public String[] OpenFile() throws IOException {
FileReader fr = new FileReader(path);
BufferedReader textReader = new BufferedReader(fr);
int numberOfLines = 0;
String[] textData = new String[numberOfLines];
int i;
for (i = 0; i < numberOfLines; i++) {
textData[i] = textReader.readLine();
}
textReader.close();
return textData;
}
}
Please, keep in mind that I am self-taught; I may not indent correctly or I may make simple mistakes but don't be rude. Can someone look this over and see why it is not sensing the number of lines (int numberOfLines) and why it won't work unless I hardcode the number of lines in the readLines() method.
The problem is, you set the number of lines to read as zero with int numberOfLines = 0;
I'd rather suggest to use a list for the lines, and then convert it to an array.
public String[] OpenFile() throws IOException {
FileReader fr = new FileReader(path);
BufferedReader textReader = new BufferedReader(fr);
//int numberOfLines = 0; //this is not needed
List<String> textData = new ArrayList<String>(); //we don't know how many lines are there going to be in the file
//this part should work akin to the readLines part
String aLine;
while ((aLine = bf.readLine()) != null) {
textData.add(aLine); //add the line to the list
}
textReader.close();
return textData.toArray(new String[textData.size()]); //convert it to an array, and return
}
}
int numberOfLines = 0;
String[] textData = new String[numberOfLines];
textData is an empty array. The following for loop wont do anything.
Note also that this is not the best way to read a file line by line. Here is a proper example on how to get the lines from a text file:
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
ArrayList<String> list = new ArrayList<String>();
while ((line = br.readLine()) != null) {
list.add(line);
}
br.close();
I also suggest that you read tutorials on object oriented concepts.
This is a class that I wrote awhile back that I think you may find helpful.
public class FileIO {
static public String getContents(File aFile) {
StringBuilder contents = new StringBuilder();
try {
//use buffering, reading one line at a time
//FileReader always assumes default encoding is OK!
BufferedReader input = new BufferedReader(new FileReader(aFile));
try {
String line = null; //not declared within while loop
/*
* readLine is a bit quirky :
* it returns the content of a line MINUS the newline.
* it returns null only for the END of the stream.
* it returns an empty String if two newlines appear in a row.
*/
while ((line = input.readLine()) != null) {
contents.append(line);
contents.append(System.getProperty("line.separator"));
}
} finally {
input.close();
}
} catch (IOException ex) {
}
return contents.toString();
}
static public File OpenFile()
{
return (FileIO.FileDialog("Open"));
}
static private File FileDialog(String buttonText)
{
String defaultDirectory = System.getProperty("user.dir");
final JFileChooser jfc = new JFileChooser(defaultDirectory);
jfc.setMultiSelectionEnabled(false);
jfc.setApproveButtonText(buttonText);
if (jfc.showOpenDialog(jfc) != JFileChooser.APPROVE_OPTION)
{
return (null);
}
File file = jfc.getSelectedFile();
return (file);
}
}
It is used:
File file = FileIO.OpenFile();
It is designed specifically for reading in files and nothing else, so can hopefully be a useful example to look at in your learning.
I have a method that takes data from a .csv file and puts it into an array backwards
(first row goes in last array slot) however I would like the first row in the .csv file to not be in the array. How would I accomplish this? Here is my code thus far:
public static String[][] parse(String symbol) throws Exception{
String destination = "C:/"+symbol+"_table.csv";
LineNumberReader lnr = new LineNumberReader(new FileReader(new File(destination)));
lnr.skip(Long.MAX_VALUE);
String[][] stock_array = new String[lnr.getLineNumber()][3];
try{
BufferedReader br = new BufferedReader(new FileReader(destination));
String strLine = "";
StringTokenizer st = null;
int line = lnr.getLineNumber()-1;
while((strLine = br.readLine()) != null){
st = new StringTokenizer(strLine, ",");
while(st.hasMoreTokens()){
stock_array[line][0] = st.nextToken();
st.nextToken();
stock_array[line][1] = st.nextToken();
stock_array[line][2] = st.nextToken();
st.nextToken();
st.nextToken();
st.nextToken();
}
line--;
}
}
catch(Exception e){
System.out.println("Error while reading csv file: " + e);
}
return stock_array;
}
You can skip the first line by just reading it in and doing nothing. Do this just before your while loop:
br.readLine();
To make sure that your array is the right size and lines get stored in the right places, you should also make these changes:
String[][] stock_array = new String[lnr.getLineNumber()-1][3];
...
int line = lnr.getLineNumber()-2;
Your code is not efficient, as far as my knowledge goes. Also, you are using linenumberreader.skip(long.max_value), which is not a correct/confirmed way to find the line count of the file. StringTokenizer is kind of deprecated way of splitting tokens. I would code it, in the following way:
public static List<String[]> parse(String symbol) throws Exception {
String destination = "C:/"+symbol+"_table.csv";
List<String[]> lines = new ArrayList<String[]>();
try{
BufferedReader br = new BufferedReader(new FileReader(destination));
int index = 0;
while((line = br.readLine()) != null){
if(index == 0) {
index++;
continue; //skip first line
}
lines.add(line.split(","));
}
if(lines != null && !lines.isEmpty()) {
Collections.reverse(lines);
}
} catch(IOException ioe){
//IOException Handling
} catch(Exception e){
//Exception Handling
}
return lines;
}