How to calculate total hits - java

I am parsing several log files and searching for a particular String in them.
I look through each line, once I find the string I create a Map with the String and a text as key.
Like Map result = new HashMap(); result.put("Report Page", line.substring(60));
I then add these Maps to a list and I interate through the list and display my table.
What I want is, to give out the number of times the string occured in the files.
Desired output :
Name Value Occurences.
...
...
...
Could someone please help?
(Note :This is not a homework project.)
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line;
while ((line = reader.readLine()) != null) {
Map result = new HashMap();
if(line.contains("Parm Name/Value:REPORT_PAGE")){
result.put("Report Page", line.substring(60));
}
rows.add(result);

The question is a bit unclear, I hope I got you right.
You're currently hashing some string (whose meaning I don't understand) to the substring itself.
It also seems, for some reason, that you create a map for each line.
Are you sure that's what you want to do?
Anyway, what I think you want to do is to create a hash map which maps strings to integers.
Please paste a more complete code...

Check http://guava-libraries.googlecode.com/svn/tags/release09/javadoc/index.html. This the right choice for your use case.

Related

Removing duplicate lines from a text file

I have a text file that is sorted alphabetically, with around 94,000 lines of names (one name per line, text only, no punctuation.
Example:
Alice
Bob
Simon
Simon
Tom
Each line takes the same form, first letter is capitalized, no accented letters.
My code:
try{
BufferedReader br = new BufferedReader(new FileReader("orderedNames.txt"));
PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter("sortedNoDuplicateNames.txt", true)));
ArrayList<String> textToTransfer = new ArrayList();
String previousLine = "";
String current = "";
//Load first line into previous line
previousLine = br.readLine();
//Add first line to the transfer list
textToTransfer.add(previousLine);
while((current = br.readLine()) != previousLine && current != null){
textToTransfer.add(current);
previousLine = current;
}
int index = 0;
for(int i=0; i<textToTransfer.size(); i++){
out.println(textToTransfer.get(i));
System.out.println(textToTransfer.get(i));
index ++;
}
System.out.println(index);
}catch(Exception e){
e.printStackTrace();
}
From what I understand is that, the first line of the file is being read and loaded into the previousLine variable like I intended, current is being set to the second line of the file we're reading from, current is then compared against the previous line and null, if it's not the same as the last line and it's not null, we add it to the array-list.
previousLine is then set to currents value so the next readLine for current can replace the current 'current' value to continue comparing in the while loop.
I cannot see what is wrong with this.
If a duplicate is found, surely the loop should break?
Sorry in advance when it turns out to be something stupid.
Use a TreeSet instead of an ArrayList.
Set<String> textToTransfer = new TreeSet<>();
The TreeSet is sorted and does not allow duplicates.
Don't reinvent the wheel!
If you don't want duplicates, you should consider using a Collection that doesn't allows duplicates. The easiest way to remove repeated elements is to add the contents to a Set which will not allow duplicates:
import java.util.*;
import java.util.stream.*;
public class RemoveDups {
public static void main(String[] args) {
Set<String> dist = Arrays.asList(args).stream().collect(Collectors.toSet());
}
}
Another way is to remove duplicates from text file before reading the file by the Java code, in Linux for example (far quicker than do it in Java code):
sort myFileWithDuplicates.txt | uniq -u > myFileWithoutDuplicates.txt
While, like the others, I recommend using a collection object that does not allow repeated entries into the collection, I think I can identify for you what is wrong with your function. The method in which you are trying to compare strings (which is what you are trying to do, of course) in your While loop is incorrect in Java. The == (and its counterpart) are used to determine if two objects are the same, which is not the same as determining if their values are the same. Luckily, Java's String class has a static string comparison method in equals(). You may want something like this:
while(!(current = br.readLine()).equals(previousLine) && current != null){
Keep in mind that breaking your While loop here will force your file reading to stop, which may or may not be what you intended.

How to search for name in file and extract value

I have a file that looks like this:
Dwarf remains:0
Toolkit:1
Cannonball:2
Nulodion's notes:3
Ammo mould:4
Instruction manual:5
Cannon base:6
Cannon base noted:7
Cannon stand:8
Cannon stand noted:9
Cannon barrels:10
...
What is the easiest way to open this file, search for name and return the value of the field? I cannot use any external libraries.
What i have tried/is this ok?
public String item(String name) throws IOException{
String line;
FileReader in = new FileReader("C:/test.txt");
BufferedReader br = new BufferedReader(in);
while ((line = br.readLine()) != null) {
if(line.contains(name)){
String[] parts = line.split(":");
return parts[1];
}
}
return null;
}
As a followup to your code - it compiles and works ok. Be aware though, that / is not the correct path separator on Windows (\ is). You could've created the correct path using, for example: Paths.get("C:", "test.txt").toString(). Correct separator is defined as well in File.separator.
The task can be easily achieved using basic Java capabilities. Firstly, you need to open the the file and read its lines. It can be easily achieved with Files.lines (Path.get ("path/to/file")). Secondly, you need to iterate through all the lines returned by those instructions. If you do not know stream API, you can change value returned from Files.lines (...) from Stream to an array using String[] lines = Files.lines(Paths.get("path/to/file")).toArray(a -> new String[a]);. Now lines variable has all the lines from the input file.
You have to then split each line into two parts (String.split) and see whether first part equals (String.equals) what you're looking for. Then simply return the second one.

String split from a CSV - Java

I am having a small problem, I hope you can help.
I am reading a CSV in java, in which one of the column has string as follows:
a. "123345"
b. "12345 - 67890"
I want to split this like(Split it into two separate columns):
a. "123345", ""
b. "12345","67890"
Now, when I am using Java's default split function, it splits the string as follows:
a. "123345,"
b. "12345,67890" (Which is basically a string)
Any idea how can I achieve this? I have wasted my 3 hours on this. Hope any one can help.
Code as follows:
while ((line = bufferReader.readLine()) != null)
{
main = line.split("\\,"); //splitting the CSV using ","
//I know that column # 13 is the column where I can find digits like "123-123" etc..
therefore I have hard coded it.
if (main[12].contains("-"))
{
temp = main[12].split("-");
//At this point, when I print temp, it still shows me a string.
//What I have to do is to write them to the csv file.
E.g: ["86409, 3567"] <--Problem here!
}
else
{
//do nothing
}
}
after this, i will write the main[] array to the file.
Please check if java.util.StringTokenizer helps
Example:
StringTokenizer tokenizer = new StringTokenizer(inputString, ";")
Manual: StringTokenizer docs

Iterating over the content of a text file line by line having limitation, working but not finishing the entire text

I'm working on a Java method that will parse in a eternal for-loop a text line by line.
As you see I'm assigning the content of a bufferReader to a list
BufferedReader br = new BufferedReader(new FileReader("C:/feed.txt"));
String strLine;
ArrayList list = new ArrayList();
while ((strLine = br.readLine()) != null) {
list.add(strLine);
This work perfectly and the feed.txt content is totally assigned to the arrayList with 18238 line.
But when I tried to use and process the content of the list with an iterator in a for-loop (the following code), there is a problem:
Iterator itr;
for (itr = list.iterator(); itr.hasNext();) {
String str = itr.next().toString();
}
The instructions (business processes) in the loop are working perfectly until the line number 5175 when the program stop his iteration. It is a problem linked to the iterator.
I think it is about the iterator because there isn't something special about this line, even by deleting it. The problem persist.
Does the iterator have a limitation? How can I rise it?
I'm trying to parse a file having this number of line, but I'm supposed to develop into my project an eternal never ending loop, receiving line all the time .
Can you help me please ?
A few things here:
First of all, use
List<String> list = new ArrayList<String>();
Then you can just iterate using
for (String str : list) {
// do something
}
Does that solve the problem?
You asked "how do I combine"? Here is a simple example. Note it does NOT use an iterator - so you will just be able to see that you are able to do something with all the lines in the input file. It's not really answering your question, but it should help you narrow down where the problem lies.
BufferedReader br = new BufferedReader(new FileReader("C:/feed.txt"));
String strLine, str;
int numLines;
ArrayList list = new ArrayList();
numLines = 0;
while ((strLine = br.readLine()) != null) {
list.add(strLine);
System.out.println(list.get(numLines));
numLines++;
// do whatever you were going to do with the iterator here
str = strLine.toString();
}
System.out.printf("Read in %d lines; the last line read is \n%s\n", numLines, list.get(numLines-1));
While this is not exactly what you were asking, when you run this code and see how it fails you will be a step closer to solving your stated problem.

Java Lucene Stop Words Filter

I have about 500 sentences in which I would like to compile a set of ngrams. I am having trouble removing the stop words. I tried adding the lucene StandardFilter and StopFilter but I still have the same problem. Here is my code:
for(String curS: Sentences)
{
reader = new StringReader(curS);
tokenizer = new StandardTokenizer(Version.LUCENE_36, reader);
tokenizer = new StandardFilter(Version.LUCENE_36, tokenizer);
tokenizer = new StopFilter(Version.LUCENE_36, tokenizer, stopWords);
tokenizer = new ShingleFilter(tokenizer, 2, 3);
charTermAttribute = tokenizer.addAttribute(CharTermAttribute.class);
while(tokenizer.incrementToken())
{
curNGram = charTermAttribute.toString().toString();
nGrams.add(curNGram); //store each token into an ArrayList
}
}
For example, the first phrase I am testing is: "For every person that listens to". In this example curNGram is set to "For" which is a stop word in my list stopWords. Also, in this example "every" is a stop word and so "person" should be the first ngram.
Why are stop words being added to my list when I am using the StopFiler?
All help is appreciated!
What you've posted looks okay to me, so I suspect that stopWords isn't providing the information you want to the filter.
Try something like:
//Let's say we read the stop words into an array list (A simple array, or any list implementation should be fine)
List<String> words = new ArrayList();
//Read the file into words.
Set stopWords = StopFilter.makeStopSet(Version.LUCENE_36, words, true);
Assuming the list you of stopwords you generated (the one I've named 'words') looks like you think it does, this should put them into a format usable to the StopFilter.
Were you already generating stopWords like that?

Categories

Resources