improve efficiency of a Buffered Reader - java

Hi so im using a api to run a search for books.
http://openlibrary.org/search.json?q=prolog
i then run a buffered search (below) to read each line in and select the lines that i want using a IF statement
"title_suggest": "Prolog",
URL search = new URL("http://openlibrary.org/search.json?q="+searchingTerm);//search string
BufferedReader in = new BufferedReader(
new InputStreamReader(search.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null) //read file
for instance: if inputLine contains title_suggest add inputLine to an arrayList.
however this is quite slow and i was wondering if there is a more effecient way to read in the data?

I cannot imagine that the parsing is a giant time suck compared to retrieving the data over the internet. But, whatever you're doing, you're way better off using a genuine JSON parser rather than rolling up your own parser, especially if you're relying on if statements to do so.
Also, make damned sure your query sent to the API is as restrictive as you can make it, after all the more exact data they can give you the better off it is for all parties involved.

this is quite slow
BufferedReader isn't 'quite slow'. It is extremely fast. You can read millions of lines per second with BufferedReader.readLine().
The slow part is your code that processes each line. Or possibly the server is slow executiong the query or delivering the data.
You're barking up the wrong tree here.

Related

Reading internal storage file is taking too much time in Android

I have a large JSON data around 20 MB (I create this data using JSON.stringify
from JavaScript code). I 'm writing this JSON data to an internal storage file on Android Device and reading it later. So When I read the file it's taking too much time, I don't know whether its reading or not. One more thing I need to Read in the Main thread only.
The below code works fine if I pass data value "Hello World" in WriteFile method, But it fails with the large JSON
public String ReadFile()
{
StringBuffer text = new StringBuffer();
String FILE_NAME = "file.txt";
try {
BufferedReader bReader = new BufferedReader(new InputStreamReader(openFileInput(FILE_NAME)));
String line;
int count = 0;
while ((line = bReader.readLine()) != null) {
text.append(line + "\n");
alert("Reading File: " + ++count);
}
}
catch (Exception e) {
alert(e.toString());
}
return text.toString();
}
public String WriteFile(String data)
{
String FILE_NAME = "file.txt";
String result = "";
try {
FileOutputStream fos = openFileOutput(FILE_NAME, Context.MODE_PRIVATE);
fos.write(data.toString().getBytes());
result = "Success";
fos.close();
}
catch (Exception e) {
e.printStackTrace();
result="Error";
}
return result;
}
I have added one alert in while loop also, but I cannot see any alert message. I have not seen even the Exception message also.
so there can be two problems.
There is something wrong in writing to file (But I don't know how to verify this? because I don't think there is any way to view internal storage file).
Something wrong in my reading code.
Update1:
If let's say I cannot read so large file in Java native code, then Is there any way to read an internal storage Android file from WebView JavaScript code?
============================================================================
Application Requirement
I have an Android application, In which I have a WebView. I have copied the full javascript code (js and HTML files) to assets folder of the app. I'm writing to file from java native code and reading from java native code. I am getting all data from the server on app launch. My client has a very slow internet connection and its disconnected many times. So they want this app to be run in offline mode. Means app will get all the data at launch and We will store it somewhere and then read it throughout the app. If a user launches the app again it will get the old existing data. Actually, this data is very big so I'm storing it to the internal storage file.
First of all, the only way to be really sure why your code is taking a long time is to profile it. We can't do that for you.
But here are some performance tips relevant to your code:
Don't read the entire 20MB JSON file into the Java heap / RAM memory unless you really need to do it. (I am finding it difficult to understand why you are doing this. For example, a typical JSON parser will happily1 read input directly from a file. Or if you are reading this so that you can send this to a client on the other end of an HTTP connection, you should be able to stream the data.)
Reading a file a line at a time and then stitching the lines back together is unnecessary. It generates unnecessary garbage. Extra garbage means more work for the GC, which slows you down. If the lines are long, you have the added performance "hit" of using a internal StringBuilder to build each line.
Reading to a recycled char[], then appending the char[] content to the StringBuilder will be faster than appending lines.
Your StringBuilder will repeatedly "grow" its backing character array to accommodate the characters as you append them. This generates garbage and leads to unnecessary copying. (Implementations typically "grow" the array exponentially to avoid O(N^2) behavior. However the expansions still affect performance, and can result in up to 3 times the peak memory usage than is actually required.)
One way to avoid this is to get an initial estimate of the number of characters you are going to add and set the StringBuilder "capacity" accordingly. You may be able to estimate the number of characters from the file size. (It depends on the encoding.)
Look for a way to do it using existing standard Java libraries; e.g. Files.copy and ByteArrayOutputStream, or Files.readAllBytes
Look for an existing 3rd-party library method; e.g. Apache Commons IO has an IOUtils.toString(Reader) method. The chances are that they will have spent a lot of time figuring out how to do this efficiently. Reusing a well engineered, well maintained library is likely to saves you time.
Don't put a trace print (I assume that is what alert is ...) in the middle of a loop that could be called millions of times. (Duh!)
1 - Parser are cheerful once you get to know them :-)

Reading lines from BufferedReader using Stream

I am trying to read a huge file which has approximately one billion lines in it. I want to use Stream for parallel processing and insert each lines in a database. I am doing something like this,
br = new BufferedReader(new InputStreamReader(inputStream, "UTF-8"));
list = br.lines().parallel().collect(Collectors.toList());
This will store all the lines in a list. But I don't want to keep all the lines in memory. So, I want to save them into database as soon as a line is read. Please help me in achieving this. Also, guide me in tweaking this idea.
Thanks in advance :)
Seems you need to use forEach and pass a Consumer that will take the line and store it in the database
lines.parallel()
.forEach(line -> {
//Invoke the code passing the 'line' that persists in the DB...something like
dbWriter.write(line);
});

Reading Random Access File with Buffered Reader

i am trying to read a huge file ( > 1GB) , i am thinking that reading it as a random access file with a buffered reader would be efficient.
i need to read the file line by line and parse it
However being new to JAVA IO Api , i'm not sure how can i do this..
i appreciate your help.
You can use Java's BufferedReader for this:
BufferedReader reader = new BufferedReader(new FileReader(fileName));
String line;
while ((line = reader.readLine()) != null) {
// Do some stuff with the line
}
fileName is the path to the file you want to read.
Do you need to read all of it and from the beginning? You can use a RandomAccessFile to jump to different parts of the file if you know what byte you can start at. I think it is the seek function that does this.
While it is perfectly doable in java, I wanted to suggest based on my experience:
If you're on Unix platform, you may use external shell script for searching through the GBs of log. sed is very optimum for this purpose. Specific usage here: http://www.grymoire.com/Unix/Sed.html
Call shell script through java file whenever you need to read/grep through the log file.
How?
1) In your java code, use ProcessBuilder class. It can take shell script as arg to constructor
ProcessBuilder obj = new ProcessBuilder("FastLogRead.sh");
2) Create object for Process
Process process = obj.start();
3) You can read the output of this shell, directly in your BufferedRead through this
BufferedReader br=new BufferedReader(new InputStreamReader(process.getInputStream()));
Pros:
Speeds up execution by avg. 10 times (I searched through around 4GB log file)
Cons:
Some developers don't like bringing in light-weight shell script in realms of java, hence want to go for java's RandomAccessFile. This is justified.
For your case, you may choose between standardization and performance.

HTML to TXT library that mimics the output of "lynx -dump"?

The problem is really that specific.
I need a library in java that can take HTML content and generate text in the same format that is generated by the Linux lynx program.
I need to expose data provided by 3rd party servers to end users on Android. Data format is ancient, in badly formatted HTML, so much that I've tried reading it using java and it fails occasionally (unacceptable). It is also growing every month (preinstall ruled out) and I can't convince them to change to "modern" stuff (life would be great in XML etc.).
Shortest route: I wrote a class to use the W3 html2txt service online (google search it). It worked fine on the app until I got complains and noticed that the W3 service fails occasionally. It's not that big of a deal, but the black box logic expects the output to be in this "lynx like" text format.
So I would like a library to do the conversion (HTML->TXT) in "lynx style" inside the app and avoid the outages in the W3 service. And besides, the lynx output the probably the best I've seen, the most organized and neat.
Are you guys aware of any?
not sure what you mean by lynx style so I might be completely off by submitting this (if so please excuse me).
I used some piece of code a while back to check HTML/XML files (at the time I was just priting it out in the logs
InputStream in = context.getResources().openRawResource(id);
StringBuffer inLine = new StringBuffer();
InputStreamReader isr = new InputStreamReader(in);
BufferedReader inRd = new BufferedReader(isr);
String text;
while ((text = inRd.readLine()) != null) {
inLine.append(text);
inLine.append("\n");
}
in.close();
return inLine.toString();
I hope it helps but I got the feeling you need something more complex :P
After a year, I give up. Answer is: no way to handle that, no library in Java. At least for now.
I'm closing this. Thank you for your attention.

Building an irc client in Java

I'm trying to write an ircBot in Java for some practice. I am using this sample code as the base. I'm trying to figure out how to get it to read in text from my console so I can actually talk to people with the bot.
There's the one while loop that takes in the input from the ircserver and spits it out to console and responds to PINGs. I'm assuming I have to have another thread that takes the input from the user and then uses the same BufferedWriter to spit it out to the ircserver again but I can't get that figured out.
Any help would be awesome!
In the code you have linked to, the 'reader' and 'writer' instances, are indeed connected to respectively the ingoing and outgoing ends of the two-way socket you have established with the IRC server.
So in order to get input from the User, you do indeed new another thread which takes commands from the user in some fashion and acts upon these. The most basic model, would naturally be to use System.in for this, preferably wrapping it so that you can retrieve whole line inputs from the User, and parse these as a command.
To read whole lines from System.in you could do something like this:
BufferedReader bin = new BufferedReader(new InputStreamReader(System.in));
String line;
while ((line = bin.readLine()) != null) {
// Do stuff
}
You could also consider using one of the CLI libraries that is out there for Java, like JLine
If you really want to do yourself a favour, I recommend (after having used it extensively) switching to pircbot. Pircbot really is a wonderful library and will let you get an IRC bot up and running in just a few minutes. Check out some of the examples on the site, it's super easy to use.

Categories

Resources