Compare speed of two java methods [duplicate] - java

This question already has answers here:
How do I write a correct micro-benchmark in Java?
(11 answers)
Closed 4 years ago.
I have two different methods which actually do the same but are implemented a bit different. They walk through a directory and read all files in it and check how many files with a certain name are in the directory. Now I want to know which is faster but both are similar and take around 3-4 seconds (the directory has millions of files) but how can I know which is really faster? Is there a method which compares the speed of them?
method)
private void getAllRelatedFilesEig(String corrId) throws InterruptedException, IOException
{
log.debug("Get all files with corrId=" + corrId + " from directory=" + processingDir);
Profiler profiler = Profiler.createStarted();
Files.list(Paths.get(processingDir))
.filter(p ->
p.getFileName().toString()
.indexOf("EPX_" + corrId + "_") >= 0)
.forEach( path ->
{
try
{
EPEXFile file = new EPEXFile(path);
if (file.isTranMessage())
{
if (file.isOrderMessage())
{
orderFiles.add(file);
}
else
{
tradeFiles.add(file);
}
}
else
{
infoFiles.add(file);
}
}
catch (IFException ex)
{
log.error("Error creating EPEXFile object " + ex.getMessage());
}
}
);
profiler.stop("allFilesWithSameCorrIdRetrieval");
log.info(orderFiles.size() + " order files with corrId=" + corrId);
log.info(tradeFiles.size() + " trade files with corrId=" + corrId);
log.info(infoFiles.size() + " info files with corrId=" + corrId);
profiler = Profiler.createStarted();
profiler.stop("processFiles");
orderFiles.clear();
tradeFiles.clear();
infoFiles.clear();
}
method)
private void getAllRelatedFilesOrig(String corrId) throws InterruptedException, IOException {
log.debug("Get all files with corrId=" + corrId + " from directory=" + processingDir);
Path dirPath = Paths.get(processingDir);
ArrayList<Path> fileList;
Profiler profiler = Profiler.createStarted();
try (Stream<Path> paths = Files.walk(dirPath)) {
fileList = paths.filter(t -> (t.getFileName().toString().indexOf("EPX_" + corrId + "_") >= 0))
.collect(Collectors.toCollection(ArrayList::new));
for (Path path : fileList) {
try {
EPEXFile file = new EPEXFile(path);
if (file.isTranMessage()) {
if (file.isOrderMessage()) {
orderFiles.add(file);
} else {
tradeFiles.add(file);
}
} else {
infoFiles.add(file);
}
} catch (IFException ex) {
log.error("Error creating EPEXFile object " + ex.getMessage());
}
}
}
profiler.stop("allFilesWithSameCorrIdRetrieval");
log.info(orderFiles.size() + " order files with corrId=" + corrId);
log.info(tradeFiles.size() + " trade files with corrId=" + corrId);
log.info(infoFiles.size() + " info files with corrId=" + corrId);
profiler = Profiler.createStarted();
profiler.stop("processFiles");
orderFiles.clear();
tradeFiles.clear();
infoFiles.clear();
}
I have tried to figure it out with the Profiler class but I could not figure out which is exactly faster since sometimes the first and sometimes the second is faster. Is there even a way to say which is faster in general? Even when it is just a little bit faster it would help me to know which one it is.

I recently wrote this method to test two of my methods which did the exact same thing differently.
private void benchMark(){
long t, t1=0, t2 =0;
for (int i =0; i< 50; i++){
t= System.currentTimeMillis();
method1();
t1 += System.currentTimeMillis()-t;
t= System.currentTimeMillis();
method2();
t2+= System.currentTimeMillis()-t;
}
System.out.println("Benchmarking\n\tMethod 1 took + "+t1+" ms\n\tMethod 2 took "+t2+" ms");
}
That's a brute way to do it, but it works since I found that one of my methods was consistently about 5% faster in every of my tests.
I call the methods one after the other to diminish the effect of performance variations during the test.

Related

concurrent modification on arraylist

There are a lot of concurrent mod exception questions, but I'm unable to find an answer that has helped me resolve my issue. If you find an answer that does, please supply a link instead of just down voting.
So I originally got a concurrent mod error when attempting to search through an arraylist and remove elements. For a while, I had it resolved by creating a second arraylist, adding the discovered elements to it, then using removeAll() outside the for loop. This seemed to work, but as I used the for loop to import data from multiple files I started getting concurrent modification exceptions again, but intermittently for some reason. Any help would be greatly appreciated.
Here's the specific method having the problem (as well as the other methods it calls...):
public static void removeData(ServiceRequest r) {
readData();
ArrayList<ServiceRequest> targets = new ArrayList<ServiceRequest>();
for (ServiceRequest s : serviceQueue) {
//ConcurrentModification Exception triggered on previous line
if (
s.getClient().getSms() == r.getClient().getSms() &&
s.getTech().getName().equals(r.getTech().getName()) &&
s.getDate().equals(r.getDate())) {
JOptionPane.showMessageDialog(null, s.getClient().getSms() + "'s Service Request with " + s.getTech().getName() + " on " + s.getDate().toString() + " has been removed!");
targets.add(s);
System.out.print("targetted"); }
}
if (targets.isEmpty()) { System.out.print("*"); }
else {
System.out.print("removed");
serviceQueue.removeAll(targets);
writeData(); }
}
public static void addData(ServiceRequest r) {
readData();
removeData(r);
if (r.getClient().getStatus().equals("MEMBER") || r.getClient().getStatus().equals("ALISTER")) {
serviceQueue.add(r); }
else if (r.getClient().getStatus().equals("BANNED") || r.getClient().getStatus().equals("UNKNOWN")) {
JOptionPane.showMessageDialog(null, "New Request failed: " + r.getClient().getSms() + " is " + r.getClient().getStatus() + "!", "ERROR: " + r.getClient().getSms(), JOptionPane.WARNING_MESSAGE);
}
else {
int response = JOptionPane.showConfirmDialog(null, r.getClient().getSms() + " is " + r.getClient().getStatus() + "...", "Manually Overide?", JOptionPane.OK_CANCEL_OPTION);
if (response == JOptionPane.OK_OPTION) {
serviceQueue.add(r); }
}
writeData(); }
public static void readData() {
try {
Boolean complete = false;
FileReader reader = new FileReader(f);
ObjectInputStream in = xstream.createObjectInputStream(reader);
serviceQueue.clear();
while(complete != true) {
ServiceRequest test = (ServiceRequest)in.readObject();
if(test != null && test.getDate().isAfter(LocalDate.now().minusDays(180))) {
serviceQueue.add(test); }
else { complete = true; }
}
in.close(); }
catch (IOException | ClassNotFoundException e) { e.printStackTrace(); }
}
public static void writeData() {
if(serviceQueue.isEmpty()) { serviceQueue.add(new ServiceRequest()); }
try {
FileWriter writer = new FileWriter(f);
ObjectOutputStream out = xstream.createObjectOutputStream(writer);
for(ServiceRequest r : serviceQueue) { out.writeObject(r); }
out.writeObject(null);
out.close(); }
catch (IOException e) { e.printStackTrace(); }
}
EDIT
The changes cause the concurrent mod to trigger every time rather than intermittently, which I guess means the removal code is better but the error now triggers at it.remove();
public static void removeData(ServiceRequest r) {
readData();
for(Iterator<ServiceRequest> it = serviceQueue.iterator(); it.hasNext();) {
ServiceRequest s = it.next();
if (
s.getClient().getSms() == r.getClient().getSms() &&
s.getTech().getName().equals(r.getTech().getName()) &&
s.getDate().equals(r.getDate())) {
JOptionPane.showMessageDialog(null, s.getClient().getSms() + "'s Service Request with " + s.getTech().getName() + " on " + s.getDate().toString() + " has been removed!");
it.remove(); //Triggers here (line 195)
System.out.print("targetted"); }
}
writeData(); }
Exception in thread "AWT-EventQueue-0" java.util.ConcurrentModificatio
nException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
at java.util.ArrayList$Itr.next(ArrayList.java:851)
at data.ServiceRequest.removeData(ServiceRequest.java:195)
at data.ServiceRequest.addData(ServiceRequest.java:209) <...>
EDIT
After some more searching, I've switch the for loop to:
Iterator<ServiceRequest> it = serviceQueue.iterator();
while(it.hasNext()) {
and it's back to intermittently triggering. By that I mean the first time I attempt to import data (the removeData method is being triggered from the addData method) it triggers the concurrent mod exception, but the next try it pushes past the failure and moves on to another file. I know there's a lot of these concurrent mod questions, but I'm not finding anything that helps in my situation so links to other answers are more than welcome...
This is not how to do it, to remove elements while going through a List you use an iterator. Like that :
List<ServiceRequest> targets = new ArrayList<ServiceRequest>();
for(Iterator<ServiceRequest> it = targets.iterator(); it.hasNext();) {
ServiceRequest currentServReq = it.next();
if(someCondition) {
it.remove();
}
}
And you will not get ConcurrentModificationException this way if you only have one thread.
If there is multiple threads involved in your code, you may still get ConcurrentModificationException. One way to solve this, is to use Collections.synchronizedCollection(...) on your collection (serviceQueue) and as a result you will get a synchronized collection that will not produce ConcurrentModificationException. But, you code may become very slow.

Java: How do I optimize memory footprint of reading/updating/writing many little files?

I need to improve an open source tool (Releng) (with JDK 1.5 compliance) that updates copyright headers in source files. (e.g copyright 2000, 2011).
It reads files and inserts back a newer revision date (e.g 2014).
Currently it eats so much memory that performance slows down to a crawl.
I need to re-write the file parser so that it uses less memory/runs faster.
I've written a basic file parser (below) that reads all files in a directory (project/files). It then increment's the first four digits found in the file and prints run-time information.
[edit]
On a small scale the current result performs 25 garbage collections and garbage collection takes 12 ms. On a large scale I get so much memory overhead that GC thrashes performance.
Runs Time(ms) avrg(ms) GC_count GC_time
200 4096 20 25 12
200 4158 20 25 12
200 4072 20 25 12
200 4169 20 25 13
Is it possible to re-use File or String objects (and other objects??) to reduce garbage collection count?
Optimization guides suggest re-using objects.
I have considered using Stringbuilder instead of Strings. But from what I gather, it's only useful if you do a lot of concatenation. Which is not done in this case?
I also don't know how to re-use any other objects in the code below (e.g files?)?
How can I go about re-using objects in this scenario (or optimize the code below)?
Any ideas/suggestions are welcomed.
import java.io.File;
import java.io.IOException;
import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.ArrayList;
public class Test {
//Use Bash script to create 2000 files, each having a 4 digit number.
/*
#!/bin/sh
rm files/test*
for i in {1..2000}
do
echo "2000" > files/test$i
done
*/
/*
* Example output:
* runs: 200
* Run time: 4822 average: 24
* Gc runs: Total Garbage Collections: 28
* Total Garbage Collection Time (ms): 17
*/
private static String filesPath = System.getProperty("user.dir") + "/src/files";
public static void main(String args[]) {
final File folder = new File(filesPath);
ArrayList<String> paths = listFilesForFolder(folder);
if (paths == null) {
System.out.println("no files found");
return;
}
long start = System.currentTimeMillis();
// ..
// your code
int runs = 200;
System.out.println("Run: ");
for (int i = 1; i <= runs; i++) {
System.out.print(" " + i);
updateFiles(paths);
}
System.out.println("");
// ..
long end = System.currentTimeMillis();
long runtime = end - start;
System.out.println("Runs Time avrg GC_count GC_time");
System.out.println(runs + " " + Long.toString(runtime) + " " + (runtime / runs) + " " + printGCStats());
}
private static ArrayList<String> listFilesForFolder(final File folder) {
ArrayList<String> paths = new ArrayList<>();
for (final File fileEntry : folder.listFiles()) {
if (fileEntry.isDirectory()) {
listFilesForFolder(fileEntry);
} else {
paths.add(filesPath + "/" + fileEntry.getName());
}
}
if (paths.size() == 0) {
return null;
} else {
return paths;
}
}
private static void updateFiles(final ArrayList<String> paths) {
for (String path : paths) {
try {
String content = readFile(path, StandardCharsets.UTF_8);
int year = Integer.parseInt(content.substring(0, 4));
year++;
Files.write(Paths.get(path), Integer.toString(year).getBytes(),
StandardOpenOption.CREATE);
} catch (IOException e) {
System.out.println("Failed to read: " + path);
}
}
}
static String readFile(String path, Charset encoding) throws IOException {
byte[] encoded = Files.readAllBytes(Paths.get(path)); // closes file.
return new String(encoded, encoding);
}
//PROFILING HELPER
public static String printGCStats() {
long totalGarbageCollections = 0;
long garbageCollectionTime = 0;
for (GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) {
long count = gc.getCollectionCount();
if (count >= 0) {
totalGarbageCollections += count;
}
long time = gc.getCollectionTime();
if (time >= 0) {
garbageCollectionTime += time;
}
}
return " " + totalGarbageCollections + " " + garbageCollectionTime;
}
}
In the end, the code above actually works fine.
I found that in the production code, the code didn't close a file buffer which caused a memory leak which caused performance issues with larger amounts of files.
After that was fixed, it scaled well.

Java multi threading file saving

I have an app that created multiple endless threads. Each thread reads some info and I created some tasks using thread pool (which is fine).
I have added additional functions that handle arrays, when it finishes, its send those ArrayLists to new thread that save those lists as files. I have implemented the saving in 3 ways and only one of which succeeds. I would like to know why the other 2 ways did not.
I created a thread (via new Thread(Runnable)) and gave it the array and name of the file. In the thread constructor I create the PrintWriter and saved the files. It ran without any problems. ( I have 1-10 file save threads runing in parallel).
If I place the save code outputStream.println(aLog); in the Run method, it never reaches it and after the constructor finishes the thread exit.
I place the created runnables (file save) in a thread pool (and code for saving is in the run() method). When I send just 1 task (1 file to save), all is fine. More than 1 task is being added to the pool (very quickly), exceptions is created (in debug time I can see that all needed info is available) and some of the files are not saved.
Can one explain the difference behavior?
Thanks
Please see code below. (starting with function that is being part of an endless thread class that also place some tasks in the pool), the pool created in the endless thread:
ExecutorService iPool = Executors.newCachedThreadPool();
private void logRate(double r1,int ind){
historicalData.clear();
for (int i = 499; i>0; i--){
// some Code
Data.add(0,array1[ind][i][0] + "," + array1[ind][i][1] + "," +
array1[ind][i][2] + "," + array1[ind][i][3] + "," +
array2[ind][i] + "\n" );
}
// first item
array1[ind][0][0] = r1;
array1[ind][0][1] = array1[ind][0][0] ;
array1[ind][0][2] = array1[ind][0][0] ;
array2[ind][0] = new SimpleDateFormat("HH:mm:ss yyyy_MM_dd").format(today);
Data.add(0,r1+","+r1+","+r1+","+r1+ "," + array2[ind][0] + '\n') ;
// save the log send it to the pool (this is case 3)
//iPool.submit(new FeedLogger(fName,Integer.toString(ind),Data));
// Case 1 and 2
Thread fl = new Thread(new FeedLogger(fName,Integer.toString(ind),Data)) ;
}
here is the FeedLogger class:
public class FeedLogger implements Runnable{
private List<String> fLog = new ArrayList<>() ;
PrintWriter outputStream = null;
String asName,asPathName;
public FeedLogger(String aName,String ind, List<String> fLog) {
this.fLog = fLog;
this.asName = aName;
try {
asPathName = System.getProperty("user.dir") + "\\AsLogs\\" + asName + "\\Feed" + ind
+ ".log" ;
outputStream = new PrintWriter(new FileWriter(asPathName));
outputStream.println(fLog); Case 1 all is fine
outputStream.flush(); // Case 1 all is fine
outputStream.close(); Case 1 all is fine
}
catch (Exception ex) {
JavaFXApplication2.logger.log(Level.SEVERE, null,asName + ex.getMessage());
}
}
#Override
public void run()
{
try{
outputStream.println(fLog); // Cas2 --> not reaching this code, Case3 (as task) create
exception when we have multiple tasks
outputStream.flush();
}
catch (Exception e) {
System.out.println("err in file save e=" + e.getMessage() + asPathName + " feed size=" +
fLog.size());
JavaFXApplication2.logger.log(Level.ALL, null,asName + e.getMessage());
}
finally {if (outputStream != null) {outputStream.close();}}
}
}
You need to call start() on a Thread instance to make it actually do something.

Open file to view its content [duplicate]

This question already has answers here:
How to open a file with the default associated program
(4 answers)
Closed 9 years ago.
I have a files list. Lets say it looks:
String[] lst = new String[] {
"C:\\Folder\\file.txt",
"C:\\Another folder\\another file.pdf"
};
I need some method to open these files with default program for them, lets say "file.txt" with Notepad, "another file.pdf" with AdobeReader and so on.
Does anyone knows how?
There is a method to do this:
java.awt.Desktop.getDesktop().open(file);
JavaDoc:
Launches the associated application to open the file.
If the specified file is a directory, the file manager of the current platform is launched to open it.
The Desktop class allows a Java application to launch associated applications registered on the native desktop to handle a URI or a file.
If you are using J2SE 1.4 o Java SE 5, the best option is:
for(int i = 0; i < lst.length; i++) {
String path = lst[i];
if (path.indexOf(' ') > 0) {
// Path with spaces
Runtime.getRuntime().exec("explorer \"" + lst[i] + "\"");
} else {
// Path without spaces
Runtime.getRuntime().exec("explorer " + lst[i]);
}
}
Just make sure the file is in the right location, and this should work fine.
try
{
File dir = new File(System.getenv("APPDATA"), "data");
if (!dir.exists()) dir.mkdirs();
File file = new File(dir"file.txt");
if (!file.exists()) System.out.println("File doesn't exist");
else Desktop.getDesktop().open(file);
} catch (Exception e)
{
e.printStackTrace();
}
I didn't know you have a String array now. So, this one uses regex to process the file list in the format you specified before. Ignore if not required.
If the file list is huge and you would prefer that the files open one by one cmd works great. If you want them to open all at once use explorer. Works only on Windows but then on almost all JVM versions. So, there's a trade-off to consider here.
public class FilesOpenWith {
static String listOfFiles = "{\"C:\\Setup.log\", \"C:\\Users\\XYZ\\Documents\\Downloads\\A B C.pdf\"}";
public static void main(String[] args) {
if (args != null && args.length == 1) {
if (args[0].matches("{\"[^\"]+\"(,\\s?\"[^\"]+\")*}")) {
listOfFiles = args[0];
} else {
usage();
return;
}
}
openFiles();
}
private static void openFiles() {
Matcher m = Pattern.compile("\"([^\"]+)\"").matcher(listOfFiles);
while (m.find()) {
try {
Runtime.getRuntime().exec("cmd /c \"" + m.group(1) + "\"");
// Runtime.getRuntime().exec("explorer \"" + m.group(1) + "\"");
} catch (IOException e) {
System.out.println("Bad Input: " + e.getMessage());
e.printStackTrace(System.err);
}
}
}
private static void usage() {
System.out.println("Input filelist format = {\"file1\", \"file2\", ...}");
}
}

When to flush a BufferedWriter

In a Java program (Java 1.5), I have a BufferedWriter that wraps a Filewriter, and I call write() many many times... The resulting file is pretty big...
Among the lines of this file, some of them are incomplete...
Do I need to call flush each time I write something (but I suspect it would be inefficient) or use another method of BufferedWriter or use another class...?
(Since I've a zillion lines to write, I do want to have something quite efficient.)
What would be the ideal "flushing" moment? (when I reach the capacity of the BufferedWriter)...
Init:
try {
analysisOutput = new BufferedWriter(new FileWriter(
"analysisResults", true));
analysisOutput.newLine();
analysisOutput.write("Processing File " + fileName + "\n");
}
catch (FileNotFoundException ex) {
ex.printStackTrace();
}
catch (IOException ex) {
ex.printStackTrace();
}
Writing:
private void printAfterInfo(String toBeMoved,HashMap<String, Boolean> afterMap, Location location)
throws IOException {
if(afterMap != null) {
for (Map.Entry<String, Boolean> map : afterMap.entrySet()) {
if (toBeMoved == "Condition") {
if (1 <= DEBUG)
System.out.println("###" + toBeMoved + " " + location + " "
+ conditionalDefs.get(conditionalDefs.size() - 1)
+ " After " + map.getKey() + " "
+ map.getValue() + "\n");
analysisOutput.write("###" + toBeMoved + " " + location + " "
+ conditionalDefs.get(conditionalDefs.size() - 1)
+ " After " + map.getKey() + " " + map.getValue()
+ "\n");
} else {
if (1 <= DEBUG)
System.out.println("###" + toBeMoved + " " + location + " "
+ map.getKey() + " After "
+ map.getValue() + "\n");
if (conditionalDefs.size() > 0)
analysisOutput.write("###" + toBeMoved + " " + location + " "
+ conditionalDefs.get(conditionalDefs.size() - 1) + " "
+ map.getKey() + " After " + map.getValue()
+ "\n");
else
analysisOutput.write("###" + toBeMoved + " " + location + " " + map.getKey() + " After " + map.getValue() + "\n");
}
}
}
I've just figured out that the lines which are incomplete are those just before "Processing file"... so it occurs when I'm switching from one file that I analyze to another...
Closing:
dispatch(unit);
try {
if (analysisOutput != null) {
printFileInfo();
analysisOutput.close();
}
}
catch (IOException ex) {
ex.printStackTrace();
}
Sometimes the information printed out by printFileInfo does not appear in the results file...
The BufferedWriter will already flush when it fills its buffer. From the docs of BufferedWriter.write:
Ordinarily this method stores characters from the given array into this stream's buffer,
flushing the buffer to the underlying stream as needed.
(Emphasis mine.)
The point of BufferedWriter is basically to consolidate lots of little writes into far fewer big writes, as that's usually more efficient (but more of a pain to code for). You shouldn't need to do anything special to get it to work properly though, other than making sure you flush it when you're finished with it - and calling close() will do this and flush/close the underlying writer anyway.
In other words, relax - just write, write, write and close :) The only time you normally need to call flush manually is if you really, really need the data to be on disk now. (For instance, if you have a perpetual logger, you might want to flush it every so often so that whoever's reading the logs doesn't need to wait until the buffer's full before they can see new log entries!)
The ideal flushing moment is when you need another program reading the file to see the data that's been written, before the file is closed. In many cases, that's never.
If you have a loop alternating init and printAfterInfo, my guess about your problem is that you don't close your writer before creating a new one on the same file. You'd better create the BufferedWriter once and close it at the end of all the processing.

Categories

Resources