Simple performance test, different results - java

I am made performance test of collection, but every time, when i am starting test, i have different results. Why it is happening and what can i do for take correct results? I am think my problem with option VM.
public class PerformanceTest {
private static void addToTheBeginTest() {
String nameOfMethod = "Method add 250_000 result: ";
List<Integer> arrayList = new ArrayList<>();
StopWatch.start();
for (int i = 0; i < 250_000; i++) {
arrayList.add(0, 1);
}
long resultAL = StopWatch.getElapsedTime();
outputInFile(nameOfMethod, resultAL);
}
private static void outputInFile(String nameOfMethod, long resultAl) {
File outputFile = new File("D:\\Idea Project\\ExperimentalProject\\src\\SimplePerformance\\");
outputFile.mkdir();
try (FileWriter writer = new FileWriter("D:\\Idea Project\\ExperimentalProject\\src\\SimplePerformance\\SimplePerformanceTest.txt", true)) {
writer.write(nameOfMethod);
writer.write(String.valueOf(resultAl) + " mc \n");
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
class StopWatch {
private static long result;
public static void start() {
result = System.currentTimeMillis();
}
public static long getElapsedTime() {
return System.currentTimeMillis() - result;
}
}
Results of 3 times

The reasons for this have already been explained by other answers.
The easiest way to handle this is to run each test x1000 and take an average. You should find the results more consistent.

The exact amount of resources available before running a code keeps on varying every ms. It is not possible to keep it constant on regular computers.

jvms do optimisations under the hood the more time a particular code is run. You should look into micro bench marking. There's a whole bunch of things that need to be done in order to get accurate results.

Related

How can I make this code more efficient? loops and Big Data

I just learned to code few months back, but my project actually is very heavy for what I already know, any help making this code run more efficient would be appreciated.
What I'm trying to do is make this code more efficient because it took 20 hours to process a 30 MB file, and I want to process a 6.5 GB worth of files. I need it to process the file in 30 minutes maximum ... is that possible?
What I'm doing in the code is:
I extract a word and check if it's ID is stored into my hashmap
I get all the parents of this word and add them to a list
In each item on the list I get the ID and Word and other parents
I create a node and add it to the hashmap
Then move on ot the next word
P.S. I do not know how to do Hadoop MapReduce code, I know it's the obvious solution ... but I don't have the time to learn it.
UPDATE!!
[As you see in the Screenshot, 99.7% time was used to "getInstance" of the word from the WordNet dictionary, which is the library I'm using: extjWNl. "getResourceInstance" is method that calls the dictionary itself, and the third highlighted entry is my method that calls these methods (only 0.001% of the time is actually spent by the rest of the method)
I am not sure if this issue is solvable, or do you have any ideas? - Click this "1" for Screenshot]1
static HashMap<Long, Node> graph = new HashMap <Long, Node> ();
private static void demonstrateTree (IndexWord word) throws JWNLException {
Long Os = word.getSenses().get(0).getOffset();
if (graph.containsKey(Os)) {
return;
}
PointerTargetTree hypernyms = PointerUtils.getHypernymTree(word.getSenses().get(0));
List<PointerTargetNodeList> hypernymsList = hypernyms.toList();
for(int c=0;c<hypernymsList.size();c++){
PointerTargetNodeList l = hypernymsList.get(c);
for(int j = l.size()-1; j >= 0 ; j--) {
Long tempid = l.get(j).getPointerTarget().getSynset().getOffset();
String tempword = l.get(j).getPointerTarget().getSynset().getWords().get(0).getLemma();
Node n = new Node(tempid, tempword, new ArrayList<Node>());
if (!graph.containsKey(tempid)) {
n.id = tempid;
n.word = tempword;
if (!(j == l.size()-1)){
n.parents.add(graph.get(l.get(j+1).getPointerTarget().getSynset().getOffset()));
}
graph.put(tempid, n);
}
}
}
}
public static void demonstrateListHelper(String text) throws JWNLException {
String lineText =text.split("\t")[2];
String [] singleWord = lineText.split("\\s+");
for (int k=0; k <singleWord.length; k++){
singleWord[k] = singleWord[k].replaceAll("[^\\w]", "");
IndexWordSet set = Dictionary.getDefaultResourceInstance().lookupAllIndexWords(singleWord[k]);
for (IndexWord word:set.getIndexWordArray()) {
demonstrateTree(word);
}
}
}
public static void generateHierarchy() {
Set<Entry<Long, Node>> iterator = graph.entrySet();
int i =0;
for(Entry<Long,Node> e : iterator) {
System.out.println(i++ +" - " +e.getValue().firstParents());
}
}
#SuppressWarnings({ "resource" })
public static void main(String[] args) throws JWNLException {
File file = new File("C:/Users/D060891/Desktop/Thesis/sentencesNYT/part-m-00001");
try {
BufferedReader input = new BufferedReader(new FileReader(file));
String line;
while ((line = input.readLine()) != null) {
demonstrateListHelper(line);
}
generateHierarchy();
}
catch (IOException e) {
e.printStackTrace();
}
}
The first rule of performance optimization is to not stare at code or guess but measure runtime behaviour. So fire up a profiler and see where your program spends the time (or the memory).
A good start would be to profile your code with VisualVM which is included in the JDK.
Update:
You now have identified the bottleneck:
Dictionary.getDefaultResourceInstance()
Looking into the source code a wordnet dictionary is loaded from a XML doc every time you call that method. So simply move the bottleneck out of the loop and get the Dictionary once at the beginning: Define a class variable
private static Dictionary dictionary;
initialize at the beginning, e.g. in main
dictionary = Dictionary.getDefaultResourceInstance();
and then use it later
dictionary.lookupAllIndexWords(singleWord[k]);

Benchmarking in Java (comparing two classes)

I'll want to compare the speed of work two classes (StringBuider and StringBuffer) using append method.
And I wrote very simple program:
public class Test {
public static void main(String[] args) {
try {
test(new StringBuffer("")); // StringBuffer: 35117ms.
test(new StringBuilder("")); // StringBuilder: 3358ms.
} catch (IOException e) {
System.err.println(e.getMessage());
}
}
private static void test(Appendable obj) throws IOException {
long before = System.currentTimeMillis();
for (int i = 0; i++ < 1e9; ) {
obj.append("");
}
long after = System.currentTimeMillis();
System.out.println(obj.getClass().getSimpleName() + ": " +
(after - before) + "ms.");
}
}
But I know, that it's bad way for benchmarking. I want to put the annotations on the method or class, set the number of iterations, tests, different conditions and at the output to get accurate results.
Please advise a good library or standard Java tools to solve this problem. Additionally, if not difficult, write a good benchmarking.
Thanks in advance!
JMH, the Java Microbenchmark Harness, allows to run correct micro benchmarks. It uses annotations to express benchmark parameters.

Class.forName() caching

In one of my applications I am using the following:
public void calculate (String className)
{
...
Class clazz = Class.forName(className);
...
}
This function is called several times / second.
There are about 10 possible class names.
And while I do realize there is some internal caching inside this function,
I think this caching is only available on native level.
For this reason I am starting to wonder if I should add my own caching.
private static Map<String,Class> classMap;
public void calculate (String className)
{
...
Class clazz = classMap.get(className);
if (clazz == null)
{
clazz = Class.forName(className);
if (classMap == null) classMap = new HashMap<String, Class>(40);
classMap.put(className, clazz);
}
...
}
Will this be a performance gain or does it really make no difference ?
Thank you in advance
I wrote a little script to calculate the execution time of both functions.
This is the Main class that I used.
public class Benchmark
{
public static void main(String... pArgs)
{
// prepare all data as much as possible.
// we don't want to do this while the clock is running.
Class[] classes = {Object.class, Integer.class, String.class, Short.class, Long.class, Double.class,
Float.class, Boolean.class, Character.class, Byte.class};
int cycles = 1000000;
String[] classNames = new String[cycles];
for (int i = 0; i < cycles; i++)
{
classNames[i] = classes[i % classes.length].getName();
}
// THERE ARE 2 IMPLEMENTATIONS - CLASSIC vs CACHING
Implementation impl = new Caching(); // or Classic();
// Start the clocks !
long startTime = System.currentTimeMillis();
for (int i = 0; i < cycles; i++)
{
impl.doStuff(classNames[i]);
}
long endTime = System.currentTimeMillis();
// calculate and display result
long totalTime = endTime - startTime;
System.out.println(totalTime);
}
}
Here is the classic implementation that uses Class.forName
private interface Implementation
{
Class doStuff(String clzName);
}
private static class Classic implements Implementation
{
#Override
public Class doStuff(String clzName)
{
try
{
return Class.forName(clzName);
}
catch (Exception e)
{
return null;
}
}
}
Here is the second implementation that uses a HashMap to cache the Class objects.
private static class Caching implements Implementation
{
private Map<String, Class> cache = new HashMap<String, Class>();
#Override
public Class doStuff(String clzName)
{
Class clz = cache.get(clzName);
if (clz != null) return clz;
try
{
clz = Class.forName(clzName);
cache.put(clzName, clz);
}
catch (Exception e)
{
}
return clz;
}
}
The results:
1100 ms without caching.
only 15 ms with caching.
Conclusion:
Is it a significant difference --> yes !
Does it matter for my application --> not at all.
Will this be a performance gain or does it really make no difference?
I would be astonished if it made a significant difference - and if you're only calling it "several times per second" (rather than, say, a million) it's really not worth optimizing.
You should at least try this in isolation in a benchmark before committing to this more complicated design. I would strongly expect Class.forName to be caching this anyway, and adding more complexity into your app does no good.
Class.forName() does two things:
it fetches a loaded class from the classloader
if no such class is found, it tries to load it.
Part #1 is pretty quick. #2 is where the real work starts (where the JVM might hit the hard disk or even the network, depending on the classloader). And if you pass the same parameters in, then all but the first invocations will never get to step #2.
So no: it's probably not worth optimizing.
No you shouldn't. Class.forName will not load the same class twice but will try to find the class among the loaded classes. It's done at native level and supposed to be very efficient.

Problem with Priority Queue

I'm trying to use a priority queue in my code, and for some reason when I remove the objects, they aren't in order. Do you know what i"m doing wrong?
Here's my code:
the contructor:
recordedsong = new PriorityQueue<recordedNote>(50, new Comparator<recordedNote>()
{
public int compare(recordedNote n1, recordedNote n2)
{
long l = n1.rt()-n2.rt();
int i = (int)l;
return i;
}
});
where each recordedNotehas a long value that is returned my the method rt().
But when I call
while (!Song.isEmpty())
{
recordedNote temp = (recordedNote)Song.remove();
and then print temp.rt() for each one, all the numbers are out of order. And not just like reverse order, but all over the place, like 1103, 0, 500, 0, 220 orders like that.
Can you see if there's anything wrong with my contructor?
Thanks!
remove should work, and in fact it does work fine in a small example program that I created to help answer this question:
import java.util.Comparator;
import java.util.PriorityQueue;
public class TestPriorityQueue {
public static void main(String[] args) {
long[] noteTimes = {1103L, 0L, 500L, 0L, 220L, 1021212812012L};
PriorityQueue<RecordedNote> noteQueue = new PriorityQueue<RecordedNote>(10,
new Comparator<RecordedNote>() {
#Override
public int compare(RecordedNote o1, RecordedNote o2) {
Long time1 = o1.getTime();
Long time2 = o2.getTime();
// uses Long's built in compareTo method, so we
//don't have to worry as much about edge cases.
return time1.compareTo(time2);
}
});
for (int i = 0; i < noteTimes.length; i++) {
RecordedNote note = new RecordedNote(noteTimes[i]);
System.out.println(note);
noteQueue.add(note);
}
System.out.println();
while (noteQueue.size() > 0) {
System.out.println(noteQueue.remove());
}
}
}
class RecordedNote {
private long time;
public RecordedNote(long time) {
this.time = time;
}
public long getTime() {
return time;
}
#Override
public String toString() {
return "[Time: " + time + "]";
}
}
So this begs the question, why isn't it working for you? Myself, I don't see enough coherent code in your question to be able to answer this. We're not sure what is Song as I don't see this declared as a class or a variable, and I also don't see where you're using your PriorityQueue variable, recordedsong, anywhere. So I suggest you do the same thing as I: create a small compilable runnable program that we can run and modify and that demonstrates your problem, an http://sscce.org
I guess there is a possibility for i getting 0. So modify compare method so that it returns a positive value rather than the result.
Reading the API docs for PriorityQueue, it states the following:
The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
My guess is that remove() is not obligated to follow the natural ordering, either.

confirming program flow

can someone tell if the code below would work fine?
class CriticalSection{
int iProcessId, iCounter=0;
public static boolean[] freq = new boolean[Global.iParameter[2]];
int busy;
//constructors
CriticalSection(){}
CriticalSection(int iPid){
this.iProcessId = iPid;
}
int freqAvailable(){
for(int i=0; i<
Global.iParameter[2]; i++){
if(freq[i]==true){
//this means that there is no frequency available and the request will be dropped
iCounter++;
}
}
if(iCounter == freq.length)
return 3;
BaseStaInstance.iNumReq++;
return enterCritical();
}
int enterCritical(){
int busy=0;
for(int i=0; i<Global.iParameter[2]; i++){
if(freq[i]==true){
freq[i] = false;
break;
}
}
//implement a thread that will execute the critical section simultaneously as the (contd down)
//basestation leaves it critical section and then generates another request
UseFrequency freqInUse = new UseFrequency;
busy = freqInUse.start(i);
//returns control back to the main program
return 1;
}
}
class UseFrequency extends Thread {
int iFrequency=0;
UseFrequency(int i){
this.iFrequency = i;
}
//this class just allows the frequency to be used in parallel as the other basestations carry on making requests
public void run() {
try {
sleep(((int) (Math.random() * (Global.iParameter[5] - Global.iParameter[4] + 1) ) + Global.iParameter[4])*1000);
} catch (InterruptedException e) { }
}
CriticalSection.freq[iFrequency] = true;
stop();
}
No, this code will not even compile. For example, your "UseFrequency" class has a constructor and a run() method, but then you have two lines CriticalSection.freq[iFrequency] = true; and
stop(); that aren't in any method body - they are just sitting there on their own.
If you get the code to compile it still will not work like you expect because you have multiple threads and no concurrency control. That means the different threads can "step on eachother" and corrupt shared data, like your "freq" array. Any time you have multiple threads you need to protect access to shared variables with a synchronized block. The Java Tutorial on concurrency explains this here http://java.sun.com/docs/books/tutorial/essential/concurrency/index.html
Have you tried compiling and testing it? Are you using an IDE like Eclipse? You can step through your program in the debugger to see what its doing. The way your question is structured no one can tell either way if your program is doing the right or wrong thing, because nothing is specified in the comments of the code, nor in the question posed.

Categories

Resources