Google RateLimiter not Working for counter - java

I have case of limiting calls to 100/s.
I am thinking of using Google Guava RateLimiter. I tested it like this:-
int cps = 100;
RateLimiter limiter = RateLimiter.create(cps);
for(int i=0;i<200;i++) {
limiter.acquire();
System.out.print("\rNumber of records processed = " + i+1);
}
But the code did not stop at 100 records to let 1 sec be completed. Am I doing something wrong?

The RateLimiter is working ok. The problem is that your output is buffered, because you are not flushing it each time. Usually, standard output is line-buffered. So if you had written
System.out.println("Number of records processed = " + (i+1));
you would have seen a pause at 100. However, what you have:
System.out.print("\rNumber of records processed = " + i+1);
has two problems. First, the "\r" is not taken as a new line and does not cause flushing; therefore the whole output is buffered and is printed to the console all in one go. Second, you need to put (i+1) in parentheses. What you have appends i to the string, and then appends 1 to the resultant string.

Besides #DodgyCodeException's suggestions regarding output flushing and concatenating +1, let's run this code to make sure you understand how RateLimiter works:
final double permitsPerSecond = 1;
RateLimiter limiter = RateLimiter.create(permitsPerSecond);
final Stopwatch stopwatch = Stopwatch.createStarted();
int i = 0;
for (; i < 2 * permitsPerSecond; i++) {
limiter.acquire();
}
System.out.println("Elapsed = " + stopwatch.stop().elapsed(TimeUnit.MILLISECONDS) + "ms");
System.out.println("Number of records processed = " + i);
(Note that I set number of tries to twice the permitsPerSecond number.) When you set permitsPerSecond to 1, you'll see:
Elapsed = 1001ms
Number of records processed = 2
For permitsPerSecond = 10; and permitsPerSecond = 100; it'd approaches (in mathematical sense) 2s limit, because 11th or 101st try waits for limit set in RateLimiter:
Elapsed = 1902ms
Number of records processed = 20
and
Elapsed = 1991ms
Number of records processed = 200

Related

Very slow iteration on Chronicle Map

I'm seeing very slow times iterating over a Chronicle Map - in the below example 93ms per iteration over 1M entries on my 2013 MacbookPro. I'm wondering if there's a better way to iterate or something I'm doing wrong or if this is expected? I know Chronicle Map isn't optimized for iterating but this ticket from a few years ago made me expect much faster iteration times. Toy example below:
public static void main(String[] args) throws Exception {
int numEntries = 1_000_000;
int numIterations = 1_000;
int avgEntrySize = BitUtil.SIZE_OF_LONG + BitUtil.SIZE_OF_INT;
ChronicleMap<IntValue, ByteBuffer> map = ChronicleMap.of(IntValue.class, ByteBuffer.class)
.name("test").entries(numEntries).averageValueSize(avgEntrySize)
.putReturnsNull(true).create();
IntValue value = Values.newHeapInstance(IntValue.class);
ByteBuffer buffer = ByteBuffer.allocate(avgEntrySize);
for (int i = 0; i < numEntries; i++) {
value.setValue(i);
buffer.clear();
buffer.putLong(i);
buffer.putInt(i);
buffer.flip();
map.put(value, buffer);
}
System.out.println("Finished insertion");
for (int i = 0; i < numIterations; i++) {
map.forEachEntry(entry -> {
Data<ByteBuffer> data = entry.value();
ByteBuffer val = data.get();
});
}
System.out.println("Finished priming");
long start = System.currentTimeMillis();
for (int i = 0; i < numIterations; i++) {
map.forEachEntry(entry -> {
Data<ByteBuffer> data = entry.value();
ByteBuffer val = data.get();
});
}
System.out.println(
"Elapsed: " + (System.currentTimeMillis() - start) + " for " + numIterations
+ " iterations");
}
Output:
Finished insertion
Finished priming
Elapsed: 93327 for 1000 iterations
Your results: 93 milliseconds per 1 million keys exactly match the result of benchmark here: http://jetbrains.github.io/xodus/#benchmarks, so it's in the expected ballpark. 93 ms / 1m keys is 93 ns per key, it "very slow" compared to what? Your map contains 16 MB of payload and it's total off-heap size is ~ 30 MB (FYI you can check that by map.offHeapMemoryUsed()), that is much more than the volume of L3 memory in consumer laptops, so iteration speed is bound by the latency of the main memory. Chronicle Map's iteration is mainly not sequential, so memory prefetch doesn't work. I've created an issue about this.
Also several notes about your code:
In your case the value size of the map is constant, so you should use constantValueSizeBySample(ByteBuffer.allocate(12)) instead of averageValueSize(). Even if the map value size wasn't constant, it's preferred to use averageValue() instead of averageValueSize(), because you cannot be sure how many bytes serializers use for the values.
Your value seems to be a good use case for value interfaces with two fields. Moreover you already use a value interface as the key type - IntValue.
Do benchmarks using JMH

I can't identify the issue with my parallel run timer

I have a program that applies a median filter to an array of over 2 million values.
I'm trying to compare run times for sequential vs parallel on the same dataset. So when I execute the program, it does 20 runs, every run is timed, and an average of the 20 times is outputted to the console.
ArrayList<Double> times = new ArrayList<>(20);//to calculate average run time
for (int run = 1; run < 21; run++) //algorithm will run 20 times
{
long startTime = System.nanoTime();
switch (method)
{
case 1: //Sequential
filt.seqFilter();
break;
case 2: //ForkJoin Framework
pool.invoke(filt); //pool is ForkJoin
break;
}
Double timeElapsed = (System.nanoTime() - startTime) / 1000000.0;
times.add(run - 1, timeElapsed);
System.out.println("Run " + run + ": " + timeElapsed + " milliseconds.");
}
times.remove(Collections.max(times)); //there's always a slow outlier
double timesSum = 0;
for (Double e : times)
{
timesSum += e;
}
double average = timesSum / 19;
System.out.println("Runtime: " + average);
filt is of type FilterObject which extends RecursiveAction. My overridden compute() method in FilterObject looks like this:
public void compute()
{
if (hi - lo <= SEQUENTIAL_THRESHOLD)
{
seqFilter();
}
else
{
FilterObject left = new FilterObject(lo, (hi + lo) / 2);
FilterObject right = new FilterObject((hi + lo) / 2, hi);
left.fork();
right.compute();
left.join();
}
}
seqFilter() processes the values between the lo and hi indices in the starting array and adds the processed values to a final array in the same positions. That's why there is no merging of arrays after left.join().
My run times for this are insanely fast for parallel - so fast that I think there must be something wrong with my timer OR with my left.join() statement. I'm getting average times of around 170 milliseconds for sequential with a filtering window of size 3 and 0.004 milliseconds for parallel. Why am I getting these values? I'm especially concerned that my join() is in the wrong place.
If you'd like to see my entire code, with all the classes and some input files, follow this link.
After some testing of your code I found the reason. It turned out that the ForkJoinPool runs one task instance only once. Subsequent invoke() calls with the same task instance will return immediately. So you have to reinstantiate the task for every run.
Another problem is with the parallel (standard threads) run. You are starting the threads but never waiting for them to finish before measuring the time. I think You could use the CyclicBarrier here.
With the mentioned fixes I get roughly the same time for ForkJoin and standard threads. And it's three times faster than sequential. Seems reasonable.
P.S. You are doing a micro-benchmark. It may be useful to read answers to that question to improve your benchmark accuracy: How do I write a correct micro-benchmark in Java?

Simple concurrent Java threads--capture begin and end

Am I correctly implementing these Java threads? The goal is to have ten concurrent threads who compute a sum from 1 to (upper bound 22 + i). I'm trying to identify the name and print it when running the thread, then print the result when the thread exits. Currently, I have all of the results printing at the same time in a random order and I am not sure if I am correctly getting the information when the thread begins and ends.
public class threads {
public static void main(String[] args) {
for(int i = 0; i < 10; i++) {
final int iCopy = i;
new Thread("" + i) {
public void run() {
int sum = 0;
int upperBound = 22;
int lowerBound = 1;
long threadID = Thread.currentThread().getId();
for (int number = lowerBound; number <= upperBound; number++){
sum = sum + number + iCopy;
}
System.out.println(threadID + " thread is running now, I and will compute the sum from 1 to " + (upperBound + iCopy) + ". The i is : " + iCopy);
System.out.println("Thread id #" + threadID + ", the "+ sum + " is done by the thread.");
}
}.start();
}
}
}
I have executed your code and observed that all threads are running properly 10 in this case. Since threads are invoked in random order that is why this behavior might be seen but I an sure that all threads for running fine and executing the functionality you require.
Any how in output i saw that in for loop the value should start from 0 to 9 but here even this is random, may be because some threads are sleeping while executing and giving way to other threads.
Hope this helps
Thanks.
The order the threads run in will depend entirely on the JVM being used and underlying resources.
If you have several cores (cpus) available, your code may run completely differently to a single core.
Essentially, your main loop runs to the end in a single thread, firing 10 new threads, and puts the start method in a process queue. Other processors may start running those threads. Each extra thread causes different total load, so they run slightly differently (performance wise) on each processor, meaning they run faster/slower, and end in different times.
Your code demonstrates this very well.

Why is Collections.synchronizedSet(HashSet) faster than HashSet for addAll, retainAll, and contains?

I ran a test to find the best concurrent Set implementation for my program, with a non-synchronized HashSet as a control, and ran into an interesting result: the addAll, retainAll, and contains operations for a Collections.synchronizedSet(HashSet) appear to be faster than those of a regular HashSet. My understanding is that a SynchronizedSet(HashSet) should never be faster than a HashSet because it consists of a HashSet with synchronization locks. I've run the test quite a few times now, with similar results. Am I doing something wrong?
Relevant results:
Testing set: HashSet
Add: 17.467758 ms
Retain: 28.865039 ms
Contains: 22.18998 ms
Total: 68.522777 ms
--
Testing set: SynchronizedSet
Add: 17.54269 ms
Retain: 20.173502 ms
Contains: 19.618188 ms
Total: 57.33438 ms
Relevant code:
public class SetPerformance {
static Set<Long> source1 = new HashSet<>();
static Set<Long> source2 = new HashSet<>();
static Random rand = new Random();
public static void main(String[] args) {
Set<Long> control = new HashSet<>();
Set<Long> synch = Collections.synchronizedSet(new HashSet<Long>());
//populate sets to draw values from
System.out.println("Populating source");
for(int i = 0; i < 100000; i++) {
source1.add(rand.nextLong());
source2.add(rand.nextLong());
}
//populate sets with initial values
System.out.println("Populating test sets");
control.addAll(source1);
synch.addAll(source1);
testSet(control);
testSet(synch);
}
public static void testSet(Set<Long> set) {
System.out.println("--\nTesting set: " + set.getClass().getSimpleName());
long start = System.nanoTime();
set.addAll(source1);
long add = System.nanoTime();
set.retainAll(source1);
long retain = System.nanoTime();
boolean test;
for(int i = 0; i < 100000; i++) {
test = set.contains(rand.nextLong());
}
long contains = System.nanoTime();
System.out.println("Add: " + (add - start) / 1000000.0 + " ms");
System.out.println("Retain: " + (retain - add) / 1000000.0 + " ms");
System.out.println("Contains: " + (contains - retain) / 1000000.0 + " ms");
System.out.println("Total: " + (contains - start) / 1000000.0 + " ms");
}
}
You aren't warming up the JVM.
Note that you run the HashSet test first.
I changed your program slightly to run the test in a loop 5 times. SynchronizedSet was faster, on my machine, in only the first test.
Then, I tried reversing the order of the two tests, and only running the test once. HashSet won again.
Read more about this here: How do I write a correct micro-benchmark in Java?
Additionally, check out Google Caliper for a framework that handles all these microbenchmarking issues.
yes
try to run the sync set before the regular and you will get your "needed" results.
I reckon this has to do with the JVM warm up and nothing else.
Try to warn up the VM with some computations and then run the benchmark or run it a few times in a mixed order.

How do apps measure CPU usage (as a %)?

So I'm trying to write an app that measures CPU usage (ie, the time CPU is working vs the time it isn't). I've done some research, but unfortunately there are a bunch of different opinions on how it should be done.
These different solutions include, but aren't limited to:
Get Memory Usage in Android
and
http://juliano.info/en/Blog:Memory_Leak/Understanding_the_Linux_load_average
I've tried writing some code myself, that I though might do the trick, because the links above don't take into consideration when the core is off (or do they?)
long[][] cpuUseVal = {{2147483647, 0} , {2147483647, 0} , {2147483647, 0} ,
{2147483647, 0} , {2147483647, 0}};
public float[] readCPUUsage(int coreNum) {
int j=1;
String[] entries; //Array to hold entries in the /proc/stat file
int cpu_work;
float percents[] = new float[5];
Calendar c = Calendar.getInstance();
// Write the dataPackage
long currentTime = c.getTime().getTime();
for (int i = 0; i <= coreNum; i++){
try {
//Point the app to the file where CPU values are located
RandomAccessFile reader = new RandomAccessFile("/proc/stat", "r");
String load = reader.readLine();
while (j <= i){
load = reader.readLine();
j++;
}
//Reset j for use later in the loop
j=1;
entries = load.split("[ ]+");
//Pull the CPU working time from the file
cpu_work = Integer.parseInt(entries[1]) + Integer.parseInt(entries[2]) + Integer.parseInt(entries[3])
+ Integer.parseInt(entries[6]) + Integer.parseInt(entries[6]) + Integer.parseInt(entries[7]);
reader.close();
percents[i] = (float)(cpu_work - cpuUseVal[i][1]) / (currentTime - cpuUseVal[i][0]);
cpuUseVal[i][0] = currentTime;
cpuUseVal[i][1] = cpu_work;
//In case of an error, print a stack trace
} catch (IOException ex) {
ex.printStackTrace();
}
}
//Return the array holding the usage values for the CPU, and all cores
return percents;
}
So here is the idea of the code I wrote...I have a global array with some dummy values that should return negative percentages the first time the function is run. The values are being stored in a database, so I would know to disregard anything negative. Anyway, the function runs, getting values of time the cpu is doing certain things, and comparing it to the last time the function is run (with the help of the global array). These values are divided by the amount of time that has passed between the function runs (with the help of the calendar)
I've downloaded some of the existing cpu usage monitors and compared them to values I get from my app, and mine are never even close to what they get. Can someone explain what I'm doing wrong?
Thanks to some help I have changed my function to look like the following, hope this helps others who have this question
// Function to read values from /proc/stat and do computations to compute CPU %
public float[] readCPUUsage(int coreNum) {
int j = 1;
String[] entries;
int cpu_total;
int cpu_work;
float percents[] = new float[5];
for (int i = 0; i <= coreNum; i++) {
try {
// Point the app to the file where CPU values are located
RandomAccessFile reader = new RandomAccessFile("/proc/stat","r");
String load = reader.readLine();
// Loop to read down to the line that corresponds to the core
// whose values we are trying to read
while (j <= i) {
load = reader.readLine();
j++;
}
// Reset j for use later in the loop
j = 1;
// Break the line into separate array elements. The end of each
// element is determined by any number of spaces
entries = load.split("[ ]+");
// Pull the CPU total time on and "working time" from the file
cpu_total = Integer.parseInt(entries[1])
+ Integer.parseInt(entries[2])
+ Integer.parseInt(entries[3])
+ Integer.parseInt(entries[4])
+ Integer.parseInt(entries[5])
+ Integer.parseInt(entries[6])
+ Integer.parseInt(entries[7]);
cpu_work = Integer.parseInt(entries[1])
+ Integer.parseInt(entries[2])
+ Integer.parseInt(entries[3])
+ Integer.parseInt(entries[6])
+ Integer.parseInt(entries[7]);
reader.close();
//If it was off the whole time, say 0
if ((cpu_total - cpuUseVal[i][0]) == 0)
percents[i] = 0;
//If it was on for any amount of time, compute the %
else
percents[i] = (float) (cpu_work - cpuUseVal[i][1])
/ (cpu_total - cpuUseVal[i][0]);
//Save the values measured for future comparison
cpuUseVal[i][0] = cpu_total;
cpuUseVal[i][1] = cpu_work;
// In case of an error, print a stack trace
} catch (IOException ex) {
ex.printStackTrace();
}
}
// Return the array holding the usage values for the CPU, and all cores
return percents;
}
Apps don't measure CPU usage, the kernel does by interrupting the process 100 times per second (or some other frequency depending on how the kernel is tuned) and incrementing a counter which corresponds to what it was doing when interrupted.
If in the process => increment the user counter.
If in the kernel => increment the system counter
If waiting for disk or network or a device => increment the waiting for IO
Otherwise increment the idle counter.
The uptime is determined by the decaying average length of the run queue i.e. how many threads are waiting to run. The first number is the average length over the last minute. You can get the load average via JMX.

Categories

Resources