I created a small app (as a project, with no plans to release it). Part of the code generates a grid of random numbers and and displays a grid using the sprite from a list with said number. It behaved as expected, at least on my Kindle Fire (and another device I tried it on).
The emulator, however, is a different story. I think the emulated device was a generic Samsung (4, maybe). The same code, when run on the emulator, fills about half the grid with one sprite, and half with a different sprite.
The grid might look like this on the emulator:
11177
11177
11777
11777
11777
instead of this on real devices:
64251
26253
87635
38415
28167
The relevant part of my code (yes, I should move new Random() elsewhere):
import java.util.Random;
// ... ... ...
for (int i = 0; i < GRID; i++) {
for (int j = 0; j < GRID; j++) {
paint.setColor(Color.parseColor("#FBB117"));
Random rand=new Random();
int num=rand.nextInt(8);
canvas.drawBitmap(bmp,frames[num],calcGridSpot(i,j),
paint);
//... Eventually closing braces:
}
}
I have had issues in the past, in Java, with Random deciding to be less random (I believe it might be due to optimizations).
Why is the emulator behaving less randomly? (And how do I fix it?)
I suspect your "yes, I should move new Random() elsewhere" remark is more accurate than you think.
You can find the source for Android's java.util.Random class; note the kitkat and lollipop/marshmallow versions are different.
kitkat:
public Random() {
// Note: Using identityHashCode() to be hermetic wrt subclasses.
setSeed(System.currentTimeMillis() + System.identityHashCode(this));
}
marshmallow:
public Random() {
// Note: Don't use identityHashCode(this) since that causes the monitor to
// get inflated when we synchronize.
setSeed(System.nanoTime() + seedBase);
++seedBase;
}
Note the change from millisecond-granularity time to nanosecond-granularity time. However, this change is only relevant if the emulated device actually has a clock that granular. Some emulators may have very coarse clocks, which means that System.nanoTime() could be returning the same value on every iteration of a reasonably fast loop.
You're creating new random number generators each loop, seeding them with N, N+1, N+2, and so on. Because the seeds are nearly identical, most of the bits in the first number output are identical.
If you pull the "new Random" out of the loop, not only will it be more efficient, it will allow the pseudo-random number generator to progress through its full sequence and give you better results.
Related
I have 2 strings in an array. I want there to be a 10% chance of one and 90% chance to select the other. Right now I am using:
Random random = new Random();
int x = random.nextInt(100 - 1) + 1;
if (x < 10) {
string = stringArray(0);
} else {
string = stringArray(1);
}
Is this the best way of accomplishing this or is there a better method?
I know it's typically a bad idea to submit a stack overflow response without submitting code, but I really challenge this question of " the best way." People ask this all the time and, while there are established design patterns in software worth knowing, this question almost always can be answered by "it depends."
For example, your pattern looks fine (I might add some comments). You might get a minuscule performance increase by using 1 - 10 instead of 1 - 100, but the things you need to ask yourself are as follows :
If I get hit by a bus, is the person who is going to be working on the application going to know what I was trying to do?
If it isn't intuitive, I should write a comment. Then I should ask myself, "Can I change this code so that a comment isn't necessary?"
Is there an existing library that solves this problem? If so, is it FOSS approved (if applicable) / can I use it?
What is the size of this codebase eventually going to be? Am I making a full program with microservices, a DAO, DTO, Controller, View, and different layers for validation?
Is there an existing convention to solve my problem (either at my company or in general), or is it unique enough that I can take my own spin on it?
Does this follow the DRY principle?
I'm in (apparently) a very small camp on stack overflow that doesn't always believe in universal "bests" for solving code problems. Just remember, programming is only as hard as the problem you're trying to solve.
EDIT
Since people asked, I'd do it like this:
/*
* #author DaveCat
* #version 1.0
* #since 2019-03-9
* Convenience method that calculates 90% odds of A and 10% odds of B.
*
*/
public static String[] calculatesNinetyPercent()
{
Random random = new Random();
int x = random.nextInt(10 - 1 ) + 1
//Option A
if(x <= 9) {
return stringArray(0);
}
else
{
//Option B
return stringArray(1);
}
}
As an aside, one of the common mistakes junior devs make in enterprise level development is excessive comments.This has a javadoc, which is probably overkill, but I'm assuming this is a convenience method you're using in a greater program.
Edit (again)
You guys keep confusing me. This is how you randomly generate between 2 given numbers in Java
One alternative is to use a random float value between 0..1 and comparing it to the probability of the event. If the random value is less than the probability, then the event occurs.
In this specific example, set x to a random float and compare it to 0.1
I like this method because it can be used for probabilities other than percent integers.
I created a checkress game and I would like the computer to calculate the most optimal move.
Here is what I've done so far:
public BoardS calcNextMove(BoardS bs)
{
ArrayList<BoardS>options = calcPossibleOptions(bs);
int max = -1;
int temp;
int bestMove = 0;
for(int k=0;k<options.size();k++)
{
temp = calculateNextMove2(options.get(k));
if(max<temp)
{
max = temp;
bestMove = k;
}
}
return options.get(bestMove);
}
public int calculateNextMove2(BoardS bs)
{
int res = soWhoWon(bs);
if(res == 2) //pc won(which is good so we return 1)
return 1;
if(res == 1)
return 0;
ArrayList<BoardS>options = calcPossibleOptions(bs);
int sum = 0;
for(int k=0;k<options.size();k++)
{
sum += calculateNextMove2(options.get(k));
}
return sum;
}
I keep getting
Exception in thread "AWT-EventQueue-0" java.lang.StackOverflowError
calcPossibleOptions works good, it's a function that returns an array of all the possible options.
BoardS is a clas that represent a game board.
I guess I have to make it more efficient, how?
The recursion on "calculateNextMove2()" will run too deep and this is why you get a StackOverFlow. If you expect the game to run to the end (unless relatively close to an actual win) this may take a very long time indeed. Like with chess e.g. (which I have more experience with) an engine will go maybe 20 moves deep before you will likely go with what it's found thus far. If you ran it from the start of a chess game... it can run for 100s of years on current technology (and still not really be able to say what the winning first move is). Maybe just try 10 or 20 moves deep? (which would still beat most humans - and would still probably classify as "best move"). As with chess you will have the difficulty of assessing a position as good or bad for this to work (often calculated as a combination of material advantage and positional advantage). That is the tricky part. Note: Project Chinook has done what you're looking to achieve - it has "defeated" the game of Checkers (no such thing exists yet for chess EDIT: Magnus Carslen has "deafeted" the game for all intents and purposes :D).
see: Alpha-beta pruning
(it may help somewhat)
Also here is an (oldish) paper I saw on the subject:
Graham Owen 1997 Search Algorithms and Game Playing
(it may be useful)
Also note that returning the first move that results in a "win" is 'naive' - because it may be a totally improbably line of moves where the opponent will gain an advantage if they don't play in a very particular win - e.g. Playing for Fools Mate or Scholars Mate in chess... (quick but very punishable)
You're going to need to find an alternative to MinMax, even with alpha-beta pruning most likely, as the board is too large making too many possible moves, and counter moves. This leads to the stack overflow.
I was fairly surprised to see that making the full decision tree for Tic-Tac-Toe didn't overflow myself, but I'm afraid I don't know enough about AI planning, or another algorithm for doing these problems, to help you beyond that.
This was questions asked in one of the interviews that I recently attended.
As far as I know a random number between two numbers can be generated as follows
public static int rand(int low, int high) {
return low + (int)(Math.random() * (high - low + 1));
}
But here I am using Math.random() to generate a random number between 0 and 1 and using that to help me generate between low and high. Is there any other way I can directly do without using external functions?
Typical pseudo-random number generators calculate new numbers based on previous ones, so in theory they are completely deterministic. The only randomness is guaranteed by providing a good seed (initialization of the random number generation algorithm). As long as the random numbers aren't very security critical (this would require "real" random numbers), such a recursive random number generator often satisfies the needs.
The recursive generation can be expressed without any "external" functions, once a seed was provided. There are a couple of algorithms solving this problem. A good example is the Linear Congruential Generator.
A pseudo-code implementation might look like the following:
long a = 25214903917; // These Values for a and c are the actual values found
long c = 11; // in the implementation of java.util.Random(), see link
long previous = 0;
void rseed(long seed) {
previous = seed;
}
long rand() {
long r = a * previous + c;
// Note: typically, one chooses only a couple of bits of this value, see link
previous = r;
return r;
}
You still need to seed this generator with some initial value. This can be done by doing one of the following:
Using something like the current time (good in most non-security-critical cases like games)
Using hardware noise (good for security-critical randomness)
Using a constant number (good for debugging, since you get always the same sequence)
If you can't use any function and don't want to use a constant seed, and if you are using a language which allows this, you could also use some uninitialized memory. In C and C++ for example, define a new variable, don't assign something to it and use its value to seed the generator. But note that this is far from being a "good seed" and only a hack to fulfill your requirements. Never use this in real code.
Note that there is no algorithm which can generate different values for different runs with the same inputs without access to some external sources like the system environment. Every well-seeded random number generator makes use of some external sources.
Here I am suggesting some sources with comment may be you find helpful:
System Time : Monotonic in a day poor random. Fast, Easy.
Mouse Point : Random But not useful on standalone system.
Raw Socket/ Local Network
(Packet 's info-part ) : Good Random Technical and time consuming - Possible to model a attack mode to reduce randomness.
Some input text with permutation : Fast, Common way and good too (in my opinion).
Timing of the Interrupt due to keyboard, disk-drive and other events: Common way – error prone if not used carefully.
Another approach is to feed an analog noise signal : example like temp.
/proc file data: On Linux system. I feel you should use this.
/proc/sys/kernel/random:
This directory contains various parameters controlling the operation of the file /dev/random.
The character special files /dev/random and /dev/urandom (present since Linux
1.3.30) provide an interface to the kernel's random number generator.
try this commads:
$cat /dev/urandom
and
$cat /dev/random
You can write a file read function that read from this file.
Read (also suggests): Is a rand from /dev/urandom secure for a login key?
`
Does System.currentTimeMillis() count as external? You could always get this and calculate mod by some max value:
int rand = (int)(System.currentTimeMillis()%high)+low;
You can get near randomness (actually chaotic and definitely not uniform*) from the logistic map x = 4x(1-x) starting with a "non-rational" x between 0 and 1.
The "randomness" appears because of the rounding errors at the edge of the accuracy of the floating point representation.
(*)You can undo the skewing once you know it is there.
You may use the address of a variable or combine the address of more variables to make a more complex one...
You could get the current system time, but that would also require a function in most languages.
You can do it without external functions if you are allowed to use some external state (e.g. a long initialised with the current system time). This is enough for you to implement a simple psuedo-random number generator.
In each call to your random function, you would use the state to create a new random value, and update the state, so that subsequent calls get different results.
You can do this with just regular Java arithmetic and/or bitwise operations, so no external functions are required.
public class randomNumberGenerator {
int generateRandomNumber(int min, int max) {
return (int) ((System.currentTimeMillis() % max) + min);
}
public static void main(String[] args) {
randomNumberGenerator rn = new randomNumberGenerator();
int cv = 0;
int min = 1, max = 4;
Map<Integer, Integer> hmap = new HashMap<Integer, Integer>();
int count = min;
while (count <= max) {
cv = rn.generateRandomNumber(min, max);
if ((hmap.get(cv) == null) && cv >= min && cv <= max) {
System.out.print(cv + ",");
hmap.put(cv, 1);
count++;
}
}
}
}
Poisson Random Generator
Lets say we start with an expected value 'v' of the random numbers. Then to say that a sequence of non negative integers satisfies a Poisson Distribution with expected value v means that over subsequences, the mean(average) of the value will appear 'v'.
Poisson Distribution is part of statistics and the details can be found on wikipedia.
But here the main advantage of using this function are:
1. Only integer values are generated.
2. The mean of those integers will be equal to the value we initially provided.
It is helpful in applications where fractional values don't make sense. Like number of planes arriving on an airport in 1min is 2.5(doesn't make sense) but it implies that in 2 mins 5 plans arrive.
int poissonRandom(double expectedValue) {
int n = 0; //counter of iteration
double limit;
double x; //pseudo random number
limit = exp(-expectedValue);
x = rand() / INT_MAX;
while (x > limit) {
n++;
x *= rand() / INT_MAX;
}
return n;
}
The line
rand() / INT_MAX
should generate a random number between 0 and 1. So we can use time of the system.
Seconds / 60 will serve the purpose.
Which function we should use is totally application dependent.
I am trying to make a Java port of a simple feed-forward neural network.
This obviously involves lots of numeric calculations, so I am trying to optimize my central loop as much as possible. The results should be correct within the limits of the float data type.
My current code looks as follows (error handling & initialization removed):
/**
* Simple implementation of a feedforward neural network. The network supports
* including a bias neuron with a constant output of 1.0 and weighted synapses
* to hidden and output layers.
*
* #author Martin Wiboe
*/
public class FeedForwardNetwork {
private final int outputNeurons; // No of neurons in output layer
private final int inputNeurons; // No of neurons in input layer
private int largestLayerNeurons; // No of neurons in largest layer
private final int numberLayers; // No of layers
private final int[] neuronCounts; // Neuron count in each layer, 0 is input
// layer.
private final float[][][] fWeights; // Weights between neurons.
// fWeight[fromLayer][fromNeuron][toNeuron]
// is the weight from fromNeuron in
// fromLayer to toNeuron in layer
// fromLayer+1.
private float[][] neuronOutput; // Temporary storage of output from previous layer
public float[] compute(float[] input) {
// Copy input values to input layer output
for (int i = 0; i < inputNeurons; i++) {
neuronOutput[0][i] = input[i];
}
// Loop through layers
for (int layer = 1; layer < numberLayers; layer++) {
// Loop over neurons in the layer and determine weighted input sum
for (int neuron = 0; neuron < neuronCounts[layer]; neuron++) {
// Bias neuron is the last neuron in the previous layer
int biasNeuron = neuronCounts[layer - 1];
// Get weighted input from bias neuron - output is always 1.0
float activation = 1.0F * fWeights[layer - 1][biasNeuron][neuron];
// Get weighted inputs from rest of neurons in previous layer
for (int inputNeuron = 0; inputNeuron < biasNeuron; inputNeuron++) {
activation += neuronOutput[layer-1][inputNeuron] * fWeights[layer - 1][inputNeuron][neuron];
}
// Store neuron output for next round of computation
neuronOutput[layer][neuron] = sigmoid(activation);
}
}
// Return output from network = output from last layer
float[] result = new float[outputNeurons];
for (int i = 0; i < outputNeurons; i++)
result[i] = neuronOutput[numberLayers - 1][i];
return result;
}
private final static float sigmoid(final float input) {
return (float) (1.0F / (1.0F + Math.exp(-1.0F * input)));
}
}
I am running the JVM with the -server option, and as of now my code is between 25% and 50% slower than similar C code. What can I do to improve this situation?
Thank you,
Martin Wiboe
Edit #1: After seeing the vast amount of responses, I should probably clarify the numbers in our scenario. During a typical run, the method will be called about 50.000 times with different inputs. A typical network would have numberLayers = 3 layers with 190, 2 and 1 neuron, respectively. The innermost loop will therefore have about 2*191+3=385 iterations (when counting the added bias neuron in layers 0 and 1)
Edit #1: After implementing the various suggestions in this thread, our implementation is practically as fast as the C version (within ~2 %). Thanks for all the help! All of the suggestions have been helpful, but since I can only mark one answer as the correct one, I will give it to #Durandal for both suggesting array optimizations and being the only one to precalculate the for loop header.
Some tips.
in your inner most loop, think about how you are traversing your CPU cache and re-arrange your matrix so you are accessing the outer most array sequentially. This will result in you accessing your cache in order rather than jumping all over the place. A cache hit can be two orders of magniture faster than a cache miss.
e.g restructure fWeights so it is accessed as
activation += neuronOutput[layer-1][inputNeuron] * fWeights[layer - 1][neuron][inputNeuron];
don't perform work inside the loop (every time) which can be done outside the loop (once). Don't perform the [layer -1] lookup every time when you can place this in a local variable. Your IDE should be able to refactor this easily.
multi-dimensional arrays in Java are not as efficient as they are in C. They are actually multiple layers of single dimensional arrays. You can restructure the code so you're only using a single dimensional array.
don't return a new array when you can pass the result array as an argument. (Saves creating a new object on each call).
rather than peforming layer-1 all over the place, why not use layer1 as layer-1 and using layer1+1 instead of layer.
Disregarding the actual math, the array indexing in Java can be a performance hog in itself. Consider that Java has no real multidimensional arrays, but rather implements them as array of arrays. In your innermost loop, you access over multiple indices, some of which are in fact constant in that loop. Part of the array access can be move outside of the loop:
final int[] neuronOutputSlice = neuronOutput[layer - 1];
final int[][] fWeightSlice = fWeights[layer - 1];
for (int inputNeuron = 0; inputNeuron < biasNeuron; inputNeuron++) {
activation += neuronOutputSlice[inputNeuron] * fWeightsSlice[inputNeuron][neuron];
}
It is possible that the server JIT performs a similar code invariant movement, the only way to find out is change and profile it. On the client JIT this should improve performance no matter what.
Another thing you can try is to precalculate the for-loop exit conditions, like this:
for (int neuron = 0; neuron < neuronCounts[layer]; neuron++) { ... }
// transform to precalculated exit condition (move invariant array access outside loop)
for (int neuron = 0, neuronCount = neuronCounts[layer]; neuron < neuronCount; neuron++) { ... }
Again the JIT may already do this for you, so profile if it helps.
Is there a point to multiplying with 1.0F that eludes me here?:
float activation = 1.0F * fWeights[layer - 1][biasNeuron][neuron];
Other things that could potentially improve speed at cost of readability: inline sigmoid() function manually (the JIT has a very tight limit for inlining and the function might be larger).
It can be slightly faster to run a loop backwards (where it doesnt change the outcome of course), since testing the loop index against zero is a little cheaper than checking against a local variable (the innermost loop is a potentical candidate again, but dont expect the output to be 100% identical in all cases, since adding floats a + b + c is potentially not the same as a + c + b).
For a start, don't do this:
// Copy input values to input layer output
for (int i = 0; i < inputNeurons; i++) {
neuronOutput[0][i] = input[i];
}
But this:
System.arraycopy( input, 0, neuronOutput[0], 0, inputNeurons );
First thing I would look into is seeing if Math.exp is slowing you down. See this post on a Math.exp approximation for a native alternative.
Replace the expensive floating point sigmoid transfer function with an integer step transfer function.
The sigmoid transfer function is a model of organic analog synaptic learning, which in turn seems to be a model of a step function.
The historical precedent for this is that Hinton designed the back-prop algorithm directly from the first principles of cognitive science theories about real synapses, which in turn were based on real analog measurements, which turn out to be sigmoid.
But the sigmoid transfer function seems to be an organic model of the digital step function, which of course cannot be directly implemented organically.
Rather than model a model, replace the expensive floating point implementation of the organic sigmoid transfer function with the direct digital implementation of a step function (less than zero = -1, greater than zero = +1).
The brain cannot do this, but backprop can!
This not only linearly and drastically improves performance of a single learning iteration, it also reduces the number of learning iterations required to train the network: supporting evidence that learning is inherently digital.
Also supports the argument that Computer Science is inherently cool.
Purely based upon code inspection, your inner most loop has to compute references to a three-dimensional parameter and its being done a lot. Depending upon your array dimensions could you possibly be having cache issues due to have to jump around memory with each loop iteration. Maybe you could rearrange the dimensions so the inner loop tries to access memory elements that are closer to one another than they are now?
In any case, profile your code before making any changes and see where the real bottleneck is.
I suggest using a fixed point system rather than a floating point system. On almost all processors using int is faster than float. The simplest way to do this is simply shift everything left by a certain amount (4 or 5 are good starting points) and treat the bottom 4 bits as the decimal.
Your innermost loop is doing floating point maths so this may give you quite a boost.
The key to optimization is to first measure where the time is spent. Surround various parts of your algorithm with calls to System.nanoTime():
long start_time = System.nanoTime();
doStuff();
long time_taken = System.nanoTime() - start_time;
I'd guess that while using System.arraycopy() would help a bit, you'll find your real costs in the inner loop.
Depending on what you find, you might consider replacing the float arithmetic with integer arithmetic.
Sometimes this piece of code always returns the same number (and sometimes it works fine):
(new Random()).nextInt(5)
I have suspicions where the problem is - it probably always creates a new Random with the same seed. So what would be the best solution:
create a static var for Random() and
use it instead.
use Math.random() * 5
(looks like it uses a static var
internally)
or something else? I don't need anything fancy just something that looks random.
Also it would be helpful if someone can explain why the original code sometimes works and sometimes it doesn't.
Thanks.
The javadoc for java.util.Random is clear:
If two instances of Random are created
with the same seed, and the same
sequence of method calls is made for
each, they will generate and return
identical sequences of numbers.
The default constructor is also clear:
Creates a new random number generator.
This constructor sets the seed of the
random number generator to a value
very likely to be distinct from any
other invocation of this constructor.
In other words, no guarantees.
If you need a more random algorithm, use java.security.SecureRandom.
...Sometimes this piece of code [..] returns the same number (and sometimes it works fine)...
So it works randomly??? :) :) :)
Ok, ok, downvote me now!!
If you're calling that line of code on successive lines, then yes, the two Random instances you're creating could be created with the same seed from the clock (the clock millisecond tick count is the default seed for Random objects). Almost universally, if an application needs multiple random numbers, you'd create one instance of Random and re-use it as often as you need.
Edit: Interesting note, The javadoc for Random has changed since 1.4.2, which explained that the clock is used as the default seed. Apparently, that's no longer a guarantee.
Edit #2: By the way, even with a properly seeded Random instance that you re-use, you'll still get the same random number as the previous call about 1/5 of the time when you call nextInt(5).
public static void main(String[] args) {
Random rand = new Random();
int count = 0;
int trials = 10000;
int current;
int previous = rand.nextInt(5);
for(int i=0; i < trials; ++i)
{
current = rand.nextInt(5);
if( current == previous )
{
count++;
}
}
System.out.println("Random int was the same as previous " + count +
" times out of " + trials + " tries.");
}
In Java 1.4, the default seed of of a new Random instance was specified in the API documentation to be the result of System.currentTimeMillis(). Obviously, a tight loop could create many Random() instances per tick, all having the same seed and all producing the same psuedo-random sequence. This was especially bad on some platforms, like Windows, where the clock resolution was poor (10 ms or greater).
Since Java 5, however, the seed is set "to a value very likely to be distinct" for each invocation of the default constructor. With a different seed for each Random instance, results should appear random as desired.
The Javadoc for Random isn't explicit about this, but the seed it uses is probably dependent on the current system time. It does state the random numbers will be the same for the same seed. If you use the call within the same millisecond, it will use the same seed. The best solution is probably to use a static Random object and use it for subsequent calls to the method.
The best way to approximate uniform distribution is to use the static method Random.nextInt(n) to produce integers in the range [0, n-1] (Yes, n is excluded). In your particular example, if you want integers in the range 0 to 5, inclusive, you would call Random.nextInt(6).