Same prediction for each inference

Same prediction for each inference - java

I saved a tensorflow model using tf.saved_model.builder.SavedModelBuilder.
However, when I try to make predictions in java, in most of the time it returns the same results (for fc8 (alexnet) the layer before softmax) in some cases, it produces some real different results and it's most likely to be correct, so from that, I assume that the training is OK.
Did anyone else experienced this? Does anyone have an idea what's wrong?
my Java implementation:
Tensor image = constructAndExecuteGraphToNormalizeImage(imageBytes);
Tensor result = s.runner().feed("input_tensor", image).feed("Placeholder_1",t).fetch("fc8/fc8").run().get(0);
private static Tensor constructAndExecuteGraphToNormalizeImage(byte[] imageBytes) {
try (Graph g = new Graph()) {
TF.GraphBuilder b = new TF.GraphBuilder(g);
// Some constants specific to the pre-trained model at:
// https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip
//
// - The model was trained with images scaled to 224x224 pixels.
// - The colors, represented as R, G, B in 1-byte each were converted to
// float using (value - Mean)/Scale.
final int H = 227;
final int W = 227;
final float mean = 117f;
final float scale = 1f;
// Since the graph is being constructed once per execution here, we can use a constant for the
// input image. If the graph were to be re-used for multiple input images, a placeholder would
// have been more appropriate.
final Output input = b.constant("input", imageBytes);
final Output output =
b.div(
b.sub(
b.resizeBilinear(
b.expandDims(
b.cast(b.decodeJpeg(input, 3), DataType.FLOAT),
b.constant("make_batch", 0)),
b.constant("size", new int[] {H, W})),
b.constant("mean", mean)),
b.constant("scale", scale));
try (Session s = new Session(g)) {
return s.runner().fetch(output.op().name()).run().get(0);
}
}
}

I am assuming that there is no random operation left in your graph, such as dropout. (Seems to be the case, since you often get the same results).
Alas, some operations in tensorflow seem to be non-deterministic, such as reductions and convolutions. We have to live with the fact that tensorflow's nets are stochastic beasts: their performance can be approached statistically but their outputs are non-deterministic.
(I believe some other frameworks such as Theano go farther than tensorflow in proposing deterministic operations.)

Related

InvalidKernelArgs on enqueueNDRange while a similar call works fine

I'm using JavaCL to process images.
I keep getting
com.nativelibs4java.opencl.CLException$InvalidKernelArgs: InvalidKernelArgs
On the call to enqueueNDRange call in this (part of) function :
FloatBuffer outBuffer = ByteBuffer.allocateDirect(4*XYZ.length).order(context.getByteOrder()).asFloatBuffer();
CLFloatBuffer cl_outBuffer = context.createFloatBuffer(CLMem.Usage.Output, outBuffer, false);
CLFloatBuffer cl_inBuffer = context.createFloatBuffer(CLMem.Usage.Input,XYZ.length);
FloatBuffer inBuffer = cl_inBuffer.map(queue, CLMem.MapFlags.Write).put(XYZ);
inBuffer.rewind();
event = cl_inBuffer.unmap(queue, inBuffer);
XYZ2RGBKernel.setArgs(cl_inBuffer, XYZ.length/4,cl_outBuffer);
event = XYZ2RGBKernel.enqueueNDRange(queue, new int[]{XYZ.length/4}, event);
event = cl_outBuffer.read(queue, outBuffer, true, event);
XYZ is a pixel array with 4 floats per pixels (encoded like RGBARGBARGBA....)
The associated kernel header is :
__kernel void XYZ2RGB( __constant float3* inputXYZ,
int numberOfPixels,
__global float* output
)
I can't figure out why it doesn't work since this call to enqueueNDRange :
CLFloatBuffer cl_Rbuffer = context.createFloatBuffer(CLMem.Usage.Input, R.length);
FloatBuffer R_buffer = cl_Rbuffer.map(queue, CLMem.MapFlags.Write).put(R);
R_buffer.rewind();
event = cl_Rbuffer.unmap(queue, R_buffer);
CLFloatBuffer cl_Gbuffer = context.createFloatBuffer(CLMem.Usage.Input, G.length);
FloatBuffer G_buffer = cl_Gbuffer.map(queue, CLMem.MapFlags.Write, event).put(G);
G_buffer.rewind();
event = cl_Gbuffer.unmap(queue, G_buffer);
CLFloatBuffer cl_Bbuffer = context.createFloatBuffer(CLMem.Usage.Input, B.length);
FloatBuffer B_buffer = cl_Bbuffer.map(queue, CLMem.MapFlags.Write, event).put(B);
B_buffer.rewind();
event = cl_Bbuffer.unmap(queue, B_buffer);
FloatBuffer outBuffer = ByteBuffer.allocateDirect(4*4*R.length).order(context.getByteOrder()).asFloatBuffer();
CLFloatBuffer cl_outBuffer = context.createFloatBuffer(CLMem.Usage.Output, outBuffer, false);
RGB2XYZKernel.setArgs(cl_Rbuffer, cl_Gbuffer, cl_Bbuffer, cl_outBuffer);
event = RGB2XYZKernel.enqueueNDRange(queue, new int[]{R.length}, event);
event = cl_outBuffer.read(queue, outBuffer, true, event);
With the associated kernel header :
__kernel void RGB2XYZ( __constant float* inputR,
__constant float* inputG,
__constant float* inputB,
__global float3* output)
Works without any problem.
Before anyone asks, float3 or float4 would work the same, because the OpenCL specs uses 4*sizeof(float) alignment for both. And I've tried switching between the two.
I also tried passing the input as float*, but it doesn't work either.
Both calls happen one after the other.
Update
I fixed it, after multiple hours :
__constant seems to have a size limit (couldn't find that in the specs though). XYZ being 4 times the size of R, G or B, it crashed at runtime.
I had issues afterwards with float3. It seems that the library I'm forced to use isn't up-to-date and so it wasn't supported well enough, so I switched to float4
However if any of you have some more insights about __constant size limit and stuff, let me know, I'm sure it will be handy for the people who will come across this thread.

__constant seems to have a size limit (couldn't find that in the specs though).
Limits depend on the device. Constant buffers have a per-buffer size limit (CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE, min 64KB), and there is also a limit on how many constant arguments you can pass to a kernel (CL_DEVICE_MAX_CONSTANT_ARGS, min 8). Both AMD and Nvidia GPUs are usually close to the minimums, so the total amount of data that can be passed as __constant can be very small.
The point of "constant" memory is not to pass read-only input user data to kernels (as you seem to be using it); the point is to store algorithm-specific constants (lookup tables, matrix/polynomial/filter coefficients, etc). If you want to pass read-only input data, the usual way is to declare the kernel argument as __global const <type>* and create the corresponding buffer with CL_MEM_READ_ONLY.
Here is some more insight.

Creating a large 2D array and populating it from a LinkedHashMap of smaller 2D Arrays

I'm trying to create a large 2D Array int[][] from a LinkedHashMap which contains a number of smaller Arrays for an A* Pathfinder I'm working on.
The Map the Pathfinder is using is streamed in smaller chunks to the client and converted into a simplified version for the Pathfinder.
Map<Coord, int[][]> pfmapcache = new LinkedHashMap<Coord, int[][]>(9, 0.75f, true);
The Coord look like this: Coord(0,0) or Coord(-1,0).... etc. and the int[][] are always int[100][100] big.
Now I would like to create a new large int[][] that would encompass all the smaller Array where the small Array Coord(0,0) would be in the center of the new Large Array.
int[][] largearray = [-1,1][0,1][1,1]
[-1,0][0,0][1,0]
[-1,-1][0,-1][1,-1]
So that the large array would be int[300][300] big in this example.
2. I would like to expand the new large Array if a new small array gets added to the pfmapcache.
int[][] largearray = [][][1,2]
[-1,1][0,1][1,1]
[-1,0][0,0][1,0]
[-1,-1][0,-1][1,-1]
I don't have to store the smaller Arrays in pfmapcache I could add them as they are created with a 2 small arrays combining etc.. but with the negative Position of the Arrays in relation to the original I have no idea how to combine them and preserve their relative postion.
First time posting here, if I need to clarify something pls let me know.

You're wondering how to use your existing pathfinder algo with a chunked map.
This is when you need to place an abstraction layer between your data representation, and your data usage (like a Landscape class).
Q: Does a pathfinding algorithm need to know it works on a grid, on a chunked grid, on a sparse matrix, or on a more exotic representation?
A: No. A Pathfinder only needs to know one thing: 'Where can I get from here?'
Ideally
you should drop any reference to the fact that your world is on a grid by working with only a class like:
public interface IdealLandscape {
Map<Point, Integer> neighbours(Point location); // returns all neighbours, and the cost to get there
}
Easy alternative
However I understood your existing implementation 'knows' about grids, with the added value that adjacency is implicit and you're working with points as (x, y). You however lost this when introducing chunks, so working with the grid doesn't work anymore. So let's make the abstraction as painless as possible. Here's the plan:
1. Introduce a Landscape class
public interface Landscape {
public int getHeight(int x, int y); // Assuming you're storing height in your int[][] map?
}
2. Refactor your Pathfinder
It's reaaally easy:
just replace map[i][j] with landscape.getHeight(i, j)
3. Test your refactoring
Use a very simple GridLandscape implementation like:
public class GridLandscape implements Landscape {
int[][] map;
public GridLandscape(...){
map = // Build it somehow
}
#Override
public int getHeight(int x, int y){
return map[x][y]; // Maybe check bounds here ?
}
}
4. Use your ChunkedGridLandscape
Now your map is abstracted away, and you know your Pathfinder works on it, you can replace it with your chunked map!
public class ChunkedGridLandscape implements Landscape {
private static final int CHUNK_SIZE = 300;
Map<Coord, int[][]> mapCache = new LinkedHashMap<>(9, 0.75f, true);
Coord centerChunkCoord;
public ChunkedGridLandscape(Map<Coord, int[][]> pfmapcache, Coord centerChunkCoord){
this.mapCache = pfmapcache;
this.centerChunkCoord = centerChunkCoord;
}
#Override
public int getHeight(int x, int y){
// compute chunk coord
int chunkX = x / CHUNK_SIZE - centerChunkCoord .getX();
int chunkX = y / CHUNK_SIZE - centerChunkCoord .getY();
Coord chunkCoord = new Coord(chunkX, chunkY);
// Now retrieve the correct chunk
int[][] chunk = mapCache.get(chunkCoord); // Careful: is the chunk already loaded?
// Now retrieve the height within the chunk
int xInChunk = (x + chunkX*CHUNK_SIZE) % CHUNK_SIZE; // Made positive again!
int yInChunk = (y + chunkY*CHUNK_SIZE) % CHUNK_SIZE; // Made positive again!
// We have everything !
return chunk[xInChunk][yInChunk];
}
}
Gotcha: Your Coord class NEEDS to have a equals and hashCode methods properly overloaded!
5. It just works
This should just immediately work with your pathfinder. Enjoy!

How to use ScriptIntrinsic3DLUT with a .cube file?

first, I'm new to image processing in Android. I have a .cube file that was "Generated by Resolve" that is LUT_3D_SIZE 33. I'm trying to use android.support.v8.renderscript.ScriptIntrinsic3DLUT to apply the lookup table to process an image. I assume that I should use ScriptIntrinsic3DLUT and NOT android.support.v8.renderscript.ScriptIntrinsicLUT, correct?
I'm having problems finding sample code to do this so this is what I've pieced together so far. The issue I'm having is how to create an Allocation based on my .cube file?
...
final RenderScript renderScript = RenderScript.create(getApplicationContext());
final ScriptIntrinsic3DLUT scriptIntrinsic3DLUT = ScriptIntrinsic3DLUT.create(renderScript, Element.U8_4(renderScript));
// How to create an Allocation from .cube file?
//final Allocation allocationLut = Allocation.createXXX();
scriptIntrinsic3DLUT.setLUT(allocationLut);
Bitmap bitmapIn = selectedImage;
Bitmap bitmapOut = selectedImage.copy(bitmapIn.getConfig(),true);
Allocation aIn = Allocation.createFromBitmap(renderScript, bitmapIn);
Allocation aOut = Allocation.createTyped(renderScript, aIn.getType());
aOut.copyTo(bitmapOut);
imageView.setImageBitmap(bitmapOut);
...
Any thoughts?

Parsing the .cube file
First, what you should do is to parse the .cube file.
OpenColorIO shows how to do this in C++. It has some ways to parse the LUT files like .cube, .lut, etc.
For example, FileFormatIridasCube.cpp shows how to
process a .cube file.
You can easily get the size through
LUT_3D_SIZE. I have contacted an image processing algorithm engineer.
This is what he said:
Generally in the industry a 17^3 cube is considered preview, 33^3 normal and 65^3 for highest quality output.
Note that in a .cube file, we can get 3*LUT_3D_SIZE^3 floats.
The key point is what to do with the float array. We cannot set this array to the cube in ScriptIntrinsic3DLUT with the Allocation.
Before doing this we need to handle the float array.
Handle the data in .cube file
As we know, each RGB component is an 8-bit int if it is 8-bit depth.
R is in the high 8-bit, G is in the middle, and B is in the low 8-bit. In this way, a 24-bit int can contain these
three components at the same time.
In a .cube file, each data line contains 3 floats.
Please note: the blue component goes first!!!
I get this conclusion from trial and error. (Or someone can give a more accurate explanation.)
Each float represents the coefficient of the component according to 255. Therefore, we need to calculate the real
value with these three components:
int getRGBColorValue(float b, float g, float r) {
int bcol = (int) (255 * clamp(b, 0.f, 1.f));
int gcol = (int) (255 * clamp(g, 0.f, 1.f));
int rcol = (int) (255 * clamp(r, 0.f, 1.f));
return bcol | (gcol << 8) | (rcol << 16);
}
So we can get an integer from each data line, which contains 3 floats.
And finally, we get the integer array, the length of which is LUT_3D_SIZE^3. This array is expected to be
applied to the cube.
ScriptIntrinsic3DLUT
RsLutDemo shows how to apply ScriptIntrinsic3DLUT.
RenderScript mRs;
Bitmap mBitmap;
Bitmap mLutBitmap;
ScriptIntrinsic3DLUT mScriptlut;
Bitmap mOutputBitmap;
Allocation mAllocIn;
Allocation mAllocOut;
Allocation mAllocCube;
...
int redDim, greenDim, blueDim;
int[] lut;
if (mScriptlut == null) {
mScriptlut = ScriptIntrinsic3DLUT.create(mRs, Element.U8_4(mRs));
}
if (mBitmap == null) {
mBitmap = BitmapFactory.decodeResource(getResources(),
R.drawable.bugs);
mOutputBitmap = Bitmap.createBitmap(mBitmap.getWidth(), mBitmap.getHeight(), mBitmap.getConfig());
mAllocIn = Allocation.createFromBitmap(mRs, mBitmap);
mAllocOut = Allocation.createFromBitmap(mRs, mOutputBitmap);
}
...
// get the expected lut[] from .cube file.
...
Type.Builder tb = new Type.Builder(mRs, Element.U8_4(mRs));
tb.setX(redDim).setY(greenDim).setZ(blueDim);
Type t = tb.create();
mAllocCube = Allocation.createTyped(mRs, t);
mAllocCube.copyFromUnchecked(lut);
mScriptlut.setLUT(mAllocCube);
mScriptlut.forEach(mAllocIn, mAllocOut);
mAllocOut.copyTo(mOutputBitmap);
Demo
I have finished a demo to show the work.
You can view it on Github.
Thanks.

With a 3D LUT yes, you have to use the core framework version as there is no support library version of 3D LUT at this time. Your 3D LUT allocation would have to be created by parsing the file appropriately, there is no built in support for .cube files (or any other 3D LUT format).

Rotation won't work in Java physics engine

I am making a java rigid body physics engine, and it has gone great so far, until I tried to implement rotation. I don't know where the problem is coming from. I have methods calculating the moment of inertia of convex polygons and circles using formulas from these websites:
http://lab.polygonal.de/?p=57
http://en.wikipedia.org/wiki/List_of_moments_of_inertia
This is the code for the polygon moment of inertia:
public float momentOfInertia() {
Vector C = centerOfMass().subtract(position); //center of mass
Line[] sides = sides(); //sides of the polygon
float moi = 0; //moment of inertia
for(int i = 0; i < sides.length; i++) {
Line l = sides[i]; //current side of polygon being looped through
Vector p1 = C; //points 1, 2, and 3 are the points of the triangle
Vector p2 = l.point1;
Vector p3 = l.point2;
Vector Cp = p1.add(p2).add(p3).divide(3); //center of mass of the triangle, or C'
float d = new Line(C, Cp).length(); //distance between center of mass
Vector bv = p2.subtract(p1); //vector for side b of triangle
float b = bv.magnitude(); //scalar for length of side b
Vector u = bv.divide(b); //unit vector for side b
Vector cv = p3.subtract(p1); //vector for side c of triangle, only used to calculate variables a and h
float a = cv.dot(u); //length of a in triangle
Vector av = u.multiply(a); //vector for a in triangle
Vector hv = cv.subtract(av); //vector for height of triangle, or h in diagram
float h = hv.magnitude(); //length of height of triangle, or h in diagram
float I = ((b*b*b*h)-(b*b*h*a)+(b*h*a*a)+(b*h*h*h))/36; //calculate moment of inertia of individual triangle
float M = (b*h)/2; //mass or area of triangle
moi += I+M*d*d; //equation in sigma series of website
}
return moi;
}
And this is for the circle:
public float momentOfInertia() {
return (float) Math.pow(radius, 2)*area()/2;
}
I know for a fact that the area functions work fine, I have checked them. I just don't know how to check if the moment of inertia equations are wrong.
For collision detection, I used the separating axis theorem to check for any combination of two polygons and circles, where it can find out whether they are colliding, the normal velocity of the collision, and the contact point of the collision. These methods all work beautifully.
I might also like to say how positions are organized. Every body has a position and a shape, either a polygon or a circle. Each shape has a position, and polygons have individual vertices. So if I want to find the absolute position of a vertex of a polygon-shaped body, I need to add the positions of the body, the polygon, and the vertex itself. The center of mass equation is in absolute position according to the shape, with no account for the body. The center of mass and moment of inertia methods are in the Shape class.
For every body, the constants are being updated according to the force and torque in the body's update method where dt is delta time. I also rotate the polygon based on the difference in rotation, because the vertices are ever changing.
public void update(float dt) {
if(mass != 0) {
momentum = momentum.add(force.multiply(dt));
velocity = momentum.divide(mass);
position = position.add(velocity.multiply(dt));
angularMomentum += torque*dt;
angularVelocity = angularMomentum/momentOfInertia;
angle += angularVelocity*dt;
shape.rotate(angularVelocity*dt);
}
}
Finally, I also have a CollisionResolver class which fixes the collision of two colliding bodies, involving applying the normal force and friction. Here is the class's only method which does all of this:
public static void resolveCollision(Body a, Body b, float dt) {
//calculate normal vector
Vector norm = CollisionDetector.normal(a, b);
Vector normb = norm.multiply(-1);
//undo overlap between bodies
float ratio1 = a.mass/(a.mass+b.mass);
float ratio2 = b.mass/(b.mass+a.mass);
a.position = a.position.add(norm.multiply(ratio1));
b.position = b.position.add(normb.multiply(ratio2));
//calculate contact point of collision and other values needed for rotation
Vector cp = CollisionDetector.contactPoint(a, b, norm);
Vector c = a.shape.centerOfMass().add(a.position);
Vector cb = b.shape.centerOfMass().add(b.position);
Vector d = cp.subtract(c);
Vector db = cp.subtract(cb);
//create the normal force vector from the velocity
Vector u = norm.unit();
Vector ub = u.multiply(-1);
Vector F = new Vector(0, 0);
boolean doA = a.mass != 0;
if(doA) {
F = a.force;
}else {
F = b.force;
}
Vector n = new Vector(0, 0);
Vector nb = new Vector(0, 0);
if(doA) {
Vector Fyp = u.multiply(F.dot(u));
n = Fyp.multiply(-1);
nb = Fyp;
}else{
Vector Fypb = ub.multiply(F.dot(ub));
n = Fypb;
nb = Fypb.multiply(-1);
}
//calculate normal force for body A
float r = a.restitution;
Vector v1 = a.velocity;
Vector vy1p = u.multiply(u.dot(v1));
Vector vx1p = v1.subtract(vy1p);
Vector vy2p = vy1p.multiply(-r);
Vector v2 = vy2p.add(vx1p);
//calculate normal force for body B
float rb = b.restitution;
Vector v1b = b.velocity;
Vector vy1pb = ub.multiply(ub.dot(v1b));
Vector vx1pb = v1b.subtract(vy1pb);
Vector vy2pb = vy1pb.multiply(-rb);
Vector v2b = vy2pb.add(vx1pb);
//calculate friction for body A
float mk = (a.friction+b.friction)/2;
Vector v = a.velocity;
Vector vyp = u.multiply(v.dot(u));
Vector vxp = v.subtract(vyp);
float fk = -n.multiply(mk).magnitude();
Vector fkv = vxp.unit().multiply(fk); //friction force
Vector vr = vxp.subtract(d.multiply(a.angularVelocity));
Vector fkvr = vr.unit().multiply(fk); //friction torque - indicated by r for rotation
//calculate friction for body B
Vector vb = b.velocity;
Vector vypb = ub.multiply(vb.dot(ub));
Vector vxpb = vb.subtract(vypb);
float fkb = -nb.multiply(mk).magnitude();
Vector fkvb = vxpb.unit().multiply(fkb); //friction force
Vector vrb = vxpb.subtract(db.multiply(b.angularVelocity));
Vector fkvrb = vrb.unit().multiply(fkb); //friction torque - indicated by r for rotation
//move bodies based on calculations
a.momentum = v2.multiply(a.mass).add(fkv.multiply(dt));
if(a.mass != 0) {
a.velocity = a.momentum.divide(a.mass);
a.position = a.position.add(a.velocity.multiply(dt));
}
b.momentum = v2b.multiply(b.mass).add(fkvb.multiply(dt));
if(b.mass != 0) {
b.velocity = b.momentum.divide(b.mass);
b.position = b.position.add(b.velocity.multiply(dt));
}
//apply torque to bodies
float t = (d.cross(fkvr)+d.cross(n));
float tb = (db.cross(fkvrb)+db.cross(nb));
if(a.mass != 0) {
a.angularMomentum = t*dt;
a.angularVelocity = a.angularMomentum/a.momentOfInertia;
a.angle += a.angularVelocity*dt;
a.shape.rotate(a.angularVelocity*dt);
}
if(b.mass != 0) {
b.angularMomentum = tb*dt;
b.angularVelocity = b.angularMomentum/b.momentOfInertia;
b.angle += b.angularVelocity*dt;
b.shape.rotate(b.angularVelocity*dt);
}
}
As for the actual problem, both the circles and polygons rotate very slowly and often in wrong directions. I know I am throwing a lot out there, but this problem has been bugging me for a while, and I would appreciate any help I can get.
Thanks.

This answer addresses the "I just don't know how to check if the moment of inertia equations are wrong." part of the question.
There are several possible approaches, some of which you may have already tried, and they can be used in combination:
Unit testing
Take your moment of inertia code and apply it to problems with known solutions from a tutorial or textbook.
Dimensional analysis
I would recommend this anyway for any scientific or engineering program. You may have deleted comments for compactness of posted code, but they are important. Annotate each variable that represents a physical quantity with its units. Check that every expression you evaluate has the right units, based on its inputs, for its result variable. For example, in the classic equation F=ma in SI units: F is in Newtons, equivalent to kg.m/(s^2), m is in kg, a is in m/(s^2), so it all balances. Be careful with transitions between physics world coordinates and screen coordinates.
Program simplification
Try working first with only one instance of one very simple shape for which you can do all the calculations by hand. Since some of your problems do not relate to rotation, a circle may be a good first choice because of its symmetry. Debug that, comparing intermediate results to equivalent results from paper-and-pencil (and calculator). Gradually add more instances of the same shape, then debug a single instance of the next shape...
Deliberate error
Given that you suspect your inertia calculations, try setting arbitrary values slightly different from your calculations, and see what differences they make in the display. Are the effects similar to the problems you are seeing? If so, keep it as a hypothesis.
As a more general note, programs that do iterative simulation can be very vulnerable to accumulated floating point error. Unless you have a real need to save space, and have done enough analysis of the numerical stability of your code to be sure float is OK, I strongly recommend using double instead. This is probably not your current problem, but is something that could become an issue later.

How to take a group of pixels and combine them into one

Im trying to create a program that finds images that are similar to each other and i found a site ( http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html ) that gives steps for making a function that creates a fingerprint of an image, the first step is to reduce the size of the image to a 8 by 8 ( 64 pixel ) image, but i cant figure out how to convert a group of pixels into one pixel e.g.
[(R,G,B)][(R,G,B)][(R,G,B)]
[(R,G,B)][(R,G,B)][(R,G,B)]
[(R,G,B)][(R,G,B)][(R,G,B)]
take this group of pixels, each pixel has a diffrent R, G and B value, how can i take them all and turn them into one set of values e.g.
[(R,G,B)]
I thought maybe add all the R, G and B values up and then average them but that seemed to simple, dose anyone know how to do this ? i am writing this program in java.

There are a lot of different interpolation/re-sampling techniques to do downscaling - you can choose one depending on what results you're expecting. A simple one i.e. is the Nearest neighbour interpolation: But this wouldn't lead to very detailed results, due to the simplicity.
More advanced techniques i.e. linear interpolation, biliniear interpolation or bicubic interpolation are way better suitable, if the pictures are actually photos (rather than i.e. pixelart). But the downscaled image in the link hasn't much details left either - so Nearest neighbor seems quite sufficient (at least to start with).
public int[] resizePixels(int[] pixels,int w1,int h1,int w2,int h2) {
int[] temp = new int[w2*h2] ;
double x_ratio = w1/(double)w2 ;
double y_ratio = h1/(double)h2 ;
double px, py ;
for (int i=0;i<h2;i++) {
for (int j=0;j<w2;j++) {
px = Math.floor(j*x_ratio) ;
py = Math.floor(i*y_ratio) ;
temp[(i*w2)+j] = pixels[(int)((py*w1)+px)] ;
}
}
return temp ;
}
This java function takes an array of pixel values (original size - w1 and h1) and returns an nearest neighbour (up/down)-scaled array of pixels with dimensions w2 x h2. See also: here.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.