I need simple opinion from all Guru!
I developed a program to do some matrix calculations. It work all fine with
small matrix. However when I start calculating BIG thousands column row matrix. It
kills the speed.
I was thinking to do processing on each row and write the result in a file then free the
memory and start processing 2nd row and write in a file, so and so forth.
Will it help in improving speed? I have to make big changes to implement this change. Thats
why I need your opinion. What do you think?
Thanks
P.S: I know about colt and Jama matrix. I can not use these packages due to company
rules.
Edited
In my program I am storing all the matrix in 2 dimensional array and if matrix is small it is fine. However, when it has thousands column and rows. Then storing all this matrix in memory for calculation cause performance issues. Matrix contains floating values. For processing I read all the matrix store in memory then start calculation. After calculating I write the result in a file.
Is memory really your bottleneck? Because if it isn't, then stopping to write things out to a file is always going to be much, much slower than the alternative. It sounds like you are probably experiencing some limitation of your algorithm.
Perhaps you should consider optimising the algorithm first.
And as I always say for all performance issue - asking people is one thing, but there is no substitute for trying it! Opinions don't matter if the real-world performance is measurable.
I suggest using profiling tools and timing statements in your code to work out exactly where your performance problem is before your start making changes.
You could spend a long time 'fixing' something that isn't the problem. I suspect that the file IO you suggest would actually slow your code down.
If your code effectively has a loop nested within another loop to process each element then you will see your speed drop away quickly as you increase the size of the matrix. If so, an area to look at would be processing your data in parallel, allowing your code to take advantage of multiple CPUs/cores.
Consider a more efficient implementation of a sparse matrix data structure and not a multidimensional array (if you are using one now)
You need to remember that perfoming an NxN multipled by an NxN takes 2xN^3 calculations. Even so it shouldn't take hours. You should get an inprovement by transposing the second matrix (about 30%) but it really shouldn't be taking hours.
So as you 2x N you increase the time by 8x. Worse than that a matrix which fit into your cache is very fast but mroe than a few MB and they have to come from main memory which slows down your operations by another 2-5x.
Putting the data on disk will really slow down your calaculation, I only suggest you do this if you martix doesn't fit in memory, but it will make it 10x - 100x slower so buying a little more memory is a good idea. (In your case your matrixies should be small enough to fit into memory)
I tried Jama, which is a very basic library which use two dimensional arrays instead of one and on 4 year old labtop took 7 minutes. You should be able to get half this time by just using the latest hardware and with multiple threads cut this below one minute.
EDIT: Using a Xeon X5570, Jama multiplied two 5000x5000 matrices in 156 seconds. Using a parallel implementation I wrote, cut this time to 27 seconds.
Use the profiler in jvisualvm in the JDK to identify where the time is spent.
I would do some simple experiments to identify how your algoritm scales, because it sounds like you might use one that has a higher runtime complexity than you think. If it runs in N^3 (which is common if you want to multiply a list with an array) then doubling the input size will eight-double the run time.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
The app I'm working on app where users can simulate tests and answer them offline. I have a software that takes the data from my database (the questions, alternatives, type of question and etc) and turn them into a array.
I don't know which is the most efficient (memory-wise): create an object with a big array with all the questions or creating separated objects (for each subject for example) with an array each or creating multiple arrays in the same object. Is it ok to create an array with about 1000 arrays inside or is it better to split that array, in... 10 arrays with 100 arrays inside each?
P.S: During the test I will only use 30 items from the array, so I'll take the entries from the big array (or from the multiple arrays) and add them to the small 30 entries array that'll be created accordingly to the user's inputs.
What I would like to use
I would like a big array, because for me it would be easier to sort and create random tests, some people are saying 1000 entries aren't too much, so I think I'll stick to a big array. What would be too big? 10k, 100k?
There are three kinds of efficiency you need to consider"
Memory efficiency; i.e. minimizing RAM utilization
CPU efficiency
Programmer efficiency; i.e minimizing the amount of your valuable time spent on writing, writing testcases, debugging, and maintaining the code.
Note that the above criteria work against each other.
Memory Efficiency
The memory size in bytes of an array of references N in Java given by
N * reference_size + array_header_size + padding
where:
reference_size is the size of a reference in bytes (typically 4 or 8)
array_header_size is typically 12 bytes
padding is greater or equal to zero, and less than the heap node size granularity.
The array itself also has a unique reference which must be held in memory somewhere.
So, if you split a large array into M smaller arrays, you will be using at least (M - 1) * 16 extra bytes of RAM, and possibly more. On the other hand, we are talking about bytes here, not kilobytes or megabytes. So this is hardly significant.
CPU Efficiency
This is harder to predict. The CPU utilization effects will depends largely on what you do with the arrays, and how you do it.
If you are simply subscripting (indexing) an array, that operation doesn't depend on the array size. But if you have multiple arrays (e.g. an array of arrays) then there will be additional overheads in determining which array to in subscript.
If you are searching for something in an array, then the larger the array you have to search the longer it will take (on average). But if you split a large array into smaller arrays, that doesn't necessarily help ... unless you know before hand which of the smaller arrays to search.
Programmer Efficiency
It will probably make your code more complicated if you use multiple arrays rather than one. More complicated code means more programmer effort in all phases of the application's development and maintenance lifecycle. It is hard to quantify how much extra effort is involved. However programmer effort means cost (paying for salaries) and time (deadlines, time to market, etc), and this is likely to outweigh any small savings in memory and CPU.
Scalability
You said:
Some people are saying 1000 entries aren't too much, so I think I'll stick to a big array. What would be too big? 10k, 100k?
Once again, it depends on the context. In reality, the memory used for an array of 100K instances of X depends largely on the average size of X. You will most likely run out of memory to represent the X instances instead of the array.
So, if you want your application to scale up indefinitely, you should probably change the architecture so that it fetches the questions / answers from the database on demand rather than loading them all into memory on start up.
Premature Optimization
Donald Knuth is often (mis-)quoted1 as saying:
"Premature optimization is the root of all evil."
What he is pointing out is that programmers are inclined to optimize things that don't really need optimizing, or spend their effort optimizing the wrong areas of their code based on incorrect intuitions.
My advice on this is the following:
Don't do fine-grained optimization too early. (This doesn't mean that you should ignore efficiency concerns in the design and coding stages, but my advice would be to only consider on the major issues; e.g. complexity of algorithms, granularity of APIs and database queries, and so on. Especially things that would be a lot of effort to fix later.)
If and when you do your optimization, do it scientifically:
Use a benchmark to measure performance.
Use a profiler to find performance hotspots and focus your efforts on those.
Use the benchmark to see if the optimization has improved things, and abandon optimizations that don't help.
Set some realistic goals (or time limits) for your optimization and stop when you reach them.
1 - The full quotation is more nuanced. Look it up. And in fact, Knuth is himself quoting Tony Hoare. For a deeper exploration of this, see https://ubiquity.acm.org/article.cfm?id=1513451
To be honest 1000 is not a big size but it matters the size of elements.
However, in future your entry size will increase, so definitely you don’t want to change logic every time. So you have to design in efficient matter which should be fruitful for growing data as well.
Memory concern - If you store all data in one array or 10 array, both it will take mostly same amount of memory very minimum difference),
But if you have 10 Array then management could be difficult, with growing demands you can face more complexity.
I can suggest you to use single Array which will be great for management. You can consider LinkedList as well which will be great for faster search results.
I'm programming something in Java, for context see this question: Markov Model descision process in Java
I have two options:
byte[MAX][4] mypatterns;
or
ArrayList mypatterns
I can use a Java ArrayList and append a new arrays whenever I create them, or use a static array by calculating all possible data combinations, then looping through to see which indexes are 'on or off'.
Essentially, I'm wondering if I should allocate a large block that may contain uninitialized values, or use the dynamic array.
I'm running in fps, so looping through 200 elements every frame could be very slow, especially because I will have multiple instances of this loop.
Based on theory and what I have heard, dynamic arrays are very inefficient
My question is: Would looping through an array of say, 200 elements be faster than appending an object to a dynamic array?
Edit>>>
More information:
I will know the maxlength of the array, if it is static.
The items in the array will frequently change, but their sizes are constant, therefore I can easily change them.
Allocating it statically will be the likeness of a memory pool
Other instances may have more or less of the data initialized than others
You right really, I should use a profiler first, but I'm also just curious about the question 'in theory'.
The "theory" is too complicated. There are too many alternatives (different ways to implement this) to analyse. On top of that, the actual performance for each alternative will depend on the the hardware, JIT compiler, the dimensions of the data structure, and the access and update patterns in your (real) application on (real) inputs.
And the chances are that it really doesn't matter.
In short, nobody can give you an answer that is well founded in theory. The best we can give is recommendations that are based on intuition about performance, and / or based on software engineering common sense:
simpler code is easier to write and to maintain,
a compiler is a more consistent1 optimizer than a human being,
time spent on optimizing code that doesn't need to be optimized is wasted time.
1 - Certainly over a large code-base. Given enough time and patience, human can do a better job for some problems, but that is not sustainable over a large code-base and it doesn't take account of the facts that 1) compilers are always being improved, 2) optimal code can depend on things that a human cannot take into account, and 3) a compiler doesn't get tired and make mistakes.
The fastest way to iterate over bytes is as a single arrays. A faster way to process these are as int or long types as process 4-8 bytes at a time is faster than process one byte at a time, however it rather depends on what you are doing. Note: a byte[4] is actually 24 bytes on a 64-bit JVM which means you are not making efficient use of your CPU cache. If you don't know the exact size you need you might be better off creating a buffer larger than you need even if you are not using all the buffer. i.e. in the case of the byte[][] you are using 6x time the memory you really need already.
Any performance difference will not be visible, when you set initialCapacity on ArrayList. You say that your collection's size can never change, but what if this logic changes?
Using ArrayList you get access to a lot of methods such as contains.
As other people have said already, use ArrayList unless performance benchmarks say it is a bottle neck.
i have been struggling to understand the following question.
Is zero-matrix necessarily memory efficient? does zero-matrix cost less memory (or does it not cost any memory)?
I tried to verify it in java but it turns out the memory has been allocated for the specified size.
I am not sure about C/C++ or other language like matlab and octave and how they manage the matrix and vector memory;
The reason why i am asking this is want to build a sparse matrix with huge size, but most of entries are zeros, it turns out that java is not a good choice, because zero-matrix in java still cost much memory. Does any one have any experience with this problem? not sure how you deal with it, your help will be appreciated.
Thanks
Straightforward zero-filled matrix will cost you in any language: the amount allocated does not depend on what numbers you fill it with.
Take a look at e.g. UJMP that provides sparse matrices support, and many algorithms. Probably other implementations exist.
In general, if you find something to be difficult to implement but likely useful, google for open-source libraries. Chances are many wheels have been invented already.
Because you have to allocate space for your matrix, it will take space independently of which numbers it will hold. (Even if null).
However I can imagine that someone somewhere has designed a data structure to handle this.
First thing that pops up in my mind is that you could create a data structure which holds positions with its according value. If you were to ask for the value at a given position that doesn't exist, you could return 0. Of course this would be inefficient for small matrices or matrices with only a few zeroes. Just an idea.
I have a problem. I work in Java, Eclipse. My program calculates some mathematical physics, and I need to draw animation (Java SWT package) of the process (some hydrodynamics). The problem is 2D, so each iteration returns two dimensional array of numbers. One iteration takes rather long time and time needed for iteration changes from one iteration to another, so showing pictures dynamically as program works seems like a bad idea. In this case my idea was to store a three dimensional array, where third index represents time, and building an animation when calculations are over. But in this case, as I want accuracuy from my program, I need a lot of iterations, so program easily reaches maximal array size. So the question is: how do I avoid creating such an enormous array or how to avoid limitations on array size? I thought about creating a special file to store data and then reading from it, but I'm not sure about this. Do you have any ideas?
When I was working on a procedural architecture generation system at university for my dissertation I created small, extremely easily read and parsed binary files for calculated data. This meant that the data could be read in within an acceptable amount of time, despite being quite a large amount of data...
I would suggest doing the same for your animations... It might be of value storing maybe five seconds of animation per file and then caching each of these as they are about to be required...
Also, how large are your arrays, you could increase the amount of memory your JVM is able to allocate if it's not maximum array size, but maximum memory limitations you're hitting.
I hope this helps and isn't just my ramblings...
Dear StackOverflowers,
I am in the process of writing an application that sorts a huge amount of integers from a binary file. I need to do it as quickly as possible and the main performance issue is the disk access time, since I make a multitude of reads it slows down the algorithm quite significantly.
The standard way of doing this would be to fill ~50% of the available memory with a buffered object of some sort (BufferedInputStream etc) then transfer the integers from the buffered object into an array of integers (which takes up the rest of free space) and sort the integers in the array. Save the sorted block back to disk, repeat the procedure until the whole file is split into sorted blocks and then merge the blocks together.
The strategy for sorting the blocks utilises only 50% of the memory available since the data is essentially duplicated (50% for the cache and 50% for the array while they store the same data).
I am hoping that I can optimise this phase of the algorithm (sorting the blocks) by writing my own buffered class that allows caching data straight into an int array, so that the array could take up all of the free space not just 50% of it, this would reduce the number of disk accesses in this phase by a factor of 2. The thing is I am not sure where to start.
EDIT:
Essentially I would like to find a way to fill up an array of integers by executing only one read on the file. Another constraint is the array has to use most of the free memory.
If any of the statements I made are wrong or at least seem to be please correct me,
any help appreciated,
Regards
when you say limited, how limited... <1mb <10mb <64mb?
It makes a difference since you won't actually get much benefit if any from having large BufferedInputStreams in most cases the default value of 8192 (JDK 1.6) is enough and increasing doesn't ussually make that much difference.
Using a smaller BufferedInputStream should leave you with nearly all of the heap to create and sort each chunk before writing them to disk.
You might want to look into the Java NIO libraries, specifically File Channels and Int Buffers.
You dont give many hints. But two things come to my mind. First, if you have many integers, but not that much distinctive values, bucket sort could be the solution.
Secondly, one word (ok term), screams in my head when I hear that: external tape sorting. In early computer days (i.e. stone age) data relied on tapes, and it was very hard to sort data spread over multiple tapes. It is very similar to your situation. And indeed merge sort was the most often used sorting that days, and as far as I remember, Knuths TAOCP had a nice chapter about it. There might be some good hints about the size of caches, buffers and similar.