I'm working on a project where i need to plot some data. At the moment i keep all the data in an object and then give the pointer to this object to the graphs. But it is possible to dynamically change the data, whereas i need to change the data the graphs gets. So here is my question:
Should i create a new array every time i edit the data or and then change the pointers in the graphs or should i just change the data within the original array and the just repaint the graphs?
Using immutable data results in cleaner, more predictable API. If you mutate the array which is currently used by the graph API, nasty interactions are lurking just around the corner. This may lead to the graph API defensively copying the array internally; at that point you lose: you get more copying than you'd needed had you started with an immutable approach up front.
Keeping one single model is the preferred approach especially from the memory performance point of view. However, it may depend. If you use the same model somewhere else then you must ponder a little bit more.
Related
I'm creating a graph in java using jGraphT and adding vertexes and edges from a list using a stream.
My question is:
Can I use stream().parallel() to add them faster?
No, at least not as far as I'm aware. Essentially, adding a vertex or edge boils down to 2 steps: (a) check whether the edge/vertex already exists and if not (b) add the edge/vertex. Depending on the type of graph, step (b) involves adding the object to the appropriate container that stores the edges/vertices. I'm not an expert on concurrent programming, but I don't see how a parallel stream can do the above faster.
I don't know exactly what your usecase is, or what you try to accomplish. There are however some optimized, special graph types in the jgrapht-opt package that might benefit you. The graph functionality doesn't change (i.e. you can run the same algorithms on them); only the way the graph is stored changes. Some storage mechanisms are more memory efficient, allowing you to store massive graphs using little memory. Other graphs, such as the sparse graphs, can be created quicker and access operations are also quicker, but these graphs are typically immutable, i.e. once created they cannot be changed. What you need really depends on your usecase.
I am currently developing my own app (just another remake of the famous "Game of Life") and I want to add a "revert" button. My game basically consists of a two dimensional array: Cell[][]...
So: My idea was to create an ArrayList which that array is being added to every update... (With a limit of 50 entries)
But then I thought, that that would be a lot of Objects in that list... So:
Would it be more performant to have an ArrayList in each cell in the 2-dimensional array, containing the history of itself, or to have a huge ArrayList containing entire game states as a history?
(I don't think you need any of my code to answer this question, but I will post it if you do)
Instead of storing cell state, why not keep the events that transform the cell state? An event would simply encapsulate the previous state info and the next state, or maybe even a coding enum that can be executed in reverse. This would already significantly reduce the data you need to store.
Also, it makes a difference if you're working natively, or on a garbage-collecting VM platform such as Java or C#. The former tends to be faster and is more forgiving in terms of large addressing storage, while the latter can start trashing wildly if you store lots of objects.
But the real proof of the pudding is in the eating / tinstatfc: there is no such thing as the fastest code (c) Michael Abrash -> i.e. benchmark your code and find out! (and come tell us)
In my opinion, an ArrayList per cell would work fine; however, if you only want to store 50 states, you'll have to create a custom ADT or utilize Guava's Collections API.
I recommend using a Stack rather than an ArrayList, as reverting every cell would be a simple operation. In Java, the correct ADT to use is an ArrayDeque, but you'll have to handle capacity yourself.
I am reading this blog post about making animations with Gnuplot and Cairo -terminal which algo's plan is simply
to save png-images to working directory, and
to save latest the video to working directory.
I would like to have something more such that the user can also browse the images real time when the images are being converted:
Data-parallelism model - data structure regularly arranged in an array
to give the user some list in some interface which the user can browse by arrow buttons
in this interface, new images are being added to the end of the list
the user can also remove bad images from the stream in real time
which may work well in Data parallelism model of Parallel programming i.e. a data set regularly structured in an array.
The operations (additions, deletions) can operate on this data, but independently on distinct processes.
Let's assume that there is no need for efficient searches for simplicity in Version 1.
However, if you come with a model which can do that also, I am happy to consider it - let's call it Version 2.
I think a list is not a good data structure here because of the wanted opportunity for deletions and continuous easy addition to the end of the data structure.
The data structure stack is not going to work either because of deletions.
I think some sort of tree data structure can work because of rather cheap deletions and cheap search there.
However, a simple array in the Data-parallelism model can be sufficient.
Languages
I think Java is a good option here because of parallelism.
However, any language and pseudocode are good too.
Frontend
I have an intuition that requirements for such a system in the frontend should be qT as a terminal emulator.
What is a better data structure for cheap deletions and continuous additions to the end?
Java LinkedList seems to be the thing you could use for version 1. you can use its single param add() to append to the list in constant time. if by "real-time" you mean when the image is in user's display and thus pointed to somehow, can delete them in constant time as well.
optimum use of memory and no re-instantiation as you'd have with an Arraylist.
any doubly linked list implemented on objects (as opposed to an array) would do.
your second version isn't clear enough.
i want to use a list which will store objects of some type (lets say for simplicity - books) so i can show them in a listview object.
im kinda new to this, so i ask for the help of more advanced and experienced users about the following debates -
which one to use? linkedlist is something im familiar with. however, how do i make the app maintain the list? should i save the details of each object in a XML? if i do that, isnt it just better to use Arraylist? (please exclude in your answer things related to proccessing time).
if not via xml - how do i 'store' a list for later use even when the app is shut down and later on activated?
Thanks!
ArrayLists are good to use when you want random access via an indexed lookup. They're just as well suited for iterating through as LinkedLists.
OTOH, LinkedList doesn't need to be resized, it only runs out of room when you run out of memory to hold more nodes. If you have lots of data growth, or you're doing lots of sequential add/removes, then LinkedLists will win out in performance.
Sometimes you need both random access and growth, in those cases you need to make a judgment call on which criteria you want to be more performant.
In your current use case, I'd probably choose an ArrayList, you'll likely know how big the list should be, it won't be changing in size that often, and if you want to display this thing in a GUI, you're likely to need to do indexed lookups.
As far as storing the list, XML is as good a means as any, CSV files (or plain line-delimited text files), YAML, JSON and even class serialization are some alternatives, choose what's easiest and most convenient for you.
You must storage your data into SQLite. Android provides a very easy way. Look at this tutorial: http://www.vogella.de/articles/AndroidSQLite/article.html
I would prefer ArrayList over LinkedList because it has methods to manipulate the size of the array that is used internally to store the list
If i am going to use it as a stack, queue, or double-ended queue then I would use a LinkedList
I'm trying to design a lightweight way to store persistent data in Java. I've already got a very efficient way to serialize POJOs to DataOutputStreams (and back), but I'm trying to think of a good way to ensure that changes to the data in the POJOs gets serialized when necessary.
This is for a client-side app where I'm trying to keep the size of the eventual distributable as low as possible, so I'm reluctant to use anything that would pull-in heavy-weight dependencies. Right now my distributable is almost 10MB, and I don't want it to get much bigger.
I've considered DB4O but its too heavy - I need something light. Really its probably more a design pattern I need, rather than a library.
Any ideas?
The 'lightest weight' persistence option will almost surely be simply marking some classes Serializable and reading/writing from some fixed location. Are you trying to accomplish something more complex than this? If so, it's time to bundle hsqldb and use an ORM.
If your users are tech savvy, or you're just worried about initial payload, there are libraries which can pull dependencies at runtime, such as Grape.
If you already have a compact data output format in bytes (which I assume you have if you can persist efficiently to a DataOutputStream) then an efficient and general technique is to use run-length-encoding on the difference between the previous byte array output and the new byte array output.
Points to note:
If the object has not changed, the difference in byte arrays will be an array of zeros and hence will compress very small....
For the first time you serialize the object, consider the previous output to be all zeros so that you communicate a complete set of data
You probably want to be a bit clever when the object has variable-sized substructures....
You can also try zipping the difference rather than RLE - might be more efficient in some cases where you have a large object graph with a lot of changes