Shared List between classes - java

I am in the design stage of my next task and I am not sure whether my idea for it is right or not, as I am not quite sure on how to realize it in an UML diagram. I would appreciate much your comments about it.
Basically the point is that I am going to have a reader and a writer class. They will be used to read and write values from/to an certain data source, i.e a database or a modbus PLC. Each of these values is identified by a unique id in my data model and in the data source. The read operation will be performed periodically for all the values by sending all their ids and quering its values. The write operation is made each time one of these values change in my datamodel and needs to be sent to this data source.
My idea is to have a shared List for the reader and the writer containing all the objects in my datamodel. For example:
class ExternalObject {
private String id;
private String transactionId;
private String value;
private String lastValue;
}
There will be a controller class that when a value changes in my data model will write it in the value attribute of the right object, then the Writer class, that is iterating through all the elements of the list all the time will see that the value is not null and send it. After this, it will reset it to null and set it to the lastValue.
Besides, the Reader class, that is reading the values from this data source all the time, when sees that a value read is different from the lastValue, it will save it in my datamodel.
By now I suppose you got the idea. There will be of course some more logics to reset values when there's no connection with the data source or to send the initial values or read them, but that's another thing.
My concern is this shared list. I am not sure if it is fine, in object oriented design, to share lists or objects like this. If this is fine, the next thing is that I don't know how to model it in an UML diagram to indicate that one object is shared between two classes.
Any ideas about it are much welcomed.

Unfortunately this is not a complete answer because I've never implemented anything like that in industry grade, but a few notes come to mind:
1) New IDs: the Reader polls for IDs it knows - but what about new IDs, inserted by external processes?
2) Performance: do you control the schema, and are your machine clocks synchronized within some reasonable margin? If so, perhaps you could have a timestamp on each object, and the reader could 'refresh' only objects that were edited/inserted since its last refresh (plus some safely margin)?
3) List: I wouldn't say "object oriented forbids list sharing", but for your own convenience you might like to consider a wrapper data structure, with methods for searching/updating/inserting/deleting. Thus you can easily replace the datastructure at will, e.g. to a map.
4) Transactions: how were you going to handle transactions of those data sources?
anyway, good luck

Related

database or ObjectOutputStream, Object specific member or actual object for reference

I'm working on an application for a pharmacy , basically this application has a class "item" and another class "selling invoices" which logs selling processes .
So my question here if the pharmacy is expected to have about ten thousand products in stock, and I'm storing these products in a linked list of type Item, and storing the invoices in linked list also , then on closing the app i save them using object output stream and reload them upon the start, Is it a bad practice ? Have I to use database instead?
My second question is, if i continue on using linkedlist and object output stream , what is better for performance and memory, storing the actual item as a field member in the invoice class or just its ID and then getting the item upon recalling using this ID reference, so what's better ?
Thanks in advance .
It is a bad idea to use ObjectOutputStream like that.
Here are some of the reasons:
If your application crashes (or the power fails) before you "save", then all changes are lost.
Saving "all objects" is expensive.
Serialized objects are opaque. It is only practical to look at them from Java code.
Serialized objects are fragile. If your application classes change, you may find that old serialized objects can no longer be read. That's bad enough, but now consider what happens if your client wants to look at pharmacy records from 5 years ago ... from a backup tape.
Serialized objects provide no way of searching ... apart from reading all of the objects one at a time.
Designs which involve reading all objects into memory do not scale. You are liable to run out of memory. Or compromise on your requirements to avoid running out of memory.
By contrast:
A database won't lose any changes have been committed. They are much more resilient to things like application errors and system level failures.
Committing database changes is not as expensive, because you only write data that has changed.
Typical databases can be viewed, queried, and if necessary repaired using an off-the-shelf database tool.
Changing Java code doesn't break the database. And for some schema changes, there are ways to migrate the database schema and records to match an updated database.
Databases have indexes and query languages for implementing efficient search.
Databases scale because the primary copy of the data is on disk, not in memory.

Multithreading in Grails - Passing domain objects into each thread causes some fields to randomly be null

I am trying to speed up a process in a Grails application by introducing parallel programming. This particular process requires sifting through thousands of documents, gathering the necessary data from them and exporting it to an excel file.
After many hours of trying to track down why this process was going so slowly, I've determined that the process has to do a lot of work gathering specific parts of data from each domain object. (Example: The domain object has lists of data inside it, and this process takes each index in these lists and appends it to a string with commas to make a nice looking, sorted list in a cell of the excel sheet. There are more examples but those shouldn't be important.)
So anything that wasn't a simple data access (document.id, document.name, etc...) was causing this process to take a long time.
My idea was to use threads for each document to asynchronously acquire all this data, when each thread finished gathering the data, it can come back to the main thread and be placed into the excel sheet, now all with simple data access, because the thread already gathered all the data.
This seems to be working, however I have a bug with the domain objects and the threads. Each thread is passed in its corresponding document domain object, but for whatever reason, the document domain objects will randomly have parts of its data changed to null.
For example: Before the document is passed into the thread, one part of the domain object will have a list that looks like this: [US, England, Wales], randomly at any point, the list will look like this in the thread: [US, null, Wales]. And this happens for any random part of the domain object, at any random time.
Generating the threads:
def docThreadPool = Executors.newFixedThreadPool(1)
def docThreadsResults = new Future<Map>[filteredDocs.size()]
filteredDocs.each {
def final document = it
def future = docThreadPool.submit(new DocumentExportCallable(document))
docThreadsResults[docCount] = future
docCount++
}
Getting the data back from the threads:
filteredDocs.each {
def data = docThreadsResults[count].get()
build excel spreadsheet...
}
DocumentExportCallable class:
class DocumentExportCallable implements Callable {
def final document
DocumentExportCallable(document) {
this.document = document
}
Map call() {
def data = [:]
code to get all the data...
return data
}
}
EDIT:
As seen below, it would be useful if I could show you the domain object. However I am not able to do this. BUT, the fact that you guys asked me about the domain object had me thinking that it just might be where the problem lies. Turns out, every part of the domain object that randomly messes up in the threads is a variable in the domain object inside "mapping" which uses SQL joins to get the data for those variables. I've just been made aware of lazy vs eager fetching in Grails. I'm wondering if this might be where the problem lies...by default it is set to lazy fetching so this constant access to the db by each thread might be where things are going wrong. I believe finding a way to change this to eager fetching might solve the problem.
I have the answer to why these null values were appearing randomly. Everything seems to be working now and my implementation is now performing much faster than the previous implementation!
Turns out I was unaware that Grails domain objects with 1-m relationships make separate sql calls when you access these fields even after you get the object itself. This must have caused these threads to be making un-thread-safe sql calls which created these random null values. Setting these 1-m properties in this specific case to be eagerly fetched fixed the issue.
For anyone reading later on, you'll want to read up on lazy vs eager fetching to get a better understanding.
As for the code:
These are the 1-m variables that were the issue in my domain object:
static hasMany = [propertyOne : OtherDomainObject, propertyTwo : OtherDomainObject, propertyThree : OtherDomainObject]
I added a flag to my database call which would enable this code for this specific case, as I didn't want these properties to always be eagerly fetched throughout the app:
if (isEager) {
fetchMode 'propertyOne', FetchMode.JOIN
fetchMode 'propertyTwo', FetchMode.JOIN
fetchMode 'propertyThree', FetchMode.JOIN
setResultTransformer Criteria.DISTINCT_ROOT_ENTITY
}
My apologies, but at the moment I do not remember why I had to put the "setResultTransformer" in the code above, but without it there were issues. Maybe someone later on can explain this, otherwise I'm sure a google search will explain.
What is happening is that your grails domain objects were detaching from the hibernate session thus hitting LazyInitiationException when your thread attempted to load lazy properties.
It's good that switching to eager fetching worked for you but it may not be an option for everyone. What you could have also done is used grails async task framework instead as it has built in session handling. See https://async.grails.org/latest/guide/index.html
However, even with grails async task passing an object between threads seems to detach it as the new thread will have a newly bound session. The solutions that I have found where to either .attach() or .merge() on the new thread to bind it with the session on the calling thread.
I believe the optimal solution would be to have hibernate load the object on the new thread, meaning in your code snippet you would pass a document id and Document.get(id) on your session supported thread.

query a database for a list of objects in order to create objects from that list: proper way to do

I have a list of report types (objects) in the database that need to be generated for the user and sent out by email / printed / saved on hdd etc.
One report ('skeleton') is one row in database.
My question is: should I create a separate object for query result of one row - 'skeleton' report object and then use this object to create the end 'report' object. Is this the correct way of handling such task?
I have been told that it is easier to create a method and just get a rowset from the database in it. Then parse the row set for required parameters necessary to create the report, create the end report object and etc.
I am not totally sure if I understand your question correctly, but I assume that you want to know if you should fill an object with the data from the database and parse the object when creating the report or just pass the resultset to the creation method?
I would recommend using an object 'Skeleton' and filling this one, since you can reuse it later on and it makes the code way more readable in my opinion.
More information on this toppic:
In many applications the MVC pattern is used to structure your program. In this pattern you structure your program in 3 layers, the first one for your UI(VIEW), the second for you buisness logic(Controller) and the third one for your persistence data(Model). These layers only communicate through domain model objects which represent your data (in your case this would be the 'Skeleton' object, also called POJOs ). This is especially helpful if you suddenly want to change from a database to a textfile or any other persistence strategy, since you should only have to change the model layer while keeping the other layers mostly the same (especially if you're using interfaces). You can find a lot on this pattern in the internet and for most standard applications i would definitely recommend it

client view of very large collection of objects. How to optimize?

I have 3-tier EJB application, and I need to create a view on a thick client (desktop Java application) that shows a very large collection of objects (over 5000 orders). Each object has child properties that are also complex objects, for example:
class Address
{
String value
// other properties
}
class Order
{
public String Number
// this is collection of complex object and I need first and last object to show it's
// properties in view
public List<Address> getAddresses()
// other properties
}
The view is a table of Orders:
Number | FirstAddress | LastAddress | ...
My first attempt was to load full List of orders (without child properties) and then dynamically download child objects when needed for display. But when I have 10000 orders and begin fast scrolling, the UI become unresponsive.
Then I try to load all orders and all children that need to be shown in the table, but the UI gets very heavy and slow, possibly because of memory cost). And it's not thick client at all, because I download almost all data from db.
What is best practice to solve this task?
Assuming you are using a JTable as the view of a suitable TableModel, query the database using a SwingWorker and publish() the results as they arrive. For simplicity, this example simply fetches random data in blocks of 10. Note that the UI remains responsive as data accumulates.
Follow Value Object or Data Transfer Object pattern. Send only what you really need. Instead of sending a graph of domain objects, just create one or more 'stupid' flat objects (containg simple attributes) per view.
I suggest implementing some sort of pagination, in other words you'll have to implement a mechanism for retrieving only a small subset of all your data, and show them chunk by chunk in different pages.
Exactly "how" depends on your approach so far.
you can either use some programming pattern like those already
mentioned
or you can implement it at DB level, where you query your DB server,
i.e. depending on the chosen DBMS you'll have to write the fetch
queries in such a manner that they retrieve only a portion of all the
data, like in here.
hope this helps!
It's advised to make a proxy object for your list that simply gets only a small part of it's elements, and also the total count, and then has the ability to load on demand other parts of the original list

Java: versioned data structures?

I have a data structure that is pretty simple (basically a structure containing some arrays and single values), but I need to record the history of the data structure so that I can efficiently get the contents of the data structure at any point in time.
Is there a relatively straightforward way to do this?
The best way I can think of would be to encapsulate the whole data structure with something that handles all the mutating operations by storing data in functional data structures, and then for each mutation operation caching a copy of the data structure in a Map indexed by time-ordering (e.g. a TreeMap with real time as keys, or a HashMap with a counter of mutation operations combined with one or more indexes stored in TreeMaps mapping real time / tick count / etc. to mutation operations)
any suggestions?
edit: In one case I already have the history as a series of transactions (this is reading items from a data file) so I can replay them, but this takes O(n) steps (n = # of transactions) every time I need to access the data. I'm looking for alternatives.
You are correct. Storing the data in a purely function data structure is the way to go. Supporting anything moderately complicated using do/undo actions is reliant on the programmer being aware of all side effects of every operation, which does not scale and breaks encapsulation.
You should use some form of persistent data structure that is immutable and is based on structural sharing (i.e. so that the parts of the data structure which do not change between versions only get stored once).
I created an open source Java library of such data structures here:
http://code.google.com/p/mikeralib/source/browse/#svn/trunk/Mikera/src/mikera/persistent
These were somewhat inspired by Clojure's persistent data structures, which might also be suitable for your purposes (they are also written in Java).
If you are only storing a little bit of data and don't have a lot of changes then storing each version is fine.
If you don't need to access the old version of the data too often, I wouldn't cache each one, I'd just make it so you could rebuild to it.
You could do this by saving mutations as transactions and replaying the transactions (with the ability to stop at any point.
So you start with an empty data structure and you might get an "Add" instruction followed by a "Change" and another "add" and then maybe a "Delete". Each of these objects would contain a COPY (not a pointer to the same object) of the thing being added or changed.
You would concatenate each operation into a list while at the same time mutating your collection.
If you find that you need a version at an older timestamp, start with a new empty collection, replay until you hit that timestamp then stop and you have the collection as it would be at that time.
If this was a very long-running application and you often needed to access items near the end, you could write an "Undo" for each add/change/delete operation object and actually mutate the data back and forth.
So imagine you have your data object and this array of mutations, you could easily run up and down the mutation list changing the data object back and forth to any version you want.
You could even contain multiple data objects, just create a new empty one and run it up the mutation array (think of it as a timeline--where each stored mutation would contain a timestamp or some version number) until you get it to the timestamp you want--this way you could have "Milestones" that you could reach instantly--for instance, if you allocated one milestone for each thread you could make the addMutation method synchronized and this data collection would become 100% threadsafe.
Note that if you actually return the data object you should only return a copy of the data--otherwise the next time you mutated that milestone it would mutate the data object you returned.
Hmm, you could also include "Rollup" functionality--if you ever decide that you will not need access to the tail (the first few transactions) you could apply them to a "Start" structure and then delete them--from then on you copy the start structure to begin from the start rather than always starting with an empty data structure.
Man, this is an awesome pattern--now I want to implement it.
Either do as you already suggested, or have a base class of some sort with subclasses that represent the different changes. Then get the proper class at run-time by passing the version/timestamp/whatever to a factory that hands you back the right one.
Multi-level undo can be based on a model (ie data structure) and a sequence of actions. Each action supports two operations: "do" and "undo". To perform a change on the model you register a new action and "do" it. This allows you to "walk" back and forth in the history, but the state of the model at a specific index cannot be accessed in constant time.
Maybe something like this would be applicable to your situation?
How long will the application be running for?
It seems like you could do what you suggested -- playing the transactions back -- but cache the data structure and list of transactions at particular points in time (every hour or every day?) to ease the pain of having to go through O(n) operations every time you need to rebuild the collection from scratch.
Granted, there is definitely a trade-off between space (that the cache takes up) and the number of operations needed to re-build it, but hopefully you will be able to find a happy medium for it.

Categories

Resources