Design pattern for import and update actions

Design pattern for import and update actions - java

I have a small component required for an application. The component that loads a csv file and then updates customer records based on the data it finds. There will be a single csv file for every customer update.
Checks a file location for csv files
For each csv file it finds, load the csv file, parse it and update the customer data with any updated data
Thats it.
However im torn between a couple of ways to do this.
Have a single Updater() class which does everything.
Have an Update() class which is a representation of the loaded csv data, this knows how to parse the csv etc. and then also have an Updater() class that is responsible for updating the customer records. The Update() class would have an Updater()
Which of these is the correct solution or are there any other better solutions to this?

If you are thinking of a real general design, consider the following:
Class UpdateSet: A list of updates. A CSV file if you want.
Interface UpdateInstance: The different types of updates, from a request point of view. A CSV line if you want.
Class InsertInstance: Implements UpdateInstance. An insert request.
Class DeleteInstance: Implements UpdateInstance. A delete request.
Class ChangeInstance: Implements UpdateInstance. An update request.
Interface UpdateSetBuilder: Produces an UpdateSet from somewhere.
Class CSVUpdateSetBuilder: Implements UpdateSetBuilder by reading a CSV file. Probably a singleton object.
Interface UpdateParser: Takes a CSV line and produces an UpdateInstance (or refuses it).
Class InsertParser: Implements UpdateParser. Probably a singleton object. Detects and parses an insert request.
Class DeleteParser: Implements UpdateParser. Probably a singleton object. Detects and parses a delete request.
Class ChangeParser: Implements UpdateParser. Probably a singleton object. Detects and parses an update request.
The different UpdateParsers are registered with the CSVUpdateSetBuilder and selected through a delegation mechanism (i.e. each of them is given in turn the opportunity of recognizing a record, if it returns null, the next UpdateParser will be given its chance).
Class Updater: Takes a collection of CustomerRecords and applies an UpdateSet to it.
Interface UpdateTypeDoer: The different types of operations, from an execution point of view.
Class InsertDoer: Implements UpdateTypeDoer. Detects InsertInstance objects and applies them to the data.
Class DeleteDoer: Implements UpdateTypeDoer. Detects DeleteInstance objects and applies a delete request to the data.
Class ChangeDoer: Implements UpdateTypeDoer. Detects ChangeInstance objects and applies an update request to the data.
The different UpdateTypeDoers are registered with the Updater and selected through a delegation mechanism (i.e. each of them is given in turn the opportunity of recognizing a record, if it returns null, the next UpdateTypeDoer will be given the opportunity).
Advantages: Very flexible, and easy to evolve and modify (add new data sources, update types, etc).
Disadvantages: A big investment in terms of design and implementation time that maybe never pays back. Are you ever going to add types of update? Different data sources? File formats?
I've always thought that in design and programming there are 2 things you can do endlessly: abstraction and indirection. Knowing how much is too little and how much is too much is the real art.

Split the functionality. It's difficult to say without knowing more about the code, but I'd say at least have the loading/parsing of the CSV separate from applying it to your internal record. I'd probably keep the code that searches the directory for CSVs separate from both of them.

I would keep it as simple as possible (1 class) for now unless you need to add more functionality. Will other classes need to get updates? If not, don't bother designing, creating (and testing) a more complication system.
If however you need multiple classes, perhaps the Observer Pattern is what you are looking for. That way other objects register that they want to get update events and your class that knows how to parse these records fires Update events. The Listening class can receive those Update events and does the actual update.

If the logic
that loads a csv file and the logic that then updates customer records based on the data it finds are both pretty simple/short then keep them in a single Updater class that first loads the data and then updates it.
If the CSV data itself is complex it might be better to make an additional class that stores the data of each object (entry in the csv file) with the appropriate mutators(setters/getters). It can still be used in a single class that reads the file, creates an object for each entry and then updates the customer records.
The only reason I could think of to split the Updater in 2 separate classes is having very complex logic of reading the file and updating the customer records. But I can't see how they could be so hard/long to implement.

Related

What Java design pattern fits the scenario of adding a user to multiple systems based on csv values?

I am working on a project where I need to add users to multiple systems (active directory, a database, & Sisense) based on data received from a spreadsheet. I've coded can get the data input correctly into each system, but I am struggling to figure out how to organize my code, in terms of what design pattern to use.
I have a model class for each component that contains the field each system needs:
ActiveDirectoryUser
SisenseUser
DatabaseUser
Then, I have what I call the worker class for each of these that actually does creates the user in the system.
ActiveDirectoryWorker
SisenseWorker
DatabaseWorker
The basic flow of my code is
Read in each line from the spreadsheet
Validate the input is valid.
Create a instance of each model class that contains the appropriate fields.
Call the individual worker classes that control how the user get added to the respective system. The model instance will be passed into this class.
I've read up on some of the various design patterns, but none of the explanations are in "plain" English. Still learning the ropes here a bit, so I'd appreciate someone suggesting a model that fits my scenario.

It sounds as though you've defined three distinct data models, one for each storage. That makes your job more difficult than it has to be. Instead, consider modelling data based on data in the spreadsheet. You could, for instance, define a class called SpreadsheetUser, which contains the valid data from a spreadsheet row.
Now define an interface, e.g. UserCreator:
interface UserCreator
{
void Create(SpreadsheetUser user);
}
Now loop through each row in your spreadsheet, validate the data and then call Create on a Composite, which could be defined like this:
class CompositeUserCreator : UserCreator
{
UserCreator[] creators;
CompositeUserCreator(params UserCreator[] creators)
{
this.creators = creators;
}
public void Create(SpreadsheetUser user)
{
foreach (creator in creators)
creator.Create(user);
}
}
You also define three concrete implementations of UserCreator, one for each storage system, and create the composite like this:
CompositeUserCreator creator =
new CompositeUserCreator(
new ActiveDirectoryUserCreator(/* perhaps some config values here... */),
new SisenseUserCreator(/* ... and here... */),
new DatabaseUserCreator(/* ... and here... */));
You'll still have the problem of dealing with failures. What should happen if you've already created a user in active directory, but then Sisense creation fails? That is, however, not a problem introduced by the Composite pattern, but a problem which is inherent in distributed computing.

A better way to call static methods in user-submitted code?

I have a large data set. I am creating a system which allows users to submit java source files, which will then be applied to the data set. To be more specific, each submitted java source file must contain a static method with a specific name, let's say toBeInvoked(). toBeInvoked will take a row of the data set as an array parameter. I want to call the toBeInvoked method of each submitted source file on each row in the data set. I also need to implement security measures (so toBeInvoked() can't do I/O, can't call exit, etc.).
Currently, my implementation is this: I have a list of the names of the java source files. For each file, I create an instance of the custom secure ClassLoader which I coded, which compiles the source file and returns the compiled class. I use reflection to extract the static method toBeInvoked() (e.g. method = c.getMethod("toBeInvoked", double[].class)). Then, I iterate over the rows of the data set, and invoke the method on each row.
There are at least two problems with my approach:
it appears to be painfully slow (I've heard reflection tends to be slow)
the code is more complicated than I would like
Is there a better way to accomplish what I am trying to do?

There is no significantly better approach given the constraints that you have set yourself.
For what it is worth, what makes this "painfully slow" is compiling the source files to class files and loading them. That is many orders of magnitude slower than the use of reflection to call the methods.
(Use of a common interface rather than static methods is not going to make a measurable difference to speed, and the reduction in complexity is relatively small.)
If you really want to simplify this and speed it up, change your architecture so that the code is provided as a JAR file containing all of the compiled classes.

Assuming your #toBeInvoked() could be defined in an interface rather than being static (it should be!), you could just load the class and cast it to the interface:
Class<? extends YourInterface> c = Class.forName("name", true, classLoader).asSubclass(YourInterface.class);
YourInterface i = c.newInstance();
Afterwards invoke #toBeInvoked() directly.
Also have a look into java.util.ServiceLoader, which could be helpful for finding the right class to load in case you have more than one source file.

Personally, I would use an interface. This will allow you to have multiple instance with their own state (useful for multi-threading) but more importantly you can use an interface, first to define which methods must be implemented but also to call the methods.
Reflection is slow but this is only relative to other options such as a direct method call. If you are scanning a large data set, the fact you have to pulling data from main memory is likely to be much more expensive.

I would suggest following steps for your problem.
To check if the method contains any unwanted code, you need to have a check script which can do these checks at upload time.
Create an Interface having a method toBeInvoked() (not a static method).
All the classes which are uploaded must implement this interface and add the logic inside this method.
you can have your custom class loader scan a particular folder for new classes being added and load them accordingly.
When a file is uploaded and successfully validated, you can compile and copy the class file to the folder which class loader scans.
You processor class can lookup for new files and then call toBeInvoked() method on loaded class when required.
Hope this help. (Note that i have used a similar mechanism to load dynamically workflow step classes in Workflow Engine tool which was developed).

State Design Pattern Implementation

I am trying to apply the State Design Pattern to an instant messenger program that I am building. The program is built on top of an existing instant messenger API. I am essentially creating a wrapper class to simplify the process of sending a message. (The wrapper class is going to be used by several automated scripts to fire off messages when some event occurs.)
Here is what I have so far:
A Messenger class that will serve as the client interface and hold a reference to the current state.
An AbstractMessengerState class from which all of the concrete states will inherit.
Several concrete State classes representing the various states (e.g. SessionStarted, LoggedIn, LoggedOut, etc.)
The problem I am having is where to store the state data. That is, which class(es) should store the fields that I need to carry out the business logic of the messenger program. For example, I have a Map data structure that maps userIDs (strings) to the objects used by the underlying API. I have a Session object that is used to access various messenging components and to log in and out of the messenger server. These objects need to be shared between all the subclasses.
If I store this data in the base class than I will be duplicating data every time I instantiate a new State. Is there a way to ensure that the data in the base class is accessible by the subclasses without duplicating the fields?
UPDATED
Ok, after reading a related post I am going to try to store everything in the Context (Messenger) class and see how that goes.

Design of a standalone class/framework requiring external data

For the sake of an example, I have a class called FileIdentifier. This class:
Has the method identify which accepts a File and returns a String representing the type.
Requires external data since new file formats are a possibility.
How could this class be written so it could used in any project while remaining unobstrusive? Overall, how is this aspect usually handled in standalone frameworks that require configuration?

That all depends on how you identify the file type. From your question I would assume that it's not a process as trivial as parsing for the file extension...
That said maybe you could just use an external XML file, or INI, or db table etc. that maps file types and just have the class read that data and return whatever... (You would actually want to use a few classes to keep things clean.) That way only the external data would need to be updated and the class remain unchanged.

Try with a chain of responsibility.
Each instance in the chain is from a different class that manages a single file type. The file is passed down in the chain, and as soon as an instance decides to manage it, the chain stops and the results are returned back.
Then you just would have to build the chain in the desired order (maybe with more common file types at the top), provide default classes that manages some file types in your framework. This shoud be also easy to extend in your applications, it's just a matter of writing another subclass of the chain that manages your new user-defined file types.
Of course your base class for the chain (the Handler, as called by dofactory.com) could provide useful protected methods to its subclasses in order to make their work easier.

Servlet doPost() Method setup?

I am interested in creating a web app that uses JSP, Servlets and XML.
At the moment I have the following:
JSP - Form input.
Servlet - Retrieving Form data and sending that data to a java object.
Java object (1) - Converts data into XML file....instantiates java object (2).
Java object (2) - Sends that file to a database.
On the returning side the database will send back another XML file that I will then process using XSLT to display back to the user.
Can I place that XSLT code in the orignial Servlets doPost() method? So my doPost()` method would:
Retrieve user inputted data from the form on my JSP page.
Instantiate a java object to convert that data to XML, in-turn that object will instantiates another object to send the XML file to a database.
Converts the resulting XML file sent from the database and displays it for the user.
Can one servlet doPost() method handle all of this? If not, how would I set up my application and classes to handle this work flow?
Thank you in advance

I wouldn't load the XSLT in POST, because every method has to do it.
Read that XSTL in the init method, precompile and cache it. Just make sure that you keep it thread safe.
Once you have the XSLT, you've got to apply it to every XML response, so those steps do belong in POST.

All your doPost() method has to do is generate a suitable servlet response (some form of content, and a suitable HTTP response structure). So it can do anything you want (including the above).
However it sounds like your rendering requirement is distinct from your form submission and storage requirement. So I would make your doPost() method delegate to a suitable method for rendering the output. That way you can generate output from stored data separately from submitting data to the database.

Well, this is not really specific to servlets, but more to Java/OOP (object oriented programming) in general. You can in fact do everything in a single method, even in a main() method. But hundreds or more of lines in a single method isn't really readable, maintainable, reuseable nor testable in long terms. Right now, you're probably just starting with Java and you probably don't need to do anything else than this, but if you ever need to duplicate (almost) the same lines of code, then it's time to refactor. Extract the variables from the duplicate code lines and wrap those lines in a new method which takes those variables as arguments and does a simple one-step task.
In general, you'd like to already split the big task in separate subtasks beforehand, using separate and reuseable classes and methods. In your case, you can for example have a single DAO class for all the DB interaction task, a generic XML helper class to convert Javabeans to XML and vice versa with help of XSL and (maybe) a domain object to manage the input/output processing (conversion/validation/errorhandling/response) and executing actions. Write down in paper how the big picture is to be accomplished in small single tasks. Each task can be often as good done by a single method. Group the methods with the same responsibilities and/or the same shared data in the same class.
To go a step further, for several tasks there may be 3rd party tools available which eases the task. I can think of for example XMLBeans and/or XStream to do the Javabean <--> XML conversion. That would already save a lot of boilerplate code and likely also the XSL step.
That said, duffymo's suggestion to load the XSL only once is a very good one. You don't need to re-execute exactly the same task which isn't dependent on request parameters at all again and again on every request, that's only inefficient.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.