Design of a standalone class/framework requiring external data - java

For the sake of an example, I have a class called FileIdentifier. This class:
Has the method identify which accepts a File and returns a String representing the type.
Requires external data since new file formats are a possibility.
How could this class be written so it could used in any project while remaining unobstrusive? Overall, how is this aspect usually handled in standalone frameworks that require configuration?

That all depends on how you identify the file type. From your question I would assume that it's not a process as trivial as parsing for the file extension...
That said maybe you could just use an external XML file, or INI, or db table etc. that maps file types and just have the class read that data and return whatever... (You would actually want to use a few classes to keep things clean.) That way only the external data would need to be updated and the class remain unchanged.

Try with a chain of responsibility.
Each instance in the chain is from a different class that manages a single file type. The file is passed down in the chain, and as soon as an instance decides to manage it, the chain stops and the results are returned back.
Then you just would have to build the chain in the desired order (maybe with more common file types at the top), provide default classes that manages some file types in your framework. This shoud be also easy to extend in your applications, it's just a matter of writing another subclass of the chain that manages your new user-defined file types.
Of course your base class for the chain (the Handler, as called by dofactory.com) could provide useful protected methods to its subclasses in order to make their work easier.

Related

A better way to call static methods in user-submitted code?

I have a large data set. I am creating a system which allows users to submit java source files, which will then be applied to the data set. To be more specific, each submitted java source file must contain a static method with a specific name, let's say toBeInvoked(). toBeInvoked will take a row of the data set as an array parameter. I want to call the toBeInvoked method of each submitted source file on each row in the data set. I also need to implement security measures (so toBeInvoked() can't do I/O, can't call exit, etc.).
Currently, my implementation is this: I have a list of the names of the java source files. For each file, I create an instance of the custom secure ClassLoader which I coded, which compiles the source file and returns the compiled class. I use reflection to extract the static method toBeInvoked() (e.g. method = c.getMethod("toBeInvoked", double[].class)). Then, I iterate over the rows of the data set, and invoke the method on each row.
There are at least two problems with my approach:
it appears to be painfully slow (I've heard reflection tends to be slow)
the code is more complicated than I would like
Is there a better way to accomplish what I am trying to do?
There is no significantly better approach given the constraints that you have set yourself.
For what it is worth, what makes this "painfully slow" is compiling the source files to class files and loading them. That is many orders of magnitude slower than the use of reflection to call the methods.
(Use of a common interface rather than static methods is not going to make a measurable difference to speed, and the reduction in complexity is relatively small.)
If you really want to simplify this and speed it up, change your architecture so that the code is provided as a JAR file containing all of the compiled classes.
Assuming your #toBeInvoked() could be defined in an interface rather than being static (it should be!), you could just load the class and cast it to the interface:
Class<? extends YourInterface> c = Class.forName("name", true, classLoader).asSubclass(YourInterface.class);
YourInterface i = c.newInstance();
Afterwards invoke #toBeInvoked() directly.
Also have a look into java.util.ServiceLoader, which could be helpful for finding the right class to load in case you have more than one source file.
Personally, I would use an interface. This will allow you to have multiple instance with their own state (useful for multi-threading) but more importantly you can use an interface, first to define which methods must be implemented but also to call the methods.
Reflection is slow but this is only relative to other options such as a direct method call. If you are scanning a large data set, the fact you have to pulling data from main memory is likely to be much more expensive.
I would suggest following steps for your problem.
To check if the method contains any unwanted code, you need to have a check script which can do these checks at upload time.
Create an Interface having a method toBeInvoked() (not a static method).
All the classes which are uploaded must implement this interface and add the logic inside this method.
you can have your custom class loader scan a particular folder for new classes being added and load them accordingly.
When a file is uploaded and successfully validated, you can compile and copy the class file to the folder which class loader scans.
You processor class can lookup for new files and then call toBeInvoked() method on loaded class when required.
Hope this help. (Note that i have used a similar mechanism to load dynamically workflow step classes in Workflow Engine tool which was developed).

Can I load a Java class in a way that automatically removes its privileges?

I am working on developing a library that needs to instantiate and return untrusted objects downloaded from an external website. At a high-level, the library works as follows:
Clients of the library requests a class from a remote source.
My library instantiates that object, then returns it to the user.
This is a major security risk, since the untrusted code can do just about anything. To address this, my library has the following design:
I enable the SecurityManager and, when instantiating the untrusted object, I use an AccessController to handle the instantiation in a context where there are no privileges.
Before returning the object back to the client, I wrap the object in a decorator that uses an AccessController to forward all method requests to the underlying object in a way that ensures that the untrusted code is never run with any permissions.
It occurs to me, though, that this might not be the most elegant solution. Fundamentally, I want to strip away all permissions from any object of any type downloaded from the remote source. My current use of AccessController is simply a way of faking this up by intercepting all requests and dropping privileges before executing them. The AccessController approach also has its own issues:
If the wrapped object has any methods that return objects, those returned objects have to themselves be wrapped.
The wrapper code will potentially be thousands of lines long, since every exported method has to be secured.
All of the methods exported by the downloaded object have to be known in advance in order to be wrapped.
My question is this: is there a way to load classes into the JVM (probably using a custom ClassLoader) such that any instances of those classes execute their methods with no permissions?
Thanks!
You will want to call defineClass with an untrusted ProtectionDomain.
Your current solution has a number of problems. It doesn't appear to cover the static initialiser. It may be possible to install code into some mutable arguments. Methods that use the immediate caller will still be privileged (AccessController.doPrivileged, say). But most of all, it falls about when rubbing up against any kind of global - for instance running a finaliser.
Don't know if there's a way to directly do what you asked, but I think your approach can be simplified by using interfaces and dynamic proxies. Basically, if you have an interface for the object to be returned, and all its methods return either simple types or interfaces, then you can wrap all the methods and their return values automatically, without knowing the methods in advance. Just implement an InvocationHandler that does the AccessController magic in its invoke method, and create proxies using Proxy.newProxyInstance(...).

Allowing maximal flexibly/extensibility using a factory

I have a little design issue on which I would like to get some advice:
I have several classes that inherit from the same base class, each one can accept the same data and analyze it in a slightly different way.
Analyzer
|
˪_ AnalyzerA
|
˪_ AnalyzerB
...
I have an input file (I do not have control over the file's format) that defines which analyzers should be invoked and their parameters. Plus it defines data-extractors in the same way and other similar things too (in similar I mean that this is an action that can have several variations).
I have a module that iterates over different analyzers in the file and calls some factory that constructs the correct analyzer. I have a factory for each of the archetypes the input file can define and so far so good.
But what if I want to extend it and to add a new type of analyzer?
The solution I was thinking about is using a property file for each factory that will be named after the factories name and it will hold a mapping between the input file's definition of whatever it wants me to execute and the actual classes that I use to execute the action.
This way I could load that class at run-time -> verify that it's implementing the right interface and then execute it.
If some John Doe would like to create his own analyzer he'd just need to add a new property to the correct file (I'm not quite sure what would be the best strategy to allow this kind of property customization).
So in short:
Is my solution too flawed?
If no what would be the most user friendly/convenient way to allow customization of properties?
P.S
Unfortunately I'm confined to using only build in JDK classes as the existing solution, so I can't just drop in SF on them.
I hope this question is not out of line I'm just not used to having my wings clipped this way, not having SF or some other to help me implement an elegant solution.
Have a look at the way how the java.sql.DriverManager.getConnection(connectionString) method is implemented. The best way is to watch the source code.
Very rough summary of the idea (it is hidden inside a lot of private methods). It is more or less an implementation of chain of responsibility, although there is not linked list of drivers.
DriverManager manages a list of drivers.
Each driver must register itself to the DriverManager by calling its method registerDriver().
Upon request for a connection, the getConnection(connectionString) method sequentially calls the drivers passing them the connectionString.
Each driver KNOWS if the given connection string is within its competence. If yes, it creates the connection and returns it. Otherwise the control is passed to the next driver.
Analogy:
drivers = your concrete Analyzers
connection strings = types of your files to be analyzed
Advantages:
There is no need to explicitly bind the analyzers with their type of file they are meant for. Let the analyzer to decide itself if it is able to analyze the file. If not, null is returned (or an exception or whatever) to tell the AnalyzerManager that the next analyzer in the row should be asked.
Adding new analyzer just means adding a new call to the register() method. Complete decoupling.

Generating code for converting between classes

In one of the project I'm working on, we have different systems.
Since those system should evolve independently we have a number of CommunicationLib to handle communication between those Systems.
CommunicationLib objects are not used inside any System, but only in communication between systems.
Since many functionality require data retrieval, I am often forced to create "local" system object that are equal to CommLib objects. I use Converter Utility class to convert from such objects to CommLib objects.
The code might look like this:
public static CommLibObjX objXToCommLib(objX p) {
CommLibObjX b = new CommLibObjX();
b.setAddressName(p.getAddressName());
b.setCityId(p.getCityId());
b.setCountryId(p.getCountryId());
b.setFieldx(p.getFieldx());
b.setFieldy(p.getFieldy());
[...]
return b;
}
Is there a way to generate such code automatically? Using Eclipse or other tools? Some field might have a different name, but I would like to generate a Converter method draft and edit it manually.
try Apache commons-beanutils
BeanUtils.copyProperties(p, b);
It copies property values from the origin bean to the destination bean for all cases where the property names are the same
If you feel the need to have source code automatically generated, you are probably doing something wrong. I think you need to reexamine the design of the communication between your two "systems". How do these "systems" communicate?
If they are on different computers or in different processes, design a wire protocol for them to use, rather than serializing objects.
If they are classes used together, design better entity classes, which are suitable for them both.

How to programmatically instantiate dynamically loaded class from values in a file?

I have basic knowledge of Java's reflection API - therefore, this is not only a question of how, it's a question of whether it's possible and whether I'm going about a solution the best way.
We're doing some acceptance testing of multiple, interrelated projects; each of these projects retrieve data from a MongoDB store using an in-house abstraction API. To facilitate this testing, each component needs some pre-loaded data to be available in the database.
I'm building a command-line tool to accept a DTO (pre-compiled class binary), for loading of multiple instances using the morphia ORM library. I would like each member of our team to be able to run the generator passing in via cli their DTO (in jar or directory form), and a file (csv or otherwise) for instantiating a desired amount of records.
I have the class loading working fine with URLClassLoader. Now I'm trying to instantiate an instance of this class using data from a file.
Is this possible? Would serialized objects be a better approach?
That's perfectly possible using the Java Reflection API :
Load Class instance by name (Class.forName(className), you don't really need the ClassLoader instance)
Grab Constructor instance of constructors have parameters and invoke newInstance(Object... args) on this constructor instance to create an instance of your DTO class.
Invoke getDeclaredFields() on your Class instance and iterate over them to set their values (field.set(instance, value)). Make sure to invoke field.setAccessible(true) to be able to access private fields.
If by "serialized objects" you mean canned instances, then no, by loading properties from a text file you allow much easier tweaking of test data (if that's a goal), including the number of objects.
But sure, it's possible; unmarshal the data from the input file and use it to initialize or construct the object in question like you would in regular code.

Categories

Resources