Should we be reusing JPA entities

Should we be reusing JPA entities - java

Currently I am working for a company that has 6-7 Java EE projects. They are multi module maven projects that are all fairly large and serve different purposes. As such, their models are very different but for the most part the data is stored in the same database.
The problem, to me, is that since there are a few areas of overlap, they simply inject the existing DAOs all the way up the depedency chain. So I have
A-parent
-A-JPA
-A-DAO
B-Parent
-B-JPA
-B-DAO
-A-JPA
-B-DAO
etc, etc. They are really only using 2 percent of the other projects model and respective DAO.
I am trying to attempt to decouple these dependencies by simply duplicating the entities needed (and only including fields/mappings for the things that are really needed) so that the same EJB isn't deployed 7 times (or more when clustered), but apparently I'm not making a convincing argument. Can anyone help point me to an article with best practices for this situation or help bring up points to explain to him.
TLDR: I want each project to have its own set of entities even if there is a very small bit of overlap to reduce dependencies between projects as well as make it so we aren't deploying the same EJB's 7 times. My boss thinks there is nothing wrong with these being unnecessarily coupled. Am I making a big deal about this for nothing? Thanks!

If it is a single data model that is being maintained for various applications to use, the persistence entities (and even their DAOs) may be seen as the Java API to that database, and I'd put it in a central component. Some organizations may even drive the design from the database upwards, and reverse engineer the persistence entities, in which case they'll be same or similar for different users.
Whether such central component is a library (reused by other components) or an EJB of its own (that's being called by other components) I would let depend on the desired transactional and caching behavior of the application, and how you see responsibilities being organized. On one project we strongly upheld the rule that each piece of data could only be maintained by a single component (a service or an EJB) and others would have to go through that single component.
If it is a common domain model, but every EJB implement its own data storage for that, then the domain model may be shared, and I would not share the persistence entities. Then you get into the discussion of sharing the domain model among different components. The world may be viewed in slightly different ways from within different sub-domains and I feel that you end up designing your domains slightly different across different sub-systems, hence there I would possibly vote against reuse.
Everyone's mileage may vary, and I may see things differently given the actual circumstances of a particular project.

Related

Is it better to hold a repository for every web application (context) or is it better to share a common instance by JNDI or a similar technique

within our company it's kind of standard to create repositories for data which is originally stored in the database as described for example in https://thinkinginobjects.com/2012/08/26/dont-use-dao-use-repository/.
Our web infrastructure consist of a few independent web applications within Tomcat 7 for printing, product description, product order (this is not persisted in the database!), category description etc.
They are all build on Servlet 2 API.
So each instance/implementation of repository holds a specialised kind of data represented by serializable classes and the instances of this serialzable classes are set up/filled by an periodically executed database query (for every resultrow the setters of the fields are called; reminds me of domain oriented entity beans with CMP).
The repositories are initialized on the servlets init sequences (so every servlet keeps it's own set of instances).
Each context has a own connection to the Oracle database (set up by resource description file on deployment).
All the data is read only, we never need to write back to the database.
Because we need some of these data types for more than one web application (context) and some even for more than one servlet within the same web context repositories with an identical data type are instantiated more than once - e.g. four times, twice within the same application.
In the end some of the data is doubled and I'm not sure if this is as clever and efficient as it should be. It should be possible to share the same repository object to more than one application (JNDI?) but at least it must be possible to share it for several servlets within the same application context.
Despite I'm irritated by the idea to use a "self build" repository instead of something like a well tested, open developed cache (ehcache, jcs, ...) because some of these caches also provide options for distributed caches (so it should also work within the same container).
If certain entries are searched the search algorithm iterates over all entries in the repository (s. link above). For every search pattern there are specialised functions which are directly called from within the business logic classes using the "entity beans"; there's no specification object or interface.
In the end the application server as a whole does not perform that well and it uses a hell lot of RAM (at least for approximately 10000 DB entries); this is in my opinion most probably correlated to the use of serializeable XSD-to-JAXB-generated classes.
Additionally every time a application is deployed for tests you have to wait at least two minutes until all entries of the database have been loaded into the repositories - when deploying on live there's a well recognizable out of service phase on context/servlet start up.
I tend to think all of this is closely related to the solutions I described above.
Because I haven't got any experiences in this field and I'm new in the company I don't want to be to obtrusive.
Maybe you can help me to evaluate ideas for a better setup:
Is it for performance and memory better to unify all the repositories into one "repository servlet" and request objects from there via HTTP (don't think so, though it seems quite modular/distributed system friendly) or should I try to go with JNDI (never did that before) and connect to the repository similar to a JDBC database?
Wouldn't it be even more sensible, faster and efficient to at least use only one single connection pool for the whole Tomcat (and reference this connection pool from within the web apps deployment descriptor)? Or might that slow down connections or limit it in any other aspect?
I was told that the cache system (ehcache) didn't work well (at least not with the performance of the self written solution - though: I can't believe that). I imagine the usage of repositories backed by a distributed (as across all contexts) cache used in all web applications should not only reduce memory footprint significantly but should not be significantly slower. - I believe it will be faster and have shorter start up times respectively it shouldn't be needed to redeploy it that often.
I'm very grateful for every tip or hint and your thoughts. Would be marvellous to get a peer review of my ideas based on practical experiences.
So thank you very much in advance!

Is it better to hold a repository for every web application (context) or is it better to share a common instance by JDNI or a similar technique
Unless someone proves me otherwise I would say there is no way to do it, in a standard way, meaning as defined in the Servlet Sepc or in the rest of the Java EE spec canon.
There are technical ways to do it which probably depend on a specific application server implementation, but this cannot be "better" in its universal sense.
If you have two applications that operate on the same data, I wonder whether the partitioning of the applications is useful. Maybe all functionality operating on some kind of data needs to be in the same application?
within our company it's kind of standard to create repositories for data which is originally stored in the database as described for example in https://thinkinginobjects.com/2012/08/26/dont-use-dao-use-repository/.
I looked up Evans in our book shelf. The blog post is quite weird. A repository and a DAO are basically the same thing, it provides CRUD operations for an object or for a tree of objects (Evans says only the the aggregate roots).
The repositories are initialized on the servlets init sequences (so every servlet keeps it's own set of instances). Each context has a own connection to the Oracle database (set up by resource description file on deployment). [ ... ]
In the end the application server as a whole does not perform that well and it uses a hell lot of RAM
When something performs badly its the best to do profiling, e.g. with YourKit or with perf and FlameGraphs if you are on Linux. If your applications need a lot of RAM, analyze the heap e.g. with Eclipse MAT. There is no way somebody can give you a recommendation or hint on a best practice without seeing any line of code.
A general answer would include anyting about performance tuning for Oracle DBs, JDBC, Java Collections and Concurrent Programming, Networking and Operating Systems.
I was told that the cache system (ehcache) didn't work well (at least not with the performance of the self written solution - though: I can't believe that)
I can. EHCache is between 10-20 times slower then a simple HashMap. See: cache benchmarks. You only need a map, when you do a complete preload and don't have any mutations.
I imagine the usage of repositories backed by a distributed (as across all contexts) cache used in all web applications should not only reduce memory footprint significantly but should not be significantly slower
Distributed caches need to go over the network and add serialization/deserialization overhead. That's probably another factor 30 slower. When is the distributed cache updated?
I'm very grateful for every tip or hint and your thoughts.
Wrap up:
Do the normal software engineering homework, do profiling and analyzing and spend the effort of tuning at the right places
Ask specific questions on one topic on stackoverflow and share your code and performance data. Ask a question about one thing at one time and read https://stackoverflow.com/help/on-topic
You may also come to the conclusion that there is nothing to tune. There are applications out there that need a day to build up an in memory data structure from persistent data. Maybe its just a lot of data? If you do not like the downtime use green blue deployment. Also use smaller data sets for development and testing

How to perform centralized changes on objects in an EJB environment

I'm working on a project where I have to process incoming message files of different kind (xml, edifact, etc.). The project is built in jboss using EJBs for recurring tasks.
I'm not native do EJBs and the ways you use them, so I try to listen, what my colleagues say and what I read about it. One central concept, I've been told, is to keep logic out of you entities (them being only data storage objects), because the entities jar-file is included in every program using any of the services, and you do not want all those calling projects to be updated for any change in the logic, you want that on a central space.
Sounds kinda sensible to me. That is the place where I think EJBs come in to provide services that operate on the data. The problem I run into, is that my remote EJBs cannot alter any objects, because they are (of course) passed by value. Returning an altered version of the object isn't working either, because that object is already existing and referenced in different places.
How can I perform operations on objects, that modify the object in a centralized manner? I will need the same action on that object in dozens of projects. The only ways I can think of is adding a tool.jar, which defeats the purpose of EJBs and is not so different from putting the logic in the entities, or to add a camel route with jump back address which seems awfully complicated and hard to understand for code readers.
Either I misunderstand some basic principle, miss an important tool, or our design must have a serious flaw.
(To be a little more concrete: I have a Message-Entity, that has many attributes like recipient, sender, message size, ... set already. No for some kind of message, I have to do the same actions, which involve setting numerous fields in the message, and adding multiple entities, that need to be attached to the message-entity. My EJB could do all that, but the changes are lost of course, and returning the modified object does not work because of multiple references to the original message-entity).

I may be missing the point, but the entity objects are representations of database rows - why do you not just persist the changes from your remote session beans? The database is there precisely to provide a consistent view of state, so it's the ideal place for this.

When to use multiple main methods?

I am confused when it is proper to use an application with multiple entry points, or I guess an application with multiple interconnected modules. I have a network application (Netty) as well as a web application (spring). I can bundle them together, in effect tightly coupling them together, or I can modularize them to operate interdependently of each other while still working together to make the application whole.
Is there any specific reason for making an application a single entity vs multiple entities? Is it "desired" to have a self contained application (eg. One main method)?

First of all, asking about the number of main() methods is a bit misleading. You can have several classes with main() methods in a single JAR file after all.
But the question seems to be more about single application vs. multiple applications, or to be more precise: tiers.
It's important to note that this issue is separate from the question of modularity and multi-threading, all of which can be employed in a single tier application just as easily as in a multi-tier application.
The reasons you'd need a multi-tiered application can vary, but here are a few examples:
It is simply part of the requirements: i.e. a chat software will usually need a server and a client because the requirement is to move data between two computers.
Scaling: you need to spread the work to multiple computers to cope with large amounts of data or requests. (This is a typical use case of message queues for example.)
Separation of concerns: This typically happens in "enterprise" systems, where different functions need to be performed in complete isolation, allowing modules to be replaced/restarted on the go or to scale them separately.

Web applications are supposed to have multiple entries; think of the URL you type that can lead to to a resource. In fact, in many web application architectures, such as JAX-RS, exposing the resource URI is encouraged. Each entity, as small as one java bean, has its own entry point. Not sure if this is what you mean, but that's my opinion.

Is it possible to build a web application as a generic product rather than a one-time customized solution?

There is a business problem that needs to be solved. The obvious solution is an enterprise web application - a locally hosted website that provides the desired functionality.
I want to build this web application, but build it such that -
Its more of a product than a one-time solution; such that it can be customized for different clients
It is possible to provide 'fixes' for this web application, so that bugs can be removed and enhancements added with minimum impact on operations
The web app should be capable of working with different databases and existing authentication systems
Is this even possible? Is it a common enough approach that there is a known way of going about this? Would it be better to use an application framework like Spring or try and keep dependencies on frameworks to minimal?
Also, any links or references to books that will guide me will be greatly appreciated.
Thanks in advance StackOverflow!
(I feel like I dont know all what I need to know before embarking on this project, please feel free to point out things I haven't and should consider)

Developing software, esp. for re-usability, requires analyzing which parts/functions are common between use cases and which aren't, drawing the line between re-usable (library) and customized/specialized code.
If you know what use cases you expect or want to support in the future this can be feasible.
If you don't, you should not start trying to generalize arbitrary functionality in the first place, because you cannot know what you will be needing in the future.
Java provides some good abstractions of various functionalities, like universal DB support via JDBC.
If you didn't already, have a look at application servers like JBoss or Glassfish. They provide plenty of basic functionality for web applications, support very loose coupling between components, and are highly configurable. To switch from one DBMS to another, for instance, it is enough to alter a single line of configuration (given the supported SQL is similar enough). Deploying applications or parts can often be done on the fly ("hot deployment") without even stopping the server.
Plus: There is a vast amout of supporting libraries and frameworks out there to help you standardize your application design.

I have been working for a while on a webapp that can be deployed in multiple locations: it is designed to be instantiated on many hosts. It's entirely possible to do this, but it is difficult. Writing the code so that it can work this way takes a great deal of care.
The key to doing it is to make all your dependencies on things explicit and all your configuration driven by properties that can be set during installation. Spring makes this quite a lot easier! In particular, the org.springframework.web.context.support.ServletContextPropertyPlaceholderConfigurer class allows you to use the servlet context as a source of values that you can then inject into your beans (e.g., via #Value annotations). It's far harder to do all that yourself. Here's (a simplified version of) what I use:
<bean class="org.springframework.web.context.support.ServletContextPropertyPlaceholderConfigurer">
<property name="contextOverride" value="true" />
<property name="location" value="/WEB-INF/default.properties" />
</bean>
This merges the servlet context's properties on top of the ones you provide as defaults inside your webapp (definitely a good practice if most things aren't going to need to be modified most of the time) and then uses them to define properties. I then apply a configuration property (e.g., foo.bar) to a bean property using a placeholder, like this:
#Value("${foo.bar}")
public void setFoobar(String foobar) { ... }
Things to configure that way include the database configuration, absolute locations of files holding things that can't be packaged inside the webapp, etc. You'll have to use your skill and knowledge of the application domain to work out what things need to be listed.
Other key principles are to keep as much as possible inside the webapp (so reducing the opportunity for the deployer to mess it up), to be very careful about documenting everything, and to try it with multiple servlet containers. Remember, the person deploying your webapp does not have access to the contents of your thoughts: you have to write it down and tell them exactly what to do. (Too many instructions are at the level of “click this, click that, magic happens” but those are poor instructions since the exact method will vary over time: saying why will help far more because its more portable.)

We are currently developing a product that can be deployed internally for multiple clients and also as a public portal solution. Here is our experience.
As others have pointed out, there are different factors to keep in mind.
Security
Security that is associated with your product, and how you would manage the product functional requirements to external security roles.
Security, authentication and authorization should not be as part of the base product. Once authorized the roles need to be mapped to product roles for achieving said functionality.
Images and logos, that require customization.
Internationalization.
For working with multiple databases, assuming a product has typically two different views, persistence and querying. Our experience was to use hibernate to support multiple databases, but theoretically we have used only two databases in the past. db2 and mysql.
Testing for multiple databases for every release of your product is a pain. Your test cases goes 3 fold or atleast once in a while to support multiple databases.
Using custom databases and functions are a big no, you can use some general functions but custom database specific functions in your query are going to be a pain and have to be very diligent to avoid them.
Supported browsers in your product.
Licenses of the third party jars may not be compatible / acceptable to all institutions so you have to watch out for that carefully.
As much as possible, enable properties or configuration to customize all variables.
Caching strategy and properties initialization strategies.
A framework helps the team to keep on the same page, rather than an internal framework. There are many advantages to use a well established framework like Spring for performance and other consideration.
Cheers!

Understand application architecture

I am a S/W developer working on a Java web application.
I have very limited knowledge of Java and want to understand my application architecture from a very high level in simple terms. Being a developer, I do not completely understand the big picture.
Can anyone help me understand this?

Each layer represents a place where some problems are solved is similar way, often using some particular libraries or frameworks. In trying to understand this work your way down through the layers. BUT, note that each layer hides the details underneath, you don't need to understand the details of lower layers in order to understand one layer.
So the Struts piece is dealing with the User-Interface related issues of understanding user requests choosing some business logic to invoke and choosing how to display the results back to the user. It doesn't concern itself with how the business logic works, that's the job of the next layer down.
By Business Logic I mean the Java (or other language) code that expresses the realities of a the customer's business. For example in a retail application we might need to work out discounts for particular volumes of orders. So the UI layer wants to display the price for a customer's order. It doesn't have any discount logic itself, instead it says to the business logic layer "Customer X is order N widgets and M zettuls, when can we supply and how much shall we charge" and the business logic figures out the pricing for this customer, which might depend on all sorts of things such as the status of the customer, the number things we have in stock, the size of the order and so on. The UI just gets an answer £450, to be delivered 16th September, and displays it.
That leads to questions such as "why separate the business logic to its own layer?" There are several possible reasons:
The business logic might be used by some completely different UI as well
It pre-exists, from some older system
Our brains are too small to think about UI and Business Logic at the same time
We have different teams working on UI and BL - different skills are needed
This same way of thinking follows down the layers. Th important thing when thinking about each layer is to try to focus on the role of the layer and treat the other layers as black-boxes. Our brains tend to be too small to think about the whole thing at the same time. I can almost feel myself changing mode as I shift between the layers - take off my UI head, put on my persistence head.
There's plenty of material "out there" about each layer. Suggest you start by reading up about one of them and ask specific questions if you get stuck.

Looks like a standard "enterprisy" application to me.
Interface
This app. is primarily intended to be used via web browsers by humans. This interface or UI layer (in a broad sense) is build using the MVC framework Struts. If you don't know MVC, then it's a good idea to read up on this term.
The app. also exposes web service interface, which is intended to be used by non-humans (other applications or scripts).
So, 2 kinds of interface is provided.
Business logic
Request coming from the 2 interfaces noted above, ultimately talk to the lower half (App. tier) to get some real job done. The actual process (calculating stuff, storing stuff and what not) happens here. Note also that this tier also talks to external systems (via Service Gateway) in order to complete the requests.
The reason they separated the system like this can vary. If it's a very simple, small app. then it doesn't pay off to have these abstractions, so it's probably a quite complex system. Maybe the app. tier is some legacy app. and they wanted to put some better UI on top of it. Maybe the app. tier and the web tier happened to use different technology stack. Or perhaps the app. tier is used by many other external systems. It also makes updating each services/replacing each services etc. easier, so maybe they are anticipating that. Anyways, it looks like a very common SOA type of design.
Not sure what else you might want to know. I see that the designer intends to use distributed cache in both tiers. All in all, it's a very common type of system integration diagram.

App Tier - Where the basic application logic resides (think of a basic console-only program).
Database - MySQL, Oracle... server
DOs - Short for Domain Objects. If anemic usually limited to getters/setters and can be synonymous with Entities (#Entity).
Data Access Objects - Objects using DAO pattern to fetch domain objects from the database. Could be considered equivalent as DAL (Data Access Layer) although this might not fit here. Usually uses a persistence/mapping framework such as Hibernate or iBatis.
Domain Model - Where Domain (domain being what has been studied/analyzed from requirements) Classes are packaged and associated (one-to-one, many-to-one, ...). Sometimes there are even composited in other container classes.
Core Application Services - Groups Services classes which could be equated to the Business Logic Layer (BLL). Biz services handle the Domain Model, Application Services are relative to the Application (methods don't manipulated Domain) and not sure what System Services are supposed to do. I'm also not sure what the distinction between Core and the Application Service block is.
Facade/Interface - This is where all the communication is done with other Tiers. It helps to fully understand interfaces and the Facade pattern.
TOs: Transfer Objects. Related to Domain Objects and used for transferring as the name implies.
Web Tier - All the logic added to make the application accessible from the Web.
Business Delegate: Block using delegation pattern? Here it apparently plays the middle man between Application facade and the Presentation Tier using TOs.
Model: Notion relative to the MVC pattern and variants. With JSF I know that beans are usually linked to JSPs. This model is to be distinguished from the Domain Model. It is created for the purposes of presentation and might contain information that has nothing to do with the data being manipulated in the application tier and is specific to the web tier.
View: Notion relative to the MVC pattern and variants. JSPs represent individual pages and use their own markup.
Session: Look up theory on sessions (necessary as HTTP is stateless).
ApplicationContext: Groups instances/settings that can be pulled into from anywhere. Often associated in the Java world with the Spring framework. To not be confused with UglyGlobalVar/singletons.
Front controller: Could be considered as equivalent to Controller of MVC although the Front Controller pattern appears to be more specific. Servlets are considered as controllers as they handle communication (GET, POST...) and coordinate the model/view. Not familiar with Struts but imagine that the framework implements classes to represent user actions.
Presentation Tier: Simply all of the logic used for outside presentation.
Filters: Can be used to filter requests, for example to redirect a request to a login page.
Feel free to add/correct this description.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.