Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have a website, consisting of about 20 Java Web applications (Servlet/JSP based webapps), of varying sizes, each handling different areas of the site.
The combined size of all 20 war's is 350mb, however by combining them I anticipate being able to ultimately reduce that and realise combined caching benefits.
Is it best to keep them separate, or merge them into a single Uber webapp war file? (and why)
I'm particularly interested in knowing any technical drawbacks of merging them.
I "vote" to combine them.
Pros
Code sharing: If you combine them, you can share code between them (becase there will be only one).
This does not apply to just your code, it also applies all the external libraries you use which will be the bigger gain I think.
Less memory: Combined will also require less memory (might be very significant) because the external libraries used by multiple apps will only have to be loaded once.
Maintainability: Also if you change something in your code base or database, you only have to change it in one place and re-deploy one app only.
Easier synchronization: If the separate apps do something critical in the database for example, it's harder to synchronize them compared to the case when everything is in one app.
Easier collaboration between different parts/modules of the code. If they are combined, you can simply call methods of other modules. If they are in different web apps, you have to do it in a dirty way like HTTP calls, RMI etc.
Cons
It will be bigger (obviously). If you worry about it being too big, just exclude the libs from the deployment war, place it under the tomcat libs.
The separate apps might use different versions of the same lib. But it's better to sort them out early when it can be done easier and with less work.
Another drawback can be the longer deployment time. Again, "outsourcing" the libs can help making it faster.
There is no drawback in terms of size, memory issues or performance when used in single file as systems are getting faster each day. And as you said running in different apps or same one, the total combined resources consumed will be the same in terms of processing or computation power. Now its a maintenance and administration issues that decides to keep a single or multiple. If you have multiple modules which might changes frequently and independently of one another, its better to have multiple webapps, talking via RMI or WS calls for intercommunication(if required). If all of them are oriented as one unit, where everything changes at once you may go with single app. having multi apps will help to install and update each one easily with respect to change in functionality at module level
deploying multiple applications to Tomcat
http://www.coderanch.com/t/471496/Tomcat/Deploying-multiple-applications-WAR
Hope it helps
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have two micro services(ms) ms1 and ms2. In both of the ms there is duplicate reusable code(like hashing/security related/orm/etc). Just FYI this usable
code can sometimes maintain some state as well in DB (if it matters by any chance)
There are two ways, I can proceed
Either extract the reuable code as separate library and include in both ms
Create separate ms for this reusable code and expose it as rest end points
If I take approach 2, advantage is I just have to redeploy the ms3 in case of any change. If take approach 1, I need to redploy both ms. At the same
time approach 2 will require separate maintenance/resources/monitoring.
Which one is more ideal approach in terms of system design considering hardware resource is not a challenge. I just mentioned two microservicebut in some cases
there are more than two ms having duplicate code.
I am not sure what is the criteria which can help me to decide whether to go towards shared library or micro service ?
Update :-
I have got some clarity from below blogs but still have question. Will think and post anew question if required.
https://blog.staticvoid.co.nz/2017/library_vs_microservice/
https://dzone.com/articles/dilemma-on-utility-module-making-a-jar-or-separate-2
Microservices are only one of architectural styles. In some cases it is better, in some it is worse than other styles. If you don't use microservices it not mean that your architecture is not good.
If you still want to have microservices, then none of these approaches (shared library vs. library as a new "microservice") is good.
I'd suggest to consider following.
Microservice approach does not mean, that each end point should be encapsulated into a separate microservice. It is normal, that one microservice provides several different end points. If this is your case, then put your two services into a sinbgle microservice and make them reachable via two different end points. Then it is fine that both of them share some classes.
Microservices should normally have independent persistence layer. If there is a strong dependency on the common persistence layer, check, what was the reason to split them into different microservices. Do they really work with different business domains? Can these service be developed and deployed independently on each other? If not, then may be there is no reason to put them into different microservices and it could be better to put them into a single microservice. Then it would be fine if they share some classes.
A good microservice should be provide functionality for some domain. If you put shared code to a separate microservice, then it may be that your shared "microservice" does not provide any functionality for a domain, but is just a wrapper for utilities. That would be not a microservice.
If you have strong reason to separate your services into two different microservices, then duplicate the code. Each microservice should be independent on the others. It should be possible to replace database and to replace any classes of one microservice without affecting the other one. One normal way to make them independable is duplicate the classes that you (currently) consider as shared. If the services are really independent with the time this duplicated code will change and will be different in each microservice. If you have to change this code in both services simultaneously, then it means that your split is not correct and that what you have are not microservices.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm fairly new to data science, and now just starting to develop a system that required me to analyze large data (e.g. 5 - 6 million records in each DBs).
In a bigger picture: I have multiple DBs containing various kind of data which need to be integrated. After integrating the data, I also need to perform some data analysis. And lastly, I need to visualize the data to many clients.
Overall,I want to know what is the current technology/trend for handling big data (i.e with java framework)
The answer is: Depends of your non-functional requirements. Your use cases will be critical in deciding which technology to use.
Let me share one of my experience, in order to clarify what I mean:
In 2012 I needed to deal with ~2 million non-structured records per month, and perform algorithms of entropy (information theory) and similarity for ~600 requests per minute.
Our scenario were composed by:
Records non-structured, but already in JSON format.
The algorithms for entropy and similarity were based in all content of the DB vs records to be matched (Take a look in [Shannon entropy formula][1], and you will understand the complexity I'm talking about)
more them 100 different web applications as clients of this solution.
Given those requirements (and many others), and after performing PoCs with [Casandra][2], [Hadoop][3], [Voldmort][4], [neo4j][5], and also tests of stress, resiliency, scalability, and robustness, we arrived in the best solution for that moment (2012):
Java EE 7 (with the new Garbage-First (G1) collector activated)
JBoss AS 7 ([wildfly][6]) + [Infinispan][7] for the MapReduce race condition, among other clusters' control, and distributed cache needs.
Servlet 3.0 (because it's Non-blocking I/O)
[Nginx][8] (In that time was beta, but different of httpd2, it was already multiple connections in a non-blocking fashion)
[mongoDB][9] (due our raw content already being in JSON document style)
[Apache Mahout][10] for all algorithms implementation, including the MapReduce strategy
among other stuffs.
So, all depends on your requirements. There's no silver bullet. Each situation demands an architectural analysis.
I remember Nasa in that time was processing ~1TB per hour in AWS with Hadoop, due the [Mars project with the Curiosity][11].
In your case, I would recommend paying attention in your requirements, maybe a Java framework it's not what you need (or not just what you need):
If you are going just to implement algorithms for data analysis, statisticians and data miners (for example), probably [R programming language][12] is gonna be the best choice.
If you need a really fast I/O (aircraft stuff for example): any native compiled language like [Go Lang][13], [C++][14], etc.
But if actually you're going to create a web applications that actually will be just a client or feed the big data solution, I'd recommend something more soft and scalable like [nodeJS][15] or even a just in time compiled technology like those one based in JVM ([Scala][16], [Jython][17], Java) in [dockerized][18] [microservices][19]...
Good luck! (Sorry, the Stack Overflow didn't allow me to add the references link yet - But all I have talked about here, can easily been googled).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am really curious about how professional programmers scale up a web application. I have made significant research effort but failed to get information about stages of scaling, it might be related to the fact that server performance depends on many factors. However, i am pretty sure that some details can be laid down approximately.
For instance,
1.) How many concurrent request can a single Tomcat server handle with decent implementation and decent hardware?
2.) At what point should be a load-balancer server involved?
3.) When does full Java EE stack (JBoss/Glassfish) begin to make sense?
I feel that this is somewhat opinion based but, ultimately, "it depends".
For example, how much load can Tomcat handle? It depends. If you're sending a static HTML page for every request then the answer is "alot". If you're trying to compute the first 100,000 prime numbers every time then probably not so much.
In general, it is best to try to design your application for clustering/distributed use. Don't count on too much in the session - keeping the sessions in sync can be expensive. Do your best to have every method truly stateless. That can be hard sometimes as the consumer (i.e. a web site) may have to pass a bit more information on each call so that any of the clustered machines know the current state of a request. And so on.
I moved a web-app from Tomcat to Glassfish and then Wildfly when I wanted to take advantage of some additional Java EE functionality - specifically JMS, CDI, and JPA. I could have used TomEE and bolted it in but a unified environment with a unified management UI was a nice benefit too. You may never need to do that though. You can add parts that you want (i.e. CDI and JPA) fairly easily to Tomcat.
Note that I didn't move from Tomcat to a full EE server for performance - I wanted to take advantage of a larger part of the EE stack. While Wildfly has some management interfaces that make managing a cluster a bit easier I could have still used Tomcat with no problem.
So, again, "it depends". If you don't have the need for more of the EE stack than Tomcat provides a full EE server may very well be overkill. Putting a set of Tomcat servers behind an Apache HTTPD load balancer (or an Amazon one) on top of a database that also is clustered isn't too bad to implement. If that is sufficient for you then I'd stick with that. Don't jump to Wildfly, etc. for just performance as you will not likely see a huge change either direction.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have a server (Linux/Apache-Tomcat & MySQL) which hosts several almost identical websites. At least, the java libraries are identical.
Right now, every website has it's own .jar file with these java classes.
I'd like to know if this is a good practice, or if I should have these classes in one place where each of the websites can access them? Would this improve performance in any way? Would it result in less memory usage for the JVM? Are there any down-sides?
I haven't been able to find any information related to this situation.
Upsides: a small amount of disk space and RAM is saved. Remember that the only heap space taken belongs to the java.lang.Class instances representing the types you actually load from that JAR file.
Downsides: all applications in the JVM are locked-into using the version of the library that is shared. If you really want all deployed webapps to be identical, then this really is no downside. Deployments can get tricky because you have to maintain a non-standard deployment process (e.g. the webapp is not self-contained) that may be different from container-to-container or between versions of the same container (e.g. Tomcat changed its mind between versions 4 and 5, 5 and 5.5, and 5 and 6 for how to configure "common" and "shared" libraries).
If the web applications are identical, you should ask yourself: should you even be deploying more than one? Instead, you could sniff the URL and use a configuration for each kind of client instead of deploying the applications separately.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
This is more of a general programming questions about a GIANT performance issue I have seen..
Basically I'll use two common programs for an example - Eclipse IDE & Newsbin (usenet client)
On my Windows 7 Machine, Eclipse is so sluggish it's almost painful to use and this is built on the java platform right?
Whilst Newsbin, on the same machine, can handle hundreds of thousands of header information and literally NEVER lags.. It's one of the most responsive programmed I have ever used..
So, is there any chance someone could shed some light on which language/platform Newsbin is built upon? I'm curious because I want to expand my skills into desktop applications and there seems to be such a massive difference in performance.
Apologies if this type of question shouldn't be posted here, but it is 'linked' with programming and I would very much like some feedback/answers.
Thanks.
There are many reason the performance could be different. It is most likely a tuning problem or you hardware doesn't suit the application. I use IntelliJ CE (another IDE like Eclipse) and it caches a lot of information about the Java classes it uses. It does this to provide rich refactoring/search capabilities. This can result in enormous amounts of disk activity if you don't have lots of free memory (to cache the disk data). I use a machine with 48 GB of memory and it almost never lags (at least not when I am the only one using it)
My guess is that newsbin of keeps the most essential information about each post and avoids having to cache lots of information about each article. i.e. its has a completely different use and usage pattern.
The performance between the two is most likely not the platform but the fact that the two are very different application.
Second, two version of the same program can be vastly different. You can create a slower Newsbin type application in the same platform that your Newsbin's application uses.
You're comparing apples to oranges: these two programs do completely different things and the performance difference probably has nothing to do with the underlying platform or language.
Also, keep in mind that Eclipse can be fast by itself, but you can ad plugins to it, and poorly written plugins can slow it down horrendously.
Remember: no matter what language you're using, you can always find a way to write code that is poor enough to make a program feel slow and unresponsive.