What is the simplest way to run Hazelcast nodes on dedicated servers?
We have a web application that uses a Hazelcast distributed map.
Currently the Hazelcast nodes are configured to run in the Servlet Container nodes.
As we scale up, we'd like to add dedicated hardware as Hazelcast nodes.
Then we won't need full Hazelcast nodes in the Servlet Containers anymore, those can be clients. (There are licensing costs associated with the Servlet Containers, so getting load off them is good, don't ask...)
So the question is, what's a minimal Hazelcast node installation? Something analogous to a memcached installation.
All it needs to do is read configuration and start up, no local clients.
I see it supports Jetty, but is that needed at all, or is there some simple class in those jars I could execute on a JVM raw?
Just create a simple class that calls HazelCast.init
There are a number of test classes in the com.hazelcast.examples package which can be run from the bin directory of the hazelcast distribution.
TL;DR
Newer version:
java -cp hazelcast-3.7.2.jar com.hazelcast.core.server.StartServer
Older version:
java -cp hazelcast-2.0.3.jar com.hazelcast.examples.StartServer
This will start a standalone Hazelcast instance
If you're using maven:
mvn -DgroupId=com.hazelcast -DartifactId=hazelcast -Dversion=3.7.2 dependency:get
cd ~/.m2/repository/com/hazelcast/hazelcast/3.7.2
will get you to the folder with the jar
You can get it to run by calling {hazelcast-directory}/bin/server.shor on Windows {hazelcast-directory}/bin/server.bat.
The configuration file can still be found in {hazelcast-directory}/bin/hazelcast.xml
This is an update to thSoft's answer as that way is no longer valid.
You can also simply run hazelcast/bin/start.sh (the configuration file is hazelcast/bin/hazelcast.xml).
Related
We upgraded to ignite2.10.0 latest version. And trying to launch compute tasks using thin client configuration.
Problem: I have to place jar everytime in cluster node whenever i changed a single line of code
Is there any way i can execute tasks dymanically (auto-deploy) without placing jar in cluster node?
Your response will be greatly appericated
It's not possible to deploy tasks dynamically (like using peerClassLoading) for thin clients yet. For now, you need to have the tasks deployed before calling them.
More details: https://ignite.apache.org/docs/latest/thin-clients/java-thin-client#executing-compute-tasks
Alternatively, you might check the following docs regarding UriDeploymentSpi. In short, instead of manually updating the jars, you might set a local folder or URI that the cluster will check from time to time and deploy a new version of code if available.
I am doing some experiments with my thesis involving the cold start problem that occurs with containers. My test application is a spring boot application that is build on the openjdk image. The first thing I want to try to resolve the problem of cold start, is the following:
Have a container ready, in the container is the openjdk and the libraries that the springboot app uses. I start my other container, using the ipc and networknamespace of the already existing container, and then be able to use the openjdk and the libraries of this container to run the jar file.
I am not exactly sure on how to achieve this? Can i achieve this by using volumes or should i be looking for a completely different approach?
On a another note, if i want x containers to run, i will make sure there are x pre-existing containers running. This is to make sure that every container has its own specific librarycontainer to work with. Would this be okay ?
In short, any way that I can speed up the spring boot application by using a second containers that is connected through ipc/net; would be helpfull to my problem.
Spring boot is a purely "runtime" framework.
If I've got your question right, you describe the following situation:
So, say you have a container A with JDK and some jars. This alone doesn't mean that you have a running process though. So its more like a volume with files ready to be reused (or maybe a layer in terms of docker images).
In addition you have another container B with a spring boot application that should be started somehow (probably with the open jdk from container A or its dedicated JDK).
Now what exactly would like like to "speed up"? The size of image (smaller image means faster deployment in CI/CD pipeline for example)? The spring boot application startup time (the time interval between the point of spawning the JVM till the Spring boot application is up and running)? Or maybe you're trying to load less classes in runtime?
The techniques that solve the arised issues are different. But all in all I think you might need to check out the Graal VM integration that among other things might create native images and speed up the startup time. This stuff is fairly new, I by myself haven't tried this yet. I believe its work-in-progress and spring will put an effort on pushing this forward (this is only my speculation so take it with a grain of salt).
Anyway, you may be interested in reading this article
However, I doubt it has something to do with your research as you've described it.
Update 1
Based on your comments - let me give some additional information that might help. This update contains information from the "real-life" working experience and I post it because it might help to find directions in your thesis.
So, we have a spring boot application on the first place.
By default its a JAR and its Pivotal's recommendation there is also an option of WARs(as Josh Long, their developer advocate says: "Make JAR not WAR")
This spring boot application usually includes some web server - Tomcat for traditional Spring Web MVC applications by default, but you can switch it to Jetty, or undertow. If you're running a "reactive applcation" (Spring WebFlux Supported since spring boot 2) your default choice is Netty.
One side note that not all the spring boot driven applications have to include some kind of embedded web server, but I'll put aside this subtle point since you seem to target the case with web servers (you mention tomcat, a quicker ability to serve requests etc, hence my assumption).
Ok, now lets try to analyze what happens when you start a spring boot application JAR.
First of all the JVM itself starts - the process is started, the heap is allocated, the internal classes are loaded and so on and so forth. This can take some time (around a second or even slightly more depending on server, parameters, the speed of your disk etc).
This thread address the question whether the JVM is really slow to start I probably won't be able to add more to that.
Ok, So now, its time to load the tomcat internal classes. This is again can take a couple of seconds on modern servers. Netty seems to be faster, but you can try to download a stanalone distribution of tomcat and start it up on your machine, or create a sample application wihout spring boot but with Embedded Tomcat to see what I'm talking about.
So far so good, not comes our application. As I said in the beginning, spring boot is purely runtime framework. So The classes of spring/spring boot itself must be loaded, and then the classes of the application itself. If The application uses some libraries - they'll be also loaded and sometimes even custom code will be executed during the application startup: Hibernate may check schema and/or scan db schema definitions and even update the underlying schema, Flyway/Liquidbase can execute schema migrations and what not, Swagger might scan controllers and generate documentation and what not.
Now this process in "real life" can even take a minute and even more, but its not because of the spring boot itself, but rather from the beans created in the application that have some code in "constructor"/"post-construct" - something that happens during the spring boot application context initialization. Another side note, I won't really dive into the internals of spring boot application startup process, spring boot is an extremely power framework that has a lot of things happening under the hood, I assume you've worked with spring boot in one way or another - if not, feel free to ask concrete questions about it - I/my colleagues will try to address.
If you go to start.spring.io can create a sample demo application - it will load pretty fast. So it all depends on your application beans.
In this light, what exactly should be optimized?
You've mentioned in comments that there might be a tomcat running with some JARs so that they won't be loaded upon the spring boot application starts.
Well, like our colleagues mentioned, this indeed more resembles a "traditional" web servlet container/application server model that we, people in the industry, "used for ages" (for around 20 years more or less).
This kind of deployment indeed has an "always up-and-running" a JVM process that is "always" ready to accept WAR files - a package archive of your application.
Once it detects the WAR thrown into some folder - it will "deploy" the application by means of creating the Hierarchical class-loader and loading up the application JARs/classes. Whats interesting in your context is that it was possible to "share" the libraries between multiple wars so that were loaded only once. For example, if your tomcat hosts, say, 3 applications (read 3 WARs) and all are using, oracle database driver, you can put this driver's jar to some shared libs folder and it will be loaded only once by the class loader which is a "parent" for the class loaders created per "WAR". This class loader hierarchy thing is crucial, but I believe its outside the scope of the question.
I used to work with both models (spring boot driven with embedded server, an application without spring boot with embedded Jetty server and "old-school" tomcat/jboss deployments ).
From my experience, and, as time proves, many of our colleagues agree on this point, spring boot applications are much more convenient operational wise for many reasons (again, these reasons are out of scope for the question IMO, let me know if you need to know more on this), that's why its a current "trend" and "traditional" deployments are still in the industry because or many non pure technical reasons (historical, the system is "defined to be" in the maintenance mode, you already have a deployment infrastructure, a team of "sysadmins" that "know" how to deploy, you name it, but bottom line nothing purely technical).
Now with all this information you probably understand better why did I suggest to take a look at Graal VM that will allow a faster application startup by means of native images.
One more point that might be relevant. If you're choosing the technology that will allow a fast startup, probably you're into Amazon Lambda or the alternative offered by other cloud providers these days.
This model allows virtually infinite scalability of the "computational" power (CPU) and under the hood they "start" containers and "kill" them immediately once they detect that the container does actually nothing. For this kind of application spring boot simple is not a good fit, but so is basically Java, again, because the JVM process is relatively slow-to-start, so once they start the container like this it will take too long till the time it becomes operational.
You can read Here about what spring ecosystem has to offer at this field, but its not really relevant to your question (I'm trying to provide directions).
Spring boot shines when you need an application that might take some time to start, but once it starts it can do its job pretty fast. And yes, its possible to stop the application (we use the term scale out/scale in) if its not "occupied" by doing an actual work, this approach is also kind of new (~3-4 years) and works best in "managed" deployment environments like kubernetes, amazon ECS, etc.
So if speed-up application start is your goal i think you would need a different approach here a summary of why i think so:
docker: a container is a running instance of an image, you can see an image as a filesystem (actually is more than that but we are talking about libraries). In a container you have jdk (and i guess your image is based on tomcat). Docker engine has a very well designed cache system so containers starts very quickly, if no changes are made on a container docker only need to retrieve some info from a cache. These containers are isolated and for good reasons (security, modularity and talking about libraries isolation let you have more version of a library in different containers). Volumes do not what you think, they are not designed to share libraries, they let you break isolation to make some things for example you can create a volume for your codebase so you have not to rebuild an image for each change during the programming phase, but usually you won't see them in a production environment (maybe for some config files).
java/spring: spring is a framework based on java, java is based on a jdk and java code runs on a vm. So to run a java program you have to start that vm(no other way to do that) and of course you cannot reduce this startup time. Java environment is very powerfull but this is why a lot of people prefer nodejs expecially for little services, java code is slow in startup (minutes versus seconds). Spring as said before is based on java, servelets and context. Spring application lives in that context, so to run a spring application you have to initialize that context.
You are running a container, on top of that you are running a vm, then you are initializing a spring context and finally you are initializing beans of your application. These steps are sequential for dependencies reasons. You cannot initialize docker,vm and a spring context and run somewhere else your application, for example if you in a spring application add a chainfilter you would need to restart application because you would need to add a servlet to your system. If you want to speed-up the process of startup you would need to change the java vm or make some changes in the spring initialization. In summary you are tryng to deal with this problem at a high level instead of low level.
To answer your first question:
I am not exactly sure on how to achieve this? Can I achieve this by using volumes or should I be looking for a completely different approach?
This has to be balanced with the actually capabilities of your infrastructure.
One could have a full optical fibre cable network, a Docker repository in this network and so a bandwidth than can allow for "big" images
One could have a poor connection to its Docker repository, and so need extra small images but a quick connection with the libraries repositories
One could have a blue/green deployment technique and care about neither the size of the images and layer nor the boot time
One thing is, if you care about image and layer size this is good and this is definitely a good practice advised by Docker, but all depends of your needs. The recommendation about keeping images and layers small if for images that you'll distribute. If this is your own image for your own application, then, you should act upon your needs.
Here is for a little bit of my own experience: in a company I was working on, we need the database to be synchronised back from the production to user acceptance test and developer environment.
Because of the size of the production environment, importing the data from an SQL file in the entrypoint of the container took around twenty minutes. This might have been alright for the UAT environment, but, was not for the developer's one.
So after trying all sort of minor improvement in the SQL file (like disabling foreign keys checks and the like), I came up with a totally new approach: I created a big fatty image, in a nightly build that would contain the database already. This is, indeed, against all good practice of Docker, but the bandwidth at the office allowed the container to start in a matter of 5 minutes at worse, compared to the twenty that was before.
So I indeed ended up with a build time of my Docker SQL image being humongous, but a download time acceptable, considering the bandwidth available and a run time reduced to the maximum.
This is taking the advantage of the fact that the build of an image only happens for once, while the start time will happens for all the containers derivating from this image.
To answer your second question:
On a another note, if I want x containers to run, I will make sure there are x pre-existing containers running. This is to make sure that every container has its own specific librarycontainer to work with. Would this be okay?
I would say the answer is: no.
Even in a micro services architecture, each service should be able to do something. As I understand it, you actual not-library-container are unable to do anything because they are tightly coupled to the pre-existence of another container.
This said there are two things that might be of interest to you:
First: remember that you can always build from another pre-existing image, even your own.
Given this would be your library-container Dockerfile
FROM: openjdk:8-jdk-alpine
ARG JAR_FILE=target/*.jar
COPY ${JAR_FILE} app.jar
ENTRYPOINT ["java","-jar","/app.jar"]
Credits: https://spring.io/guides/topicals/spring-boot-docker/#_a_basic_dockerfile
And that you build it via
docker build -t my/spring-boot .
Then you can have another container build on top of that image:
FROM: my/spring-boot
COPY some-specific-lib lib.jar
Secondly: there is a nice technique in Docker to deal with libraries that is called multi-stage builds and that can be used exactly for your case.
FROM openjdk:8-jdk-alpine as build
WORKDIR /workspace/app
COPY mvnw .
COPY .mvn .mvn
COPY pom.xml .
COPY src src
RUN ./mvnw install -DskipTests
RUN mkdir -p target/dependency && (cd target/dependency; jar -xf ../*.jar)
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG DEPENDENCY=/workspace/app/target/dependency
COPY --from=build ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY --from=build ${DEPENDENCY}/META-INF /app/META-INF
COPY --from=build ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","hello.Application"]
Credits: https://spring.io/guides/topicals/spring-boot-docker/#_multi_stage_build
And as you can see in the credits of this multi-stage build, there is even a reference to this technique in the guide of the Spring website.
The manner you are attempting to reach your goal, defies the entire point of containerisation.
We may cycle back to firmly focus on the goal -- you are aiming to "resolve the problem of cold start" and to "speed up the spring boot application".
Have you considered actually compiling your Java application to a native binary?
The essence of the JVM is to support Java's feature of interoperability in a respective host environment. Since containers by their nature inherently resolves interoperability, another layer of resolution (by the JVM) is absolutely irrelevant.
Native compilation of your application will factor out the JVM from your application runtime, therefore ultimately resolving the cold start issue. GraalVM is a tool you could use to do native compilation of a Java application. There are GraalVM Container Images to support your development of your application container.
Below is a sample Dockerfile that demonstrates building a Docker image for a native compiled Java application.
# Dockerfile
FROM oracle/graalvm-ce AS builder
LABEL maintainer="Igwe Kalu <igwe.kalu#live.com>"
COPY HelloWorld.java /app/HelloWorld.java
RUN \
set -euxo pipefail \
&& gu install native-image \
&& cd /app \
&& javac HelloWorld.java \
&& native-image HelloWorld
FROM debian:10.4-slim
COPY --from=builder /app/helloworld /app/helloworld
CMD [ "/app/helloworld" ]
# .dockerignore
**/*
!HelloWorld.java
// HelloWorld.java
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, Native Java World!");
}
}
Build the image and run the container:
# Building...
docker build -t graalvm-demo-debian-v0 .
# Running...
docker run graalvm-demo-debian-v0:latest
## Prints
## Hello, Native Java World!
Spring Tips: The GraalVM Native Image Builder Feature is an article that demos building a Spring Boot application with GraalVM.
I have an application that dispatches Apache Flink jobs to an AWS Elastic MapReduce YARN cluster through the RemoteExecutionEnvironment scala API.
These jobs use the JNI to run part of their calculations through a C library. During development I just put a System.loadLibrary() call in a RichCrossFunction's open() method to load this JNI library. This worked fine in a LocalExecutionEnvironment.
Now that I'm moving to a RemoteExecutionEnvironment this no longer seems to work. It looks like Flink is using a new ClassLoader every time it dispatches a job and I am getting Native library already loaded in another classloader errors on the calculation nodes.
Some googling informed me that this is a common issue with Tomcat applications and a solution is available in the Tomcat FAQ: http://wiki.apache.org/tomcat/HowTo#I.27m_encountering_classloader_problems_when_using_JNI_under_Tomcat
Is there a similar solution available for Flink or YARN?
Additionally, is it possible to avoid resubmitting the JAR every time a job is queued? I'm always using the same jar on this cluster, so it's unnecessary overhead...
I fixed the issue by calling loadLibrary in a static initializer in my JNI jar, then dropping my JNI jar in Flink's /lib folder, similar to the pattern in the Tomcat link above.
It is automatically replicated to the Flink TaskManagers by the yarn-session.sh startup procedure. This allowed me to circumvent the ClassLoader segregation in the same way you would do with Tomcat.
I'm using Maven, so I prevented the JNI jar from being included in my uberjar using maven-shade-plugin.
I still don't know if this is the best way, since the flink manual discourages using the /lib folder because it doesn't respect their ClassLoader management (https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/cluster_execution.html), but that's exactly what I wanted.
Maybe another way would have been to use a NativeLoader pattern and create a separate temp file for every ClassLoader, but that would create a bunch of duplicate native libraries and this method works for me.
I have several webapps deployed to production. I have used Tomcat as my servlet engine for ~10 years now. I'm considering moving to embedding Jetty model from the deploy-a-war-into-Tomcat model.
These webapps are deployed over several servers and some of the are horizontally scaled (using nginx IP hash based partitioning).
I see some advantages:
I can configure my servlet engine for a particular webapp (instead of having a generic configuration for Tomcat which is running several different webapps)
It's easier to horizontally scale my webapp by running multiple Jetty instances (configured to listen on different ports) on the same host. I could also do this with Tomcat (and I have run multiple tomcat instance on the same host in the past), but I've moved to using Debian packages (.deb archives) for deployment and it's not as easy to run multiple Tomcats using this method.
My deployment package (.deb) is more "complete" at build time i.e. I don't have to be sure the Tomcat instance is configured correctly.
And disadvantages:
More instances of a servlet engine running on a server means more resources being used
I've never used Jetty. I don't think I have any Tomcat-specific stuff going on in my webapps, but I'm not sure.
My main concern is the amount of resources that Jetty will use. If I had one Tomcat instance running 4 webapps, what will the difference in resources (memory/processor) be with four Jetty instances running?
This question is probably too open-ended, but I'm curious to know if I'm overlooking something or if anybody has any experience moving from Tomcat to (embedded) Jetty.
The web container I've found easiest to embed in a jar file (and it is still a valid WAR too) is Winstone (http://winstone.sourceforge.net/).
Jenkins - http://jenkins-ci.org/ - use this container so it has been pretty stress-tested. Note that it is Servlet 2.4 only.
Well I think there is no direct answer;
I might not fully understand the ".deb" part as I'm not a debian freak :)
I prefer having an instance of tomcat with a number of configurations aka CATALINA_HOME folders where you may specify apps and ports running, so you can always have all your configs separately and change the tomcat instance if needed.
Also, see related post:
Jetty: To embed or not to embed?
I also was used to Tomcat, so in my new project I tried using Jetty to learn about it.
In an enterprise environment (where you have production / testing / development servers) I would stick to Tomcat, mainly because it helps you in getting to separate code from configuration files (now I am working in setting the conf files in a separate jar, because that way when I move changes from testing to production they do not have to manually update the jars that I'll pass to sysadmin).
Other issue is that it looks like that Jetty has changed ownership not so long ago, and looking for info often got me to the old version.
Apart from that, using Jetty is not that different from Tomcat; had to run a little through docs for finding where everything is, but structure is (as last what I have seen, I have not tried anything too complicated) more or less like Tomcat.
You might have a set of properties that is used on the developer machine, which varies from developer to developer, another set for a staging environment, and yet another for the production environment.
In a Spring application you may also have beans that you want to load in a local environment but not in a production environment, and vice versa.
How do you handle this? Do you use separate files, ant/maven resource filtering or other approaches?
I just put the various properties in JNDI. This way each of the servers can be configured and I can have ONE war file.
If the list of properties is large, then I'll host the properties (or XML) files on another server. I'll use JNDI to specify the URL of the file to use.
If you are creating different app files (war/ear) for each environment, then you aren't deploying the same war/ear that you are testing.
In one of my apps, we use several REST services. I just put the root url in JNDI. Then in each environment, the server can be configured to communicate with the proper REST service for that environment.
I just use different Spring XML configuration files for each machine, and make sure that all the bits of configuration data that vary between machines is referenced by beans that load from those Spring configuration files.
For example, I have a webapp that connects to a Java RMI interface of another app. My app gets the address of this other app's RMI interface via a bean that's configured in the Spring XML config file. Both my app and the other app have dev, test, and production instances, so I have three configuration files for my app -- one that corresponds to the configuration appropriate for the production instance, one for the test instance, and one for the dev instance.
Then, the only thing that I need to keep straight is which configuration file gets deployed to which machine. So far, I haven't had any problems with the strategy of creating Ant tasks that handle copying the correct configuration file into place before generating my WAR file; thus, in the above example, I have three Ant tasks, one that generates the production WAR, one that generates the dev WAR, and one that generates the test WAR. All three tasks handle copying the right config file into the right place, and then call the same next step, which is compiling the app and creating the WAR.
Hope this makes some sense...
We use properties files specific to the environments and have the ant build select the correct set when building the jars/wars.
Environment specific things can also be handled through the directory service (JNDI), depending on your app server. We use tomcat and our DataSource is defined in Tomcat's read only JNDI implementation. Spring makes the lookup very easy.
We also use the ant strategy for building different sites (differeing content, security roles, etc) from the same source project as well.
There is one thing that causes us a little trouble with this build strategy, and that is that often files and directories don't exist until the build is run, so it can make it difficult to write true integration tests (using the same spring set up as when deployed) that are runnable from within the IDE. You also miss out on some of the IDE's ability to check for the existence of files, etc.
I use Maven to filter out the resources under src/main/resources in my project. I use this in combination with property files to pull in customized attributes in my Spring-based projects.
For default builds, I have a properties file in my home directory that Maven then uses as overrides (so things like my local Tomcat install are found correctly). Test server and production server are my other profiles. A simple -Pproduction is all it then takes to build an application for my production server.
Use different properties files and use ant replace filters which will do the replacement based on environment for which the build is done.
See http://www.devrecipes.com/2009/08/14/environment-specific-configuration-for-java-applications/
Separate configuration files, stored in the source control repository and updated by hand. Typically configuration does not change radically between one version and the next so synchronization (even by hand) isn't really a major issue.
For highly scalable systems in production environments I would seriously recommend a scheme in which configuration files are kept in templates, and as part of the build script these templates are used to render "final" configuration files (all environments should use the same process).
I recently also used Maven for alternative configurations for live or staging environments. Production configuration using Maven Profiles. Hope it helps.
I use Ant's copy with a filter file.
In the directory with the config file with variables I have a directory with a file for each environment. The build script know the env and uses the correct variable file.
I have different configuration folders holding the configurations for the target deployment, and I use ANT to select the one to use during the file copy stage.
We use different ant targets for different environments. The way we do it may be a bit inelegant but it works. We will just tell certain ant targets to filter out different resource files (which is how you could exclude certain beans from being loaded), load different database properties, and load different seed data into the database. We don't really have an ant 'expert' running around but we're able to run our builds with different configurations from a single command.
One solution I have seen used is to configure the staging environment so that it is identical to the production environment. This means each environment has a VLAN with the same IP range, and machine roles on the same IP addresses (e.g. the db cluster IP is always 192.168.1.101 in each environment). The firewalls mapped external facing addresses to the web servers, so by swapping host files on your PC the same URL could be used - http://www.myapp.com/webapp/file.jsp would go to either staging or production, depending on which hosts file you had swapped in.
I'm not sure this is an ideal solution, it's quite fiddly to maintain, but it's an interesting one to note.
Caleb P and JeeBee probably have your fastest solution. Plus you don't have to setup different services or point to files on different machines. You can specify your environment either by using a ${user.name} variable or by specifying the profile in a -D argument for Ant or Maven.
Additionally in this setup, you can have a generic properties file, and overriding properties files for the specific environments. Both Ant and Maven support these capabilities.
Don't forget to investigate PropertyPlaceholderConfigurer - this is especially useful in environments where JNDI is not available