I understand the concept of source version control and how it applies to self-contained projects like a Windows application. But for web development, most files are stored on the web server. This has become a headache for development with many people just copying and renaming files and then pushing files over to production is another mess.
I need some kind of source version control that is relatively not too difficult to learn and must be GUI-based or have a GUI as an option. The people who will use this have little or no knowledge of the command line.
How can I integrate source version control with web server files? What software is available for such an endeavor? And is it possible to have the source version control software administer both the production and development web servers or I may only have two separate source version control installs for each web server and manually push over changes?
The web servers are Windows-based and also use Tomcat for Java/JSP.
Any help would be appreciated. Thank you.
I think you are not clear on the idea of version control. Version control is about managing your code. It is about putting your code in a remote server (may be in a central location) and accessing it using a client tool. This way a number of people can work on different part of the code and than push their work to version control server. It has nothing do with the type of the project.
The project can be a windows application, web server application or any application.
While using version control, in regular intervals or whenever needed you build your code from the version control server and deploy it to the web server which means you are deploying code that is already build (a .war for a web application).
You first deploy to your development server and later deploy the same war to the production server.
You can use SVN server for your version control server and Tortoise SVN as client.
You have to split in mind two different but interacting things - Version Control and Deploy Tools:
VCS has to do with any evolving over time items, which you want to have under control
Deploy just deliver correct object into the correct place at the correct time and convert "set of something" into Product.
Deploy isn't a problem per se (almost any job can be automated), main problem in multiDEV environment (2+) with central STAGE (less with PROD) server is question of communication between Devs and synchronizing of their operations, i.e. - workflow and management:
just imagine 2 (or more) devs, performing diferent unrelated tasks, which want to test latest own (and only own) changes on common STAGING server (because they haven't functional local environment). If 1-st deploy "some WIP" on server, he don't want to have own tests be interrupted and code poisoned by deploying third-party changes. They must to communicate and coordinate actions, it can't be dumb "copy to..." in post-commit hook
And is it possible to have the source version control software administer both the production and development web servers
Yes. But VCS does not "administer" web-servers in common sense, rather it's "communicates" or "take into account"
Related
My area of expertise is not in Java; I write Java code snippets where needed by my other apps. Others in my organization do work with Java.
For a large browser-based app developed in Java using Eclipse Oxygen and deployed on Tomcat or WildFly on a RedHat 7 server, how does one create a patch? i.e. where a couple of JSP's and classes have changed, must the complete WAR be created and redeployed? Or, is there a way to generate a WAR that only has these changes, and deploy it on target machine via CLI or TC/WF console?
The complete WAR may be a few gigs in size so would take longer to email to client, whereas a 50KB patch would appeal more to a sense of neatness. Also, if I send out monthly updates with full WAR's, the backup size would be much larger.
Is there a tutorial one could follow?
I have been programming in Java for 3 years, but have mostly been deploying Java applications locally, and thus, have not had a need to think about scale, auto-updating, deployment servers, and the like. I simply used a corporate shared file drive for distribution, and wrapped the jar in an Exe wrapper every time I wanted to push an update.
However, the company I work for wants to scale the Java desktop application I made, and deploy it across departments. This calls in the need for automated deployment, updating, and change tracking. So, for the last week, I have been searching and speaking with a lot of people about the process of automating the deployment of Java applications. Of course, I have found a huge amount of response pertaining to Java EE, tomcat servers, and mostly web based applications. But, what I am looking for is a way to manage Desktop based Java applications, that run from an executable on a user's windows machine.
The question then that I have run into, how is it that I would get started setting up automatic builds, updates, and deployment when I the target audience is running the applications from their desktop as an Exe wrapped jar file? I have found a few articles that reference installers with update based capabilities, but I'm trying to figure out how I can utalize a Java server to autocompile and distribute updates, and all that Jazz. Like I said, I am extremely dumb when it comes to this side of Java and would love any direction that there is to offer. Thanks!
When shipping a web application to production, would you consider an enterprise application archive or an RPM?
What are the cons and pros of each?
With RPM you can keep track of versioning and treat configuration files properly.
Let's assume your application won't be installed on Windows boxes, so OS dependency is not an issue we are worried about.
Most Java web applications that I've worked with, or studied, have been typically published as Enterprise or Web archives. The case for RPMs is quite weak, except for very specific scenarios.
There are several points that go in favor of EAR/WAR files:
Installation is often easy, even if it involves some manual process of copying/uploading EAR/WAR files to a designated directory. However, you need to know your target audience here. If you are expecting that a Linux system administrator (who has little or no knowledge of Java application servers) is performing the installation and maintenance of the application, then you might right in choosing RPMs. This is however a rare case for businesses, for it makes less sense from the point of view of support; you simply do not want to be at the mercy of a third-party application developer when installation/configuration issues are encountered in the production.
EAR/WAR files can be published in a manner that can allow for portable installations. It is theoretically possible to support multiple containers with a single build. This is a far better alternative than requiring a RPM to be published per-container; each RPM would have to install an application-specific container and publish the EAR/WAR file to this embedded container. If you wish to have your customers retain the choice of deploying onto their own containers, a RPM-only deployment model will require them to extract the EAR/WAR file from the RPM, and then perform the deploy themselves.
RPMs cannot be used to deploy applications in a standard manner, across commercial containers like WebLogic, WebSphere etc. It simply cannot be done, unless you expect your customers to employ a standard installation model involving directory layouts, clustering modes etc. RPMs that are created in-house to target a single customer, might not have this problem, as internal standards can be established on how the containers would be installed and configured.
Inferring from the above statements, EAR/WAR files ought to be available always to account for customer needs, with the added possibility of having RPMs for a hassle-free installation.
I'm not so experienced but i worked on some big Java EE projects (using maven2) with very distinct ways to handle the installation / delivery on the different platforms.
1) One of them was to use snapshots for development and then make a maven release, of components and main webapplications. Thus the delivery is:
war/ear files
List item
properties files
sgdb files
some others
And teams will use that files to put the new application versions in the different platforms.
I think this process is strict and permits you to always keep easily the different configurations passed in production, but it's not really flexible, the process is a bit heavy and it conducted us to sometimes do some dirty things like overriding a class of a war to patch a regression...
This is an e-commerce website with 10million unique visitors per month and a 99.89% availability.
2) Another i saw is to checkout the sources on each platform and then install the snapshot artifacts in a local repository. Then the application server will use these snapshots of the .m2 folder.
There is not a real delivery process since to put a new version in production, we just have to update the sources of the components / webapps, do some maven clean install and restart the application server.
I think it's more flexible but i see some drawbacks and this approach seems dangerous for me.
This website has a frontoffice, i don't know the numbers but it's far less than the 1st one. It also has a big backoffice available for most employees of a 130 000 people company.
I guess depending on the website, its exposition to the public and the availability required, we have to adapt the delivery strategy to the needs.
I'm not here to ask which solution is the best but wonder if you have seen different things, and which strategy you would use in which case?
Without dealing dealing web sites, I had to participate in release management process for various big (Java) projects in heterogeneous environment:
development on "PC", meaning in our case Windows -- sadly still Windows Xp for now -- (and unit testing)
continuous integration and system testing on linux (because they are cheaper to setup)
pre-production and production on Solaris (Sun Fire for instance)
The common method I saw was:
binary dependency (each project uses the binaries produced by the other project, not their sources)
no recompilation for integration testing (the jars produced on PC are directly used on linux farms)
full recompilation on pre-production (meaning the binary stored on the Maven repo), at least to make sure that everything is recompiled with the same JDK and the sale options.
no VCS (Version Control System, like SVN, Perforce, Git, Mercurial, ...) on a production system: everything is deployed from pre-prod through rsynch.
So the various parameters to take into account for a release management process is:
when you develop your project, do you depend directly on the sources or the binaries of the other projects?
where do you store your setting values?
Do you parametrize them and, if yes, when do you replace the variables by their final values (only at startup, or also during runtime?)
do you recompile everything on the final (pre-production) system?
How do you access/copy/deploy on your production system?
How do you stop/restart/patch your applications?
(and this is not an exhaustive list.
Depending on the nature of the application release, other concerns will have to be addressed)
The answer to this varies greatly depending on the exact requirements and team structures.
I've implemented processes for a few very large websites with similar availability requirements and there are some general principles I find have worked:
Externalise any config such that the same built artifact can run on all your environments. Then only build the artifacts once for each release - Rebuilding for different environments is time consuming and risky e.g. it not the same app that you tested
Centralise the place where the artifacts get built. - e.g. all wars for production must be packaged on the CI server (using the maven release plugin on hudson works well for us).
All changes for release must be traceable (version control, audit table etc.), to ensure stability and allow for quick rollbacks & diagnostics. This doesn't have to mean a heavyweight process - see the next point
Automate everything, building, testing, releasing, and rollbacks. If the process is dependable, automatable and quick the the same process can be used for everything from quick fixes to emergency changes. We use the same process for a quick 5 minute emergency fix and for a major release, because it is automated and quick.
Some additional pointers:
See my answer property-placeholder location from another property for a simple way to load different properties per environment with spring.
http://wiki.hudson-ci.org/display/HUDSON/M2+Release+Plugin If you use this plugin and ensure that only only the CI server has the correct credentials to perform maven releases, you can ensure that all releases are performed consistently.
http://decodify.blogspot.com/2010/10/how-to-build-one-click-deployment-job.html A simple way of deploying your releases. Although for large sites you will probably need something more complicated to ensure no downtime - e.g. deploying to half the cluster at a time and flip-flopping web traffic between the two halves - http://martinfowler.com/bliki/BlueGreenDeployment.html
http://continuousdelivery.com/ A good website and book with some very good patterns for releasing.
Hope this helps - good luck.
To add to my previous answer, what you are dealing with is basically a CM-RM issue:
CM (Change Management)
RM (Release Management)
In other words, after the first release (i.e. the main initial development is over), you have to keep making release, and that is what CM-RM is supposed to manage.
The implementation of the RM can be either 1) or 2) in your question, but my point would be to add to that mechanism:
proper CM in order to track any change request, and evaluate their impact before committing to any development
proper RM in order to be able to realize the "release" tests (system, performance, regression, deployment tests), and then to planify, schedule, perform and then monitor the release itself.
Without claiming it's a best solution, this is how my team currently does staging and deployment.
Developers initially develop at their local machine, the OS is free to choose, but we strongly encourage using the same JVM as will be used in production.
We have a DEV server where frequently snapshots of the code is being pushed to. This is simply a scp from the binary build produced from the IDE. We plan to build directly on the server though.
The DEV server is used for stakeholders to continuously peek along with development. By its very nature it's unstable. This is well known with all users of this server.
If the code is good enough, it's branched and pushed to a BETA server. Again, this is a scp of a binary build from the IDE.
Testing and general QA takes place on this BETA server.
Mean while, if any emergency changes should be necessary for the software currently in production, we have a third staging server called the UPDATE server.
The UPDATE server is initially only used to stage very small fixes. Here too we use scp to copy binaries.
After all testing is conducted on UPDATE, we copy the build from UPDATE to LIVE. Nothing ever goes to the live servers directly, it always goes via the update server.
When all testing is finalized on BETA, the tested build is copied from the beta server to the UPDATE server and a final round of sanity testing is performed. Since this is the exact build that was tested on the beta server, it is very unlikely that problems are found in this stage, but we uphold the rule that everything deployed to the live server should go via the update server and that everything on the update server should be tested before moving it on.
This sliding strategy allows us to develop for 3 versions in parallel. Version N that's currently in production and staged via the update server, version N+1 that will be the next major release that's about to be released and is staged on the beta server, and version N+2 that is the next-next major release for which development is currently underway and is staged on the dev server.
Some of the choices that we made:
A full application (an EAR) typically depends on artifacts from other projects. We choose to include the binaries of those other projects instead of building the whole thing from source. This simplifies building and gives greater assurance that a tested application is bundled with exactly the right versions of all its dependencies. The cost is that a fix in such a dependency has to be manually distributed to all applications that depend on it.
Configuration for every staging is embedded in the EAR. We currently use a naming convention and a script copies the right version of each configuration file to the right location. Parameterizing the path for each configuration file, e.g. by using a single {stage} placeholder in a root config file is currently being considered. The reason we store the config in the EAR, is because the developers are the ones who introduce and depend on configuration, so they should be the ones responsible for maintaining it (adding new entries, removing unused one, tweaking existing ones, etc).
We use a DevOps strategy for a deployment team. It consists of a person who is purely a developer, two persons who are both developer and operations and two persons who are purely operations.
Embedding the configuration in the EAR might be controversial, since traditionally operations needs to have control about e.g. the DB data sources being used in production (to what server it points to, how many connections a connection pool is allowed to have, etc). However, since we have persons on the development team who are also in operations, they are easily able to sanity check the changes made by other developers in the configuration while the code is still in development.
Parallel to the staging we have the continuous build server server doing a scripted (ANT) build after every check-in (with a maximum of once per 5 minutes), and runs unit tests and some other integrity tests.
It remains difficult to say whether this is a best-of-breed approach and we're constantly trying to improve our process.
I am a big advocate of a single deployable containing everything (Code, Config, DB Delta, ...) for all environments, built and released centrally on the CI server.
The main idea behind this is that Code, Config & DB Delta are tightly coupled anyway. The code is dependent on certain properties being set in the config and some objects (tables, views, ...) being present in the DB. So why split this and spend your time tracking everything to make sure it fits together, when you can just ship it together in the first place.
Another big aspect is minimizing differences between environments, to reduce failure causes to the absolute minimum.
More details in my Continuous Delivery talk on Parleys: http://parleys.com/#id=2443&st=5
I recently have a problem that my java code works perfectly ok on my local machine, however it just wouldn't work when I deploy it onto the web server, especially the DB part. The worst part is that the server is not my machine. So I had to come back and forth to check the versions of softwares, the db accounts, the settings, and so on...
I have to admit that I did not do a good job with the logging mechanism in the system. However as an newbie programmer with little experience, I had to accept my learning curves. Therefore, here comes a very general but important question:
According to your experience, where would it be most likely to go wrong when it is working perfectly on the development machine but totally surprises you on the production machine?
Thank you for sharing your experience.
The absolute number one cause of problems which occur in production but not in development is Environment.
Your production machine is, more likely than not, configured very differently from your development machine. You might be developing your Java application on a Windows PC whilst deploying to a Linux-based server, for example.
It's important to try and develop against the same applications and libraries as you'll be deploying to in production. Here's a quick checklist:
Ensure the JVM version you're using in development is the exact same one on the production machine (java -version).
Ensure the application server (e.g. Tomcat, Resin) is the same version in production as you're using in development.
Ensure the version of the database you're using is the same in production as in development.
Ensure the libraries (e.g. the database driver) installed on the production machine are the same versions as you're using in development.
Ensure the user has the correct access rights on the production server.
Of course you can't always get everything the same -- a lot of Linux servers now run in a 64-bit environment, whilst this isn't always the case (yet!) with standard developer machines. But, the rule still stands that if you can get your environments to match as closely as possible, you will minimise the chances of this sort of problem.
Ideally you would build a staging server (which can be a virtual machine, as opposed to a real server) which has exactly (or as close as possible to) the same environment as the production server.
If you can afford a staging server, the deployment process should be something like this:
Ensure application runs locally in development and ensure all unit and functional tests pass in development
Deploy to staging server. Ensure all tests pass.
Once happy, deploy to production
You're most likely running under a different user account. So the environment that you inherit as a developer will be vastly different from that a a production user (which is likely to be a very cut down environment). Your PATH/LD_LIBRARY_PATH (or Windows equivalents) will be different. Permissions will have changed etc. Plus the installed software will be different.
I would strongly recommend maintaining a test box and a test user account that is set up with the same software, permissions and environments as the production user. Otherwise you really can't guarantee anything. You really need to manage and control the production and test servers wrt. accounts/installed software etc. Your development box will always be different, but you need to be aware of the differences.
Finally a deployment sanity check is always a good idea. I usually implement a test URL that can be checked as soon as the app is deployed. It will perform database queries or whatever other key functions are required, and report unambiguously as to what's working/not working via a traffic light mechanism.
Specifically you can check all the configuration files (*.xml / *.properties) in your application and ensure that you are not hard coding any paths/variables in your app.
You should maintain different config files for each env. and verify the installation guide from env admin. (if exists)
Other than that versions of all softwares/dependency list etc as described by others.
A production machine will likely miss some of the libraries and tools you have on your development machine. Or there may be older versions of them. Under circumstances it may interfere with the normal software function.
Database connection situation may be different, meaning users and roles and access levels.
One common (albeit easy to detect) problem is conflicting libraries, especially if you're using Maven or Ivy for dependency management and don't double check all the managed dependencies at least once before deploying.
We've had numerous incompatible versions of logging frameworks and even Servlet/JSP API .jar:s a few times too many in our test deployment environment. Also it's always a good idea to check what the shared libraries folder of your tomcat/equivalent contains, we've had some database datasource class conflicts because someone had put postgre's jdbc jar to the shared folder and project came with its own jar for jdbc connectivity.
I always try to get an exact copy of the Server my product is running. After some apps and of course a lot of Bugs i vreated myself a List of common Bugs/Hints. Another Solution i tested for my last project was to get the running Software on that Server and try to configure it. Strange effects can happen with that^^
Last but not least..i always test my apps on different machines.
In my experience there is no definite answer to this question. Following are some of the issues I faced.
Automatic updates was not turned on in dev server (windows) and it was turned on in the production server(which in first place is wrong!). So one of my web application crached due to a patch applied.
Some batch jobs were running in the production app server which changed some data on which my application was using.
It is not me who does the deployment for my company so most of the time people who deploy miss some registry entries, or add wrong registry entries. Simple but very hard to detect (may be for me ;-) ) once I took hours to identify a space in one of the registry values. Now We have a very long release document which has all the details about all servers used by the application and there is a check list for "current release" which the engineers who deploy the application fill in.
Willl add more if I remeber any.
Beyond just a staging server another strategy for making sure the environments you deploy into are the same is to make sure they are set up automatically. That is you use a tool like Puppet to install all the dependencies that the server has and run your install process before every installation so that all the configuration is reset. That way you can ensure the configuration of the box is what you have set it to during the development process and have the configuration of the production environment in source control.