How do Apache httpd and Tomcat work together?

How do Apache httpd and Tomcat work together? - java

I am inheriting a project involving a Java web app whose backend is powered by an Apache httpd/Tomcat combo. The web server is being used to serve back JS, static content, and to perform general load balancing, and Tomcat is serving back JSPs via a single WAR file.
I will be receiving access to the code base later on today or tomorrow, but wanted to try and do some research ahead of time.
My question can be summed up as: how do these two work together?
Who first receives HTTP requests?
How does httpd know when to forward JSP requests on to Tomcat, or to just respond to a request itself?
How does httpd "pass" the request to, and "receive" the response from, Tomcat? Does it just "copy-n-paste" the request/response to a port Tomcat is listening on? Is there some sort of OS-level interprocess communication going on? Etc.
These are just general questions about how the technologies collaborate with each other. Thanks in advance!

Who first receives HTTP requests?
Apache, almost certainly. There could be admin processes that talk directly to Tomcat, though.
How does httpd know when to forward JSP requests on to Tomcat, or to just respond to a request itself?
From its configuration. The specifics will vary. It might, for instance, be using mod_jk or mod_jk2, in which case you'll find JkMount directives in the config files, e.g.:
JkMount /*.jsp ajp13_worker
...which tells it to pass on requests at the root of the site for files matching *.jsp to the ajp13_worker, which is defined in the workers.properties file.
Or it could be set up in a simple HTTP reverse-proxy arrangement. Or something else.
How does httpd "pass" the request to, and "receive" the response from, Tomcat?
It depends on the configuration; it could be HTTP, it could be AJP, or it could be using some other module.
Does it just "copy-n-paste" the request/response to a port Tomcat is listening on?
Sort of. :-) See the reverse-proxy link above.
Is there some sort of OS-level interprocess communication going on?
Yes. AFAIK, it's all socket-based (rather than, say, shared memory stuff), which means (amongst other things) that Tomcat and Apache need not be running on the same machine.

Related

How can I see what Tomcat is doing with respect to Servlets and the network side

Currently, I am learning about Java web development.
A lot of it seems to simply be configuration to me, and I feel that my understanding is superficial because I only see the configuration (i.e. define your servlets and their mappings in the web.xml file, make custom Servlets by extending the HttpServlet class, instantiate Tomcat in the main method, etc.)
I want do know a bit more about what is actually going on under the hood - so I am in some need of some guidance.
To this end, I have done some cursory reading on Tomcat and servlets from the following links:
What is a servlet
Difference between embedded and not embedded
Tomcat docs
So what I think I understand from this is that the servlets sit inside the Tomcat instance (a servlet container) and Tomcat handles all of the receiving all of the requests of the client and relaying them to the servlets. The servlets process the requests, send back a response, which Tomcat then sends back to the client. I suppose in the local setup I have, my machine would both be acting as the client and server.
Given the above I want to know:
How I can directly see and monitor the client's sending the request to Tomcat and verify that Tomcat has received the request? Essentially, how can I verify that this networking side of things is happening due to some implementation by Tomcat?
How does Tomcat parse the request information and send it to the servlets?
Is Tomcat a servlet container or a web server? Are these the same thing?
In the answer given in the second link regarding embedded vs. non-embedded, the answer states that an embedded server looks like a regular java program. Does this mean that for an embedded server, the server is in the java application while the web application is inside the server in the non-embedded case? Like the containment relationship is reversed? What does containment mean in the first place here?
Apologies for the numerous questions and thank you for helping to clarify.

2. How does Tomcat parse the request information and send it to the servlets?
The Servlet specification explains that in detail. The spec is surprisingly easy to read; I suggest giving it a go.
As a simplified overview…
The job of a Servlet container is to process the incoming request, which is just a bunch of text. The Servlet container pulls out the various pieces and assembles them into a request object.
Likewise, the response produced by your servlet is packaged up as a response object. The Servlet container's job is to use all the info contained in that object to create a stream of text to be sent back to the client web browser.
The whole point of Servlet containers is to relieve the servlet-writing programmers of the need to know much of the details of HTTP and how to make a server. The Servlet container does all that work. In other words, the great thing about Servlet technology is that you the programmer need not ask this # 2 question of yours!
3. Is Tomcat a servlet container or a web server? Are these the same thing?
(a) both, (b) no.
No, servlet containers and web servers are two different kinds of software.
A web server handles:
listening for incoming connections from clients (web browsers, etc.)
sending response back to the client
A web server handles all the network traffic.
A Servlet container provides an environment in which relatively small chunks of code (servlets) can process a request and formulate a response. The small servlet does not have to handle network traffic, launching & shutting down, security, and all the other responsibilities of a full server. That explains the "-let" in "Servlet".
Your servlet you write plugs into a Servlet container. The container communicates with the web server, receiving each request passed by the web server, and passing to the web server the response produced by your servlet. When a request arrives, the container invokes your servlet.
Your servlet remains blissfully ignorant as to what particular Servlet container implementation is running, as long as it complies with the Jakarta Servlet specification. And your servlet remains blissfully ignorant as to the existence of web servers.
Some products, such as Tomcat & Jetty, can be composed of both a web server and a Servlet container.
Tomcat is composed mainly of three components: (1) Catalina, a servlet container, (2) Coyote, a web server, and (3) Jasper, a Jakarta Server Pages processor. See Wikipedia.
For most people's needs, the Coyote web server in Tomcat is a suitable web server. So you can use Tomcat as as all-in-one application server, handling both web traffic and servlets.
[web request] ➜ [Tomcat Coyote] ➜ [Tomcat Catalina] ➜ [your servlet]
Alternatively, some folks choose to use Tomcat only as a Servlet container, sitting behind a separate web server such as Apache HTTP Server. In such a case, Tomcat’s Coyote component goes unused. Instead, the separate web server handles client browser components, and processes incoming requests. If a request is asking for a static resource, the web server serves it out, without any involvement from Tomcat. If the request is asking for work that has been assigned to a servlet, then the separate web server passes the request on to Tomcat and its Catalina component. After your servlet produces a response, the response moves from Tomcat back to the external web server, which traffics the response onwards to the client web browser.
[web request] ➜ [Apache HTTP Server] ➜ [Tomcat Catalina] ➜ [your servlet]
4 … embedded vs. non-embedded …
Non-embedded is the classic situation, as originally envisioned when Servlet technology was first invented.
Back then, servers were few, expensive, and already in place permanently. The goal of Servlet technology was to make it easy for companies to keep those expensive servers busy by having many web applications running alongside each other.
Servlet technology allowed many different servlets to be running on one machine without stepping on each other, and without the programmers of each servlet having known anything about the other servlets being written. The Servlet container can stay up and running as servlets are deployed and un-deployed.
Fast forward, and we have cloud technology where servers are many, cheap, and convenient to create and destroy on-the-fly. So nowadays many people want to run their web applications separately, one web app per virtual machine or virtual service. Thus the need for embedded mode. We need an application that can be launched and shutdown on its own, to run one specific servlet (or multiple servlets meant to work together) without any other unrelated web apps.
One way to achieve this new goal is to package a web server and servlet container into a standalone Java app. A system administrator can launch and quit this standalone app like any other Java app, without knowing anything about how to configure an on-going web server and Servlet container.

How can I make GRPC server run on another web server (not netty)

Grpc Server seems to be implemented using netty. Is there a way to use other implementations ?

Netty is the only supported server. You can either have two separate ports (one for your other server, one for gRPC) or could reverse proxy from your other server to the Netty server.
There is work underway (tracking issue) to allow serving using the Servlet API, so then any Servlet Container could be used. But there are restrictions, like the needing to be the root ('/') webapp. It is far enough along to test it and provide feedback, but there also may be some gaps in the implementation.

Java application server without HTTP

I have a client software that is written in C++/C# and a database. Now I don't want the client to access the database directly, so I thought about placing an application server in the middle. This one should get a short request from the client, ask the database for new data, do some filtering (that can't be done in sql) and then return the data to the client.
My search for this kind of software brought me to Glassfish or Tomcat but my problem in understanding is, that these always want to talk http with html/jsp. Because most of my data is encrypted anyways, I don't need such plain text protocols and would be totally happy with something that just takes a byte stream.
On the other hand would it be nice to have a server handle the thread pool for me (don't want to implement all that from scratch).
After more than a day of searching / testing I'm even more confused than at the beginning (ejb, beans, servlet, websocket, ... so many things to google before understanding just the simplest tutorials).
TL;DR: how do I get Tomcat/Glassfish to just open a socket and create a new thread for every request, without any HTML/CSS/JSP involved?

Jetty and Tomcat are so called servlet container and thus primarly targeted at HTTP exchanges. Glassfish is an application server that uses a servlet container as one of its modules. I would stop thinking in that direction - that's all more like web applications and web services - some levels too high what you are asking for.
I think you should more look into sth. like Netty which is merley a "high performance protocol" server. Take a look at the documentation here (even some sort of tutorial there which might fit your use case).

GlassFish is an "enterprise application server", targeting the Java EJB specification. That's surely overdone for your purpose. You can give Tomcat a try. It is a "servlet container", targeting Java Servlet specification. Servlets have one purpose: listening to an incoming URL (request), executing Java code and returning a response, usually over HTTP.
For sure, you may start your own (plain) ServerSocket, for example using a ServletContextListener (which will be started once your application starts). But you should go for a higher protocol to send the data, like Hessian and Burlap, which is implemented in both, Java and C++ and easy to set up.

Test large number of webservices on a single computer

We have a large system of physical devices which all run a web service for control and a central control system for controlling these devices. I need to make a substitute for such a physical device in order to test the controlling unit. How will I go about running more than one instance of a test device on a single computer. The protocol used in SOAP with a wsdl written in stone. In addition to the webservice each test device needs a webserver to monitor state and generating events.
My first approach is to embed jetty and use axis2 for webservices, but I am having some trouble making that fly. I managed to get the axis2 SimpleHttpServer working with a webservice, but as far as I can tell SimpleHttpServer will not let me run Servlets or let alone wars. Is there a better approach I am missing?
I considered making a proxy server listening on any number of ports and forwarding the request to a webservice to a central webservice with an additional paramater saying which port the request originated from, but since the wsdl is writting in stone I can not pass this paramater along.
EDIT: I am using Netbeans to generate a webservice for me. Works as a charm but not enough for my project. For some reason wsimport chokes on the wsdl. I don't understand how Netbeans can deploy to the bundled Glassfish server, but if I drop the generated dist/my-project.war into tomcat the webservice doesn't work. Much less show up in web.xml. What am I missing?

Be aware that if you route your network requests through a SOCKS proxy, you can essentially redirect even hardcoded names and ports in the SOCKS proxy to whatever you need.

Axis2 is not meant to be used as a servlet container, so using SimpleHttpServer doesn't help you there.
But Jetty is a full featured Servlet container. If you want to make it work, you have to run your Wars with Jetty. (Or any other servlet container, but Jetty is perfectly fine)
I'm not Jetty expert, but this should work:
Server server = new Server(8080);
Context root = new Context(server, "/", Context.SESSIONS);
root.addServlet(new ServletHolder( yourServletInstance ), "/*");
server.start();
(Taken from Jetty Wiki)

Ok I've figured out a solution. I can use Glassfish. Then I deploy the same webapp multiple times under different names. Then I have a small proxy made in glassfish which listens on a number of ports and then translates the request to one of the instances running i glassfish.

Separating web server and app server, do both need java?

If we are to separate our web server and app server, would we need java on both machines? I've had one coworker say to install jboss on both machines (seems to defeat the purpose if both machines have app server installed) and another says just install jboss on one and apache on the other (app server/web server).
I have the web project setup already and it uses servlets and JSPs. The JSPs display the content while the servlets do the action. The servlets receive requests and forward responses to the JSP. My question is how do I do this if the web server only has apache and therefore displays static content? I understand how to forward the requests from the web server to the app server but what about maintaining session state, is that done on the web server and if so how would it be done?
If the login page is html and the content after the login is html then how could I stop people from accessing the content if they haven't logged in?

The latter setup you describe, with Apache serving static content and forwarding requests for JSP/servlets onto the app server is the standard setup.
Session state is maintained as normal, your Java webapp on the app server sends the user back a cookie containing a JSESSIONID and when the user makes subsequent requests, Apache includes all request info (including cookies) in what it forwards to the app server.
The setup becomes a bit more complicated if you want to have Apache sit in front of and load balance requests to multiple JBoss instances, but it's still pretty easy to set up with mod_proxy_balancer.
Some links that might help you:
http://help.shadocms.com/blog/2009/how-to-setup-apache-with-jboss-on-a-shado-site.cfm
http://redlumxn.blogspot.com/2008/01/configure-apache2-and-jboss-422ga.html

There are many possibilities.
On web machine install just apache with mod_jk to redirect the requests to tomcat/jboss.
In this case you don't need java on this machine.
You can also separate your jsp container (e.g. tomcat/jboss) and your app server in this case you you will need to install java where you have your web container.
Generally where there is a need of higher security people combine the above mentioned possibilities. Thin web layer (apache + no java) + Web container (e.g. tomcat) + app layer (jboss/glassfish)
The first solution is normally the standard one.

Your scenario reminds me of SiteMinder. It was used to access control into our application. It has built in HTTP forwarding so from the user's perspective the browser talks to siteminder and siteminder talks to the real application. They both use session cookies and siteminder's called SMSESSION while the app's called JSESSIONID so there is no conflict.

A common deployment is to use Apache fronting servers to serve static content and forwarding requests for dynamic content to the JSP server. This is mainly for performance reasons, Apache being both faster at serving content and reducing the load on the JSP server.
I don't see any reason why you couldn't, for example, use IIS as the fronting server (removing Java from the equation), although with the wealth Apache modules and accompanying information about the configuration I think you might be making life difficult for yourself if you did.

Short answer - No.
Long answer -
It depends on the needs of your application. There are a few reasons why you would want to have the web server on a different physical machine:
You want to have the web server serve
the static content, and leave the app
server free to only process
servlet/jsp content
You wish to implement software based
load balancing. You would have the
apache server proxy requests to
multiple backing app servers
In your login example, the html page is served by apache, and the action of the html form points to your servlet for processing - so JBoss/java will still manage the session. Keep in mind that any static content you want apache to server will need to be present on the web server.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.