I want to open a website in web browser. I know it is easy but i want to do it in different way ...
It is like proxy server .I have made a java code that will get content(source code) of webpage and when browser request localhost on particular port number this code writes source code in browser. But instead of getting web page I am getting source code of webpage in browser and also i want to make a request from java code as a illusion of browser means server should feel that that request is made from a browser and not from java console.
import java.net.*;
import java.io.*;
public class URLConnectionReader {
public static void main(String args[]) throws Exception{
URL ul = null;
HttpURLConnection ulc = null;
ServerSocket server = null;
Socket client = null;
DataInputStream in = null;
DataOutputStream out = null;
String c = null;
server = new ServerSocket(9898);
System.out.println("Server is waiting for clients on port no 9898....");
while(client == null){
client = server.accept();
}
System.out.println("Connected.....");
out = new DataOutputStream(client.getOutputStream());
ul = new URL("http://www.google.com");
ulc = (HttpURLConnection)ul.openConnection();
in = new DataInputStream(ulc.getInputStream());
while((c = in.readLine())!=null){
out.writeBytes(c);
}
in.close();
out.close();
client.close();
}
}
Loading web pages is not quite as simple as you probably think. Both the browser and the server use a protocol called HTTP. In simple terms, the browser sends a request consisting of a request line, headers and sometimes data, and the server responds with a response line, headers and data. Most web pages also have related resources that need to be loaded for displaying the page (such as images, stylesheets and scripts), and each resource is loaded through a separate request.
Your program only accepts one request, completely ignores the details of the request, and then loads a fixed web page and sends it as the response. The way you are loading the web page (with a URL), you are only getting the data part of the response (the page source); the response line and the headers are missing. The headers are very important as one of them (named "Content-Type") specifies what kind of resource it is - web page, image or something else. Without it, browsers usually assume the data is plain text and display it accordingly.
So if you want your experiment to work better, you need to make sure you send a complete and valid HTTP response to the browser. You can probably reconstruct the response line and headers from the HttpURLConnection object. Or you can use sockets directly to load the web page.
A better solution would be to use a java web server (such as Jetty) in which you'd run a servlet that loads the remote page using an HTTP client library (such as Apache HttpComponents) and does the necessary processing of addresses and headers. But.. small steps :)
Related
I have a Java application which opens an existing company's website using the Socket class:
Socket sockSite;
InputStream inFile = null;
BufferedWriter out = null;
try
{
sockSite = new Socket( presetSite, 80 );
inFile = sockSite.getInputStream();
out = new BufferedWriter( new OutputStreamWriter(sockSite.getOutputStream()) );
}
catch ( IOException e )
{
...
}
out.write( "GET " + presetPath + " HTTP/1.1\r\n\r\n" );
out.flush();
I would read the website with the stream inFile and life is good.
Recently this started to fail. I was getting an HTTP 301 "site has moved" error but no moved-to link. The site still exists and responds using the same original HTTP reference and any web browser. But the above code comes back with the HTTP 301.
I changed the code to this:
URL url;
InputStream inFile = null;
try
{
url = new URL( presetSite + presetPath );
inFile = url.openStream();
}
catch ( IOException e )
{
...
}
And read the site with the original code from inFile stream and it now works again.
This difference doesn't just occur in Java but it also occurs if I use Perl (using IO::Socket::INET approach opening the website port 80, then issuing a GET fails, but using LWP::Simple method get just works). In other words, I get a failure if I open the web page first with port 80, then do a GET, but it works fine if I use a class which does it "all at once" (that just says, "get me web page with such-and-such an HTTP address").
I thought I'd try the different approaches on http://www.microsoft.com and got an interesting result. In the case of opening port 80, followed by issuing the GET /..., I received an HTTP 200 response with a page that said, "Your current user agent
In one case, I tried the "port 80" open followed by GET / on www.microsoft.com and I received an HTTP 200 response page that said, "Your current user agent appears to be from an automated process...". But if I use the second method (URL class in Java, or LWP in Perl) I simply get their web page.
So my question is: how does the URL class (in Java) or the LWP module (in Perl) do its thing under the hood that makes it different from opening the website on port 80 and issuing a GET?
Most servers require the Host: header, to allow virtual hosting (multiple domains on one IP)
If you use a packet capturing software to see what's being sent when URL is used, you'll realize that there's a lot more than just "GET /" being sent. All sorts of additional header information are included. If a server gets just a simple "GET /", it's easy to deduct that it can't be a very sophisticated client on the other end.
Also, HTTP 1.0 is "outdated", the current version is 1.1.
Java URL implementation delegates to HttpURLConnection if it starts with "http:"
i'm trying to download a file from a site , this site has a life ray server
i have been reading to much about but all describe how to configure a server not how to read from , all examples i saw has HTTPServletRequest which needs a request input how can i transfer a URL to a request ,from where to start at least .
in other words :i have the URL , in the webpage i select a date and a download like is generated , how can i make it down in java ????
i tried this:
HttpServletRequest request = PortalUtil.getHttpServletRequest(PortletRequest);
so how to link my URL to PortletRequest
If you have the URL of the download the only thing you need is to perform a client request against that URL.
First thing you should try to be sure that the URL you have is the one that will give you the expected results is try to paste it in a new browser window and verify that the download starts.
Then, if you want to perform that download through Java you can do very easily using the URL and URLConnection (HttpURLConnection in this case) classes:
String urlString = "..."; // Your URL
URL url = new URL(urlString);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
if (conn.getResponseCode() == 200) {
InputStream stream = conn.getInputStream();
// Read the data from the stream
}
You could also do the same using Apache HTTP Client.
Note: PortalUtil.getHttpServletRequest(...) is used internally by Liferay and you won't have any access to that API if you are doing a client request.
If you're writing a portlet, by design you don't get access to the HttpServletRequest.
What you can do is to utilize the "resource-serving" lifecycle phase of a portlet. There you get access to a ResourceRequest and ResourceResponse object. Those objects behave almost like a HttpServletRequest/-Response object
As you don't name the framework that you're using: javax.portlet.GenericPortlet.serveResource() is the method that you want to override in the pure JSR-286 API.
On the UI side, <portlet:resourceURL/> will provide the URL to your portlet's resource handling method.
This should provide you with enough google-food to find tutorials on how to implement different lifecycle phases - I can't judge the required level of detail you need. Note that Liferay has quite a few sample portlets that you can utilize as a source for sample code.
Edit: Following your comment below, let me give you some pseudo code (just typed here, never compiled/run):
on a jsp frontend, e.g. view.jsp:
Download File
Then, in your portlet, assuming you're implementing javax.portlet.GenericPortlet in one way or another (e.g. indirectly through Liferay's MVCPortlet or any other superclass):
public class MyPortlet extends GenericPortlet {
....
#Override
public void serveResource(ResourceRequest request, ResourceResponse response) {
// implement the file streaming here,
// use ResourceResponse the way you find illustrated
// in samples for HttpServletResponse
}
I am trying to read a website using the java.net package classes. The site has content, and i see it manually in html source utilities in the browser. When I get its response code and try to view the site using java, it connects successfully but interprets the site as one without content(204 code). What is going on and is it possible to get around this to view the html automatically.
thanks for your responses:
Do you need the URL?
here is the code:
URL hef=new URL(the website);
BufferedReader kj=null;
int kjkj=((HttpURLConnection)hef.openConnection()).getResponseCode();
System.out.println(kjkj);
String j=((HttpURLConnection)hef.openConnection()).getResponseMessage();
System.out.println(j);
URLConnection g=hef.openConnection();
g.connect();
try{
kj=new BufferedReader(new InputStreamReader(g.getInputStream()));
while(kj.readLine()!=null)
{
String y=kj.readLine();
System.out.println(y);
}
}
finally
{
if(kj!=null)
{
kj.close();
}
}
}
Suggestions:
Assert than when manually accessing the site (with a web browser client) you are effectively getting a 200 return code
Make sure that the HTTP request issued from the automated (java-based) logic is similar/identical to that of what is sent by an interactive web browser client. In particular, make sure the User-Agent is identical (some sites purposely alter their responses depending on the agent).
You can use a packet sniffer, maybe something like Fiddler2 to see exactly what is being sent and received to/from the server
I'm not sure that the java.net package is robot-aware, but that could be a factor as well (can you check if the underlying site has robot.txt files).
Edit:
assuming you are using the java.net package's HttpURLConnection class, the "robot" hypothesis doesn't apply.
On the other hand you'll probably want to use the connection's setRequestProperty() method to prepare the desired HTTP header for the request (so they match these from the web browser client)
Maybe you can post the relevant portions of your code.
I have abandoned my earlier quest to make the applet communicate directly with the database, even though users and webpages have said that it's possible. I am now trying to get my applet to pass information (String and boolean format) entered in textfields or indicated by checkboxes, and give this to the servlet, which then stores it appropriately in the database. I've got the applet front end - the GUI - built, and the servlet - database connection also built. The only problem is the link between the two, applet and servlet. How would one pass String data from an applet to a servlet?
Thanks,
Joseph G.
First up, you have to acknowledge that you can only communicate with the server from where your applet was downloaded from, that includes the port number, unless you want to mess around with permissions, applet signing and all that malarky. This also isn't just an Applet restriction, the same applies to Flash and JavaScript (though in the case of JavaScript there are tricks to get around it).
Using either the "getCodeBase()" or "getDocumentBase()" method on your Applet will get you a URL from which you can get the component parts required to build a new URL that will let you call a servlet.
Thus, your Applet must be being served from the same server that your servlet is hosted on.
e.g. if your Applet is in the following page:
http://www.example.com/myapplet.html
...it means you can make calls to any URL that starts with
http://www.example.com/
...relatively easily.
The following is a crude, untested, example showing how to call a Servlet. This assumes that this snippet of code is being called from within an instance of Applet.
URL codeBase = getCodeBase();
URL servletURL = new URL(codeBase.getProtocol(), codeBase.getHost(), codeBase.getPort(), "/myServlet");
// assumes protocol is http, could be https
HttpURLConnection conn = (HttpURLConnection)servletURL.openConnection();
conn.setDoOutput(true);
conn.setRequestMethod("POST");
PrintWriter out = new PrintWriter(conn.openOutputStream());
out.println("hello world");
out.close();
System.out.println(conn.getResponseCode());
Then in your servlet, you can get the text sent by overriding doPost() and reading the input stream from the request (no exception handling shown and only reads first line of input):
public void doPost(HttpServletRequest req, HttpServletResponse res) {
BufferedReader reader = req.getReader();
String line = reader.readLine();
System.out.println("servlet received text: " + line);
}
Of course, that's just one approach. You could also take your inputs and build up a query string like this (URLEncoding not shown):
String queryString = "inputa=" + view.getInputA() + "&inputb=" + view.getInputB();
and append that to your URL:
URL servletURL = new URL(codeBase.getProtocol(), codeBase.getHost(), codeBase.getPort(), "/myServlet?" + queryString);
However, it seems fairly common to build up some kind of string and stream it to the servlet instead these days.
A recommended format would be JSON as it's semi-structured, while being easy to read and there are plenty of (de)serializers around that should work in your Applet and in your servlet. This means you can have a nice object model for your data which you could share between your Applet and Servlet. Building up a query string of complex inputs can be a mind bender.
Likewise, you could actually use Java serialisation and stream binary to your Servlet which then uses Java serialisation to create the appropriate Java objects. However, if you stick to something like JSON, it'll mean your servlet is more open to re-use since Java serialisation has never been implemented outside of Java (that I am aware of).
Hm, I guess the applet and the servlet run in two separate Java processes. In that case you'll have to use some remoting technology, e.g. an http call to localhost. In fact, that is what servlets are mainly used and implemented for: Accept and process http requests.
I have a simple web page with an
embedded Java applet.
The applet
makes HTTP calls to different Axis
Cameras who all share the same
authentication (e.g. username,
password).
I am passing the user name and password to the Java code upon launch of the applet - no problem.
When I run from within NetBeans with the applet viewer, I get full access to the cameras and see streaming video - exactly as advertised.
The problem begins when I open the HTML page in a web browser (Firefox).
Even though my code handles authentication:
URL u = new URL(useMJPGStream ? mjpgURL : jpgURL);
huc = (HttpURLConnection) u.openConnection();
String base64authorization =
securityMan.getAlias(this.securityAlias).getBase64authorization();
// if authorization is required set up the connection with the encoded
// authorization-information
if(base64authorization != null)
{
huc.setDoInput(true);
huc.setRequestProperty("Authorization",base64authorization);
huc.connect();
}
InputStream is = huc.getInputStream();
connected = true;
BufferedInputStream bis = new BufferedInputStream(is);
dis= new DataInputStream(bis);
The browser still brings up an authentication pop-up and requests the username and password for each camera separately!
To make things worse, the images displayed from the camera are frozen and old (from last night).
How can I bypass the browser's authentication?
Fixed
I added the following lines:
huc.setDoOuput(true);
huc.setUseCaches(false);
after the
huc.setDoInput(true);
line.
When running in the browser base64authorization not null, correct?
I'm not really sure what getBase64authorization is supposed to return, but I'm fairly certain when you call huc.setRequestProperty("Authorization", **autorization value**) it's looking for a HTTP Basic authentication value. Meaning **authorization value** needs to be in the format Basic **base 64 encoding of username:password** as described here.
Maybe you just need to add the Basic (note the trailing space) string to your property.