How to use Http, Socks4 and Socks5 proxies in java? - java

I want to screen-scrape a website and for that I want to use Http, Socks4 and Sock5 proxies. So my questions are as follows:
Is it possible to use these proxies through Java without using any other external API? For instance, Is it possible to send a request through HttpURLConnection through theseproxies?
If it is not possible, then What other external APIs I can use?
I was doing it by using a headless browser provided by HtmlUnit but it takes time to load even simple webpages, so could you please suggest me other APIs (if any) that provide headless browsers that are fast in loading webpages. I don't want to open webpages that contain heavy AJAX or Javascript code. I just need to click on the forms button through the headless browser.

Is it possible to use these proxies through Java without using any other external API? For instance, Is it possible to send a request through HttpURLConnection through these proxies?
Yes, you can configure proxies by either using (global) system properties, or using the Proxy class, or using a ProxySelector. The two later options are available since Java 5 and are more flexible. Have a look at Java Networking and Proxies as mentioned by jarnbjo for all the details.
I was doing it by using a headless browser provided by HtmlUnit but it takes time to load even simple webpages, so could you please suggest me other APIs (if any) that provide headless browsers that are fast in loading webpages. I don't want to open webpages that contain heavy AJAX or Javascript code. I just need to click on the forms button through the headless browser.
Unfortunately, the first alternatives I can think of are either HtmlUnit based (like JWebUnit or WebTest) or slower (Selenium, WebDriver - that you can run in headless mode). But maybe you could try HttpUnit if you don't need advanced JavaScript support.

Yes, that is possible. You can find the configuration options for different network proxies here.

Related

How can i automate server health check without using selenium

I want to automate a server health check that is configured using weblogic. I have to login to weblogic console to view server's health. If I think about automating this process, then the only way known to me is selenium. But i dont want to use it. Is there any other way through which i can login to weblogic console and get the health status of server in java
I think it is pretty simple: if you don't have components at hand that provide you the functionality you are looking for; then you will have to see what it takes to implement it by yourself.
You could start here for example. Or for a slightly different approach there.
You can do the API automation, instead of UI automation. You have to find out if there are any API exposed to get the data. If there are APIs available then you can use httpClient library in java for API automation.

Is a bad idea to use HTMLUnit to every first HttpRequest?

I have a ajax frontend for my java backend, and to make things work with the crawls and HTML5 pushState I going to use HtmlUnit to process the javascript in every first request.
I can make a workaround with a Filter to not use HtmlUnit trying to reduce the server load.
Regardless of browser compatibility, which is best for the server?
After a lot of research and test I can tell that process heavy javascript applications on server side ins't I good idea, HtmlUnit is the best tool around and it doesn't do the job very well, so if your requirements are to support clients that doesn't support javascript, so go for PHP or some other server side script language or maybe JSF.
I my case, I have a Back-end in JAX-RS and I manage to support HTML5 pushState without process the javascript on server-side and assume that the client process javascript, HtmlUnit still in use for enable crawlability.

Listen for custom protocol in a java application

I have created a java app and I would like to be able to execute actions in this app by calling some custom urls (ex: myapp://do_this)
I have already searched for this, and I have found some information about handling such urls in java (URLStreamHandler).
The only part I'm missing, is how to tell the OS to redirect the "myapp://" protocol to my application.
More and more apps are defining their own protocol and I was wondering whether it was possible to create such kind of things in Java.
Thanks
The protocol in URLStreamHandler will be used inside of the jvm. Typical usage is something like res://... for resources. The OS and the browsers have their own sets of protocols (ftp, mailto).
You could probably extend Firefox with a plugin to handle your protocol.
But then you have to send that to your app, running as a small "web" server.
Forget for a moment URLStreamHandler.
Embed the Jetty web server in your app, say on port 8765, and write a servlet to handle your URLs. Then you can in your browser type "http://localhost:8765/do_this". Should suffice.
To add a new protocol, one needs to implement an XPCOM component. Since XPCOM allows programming languages to talk to eachother, XPCOM components can be implemented in C++ or JavaScript currently in Mozilla.
http://www.nexgenmedia.net/docs/protocol/ will help you to understand more.
i think this is what you want.
As others have said, getting browsers to understand a new protocol name is browser (and OS) specific - you can't do it from the server.
However, would Java Web Start ( http://download.oracle.com/javase/tutorial/deployment/webstart/ ) fit your requirements? Most browsers are already set up to handle JWS applications correctly.

Cross-domain calls from a webbrowser

I would like to execute a cross domain http request from a website. What are my options?
Javascript is out, because most browser don't allow cross domain calls. Generally the solution is to use a proxy, but that isn't an option for this project.
The other things I was thinking about would be to use Flash or maybe Java. Are there any other platforms that I could use?
You will have to stick with the proxy solution because flash and java have the same cross-domain restrictions as javascript. If this is something that is only for personal use, there is an option as I know with the flex builder and the debugger version of the falsh player which can make cross-domain requests.
Both Java and Flash support crossdomain.xml files, as documented on Oracle and Adobe sites respectively.
W3C is working on a standard that takes a different approach. When that gets implemented by which systems, I cannot predict.
If you have administrative access to the server you will be making a cross-domain request to, then you can make it serve a Flash cross-domain policy file that grants another server (or servers) cross-domain access. Then that other server needs to use Flash to make its cross-domain requests.
If you are looking for something to help get you started, check out the opensource Forge project. It exposes a cross-domain XmlHttpRequest API in JavaScript so you only have to write JavaScript code:
http://github.com/digitalbazaar/forge/blob/master/README
"Javascript is out, because most browser don't allow cross domain calls."
Unfortunately, Javascript is most definitely in. You just have to add a new script to the page with whatever src url you like. It's called Cross-Site Scripting (or XSS). IMO, the vulnerability it introduces renders moot all the other attempts by browsers to regulate a "same-origin" policy. They're just trying to patch a hole in a pair of pants that have already fallen down around your ankles.

Develop desktop applications view with HTML, as a web application

I am used to develop web applications in Java (Struts, Spring, JSP...). But now I want to develop a desktop one. I never liked to design windows in Java (AWT, Swing, SWT): too much work for an ugly interface. So I think it could be a good idea if could take advantage of my web-app skills. One option is to modify the SWT Browser and make calls to a Java function instead of HTTP requests. A very good add-on would be use of JSP. Finally, I thought that probably there is some framework or tool for this.
Do you think that what I propose is a good idea?
There is available some framework for this?
I need this for light applications. So I think that embedding an Tomcat server and using it with HTTP requests is not a good idea.
Edit: One example application could be a folder comparer: you specify two folders and the app shows you which folders and files are different. In this case, I think opening an external browser is ugly. Bloated application (with its server, MVC, etc) wouldn't be the best choice.
If you have used the JavaScript library - ExtJs - then you can use it with Adobe AIR to build good looking desktop based web app.
Building app's in Adobe AIR is also simple and elegant with the flex builder ide.
If your option goes to embed a light server, check winstone is not fully J2EE compliant but should be enough for what you need.
About the browser, I am not a big fan of swt myself, it complicates a lot cross-platform deployment, so probably worth to keep an eye on jwebpane, not quite ready yet, but probably the solution you'll need.
I wouldn't discount embedding a web server. I've done this before with a web start application embedding Jetty.
The download was pretty fast, the server starts up and you can use BrowserLauncher to immediately drive your browser to the embedded server, and hence your application. Jetty is designed to be modular and have a small footprint, so you can probably cut it down to the bare necessities.
There are several options: You can use the plugin API of Firefox and develop your app in there. You can use HTML, JavaScript, the built-in database, all the browser features and access the OS level.
Or you could try PyQt (Python and Qt) which allows to write simple applications very quickly.
[EDIT] The main problem you're facing is security: For security reasons, JavaScript apps (running in a HTML page) can't access local OS resources. So unless your browser allows you to write plugins in JavaScript (which is only true for FF AFAIK), there is no way to write an application which uses HTML as the "view" without the help of something else.
Moreover, HTML is very limited when it comes to features for applications. HTML is designed to be a "static document view" not an "application". You can do things like GMail but if you compare GMail to any real mail app (Outlook, Thunderbird, Notes), you'll see quickly that real desktop apps offer a lot more features.

Categories

Resources