Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I want to scrape my website and then use the data from the website to populate elements in my app, my website has login pages and certain pages only open after the login has been done.
I started working with HtmlUnit as it is a headless browser and completed the custom api in a java IDE, later i tried to use the jar i generated from the java IDE and found that there are incompatibility issues with HtmlUnit and Android.
Can anyone propose a solution to this problem?
Edit :
Since no one actually answered this question I am currently going with a work around using android's native WebView, settings its Visibility to invisible and then using javascript interfacing to a Java object, I can inject JS code to scrape any data.
Use Jsoup library for such purpose. Very handy and easy to use.
Start with this answer and follow documents and other examples.
Either you contribute to HtmlUnit to produce a version of HtmlUnit not using the missing dependencies from Android.
Or you can use an alternative method like this one, as this seems to be the path someone else go before you.
If a real headless browser able to manage any recent web features, would exist, it would mean a team would have developed it and then invest much effort in it (in supporting existing and coming features) consistently.
Apart from Opera, Chrome, IE, and Firefox browsers, there is no such team.
I would point out Chromium (CEF) as the most open and actively supported cross language wise. Try Cef for java
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm thinking about writing a desktop application that the GUI is made with either HTML or PHP, but the functions are run by a separate Java or python code, is there any heads up that I can look into?
There are a couple of possible options:
Run your backend code as an embedded HTTP-server (like Jetty* for Java or Tornado* for Python). If the user starts the application, the backend runs the server and automatically starts the web browser with the URL of your server. This, however, may cause problems with the operating system firewall (running a server on the local machine)
You could also have a look at CEF (chromium embedded framework). It is made for exactly this purpose (running an HTML-Application inside your code). It uses the same codebase as the chromium (and chrome) web browser. It was developed originally for C++, but there is also a Java binding: java-cef
Oh and by the way, PHP is a server-side language. I would not recommend to use it in your scenario (since your backend code is Python or Java).
*I have not enough reputation to add more than two links, so you'll have to google those ones yourself.
You could expose data from Java or Python as JSON via GET request and use PHP to access it. There are multiple libraries for each of these languages both for writing and reading JSON. GET request can take parameters if needed.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a piece of java code I would like to run in my web browser and publish online. How can I do this without using applets? I have tried java vertx but I am not sure how to use it and there are no good tutorials online.
The short answer is you can't. Browsers don't "speak" Java natively, which is why applets required a plugin. As you probably know, Google is in the process of removing support for the plugin technology used by the Java plugin (NPAPI) and so soon Java won't work in Chrome at all (it already doesn't under Linux).
Your only real options are:
Provide a means of running it server-side, like http://ideone.com and various other "online" compilers do.
Translate it from Java to JavaScript (either manually or using a tool), which the browser can then run. But note that Java and JavaScript are not only markedly different languages despite a superficial similarity in syntax, but the standard environment for each is also quite different from the other.
How you do either of those is much too broad a question for SO.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I want to write a program to save web site when we enter web site link. What is the easiest programming language to do that. I want to save entire web site to my computer. I know there is way to write a program to save web page. But my requirement is to save entire web site. How can i do it. I just need some tips. Then i can do some research and find out a solution. Please help me to get start with my work. Thanx.
What you are trying to create is actually a download manager. It is easier to create a simple download manager in java but quite tedious to create a full fledged one.
The idea behind it is simple. Say you have a webpage with url www.example.com/index.html. to download just index.html is easy. But to download all pages of a domain or website. You have to download index.html. Then parse index.html for links that are inside domain (ie within www.example.com).You need to download all the links, and then go through all pages downloaded from links and find more links. This goes on till you have parsed all links once. So essentially you would need to read a webpage,grab links and then download those links.You need to search info on web crawler,web page parsing etc.
If you are just trying to download a website please try softwares like flashget,internet download manager etc. There are some opensource once so you could get source as well.
Please go through the links below for more info
http://www.9code.in/java-download-manager-with-full-source-code/
http://www.javaworld.com/article/2076095/core-java/download-a-website-for-offline-browsing.html
http://www.programcreek.com/2012/12/how-to-make-a-web-crawler-using-java/
How to get a web page's source code from Java
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I want to write a web crawler that can interpret JavaScript. Basically its a program in Java or PHP that takes a URL as input and outputs the DOM tree which is similar to the output in Firebug HTML window. The best example is Kayak.com where you can not see the resulting DOM displayed on the browser when you 'view source' but can save the resulting HTML though Firebug.
How would I go about doing this? What tools exist that would help me?
Ruby's Capybara is an integration test library, but it can also be used to write stand-alone web-crawlers. Given that it uses backends like Selenium or headless WebKit, it interprets javascript out-of-the-box:
require 'capybara/dsl'
require 'capybara-webkit'
include Capybara::DSL
Capybara.current_driver = :webkit
Capybara.app_host = "http://www.google.com"
page.visit("/")
puts(page.html)
I've been using HtmlUnit (Java). This was originally designed for unit testing pages. It's not perfect javascript, but it hasn't failed me in my limited usage. According to the site, it can run the following JS frameworks to a reasonable degree:
jQuery 1.2.6
MochiKit 1.4.1
GWT 2.0.0
Sarissa 0.9.9.3
MooTools 1.2.1
Prototype 1.6.0
Ext JS 2.2
Dojo 1.0.2
YUI 2.3.0
You are more likely to have success in Java than in PHP. There is a pre-existing Javascript interpreter for Java called Rhino. It's a reference implementation, and well-documented.
Rhino is used in lots of existing Java apps to provide Javascript scripting ability within the app. I have also heard of it used to assist with performing automated tests in Javascript.
I also know that Java includes code that can parse and render HTML, though someone who knows more about Java than me can probably advise more on that. I am not denying it would be very difficult to achieve something like this; you'd essentially be re-implementing a lot of what a browser does.
You could use Mozilla's rendering engine Gecko:
https://developer.mozilla.org/en/Gecko
Give a look here: http://snippets.scrapy.org/snippets/22/
it's a python screen scraping and web crawling framework used with webdrivers that open a page, render all the things you need and gives you the possibilities to "capture" anything you want in the page via
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I need to develop a basic web app very quickly (1 week) for a demo.
My requirements are
Java (I need to make use of existing Java libraries to access the relevant data)
2 screens
One for static data view, maybe some search parameters
Other for basic form entry
No fancy AJAX required
Ideally easy for a web designer to come in and tart it up as necessary, without having to rewrite everything
My first stop was going to be to checkout Wicket as I've heard good things about it. I don't have the time right now to dive into anything heavy, which probably writes off JSF in my mind (I played with JSF1, steep learning curve which I've now slid back down).
I'm happy to treat the result as throwaway so if there's a framework which starts of well but then doesn't scale up to bigger projects, that would be ok.
Any suggestions appreciated on frameworks/approach.
Spring roo can very quickly create web applications using GWT for CRUD, and tarting it up later. Check out the Keynote from Google I/O 2010 (Especially Day 1, Part 9) where the skeleton of a basic expense tracking application is developed from scratch in about 2 minutes.
GWT support is in Roo 1.1.0.M1
As a milestone release, Roo 1.1.0.M1 isn't intended for mission-critical use.
It is probably easiest to get in the form already integrated with the eclipse based SpringSource Tool Suite
I suggest the Play framework which has the huge advantage to be full Java (so less learning curve if you don't know Groovy). Check out the demo!
Use Groovy/Grails. Full access to all Java libraries and you will be done so much faster it will make your head spin.
This is from a hardcore java user, by the way. It's just not the appropriate language for most web apps.
Oh, you could probably also use JRuby on rails.
Google App Engine. There are some good video tutorials from Google that get you up and running in no-time.
http://code.google.com/appengine/
Intro (< 10 min): http://www.youtube.com/watch?v=bfgO-LXGpTM
Check out CUBA platfrom, it matches your requitements.
I can suggest GWT. It works on Google App Engine too if this is an internet app.