How To download web page using java as well preserve image files

How To download web page using java as well preserve image files - java

How to download the page and the images given the url using java. A similar question was avaialble already, which when tried just saves the page and not the images.

You'll have to download the page (HTML), then parse the HTML, find the <img> tags in there, and download the images (the src attribute of the <img> tag is the URL of the image file). Likewise you might want to download accompanying CSS or JavaScript files for the page.

Java makes it easy to copy files from a Web site

Related

How to programmatically download all contents of webpage, not only the source code in Java

I know how to download a webpage's source in java. But a webpage also contains image url, CSS and JS script url which need to be downloaded later like:
<LINK REL="STYLESHEET" HREF="htmlatex.css">
<img src=p10012.gif>
If I only download the source of a webpage, rendering it in offline mode will need to download this htmlatex.css and p10012.gif result in missing contents in offline mode. My objective is to download all contents of webpage programmatically and provide it as assets of an android app. HOw can I do that in java.
Note: please let me know if my question is not clear enough.

I would suggest to use JSoup library to do it as its pretty good HTML parse. You can parse HTML and than iterate over resources to download them. I am not sure but there should be an example on the same topic you asked.

Display PDF in browser

I am developing a Java Web Application (jsp/servlet) using tomcat. I need to display pdf file from local machine. can you suggest what is best way to display it?

I used iframe to display pdf file.
<iframe src="resume.pdf" width="100%" style="height:60em">
[Your browser does <em>not</em> support <code>iframe</code>,
or has been configured not to display inline frames.
You can access the document
via a link though.]
</iframe>

I think you can try a Library called XPDF , I think you can convert from PDF to HTML page , or the second option is just let the user open a link to the page (www.yourwebsite.com/pdffolder/somepdf.pdf)

If you need display a pdf file using tomcat, you can access directly to the file using the specific url where the file is located in your navigator, depending on the path where you put the file, so you can access using 127.0.0.1/files/test.pdf for example. If you need generate a pdf, the best tool I think is iText, this is an easy example how to use id: Introducing PDF and iText

How to set favicon.ico on a pdf link?

I am working on web project in java where i have to open a pdf file by the link :
View User Manual
I know how to set favicon on the links but didn't know how to set it on this pdf file..
Can anyone have the answer?
Thanks in advance...

In Chrome, do the following:
Place the favicon in the root of your website, and call it favicon.ico
Clear the cache
Load the PDF from your website

You can't. A favicon is something that can be set on a HTML page, but not on a PDF file. You might be able to do that by linking to a HTML page which has a favicon and contains an iframe containing the PDF file, but it seems overkill for just a favicon.

You can try the default favicon location - i.e. place favicon.ico on the root of server (which is normally the ROOT application). In production you will almost always be running as ROOT. But I don't know if browsers will recognize that - if they don't, it means you can't do it. PDFs are read in the browser only if there's a plugin, so perhaps the normal favicon resolution doesn't happen.

You could try using a php script to deliver the pdf and setting a favicon in the header there.
But I don't think thats worth the effort.

save webpages for offline browsing

i am trying to create an android application that saves webpages to use it in offline-browsing, i was able to save the webpage but the problem was in the contents (images, javascripts,..etc), is there a way to do so programmatically, i use eclipse and test my work on an emulator.

hm, I am afraid you should parse html's yourself (I mean do that with a properly lib) and store all resources (css, js, images, videos etc.) too.
s. how it is done in a java crawler: open source crawlers

You will need to search for all images, javascript files, css files, etc... and download them, saving them to the same relative path to the HMTL files - Assuming the html is coded with relative paths (images/image.png) and not absolute paths (http://www.domain.com/image/image.png).
You can pretty easily search the html string for <img, <script, <link etc.. and parse from there - or you can find a 3rd party html parser

Load HTML file to WebView with custom CSS

I have a WebView on my Android application which loads (WebView.loadUrl()) different local HTML files from phone's internal storage. I would like to include some custom css styles for them.
Now, I could have my app edit every HTML file and add linking reference for the CSS file.
I could also read the file contents, add the CSS linking and use WebView.loadData() to load it.
But is it possible to do this a lot simpler and efficiently?
Note: The HTML files are downloaded from a website. So editing them manually is not possible in this case, but once downloaded they can be edited via the app if necessary.

One possibility (I have not tried this):
WebView.loadDataWithBaseURL(String baseUrl, String data, ..)
takes a baseURL for the document to use to resolve relative URLs. Take a look at the CSS url and construct baseURL so that CSS url will reference local CSS file.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How To download web page using java as well preserve image files - java

How to download the page and the images given the url using java. A similar question was avaialble already, which when tried just saves the page and not the images.

You'll have to download the page (HTML), then parse the HTML, find the <img> tags in there, and download the images (the src attribute of the <img> tag is the URL of the image file). Likewise you might want to download accompanying CSS or JavaScript files for the page.

Java makes it easy to copy files from a Web site

Related

How to programmatically download all contents of webpage, not only the source code in Java

Display PDF in browser

How to set favicon.ico on a pdf link?

save webpages for offline browsing

Load HTML file to WebView with custom CSS

Categories

Resources