Convert HTML to DOC with images in Java - java

I am stuck in a Java application.
I have a doubt that is there any way to convert HTML template to DOC Template with Image in HTML file using Java.
I have tried Aspose API but I cant use it because it is not open.
I fetch HTML template from database and store the whole template into string and now I want this string output in a WORD DOC including the images.
Here is my piece of code:
proc_stmt = con.prepareCall("{call PROCEDURECALL(?)}");
proc_stmt.registerOutParameter(1, Types.CLOB);
proc_stmt.execute();
String htmltemplate = proc_stmt.getString(1);
I am storing the HTML template in a String and now I want it to be converted in WORD DOC.
It also have a image src=local path link.The whole template is working fine but the image is not being posted so can anyone help me with it?

Thank you all for the time and help.
I tried docjx4j API 2.8.1 and it wors like wonder.
It had ConvertinXHTMLinFile and it works fine.
If anyone wants the code I will post it.
Here is the link that helped me :
https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/samples/ConvertInXHTMLFile.java
Once again, Thank you all.
Vrinda.

Related

web scraping jsoup java unable to scrape full information

I have an information to be scraped from a website. I could scrape it. But not all the information is being scraped. There is so much of data loss. The following images helps you further to understand :
I used Jsoup, connected it to URL and then extracted this particular data using the following code :
Document doc = Jsoup.connect("https://www.awattar.com/tariffs/hourly#").userAgent("Mozilla/17.0").get();
Elements durationCycle = doc.select("g.x.axis g.tick text");
But in the result, I couldn't find any of that related information at all. So I printed the whole document from the URL and it shows the following :
I could see the information when I download the page and read it as an input file but not when I connect directly to URL. But I want to connect it to URL. Is there any suggestion?
I hope my question is understandable. Let me know in case if it is not explanatory.
There is a request body limitation in Jsoup. you should use the maxBodySize parameter:
Document doc = Jsoup.connect("https://www.awattar.com/tariffs/hourly#").userAgent("Mozilla/17.0").maxBodySize(0).get();
"0" is no limit.

JavaFx html formatted text in pdf using iText with formatation

Is it possible to set a formatted HTML-Text (Color, Alignment, ...) from a HTMLEditor to an "editable" PDF using iText.
I didn't find anything on the internet.
Thanks.
The easiest way of doing this is (as Amedee suggested) using pdfHTML.
It's an iText7 add-on that converts HTML5 (+CSS3) into pdf syntax.
The code is pretty straightforward:
HtmlConverter.convertToPdf(
"<b>This text should be written in bold.</b>", // html to be converted
new PdfWriter(
new File("C://users/user2002/output.pdf") // destination file
)
);
To learn more, go to https://itextpdf.com/itext7/pdfHTML
I found a Solution in this post using The Flying Saucer: this

Parsing shopping websites usign jsoup

I have the following code:
doc = Jsoup.connect("http://www.amazon.com/gp/goldbox").userAgent("Mozilla").timeout(5000).get();
Elements hrefs = doc.select("div.a-row.layer");
System.out.println("Results:"+ hrefs); //I am trying to print out contents but not able to see the output.
Problem: Want to display all image src within the div with class name "a-row layer". But, i am unable to see the output.
What is the mistake with my query?
I have taken a look at the website and tested it myself. The issue seems to be that the piece of html code you want to extract (div.a-row.layer) is generated by JavaScript.
Jsoup does not support JavaScript and cannot parse those generated by it. You would need a headless web browser to deal with this, such as HTMLUnit.

I am using to jsoup to pull images from website url, but I want the page to load first is there anyway to do this?

The problem is that some of the urls I am trying to pull from have javascript slideshow containers that haven't loaded yet. I want to get the images in the slideshow but since this hasn't loaded yet it doesn't grab the element. Is there anyway to do this? This is what I have so far
Document doc = Jsoup.connect("http://news.nationalgeographic.com/news/2013/03/pictures/130316-gastric-brooding-frog-animals-weird-science-extinction-tedx/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+ng%2FNews%2FNews_Main+%28National+Geographic+News+-+Main%29").get();
Elements jpg = doc.select("img[src$=.jpg]");
jsoup can't handle javascript, but you can use an additional library for this:
Parse JavaScript with jsoup
Trying to parse html hidden by javascript

How to read XML data in android

Hi i have been trying to read my XML file in my android but have been unsuccessful.
this is my XML file : http://collectionservice.byethost13.com/backup.XML
all i have to do is that there is a row tag in the document and have to show only the IDs in all the row tags inside the Listview of the first screen.
Can any body give me an example or something will be very thankful.
Hi i just want to show this XML file:
ID on the first screen listview from the XML and then on click on the specific id it goes to the next screen and show ID,Name,Phone,Department,What_Ever of that id.
Can anybody do give me a code or something have to give it to the client today and i am new to android will be very thankful to you.
Man have tried many links but no successes
Pretty please.
Here the official documentation, pretty clear and simple : https://developer.android.com/training/basics/network-ops/xml.html
I found that simple-xml is very easy to use to parse xml into objects.
The docs are well written so it could be helpful to you.
You can use SaxParser to parse XML file in android. Extend DefaultHandler class to read your tags.
Find a simple example here

Categories

Resources