I'm trying to download some images from a website. I've been using Jsoup to do some scraping and have successfully downloaded images given a url before but the images on this website are in svg format. There is no link to a location where the svg file is located, the image is embedded in svg tags. I have seen Batik used for converting svg files to other image formats but I don't have the svg file available.
Is there any way to do this? Would appreciate any guidance. Thank you.
Typically an SVG image is not a file, but rather it is included in the response body of the GET request from a browser. What you can do to test this is to download a REST client, POSTMAN if you're using Chrome, and issue a GET request to the url of the svg. The response will be the SVG image. Thinking now in terms of Java, you may have to do some parsing in your code to grab just the actual svg element because the website may return extra junk wrapping the embedded svg.
I have used Batik and I think it's not a good solution for many reasons for what you're trying to do. In the past I ended up writing Java code that executed a 3rd party program for image conversion. It was basically a Command class that wrapped the execution of phantomjs. Download phantomjs, and use the rasterize.js file in the examples folder to achieve quick and easy image conversion from .svg to .png or .jpg. At the command line, the command for phantomjs is something like:
phantomjs rasterize.js C:\sourceImage.svg C:\outputImage.png
If you are doing image manipulation, I did it a lot using ImageMagick as phantomjs is only good for rendering svg to a rasterized image format.
In your case what you want to do is for every svg image at the url, GET the svg, parse it into a String, write that String to a file, then do something like:
String command = "C:\\phantomjs\\phantomjs.exe C:\\phantomjs\\rasterize.js C:\\source.svg C:\\output.png"
Process process = Runtime.getRuntime().exec(command);
Obviously make your code more general, replacing the values in the command string with resusable variables.
If this is in the context of a commercial platform, you can install phantomjs and your java app on a single server, and then just connect this app via REST endpoints to your svg finder app that gets the images. When your svg finder app gets an image, have it parse it, format it, then POST it to the phantomjs server for rendering and uploading/storage.
Just save the part of the HTML file between the <svg> tags (including the <svg>). Give it a .svg extension. You should then be able to open it in a browser, or pass it to Batik, ImageMagick or some other converter.
Related
Right now I'm working on displaying LaTeX generated document with Java.
Strictly speaking, LaTeX source can be used to directly generate two formats:
DVI using latex, the first one to be supported;
PDF using pdflatex, more recent.
However rendering dvi or pdf is not available as far as I know.
Is there any way to handle those formats ? Or maybe others that makes sense ?
There are not enough details with regards to how you wish to "render" DVI or PDF from a LaTeX document. However, you could always just render the pdf using pdflatex and DVI using latex and use ICEpdf for viewing PDFs and javaDVI for viewing DVIs.
Another neat hack to display pdf in a panel is to pass the file path to an embedded web component in the application, and the web component will use whatever pdf rendering tool is available on your machine (Acrobat, Foxit, Preview, etc.)
I remember there was a post about this a long time ago.
I don't think there's a generic way to preview the rendered output without generating the file itself. You can write your own LaTeX engine which caches the output every few seconds and displays that but regardless of the storage, you have to output it somewhere physically and then render the output separately using any of the steps mentioned above.
Another approach is to convert the div output to an svg image file and render that with SVGGraphics2D. That will produce nice scalable results. Dvi files can be converted to svg on the command line (or in a script) using:
dvisvgm --no-fonts input.dvi -o output.svg
For more conversion options see this thread on how to convert pdf to clean svg.
In Lotus Notes, I have a document that contains a PPT file attachment. Using Apache POI, I was planning to generate an image from a specific slide from that PPT and display it on the web browser by accessing the agent from the web probably with the use of Ajax. Is there a way to temporarily store the generated image and display it? If yes, how would you be able to do it?
You could achieve this even without saving the attachment by encoding it Base64 and embedding it in the source-code like this:
<img src="data:image/gif;base64,
R0lGODlhmwDFAPcAAAAAAAEBAQICAgMDAwQEBAUFBQYGBgcHBwgICAkJCQoKCgsLCwwMDA0N
DQ4ODg8PDxAQEBERERISEhMTExQUFBUVFRYWFhcXFxgYGBkZGRoaGhsbGxwcHB0dHR4eHh8f
HyAgICEhISIiIiMjIyQkJCUlJSYmJicnJygoKCkpKSoqKisrKywsLC0tLS4uLi8vLzAwMDEx
MTIyMjMzMzQ0NDU1NTY2Njc3Nzg4ODk5OTo6Ojs7Ozw8PD09PT4+Pj8/P0BAQEFBQUJCQkND
.... Lot of ascii characters ....
gww18FBEikHcgNkMRW5lmkJI/teaa0wNiOhshFFuiRSVpL34nqQRphZmcV5miORZQwnRpndI
nUmiiTStuaKbLl4Z45wuuADDDDfsgNKeMmy160w1hdaVSZfupyiXSgLoWpOQFjgpWUsKCGem
CEXFlIRlBefllxqKKlyblb45olWqosgmi29iGiudM+6Knp5F6LhrDYYCccQRQuzQp1cBAQA7">
If you need to store it, you could store it to a temp- folder and then attach it to a document. Then you could easily show it using an img- tag with src- url http://server/db.nsf/_/documentuniqueidOfSavedDocument/$File/nameOfImage.jpg
I know it's possible to convert an HTML file to PDF using Google Drive (HTML2PDF using Google Drive API) but I'd like to know if this HTML has images and CSS files is possible and how to do that.
You need convert HTML to a Docs file and export it as PDF. During the docs conversion most of the non-trivial styles are being trimmed. Basic coloring, sizing and positioning will all you'll get. The exported PDF is the Docs' file's PDF version. Images will be preserved though.
You can make experiments by uploading your html files to Google Drive on drive.google.com with conversion settings on and see the results.
For images you could try this: Embedding Base64 Images
Worked for me when uploading by web. Should work with my solution https://stackoverflow.com/a/21711109/592042
Css can be written right into html file.
I already tested window.print() command for this purpose but it is not fulfill my requirement.
I also used print content of iframe in which source is pdf file but it is only work in chrome not in other browser.
I want to print pdf files automatically using code instead of open file and print it.
For example there are two files such as 1.pdf and 2.pdf in any directory and source is given then how can print both files using either javascript or php or both.
I already tested window.print() command for this purpose but it is not fulfill my requirement.
My required as image as:
Million thanks in advance.
This is not possible since most browsers, unlike google chrome (where it works) don't have a built in pdf viewer.
The printing of a pdf document is up to the pdf reader, whether or not it is installed as a browser plugin, not the browser.
I fix this issue of merging multiple pdf or image or both by using imageMagick.
Using below command we can merge pdf and image as:
<?
$cmd = "test.pdf test.jpeg final.pdf";
exec("convert $cmd");
?>
After completed merging process, open final.pdf automatically using code then user can print it easily.
You can find more.
I am working on a web application (using Grails) which will generate a gift certificate. I'm thinking of a workflow like this:
The user will pick a template which will be stored as an image.
Then the text (name, date, amount etc) will be overlaid on the image to make the final certificate. There is a set of co-ordinates associated with each template which describes where to put each bit of text.
There is a kind of 'live preview' in the browser which shows the user what the final certificate will look like.
When the user is happy with the results, they download the certificate as a PDF and print it.
Can anyone recommend a library for Java (or Groovy) that will make it easy to do this? I'm not particularly worried about speed, as I suspect that the webapp will only be used by a few people at a time.
UPDATE: in the end I used the iText PDF library to overlay text on a PDF template using PDFStamper.
You can do this with the standard Java 2D graphics libraries - create a BufferedImage from the image, get its Graphics and use drawString() to put the text on top. Of course, the text would then be part of the bitmap in the resulting PDF, and not use the full printing resolution.
In addition to the answers above, I have come across the groovy library GraphicsBuilder and the Grails plugin j2D which are also potiential solutions.
You might consider using Batik to do this as SVG. Your image would be an <img> tag and your text would be one or more <text> tags. There's a converter (called FOP, I believe) which will get you PDF output.