I have a java app that processes excel files from emails in an automated fashion (.xls, xlsx etc). I've noticed that some files are not native files. Opening in Excel will give a warning that the file is corrupt/badly formated. Opening in notepad++ clearly shows HTML
Unfortunately I can't just manually handle these files so I need a way to automatically spot them.
I noticed that when I use java.io.fiile object then with org.apache.tika.Tika I can detect the the type. So with the file object I can find out the extension, and with tika.detect() i can find that the format is called "text/html". (Not sure if this is the best way, but it seems to work with my singular example)
So I can then find these kinds of files using:
File file = getTheFileObject();
if ( tika.detect(file).equalsIgnoreCase("text/html") && file.getName().contains(".xls") ) { ... do what I want with the corrupt file... }
My problem comes when doing something similar with email attachments. To get the file from emails I'm using the com.microsoft.ews-java-api 2.0 and from this I can get a FileAttachment object which represents the file.
But when I attempt to use tika.detect() on this (same corrupt file) i get a different format output "application/octet-stream" instead of "text/html". Or get "application/vnd.ms-excel" using the FileAttachments own methods
How can I spot these corrupt files if I can't spot the html formated xls files?
FileAttachment attachment = getFileAttachment();
attachment.getContentType() //application/vnd.ms-excel
tika.detect(attachment.getContentStream()) //application/octet-stream
How would I spot an html file that has .xls file extension from the emails ews FileAttachment object? Will tika still help?
Need Java Code to convert Google Drive Spreadsheet to Excel. Later I want to send the converted file as mail attachment.
I am using this code to retrieve the file metadata -
services.files().get(fileObj.getId()).executeAsInputStream();
I have a code to send the mail, but the problem is if I use the above code, the attachment is like a link to Google Drive document.
The File object has exportLinks property that
"Links for exporting Google Docs to specific formats."
exportLinks.(key) can be used to get the link for a specific format. Replace the key with the mimeType you want the file exported to (if I remember correctly)
You can find more info about the exportLinks keys (and Document Downloading in general) here
Actually I am attempting to extract the data from a PDF file but I didn't find any example in the internet and I am asking if there is any possibility that I can use the JPedal library to open to read the data from a PDF file.
You can use PDFBox from Apache.
I am not familiar with JPedal, but I write lots of code that generates and processes pdf files. I use IText and highly recommend it. If you have a specific question on how to process a pdf file, let me know.
I need to generate xml, excel, pdf, text file from list what i retrieved from Database. I have used itext-1.3.jar for that and I have generated successfully. Here what i want to know means, is there is any other API is available like itext, if yes means kindly suggest me . Thanks in advance.
Apache Java API To Access Microsoft Excel Format Files can read/write Microsoft Excel 2003/2010 formats.
I have a JAVA program that creates a PDF file.
However I need to send this PDF file to a printer via SDK for this printer that would only accept PRN file type …
I understand that a the PRN file is built by using the specific driver for the specific printer, so the java program should be able to pick the driver for use in order to convert the PDF to a PRN file
As for the question, why wouldn’t I send the PDF file directly to print via the driver, well, this is a zebra printer that prints and encode smart cards, performing printing and encoding is only available if approaching it from the SDK, if I were to sent in directly to the driver , it would only print without encoding the cards
The PDF will need to be rendered and set into the format desired by the printer.
GhostScript is what you want for this, the command would be:
gswin32c -dNOPAUSE -sDEVICE=PrinterName -sOutputFile="c:/out.prn" "file.pdf"
As i understood your printer manufactor is Zebra and you are searching vendor specific solution to the problem. Here is an open source project jZebra that is support many vendors and as i understood it has capability to print from PRN file. Check this thread if it is applicable for you or not.
if you are lucky and the file you are trying to open contains plain text you can try this:
Change extension of this file from prn to txt
Open new file in notepad or any other text editor
You may see exactly what you wanted plus some header that can be removed later.
Zebra printers use zpl language. You can design your zpl template and send it to your printer as a prn file.
^XA
^FX Top section with company logo, name and address.
^CF0,60
^FO50,50^GB100,100,100^FS
^FO75,75^FR^GB100,100,100^FS
^FO88,88^GB50,50,50^FS
^FO220,50^FDInternational Shipping, Inc.^FS
^CF0,40
^FO220,100^FD1000 Shipping Lane^FS
^FO220,135^FDShelbyville TN 38102^FS
^FO220,170^FDUnited States (USA)^FS
^FO50,250^GB700,1,3^FS
^XZ
Copy the text above and paste it into the notepad. Save it as abc.prn and send it to your printer.
You will see, it will print a label (I assume you are using a 4x6 printer).
So you can create your own zpl template using this website. http://labelary.com/viewer.html
and this guide
https://www.zebra.com/content/dam/zebra/manuals/en-us/software/zpl-zbi2-pm-en.pdf