I am trygin to fetch the html source of an email message so i can parse it.
For some reason content.getCount() returns 2
and sometime the content.getBodyPart(i) is actually a nested Multipart object
How do I distinguish between the elements returned by content.getBodyPart(i)
Is this the correct way to fetch the HTML Source?
ParsedEmailData procesMultiPart(Multipart content){
for (int i = 0; i < content.getCount(); i++) {
BodyPart bodyPart = content.getBodyPart(i);
Object body = bodyPart.getContent();
if(body instanceof Multipart) {
return procesMultiPart((Multipart) body);
}
if (body instanceof String) {
return parsedEmailData = parseEmailBody((String) body);
}
}
Related
I'm trying to get the inline images of a mail, for which I have the following code:
protected void setCidAttachments(Message message, MensajeEmail mensajeEmail) {
try {
MimeMultipart mimeMultipart = (MimeMultipart) message.getDataHandler().getContent();
for (int k = 0; k < mimeMultipart.getCount(); k++) {
MimeBodyPart part = (MimeBodyPart) mimeMultipart.getBodyPart(k);
processPart(part, mensajeEmail);
}
}
catch (Exception e) {
log.error("Error obtendo adxuntos con cid", e);
}
}
private void processPart (BodyPart part, MensajeEmail mensajeEmail) throws MessagingException, IOException {
String type = getContentType(part);
StringBuilder content = new StringBuilder(mensajeEmail.getContenido());
if (isImage(type) && part.getDataHandler() != null && part.getDataHandler().getContent() != null) {
if (part.getDataHandler().getContent() instanceof MimeMultipart) {
MimeMultipart p = (MimeMultipart) part.getDataHandler().getContent();
for (int i = 0; i < p.getCount(); i++) {
BodyPart subpart = p.getBodyPart(i != p.getCount() - 1 ? i + 1 : i);
processPart(subpart, mensajeEmail);
}
} else {
mensajeEmail.setContenido(getInlineImage(part, content));
}
}
}
private String getInlineImage (BodyPart part, StringBuilder content) throws MessagingException, IOException {
Base64 decoder64 = new Base64();
ByteArrayOutputStream bos = new ByteArrayOutputStream();
// Get type
String type = getContentType(part);
// Get Content-ID
String contentId = getContentId(part);
// Replace
if (contentId.length() > 0) {
part.getDataHandler().writeTo(bos);
int start = content.indexOf("src=\"cid:" + contentId + "\"") + 5;
if (start > 4) {
int length = contentId.length() + 4;
content.replace(start, start + length, "data:" + (isImage(type) ? type : "image/png;") + " base64," + decoder64.encodeToString(bos.toByteArray()));
}
}
bos.close();
return content.toString();
}
private String getContentId (BodyPart part) throws MessagingException {
Enumeration headers = part.getAllHeaders();
while (headers.hasMoreElements()) {
Header header = (Header)headers.nextElement();
if (header.getName().equalsIgnoreCase("Content-ID"))
return cleanContentId(header.getValue());
}
return "";
}
private String getContentType (BodyPart part) throws MessagingException {
return part.getContentType().split(" ")[0];
}
private boolean isImage (String mime) {
return !mime.equals("text/html;") && !mime.equals("text/plain;");
}
private String cleanContentId (String contentId) {
if (contentId.charAt(0) == '<') contentId = contentId.substring(1, contentId.length() - 1);
return contentId;
}
This works perfectly fine when I send PNG images (which makes me think my code is indeed correct). However, when I try to send a JPG image, I get the following exception:
javax.activation.UnsupportedDataTypeException: Unknown image type image/jpeg; name=sony-car-796x418.jpg
at org.apache.geronimo.activation.handlers.AbstractImageHandler.getContent(AbstractImageHandler.java:57)
at javax.activation.DataSourceDataContentHandler.getContent(DataHandler.java:795)
at javax.activation.DataHandler.getContent(DataHandler.java:542)
at es.enxenio.fcpw.plinper.daemons.email.AbstractProtocoloObtencionEmail.processPart(AbstractProtocoloObtencionEmail.java:378)
...
Is the framework really not able to work with JPG images? Is there some way I can fix this?
EDIT: Gmail doesn't even let me send JPG images so it's probably not a very common format for mail images, which makes me think might not be widely implemented and that could be the reason why Java doesn't seem to be able to work with it
I found the problem. This line
if (isImage(type) && part.getDataHandler() != null && part.getDataHandler().getContent()
shouldn't check whether the type is an image but whether it is a multipart. Otherwise we could be processing a jpg image as a multipart. For some reason png images don't mind this and that's why it was working. Here are the changed parts of the code:
protected void setCidAttachments(Message message, MensajeEmail mensajeEmail) {
try {
processPart(message, mensajeEmail);
}
catch (Exception e) {
log.error("Error obtendo adxuntos con cid", e);
}
}
private void processPart(Part part, MensajeEmail mensajeEmail) throws MessagingException, IOException {
String type = getContentType(part);
StringBuilder content = new StringBuilder(mensajeEmail.getContenido());
if (isMultipart(type) && part.getDataHandler() != null && part.getDataHandler().getContent() != null && part.getDataHandler().getContent() instanceof MimeMultipart) {
MimeMultipart multipart = (MimeMultipart) part.getDataHandler().getContent();
for (int i = 0; i < multipart.getCount(); i++) {
BodyPart subpart = multipart.getBodyPart(i);
processPart(subpart, mensajeEmail);
}
}
else {
mensajeEmail.setContenido(getInlineImage(part, content));
}
}
private boolean isMultipart(String mime) {
return (Pattern.matches("multipart/.*", mime));
}
I got a similar exception running an app on eclipse osgi with java 11 and with bundles javax.mail.glassfish 1.4.1 and javax.activation 1.1.0 (got these 2 from https://eclipse.org/orbit):
javax.activation.UnsupportedDataTypeException: Unknown image type image/jpeg; name=image001.jpg
at org.apache.geronimo.activation.handlers.AbstractImageHandler.getContent(AbstractImageHandler.java:57)
at javax.activation.DataHandler.getContent(DataHandler.java:147)
at javax.mail.internet.MimeBodyPart.getContent(MimeBodyPart.java:652)
at my.code.calling.getcontent.MyClass(MyClass.java:802)
The package org.apache.geronimo.activation.handlers is included in the javax.transaction 1.1.0 bundle.
I resolved the problem by #-commenting the image/gif, image/jpeg handlers in the file META-INF/mailcap inside the javax.activation bundle:
## <apache license disclaimer> http://www.apache.org/licenses/LICENSE-2.0
##
## $Rev$ $Date: 2008/04/09 19:25:23 $
##
text/plain;; x-java-content-handler=org.apache.geronimo.activation.handlers.TextPlainHandler
text/html;; x-java-content-handler=org.apache.geronimo.activation.handlers.TextHtmlHandler
text/xml;; x-java-content-handler=org.apache.geronimo.activation.handlers.TextXmlHandler
#image/gif;; x-java-content-handler=org.apache.geronimo.activation.handlers.ImageGifHandler
#image/jpeg;; x-java-content-handler=org.apache.geronimo.activation.handlers.ImageJpegHandler
multipart/*;; x-java-content-handler=org.apache.geronimo.activation.handlers.MultipartHandler
There's no image/png here, that's why pngs are not a problem in the first place.
So by commenting gif and jpeg handlers, attachments of these types are now handled like pngs: getContent() will just yield an InputStream, instead of an AWT Image, which I think those geronimo ImageHandlers would produce if everything worked as intended.
Some internals: Geronimo AbstractImageHandler of javax.activation 1.1.0 tries to determine the type of image from javax.mail.glassfish 1.4.1 method IMAPBodyPart.getContent(), but this returns the mime-type incl. parameters, e.g. "image/jpeg; name=sony-car-796x418.jpg", which isn't understood and ultimately leads to the UnsupportedDataTypeException.
javax.mail.glassfish also has an META-INF/mailcap file, whose image/* section interestingly looks like this:
# can't support image types because java.awt.Toolkit doesn't work on servers
#
#image/gif;; x-java-content-handler=com.sun.mail.handlers.image_gif
#image/jpeg;; x-java-content-handler=com.sun.mail.handlers.image_jpeg
That could be a lead, I still did get the original jpeg exception also in a gui application, though.
Another thing, this error doesn't occur for me when running the same setup with java 8 instead of 11, probably got something to do with java 8 having its own javax.activation package.
I'm trying to extract content of a MTOM using code below
Iterator i = msg.getAttachments();
while (i.hasNext())
{
AttachmentPart att = (AttachmentPart)i.next();
Object obj = att.getContent();
}
where msg is SOAPMessage MIME type but the rawContent comes as null and will crash on getting AttachmentPart
Is there any other way to get MTOM content? Getting boundaries and looping through?
I have ended up with the code below
MimeMultipart mp = new MimeMultipart(new ByteArrayDataSource(InputStream, "multipart/related"));
int count = mp.getCount();
for (int i = 0; i < count; i++) {
BodyPart bodyPart = mp.getBodyPart(i);
String content = new String(read(bodyPart));
String partContentType = bodyPart.getContentType();
if(partContentType.toLowerCase().contains(SOAPConstants.SOAP_1_2_CONTENT_TYPE)) {
//process SOAP 1.2
}
if(partContentType.toLowerCase().contains(SOAPConstants.SOAP_1_1_CONTENT_TYPE)) {
//process SOAP 1.1
}
if(partContentType.toLowerCase().contains("application/octet-stream")) {
// process binary part
}
}
I have a requirement where I need to process the first line in the email message and, possibly, forward it.
But the problem happens when this message has attachments. And I need to forward them as well. I just can't find a good example of processing email messages with java.mail in a safe way that would cater for multiple message structures. Also, the forwarding example is a problem.
Can anyone point me to a good resource with some code examples?
Thank you
The code of getting the first line of the email message, forwarding I don't have working:
private String getMessgaeFirstLine(Message msg) throws IOException, MessagingException{
String result = null;
Object objRef = msg.getContent();
Multipart mp = (Multipart) objRef;
int count = mp.getCount();
for (int i = 0; i < count; i++)
{
BodyPart bp = mp.getBodyPart( i );
if (bp instanceof MimeBodyPart )
{
MimeBodyPart mbp = (MimeBodyPart) bp;
if ( mbp.isMimeType( "text/plain" )) {
result = (String) mbp.getContent();
result = result.replaceAll("(\\r|\\n)", "");
break;
}
}
}
return result;
}
The simplest way will be to forward the original message as an attachment to the new message. See the JavaMail FAQ.
Is it possible to get only the count of attachments for a mail in Java? I tried using this:
DataHandler handler = message.getDataHandler();
AttachedFileName= handler.getName();
This lists out all the attachments for all mails inbox but not for specific mails.
Is this possible if so how?
Thanks!
This should give you the attachment count,
Multipart multipart = (Multipart) message.getContent();
int attachmentCount = multipart.getCount();
I don't have enough reputation to comment on the accepted solution:
Multipart multipart = (Multipart) message.getContent();
int attachmentCount = multipart.getCount();
But I don't think that it is ideal for the following reasons:
Many email clients [For example: Thunderbird] send all HTML emails as multipart/alternative. They include a HTML part and an alternative plain text part. Historically, it was done to let clients choose the best alternative that they are capable of displaying.
Not everything included as a part is an attachment. For example, many images are not displayed as attachments in email clients because their disposition is set to "inline".
In summary, this solution potentially counts all HTML emails as having attachments and all emails with inline images as having attachments.
Here is an alternative that ignores parts not normally considered attachments:
private int getAttachmentCount(Message message) {
int count = 0;
try {
Object object = mMessage.getContent();
if (object instanceof Multipart) {
Multipart parts = (Multipart) object;
for (int i = 0; i < parts.getCount(); ++i) {
MimeBodyPart part = (MimeBodyPart) parts.getBodyPart(i);
if (Part.ATTACHMENT.equalsIgnoreCase(part.getDisposition()))
++count;
}
}
} catch (IOException | MessagingException e) {
e.printStackTrace();
}
return count;
}
I know that this solution gets the body part, but I believe that it is the only accurate way to see if it is an attachment.
I had a similar problem. The accepted answer didn't work for me because multipart may not necessarily be an attachment file. I counted the number of attachments by ignoring the other possible cases in multipart.
int count = 0;
Multipart multipart = (Multipart) message.getContent();
for(int i = multipart.getCount() - 1; i >= 0; i--)
{
BodyPart bodyPart = multipart.getBodyPart(i);
String bodyPartContentType = bodyPart.getContentType();
if(isSimpleType(bodyPartContentType)) continue;
else if(isMultipartType(bodyPartContentType)) continue;
else if(!isTextPlain(bodyPartContentType)) count++;
}
You can check simple type, multipart type, and text using these methods:
private boolean isTextPlain(String contentType)
{
return (contentType.contains("TEXT/PLAIN") || (contentType.contains("TEXT/plain")));
}
private boolean isSimpleType(String contentType)
{
if(contentType.contains("TEXT/HTML") || contentType.contains("text") ||
contentType.contains("TEXT/html")) return true;
return false;
}
private boolean isMultipartType(String contentType)
{
if(contentType.contains("multipart") || contentType.contains("multipart/mixed")) return true;
return false;
}
This worked for me.
I am writing a Java application to download emails using Exchange Web Services. I am using Microsoft's ewsjava API for doing this.
I am able to fetch email headers. But, I am not able to download email attachments using this API. Below is the code snippet.
FolderId folderId = new FolderId(WellKnownFolderName.Inbox, "mailbox#example.com");
findResults = service.findItems(folderId, view);
for(Item item : findResults.getItems()) {
if (item.getHasAttachments()) {
AttachmentCollection attachmentsCol = item.getAttachments();
System.out.println(attachmentsCol.getCount()); // This is printing zero all the time. My message has one attachment.
for (int i = 0; i < attachmentsCol.getCount(); i++) {
FileAttachment attachment = (FileAttachment)attachmentsCol.getPropertyAtIndex(i);
String name = attachment.getFileName();
int size = attachment.getContent().length;
}
}
}
item.getHasAttachments() is returning true, but attachmentsCol.getCount() is 0.
You need to load property Attachments before you can use them in your code. You set it for ItemView object that you pass to FindItems method.
Or you can first find items and then call service.LoadPropertiesForItems and pass findIesults and PropertySet object with added EmailMessageSchema.Attachments
FolderId folderId = new FolderId(WellKnownFolderName.Inbox, "mailbox#example.com");
findResults = service.findItems(folderId, view);
service.loadPropertiesForItems(findResults, new PropertySet(BasePropertySet.FirstClassProperties, EmailMessageSchema.Attachments));
for(Item item : findResults.getItems()) {
if (item.getHasAttachments()) {
AttachmentCollection attachmentsCol = item.getAttachments();
System.out.println(attachmentsCol.getCount());
for (int i = 0; i < attachmentsCol.getCount(); i++) {
FileAttachment attachment = (FileAttachment)attachmentsCol.getPropertyAtIndex(i);
attachment.load(attachment.getName());
}
}
}
Honestly as painful as it is, I'd use the PROXY version instead of the Managed API. It's a pity, but the managed version for java seems riddled with bugs.
before checking for item.getHasAttachments(), you should do item.load(). Otherwise there is a chance your code will not load the attachment and attachmentsCol.getCount() will be 0.
Working code with Exchange Server 2010 :
ItemView view = new ItemView(Integer.MAX_VALUE);
view.getOrderBy().add(ItemSchema.DateTimeReceived, SortDirection.Descending);
FindItemsResults < Item > results = service.findItems(WellKnownFolderName.Inbox, new SearchFilter.IsEqualTo(EmailMessageSchema.IsRead, true), view);
Iterator<Item> itr = results.iterator();
while(itr.hasNext()) {
Item item = itr.next();
item.load();
ItemId itemId = item.getId();
EmailMessage email = EmailMessage.bind(service, itemId);
if (item.getHasAttachments()) {
System.err.println(item.getAttachments());
AttachmentCollection attachmentsCol = item.getAttachments();
for (int i = 0; i < attachmentsCol.getCount(); i++) {
FileAttachment attachment=(FileAttachment)attachmentsCol.getPropertyAtIndex(i);
attachment.load("C:\\TEMP\\" +attachment.getName());
}
}
}
Little late for the answer, but here is what I have.
HashMap<String, HashMap<String, String>> attachments = new HashMap<String, HashMap<String, String>>();
if (emailMessage.getHasAttachments() || emailMessage.getAttachments().getItems().size() > 0) {
//get all the attachments
AttachmentCollection attachmentsCol = emailMessage.getAttachments();
log.info("File Count: " +attachmentsCol.getCount());
//loop over the attachments
for (int i = 0; i < attachmentsCol.getCount(); i++) {
Attachment attachment = attachmentsCol.getPropertyAtIndex(i);
//log.debug("Starting to process attachment "+ attachment.getName());
//FileAttachment - Represents a file that is attached to an email item
if (attachment instanceof FileAttachment || attachment.getIsInline()) {
attachments.putAll(extractFileAttachments(attachment, properties));
} else if (attachment instanceof ItemAttachment) { //ItemAttachment - Represents an Exchange item that is attached to another Exchange item.
attachments.putAll(extractItemAttachments(service, attachment, properties, appendedBody));
}
}
}
} else {
log.debug("Email message does not have any attachments.");
}
//Extract File Attachments
try {
FileAttachment fileAttachment = (FileAttachment) attachment;
// if we don't call this, the Content property may be null.
fileAttachment.load();
//extract the attachment content, it's not base64 encoded.
attachmentContent = fileAttachment.getContent();
if (attachmentContent != null && attachmentContent.length > 0) {
//check the size
int attachmentSize = attachmentContent.length;
//check if the attachment is valid
ValidateEmail.validateAttachment(fileAttachment, properties,
emailIdentifier, attachmentSize);
fileAttachments.put(UtilConstants.ATTACHMENT_SIZE, String.valueOf(attachmentSize));
//get attachment name
String fileName = fileAttachment.getName();
fileAttachments.put(UtilConstants.ATTACHMENT_NAME, fileName);
String mimeType = fileAttachment.getContentType();
fileAttachments.put(UtilConstants.ATTACHMENT_MIME_TYPE, mimeType);
log.info("File Name: " + fileName + " File Size: " + attachmentSize);
if (attachmentContent != null && attachmentContent.length > 0) {
//convert the content to base64 encoded string and add to the collection.
String base64Encoded = UtilFunctions.encodeToBase64(attachmentContent);
fileAttachments.put(UtilConstants.ATTACHMENT_CONTENT, base64Encoded);
}
//Extract Item Attachment
try {
ItemAttachment itemAttachment = (ItemAttachment) attachment;
PropertySet propertySet = new PropertySet(
BasePropertySet.FirstClassProperties, ItemSchema.Attachments,
ItemSchema.Body, ItemSchema.Id, ItemSchema.DateTimeReceived,
EmailMessageSchema.DateTimeReceived, EmailMessageSchema.Body);
itemAttachment.load();
propertySet.setRequestedBodyType(BodyType.Text);
Item item = itemAttachment.getItem();
eBody = appendItemBody(item, appendedBody.get(UtilConstants.BODY_CONTENT));
appendedBody.put(UtilConstants.BODY_CONTENT, eBody);
/*
* We need to check if Item attachment has further more
* attachments like .msg attachment, which is an outlook email
* as attachment. Yes, we can attach an email chain as
* attachment and that email chain can have multiple
* attachments.
*/
AttachmentCollection childAttachments = item.getAttachments();
//check if not empty collection. move on
if (childAttachments != null && !childAttachments.getItems().isEmpty() && childAttachments.getCount() > 0) {
for (Attachment childAttachment : childAttachments) {
if (childAttachment instanceof FileAttachment) {
itemAttachments.putAll(extractFileAttachments(childAttachment, properties, emailIdentifier));
} else if (childAttachment instanceof ItemAttachment) {
itemAttachments = extractItemAttachments(service, childAttachment, properties, appendedBody, emailIdentifier);
}
}
}
} catch (Exception e) {
throw new Exception("Exception while extracting Item Attachments: " + e.getMessage());
}