My application is using Spring Integration for email polling from Outlook mailbox.
As, it is receiving the String (email body)from an external system (Outlook), So I have no control over it.
For Example,
String emailBodyStr= "rejected by sundar14-\u200B.";
Now I am trying to remove the unicode character \u200B from this String.
What I tried already.
Try#1:
emailBodyStr = emailBodyStr.replaceAll("\u200B", "");
Try#2:
`emailBodyStr = emailBodyStr.replaceAll("\u200B", "").trim();`
Try#3 (using Apache Commons):
StringEscapeUtils.unescapeJava(emailBodyStr);
Try#4:
StringEscapeUtils.unescapeJava(emailBodyStr).trim();
Nothing worked till now.
When I tried to print this String using below code.
logger.info("Comment BEFORE:{}",emailBodyStr);
logger.info("Comment AFTER :{}",emailBodyStr);
In Eclipse console, it is NOT printing unicode char,
Comment BEFORE:rejected by sundar14-.
But the same code prints the unicode char in Linux console as below.
Comment BEFORE:rejected by sundar14-\u200B.
I read some examples where str.replace() is recommended, but please note that examples uses javascript, PHP and not Java.
Finally, I am able to remove 'Zero Width Space' character by using 'Unicode Regex'.
String plainEmailBody = new String();
plainEmailBody = emailBodyStr.replaceAll("[\\p{Cf}]", "");
Reference to find the category of Unicode characters.
Character class from Java.
Character class from Java lists all of these unicode categories.
Website: http://www.fileformat.info/
Website: http://www.regular-expressions.info/ => Unicode Regular Expressions
Note 1: As I received this string from Outlook Email Body - none of the approaches listed in my question was working.
My application is receiving a String from an external system
(Outlook), So I have no control over it.
Note 2: This SO answer helped me to know about Unicode Regular Expressions .
I am using a MessageConsole in Eclipse to display output information. The output is formatted into Error 1 - (MyClass.java:10), which is expected to generate a clickable link to code (MyClass.java line 10, in this case), since the console should be able to parse the pattern (FileName.java:LineNumber) automatically as suggested in this post.
However, it failed to work this way. But when I use System.out.println() to output this pattern directly in the plugin Eclipse, the link can be generated.
I also considered the possibility of multiple consoles in the plugin, but streaming the patterned text to other consoles did not work either. Any insights?
My code is like below:
ConsolePlugin plugin = ConsolePlugin.getDefault();
IConsoleManager conMan = plugin.getConsoleManager();
MessageConsole myConsole = new MessageConsole( name, null );
conMan.addConsoles( new IConsole[]{myConsole} );
MessageConsoleStream out = myConsole.newMessageStream();
out.println("Error 1 - (MyClass.java:10)");
Matching for Java code links is only done for consoles which have the javaStackTraceConsole console type.
So you can use the org.eclipse.ui.console.consolePatternMatchListeners extension point to define your own pattern matcher to do the same thing for your console.
Or you can use the:
public MessageConsole(String name, String consoleType, ImageDescriptor imageDescriptor, boolean autoLifecycle)
constructor to specify the console type for your console to match the existing matchers.
when I am trying to call method with parameter using my Polish language f.e.
node.call("ąćęasdasdęczć")
I get these characters as input characters.
Ä?Ä?Ä?asdasdÄ?czÄ
I don't know where to set correct encoding in maven pom.xml? or in my IDE? I tried to change UTF-8 to ISO_8859-2 in my IDE setting, but it didn't work. I was searching similiar questions, but I didn't find the answer.
#Edit 1
Sample code:
public void findAndSendKeys(String vToSet , By vLocator){
WebElement element;
element = webDriverWait.until(ExpectedConditions.presenceOfElementLocated(vLocator));
element.sendKeys(vToSet);
}
By nameLoc = By.id("First_Name");
findAndSendKeys("ąćęasdasdęczć" , nameLoc );
Then in input field I got Ä?Ä?Ä?asdasdÄ?czÄ. Converting string to Basic Latin in my IDE helps, but It's not the solution that I needed.
I have also problems with fields in classes f.e. I have class in which I have to convert String to basic Latin
public class Contacts{
private static final By LOC_ADDRESS_BTN = By.xpath("//button[contains(#aria-label,'Wybór adresu')]");
// it doesn't work, I have to use basic latin and replace "ó" with "\u00f3" in my IDE
}
#Edit 2 - Changed encoding, but problem still exists
1:
i have this weird problem. I have this Java method that works fine in my program:
/*
* Extract all image urls from the html source code
*/
public void extractImageUrlFromSource(ArrayList<String> imgUrls, String html) {
Pattern pattern = Pattern.compile("\\<[ ]*[iI][mM][gG][\t\n\r\f ]+.*[sS][rR][cC][ ]*=[ ]*\".*\".*>");
Matcher matcher = pattern.matcher(html);
while (matcher.find()) {
imgUrls.add(extractImgUrlFromTag(matcher.group()));
}
}
This method works fine in my java application. But whenever I test it in JUnit test, it only adds the last url to the ArrayList
/**
* Test of extractImageUrlFromSource method, of class ImageDownloaderProc.
*/
#Test
public void testExtractImageUrlFromSource() {
System.out.println("extractImageUrlFromSource");
String html = "<html><title>fdjfakdsd</title><body><img kfjd src=\"http://image1.png\">df<img dsd src=\"http://image2.jpg\"></body><img dsd src=\"http://image3.jpg\"></html>";
ArrayList<String> imgUrls = new ArrayList<String>();
ArrayList<String> expimgUrls = new ArrayList<String>();
expimgUrls.add("http://image1.png");
expimgUrls.add("http://image2.jpg");
expimgUrls.add("http://image3.jpg");
ImageDownloaderProc instance = new ImageDownloaderProc();
instance.extractImageUrlFromSource(imgUrls, html);
imgUrls.stream().forEach((x) -> {
System.out.println(x);
});
assertArrayEquals(expimgUrls.toArray(), imgUrls.toArray());
}
Is it the JUnit that has the fault. Remember, it works fine in my application.
I think there is a problem in the regex:
"\\<[ ]*[iI][mM][gG][\t\n\r\f ]+.*[sS][rR][cC][ ]*=[ ]*\".*\".*>"
The problem (or at least one problem) us the first .*. The + and * metacharacters are greedy, which means that they will attempt to match as many characters as possible. In your unit test, I think that what is happening is that the .* is matching everything up to the last 'src' in the input string.
I suspect that the reason that this "works" in your application is that the input data is different. Specifically, I suspect that you are running your application on input files where each img element is on a different line. Why does this make a difference? Well, it turns out that by default, the . metacharacter does not match line breaks.
For what it is worth, using regexes to "parse" HTML is generally thought to be a bad idea. For a start, it is horribly fragile. People who do a lot of this kind of stuff tend to use proper HTML parsers ... like "jsoup".
Reference: RegEx match open tags except XHTML self-contained tags
I wish I could comment as I'm not sure about this, but it might be worth mentioning...
This line looks like it's extracting the URLs from the wrong array...did you mean to extract from expimgUrls instead of imgUrls?
instance.extractImageUrlFromSource(imgUrls, html);
I haven't gotten this far in my Java education so I may be incorrect...I just looked over the code and noticed it. I hope someone else who knows more can actually give you a solid answer!
How do I open the default mail program with a Subject and Body in a cross-platform way?
Unfortunately, this is for a a client app written in Java, not a website.
I would like this to work in a cross-platform way (which means Windows and Mac, sorry Linux). I am happy to execute a VBScript in Windows, or AppleScript in OS X. But I have no idea what those scripts should contain. I would love to execute the user's default program vs. just searching for Outlook or whatever.
In OS X, I have tried executing the command:
open mailto:?subject=MySubject&body=TheBody
URL escaping is needed to replace spaces with %20.
Updated On Windows, you have to play all sorts of games to get start to run correctly. Here is the proper Java incantation:
class Win32 extends OS {
public void email(String subject, String body) throws Exception {
String cmd = "cmd.exe /c start \"\" \"" + formatMailto(subject, body) + "\"";
Runtime.getRuntime().exec(cmd);
}
}
In Java 1.6 you have a stardard way to open the default mailer of the platform:
the Desktop.mail(URI) method.The URI can be used to set all the fields of the mail (sender, recipients, body, subject).
You can check a full example of desktop integration in Java 1.6 on Using the Desktop API in Java SE 6
start works fine in Windows (see below). I would use Java's built in UrlEscape then just run a second replacement for '+' characters.
start mailto:"?subject=My%20Subject&body=The%20Body"
Never use Runtime.exec(String) on Mac OS X or any other operating system. If you do that, you'll have to figure out how to properly quote all argument strings and so on; it's a pain and very error-prone.
Instead, use Runtime.exec(String[]) which takes an array of already-separated arguments. This is much more appropriate for virtually all uses.
1. Add a Subject Line
You can prefill the subject line in the email by adding the subject preceded by '?subject=' after the email address.
So the link now becomes:
Email Us
2. Send to Multiple Recipients
Mail can be sent to additional recipients either as carbon copies (cc) or blind carbon copies (bcc).
This is done in a similar way, by placing '?cc=someoneelse#theirsite.com' after the initial address.
So the link looks like this:
Email Us
cc can simply be replaced by bcc if you wish to send blind carbon copies.
This can be very useful if you have links on pages with different subjects. You might have the email on each page go to the appropriate person in a company but with a copy of all mails sent to a central address also.
You can of course specify more than one additional recipient, just separate your list of recipients with a comma.
Email Us
Sourced from Getting More From 'mailto' which now 404s. I retrieved the content from waybackmachine.
3. Combining Code
You can combine the various bits of code above by the addition of an '&' between each.
Thus adding
me#mysite.com?subject=Hello&cc=you#yoursite.com&bcc=her#hersite.com
would send an email with the subject 'Hello' to me, you and her.
4. Write the Email
You can also prefill the body of the email with the start of a message, or write the whole message if you like! To add some thing to the body of the email it is again as simple as above - '?body=' after the email address. However formatting that email can be a little tricky. To create spaces between words you will have to use hex code - for example '%20' between each word, and to create new lines will mean adding '%0D'. Similarly symbols such as $ signs will need to be written in hex code.
If you also wish to add a subject line and send copies to multiple recipients, this can make for a very long and difficult to write bit of code.
It will send a message to three people, with the subject and the message filled in, all you need to do is add your name.
Just look at the code!
<a href="mailto:abbeyvet#outfront.net?CC=spooky#outfront.net
&BCC=thomasbrunt#outfront.net&Subject=Please%2C%20I%20insist
%21&Body=Hi%0DI%20would%20like%20to%20send%20you%20
%241000000%20to%20divide%20as%20you%20see%20fit%20among
%20yourselves%20and%20all%20the%20moderators.%0DPlease%
20let%20me%20know%20to%20whom%20I%20should%20send
%20the%20check.">this link</a>
Note: Original source URL where I found this is now 404ing so I grabbed to content from waybackmachine and posted it here so it doesn't get lost. Also, the OP stated it was not for a website, which is what these examples are, but some of these techniques may still be useful.
I had to re-implement URLencode
because Java's would use + for space
and Mail took those literally.
I don't know if Java has some built-in method for urlencoding the string, but this link http://www.permadi.com/tutorial/urlEncoding/ shows some of the most common chars to encode:
; %3B
? %3F
/ %2F
: %3A
# %23
& %24
= %3D
+ %2B
$ %26
, %2C
space %20 or +
% %25
< %3C
> %3E
~ %7E
% %25
I don't know if Java has some built-in method for urlencoding the string, but this link http://www.permadi.com/tutorial/urlEncoding/ shows some of the most common chars to encode:
For percent-encoding mailto URI hnames and hvalues, I use the rules at http://shadow2531.com/opera/testcases/mailto/modern_mailto_uri_scheme.html#encoding. Under http://shadow2531.com/opera/testcases/mailto/modern_mailto_uri_scheme.html#implementations, there's a Java example that may help.
Basically, I use:
private String encodex(final String s) {
try {
return java.net.URLEncoder.encode(s, "utf-8").replaceAll("\\+", "%20").replaceAll("\\%0A", "%0D%0A");
} catch (Throwable x) {
return s;
}
}
The string that's passed in should be a string with \r\n, and stray \r already normalized to \n.
Also note that just returning the original string on an exception like above is only safe if the mailto URI argument you're passing on the command-line is properly escaped and quoted.
On windows that means:
Quote the argument.
Escape any " inside the quotes with \.
Escape any \ that precede a " or the end of the string with \.
Also, on windows, if you're dealing with UTF-16 strings like in Java, you might want to use ShellExecuteW to "open" the mailto URI. If you don't and return s on an exception (where some hvalue isn't completely percent-encoded, you could end up narrowing some wide characters and losing information. But, not all mail clients accept unicode arguments, so ideally, you want to pass a properly percent-encoded-utf8 ascii argument with ShellExecute.
Like 'start', ShellExecute with "open" should open the mailto URI in the default client.
Not sure about other OS's.
Mailto isn't a bad route to go. But as you mentioned, you'll need to make sure it is encoded correctly.
The main problem with using mailto is with breaking lines. Use %0A for carriage returns, %20 for spaces.
Also, keep in mind that the mailto is considered the same as a URL of sorts and therefore will have the same limitations for length. See
http://support.microsoft.com/kb/208427, note the maximum URL length of 2083 characters. This is confirmed for mailto as well
in this article: http://support.microsoft.com/kb/279460/en-us. Also, some mail clients can also have a limit (I believe older versions of Outlook Express had a limit of something much smaller like 483 characters or something. If you expect to have a longer string than that then you'll need to look at alternatives.
BTW, you shouldn't have to resort to kicking out a script to do that as long as you can shell out a command from Java (I don't know if you can since I don't do Java).
You may use this...
main(string[] args){
String forUri = String.format("mailto:?subject=%s&body=%s", urlEncode(sub), urlEncode(mailBody));
Desktop.getDesktop().mail(new URI(forUri));
}
private static final String urlEncode(String str) {
try {
return URLEncoder.encode(str, "UTF-8").replace("+", "%20");
} catch (UnsupportedEncodingException e) {
throw new RuntimeException(e);
}
}
Also for formatting read A simple way of sending emails in Java: mail-to links
I have implemented this, and it works well on OS X. (Ryan's mention of the max URL length has not been codified.)
public void email(String subject, String body) throws Exception {
String cmd = "open mailto:";
cmd += "?subject=" + urlEncode(subject);
cmd += "&body=" + urlEncode(body);
Runtime.getRuntime().exec(cmd);
}
private static String urlEncode(String s) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < s.length(); i++) {
char ch = s.charAt(i);
if (Character.isLetterOrDigit(ch)) {
sb.append(ch);
}
else {
sb.append(String.format("%%%02X", (int)ch));
}
}
return sb.toString();
}
I had to re-implement URLencode because Java's would use + for space and Mail took those literally. Haven't tested on Windows yet.