How to put Unicode char in Java String? [duplicate] - java

This question already has answers here:
Creating Unicode character from its number
(13 answers)
Closed 1 year ago.
How to put Unicode char U+1F604 in Java String?
I attempted using
String s = "\u1F604";
but it equivalent to
String s = "\u1F60"+"4";
it was split into 2 chars.

DuncG's answer is a good way of doing it. The short explanation for this is that Unicode characters, by default, only take up 4 bytes, so the string literal escape only allows \u####. However, emojis are surrogate pairs and Unicode has reserved U+D800 to U+DFFF for these pairs, allowing 1024 x 1024 pair characters.
A different way of doing it that doesn't require converting into UTF-16 and encoding as a surrogate pair is to use Character.toChars(...):
public class Main {
public static void main(String[] args) {
String s = "Hello " + new String(Character.toChars(0x1f604)) + "!";
System.out.println(s);
}
}
Try it online!

The third variant, especially Character.toString(0x1f604):
public class Main {
public static void main(String[] args) {
String s1 = "Hello " + Character.toString(0x1f604) + "!"; // Since Java 11
String s2 = "Hello " + new String(new int[]{0x1f604}, 0, 1) + "!"; // < 11
System.out.println(s1 + " " + s2);
}
}
(Notice that in some other languages \U0001f604 might be used. In java \u and \U are the same.)

The UTF-16 encoding of your character U+1F604 is 0xD83D 0xDE04, so it should be:
String s = "\uD83D\uDE04";

You can add this UTF-16 smiley face symbol to the string as a symbol itself, as a hexadecimal or decimal surrogate pair, or its supplementary code point.
// symbol itself
String str1 = "😄";
// surrogate pair
String str2 = "\uD83D\uDE04";
// surrogate pair to its supplementary code point value
int cp = Character.toCodePoint('\uD83D', (char) 0xDE04);
// since 11 - decimal codepoint to string
String str3 = Character.toString(cp);
// since 11 - hexadecimal codepoint to string
String str4 = Character.toString(0x1f604);
// output
System.out.println(str1 + " " + str2 + " " + str3 + " " + str4);
Output:
😄 😄 😄 😄

If you have a string representation of a hexadecimal value of a character, you can read a numeric value using Integer.parseInt method.
// surrogate pair
char high = (char) Integer.parseInt("D83D", 16);
char low = (char) Integer.parseInt("DE04", 16);
String str1 = new String(new char[]{high, low});
// supplementary code point
int cp = Integer.parseInt("1F604", 16);
char[] chars = Character.toChars(cp);
String str2 = new String(chars);
// since 11
String str3 = Character.toString(cp);
// output
System.out.println(str1 + " " + str2 + " " + str3);
Output:
😄 😄 😄

Related

split String If get any capital letters

My String:
BByTTheWay .I want to split the string as B By T The Way BByTheWay .That means I want to split string if I get any capital letters and last put the main string as it is. As far I tried in java:
public String breakWord(String fileAsString) throws FileNotFoundException, IOException {
String allWord = "";
String allmethod = "";
String[] splitString = fileAsString.split(" ");
for (int i = 0; i < splitString.length; i++) {
String k = splitString[i].replaceAll("([A-Z])(?![A-Z])", " $1").trim();
allWord = k.concat(" " + splitString[i]);
allWord = Arrays.stream(allWord.split("\\s+")).distinct().collect(Collectors.joining(" "));
allmethod = allmethod + " " + allWord;
// System.out.print(allmethod);
}
return allmethod;
}
It givs me the output: B ByT The Way BByTTheWay . I think stackoverflow community help me to solve this.
You may use this code:
Code 1
String s = "BByTTheWay";
Pattern p = Pattern.compile("\\p{Lu}\\p{Ll}*");
String out = p.matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.joining(" "))
+ " " + s;
//=> "B By T The Way BByTTheWay"
RegEx \\p{Lu}\\p{Ll}* matches any unicode upper case letter followed by 0 or more lowercase letters.
CODE DEMO
Or use String.split using same regex and join it back later:
Code 2
String out = Arrays.stream(s.split("(?=\\p{Lu})"))
.collect(Collectors.joining(" ")) + " " + s;
//=> "B By T The Way BByTTheWay"
Use
String s = "BByTTheWay";
Pattern p = Pattern.compile("[A-Z][a-z]*");
Matcher m = p.matcher(s);
String r = "";
while (m.find()) {
r = r + m.group(0) + " ";
}
System.out.println(r + s);
See Java proof.
Results: B By T The Way BByTTheWay
EXPLANATION
--------------------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
[a-z]* any character of: 'a' to 'z' (0 or more
times (matching the most amount possible))
As per requirements, you can write in this way checking if a character is an alphabet or not:
char[] chars = fileAsString.toCharArray();
StringBuilder fragment = new StringBuilder();
for (char ch : chars) {
if (Character.isLetter(ch) && Character.isUpperCase(ch)) { // it works as internationalized check
fragment.append(" ");
}
fragment.append(ch);
}
String.join(" ", fragment).concat(" " + fileAsString).trim(); // B By T The Way BByTTheWay

Password Regex with ANY THREE combination but no spaces

Ensure Password meets the following Criteria
Length: 8-32 characters
Password must contain atleast 3 of the following: Uppercase, Lowercase, Number, Symbol
Password must not contain Spaces
Password Characters allowed:
!##$%^*()_+Aa1~`-={}[]|\:;"',.?/
I tried:
^.*(?=.*[a-z])(?=.*[A-Z])(?=.*[\d\W])^(?!.*[&])(?=.*[!##$%^*()_+Aa1~`\-={}[\]|\:;"',.?/])\S{8,32}$
I have written this Regex, But enforces One uppercase letter which Should not be the case.......It should accept any three combinations of Uppercase, Lowercase, Numbers, Symbols:
!##$%^*()_+Aa1~`-={}[]|\:;"',.?/
The regex limit limit 255 characters. Any suggestions help on this please.
^(?:[A-Z]()|[a-z]()|[0-9]()|[!##$%^*()_+~`={}\[\]|\:;"',.?/-]())+$(?:\1\2\3|\1\2\4|\1\3\4|\2\3\4)
In more readable form:
^
(?:
[A-Z] () |
[a-z] () |
[0-9] () |
[!##$%^*()_+~`={}\[\]|\\:;"',.?/-] ()
)+
$
(?:
\1\2\3 |
\1\2\4 |
\1\3\4 |
\2\3\4
)
What I'm doing is using empty capturing groups as check boxes, tallying which kinds of characters were seen over the course of the match. So, for example, if there's no uppercase letter in the string, group #1 never participates in the match, so \1 won't succeed at the end. Unless all three other groups do participate, the match will fail.
Be aware that this technique doesn't work in all flavors. In JavaScript, for example, a backreference to an empty group always succeeds, even if the group didn't participate in the match.
You can use:
^(?=.*[!##$%^*()_+~`={}|:;"',.?\[\]\/-].*)(?=.*[A-Z].*)(?=.*[a-z].*)[\w!##$%^*()_+~`={}|:;"',.?\[\]\/-]{8,32}$|
^(?=.*[!##$%^*()_+~`={}|:;"',.?\[\]\/-].*)(?=.*\d.*)(?=.*[a-z].*)[\w!##$%^*()_+~`={}|:;"',.?\[\]\/-]{8,32}$|
^(?=.*[!##$%^*()_+~`={}|:;"',.?\[\]\/-].*)(?=.*[A-Z].*)(?=.*\d.*)[\w!##$%^*()_+~`={}|:;"',.?\[\]\/-]{8,32}$|
^(?=.*\d.*)(?=.*[A-Z].*)(?=.*[a-z].*)[\w!##$%^*()_+~`={}|:;"',.?\[\]\/-]{8,32}$
See LiveDemo
If you want a bit more of flexibility you can use this:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Dummy{
public static void main(String []args){
String password = "aaaZZZ111";
String specials = "!##$%^*()_+~`={}|:;\"',.?\\[\\]\\/-";
String uppercase = "A-Z";
String lowercase = "a-z";
String numbers = "\\d";
String all = specials + uppercase + lowercase + numbers;
int min = 8;
int max = 32;
String regex =
"^" + lookahead(specials) + lookahead(uppercase) + lookahead(lowercase) + "[" + all + "]{"+ min +"," + max + "}$|" +
"^" + lookahead(specials) + lookahead(uppercase) + lookahead(numbers) + "[" + all + "]{"+ min +"," + max + "}$|" +
"^" + lookahead(specials) + lookahead(lowercase) + lookahead(numbers) + "[" + all + "]{"+ min +"," + max + "}$|" +
"^" + lookahead(uppercase) + lookahead(lowercase) + lookahead(numbers) + "[" + all + "]{"+ min +"," + max + "}$";
Pattern r = Pattern.compile(regex);
Matcher m = r.matcher(password);
if (m.find()) {
System.out.println("OK");
} else {
System.out.println("NO MATCH");
}
}
public static String lookahead(String input) {
return "(?=.*[" + input + "].*)";
}
}

String method output not understanding

From the below question I din't understand how the output has come. Could someone please explain me how did it come ?
public class mystery{
public static void main(String[] args) {
System.out.println(serios("DELIVER"));
}
public static String serios(String s)
{
String s1 = s.substring(0,1);
System.out.println(s1);
String s2 = s.substring(1, s.length() - 1);
System.out.println(s2);
String s3 = s.substring(s.length() - 1);
System.out.println(s3);
if (s.length() <= 3)
return s3 + s2 + s1;
else
return s1 + serios(s2) + s3;
}
}
Output:
D
ELIVE
R
E
LIV
E
L
I
V
DEVILER
Thanks !!
For this chunk of code
String s1 = s.substring(0,1);//this initializes s1 = D as Substring reads up to right before the ending index which is 1.
System.out.println(s1);//print s1
This chunk
String s2 = s.substring(1, s.length() - 1);//Starts where the previous chunk left off, ends right before the ending initializing s2 = ELIVE
System.out.println(s2);//print s2
Final Chunk
String s3 = s.substring(s.length() - 1);//This chunk starts from the end and captures R
System.out.println(s3);//print s3
These three chunks and their print statements will give you
D ELIVE R
Now let's move on.
The final return statement returns s1 + serios(s2) + s3. This is recursion, a function called within itself.
This recursion will run until the if condition is met. Finally printing out DELIVER
You can see a pattern to understand better.
DELIVER when run through the code is printed out like this D ELIVE R. The first and last letters are separated from the center of the word.
return s1 + serios(s2) + s3;
since s2 = ELIVE it will become equal to s. It will be split apart using substring just like DELIVER to become E LIV E setting
LIV = s2
s will now equal LIV, and be split apart and printed out as
L I V
Finally the length of s is equal to 3, so the if condition will run and print out DEVILER
apart of what subString is doing, I think your problem is about recursive behavior of the series method.
at first call u send "DELIVER".
at the following line you can see if the input param is grater than 3 the method call itself agin this time with s2. for the first iteration s2 = ELIVE.
if (s.length() <= 3)
return s3 + s2 + s1;
else
return s1 + serios(s2) + s3;
you can think about running series("ELIVE"); and for the same process you will see this time s2 will get "LIV" which the recursion do not happen again and if part will run.
if (s.length() <= 3)
return s3 + s2 + s1;
I hope this help you.
For this type of tasks, it helps to trace the method calls
public class mystery {
public static void main(String[] args)
{
serios("DELIVER", "");
}
public static String serios(String s, String indentation)
{
String s1 = s.substring(0, 1);
System.out.println(indentation + "\"" + s1 + "\" is the substring of \"" + s + "\" at 0");
String s2 = s.substring(1, s.length() - 1);
System.out.println(indentation + "\"" +s2 + "\" is the substring of \"" + s + "\" from 1 to " + (s.length() - 2));
String s3 = s.substring(s.length() - 1);
System.out.println(indentation + "\"" + s3 + "\" is the substring of \"" + s + "\" at " + (s.length() - 1));
if (s.length() <= 3)
return s3 + s2 + s1;
else
{
indentation += " ";
return s1 + serios(s2, indentation) + s3;
}
}
}
Output:
"D" is the substring of "DELIVER" at 0
"ELIVE" is the substring of "DELIVER" from 1 to 5
"R" is the substring of "DELIVER" at 6
"E" is the substring of "ELIVE" at 0
"LIV" is the substring of "ELIVE" from 1 to 3
"E" is the substring of "ELIVE" at 4
"L" is the substring of "LIV" at 0
"I" is the substring of "LIV" from 1 to 1
"V" is the substring of "LIV" at 2

What does this quotation marks represent in System.out.println()?

I wonder why the double quotation marks is not shown in the actual output - just after the equal sign:
String word = "" + c1 + c2 + "ll";
The full code as follows:
public class InstantMethodsIndexOf
{
public void start()
{
String greeting = new String ("Hello World");
int position = greeting.indexOf('r');
char c1 = greeting.charAt(position + 2);
char c2 = greeting.charAt(position - 1);
**String word = "" + c1 + c2 + "ll";**
System.out.println(word);
}
}
When you pass "" to a String you are passing an empty String. You need to escape the quotation with a back slash if you want to print them.
Example:
String word = "\"" + c1 + c2 + "ll\"";
then System.out.println(word) will print:
"Hell"
As you can see I am escaping one double quotation at the beginning and another at the end
(Assuming c1 == 'H' and c2 == 'e')
The quotation mark does not appear because you have none being printed. What you have is an empty string being concatenated with other contents.
If you need the quotation mark, then you shoud do the following:
String word = "\"" + c1 + c2 + "ll";
It's a way to let Java know that it will be a string straight from the beginning, since "" is a String object of an empty string.
In your code, it doesn't really look useful. But following is an example where it would be:
int a=10, b=20;
String word = a + b + "he"; // word = "30he"
String word2 = "" + a + b + "he"; // word2 = "1020he"
I wonder why the double quotation marks is not shown in the actual
output - just after the equal sign:
String word = "" + c1 + c2 + "ll";
You are declaring a String that concatenates:
The empty String ""
c1
c2
The String literal "ll"
To show the quotes and make the code easier to read, try:
String word = '\u0022' + c1 + c2 + "ll"
which uses the unicode character value to print the double quote
I wonder why the double quotation marks is not shown in the actual
output - just after the equal sign:
In java String represented by the use of double quotes means the data between double quotes is considered as String value but if you want to include double quotes you have to use escape character \".
Moreover I suggest you to use StringBuilder and append your characters and String into it and use toString to print.
String str="ABC";//So value of String literal is ABC not "ABC"
String empty="";//is just empty but NOT Null
String quote="\"";//Here String has value " (One Double Quote)
This code
String greeting = "Hello World"; // <-- no need for new String()
int position = greeting.indexOf('r'); // <-- 8
char c1 = greeting.charAt(position + 2); // <-- 'd'
char c2 = greeting.charAt(position - 1); // <-- 'o'
String word = "" + c1 + c2 + "ll"; // <-- "" + 'd' + 'o' + "ll"
The empty String "" is used to coerce the arithmetic to a String, so it could also be written as
StringBuilder sb = new StringBuilder();
sb.append(c1).append(c2).append("ll");
String word = sb.toString();
or
StringBuilder sb = new StringBuilder("ll");
sb.insert(0, c2);
sb.insert(0, c1);
String word = sb.toString();
If you wanted to include double quotes in your word, your could escape them with a \\ or use a character -
char c1 = greeting.charAt(position + 2); // <-- 'd'
char c2 = greeting.charAt(position - 1); // <-- 'o'
String word = "\"" + c1 + c2 + "ll\""; // <-- "\"" + 'd' + 'o' + "ll\""
or
String word = "" + '"' + c1 + c2 + "ll" + '"';

Regex for validating numbers with certain length and beginning

I would like to validate a value which should have numbers only and length should be 11 and should not start with 129.
Is this possible as I am not very efficient in regular expressions?
Use negative lookahead. The regex should be ^(?!129)\d{11}$ Turn that into a Java pattern; escape the backslash.
You can use
String num_regex = "^(?!129)\\b[0-9]{11}\\b";
String testString= "12345678910";
Boolean b = testString.matches(num_regex);
System.out.println("String: " + testString + " :Valid = " + b);
testString= "12945678910";
b = testString.matches(num_regex);
System.out.println("String: " + testString + " :Valid = " + b);
OUTPUT:
String: 12345678910 :Valid = true
String: 12945678910 :Valid = false

Categories

Resources