Convert string with Unicode to show unicode in java?

Convert string with Unicode to show unicode in java? - java

I am trying to convert strings that contain a unicode to the actual character but everything I have found so far either only work if the string is only the unicode or converts the symbol to the code.
This is the string I am using as an example right now
Rebroadcast of Shows from the past Week! RPGs, Talk shows, Science, Wisdom, Vampires and more - Good stuff! \\u003c3 - !rbschedule for more info
I am getting this in from an API call so I can't just write it as \ instead of the \\.

\\
That is called escaping, and it is what is currently blocking you from seeing the < character.
Un-escaping is not what you'd actually want to do manually, as there are many caveats.
You might want to use Apache common-text StringEscapeUtils#unescapeJava
final String result = StringEscapeUtils.unescapeJava(yourString);
That will output "...Good stuff! <3 - !rbschedule for more info..."
The Maven dependency
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.6</version>
</dependency>
Or for Gradle
compile group: 'org.apache.commons', name: 'commons-text', version: '1.6'

what if you replace all "//" with "/" dynamically?

Related

reliable approach to unescape HTML Characters in java (Android)

I didn't find any approach except replacing myself to unescape any kind of HTML Charaters in a reliable way.
StringEscapeUtils currently seems the best for my, but it doesn't work for any case:
My Code:
#Test
public void escapeHTMLChars2() {
String sample1 = "<";
Assert.assertEquals("<", org.apache.commons.lang3.StringEscapeUtils.unescapeHtml4(sample1));
String sample2 = "–";
Assert.assertEquals("–", org.apache.commons.lang3.StringEscapeUtils.unescapeHtml4(sample2));
}
gradle:
implementation group: 'org.apache.commons', name: 'commons-lang3', version: '3.11'
leads to:
org.junit.ComparisonFailure:
Expected :-
Actual :?
Is there a way to unescape all kinds of HTML chars without hacking ?

Apache Solr 6.6.1 number mapping in Urdu language

I have configured Apache Solr 6.6.2 to index and search documents later. I am facing some problems. If there is a number in document like 1234, I want it should be mapped (copied) to corresponding Urdu numerics like ۱۲۳۴. It will ultimately help to retrieve document if either user enter 1234 or ۱۲۳۴.
Is there any built in solution in Solr or how I can come into this functionality?

If you are using Java/SolrJ client for indexing ...
Add junidecode dependency to your project
for gradle
compile group: 'junidecode', name: 'junidecode', version: '0.1.1'
for maven:
<dependency>
<groupId>junidecode</groupId>
<artifactId>junidecode</artifactId>
<version>0.1.1</version>
</dependency>
while indexing ... index an additional field ...
import net.sf.junidecode.Junidecode;
String converted = Junidecode.unidecode("۱۲۳۴")
// converted == 1234

Convert from CL8ISO8859P5 encoding to ASCII in Java

probably you may give me some hints what can I do/see in my case:)
There is an Oracle code that converts given hexadecimal input in AMERICAN_AMERICA.CL8ISO8859P5 to ASCII: utl_raw.cast_to_varchar2(utl_raw.convert(hextoraw('31383831303891353080853737303338385A5A'), 'AMERICAN_AMERICA.CL8ISO8859P5', 'AMERICAN_AMERICA.RU8PC866'))
Example input: 31383831303891353080853737303338385A5A, example output: 188108С50АЕ770388ZZ
My pain is to solve how can I do it in Java:) Prerequisite: I have no connection to the database and can't execute prepared SQL statement in order to call this function in Oracle package...
I am able to parsing everything except specific bytes (91 -> 'C1', 8085 -> 'B0B5', 5A -> 'Z') with the following code:
new String(DatatypeConverter.parseHexBinary("31383831303891353080853737303338385A5A"))
I've also tried all standards encodings in String constructor with encoding but there were no positive results:(
Do you know if there are encodings in Java that are identical to AMERICAN_AMERICA.CL8ISO8859P5? Or do you know some libraries or Java functions that are able to make this conversion (AMERICAN_AMERICA.CL8ISO8859P5 to ASCII) ?
Many thanks to you in advance!

AMERICAN_AMERICA.RU8PC866 Oracle encoding is IBM-866 encoding in Java (hint from #kfinity). My issue solved by using
new String(DatatypeConverter.parseHexBinary(input), "IBM-866")
CP866 worked as well.

How to call Java methods in Maven archetype XML

I am trying to install a custom archetype in my local repo. When user generate a project based on this archetype, they are required to provide the artifactId. Most of the time it is in all lower case. However, the main Class name (also the java filename) is dependent on this artifactId with the first letter capitalized. Instead of asking user to input another variable, I would like to call some String method to convert the artifactId to the correct format for class name.
In Maven archetype: Modify artifactId, looks like you can embed Java method in archetype-metadata.xml as below:
<requiredProperty key="artifactIdWithUnderscore" >
<defaultValue>${artifactId.replaceAll("-", "_")}</defaultValue>
</requiredProperty>
So I did something similar in my archetype-metadata.xml to capitalize first letter.
<requiredProperty key="artifactId" />
<requiredProperty key="serviceName">
<defaultValue>${artifactId.toLowerCase().substring(0,1).toUpperCase()+actifactId.toLowerCase().substring(1)}</defaultValue>
</requiredProperty>
However I got the following parse error:
SEVERE: Parser Exception: serviceName
org.apache.velocity.runtime.parser.ParseException: Encountered "+artifactId.toLowerCase().substring(1)}" at line 1, column 55.
Was expecting one of:
"[" ...
"}" ...
at org.apache.velocity.runtime.parser.Parser.generateParseException(Parser.java:3679)
What is correct way to insert Java String method in this archetype xml?

The plus character can't be used as string concatenation operator here, it's also not needed. Just concatenate two replacements (without any operator between):
<defaultValue>${artifactId.toLowerCase().substring(0,1).toUpperCase()}${actifactId.toLowerCase().substring(1)}</defaultValue>

<defaultValue>
${artifactId.substring(0,1).toUpperCase()}${artifactId.substring(1).toLowerCase()}
</defaultValue>
works for me, but artifactId is coming from the archetype project and not from my argument artifactId as expected...

This will capitalize the first letter.
<requiredProperty key="artifactId" />
<requiredProperty key="serviceName">
<defaultValue>${StringUtils.capitalize("artifactId")}</defaultValue>
</requiredProperty>
Make sure you have Apache commons dependency included.
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.5</version>
</dependency>

Is there any Alternative of --guess in java runcukes

i am getting an error `cucumber.runtime.AmbiguousStepDefinitionsException: ✽.Then I should validate my username with expected value (features/sanity.feature:32) matches more than one step definition:
^I should validate ([^"]*) with expected value in UserSteps.iShouldValidateWithExpectedValue(String)
^I should validate my username with expected value$ in HomeSteps.iShouldValidateMyUsernameWithExpectedValue()
`
When i run with calabash i use to add "--guess" in the run command.
But Don't know how to resolve the same in CucumberJvm in Java.
Is there any possiblities to add this in #cucumberoptions ?
Am using the maven dependency
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-java</artifactId>
<version>1.2.5</version>
</dependency>

--guess is not an option that you can set in the command line runner. Recall that cucumber can always guess wrong. I recommend that you simply change a few words near the beginning of the failing statement to make sure that that statement is unique. Better safe than sorry.
Searching through the cucumber-jvm source, there is at least one reference to --guess being set by default.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Convert string with Unicode to show unicode in java? - java

what if you replace all "//" with "/" dynamically?

Related

reliable approach to unescape HTML Characters in java (Android)

Apache Solr 6.6.1 number mapping in Urdu language

Convert from CL8ISO8859P5 encoding to ASCII in Java

How to call Java methods in Maven archetype XML

Is there any Alternative of --guess in java runcukes

Categories

Resources