Problems with encoding in IntellijIDEA on Windows

Problems with encoding in IntellijIDEA on Windows - java

When I launch my program, it crashes on the line
String[] words = tdElement.text().replaceAll("[^a-zA-Zа-яА-Я ]", " ").split(" ");
with the following exception:
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal character range near index 16
[^a-zA-ZР°-СЏРђ-РЇ ]
This "tdElement" contains both English and Russian letters. When there is no Russian letters in the "tdElement", everything works fine. I tried to go to "Settings" -> "File Encodings" and set the "Global Encoding", "Project Encoding" and "Default encoding for properties files" fields to UTF-8, but it didn`t work. Thank you in advance

Related

Regex Expression working in Notepad++ but not Java

Okay, so i'm working on a problem I'd rather solve with regex, I test most of my regex expressions in Notepad++, This has worked fine after a few tweaks such as double escaping some things for java, However this regex expression throws an exception when run in java, however it runs in Notepad++ just fine, the idea if this code is to be able to mention a different player in the game with a highlighted name.
tldr; I'm trying to replace the first occurrence of a specific name in a message
I've tried looking around for a while but i haven't found a solution, so i thought i might as well ask here.
p.getName() simply returns a string (the players name)
String newmessage = message.replaceFirst("(?i)" + Pattern.quote(p.getName()) + "((?(?=\\s)|('|,|!))|$)", color + p.getName() + Color.toString(Color.getLastColorOf(message)));
However executing the code throws this exception
...at java.lang.Thread.run(Unknown Source) [?:1.8.0_202]
Caused by: java.util.regex.PatternSyntaxException: Unknown inline modifier near index 15
(?i)\QTauCubed\E((?(?=\s)|('|,))|$)
^
at java.util.regex.Pattern.error(Unknown Source) ~[?:1.8.0_202]...
And I'm not sure what it wants me to do, I don't see how this is not valid regex
This is the regex for Notepad++
(?i)Name((?(?=\s)|('|,|!))|$)
The above will match
Name's r
Name
Name test
Name,
Name!
But will not match
Nametest
That is what I intended it to do.

I vote for just using the pattern \bName\b along with String#replaceFirst:
String input = "Rename here is a Name and here is the same Name again.";
input = input.replaceFirst("\\bName\\b", "blah");
System.out.println(input);
This prints:
Rename here is a blah and here is the same Name again.

substring using \(backslash) in java

I want to get file name from complete path of file.
Input : "D://amol//1/\15_amol.jpeg"
Expected Output : 15_amol.jpeg
I have written below code for this
public class JavaApplication9 {
public static void main(String[] args) {
String fname="D://amol//1/\15_amol.jpeg";
System.out.println(fname.substring(fname.lastIndexOf("/")));
System.out.println(fname.substring(fname.lastIndexOf("\\")));
}
}
but getting below output :
_amol.jpeg
Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
String index out of range: -1
at java.lang.String.substring(String.java:1927)
at javaapplication9.JavaApplication9.main(JavaApplication9.java:6)
C:\Users\lakhan.kamble\AppData\Local\NetBeans\Cache\8.1\executor-snippets\run.xml:53:
Java returned: 1

The string \15 is an "octal escape" for the carriage return character (0x0d, 13 decimal). There are two possibilities here.
You really meant \15 to be the octal escape, in which case you are trying to create a filename with an embedded newline. The actual contents of fname in this case could be expressed as
"D://amol//1/" + "\n" + "_amol.jpeg";
Windows will prevent that from happening and your program will throw an IOException.
You really meant
String fname="D://amol//1/\\15_amol.jpeg";
In this case the extra backslash is redundant and will be ignored by Windows because the filename will resolve (in Windows path terms) to D:\amol\1\\15_amol.jpeg and adjacent directory separators collapse to a single separator. So you could just omit the extra backslash altogether without changing the effective path.
As to your exception, the string as shown DOES NOT contain a backslash character (case 1 above), so
fname.lastIndexOf("\\")
returned -1, causing the exception

Running an Imported android project

I have a problem with running an imported android project. In fact, when i run it i get an error from BuildConfig.java ( this file is not editable) the error is
"Error:(18, 69) error: illegal escape character".
and this is the line that the error is pointing to:
public static final String[] TRANSLATION_ARRAY = new String[]{"C:\gpslogger\gpslogger\src\main\res\values","C:\gpslogger\gpslogger\src\main\res\af","C:\gpslogger\gpslogger\src\main\res\ar","C:\gpslogger\gpslogger\src\main\res\ca","C:\gpslogger\gpslogger\src\main\res\cs","C:\gpslogger\gpslogger\src\main\res\cy","C:\gpslogger\gpslogger\src\main\res\da","C:\gpslogger\gpslogger\src\main\res\de","C:\gpslogger\gpslogger\src\main\res\el","C:\gpslogger\gpslogger\src\main\res\es","C:\gpslogger\gpslogger\src\main\res\es-ES","C:\gpslogger\gpslogger\src\main\res\es-MX","C:\gpslogger\gpslogger\src\main\res\es-PE","C:\gpslogger\gpslogger\src\main\res\fa","C:\gpslogger\gpslogger\src\main\res\fi","C:\gpslogger\gpslogger\src\main\res\fr","C:\gpslogger\gpslogger\src\main\res\fr-CA","C:\gpslogger\gpslogger\src\main\res\gl","C:\gpslogger\gpslogger\src\main\res\he","C:\gpslogger\gpslogger\src\main\res\hi","C:\gpslogger\gpslogger\src\main\res\hr","C:\gpslogger\gpslogger\src\main\res\hu","C:\gpslogger\gpslogger\src\main\res\is","C:\gpslogger\gpslogger\src\main\res\it","C:\gpslogger\gpslogger\src\main\res\ja","C:\gpslogger\gpslogger\src\main\res\ko","C:\gpslogger\gpslogger\src\main\res\lt","C:\gpslogger\gpslogger\src\main\res\lv","C:\gpslogger\gpslogger\src\main\res\mk","C:\gpslogger\gpslogger\src\main\res\ms","C:\gpslogger\gpslogger\src\main\res\nl","C:\gpslogger\gpslogger\src\main\res\no","C:\gpslogger\gpslogger\src\main\res\pl","C:\gpslogger\gpslogger\src\main\res\pt","C:\gpslogger\gpslogger\src\main\res\pt-BR","C:\gpslogger\gpslogger\src\main\res\pt-PT","C:\gpslogger\gpslogger\src\main\res\ro","C:\gpslogger\gpslogger\src\main\res\ru","C:\gpslogger\gpslogger\src\main\res\sk","C:\gpslogger\gpslogger\src\main\res\sl","C:\gpslogger\gpslogger\src\main\res\sr","C:\gpslogger\gpslogger\src\main\res\sv","C:\gpslogger\gpslogger\src\main\res\sv-SE","C:\gpslogger\gpslogger\src\main\res\ta","C:\gpslogger\gpslogger\src\main\res\th","C:\gpslogger\gpslogger\src\main\res\tl","C:\gpslogger\gpslogger\src\main\res\tr","C:\gpslogger\gpslogger\src\main\res\uk","C:\gpslogger\gpslogger\src\main\res\vi","C:\gpslogger\gpslogger\src\main\res\zh","C:\gpslogger\gpslogger\src\main\res\zh-CN","C:\gpslogger\gpslogger\src\main\res\zh-TW"};

\g is an illegal escape character in "C:\gps..."
All windows path strings need to have \\, for example "C:\\"
You should find the file that is editable and correct those strings
P.S. the (18, 69) in the error means line 18, column 69 of the file.

Java replace/replaceAll strange behavior

I can't get what I'm missing here. Both replace and replaceAll from java.lang.String are generating a question mark (?) after each ocurrence:
String str = "ABCD DKABCED DLS ABC";
System.out.println("str='"+str+"'");
System.out.println("str.replaceAll(\"ABC\", \"A\\\\${BC}\" ) => " + str.replaceAll("ABC", "A\\${BC}" ));
System.out.println("str.replace(\"ABC\", \"A${BC}\" ) => " + str.replace("ABC", "A${BC}" ));
Generates the following output:
str='ABCD DKABCED DLS ABC'
str.replaceAll("ABC", "A\\${BC}?" ) => A${BC}?D DKA${BC}?ED DLS A${BC}?
str.replace("ABC", "A${BC}?" ) => A${BC}?D DKA${BC}?ED DLS A${BC}?
Here an image of the execution:
Does anybody knows why?
EDITED:
Just for the record. The problem it that there really WAS a character after the brackets.
After coping and pasting to Notepad++ I could see the }?"text. Not in Netbeans.
So purelly enconding missunderstanding.

I suspect this is a character encoding problem. When I pasted your code into Eclipse (on Windows) it could not save the code, complaining about the character set:
Some characters cannot be mapped using "Cp1252" character encoding.
When I retyped it in from scratch, the problem went away:
String str = "ABCD DKABCED DLS ABC";
System.out.println("str='" + str + "'");
System.out.println(str.replace("ABC", "A${BC}"));
produces the following (without extra ? marks):
str='ABCD DKABCED DLS ABC'
A${BC}D DKA${BC}ED DLS A${BC}
If you take the hexdump of a normal } you get 7d.
But for the } character in your code, I get 7d e2 80 8b

That would be because you have question marks in your replacement string. Thus replace and replaceAll are simply doing exactly what you are telling them to do.

Why do I get "Not a hexadecimal character" when using tdbloader2

I'm loading a recent DBPedia dump file, specifically short_abstracts_en.nt available from http://data.dws.informatik.uni-mannheim.de/dbpedia/2014/en/short_abstracts_en.nt.bz2 (warning, 409M file).
tdbloader2 fails to load, with:
org.apache.jena.riot.RiotException: [line: 1263473, col: 122] Not a hexadecimal character:
I can replicate this error with riot --validate
$JENA_HOME/bin/riot --validate /var/data/uncompressed/short_abstracts_en.nt
20:04:36 ERROR riot :: [line: 1263473, col: 122] Not a hexadecimal character:
Line 1263473 of that file looks like this:
<http://dbpedia.org/resource/Taiwanese_kana> <http://www.w3.org/2000/01/rdf-schema#comment> "Taiwanese kana (\u30BF\u30A \u30F2\u30A1\u30CC \u30AE\u30A \u30AB\u30A \u30D3\u30A7\u30F ) is a katakana-based writing system once used to write Holo Taiwanese, when Taiwan was ruled by Japan. It functioned as a phonetic guide to hanzi, much like furigana in Japanese or Zhuyin fuhao in Chinese. There were similar systems for other languages in Taiwan as well, including Hakka and Formosan languages.The system was imposed by Japan at the time, and used in a few dictionaries, as well as textbooks."#en .
Column 122 is part of the unicode set of characters: (\u30BF\u30A \u30F2\u30A1\u30CC \u30AE\u30A \u30AB\u30A \u30D3\u30A7\u30F ) (with column 122 in bold: \u30F2).
The error message is correct: \u30F2 is a (valid) unicode character, not a hexadecimal character.
Why does Jena think it should be hex, and what do I do about it?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Problems with encoding in IntellijIDEA on Windows - java

Related

Regex Expression working in Notepad++ but not Java

substring using \(backslash) in java

Running an Imported android project

Java replace/replaceAll strange behavior

Why do I get "Not a hexadecimal character" when using tdbloader2

Categories

Resources