Illegal character error when compiling a program made using geany \u0000

Illegal character error when compiling a program made using geany \u0000 - java

I have been using Geany to create Java programs, where until now I was able to compile them successfully. The simple program created below in Java was made using Geany, however the illegal character error (\u0000) occurred.
public class SumOfCubedDigits
{
public static void main(String[] args)
{
for (int i=1; i<=9; i++)
{
for (int j=0; j<=9; j++)
{
for (int k=0; k<=9; k++)
{
double iCubed=Math.pow(i,3);
double jCubed=Math.pow(j,3);
double kCubed=Math.pow(k,3);
double cubedDigits = iCubed + jCubed + kCubed;
int concatenatedDigits = (i*100 + j*10 + k);
if (cubedDigits==concatenatedDigits)
{
System.out.println(concatenatedDigits);
}
}
}
}
}
}
I recreated the program in nano and it was able to compile successfully. I then copied it across to Geany under a different name of SumTest.java, compiled it and got the same illegal character error. Clearly the error is with the Geany IDE for Raspberry Pi. I'd like to know how I could fix the editor to create and compile programs successfully as it not just this program, it is any program created in Java using Geany.

This might be a problem with encoding that Geany uses when saving the source file.
If you compile the file with javac without specifying the -encoding parameter the platform's default encoding is used. On a modern Linux this is likely to be UTF-8; on Windows it is one of the ANSI character sets or UTF-16, I think.
To find out what the default encoding is, you can compile and run a small java program:
public class DefaultCharsetPrinter {
public static void main(String[] argv) {
System.out.println(Charset.defaultCharset());
}
}
This should print the name of the default encoding used by java programs.
In Geany you can set the file encoding in menu Document > Set Encoding. You need to set this to the same value used by javac. The Geany manual describes additional options for setting the encoding.
As you are seeing a lot errors complaining about the null character it is most likely that Geany stores the file in an encoding with multiple bytes per character (for instance UTF-16) while javac uses an encoding with a single byte per character. If I save your source file as UTF-16 and then try to compile it with javac using UTF-8 encoding, I get the same error messages that you see. After saving the file as UTF-8 in Geany, the file compiles without problems.

I had the same problem with a file i generated using the command echo echo "" > Main.java in Windows Powershell.
I searched the problem and it seemed to have something to do with encoding. I checked the encoding of the file using file -i Main.java and the result was text/plain; charset=utf-16le.
Later i deleted the file and recreated it using git bash using touch Main.java and with this the file compiled successfully. I checked the file encoding using file -i command and this time the result was Main.java: text/x-c; charset=us-ascii.
Next i searched the internet and found that to create an empty file using Powershell we can use the Cmdlet New-Item. I create the file using New-Item Main.java and checked it's encoding and this time the result was Main.java: text/x-c; charset=us-ascii and this time it compiled successully.

Related

How to enter UTF-8 characters inside a jar file using Scanner(System.in)

I have a java project in which i want to take input from the user.
I wrote the code in eclipse and it was running without any problems at all.
However, when I export my classes into an executable-jar file using eclipse and try to run it in the windows cmd, the Scanner(System.in) can't read charachters in UTF-8 (greek characters) or something else that i haven't thought about.
This is the part of the code where i run into the problem :
String yesORno = inp.stringScanner(); // basically a nextLine()
while (!(yesORno.equals("ΝΑΙ") || yesORno.equals("ΟΧΙ"))) { // ΝΑΙ and OXI are greek characters not latin
System.out.println("Παρακαλώ πληκτρολογίστε 'ΝΑΙ' ή 'ΟΧΙ'"); // please type ΝΑΙ or ΟΧΙ in greek
yesORno = inp.stringScanner(); // take input again
}
inp is an object of an other class which i use to take inputs, in this case with the method stringScanner()
public String stringScanner() {
Scanner in = new Scanner(System.in);
return in.nextLine();
}
So when i run the code in eclipse and enter some sample characters for testing i get :
And that's what i want to happen every time.
But when i run the jar file i get :
As you can see the jar file for some reason doesn't recognise greek NAI and yesORno.equals("ΝΑΙ") doesn't return true to stop the while loop.
The same happens with OXI
I have tried running the jar file by using a .bat file like :
start java -Dfile.encoding=UTF-8 -jar Myfile.jar
but no solution.
I've done a lot of reserch to resolve this problem but I have found nothing.
I would appreciate your help

The JVM argument -Dfile.encoding tells the JVM what is the default encoding for (text) files it may encounter. This includes stdin, stdout and stderr – mapped to System.in, System.out and System.err. But the argument will not change anything in the operating system.
Most probably, your Windows CMD is using the Windows-1253 encoding, not UTF-8. When you tell the JVM with the -Dfile.encoding argument that it would be UTF-8, that would be an outright lie …
Try start java -Dfile.encoding=Windows-1253 -jar Myfile.jar or start java -Dfile.encoding=ISO-8859-7 -jar Myfile.jar.
If you setup your system with Windows-1253, the second option may cause other problems, as ISO-8859-7 and Windows-1253 are not fully compatible. But for your test it should do the job.

According to the documentation, the way you use the scanner will always depend on the operating system's encoding settings.
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Scanner.html#%3Cinit%3E(java.io.InputStream)
Look at the alternative constructors - you can define the encoding there directly. Your code could look like
Scanner in = new Scanner(System.in, "UTF-8");

Why is Jenkins having problems with UTF 8

I have a Jenkins job running. I just want to get all files. In every file name there is a Chinese letter. So the problem is now that Jenkins has problems reading in those files. Jenkins makes just "?" out of the Asian letter. The second problem is. Actually it is more than 100 files. But Jenkins only gives me 20 files. Maybe now a lot of files will look the same because of the question mark "?" .
Does anyone know how I can fix this. The problem only occurs on Jenkins ( running on Linux ) . On my local machine in Eclipse it works though.
File resourcePath = new File("resources/china_data/");
File[] files = resourcePath.listFiles();
for (final File file : files)
{
System.out.console(file.getName);
}

An alternative solution is to use the new java.nio.Path api in place of the java.io.File api
Also try setting the below in your code initially.
System.setProperty("sun.jnu.encoding","utf-8");
System.setProperty("file.encoding","UTF-8");

Assuming you are using System.out.println, this happens when the program runs with an ASCII locale:
$ cat Main.java
import java.util.*;
import java.io.*;
class Main {
public static void main(String[] args) throws Exception {
File resourcePath = new File("resources/china_data/");
File[] files = resourcePath.listFiles();
for (final File file : files)
{
System.out.println(file.getName());
}
}
}
$ javac Main.java
$ LC_CTYPE=C java Main
???????
When the program runs with a UTF-8 capable locale, either from the environment or configured through Java, you get the expected result:
$ LC_CTYPE=en_US.UTF-8 java Main
中华人民共和国
$ LC_CTYPE=C java -Dfile.encoding=UTF-8 Main
中华人民共和国
If you're not sure how to configure your server, you can also do this from within Java:
System.setOut(new PrintStream(System.out, true, "UTF-8"));

Visual studio code java compile not work with UTF-8

I am having a bug with the java debug plugin in my VS Code on Windows 10. When I try to print some special characters with this code :
public class App {
public static void main(String[] args) {
String hello = "こんにちは世界!";
System.out.println(hello);
FileOutputStream oStream = new FileOutputStream("output.txt");
Writer out = new BufferedWriter(new OutputStreamWriter(oStream, "UTF-8"));
try {
out.write(hello);
} finally {
out.close();
}
}
}
If I set the file encode is UTF-8 , when I debug my code in Vs Code, the debug console and the file show broken characters, if I change file encode to UTF-8 with BOM both debug console and the file show correct characters. But changing file encode to UTF-8 is not a solution because java does not support UTF-8 encoding does not recognize initial BOM . So changing file encode to UTF-8 with BOM break my project, because every time i compile my code with javac or build tools like gradle, maven ..., it will throw this error :
> gradle build
> Task :compileJava FAILED
D:\Workspace\Code\~SourceCode\Java\TestEncode\src\main\java\App.java:1: error: illegal character: '\ufeff'
?import java.io.BufferedWriter;
^
(I don't know what magic M$ use to make UTF-8 with BOM work heare!)
Does anybody know a fix or workaround for this?

Syntax check on a java file

For a given java file, I'd like to check if it's syntactically correct. ie. If it has semi-colons in the right places, matching parenthesis etc. Importantly, I need to check the file in isolation from all of its dependencies.
This other answer is promising, however it's really doing a semantic check rather than a syntactic check. It does what a compiler would do - check all the imports statements as well as verify external classes that are referenced in the code.
Is there a way to do a true syntax check? (A check that only inspects the raw text against Java's formal grammar)

Create or use a Java source code parser. For some parser generators there are public Java grammars available - you could use it to generate the parser.
E.g. Java 8 grammar for ANTLR (no idea about quality of that grammar though, you'd have to do your evaluation - but the grammar is written by the author of ANTLR, so should be OK I guess).

As Jiri suggested, use the ANTLR library to put together a syntax checker.
Download and extract the ANTLR grammars from here
Download the ANTLR4 jar from here
Run the following command to generate classes from the Java 8 grammar:
java -jar antlr-4.5.3-complete.jar
~/Downloads/grammars-v4-master/java8/Java8.g4
Copy the .java files that were created into your project
Then you can write your syntax checker:
public static void main(String[] args) throws SAXException, IOException {
ANTLRInputStream input = new ANTLRFileStream(args[0]);
Java8Lexer lexer = new Java8Lexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
Java8Parser parser = new Java8Parser(tokens);
final StringBuilder errorMessages = new StringBuilder();
parser.addErrorListener(new BaseErrorListener() {
#Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
String err = String.format("Failed to parse at line %d:%d due to %s", line, charPositionInLine + 1, msg);
errorMessages.append(err);
errorMessages.append(System.lineSeparator());
}
});
parser.compilationUnit();
int syntaxErrors = parser.getNumberOfSyntaxErrors();
if (syntaxErrors == 0) {
System.out.println(args[0] + ": PASS");
} else {
System.out.println(args[0] + ": FAILED (" + syntaxErrors + " syntax errors");
}
}

I was researching how to do a quick java syntax-checker for usage in vim. I inspired by #Fidel's answer (thanks for the example checker!) but it didn't suffice me as I needed it to work in a standalone way.
Here is a step by step process of what had to be done in order to have runnable command:
Run:
# download and generate .java classes
wget https://repo1.maven.org/maven2/org/antlr/antlr4/4.5.3/antlr4-4.5.3.jar -O antlr4-4.5.3.jar
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/java8/Java8.g4 -O Java8.g4
java -jar antlr4-4.5.3.jar ./Java8.g4
Then write a syntax checker, perhaps similar to #Fidel's one and place it under package checker; place it directly under ./checker directory
Also place all of the generated .java classes under that package (ie. put package checker; to the first line of all the files)
Run:
# prepare directory structure and compile
mkdir -p checker && mv *.java ./checker/
javac -cp antlr4-4.5.3.jar ./checker/*.java
Prepare a Manifest file that will look similarly to:
Class-Path: antlr4-4.5.3.jar
Main-Class: checker.<name-of-your-checker-class>
Run:
# package into a jar file and run to test everything works
jar cfm checker.jar Manifest.txt checker/*.class
java -jar ./checker.jar <filename-you-want-to-run-syntax-check-against>
For my usage in vim I then created a simple shell script to wrap execution of the process of running the java jar that looks like this:
#!/bin/bash
DIR=$(readlink -f $0)
DIR=${DIR:0:(-3)} # put the length of the script name here instead of '3'
java -jar ${DIR}checker.jar $#
Then make it executable and symlink it under $PATH:
chmod a+x <file-name>
sudo ln -s /path/to/the/script /usr/bin/java-syntax
And finally added a keybinding like this into a .vimrc file (ie. to map running the syntax-check when F10 key is pressed):
" java syntax validation
map <F10> :!java-syntax % <CR>
I also stored the process into a github repository, where all the commands are prepared in a makefile and it suffices to run make in order to build and package the checker. Feel free to use it as an inspiration.

Different Result in Java Netbeans Program

I am working on a small program to find text in a text file but I am getting a different result depending how I run my program.
When running my program from Netbeans I get 866 matches.
When running my program by double clicking on the .jar file in the DIST folder, I get 1209 matches (The correct number)
It seems that when I'm running the program from Netbeans, it doesn't get to the end of the text file. Is that to be expected ?
Text File in question
Here is my code for reading the file:
#FXML
public void loadFile(){
//Loading file
try{
linelist.clear();
aclist.clear();
reader = new Scanner(new File(filepathinput));
while(reader.hasNext()){
linelist.add(reader.nextLine());
}
for(int i = 0; i < linelist.size()-1; i++){
if(linelist.get(i).startsWith("AC#")){
aclist.add(linelist.get(i));
}
}
}
catch(java.io.FileNotFoundException e){
System.out.println(e);
}
finally{
String accountString = String.valueOf(aclist.size());
account.setText(accountString);
reader.close();
}
}

The problem is an incompatibility between the java app's (i.e. JVM) default file encoding and the input file's encoding.
The file's encoding is "ANSI" which commonly maps to Windows-1252 encoding (or its variants) on Windows machines.
When running the app from the command prompt, the JVM (so the Scanner implicitly) will take the system default file encoding which is Windows-1252. Reading the same encoded file with this setup will not cause the problem.
However, Netbeans by default sets the project encoding to utf-8, therefore when running the app from Netbeans its file encoding is utf-8. Reading the file with this encoding resulting to confusion of the scanner. The character "ï" (0xEF) of the text "Caraïbes" is the cause of the problem. Since it is one of characters of BOM (ï»¿ = 0xEF 0xBB 0xBF) sequence, it is somehow messing up the scanner.
As a solution,
either specify the encoding type of the scanner explicitly
reader = new Scanner(file, "windows-1252");
or convert the input file encoding to utf-8 using notepad or better notepad++, and set encoding type to utf-8 without using system default.
reader = new Scanner(file, "utf-8");
However, when the different OSes are considered, working with utf-8 at all places will the preferred way dealing with multi-platform environments. Hence the 2nd way is to go.

It can also depend on the filepathinput input. When jar and netbeans both might be referring to two different files. Possibly with same name in different location. Can you give more information on the filepathinput variable value?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Illegal character error when compiling a program made using geany \u0000 - java

Related

How to enter UTF-8 characters inside a jar file using Scanner(System.in)

Why is Jenkins having problems with UTF 8

Visual studio code java compile not work with UTF-8

Syntax check on a java file

Different Result in Java Netbeans Program

Categories

Resources