How to print IPA characters with mvn:exec? - java

I've the following simply program that prints the IPA word ˈabsəluːt [1]. Unfortunately executing this program with mvn:exec prints the word as ?abs?lu?t[2]. How to make it print it correctly, i.e. as in [1]?
package dp4j.encodingtest;
public class App {
public static void main(String[] args) {
String s = "ˈabsəluːt";
System.out.println(s);
FileUtils.writeStringToFile(new File("s.txt"), s);
}
}
The mvn exec:exec command:
mvn "-Dexec.args=-classpath %classpath dp4j.encodingtest.App"
-Dexec.executable=C:\\jdk1.7.0_25\\bin\\java.exe exec:exec
Even printing the word to the s.txt file doesn't print it is incorrectly as in [2].

The issue is with the application where that word is displayed/printed, in this case, the console. Nothing can be done from java / maven except to make sure your java source code encoding is UTF-8 (as string literal is used).
If you are running it from an IDE, then try to change the console font from IDE's options/preferences to ‘Lucida Sans’ as this font has partial support for IPA extensions or some other available fonts with IPA support.

Related

Logging files names that contain Norwegian letters in the file name in Unix OS using a Jar executable

I have a simple java program that when run is supposed to traverse through the whole directory on a Unix server and log all files on the fileserver that contain Norwegian letters "å,ø,æ".
This is how it looks on the fileserver using winSCP:
In the end the logs.log file should look like this:
2022-10-25 14:27:02 INFO Logger:99 - File: 'DN_Oppmålings.pdf'
2022-10-25 14:27:02 INFO Logger:99 - File: 'Salg_av_gærden.pdf'
However, this is how it ends up in the log file, all Norwegian letters are represented with a square.
I can't seem to figure out why it happens. It probably has something to do with the encodings. Because when I run it on windows locally, everything runs as expected and I get the result I need. But when I build the project as an executable jar and run on the server it gets wrong.
Here is the code I am using.
public static void renameFiles3(File[] files) throws IOException {
for (File filename : files) {
if (filename.isDirectory()) {
renameFiles3(filename.listFiles());
} else {
String fileNameString = filename.getName();
if (fileNameString.contains("å") || fileNameString.contains("ø") || fileNameString.contains("æ")){
logger.info("File: '" + filename.getName());
}
}
}
}
public static void main(String[] args) {
File[] files = new File(path).listFiles();
try {
renamer.renameFiles3(files);
} catch catch(IOException ex){
logger.error(ex.toString());
}
}
Someone pointed out that the encoding should be specified, but I am not sure how that is done. If I run "locale" command on the Unix server this is what I get as output.
[e1111111#ilt repository]$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
I use Putty to run the jar file. Here are the configs.
Stacktrace of the error I get when running the jar:
java.nio.file.NoSuchFileException: ./documentRepository/DN_Oppm�lings.pdf
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:430)
at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
at java.base/java.nio.file.Files.move(Files.java:1422)
at com.example.fixfilenamesonfileserver.Renamer.renameFiles2(Renamer.java:105)
at com.example.fixfilenamesonfileserver.Renamer.renameFiles2(Renamer.java:89)
at com.example.fixfilenamesonfileserver.Renamer.renameFiles2(Renamer.java:89)
at com.example.fixfilenamesonfileserver.Renamer.renameFiles2(Renamer.java:89)
at com.example.fixfilenamesonfileserver.Renamer.renameFiles2(Renamer.java:89)
at com.example.fixfilenamesonfileserver.Renamer.renameFiles2(Renamer.java:89)
at com.example.fixfilenamesonfileserver.Renamer.renameFiles2(Renamer.java:89)
at com.example.fixfilenamesonfileserver.Renamer.main(Renamer.java:154)
What makes it even more strange, is that I can create for instance a folder with mkdir containing Norwegian letters in the name and it would be displayed correctly and also logged correctly if I create a file with Norwegian letters.
Some time ago I wrote an answer for a very similar problem.
As stated in the aforementioned solution, the problem could be related with the use of different charsets in your local Windows laptop (probably, cp-1252 or some variant) and your server.
As suggested, please, consider review the charset which is in place in the JVM in every environment, and review and adapt if necessary the value of the file.encoding system property on your laptop and the server environment, maybe it will help you solve the problem.
Probably running your jar with a proper value for the file.encoding JVM property may do the application work properly:
java -Dfile.encoding=UTF-8 -jar your_app.jar
I suspect there is no problem with your Java nor your file.
The problem is likely with the app you use to view that text. That app is using a font that lacks a glyph for those characters.
Edit your Question to note the app and OS if you want further assistance.
Assuming you are printing the letters to a terminal, the problem is most likely the terminal you use. If you are printing the characters to a terminal, make sure it is set to chcp65001, and a font that supports displaying norwegian letters fully. I have encountered similar problems while trying to display multilingual text due to the shortage of support for multiple languages in the same font.
So, to summarize, first set the terminal code-page encoding to chcp 65001, and then change the font of the terminal to a font that supports norwegian letter fully, and then run the jar file from the terminal like : java -jar <jarname>.jar

VSCode not reading Cyrillic from console

Well, here is the problem, I have started using VScode, and I can't read from console cyrillic characters.
My code:
import java.util.Scanner;
class App {
public static void main(String[] args) throws Exception {
Scanner input = new Scanner(System.in, "UTF-8");
String word = input.nextLine();
System.out.println(word);
}
}
Now when I enter any cyrillic string it will print empty string back to me. If i write something like
System.out.println("Привет"); //cyrillic symbols
It will print "Привет", which is fine. So I am guessing it has something to do with reading the string rather than outputing it.
chcp command gives Active code page: 65001
I have tried setting encoding and without it, but it doesn't seem to work, is there something I missed?
Thanks in advance
I've tested the code on my machine and got the same result: nothing shown;
You can see, when run it in external Window PowerShell or Command Prompt, the result is different but still not shown correctly:
When we change the encode style to GBK(936), the cyrillic characters can be displayed correctly:
When it comes to changing integrated terminal encoding style in vscode and execute code again, it still shows nothing:
About these different results between external Command Prompt and integrated terminal in VS Code, I've put a github request. And I'm doing some research, if any useful imformation i get, i will update you.

How to parse file patterns using Apache commons CLI

I'm trying to parse my command line arguments using the apache commons CLI. It might be a bit heavy handed for the example here, but it makes sense in the context of the program I'm creating. I'm trying to read a file pattern filter, similar to what grep uses to select files to process.
My Argument looks like this:
Program --input *.*
I've written a test program to see what the parser is seeing;
public static void main(String[] args) {
Options options = new Options();
options.addOption(new Option(INPUT_FILTER_SHORT, INPUT_FILTER_LONG, true, INPUT_FILTER_DESCRIPTION));
CommandLineParser parser = new BasicParser();
CommandLine cmd = parser.parse(options, args);
System.out.println(cmd.getOptionValue(INPUT_FILTER_SHORT));
}
This prints out:
.classpath
If I change my arguments to:
Program --input test.txt
I get the output:
test.txt
I'm assuming that I have to do something to tell apache commons what * is not a special character? I can't seem to find anything about this online.
I'm experiencing this on Windows (7). I'm fairly certain it's the *.* which is causing the issue as when I swap to using patterns that don't use *, the expected pattern shows up.
Your problem isn't really to do with Commons CLI, but to do with how the shell and the Java executable together process the parameters.
To eliminate other factors, and see what's going on, use a short Java program:
public class ArgsDemo {
public static void main(String[] args) {
for(int i=0; i<args.length; i++) {
System.out.println("" + i + ": " + args[i]);
}
}
}
Play with java ArgsDemo hello world, java ArgsDemo * etc. and observe what happens.
On UNIX and Linux:
Java does no special processing of *. However, the shell does. So if you did:
$ mkdir x
$ cd x
$ touch a b
$ java -jar myjar.jar MyClass *
... then MyClass.main() would be invoked with the parameter array ["a","b"] -- because the UNIX shell expands * to files in the current directory.
You can suppress this by escaping:
$ java -jar myjar MyClass * // main() sees ["*"])
(Note that a UNIX shell wouldn't expand *.* to .classpath because this form would ignore "hidden" files starting with .)
On Windows
cmd.exe does not do UNIX-style wildcard expansion. If you supply * as a parameter to a command in Windows, the command gets a literal *. So for example, PKUNZIP *.zip passes *.zip to PKUNZIP.EXE, and it's up to that program to expand the wildcard if it wants to.
Since some release of Java 7, the Java executable for Windows does some wildcard to filename expansion of its own, before passing the parameters to your main() class.
I've not been able to find clear documentation of Java-for-Windows' wildcard expansion rules, but you should be able to control it with quoting, escaping the quotes to prevent cmd.exe interpreting them:
> java.exe -jar myjar.jar MyClass """*.*"""
(Untested as I don't have a Windows box handy, and quoting in cmd.exe is a bit of a beast - do please experiment and either edit the above or leave a comment)

Changing the color of output

I was trying to change the color of console output in java.As my program output displays a warning message so i thought to change that warning color.
After searching it over stackoverflow and response for similar questions I came accross Jansi and JLibs but they are not working.This is the sample code which i wrote using jansi.I included the jar in class-path
import static org.fusesource.jansi.Ansi.*;
import static org.fusesource.jansi.Ansi.Color.*;
class Test
{
public static void main(String[] args)
{
System.out.println( ansi().eraseScreen().fg(RED).a("Hello").fg(GREEN).a(" World").reset());
}
}
However its doing nothing just printing this: ←[2J←[31mHello←[32m World←[m
I am using windows 7 and jdk 1.7 and i am not using eclipse.
Any help
from Jansi
Using jansi is easy. Before you start sending ANSI escape sequences to
System.out make sure you run: AnsiConsole.systemInstall();

Standard in with Processing (processing.org)

This simple code has never worked for me in processing:
try {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String str = "";
while (str != null) {
System.out.print("> prompt ");
str = in.readLine();
println(str);
}
}
catch (IOException e) {
}
Probably because the console output box cannot be used for input, unlike in Eclipse. Is there a simple workaround, or am I forced to do something like a dialog box (or keyPressed handling) for standard in?
If you are using the Processing IDE, Processing does not support this behavior natively. If you export your sketch and edit the java files, or use Eclipse, Proclipsing, core.jar, etc. you can access the System.in like any other java application, however, this would defeat the purpose of processing in that it doesn't typically run from the command line and is graphic in nature.
Best practice would be to capture the keystokes with the key pressed method. For example:
String str = "";
void keyPressed() {
str += key;
}
then in your draw() loop/method, you could handle the text input on str and clear it out if you wanted.
If you wanted something more sophisticated that would have a better UX, I suggest using something like ControlP5's TextField or TextArea.
Your program works perfectly (I named it test) and exported it as an Applet. I tested with cygwin as well as the windows command prompt:
$ cd test/applet
$ java -jar test.jar
Output (I typed "hello" and hit enter):
prompt> hello
hello
prompt>
I tried really quick on an Ubuntu terminal through ssh. I had issues getting it connected to the x11 server. Consider: http://en.wikipedia.org/wiki/Xvfb if that is an issue.
Just to confirm, I was able to run the SharedCanvasServer example included in Library->Network where I added a System.out.println to dump debug to the executing terminal.
java -cp "core.jar;net.jar;SharedCanvasServer.jar" SharedCanvasServer

Categories

Resources