I am encountering an issue to save/ create the file using java.
java.io.IOException: No such file or directory
at java.io.UnixFileSystem.createFileExclusively(Native Method) ~[na:1.7.0_79]
My environment is using Linux but having a mount on Windows (The place where I try to store the file).
It will hit everytime I tried to create when the filename having a chinese characters.
Could this happen because of encoding between Linux and Windows difference?
When I tried running and storing in similar OS (run apps in Linux, storing in Linux, same thing for windows) it run smoothly.
Any help is very appreciated.
The code i used to create the file
File imgPath = new File(fullpath.toString());
if (!imgPath.exists()){
FileUtils.forceMkdir(imgPath);
imgPath.setWritable(true, false);
}
fullpath.append(File.separator).append(fileName);
outputStream = new FileOutputStream(new File(fullpath.toString()));
Thanks a lot.
Note: I'm a fairly new user and can't comment directly yet (only on my questions and answers so far), so I'm posting this as an answer.
Windows uses UTF-16 while Linux uses UTF-8; (considering that you haven't installed anything extra to change anything yet) UTF-8 and UTF-16 support the same range of characters. However, I remember correctly, it had something to do with memory (UTF-8 starts at 8 bits and UTF-16 starts at 16?). Regardless, they're stored/ read a little differently. And then, InputStreamReader converts characters from their external representation in the specified encoding to the internal representation. It's mentioned in this stackoverflow post (Difference between UTF-8 and UTF-16?) about the exact way it's done in bytes. They're the same for the basics, but different for others, like Chinese characters. would suggest looking for solutions along that line (I have to get to class!). I could be entirely wrong, but this is probably a good starting place. Good luck.
I have a Java2EE Application runnning on a WildFly 10, I am using Terminator (Terminal) to monitor what's going on and Sublime Text 2 to open the log files.
Now I am sending XML through HTTP and for some reason the encoding is messed up (I am german, so äüöß are screwed up). It should be UTF-8 since everything I use is UTF-8 by default, plus I double-checked anyways, and yes, it's UTF-8, but still the encoding is messed up.
But now when I check log files, terminal output or whatever ...
All I see are question marks instead of ä, ö, ü and ß
So does anyone have productive ideas that could help me?
Try this command in jboss-cli.sh
/subsystem=undertow/servlet-container=default:write-attribute(name=default-encoding,value=UTF-8)
then
reload
In general - it is not clear, whether you have problems displaying national chars in OS (check locale, LANG environment variables et c) or doing some programming error.
Also, if you are URLDecoding XMLs, be sure to specify encoding.
For example:
URLDecoder.decode(xml, "UTF-8"))
I am writing a program in java with Eclipse IDE, and i want to write my comments in Greek. So i changed the encoding from Window->Preferences->General->Content Types->Text->Java Source File, to UTF-8. The comments in my code are ok but when i run my program some words contains weird characters e.g San Germ�n (San Germán). If i change the encoding to ISO-8859-1, all are ok when i run the program but the comments in my code are not(weird characters !). So, what is going wrong with it?
Edit: My program is in java swing and the weird characters with UTF-8 are Strings in cells of a JTable.
EDIT(2): Ok, i solve my problem i keep the UTF-8 encoding for java file but i change the encoding of the strings. String k = new String(myStringInByteArray,ISO-8859-1);
This is most likely due to the compiler not using the correct character encoding when reading your source. This is a very common source of error when moving between systems.
The typical way to solve it is to use plain ASCII (which is identical in both Windows 1252 and UTF-8) and the "\u1234" encoding scheme (unicode character 0x1234), but it is a bit cumbersome to handle as Eclipse (last time I looked) did not transparently support this.
The property file editor does, though, so a reasonable suggestion could be that you put all your strings in a property file, and load the strings as resources when needing to display them. This is also an excellent introduction to Locales which are needed when you want to have your application be able to speak more than one language.
I have this problem that has been dropped on me, and have been a couple of days of unsuccessful searches and workaround attempts.
I have now an internal java swing program distributed by jnlp/webstart, on osx and windows computers, that, among other things, downloads some files from WebDav.
Recently, on a test machine with OSX 10.8 and Java 7, filenames and directory names with accented characters started having those replaced by question marks.
No problem on OSX with versions of Java before 7.
example :
XXXYYY_è_ABCD/
becomes
XXXYYY_?_ABCD/
using java.text.Normalizer (NFD, NFC, NFKD, NFKC) on the original string, the result is different but still wrong :
XXXYYY_e?_ABCD/
or
XXXYYY_e_ABCD/
I know, from correspondence between [andrew.brygin at oracle.com] and [mik3hall at gmail.com] that
Yes, file.encoding is set based on the locale that the jvm is running
on, and if you run your java vm in xxxx.UTF-8 locale, the
file.encoding should be UTF-8, set to MacRoman will be problematic.
So I believe Oracle/OpenJDK7 behaves correctly. That said, as Andrew
Thompson pointed out, if all previous Apple JDK releases use MacRoman
as the file.encoding for english/UTF-8 locale, there is a
"compatibility" concern here, it might worth putting something in the
release note to give Oracle/OpenJDK MacOS user a heads up.
original mail
from Joni Salonen blog (java-and-file-names-with-invalid-characters) i know that :
You probably know that Java uses a “default character encoding” to
convert binary data to Strings. To read or write text using another
encoding you can use an InputStreamReader or OutputStreamWriter. But
for data-to-text conversions deep in the API you have no choice but to
change the default encoding.
and
What about file.encoding?
The file.encoding system property can also be used to set the default
character encoding that Java uses for I/O. Unfortunately it seems to
have no effect on how file names are decoded into Strings.
executing locale from inside the jnlp invariabily prints
LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
the most similar problem on stackoverflow with a solution is this :
encoding-issues-on-java-7-file-names-in-os-x
but the solution is wrapping the execution of the java program in a script with
#!/bin/bash
export LC_CTYPE="UTF-8" # Try other options if this doesn't work
exec java your.program.Here
but I don't think this option is available to me because of the webstart, and I haven't found any way to set the LC_CTYPE environment variable from within the program.
Any solutions or workarounds?
P.S. :
If we run the program directly from shell, it writes the file/directory correctly even on OSX 10+Java 7.
The problem appears only with the combination of JNLP+OSX+Java7
I take it it's acceptable to have maximal ASCII representation of the file name, which works in virtually any encoding.
First, you want to use specifically NFKD, so that maximum information is retained in the ASCII form. For example, "2⁵" becomes "25"rather than just
"2", "fi" becomes "fi" rather than "" etc once the non-ascii and non-control characters are filtered out.
String str = "XXXYYY_è_ABCD/";
str = Normalizer.normalize(str, Normalizer.Form.NFKD);
str = str.replaceAll( "[^\\x20-\\x7E]", "");
//The file name will be XXXYYY_e_ABCD no matter what system encoding
You would then always pass filenames through this filter to get their filesystem name. You only lose is some uniqueness, I.E file asdé.txt is the same
as asde.txt and in this system they cannot be differentiated.
EDIT: After experimenting with OS X some more I realized my answer was totally wrong, so I'm redoing it.
If your JVM supports -Dfile.encoding=UTF-8 on the JVM command line, that might fix the issue. I believe that is a standard property but I'm not certain about that.
HFS Plus, like other POSIX-compliant file systems, stores filenames as bytes. But unlike Linux's ext3 filesystem, it forces filenames to be valid decomposed UTF-8. This can be seen here with the Python interpreter on my OS X system, starting in an empty directory.
$ python
Python 2.7.1 (r271:86832, Jul 31 2011, 19:30:53)
>>> import os
>>> os.mkdir('\xc3\xa8')
>>> os.mkdir('e\xcc\x80')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 17] File exists: 'e\xcc\x80'
>>> os.mkdir('\x8f')
>>> os.listdir('.')
['%8F', 'e\xcc\x80']
>>> ^D
$ ls
%8F è
This proves that the directory name on your filesystem cannot be Mac-Roman encoded (i.e. with byte value 8F where the è is seen), as long as it's an HFS Plus filesystem. But of course, the JVM is not assured of an HFS Plus filesystem, and SMB and NFS do not have the same encoding guarantees, so the JVM should not assume this scheme.
Therefore, you have to convince the JVM to interpret file and directory names with UTF-8 encoding, in order to read the names as java.lang.String objects correctly.
Shot in the dark: File Encoding does not influence the way how the file names are created, just how the content gets written into the file - check this guy here: http://jonisalonen.com/2012/java-and-file-names-with-invalid-characters/
Here is a short entry from Apple: http://developer.apple.com/library/mac/#qa/qa1173/_index.html
Comparing this to http://docs.oracle.com/javase/tutorial/i18n/text/normalizerapi.html I would assume you want to use
normalized_string = Normalizer.normalize(target_chars, Normalizer.Form.NFD);
to normalize the file names before you pass them to the File constructor. Does this help?
I don't think there is a real solution to this problem, right now.
Meantime I came to the conclusion that the "C" environment variables printed from inside the program are from the Java Web Start sandbox, and (by design, apparently) you can't influence those using the jnlp.
The accepted (as accepted by the company) workaround/compromise was of launching the jnlp using javaws from a bash script.
Apparently, launching the jnlp from browser or from finder creates a new sandbox environment with the LANG not setted (so is setted to "C" that is equal to ASCII).
Launching the jnlp from command line instead prints the right LANG from the system default, inheriting it from the shell.
This permits to at least preserve the autoupdating feature of the jnlp and dependencies.
Anyway, we sent a bug report to Oracle, but personally I'm not hoping it to be resolved anytime soon, if ever.
It's a bug in the old-skool java File api, maybe just on a mac? Anyway, the new java.nio api works much better. I have several files containing unicode characters and content that failed to load using java.io.File and related classes. After converting all my code to use java.nio.Path EVERYTHING started working. And I replaced org.apache.commons.io.FileUtils (which has the same problem) with java.nio.Files...
...and be sure to read and write the content of file using an appropriate charset, for example:
Files.readAllLines(myPath, StandardCharsets.UTF_8)
If Two different machines have different character encodings.How to take from a java program that same file on both machines should be read in similar manner.Is it possible using java or we have to manually set the encodings of both machines?
It sounds like you just want to use something like:
InputStream inputStream = new FileInputStream(...);
Reader reader = new InputStreamReader(reader, "UTF-8"); // Or whatever encoding
Basically you don't have to use the platform default encoding, and you should almost never do so. It's a pain that FileReader always uses the platform default encoding :( I prefer to explicitly specify the encoding, even if I'm explicitly specifying that I want to use the platform default :)
You don't need to change the machine's settings.
You can use any java.io.Reader subclass that allows you to set the character encoding. For instance InputStreamReader, like so:
new InputStreamReader(new FileInputStream("file.txt"), "UTF8");
You are in control of reading/writing the files on both environment.
Working with text files in Java
You have control control on only read side.
You know the encoding used to write the file: Identify what encoding is used to write the file and use the same encoding to read it.
You doesn't know the encoding used to write the file: Best you can do is guess the encoding: Character Encoding Detection Algorithm
UPDATE
If your issues is that you are not seeing the output properly in eclipse console then the issue might be with the encoding setting of the eclipse itself. Read this article on how to fix eclipse.