Issue using Coldfusion FileExists when checking files with UTF-8 and ASCII

Issue using Coldfusion FileExists when checking files with UTF-8 and ASCII - java

When trying to detect the existence of the files that were encoded in UTF-8 with FileExists function, the files could not be found.
I found that in the Coldfusion server the Java File Encoding was originally set to "UTF-8". For some unknown reason it was back to default "ASCII". I suspect that this is the issue.
For example, a user uploaded a photo named 云拼花.jpg while the server Java file encoding was set to UTF-8, and now with the server Java file encoding set to ASCII, I use
<cfif FileExists("#currentpath##pic#")>
The result will be not found i.e. file does not exist. However if I simply display it using:
<IMG SRC="/images/#pic#">
The image will display. This caused issues when I try to test the existence of the images. The images are there but can't be found by FileExists.
Now the directory has a mix of files encoded in either UTF-8 or ASCII. Is there anyway to:
force any upload file to UTF-8 encoding
check for the existence of the file
regardless of CF Admin Java File Encoding setting?

Add this to your page.
<cfprocessingdirective pageencoding="utf-8">
This should fix the issue.

Related

Java 11 Freemaker with utf-8 resources

We have a Java application (OpenJDK 1.8) - a service generating some payload using the Freemaker templates (mvn version 2.3.31). The content translations are handled using the resource bundles (.property files with translations, e.g. template.properties, template_fi.properties, template_bg.properties, ..). The properties files have content of the utf-8 encoding and all works good.
When migrating to Java 11 (Zulu OpenJDK 11), we started to have an issue with translations, which were not "latin" - having characters not in the ISO-8859-1 charset. All characters out of the charset encoding were changed to ?. (yet the resource files were utf-8 encoded, changing the content using native2ascii did not help)
After some time / experiments we solved the encoding issue using the system property:
-D java.util.PropertyResourceBundle.encoding=ISO-8859-1
I'm looking for an explanation - WHY? I find the property value counterintuitive and I'd like to understand the process.
According to the documentation I understand the ResourceBundle suppose to read the property in using the ISO-8859-1 and throw an exception when encountering an invalid character. The system properly mentioned above should enable having the property file encoded in UTF. Yet the workable solution was explicitly setting the ISO-8859-1 encoding
And indeed testing pure Java implementation, the proper output is achieved using the UTF-8 encoding
System.setProperty("java.util.PropertyResourceBundle.encoding","UTF-8");
// "ISO-8859-1" <- not working
// System.setProperty("java.util.PropertyResourceBundle.encoding","ISO-8859-1");
Locale locale = Locale.forLanguageTag("bg-BG");
ResourceBundle testBundle = ResourceBundle.getBundle("test", locale);
System.out.println(testBundle.getString("name"));
// return encoded, so the terminal doeesn't break the non-latin characters
System.out.println(
Base64.getEncoder()
.encodeToString(testBundle.getString("name").getBytes("UTF-8")));
I assume that the Fremaker library somehow makes some encoding changes internally, yet not sure what/why, the Freemaker internal localized string is a simple bundle

Character encoding for French locale while creating PDF - Java

I have a spring boot application which renders a XML document into PDF. The document contains French characters likeé à. While running the application through STS I have no issues the PDF is generated as expected. But while running the application through command line using java -jar target\application.jar the generated PDF has French characters as Ã© Ã. I am converting the XML into byte[] and creating the PDF. I couldn't figure out a way out. Any help is much appreciated.

Two options:
Force the encoding with the file.encoding argument, such as -Dfile.encoding=utf-8.
java -Dfile.encoding=utf-8 -jar target\application.jar
(better) When you convert the xml file into a byte array, specify the encoding:
Reader reader = new InputStreamReader(new FileInputStream("/path/to/xml/file"), StandardCharsets.UTF_8);
// do your file reading ...

Umlaut problems with Spark job writing to an NFSv3 mounted volume

I am trying to copy files to an nfsv3 mounted volume during a spark job. Some of the files contain umlauts. For example:
Malformed input or input contains unmappable characters:
/import/nfsmountpoint/Währungszählmaske.pdf
The error occurs in the following line of scala code:
//targetPath is String and looks ok
val target = Paths.get(targetPath)
The file encoding is shown as ANSI X3.4-1968 although the linux locale on the spark machines is set to en_US.UTF-8.
I already tried to change the locale for the spark job itself using the following arguments:
--conf 'spark.executor.extraJavaOptions=-Dsun.jnu.encoding=UTF8 -Dfile.encoding=UTF8'
--conf 'spark.driver.extraJavaOptions=-Dsun.jnu.encoding=UTF8 -Dfile.encoding=UTF8'
This solves the error, but the filename on the target volume looks like this:
/import/nfsmountpoint/W?hrungsz?hlmaske.pdf
The volume mountpoint is:
hnnetapp666.mydomain:/vol/nfsmountpoint on /import/nfsmountpoint type nfs (rw,nosuid,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=4.14.1.36,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=4.14.1.36)
Is there a possible way to fix this?

Solved this by setting the encoding settings like mentioned above and manually converting from and to UTF-8:
Solution for encoding conversion
Just using NFSv4 with UTF-8 support would have been an easier solution.

com.sun.star.lang.IllegalArgumentException - Unsupported URL <file:///

Hi Stackoverflow community, i am having an issue about reading a file from my java webapp. I want to get a file from a directory in my webapp, then converting it to PDF. Everything works just fine in my developpement environement ( Windows) but when i am puting this on sever ( LINUX), when the server reach the code to read my doc file to convert it, java throw this exception :
com.sun.star.lang.IllegalArgumentException - Unsupported URL <file:///
Here is the code :
fileDocToConvert = new File(GET_REAL_PATH()+repo_Name+slash+fileName);
fileDocToConvert path become then : /usr/share/tomcat7/webapps/myapp/repo_name/exemple.doc
the exception fired up when i try to convert :
OpenOfficeConnection connection = new SocketOpenOfficeConnection(8100);
connection.connect();
DocumentConverter converter = new OpenOfficeDocumentConverter(connection);
//HERE...=> // converter.convert(docFile, pdfFile);
I am using :
jodConverter 2.2.1, openOffice 3, Java7, Tomcat 7
I start the openOffice service this way :
soffice --headless --accept="socket,host=127.0.0.1,port=8100;urp;" --nofirststartwizard
I can't get a way to follow to solve this issue.
Thank you in advance

I resolved the problem, by installing some missing components of OpenOffice ( Calc, Writer ). The problem was that OpenOffice can't understand the path of the file given to it.
Thank's for your help millimoose.

Have got java.lang.Exception: Unsupported URL <file:////... error message when started multiple libreoffice instances from different users and with conflicting same port setting.

The problem for me was that OpenOffice or another program that uses components of OpenOffice in it cannot understand the path of the file given to it as a place to save the file. Save the file that you are trying to save somewhere else on your computer and see if that works.

How to upload a file using java without changing its encoding

I have a Java class that upload a text file from a Windows client to a Linux server.
The file I am triyng to upload is encoded using Cp1252 or ISO-8859-1.
When the file is uploaded, it become encoded using utf-8, then strings containing accents like éèà can't be read.
The command
file -i *
in the linux server tells me that it's encoded using utf-8.
I think the encoding was changed diring the upload, so I added this code to my servlet:
String currentEncoding=System.getProperty("file.encoding");
System.setProperty("file.encoding", "Cp1252");
item.write(file);
System.setProperty("file.encoding", currentEncoding);
In the jsp file, I have this code:
<form name="formUpload"
action="..." method="post"
enctype="multipart/form-data" accept-charset="ISO-8859-1">
The lib I use to upload a file is apache commun.
Doe's any one have a clue, cause I'm really runnig out of ideas!
Thanks,
Otmane MALIH

Setting the system property file.encoding will only work when you start Java. Instead, you will have to open the file with this code:
public static BufferedWriter createWriter( File file, Charset charset ) throws IOException {
FileOutputStream stream = new FileOutputStream( file );
return new BufferedWriter( new OutputStreamWriter( stream, charset ) );
}
Use Charset.forName("iso8859-1") as charset parameter.
[EDIT] Your problem is most likely the file command. MacOS is the only OS in the world which can tell you the encoding of a file with confidence. Windows and Linux have to make a guess. This guess can be wrong.
So what you need to do is to open the file with an editor where you specify the encoding. You need to do that on Windows (to make sure that the file really was saved with Cp1252; some applications ignore the platform and always safe their data in UTF-8).
And you need to do the same on Linux. If you just open the file, the editor will take the platform encoding (which is UTF-8 on modern Linux systems) and try to read the file with that -> ISO-8859-1 umlauts will be garbled. But if you open the file with ISO-8859-1, then UTF-8 will be garbled. That's the only way to be sure what the encoding of a text file really is.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Issue using Coldfusion FileExists when checking files with UTF-8 and ASCII - java

Add this to your page. <cfprocessingdirective pageencoding="utf-8"> This should fix the issue.

Related

Java 11 Freemaker with utf-8 resources

Character encoding for French locale while creating PDF - Java

Umlaut problems with Spark job writing to an NFSv3 mounted volume

com.sun.star.lang.IllegalArgumentException - Unsupported URL <file:///

How to upload a file using java without changing its encoding

Categories

Resources