Java | CSV issue with ^M - java

I have problem with this annoying ^M, while exporting some data, writing it to a CSV file to be downloaded. I did some research and found out that if the file you are reading comes from a Windows system this issue happens (Windows uses CR (i.e. ^M)/LF pair to indicate the end of a line, while UNIX uses only a LF).
Now can anyone offer me a solution to overcome this problem (like eliminating or replacing ^M ) before putting it to the writer (writer.write(columnToBeInserted);)

You could use unix2dos and dos2unix to convert UNIX and Windows files respectively. Both are available on *nix and Windows platforms. Read more.
Links for Windows
Dos2Unix
Unix2Dos
Also see How to convert files from Dos to Unix in java

As you read each line do
line.replaceAll("\\p{Cntrl}", "");
Or use a tool to do it for you

in linux/unix environment there is a utilities called dos2unix and unix2dos which converts the files from windows to linux format and vise versa .
on windows check this link and download the utility whch will convert from windows to linux format http://www.sg-chem.net/u2win/

Related

Windows couldn't remove file with large path

Google Web Toolkit (GWT) generates huge number of temporary files in the temp (C:\Users\User01\AppData\Local\Temp) directory.
Example of a file path:
C:\Users\User01\AppData\Local\Temp\gwt-codeserver-1101830889369654349.tmp\com.company01.web.builder.BuildingsWeb\compile-2\gen\com\company01\web\theme\custom_pluto123\client\base\progressbar\Css3ProgressBarAppearance_Css3ProgressBarTemplate_render_SafeHtml__SafeHtml_text__Css3ProgressBarStyles_style__SafeStyles_wrapStyles__SafeStyles_progressBarStyles__SafeStyles_progressTextStyles__SafeStyles_widthStyles___SafeHtmlTemplatesImpl.java
The above file path contains 437 characters.
When I tried to remove this type of files from Windows Explorer, it got crashed. Also I've tried to remove or rename it from command prompt it says The filename or extension is too long.
Finally I deleted by running custom java program.
Now, my question is why Windows couldn't able to remove it? If its not supported by OS, how java removes it?
Note:
I tried all of the above commands/actions with proper UAC (Run as administrator) in Windows 7 Ultimate and the File System was NTFS
Windows had an limitation of 260 characters (=MAX_PATH) but now also allows to create paths with up to 32,767 characters through the Unicode version of its API.
Windows Explorer sadly cannot handle long paths.
Java seems to use the Unicode API and therefore can create and remove long paths.
Resources:
https://support.microsoft.com/en-us/kb/320081
https://msdn.microsoft.com/en-us/library/aa365247%28VS.85%29.aspx

Java difference between running from netbeans and cmd

I have a program that writes text data to files. When I run it from netbeans the files are in a correct encoding and you can read them with a notepad. When I run it from cmd using java -cp ....jar the encoding is different.
What may be the issue??
ps. I've checked that the jre. versions are the same that executes (v 1.8.0_31)
Netbeans startup scripts may specify a different encoding than your system default. You can check in your netbeans.conf.
You can set the file.encoding property when invoking java. For example, java -Dfile.encoding=UTF8 -cp... jar.
If you do not want to be surprised when running your code on different environments, even better solution would be to specify the encoding in your source code.
Further reading:
file encoding: Character and Byte Streams
netbeans.conf encoding options: How To Display UTF8 In Netbeans 7?

How to execute a .vbs file in a linux OS from java

I have a .vbs file that makes the conversion of a docx file to a pdf type file, i run this .vbs from java in windows. Since i need this program running in a linux based OS i don't know if this solution would work.
the .vbs and the java code that i use for the project is here in this link: http://mydailyjava.blogspot.mx/2013/05/converting-microsoft-doc-or-docx-files.html
Note: I tried other solutions to convert the docx file to a pdf, but these solutions (docx4j, xdocreports, jodConverter) causes loss of format in the final pdf file, so those apis are not an option.
It is unlikely that you would be able to run the mentionned programs on Linux, since for that you would need:
Microsoft Word installed, to open the Word file and print it
Microsoft scripting host, to execute the vbs script
A batch script interpreter that can access the scripting host
Since all these items are Microsoft software, they don't run natively on Linux.
So you will have to find alternatives as suggested by vzamanillo, or maybe find a way to run this in a WINE environment, but then that's not really Linux.
Since you're doing this from java, if you are open to using 3rd party software you could try jWordConvert.

Opening Files with Java while Working in Cygwin

I am running Cygwin on a Windows 7 machine, and using script files to execute Java programs in batch. My problem is this: I try to pass in a Cygwin / Linux path to a file, via the command line, and Java converts all of the forward slashes to backslashes.
For instance:
java program $scratchname/path_to_folder/ filename_$i.txt
Within Java, I take the directory and add the file name to open the file, which usually works with no issues as long as I'm using a Windows command line. However, in Cygwin Java converts this to:
home\scratch\path_to_folder
which Cygwin doesn't like.
I don't think this is a simple matter of replacing the backslashes with forward slashes, because Java seems to default to the Windows path conventions when I try to open the file. I'm guessing this is because Cygwin is pointed to the Windows installation of the JVM.
How can I force Java to use Cygwin / Linux path name conventions on a Windows system?
Java is a Windows program, and as such, only understands Windows paths; launching it from a Cygwin shell can't change that. You can use cygpath to convert paths back and forth.
Reference link: https://cygwin.com/cygwin-ug-net/using-effectively.html
Example case:
java -jar path/to/your-1.0.jar "$(cygpath -aw /home/YOUR_USER/path/to/file.txt)"
Options:
a provides the absolute path
w uses the Windows format

How to convert files from Dos to Unix

I am having several files which I want to convert from Dos to Unix. Is there any API or method which will help me to do this?
There are linux tools that can do this (dos2unix for example).
In Java it can be done with String.replaceAll().
DOS uses \r\n for line termination, while UNIX uses a single \n.
String unixText = windowsText.replaceAll("\r\n", "\n"); // DOS2UNIX
So no, no API exists. Yes, it is dead easy.
There is a utility/command in Linux/Unix called dos2unix that will help you convert your files from dos to unix format. To install simply type in console(you may need root privileges)
yum install dos2unix
To do the conversion you have to use command dos2unix followed by filename. For example
[aniket#localhost ~]$ dos2unix sample.txt
dos2unix: converting file sample.txt to UNIX format ...
For all files in a directory you can simply use
dos2unix *
Just an alternate way (than the one parasietje described) using dox2unix. Say you all dos files are in one folder
Runtime.getRuntime().exec("dos2unix /path/to/dos/files/*");
Most unix/linux distributions have utility named unix2dos and dos2unix commands.
EDIT:
Just copy your file to unix machine and run dos2unix *.
You can also find this utility for Windows and do the same.
String unixText = windowsText.replaceAll("\r\n", "\n"); // DOS2UNI
The above line should remove all \r but for some reason it also removes the \n so I had to add it back when printing unixText to a file:
unixText + "\n"

Categories

Resources