How to save string with code page 1250 into RandomAccessFile in java - java

I have text file with string which code page is 1250. I want to save text into RandomAccessFile. When I read bytes from RandomAccessFile I get string with different character. Some solution...

If you're using writeUTF() then you should read its JavaDoc to learn that it always writes modified UTF-8.
If you want to use another encoding, then you'll have to "manually" do the encoding and somehow store the length of the byte[] as well.
For example:
RandomAccessFile raf = ...;
String writeThis = ...;
byte[] cp1250Data = writeThis.getBytes("cp1250");
raf.writeInt(cp1250Data.length);
raf.write(cp1250Data);
Reading would work similarly:
RandomAccessFile raf = ...;
int length = raf.readInt();
byte[] cp1250Data = new byte[length];
raf.readFully(cp1250Data);
String string = new String(cp1250Data, "cp1250");

This code will write and read a string using the 1250 code page. Of course, you will need to clean it, check exceptions and close streams properly before putting in prod :)
public static void main(String[] args) throws Exception {
File file = new File("/toto.txt");
String myString="This is a test";
OutputStreamWriter w = new OutputStreamWriter(new FileOutputStream(file), Charset.forName("windows-1250"));
w.write(myString);
w.flush();
CharBuffer b = CharBuffer.allocate((int)file.length());
new InputStreamReader(new FileInputStream(file), Charset.forName("windows-1250")).read(b);
System.out.println(b.toString());
}

Related

How to read n base64 encoded characters from a file at a time and decode and write to another file?

Currently I have a source file which has base64 encoded data (20 mb in size approx). I want to read from this file, decode the data and write to a .TIF output file. However I don't want to decode all 20MB data at once. I want to read a specific number of characters/bytes from the source file, decode it and write to destination file. I understand that the size of the data I read from the source file has to be in multiples of 4 or else it can't be decoded?
Below is my current code where I decode it all at once
public write Output(File file){
BufferedReader br = new BufferedReader (new Filereader(file));
String builder sb = new StringBuilder ();
String line=BR.readLine();
While(line!=null){
....
//Read line by line and append to sb
}
byte[] decoded = Base64.getMimeDecoder().decode(SB.toString());
File outputFile = new File ("output.tif")
OutputStream out = new BufferedOutputStream(new FileOutputStream(outputFile));
out.write(decoded);
out.flush();
}
How can I read specific number of characters from source file and decode and then write to output file so I don't have to load everything in memory?
Here is a simple method to demonstrate doing this, by wrapping the Base64 Decoder around an input stream and reading into an appropriately sized byte array.
public static void readBase64File(File inputFile, File outputFile, int chunkSize) throws IOException {
FileInputStream fin = new FileInputStream(inputFile);
FileOutputStream fout = new FileOutputStream(outputFile);
InputStream base64Stream = Base64.getMimeDecoder().wrap(fin);
byte[] chunk = new byte[chunkSize];
int read;
while ((read = base64Stream.read(chunk)) != -1) {
fout.write(chunk, 0, read);
}
fin.close();
fout.close();
}

read FIile content as bytes java

I have 2 java classes. Let them be class A and class B.
Class A gets String input from user and stores the input as byte into the FILE, then Class B should read the file and display the Byte as String.
CLASS A:
File file = new File("C:\\FILE.txt");
file.createNewFile();
FileOutputStream fos = new FileOutputStream(file);
String fwrite = user_input_1+"\n"+user_input_2;
fos.write(fwrite.getBytes());
fos.flush();
fos.close();
In CLASS B, I wrote the code to read the file, but I don't know how to read the file content as bytes.
CLASS B:
fr = new FileReader(file);
br = new BufferedReader(fr);
arr = new ArrayList<String>();
int i = 0;
while((getF = br.readLine()) != null){
arr.add(getF);
}
String[] sarr = (String[]) arr.toArray(new String[0]);
The FILE.txt has the following lines
[B#3ce76a1
[B#36245605
I want both these lines to be converted into their respective string values and then display it. How to do it?
Are you forced to save using a String byte[] representation to save data? Take a look at object serialization (Object Serialization Tutorial), you don't have to worry about any low level line by line read or write methods.
Since you are writing a byte array through the FileOutputStream, the opposite operation would be to read the file using the FileInputStream, and construct the String from the byte array:
File file = new File("C:\\FILE.txt");
Long fileLength = file.length();
byte[] bytes = new byte[fileLength.intValue()]
try (FileInputStream fis = new FileInputStream(file)) {
fis.read(bytes);
}
String result = new String(bytes);
However, there are better ways of writing the String to a file.
You could write it using the FileWriter, and read using FileReader (possibly wrapping them by the corresponding BufferedReader/Writer), this will avoid creating intermediate byte array. Or better yet, use Apache Commons' IOUtils or Google's Guava libraries.

ANSI to UTF-8 through Java : some lines are lost

I wanted to convert some files from ANSI (Arabic) to UTF-8. It works but after the new file is created, it is missing some lines (at the end). Any ideas why?
This is the code :
public class CustomFileConverter {
private static final char BYTE_ORDER_MARK = '\uFEFF';
public void createFile(String inputFile, String outputFile) throws IOException{
FileInputStream input = new FileInputStream(inputFile);
InputStreamReader inputStreamReader = new InputStreamReader(input, "windows-1256"); // Arabic
char[] data = new char[1024];
int i = inputStreamReader.read(data);
if(new File(outputFile).exists()){
new File(outputFile).delete();
}
FileOutputStream output = new FileOutputStream(outputFile,true);
Writer writer = new OutputStreamWriter(output,"UTF-8");
String text = "";
writer.write(BYTE_ORDER_MARK);
while(i !=-1){
String str = new String(data,0,i);
text = text+str;
i = inputStreamReader.read(data);
}
// System.out.print(text); It is printed Completely
writer.write(text);
// File lacks some final lines...
output.close();
input.close();
}
}
When wrapping an output stream in a writer and writing to the writer, the writer may cache data before actually forwarding it to the output stream.
Since you're closing the output stream (file) before flushing the writer, there may be unwritten data which can no longer be written to the file since the output stream is closed.
Instead of closing the FileOutputStream output, close the writer writer, that will flush the contents of the writer to the file and also close both the writer itself and the wrapped FileOutputStream;

Export string with bytes to file without encoding

I stored bytes within a string in Java
String header ="00110011000000011001000000000001001011000000000100000010000000000000000000000000000000000000000000000000000000000000000000000000";
Now i want to write that String to a file, but export that as a series of bits and not encoded as a text.
Writing to the file looks like this:
BufferedWriter writer = new BufferedWriter (new FileWriter("test.epd"));
writer.write(header);
How can I do this(The string in this prog will be longer --> around 8kB)
I would use BinaryCodec from commons apache commons-codec.
String headerb = "00110011000000011001000000000001001011000000000100000010000000000000000000000000000000000000000000000000000000000000000000000000";
BinaryCodec codec = new BinaryCodec();
//I have no idea why this method is not static.
//you may use BinaryCodec.fromAscii(ascii.toCharArray()) instead
byte[] bval = codec.toByteArray(headerb);
File file = new File("test.epd");
Files.write(bval, file );
//Test that when the file is read, we retrieve the same string
byte[] byteArray = Files.toByteArray(file);
String asciiString = BinaryCodec.toAsciiString(byteArray);
System.out.println(asciiString);

Java FileInputStream

I am trying to use a FileInputStream to essentially read in a text file, and then output it in a different text file. However, I always get very strange characters when I do this. I'm sure it's some simple mistake I'm making, thanks for any help or pointing me in the right direction. Here's what I've got so far.
File sendFile = new File(fileName);
FileInputStream fileIn = new FileInputStream(sendFile);
byte buf[] = new byte[1024];
while(fileIn.read(buf) > 0) {
System.out.println(buf);
}
The file it is reading from is just a big text file of regular ASCII characters. Whenever I do the system.out.println, however, I get the output [B#a422ede. Any ideas on how to make this work? Thanks
This happens because you are printing a byte array object itself, rather than printing its content. You should construct a String from the buffer and a length, and print that String instead. The constructor to use for this is
String s = new String(buf, 0, len, charsetName);
Above, len should be the value returned by the call of the read() method. The charsetName should represent the encoding used by the underlying file.
If you're reading from a file to another file, you shouldn't convert the bytes to a string at all, just write the bytes read into the other file.
If your intention is to convert a text file from an encoding to another, read from a new InputStreamReader(in, sourceEncoding), and write to a new OutputStreamWriter(out, targetEncoding).
That's because printing buf will print the reference to the byte array, not the bytes themselves as String as you would expect. You need to do new String(buf) to construct the byte array into string
Also consider using BufferedReader rather than creating your own buffer. With it you can just do
String line = new BufferedReader(new FileReader("filename.txt")).readLine();
Your loop should look like this:
int len;
while((len = fileIn.read(buf)) > 0) {
System.out.write(buf, 0, len);
}
You are (a) using the wrong method and (b) ignoring the length returned by read(), other than checking it for < 0. So you are printing junk at the end of each buffer.
the object 's defualt toString method is return object's id in the memory.
byte buf[] is an object.
you can print using this.
File sendFile = new File(fileName);
FileInputStream fileIn = new FileInputStream(sendFile);
byte buf[] = new byte[1024];
while(fileIn.read(buf) > 0) {
System.out.println(Arrays.toString(buf));
}
or
File sendFile = new File(fileName);
FileInputStream fileIn = new FileInputStream(sendFile);
byte buf[] = new byte[1024];
int len=0;
while((len=fileIn.read(buf)) > 0) {
for(int i=0;i<len;i++){
System.out.print(buf[i]);
}
System.out.println();
}

Categories

Resources