How to encrypt a file with newline feed

How to encrypt a file with newline feed - java

I am encrypting a file, but the encrypted file comes with a continuous string.
I want output the same way as my input file. see for example
input file:
===========
Language,English
System Name,TSLGN0
Number of board SPC,12
.
.
Output Encrypted file:
========================
ADCDE12345456
ABCDDDDDDDEDEDAAAADDDD12333
ABCDE123456789
.
.
What I am getting:
760bad166e25ea1e2f6a741363816a15703f2e20524503eee544f69909dd69af760bad166e25ea1e2f
Code below:
BufferedWriter bwr = new BufferedWriter(new FileWriter(new File("C:\\Crypto_Out.txt")));
mbr = new BufferedReader(new FileReader("C:\\Crypto_In.txt"));
while ((line = mbr.readLine()) != null)
{
enSecretText=encrypt(line);
bwr.write(enSecretText.toString());
}
bwr.flush();
bwr.close();
Please suggest

Encryption treats files as a stream of bytes. It is not interested in the meaning assigned to those bytes, just how to encrypt them. Your encrypted ciphertext will be a continuous stream of bytes. It is up to you how to handle that ciphertext.
If you want the ciphertext as letters, then encode it as Base64. If you want to add newlines to your Base64 then you can do so, but your must remove the newlines before removing the Base64 to get back to the original ciphertext bytes.
Decrypting the ciphertext bytes will get back to your original text.

Just add a newline at the end of every iteration.
while ((line = mbr.readLine()) != null)
{
enSecretText=encrypt(line);
bwr.write(enSecretText.toString());
bwr.newLine();
}

You have to add bwr.newLine(); after bwr.write.

Related

String format when reading from file

I have this example. It reads a line "hello" from a file saved as utf-8. Here is my question:
Strings are stored in java in UTF-16 format. So when it reads the line hello it converts it to a utf-16 format. So string s is in a utf-16 with a utf-16 BOM... Am i right?
filereader = new FileReader(file);
read= new BufferedReader(filereader);
String s= null;
while ((s= read.readLine()) != null)
{
System.out.println(s);
}
So when i do this:
s= s.replace("\uFEFF","A");
nothing happens. Should the above find and replace the UTF-16 BOM? Or is it eventually a utf-8 format? Am a little bit confused about this.
Thank you

Try to use the Apache Commons library and the class org.apache.commons.io.input.BOMInputStream to get rid of this kind of problems.
Example:
String defaultEncoding = "UTF-8";
InputStream inputStream = new FileInputStream(file);
try
{
BOMInputStream bOMInputStream = new BOMInputStream(inputStream);
ByteOrderMark bom = bOMInputStream.getBOM();
String charsetName = bom == null ? defaultEncoding : bom.getCharsetName();
InputStreamReader reader = new InputStreamReader(new BufferedInputStream(bOMInputStream), charsetName);
// your code...
}
finally
{
inputStream.close();
}
For what concerns the BOM itself, as #seand said, it's just meta data being used for reading/writing/storing strings in memory. It's present in the strings themselves, but you cannot replace or modify it unless working at binary level or re-encoding the strings.
Let's make a few examples:
String str = "Hadoop";
byte bt1[] = str.getBytes();
System.out.println(bt1.length); // 6
byte bt2a[] = str.getBytes("UTF-16");
System.out.println(bt2a.length); // 14
byte bt2b[] = str.getBytes("UTF-16BE");
System.out.println(bt2b.length); // 14
byte bt3[] = str.getBytes("UTF-16LE");
System.out.println(bt3.length); // 12
In the UTF-16 (which defaults to Big Endian) and UTF-16BE versions, you get 14 bytes because of the BOM being inserted to distinguish between BE and LE. If you specify UTF-16LE you get 12 bytes because of no BOM is being added.
You cannot strip the BOM from a string with a simple replace, as you tried. Because the BOM, if present, is only part of the underlying byte stream that, memory side, is being handled as a string by the java framework. And you can't manipulate it like you manipulate characters that are part of the string itself.

Convert bytes to string from a txt file

In our school project we are supposed to make a program which uses txt files to store data. I decided to simply store them using bytes, with UTF-8 encoding.
When I write to the file I first convert my string to bytes.
Then when i try to read info from file, I cannot do so because the bytes i have stored in the txt are in string format, and it's impossible to convert from String to Bytes without encoding.
Here are the parts of my code responisble for:
try
{
byte[] bytes=line.getBytes("UTF-8");
out.println(bytes);
}
catch(IOException q)
{
System.out.println("Cant encode pls");
}
^Writing
Reading:
try{
byte[] readbyte=in.readLine().getBytes("UTF-8");
String str=new String(readbyte, "UTF-8");
System.out.println(str+" Decoded");
model1.addElement(str);
}
Unfortunately the reading gives me only what i started with, since i simply encode the string from the txt and decode it back.
So how do i fix this?

Issue with java String and String Builder

I'm working on a project in which I have to decrypt a xml text from PHP server and parse into java, I have decrypted the xml text by using CipherInputStream and seen xml file fully get printed, but I'm facing a weird issue while trying to store the xml text in a java string, I'm working on the code below:
public String decrypt(InputStream Fis){
Cipher c = Cipher.getInstance(algo + "/CBC/NoPadding");
String add = "";
StringBuilder getAns = new StringBuilder();
c.init(Cipher.DECRYPT_MODE, key);
CipherInputStream cis = new CipherInputStream(Fis , c);
byte[] encData = new byte[16];
int dummy;
while ((dummy = cis.read(encData)) != -1)
{
System.out.println(new String(encData, "UTF-8").trim());
add = (new String(encData, "UTF-8").trim());
getAns.append(add);
}
System.out.println(getAns);
...
}
It prints the full XML text in log cat by the first println statement inside while loop, But when printing the StringBuilder getAns, it only prints a part of the text.
I have also tried just using String:
add = add + (new String(encData, "UTF-8").trim());
also using ArrayList<String> but it only prints a part of the text.
I guess this may be due to a silly mistake , but I'm struggling with this. Any help will be appreciated.

You are reading some bytes from the input stream inside the while condition:
while ((dummy = cis.read(encData)) != -1)
This already populates the encData byte array with having the correct number of read bytes in the dummy variable. Afterwards you again read some bytes:
dummy = cis.read(encData);
This overwrites the bytes you have read one step before. Delete that line!

Finally caught the issue, It's with the System.out.pritnln(), It has certain limits for printing. This function can only print around 4060 characters at a time, I've found this by using getAns.substring(10000,15000);.

Java Charset InputStreamReader, File Channel Differences

I'm trying to read a (Japanese) file that is encoded as a UTF-16 file.
When I read it using an InputStreamReader with a charset of 'UTF-16" the file is read correctly:
try {
InputStreamReader read = new InputStreamReader(new FileInputStream("JapanTest.txt"), "UTF-16");
BufferedReader in = new BufferedReader(read);
String str;
while((str=in.readLine())!=null){
System.out.println(str);
}
in.close();
}catch (Exception e){
System.out.println(e);
}
However, when I use File Channels and read from a byte array the Strings aren't always converted correctly:
File f = new File("JapanTest.txt");
fis = new FileInputStream(f);
channel = fis.getChannel();
MappedByteBuffer buffer = channel.map( FileChannel.MapMode.READ_ONLY, 0L, channel.size());
buffer.position(0);
int get = Math.min(buffer.remaining(), 1024);
byte[] barray = new byte[1024];
buffer.get(barray, 0, get);
CharSet charSet = Charset.forName("UTF-16");
//endOfLinePos is a calculated value and defines the number of bytes to read
rowString = new String(barray, 0, endOfLinePos, charSet);
System.out.println(rowString);
The problem I've found is that I can only read characters correctly if the MappedByteBuffer is at position 0. If I increment the position of the MappedByteBuffer and then read a number of bytes into a byte array, which is then converted to a string using the charset UTF-16, then the bytes are not converted correctly. I haven't faced this issue if a file is encoded in UTF-8, so is this only an issue with UTF-16?
More Details:
I need to be able to read any line from the file channel, so to do this I build a list of line ending byte positions and then use those positions to be able to get the bytes for any given line and then convert them to a string.

The code unit of UTF-16 is 2 bytes, not a byte like UTF-8. The pattern and single byte code unit length makes UTF-8 self-synchronizing; it can read correctly at any point and if it's a continuation byte, it can either backtrack or lose only a single character.
With UTF-16 you must always work with pairs of bytes, you cannot start reading at an odd byte or stop reading at an odd byte. You also must know the endianess, and use either UTF-16LE or UTF-16BE when not reading at the start of the file, because there will be no BOM.
You can also encode the file as UTF-8.

Possibly, the InputStreamReader does some transformations the normal new String(...) does not. As a work-around (and to verify this assumption) you could try to wrap the data read from the channel like new InputStreamReader( new ByteArrayInputStream( barray ) ).
Edit: Forget that :) - Channels.newReader() would be the way to go.

Corrupt Gzip string due to character encoding

I have some corrupted Gzip log files that I'm trying to restore. The files were transfered to our servers through a Java backed web page. The files have always been sent as plain text, but we recently started to receive log files that were Gzipped. These Gzipped files appear to be corrupted, and are not unzip-able, and the originals have been deleted. I believe this is from the character encoding in the method below.
Is there any way to revert the process to restore the files to their original zipped format? I have the resulting Strings binary array data in a database blob.
Thanks for any help you can give!
private String convertStreamToString(InputStream is) throws IOException {
/*
* To convert the InputStream to String we use the
* Reader.read(char[] buffer) method. We iterate until the
* Reader return -1 which means there's no more data to
* read. We use the StringWriter class to produce the string.
*/
if (is != null) {
Writer writer = new StringWriter();
char[] buffer = new char[1024];
try {
Reader reader = new BufferedReader(
new InputStreamReader(is, "UTF-8"));
int n;
while ((n = reader.read(buffer)) != -1) {
writer.write(buffer, 0, n);
}
} finally {
is.close();
}
return writer.toString();
} else {
return "";
}
}

If this is the method that was used to convert the InputStream to a String, then your data is almost certainly lost.
The problem is that UTF-8 has quite a few byte sequences that are simply not legal (i.e. they don't represent any value). These sequences will be replaced with the Unicode replacement character.
That character is the same no matter which invalid byte sequence was decoded. Therefore the specific information in those bytes is lost.

If that's the code you have you never should have converted to a Reader (or in fact a String). Only preserving as a Stream (or byte array) would avoid corrupting binary files. And once it's read into the string....illegal sequences (and there are many in utf-8) WILL be discarded.
So no, unless you are quite lucky, there is no way to recover the info. You'll have to provide another process where you process the pure stream and insert as a pure BLOB not a CLOB

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to encrypt a file with newline feed - java

Just add a newline at the end of every iteration. while ((line = mbr.readLine()) != null) { enSecretText=encrypt(line); bwr.write(enSecretText.toString()); bwr.newLine(); }

You have to add bwr.newLine(); after bwr.write.

Related

String format when reading from file

Convert bytes to string from a txt file

Issue with java String and String Builder

Java Charset InputStreamReader, File Channel Differences

Corrupt Gzip string due to character encoding

Categories

Resources