JAVA, How to control BOM in UTF8 files? - java

I use Java for file reading. Here's my code:
public static String[] fajlbeolvasa(String s) throws IOException
{
ArrayList<String> list = new ArrayList<>();
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(s), "UTF8"));
while(true)
{
String line = reader.readLine();
if (line == null)
{
break;
}
list.add(line);
}
}
However, when I read the file, then the output will be incorrect shaped.
For example: "Farkasgyep\305\261". Maybe something wrong with the BOM.
How can I solve this problem in Java? Will be grateful for any help.

You can try to check for BOM in the following way, this treat the file as byte[], you shouldn't have problem using this with your file:
private static boolean isBOMPresent(byte[] content){
boolean result = false;
byte[] bom = new byte[3];
try (ByteArrayInputStream is = new ByteArrayInputStream(content)) {
int bytesReaded = is.read(bom);
if(bytesReaded != -1) {
String stringContent = new String(Hex.encodeHex(bom));
if (BOM_HEX_ENCODE.equalsIgnoreCase(stringContent)) {
result = true;
}
}
} catch (Exception e) {
LOGGER.error(e);
}
return result;
}
Then, if you need to remove it you can use this:
public static byte[] removeBOM(byte[] fileWithBOM) {
final String BOM_HEX_ENCODE = "efbbbf";
if (isBOMPresent(fileWithBOM)) {
ByteBuffer bb = ByteBuffer.wrap(fileWithBOM);
byte[] bom = new byte[3];
bb.get(bom, 0, bom.length);
byte[] contentAfterFirst3Bytes = new byte[fileWithBOM.length - 3];
bb.get(contentAfterFirst3Bytes, 0, contentAfterFirst3Bytes.length);
return contentAfterFirst3Bytes;
} else {
return fileWithBOM;
}
}

Related

how to return a value from this try catch function in java?

so i have this class
public static String ip (String url){
try {
String webPagea = url;
URL urla = new URL(webPagea);
URLConnection urlConnectiona = urla.openConnection();
InputStream isa = urlConnectiona.getInputStream();
InputStreamReader isra = new InputStreamReader(isa);
int numCharsReada;
char[] charArraya = new char[1024];
StringBuffer sba = new StringBuffer();
while ((numCharsReada = isra.read(charArraya)) > 0) {
sba.append(charArraya, 0, numCharsReada);
}
String resulta = sba.toString();
return resulta;
} catch (Exception e)
{
}
**(compile error)
}
and i want the above to return the resulta string when called from another class like below:
private class t1 implements Runnable{
public void run() {
String getip= ip("http://google.com");
}
but i get compile error that i didn't add a return statement above where the 2 stars are.
Also in general when i define a string within a try catch like above i cant access it outside the try/catch what am i doing wrong ?
example:
public void haha(String data)
{
try {
string test="test6";
} catch (Exception e)
}
string vv=test; <--test cannot be found
}
I want to emphasize i want to get the output of the page not the source code
if the website outputs text i want just the text not the html code
cheers
The scope of the string resulta is within the bounds of the try block. Modify you rcode to have the string resulta declared outside the try block, like this:
public static String ip (String url){
String resulta = "";
try {
String webPagea = url;
URL urla = new URL(webPagea);
URLConnection urlConnectiona = urla.openConnection();
InputStream isa = urlConnectiona.getInputStream();
InputStreamReader isra = new InputStreamReader(isa);
int numCharsReada;
char[] charArraya = new char[1024];
StringBuffer sba = new StringBuffer();
while ((numCharsReada = isra.read(charArraya)) > 0) {
sba.append(charArraya, 0, numCharsReada);
}
resulta = sba.toString();
} catch (Exception e) {
}
return resulta;
}

String[] cannot be converted to State[]

Just wondering what I have done wrong here I'm getting an error in the method setLine() which is:
error: incompatible types: String[] cannot be converted to State[]
Im not too sure on what to do to fix it since I need the line to be split and stored in that state array so I can determine whether if it is a state or location when reading from a csv file.
public static void readFile(String inFilename)
{
FileInputStream fileStrm = null;
InputStreamReader rdr;
BufferedReader bufRdr;
int stateCount = 0, locationCount = 0;
String line;
try
{
fileStrm = new FileInputStream(inFilename);
rdr = new InputStreamReader(fileStrm);
bufRdr = new BufferedReader(rdr);
line = bufRdr.readLine();
while (line != null)
{
if (line.startsWith("STATE"))
{
stateCount++;
}
else if (line.startsWith("LOCATION"))
{
locationCount++;
}
line = bufRdr.readLine();
}
fileStrm.close();
State[] state = new State[stateCount];
Location[] location = new Location[locationCount];
}
catch (IOException e)
{
if (fileStrm != null)
{
try { fileStrm.close(); } catch (IOException ex2) { }
}
System.out.println("Error in file processing: " + e.getMessage());
}
}
public static void processLine(String csvRow)
{
String thisToken = null;
StringTokenizer strTok;
strTok = new StringTokenizer(csvRow, ":");
while (strTok.hasMoreTokens())
{
thisToken = strTok.nextToken();
System.out.print(thisToken + " ");
}
System.out.println("");
}
public static void setLine(State[] state, Location[] location, int stateCount, int locationCount, String line)
{
int i;
state = new State[stateCount];
state = line.split("="); <--- ERROR
for( i = 0; i < stateCount; i++)
{
}
}
public static void writeOneRow(String inFilename)
{
FileOutputStream fileStrm = null;
PrintWriter pw;
try
{
fileStrm = new FileOutputStream(inFilename);
pw = new PrintWriter(fileStrm);
pw.println();
pw.close();
}
catch (IOException e)
{
if (fileStrm != null)
{
try
{
fileStrm.close();
}
catch (IOException ex2)
{}
}
System.out.println("Error in writing to file: " + e.getMessage());
}
}
This error occurs, as it just says 'String[] cannot be converted to State[]'. That is like you wanted to store an Integer into a String, it's the same, because the types don't have a relation to each other (parent -> child).
So if you want to solve your problem you need a method which converts the String[] into a State[]. Something like this:
private State[] toStateArray(String[] strings){
final State[] states = new State[strings.length];
for(int i = strings.length-1; i >= 0; i--){
states[i] = new State(strings[i]); // here you have to decide how to convert String to State
}
return states;
}

Byte encryption for text file

I have an app in which I have to read a .txt file so that I can store some values and keep them. This is working pretty well, except for the fact that I want to make those values non-readable or "non-understandable" for external users.
My idea was to convert the file content into Hex or Binary and, in the reading process, change it back to Char. The thing is that I don't have access to methods such as String.Format due to my compiler.
Here's how I'm currently reading and keeping the values:
byte[] buffer = new byte[1024];
int len = myFile.read(buffer);
String data = null;
int i=0;
data = new String(buffer,0,len);
Class to open and manipulate the file:
public class File {
private boolean debug = false;
private FileConnection fc = null;
private OutputStream os = null;
private InputStream is = null;
private String fileName = "example.txt";
private String pathName = "logs/";
final String rootName = "file:///a:/";
public File(String fileName, String pathName) {
super();
this.fileName = fileName;
this.pathName = pathName;
if (!pathName.endsWith("/")) {
this.pathName += "/"; // add a slash
}
}
public boolean isDebug() {
return debug;
}
public void setDebug(boolean debug) {
this.debug = debug;
}
public void write(String text) throws IOException {
write(text.getBytes());
}
public void write(byte[] bytes) throws IOException {
if (debug)
System.out.println(new String(bytes));
os.write(bytes);
}
private FileConnection getFileConnection() throws IOException {
// check if subfolder exists
fc = (FileConnection) Connector.open(rootName + pathName);
if (!fc.exists() || !fc.isDirectory()) {
fc.mkdir();
if (debug)
System.out.println("Dir created");
}
// open file
fc = (FileConnection) Connector.open(rootName + pathName + fileName);
if (!fc.exists())
fc.create();
return fc;
}
/**
* release resources
*/
public void close() {
if (is != null)
try {
is.close();
} catch (IOException e) {
e.printStackTrace();
}
is = null;
if (os != null)
try {
os.close();
} catch (IOException e) {
e.printStackTrace();
}
os = null;
if (fc != null)
try {
fc.close();
} catch (IOException e) {
e.printStackTrace();
}
fc = null;
}
public void open(boolean writeAppend) throws IOException {
fc = getFileConnection();
if (!writeAppend)
fc.truncate(0);
is = fc.openInputStream();
os = fc.openOutputStream(fc.fileSize());
}
public int read(byte[] buffer) throws IOException {
return is.read(buffer);
}
public void delete() throws IOException {
close();
fc = (FileConnection) Connector.open(rootName + pathName + fileName);
if (fc.exists())
fc.delete();
}
}
I would like to know a simple way on how to read this content. Binary or Hex, both would work for me.
So, with some understanding of the question, I believe you're really looking for a form of obfuscation? As mentioned in the comments, the easiest way to do this is likely a form of cipher.
Consider this example implementation of a shift cipher:
Common
int shift = 11;
Writing
// Get the data to be wrote to file.
String data = ...
// cipher the data.
char[] chars = data.toCharArray();
for (int i = 0; i < chars.length; ++i) {
chars[i] = (char)(chars[i] + shift);
}
String cipher = new String(chars);
// Write the data to the cipher file.
...
Reading
// Read the cipher file.
String data = ...
// Decipher the data.
char[] chars = data.toCharArray();
for (int i = 0; i < chars.length; ++i) {
chars[i] = (char)(chars[i] - shift);
}
String decipher = new String(chars);
// Use data as required.
...
Here's an example implementation on Ideone. The output:
Data : I can read this IP 192.168.0.1
Cipher : T+nly+}plo+st~+T[+<D=9<AC9;9<
Decipher: I can read this IP 192.168.0.1
I tried to keep this as low level as possible in order to satisfy the Java 3 requirement.
Note that this is NOT secure by any means. Shift ciphers (like most ciphers in a bubble) are trivial to break by malicious entities. Please do not use this if security is an actual concern.
Your solution is too complex. With java 8, you can try :
String fileName = "configFile.txt";
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
//TO-DO .Ex
stream.forEach(System.out::println);
} catch (IOException e) {
e.printStackTrace();
}

Cache file in memory and read in parallel

I've a program (simple log parser) that's so slow couse in some cases it had to full scan input file. So I think to pre-cache the entire file (~100MB) in and read it with multiple thread.
With actual configuration I use the BufferedReader to do the "main read" and RandomAccessFile to goto onto specific offset and read what I need.
I've tried this way:
..
Reader reader = null;
if (cache) {
// caching file in memory
br = new BufferedReader(new FileReader(file));
buffer = new StringBuilder();
for (String line = br.readLine(); line != null; line = br.readLine()) {
buffer.append(line).append(CR);
}
br.close();
reader = new StringReader(buffer.toString());
} else {
reader = new FileReader(file);
}
br = new BufferedReader(reader);
for (String line = br.readLine(); line != null; line = br.readLine()) {
offset += line.length() + 1; // Il +1 รจ per il line.separator
matcher = Constants.PT_BEGIN_COMPOSITION.matcher(line);
if (matcher.matches()) {
linecount++;
record = new Record();
record.setCompositionCode(matcher.group(1));
matcher = Constants.PT_PREFIX.matcher(line);
if (matcher.matches()) {
record.setBeginComposition(Constants.SDF_DATE.parse(matcher.group(1)));
record.setProcessId(matcher.group(2));
if (cache) {
executor.submit(new PubblicationParser(buffer, offset, record));
} else {
executor.submit(new PubblicationParser(file, offset, record));
}
records.add(record);
} else {
br.close();
throw new ParseException(line, 0);
}
}
}
In the PubblicationParser there is a init() method that choose what custom reader to use. A RandomAccessFileReader:
if (file != null) {
this.logReader = new RandomAccessFileReader(file, offset);
} else if (sb != null) {
this.logReader = new StringBuilderReader(sb, (int) offset);
}
And this is my 2 custom reader:
//
public class StringBuilderReader implements LogReader {
public static final String CR = System.getProperty("line.separator");
private final StringBuilder sb;
private int offset;
public StringBuilderReader(StringBuilder sb, int offset) {
super();
this.sb = sb;
this.offset = offset;
}
#Override
public String readLine() throws IOException {
if (offset >= sb.length()) {
return null;
}
int indexOf = sb.indexOf(CR, offset);
if (indexOf < 0) {
indexOf = sb.length();
}
String substring = sb.substring(offset, indexOf);
offset = indexOf + CR.length();
return substring;
}
#Override
public void close() throws IOException {
// TODO Auto-generated method stub
}
}
//
public class RandomAccessFileReader implements LogReader {
private static final String FILEMODE_R = "r";
private final RandomAccessFile raf;
public RandomAccessFileReader(File file, long offset) throws IOException {
this.raf = new RandomAccessFile(file, FILEMODE_R);
this.raf.seek(offset);
}
#Override
public void close() throws IOException {
raf.close();
}
#Override
public String readLine() throws IOException {
return raf.readLine();
}
}
The problem is that the "cache way" is so slow and I understand why!
You should be making sure that it is indeed the I/O making your application slow, not something else (e.g inefficient logic in your parser). For that, you could use a Java profiler (JProfiler, for example).
If it is indeed I/O, then it might be better to use some ready-made solution to load the file into memory - essentially that's what you are trying to implement yourself.
Have a look at MappedByteBuffer and ByteBuffer.

How to get stream output as string?

In my servlet I am running a few command line commands in background, I've successfully printed output on console.
My doGet()
public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException
{
String[] command =
{
"zsh"
};
Process p = Runtime.getRuntime().exec(command);
new Thread(new SyncPipe(p.getErrorStream(), response.getOutputStream())).start();
new Thread(new SyncPipe(p.getInputStream(), response.getOutputStream())).start();
PrintWriter stdin = new PrintWriter(p.getOutputStream());
stdin.println("source ./taxenv/bin/activate");
stdin.println("python runner.py");
stdin.close();
int returnCode = 0;
try {
returnCode = p.waitFor();
}
catch (InterruptedException e) {
e.printStackTrace();
} System.out.println("Return code = " + returnCode);
}
class SyncPipe implements Runnable
{
public SyncPipe(InputStream istrm, OutputStream ostrm) {
istrm_ = istrm;
ostrm_ = ostrm;
}
public void run() {
try
{
final byte[] buffer = new byte[1024];
for (#SuppressWarnings("unused")
int length = 0; (length = istrm_.read(buffer)) != -1; )
{
// ostrm_.write(buffer, 0, length);
((PrintStream) ostrm_).println();
}
}
catch (Exception e)
{
e.printStackTrace();
}
}
private final OutputStream ostrm_;
private final InputStream istrm_;
}
Now, I want to save the ostrm_ to a string or list, and use that inside doGet()
How to achieve this?
==============================EDIT============================
Based on answers below, I've edited my code as follows
int length = 0; (length = istrm_.read(buffer)) != -1; )
{
// ostrm_.write(buffer, 0, length);
String str = IOUtils.toString(istrm_, "UTF-8");
//((PrintStream) ostrm_).println();
System.out.println(str);
}
Now, How do I get the str in runnable class into my doGet()?
You can use Apache Commons IO.
Here is the documentation of IOUtils.toString() from their javadocs
Gets the contents of an InputStream as a String using the specified character encoding. This
method buffers the input internally, so there is no need to use a
BufferedInputStream.
Parameters: input - the InputStream to read from encoding - the
encoding to use, null means platform default Returns: the requested
String Throws: NullPointerException - if the input is null IOException
- if an I/O error occurs
Example Usage:
String str = IOUtils.toString(yourInputStream, "UTF-8");
You can call something like the following:
(EDIT: added also the client calls)
public void run() {
try
{
String out = getAsString(istrm_);
((PrintStream) ostrm_).println(out);
} catch (Exception e) {
e.printStackTrace();
}
}
public static String getAsString(InputStream is) throws Exception {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int cur = -1;
while((cur = is.read()) != -1 ){
baos.write(cur);
}
return getAsString(baos.toByteArray());
}
public static String getAsString(byte[] arr) throws Exception {
String res = "";
for(byte b : arr){
res+=(char)b;
}
return res;
}

Categories

Resources