How to convert FileInputStream into string in java? - java

In my java project, I'm passing FileInputStream to a function,
I need to convert (typecast FileInputStream to string),
How to do it.??
public static void checkfor(FileInputStream fis) {
String a=new String;
a=fis //how to do convert fileInputStream into string
print string here
}

You can't directly convert it to string. You should implement something like this
Add this code to your method
//Commented this out because this is not the efficient way to achieve that
//StringBuilder builder = new StringBuilder();
//int ch;
//while((ch = fis.read()) != -1){
// builder.append((char)ch);
//}
//
//System.out.println(builder.toString());
Use Aubin's solution:
public static String getFileContent(
FileInputStream fis,
String encoding ) throws IOException
{
try( BufferedReader br =
new BufferedReader( new InputStreamReader(fis, encoding )))
{
StringBuilder sb = new StringBuilder();
String line;
while(( line = br.readLine()) != null ) {
sb.append( line );
sb.append( '\n' );
}
return sb.toString();
}
}

public static String getFileContent(
FileInputStream fis,
String encoding ) throws IOException
{
try( BufferedReader br =
new BufferedReader( new InputStreamReader(fis, encoding )))
{
StringBuilder sb = new StringBuilder();
String line;
while(( line = br.readLine()) != null ) {
sb.append( line );
sb.append( '\n' );
}
return sb.toString();
}
}

Using Apache commons IOUtils function
import org.apache.commons.io.IOUtils;
InputStream inStream = new FileInputStream("filename.txt");
String body = IOUtils.toString(inStream, StandardCharsets.UTF_8.name());

Don't make the mistake of relying upon or needlessly converting/losing endline characters. Do it character by character. Don't forget to use the proper character encoding to interpres the stream.
public String getFileContent( FileInputStream fis ) {
StringBuilder sb = new StringBuilder();
Reader r = new InputStreamReader(fis, "UTF-8"); //or whatever encoding
int ch = r.read();
while(ch >= 0) {
sb.append(ch);
ch = r.read();
}
return sb.toString();
}
If you want to make this a little more efficient, you can use arrays of characters instead, but to be honest, looping over the characters can be still quite fast.
public String getFileContent( FileInputStream fis ) {
StringBuilder sb = new StringBuilder();
Reader r = new InputStreamReader(fis, "UTF-8"); //or whatever encoding
char[] buf = new char[1024];
int amt = r.read(buf);
while(amt > 0) {
sb.append(buf, 0, amt);
amt = r.read(buf);
}
return sb.toString();
}

From an answer I edited here:
static String convertStreamToString(java.io.InputStream is) {
if (is == null) {
return "";
}
java.util.Scanner s = new java.util.Scanner(is);
s.useDelimiter("\\A");
String streamString = s.hasNext() ? s.next() : "";
s.close();
return streamString;
}
This avoids all errors and works well.

Use following code ---->
try {
FileInputStream fis=new FileInputStream("filename.txt");
int i=0;
while((i = fis.read()) !=-1 ) { // to reach until the laste bytecode -1
System.out.print((char)i); /* For converting each bytecode into character */
}
fis.close();
} catch(Exception ex) {
System.out.println(ex);
}

Related

How to get HTML code as String from a HTML file in java?

I need to get HTML code as String from an existing HTML file? How can I do that in Java?
I tried the following, but the result was something that is not html syntax.
File htmlFile = new File(filePath);
StringBuilder contentBuilder = new StringBuilder();
String str;
try {
BufferedReader in = new BufferedReader(new FileReader(htmlFile));
while ((str = in.readLine()) != null) {
contentBuilder.append(str);
}
in.close();
} catch (IOException e) {
}
String htmlCodeAsString = contentBuilder.toString();
You can try this :
StringBuilder bldr = new StringBuilder();
String str;
BufferedReader in = new BufferedReader(new FileReader("filename.html"));
while((str = in.readLine())!=null)
bldr.append(str);
in.close();
String content = bldr.toString();
You can also use a Scanner:
Scanner scanner = new Scanner(new File("test.html"));
String text = scanner.useDelimiter("\\A").next();
scanner.close();
The regex \A marks the beginning of the input. Scanner uses a buffer which default size is 1024, but will be increased by the Scanner when needed.

Multiple Operations with FileInputStream

I have a Java class to read a file and set some data to a class. I have a class
FileMetadata.java
public class FileMetadata implements Serializable {
private String location;
private Double size;
private String content;
private List<String> lines;
private String md5Digest;
//parameterized constructor
//getters and setters
}
After reading the file, I want to set the values in this class.
This is my method to read the file.
FileUtil.java
public static void readFile(File file) throws IOException {
FileInputStream fis = null;
List<String> fileLines = new ArrayList<String>();
try {
fis = new FileInputStream(file);
StringBuilder sb = new StringBuilder();
StringBuilder sbLine = new StringBuilder();
int ch;
while ((ch = fis.read()) != -1) {
String line = "" + (char) ch;
sb.append(line);
if(line.matches("(\r|\n)")) {
fileLines.add(sbLine.toString());
sbLine.setLength(0);
} else {
sbLine.append(line);
}
}
String md5 = DigestUtils.md5Hex(fis);
System.out.println(md5);
System.out.println(sb.toString());
System.out.println(getFileSizeInKB(file));
for(String str : fileLines) {
System.out.println(str);
}
} finally {
if (fis != null) {
fis.close();
}
}
}
But the list is not coming up properly as it is adding an empty string after each line, because the file next line is "\r\n". The second time it is adding empty StringBuilder so the list is getting extra empty string after each line.
I could try checking the length of it before adding to List. But if the file contains an empty line, I want to add to the list.
You can try using BufferedReader class.
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line=br.readLine())!=null) {
..............
}
br.close();
FileInputStream fis = new FileInputStream(file);
String md5 = DigestUtils.md5Hex(fis);
fis.close();
(There is not exception handling in the example).

Java: reading utf-8 file page by page using FileInputStream

I need some code that will allow me to read one page at a time from a UTF-8 file.
I've used the code;
File fileDir = new File("DIRECTORY OF FILE");
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(fileDir), "UTF8"));
String str;
while ((str = in.readLine()) != null) {
System.out.println(str);
}
in.close();
}
After surrounding it with a try catch block it runs but outputs the entire file!
Is there a way to amend this code to just display ONE PAGE of text at a time?
The file is in UTF-8 format and after viewing it in notepad++, i can see the file contains FF characters to denote the next page.
You will need to look for the form feed character by comparing to 0x0C.
For example:
char c = in.read();
while ( c != -1 ) {
if ( c == 0x0C ) {
// form feed
} else {
// handle displayable character
}
c = in.read();
}
EDIT added an example of using a Scanner, as suggested by Boris
Scanner s = new Scanner(new File("a.txt")).useDelimiter("\u000C");
while ( s.hasNext() ) {
String str = s.next();
System.out.println( str );
}
If the file is valid UTF-8, that is, the pages are split by U+00FF, aka (char) 0xFF, aka "\u00FF", 'ΓΏ', then a buffered reader can do. If it is a byte 0xFF there would be a problem, as UTF-8 may use a byte 0xFF.
int soughtPageno = ...; // Counted from 0
int currentPageno = 0;
try (BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream(fileDir), StandardCharsets.UTF_8))) {
String str;
while ((str = in.readLine()) != null && currentPageno <= soughtPageno) {
for (int pos = str.indexOf('\u00FF'; pos >= 0; )) {
if (currentPageno == soughtPageno) {
System.out.println(str.substring(0, pos);
++currentPageno;
break;
}
++currentPageno;
str = str.substring(pos + 1);
}
if (currentPageno == soughtPageno) {
System.out.println(str);
}
}
}
For a byte 0xFF (wrong, hacked UTF-8) use a wrapping InputStream between FileInputStream and the reader:
class PageInputStream implements InputStream {
InputStream in;
int pageno = 0;
boolean eof = false;
PageInputSTream(InputStream in, int pageno) {
this.in = in;
this.pageno = pageno;
}
int read() throws IOException {
if (eof) {
return -1;
}
while (pageno > 0) {
int c = in.read();
if (c == 0xFF) {
--pageno;
} else if (c == -1) {
eof = true;
in.close();
return -1;
}
}
int c = in.read();
if (c == 0xFF) {
c = -1;
eof = true;
in.close();
}
return c;
}
Take this as an example, a bit more work is to be done.
You can use a Regex to detect form-feed (page break) characters. Try something like this:
File fileDir = new File("DIRECTORY OF FILE");
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(fileDir), "UTF8"));
String str;
Regex pageBreak = new Regex("(^.*)(\f)(.*$)")
while ((str = in.readLine()) != null) {
Match match = pageBreak.Match(str);
bool pageBreakFound = match.Success;
if(pageBreakFound){
String textBeforeLineBreak = match.Groups[1].Value;
//Group[2] will contain the form feed character
//Group[3] will contain the text after the form feed character
//Do whatever logic you want now that you know you hit a page boundary
}
System.out.println(str);
}
in.close();
The parenthesis around portions of the Regex denote capture groups, which get recorded in the Match object. The \f matches on the form feed character.
Edited Apologies, for some reason I read C# instead of Java, but the core concept is the same. Here's the Regex documentation for Java: http://docs.oracle.com/javase/tutorial/essential/regex/

Weird character at the beginning of the file?

When reading from a file, the first line that I read has a weird character (using BufferedReader). How do I delete this character? I know I can do it manually, but I want to do it the right way.
Picture(NetBeans output)
Using the relevant code from the link that the OP provided, here is an answer to the question which works as intended.
import java.io.*;
public class UTF8ToAnsiUtils {
// FEFF because this is the Unicode char represented by the UTF-8 byte order mark (EF BB BF).
public static final String UTF8_BOM = "\uFEFF";
public static void main(String args[]) {
try {
if (args.length != 2) {
System.out
.println("Usage : java UTF8ToAnsiUtils utf8file ansifile");
System.exit(1);
}
boolean firstLine = true;
FileInputStream fis = new FileInputStream(args[0]);
BufferedReader r = new BufferedReader(new InputStreamReader(fis,
"UTF8"));
FileOutputStream fos = new FileOutputStream(args[1]);
Writer w = new BufferedWriter(new OutputStreamWriter(fos, "Cp1252"));
for (String s = ""; (s = r.readLine()) != null;) {
if (firstLine) {
s = UTF8ToAnsiUtils.removeUTF8BOM(s);
firstLine = false;
}
w.write(s + System.getProperty("line.separator"));
w.flush();
}
w.close();
r.close();
System.exit(0);
}
catch (Exception e) {
e.printStackTrace();
System.exit(1);
}
}
private static String removeUTF8BOM(String s) {
if (s.startsWith(UTF8_BOM)) {
s = s.substring(1);
}
return s;
}
}

How to read BufferedReader faster

I want to optimize this code:
InputStream is = rp.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
String text = "";
String aux = "";
while ((aux = reader.readLine()) != null) {
text += aux;
}
The thing is that i don't know how to read the content of the bufferedreader and copy it in a String faster than what I have above.
I need to spend as little time as possible.
Thank you
Using string concatenation in a loop is the classic performance killer (because Strings are immutable, the entire, increasingly large String is copied for each concatenation). Do this instead:
StringBuilder builder = new StringBuilder();
String aux = "";
while ((aux = reader.readLine()) != null) {
builder.append(aux);
}
String text = builder.toString();
You can try Apache IOUtils.toString. This is what they do:
StringWriter sw = new StringWriter();
char[] buffer = new char[1024 * 4];
int n = 0;
while (-1 != (n = input.read(buffer))) {
sw.write(buffer, 0, n);
}
String text = sw.toString();
When BufferedReader reads from Socket, it is necessary to add bufferedReader.ready():
BufferedReader br = new BufferedReader(new InputStreamReader(socket.getInputStream()));
StringBuilder sb= new StringBuilder();
String line = "";
while (br.ready() && (line = br.readLine()) != null) {
sb.append(line + "\r\n");
}
String result = sb.toString();
One line solution:
String result = reader.lines().collect(joining(lineSeparator()));
Imports:
import java.io.*;
import static java.lang.System.lineSeparator;
import static java.util.stream.Collectors.joining;
I wrote a simple function to do this using StringBuilder and While loop with catching IOException inside.
public String getString(BufferedReader bufferedReader) {
StringBuilder stringBuilder = new StringBuilder();
String line = null;
do {
try {
if ((line = bufferedReader.readLine()) != null) {
stringBuilder.append(line).append(System.lineSeparator());
}
} catch (IOException e) {
e.printStackTrace();
}
} while (line != null);
return stringBuilder.toString();
}
You can use StringBuffer
while ((aux = reader.readLine()) != null) {
stringBuffer.append(aux);
}

Categories

Resources