I have a TXT file in which I'd like to change this String
<!DOCTYPE Publisher
PUBLIC "-//Springer-Verlag//DTD A++ V2.4//EN" "http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd">
into this one <!DOCTYPE Publisher> using Java.
I wrote the following function but it seems not to be working.
public void replace() {
try {
File file = new File("/home/zakaria/Bureau/PhD/test2/file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
while((line = reader.readLine()) != null) {
oldtext += line + "\n";
}
reader.close();
String newtext = oldtext
.replaceAll("<!DOCTYPE Publisher\nPUBLIC \"-//Springer-Verlag//DTD A++ V2.4//EN\" \"http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd\">",
"<!DOCTYPE Publisher>");
FileWriter writer = new FileWriter("/home/zakaria/Bureau/PhD/test2/file.txt");
writer.write(newtext);
writer.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
What did I do wrong?
Try this simple code:
public static void replace() {
try {
File file = new File("resources/abc.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
boolean found = false;
while ((line = reader.readLine()) != null) {
if (line.trim().startsWith("<!DOCTYPE Publisher")) {
found = true;
}
if (line.trim().endsWith("A++V2.4.dtd\">")) {
oldtext += "<!DOCTYPE Publisher>";
found = false;
continue;
}
if (found) {
continue;
}
oldtext += line + "\n";
}
reader.close();
FileWriter writer = new FileWriter("resources/file.txt");
writer.write(oldtext);
writer.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
You are fortunate to start with that it didn't change anything at all.
Otherwise you'd have lost your original file...
Never modify a file in place!!
Create a temporary file where you write the modified content, and only then rename to your original file.
Also, the string you want to replace is pretty complicated, and you don't want to use .replace() since this will replace all occurrences.
Do like this:
final String quoted
= Pattern.quote("<!DOCTYPE Publisher\nPUBLIC \"-//Springer-Verlag//DTD A++ V2.4//EN\" \"http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd\">");
final Pattern pattern = Pattern.compile(quoted);
final Path victim = Paths.get("/home/zakaria/Bureau/PhD/test2/file.txt");
final Path tmpfile = Files.createTempFile("tmp", "foo");
final byte[] content = Files.readAllBytes(victim);
final String s = new String(content, StandardCharsets.UTF_8);
final String replacement = pattern.matcher(s).replaceFirst("<!DOCTYPE Publisher>");
try (
final OutputStream out = Files.newOutputStream(tmpfile);
) {
out.write(replacement.getBytes(StandardCharsets.UTF_8));
out.flush();
}
Files.move(tmpfile, victim);
If the text you want to eliminate is on the second and subsequent lines, as in your demo-input
<!DOCTYPE Publisher
PUBLIC "-//Springer-Verlag//DTD A++ V2.4//EN"
"http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd">
and no lines between the first and last in the tag contain a closing >, then you can do the following:
while(more lines to process)
if "<!DOCTYPE Publisher" is not found
read line and output it
else
//This is the first line in a <!DOCTYPE tag
read the line and output it, appending '>' to the end
while the next line does NOT end with a '>'
discard it (don't output it)
Try with this regexp:
String newtext = oldtext.replaceAll(
"<!DOCTYPE Publisher\nPUBLIC \"-\\/\\/Springer-Verlag\\/\\/DTD A[+][+] V2[.]4\\/\\/EN\"[ ]\"http:\\/\\/devel[.]springer[.]de\\/A[+][+]\\/V2[.]4\\/DTD\\/A[+][+]V2[.]4[.]dtd\">", "<!DOCTYPE Publisher>");
The only changes are escaping forward slashes and putting dots and plus signs between square brackets.
Related
Relatively new to programming. I want to read a URL, modify the text string, then write it to a line-separated csv textfile.
The read & modify parts run. Also, outputting the string to terminal (using Eclipse) looks fine (csv, line by line), like this;
data_a,data_b,data_c,...
data_a1,data_b1,datac1...
data_a2,data_b2,datac2...
.
.
.
But I'm unable to write the same string to file - it just becomes a one-liner (see my below for-loops, attempts no. 1 & 2);
data_a,data_b,data_c,data_a1,data_b1,datac1,data_a2,data_b2,datac2...
I guess I'm looking for a way to, in the FileWriter or BufferedWriter loops, convert the string finalDataA to array string (i.e. include the string suffix "[0]") but I have not yet found such an approach that would not give errors of the type "Cannot convert String to String[]". Any suggestions?
String data = "";
String dataHelper = "";
try {
URL myURL = new URL(url);
HttpURLConnection myConnection = (HttpURLConnection) myURL.openConnection();
if (myConnection.getResponseCode() == URLStatus.HTTP_OK.getStatusCode()) {
BufferedReader in = new BufferedReader(new InputStreamReader(myConnection.getInputStream()));
while ((data = in.readLine()) != null) {
dataHelper = dataHelper + "\n" + data;
}
in.close();
String trimmedData = dataHelper.trim().replaceAll(" +", ",");
String parts[] = trimmedData.split(Pattern.quote(")"));// ,1.,");
String dataA = parts[1];
String finalDataA[] = dataA.split("</PRE>");
// parts 2&3 removed in this example
// Console output for testing purpose - This prints out many many lines of csv-data
System.out.println(finalDataA[0]);
//This returns the value 1
System.out.println(finalDataA.length);
// Attempt no. 1 to write to file - writes a oneliner
for(int i = 0; i < finalDataA.length; i++) {
try (BufferedWriter bw = new BufferedWriter(new FileWriter(pathA, true))) {
String s;
s = finalDataA[i];
bw.write(s);
bw.newLine();
bw.flush();
}
}
// Attempt no. 2 to write to file - writes a oneliner
FileWriter fw = new FileWriter(pathA);
for (int i = 0; i < finalDataA.length; i++) {
fw.write(finalDataA[i] + "\n");
}
fw.close();
}
} catch (Exception e) {
System.out.println("Exception" +e);
}
Create the BufferedWriter and the FileWriter ahead of the for loop, not every time around it.
From your code comments, finalDataA has one element, so the for-loop will be executed only once. Try splitting finalDataA[0] into rows.
Something like this:
String endOfLineToken = "..."; //your variant
String[] lines = finalDataA[0].split(endOfLineToken)
BufferdWriter bw = new BufferedWriter(new FileWriter(pathA, true));
try
{
for (String line: lines)
{
bw.write(line);
bw.write(endOfLineToken);//to put back line endings
bw.newLine();
bw.flush();
}
}
catch (Exception e) {}
I want to replace a word from a txt file in java. I already have my regular expression and the method for reading the txt file from java. But i have no idea how to replace a word from it using mu regualar expression.
Any suggestion or example?
public class BTest
{
public static void main(String args[])
{
try
{
File file = new File("file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
while((line = reader.readLine()) != null)
{
oldtext += line + "\r\n";
}
reader.close();
// replace a word in a file
String newtext = oldtext.replaceAll("drink", "Love");
//To replace a line in a file
//String newtext = oldtext.replaceAll("This is test string 20000", "blah blah blah");
FileWriter writer = new FileWriter("file.txt");
writer.write(newtext);writer.close();
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
}
}
Parse the file into one string.Then replace all instances of word with new word.
String response = "test string".replaceAll("regex here", "new text");
Then write the new text to a file
FileWriter writer = new FileWriter("out.txt");
writer.write(response);
I am trying to get one chapter at a time of a book. I am using the Paul Seigmann library. However, I am not sure how to do it but I am able to get all the text from the book. Not sure where to go from there.
// find InputStream for book
InputStream epubInputStream = assetManager
.open("the_planet_mappers.epub");
// Load Book from inputStream
mThePlanetMappersBookEpubLib = (new EpubReader()).readEpub(epubInputStream);
Spine spine = new Spine(mThePlanetMappersBookEpubLib.getTableOfContents());
for (SpineReference bookSection : spine.getSpineReferences()) {
Resource res = bookSection.getResource();
try {
InputStream is = res.getInputStream();
BufferedReader r = new BufferedReader(new InputStreamReader(is));
String line;
while ((line = r.readLine()) != null) {
line = Html.fromHtml(line).toString();
Log.i("Read it ", line);
mEntireBook.append(line);
}
} catch (IOException e) {
}
I don't know if you're still looking for an answer, but...
I'm working on it too right now. This is the code I have to retrieve the content of all the epub file:
public ArrayList<String> getBookContent(Book bi) {
// GET THE CONTENTS OF ALL PAGES
StringBuilder string = new StringBuilder();
ArrayList<String> listOfPages = new ArrayList<>();
Resource res;
InputStream is;
BufferedReader reader;
String line;
Spine spine = bi.getSpine();
for (int i = 0; spine.size() > i; i++) {
res = spine.getResource(i);
try {
is = res.getInputStream();
reader = new BufferedReader(new InputStreamReader(is));
while ((line = reader.readLine()) != null) {
// FIRST PAGE LINE -> <?xml version="1.0" encoding="utf-8" standalone="no"?>
if (line.contains("<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>")) {
string.delete(0, string.length());
}
// ADD THAT LINE TO THE FINAL STRING REMOVING ALL THE HTML
string.append(Html.fromHtml(formatLine(line)));
// LAST PAGE LINE -> </html>
if (line.contains("</html>")) {
listOfPages.add(string.toString());
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
return listOfPages;
}
private String formatLine(String line) {
if (line.contains("http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd")) {
line = line.substring(line.indexOf(">") + 1, line.length());
}
// REMOVE STYLES AND COMMENTS IN HTML
if ((line.contains("{") && line.contains("}"))
|| ((line.contains("/*")) && line.contains("*/"))
|| (line.contains("<!--") && line.contains("-->"))) {
line = line.substring(line.length());
}
return line;
}
As you may have notice I need to improve the filter, but I have every chapter of that book in my ArrayList. Now I just need to call that ArrayList like myList.get(0); and is done.
To show the text in a proper way, I'm using the bluejamesbond:textjustify library (https://github.com/bluejamesbond/TextJustify-Android).
It is easy to use and powerful.
I hope it helps you, and if anybody finds a better way to filter that html, notice me, please.
i have problem when try to replacing String in File.
in my file i have :
<!-- Header -->
<header fontName="Arial" size="24"/>
<!-- Content -->
<content>
<fontName="Arial" size="11"/>
</content>
How to replace fontName and size just for <!-- Header --> ?
This is my code for replace
public class StringReplacement {
public static void main(String args[])
{
try
{
File file = new File("file.xml");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
while((line = reader.readLine()) != null)
{
oldtext += line + "\r\n";
}
reader.close();
// replace a word in a file
//String newtext = oldtext.replaceAll("drink", "Love");
//To replace a line in a file
String newtext = oldtext.replaceAll("Arial", "Times New Roman");
FileWriter writer = new FileWriter("file.xml");
writer.write(newtext);
writer.close();
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
}
}
But it just replace all the text to be replaced.
If you are sure that this is the format of the file you can simply do the following:
String newtext = oldtext.replaceAll("header fontName=\"Arial\"", "header fontName=\"Times New Roman\"");
By the way use a StringBuilder to append Strings.
In your read loop while((line = reader.readLine()) != null) you could test if you found the <!-- Header --> line (and not yet the <!-- Content --> line), and do your replace only in the header block.
boolean inHeader == false;
while((line = reader.readLine()) != null) {
if (line.equals("<!-- Header -->")) {
inHeader = true;
} else if (line.equals("<!-- Content -->")) {
inHeader = false;
}
if (inHeader) {
line = line.replaceAll("Arial", "Times New Roman");
}
oldtext += line + "\r\n";
}
And remove the line
String newtext = oldtext.replaceAll("Arial", "Times New Roman");
EDIT: It would probably be cleaner to detect arbitrary tags rather than hardcoding Header and Content. That would require a regular expression to match <!-- (tag) --> and test if tag is equal to "Header", but this approach is easier, of course.
I have a string like this in a file
<script>
Evening</script>
I have written a code to replace this string but it's not identifying the newline character
i,e. I want to replace above string with:
<h1>Done</h1>
code goes like this:
package stringreplace;
import java.io.*;
import org.omg.CORBA.Request;
public class stringreplace {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
FileReader fr = null;
BufferedReader br = null;
try
{
fr = new FileReader("G://abc.html");
br = new BufferedReader(fr);
String newtext="";
String line="";
String matchExist1 = "<script>\r\nEvening</script>";
String newpattern = "<h1>Done</h1>";
String matchExist2 = "</body>";
String newpattern2 = "<script>alpha</script></body>";
StringBuffer sb = new StringBuffer();
while((line=br.readLine())!=null)
{
int ind2 = line.indexOf(matchExist1);
System.out.println(ind2);
int ind3 = line.indexOf(matchExist2);
if((ind2==-1) || (ind3==-1))
{
line = line.replaceFirst(matchExist1,newpattern);
line = line.replaceFirst(matchExist2,newpattern2);
sb.append(line+"\n");
}
//sb.append(line+"\n");
else if((ind2!=-1) || (ind3!=-1))
{
String tag = "</body>";
line = line.replaceFirst("</body>",tag);
sb.append(line+"\n");
}
}
br.close();
FileWriter fw = new FileWriter("G://abc.html");
fw.write(sb.toString());
fw.close();
System.out.println("done");
System.out.println(sb);
}
catch (Exception e)
{
System.out.println(e);
}
}
}
But it is not identifying newline character.
Since you are reading only one input line at a time you can hardly expect to match a pattern that spans two lines.You must first fix your read to have a least two lines in it. Once you've done that, #sterna's answer will do the trick
I think you can't be sure about how your newline looks like. So I would not match for a specific sequence instead use \s+ this is at least one whitespace character and all newline characters are included.
String matchExist1 = "<script>\\s+Evening</script>";
Edit:
Of course, you have to fix at first the problem mgc described (+1). And then you can make use of my answer!