finding a specific valued string from an xml file

finding a specific valued string from an xml file - java

I have to read line by line an Xml file in java.
The file has lines of the format :
<CallInt xsi:type="xsd:int">124</CallInt>
I need to pick up only tag name CallInt and the value 124 from the above line.
I tried using String Tokenizer, Split etc. But nothing to the rescue.
Can anyone help me with this?
Some code
BufferedReader buf = new BufferedReader(new FileReader(myxmlfile));
while((line = buf.readLine())!=null)
{
String s = line;
// Scanning for the tag and the integer value code???
}

You should really use a small xml parser.
If you have to read line-by-line, and the format is guaranteed to be line-based, search for delimiters around the content you want to extract with indexOf() and then use substring()...
int cut0 = line.indexOf('<');
if (cut0 != -1) {
int cut1 = line.indexOf(' ', cut0);
if (cut1 != -1) {
String tagName = line.substring(cut0 + 1, cut1);
int cut2 = line.indexOf('>', cut1); // insert more ifs as needed...
int cut3 = line.indexOf('<', cut2);
String value = line.substring(cut2 + 1, cut2);
}
}

Here's a small example with StaX.
Note I've removed the reference to the schema for simplicity (it'll fail as is otherwise).
XML file called "test", in path "/your/path"
<thingies>
<thingie foo="blah"/>
<CallInt>124</CallInt>
</thingies>
Code
XMLInputFactory factory = null;
XMLStreamReader reader = null;
// code is Java 6 style, no try with resources
try {
factory = XMLInputFactory.newInstance();
// coalesces all characters in one event
factory.setProperty(XMLInputFactory.IS_COALESCING, true);
reader = factory.createXMLStreamReader(new FileInputStream(new File(
"/your/path/test.xml")));
boolean readCharacters = false;
while (reader.hasNext()) {
int event = reader.next();
switch (event) {
case (XMLStreamConstants.START_ELEMENT): {
if (reader.getLocalName().equals("CallInt")) {
readCharacters = true;
}
break;
}
case (XMLStreamConstants.CHARACTERS): {
if (readCharacters) {
System.out.println(reader.getText());
readCharacters = false;
}
break;
}
}
}
}
catch (Throwable t) {
t.printStackTrace();
}
finally {
try {
reader.close();
}
catch (Throwable t) {
t.printStackTrace();
}
}
Output
124
Here is an interesting SO thread on schemas and StaX.

Related

Android Reading a large text efficiently in Java

My code is too slow
How can I make my code efficiently? Currently the code needs several minutes until the file was read, which is way too long. Can this be done faster? There is no stacktrace, because it works, but too slow.
Thanks!
The Problem Code:
private void list(){
String strLine2="";
wwwdf2 = new StringBuffer();
InputStream fis2 = this.getResources().openRawResource(R.raw.list);
BufferedReader br2 = new BufferedReader(new InputStreamReader(fis2));
if(fis2 != null) {
try {
LineNumberReader lnr = new LineNumberReader(br2);
String linenumber = String.valueOf(lnr);
int i=0;
while (i!=1) {
strLine2 = br2.readLine();
wwwdf2.append(strLine2 + "\n");
String contains = String.valueOf(wwwdf2);
if(contains.contains("itisdonecomplet")){
i++;
}
}
// Toast.makeText(getApplicationContext(), strLine2, Toast.LENGTH_LONG).show();
Toast.makeText(getApplicationContext(), wwwdf2, Toast.LENGTH_LONG).show();
} catch (IOException e) {
e.printStackTrace();
}
}
}

Use StringBuilder instead of StringBuffer.
StringBuffer is synchronized, and you don't need that.
Don't use String.valueOf, which builds a string, negating the value using a StringBuffer/Builder. You are building a string from the whole buffer, checking it, discarding the string, then constructing nearly the same string again.
Use if (wwwdf2.indexOf("itisdonecomplet") >= 0) instead, which avoids creating the string.
But this will still be reasonably slow, as although you would not be constructing a string and searching through it all, you are still doing the searching.
You can make this a lot faster by only searching the very end of the string. For example, you could use wwwdf2.indexOf("itisdonecomplet", Math.max(0, wwwdf2.length() - strLine2.length() - "itisdonecomplet".length())).
Although, as blackapps points out in a comment, you could simply check if strLine2 contains that string.
Don't use string concatenation inside a call to append: make two separate calls.
wwwdf2.append(strLine2);
wwwdf2.append("\n");
You don't check if you reach the end of the file. Check if strLine2 is null, and break the loop if it is.

My new Created code:(My test device is a Samsung S8)
private void list(){
String strLine2="";
wwwdf2 = new StringBuilder();
InputStream fis2 = this.getResources().openRawResource(R.raw.list);
BufferedReader br2 = new BufferedReader(new InputStreamReader(fis2));
if(fis2 != null) {
try {
LineNumberReader lnr = new LineNumberReader(br2);
String linenumber = String.valueOf(lnr);
int i=0;
while (i!=1) {
strLine2 = br2.readLine();
wwwdf2.append(strLine2);
wwwdf2.append("\n");
if (wwwdf2.indexOf("itisdonecomplet") >= 0){
i++;
}
}
// Toast.makeText(getApplicationContext(), strLine2, Toast.LENGTH_LONG).show();
Toast.makeText(getApplicationContext(), wwwdf2, Toast.LENGTH_LONG).show();
} catch (IOException e) {
e.printStackTrace();
}
}
}

I am using the epublib and I am trying to get the entire chapter of a book at a time

I am trying to get one chapter at a time of a book. I am using the Paul Seigmann library. However, I am not sure how to do it but I am able to get all the text from the book. Not sure where to go from there.
// find InputStream for book
InputStream epubInputStream = assetManager
.open("the_planet_mappers.epub");
// Load Book from inputStream
mThePlanetMappersBookEpubLib = (new EpubReader()).readEpub(epubInputStream);
Spine spine = new Spine(mThePlanetMappersBookEpubLib.getTableOfContents());
for (SpineReference bookSection : spine.getSpineReferences()) {
Resource res = bookSection.getResource();
try {
InputStream is = res.getInputStream();
BufferedReader r = new BufferedReader(new InputStreamReader(is));
String line;
while ((line = r.readLine()) != null) {
line = Html.fromHtml(line).toString();
Log.i("Read it ", line);
mEntireBook.append(line);
}
} catch (IOException e) {
}

I don't know if you're still looking for an answer, but...
I'm working on it too right now. This is the code I have to retrieve the content of all the epub file:
public ArrayList<String> getBookContent(Book bi) {
// GET THE CONTENTS OF ALL PAGES
StringBuilder string = new StringBuilder();
ArrayList<String> listOfPages = new ArrayList<>();
Resource res;
InputStream is;
BufferedReader reader;
String line;
Spine spine = bi.getSpine();
for (int i = 0; spine.size() > i; i++) {
res = spine.getResource(i);
try {
is = res.getInputStream();
reader = new BufferedReader(new InputStreamReader(is));
while ((line = reader.readLine()) != null) {
// FIRST PAGE LINE -> <?xml version="1.0" encoding="utf-8" standalone="no"?>
if (line.contains("<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>")) {
string.delete(0, string.length());
}
// ADD THAT LINE TO THE FINAL STRING REMOVING ALL THE HTML
string.append(Html.fromHtml(formatLine(line)));
// LAST PAGE LINE -> </html>
if (line.contains("</html>")) {
listOfPages.add(string.toString());
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
return listOfPages;
}
private String formatLine(String line) {
if (line.contains("http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd")) {
line = line.substring(line.indexOf(">") + 1, line.length());
}
// REMOVE STYLES AND COMMENTS IN HTML
if ((line.contains("{") && line.contains("}"))
|| ((line.contains("/*")) && line.contains("*/"))
|| (line.contains("<!--") && line.contains("-->"))) {
line = line.substring(line.length());
}
return line;
}
As you may have notice I need to improve the filter, but I have every chapter of that book in my ArrayList. Now I just need to call that ArrayList like myList.get(0); and is done.
To show the text in a proper way, I'm using the bluejamesbond:textjustify library (https://github.com/bluejamesbond/TextJustify-Android).
It is easy to use and powerful.
I hope it helps you, and if anybody finds a better way to filter that html, notice me, please.

How can I read from the next line of a text file, and pause, allowing me to read from the line after that later?

I wrote a program that generates random numbers into two text files and random letters into a third according the two constant files. Now I need to read from each text file, line by line, and put them together. The program is that the suggestion found here doesn't really help my situation. When I try that approach it just reads all lines until it's done without allowing me the option to pause it, go to a different file, etc.
Ideally I would like to find some way to read just the next line, and then later go to the line after that. Like maybe some kind of variable to hold my place in reading or something.
public static void mergeProductCodesToFile(String prefixFile,
String inlineFile,
String suffixFile,
String productFile) throws IOException
{
try (BufferedReader br = new BufferedReader(new FileReader(prefixFile)))
{
String line;
while ((line = br.readLine()) != null)
{
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(productFile, true))))
{
out.print(line); //This will print the next digit to the right
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
}
}
}
}
EDIT: The digits being created according to the following. Basically, constants tell it how many digits to create in each line and how many lines to create. Now I need to combine these together without deleting anything from either text file.
public static void writeRandomCodesToFile(String codeFile,
char fromChar, char toChar,
int numberOfCharactersPerCode,
int numberOfCodesToGenerate) throws IOException
{
for (int i = 1; i <= PRODUCT_COUNT; i++)
{
int I = 0;
if (codeFile == "inline.txt")
{
for (I = 1; I <= CHARACTERS_PER_CODE; I++)
{
int digit = (int)(Math.random() * 10);
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(codeFile, true))))
{
out.print(digit); //This will print the next digit to the right
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
System.exit(1);
}
}
}
if ((codeFile == "prefix.txt") || (codeFile == "suffix.txt"))
{
for (I = 1; I <= CHARACTERS_PER_CODE; I++)
{
Random r = new Random();
char digit = (char)(r.nextInt(26) + 'a');
digit = Character.toUpperCase(digit);
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(codeFile, true))))
{
out.print(digit);
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
System.exit(1);
}
}
}
//This will take the text file to the next line
if (I >= CHARACTERS_PER_CODE)
{
{
Random r = new Random();
char digit = (char)(r.nextInt(26) + 'a');
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(codeFile, true))))
{
out.println(""); //This will return a new line for the next loop
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
System.exit(1);
}
}
}
}
System.out.println(codeFile + " was successfully created.");
}// end writeRandomCodesToFile()

Being respectfull with your code, it will be something like this:
public static void mergeProductCodesToFile(String prefixFile, String inlineFile, String suffixFile, String productFile) throws IOException {
try (BufferedReader prefixReader = new BufferedReader(new FileReader(prefixFile));
BufferedReader inlineReader = new BufferedReader(new FileReader(inlineFile));
BufferedReader suffixReader = new BufferedReader(new FileReader(suffixFile))) {
StringBuilder line = new StringBuilder();
String prefix, inline, suffix;
while ((prefix = prefixReader.readLine()) != null) {
//assuming that nothing fails and the files are equals in # of lines.
inline = inlineReader.readLine();
suffix = suffixReader.readLine();
line.append(prefix).append(inline).append(suffix).append("\r\n");
// write it
...
}
} finally {/*close writers*/}
}
Some exceptions may be thrown.
I hope you don't implement it in one single method.
You can make use of iterators too, or a very simple reader class (method).
I wouldn't use List to load the data at least I guarantee that the files will be low sized and that I can spare the memory usage.

My approach as we discussed by storing the data and interleaving it. Like Sergio said in his answer, make sure memory isn't a problem in terms of the size of the file and how much memory the data structures will use.
//the main method we're working on
public static void mergeProductCodesToFile(String prefixFile,
String inlineFile,
String suffixFile,
String productFile) throws IOException
{
try {
List<String> prefix = read(prefixFile);
List<String> inline = read(inlineFile);
List<String> suffix = read(productFile);
String fileText = interleave(prefix, inline, suffix);
//write the single string to file however you want
} catch (...) {...}//do your error handling...
}
//helper methods and some static variables
private static Scanner reader;//I just prefer scanner. Use whatever you want.
private static StringBuilder sb;
private static List<String> read(String filename) throws IOException
{
List<String> list = new ArrayList<String>;
try (reader = new Scanner(new File(filename)))
{
while(reader.hasNext())
{ list.add(reader.nextLine()); }
} catch (...) {...}//catch errors...
}
//I'm going to build the whole file in one string, but you could also have this method return one line at a time (something like an iterator) and output it to the file to avoid creating the massive string
private static String interleave(List<String> one, List<String> two, List<String> three)
{
sb = new StringBuilder();
for (int i = 0; i < one.size(); i++)//notice no checking on size equality of words or the lists. you might want this
{
sb.append(one.get(i)).append(two.get(i)).append(three.get(i)).append("\n");
}
return sb.toString()
}
Obviously there is still some to be desired in terms of memory and performance; additionally there are ways to make this slightly more extensible to other situations, but it's a good starting point. With c#, I could more easily make use of the iterator to make interleave give you one line at a time, potentially saving memory. Just a different idea!

i want to change the text in a file, my code is searching the word but not replacing the word

I am trying to replace a string from a js file which have content like this
........
minimumSupportedVersion: '1.1.0',
........
now 'm trying to replace the 1.1.0 with 1.1.1. My code is searching the text but not replacing. Can anyone help me with this. Thanks in advance.
public class replacestring {
public static void main(String[] args)throws Exception
{
try{
FileReader fr = new FileReader("G:/backup/default0/default.js");
BufferedReader br = new BufferedReader(fr);
String line;
while((line=br.readLine()) != null) {
if(line.contains("1.1.0"))
{
System.out.println("searched");
line.replace("1.1.0","1.1.1");
System.out.println("String replaced");
}
}
}
catch(Exception e){
e.printStackTrace();
}
}
}

First, make sure you are assigning the result of the replace to something, otherwise it's lost, remember, String is immutable, it can't be changed...
line = line.replace("1.1.0","1.1.1");
Second, you will need to write the changes back to some file. I'd recommend that you create a temporary file, to which you can write each `line and when finished, delete the original file and rename the temporary file back into its place
Something like...
File original = new File("G:/backup/default0/default.js");
File tmp = new File("G:/backup/default0/tmpdefault.js");
boolean replace = false;
try (FileReader fr = new FileReader(original);
BufferedReader br = new BufferedReader(fr);
FileWriter fw = new FileWriter(tmp);
BufferedWriter bw = new BufferedWriter(fw)) {
String line = null;
while ((line = br.readLine()) != null) {
if (line.contains("1.1.0")) {
System.out.println("searched");
line = line.replace("1.1.0", "1.1.1");
bw.write(line);
bw.newLine();
System.out.println("String replaced");
}
}
replace = true;
} catch (Exception e) {
e.printStackTrace();
}
// Doing this here because I want the files to be closed!
if (replace) {
if (original.delete()) {
if (tmp.renameTo(original)) {
System.out.println("File was updated successfully");
} else {
System.err.println("Failed to rename " + tmp + " to " + original);
}
} else {
System.err.println("Failed to delete " + original);
}
}
for example.
You may also like to take a look at The try-with-resources Statement and make sure you are managing your resources properly

If you're working with Java 7 or above, use the new File I/O API (aka NIO) as
// Get the file path
Path jsFile = Paths.get("C:\\Users\\UserName\\Desktop\\file.js");
// Read all the contents
byte[] content = Files.readAllBytes(jsFile);
// Create a buffer
StringBuilder buffer = new StringBuilder(
new String(content, StandardCharsets.UTF_8)
);
// Search for version code
int pos = buffer.indexOf("1.1.0");
if (pos != -1) {
// Replace if found
buffer.replace(pos, pos + 5, "1.1.1");
// Overwrite with new contents
Files.write(jsFile,
buffer.toString().getBytes(StandardCharsets.UTF_8),
StandardOpenOption.TRUNCATE_EXISTING);
}
I'm assuming your script file size doesn't cross into MBs; use buffered I/O classes otherwise.

remove '#' symbol from the beginning of the string in java

Sample data in csv file
##Troubleshooting DHCP Configuration
#Module 3: Point-to-Point Protocol (PPP)
##Configuring HDLC Encapsulation
Hardware is HD64570
So i want to get the lines as
#Troubleshooting DHCP Configuratin
Module 3: Point-to-Point Protocol(PPP)
#Configuring HDLC Encapsulation
Hardware is HD64570
I have written sample code
public class ReadCSV {
public static BufferedReader br = null;
public static void main(String[] args) {
ReadCSV obj = new ReadCSV();
obj.run();
}
public void run() {
String sCurrentLine;
try {
br = new BufferedReader(new FileReader("D:\\compare\\Genre_Subgenre.csv"));
try {
while ((sCurrentLine = br.readLine()) != null) {
if(sCurrentLine.charAt(0) == '#'){
System.out.println(sCurrentLine);
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
I am getting below error
##Troubleshooting DHCP Configuration
#Module 3: Point-to-Point Protocol (PPP)
##Configuring HDLC Encapsulation
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at java.lang.String.charAt(Unknown Source)
at example.ReadCSV.main(ReadCSV.java:19)
Please suggest me how to do this?

Steps:
Read the CSV file line by line
Use line.replaceFirst("#", "") to remove the first # from each line
Write the modified lines to an output stream (file or String) which suites you

If the variable s contains the content of the CSV file as String
s = s.replace("##", "#");
will replace all the occurrencies of '##" with '#'

You need something like String line=buffer.readLine()
Check the first character of the line with line.charAt(0)=='#'
Get the new String with String newLine=line.substring(1)

This is a rather trivial question. Rather than do the work for you, I'll outline the steps that you need to take without gifting you the answer.
Read in a file line by line
Take the first line and check if the first character of this line is a # - If it is, create a substring of this line excluding the first character ( or use fileLine.replaceFirst("#", ""); )
Store this line somewhere in an array like data structure or simply replace the current variable with the edited one ( fileLine = fileLine.replaceFirst("#", ""); )
Repeat until no more lines left from file.
If you want to add these changes to the file, simply overwrite the old file with the new lines (e.g. Using a steam reader and setting second parameter to false would overwrite)
Make an attempt and show us what you have tried, people will be more likely to help if they believe you have attempted the problem yourself thoroughly first.

package stackoverflow.q_25054783;
import java.util.Arrays;
public class RemoveHash {
public static void main(String[] args) {
String [] strArray = new String [3];
strArray[0] = "##Troubleshooting DHCP Configuration";
strArray[1] = "#Module 3: Point-to-Point Protocol (PPP)";
strArray[2] = "##Configuring HDLC Encapsulation";
System.out.println("Original array: " + Arrays.toString(strArray));
for (int i = 0; i < strArray.length; i++) {
strArray[i] = strArray[i].replaceFirst("#", "");
}
System.out.println("Updated array: " + Arrays.toString(strArray));
}
}
//Output:
//Original array: [##Troubleshooting DHCP Configuration, #Module 3: Point-to-Point Protocol (PPP), ##Configuring HDLC Encapsulation]
//Updated array: [#Troubleshooting DHCP Configuration, Module 3: Point-to-Point Protocol (PPP), #Configuring HDLC Encapsulation]

OpenCSV reads CSV file line by line and gives you an array of strings, where each string is one comma separated value, right? Thus, you are operating on a string.
You want to remove '#' symbol from the beginning of the string (if it is there). Correct?
Then this should do it:
CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
if (nextLine[0].charAt(0) == '#') {
nextLine[0] = nextLine[0].substring(1, nextLine[0].length());
}
}
Replacing the first '#' symbol on each of the lines in the CSV file.

private List<String> getFileContentWithoutFirstChar(File f){
try (BufferedReader input = new BufferedReader(new InputStreamReader(new FileInputStream(f), Charset.forName("UTF-8")))){
List<String> lines = new ArrayList<String>();
for(String line = input.readLine(); line != null; line = input.readLine()) {
lines.add(line.substring(1));
}
return lines
} catch(IOException e) {
e.printStackTrace();
System.exit(1);
return null;
}
}
private void writeFile(List<String> lines, File f){
try(BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(f), StandardCharsets.UTF_8))){
for(String line : lines){
bw.write(content);
}
bw.flush();
}catch (Exception e) {
e.printStackTrace();
}
}
main(){
File f = new File("file/path");
List<Stirng> lines = getFileContent(f);
f.delete();
writeFile(lines, f);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

finding a specific valued string from an xml file - java

Related

Android Reading a large text efficiently in Java

I am using the epublib and I am trying to get the entire chapter of a book at a time

How can I read from the next line of a text file, and pause, allowing me to read from the line after that later?

i want to change the text in a file, my code is searching the word but not replacing the word

remove '#' symbol from the beginning of the string in java

Categories

Resources