String to Text - Line Break instead of comma - java

I need help before I'm totally despaired :D
As you will see I tried it in different ways even if there are just a really few differences. My problem is that I have a string which I want (or have) to output. This means I need it in a text file. Not that big problem, eh? But the actual problem is that I want line breaks instead of commas. I know I could just replace them after the file is written but it's just unnecessary when there is another way.
The Output looks like this
[/rechtschreibung/_n, /rechtschreibung/_nauf, /rechtschreibung/_naus,
/rechtschreibung/_Ndrangheta, ....]
I want it to look like this
/rechtschreibung/_n
/rechtschreibung/_nauf
/rechtschreibung/_naus
/rechtschreibung/_Ndrangheta
Anyway even when I don't need this method later because I will store this and some other information into a database like sql. It will help me to build up the program step by step and learn some more Java ;)
So here is my code snippet
BufferedWriter bw = null;
//PrintWriter out
//= new PrintWriter(new BufferedWriter(new FileWriter("foo.out")));
try {
bw = new BufferedWriter(new FileWriter("bfwr.txt"));
bw.write(test5.getWoerterListe().toString());
bw.newLine();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
/*
try {
PrintWriter out = new PrintWriter(new FileWriter("prwr.txt"));
out.print(test5.getWoerterListe());
out.close();
System.out.printf("Testing String");
} catch (IOException e) {
e.printStackTrace();
}
*/
/*
try {
FileWriter test10 = new FileWriter("test.txt");
test10.write(test5.getWoerterListe().toString());
test10.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
*/
Please be nice to me :D
Assistance appreciated =)
EDIT #1
Code directly before first one.
Oberordner test2 = new Oberordner("http://www.duden.de/definition");
Unterordner test3 = new Unterordner(test2.getOberOrdner());
WoerterListe test5 = new WoerterListe(test3.getUnterOrdnerURL());
test5.setWoerterListe();
and from WoerterListe.java the really end part
public ArrayList<String> getWoerterListe(){
return WoerterListe;
}
Additional Information: the string is not stored in the code because there are tenthousands of words like '/rechtschreibung/*'
By the way the language here is german unfortunately I have to use german words =(

I'm not a Java developer and you didn't state what getWoerterListe() returns, but here's my guess.
getWoerterListe() probably return a list of strings, and the default behaviour of toString() in this case is to convert the list to comma seperated values. So instead of calling toString() on the list, loop through it and write out each line followed by a carriage return (or whatever Java uses to end lines).

Code:
String s = "[/rechtschreibung/_n, /rechtschreibung/_nauf, "
+ "/rechtschreibung/_naus, /rechtschreibung/_Ndrangheta, ....]";
String srp = s.replaceAll("\\[|\\]|\\.+" ,"");
String[] sp = srp.split(",");
for (int i = 0; i < sp.length; i++) {
System.out.println(sp[i].trim());
}
Output:
/rechtschreibung/_n
/rechtschreibung/_nauf
/rechtschreibung/_naus
/rechtschreibung/_Ndrangheta
Explanation:
I assumed [/rechtschreibung/_n, /rechtschreibung/_nauf, /rechtschreibung/_naus, /rechtschreibung/_Ndrangheta, ....] is a String. I removed all uncessary character like [ , ] , and any number of . form it. After that, I splited by , and print each element of splited string on the output.

Related

OutputStreamWriter only writing one item into file

I have used the following code to write elements from an arraylist into a file, to be retrieved later on using StringTokenizer. It works perfect for 3 other arraylists but somehow for this particular one, it throws an exception when reading with .nextToken() and further troubleshooting with .countTokens() shows that it only has 1 token in the file. The delimiters for both write and read are the same - "," as per the other arraylists as well.
I'm puzzled why it doesnt work the way it should as with the other arrays when I have not changed the code structure.
=================Writing to file==================
public static void copy_TimeZonestoFile(ArrayList<AL_TimeZone> timezones, Context context){
try {
FileOutputStream fileOutputStream = context.openFileOutput("TimeZones.dat",Context.MODE_PRIVATE);
OutputStreamWriter writerFile = new OutputStreamWriter(fileOutputStream);
int TZsize = timezones.size();
for (int i = 0; i < TZsize; i++) {
writerFile.write(
timezones.get(i).getRegion() + "," +
timezones.get(i).getOffset() + "\n"
);
}
writerFile.flush();
writerFile.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
==========Reading from file (nested in thread/runnable combo)===========
public void run() {
if (fileTimeZones.exists()){
System.out.println("Timezone file exists. Loading.. File size is : " + fileTimeZones.length());
try{
savedTimeZoneList.clear();
BufferedReader reader = new BufferedReader(new InputStreamReader(openFileInput("TimeZones.dat")));
String lineFromTZfile = reader.readLine();
while (lineFromTZfile != null ){
StringTokenizer token = new StringTokenizer(lineFromTZfile,",");
AL_TimeZone timeZone = new AL_TimeZone(token.nextToken(),
token.nextToken());
savedTimeZoneList.add(timeZone);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e){
e.printStackTrace();
}
}
}
===================Trace======================
I/System.out: Timezone file exists. Loading.. File size is : 12373
W/System.err: java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
at com.cryptotrac.trackerService$1R_loadTimeZones.run(trackerService.java:215)
W/System.err: at java.lang.Thread.run(Thread.java:764)
It appears that this line of your code is causing the java.util.NoSuchElementException to be thrown.
AL_TimeZone timeZone = new AL_TimeZone(token.nextToken(), token.nextToken());
That probably means that at least one of the lines in file TimeZones.dat does not contain precisely two strings separated by a single comma.
This can be easily checked by making sure that the line that you read from the file is a valid line before you try to parse it.
Using method split, of class java.lang.String, is preferable to using StringTokenizer. Indeed the javadoc of class StringTokenizer states the following.
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
Try the following.
String lineFromTZfile = reader.readLine();
while (lineFromTZfile != null ){
String[] tokens = lineFromTZfile.split(",");
if (tokens.length == 2) {
// valid line, proceed to handle it
}
else {
// optionally handle an invalid line - maybe write it to the app log
}
lineFromTZfile = reader.readLine(); // Read next line in file.
}
There are probably multiple things wrong, because I'd actually expect you to run into an infinite loop, because you are only reading the first line of the file and then repeatedly parse it.
You should check following things:
Make sure that you are writing the file correctly. What does the written file exactly contain? Are there new lines at the end of each line?
Make sure that the data written (in this case, "region" and "offset") never contain a comma, otherwise parsing will break. I expect that there is a very good chance that "region" contains a comma.
When reading files you always need to assume that the file (format) is broken. For example, assume that readLine will return an empty line or something that contains more or less than one comma.

I need to contain all matches of a Regex into a text file; I'm new to java programming

I'm trying to contain all matches found into a text document, I have been banging my head on my desk for the past 3 hours and figured it would be time I asked for help.
My current issue is with the List<String> and I'm not sure if it because the information entered is wrong or if it's my file print methods. It does not print to file and with other means of printing such as writer.println(returnvalue) and even then, it still only displays one of the matches and not all, I do have the matches appearing in console just to make sure they are showing and they are.
Edit2: Sorry this would be my first question on stackoverflow, I guess my question is How would you print all the data from a list array to a text file?
Edit3: My newest problem is printing out all matches i am currently stuck printing out the last match, any advice?
public static void RegexChecker(String TheRegex, String line){
String Result= "";
List<String> returnvalue = new ArrayList<String>();
Pattern checkRegex = Pattern.compile(TheRegex);
Matcher regexMatcher = checkRegex.matcher(line);
int count = 0 ;
FileWriter writer = null;
try {
writer = new FileWriter("output.txt");
} catch (IOException e1) {
e1.printStackTrace();
}
while ( regexMatcher.find() ){
if (regexMatcher.group().length() != 0){
returnvalue.add(regexMatcher.group());
System.out.println( regexMatcher.group().trim() );
}
for(String str: returnvalue) {
try {
out.write(String.valueOf(returnvalue.get(i)));
writer.write(str);
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Get the for out of while. You want to write to the file only after all matches have been added to the list. The for-each block needs some modifications as well.
The for-each construct gives you values from iteration over the collection. You need not obtain the values again using an index.
Try this:
while (regexMatcher.find()) {
if (regexMatcher.group().length() != 0) {
returnvalue.add(regexMatcher.group());
System.out.println(regexMatcher.group().trim());
}
}
try {
for (String str : returnvalue) {
writer.write(str + "\n");
}
writer.flush();
writer.close();
} catch (IOException e) {
e.printStackTrace();
}

Why is website crawling taking forever?

public class Parser {
public static void main(String[] args) {
Parser p = new Parser();
p.matchString();
}
parserObject courseObject = new parserObject();
ArrayList<parserObject> courseObjects = new ArrayList<parserObject>();
ArrayList<String> courseNames = new ArrayList<String>();
String theWebPage = " ";
{
try {
URL theUrl = new URL("http://ocw.mit.edu/courses/");
BufferedReader reader =
new BufferedReader(new InputStreamReader(theUrl.openStream()));
String str = null;
while((str = reader.readLine()) != null) {
theWebPage = theWebPage + " " + str;
}
reader.close();
} catch (MalformedURLException e) {
// do nothing
} catch (IOException e) {
// do nothing
}
}
public void matchString() {
// this is my regex that I am using to compare strings on input page
String matchRegex = "#\\w+(-\\w+)+";
Pattern p = Pattern.compile(matchRegex);
Matcher m = p.matcher(theWebPage);
int i = 0;
while (!m.hitEnd()) {
try {
System.out.println(m.group());
courseNames.add(i, m.group());
i++;
} catch (IllegalStateException e) {
// do nothing
}
}
}
}
What I am trying to achieve with the above code is to get the list of departments on the MIT OpencourseWare website. I am using a regular expression that matches the pattern of the department names as in the page source. And I am using a Pattern object and a Matcher object and trying to find() and print these department names that match the regular expression. But the code is taking forever to run and I don't think reading in a webpage using bufferedReader takes that long. So I think I am either doing something horribly wrong or parsing websites takes a ridiculously long time. so I would appreciate any input on how to improve performance or correct a mistake in my code if any. I apologize for the badly written code.
The problem is with the code
while ((str = reader.readLine()) != null)
theWebPage = theWebPage + " " +str;
The variable theWebPage is a String, which is immutable. For each line read, this code creates a new String with a copy of everything that's been read so far, with a space and the just-read line appended. This is an extraordinary amount of unnecessary copying, which is why the program is running so slow.
I downloaded the web page in question. It has 55,000 lines and is about 3.25MB in size. Not too big. But because of the copying in the loop, the first line ends up being copied about 1.5 billion times (1/2 of 55,000 squared). The program is spending all its time copying and garbage collecting. I ran this on my laptop (2.66GHz Core2Duo, 1GB heap) and it took 15 minutes to run when reading from a local file (no network latency or web crawling countermeasures).
To fix this, make theWebPage into a StringBuilder instead, and change the line in the loop to be
theWebPage.append(" ").append(str);
You can convert theWebPage to a String using toString() after the loop if you wish. When I ran the modified version, it took a fraction of a second.
BTW your code is using a bare code block within { } inside a class. This is an instance initializer (as opposed to a static initializer). It gets run at object construction time. This is legal, but it's quite unusual. Notice that it misled other commenters. I'd suggest converting this code block into a named method.
Is this your whole program? Where is the declaration of parserObject?
Also, shouldn't all of this code be in your main() prior to calling matchString()?
parserObject courseObject = new parserObject();
ArrayList<parserObject> courseObjects = new ArrayList<parserObject>();
ArrayList<String> courseNames = new ArrayList<String>();
String theWebPage=" ";
{
try {
URL theUrl = new URL("http://ocw.mit.edu/courses/");
BufferedReader reader = new BufferedReader(new InputStreamReader(theUrl.openStream()));
String str = null;
while((str = reader.readLine())!=null)
{
theWebPage = theWebPage+" "+str;
}
reader.close();
} catch (MalformedURLException e) {
} catch (IOException e) {
}
}
You are also catching exceptions and not displaying any error messages. You should always display an error message and do something when you encounter an exception. For example, if you can't download the page, there is no reason to try to parse a empty string.
From you comment I learned about static blocks in classes (thank you, didn't know about them). However, from what I've read you need to put the keyword static before the start of the block {. Also, it might just be better to put the code into your main, that way you can exit if you get a MalformedURLException or IOException.
You can, of course, solve this assignment with the limited JDK 1.0 API, and run into the issue that Stuart Marks helped you solve in his excellent answer.
Or, you just use a popular de-facto standard library, like for instance, Apache Commons IO, and read your website into a String using a no-brainer like this:
// using this...
import org.apache.commons.io.IOUtils;
// run this...
try (InputStream is = new URL("http://ocw.mit.edu/courses/").openStream()) {
theWebPage = IOUtils.toString(is);
}

Parse String Output to File

The first part of this “Frankenstein-ed” Java works perfectly, however the second part outputs some jumbled nonsense. So the variable of result will be my input from the user. I had to first UpperCase the string before I did the parsing for some dumb reason, it’s hard when you come from the Database/Analysis background and know you do something in seconds and not get an error... I gave credit where credit is due within the code...
myfile.txt ---> [Ljava.lang.String;#19821f
import java.io.*;
/*http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#split%28java.lang.String%29*/
public class StringParser {
public static void main (String arg[])
throws FileNotFoundException {
String result = "eggs toast bacon bacon butter ice beer".toUpperCase();
String[] resultU = result.split("\\s");
String[] y = resultU;
{
for (int x=0; x< resultU.length; x++)
System.out.println(resultU[x]);
/*http://www.javacoffeebreak.com/java103/java103.html#output*/
FileOutputStream out; // declare a file output object
PrintStream p; // declare a print stream object
try
{
// Create a new file output stream
// connected to "myfile.txt"
out = new FileOutputStream("myfile.txt");
// Connect print stream to the output stream
p = new PrintStream( out );
p.println (resultU);
p.close();
}
catch (Exception e)
{
System.err.println ("Error writing to file");
}
}
}
}
Do you realize you're overwriting the same file for each element in your array?
You should use
out = new FileOutputStream("myfile.txt", true); // appends to existing file
As well as printing the actual element, not the String representation of the whole array
p.println(resultU[x]); // resultU without index prints the whole array - yuk!
Although you should probably update your code to only create the output File once and just write each element of the array to the same output stream, as the current method is a bit inefficient.
Something like
public static void main(String[] args) {
String result = "eggs toast bacon bacon butter ice beer".toUpperCase();
PrintStream p = null;
try {
p = new PrintStream(new FileOutputStream("myfile.txt"));
for (String s : result.split("\\s")) {
p.println(s);
p.flush(); // probably not necessary
}
} catch (Exception e) {
e.printStackTrace(); // should really use a logger instead!
} finally {
try {
p.close(); // wouldn't need this in Java 7!
} catch (Exception e) {
}
}
}
You have to iterate the array and write each element one after one.
FileOutputStream out; // declare a file output object
PrintStream p; // declare a print stream object
try
{
out = new FileOutputStream("myfile.txt");
p = new PrintStream( out );
for(String str:resultU)
{
p.println (str);
}
p.close();
}
catch (Exception e)
{
System.err.println ("Error writing to file");
}
Your line
p.println (resultU);
is printing a string representation of the array itself, not the elements in it. To print the elements, you'll need to loop through the array and print them out individually. The Arrays class has a convenience method to do this for you, of course.
That "jumbled non-sense" is the Strings location in memory, but that's not important right now.
The solution to your problem is this:
try {
FileOutputStream out = new FileOutputStream("myfile.txt", true);
PrintStream = new PrintStream(out);
for(String s : resultU)
p.println(s);
p.close();
} catch(Exception e) {
e.printStackTrace();
}
This replaces your entire for loop.

printing HL7 message in console

I'm passing an object to constructor and then adding parameters of this object to HL7.
ORU_R01 is the type of HL7.
When i print HL7 to console, only the last OBX is printed.
What is wrong with my code?
How can i write this HL7 message to socket?
Is there simpler way in java to handel HL7?
public class FlexSMessageHL7 {
private FileWriter writeHL7ToFile;
private PrismaflexSMessage sMessage;
private ORU_R01 message;
private int i = 0;
private OBX obx = null;
public FlexSMessageHL7(FlexSMessage sMessage) {
this.sMessage = sMessage;
this.message = new ORU_R01();
createHL7SMessage();
}
public void createHL7SMessage() {
// Populate the MSH Segment
MSH msh = message.getMSH();
try {
msh.getFieldSeparator().setValue("|");
msh.getEncodingCharacters().setValue("^~\\&");
msh.getDateTimeOfMessage().setValue(sMessage.getTime().toString());
msh.getSendingApplication().getNamespaceID().setValue(String.valueOf(sMessage.getMachID()));
} catch (DataTypeException e) {
e.printStackTrace();
}
// Populate the OBR Segment:time
OBR obr = message.getPATIENT_RESULT().getORDER_OBSERVATION().getOBR();
try {
obr.getObservationDateTime().setValue(String.valueOf(sMessage.getTime()));
} catch (DataTypeException e) {
e.printStackTrace();
}
// Populate the PID Segment:PatientId
PID pid = message.getPATIENT_RESULT().getPATIENT().getPID();
try {
pid.getPatientID().getIDNumber().setValue(sMessage.getPatID());
} catch (HL7Exception e) {
e.printStackTrace();
}
// Populate the OBX Segment:Param_Code, time, Measure_Value
while (i < sMessage.getMsgInfo()) {
for (PrismaflexSRecord sRecord : sMessage.getsRecordCollection()) {
try {
obx = message.getPATIENT_RESULT().getORDER_OBSERVATION().getOBSERVATION(i).getOBX();
obx.getSetIDOBX().setValue(String.valueOf(i));
obx.getObservationIdentifier().getIdentifier().setValue(sRecord.getParamCode());
obx.getDateTimeOfTheObservation().setValue(String.valueOf(sRecord.getTimeStamp()));
obx.getObservationIdentifier().getNameOfCodingSystem().setValue(String.valueOf(sRecord.getMeasureValue()));
i++;
} catch (HL7Exception e) {
e.printStackTrace();
}
}
}
try {
writeHL7ToFile = new FileWriter(File.createTempFile("prismaflexOutputFrom3001HL7", "txt", new File
("c:\\tmp\\prismaflex")));
writeHL7ToFile.write(message.getMSH().toString());
writeHL7ToFile.flush();
} catch (IOException e) {
e.printStackTrace();
}
// Now, Encode the message and look at the output
try {
Parser parser = new PipeParser();
String encodedMessage = parser.encode(message);
System.out.println("Printing HL7 Encoded Message:");
System.out.println(encodedMessage);
} catch (HL7Exception e) {
e.printStackTrace();
}
}
}
As Nicholas Orlowski pointed out, the problem is in the line ending characters, which according to the HL7 standard are CR characters which make a Windows command prompt only reset the cursor to the beggining of the line and overwrite it with next line's content. Therefore for console output You need to replace the line-endings with something else.
For a recent HL7 app using HAPI, which You also seem to be using, I made a little helper method to achieve this function:
private static String replaceNewlines(String input) {
return input.replaceAll("\\r", "\n");
}
The function can be used on all platforms, as it replaces the CR characters with the OS-specific newline character(s).
Then I can use it to output to console as follows:
LOGGER.trace("Generated message contents:\n" + replaceNewlines(outMessage.encode()));
In this case I am using log4j for logging to console, not simple console printout, but the problem was the same for me.
Hope it helps!
Have you considered using the HAPI? It is written for java, it's counter part nHAPI is written for .net as well. Details here:
http://hl7api.sourceforge.net/
I have had a similar problem in my python HL7py library. Many times the console doesn't like printing characters. I had to write a helper that changed CR to LF (line feed) to display the lines correctly. Hope that helps.
It won't display in the console but it will when you write to the file. Try looking at the variable in debug mode and writing it to a file.

Categories

Resources