EDIT: children is an array of directories. This code loops trough this array in order to enter to each directory and load into the array webs all the files listed. Then, for each file, the readFile function is supposed to read the file.
My code is:
for (File cat: children) {
File[] webs = cat.listFiles();
System.out.println(" Indexing category: " + cat.getName());
for (File f: webs) {
Web w = readFile(f);
// Do things with w
}
}
I'm getting this error:
org.htmlparser.util.ParserException: Error in opening a connection to 209800.webtrec
209801.webtrec
...
422064.webtrec
422071.webtrec
422087.webtrec
422089.webtrec
422112.webtrec
422125.webtrec
422127.webtrec
;
java.io.IOException: File Name Too Long
at java.io.UnixFileSystem.canonicalize0(Native Method)
at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:172)
at java.io.File.getCanonicalPath(File.java:576)
at org.htmlparser.http.ConnectionManager.openConnection(ConnectionManager.java:848)
at org.htmlparser.Parser.setResource(Parser.java:398)
at org.htmlparser.Parser.<init>(Parser.java:317)
at org.htmlparser.Parser.<init>(Parser.java:331)
at IndexGenerator.IndexGenerator.readFile(IndexGenerator.java:156)
at IndexGenerator.IndexGenerator.main(IndexGenerator.java:101)
It's strange because I don't see any of those files in that directory.
Thanks!
EDIT2: This is the readFile function. It loads the contents of the file into a string and parses it. Actually, files are html files.
private static Web readFile(File file) {
try {
FileInputStream fin = new FileInputStream(file);
FileChannel fch = fin.getChannel();
// map the contents of the file into ByteBuffer
ByteBuffer byteBuff = fch.map(FileChannel.MapMode.READ_ONLY,
0, fch.size());
// convert ByteBuffer to CharBuffer
// CharBuffer chBuff = Charset.defaultCharset().decode(byteBuff);
CharBuffer chBuff = Charset.forName("UTF-8").decode(byteBuff);
String f = chBuff.toString();
// Close imputstream. By doing this you close the channel associated to it
fin.close();
Parser parser = new Parser(f);
Visitor visit = new Visitor();
parser.visitAllNodesWith((NodeVisitor)visit);
return new Web(visit.getCat(), visit.getBody(), visit.getTitle());
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ParserException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
Okay, finally I got the solution.
It was a very stupid error. I had a file in that directory that contained the names of all empty html files that I had deleted in a previous task. So, I was trying to parse it, and then the parser would interpret it like an URL and not as an htmlfile (since there aren't tags and a lot of points...). I couldn't find the file easily because I have millions of files in that folder.
Related
class HelloWorld {
public static void main(String args[]) {
File file = new File("d://1.mp4");
FileInputStream fr = null;
FileOutputStream fw = null;
byte a[] = new byte[(int) file.length()];
try {
fr = new FileInputStream(file);
fw = new FileOutputStream("d://2.mp4");
fr.read(a);
fw.write(a);
fw.write(a);
fw.write(a);
fw.write(a);
fw.write(a);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
fr.close();
fw.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
Here i write fw.write(a) five times, the size of the file increases to 5x but the original 1.mp4 and copy 2.mp4 both have same length i.e. 3:30 minutes ?
Simply duplicating the bytes of certain files does not necessarily mean it simply duplicates things when inspecting them with software. For example, the video player might read the data until some terminal is encountered and not look forward. This terminal would then exist at the end of the first file data block.
You could open the new file with a hex editor and check if you can see the data of the original video file five times in a row.
FileOutputStream fooStream = new FileOutputStream("FilePath", false);
This will overwrite the content and the size of the file created will be same size as of original file.
I will get dynamic paths from database. Example: 1.xyz/abc/file1.txt
2.pqr/file2.txt
Now I need to append these paths to existing file (eg:/users/rama/) and save that file
my final directory should like /users/rama/xyz/abc/file1.txt
I am able to create directories such as xyz/abc if they don't exist, but the problem is file1.txt is also created as directory instead of file.
I am able to create directories such as xyz/abc if they don't exist,
but the problem is file1.txt is also created as directory instead of
file.
Because you are creating the directories until *.txt. Below an example of code to acheive what you want:
String prefix = "/users/rama/";
String filePath = "xyz/abc/file1.txt";
// concatenation => /users/rama/xyz/abc/file1.txt
String fullPath = prefix.concat(filePath);
PrintWriter writer;
try {
// Getting the directory path : /users/rama/xyz/abc/
int lastIndexOfSlash = fullPath.lastIndexOf("/");
String path = filePath.substring(0, lastIndexOfSlash);
File file = new File(path);
// If /users/rama/xyz/abc/ don't exist then creating it.
if(!file.exists()) {
file.mkdirs();
}
// Creating the file.
writer = new PrintWriter(fullPath, "UTF-8");
writer.println("content");
writer.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I am really confused. I used the
String path = myclass.class.getProtectionDomain().getCodeSource().getLocation()
.toString().replace("file:/", "") + "Myfile.txt"
(The .replace("file:/", "") is there or else it outputs file:/C:Insertpathhere)
to get the containing directory of the running jar. When I print this in the console, it prints C:/users/username/desktop/Myfile.txt. However, when I use a BufferedWriter with the same variable path, it outputs the file to C:/users/username/destop/Maze.jarMyfile.txt (Where maze.jar is the name of the jarfile).
I'm really stumped, can anyone help?
Full code:
(Where maz is a 2D character array of a generated map and genmaze is a 2D String array.)
String path = myclass.class.getProtectionDomain().getCodeSource().getLocation().toString().replace("file:/", "") + "Myfile.txt"
System.out.println(Values.mazegen);
BufferedWriter writer = null;
try {
writer = new BufferedWriter(new FileWriter(path));
for(int i=0;i<r;i++){
for(int j=0;j<c;j++){
genmaze[i][j] = Character.toString(maz[i][j]);
writer.write(maz[i][j]);
System.out.print(maz[i][j]);
}
writer.newLine();
System.out.println();
}
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} finally {
try {
writer.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
This works as it should:
ProtectionDomain pd = Y.class.getProtectionDomain();
CodeSource cs = pd.getCodeSource();
URL url = cs.getLocation();
System.out.println( "URL=" + url );
String path = url.toString().replace("file:", "") + "Myfile.txt";
and the file has the correct name.
Showing the correct URL, either of a directory (if I exec the .class) or of a jar, if I pack it into a jar.
Please note however, that I omitted the '/' from the replaced string. This would produce a relative pathname, and there's no telling what may happen then.
I'm using MemoryMapped buffer to read a file. Initially I'm getting the channel size and using the same size I"m mapping the file on memory and here the initial position is 0 as I want to map the file from the beginning. Now another 400KB of data is added to that file, now I want to map that 400kb alone. But something is wrong in my code, I'm not able to figure it out and I'm getting this
260java.io.IOException: Channel not open for writing - cannot extend file to required size
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:812)
at trailreader.main(trailreader.java:55
So here's my code
BufferedWriter bw;
FileInputStream fileinput = null;
try {
fileinput = new FileInputStream("simple.csv");
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
FileChannel channel = fileinput.getChannel();
MappedByteBuffer ByteBuffer;
try {
ByteBuffer = fileinput.getChannel().map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
/*
* Add some 400 bytes to simple.csv. outside of this program...
*/
//following line throw exception.
try {
ByteBuffer = fileinput.getChannel().map(FileChannel.MapMode.READ_ONLY, channel.size(), 400);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
So in my code I'm trying to reread the additional data that has been added but its not working, I know the prob is channel.size(), but I'm not able to rectify it.
channel.size() is always the current end of file. You are attempting to map 400 bytes past it. It isn't there. You need something like:
ByteBuffer = fileinput.getChannel().map(FileChannel.MapMode.READ_ONLY, channel.size()-400, 400);
I'm trying to read a file in eclipse and print it. The problem is that the compiler always says to me that the file or directory doesn't exist. I have to use relative paths.
The relevant part of the project routes is:
uva.pfc.refactoringEngine.core <--Project
...
src
uva.pfc.refactoringengine.core.actions <-- Actual Package
...
CreateEnumSetPlusClas.java <--File from I want to read the EnumSetPlus.java file
...
EnumSetPlus.java <-- File I want read and print
This is the code:
String total="";
File actual = new File("src/EnumSetPlus.java");
FileReader filereader = null;
try {
filereader = new FileReader(actual);
}
catch (FileNotFoundException e) {
// TODO Auto-generated catch block e.printStackTrace();
}
BufferedReader input = new BufferedReader(filereader);
try {
while ((line = input.readLine()) != null)
{
total += line + "\n";
}
input.close();
}
catch (IOException e) {
// TODO Auto-generated catch block e.printStackTrace();
}
System.out.println(total);
I think the problem is that I have to do something if I want the file path recognised by de eclipse project.
Could you help me??
Thaks beforehand.
I'd use getClass().getResourceAsStream("/EnumSetPlus.txt") - this will look for the file on the root of the classpath (which is bin/, but all files from src go to bin). You then get an InputStream which you can adapt to Redaer via new InputStreamReader(stream, encoding)
In Eclipse the current working directory is src by default.
Try this
File actual = new File("EnumSetPlus.txt");
Also I would look into Kevin's answer too. :-)
Try:
String filePath = "/EnumSetPlus.java";
File actual = new File(ClassLoader.getSystemResource(filePath).getFile());
Your example says that you want to read a file called EnumSetPlus.java but the source code is looking for a file called EnumSetPlus.txt.