My goal is to check if a file with a particular (part of the name) is found in a folder on the network, also taking into account all folders below it. To do so I need a way to efficiently get a list of all files and folders in and below a given folder. My recursive function does ~2500 items/s on a local drive, but only several/sec on a network drive. I need something faster.
The core question is: what is the fastest way to get a list of items in a folder including the attribute isDirectory or something similar?
I put my hope on the walkFileTree functionality of java.nio, but I am unable to use it. (version: 8.4.0.150421 (R2014b) with Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot™ 64-Bit Server VM mixed mode)
Current problem: I am unable to use any functionality from java.nio
java.io works, e.g. create a file object:
jFile = java.io.File('C:\')
% then use jFile.list or jFile.isDirectory or jFile.toPath, it all works!
Naively calling nio fails:
java.nio.file.Files('C:\')
% -> No constructor 'java.nio.file.Files' with matching signature found.
I realize java.nio.file works a bit differently, to use the methods in Files a path is needed, which can be constructed with java.nio.file.Path.get. This thing eats a string. But this also fails:
java.nio.file.Paths.get('C:\') % -> No method 'get' with matching signature found for class 'java.nio.file.Paths'.
However the method exists:
methods java.nio.file.Paths
% -> Methods for class java.nio.file.Paths:
equals getClass notify toString
get hashCode notifyAll wait
So what is going wrong here? I am not allowed to feed a matlab string? Should I use a Java string? This too fails:
jString = java.lang.String('C:\');
java.nio.file.Paths.get(jString)
% -> No method 'get' with matching signature found for class 'java.nio.file.Paths'.
An oracle workaround is to create the path in java.io, but feeding that to java.nio also fails..
path = java.io.File('C:\').toPath;
java.nio.file.Files.isDirectory(path)
% -> No method 'isDirectory' with matching signature found for class 'java.nio.file.Files'.
So I am not getting any closer to even trying the walkFileTree. I can not get java.nio to do anything in Matlab.
Help: so does anybody have any idea on how to call the java.nio.file functions or answer my core question?
ps: examples of straightforward methods so far without java.nio, examples do no include the recursive part but show the horrible performance
strategy 1: recursively use Matlab's 'dir' function. It is a nice function, as it also gives attributes, but it is a bit slow. In my starting network folder (contains 150 files/folders, path stored as string Sdir) the following command takes 34.088842 sec :
tic;d=dir(Sdir);toc
strategy 2: use java.io.File to speed things up, this hardly helps, because isDirectory needs calling.. Using a heuristic on the names of the items is too dangerous, I am forced to use folders with dots in them. Example in same dir, 31.315587 sec:
tic;jFiles = java.io.File(Sdir).listFiles;
LCVdir = arrayfun(#isDirectory, jFiles, 'UniformOutput',0);
toc
Those java.nio.file methods have variadic signatures. Looks like Matlab is unable to do the auto-boxing needed to make them work transparently, so you will need to call them with the array form of their arguments.
The signature for java.nio.file.Paths.get is get(String first, String... more). This is equivalent to get(String first, String[] more).
>> java.nio.file.Paths.get('C:\', javaArray('java.lang.String', 0))
ans =
C:\
>> class(ans)
ans =
sun.nio.fs.UnixPath
Similarly, the signature for java.nio.file.Files.isDirectory is isDirectory(Path path, LinkOption... options), so you need to supply the options argument.
>> p = java.nio.file.Paths.get('/usr/local', javaArray('java.lang.String', 0));
>> java.nio.file.Files.isDirectory(p, javaArray('java.nio.file.LinkOption', 0))
ans =
logical
1
>>
BTW, the Files.walkFileTree method will require you to implement a custom java.nio.file.FileVisitor subclass, which you will need to do in Java, not plain Matlab.
Also, since you're on a network drive, the network file I/O might actually be your bottleneck here, so don't get your hopes too high for the Java NIO solution to be much faster. To make this really fast, the traversal needs to be run on a machine that has fast access to the filesystem data, or even better, something that has indexed it for efficient searching.
Related
So this trivial question has generated a disproportionate amount of discussion here. It feels more like a playful puzzle but not full-on codegolf.
The javadoc for the NIO.2 Path class includes this part:
A Path is considered to be an empty path if it consists
solely of one name element that is empty.
followed by the "empty path maps to default directory" clause -- that behavior is well understood and not relevant to the question here.
The discussion arose from some of our junior developers asking: given a Path instance p, how should they test for an empty path condition? Turns out the rest of their team (with more experience) had each been doing their own thing, and while all of their approaches "worked", they wanted to converge on the officially correct way; I believe there may have been a
round of beers at stake.
Testing for consists solely of one name element is trivial
(p.getNameCount() == 1). Testing for that is empty means obtaining that
name element (p.getName(0) or p.getFileName()), which... is also a Path
instance that needs to be tested for emptiness...
Calling p.toString() and then testing for isEmpty() felt distasteful, because the emptiness test is being done on a String representation of the path, not the path instance itself. This sparked some philosophical debate about the completeness of the Path API and the meaning of canonical representations. I think they were already two beers in by then.
One developer pointed to the Path#resolve(Path other) method's javadocs, which contain the note If other is an empty path then this method trivially returns this path. So his emptiness test uses an isolated Path instance, and tests for isolated.resolve(p).equals(isolated), which seemed suspiciously too clever
and apparently led to raised voices.
Another developer admitted to testing whether p was an instance of sun.nio.fs.UnixPath and then abusing reflection to accessing its private isEmpty() method. I wasn't present to ask what he does for Windows platforms, and suspect this wouldn't work in Java 9+ anyway.
In the end, they said they grudgingly settled on p.toString().length() == 0 but nobody was happy about it. None of them like the idea that the Path class depends on an "emptiness" quality that they could only apparently measure using methods of the String class, either before construction or after conversion. Presumably this solution was good enough for them to figure out who bought the beers, anyway.
Anyhow, once I heard about it I had to admit I was at a loss as to the best practice. What do the experts do for this case? Convert to String and be done with it, or stay within the NIO.2 API and take advantage of the resolve behavior, or...? (If you live near our other team, they might buy you a beer.)
Ideally, toString() should not be used for overall comparisons. And while you could use the resolve method, you really shouldn’t. (I won’t even address the use of reflection for this.)
I believe you all are over-thinking this problem. Just write what you mean. If you want to test if a Path is equal to the empty path, then do exactly that:
if (somePath.equals(Paths.get("")))
I suppose you could store the empty path in a constant, but it’s so trivial that I wouldn’t bother. It might even make the code harder to read instead of making it easier.
If you don’t want to do that, then your first instinct was correct: test for the conditions described in the documentation.
if (somePath.getNameCount() == 1 &&
somePath.getFileName().toString().isEmpty())
Or:
if (somePath.getNameCount() == 1 && somePath.endsWith(""))
I would prefer using equals, because when someone else reads the code, they will see code that shows your intent: to check whether a path is equal to the empty path.
I need to define another method in ITuple.class like
public Object getValue(int i);
but with Float
public Object getValue(float j);
How can i add it ?
I'm new to storm so Can I find the method that make the same job as I searched and couldn't find , isn't right?
I think you are misunderstanding what getValue(int) does. Here is the description from the javadocs:
Object getValue(int i)
Gets the field at position i in the tuple. Returns object since tuples are dynamically typed.
As you can see, the int argument is the position in the tuple; i.e. the index. Tuple position are inherently integers, so adding an alternative that takes a floating point argument doesn't make any sense.
Supposing (hypothetically) that it did make sense to add a getValue(float) overload, then the way to do it would be to:
download the source code (".java" files),
modify the interface in the ITuple.java source file
modify the source files for classes that implement the interface
build them all to produce new JAR files
use those JAR files in your application
... and repeat this patching procedure every time you upgraded your Apache Storm release. That is probably a bad idea, even if what you were doing made sense.
But attempting to modify ".class" files directly is an even worse idea.
.class files are compilated java files. I assume you found this file in a library.
I'm afraid to inform that this code isn't yours and you cannot edit it.
What you can do, however, is extends the ITuple interface in a MyITuple interface and add whatever you want in it.
So, given that Java has little to no support to unsigned types, I'm right now writing a small API to handle these (for now, I have UnsignedByte and UnsignedInt). The algorithm is simple: store each of them as their higher representation (byte->short, int->long), extends the Number class and implement some calculation and representation utility methods.
The problem is: it is actually very verbose - and boring - to have to, every time, code things like:
UnsignedByte value = new UnsignedByte(15);
UnsignedByte convert = new UnsignedByte(someIntValue);
I was wondering: is there any way to implement, on Eclipse, something like a "file pre-processor", in a way that it will automatically replace some pre-defined strings with other pre-defined strings before compiling the files?
For example: replace U(x) with new UnsignedByte(x), so it would be possible to use:
UnsignedByte value = U(15);
UnsignedByte convert = U(someIntValue);
Yes, I could create a method called U(...) and use import static, but even then, it would be so much trouble doing it for every class that I would use my unsigned types.
I could write a simple Java program that would replace these expressions in a file, but the problem is: How could I integrate that on Eclipse, in a way that it would call/use it every time a Java file is compiled?
I would recommend using Eclipse Templates for doing this instead. I know its not exactly what you ask for but its very simple and can be achieved out of the box.
When you write sysout in Eclipse and press Ctrl+Space it gives you an option to replace that with System.out.println();
You can find more information in the following link
How to add shortcut keys for java code in eclipse
I can point you at how one project I know of does this, they have a set of Python scripts that generate a whole set of classes (java files) from a template base file. They run the script manually, as opposed to part of the build.
Have a look here for the specific example. In this code they have a class for operating on double, but from this class they want to generate code to operate on float, int, etc all in the same way.
There is, of course, a big debate about whether generated code should be checked in or not to source repository. I leave that issue aside and hope that the above example is good to get you going.
I've got a bit of an interesting challenge
To the point:
I want to allow a user to enter an expression in a text field, and have that string treated as a python expression. There are a number of local variables I would like to make available to this expression.
I do have a solution though it will be cumbersome to implement. I was thinking of keeping a Python class source file, with a function that has a single %s in it. When the user enters his expression, we simply do a string format, and then call Jython's interpreter, to spit out something we can execute. There would have to be a number of variable declaration statements in front of that expression to make sure the variables we want to expose to the user for his expression.
So the user would be presented with a text field, he would enter
x1 + (3.5*x2) ** x3
and we would do our interpreting process to come up with an open delegate object. We then punch the values into this object from a map, and call execute, to get the result of the expression.
Any objections to using Jython, or should I be doing something other than modifying source code? I would like to think that some kind of mutable object akin to C#'s Expression object, where we could do something like
PythonExpression expr = new PythonExpression(userSuppliedText)
expr.setDefaultNamespace();
expr.loadLibraries("numPy", /*other libraries?*/);
//comes from somewhere else in the flow, but effectively we get
Map<String, Double> symbolValuesByName = new HashMap<>(){{
put("x1", 3.0);
put("x2", 20.0);
put("x3", 2.0);
}};
expr.loadSymbols(symbolValuesByName);
Runnable exprDelegate = expr.compile();
//sometime later
exprDelegate.run();
but, I'm hoping for a lot, and it looks like Jython is as good as it gets. Still, modifying source files and then passing them to an interpreter seems really heavy-handed.
Does that sound like a good approach? Do you guys have any other libraries you'd suggest?
Update: NumPy does not work with Jython
I should've discovered this one on my own.
So now my question shifts: Is there any way that from a single JVM process instance (meaning, without ever having to fork) I can compile and run some Python code?
If you simply want to parse the expressions, you ought to be able to put something together with a Java parser generator.
If you want to parse, error check and evaluate the expressions, then you will need a substantial subset of the functionality a full Python interpreter.
I'm not aware of a subset implementation.
If such a subset implementation exists, it is unclear that it would be any easier to embed / call than to use a full Python interpreter ... like Jython.
If the powers that be dictate that "thou shalt use python", then they need to pay for the extra work it is going to cause you ... and the next guy who is going to need to maintain a hybrid system across changes in requirements, and updates to the Java and Python / Jython ecosystems. Factor it into the project estimates.
The other approach would be to parse the full python expression grammar, but limit what your evalutor can handle ... based on what it actually required, and what is implementable in your project's time-frame. Limit the types supported and the operations on the types. Limit the built-in functions supported. Etcetera.
Assuming that you go down the Java calling Jython route, there is a lot of material on how to implement it here: http://www.jython.org/jythonbook/en/1.0/JythonAndJavaIntegration.html
I need to change the signature of a method used all over the codebase.
Specifically, the method void log(String) will take two additional arguments (Class c, String methodName), which need to be provided by the caller, depending on the method where it is called. I can't simply pass null or similar.
To give an idea of the scope, Eclipse found 7000 references to that method, so if I change it the whole project will go down. It will take weeks for me to fix it manually.
As far as I can tell Eclipse's refactoring plugin of Eclipse is not up to the task, but I really want to automate it.
So, how can I get the job done?
Great, I can copy a previous answer of mine and I just need to edit a tiny little bit:
I think what you need to do is use a source code parser like javaparser to do this.
For every java source file, parse it to a CompilationUnit, create a Visitor, probably using ModifierVisitor as base class, and override (at least) visit(MethodCallExpr, arg). Then write the changed CompilationUnit to a new File and do a diff afterwards.
I would advise against changing the original source file, but creating a shadow file tree may me a good idea (e.g. old file: src/main/java/com/mycompany/MyClass.java, new file src/main/refactored/com/mycompany/MyClass.java, that way you can diff the entire directories).
Eclipse is able to do that using Refactor -> Change Method signature and provide default values for the new parameters.
For the class parameter the defaultValue should be this.getClass() but you are right in your comment I don't know how to do for the method name parameter.
IntelliJ IDEA shouldn't have any trouble with this.
I'm not a Java expert, but something like this could work. It's not a perfect solution (it may even be a very bad solution), but it could get you started:
Change the method signature with IntelliJ's refactoring tools, and specify default values for the 2 new parameters:
c: self.getClass()
methodName: Thread.currentThread().getStackTrace()[1].getMethodName()
or better yet, simply specify null as the default values.
I think that there are several steps to dealing with this, as it is not just a technical issue but a 'situation':
Decline to do it in short order due to the risk.
Point out the issues caused by not using standard frameworks but reinventing the wheel (as Paul says).
Insist on using Log4j or equivalent if making the change.
Use Eclipse refactoring in sensible chunks to make the changes and deal with the varying defaults.
I have used Eclipse refactoring on quite large changes for fixing old smelly code - nowadays it is fairly robust.
Maybe I'm being naive, but why can't you just overload the method name?
void thing(paramA) {
thing(paramA, THE_DEFAULT_B, THE_DEFAULT_C)
}
void thing(paramA, paramB, paramC) {
// new method
}
Do you really need to change the calling code and the method signature? What I'm getting at is it looks like the added parameters are meant to give you the calling class and method to add to your log data. If the only requirement is just adding the calling class/method to the log data then Thread.currentThread().getStackTrace() should work. Once you have the StackTraceElement[] you can get the class name and method name for the caller.
If the lines you need replaced fall into a small number of categories, then what you need is Perl:
find -name '*.java' | xargs perl -pi -e 's/log\(([^,)]*?)\)/log(\1, "foo", "bar")/g'
I'm guessing that it wouldn't be too hard to hack together a script which would put the classname (derived from the filename) in as the second argument. Getting the method name in as the third argument is left as an exercise to the reader.
Try refactor using intellij. It has a feature called SSR (Structural Search and Replace). You can refer classes, method names, etc for a context. (seanizer's answer is more promising, I upvoted it)
I agree with Seanizer's answer that you want a tool that can parse Java. That's necessary but not sufficient; what you really want is a tool that can carry out a reliable mass-change.
To do this, you want a tool that can parse Java, can pattern match against the parsed code, install the replacement call, and spit out the answer without destroying the rest of the source code.
Our DMS Software Reengineering Toolkit can do all of this for a variety of languages, including Java. It parses complete java systems of source, builds abstract syntax trees (for the entire set of code).
DMS can apply pattern-directed, source-to-source transformations to achieve the desired change.
To achieve the OP's effect, he would apply the following program transformation:
rule replace_legacy_log(s:STRING): expression -> expression
" log(\s) " -> " log( \s, \class\(\), \method\(\) ) "
What this rule says is, find a call to log which has a single string argument, and replace it with a call to log with two more arguments determined by auxiliary functions class and method.
These functions determine the containing method name and containing class name for the AST node root where the rule finds a match.
The rule is written in "source form", but actually matches against the AST and replaces found ASTs with the modified AST.
To get back the modified source, you ask DMS to simply prettyprint (to make a nice layout) or fidelity print (if you want the layout of the old code preserved). DMS preserves comments, number radixes, etc.\
If the exisitng application has more than one defintion of the "log" function, you'll need to add a qualifier:
... if IsDesiredLog().
where IsDesiredLog uses DMS's symbol table and inheritance information to determine if the specific log refers to the definition of interest.
Il fact your problem is not to use a click'n'play engine that will allow you to replace all occurences of
log("some weird message");
by
log(this.getClass(), new Exception().getStackTrace()[1].getMethodName());
As it has few chances to work on various cases (like static methods, as an example).
I would tend to suggest you to take a look at spoon. This tool allows source code parsing and transformation, allowing you to achieve your operation in a -obviously code based- slow, but controlled operation.
However, you could alos consider transforming your actual method with one exploring stack trace to get information or, even better, internally use log4j and a log formatter that displays the correct information.
I would search and replace log( with log(#class, #methodname,
Then write a little script in any language (even java) to find the class name and the method names and to replace the #class and #method tokens...
Good luck
If the class and method name are required for "where did this log come from?" type data, then another option is to print out a stack trace in your log method. E.g.
public void log(String text)
{
StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw, true);
new Throwable.printStackTrace(pw);
pw.flush();
sw.flush();
String stackTraceAsLog = sw.toString();
//do something with text and stackTraceAsLog
}