I have got a third party dll file with header like below.
class MyClass
{
public:
MyClass() {};
virtual ~MyClass() {};
int Double(int x);
};
I cannot connect with the dll file using JNA (as I cannot extern here). So I have created a wrapper dll. From that wrapper dll , I am trying to connect with the third party dll. wrapper dll header file (Sample.h)
extern "C" {
int __declspec(dllexport) Double(int x);
}
wrapper dll's cpp file
typedef int (__cdecl *MYPROC)(int);
int func(int x)
{
HINSTANCE hinstLib;
MYPROC ProcAdd;
BOOL fFreeResult, fRunTimeLinkSuccess = FALSE;
int result;
printf("inside the function");
// Get a handle to the DLL module.
hinstLib = LoadLibrary(TEXT("example_dll.dll"));
// If the handle is valid, try to get the function address.
if (hinstLib != NULL)
{
ProcAdd = (MYPROC) GetProcAddress(hinstLib, "Double");
// If the function address is valid, call the function.
if (NULL != ProcAdd)
{
fRunTimeLinkSuccess = TRUE;
result = ProcAdd(x);
printf("Result: %d", result);
} else {
printf("function not found");
}
// Free the DLL module.
fFreeResult = FreeLibrary(hinstLib);
}
// If unable to call the DLL function, use an alternative.
if (! fRunTimeLinkSuccess)
printf("Message printed from executable\n");
return result;
}
Using a C program(compiled by g++) I can connect with both the dll and get the data.
But from a Java program(using JNA), I can connect with the first dll but first dll cannot connect with the second dll.
Could anybody please enlighten me how to fix the problem? Of if there are any other way to fix it?
Related
I'm trying to write an Android app which calls some Rust code via swig. I have written a simple test for the linkages. A version that has an ordinary Java main program works fine. My Android version fails with an error:
java.lang.UnsatisfiedLinkError: dlopen failed: cannot locate symbol
"log" referenced by
"/data/app/hk.jennyemily.work.try3-1/lib/x86_64/libsimpleswig.so"...
at java.lang.Runtime.loadLibrary0(Runtime.java:989)
My Rust code does not import "log".
So what do I do to fix this?
The Android Java basically only does the following so far (I have not yet coded the calls to the library):
static {
System.loadLibrary("simpleswig");
}
Here is my Rust code:
use std::ffi::CString;
use std::os::raw::c_char;
pub struct SimpleData {
pub str: CString,
}
#[no_mangle]
pub extern "C" fn newSimpleData() -> *mut SimpleData {
Box::into_raw(Box::new(SimpleData {
str: CString::new("hello").unwrap(),
}))
}
#[no_mangle]
pub extern "C" fn say(ptr: *mut SimpleData) -> *const c_char {
let sd = unsafe {
assert!(!ptr.is_null());
&mut *ptr
};
sd.str.as_ptr()
}
#[no_mangle]
pub extern "C" fn release(ptr: *mut SimpleData) {
if ptr.is_null() {
return;
}
unsafe {
let _ = Box::from_raw(ptr);
}
}
There is also the basic Swig code for wrapping the Rust code.
The Swig and Rust code is linked into a dynamic library (.so).
Edit: the actual C compile and link line is:
$CC -shared $SIMPDIR/simple_wrap.c $SIMPDIR/libsimple.a -o $SIMPDIR/libsimpleswig.so -I"${JAVALOC}/include" -I"${JAVALOC}/include/linux"
Here is the stand-alone Java main program, which works without any problems with the same Rust and Swig code.
public class test {
static {
System.loadLibrary("simpleswig");
}
public static void main(String argv[]) {
System.out.println("starting...");
simpleswig.SWIGTYPE_p_SimpleData sd = simpleswig.simpleswig.newSimpleData();
System.out.println( simpleswig.simpleswig.say(sd));
System.out.println("releasing...");
simpleswig.simpleswig.release(sd);
System.out.println("done.");
}
}
I'm trying to call some java classes from c++ code by using the JNI. Today I experienced a very strange behaviour in my programm. I suspect that the c++ code is not wait until the java side finished it's work and I don't know why.
The C++ code (in a shared object library) is running in it's own C++ thread. It is using a existing JavaVM of a java app that is already up and running. The references to the VM and the ClassLoader where fetched while the java application was loading the shared object library in JNI_Onload. Here I call the java method of a java object I created with JNI in the C++ thread:
env->CallVoidMethod(javaClassObject, javaReceiveMethod, intParam, byteParam, objectParam);
uint32_t applicationSideResponseCode = getResponseCode(objectClass, objectParam, env);
resp.setResponseCode(applicationSideResponseCode);
std::string applicationData = getApplicationData(serviceResultClass, serviceResultObject, env);
resp.setData(applicationData);
The Java javaReceiveMethod is accessing the database and fetching some applicationData which is stored in the objectParam. Unfortunately the C++ code fetched the applicationData before the java class completed it's work. applicationData is null and the JNI crashes. Why? I can't find any documentation of Oracle stating that CallVoidMethod is executed asynchronous?
Edit
I verified that no exception has occured on the java method. Everything seems fine, except that Java is still busy while C++ is trying to access the data.
Edit
I can confirm that if I debug the java application two threads are shown. On main thread, that is executing javaReceiveMethod and one thread that is fetching the applicationData. How can I solve this problem? Idle in the second thread until the data is available?
Edit
In my C++ code I'm creating a new object of the java class that I want to call:
jmethodID javaClassConstructor= env->GetMethodID(javaClass, "<init>", "()V");
jobject serviceObject = env->NewObject(javaClass, serviceConstructor);
jobject javaClassObject = env->NewObject(javaClass, javaClassConstructor);
After that I call the code as shown above. After more debugging I can say that the method is called in a thread named Thread-2 (I don't know if that is the c++ thread or a new one from JNI). It is definitely not the java main thread. However the work of the method is interrupted. That means If I debug the code, I can see that the data would be set soon but in the next debug step the getApplicationData method is executed (which can only occur if c++ is calling it).
Edit
The Java method I call:
public int receive(int methodId, byte[] data, ServiceResult result){
log.info("java enter methodID = " + methodId + ", data= " + data);
long responseCode = SEC_ERR_CODE_SUCCESS;
JavaMessageProto msg;
try {
msg = JavaMessageProto (data);
log.info("principal: " + msg.getPrincipal());
JavaMessage message = new JavaMessage (msg);
if(methodId == GET_LISTS){
//this is shown in console
System.out.println("get lists");
responseCode = getLists(message);
//this point is not reached
log.info("leave");
}
//[... different method calls here...]
if(responseCode != SEC_ERR_CODE_METHOD_NOT_IMPLEMENTED){
//ToDoListMessageProto response = message.getProtoBuf();
JavaMessageProto response = JavaMessageProto.newBuilder()
.setToken(message.getToken())
.setPrincipal(message.getPrincipal()).build();
byte[] res = response.toByteArray();
result.setApplicationData(response.toByteArray());
}
else{
result.setApplicationData("");
}
} catch (InvalidProtocolBufferException e) {
responseCode = SEC_ERR_CODE_DATA_CORRUPTED;
log.severe("Error: Could not parse Client message." + e.getMessage());
}
result.setResponseCode((int)responseCode);
return 0;
}
The second method is
public long getLists(JavaMessage message) {
log.info("getLists enter");
String principal = message.getPrincipal();
String token = message.getToken();
if(principal == null || principal.isEmpty()){
return SEC_ERR_CODE_PRINCIPAL_EMPTY;
}
if(token == null || token.isEmpty()){
return SEC_ERR_CODE_NO_AUTHENTICATION;
}
//get user object for authorization
SubjectManager manager = new SubjectManager();
Subject user = manager.getSubject();
user.setPrincipal(principal);
long result = user.isAuthenticated(token);
if(result != SEC_ERR_CODE_SUCCESS){
return result;
}
try {
//fetch all user list names and ids
ToDoListDAO db = new ToDoListDAO();
Connection conn = db.getConnection();
log.info( principal + " is authenticated");
result = db.getLists(conn, message);
//this is printed
log.info( principal + " is authenticated");
conn.close(); //no exception here
message.addId("testentry");
//this not
log.info("Fetched lists finished for " + principal);
} catch (SQLException e) {
log.severe("SQLException:" + e.getMessage());
result = SEC_ERR_CODE_DATABASE_ERROR;
}
return result;
}
CallVoidMethod is executed synchronously.
Maybe you have an excepion on c++ side?, do you use c++ jni exception checks?:
env->CallVoidMethod(javaClassObject, javaReceiveMethod, intParam, byteParam, objectParam);
if(env->ExceptionOccurred()) {
// Print exception caused by CallVoidMethod
env->ExceptionDescribe();
env->ExceptionClear();
}
The C++ code (in a shared object library) is running in it's own C++ thread. It is using a existing JavaVM of a java app that is already up and running.
its not clear whether you have attached current thread to virtual machine. Make sure env is comming from AttachCurrentThread call. You will find example here: How to obtain JNI interface pointer (JNIEnv *) for asynchronous calls.
I have build an application connecting R and java using the Rserve package.
In that, i am getting the error as "evaluation successful but object is too big to transport". i have tried increasing the send buffer size value in Rconnection class also. but that doesn't seem to work.
The object size which is being transported is 4 MB
here is the code from the R connection file
public void setSendBufferSize(long sbs) throws RserveException {
if (!connected || rt == null) {
throw new RserveException(this, "Not connected");
}
try {
RPacket rp = rt.request(RTalk.CMD_setBufferSize, (int) sbs);
System.out.println("rp is send buffer "+rp);
if (rp != null && rp.isOk()) {
System.out.println("in if " + rp);
return;
}
} catch (Exception e) {
e.printStackTrace();
LogOut.log.error("Exception caught" + e);
}
//throw new RserveException(this,"setSendBufferSize failed",rp);
}
The full java class is available here :Rconnection.java
Instead of RServe, you can use JRI, that is shipped with rJava package.
In my opinion JRI is better than RServe, because instead of creating a separate process it uses native calls to integrate Java and R.
With JRI you don't have to worry about ports, connections, watchdogs, etc... The calls to R are done using an operating system library (libjri).
The methods are pretty similar to RServe, and you can still use REXP objects.
Here is an example:
public void testMeanFunction() {
// just making sure we have the right version of everything
if (!Rengine.versionCheck()) {
System.err.println("** Version mismatch - Java files don't match library version.");
fail(String.format("Invalid versions. Rengine must have the same version of native library. Rengine version: %d. RNI library version: %d", Rengine.getVersion(), Rengine.rniGetVersion()));
}
// Enables debug traces
Rengine.DEBUG = 1;
System.out.println("Creating Rengine (with arguments)");
// 1) we pass the arguments from the command line
// 2) we won't use the main loop at first, we'll start it later
// (that's the "false" as second argument)
// 3) no callback class will be used
engine = REngine.engineForClass("org.rosuda.REngine.JRI.JRIEngine", new String[] { "--no-save" }, null, false);
System.out.println("Rengine created...");
engine.parseAndEval("rVector=c(1,2,3,4,5)");
REXP result = engine.parseAndEval("meanVal=mean(rVector)");
// generic vectors are RVector to accomodate names
assertThat(result.asDouble()).isEqualTo(3.0);
}
I have a demo project that exposes a REST API and calls R functions using this package.
Take a look at: https://github.com/jfcorugedo/RJavaServer
I want to load my own native libraries in my java application. Those native libraries depend upon third-party libraries (which may or may not be present when my application is installed on the client computer).
Inside my java application, I ask the user to specify the location of dependent libs. Once I have this information, I am using it to update the "LD_LIBRARY_PATH" environment variable using JNI code. The following is the code snippet that I am using to change the "LD_LIBRARY_PATH" environment variable.
Java code
public static final int setEnv(String key, String value) {
if (key == null) {
throw new NullPointerException("key cannot be null");
}
if (value == null) {
throw new NullPointerException("value cannot be null");
}
return nativeSetEnv(key, value);
}
public static final native int nativeSetEnv(String key, String value);
Jni code (C)
JNIEXPORT jint JNICALL Java_Test_nativeSetEnv(JNIEnv *env, jclass cls, jstring key, jstring value) {
const char *nativeKey = NULL;
const char *nativeValue = NULL;
nativeKey = (*env)->GetStringUTFChars(env, key, NULL);
nativeValue = (*env)->GetStringUTFChars(env, value, NULL);
int result = setenv(nativeKey, nativeValue, 1);
return (jint) result;
}
I also have corresponding native methods to fetch the environment variable.
I can successfully update the LD_LIBRARY_PATH (this assertion is based on the output of C routine getenv().
I am still not able to load my native library. The dependent third-party libraries are still not detected.
Any help/pointers are appreciated. I am using Linux 64 bit.
Edit:
I wrote a SSCE (in C) to test if dynamic loader is working. Here is the SSCE
#include
#include
#include
#include
int main(int argc, const char* const argv[]) {
const char* const dependentLibPath = "...:";
const char* const sharedLibrary = "...";
char *newLibPath = NULL;
char *originalLibPath = NULL;
int l1, l2, result;
void* handle = NULL;
originalLibPath = getenv("LD_LIBRARY_PATH");
fprintf(stdout,"\nOriginal library path =%s\n",originalLibPath);
l1 = strlen(originalLibPath);
l2 = strlen(dependentLibPath);
newLibPath = (char *)malloc((l1+l2)*sizeof(char));
strcpy(newLibPath,dependentLibPath);
strcat(newLibPath,originalLibPath);
fprintf(stdout,"\nNew library path =%s\n",newLibPath);
result = setenv("LD_LIBRARY_PATH", newLibPath, 1);
if(result!=0) {
fprintf(stderr,"\nEnvironment could not be updated\n");
exit(1);
}
newLibPath = getenv("LD_LIBRARY_PATH");
fprintf(stdout,"\nNew library path from the env =%s\n",newLibPath);
handle = dlopen(sharedLibrary, RTLD_NOW);
if(handle==NULL) {
fprintf(stderr,"\nCould not load the shared library: %s\n",dlerror());
exit(1);
}
fprintf(stdout,"\n The shared library was successfully loaded.\n");
result = dlclose(handle);
if(result!=0) {
fprintf(stderr,"\nCould not unload the shared library: %s\n",dlerror());
exit(1);
}
return 0;
}
The C code also does not work. Apparently, the dynamic loader is not rereading the LD_LIBRARY_PATH environment variable. I need to figure out how to force the dynamic loader to re-read the LD_LIBRARY_PATH environment variable.
See the accepted answer here:
Changing LD_LIBRARY_PATH at runtime for ctypes
In other words, what you're trying to do isn't possible. You'll need to launch a new process with an updated LD_LIBRARY_PATH (e.g., use ProcessBuilder and update environment() to concatenate the necessary directory)
This is a hack used to manipulate JVM's library path programmatically. NOTE: it relies on internals of ClassLoader implementation so it might not work on all JVMs/versions.
String currentPath = System.getProperty("java.library.path");
System.setProperty( "java.library.path", currentPath + ":/path/to/my/libs" );
// this forces JVM to reload "java.library.path" property
Field fieldSysPath = ClassLoader.class.getDeclaredField( "sys_paths" );
fieldSysPath.setAccessible( true );
fieldSysPath.set( null, null );
This code uses UNIX-style file path separators ('/') and library path separator (':'). For cross-platform way of doing this use System Properties to get system-specific separators: http://download.oracle.com/javase/tutorial/essential/environment/sysprop.html
I have successfully implemented something similar for CollabNet Subversion Edge, which depends on the SIGAR libraries across ALL Operating Systems (we support Windows/Linux/Sparc both 32 bits and 64 bits)...
Subversion Edge is a web application that helps one managing Subversion repositories through a web console and uses SIGAR to the SIGAR libraries helps us provide users data values directly from the OS... You need to update the value of the property "java.library.path" at runtime. (https://ctf.open.collab.net/integration/viewvc/viewvc.cgi/trunk/console/grails-app/services/com/collabnet/svnedge/console/OperatingSystemService.groovy?revision=1890&root=svnedge&system=exsy1005&view=markup Note that the URL is a Groovy code, but I have modified it to a Java here)...
The following example is the implementation in URL above... (On Windows, your user will be required to restart the machine if he/she has downloaded the libraries after or downloaded them using your application)... The "java.library.path" will update the user's path "usr_paths" instead of System path "sys_paths" (permissions exception might be raised depending on the OS when using the latter).
133/**
134 * Updates the java.library.path at run-time.
135 * #param libraryDirPath
136 */
137 public void addDirToJavaLibraryPathAtRuntime(String libraryDirPath)
138 throws Exception {
139 try {
140 Field field = ClassLoader.class.getDeclaredField("usr_paths");
141 field.setAccessible(true);
142 String[] paths = (String[])field.get(null);
143 for (int i = 0; i < paths.length; i++) {
144 if (libraryDirPath.equals(paths[i])) {
145 return;
146 }
147 }
148 String[] tmp = new String[paths.length+1];
149 System.arraycopy(paths,0,tmp,0,paths.length);
150 tmp[paths.length] = libraryDirPath;
151 field.set(null,tmp);
152 String javaLib = "java.library.path";
153 System.setProperty(javaLib, System.getProperty(javaLib) +
154 File.pathSeparator + libraryDirPath);
155
156 } catch (IllegalAccessException e) {
157 throw new IOException("Failed to get permissions to set " +
158 "library path to " + libraryDirPath);
159 } catch (NoSuchFieldException e) {
160 throw new IOException("Failed to get field handle to set " +
161 "library path to " + libraryDirPath);
162 }
163 }
The Bootstrap services (Groovy on Grails application) class of the console runs a service and executes it with the full path to the library directory... UNiX-based servers do not need to restart the server to get the libraries, but Windows boxes do need a server restart after the installation. In your case, you would be calling this as follows:
String appHomePath = "/YOUR/PATH/HERE/TO/YOUR/LIBRARY/DIRECTORY";
String yourLib = new File(appHomePath, "SUBDIRECTORY/").getCanonicalPath();
124 try {
125 addDirToJavaLibraryPathAtRuntime(yourLib);
126 } catch (Exception e) {
127 log.error("Error adding the MY Libraries at " + yourLib + " " +
128 "java.library.path: " + e.message);
129 }
For each OS you ship your application, just make sure to provide a matching version of the libraries for the specific platform (32bit-Linux, 64bit-Windows, etc...).
I am trying to process files one at a time that are stored over a network. Reading the files is fast due to buffering is not the issue. The problem I have is just listing the directories in a folder. I have at least 10k files per folder over many folders.
Performance is super slow since File.list() returns an array instead of an iterable. Java goes off and collects all the names in a folder and packs it into an array before returning.
The bug entry for this is http://bugs.sun.com/view_bug.do;jsessionid=db7fcf25bcce13541c4289edeb4?bug_id=4285834 and doesn't have a work around. They just say this has been fixed for JDK7.
A few questions:
Does anybody have a workaround to this performance bottleneck?
Am I trying to achieve the impossible? Is performance still going to be poor even if it just iterates over the directories?
Could I use the beta JDK7 builds that have this functionality without having to build my entire project on it?
Although it's not pretty, I solved this kind of problem once by piping the output of dir/ls to a file before starting my app, and passing in the filename.
If you needed to do it within the app, you could just use system.exec(), but it would create some nastiness.
You asked. The first form is going to be blazingly fast, the second should be pretty fast as well.
Be sure to do the one item per line (bare, no decoration, no graphics), full path and recurse options of your selected command.
EDIT:
30 minutes just to get a directory listing, wow.
It just struck me that if you use exec(), you can get it's stdout redirected into a pipe instead of writing it to a file.
If you did that, you should start getting the files immediately and be able to begin processing before the command has completed.
The interaction may actually slow things down, but maybe not--you might give it a try.
Wow, I just went to find the syntax of the .exec command for you and came across this, possibly exactly what you want (it lists a directory using exec and "ls" and pipes the result into your program for processing): good link in wayback (Jörg provided in a comment to replace this one from sun that Oracle broke)
Anyway, the idea is straightforward but getting the code right is annoying. I'll go steal some codes from the internets and hack them up--brb
/**
* Note: Only use this as a last resort! It's specific to windows and even
* at that it's not a good solution, but it should be fast.
*
* to use it, extend FileProcessor and call processFiles("...") with a list
* of options if you want them like /s... I highly recommend /b
*
* override processFile and it will be called once for each line of output.
*/
import java.io.*;
public abstract class FileProcessor
{
public void processFiles(String dirOptions)
{
Process theProcess = null;
BufferedReader inStream = null;
// call the Hello class
try
{
theProcess = Runtime.getRuntime().exec("cmd /c dir " + dirOptions);
}
catch(IOException e)
{
System.err.println("Error on exec() method");
e.printStackTrace();
}
// read from the called program's standard output stream
try
{
inStream = new BufferedReader(
new InputStreamReader( theProcess.getInputStream() ));
processFile(inStream.readLine());
}
catch(IOException e)
{
System.err.println("Error on inStream.readLine()");
e.printStackTrace();
}
} // end method
/** Override this method--it will be called once for each file */
public abstract void processFile(String filename);
} // end class
And thank you code donor at IBM
How about using File.list(FilenameFilter filter) method and implementing FilenameFilter.accept(File dir, String name) to process each file and return false.
I ran this on Linux vm for directory with 10K+ files and it took <10 seconds.
import java.io.File;
import java.io.FilenameFilter;
public class Temp {
private static void processFile(File dir, String name) {
File file = new File(dir, name);
System.out.println("processing file " + file.getName());
}
private static void forEachFile(File dir) {
String [] ignore = dir.list(new FilenameFilter() {
public boolean accept(File dir, String name) {
processFile(dir, name);
return false;
}
});
}
public static void main(String[] args) {
long before, after;
File dot = new File(".");
before = System.currentTimeMillis();
forEachFile(dot);
after = System.currentTimeMillis();
System.out.println("after call, delta is " + (after - before));
}
}
An alternative is to have the files served over a different protocol. As I understand you're using SMB for that and java is just trying to list them as a regular file.
The problem here might not be java alone ( how does it behaves when you open that directory with Microsoft Explorer x:\shared ) In my experience it also take a considerably amount of time.
You can change the protocol to something like HTTP, only to fetch the file names. This way you can retrieve the list of files over http ( 10k lines should't be too much ) and let the server deal with file listing. This would be very fast, since it will run with local resources ( those in the server )
Then when you have the list, you can process them one by exactly the way you're doing right now.
The keypoint is to have an aid mechanism in the other side of the node.
Is this feasible?
Today:
File [] content = new File("X:\\remote\\dir").listFiles();
for ( File f : content ) {
process( f );
}
Proposed:
String [] content = fetchViaHttpTheListNameOf("x:\\remote\\dir");
for ( String fileName : content ) {
process( new File( fileName ) );
}
The http server could be a very small small and simple file.
If this is the way you have it right now, what you're doing is to fetch all the 10k files information to your client machine ( I don't know how much of that info ) when you only need the file name for later processing.
If the processing is very fast right now it may be slowed down a bit. This is because the information prefetched is no longer available.
Give it a try.
A non-portable solution would be to make native calls to the operating system and stream the results.
For Linux
You can look at something like readdir. You can walk the directory structure like a linked list and return results in batches or individually.
For Windows
In windows the behavior would be fairly similar using FindFirstFile and FindNextFile apis.
I doubt the problem is relate to the bug report you referenced.
The issue there is "only" memory usage, but not necessarily speed.
If you have enough memory the bug is not relevant for your problem.
You should measure whether your problem is memory related or not. Turn on your Garbage Collector log and use for example gcviewer to analyze your memory usage.
I suspect that it has to do with the SMB protocol causing the problem.
You can try to write a test in another language and see if it's faster, or you can try to get the list of filenames through some other method, such as described here in another post.
If you need to eventually process all files, then having Iterable over String[] won't give you any advantage, as you'll still have to go and fetch the whole list of files.
If you're on Java 1.5 or 1.6, shelling out "dir" commands and parsing the standard output stream on Windows is a perfectly acceptable approach. I've used this approach in the past for processing network drives and it has generally been a lot faster than waiting for the native java.io.File listFiles() method to return.
Of course, a JNI call should be faster and potentially safer than shelling out "dir" commands. The following JNI code can be used to retrieve a list of files/directories using the Windows API. This function can be easily refactored into a new class so the caller can retrieve file paths incrementally (i.e. get one path at a time). For example, you can refactor the code so that FindFirstFileW is called in a constructor and have a seperate method to call FindNextFileW.
JNIEXPORT jstring JNICALL Java_javaxt_io_File_GetFiles(JNIEnv *env, jclass, jstring directory)
{
HANDLE hFind;
try {
//Convert jstring to wstring
const jchar *_directory = env->GetStringChars(directory, 0);
jsize x = env->GetStringLength(directory);
wstring path; //L"C:\\temp\\*";
path.assign(_directory, _directory + x);
env->ReleaseStringChars(directory, _directory);
if (x<2){
jclass exceptionClass = env->FindClass("java/lang/Exception");
env->ThrowNew(exceptionClass, "Invalid path, less than 2 characters long.");
}
wstringstream ss;
BOOL bContinue = TRUE;
WIN32_FIND_DATAW data;
hFind = FindFirstFileW(path.c_str(), &data);
if (INVALID_HANDLE_VALUE == hFind){
jclass exceptionClass = env->FindClass("java/lang/Exception");
env->ThrowNew(exceptionClass, "FindFirstFileW returned invalid handle.");
}
//HANDLE hStdOut = GetStdHandle(STD_OUTPUT_HANDLE);
//DWORD dwBytesWritten;
// If we have no error, loop thru the files in this dir
while (hFind && bContinue){
/*
//Debug Print Statment. DO NOT DELETE! cout and wcout do not print unicode correctly.
WriteConsole(hStdOut, data.cFileName, (DWORD)_tcslen(data.cFileName), &dwBytesWritten, NULL);
WriteConsole(hStdOut, L"\n", 1, &dwBytesWritten, NULL);
*/
//Check if this entry is a directory
if (data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY){
// Make sure this dir is not . or ..
if (wstring(data.cFileName) != L"." &&
wstring(data.cFileName) != L"..")
{
ss << wstring(data.cFileName) << L"\\" << L"\n";
}
}
else{
ss << wstring(data.cFileName) << L"\n";
}
bContinue = FindNextFileW(hFind, &data);
}
FindClose(hFind); // Free the dir structure
wstring cstr = ss.str();
int len = cstr.size();
//WriteConsole(hStdOut, cstr.c_str(), len, &dwBytesWritten, NULL);
//WriteConsole(hStdOut, L"\n", 1, &dwBytesWritten, NULL);
jchar* raw = new jchar[len];
memcpy(raw, cstr.c_str(), len*sizeof(wchar_t));
jstring result = env->NewString(raw, len);
delete[] raw;
return result;
}
catch(...){
FindClose(hFind);
jclass exceptionClass = env->FindClass("java/lang/Exception");
env->ThrowNew(exceptionClass, "Exception occured.");
}
return NULL;
}
Credit:
https://sites.google.com/site/jozsefbekes/Home/windows-programming/miscellaneous-functions
Even with this approach, there are still efficiencies to be gained. If you serialize the path to a java.io.File, there is a huge performance hit - especially if the path represents a file on a network drive. I have no idea what Sun/Oracle is doing under the hood but if you need additional file attributes other than the file path (e.g. size, mod date, etc), I have found that the following JNI function is much faster than instantiating a java.io.File object on a network the path.
JNIEXPORT jlongArray JNICALL Java_javaxt_io_File_GetFileAttributesEx(JNIEnv *env, jclass, jstring filename)
{
//Convert jstring to wstring
const jchar *_filename = env->GetStringChars(filename, 0);
jsize len = env->GetStringLength(filename);
wstring path;
path.assign(_filename, _filename + len);
env->ReleaseStringChars(filename, _filename);
//Get attributes
WIN32_FILE_ATTRIBUTE_DATA fileAttrs;
BOOL result = GetFileAttributesExW(path.c_str(), GetFileExInfoStandard, &fileAttrs);
if (!result) {
jclass exceptionClass = env->FindClass("java/lang/Exception");
env->ThrowNew(exceptionClass, "Exception Occurred");
}
//Create an array to store the WIN32_FILE_ATTRIBUTE_DATA
jlong buffer[6];
buffer[0] = fileAttrs.dwFileAttributes;
buffer[1] = date2int(fileAttrs.ftCreationTime);
buffer[2] = date2int(fileAttrs.ftLastAccessTime);
buffer[3] = date2int(fileAttrs.ftLastWriteTime);
buffer[4] = fileAttrs.nFileSizeHigh;
buffer[5] = fileAttrs.nFileSizeLow;
jlongArray jLongArray = env->NewLongArray(6);
env->SetLongArrayRegion(jLongArray, 0, 6, buffer);
return jLongArray;
}
You can find a full working example of this JNI-based approach in the javaxt-core library. In my tests using Java 1.6.0_38 with a Windows host hitting a Windows share, I have found this JNI approach approximately 10x faster then calling java.io.File listFiles() or shelling out "dir" commands.
I wonder why there are 10k files in a directory. Some file systems do not work well with so many files. There are specifics limitations for file systems like max amount of files per directory and max amount of levels of subdirectory.
I solve a similar problem with an iterator solution.
I needed to walk across huge directorys and several levels of directory tree recursively.
I try FileUtils.iterateFiles() of Apache commons io. But it implement the iterator by adding all the files in a List and then returning List.iterator(). It's very bad for memory.
So I prefer to write something like this:
private static class SequentialIterator implements Iterator<File> {
private DirectoryStack dir = null;
private File current = null;
private long limit;
private FileFilter filter = null;
public SequentialIterator(String path, long limit, FileFilter ff) {
current = new File(path);
this.limit = limit;
filter = ff;
dir = DirectoryStack.getNewStack(current);
}
public boolean hasNext() {
while(walkOver());
return isMore && (limit > count || limit < 0) && dir.getCurrent() != null;
}
private long count = 0;
public File next() {
File aux = dir.getCurrent();
dir.advancePostition();
count++;
return aux;
}
private boolean walkOver() {
if (dir.isOutOfDirListRange()) {
if (dir.isCantGoParent()) {
isMore = false;
return false;
} else {
dir.goToParent();
dir.advancePostition();
return true;
}
} else {
if (dir.isCurrentDirectory()) {
if (dir.isDirectoryEmpty()) {
dir.advancePostition();
} else {
dir.goIntoDir();
}
return true;
} else {
if (filter.accept(dir.getCurrent())) {
return false;
} else {
dir.advancePostition();
return true;
}
}
}
}
private boolean isMore = true;
public void remove() {
throw new UnsupportedOperationException();
}
}
Note that the iterator stop by an amount of files iterateds and it has a FileFilter also.
And DirectoryStack is:
public class DirectoryStack {
private class Element{
private File files[] = null;
private int currentPointer;
public Element(File current) {
currentPointer = 0;
if (current.exists()) {
if(current.isDirectory()){
files = current.listFiles();
Set<File> set = new TreeSet<File>();
for (int i = 0; i < files.length; i++) {
File file = files[i];
set.add(file);
}
set.toArray(files);
}else{
throw new IllegalArgumentException("File current must be directory");
}
} else {
throw new IllegalArgumentException("File current not exist");
}
}
public String toString(){
return "current="+getCurrent().toString();
}
public int getCurrentPointer() {
return currentPointer;
}
public void setCurrentPointer(int currentPointer) {
this.currentPointer = currentPointer;
}
public File[] getFiles() {
return files;
}
public File getCurrent(){
File ret = null;
try{
ret = getFiles()[getCurrentPointer()];
}catch (Exception e){
}
return ret;
}
public boolean isDirectoryEmpty(){
return !(getFiles().length>0);
}
public Element advancePointer(){
setCurrentPointer(getCurrentPointer()+1);
return this;
}
}
private DirectoryStack(File first){
getStack().push(new Element(first));
}
public static DirectoryStack getNewStack(File first){
return new DirectoryStack(first);
}
public String toString(){
String ret = "stack:\n";
int i = 0;
for (Element elem : stack) {
ret += "nivel " + i++ + elem.toString()+"\n";
}
return ret;
}
private Stack<Element> stack=null;
private Stack<Element> getStack(){
if(stack==null){
stack = new Stack<Element>();
}
return stack;
}
public File getCurrent(){
return getStack().peek().getCurrent();
}
public boolean isDirectoryEmpty(){
return getStack().peek().isDirectoryEmpty();
}
public DirectoryStack downLevel(){
getStack().pop();
return this;
}
public DirectoryStack goToParent(){
return downLevel();
}
public DirectoryStack goIntoDir(){
return upLevel();
}
public DirectoryStack upLevel(){
if(isCurrentNotNull())
getStack().push(new Element(getCurrent()));
return this;
}
public DirectoryStack advancePostition(){
getStack().peek().advancePointer();
return this;
}
public File[] peekDirectory(){
return getStack().peek().getFiles();
}
public boolean isLastFileOfDirectory(){
return getStack().peek().getFiles().length <= getStack().peek().getCurrentPointer();
}
public boolean gotMoreLevels() {
return getStack().size()>0;
}
public boolean gotMoreInCurrentLevel() {
return getStack().peek().getFiles().length > getStack().peek().getCurrentPointer()+1;
}
public boolean isRoot() {
return !(getStack().size()>1);
}
public boolean isCurrentNotNull() {
if(!getStack().isEmpty()){
int currentPointer = getStack().peek().getCurrentPointer();
int maxFiles = getStack().peek().getFiles().length;
return currentPointer < maxFiles;
}else{
return false;
}
}
public boolean isCurrentDirectory() {
return getStack().peek().getCurrent().isDirectory();
}
public boolean isLastFromDirList() {
return getStack().peek().getCurrentPointer() == (getStack().peek().getFiles().length-1);
}
public boolean isCantGoParent() {
return !(getStack().size()>1);
}
public boolean isOutOfDirListRange() {
return getStack().peek().getFiles().length <= getStack().peek().getCurrentPointer();
}
}
Using an Iterable doesn't imply that the Files will be streamed to you. In fact its usually the opposite. So an array is typically faster than an Iterable.
Are you sure it's due to Java, not just a general problem with having 10k entries in one directory, particularly over the network?
Have you tried writing a proof-of-concept program to do the same thing in C using the win32 findfirst/findnext functions to see whether it's any faster?
I don't know the ins and outs of SMB, but I strongly suspect that it needs a round trip for every file in the list - which is not going to be fast, particularly over a network with moderate latency.
Having 10k strings in an array sounds like something which should not tax the modern Java VM too much either.