AWS Lambda and Java Reflections (Guava)

AWS Lambda and Java Reflections (Guava) - java

I am trying to run Guava reflections in my AWS Lambda function but it seems to not work in production..
The Code i am trying to run is supposed to create a Map<String, Class> with class name and class.
Code:
val converterClassMap by lazy {
val cl = ClassLoader.getSystemClassLoader()
ClassPath.from(cl).getTopLevelClasses("converters").asSequence().mapNotNull { it.load().kotlin }
.filter { it.simpleName?.endsWith("Converter") == true }
.associateBy( { it.simpleName }, { it } )
}
Running this code locally works perfectly, but running it in production on a lambda return an error where the map is empty.
Key PaginationConverter is missing in the map.: java.util.NoSuchElementException
Has anyone else run into this problem?

One more case. You have the
val cl = ClassLoader.getSystemClassLoader()
the line in the code. It means it takes the system classloader to scan for classes.
Try using
class SomeClassFromYouCodeNotALibrary
val cl = SomeClassFromYouCodeNotALibrary::class.java.classLoader
That one will work stable, independent from the number of classloaders, that are used in the application. AWS Lambda runtime may have specific classloaders, for example.
If it does not work, try logging the classloader type and classpath, e.g. println(cl) and println((cl as? URLClassLoader).getURLs().joinToString(", "))

Related

javassist - How can I replace a method body without extra startup flags on Java 17?

I'm trying to redefine a method at runtime using javassist, but I'm running into some issues on the last step, because of the weird requirements I have for this:
I can't require the user to add startup flags
My code will necessarily run after the class has already been defined/loaded
My code looks like this:
val cp = ClassPool.getDefault()
val clazz = cp.get("net.minecraft.world.item.ItemStack")
val method = clazz.getDeclaredMethod(
"a",
arrayOf(cp.get("net.minecraft.world.level.block.state.IBlockData"))
)
method.setBody(
"""
{
double destroySpeed = this.c().a(this, $1);
if (this.s()) {
return destroySpeed * this.t().k("DestroySpeedMultiplier");
} else {
return destroySpeed;
}
}
""".trimIndent()
)
clazz.toClass(Items::class.java)
(I'm dealing with obfuscated method references, hence the weird names)
However, calling .toClass() causes an error as there are then two duplicate classes on the class loader - and to my knowledge there's no way to unload a single class.
My next port of call to update the class was to use the attach API and an agent, but that requires a startup flag to be added (on Java 9+, I'm running J17), which I can't do given my requirements. I have the same problem trying to load an agent on startup.
I have tried patching the server's jar file itself by using .toBytecode(), but I didn't manage to write the new class file to the jar - this method sounds promising, so it's absolutely on the table to restart the server after patching the jar.
Is there any way I can get this to work with my requirements? Or is there any alternative I can use to change a method's behavior?

pyspark: call a custom java function from pyspark. Do I need Java_Gateway?

I wrote the following MyPythonGateway.java so that I can call my custom java class from Python:
public class MyPythonGateway {
public String findMyNum(String input) {
return MyUtiltity.parse(input).getMyNum();
}
public static void main(String[] args) {
GatewayServer server = new GatewayServer(new MyPythonGateway());
server.start();
}
}
and here is how I used it in my Python code:
def main():
gateway = JavaGateway() # connect to the JVM
myObj = gateway.entry_point.findMyNum("1234 GOOD DAY")
print(myObj)
if __name__ == '__main__':
main()
Now I want to use MyPythonGateway.findMyNum() function from PySpark, not just a standalone python script. I did the following:
myNum = sparkcontext._jvm.myPackage.MyPythonGateway.findMyNum("1234 GOOD DAY")
print(myNum)
However, I got the following error:
... line 43, in main:
myNum = sparkcontext._jvm.myPackage.MyPythonGateway.findMyNum("1234 GOOD DAY")
File "/home/edamameQ/spark-1.5.2/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 726, in __getattr__
py4j.protocol.Py4JError: Trying to call a package.
So what did I miss here? I don't know if I should run a separate JavaApplication of MyPythonGateway to start a gateway server when using pyspark. Please advice. Thanks!
Below is exactly what I need:
input.map(f)
def f(row):
// call MyUtility.java
// x = MyUtility.parse(row).getMyNum()
// return x
What would be the best way to approach this? Thanks!

First of all the error you see usually means the class you're trying to use is not accessible. So most likely it is a CLASSPATH issue.
Regarding general idea there are two important issues:
you cannot access SparkContext inside an action or transformation so using PySpark gateway won't work (see How to use Java/Scala function from an action or a transformation? for some details)). If you want to use Py4J from the workers you'll have to start a separate gateways on each worker machine.
you really don't want to pass data between Python an JVM this way. Py4J is not designed for data intensive tasks.

In PySpark before start calling the method -
myNum = sparkcontext._jvm.myPackage.MyPythonGateway.findMyNum("1234 GOOD DAY")
you have to import MyPythonGateway java class as follows
java_import(sparkContext._jvm, "myPackage.MyPythonGateway")
myPythonGateway = spark.sparkContext._jvm.MyPythonGateway()
myPythonGateway.findMyNum("1234 GOOD DAY")
specify the jar containing myPackage.MyPythonGateway with --jars option in spark-submit

If input.map(f) has inputs as an RDD for example, this might work, since you can't access the JVM variable (attached to spark context) inside the executor for a map function of an RDD (and to my knowledge there is no equivalent for #transient lazy val in pyspark).
def pythonGatewayIterator(iterator):
results = []
jvm = py4j.java_gateway.JavaGateway().jvm
mygw = jvm.myPackage.MyPythonGateway()
for value in iterator:
results.append(mygw.findMyNum(value))
return results
inputs.mapPartitions(pythonGatewayIterator)

all you need to do is compile jar and add to pyspark classpath with --jars or --driver-class-path spark submit options. Then access class and method with below code-
sc._jvm.com.company.MyClass.func1()
where sc - spark context
Tested with Spark 2.3. Keep in mind, you can call JVM class method only from driver program and not executor.

Spark can't load static files from webjars

I am using Spark Framework in my application, and use
staticFileLocation("/META-INF/resources/");
so that I can use webjars, which contain css and js files in there. I also have my own resources put in my projects src/main/resources/META-INF/resources folder because my gradle build picks them up from there.
My build uses a fat-jar approach, where everything ends up in a single jar and all files are served perfectly by Spark.
My problem is that when I run some unit tests standalone from Eclipse, even though I ensured that the webjars are on classpath, they are not served by Spark, only my own project static resources are.
#Test
public void testStartup() throws InterruptedException {
InputStream schemaIS = this.getClass().getClassLoader().getResourceAsStream("META-INF/resources/webjars/bootstrap/3.2.0/js/bootstrap.min.js");
System.out.println(schemaIS == null);
staticFileLocation("/META-INF/resources/");
// depending on the trailing / the bootstrap js is found, but Spark never serves it
}
I think this has something to do with classloaders, but I am not finding the way to make this work. Looking at Spark code, it says The thread context class loader will be used for loading the resource. I also see that the code itself removes the trailing slash, which makes big difference in the plain getResourceAsStream.
Is it a bug in Spark, or is there any way to make it work properly?

Note that removing the leading slash is required by jetty not by Spark.
Unfortunately with Spark you cannot mix static files (in a physical directory/folder) with files served as resources in a jar. And many jars will not work either in Spark.
I had a look at this a few weeks ago and came to a conclusion this is a minor weakness in Spark (or a bug if you may say).
The only way I found out was to reverse Spark and figure out how jetty works. I managed with the following Nashorn javascript snippets to make webjars and static files to work together.
Unless Spark author changes his code to allow inclusion of tailor made context handlers, this will not help you out. But if you wish to pursue in jetty instead, this code with adaptation can help you out.
This code is for Nashorn jjs (from JDK8) but can be easily ported to Java. With this code I was able to use 3 separate webjars jquery/bootstrap/angular and the rest of my client code was in a physical directory/folder public.
app.js:
with(new JavaImporter(
org.eclipse.jetty.server
, org.eclipse.jetty.server.handler
)) {
var server = new Server(4567);
var ctxs = new ContextHandlerCollection();
ctxs.setHandlers(Java.to([
load('src/static.js')
, load('src/webjars.js')
], Handler.class.getName().concat('[]')));
server.setHandler(ctxs);
server.start();
server.join();
}
src/static.js:
(function () {
var context;
with(new JavaImporter(
org.eclipse.jetty.server.handler
, org.eclipse.jetty.util.resource
)) {
context = new ContextHandler();
context.setContextPath("/");
var handler = new ResourceHandler();
handler.setBaseResource(Resource.newResource("public"));
context.setHandler(handler);
}
return context;
})();
src/webjars.js:
(function () {
var context;
with(new JavaImporter(
org.eclipse.jetty.server.handler
, org.eclipse.jetty.util.resource
)) {
context = new ContextHandler();
context.setContextPath("/");
var handler = new (Java.extend(ResourceHandler, {
getResource: function(req) {
var path = req.getUri();
var resource = Resource.newClassPathResource(path);
if (resource == null || !resource.exists()) {
resource = Resource.newClassPathResource("META-INF/resources/webjars" + path);
}
return resource;
}
}))();
handler.setDirectoriesListed(true); // true when debugging, false in production
context.setHandler(handler);
}
return context;
})();

Getting error on files.readAllLines

I am working in android. For reading the file content, I am using the method
List<String> lines = Files.readAllLines(wiki_path);
But whem I am using this method I am getting this error:
The method readAllLines(Path) is undefined for the type MediaStore.Files.
Why can the compiler not find the method?
Path wiki_path = Paths.get("C:/tutorial/wiki", "wiki.txt");
try {
List<String> lines = Files.readAllLines(wiki_path);
for (String line : lines) {
if(url.contains(line))
{
other.put(TAG_Title, name);
other.put(TAG_URL, url);
otherList.add(other);
break;
}
}
}

The method you're trying to use is a member of java.nio.file.Files - but that class (and indeed that package) doesn't exist on Android. Even if the Java 7 version existed, you're trying to use a method introduced in Java 8. The Files class you've imported is android.provider.MediaStore.Files which is an entirely different class.
Even if it compiled, the path you're providing looks ever so much like a Windows path, which wouldn't work on an Android device...

Launching Java Subprocess using parent process Classpath

I want to launch a java subprocess, with the same java classpath and dynamically loaded classes as the current java process. The following is not enough, because it doesn't include any dynamically loaded classes:
String classpath = System.getProperty("java.class.path");
Currently I'm searching for each needed class with the code below. However, on some machines this fails for some classes/libs, the source variable is null. Is there a more reliable and simpler way to get the location of libs that are used by the current jvm process?
String stax = ClassFinder.classPath("javax.xml.stream.Location");
public static String classPath(String qualifiedClassName) throws NotFoundException {
try {
Class qc = Class.forName( qualifiedClassName );
CodeSource source = qc.getProtectionDomain().getCodeSource();
if ( source != null ) {
URL location = source.getLocation();
String f = location.getPath();
f = URLDecoder.decode(f, "UTF-8"); // decode URL to avoid spaces being replaced by %20
return f.substring(1);
} else {
throw new ClassFinder().new NotFoundException(qualifiedClassName+" (unknown source, likely rt.jar)");
}
} catch ( Exception e ) {
throw new ClassFinder().new NotFoundException(qualifiedClassName);
}
}

See my previous question which covers getting the classpath as well as how to launch a sub-process.

I want to launch a java subprocess, with the same java classpath and dynamically loaded classes as the current java process.
You mean invoke a new JVM?
Given that...
it is possible to plug in all sorts of agents and instrumentation into a JVM that can transform classes at load time
it is possible to take a byte array and turn it into a class
it is possible to have complex class loader hierarchies with varying visibility between classes and have the same classes loaded multiple times
...there is no general, magic, catch-all and foolproof way to do this. You should design your application and its class loading mechanisms to achieve this goal. If you allow 3rd party plug-ins, you'll have to document how this works and how they have to register their libraries.

If you look at the javadoc for Class.getClassLoader, you'll see that the "bootstrap" classloader is typically represented as the null. "String.class.getClassLoader()" will return null on the normal sun jvm implementations. i think this implementation detail carries over into the CodeSource stuff. As such, I wouldn't imagine you would need to worry about any class which comes from the bootstrap classloader as long as your sub-process uses the same jvm impl as the current process.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

AWS Lambda and Java Reflections (Guava) - java

Related

javassist - How can I replace a method body without extra startup flags on Java 17?

pyspark: call a custom java function from pyspark. Do I need Java_Gateway?

Spark can't load static files from webjars

Getting error on files.readAllLines

Launching Java Subprocess using parent process Classpath

Categories

Resources