Java Runtime query - toString() and T

Java Runtime query - toString() and T - java

In the following expression:
T(org.apache.commons.io.IOUtils).toString(T(java.lang.Runtime)
.getRuntime().exec(T(java.lang.Character).toString(105)
.concat(T(java.lang.Character).toString(100))).getInputStream())
Does the '105' in toString(105) refer to an itemized object within the Character class?
and
Why is the 'T', which I believe expresses a generic type, and is used 4 times in this expression, a necessary feature of Java?

The toString() method that seems to be invoked here is actually the toString(char) (static) method of java.lang.Character. Quoting the documentation:
public static String toString(char c)
Returns a String object representing the specified char.
The result is a string of length 1 consisting solely of the specified char.
Parameters:
c - the char to be converted
Returns:
the string representation of the specified char
Since:
1.4
Note that 100 and 105 are also valid char values where 100 == 'd' and 105 == 'i'.
Update: after knowing the context, I am now confident that this code is intended to be injected into a template for a web page. The template engine used provides special syntax for accessing static methods where T(Classname) resolves to just Classname (not Classname.class!) in the resulting Java code.
So your code would be translated to:
org.apache.commons.io.IOUtils.toString(java.lang.Runtime
.getRuntime().exec(java.lang.Character.toString(105)
.concat(java.lang.Character.toString(100))).getInputStream())
The full qualification of the class names is necessary because we do not know if those classes are imported on the attacked site (or if the template engine even allows imports or class names must always be fully qualified).
A more readable version of the code that assumes imports is
IOUtils.toString(
Runtime.getRuntime().exec(
Character.toString(105).concat(Character.toString(100))
).getInputStream()
)
And after a little de-obfuscation...
IOUtils.toString(Runtime.getRuntime().exec("id").getInputStream())

Whatever this is, it is definitely NOT meaningful Java code.
And the fact that you can provide it as as a search query on some site is not evidence that it is Java either.
I suspect that this is actually some custom (site-specific?) query language. That makes it futile to try to understand it as a Java snippet.
Your theory that T could denote a generic type parameter doesn't work. Java would not allow you to write T(...) if that was the case.
Furthermore, if we assume that org.apache.commons.io.IOUtils, java.lang.Runtime and so on are intended to refer to Java class objects, then the correct Java syntax would be org.apache.commons.io.IOUtils.class, java.lang.Runtime.class and so on.
So what does it mean?
Well a bit of Googling found me some other examples that look like yours. For instance;
https://github.com/VikasVarshney/ssti-payload
appears to generate "code" that is reminiscent of your example. This is SSTI - Server Side Template Injection, and it appears to be targeting Java EE Expression Language (EL).
And I think this particular example is an attempt to run the Linux id program ... which would output some basic information about the user and group ids for the account running your web server.
Does it matter? Well only if your site is vulnerable to SSTI attacks!
How would you know if your site is vulnerable?
By understanding the nature of SSTI with respect to EL and other potential attack vectors ... and auditing your codebase and configurations.
By using a vulnerability scanner to test your site and/or your code-base.
By employing the services of a trustworthy IT security company to do some penetration testing.
In this case, you could also try to use curl to repeat the attempted attack ... as the hacker would have done ... based on what is in your logs. Just see if it actually works. Note that running the id program does no actual damage to your system. The harm would be in the information that is leaked to a hacker ... if they succeed.
Note that if this hack did succeed, then the hacker would probably try to run other programs. These could do some damage to your system, depending on how how well your server was hardened against such things.

Related

Create dynamic classes with reserved words as variables

This question was once asked without a satisfactory answer besides "why would you want to do this" at Reserved words as variable or method names. I'm going to ask it again, and provide context that explains why it is necessary, and even the direction to a proper solution.
I am writing code that builds classes dynamically to match the schema of a database, which I have no control over. For the most part, the code is working cleanly, but in about .1% of the column cases, there are reserved words in Java being used as column names. The following code is being used to create the dynamic field in the class:
evalClass.addField(CtField.make("public " + columnType + " " + columnName + ";", evalClass));
Now, with Java the language, this results in an issue, however in JVM byte code, this should be perfectly legal, so there should be a way to dynamically create this field and access it using byte-code operations. Does anybody have any examples of how this would be done in a way that would support arbitrary strings, including spaces and reserved words? Thanks!

It's not clear which part you are stuck on. Any bytecode manipulation library should let you do this.
For example, using ASM, you just pass your string directly to visitField. There's no hoops to jump through or anything.
Note that even at the bytecode level, there are still a few restrictions on field names. In particular, they can't be more than 65535 bytes long in MUTF8 encoding.

You picked the only way where this doesn’t work—Javassist’s source level API. It should be obvious to you that if you use the identifier to construct source code, the identifier must adhere to the source code rules. Besides, using the already known intended structure to construct source code which has to be parsed again to reconstitute the intention, is the most inefficient way of processing byte code.
You can use the Bytecode level API to overcome these limitations. As a side note, most other byte code processing libraries do not have a source level API at all, so with them you would have used a byte code level API right from the start.
That said, you should rethink your premise. Generated classes whose fields can only be accessed via Reflection or other generated code, do not offer any advantage over, e.g. a HashMap mapping from identifiers to values or arrays intrinsically associating columns with positions.

Forcing devs to explicitly define keys for configuration data

We are working in a project with multiple developers and currently the retrieval of values from a configuration file is somewhat "wild west":
Everybody uses some string to retrieve a value from the Config object
Those keys are spread across multiple classes and packages
Sometimes the are not even declared as constants
Naming of the keys is inconsistent and the config file (.properties) looks messy
I would like to sort that out and force everyone to explicitly define their configuration keys. Ideally in one place to streamline how config keys actually look.
I was thingking of using an Enum as a key and turning my retrieval method into:
getConfigValue(String key)
into something like
getConfigValue(ConfigKey)
NOTE: I am using this approach since the Preferences API seems a bit overkill to me plus I would actually like to have the configuration in a simple file.
What are the cons of this approach?

First off, FWIW, I think it's a good idea. But you did specifically ask what the "cons" are, so:
The biggest "con" is that it ties any class that needs to use configuration data to the ConfigKey class. Adding a config key used to mean adding a string to the code you were working on; now it means adding to the enum and to the code you were working on. This is (marginally) more work.
You're probably not markedly increasing inter-dependence otherwise, since I assume the class that getConfigValue is part of is the one on which you'd define the enum.

The other downside to consolidation is if you have multiple projects on different parts of the same code base. When you develop, you have to deal with delivery dependencies, which can be a PITA.
Say Project A and Project B are scheduled to get released in that order. Suddenly political forces change in the 9th hour and you have to deliver B before A. Do you repackage the config to deal with it? Can your QA cycles deal with repackaging or does it force a reset in their timeline.
Typical release issues, but just one more thing you have to manage.

From your question, it is clear that you intend to write a wrapper class for the raw Java Properties API, with the intention that your wrapper class provides a better API. I think that is a good approach, but I'd like to suggest some things that I think will improve your wrapper API.
My first suggested improvement is that an operation that retrieves a configuration value should take two parameters rather than one, and be implemented as shown in the following pseudocode:
class Configuration {
public String getString(String namespace, String localName) {
return properties.getProperty(namespace + "." + localName);
}
}
You can then encourage each developer to define a string constant value to denote the namespace for whatever class/module/component they are developing. As long as each developer (somehow) chooses a different string constant for their namespace, you will avoid accidental name clashes and promote a somewhat organised collection of property names.
My second suggested improvement is that your wrapper class should provide type-safe access to property values. For example, provide getString(), but also provide methods with names such as getInt(), getBoolean(), getDouble() and getStringList(). The int/boolean/double variants should retrieve the property value as a string, attempt to parse it into the appropriate type, and throw a descriptive error message if that fails. The getStringList() method should retrieve the property value as a string and then split it into a list of strings based on using, say, a comma as a separator. Doing this will provide a consistent way for developers to get a list value.
My third suggested improvement is that your wrapper class should provide some additional methods such as:
int getDurationMilliseconds(String namespace, String localName);
int getDurationSeconds(String namespace, String localName);
int getMemorySizeBytes(String namespace, String localName);
int getMemorySizeKB(String namespace, String localName);
int getMemorySizeMB(String namespace, String localName);
Here are some examples of their intended use:
cacheSize = cfg.getMemorySizeBytes(MY_NAMSPACE, "cache_size");
timeout = cfg.getDurationMilliseconds(MY_NAMSPACE, "cache_timeout");
The getMemorySizeBytes() method should convert string values such as "2048 bytes" or "32MB" into the appropriate number of bytes, and getMemorySizeKB() does something similar but returns the specified size in terms of KB rather than bytes. Likewise, the getDuration<units>() methods should be able to handle string values like "500 milliseconds", "2.5 minutes", "3 hours" and "infinite" (which is converted into, say, -1).
Some people may think that the above suggestions have nothing to do with the question that was asked. Actually, they do, but in a sneaky sort of way. The above suggestions will result in a configuration API that developers will find to be much easier to use than the "raw" Java Properties API. They will use it to obtain that ease-of-use benefit. But using the API will have the side effect of forcing the developers to adopt a namespace convention, which will help to solve the problem that you are interested in addressing.
Or to look at it another way, the main con of the approach described in the question is that it offers a win-lose situation: you win (by imposing a property-naming convention on developers), but developers lose because they swap the familiar Java Properties API for another API that doesn't offer them any benefits. In contrast, the improvements I have suggested are intended to provide a win-win situation.

Why is it wrong to use numbers in Java method names?

Sometime ago, I remember being told not to use numbers in Java method names. Recently, I had a colleague ask me why and, for the life of me, I could not remember.
According to Sun (and now Oracle) the general naming convention for method names is:
Methods should be verbs, in mixed case
with the first letter lowercase, with
the first letter of each internal word
capitalized.
Code Conventions of Java
This doesn't specifically say that numbers can't be used, although by omission you can see that it's not advised.
Consider the situatiuon (that my colleague has) where you want to perform some logic based on a specific year, for instance, a new policy that takes affect in 2011, and so your application must act on the information and process it based on it's year. Common sense could tell you that you could call the method:
boolean isSessionPost2011(int id) {}
Is it acceptable to use numbers in method names (despite the wording of the standard)? If not, why?
Edit: "This doesn't specifically say that numbers can't be used, although by omission you can see that it's not advised." Perhaps I worded this incorrectly. The standard says 'Methods should be verbs'. I read this to say that considering a number is not a verb, then method names should not use numbers.

The standard Java class library is full of classes and methods with numbers in it, like Graphics2D.

The method seems ... overly specific.
Couldn't you instead use:
boolean isSessionAfter(int id, Date date)
?
That way the next time you have a policy applied to anything after a particular date, you don't need to copy-paste the old method and change the number - you just call it with a different date.

Sure, it's acceptable to use numbers in method names. But as per your example, that's why it's generally frowned upon. Let's say that there is now a new policy in place for the year 2012. Now, there's a new policy in place for 2014. And maybe 2020! So, you have four methods that are roughly equivalent.
What you want isn't a boolean but rather a strategy to do something, or do nothing, based on whether or not a policy was found. Hence, a method void processPolicy(Structure yourStructure); would be a better approach - now you can shield that you're doing a lookup based on the year, and don't have to have separate methods per year, or even limit it to just one policy per year (maybe a policy takes place in two different years, for example, or just three months).

The Java Language Specification seems fairly specific on this topic:
3.8 Identifiers
An identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter.
...
The Java letters include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024). The $ character should be used only in mechanically generated source code or, rarely, to access preexisting names on legacy systems.
The "Java digits" include the ASCII digits 0-9 (\u0030-\u0039).

This doesn't specifically say that numbers can't be used, although by omission you can see that it's not advised.
I certainly wouldn't read the Java Style Guide that way. And judging from numerous examples in the Java class libraries, neither do they.
I guess the only caveat is that the JSG recommends use of meaningful names. And the corollary is that you should only use numbers in identifiers when they are semantically meaningful. Good examples are
"3D",
"i18n" ( == internationalization ),
"2020" (the year),
"X509" (a standard), and so on.
Even "int2Real" is meaningful in a folksy way.
UPDATE
#biziclomp has raised the case of LayoutManager2, and claims that the 2 conveys no meaning.
Here's what the javadoc says about the purpose of this interface:
This minimal extension to LayoutManager is intended for tool providers who wish to the creation of constraint-based layouts. It does not yet provide full, general support for custom constraint-based layout managers.
From this, I would say that the 2 in the name is meaningful. Basically, it is saying that you can view this as a successor to LayoutManager. I guess that could have been said in words, but see the examples above on how numbers where numbers are used as short-hand.
# BlueRaja writes:
The 2 does not explain anything - how is LayoutManager2 any different from LayoutManager?
The advice of the Style Guide is NOT that names should explain things. Rather, it advises that they should be meaningful. (For the explanation, refer to the javadoc.) Obviously meaningfulness is relative, but there is a practical limit on the amount of information you can put into an identifier before it becomes hard to read and hard to type.
My take is that the identifier should remind the reader what the meaning of the thing (class, field, method, etc) that is named.
It is a trade-off.

Methods should be verbs, in mixed case with the first letter lowercase, with the first letter of each internal word capitalized.
This phrasing alone already shows that they use a more general meaning of verb than the usual, where only is would be the verb, neither session nor post are verbs. The sentence means something like Method names should be verbs or verbal phrases, ..., and numbers can very well be parts of verbal phrases.
The idea is that a complete method call can be read as a complete sentence, with the subject being the object before the dot, the verb being the method name, and additional objects being the arguments to the method:
if (buffer.isEmpty())
buffer.append(word);
(Most such sentences would be either questioning or imperative ones.)
Your method name has (from a naming convention viewpoint) the only problem that the subject of the sentence (the session) is not the this object of your method, but an parameter, but this can't be avoided with Java, I think (please someone prove me wrong).
For multiple-parameter methods the smalltalk approach would work better:
"Hello" replace: "e" with: "x"
(where replace:with: is one method of the string class.)

Yes, in some circumstances. For example, maybe you want to handle X.509 certificates. I think it would be perfectly acceptable to write a method called handleX509Certificate.

The only problem I see with using numbers in method names is that it may be an indication that something in your design could be improved upon. (I hesitate to say "is wrong.") For instance, in your example, you stated that you have a specific policy which comes into effect after 2011. However, having a method specifically to check for that year seems overly specific and magic-number-y. I'd instead suggest creating a generalized function to check if an event occurred after a specified date as Anon suggested.
(Anon's answer popped up while I was halfway through mine, so my apologies if it seems like I'm just duplicating what he said. I felt that mine expanded on what he was saying a bit, so I thought I'd post it anyway.)

I would consider calling your method something else. Nothing against numbers exactly, but what happens if the project slips it release date? You'll have a method called post2011 - when it should be called post2012 now. Consider calling it postProjNameImplentation instead maybe?

The use of number it is not bad itself, but usually they are not very common.
in the specific case, I don't think isSessionPost2011(int id) {} is a good name. but it is better isSessionPostYear(int id, int year) {} more extensible for future uses.

The fact it is a coding convention and the use of the verb "should" suggest you that digits are permitted but advised against in methods names. However in your example, why not generalizing the same code as?
session.isPostYear(int year);

We use 'em all the time, like the example you showed. Also for interface versions, like IConnection2 and IConnection3.
Eclipse doesn't complain that it's a nontraditional name, either. :)
But acceptable? That's kind-of up to you.

Don't ever forget - rules are made to be broken. The only absolute should be that there are no absolutes.

I don't believe there's a per se reason to avoid numbers in identifiers, although in the case you describe, I don't know that I'd use it. Rather, I'd name the method something like boolean isPolicyXyzApplicable(int id).
If this is a policy that's expected to change more over time, consider splitting policies out into different classes so you don't end up growing a long vine of if(isPolicyX) ... else if(isPolicyY) ... else if(isPolicyZ) ... in your methods. Once this is factored out, use an abstract or interface method Policy.isApplicableTo(transaction) and a collection of Policy objects to determine what to do.

As long as you have a reason for using numbers, then imho I think it's fine.
For your example, there might be 2 isSessionPost method, so how would you name them? isSessionPost and isSessionPost2? Not very clear to be honest.
Just remember that all names must be meaningful and you won't go wrong.

I think in your case it's OK to use it as a one-off marker, specifically if you expect that the method will only live for a short period of time and eventually be deprecated.
If I understand your use case, you need to bring in some legacy data into the new version of your application. If this is the case, then definitely add this method, mark it #deprecated and retire it when all your clients are updated.
On the other hand Ralph here has a valid point. Don't let this project to slip into 2012 :)

nothing is wrong
String int2string(int i)
User findUser4Id(long id)
void startHibern8();
wow! this website doesn't like these method names! I got captchaed!

Best sandboxed expression language for JVM

I want an expression language that runs on the JVM and includes support for
math expressions, including operator priority
string expressions, like substring, etc
supports named functions
this allows me to decorate and control exactly who and what functions can be executed.
read/write variables that are "typeless" / allow type conversion in a controlled manner.
does not allow arbitary java scriptlets.
it should not be possible to include constructs like new Someclass()
cannot execute arbitrary static or otherwise method
does not allow any OGNL like expressions.
I only want to functions I map to be available.
support for control constructs like if this then that is for the moment optional.
must be embeddable.
This previous stackoverflow question is similar, but:
does not really answer "how" or "what" as does the above,
allows java object expressions, throwing an exception from a SecurityManager to stop method execution, which is nasty and wrong.
java object like expressions should be an error at parse time.
jexel seem to be closest possible match, but License is a bit horrible (GPL/Commercial).

If you only want the scripts to output text, then Apache Velocity fit's your constraints quite well. It runs in an environment where it only has access to the objects you give it, but can do things like basic math.
The Apache license is a bit friendlier than GPL too.

Java for Clojure users

I've been using Lisp on and off, and I'm catching up with clojure.
The good thing about clojure is that I can use all the java functions naturally, and the bad thing about clojure is also that I have to know java function naturally.
For example, I had to spend some time (googling) to find square function in Java (Math/sqrt in clojure notation).
Could you recommend me some good information resource for Java functions (libraries) for clojure users that are not so familiar with Java?
It can be anything - good books, webpages, forums or whatever.

I had similar problems when I first started using Clojure. I had done some Java development years ago, but was still pretty unfamiliar with the libraries out there.
Intro
I find the easiest way to use Java is to not really use it. I think a book would be a little bit much to just get started using Java from Clojure. There isn't that much you really need to know, unless you really start getting down into the JVM/Java libraries. Let me explain.
Spend more time learning how to use Clojure inside and out, and become familiar with Clojure-Contrib. For instance, sqrt is in generic.math-functions in clojure.contrib.
Many of the things you'll need are in fact already in Clojure–but still plenty are not.
Become familiar with calling conventions and syntactic sugar in Clojure for using Java. e.g. Math/sqrt, as per your example, is calling the static method (which just a function, basically) sqrt from the class Math.
Anyway, here's a guide that should help you get started if you find yourself really needing to use Java. I'm going to assume you've done some imperative OO programming, but not much else. And even if you haven't, you should be okay.
Isaac's Clojurist's Guide to Java
Classes
A class is a bundle of methods (functions which act on the class) that
can also be a data type: e.g. to create a new class of the type Double : (Double. 1.2) which initializes the class Double (the period is the syntactic sugar for calling the class constructor methods, which initialize the class with the values you provide) with the value 1.2.
Now, look at the Double class in the Java 6 API:
Double
public Double(double value)
Constructs a newly allocated Double object that represents the
primitive double argument.
Parameters:
value - the value to be represented by the Double.
So you can see what happened there. You "built" a new Double with value 1.2, which is a double. A little confusing there, but really a Double is a class that represents a Double and can do things relating to doubles.
Static Methods
For instance, to parse a Double value out of a string, we can use the static method (meaning we don't need a particular instance of Double, we can just call it like we called sqrt) parseDouble(String s):
(Double/parseDouble "1.2") => 1.2
Not to tricky there.
Nonstatic Methods
Say we want to use a Java class that we initialized to something. Not too difficult:
(-> (String. "Hey there") ;; make a new String object
(.toUpperCase)) ;; pass it to .toUpperCase (look up -> to see what it does)
;; toUpperCase is a non-static method
=> "HEY THERE"
So now we've used a method which is not static, and which requires a real, live String object to deal with. Let's look at how the docs say it works:
toUpperCase
public String toUpperCase()
Converts all of the characters in this String to upper case using
the rules of the default locale. This method is equivalent to
toUpperCase(Locale.getDefault()).
Returns:
the String, converted to uppercase.
So here we have a method which returns a string (as shown by the "String" after the public in the definition, and takes no parameters. But wait! It does take a parameter. In Python, it'd be the implicit parameter self: this is called this in Java.
We could also use the method like this: (.toUpper (String. "Hey there")) and get the same result.
More on Methods
Since you deal with mutable data and classes in Java, you need to be able to apply functions to Classes (instances of Classes, really) and not expect a return value.
For instance, say we're dealing with a JFrame from the javax.swing library. We might need to do a number of things to it, not with it (you generally operate with values, not on them in functional languages). We can, like this:
(doto (JFrame. "My Frame!");; clever name
(.setContentPane ... here we'd add a JPanel or something to the JFrame)
(.pack) ;; this simply arranges the stuff in the frame–don't worry about it
(.setVisibleTrue)) ;; this makes the Frame visible
doto just passes its first argument to all the other functions you supply it, and passes it as the first argument to them. So here we're just doing a lot of things to the JFrame that don't return anything in particular. All these methods are listed as methods of the JFrame in the documentation (or its superclasses… don't worry about those yet).
Wrapping up
This should prepare you for now exploring the JavaDocs yourself. Here you'll find everything that is available to you in a standard Java 1.6 install. There will be new concepts, but a quick Google search should answer most of your questions, and you can always come back here with specific ones.
Be sure to look into the other important Clojure functions like proxy and reify as well as extend-type and its friends. I don't often use them, but when I need to, they can be invaluable. I still am understanding them myself, in fact.
There's a ton out there, but it's mostly a problem of volume rather than complexity. It's not a bad problem to have.
Additional reading:
Static or Nonstatic? ;; a guide to statis vs. nonstatic methods
The Java Class Library ;; an overview of what's out there, with a nice picture
The JavaDocs ;; linked above
Clojure Java Interop Docs ;; from the Clojure website
Best Java Books ;; as per clartaq's answer

Really, any good Java book can get you started. See for example the answer to the question about the
best Java book people have read so far. There are lots of good sources there.
Once you have a little Java under you belt, using it is all just a matter of simple Clojure syntax.
Mastering the content of the voluminous Java libraries is a much bigger task than figuring out how to use them in Clojure.

My first question would be: what do you exactly need? There are many Java libraries out there. Or do you just need the standard libraries? In that case the answer given by dbyrne should be enough.
Keep in mind that in general you are better of using the Clojure data structures like sequences instead of the Java equivalents.

Start with the Sun (now Oracle) Java Tutorials: http://download.oracle.com/javase/tutorial/index.html
Then dive into the Java 6 API docs:
http://download-llnw.oracle.com/javase/6/docs/
Then ask questions on #clojure IRC or the mailing list, and read blogs.
For a deep dive into Java the language, I recommend Bruce Eckel's free Thinking in Java:
http://www.mindview.net/Books/TIJ/

I think the plain old Java 6
API Specification should be pretty much all you need.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.