In the following expression:
T(org.apache.commons.io.IOUtils).toString(T(java.lang.Runtime)
.getRuntime().exec(T(java.lang.Character).toString(105)
.concat(T(java.lang.Character).toString(100))).getInputStream())
Does the '105' in toString(105) refer to an itemized object within the Character class?
and
Why is the 'T', which I believe expresses a generic type, and is used 4 times in this expression, a necessary feature of Java?
The toString() method that seems to be invoked here is actually the toString(char) (static) method of java.lang.Character. Quoting the documentation:
public static String toString(char c)
Returns a String object representing the specified char.
The result is a string of length 1 consisting solely of the specified char.
Parameters:
c - the char to be converted
Returns:
the string representation of the specified char
Since:
1.4
Note that 100 and 105 are also valid char values where 100 == 'd' and 105 == 'i'.
Update: after knowing the context, I am now confident that this code is intended to be injected into a template for a web page. The template engine used provides special syntax for accessing static methods where T(Classname) resolves to just Classname (not Classname.class!) in the resulting Java code.
So your code would be translated to:
org.apache.commons.io.IOUtils.toString(java.lang.Runtime
.getRuntime().exec(java.lang.Character.toString(105)
.concat(java.lang.Character.toString(100))).getInputStream())
The full qualification of the class names is necessary because we do not know if those classes are imported on the attacked site (or if the template engine even allows imports or class names must always be fully qualified).
A more readable version of the code that assumes imports is
IOUtils.toString(
Runtime.getRuntime().exec(
Character.toString(105).concat(Character.toString(100))
).getInputStream()
)
And after a little de-obfuscation...
IOUtils.toString(Runtime.getRuntime().exec("id").getInputStream())
Whatever this is, it is definitely NOT meaningful Java code.
And the fact that you can provide it as as a search query on some site is not evidence that it is Java either.
I suspect that this is actually some custom (site-specific?) query language. That makes it futile to try to understand it as a Java snippet.
Your theory that T could denote a generic type parameter doesn't work. Java would not allow you to write T(...) if that was the case.
Furthermore, if we assume that org.apache.commons.io.IOUtils, java.lang.Runtime and so on are intended to refer to Java class objects, then the correct Java syntax would be org.apache.commons.io.IOUtils.class, java.lang.Runtime.class and so on.
So what does it mean?
Well a bit of Googling found me some other examples that look like yours. For instance;
https://github.com/VikasVarshney/ssti-payload
appears to generate "code" that is reminiscent of your example. This is SSTI - Server Side Template Injection, and it appears to be targeting Java EE Expression Language (EL).
And I think this particular example is an attempt to run the Linux id program ... which would output some basic information about the user and group ids for the account running your web server.
Does it matter? Well only if your site is vulnerable to SSTI attacks!
How would you know if your site is vulnerable?
By understanding the nature of SSTI with respect to EL and other potential attack vectors ... and auditing your codebase and configurations.
By using a vulnerability scanner to test your site and/or your code-base.
By employing the services of a trustworthy IT security company to do some penetration testing.
In this case, you could also try to use curl to repeat the attempted attack ... as the hacker would have done ... based on what is in your logs. Just see if it actually works. Note that running the id program does no actual damage to your system. The harm would be in the information that is leaked to a hacker ... if they succeed.
Note that if this hack did succeed, then the hacker would probably try to run other programs. These could do some damage to your system, depending on how how well your server was hardened against such things.
In a huge project with tens of thousands of Java files there are a couple of Java classes where developers may pass in strings as parameters to a constructor class I had implemented
public byte[] getProductReport(List<String> products, Date from, Date to) {
// ... do some stuff before...
List<ReportParameterDto> reportParameters = new ArrayList<>();
reportParameters.add(new ReportParameterDto("From (YYYY.MM.DD)", ParameterType.DATE, from));
reportParameters.add(new ReportParameterDto("To_(YYYY.MM.DD)", ParameterType.DATE, to));
reportParameters.add(new ReportParameterDto("Products", ParameterType.SELECT, someList));
return ReportFromCRServerHelper.downloadReport("ProductReporot", reportParameters, ReportFormat.PDF);
}
If a developer uses wrong string values downloading a requested report (from a remote Report server) will fail during runtime.
In this example I would like to have some validation checking - during compilation - in order to avoid these errors before they are found by the customer.
I have API methods to obtain parameter values from a report which I hope to use
during compilation of the above method.
In my example the compilation should fail and throw an error highlighting how parameters should look instead:
"From (JJJJ-MM)" is invalid --> should be "From_(JJJJ-MM)"
"Products" is invalid --> should be "PRODUCT_LIST"
Can I detect these parameters (used in above ReportParameterDto constructors) through JAVAX annotation processing?
The few tutorials / blogs that I found dealt with validating parameters in method signatures, not the values passed into methods.
Or are there a more elegant tools available?
A compile-time tool like the Checker Framework can validate the arguments to a method or constructor. You annotate the parameter types to indicate the permitted values, and then when javac runs it issues a warning if an argument may not be compatible with the parameter.
If there is a limited number of possible values, then you can use the Fake Enum Checker to treat strings or integers as enumerated values. (Using a Java enum is also a good idea, but may not be possible or convenient because of other code that expects a string or integer.)
If the number of possible values is unlimited, then you can use the Constant Value Checker -- for example, to supply a regular expression that any constant string argument must satisfy.
You can also define your own compile-time validation if you want different functionality than is available in the checkers that are distributed with the Checker Framework.
I'm trying to execute this string in java using reflection
String methodCall = "com.mypackage.util.MathUtil.myFunction(4,\"abc\")";
This code does the job for me (after little string parsing)
Class.forName("com.mypackage.util.MathUtil").getDeclaredMethod("myFunction", int.class, String.class).invoke(null, 4, "abc");
The problem with the above solution is that i need to know the parameter types before invoking the method, and unfortunately i don't have them.
As a solution i can get all declared methods using Class.forName("com.mypackage.util.MathUtil").getDeclaredMethods() , iterate, match name and parameter count, and manually check types with some logic to identify the appropriate method.
Can java do this heavy lifting for me with something like this
Class.forName("com.mypackage.util.MathUtil").getDeclaredMethod("myFunction").invoke(null, 4, "abc");
This code should try to match the appropriate method and can throw NoSuchMethodException or Ambiguity error when 2 or more similar methods matched. Also feel free to suggest other ways to achieve this use case.
The core problem of identifying the appropriate method with types was eliminated with the help of BeanShell.
String methodCall = "com.mypackage.util.MathUtil.randomNumbers(4,\"abc\")";
Interpreter i = new Interpreter();
String result = i.eval(methodCall).toString();
The performance of this eval execution is actually pretty good (~10-20ms) and i'm using this solution on a standalone framework, so i need not worry much. This also gives me an additional benefit to allow a complete java snippet on the framework for customisation purposes.
Special Thanks to #jCoder for the solution.
I have been trying to find the exact term for "tracking a method's parameter" for Java programming language and I generally found "taint analysis", but still not sure if I am on the right path.
What I want is, to keep track of a method's parameter and see which part of the method (in scope) does the parameter effect. For example, if a parameter is assigned to another variable, I also want to keep of track of the assigned variable as well. By mentioning "parts", it could be lines of code, statement or branch of a control flow graph.
I also checked for tools and came across with Checker Framework and Findbugs, however it seems that they don't quite satisfy the needs that I want or I couldn't manage to make them work for my needs.
Please tell if "taint analysis" is the right term that I am looking for. Also, any other tool suggestions are welcome.
There is an edited code below from Checker Framework Live Demo. What I expect is, inside processRequest() when the variable String input is Tainted, I expect to get a warning or an error for all of the lines inside executeQuery() method. Because a tainted variable is passed to it's parameter.
import org.checkerframework.checker.tainting.qual.*;
public class TaintingExampleWithWarnings {
String getUserInput() {
return "taintedStr";
}
void processRequest() {
#Tainted String input = getUserInput();
executeQuery(input); //error: pass tainted string to executeQeury()
}
public void executeQuery(#Untainted String input) {
// Do some SQL Query
String token = input + " Hello World";
String tokens[] = token.split(" ");
for(int i=0; i<tokens.length; i++)
{
System.out.println((i+1)+"String: "+tokens[i])
}
}
/* To eliminate warning in line 10, replace line 10 by
* executeQuery(validate(input)); */
/*#Untainted*/ public String validate(String userInput) {
// Do some validation here
#SuppressWarnings("tainting")
#Untainted String result = userInput;
return result;
}
}
The Tainting Checker of the Checker Framework issues a warning on exactly the defective line of your code:
% javac -g TaintingExampleWithWarnings.java -processor tainting
TaintingExampleWithWarnings.java:10: error: [argument.type.incompatible] incompatible types in argument.
executeQuery(input); //error: pass tainted string to executeQeury()
^
found : #Tainted String
required: #Untainted String
1 error
This pinpoints the defect and indicates exactly what you need to fix in your program.
I expect to get a warning or an error for all of the lines inside
executeQuery() method
The implementation of executeQuery() is correct; it's the use of executeQuery() that is problematic.
(Background: A modular analysis is one that works one method at a time. A modular analysis relies on specifications of methods.)
Type-checking is an example of a modular analysis. Its specifications are user-written annotations on formal parameters.
When type-checking the body of executeQuery(), the type-checker assumes
that the formal parameter declarations are correct.
When type-checking a call to executeQuery(), the type-checker verifies that the arguments are legal.
If there is even one type-checking error somewhere in your program, then
your program might behave unsafely (possibly at some other location).
If you want to know all the possible places that taint could flow to in your program, then you need to use a non-modular, whole-program analysis. Furthermore, the whole-program analysis would need to ignore every user-written annotation in the program. Such an analysis is possible to do and is a reasonable desire, but it is not addressed by the tools you mentioned in your question.
I have a method that will process a Collection<Nodes> that is passed in as a parameter. This Collection will be modified, therefore I thought it would be good to first make a copy of it. How do I name the parameter and local variable, e.g. nodes in the example below?
List<Nodes> process(Collection<Nodes> nodes) {
List<Nodes> nodes2 = new ArrayList<>(nodes);
...
}
As another example consider the following where the variable is an int parsed from a String parameter:
public void processUser(final String userId) {
final int userId2 = Integer.parseInt(userId);
...
A good approach to the name variables problem is to use names that suggest the actual meaning of the variable. In your example, you are using names that do not say anything about the method functionality or variables meaning, that's why it is hard to pick a name.
There are many cases like yours in the JDK, e.g. Arrays#copyOf:
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
#SuppressWarnings("unchecked")
T[] copy = ((Object)newType == (Object)Object[].class)
? (T[]) new Object[newLength]
: (T[]) Array.newInstance(newType.getComponentType(), newLength);
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;
}
In this case they call the parameter original and the local variable copy which perfectly expresses that the returned value is a copy of the parameter. Precisely, copying is what this method does and it is named accordingly.
Using the same reasoning for your case (consider refactoring to give more meaningful names to your method and variables) I would name your local copy of nodes something like processedNodes, to express what that variable is and to be consistent with your method's name.
Edit:
The name of the new method you added in your edit does not provide hints about what it does either. I'll assume that it modifies some properties (maybe in a database) of the user whose id is passed via parameter.
If that is the case (or similar), I think that an appropriate approach you
could apply would be that every method should have a single responsibility. According to your method's name it should process the user, for that you need an int userId. The responsibility of parsing an String userId should be out of the scope of this method.
Using the proposed approach has, among others, the following advantages:
Your class won't change if you have to add additional validation to your input.
Your class won't be responsible for handling NumberFormatException which must be the application responsibility.
Your processUser method won't change if you have to handle different types of inputs (e.g. float userId).
It ultimately comes down to what you want to communicate to future programmers. The computer obviously doesn't care; it's other people you're talking to. So the biggest factor is going to be what those people need to know:
What is the logical (abstract, conceptual) meaning of this variable?
What aspects of how this variable is used could be confusing to programmers?
What are the most important things about this variable?
Looking at your first example, it's kind of hard to understand enough about your program to really choose a good name. The method is called process; but methods generally speaking implement computational processes, so this name really doesn't tell me anything at all. What are you processing? What is the process? Who are you processing it for, and why? Knowing what the method does, and the class it's in, will help to inform your variable name.
Let's add some assumptions. Let's say you're building an application that locates Wi-fi access points in a building. The Node in question is a wireless node, with subclasses Repeater, AccessPoint, and Client. Let's also say it's an online-processed dataset, so the collection of nodes given may change at any time in response to a background thread receiving updates in what nodes are currently visible. Your reason for copying the collection at the head of the method is to isolate yourself from those changes for the duration of local processing. Finally, let's assume that your method is sorting the nodes by ping time (explaining why the method takes a generic Collection but returns the more specific List type).
Now that we better understand your system, let's use that understanding to choose some names that communicate the logical intention of your system to future developers:
class NetworkScanner {
List<Node> sortByPingTime(Collection<Node> networkNodes) {
final ArrayList<Node> unsortedSnapshot;
synchronized(networkNodes) {
unsortedSnapshot = new ArrayList<>(networkNodes);
}
return Utils.sort(unsortedSnapshot, (x,y) -> x.ping < y.ping);
}
}
So the method is sortByPingTime to define what it does; the argument is networkNodes to describe what kind of node we're looking at. And the variable is called unsortedSnapshot to express two things about it that aren't visible just by reading the code:
It's a snapshot of something (implying that the original is somehow volatile); and
It has no order that matters to us (suggesting that it might have, by the time we're done with it).
We could put nodes in there, but that's immediately visible from the input argument. We could also call this snapshotToSort but that's visible in the fact that we hand it off to a sort routine immediately below.
This example remains kind of contrived. The method is really too short for the variable name to matter much. In real life I'd probably just call it out, because picking a good name would take longer than anyone will ever waste figuring out how this method works.
Other related notes:
Naming is inherently a bit subjective. My name will never work for everyone, especially when multiple human languages are taken into account.
I find that the best name is often no name at all. If I can get away with making something anonymous, I will--this minimizes the risk of the variable being reused, and reduces symbols in IDE 'find' boxes. Generally this also pushes me to write tighter, more functional code, which I view as a good thing.
Some people like to include the variable's type in its name; I've always found that a bit odd because the type is generally immediately obvious, and the compiler will usually catch me if I get it wrong anyway.
"Keep it Simple" is in full force here, as everywhere. Most of the time your variable name will not help someone avoid future work. My rule of thumb is, name it something dumb, and if I ever end up scratching my head about what something means, choose that occasion to name it something good.
I used to give names, which reflect and emphasize the major things. So a potential reader (including myself after a couple of months) can get immediately, what is done inside the method just by its signature.
The API in discussion receives an input , does some processing and returns the output. These are the three main things here.
If it is not important, what processing is done and what is the type of input, the most generic is this form:
List<Nodes> process(Collection<Nodes> input) {
List<Nodes> output = new ArrayList<>(input);
...
}
and
public void process(final String input) {
final int output = Integer.parseInt(input);
...
If it is important to provide more information about processing and type of an input, names like: processCollection, inputCollection and processUser, inputUserId are more appropriate, but the local variable is still the output - it is clear and self-explained name:
List<Nodes> processCollection(Collection<Nodes> inputCollection) {
List<Nodes> output = new ArrayList<>(inputCollection);
...
}
and
public void processUser(final String inputUserId) {
final int output = Integer.parseInt(inputUserId);
...
It depends on the use case and sometimes it is even more appropriate to elaborate the processing, which is done: asArray or asFilteredArray etc instead of processCollection.
Someone may prefer the source-destination terminology to the input-output - I do not see the major difference between them. If this serves telling the method story with its title, it is good enough.
It depends on what you are going to do with the local variable.
For example in the first example it seems that is likely that variable nodes2 will actually be the value returned in the end. My advice is then to simply call it result or output.
In the second example... is less clear what you may want to achieve... I guess that userIdAsInt should be fine for the local. However if an int is always expected here and you still want to keep the parameter as a String (Perhaps you want to push that validation out of the method) I think it is more appropriate to make the local variable userId and the parameter userIdAsString or userIdString which hints that String, although accepted here, is not the canonic representation of an userId which is an int.
For sure it depends on the actual context. I would not use approaches from other programming languages such as _ which is good for instance for naming bash scripts, IMO my is also not a good choice - it looks like a piece of code copied from tutorial (at least in Java).
The most simple solution is to name method parameter nodesParam or nodesBackup and then you can simply go with nodes as a copy or to be more specific you can call it nodesCopy.
Anyway, your method process has some tasks to do and maybe it is not the best place for making copies of the nodes list. You can make a copy in the place where you invoke the method, then you can simply use nodes as a name of your object:
List<Nodes> process(Collection<Nodes> nodes) {
// do amazing things here
// ...
}
// ...
process(new ArrayList<>(nodes))
// ...
Just my guess, you have got a collection and you want to keep the original version and modify the copy, maybe a real solution for you is to use java.util.stream.Stream.
Simply put, when naming the variable, I consider a few things.
How is the copy created? (Is it converted from one type to another?...)
What am I going to do with the variable?
Is the name short, but/and meaningful?
Considering the same examples you have provided in the question, I will name variables like this:
List<Nodes> process(Collection<Nodes> nodes) {
List<Nodes> nodesCopy = new ArrayList<>(nodes);
...
}
This is probably just a copy of the collection, hence the name nodesCopy. Meaningful and short. If you use nodesList, that can mean it is not just a Collection; but also a List (more specific).
public void processUser(final String userId) {
final int userIdInt = Integer.parseInt(userId);
...
The String userId is parsed and the result is an integer (int)! It is not just a copy. To emphasize this, I would name this as userIdInt.
It is better not to use an underscore _, because it often indicates instance variables. And the my prefix: not much of a meaning there, and it is nooby (local will do better).
When it comes to method parameter naming conventions, if the thing a method parameter represents will not be represented by any other variable, use a method parameter name that makes it very clear what that method parameter is in the context of the method body. For example, primaryTelephoneNumber may be an acceptable method parameter name in a JavaBean setter method.
If there are multiple representations of a thing in a method context (including method parameters and local variables), use names that make it clear to humans what that thing is and how it should be used. For example, providedPrimaryTelephoneNumber, requestedPrimaryTelephoneNumber, dirtyPrimaryTelephoneNumber might be used for the method parameter name and parsedPrimaryTelephoneNumber, cleanPrimaryTelephoneNumber, massagedPrimaryTelephoneNumber might be used for the local variable name in a method that persists a user-provided primary telephone number.
The main objective is to use names that make it clear to humans reading the source code today and tomorrow as to what things are. Avoid names like var1, var2, a, b, etc.; these names add extra effort and complexity in reading and understanding the source code.
Don't get too caught up in using long method parameter names or local variable names; the source code is for human readability and when the class is compiled method parameter names and local variable names are irrelevant to the machine.