CodeQL Scanning JAR Files - java

I'm just getting started with CodeQL and have had plenty of success scanning Python projects. Now, I'm starting to scan Java projects, and I struggle to scan precompiled projects.
From what I gathered, it appears CodeQL CLI includes an autobuilder for Java code and will build the projects for me. I'm trying to scan projects already compiled from the Maven central repository.
Question:
Is it possible to scan compiled Java source code (i.e., bytecode, class files) contained within a JAR file with CodeQL?
If so, how can I invoke these properties to scan JAR files from the CLI?
Thanks for any insight!

As mentioned in the other answer, for Java CodeQL observes the results during compilation and creates a database from it. It is therefore not possible to build a database from a JAR containing compiled classes. It is however possible to use compiled classes in a project (e.g. in the form of Maven dependencies, or JDK usage), and CodeQL will record the information that these classes are used, but it has no insight into what these classes do. That means no dataflow or taintflow will be available for them, unless CodeQL explicitly models it, see the list of supported frameworks.
However, since your plan is to run queries against projects from Maven Central, it is most likely easiest to obtain the databases from lgtm.com, or to directly use the Query Console on lgtm.com, see also the documentation. For most projects lgtm.com is able to build the project on its own.
lgtm.com is owned by Semmle, which originally created CodeQL and was acquired by GitHub.

From what I read, it does not seem to work on compiled classes. You will need the src code, whether that exists as a (Jar, which then you need to unzip before processing), or a Github project.
Usually during running you would provide the way to build your project, such as --language=java --command='mvn clean install -DskipTests' <-- This requires source code.

Related

committing an executable with separately committed Java packages

My application has 4 Java packages.
com.me.utilities
com.me.widget
com.me.analysis
com.me.interface
All packages depend upon the utilities. The widget package depends upon the interface package.
The utilities might be valuable to other applications so it ought to be a package of its own. The analysis does not depend upon the widget and the interface so analysis ought to be a package of its own. The interface might change because the organization that it interfaces to might go out of business so the interface ought to be a package of its own.
This is just one application that produces one executable.
On the basis of this organization I do commits on each package but not on the executable. I want to start to commit the executable. One way would be to commit the executable in a new git archive without any connection to the source but that sounds reckless to have an executable and no formal way to tie it to source code.
Another way, which sounds a little inefficient, would be to simply start a new git archive that "adds" the source code of all 4 Java packages, each of which has many Java files, and also "add" the executable. This seems a little strange because it fails to respect the 4 existing git archives that already know about their respective collections of source code.
What is the right way to tie these 4 packages together with their common executable?
I use SmartGit for routine commits and I use command line git for reverting. I am willing to stop using SmartGit if the solution to this inquiry necessitates it.
It looks like what you're looking for is an artifact repository, like Nexus, Artifactory, JCenter, etc.
That's where you typically publish the artifacts produced by a build every time you do a release.
That's also what allows build tools and IDEs to get artifacts for the libraries a project uses. So if you end up turning your utilities package into a separate library, used by several different projects, you'll need to publish it in such an artifact repository. Both Gradle and Maven get their artifacts, but also allow publishing artifacts, to such artifact repositories.

Distribution of java class library code

I need to put some old java class library code that I have into a repo, from where others can check it out and build it. You know, like any public repo.
But, I'm not sure what the best way to do this is in the java world. In old-fashioned projects, we just used to supply the build scripts and a list of dependencies. You gathered or installed the dependencies separately before running the build scripts.
But these days for many languages, you have package managers and the like that pull from remote locations and your build scripts need to include dependency fetching.
Basically, I'm not familiar with how java libs and programs are packaged.
Should I include the (dependency) libs in the repo? And update them whenever a new version is out?
Does java now have a package manager that will pull in the latest versions of the dependencies?
Do I leave it upto the people checking out to download the libs themselves before they run the build scripts?
I'd prefer it if the solution didn't involve installing a huge package manager. Gradle wants to pull in like 150MB+ of stuff and as far as I am aware, it isn't ubiquitous on java deployments.
Thanks.
Use Maven. I believe these days it's the #1 "package manager" (not a term that's usually used to describe it, but quite apt) by a large margin. It's built into Netbeans, IntelliJ IDEA, and I believe Eclipse.
However, it won't just "pull the latest versions" of your dependencies, since your application may break. Only the versions you specify. Therefore, you should periodically update (and test) your code to reduce incompatibilities when someone tries to use your library in an application which directly or indirectly pulls newer versions of the same libs (and they get into a bit of "dll hell"), or reduce your use of third-party libraries in general.
You should also consider publishing your library in a compiled form to Maven Central so that using your library would be as easy as adding a dependency to the pom.xml. The problem that Maven solves, after all, is not so much making it easy to build your library (since just bundling the dependencies gets you most of the way), but making it easy to use your library.

Importing protocol buffer definitions between Maven projects

I currently manage a few separate Maven projects in which I use Protobufs as a serialization format and over the wire. I am using David Trott's maven-protoc plugin to generate the code at compile time.
All is good and well until I want those project to communicate between one another — or rather, use each other's protobufs. The protobuf language has an "import" directive which does what I want but I'm faced with the challenge of having project A exporting a ".proto" file (or possibly some intermediate format?) for project B to depend upon.
Maven provides a way for a project to bundle resources but AFAIK, these are meant to be used at runtime by the code and not by a goal during the compile / source generation phase — at least I haven't been able to find documentation that describes what I want to achieve.
I've found another way to achieve, and it doesn't involve any Maven magic. Diving into the code for the maven-protoc plugin, I found that this is a supported use case -- the plugin will look for and collect and .proto files in dependent jars and unpack them into a temporary directory. That directory is then set as an import path to the protoc invocation.
All that needs to happen is for the .proto file to be included in the dependency's package, which I did by making it a resource:
projects/a/src/main/resources/a.proto
Now in projects/b/pom.xml, add 'a' as a regular Maven dependency and just import a.proto from b.proto as if it existed locally:
b.proto:
import "a.proto";
This isn't ideal, since files names may clash between various projects, but this should occur rarely enough.
You can package your .proto files in a separate .jar/.zip in the project where they are generated, and publish them in your repository using a dedicated classifier. Using the assembly plugin might help here to publish something close to "source jars" that are built during releases.
Then, on projects using them, add previously created artifact as dependency.
Use the dependency plugin with the "unpack-dependencies" goal, and bind it to a phase before "compile".

what happens when we Build a java web project

I am a newbie to J2EE and I am not able to understand the directory structure created on building the java web project. After bit of googling i understood what we store in WEB-INF but
1)i am not able to understand that what we store in META-INF ?
2)how target folder get created?
3)where we mention that what all files should be placed in target folder?
I am using Maven to build the project which is a spring-hibernate based project.
thanks in advance
1) What's the purpose of META-INF?
2) Maven creates the target folder for you. It's where all of the Maven plugins dump their work by default.
3) Maven has mechanisms for excluding files from it.
The key to understanding Maven is that Maven works on conventions. That means that Maven will do a lot of things really well with almost no effort on your part if you structure your project according to Maven's expectations. For example, this is how you differentiate between Java classes and resources in the source directory:
src/main/java/com/mycompany/MyObj.java
src/main/resources/my/company/spring.context.xml
src/test/java/com/mycompany/MyObjTest.java
src/test/resources/my/company/spring.context.xml
When you run mvn test it will compile all of that, move it appropriately over to the target folder, load the JUnit runner and give you a classpath that will let Spring have easy access to the spring context under the test folder. It'll run the tests and drop the reports under target.
Maven is not like Ant. In Ant, you have to tell it everything. Maven works on the opposite end in that it assumes everything by default until you tell it otherwise.
This is a common problem because java has grown so big. Its often hard to tell where one technology ends and another begins. You need to familiarize yourself with the documentation for all the various components you are using.
For instance, if you have a 'target' folder then I assume you are using maven. Maven is a java utility used for dependency management. When you 'mavenize' a project, you agree to adhere to a bunch of standards and maven in turn does a lot of the grunt work for you(compiling code, finding dependent libraries, and running tests). Part of what maven does is create standard maven directories in this case 'target'
more maven info - http://maven.apache.org/
As for META-INF this is part of the Java EE spec. It does have a purpose concerning packaging and deployment, but you'll generally not finding yourself using it very often. Its generally the same principle as maven. You adhere to the Java EE standard and the Java EE compliant tools do most of the work for you.
For more info look at this link - http://java.sun.com/blueprints/guidelines/designing_enterprise_applications/packaging_deployment/index.html
In general to understand these you should check out some tutorials on Java EE and refer to your container's examples and documentation.
1) What is the purpose of the META-INF
2-3)Target folder creates Macven, it manages all dependensies, etc: one, two

How to examine required libraries?

I developing a web application with a lot of libraries like, Spring, Apache CXF, Hibernate, Apache Axis, Apache Common and so one. Each of these framework comes with a lot of *.jar libraries.
For development I simple take all of the delivered libraries and add them to my classpath.
For deployment not all of these libraries are required, so is there a quick way to examine all the required libraries (*.jar) which are used by my source code?
If you move your project to use Maven such things become easier:
mvn dependency:analyze
mvn dependency:tree
For your example, Maven + IDE + nice dependency diagrams could help allot.
See an example of this : it's much easier this way to figure out what happens in a project, and this way you don't need to add to your project "all delivered libraries" - just what it's required.
JDepend traverses Java class file
directories and generates design
quality metrics for each Java package.
JDepend allows you to automatically
measure the quality of a design in
terms of its extensibility,
reusability, and maintainability to
manage package dependencies
effectively.
So, as a quick, dirty, and potentially inefficient way, you can try this in Eclipse:
Create two copies of your project.
In project copy #2 remove all the jars from the classpath.
Pick a source file that now has errors because it can't resolve a class reference. Pick one of the unresolved classes and note its fully qualified class name.
Do Control-Shift-T and locate the unresolved class. You should be able to see which jar its contained in since all the jars are still in the classpath for project copy #1.
Add the jar that contains this unresolved class back into your classpath in project copy #2, then repeat steps 3 and 4 until all class references are resolved.
Unfortunately you're not done yet since the jar files themselves may also have dependencies. Two ways to deal with this:
Go read the documentation for all the third-party packages you're using. Each package should tell you what its dependencies are.
Run your application and see if you get any ClassNotFoundExceptions. If you do, then use Control-Shift-T to figure out what jar that class comes from and add it to your classpath. Repeat until your project runs without throwing any ClassNotFoundExceptions.
The problem with #2 is that you don't really know you've resolved all the dependencies since you can't simulate every possible execution path your project might take.

Categories

Resources