Adding Google Guava to Android project - significantly slows down the build

Adding Google Guava to Android project - significantly slows down the build - java

After adding Google Guava r09 to our Android project the build time increased significantly, especially the DEX generation phase. I understand that DEX generation takes all our classes + all jars we depend on and translates them to DEX format. Guava is a pretty big jar around 1.1MB
Can it be the cause for the build slowdown?
Are there anything can be done to speed this up?
P.S. Usually I build from Intellij, but I also tried building with Maven - same results.
Thanks
Alex

For what it's worth, my gut is that this isn't the cause. It's hard to take a long time doing anything with a mere 1.1MB of bytecode; I've never noticed dex taking any significant time. But let's assume it is the issue for sake of argument.
If it matters enough, you could probably slice up the Guava .jar to remove whole packages you don't use. It is composed of several pieces that aren't necessarily all inter-related.
I don't think this is going to speed things up, but maybe worth mentioning: if you run the build through Proguard (the optimizer now bundled with the SDK), it can remove unused classes before you get to DEX (and, do a bunch of other great optimization on the byte code). But of course that process probably takes longer itself than dex-ing.

Related

Why is it important that Java (and other JVM languages) is highly portable? [duplicate]

I've been thinking about it lately, and it seems to me that most advantages given to JIT compilation should more or less be attributed to the intermediate format instead, and that jitting in itself is not much of a good way to generate code.
So these are the main pro-JIT compilation arguments I usually hear:
Just-in-time compilation allows for greater portability. Isn't that attributable to the intermediate format? I mean, nothing keeps you from compiling your virtual bytecode into native bytecode once you've got it on your machine. Portability is an issue in the 'distribution' phase, not during the 'running' phase.
Okay, then what about generating code at runtime? Well, the same applies. Nothing keeps you from integrating a just-in-time compiler for a real just-in-time need into your native program.
But the runtime compiles it to native code just once anyways, and stores the resulting executable in some sort of cache somewhere on your hard drive. Yeah, sure. But it's optimized your program under time constraints, and it's not making it better from there on. See the next paragraph.
It's not like ahead-of-time compilation had no advantages either. Just-in-time compilation has time constraints: you can't keep the end user waiting forever while your program launches, so it has a tradeoff to do somewhere. Most of the time they just optimize less. A friend of mine had profiling evidence that inlining functions and unrolling loops "manually" (obfuscating source code in the process) had a positive impact on performance on his C# number-crunching program; doing the same on my side, with my C program filling the same task, yielded no positive results, and I believe this is due to the extensive transformations my compiler was allowed to make.
And yet we're surrounded by jitted programs. C# and Java are everywhere, Python scripts can compile to some sort of bytecode, and I'm sure a whole bunch of other programming languages do the same. There must be a good reason that I'm missing. So what makes just-in-time compilation so superior to ahead-of-time compilation?
EDIT To clear some confusion, maybe it would be important to state that I'm all for an intermediate representation of executables. This has a lot of advantages (and really, most arguments for just-in-time compilation are actually arguments for an intermediate representation). My question is about how they should be compiled to native code.
Most runtimes (or compilers for that matter) will prefer to either compile them just-in-time or ahead-of-time. As ahead-of-time compilation looks like a better alternative to me because the compiler has more time to perform optimizations, I'm wondering why Microsoft, Sun and all the others are going the other way around. I'm kind of dubious about profiling-related optimizations, as my experience with just-in-time compiled programs displayed poor basic optimizations.
I used an example with C code only because I needed an example of ahead-of-time compilation versus just-in-time compilation. The fact that C code wasn't emitted to an intermediate representation is irrelevant to the situation, as I just needed to show that ahead-of-time compilation can yield better immediate results.

Greater portability: The
deliverable (byte-code) stays
portable
At the same time, more platform-specific: Because the
JIT-compilation takes place on the
same system that the code runs, it
can be very, very fine-tuned for
that particular system. If you do
ahead-of-time compilation (and still
want to ship the same package to
everyone), you have to compromise.
Improvements in compiler technology can have an impact on
existing programs. A better C
compiler does not help you at all
with programs already deployed. A
better JIT-compiler will improve the
performance of existing programs.
The Java code you wrote ten years ago will run faster today.
Adapting to run-time metrics. A JIT-compiler can not only look at
the code and the target system, but
also at how the code is used. It can
instrument the running code, and
make decisions about how to optimize
according to, for example, what
values the method parameters usually
happen to have.
You are right that JIT adds to start-up cost, and so there is a time-constraint for it,
whereas ahead-of-time compilation can take all the time that it wants. This makes it
more appropriate for server-type applications, where start-up time is not so important
and a "warm-up phase" before the code gets really fast is acceptable.
I suppose it would be possible to store the result of a JIT compilation somewhere, so that it could be re-used the next time. That would give you "ahead-of-time" compilation for the second program run. Maybe the clever folks at Sun and Microsoft are of the opinion that a fresh JIT is already good enough and the extra complexity is not worth the trouble.

The ngen tool page spilled the beans (or at least provided a good comparison of native images versus JIT-compiled images). Executables that are compiled ahead-of-time typically have the following benefits:
Native images load faster because they don't have much startup activities, and require a static amount of fewer memory (the memory required by the JIT compiler);
Native images can share library code, while JIT-compiled images cannot.
Just-in-time compiled executables typically have the upper hand in these cases:
Native images are larger than their bytecode counterpart;
Native images must be regenerated whenever the original assembly or one of its dependencies is modified.
The need to regenerate an image that is ahead-of-time compiled every time one of its components is a huge disadvantage for native images. On the other hand, the fact that JIT-compiled images can't share library code can cause a serious memory hit. The operating system can load any native library at one physical location and share the immutable parts of it with every process that wants to use it, leading to significant memory savings, especially with system frameworks that virtually every program uses. (I imagine that this is somewhat offset by the fact that JIT-compiled programs only compile what they actually use.)
The general consideration of Microsoft on the matter is that large applications typically benefit from being compiled ahead-of-time, while small ones generally don't.

Simple logic tell us that compiling huge MS Office size program even from byte-codes will simply take too much time. You'll end up with huge starting time and that will scare anyone off your product. Sure, you can precompile during installation but this also has consequences.
Another reason is that not all parts of application will be used. JIT will compile only those parts that user care about, leaving potentially 80% of code untouched, saving time and memory.
And finally, JIT compilation can apply optimizations that normal compilators can't. Like inlining virtual methods or parts of the methods with trace trees. Which, in theory, can make them faster.

Better reflection support. This could be done in principle in an ahead-of-time compiled program, but it almost never seems to happen in practice.
Optimizations that can often only be figured out by observing the program dynamically. For example, inlining virtual functions, escape analysis to turn stack allocations into heap allocations, and lock coarsening.

Maybe it has to do with the modern approach to programming. You know, many years ago you would write your program on a sheet of paper, some other people would transform it into a stack of punched cards and feed into THE computer, and tomorrow morning you would get a crash dump on a roll of paper weighing half a pound. All that forced you to think a lot before writing the first line of code.
Those days are long gone. When using a scripting language such as PHP or JavaScript, you can test any change immediately. That's not the case with Java, though appservers give you hot deployment. So it is just very handy that Java programs can be compiled fast, as bytecode compilers are pretty straightforward.
But, there is no such thing as JIT-only languages. Ahead-of-time compilers have been available for Java for quite some time, and more recently Mono introduced it to CLR. In fact, MonoTouch is possible at all because of AOT compilation, as non-native apps are prohibited in Apple's app store.

I have been trying to understand this as well because I saw that Google is moving towards replacing their Dalvik Virtual Machine (essentially another Java Virtual Machine like HotSpot) with Android Run Time (ART), which is a AOT compiler, but Java usually uses HotSpot, which is a JIT compiler. Apparently, ARM is ~ 2x faster than Dalvik... so I thought to myself "why doesn't Java use AOT as well?".
Anyways, from what I can gather, the main difference is that JIT uses adaptive optimization during run time, which (for example) allows ONLY those parts of the bytecode that are being executed frequently to be compiled into native code; whereas AOT compiles the entire source code into native code, and code of a lesser amount runs faster than code of a greater amount.
I have to imagine that most Android apps are composed of a small amount of code, so on average it makes more sense to compile the entire source code to native code AOT and avoid the overhead associated from interpretation / optimization.

It seems that this idea has been implemented in Dart language:
https://hackernoon.com/why-flutter-uses-dart-dd635a054ebf
JIT compilation is used during development, using a compiler that is especially fast. Then, when an app is ready for release, it is compiled AOT. Consequently, with the help of advanced tooling and compilers, Dart can deliver the best of both worlds: extremely fast development cycles, and fast execution and startup times.

One advantage of JIT which I don't see listed here is the ability to inline/optimize across separate assemblies/dlls/jars (for simplicity I'm just going to use "assemblies" from here on out).
If your application references assemblies which might change after install (e. g. pre-installed libraries, framework libraries, plugins), then a "compile-on-install" model must refrain from inlining methods across assembly boundaries. Otherwise, when the referenced assembly is updated we would have to find all such inlined bits of code in referencing assemblies on the system and replace them with the updated code.
In a JIT model, we can freely inline across assemblies because we only care about generating valid machine code for a single run during which the underlying code isn't changing.

The difference between platform-browser-dynamic and platform-browser is the way your angular app will be compiled.
Using the dynamic platform makes angular sending the Just-in-Time compiler to the front-end as well as your application. Which means your application is being compiled on client-side.
On the other hand, using platform-browser leads to an Ahead-of-Time pre-compiled version of your application being sent to the browser. Which usually means a significantly smaller package being sent to the browser.
The angular2-documentation for bootstrapping at https://angular.io/docs/ts/latest/guide/ngmodule.html#!#bootstrap explains it in more detail.

Why size of apk is so low?

I have an android project with total size 280 MB. But when I build it, the apk created is just 6 Mb. How is such a compression factor achieved. I have many third party libraries and resources in my project.

The most direct answer is that you are compiling an APK, not compressing it. You are changing the nature of the content, not making the content smaller.
Lots of things happen during a compile that can reduce the size of the pre-compiled content.
Here is an explanation of the compile process for C++ here on SO that is pretty good. APK compilation is different but the ideas are largely the same.
How does the compilation/linking process work?
here is one that is more specific to the APK compile process. I'm not familiar enough with that process to know how accurate the article is but it seems decent.
http://www.alittlemadness.com/2010/06/07/understanding-the-android-build-process/

What is the purpose of bytecode in Java?

Given that I can compile 300 classes in seconds, an implementation of Java could just be given Java source files instead of bytecode as an input, then compile and cache the input source code, and never compile it again (e.g python does this, and lots of language implementations do the same except don't even bother to cache):
This initial compilation experience would be equivalent to the installation process that users are already used to
This would remove the need for implementing the non trivial task of verification in the bytecode interpreter (which really is just reimplementing parts of the compile time checks), reducing implementation complexity.
Java as it currently is, verifies the input bytecode every time it starts, even if it already verified it before. Point 2 would of course reduce startup times because it eliminates this step (although the current Java platform could also cache the "checked" status somewhere to reduce startup times, I'm not sure if it does this)
This would allow implementations to compile however they want (or not at all), e.g for performance. Android doesn't even use Java bytecode, it uses dalvik bytecode, because they claim it's more suitable for their needs (e.g more performant on their hardware). If bytecode didn't exist, this design decision made by Google would have been completely transparent.
This would promote open source
This answers why distribute bytecode instead of native code, but to be clear, I'm wondering why even have a compiled format for distribution at all? Assuming that compilation is important, why not just have the runtime compile source and cache it?
The only remaining rationale I can come up with is for obfuscation, but...
the way current compilers compile, the code can be mechanically decompiled pretty accurately
source code can be obfuscated too
...so this point is reduced to that intuition would say that bytecode is more complicated than source code, thus having a bytecode distribution format allows to trick businessmen into thinking their IP is protected (i.e bytecode would be to "add value", but there is no technical reason for it).
Why is the Java platform designed for distributing bytecode to the user, as opposed to distributing source code to the user? I could not find an explanation of this anywhere on the internet. Is there a big reason I am missing here?
If you give a reason, you should probably state whether it's a reason that the language designers initially had, or a reason that is still valid today.

You are thinking just inside your little world. There are some compelling reasons to compile the source and deliver bytecode instead:
Download time (Applets were supposed to become a widely accepted web technology) - the user has no need for the source, so why retain the source? Reducing the amount of information transmitted means faster downloads.
Reduces startup time. Compiling on every run, takes additional time. If you can compile 300 classes a second that would mean an additional 5-10 Seconds startup time on the JRE alone nowadays. And machines were a little slower in 1995, you know.
Java is aimed at a multitude of platforms. Some platforms are not as powerful as your PC. Think of embedded and mobile devices. They may have neither the storage nor the power to compile code.
Bytecode allows any language to be compiled to bytecode - not just Java. There are plenty of other languages that can compile to bytecode. Would you prefer to install a new compiler for each of them?
Companies are often reluctant to give "the source" out of their hands. Java would have more acceptance problems if programs were to be delivered "in source".
Bytecode is a form of machine code simple enough to execute in hardware directly (There are a few embedded designs that have partial native bytecode support).
I'm sure there are more pros for bytecode I haven't even though about.

99% vs 1% in terms of code compilation or build

This is in terms of code compilation and nothing else.. :)
So, I am a newbie in my company and predictably got stuck with an awesomely slow computer. And I am having a big problem with my Netbeans running out of memory/resource every time I make a build. I am compiling my JAVA files.
I was using 7.0, and even though I was getting this error, I got by it by compiling the source packages in chunks. (sometimes I had to compile the selected ones more than once)
But ever since I moved to 7.2, this problem is getting worse. I have to now compile the packages in even more smaller chunks. Sometimes package by package and file by file. Hence costing me a lot of time and even lot of hair.
I have no idea which packages to compile first. The netbeans was taking care of that. Therefore, taking resources.
Most of my colleagues have powerful computers and have no problem building the whole source base. So, I started getting the complied packages and only building the required ones.
So, is this the correct approach or building the whole source (even though I just make changes to 1% of the total code base, at any given time)?
Almost everyone in this company is building the whole code base, at least once, even though most of the changes are only in 1%.

It is far better to build the entire project and have it work as designed, then build 99% of it and it doesn't work. There's no indication that the 1% is critical or non-critical code, and as a beginner, you can't tell that just right off the bat.
I would inform your teammates/IT personnel about the slow build and ask what can be done to resolve it, instead of building the code in chunks.

Maybe you should highlight the issues with a developer having a slow machine impeding the work you are doing, when you explain the difference in lost productivity versus hardware cost, you will shortly have a new machine.
Then you can stop worrying about building "99%" and get on to real issues.

It's better to build the entire project. Try tune netbeans.conf
netbeans_default_options="-J-client -J-Xss4m -J-Xms128m -J-XX:PermSize=128m -J-XX:MaxPermSize=512m -J-Dapple.laf.useScreenMenuBar=true -J-Dapple.awt.graphics.UseQuartz=true -J-Dsun.java2d.noddraw=true-J-XX:+UseParNewGC -J-XX:+UseConcMarkSweepGC -J-XX:+CMSClassUnloadingEnabled -J-XX:+CMSPermGenSweepingEnabled"

So, is this the correct approach or building the whole source (even
though I just make changes to 1% of the total code base, at any given
time)?
I think that you can build only some parts of project only if you perfectly know all internal dependencies and can guaranty that no unexpected behaviour in nearby module happens after your modifications were made. It is my opinion. Moreover, you can change code and compile it succesful, but the entire project build can fail the same.
P.S. You should get company to buy you a new computer.

In theory you could walk through all dependencies and make yourself a dependency hierarchy map, and you should only have to compile the code you've changed plus everything that depends on it. However it's not necessarily 100% foolproof and requires A LOT of effort for very little gain. It's not something that I would expect to be a newbies responsibility to sort out, rather your superiors should get you sorted out with some appropriate kit.

How can I profile Java memory usage without an instumented JVM (Java 1.1.8)

I am currently trying to determine the cause of high memory usage in a Java application running on an exotic platform where I know of no instrumented JVM.
I have the source to the application, and can make changes to the source for the purposes of testing.
How can I debug memory usage under these conditions?
If more info is needed, I'll be happy to provide. I'm just a little lost trying to use such an old jvm without much tooling to speak of.

If I were in your shoes I would approach it with:
Find the functional areas you know
need attention.
Make backup copy of code
Start inserting print statements
with start and end times
See what takes a lot of time and
narrow it down.

For Java 5 and later this can be done using Java agents. For earlier versions - including 1.1.8 - you must load native agents to do this. If you cannot instrument your code, you must do the work needed yourself.
One approach to get most of the way is to use a Java 1.1 compatible version of log4j which allows you to essentially write out strings prepended with a timestamp. This can then be massaged afterwards into extracting answers to whatever you want to know.

If you need memory profiling - and I'd recommend against this - you could start serializing objects out to disk, then measuring disk size as a rough estimate of memory size.
If you really want to dig into where you're usually not supposed to be, try the sun.misc package, although I don't know how much of that was around in 1.1.x.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.