How to minify different javascript files at runtime using Java - java

I'm trying to build (or find an existing one I can use) a web filter that will compress a JavaScript file at runtime. I've tried building one based on YUICompressor, but I'm getting weird errors out of it when I try and pass a String based source into it instead of an actual file.
Now I'm expecting to get bombarded with responses like 'Real time compression/minification is a bad idea' but there is a reason I'm not wanting to do it at build time.
I've got a JavaScript web application that lazy loads it's JavaScript. It will only load what it actually needs. The JavaScript files can specify dependencies and I already have a filter that will concatenate the requested files and any dependencies not already loaded into a single response. This means that there is a large number of different combinations in the JavaScript that will be sent to the user which makes trying to build all the bundles at build time impractical.
So to restate. Ideally I'm looking for an existing real time javascript filter I can just plug into my app.
If one doesn't exist I'm looking for tips on what I can use as building blocks. YUICompressor hasn't quite got me there and GoogleClosure seems to only be a web API.
Cheers,
Peter

Take a look at The JavaScript Minifier from Douglas Crockford. The source is here: JSMin.java. It's not a filter and only contains the code to minify. We've made it into a filter where we combine and minify JavaScript on the fly as well. It works well, especially if you have browsers and a CDN cache results.
Update: I left out that we cache them on the server too. They're only regenerated if any of the assets required to make the combined and minified output have changed. Basically instead of "compiling" them at build time, we handle each combination once at runtime.

I already have a filter that will
concatenate the requested files and
any dependencies not already loaded
into a single response
Sooo.. Why not just minify those prior to loading/concatenating them?
(And yes, compression on the fly is horribly expensive and definitely not worth doing for every .js file served up)

Related

Generate redirects from a list in Jakarta Tomcat

I'm pretty experienced in HTML, CSS, javascript, SQL, IIS, and a little Apache, but have essentially no knowledge of Java or Tomcat. I have a client with a low budget and a legacy web site based on a proprietary CMS (an ancient version of this) built on jakarta tomcat. Upgrading is not an option and paying tms to develop is also not an option. I'm much cheaper.
The URLs of the pages and documents on the site tend to be pretty long and not very meaningful to humans. For one reason or another when they are doing promotions they want shorter URLs for particular content. For example they may want http://{server}/naftir.html to redirect to http://{server}/cmspreview/content.jsp?id=com.tms.cms.section.Section_1013_sub_options.
I've solved this by a kludge of putting (for example) a naftir.html file in the root directory of the server and writing the redirects in there. But the {whatever}.html files are piling up and it seems there should be a better solution. E.g. edit the 404 file to look in a list (or MySQL table) of short names and desired redirects to do the redirection if found, otherwise display the 404. Or some other method based on a list of short names and redirect URLs rather than loads of files.
Any ideas?
You can easily configure a lot of this kind of stuff using a tool called urlrewrite. It's a Filter that you configure in WEB-INF/web.xml to run for all the URLs you want to re-map. Then, there is a "rewrite" configuration file where you can map specific incoming URLs to outgoing URLs. You can even use parametric replacements like mapping /foo/* to /a/b/c/d/*.
You can simplify the configuration by mapping the urlrewrite filter to all URLs, but then you lose a bit of performance on all the URLs that aren't ultimately rewritten.

Java crawler with custom file save ability

I'm looking for a open-source web crawler written in Java which, in addition to usual web crawler features such as depth/multi-threaded/etc. has the ability to custom handling each file type.
To be more precise, when a file is downloaded (or is going to be downloaded), I want to handle the saving operation of the files. The HTML files should be saved in a different repository, images to another location and other files somewhere else. Also, the repository could be not just a simple file system.
I've heard a lot about Apache Nutch. Does it have the ability to do this? I'm looking to achieve this as simple and fast as possible.
Based on assumption that you want a lot of control over how crawler works, I would recommend crawler4j. There are many examples, so you can get quick glimpse of how things are working.
You could easily handle resources based on their content type (take a look at Page.java class - it is class of object that contains information about fetched resource).
There is no limitations regarding repository. You can use anything you wish.

AppEngine: only loading necessary jars

I'd like to include a template library for generating user emails. However, only a tiny percentage of requests require this jar. I'd like to kick off a task that will load the jar and then send my email in the background, returning to the user ASAP.
How can I defer loading the jar until it's required?
It's occurred to me to upload multiple versions of the app, one with the jar and other email utilities, and one without. I'd be sad to lose the way I currently use versions, though, which is to specify incremental improvements. Any other suggestions?
I don't think this is possible I'm afraid, although you could use multiple versions of you app in the way you suggest but also use versions for incremental improvements. Since you can use any string as the version "number" for your app, you could have "mail-1" and "1" as the two copies of version 1. You end up having half the number of versions available, but you can still use them for both purposes.

Is it possible to keep referenced files inside the service itself?

I'm working on a jax-ws service in Eclipse. At some point, this service opens and uses a couple of XSLT stylesheets.
My question is, can you somehow import and keep these 2 files in the project itself, as you can with a library? For convenience' sake. I basically want my service to work as is, without having to go through the trouble of shipping the xslts along with the service but having to place them in different locations on the server, having to explain to people how and where they must go etc..
On a related note, how come when I make new File("D:\x.xslt");, the service looks for it in "C:\Users\Tudor\Desktop\eclipseJ2EE\eclipse\D:\x.xslt"? As in, *eclipse_path*/*fileName*. I would have understood, if it looked for the file in the root of the apache tomcat server; but not the installDir of eclipse... Anyway, how do I change that behaviour?
You can store the xslt file within your source classpath and load it via the Classloader.
If you are using Spring you can also use the ResourceLoader to load resources.
Its rarely a good idea to use File instances with relative paths directly, since within different server environments the base directory often differs.
Hope this helps.

Java content APIs for a large number of files

Does anyone know any java libraries (open source) that provides features for handling a large number of files (write/read) from a disk. I am talking about 2-4 millions of files (most of them are pdf and ms docs). it is not a good idea to store all files in a single directory. Instead of re-inventing the wheel, I am hoping that it has been done by many people already.
Features I am looking for
1) Able to write/read files from disk
2) Able to create random directories/sub-directories for new files
2) Provide version/audit (optional)
I was looking at JCR API and it looks promising but it starts with a workspace and not sure what will be the performance when there are many nodes.
Edit: JCP does look pretty good. I'd suggest trying it out to see how it actually does perform for your use-case.
If you're running your system on Windows and noticed a horrible n^2 performance hit at some point, you're probably running up against the performance hit incurred by automatic 8.3 filename generation. Of course, you can disable 8.3 filename generation, but as you pointed out, it would still not be a good idea to store large numbers of files in a single directory.
One common strategy I've seen for handling large numbers of files is to create directories for the first n letters of the filename. For example, document.pdf would be stored in d/o/c/u/m/document.pdf. I don't recall ever seeing a library to do this in Java, but it seems pretty straightforward. If necessary, you can create a database to store the lookup table (mapping keys to the uniformly-distributed random filenames), so you won't have to rebuild your index every time you start up. If you want to get the benefit of automatic deduplication, you could hash each file's content and use that checksum as the filename (but you would also want to add a check so you don't accidentally discard a file whose checksum matches an existing file even though the contents are actually different).
Depending on the sizes of the files, you might also consider storing the files themselves in a database--if you do this, it would be trivial to add versioning, and you wouldn't necessarily have to create random filenames because you could reference them using an auto-generated primary key.
Combine the functionality in the java.io package with your own custom solution.
The java.io package can write and read files from disk and create arbitrary directories or sub-directories for new files. There is no external API required.
The versioning or auditing would have to be provided with your own custom solution. There are many ways to handle this, and you probably have a specific need that needs to be filled. Especially if you're concerned about the performance of an open-source API, it's likely that you will get the best result by simply coding a solution that specifically fits your needs.
It sounds like your module should scan all the files on startup and form an index of everything that's available. Based on the method used for sharing and indexing these files, it can rescan the files every so often or you can code it to receive a message from some central server when a new file or version is available. When someone requests a file or provides a new file, your module will know exactly how it is organized and exactly where to get or put the file within the directory tree.
It seems that it would be far easier to just engineer a solution specific to your needs.

Categories

Resources