I'm porting an old Java project to GAE. It has some servlets, which generate html pages with static images in them. In the original project these images are stored on the filesystem next to the servlets.
I'm trying to use GCS in the first place, I've uploaded my files and gave permissions on public read. In this case I can reach the files with their public link, I can embed these links into the HTML output. But I have a feeling that this isn't the right solution. The load time seems quite slow, like the images don't "travel internally", and I have to provide permission for every single image.
So my question is, how to get an "internal" URL for a file located on GCS in your GAE application?
I've found some Java examples, but in my case I don't think I need the image object in the source, I just need an URL to pass it on to the HTML source.
As far as I know I could just simple deploy the images with the source as resources, but there are quite many of them.
If there are other soultions, like Datastore, I'm open for that too, but I thought GCS would be the easiest.
Google Cloud Storage is as fast an option for loading images as any other. A browser reads a link and asks the server (in this case GCS) to deliver an image. There is no "internal" URL that can work faster - the speed reflects the bandwidth/distance between GCS and the browser which asked for an image.
You can speed it up by using a CDN, where your image is stored on local servers throughout the world. It only makes sense if you serve content to a very large number of users, and it is a critical part of how fast a page loads.
Another way to speed up page load time is to use image sprites instead of images. This way you cut the number of requests from a browser to a server (i.e. GCS). If you images do not change frequently, and most pages need the same "collection" of images (i.e. not shown dynamically), this is a very good solution.
Related
I am planning to develop e-commerce application with tomcat server. Kindly suggest where i need to save images and how to serve images to applicaiton.
you can save images in your project path,
you can save them in your local machine,
you can save them in the database too.
i prefer to save the imagepath in database and then load the image to the application by getting the image from that location.
there are somany ways that you can extract image from a location
you have to be fammiliar with filereading , bufferedimage ,filewriting using java base64 encoding and decoding technologies so that you can send an image in a string format.
you should start coding the application and then post the problems that you've got
Since this question is highly opinion based so here comes my opinion. If I were you developing an ecommerce application, I would rather put my images on cdn than keeping them in my application since ecommerce applications are supposed to have a lot of images and it can weigh a lot while you package your application. So, it's better to have images at separate location (recommended cdn since they have their own caching) to make efficient use of them.
I am building an application where audio data is uploaded to my GAE server, processed, and displayed as a response to an HTTP GET request.
Part of the data I wish to display is in the format of a graph. What I am having a hard time understanding is how to create my response in such a way that I can include graphs.
From what I understand, one approach might be to create the graph using this API:
http://googleappsdeveloper.blogspot.ca/2011/09/visualize-your-data-charts-in-google.html
And then store it as a blob in my datastore. I can then create a JSP to serve the blob as an image? Not sure if I am understanding this correctly. Specifically, I'm not sure about being able to access all of this functionality from GAE, and if I'm doing this in a convoluted way.
I am quite new to GAE and web programming in general, so I greatly appreciate feedback and suggestions on how to do this in the simplest/quickest way. I wouldn't mind links to relevant resources as well.
you have mainly two ways to go:
1) Send in your response only data and let your front end (your website or app) parse them and put them in graph form.
You can write in your response the data to show, and it is quite suggested to give them a structure (so as your front-end can easily interpret and validate the data). Common formats are JSON and XML (they basically can give a custom hierarchical structure to your data,for example you can organize the graph data in columns form)
The way to build a graph depends on the technology you use in your front-end and you can either use a third part library or build your own
2) Create graphs in your web application, store them and allow users to get them via HTTP. Once you have found a way to build a graph image from data you need to store it. GAE gives to you two storage systems, the Blobstore and the Google Cloud Storage.
I think You can save files in the Blobstore only by direct upload via HTTP therefore if you're saving image directly in your GAE web app there's no easy way to use it (you should open an HTTP connection).
The Google cloud storage instead can be accessed by using the dedicated libraries (https://developers.google.com/appengine/docs/java/googlecloudstorageclient/getstarted) that you need to download and add to your project during the developing phase (and activate them) there are tutorials for this (https://developers.google.com/appengine/docs/java/googlecloudstorageclient/).
In order to serve images you can bypass the middle code that should read the image from the GCS and serve it as a response by using the Images service. Once generated a so called "serving URL" associated to a given image, the images service permits to directly access to the image via HTTP (https://developers.google.com/appengine/docs/java/images/).
Finally the first option is interesting because (obviously if you can) it's simpler and lighter for the server side (the one you pay) and you can anyway cache the images to avoid useless computation, the second is maybe more correct in a certain point of view but it is more complex.
I'll try to be short with the description of my situation:
I'm making a restaurant recommendation web site. I want users to be able to add a new restaurant and upload 1 picture of the restaurant (restaurant's profile picture). That picture will later be displayed when users search for the restaurants. Each time a new restaurant is added I want to create new folder for that restaurant and place the uploaded picture there.
I can upload an image and place on my file system, but when I try to display it it doesn't show.
<img src="\C:\glassfish4\glassfish\domains\domain1\applications\__internal\WebAppName\profilePicture.jpg" />
This is where I was originally placing the images, but they wouldn't show.
If I understood correctly from here that is not a good practice to reference images in this manner because then the browser will be looking for that location in user's machine.
I tried to place the images in:
C:\glassfish4\glassfish\domains\domain1\eclipseApps\WebAppName
because I figured that this is where WebContent folder is (please, correct me if I am wrong), and that this location is accessible with:
<img src="http://localhost:8080/WebAppName/..."/>
but that worked only for a short time. As soon as I redeployed my app the folders in which the pictures were placed had gone away (and they were created).
So my question(s) are:
How and where to place these images, and what should my src attribute look like in an html document (should it be like C:\... or http://localhost/...)?
What are conventions, practices for this, and how is this generally done?
And does redeployment has anything to do with my pictures being gone?
I found this post, but it did not solve my problem.
Note: - I am using glassfish4, and Java Servlets, JSP, JSTL/EL, and generally Java.
Thanks in advance!
And does redeployment has anything to do with my pictures being gone?
It does. When you redeployed your application, GAS removed the application directory at ${GAS_INST_DIR}\domains\${YOUR_DOMAIN_NAME}\application{YOUR_APP_BUNDLE_NAME}. Once you decided to store you images there, they are gone after the redeployment.
How and where to place these images
The most straightforward way is to put your files somewhere outside application server folder could be a solution but I would say just half a solution. Let's assume you store your pictures in a local folder /var/application/data. Later you decided to cluster your application. Now you are again in trouble. Each instance has its own /var/application/data directory and as a rule you do not know what node will handle a request for storing an image.
What are conventions, practices for this, and how is this generally done?
I would say it is you who decides what way to go according to the needs of your application. I will list the ways that are the most obvious. All have their own strength and weaknesses.
You can put the images in a local folder. The strong side is simplicity. Once you decide to cluster, you would have to remake this approach. If you go this way the most general approach would be to create a servlet that loads your images and in this case your src= will point to the servlet. Did not find a good example right away, but I think this example will give you an idea how to do it. The only thing I would suggest using finally block or if you use jdk 1.7 the try-with-resources for closing stream. Another thing you will need to pass file name as a parameter to the servlet.
Store images in database. It could be RDBMS or noSql. Down side is that not all RDBMSs work efficiently with binary data. Again src could point to a servlet that loads images from the DB. Here you should design your DB accordingly so you can retrieve images effectively. I would not choose this approach, but this is just a personal opinion. Cannot say how efficient the noSql databases are for storing binary data. You should do it yourself
Consider Webdav. In this case your src attribute will be a link to a resource in webdav server. You can use it in clustered environment, relatively simple implementation.
and what should my src attribute
look like in an html document (should it be like C:... or
http://localhost/...)?
Depends on the approach you choose. See item 1-3.
Hope that helps.
I want to use java.net.url to crawl some websites and retrieve some data.
I am confused about the following issues--
(1) Suppose I configure the crawler to visit a video sharing webpage, for eg You Tube. Now, the crawler is set to visit a specific You Tube video page-- does this mean that when the crawler actually visits that page, it will by default download all elements on that page, including the FLV Video? Or can I control which files to retrieve. The aim being, minimisation of bandwidth utilisation on Google App Engine. Specifically, initially I want only the HTML web page itself to be retrieved, without retrieving images/videos/other attachments on that web page... is this possible, either on Google App Engine, or as part of a regular java web app?
(2) What is the quick and easy way to know the exact bandwidth being utilised for visiting a single specific site? So that I can keep track of bandwidth utilisation?
Also keeping the above 2 issues in mind, do you recommend usage of java.net.url or low level API? Or do you think I should not stick with App Engine (and use for eg. AWS)?
(1) Your crawler will only load what the web-server responds for a specific URL, which normally is pure HTML. In case of YouTube, just right-click with your browser on a page and select View Source. That is what you'll download if you load the page automatically. No video, just text.
(2) when you read the content of the webpage, just count the bytes you received. That is your bandwidth.
Requirement is to keep a copy of complete web page at server side same as it is rendered on client browser as past records.These records are revisited.
We are trying to store the html of rendered web page. The html is then rendered using resources like javascript, css and image present at server side. These resources keep on changing. Therefore old records are no longer rendered perfectly.
Is there any other way to solve above? We are also thinking converting it into pdf using IText or apache FOP api but they does not consider javascript effect on page while conversion. Is there any APIs available in java to achieve this?
Till now, no approach working perfectly. Please suggest.
Edit:
In summary,requirement is to create a exact copy of rendered web page at server side to store user activities on that page.
wkhtmltopdf should do this quite nicely for you. It will take a URL, and return a pdf.
code.google.com/p/wkhtmltopdf
Example:
wkhtmltopdf http://www.google.com google.pdf
Depending on just how sophisticated your javascript is, and depending on how faithfully you want to capture what the client saw, you may be undertaking an impossible task.
At a high level, you have the following options:
Keep a copy of everything you send to the client
Get the client to return back exactly whatever it has rendered
Build your system in such a way that you can actually fetch all historical versions of the constituent resources if/when you need to reproduce a browser's view.
You can do #1 using JSP filters etc, but it doesn't address issues like the javascript fetching dynamic html content during rendering on the client.
Getting the client to return what they are seeing (#2) is tricky, and bandwidth intensive.
So I would opt for #3. In order to turn a website that renders dynamic content versioned, you have to do several things. First, all datasources need to versioned too. So any queries would need to specify the version. "Version" can be a timestamp or some generation counter that you maintain. If you are taking this approach, you would also need to ensure that any javascript you feed to the client does not fetch external resources directly. Rather, it should ask for any resources from your system. Your system would in turn fetch the external content (or reuse from a cache).
The answer would depend on the server technology being used to write the HTML. Are you using Java/JSPs or Servlets or some sort of an HTTPResponse object to push the HTML/data to the browser?
If only the CSS/JS/HTML are changing, why don't you just take snapshots of your client-side codebase and store them as website versions?
If other data is involved (like XML/JSON) take a snapshot of those and version that as well. Then the snapshot of the client codebase as mentioned above with the contemporary snapshot of the data should together give you the exact rendering of your website as at that point of time.
A very resource-consuming requirement but...
You haven't written what application server you are using and what framework. If you're generating responces in your own code, you can just store it while generating.
Another possibility is to write a filter, that would wrap servlet's OutputStream and log everything that was written to it, you must just assure your filter is on the top of the hierarchy.
Another, very powerfull, easiest to manage and generic solution, however possibly the most resource-consuming: write transparent proxy server staying between user and application server, that would redirect each call to app server and return exact response, additionally saving each request and response.
If you're storing the html page, why not the references to the js, css, and images too?
I don't know what your implementation is now, but you should create a filesystem with all of the html pages and resources, and create references to the locations in a db. You should be backing up the resources in the filesystem every time you change them!
I use this implementation for an image archive. When a client passes us the url of an image we want to be able to go back and check out exactly what the image was at that time they sent it (since it's a url it can change at any time). I have a script that will download the image as soon as we receive the url, store it in the filesystem, and then store the path to the file in the db along with other various details. This is similar to what you need, just a couple more rows in your table for the js, css, images paths.