AWS - where to store image/documents for application use? - java

I have a j2ee Application running on AWS,
User need to upload an image or PDF to the application for internal use,
What is the right way to get/create a path the AWS to store the images?
Images/pdf will not be exposed for anyone to download,Its just to the j2ee application
I was searching and found "buckets", but buckets are exposed to the outside world for manual upload, so i am not sure if this is the right way to go

You can implement a file upload feature in your application (a page that the user can access) which streams the file to memory within the application (example here for Spring web application). Once the image is in memory, you can store it in a secured AWS S3 bucket with the AWS SDK.

In AWS there are multiple storage options available. But the best option would be to use a S3 bucket. By default the S3 bucket is private and not open to outside world. You can manage permission to the bucket and authorize only your application can upload files to there and view them. There are couple of benefits when using S3 to keep file uploads.
Extremely high durability of 99.999999999
High availability 99.99
High scalability
Unlimited storage
Low cost and event lower cost in archiving data with life cycle rules
Versioning
& etc.
Also you application can scale independently without limiting with storage.

Related

Design Choices for storing images on Google App Engine

I'm developing an application that I'm thinking of hosting on GAE, hopefully within the free tier. It's an "own-time" project and I find the GAE documentation pretty incomprehensible when it comes to working out which products are available, how much they cost, and how they should be used.
The app relies on users being able to upload images with some meta data. The meta data needs to be searchable and allow the images to be displayed.
My problem comes with where to save the images. I'm storing the metadata in the DataStore. Google seem to imply that I should store the images in either Cloud Storage or the Blob Store, with a preference for Cloud Storage. This seems to be chargeable.
I also see mention of the Image Service - is this something else I should consider?
From Default Google Cloud Storage bucket:
Applications can use a Default Google Cloud Storage bucket, which has
free quota and doesn't require billing to be enabled for the app. You
create this free default bucket in the Google Cloud Platform Console
App Engine settings page for your project.
So you can use GCS without being charged as long as you don't exceed the free quota.
You also probably want to stick with the standard environment for its free quota. From App Engine Pricing:
App Engine applications run as instances within the standard
environment or the flexible environment.
Instances within the standard environment have access to a daily limit
of resource usage that is provided at no charge defined by a set of
quotas. Beyond that level, applications will incur charges as
outlined below. To control your application costs, you can set a
spending limit. To estimate costs for the standard environment,
use the pricing calculator.
For instances within the flexible environment, services and APIs are
priced as described below.
And for images-related options, from Overview of Images API for Java:
Java 8 on App Engine supports Java's native image manipulation
classes such as AWT and Java2D alongside the App Engine
Images API.

File storage for Java based web application

I'm in the processing of designing a Java based web application (Spring based to be specific). One of the key requirement is that, this application has to accept many files of various formats (pdf, jpeg, dwg, png etc.) uploaded by the user. Also, to be able to download back to user's local computer. There will be thousands of files being uploaded/downloaded.
I am thinking of two approaches:
Upload the documents to the same box where server is running. Mostly all the documents will be uploaded to, and downloaded from box where Tomcat is running. I'm worried that, as the documents grow in number, this may impact overall performance.
Upload/download documents to another server dedicated for storing/retrieving of documents.
If 2nd approach is taken, how Spring application can upload/download files to/from remote server? Or which approach is being used in the similar applications.
Or could you suggest any other optimal way of handling this requirement.
Thanks in advance.
Ganesh
Many modern applications built like this are going to use an external storage system like Amazon S3 to store these files, which buys you all kinds of nice features - high availability for downloads, an effectively unlimited pool of disk space, data replication, and so on.
There's a tutorial available for integrating spring with Amazon S3. You should check that out. Regardless of whether you choose S3 or something else, the approach will be similar.
Have you thought about using a DB. You could store those files as BLOBs. Here is a tutorial for this: link
"This tutorial walks you through the steps of building an interesting Spring MVC web application that allows the user to upload files from her computer to the server. The files are stored in database using Hibernate."
As to two approaches you consider:
Either way you will have to manage those files, back them up and check if there is enough space to store more files. Also this may cause some security issues as you accept all files.
I recommend you using a JCR like Apache Sling or Apache Jackrabbit.
Apache Sling™ is a framework for RESTful web-applications based on an extensible content tree with the following feautures:
Content resolution that maps a request URL to a content node in the content repository
Servlet resolution that maps a content node and a request method to a Servlet handling the request
Default servlets supporting WebDAV, content creation from web forms and JSON representation
A Javascript client library, allowing access to the content repository through AJAX
Support for server-side scripting with Javascript, JSP, Ruby, Velocity and Scala
OSGi-based extensibility through Apache Felix – the Felix Web Console was originally developed by the Apache Sling project

Document storage strategy for Java web apps

We have multiple web applications for different functional areas. There is no overlap of functionality between these apps and hence they are fairly independent. All these apps generate content like PDF and XML data. Currently all these apps are storing these documents in a path relative to their web root. The documents are accessed using url relative to the app specific web root.
Now we want to move to a design where these apps store the data/files in one central location and these documents can be accessed thru a URL outside of the specific application web root. Also we want these documents to be available even if the specific application is down.
We experimented with Apache Jackrabbit etc, but most of these are CMS tools that provide a lot more than what we want. We don't need full CMC capabilities since we don't really intend to do any web publishing, editing etc. We just need a simple way for multiple apps to store files in one single location and later access them thru a URL. Some thing probably like cloud storage.
Are there any tools out there that could help us implement this? Or Design pattern?
We need
beans from multiple apps to be able to save files in one central location (we can't use fixed disk drive location)
common url based access to these resources
We use: Java web apps on Tomcat 7 using JSF/Myfaces
Use an Apache web server (or other web server). Save the files in a folder published through HTTP by the server. To save them you can use any protocol that allows file transfer (FTP, SCP...).

Document management system using Google Cloud Storage

I am currently working on different aspect Google App Engine and still in study phase and build some small apps and deployed it on cloud. Recently when i was installing a command line to for cloud storage(i.e. gsUtil) I encountered versioning support on cloud storage and was able to retrieve old objects or deleted objects through gsUtil . So building a document management system on GAE is good idea with Google cloud storage or I should be using Google drive SDK ?
Please guide me on this problem .
Thanks in advance
Completely different products for completely different use cases.
Google Cloud Storage is a storage on cloud, no more abstractions. If you want to build a document management system from scratch, you can prefer it as the storage provider.
If you build an app on the top of Google Drive, you inherit a file system abstraction, user management, a permissions model and etc. But you don't own the users, neither their drives. Additionally, Drive's quota management is fined tuned for "per-user" usage. Most people think creating a single Drive account and logically share it among their users on the application level will work. It's unlikely to scale due to the quota limitations.

Amazon access credentials in Android App

Amazon Cloud Services (AWS) has provided the ready to use Library to make calls to SDB, S3, SNS etc right from your Android app. This makes it really easy for a mobile developer who is not familiar with web services and web applications to create a completely scalable cloud based app.
We give the Amazon Access Credentials in these API calls to connect to our cloud Account; My question is:
How do I effectively use Key rotation in the app, since I would be distributing the app, once the change in key could mean a period disruption for the existing users.
Would hard coding the Amazon Access Credentials inside the code (as a field Constant etc) make it vulnerable to extraction? Via decompiling etc.?
I talked to the Amazon Advocate for our region and he told that Amazon client library is not designed for such a purpose.
It could be used in for in-house apps (not being published), like client-demo apps.
If you're bundling the Credentials with an app to be published in open market (not recommended), use IAM and create a separate credential with with restricted access.
If you're building an app like Instagram, you may have to setup a web server to proxy your calls to Amazon (effectively making the client library useless).
Obviously, I was not very convinced. I think an entire client library to Amazon communication (bypassing the need for a webserver) could be a great advantage for Mobile devs.
Re:
Would hard coding the Amazon Access Credentials inside the code (as a field Constant etc) make it vulnerable to extraction? Via decompiling etc.?
Yes, by looking for strings and patterns in the binary. Also decompiling, but that'd often not be necessary.
The first question is, what sort of threats are you trying to protect against? Governments? Paid hackers? Or you just want to make it not easy to gain access other than via the app?
Limit the access the keys have to just the data that the app needs.
Store the keys in the app in several pieces. Modify them in some way (eg ROT47), then re-combine when sending to the service.
Don't put all of the key information into the app. Require use of another security device such as the Amazon MFA
Install monitoring to detect unusual patterns of access that could indicate access from outside of the app.

Categories

Resources