Newbie: query amazon s3 database in Android client (Java) - java

I am new in Amazon S3 service. I have an Amazon S3 database, the directory(bucket) structure is like:
-All bucket
-MyCompany
-MyProduct
-Product_1
- sub1_prod1
- sub1_prod2
...
-Product_2
- sub2_prod1
- sub2_prod2
...
As you see above, under MyProduct bucket I have several product buckets (e.g. Product_1), under each of the product bucket I have several sub-product(e.g. sub1_prod1). Each sub-product contains multiple files.
Now, I want to implement Java code in my Android client to query all my products under MyProduct bucket, how can I do this? I am using AmazonS3Client class provided by Amazon Android SDK.
P.S.:
I am able to create my AmazonS3Client object by using my credential.
AmazonS3Client s3 = new AmazonS3Client(myCred);
I know how to upload files to S3 bucket in java code, but I am not sure how to query the S3 database & get the result in my Android client, that's to get all the file names under each sub_product bucket.

I have an Amazon S3 database
IMHO, Amazon S3 is not a database, any more than a directory of files is a database. You may wish to consider other Amazon AWS services that are actual databases, such as DynamoDB or RDS.
that's to get all the file names under each sub_product bucket
By reading the documentation, it would appear that you will need to use some flavor of listObjects().
The brute-force approach would be to use the listObjects() that just takes the bucket name. That will give you a list of everything, and you would need to sort them into the tree structure yourself.
The less-brute-force approach would be to use the listObjects() that takes the bucket name and a prefix, or the listObjects() that takes a ListObjectsRequest parameter. To use filesystem terms, this will tell you the files and subdirectories in that directory. This way, you can download the pieces more easily. However, this may require a lot of HTTP requests.

Related

Microservice architecture for uploading and downloading multiple files at once

I'm trying to create a new microservice with Spring Boot for uploading and downloading multiple files at once.
These files (PDF,XML,ZIP,TIFF,..), based on some conditions, can be stored inside a storage like S3 or inside another kind of storage. This microservice has to implement the logic to understand where these files are, download them temporarily in a local folder and then return them back to the client application.
The goal is to hide the recovery logic and the type of storage where the files reside to the client applications.
Each one of my business entity has several files associated so for the upload API I was thinking of using a Multipart Request to send the files of the same entity all together.
I would like to do the same for the download API: given the ID of an entity the API has to return all the files associated with it.
I don't know what's the best way to achieve this goal.
I have seen that there is a Multipart Response but I don't know if it is reliable.
Another idea is to download the files in a temporary shared folder and to send back to the client application the list of paths where they are.
Another one is to download always the files in a local (not shared) folder and to send back to the client application the list of URLs that it has to use to get them.
What do you think about it? Any other option?
Thanks for your help!
If you're looking to hide the fact that you're using S3 as the storage backing, my guess is that you're trying to either a) ensure you have the flexibility to change the storage backend at a later date, and/or b) put your own authentication in front of the upload/download to ensure users have permissions to read/write contents.
In either case, this sounds like a good job for a performant API gateway to ensure you maximize throughput. Instead of writing a custom service, you can write a configuration for something like Traefik that would a) authenticate requests, b) proxy the request to S3 directly, and c) rewrite the host and path to mask the usage of S3 as a storage backend. If you choose to use Traefik, take a look at the Routers section and the ReplacePathRegex middleware.

How to properly use s3 to deliver and store files in a web application?

So we are planning to move static content to s3 for operational reasons. I just want to understand where to place s3 in the workflow of handling a request.
If website requires an image, should the request hit our service first which would fetch the image from s3 (reverse-proxy) or should client directly request the file.
How to hide file names ,pathnames and manage permissions in request for file?
Same questions applicable for uploading new content.
Handle s3 quota and parallel requests
I was going to comment, but this turned into a full answer instead...
Either. If your assets are public, the lowest-weight method is to just request them from a public S3 bucket. If they're not, though, it's probably easiest to use Cloudfront rather than rolling-your-own auth around S3 requests.
You can make it look like your asset A.jpeg in S3.yourBucket/A.jpeg is at yourWebsite.com/A.jpeg using Cloudfront. If you want to also obscure the filename A, you need to use e.g. API gateway to serve you the file without revealing anything about it to your front end. If it were me, I wouldn't bother.
Unless you absolutely have to, don't let users upload to the same bucket that other users download from. There are several approaches to uploads depending on the use-case. Pre-signed URL's are good for one-time use. You can also just provide the user with AWS credentials that are allowed to write-only to the upload bucket, by using Cognito.
There's no S3 quota. You get charged for reads and writes. For a simple site, these charges will be tiny. If you're worried, you can use Cloudfront to rate-limit your users. You can also use API Gateway to create limits for individual users. S3 is extremely parallelizable.

Upload content to AWS S3 and serve the content via Cloudfront

I need to upload static contents like image to AWS S3 and provide a link back so that image can be accessed through Cloudfront CDN (content delivery network). I am new to AWS and I read that s3 bucket is linked to CDN and I believe it's all configuration based. From java code I am able to upload to s3 and get the bucket based url back. How can I retrieve CDN url from java code for the same uploaded image. Could you please help me out here.
The AWS S3 has to be manually linked to AWS CloudFront (Image attached).
Once a Distribution is created, you can see a new domain mapped to the distribution. Using that domain name you should be able to access the CDN URL in your Java code.

Java spark accessing emc object store through S3 API

Can I get a reference for the apis of Java+ spark sql accessing emc object store via S3 api. I tried many S3 apis(aws-java-sdk.1.7.4 jar) but stuck in some error related to bucket name.(Because my bucket name contains "" underscore. My object store on emc which allows bucket names with "". But I want to access this by spark sql but through S3 api.
Trouble is that the S3A connectors all expect bucket name to be a valid hostname, but _ isn't allowed in DNS names.
AWS now forbid new buckets with underscores, and the people who do the S3 connectors for tools like spark aren't going to do anything with bug reports about this other than close as "wontfix".
Sorry, but you'll just have to rename your bucket.

S3 Bucket Signed URLs to grant access to pictures

I'm having a brainstorming issue on how to get user uploaded pictures viewed by only the friends of the users.
So what I've come up with so far is:
Create a DynamoDB table for each user, with a dynamic list of friends/new friends added.
Generate a Signed URL for every user-uploaded picture.
Allow access to the Signed URL to every friend listed in the DynamoDB table to view set picture/s.
Does this sound correct? Also, would I technically have just one bucket for ALL user uploaded pictures? Something about my design sounds off...
Can anyone give me a quick tutorial on how to accomplish this via Java?
There two basic approaches:
Permissions in Amazon S3, or
Application-controlled access to object in Amazon S3
Permissions in Amazon S3
You can provide credentials (either via IAM or Amazon Cognito) that allow users to access a particular path within an Amazon S3 bucket. For example, each user could have their own path within the bucket.
Your application would generate URLs that include signatures that identify them as that particular user and Amazon S3 would grant access to the objects.
One benefit of this approach is that you could provide the AWS credentials to the users and they could interact directly with AWS, such as using the AWS Command-Line Interface (CLI) to upload/download files without having to always go via your application.
Application-controlled access to object in Amazon S3
In this scenario, users have no permissions within Amazon S3. Instead, each time that your application wishes to generate a URL to an object in S3 (eg in an <img> tag), you created a pre-signed URL. This will grant access to the object for a limited time. It only takes a couple of lines of code and can be done within the application without communication with AWS to generate the URL.
There is no need to store pre-signed URLs. They are generated on-the-fly.
The benefit of this approach is that your application has full control over which objects they can access. Friends could share pictures with other users and the application would grant access, whereas the first method only grants access to objects within the user's specific path.

Categories

Resources