I have a AWS S3 with the following structure or hierarchy:
customer/name/firstname/123.gz
customer/name/firstname/456.gz
customer/name/firstname/789.gz
I need to get count of all the gz files in customer/name/firstname using java sdk.
Can I please have a Java code on how to do it?
There are several ways to get list of files from S3. This is one of them:
/**
* #bucketName bucket name (i.e. customer)
* #path path within given bucket (i.e. name/firstname)
* #pattern pattern that matches required files (i.e. "\\w+\\.gz")
*/
private List<String> getFileList(String bucketName, String path, Pattern pattern) throws AmazonS3Exception {
ListObjectsV2Request request = createRequest(bucketName, path);
return s3.listObjectsV2(request).getObjectSummaries().stream()
.map(file -> FilenameUtils.getName(file.getKey()))
.filter(fileName -> pattern.matcher(fileName).matches())
.sorted()
.collect(Collectors.toList());
}
private static ListObjectsV2Request createRequest(String bucketName, String path) {
ListObjectsV2Request request = new ListObjectsV2Request();
request.setPrefix(path);
request.withBucketName(bucketName);
return request;
}
P.S. I suppose that you already have S3 credentials in your home directory and successfully initialized AmazonS3 s3 instance.
Related
I want to archive all the files and sub directories in a s3 directory to some other s3 location using java. Is there any direct way to copy one s3 directory to another in java or scala?
There is no API call to operate on whole directories in Amazon S3.
In fact, directories/folders do not exist in Amazon S3. Rather, each object stores the full path in its filename (Key).
If you wish to copy multiple objects that have the same prefix in their Key, your code will need to loop through the objects, copying one object at a time.
A bit wordy, but does the job: reasonable logging, multithreading via TransferManager, handling continuation token for "folders" with more than 1000 keys:
/**
* Copies all content from s3://sourceBucketName/sourceFolder to s3://destinationBucketName/destinationFolder.
*/
public void copyAll(String sourceBucketName, String sourceFolder, String destinationBucketName, String destinationFolder) {
log.info("Copying data from s3://{}/{} to s3://{}/{}", sourceBucketName, sourceFolder, destinationBucketName, destinationFolder);
TransferManager transferManager = TransferManagerBuilder.standard()
.withS3Client(client)
.build();
try {
ListObjectsV2Request request = new ListObjectsV2Request()
.withBucketName(sourceBucketName)
.withPrefix(sourceFolder);
ListObjectsV2Result objects;
do {
objects = client.listObjectsV2(request);
List<Copy> transfers = new ArrayList<>();
for (S3ObjectSummary object : objects.getObjectSummaries()) {
String sourceKey = object.getKey();
String sourceRelativeKey = sourceKey.substring(sourceFolder.length());
String destinationKey = destinationFolder + sourceRelativeKey;
transfers.add(transferManager.copy(sourceBucketName, sourceKey, destinationBucketName, destinationKey));
}
for (Copy transfer : transfers) {
log.debug(transfer.getDescription());
transfer.waitForCompletion();
}
log.info("Copied batch of {} objects. Last object: {}", transfers.size(), transfers.isEmpty() ? "None" : transfers.get(transfers.size() - 1).getDescription());
request.setContinuationToken(objects.getNextContinuationToken());
} while (objects.isTruncated());
log.info("Copy operation completed successfully from s3://{}/{} to s3://{}/{}", sourceBucketName, sourceFolder, destinationBucketName, destinationFolder);
} catch (InterruptedException e) {
// Resetting interrupt flag and returning control to the caller.
Thread.currentThread().interrupt();
throw new RuntimeException(e);
} finally {
transferManager.shutdownNow(false);
}
}
I want to change all files in folder over GCP to be publicly shared.
I see how to do this via gsutils.
How can i do this via java api?
Here is my try:
public static void main(String[] args) throws Exception {
//// more setting up code here...
GoogleCredential credential = GoogleCredential.fromStream(credentialsStream, httpTransport, jsonFactory);
credential = credential.createScoped(StorageScopes.all());
final Storage storage = new Storage.Builder(httpTransport, jsonFactory, credential)
.setApplicationName("monkeyduck")
.build();
final Storage.Objects.Get getRequest1 = storage.objects().get(bucketName, "sounds/1.0/arabic_test22/1000meters.mp3");
final StorageObject object1 = getRequest1.execute();
System.out.println(object1);
final List<ObjectAccessControl> aclList = new ArrayList<>();
// final ObjectAccessControl acl = new ObjectAccessControl()
// .setRole("PUBLIC-READER")
// .setProjectTeam(new ObjectAccessControl.ProjectTeam().setTeam("viewers"));
final ObjectAccessControl acl = new ObjectAccessControl()
.setRole("READER").setEntity("allUsers");
//System.out.println(acl);
aclList.add(acl);
object1.setAcl(aclList);
final Storage.Objects.Insert insertRequest = storage.objects().insert(bucketName, object1);
insertRequest.getMediaHttpUploader().setDirectUploadEnabled(true);
insertRequest.execute();
}
}
I get NPE because insertRequest.getMediaHttpUploader() == null
Instead of using objects().insert(), try using the ACL API
ObjectAccessControl oac = new ObjectAccessControl()
oac.setEntity("allUsers")
oac.setRole("READER");
Insert insert = service.objectAccessControls().insert(bucketName, "sounds/1.0/arabic_test22/1000meters.mp3", oac);
insert.execute();
About the folder matter. In Cloud Storage the concept of "folder" does not exists, it is only "bucket" and "object name".
The fact you can see the file grouped inside folders (I'm talking about the Cloud Storage Browser) it is only a graphic representation. With the API you will always handle "bucket" and "object name".
Knowing this, the Objects: list provides a prefix parameter which you can use to filter all the objects where the name starts with it. If you think the start of your object name as the folder, this filter can achieve what you're looking for.
From the documentation of the API I quote
In conjunction with the prefix filter, the use of the delimiter
parameter allows the list method to operate like a directory listing,
despite the object namespace being flat. For example, if delimiter
were set to "/", then listing objects from a bucket that contains the
objects "a/b", "a/c", "d", "e", "e/f" would return objects "d" and
"e", and prefixes "a/" and "e/".
I am trying to upload a file from a java class to aws S3.
I am using the exact code as given here
The only parts I changed are these:
private static String bucketName = "s3-us-west-2.amazonaws.com/<my-bubket-name>";
private static String keyName = "*** Provide key ***";
private static String uploadFileName = "/home/...<localpath>.../test123";
I am not sure what to add in Provide Key . But even if I leave it this way, i get an error like this :
Error Message: The bucket is in this region: null.Please use this region to retry the request (Service: Amazon S3; Status Code: 301; Error Code: PermanentRedirect; Request ID: *******)
HTTP Status Code: 301
AWS Error Code: PermanentRedirect
Error Type: Client
Instead of s3-us-west-2.amazonaws.com/<my-bucket-name> you should put <my-bucket-name>.
You need to specify the name of your bucket. KeyName is the place inside your specified bucket where you wanna store your file. It should something like this:
private static String bucketName = "xyz";// xyz is my bucket name at s3
private static String keyName = "upload/test/";// this is location where
//file will be stored
private static String uploadFileName = "/home/...<localpath>.../test123";
Ensure you have used the correct region.
Changing the region from us-west-2 to us-east-1 worked for me.
I am trying to get the file size (content-length) using Amazon S3 JAVA sdk.
public Long getObjectSize(AmazonS3Client amazonS3Client, String bucket, String key)
throws IOException {
Long size = null;
S3Object object = null;
try {
object = amazonS3Client.getObject(bucket, key);
size = object.getObjectMetadata().getContentLength();
} finally {
if (object != null) {
//object.close();
1. This results in 50 calls (connection pool size) post that I start getting connection pool errors.
2. If this line is uncommented it takes hell lot of time to make calls.
}
}
return size;
}
I followed this and this. But not sure what I am doing wrong here.
Any help on this?
I'm guessing what your actual question is asking, but I think you can reduce your code and eliminate the need to create an s3Object at all by doing something like:
public Long getObjectSize(AmazonS3Client amazonS3Client, String bucket, String key)
throws IOException {
return amazonS3Client.getObjectMetadata(bucket, key).getContentLength();
}
That should remove the need to call object.close() which you appear to be having issues with.
For v2 of the Amazon S3 Java SDK, try something like this:
HeadObjectRequest headObjectRequest =
HeadObjectRequest.builder()
.bucket(bucket)
.key(key)
.build();
HeadObjectResponse headObjectResponse =
s3Client.headObject(headObjectRequest);
Long contentLength = headObjectResponse.contentLength();
So we have 2 SDKs.
For v1 of the Amazon S3 Java SDK, below
client.getObjectMetadata(bucket, key).getContentLength();
where client is an instance of AmazonS3 coming from import com.amazonaws.services.s3.AmazonS3; and implementation 'com.amazonaws:aws-java-sdk-s3:1.12.353' gradle dependencies.
For v2 of the Amazon S3 Java SDK, below:
return client.headObject(HeadObjectRequest.builder().bucket(bucket).key(key).build()).contentLength();
where client is instance of S3Client coming from import software.amazon.awssdk.services.s3.S3Client; and gradle dependencies implementation 'software.amazon.awssdk:s3:2.18.35'
Is it possible to upload a txt/pdf/png file to Amazon S3 in a single action, and get the uploaded file URL as the response?
If so, is AWS Java SDK the right library that I need to add in my java struts2 web application?
Please suggest me a solution for this.
No you cannot get the URL in single action but two :)
First of all, you may have to make the file public before uploading because it makes no sense to get the URL that no one can access. You can do so by setting ACL as Michael Astreiko suggested.
You can get the resource URL either by calling getResourceUrl or getUrl.
AmazonS3Client s3Client = (AmazonS3Client)AmazonS3ClientBuilder.defaultClient();
s3Client.putObject(new PutObjectRequest("your-bucket", "some-path/some-key.jpg", new File("somePath/someKey.jpg")).withCannedAcl(CannedAccessControlList.PublicRead))
s3Client.getResourceUrl("your-bucket", "some-path/some-key.jpg");
Note1:
The different between getResourceUrl and getUrl is that getResourceUrl will return null when exception occurs.
Note2:
getUrl method is not defined in the AmazonS3 interface. You have to cast the object to AmazonS3Client if you use the standard builder.
You can work it out for yourself given the bucket and the file name you specify in the upload request.
e.g. if your bucket is mybucket and your file is named myfilename:
https://mybucket.s3.amazonaws.com/myfilename
The s3 bit will be different depending on which region your bucket is in. For example, I use the south-east asia region so my urls are like:
https://mybucket.s3-ap-southeast-1.amazonaws.com/myfilename
For AWS SDK 2+
String key = "filePath";
String bucketName = "bucketName";
PutObjectResponse response = s3Client
.putObject(PutObjectRequest.builder().bucket(bucketName ).key(key).build(), RequestBody.fromFile(file));
GetUrlRequest request = GetUrlRequest.builder().bucket(bucketName ).key(key).build();
String url = s3Client.utilities().getUrl(request).toExternalForm();
#hussachai and #Jeffrey Kemp answers are pretty good. But they have something in common is the url returned is of virtual-host-style, not in path style. For more info regarding to the s3 url style, can refer to AWS S3 URL Styles. In case of some people want to have path style s3 url generated. Here's the step. Basically everything will be the same as #hussachai and #Jeffrey Kemp answers, only with one line setting change as below:
AmazonS3Client s3Client = (AmazonS3Client) AmazonS3ClientBuilder.standard()
.withRegion("us-west-2")
.withCredentials(DefaultAWSCredentialsProviderChain.getInstance())
.withPathStyleAccessEnabled(true)
.build();
// Upload a file as a new object with ContentType and title specified.
PutObjectRequest request = new PutObjectRequest(bucketName, stringObjKeyName, fileToUpload);
s3Client.putObject(request);
URL s3Url = s3Client.getUrl(bucketName, stringObjKeyName);
logger.info("S3 url is " + s3Url.toExternalForm());
This will generate url like:
https://s3.us-west-2.amazonaws.com/mybucket/myfilename
Similarly if you want link through s3Client you can use below.
System.out.println("filelink: " + s3Client.getUrl("your_bucket_name", "your_file_key"));
a bit old but still for anyone stumbling upon this in the future:
you can do it with one line assuming you already wrote the CredentialProvider and the AmazonS3Client.
it will look like this:
String ImageURL = String.valueOf(s3.getUrl(
ConstantsAWS3.BUCKET_NAME, //The S3 Bucket To Upload To
file.getName())); //The key for the uploaded object
and if you didn't wrote the CredentialProvider and the AmazonS3Client then just add them before getting the URL like this:
CognitoCachingCredentialsProvider credentialsProvider = new CognitoCachingCredentialsProvider(
getApplicationContext(),
"POOL_ID", // Identity pool ID
Regions.US_EAST_1 // Region
);
Below method uploads file in a particular folder in a bucket and return the generated url of the file uploaded.
private String uploadFileToS3Bucket(final String bucketName, final File file) {
final String uniqueFileName = uploadFolder + "/" + file.getName();
LOGGER.info("Uploading file with name= " + uniqueFileName);
final PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, uniqueFileName, file);
amazonS3.putObject(putObjectRequest);
return ((AmazonS3Client) amazonS3).getResourceUrl(bucketName, uniqueFileName);
}
If you're using AWS-SDK, the data object returned contains the Object URL in data.Location
const AWS = require('aws-sdk')
const s3 = new AWS.S3(config)
s3.upload(params).promise()
.then((data)=>{
console.log(data.Location)
})
.catch(err=>console.log(err))
System.out.println("Link : " + s3Object.getObjectContent().getHttpRequest().getURI());
With this code you can retrieve the link of already uploaded file to S3 bucket.
To make the file public before uploading you can use the #withCannedAcl method of PutObjectRequest:
myAmazonS3Client.putObject(new PutObjectRequest('some-grails-bucket', 'somePath/someKey.jpg', new File('/Users/ben/Desktop/photo.jpg')).withCannedAcl(CannedAccessControlList.PublicRead))
String url = myAmazonS3Client.getUrl('some-grails-bucket', 'somePath/someKey.jpg').toString();