Java : How to print the content of s3 bucket

Java : How to print the content of s3 bucket - java

I am using jdk 11 and virtual-host-style-access (AWS SDK for Java version 2) to create/access objects in AWS s3 bucket following :
https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/examples-s3-objects.html#list-object
While I was able to create objects in the the designated bucket, I am not able to print the list of contents/objects in the bucket although, as I checked the permission, everyone is granted to view the objects in the bucket. The error message is :
software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404
This is the way the s3client is created :
adapterSmsS3Client = S3Client.builder()
.region(Region.US_WEST_2)
.credentialsProvider(StaticCredentialsProvider.create(AwsBasicCredentials.create(ACCESS_KEY,SECRET_KEY)))
.endpointOverride(URI.create(BASE_URL))
.build();
And this is the way I am trying to print the list:
public static void listBucketObjects( S3Client s3, String bucketName ) {
ListBucketsResponse res1 = s3.listBuckets();
ListObjectsRequest listObjects = ListObjectsRequest
.builder()
.bucket(BUCKET_NAME)
.build();
ListObjectsResponse res = s3.listObjects(listObjects);
List<S3Object> objects = res.contents();
for (ListIterator iterVals = objects.listIterator(); iterVals.hasNext(); ) {
S3Object myValue = (S3Object) iterVals.next();
System.out.print("\n The name of the key is " + myValue.key());
System.out.print("\n The object is " + calKb(myValue.size()) + " KBs");
System.out.print("\n The owner is " + myValue.owner());
}
}
BUCKET_NAME is the name of bucket on s3 ( not any URL)
Although, I would like to mention that if I use Path-style-request (AWS SDK for Java version 1), following :
https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/examples-s3-objects.html
I am able to print contents from the same bucket. However we do not intend to go that way.
Any insight on why am I getting the "key does not exist" error or potential resolution?

If you had any problem with permissions you would have get a 403 forbidden; not a 404 NoSuchKey.
What are the names of your objects in the bucket ? My guess is that you have some special characters or url-encoded characters that causes the problem. See https://aws.amazon.com/premiumsupport/knowledge-center/404-error-nosuchkey-s3/?nc1=h_ls for more details.
And I suggest you to use listObjectsV2 instead of the V1.

Related

Adding an attachment on Azure CosmosDB

I am looking for some help on how to add an attachment on CosmosDB. Here is the little background.
Our application is currently on IBM Bluemix and we are using CloudantDB. We use CloudanDB to store attachments (PDF file). We are no moving to Azure PaaS App Service and planning to use CosmosDB. I am looking for help on how to create an attachment on CosmosDB using Java API. What API do I need to use? I want to do a small POC.
Thanks,

Well Personally i feel In Azure, if you go want to put files into documentDb, you will pay high for the query cost. Instead it would be normal practice to use Azure blob and save the link in a field, and then return url if its public or binary data if you want it to be secured.
However, You could store it using
var myDoc = new { id = "42", Name = "Max", City="Aberdeen" }; // this is the document you are trying to save
var attachmentStream = File.OpenRead("c:/Path/To/File.pdf"); // this is the document stream you are attaching
var client = await GetClientAsync();
var createUrl = UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName);
Document document = await client.CreateDocumentAsync(createUrl, myDoc);
await client.CreateAttachmentAsync(document.SelfLink, attachmentStream, new MediaOptions()
{
ContentType = "application/pdf", // your application type
Slug = "78", // this is actually attachment ID
});
WORKING WITH ATTACHMENTS
I have answered a similar question here

What client API I can use?
You could follow the cosmos db java sdk to CRUD attachment.
import com.microsoft.azure.documentdb.*;
import java.util.UUID;
public class CreateAttachment {
// Replace with your DocumentDB end point and master key.
private static final String END_POINT = "***";
private static final String MASTER_KEY = "***";
public static void main(String[] args) throws Exception, DocumentClientException {
DocumentClient documentClient = new DocumentClient(END_POINT,
MASTER_KEY, ConnectionPolicy.GetDefault(),
ConsistencyLevel.Session);
String uuid = UUID.randomUUID().toString();
Attachment attachment = getAttachmentDefinition(uuid, "application/text");
RequestOptions options = new RequestOptions();
ResourceResponse<Attachment> attachmentResourceResponse = documentClient.createAttachment(getDocumentLink(), attachment, options);
}
private static Attachment getAttachmentDefinition(String uuid, String type) {
return new Attachment(String.format(
"{" +
" 'id': '%s'," +
" 'media': 'http://xstore.'," +
" 'MediaType': 'Book'," +
" 'Author': 'My Book Author'," +
" 'Title': 'My Book Title'," +
" 'contentType': '%s'" +
"}", uuid, type));
}
}
In the documentation it says, total file size we can store is 2GB.
"Azure Cosmos DB allows you to store binary blobs/media either with
Azure Cosmos DB (maximum of 2 GB per account) " Is it the max we can
store?
Yes.The size of attachments is limited in document db. However, there are two methods for creating a Azure Cosmos DB Document Attachment.
1.Store the file as an attachment to a Document
The raw attachment is included as the body of the POST.
Two headers must be set:
Slug – The name of the attachment.
contentType – Set to the MIME type of the attachment.
2.Store the URL for the file in an attachment to a Document
The body for the POST include the following.
id – It is the unique name that identifies the attachment, i.e. no two attachments will share the same id. The id must not exceed 255 characters.
Media – This is the URL link or file path where the attachment resides.
The following is an example
{
"id": "device\A234",
"contentType": "application/x-zip-compressed",
"media": "www.bing.com/A234.zip"
}
If your files are over limitation , you could try to store them with second way. More details, please refer to blog.
In addition, you could notice that cosmos db attachments support
garbage collect mechanism,it ensures to garbage collect the media when all of the outstanding references are dropped.
Hope it helps you.

how to fetch latest time of a file in S3 in java

I am new to S3. I need to fetch a file from S3, update it, and store it back to S3, so I need to fetch the latest time of this file in an existing module; it would be good if the answer is in java.

This gets a list of objects in the bucket. This also prints out each object’s name, the file size, and last modified date.
ObjectListing objects = conn.listObjects(bucket.getName());
do {
for (S3ObjectSummary objectSummary : objects.getObjectSummaries()) {
System.out.println(objectSummary.getKey() + "\t" +
ObjectSummary.getSize() + "\t" +
StringUtils.fromDate(objectSummary.getLastModified()));
}
objects = conn.listNextBatchOfObjects(objects);
} while (objects.isTruncated());
The output will look something like this:
myphoto1.jpg 251262 2011-08-08T21:35:48.000Z
myphoto2.jpg 262518 2011-08-08T21:38:01.000Z
Ref for s3 examples including above one( LISTING A BUCKET’S CONTENT) is at: http://docs.ceph.com/docs/master/radosgw/s3/java/

Getting an id for existing dir on Drive

I am trying to get an id for an existing dir on Google Drive.
com.google.api.services.drive.model.About about = drive.about().get().execute();
com.google.api.services.drive.Drive.Children.List list =
drive.children().list(about.getRootFolderId());
Iterator<Entry<String, Object>> itr = list.entrySet().iterator();
Entry<String, Object> s;
while (itr.hasNext()) {
s = itr.next();
System.out.println(s.getKey() + "::" + s.getValue());
}
Right now this code is giving an output -
folderId::0APcEBFk-CF2pUk9PVA
which is probably not the correct id because I have 2 dirs and 3 files in my google drive.
I must be missing something, what the right way to get the id of an existing dir.
I have seen this question, and it will be helpful if I can get an equivalent java example. I am using the same account's google drive which is owning the app.

Create a List of files and iterate.
Drive service = new Drive.Builder(httpTransport, jsonFactory, credential1).build();
String query = "'" + about.getRootFolderId() +"' " + "in parents";
List<File> files = service.files().list().setQ(query).execute().getItems();
return files;
This should do.

Comparing your code to the example here you didn't call execute() on list, I think you're iterating on the request arguments.
Also it says:
folderId To list all files in the root folder, use the alias root as the value for folderId.
so you can skip getting the About.

Find all the attached volumes for an EC2 instance

I'm using the below code to get all the available volumes under EC2. But I can't find any Ec2 api to get already attached volumes with an instance. Please let me know how to get all attached volumes using instanceId.
EC2Api ec2Api = computeServiceContext.unwrapApi(EC2Api.class);
List<String> volumeLists = new ArrayList<String>();
if (null != volumeId) {
volumeLists.add(volumeId);
}
String[] volumeIds = volumeLists.toArray(new String[0]);
LOG.info("the volume IDs got from user is ::"+ Arrays.toString(volumeIds));
Set<Volume> ec2Volumes = ec2Api.getElasticBlockStoreApi().get()
.describeVolumesInRegion(region, volumeIds);
Set<Volume> availableVolumes = Sets.newHashSet();
for (Volume volume : ec2Volumes) {
if (volume.getSnapshotId() == null
&& volume.getStatus() == Volume.Status.AVAILABLE) {
LOG.debug("available volume with no snapshots ::" + volume.getId());
availableVolumes.add(volume);
}
}

The AWS Java SDK now provides a method to get all the block device mappings for an instance. You can use that to get a list of all the attached volumes:
// First get the EC2 instance from the id
DescribeInstancesRequest describeInstancesRequest = new DescribeInstancesRequest().withInstanceIds(instanceId);
DescribeInstancesResult describeInstancesResult = ec2.describeInstances(describeInstancesRequest);
Instance instance = describeInstancesResult.getReservations().get(0).getInstances().get(0);
// Then get the mappings
List<InstanceBlockDeviceMapping> mappingList = instance.getBlockDeviceMappings();
for(InstanceBlockDeviceMapping mapping: mappingList) {
System.out.println(mapping.getEbs().getVolumeId());
}

You can filter the output of the EC2 DescribeVolumes API call. There are various attachment.* filters available, the one you want is filtering by attached instance ID. Try the following code:
Multimap<String, String> filter = ArrayListMultimap.create();
filter.put("attachment.instance-id", instanceId);
filter.put("attachment.status", "attached");
Set<Volume> volumes = ec2Api.getElasticBlockStoreApi().get()
.describeVolumesInRegionWithFilter(region, volumeIds, filter);
The filter is a Multimap with the keys and values you want to filter on. You can actually specify the same filter multiple times, for example to get all volumes attached to a number of different instances.

You can use volumeAttachmentApi.listAttachmentsOnServer() to do this.
NovaApi novaApi = context.unwrapApi(NovaApi.class);VolumeApi volumeApi = novaApi.getVolumeExtensionForZone(region).get();
VolumeAttachmentApi volumeAttachmentApi = novaApi.getVolumeAttachmentExtensionForZone(region).get();
volumeAttachmentApi.listAttachmentsOnServer(serverId)

listing objects in aws bucket

I was trying to print all the objects in a bucket but I am getting an error.
Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 301, AWS Service: Amazon S3, AWS Request ID: 758A7CBF1A29FD74, AWS Error Code: PermanentRedirect, AWS Error Message: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint., S3
At the moment I only have the following code :
public class S3Download {
/**
* #param args
*/
public static void main(String[] args) {
AmazonS3 s3 = new AmazonS3Client(new ClasspathPropertiesFileCredentialsProvider());
Region usWest2 = Region.getRegion(Regions.US_WEST_2);
s3.setRegion(usWest2);
String bucketName = "apireleasecandidate1";
ListObjectsRequest listObjectRequest = new ListObjectsRequest().withBucketName(bucketName);
ObjectListing objectListing;
do{
objectListing = s3.listObjects(listObjectRequest);
for(S3ObjectSummary objectSummary : objectListing.getObjectSummaries()){
System.out.println(" - " + objectSummary.getKey() + " " + "(size = " +
objectSummary.getSize() + ")");
}
listObjectRequest.setMarker(objectListing.getNextMarker());
}while(objectListing.isTruncated());
}
}
I found this solution on amazon's website.
Does anyone know what I am missing?

For Scala developers, here it is recursive function to execute a full scan and map of the contents of an AmazonS3 bucket using the official AWS SDK for Java
import com.amazonaws.services.s3.AmazonS3Client
import com.amazonaws.services.s3.model.{S3ObjectSummary, ObjectListing, GetObjectRequest}
import scala.collection.JavaConversions.{collectionAsScalaIterable => asScala}
def map[T](s3: AmazonS3Client, bucket: String, prefix: String)(f: (S3ObjectSummary) => T) = {
def scan(acc:List[T], listing:ObjectListing): List[T] = {
val summaries = asScala[S3ObjectSummary](listing.getObjectSummaries())
val mapped = (for (summary <- summaries) yield f(summary)).toList
if (!listing.isTruncated) mapped.toList
else scan(acc ::: mapped, s3.listNextBatchOfObjects(listing))
}
scan(List(), s3.listObjects(bucket, prefix))
}
To invoke the above curried map() function, simply pass the already constructed (and properly initialized) AmazonS3Client object (refer to the official AWS SDK for Java API Reference), the bucket name and the prefix name in the first parameter list. Also pass the function f() you want to apply to map each object summary in the second parameter list.
For example
map(s3, bucket, prefix)(s => println(s))
will print all the files
val tuple = map(s3, bucket, prefix)(s => (s.getKey, s.getOwner, s.getSize))
will return the full list of (key, owner, size) tuples in that bucket/prefix
val totalSize = map(s3, "bucket", "prefix")(s => s.getSize).sum
will return the total size of its content (note the additional sum() folding function applied at the end of the expression ;-)
You can combine map() with many other functions as you would normally approach by Monads in Functional Programming

It appears that your bucket "apireleasecandidate1" is not in the us-west-1 region. I think it is in the us-classic region. You should modify your code to remove the setRegion() call.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java : How to print the content of s3 bucket - java

Related

Adding an attachment on Azure CosmosDB

how to fetch latest time of a file in S3 in java

Getting an id for existing dir on Drive

Find all the attached volumes for an EC2 instance

listing objects in aws bucket

Categories

Resources