Copy S3 objects with "subdirectories" using s3Client.copyObject method - java

I have an S3-Bucket with two files:
s3://bucketA/objectA/objectB/fileA
s3://bucketA/objectA/objectB/fileB
I want to use the s3Client in java to create a copy of objectA known as objectC using the copyObject method of s3Client.
s3://bucketA/objectA/ ---Copy-To---> s3://bucketA/objectC/
The problem is the contents of objectA are not being copied into objectC. Object C does not contain object B and fileA and fileB. How can I copy the contents of the object as well?
Here is my code: (I am using kotlin)
s3client.copyObject(CopyObjectRequest("bucketA", "objectA","bucketA", "objectC"))
I checked in the S3 console and this works (it creates a folder called objectC, but I'm unable to get the contents of objectA into object C.)

What is happening is that using the SDK you're not making a recursive copy of the objects.
So the easiest solution is to use the AWS CLI
aws s3 cp s3://source-awsexamplebucket s3://source-awsexamplebucket --recursive --storage-class STANDARD
Note that you've to take into consideration the size of the objects, the amount, etc. If its something too big a batch mechanism could be made to help your system cope with load. You can read it further on the AWS documentation.
Now, and assuming you need to be doing that programmatically. The algorithm has 2 parts listing + copying. Something along those lines will work.
ListObjectsV2Result result = s3.listObjectsV2(from_bucket);
List<S3ObjectSummary> objects = result.getObjectSummaries();
for (S3ObjectSummary os : objects) {
s3.copyObject(from_bucket, os.getKey(), to_bucket, os.getKey());
}
// exception handling and error handling ommited for brevity

Related

AWS CDK getting DynamoDB Stream ARN returns null

Is it possible to get the Stream ARN of a DynamoDB table using the AWS CDK?
I tried the below, but when I access the streamArn using the getTableStreamArn it returns back null.
ITable table = Table.fromTableArn(this, "existingTable", <<existingTableArn>>);
System.out.println("ITable Stream Arn : " + table.getTableStreamArn());
Tried using fromTableAttribute as well, but the stream arn is still empty.
ITable table =
Table.fromTableAttributes(
this, "existingTable", TableAttributes.builder().tableArn(<<existingTableArn>>).build());
this is not possible with the fromTableArn method... Please see the documentation here:
https://docs.aws.amazon.com/cdk/api/latest/docs/aws-dynamodb-readme.html#importing-existing-tables
If you intend to use the tableStreamArn (including indirectly, for
example by creating an
#aws-cdk/aws-lambda-event-source.DynamoEventSource on the imported
table), you must use the Table.fromTableAttributes method and the
tableStreamArn property must be populated.
That value is most likely not available when your Java Code is running.
With the CDK there is a multi-step process to get your code to execute:
Your Java Code is executed and triggers the underlying JSII Layer
JSII executes the underlying Javascript/Typescript implementation of the CDK
The Typescript Layer produces the CloudFormation Code
The CloudFormation Code (and other assets) is sent to the AWS API
CloudFormation executes the template and provisions the resources
Some attributes are only available during Step 5) and before that only contain internal references that are eventually put into the CloudFormation template. If I recall correctly, the Table Stream ARN is one of them.
That means if you want that value, you have to create a CloudFormation Output that shows them, which will be populated during the deployment.

What is the Tensorflow Java Api `toGraphDef` equivalent in Python?

I am using the Tensorflow Java Api to load an already created Tensorflow model into the JVM.
I am using this as an example: tensorflow/examples/LabelImage.java
Here is my simple scala code:
import java.nio.file.{Files, Path, Paths}
import org.tensorflow.{Graph, Session, Tensor}
def readAllBytesOrExit(path: Path): Array[Byte] = Files.readAllBytes(path)
val graphDef = readAllBytesOrExit(Paths.get("PATH_TO_A_SINGLE_FILE_DESCRIBING_TF_MODEL.pb"))
val g = new Graph()
g.importGraphDef(graphDef)
val session = new Session(g)
val result: Tensor = session.runner().feed("input", image).fetch("output").run().get(0))
How do I save my model to get both the Session and the Graph stored in the same file. as described in the "PATH_TO_A_SINGLE_FILE_DESCRIBING_TF_MODEL.pb" above.
Described here it mentions:
The serialized representation of the graph, often referred to as a
GraphDef, can be generated by toGraphDef() and equivalents in other
language APIs.
What are the equivalents in other language APIs? I dont find it obvious
Note: I already looked at the mnist_saved_model.py under tensorflow_serving but saving it through that procedure gives me a .pb file and a variables folder. When trying to load that .pb file I get: java.lang.IllegalArgumentException: Invalid GraphDef
Currently with the Java API of tensorflow, I only found how to save a graph as a graphDef (i.e. without its variables and meta-data). This can be done by just writing the Array[Byte] to a file:
Files.write(Paths.get(modelDir, modelName), myGraph.toGraphDef)
Here myGraph is a java object from the Graph class.
I would suggest to save your model from the Python API, using the SavedModel api defined here. It will save your model in a folder with both the serialized graph in a .pb file and the variables in a folder. Note the tag_constants you use as you'll need it in your scala/java code to load the model with the variables. Then the graph and session with variables are easily loaded with the SavedModelBundle java class from the java api. It returns you a wrapper with both the graph and the session containing the variables values:
val model = SavedModelBundle.load(modelDir, modelTag)
If you already tried this, maybe you can share your code to see why it returned an invalid GraphDef.
Another option is to freeze your graph, i.e. you turned your variable nodes into constant Nodes so everything is self-contained in the .pb file. Mores infos here for the freezing part

Java method that writes to file does nothing when invoked from a JSP

Hey, all! I have a class method who's primary function is to get a Map object, which works fine; however, it's an expensive operation that doesn't need to be done every time, so I'd like to have the results stored in an XML file using JAXB, to be read from for the majority of calls and updated infrequently.
When I run a class that calls it out of NetBeans the file is created no problem with exactly what I want -- but when I have my JSP call the method nothing happens whatsoever, even though the rest of the information is passed normally. I have the feeling it's somehow lacking write privileges, but the file is just in the root directory so I'm not sure what I'm missing. Thanks for the help!
The code looks roughly like this:
public class DataHandler() {
...
public void config() {
MapHolder bucket = new MapHolder();
MapExporter exp = new MapExporter();
Map map = makeMap();
bucket.setMap(map);
exp.exportMap(bucket);
}
}
And then the JSP has a javabean of Datahandler, and this line:
databean.config();
It's probably a tad more fragmented than it needs to be; the whole bucket rigamarole was because I was stumbling trying to learn how to write a map to an xml file. Mapholder is just a class that I wrap around the map, and MapExporter just uses a JAXB marshaller, and it all does work properly when run from NetBeans.
OK turns out I'm just dumb; everything was working fine, the file was just being stored in a folder at the localhost location. Whoops! That'd be my inexperience with web development at work.

JavaScript scope resolve time

I'm writing an app that loads javascript dynamically using rhino(or a browser); I got 2 files:
// in a file called fooDefinition.js
var definition = {
foo: function(data){return bar(data)},
loadFile: "barLib.js"
}
now, bar() is defined like this:
// in a file called barLib.js
function bar(data){
return data + " -> bar!";
}
This is what I want to do:
load fooDefinition.js into the environment
read the value of loadFile (in this case: "barLib.js") and load the file (NOTE: load the file through external mechanism, not through javascript itself!)
call foo
external mechanism & example usage (Java pseudo code):
// assume engine is a statefull engine (rhino for example)
String str = /*content of fooDefinition.js*/;
engine.eval(str);
String fileToLoad = engine.eval("definition.loadFile");
engine.load(IOUtils.readFileToString(new File(fileToLoad)));
String result = engine.eval("definition.foo('myData')");
I've tried this in Google Chrome's JS console and no error was thrown
I wonder is this the correct way of accomplish such task?
TL;DR:
Are the attributes of an object loaded and checked when the object is defined?
If your engine is statefull that is it keeps track of defined variables, yes your approach is corrent and will work as expected
But if it is not, your way will fail, because when you call the following
String fileToLoad = engine.eval("definition.loadFile");
your engine haven't any info about definition object and as a result it return an exception (in JavaScript).
It seems your engine is statefull and all things will work correctly

Checking for successful S3 copy operation?

I'm trying to implement a move operation using the Amazon S3 Java API.
The problem I am having is that the CopyObjectResult object returned by the AmazonS3Client.copyObject method doesn't seem to have a clear indicator about wiether the operation was successful or not.
Given that after this operation I am going to be deleting a file, I'd want to make sure that the operation was successful.
Any suggestions?
This is what I ended up doing,
def s3 = createS3Client(credentials)
def originalMeta = s3.getObjectMetadata(originalBucketName, key)
s3.copyObject(originalBucketName, key, newBucketName, newKey)
def newMeta = s3.getObjectMetadata(newBucketName, newKey)
// check that md5 matches to confirm this operation was successful
return originalMeta.contentMD5 == newMeta.contentMD5
Note that this is Groovy code, but it is extremely similar to how the Java code would work.
I don't like having to make two additional operations to check the metadata, so if there is anyway to do this more efficiently let me know.
I'm pretty sure you can just use CopyObjectResult object's getETag method to confirm that, after created, it has a valid ETag, as was made in the CopyObjectRequest setETag method.
getETag
public String getETag() Gets the ETag value for the new object that
was created in the associated CopyObjectRequest. Returns: The ETag
value for the new object. See Also: setETag(String) setETag
public void setETag(String etag)
Sets the ETag value for the new
object that was created from the associated copy object request.
Parameters: etag - The ETag value for the new object. See Also:
getETag()
Always trust the data.
Been a year since I did a similar function with the Amazon PhP SDK a couple years ago, but it should work.
AWS documentation says
The source and destination ETag is identical for a successfully copied

Categories

Resources