AWS DynamoDB Deleting via Java SDK

AWS DynamoDB Deleting via Java SDK - java

When trying to do a delete via AWS Java SDK I get the error
The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: 52N303HS3D535K28KSN3R3803VVV4KQNSO5AEMVJF66Q9ASUAAJG)
I have a delete item spec defined that looks like this
DeleteItemSpec deleteItemSpec = new DeleteItemSpec()
.withPrimaryKey("pk", messageId)
.withConditionExpression("#ip > :val")
.withNameMap(new NameMap()
.with("#ip", "timestamp"))
.withValueMap(new ValueMap()
.withNumber(":val", 0))
.withReturnValues(ReturnValue.NONE);
And my table is created like this
List<AttributeDefinition> attributeDefinitions = new ArrayList<>();
attributeDefinitions.add(new AttributeDefinition()
.withAttributeName("pk")
.withAttributeType(ScalarAttributeType.S));
attributeDefinitions.add(new AttributeDefinition()
.withAttributeName("timestamp")
.withAttributeType(ScalarAttributeType.N));
List<KeySchemaElement> keySchema = new ArrayList<>();
keySchema.add(new KeySchemaElement()
.withAttributeName("pk")
.withKeyType(KeyType.HASH));
keySchema.add(new KeySchemaElement()
.withAttributeName("timestamp")
.withKeyType(KeyType.RANGE));
I'm wondering if the sort key for timestamp is causing this issue. Do I need to specify the timestamp other than > 0?

The issue is that you must specify both the hash and range key when you delete an object. Your hash key is "pk" and your range key is "timestamp" but you are only passing in the hash key into the withPrimaryKey method.
It looks like you are trying to delete multiple items at a time. This is not possible with DynamoDB. You will first need to do a query on the key and you can apply your condition expression to that to only retrieve the keys of the items you want to delete. However, you will then need to call the delete API individually for each record or use the batch API to delete records in batches while still specifying the hash and range key for each individual item.

Related

How to get Datastore entity id from com.google.datastore.v1.Entity

I have written a code to fetch data from Google Datastore in my Google Cloud Dataflow program. I am able to fetch all fields of the entity except Id field which is autogenerated field. I have tried to use entity.getKey() but I am getting null.
Below is my code snippet,
Datastore datastore = DataflowDatastoreService.getDatastoreObject(null, null, null);
Query.Builder queryBuilder = Query.newBuilder();
Filter filter1 = Filter.newBuilder()
.setPropertyFilter(PropertyFilter.newBuilder() .setProperty(PropertyReference.newBuilder().setName("cId"))
.setOp(PropertyFilter.Operator.EQUAL)
.setValue(Value.newBuilder().setIntegerValue(1059438885900008L).build()).build()).build();
Filter filter2 = Filter.newBuilder()
.setPropertyFilter(PropertyFilter.newBuilder()
.setProperty(PropertyReference.newBuilder().setName("active"))
.setOp(PropertyFilter.Operator.EQUAL)
.setValue(Value.newBuilder().setBooleanValue(Boolean.TRUE).build()).build()).build();
Filter composeFilter = Filter.newBuilder().setCompositeFilter(CompositeFilter.newBuilder()
.addFilters(filter1).setOp(Operator.AND).addFilters(filter2).build()).build();
queryBuilder.addKind(KindExpression.newBuilder().setName("MyMaster").build());
queryBuilder.setFilter(composeFilter).build();
RunQueryRequest request = DataflowDatastoreService.makeRequest(queryBuilder.build(), null);
RunQueryResponse response = datastore.runQuery(request);
QueryResultBatch batch = response.getBatch();
List<EntityResult> entityResutls = batch.getEntityResultsList();
List<Entity> myEntities = new ArrayList<>();
Map<String, Value> entityMap = myEntities(0).getPropertiesMap();
In my code I am able to get all fields in entityMap key but I am not getting key, is there any other way through which I can fetch all the fields with Id.

Note: I'm not a java user, answer based on python experience
Indeed, entities returned in a regular query result do not contain the entity key/ID. Attempting to obtain that from the entity is rather inefficient - you need to reach to the datastore for each individual entity (not even looking at why that doesn't appear to be working for you).
If I need the entity keys/IDs I'd instead use keys-only queries - obtaining the keys, from which I can easily get:
the key IDs, locally, without making actual datastore calls (in python via key.id(), I don't know the java equivalent)
the entities via direct key lookup, which can be batched for efficiency.

entity.getKey().getPathList().get(0).getId()
This help me to achieve the result. Getting entity Id through getKey method.

Spring MongoTemplate upsert entire list

I have a list of objects which I want to insert into a collection. The mongoTemplate.insert(list); works fine but now I want to modify it to upsert(); as my list can contain duplicate objects which are already inserted into a collection. So what I want is insert entire list and on the go check if the item is already present in the collection then skip it else insert it.

You can try out continueOnError or ordered flag like this:
db.collection.insert(myArray, {continueOnError: true})
OR,
db.collection.insert(myArray, {ordered: false})

You need to create a unique index field of your object's id(if there is no unique constraint). So that it will make error while you try to insert using same id.
Using the unique constraint you insert array or using BulkInsert
For using insert you can set a flag continueOnError: true which can continue insertion whenever error found in case of error because of unique constraint while inserting existing id of object.

The only way to do a bulk-upsert operation is the method MongoCollection.bulkWrite (or at least: the only way I know... ;-))
To use it, you have to convert your documents to the appropriate WriteModel: for upserts on a per-document basis, this is UpdateOneModel.
List<Document> toUpdate = ...;
MongoCollection coll = ...;
// Convert Document to UpdateOneModel<Document>
List<UpdateOneModel<Document>> bulkOperationList = toUpdate.stream()
.map(doc -> new UpdateOneModel<Document>(
Filters.eq("_id", doc.get("_id")), // identify by same _id
doc,
new UpdateOptions().upsert(true)))
.collect(Collectors.toList());
// Write to DB
coll.bulkWrite(bulkOperationList);
(Disclaimer: I only typed this code, I never ran it)

Bulk Import in CosmosDb with Stored Procedure and Partitioning

I've been using Azure's cosmosDb for a while. Recently i had done bulk import using stored procedure in a collection of my database, and that used to work fine. Now i've to do the same in another collection which uses partitioning; I searched azure code samples and modified my previous bulk insert function like this:
public void createMany(JSONArray aDocumentList, PartitionKey aPartitionKey) throws DocumentClientException {
List<String> aList = new ArrayList<String>();
for(int aIndex = 0; aIndex < aDocumentList.length(); aIndex++) {
JSONObject aJsonObj = aDocumentList.getJSONObject(aIndex);
aList.add(aJsonObj.toString());
}
String aSproc = getCollectionLink() + BULK_INSERTION_PROCEDURE;
RequestOptions requestOptions = new RequestOptions();
requestOptions.setPartitionKey(aPartitionKey);
String result = documentClient.executeStoredProcedure(aSproc,
requestOptions
, new Object[] { aList}).getResponseAsString();
}
but this code gives me error:
com.microsoft.azure.documentdb.DocumentClientException: Message: {"Errors":["Encountered exception while executing function. Exception = Error: {\"Errors\":[\"Requests originating from scripts cannot reference partition keys other than the one for which client request was submitted.\"]}\r\nStack trace: Error: {\"Errors\":[\"Requests originating from scripts cannot reference partition keys other than the one for which client request was submitted.\"]}\n at callback (bulkInsertionStoredProcedure.js:1:1749)\n at Anonymous function (bulkInsertionStoredProcedure.js:689:29)"]}
I'm not quite certain what that error actually means. Since partitionKey is just a JSON key in the document, why would it need it in other Documents also. Do i need to append this in my document also(with partitionKey key) .Could anyone please tell me what i'm missing here? I've searched over the internet and haven't found anything useful that could make it work.

I've already answered this question here. The gist of it is that the documents you're inserting with your SPROC must have a partitionKey that matches the one you pass with the request options
// ALL documents inserted must have a parititionKey value that matches
// "aPartitionKey" value
requestOptions.setPartitionKey(aPartitionKey);
So if aPartitionKey == 123456 then all the documents you are inserting with the SPROC are required to belong to that partition. If you have documents spanning multiple partitions that you want to bulk insert you will have to group them by partition key and run the SPROC separately for each grouping.

Java Couchbase Querying to find a document's ID?

I'm new to couchbase. I'm using Java for this. I'm trying to remove a document from a bucket by looking up its ID with query parameters(assuming the ID is unknown).
Lets say I have a bucket called test-data. In that bucked I have a document with ID of 555 and Content of {"name":"bob","num":"10"}
I want to be able to remove that document by querying using 'name' and 'num'.
So far I have this (hardcoded):
String statement = "SELECT META(`test-data`).id from `test-data` WHERE name = \"bob\" and num = \"10\"";
N1qlQuery query = N1qlQuery.simple(statement);
N1qlQueryResult result = bucket.query(query);
List<N1qlQueryRow> row = result.allRows();
N1qlQueryRow res1 = row.get(0);
System.out.println(res1);
//output: {"id":"555"}
So I'm getting a json that has the document's ID in it. What would be the best way to extract that ID so that I can then remove the queryed document from the bucket using its ID? Am I doing to many steps? Is there a better way to extract the document's ID?
bucket.remove(docID)
Ideally I'd like to use something like a N1q1QueryResult to get this going but I'm not sure how to set that up.
N1qlQueryResult result = bucket.query(select("META.id").fromCurrentBucket().where((x("num").eq("\""+num+"\"")).and(x("name").eq("\""+name+"\""))));
But that isn't working at the moment.
Any help or direction would be appreciated. Thanks.

There might be a better way which is running this kind of query:
delete from `test-data` use keys '00000874a09e749ab6f199c0622c5cb0' returning raw META(`test-data`).id
or if your fields has index:
delete from `test-data` where name='bob' and num='10' returning raw META(`test-data`).id
This query deletes the specified document with given document key (which is meta.id) and returns document id of deleted document if it deletes any document. Returns empty if no documents deleted.
You can implement this query with couchbase sdk as follows:
Statement statement = deleteFrom("test-data")
.where(x("name").eq(s("bob")).and(x("num").eq(s("10"))))
.returningRaw(meta(i("test-data")).get("id"));
You can make this statement parameterized or just execute like that.

Query all items in DynamoDB from a given hash key with a hash-range schema using java sdk

EDIT:
I was actually incorrect. I was querying the table when I meant to query an index which explains my error. Vikdor's solution is a valid one though.
ORIGINAL:
I have a table with a Hash-Range key schema in DynamoDB. I need to be able to get all items associated with a specific hash key but it seems to require a range key condition. My issue is I want EVERY range key but there is no wildcard option. As of right now my range key is a string and the only way I could think to do this is by querying all range keys greater or equal to the smallest ascii characters I can use since the documentation says it sorts based on ascii character values.
I looked into scanning but it appears that simply will read the entire table which is NOT an option.
Is there any better way to query for all values of a hash key or can anyone confirm that using the method with the ascii character will work?

but it seems to require a range key condition.
This doesn't sound to be true.
I use DynamoDBMapper and use DynamoDBQueryExpression to query all the records with a given HashKey as follows:
DynamoDBQueryExpression<DomainObject> query =
new DynamoDBQueryExpression<DomainObject>();
DomainObject hashKeyValues = new DomainObject();
hashKeyValues.setHashKey(hashKeyValue);
query.setHashKeyValues(hashKeyValues);
// getMapper() returns a DynamoDBMapper object with the appropriate
// AmazonDynamoDBClient object.
List<DomainObject> results = getMapper().query(query);
HTH.

You can use DynamoDB's query API, which allows you to query the database based conditional expressions using the hash/range keys. You can see examples of the API here. Here is a relevant example:
ItemCollection<QueryOutcome> items = table.query("theHashFieldName", "theHashFieldToQuery");
You can also query using more complex expressions. E.g.:
DynamoDB dynamoDB = new DynamoDB(
new AmazonDynamoDBClient(new ProfileCredentialsProvider()));
Table table = dynamoDB.getTable("TableName");
QuerySpec spec = new QuerySpec()
.withKeyConditionExpression("Id = :v_id")
.withValueMap(new ValueMap()
.withString(":v_id", "TheId"));
ItemCollection<QueryOutcome> items = table.query(spec);
Iterator<Item> iterator = items.iterator();
Item item = null;
while (iterator.hasNext()) {
item = iterator.next();
System.out.println(item.toJSONPretty());
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

AWS DynamoDB Deleting via Java SDK - java

Related

How to get Datastore entity id from com.google.datastore.v1.Entity

Spring MongoTemplate upsert entire list

Bulk Import in CosmosDb with Stored Procedure and Partitioning

Java Couchbase Querying to find a document's ID?

Query all items in DynamoDB from a given hash key with a hash-range schema using java sdk

Categories

Resources