How to delete an item in DynamoDB using Java?

How to delete an item in DynamoDB using Java? - java

I know this sounds like a simple question, but for some reason I can't find a clear answer online or through StackOverflow.
I have a DynamoDB with a Table named "ABC". The primary key is "ID" as a String and one of the other attributes is "Name" as a String. How can I delete an item from this table using Java?
AmazonDynamoDBClient dynamoDB;
.
.
.
DeleteItemRequest dir = new DeleteItemRequest();
dir.withConditionExpression("ID = 214141").withTableName("ABC");
DeleteItemResult deleteResult = dynamoDB.deleteItem(dir);
I have a validation exception:
Exception in thread "main" com.amazonaws.AmazonServiceException: 1 validation error detected: Value null at 'key' failed to satisfy constraint: Member must not be null (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: RQ70OIGOQAJ9MRGSUA0UIJLRUNVV4KQNSO5AEMVJF66Q9ASUAAJG)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1160)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:748)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:467)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:302)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:3240)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.deleteItem(AmazonDynamoDBClient.java:972)
at DynamoDBUploader.deleteItems(DynamoDBUploader.java:168)
at Main.main(Main.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
If I need to know the Hash Key in order to delete an item in a DynamoDB Table, I think I may need to redesign my database in order to delete items efficiently.
My table looks like this:
If that is the case, ahh... I think I need to re-design my database table.
ID | Name | Date | Value
-----------------------------------
1 | TransactionA | 2015-06-21 | 30
2 | TransactionB | 2015-06-21 | 40
3 | TransactionC | 2015-06-21 | 50
Basically, I would like to easily delete all transactions with Date "2015-06-21". How can I do this simply and quickly without having to deal with the Hash Key ID?

AWS DynamoDB knows the column that is hash key of your table.
You just need to specify the value to be deleted.
DeleteItemRequest has a fluent API for that :
Key keyToDelete = new Key().withHashKeyElement(new AttributeValue("214141"));
DeleteItemRequest dir = new DeleteItemRequest()
.withTableName("ABC")
.withKey(keyToDelete);

For Kotlin:
I have table with :
Partition key: account_id (String)
Sort key: message_id (String)
To delete an item fron DynamoDb I do the following:
fun deleteMessageById(messageId: String, accountId: String){
val item = HashMap<String, AttributeValue>()
item["account_id"] = AttributeValue(accountId)
item["message_id"] = AttributeValue(messageId)
val deleteRequest = DeleteItemRequest().withTableName(tableName).withKey(item)
dynamoConfiguration.amazonDynamoDB().deleteItem(deleteRequest)
}

Related

How to Retrieve Firestore Documents with a Specific Hashmap Field Value?

--- Users (Collection)
|
--- p0A1fXH4l2TpvGE2lo0x
|
--- List (HashMap)
|
--- ID (String) (Value: UQx4CWRgnVLOdKEY3AKJ)
--- NAME (String) (Value: ...)
In Firestore, how can I find the documents that have a list ID equal to UQx4CWRgnVLOdKEY3AKJ? Before deleting the list, I need to remove it from the users who have used it. How can I determine which documents are using this list ID so I can delete them?

If I understand correctly, the List field inside the user document is a Map which contains only String values. So if your database schema looks exactly like this:
db
|
--- Users (collection)
|
--- $uid (document)
|
--- List (map) //👈
|
--- ID: "UQx4CWRgnVLOdKEY3AKJ"
|
--- NAME: "Taha Sami"
To get all users where the ID field within the List holds the value of UQx4CWRgnVLOdKEY3AKJ, a query like this will do the trick:
Query queryByListId = db.collection("Users").whereEqualTo("List.ID", "UQx4CWRgnVLOdKEY3AKJ");
// 👆

It looks like your List (HashMap) may have multiple child objects, and you want to search across all of those for a specific ID value.
There is no way to search across all objects in a map field in Firestore. If you want to search across all ID values, add an additional array field with just this values, e.g.
ListIDs: ["UQx4CWRgnVLOdKEY3AKJ", ...]
With that field in place, you can then use the array-contains operator to query for matching documents.

How to get the next free item in DynamoDB in concurrent environment

I have a table in DynamoDB with my users (Partial key = key, Sort key = no):
key isActive
user1 true
user2 false
... ...
In my code I need to return a next user with status not active (isActive=false). What is the best way to do this, if I need solution is based on that I have
A huge table
Concurrent environment
I wrote code that works, BUT I amn't sure it is a good solution due to Scan and Filter expression:
public String getFreeUser() throws IOException {
Table table = dynamoDB.getTable("usersTableName");
ScanSpec spec = new ScanSpec()
.withFilterExpression("isActive = :is_active")
.withValueMap(new ValueMap().withBoolean(":is_active", false));
ItemCollection<ScanOutcome> items = table.scan(spec);
Iterator<Item> iterator = items.iterator();
Item item = null;
while (iterator.hasNext()) {
item = iterator.next();
try {
UpdateItemSpec updateItemSpec = new UpdateItemSpec()
.withPrimaryKey(new PrimaryKey("key", item.getString("key")))
.withUpdateExpression("set #ian=:is_active_new")
.withConditionExpression("isActive = :is_active_old")
.withNameMap(new NameMap()
.with("#ian", "isActive"))
.withValueMap(new ValueMap()
.withBoolean(":is_active_old", false)
.withBoolean(":is_active_new", true))
.withReturnValues(ReturnValue.ALL_OLD);
UpdateItemOutcome outcome = table.updateItem(updateItemSpec);
return outcome.getItem().getString("key");
} catch (Exception e) {
}
}
throw new IOException("No active users were found");
}

GSI + Query == GOOD
userID (PK) | isActive | otherAttribute | ...
user1 | true | foo | ...
user2 | false | bar | ...
user3 | true | baz | ...
user4 | false | 42 | ...
...
GSI:
userID | isActive (GSI-PK)
user1 | true
user2 | false
user3 | true
user4 | false
Add a GSI with a hash key of isActive. This will allow you to query directly the items where isActive == false.
The benefit vs scan and filter is that reads will be much more efficient. The cost is that your GSI requires it's own storage, so if your table is huge (as per your assumption) then you might want to consider a sparse index.
Sparse Index GSI + Query == BETTER
userID (PK) | isNotActive | otherAttribute | ...
user1 | | foo | ...
user2 | false | bar | ...
user3 | | baz | ...
user4 | false | 42 | ...
...
GSI:
userId | isNotActive (GSI-PK)
user2 | false
user4 | false
Consider replacing the attribute isActive with isNotActive and don't give this attribute to the active users. That is, the inactive users will have true but the active users will not have this attribute at all. You can then create your GSI with this isNotActive attribute. Since it only contains the inactive users, it will be smaller and more efficient to store and query.
Note that when a user becomes active you will need to delete this attribute, and vice versa for active users that become inactive.
Attribute Projections
Regardless of which GSI you decide is best for you, if you know which attribute(s) you will need when querying these inactive users - even if it's just "all of them" - you can project these to your GSI so that you don't need to do the second lookup by key. This will increase the size of your GSI, but may be a tradeoff worth making depending on your table size, the ratio of active to inactive users, and your expected access patterns.
UPDATE
In response to the first comment, to be clear the GSI key (now labelled "GSI-PK") is not the userID. I could put the isActive or active column on the far left in the GSI tables, but that's not how it appears in the AWS console so I've left it in the original order for consistency with the way AWS display it.
Re the second comment on concurrency, you're right I didn't address this. My solution will work in a concurrent environment except for one thing - you can only do eventually consistent reads, not strongly consistent reads. What this means is that a very recent newly inactive user (and by recent I mean a fraction of a second in most circumstances) might not have replicated to the GSI yet. Similarly, a user that has recently changed from inactive to active might not have updated the GSI yet. You'll need to consider whether eventually consistent reads are acceptable for your use case.
Another consideration is that if this is going to be a very large table, if the query results are going to total >1MB you're going to get a paginated result anyway because DynamoDB enforce that limit. Without a global table lock, you're going to get some inconsistency due to updates from other clients between page queries, in which case eventually consistent reads will need to work for you.

How to successfully separate "ordered items" from "product items" in a shopping software system design?

I am developing an app for ordering products online. I will first present you a part of the ER Diagram and then explain.
Here the "products" are food items and they will go inside the fresh_products table. As you can see a fresh_product consists of product species, category, type, size and product grade.
Then when I place an order, the order id and order_status will be saved in order table. All the ordered items will be saved in the order_item table. In order_item table you can see I have a direct connection with the fresh_products.
At certain times, the management will decide to change the existing fresh_products. For an example, lets take fish products such as Tuna. This fish item has a category called Loin(fished with Loin), after sometime management decide to remove it or rename it because they no longer fish with Loin.
Now, my design will be affected because of the above change. Why? When you make a change to any of the fresh_product fields, it will directly affect the order_item which holds the information of already ordered items. So if you rename Loin , all order_items which has Loin will now be linked to the new name which is wrong. If you decide to delete a fresh_product you can't do that either, because there are existing order history bound to that fresh_product.
My Suggestion
As a solution for this problem, I am thinking of removing the relationship of fresh_products from order_item. Then, I will add String fields into order_item representing all fields of fresh_products, for an example productType, prodyctCategory and so on.
Now I don't have a direct connection with fresh_products but have all the information needed. At the same time, anyfresh_product item can undertake any change without affecting already purchased items at order_item.
My question is, is my suggestion the best way to solve this issue? If you have better solutions, I am open.

Consider the following; in this example, only the name product changes, but you could easily add columns for each attribute (although at some point this would move from sublime to ridiculous)...
Also, I rarely use correlated subqueries, so apologies if there's an error there...
create table product_history
(product_id int not null
,product_name varchar(12) not null
,date date not null
,primary key (product_id,date)
);
insert into product_history values
(1,'soda','2020-01-01'),
(1,'pop','2020-01-04'),
(1,'cola','2020-01-07');
create table order_detail
(order_id int not null
,product_id int not null
,quantity int not null default 1
,primary key(order_id,product_id)
);
insert into order_detail values
(100,1,4);
create table orders
(order_id serial primary key
,customer_id int not null
,order_date date not null
);
insert into orders values
(100,22,'2020-01-05');
SELECT o.order_id
, o.customer_id
, o.order_date
, ph.product_id
, ph.product_name
, od.quantity
FROM orders o
JOIN order_detail od
ON od.order_id = o.order_id
JOIN product_history ph
ON ph.product_id = od.product_id
WHERE ph.date IN
( SELECT MAX(x.date)
FROM product_history x
WHERE x.product_id = ph.product_id
AND x.date <= o.order_date
);
| order_id | customer_id | order_date | product_id | product_name | quantity |
| -------- | ----------- | ---------- | ---------- | ------------ | -------- |
| 100 | 22 | 2020-01-05 | 1 | pop | 4 |
---
View on DB Fiddle

Defining new schema for Spark Rows

I have a DataFrame and one of its columns contains a string of JSON. So far, I've implemented the Function interface as required by the JavaRDD.map method: Function<Row,Row>(). Within this function, I'm parsing the JSON, and creating a new row whose additional columns came from values in the JSON. For example:
Original row:
+------+-----------------------------------+
| id | json |
+------+-----------------------------------+
| 1 | {"id":"abcd", "name":"dmux",...} |
+------------------------------------------+
After applying my function:
+------+----------+-----------+
| id | json_id | json_name |
+------+----------+-----------+
| 1 | abcd | dmux |
+-----------------+-----------+
I'm running into trouble when trying to create a new DataFrame from the returned JavaRDD. Now that I have these new rows, I need to create a schema. The schema is highly dependent on the structure of the JSON, so I'm trying to figure out a way of passing schema data back from the function along with the Row object. I can't use broadcast variables as the SparkContext doesn't get passed into the function.
Other than looping through each column in a row in the caller of Function what options do I have?

You can create a StructType. This is Scala, but it would work the same way:
val newSchema = StructType(Array(
StructField("id", LongType, false),
StructField("json_id", StringType, false),
StructField("json_name", StringType, false)
))
val newDf = sqlContext.createDataFrame(rdd, newSchema)
Incidentally, you need to make sure your rdd is of type RDD[Row].

Records in DB are not one by one. Hibernate

I have db table Users with two columns : ID (AI) and NAME (UNIQUE).
When I'm adding new record to db everything is ok, this record has ID = 1.
When I'm trying to add record with existing name I'm getting error:
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry
I want to add next record, with valid data and this record has ID = 3.
Is any solution to avoid it? I want to have have ID's like this:
1 | 2 | 3
not
1 | 3 | 5 etc.
or maybe should I first check if this name exist? Which option is better?
My code:
public boolean save(){
hibernate.beginTransaction();
hibernate.save(userObject);
hibernate.getTransaction().commit();
hibernate.close();
return true;
}

There are different ID generation strategies. If ID is generated based on sequence then its last number is immediately incremented whenever "next value" is requested. There won't be a problem you described if ID is generated based on table.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.