how to check for an item in dynamodb using java? - java

Hi I'm trying to create a java program which checks for a value present in the amazon dynamodb.
Well actually im trying to write a java program for a retail store where the program checks for a certain product(value) in the dynamodb and returns the quantity present in the store.I already created a table in dynamodb and also used the scan api to retrieve items.But now i want to search for a certain value is present or not and also return the quantity present.
It would be really helpful if anyone could help me with a snippet.
ps: i'm new to java
Thanks

Given following data modelling for a table ConsumerTable, with leaseKey as indexed field;
{
"checkpoint": "49572533067729099438069946639296561054199125138133221378",
"checkpointSubSequenceNumber": 0,
"leaseCounter": 1372794,
"leaseKey": "shardId-000000000000",
"leaseOwner": "pout_172.21.63.243",
"ownerSwitchesSinceCheckpoint": 0
}
Java API to query Amazon DynamoDB by the indexed field would be
import com.amazonaws.ClientConfiguration;
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.retry.v2.RetryPolicy;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient;
import com.amazonaws.services.dynamodbv2.document.DynamoDB;
public AmazonDynamoDB getDbConnection() {
AmazonDynamoDBClient dynamoDB = new AmazonDynamoDBClient( new ProfileCredentialsProvider("your-aws-auth-profile_in_~/.aws/credentials"));
dynamoDB.setRegion(Region.getRegion(Regions.US_WEST_2))
return dynamoDB;
}
public void getData() {
DynamoDB dynamoDB = new DynamoDB(getDbConnection());
Table consumerOffsetTable = dynamoDB.getTable("ConsumerTable");
Optional.ofNullable(consumerOffsetTable.getItem("leaseKey", "shardId-000000000000")).ifPresent(item -> {
Map<String, Object> itemMap = item.asMap();
System.out.println(itemMap.get("leaseKey").toString() + " -> " + itemMap.get("checkpoint").toString());
});
}
Maven dependency would be
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-dynamodb</artifactId>
<version>1.11.115</version>
</dependency>
In your product, quantity example, index product attribute and query should be similar to above example.
Resources
For AWS access using SDK - http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html
For Dynamodb apis
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ScanJavaDocumentAPI.html
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryingJavaDocumentAPI.html

Related

BatchDelete is not deleting all value in list

Trying to delete multiple row of DynamoDB.
Querying only on one table of DynamoDB.On basis the of Partition key, its returning 2 values as output, which is save in list.
Now after deleting this value using BatchDelete only first element get deleted. Sometimes on random basis second value also get deleted but that was not happened every time.
DynamoDBQueryExpression<Abc> queryExpression = new DynamoDBQueryExpression<Abc>()
.withHashKeyValues(abc);
List<Abc> xyz = dynamoDBMapper.query(Abc.class,queryExpression);
//xyz has size 2
dynamoDBMapper.batchDelete(xyz);
Should I use sleep or is there any other way.
If you look at Java V1 here:
https://github.com/awsdocs/aws-doc-sdk-examples
You will see it's marked as deprecated.
I strongly recommend that you upgrade to the AWS SDK for Java v2 API.
When working with Java V2 and DynamoDB, the Enhanced Client offers a straightforward way to map client-side classes to DynamoDB tables. This is documented in the Java V2 Developer Guide here:
Mapping items in DynamoDB tables
To use the Enhanced Client to delete multiple items, you can use this Java code:
package com.example.dynamodb;
// snippet-start:[dynamodb.java2.mapping.batchdelete.import]
import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbEnhancedClient;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbTable;
import software.amazon.awssdk.enhanced.dynamodb.Key;
import software.amazon.awssdk.enhanced.dynamodb.TableSchema;
import software.amazon.awssdk.enhanced.dynamodb.model.BatchWriteItemEnhancedRequest;
import software.amazon.awssdk.enhanced.dynamodb.model.DeleteItemEnhancedRequest;
import software.amazon.awssdk.enhanced.dynamodb.model.WriteBatch;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;
import software.amazon.awssdk.services.dynamodb.model.DynamoDbException;
// snippet-end:[dynamodb.java2.mapping.batchdelete.import]
/*
* Before running this code example, create an Amazon DynamoDB table named Customer with these columns:
* - id - the id of the record that is the key
* - custName - the customer name
* - email - the email value
* - registrationDate - an instant value when the item was added to the table
*
* Also, ensure that you have set up your development environment, including your credentials.
*
* For information, see this documentation topic:
*
* https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html
*/
public class EnhancedBatchDeleteItems {
public static void main(String[] args) {
ProfileCredentialsProvider credentialsProvider = ProfileCredentialsProvider.create();
Region region = Region.US_EAST_1;
DynamoDbClient ddb = DynamoDbClient.builder()
.region(region)
.credentialsProvider(credentialsProvider)
.build();
DynamoDbEnhancedClient enhancedClient = DynamoDbEnhancedClient.builder()
.dynamoDbClient(ddb)
.build();
deleteBatchRecords(enhancedClient);
ddb.close();
}
// snippet-start:[dynamodb.java2.mapping.batchdelete.main]
public static void deleteBatchRecords(DynamoDbEnhancedClient enhancedClient) {
try {
DynamoDbTable<Customer> mappedTable = enhancedClient.table("Customer", TableSchema.fromBean(Customer.class));
Key key1 = Key.builder()
.partitionValue("id110")
.build();
Key key2 = Key.builder()
.partitionValue("id120")
.build();
BatchWriteItemEnhancedRequest request = BatchWriteItemEnhancedRequest.builder()
.writeBatches(WriteBatch.builder(Customer.class)
.mappedTableResource(mappedTable)
.addDeleteItem(DeleteItemEnhancedRequest.builder()
.key(key1)
.build())
.build(),
WriteBatch.builder(Customer.class)
.mappedTableResource(mappedTable)
.addDeleteItem(DeleteItemEnhancedRequest.builder()
.key(key2)
.build())
.build())
.build();
// Delete these two items from the table.
enhancedClient.batchWriteItem(request);
System.out.println("Records deleted");
} catch (DynamoDbException e) {
System.err.println(e.getMessage());
System.exit(1);
}
}
// snippet-end:[dynamodb.java2.mapping.batchdelete.main]
}
You can find this example and other Java v2 DynamoDB examples in AWS Code Example Github.
I do not see any issue with your code, I have tested similar and works for me with no issue. My first suggestion would be to ensure you wrap your code in a try/catch block:
try {
DynamoDBMapper mapper = new DynamoDBMapper(client);
Reply key = new Reply();
key.setPk("1");
DynamoDBQueryExpression<Reply> queryExpression = new DynamoDBQueryExpression<Reply>()
.withHashKeyValues(key)
.withReturnConsumedCapacity(ReturnConsumedCapacity.TOTAL)
.withScanIndexForward(false);
List<Reply> latestReplies = mapper.query(Reply.class, queryExpression);
// Log keys here to be sure you are deleting the correct item
for (Reply c: latestReplies) {
System.out.println(c.getPk());
}
mapper.batchDelete(latestReplies);
} catch (Throwable t) {
System.err.println("Error running: " + t);
t.printStackTrace();
}
I would suggest that you implement some logging, just to be sure that the items you read are the ones you expected to be deleted.
It may also be worth enabling CloudTrail Dataplane Logs which will allow you to see all the dataplane events being executed on the table.
You may also enable HTTP wire logging to provide you another level of logging, however, this is not advised for production workloads as the logging is quite verbose.

How to pass “values” attribute as array of strings in payload object in case of Googles AUTOML TABLE?

We are  using Google cloud service AUTOML TABLES for online prediction.
We have created, trained and deployed the model. The model is giving predictions using the Google console. We are trying to integrate this model in our java code.
We are not able to pass “values”  attribute as array of strings in payload object in java code. We haven’t found anything for this in documentation.
Please find the links we are using for this:
https://cloud.google.com/automl-tables/docs/samples/automl-tables-predict
Please find the json object in the screenshot.
Please let us know how to pass “values”  attribute as array of strings in payload object?
Thanks.
Based from the reference you are following, to be able to populate "values" you need to define it at the main(). You can refer to Class Value.Builder if you need to set Numbers, Null, etc. values.
List<Value> values = new ArrayList<>();
values.add(Value.newBuilder().setStringValue("This is test data.").build());
// add more elements in values as needed
This list values will be used in Row that accepts iterable protobuf value. See Row.newBuilder.addAllValues().
Row row = Row.newBuilder().addAllValues(values).build();
Using these, the payload is complete and a prediction request be built:
ExamplePayload payload = ExamplePayload.newBuilder().setRow(row).build();
PredictRequest request =
PredictRequest.newBuilder()
.setName(name.toString())
.setPayload(payload)
.putParams("feature_importance", "true")
.build();
PredictResponse response = client.predict(request);
Your full prediction code should look like this:
import com.google.cloud.automl.v1beta1.AnnotationPayload;
import com.google.cloud.automl.v1beta1.ExamplePayload;
import com.google.cloud.automl.v1beta1.ModelName;
import com.google.cloud.automl.v1beta1.PredictRequest;
import com.google.cloud.automl.v1beta1.PredictResponse;
import com.google.cloud.automl.v1beta1.PredictionServiceClient;
import com.google.cloud.automl.v1beta1.Row;
import com.google.cloud.automl.v1beta1.TablesAnnotation;
import com.google.protobuf.Value;
import com.google.protobuf.NullValue;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
class TablesPredict {
public static void main(String[] args) throws IOException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-project-id";
String modelId = "TBL9999999999";
// Values should match the input expected by your model.
List<Value> values = new ArrayList<>();
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("blue-colar").build());
values.add(Value.newBuilder().setStringValue("married").build());
values.add(Value.newBuilder().setStringValue("primary").build());
values.add(Value.newBuilder().setStringValue("no").build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("yes").build());
values.add(Value.newBuilder().setStringValue("yes").build());
values.add(Value.newBuilder().setStringValue("cellular").build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("unknown").build());
predict(projectId, modelId, values);
}
static void predict(String projectId, String modelId, List<Value> values) throws IOException {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (PredictionServiceClient client = PredictionServiceClient.create()) {
// Get the full path of the model.
ModelName name = ModelName.of(projectId, "us-central1", modelId);
Row row = Row.newBuilder().addAllValues(values).build();
ExamplePayload payload = ExamplePayload.newBuilder().setRow(row).build();
// Feature importance gives you visibility into how the features in a specific prediction
// request informed the resulting prediction. For more info, see:
// https://cloud.google.com/automl-tables/docs/features#local
PredictRequest request =
PredictRequest.newBuilder()
.setName(name.toString())
.setPayload(payload)
.putParams("feature_importance", "true")
.build();
PredictResponse response = client.predict(request);
System.out.println("Prediction results:");
for (AnnotationPayload annotationPayload : response.getPayloadList()) {
TablesAnnotation tablesAnnotation = annotationPayload.getTables();
System.out.format(
"Classification label: %s%n", tablesAnnotation.getValue().getStringValue());
System.out.format("Classification score: %.3f%n", tablesAnnotation.getScore());
// Get features of top importance
tablesAnnotation
.getTablesModelColumnInfoList()
.forEach(
info ->
System.out.format(
"\tColumn: %s - Importance: %.2f%n",
info.getColumnDisplayName(), info.getFeatureImportance()));
}
}
}
}
For testing purposes I used Google's test dataset (gs://cloud-ml-tables-data/bank-marketing.csv) and used the code above to run send prediction.
See test prediction:

Connect To MongoDB using Apache Mahout

I'm trying to generate recommendations using Apache Mahout while using MongoDB to create the datamodel as per the MongoDBDataModel. My code is as follows :
import java.net.UnknownHostException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.ThresholdUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.UserBasedRecommender;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import com.mongodb.MongoException;
public class usingMongo {
public static void main(String[] args) throws UnknownHostException, Mong oException
,TasteException {
final long startTime = System.nanoTime();
MongoDBDataModel model = new MongoDBDataModel("AdamsLaptop", 27017,
"test", "ratings100k", false, false, null);
System.out.println("connected to mongo ");
UserSimilarity UserSim = new PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.5, UserSim, model);
UserBasedRecommender UserRecommender = new GenericUserBasedRecommender(model, neighborhood, UserSim);
List<RecommendedItem>UserRecommendations = UserRecommender.recommend(1, 3);
for (RecommendedItem recommendation : UserRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " USER");
}
ItemSimilarity ItemSim = new PearsonCorrelationSimilarity(model);//LogLikelihoodSimilarity(model);
GenericItemBasedRecommender ItemRecommender = new GenericItemBasedRecommender(model, ItemSim);
List<RecommendedItem>ItemRecommendations = ItemRecommender.recommend(1, 3);
for (RecommendedItem recommendation : ItemRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " ITEM");
}
final long duration = System.nanoTime() - startTime;
System.out.println(duration);
}
}
I cant see where I've gone wrong but with numerous changes and lots of trial and error the error message remains the same :
Exception in thread "main" java.lang.NullPointerException
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.getID(MongoDBDataModel.java:743)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.buildModel(MongoDBDataModel.java:570)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.<init>(MongoDBDataModel.java:245)
at recommender.usingMongo.main(usingMongo.java:24)
Any suggestions? Here's an example of my data within MongoDB :
{ "_id" : ObjectId("56ddf61f5960960c333f3dcb"),"userId" : 1, "movieId" : 292, "rating" : 4, "timestamp" : 847116936 }
I succesfully integrated MongoDB data to mahout.
The structure of the data in mongoDB depends on the kind of Similarity algorithm you use.for eg,
UserSimilarity
MongoDBDataModel datamodel = new MongoDBDataModel("127.0.0.1", 27017, "testing", "ratings", true, true, null);
where the user_id, item_id are integer values, preference are float values and created_at as timestamp
SVDRecommender
the user_id, item_id are MongoDB Objects and preference are float values and created_at as timestamp
The obvious troubleshooting you can do is whether the MongoDB server is running or not. As per the exception it's running. I think the problem lies in your structure of data..
Use user_id instead of userId, item_id instead of itemId, preference instead of rating. I don't know if this will make any difference. I used one of the tutorial online, but can't find it at the moment.
It's working but too slow when I have more than 10000 users with 1000 items.
I think that the problem is that mahout assumes some default values when it comes to some fields that need to reside in your mongoDB the item ID, User ID and preferences that are user_id, item_id and preference so The solution might lie on using another MongoDBDataModel constructor that will give you the possibility to pass as parameters the names of those fields in your mongoDB instance or redesign your Collections Schema.
I hope that makes sense.

Spring Data MongoDB Aggregation $out pipeline

I'm working with Spring data mongoDB and using aggregation to fetch docuemnts.
List<AggregationOperation> operationsList = new ArrayList<AggregationOperation>();
operationsList.add(Aggregation.unwind("calendarEvent"));
operationsList.add(Aggregation.match(criteria));
operationsList.add(getMacroEventProjectionFields());
if (start <= 0) {
start = 1;
}
operationsList.add(Aggregation.skip(start - 1));
if (limit > 0) {
operationsList.add(Aggregation.limit(limit));
}
Aggregation aggregation = Aggregation.newAggregation(operationsList);
AggregationResults<MacroEvent> groupResults = mongoTemplate.aggregate(
aggregation, mongoTemplate.getCollectionName(KALiEvent.class),
MacroEvent.class);
The collection contains thousands of records and it was working fine till yesterday. But today it has started throwing the following exception:
Error [exception: aggregation result exceeds maximum document size (16MB)]
I Googled this issue and found that aggregation can only return the results equal to the size of MongoDB document size, which is 16MB. The only workaround I found for this is to use the $out pipeline which does the following:
Takes the documents returned by the aggregation pipeline and writes them to a specified collection. The $out operator must be the last stage in the pipeline. The $out operator lets the aggregation framework return result sets of any size.
But, I just can't figure out the way that how this can be done using Spring data MongoDB and how I can use this $out in my code pasted above.
In order to avoid that 16 mb exception, I've created my own OutOperation class that performs mongo $out pipeline operation:
package com.myop;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import org.springframework.data.mongodb.core.aggregation.AggregationOperation;
import org.springframework.data.mongodb.core.aggregation.AggregationOperationContext;
public class OutOperation implements AggregationOperation {
/**
the name of the collection where aggregation data will be stored
*/
private String collectionName;
public OutOperation(String collectionName) {
this.collectionName = collectionName;
}
#Override
public DBObject toDBObject(AggregationOperationContext context) {
return new BasicDBObject("$out", collectionName);
}
}

Parsing Information from URL Using Jsoup

I need help with my Java project using Jsoup (if you think there is a more efficient way to achieve the purpose, please let me know). The purpose of my program is to parse certain useful information from different URLs and put it in a text file. I am not an expert in HTML or JavaScript, therefore, it has been difficult for me to code in Java exactly what I want to parse.
In the website that you see in the code below as one of the examples, the information that interests me to parse with Jsoup is everything you can see in the table under “Routing”(Route, Location, Vessel/Voyage, Container Arrival Date, Container Departure Date; = Origin, Seattle SSA Terminal T18, 26 Jun 15 A, 26 Jun 15 A… and so on).
So far, with Jsoup we are only able to parse the title of the website, yet we have been unsuccessful in getting any of the body.
Here is the code that I used, which I got from an online source:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class Jsouptest71115 {
public static void main(String[] args) throws Exception {
String url = "http://google.com/gentrack/trackingMain.do "
+ "?trackInput01=999061985";
Document document = Jsoup.connect(url).get();
String title = document.title();
System.out.println("title : " + title);
String body = document.select("body").text();
System.out.println("Body: " + body);
}
}
Working code:
import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
import java.util.ArrayList;
public class Sample {
public static void main(String[] args) {
String url = "http://homeport8.apl.com/gentrack/blRoutingPopup.do";
try {
Connection.Response response = Jsoup.connect(url)
.data("blNbr", "999061985") // tracking number
.method(Connection.Method.POST)
.execute();
Element tableElement = response.parse().getElementsByTag("table")
.get(2).getElementsByTag("table")
.get(2);
Elements trElements = tableElement.getElementsByTag("tr");
ArrayList<ArrayList<String>> tableArrayList = new ArrayList<>();
for (Element trElement : trElements) {
ArrayList<String> columnList = new ArrayList<>();
for (int i = 0; i < 5; i++) {
columnList.add(i, trElement.children().get(i).text());
}
tableArrayList.add(columnList);
}
System.out.println("Origin/Location: "
+tableArrayList.get(1).get(1));// row and column number
System.out.println("Discharge Port/Container Arrival Date: "
+tableArrayList.get(5).get(3));
} catch (IOException e) {
e.printStackTrace();
}
}
}
Output:
Origin/Location: SEATTLE SSA TERMINAL (T18), WA  
Discharge Port/Container Arrival Date: 23 Jul 15  E
You need to utilize document.select("body") select method input to which is CSS selector. To know more about CSS selectors just google it, or Read this. Using CSS selectors you can identify parts of web page body easily.
In your particular case you will have a different problem though, for instance the table you are after is inside an IFrame and if you look at the html of web page you are visiting its(iframe's) url is "http://homeport8.apl.com/gentrack/blRoutingFrame.do", so if you visit this URL directly so that you can access its content you will get an exception which is perhaps some restriction from Server. To get content properly you need to visit two URLs via JSoup, 1. http://homeport8.apl.com/gentrack/trackingMain.do?trackInput01=999061985 and 2. http://homeport8.apl.com/gentrack/blRoutingFrame.do?trackInput01=999061985
For first URL you'll get nothing useful, but for second URL you'll get tables of your interest. The try using document.select("table") which will give you List of tables iterator over this list and find table of your interest. Once you have the table use Element.select("tr") to get a table row and then for each "tr" use Element.select("td") to get table cell data.
The webpage you are visiting didn't use CSS class and id selectors which would have made reading it with jsoup a lot easier so I am afraid iterating over document.select("table") is your best and easy option.
Good Luck.

Categories

Resources