I'm using ORMLite to manage database tables which contain lists of lookup values for a data collection application. These lookup values are periodically updated from a remote server. However, I'd like to be able to preserve the data in a specific column while creating or updating the records, since I would like to store usage counts (specific to the device) associated with each lookup value. Here's how I'm updating the records:
//build list of new records
final List<BaseLookup> rows = new ArrayList<BaseLookup>();
for (int i = 0; i < jsonRows.length(); i++) {
JSONObject jsonRow = jsonRows.getJSONObject(i);
//parse jsonRow into a new BaseLookup object and add to rows
...
}
//add the new records
dao.callBatchTasks(new Callable<Void>() {
public Void call() throws Exception {
for (BaseLookup row : rows) {
//this is where I'd like to preserve the existing
//value (if any) of the "usageCount" column
Dao.CreateOrUpdateStatus result = dao.createOrUpdate(row);
}
return null;
}
});
I've considered attempting to fetch and merge each record individually within the loop, but this seems like it would perform poorly (some tables are a few thousand records). Is there a simpler or more integrated way to accomplish this?
I'd like to be able to preserve the data in a specific column while creating or updating the records, since I would like to store usage counts (specific to the device) associated with each lookup value
If you have to update certain columns from the JSON data but you want to set the usageCount to usageCount + 1 then you have a couple of options.
You could build an update statement using the dao.updateBuilder(); method and the UpdateBuilder class and then update the columns to their new values and usageCount to usageCount + 1 where the id matches. You should watch the return value to make sure a row was updated. If none were then you create the object.
However, it would be easier to just:
get the BaseLookup from the database
if null, call dao.create() to persist a new entry
otherwise update columns and increment the usageCount
and save it back with a dao.update(...)
Related
I have a schema created in Apache Ignite with 10 columns, where 3 of them are set index (say A, B are string type, C is int type). The total number of rows is around 40,000,000. Here is how I create cache table:
CacheConfiguration<AffinityKey<Long>, Object> cacheCfg = new CacheConfiguration<>();
cacheCfg.setName(CACHE_NAME);
cacheCfg.setDataRegionName("MY_DATA_REGION");
cacheCfg.setBackups(1);
QueryEntity queryEntity = new QueryEntity(AffinityKey.class, Object.class)
.setTableName("DataCache")
.addQueryField("Field_A", String.class.getName(), null)
.addQueryField("Field_B", String.class.getName(), null)
.addQueryField("Field_C", Integer.class.getName(), null)
.addQueryField("Field_D", Integer.class.getName(), null);
List<QueryIndex> queryIndices = new ArrayList<>();
List<String> groupIndices = new ArrayList<>();
groupIndices.add("Field_A");
groupIndices.add("Field_B");
groupIndices.add("Field_C");
queryIndices.add(new QueryIndex(groupIndices, QueryIndexType.SORTED));
queryEntity.setIndexes(queryIndices);
cacheCfg.setQueryEntities(Arrays.asList(queryEntity));
ignite.getOrCreateCache(cacheCfg);
I'm trying to query the ignite cache with sql statement like
select * from DataCache where
Field_A in (...) and Field_B in (...) and Field_C in (...)
with each in-clause having 1000~5000 length. The querying speed is not fast, even slower than directly query to Google Big Query. I just wonder if there's any way to improve the query performance when using in-clause sql.
You don't say how you created your table, but I'm guessing you have three indexes, one on each column. I suspect you'll need to create a group index, i.e., one index across all three columns. With so many elements in your IN clauses, it may also be beneficial to rewrite as a JOIN.
I am using hadoop map-reduce for processing XML file. I am directly storing the JSON data into mongodb. How can I achieve that only non-duplicate records will be stored into database before executing BulkWriteOperation?
The duplicate records criteria will be based on product image and product name, I do not want to use a layer of morphia where we can assign indexes to the class members.
Here is my reducer class:
public class XMLReducer extends Reducer<Text, MapWritable, Text, NullWritable>{
private static final Logger LOGGER = Logger.getLogger(XMLReducer.class);
protected void reduce(Text key, Iterable<MapWritable> values, Context ctx) throws IOException, InterruptedException{
LOGGER.info("reduce()------Start for key>"+key);
Map<String,String> insertProductInfo = new HashMap<String,String>();
try{
MongoClient mongoClient = new MongoClient("localhost", 27017);
DB db = mongoClient.getDB("test");
BulkWriteOperation operation = db.getCollection("product").initializeOrderedBulkOperation();
for (MapWritable entry : values) {
for (Entry<Writable, Writable> extractProductInfo : entry.entrySet()) {
insertProductInfo.put(extractProductInfo.getKey().toString(), extractProductInfo.getValue().toString());
}
if(!insertProductInfo.isEmpty()){
BasicDBObject basicDBObject = new BasicDBObject(insertProductInfo);
operation.insert(basicDBObject);
}
}
//How can I check for duplicates before executing bulk operation
operation.execute();
LOGGER.info("reduce------end for key"+key);
}catch(Exception e){
LOGGER.error("General Exception in XMLReducer",e);
}
}
}
EDIT: After the suggested answer I have added :
BasicDBObject query = new BasicDBObject("product_image", basicDBObject.get("product_image"))
.append("product_name", basicDBObject.get("product_name"));
operation.find(query).upsert().updateOne(new BasicDBObject("$setOnInsert", basicDBObject));
operation.insert(basicDBObject);
I am getting error like: com.mongodb.MongoInternalException: no mapping found for index 0
Any help will be useful.Thanks.
I suppose it all depends on what you really want to do with the "duplicates" here as to how you handle it.
For one you can always use .initializeUnOrderedBulkOperation() which won't "error" on a duplicate key from your index ( which you need to stop duplicates ) but will report any such errors in the returned BulkWriteResult object. Which is returned from .execute()
BulkWriteResult result = operation.execute();
On the other hand, you can just use "upserts" instead and use operators such as $setOnInsert to only make changes where no duplicate existed:
BasicDBObject basicdbobject = new BasicDBObject(insertProductInfo);
BasicDBObject query = new BasicDBObject("key", basicdbobject.get("key"));
operation.find(query).upsert().updateOne(new BasicDBObject("$setOnInsert", basicdbobject));
So you basically look up the value of the field that holds the "key" to determine a duplicate with a query, then only actually change any data where that "key" was not found and therefore a new document and "inserted".
In either case the default behaviour here will be to "insert" the first unique "key" value and then ignore all other occurances. If you want to do other things like "overwrite" or "increment" values where the same key is found then the .update() "upsert" approach is the one you want, but you will use other update operators for those actions.
How to compare list of records against database? I have more than 1000 records in list and need to validate against database. How to validate each record from list to database? Select all the data from database and stored in list, then have to compare the values? Please advise...
The below code lists values to validate against database.
private void validatepart(HttpServletRequest req, Vector<String> errors) {
Parts Bean = (Parts)req.getAttribute("partslist");
Vector<PartInfo> List = Bean.getPartList();
int sz = partList.size();
for (int i = 0; i < sz; i++) {
PartInfo part = (PartInfo)partList.elementAt(i);
System.out.println(part.getNumber());
System.out.println(part.getName());
}
}
This depends on what you mean by compare. If it's just one field then executing a query such as select * from parts_table where part_number = ?. It's not that much of a stretch to add more fields to that query. If nothing is returned you know it doesn't exist.
If you need to compare and know exactly which values are different then you can try something like this
List<String> compareObjects(PartInfo filePart, PartInfo dbPart) {
List<String> different = new LinkedList<String>();
if (!filePart.getNumber().equals(dbPart.getNumber())) {
different.add("number");
}
//repeat for all your fields
return different;
}
If your list of objects that you need to validate against the database includes a primary key, then you could just build a list of those primary key values and run a query like:
SELECT <PRIMARY KEY FIELD> FROM <TABLE> WHERE <PRIMARY_KEY_FIELD> IN <LIST OF PRIMARY KEYS> SORT BY <PRIMARY KEY FIELD> ASC;
Once you get that list back, you can compare the results. My instinct would be to put your data (and the query results too) into a Set object and then call removesAll() to get the items not in the database (reverse this for items in the database but not in your set):
yourDataSet.removeAll(queryResults);
This assumes that you have an equals() method implemented in your PartInfo object. You can see the Java API documentation for more details.
I have requirement to remove the duplicate values from result set based on some unique identifier.
I need to remove the duplicates from the result set.
while(resultSet.next())
{
int seqNo = resultSet.getInt("SEQUENCE_NO");
String tableName = resultSet.getString("TABLE_NAME");
String columnName = resultSet.getString("COLUMN_NAME");
String filter = resultSet.getString("FILTER");
}
from the above iteration, i m getting 2 rows from result set. There is same seq no,same table name, different columnname, same filter.
1 PRODUCTFEES CHARGETYPE PRODUCTID
1 PRODUCTFEES PRODUCTCODE PRODUCTID
My requirement is to remove the duplicate table name, duplicate seq no, duplicate filter.
I want to get output something below,
1 PRODUCTFEES CHARGETYPE PRODUCTCODE PRODUCTID
By the example you provide, it seems like you want to output all distinct values for each column indidivually (there are 4 columns in the table, but you output 5 values).
Being the question tagged java, an approach you could take would be using an implementation of Set for each of the columns, so that duplicates won't get through. Then output all the elements of each Set.
LinkedHashSet[] sets = new LinkedHashSet[]{
new LinkedHashSet(),
new LinkedHashSet(),
new LinkedHashSet(),
new LinkedHashSet() };
while(resultSet.next()) {
sets[0].add(resultSet.getInt("SEQUENCE_NO"));
sets[1].add(resultSet.getString("TABLE_NAME")););
sets[2].add(resultSet.getString("COLUMN_NAME"));
sets[3].add(resultSet.getString("FILTER"));
}
StringBuilder buf = new StringBuilder();
for (LinkedHashSet set : sets) {
// append to buf all elements of each set
}
But it might be simpler to address this from the very same SQL query and just make SELECT DISTINCT columnX for each of the columns and output the result without further manipulation. Or use an aggregation function that will concatenate all distinct values. The implementation will be highly dependent on the DBMS you're using (GROUP_CONCAT for MySQL, LISTAGG for Oracle, ...). This would be a similar question for Oracle: How to use Oracle's LISTAGG function with a unique filter?
Based on the different outputs I'd say, that you not just need to remove duplicates, but also reorder the data from the duplicates.
In that case you need to fill a new data-array (or similar structure) in the while(resultSet.next()), and after that loop over the newly arranged data-object and output accordingly.
In Meta-Lang this would be as follows:
while resultset.next()
if newdata-array has unique key
add column-name to found entry in newdata-array
else
create new entry in newdata-array with column-name
while newdata-array.next()
output seq, table-name
while entry.column-names.next()
output column-name
output product-id
I have 6 columns in a table. I have a select query which selects some records from the table. While iterating over the result set, im using the following logic to extract the values in the columns:
Statement select = conn.createStatement();
ResultSet result = select.executeQuery
("SELECT * FROM D724933.ECOCHECKS WHERE ECO = '"+localeco+"' AND CHK_TOOL = '"+checknames[i]+"'");
while(result.next()) { // process results one row at a time
String eco = result.getString(1);
mapp2.put("ECO", eco);
String chktool = result.getString(2);
mapp2.put("CHECK_TOOL", chktool);
String lastchktime = result.getString(3);
mapp2.put("LAST_CHECK_TIME", lastchktime);
String status = result.getString(4);
mapp2.put("STATUS", status);
String statcmts = result.getString(5);
mapp2.put("STATUS_COMMENTS", statcmts);
String details = result.getString(6);
mapp2.put("DETAILS_FILE", details);
}
I have 2 questions here:
1. Is there any better approach rather than using result.getString()???
2. Lets say, another column gets added to the table at a later point. Is there any way my code handles this new addition without making change to the code at that point of time
You can use ResultSetMetaData to determine the number and names of the columns in your ResultSet and deal with it this way. Note however that changing the number of columns in the database - affecting your code - and having the code still work may not always be a good idea.
Additionally, note that you're overwriting the values in your map on each iteration of the loop. You probably want to add those maps to some sort of List?
Finally, you need to make sure that your getString methods will not return null anywhere, otherwise putting it into a map will throw an exception.
Statement select = conn.createStatement();
ResultSet result = select.executeQuery("SELECT * FROM D724933.ECOCHECKS WHERE ECO = '"+localeco+"' AND CHK_TOOL = '"+checknames[i]+"'");
ResultSetMetaData rsmd = result.getMetaData();
int numberOfColumns = rsmd.getColumnCount();
List data = new ArrayList<Map>();
Map mapp2;
while(result.next()) { // process results one row at a time
mapp2 = new HashMap<String, String>();
for(int i=1; i<=numberOfColumns; i++) {
mapp2.put(rsmd.getColumnName(i), rs.getString(i));
}
data.add(mapp2);
}
Each of the get family of methods on ResultSet has an overloaded variant that takes a column name as argument. You can use this instead to reduce reliance on ordering of columns.
ResultSet results = ...;
results.getString(1);
You could do this:
results.getString("name");
But the preferred way of handling this sort of problem is to impose an ordering of your own on the result set, by explicitly selecting the columns you want in the initial query.
If your table adds a new column, then obviously you have to change your code, because in your code you use hardcoded value, I mean getString(1).
Instead use ResultSetMetaData's getColumnCount and do some other logic to get that many column values dynamically.
Another thing for your first question, ResultSet contains getXXX() methods with two types of parameters, String column name and int column index. You used the index instead of column name which will perform little faster.
It is bad practice to use SELECT *, instead you should select only the columns you are interested in. The reason is exactly what you mentioned: What happens if your DB changes. you don't want to go trhough the whole code and find and edit all SELECT * statements.
You don't need to put the result into your own map because you can already do:
result.getString("DETAILS_FILE");
But there are already other answers explaining that.
It would be further helpful to use a constant instead of the string "DETAILS_FILE". You can use the constant in the SELECT and in the result.getString(). In case your DB changes you only need to introduce a new constant or change an existing one.