MongoDB insert in multiple threads - java

I am using MongoDB as database. So When I insert a document into the database and very shortly after I do this again, it inserts the document again (I check if the database contains the document before inserting it). The reason it does this, I think, is that I run the update method async which means it takes some time, so at the time it checks if it contains it it's still updating it to the database.
Update method:
public static void updateAndInsert(final String collection, final String where, final String whereValue, final DBObject value)
{
Utils.runAsync(new Runnable()
{
#Override
public void run()
{
if(!contains(collection, where, whereValue))
insert(collection, value);
else
db.getCollection(collection).update(new BasicDBObject(where, whereValue), new BasicDBObject("$set", value));
}
});
}
How can I make sure it only inserts it once?

A question without a question. Wow! :D
You shouln't do it that way, because there are no transactions in MongoDB. But you do have atomic operations on single documents.
Better use an upsert. Within the find part of the upsert, you specify the thing you do within your contains method. (Maybe have a look here: http://techidiocy.com/upsert-mongodb-java-example/ or just google for MongoDB and upsert)
This way you can do contains, insert and update in a single query. That's the way you should do it with MongoDB!

Related

How to replace whole SQL table data frequently?

I have a Spring application that runs a cron on it. The cron every few minutes gets new data from external API. The data should be stored in a database (MySQL), in place of old data (Old data should be overwritten by new data). The data requires to be overwritten instead of updated. The application itself provides REST API so the client is able to get the data from the database. So there should not be situation that client sees an empty or just a part of data from database because there is an data update.
Currently I've tried deleting whole old data and insert new data but there is a place that a client gets just a part of the data. I've tried it via Spring Data deleteAll and saveAll methods.
#Override
#Transactional
public List<Country> overrideAll(#NonNull Iterable<Country> countries) {
removeAllAndFlush();
List<CountryEntity> countriesToCreate = stream(countries.spliterator(), false)
.map(CountryEntity::from)
.collect(toList());
List<CountryEntity> createdCountries = repository.saveAll(countriesToCreate);
return createdCountries.stream()
.map(CountryEntity::toCountry)
.collect(toList());
}
private void removeAllAndFlush() {
repository.deleteAll();
repository.flush();
}
I also thought about having a temporary table that gets new data and when the data is complete just replace main table with temporary table. Is it a good idea? Any other ideas?
It's a good idea. You can minimize the downtime by working on another table until it's ready and then switch tables quickly by renaming. This will also improve perceived performance by the users because no record needs to be locked like what happens when using UPDATE/DELETE.
In MySQL, you can use RENAME TABLE if you don't have triggers on the table. It allows multiple table renaming at once and it works atomically (i.e. transaction - if any error happens, no change is made). You can use the following for example
RENAME TABLE countries TO countries_old, countries_new TO countries;
DROP TABLE countries_old;
Refer here for more details
https://dev.mysql.com/doc/refman/5.7/en/rename-table.html

Java method for MongoDB collection.save()

I'm having a problem with MongoDB using Java when I try adding documents with customized _id field. And when I insert new document to that collection, I want to ignore the document if it's _id has already existed.
In Mongo shell, collection.save() can be used in this case but I cannot find the equivalent method to work with MongoDB java driver.
Just to add an example:
I have a collection of documents containing websites' information
with the URLs as _id field (which is unique)
I want to add some more documents. In those new documents, some might be existing in the current collection. So I want to keep adding all the new documents except for the duplicate ones.
This can be achieve by collection.save() in Mongo Shell but using MongoDB Java Driver, I can't find the equivalent method.
Hopefully someone can share the solution. Thanks in advance!
In the MongoDB Java driver, you could try using the BulkWriteOperation object with the initializeOrderedBulkOperation() method of the DBCollection object (the one that contains your collection). This is used as follows:
MongoClient mongo = new MongoClient("localhost", port_number);
DB db = mongo.getDB("db_name");
ArrayList<DBObject> objectList; // Fill this list with your objects to insert
BulkWriteOperation operation = col.initializeOrderedBulkOperation();
for (int i = 0; i < objectList.size(); i++) {
operation.insert(objectList.get(i));
}
BulkWriteResult result = operation.execute();
With this method, your documents will be inserted one at a time with error handling on each insert, so documents that have a duplicated id will throw an error as usual, but the operation will still continue with the rest of the documents. In the end, you can use the getInsertedCount() method of the BulkWriteResult object to know how many documents were really inserted.
This can prove to be a bit ineffective if lots of data is inserted this way, though. This is just sample code (that was found on journaldev.com and edited to fit your situation.). You may need to edit it so it fits your current configuration. It is also untested.
I guess save is doing something like this.
fun save(doc: Document, col: MongoCollection<Document>) {
if (doc.getObjectId("_id") != null) {
doc.put("_id", ObjectId()) // generate a new id
}
col.replaceOne(Document("_id", doc.getObjectId("_id")), doc)
}
Maybe they removed save so you decide how to generate the new id.

Mybatis query returning incorrect results. Possible caching issue?

I have a JUnit test, which includes use of Mybatis. At the start of the test I'm retrieving a count of records in the table.
At the end of the test I expect an additional record to be present in the table, and have an assert to verify this condition. However I find that the second query returns exactly the same number of records as the first one did.
I know a new record has definitely been inserted in the table.
I thought this may be related to caching, so I tried flushing all caches associated with the session. I also tried using setCacheEnabled(false), but still the same result.
Here's my code fragment -
#Test
public void config_0_9() {
session.getConfiguration().setCacheEnabled(false);
cfgMapper = session.getMapper(CfgMapper.class);
int oldRecords = cfgMapper.countByExample(null);
messageReprocessor.processSuspendedMessages();
session.commit();
int newRecords = cfgMapper.countByExample(null);
assertTrue(newRecords == oldRecords + 1);
}

Update multiple rows in MongoDB java Driver

I want to update multiple rows in My Collection called "Users". Right now I am updating both the rows seperately but I want to do the same in one query.
My current code:
coll.update(new BasicDBObject().append("key", k1), new BasicDBObject().append("$inc", new BasicDBObject().append("balance", 10)));
coll.update(new BasicDBObject().append("key", k2), new BasicDBObject().append("$inc", new BasicDBObject().append("balance", -10)));
How to make these two seperate updates in one statement?
First let me translate your java code to shell script so people can read it :
db.coll.update({key: k1}, {$inc:{balance:10}})
db.coll.update({key: k2}, {$inc:{balance:-10}})
Now, the reason you will never be able to do this in one update is because there is no way to provide a unique update clause per matching document. You could bulk your updates so that you can do this (pseudoish):
set1 = getAllKeysForBalanceIncrease();
set2 = getAllKeysForBalanceDecrease();
db.coll.update({key:{$in:set1}}, {$inc:{balance:10}}, false, true)
db.coll.update({key:{$in:set2}}, {$inc:{balance:-10}}, false, true)
In other words, you can update multiple documents within one atomic write but the update operation will be static for all documents. So aggregating all documents that require the same update is your only optimization path.
The $in clause can be composed in Java through :
ObjectId[] oidArray = getAllKeysEtc();
query = new BasicDBObject("key", new BasicDBObject("$in", oidArray));
In MongoDB you do not have transactions that span multiple documents. Only writes on a document are atomic.
But you can do updates with:
public WriteResult update(DBObject q,
DBObject o,
boolean upsert,
boolean multi)
But note, this will not be in a transaction.

Database deletes failed during inserts

I have two java apps: one of them inserts records to Table1.
Second application reads first N items and removes them.
When 1st application inserts data intensive, 2nd failed when I try to delete any rows with CannotSerializeTransactionException. I don't see any problems: inserted items are visible in select/delete only when insert transaction is finished. How can I fix it? Thanks.
TransactionTemplate tt = new TransactionTemplate(platformTransactionManager);
tt.setIsolationLevel(Connection.TRANSACTION_SERIALIZABLE);
tt.execute(new TransactionCallbackWithoutResult() {
#Override
protected void doInTransactionWithoutResult(TransactionStatus status) {
List<Record> records = getRecords(); // jdbc select
if (!records.isEmpty()) {
try {
processRecords(records); // no database
removeRecords(records); // jdbc delete - exception here
} catch (CannotSerializeTransactionException e) {
log.info("Transaction rollback");
}
} else {
pauseProcessing();
}
}
});
pauseProcessing() - sleep
public void removeRecords(int changeId) { String sql = "delete from RECORDS where ID <= ?";
getJdbcTemplate().update(sql, new Object[]{changeId});}
Are you using Connection.TRANSACTION_SERIALIZABLE also in first application? Looks like first application locks table, so second one cannot access it (cannot start transaction). Maybe Connection.TRANSACTION_REPEATABLE_READ could be enough?
Probably you can also configure second application not to throw exception when it cannot access resources, but to wait for it.
This sounds as if you're reading uncommitted data. Are you sure you're properly settings the isolation level?
It seems to me that you're mixing up constants from two different classes: Shouldn't you be passing TransactionDefinition.ISOLATION_SERIALIZABLE instead of Connection.TRANSACTION_SERIALIZABLE to the setIsolationLevel method?
Why do you set the isolation level anyway? Oracle's default isolation level (read committed) is usually the best compromise between consistency and speed and should nicely work in you case.

Categories

Resources