Spring PagingAndSortingRepository delete entry during processing - java

Im using spring PagingAndSortingRepository to do pagination of database entries.
During processing i need to delete some entries..
when i call the repository to delete, the entry is deleted, after that the problem is with the next pageable. i'm not getting the size number of elements from the next Pageable (pageRequest.next();).
Is there any way to iterate with pagination and perform in parallel crud operation.
Part of the code
while (!onePage.isEmpty()) {
while (pageIterator.hasNext()) {
Object nextElement = pageIterator.next();
if (!falseCondition) {
log.info("sending message with Id {}", nextElement.getId());
repository.deleteById(nextElement.getId());
} else {
log.info("Lost connection");
return;
}
}
pageRequest = pageRequest.next();
onePage = repository.findAll(pageRequest);
pageIterator = onePage.iterator();
}
Many thanks.

Like #ruba pointed out in the example, it is not a hibernate issue. Even if you using jdbc API directly you will have to handle the situation. I can propose you a solution
You can implement your custom spring-data-jpa repository method where the service pass the pageRequest but you translate it to offset and limit. So instead of calling pageRequest.next() you do the following which takes into account of the items deleted.
long nextPageNumber = pageRequest.getPageNumber() + 1;
long nextOffset = nextPageNumber * pageRequest.getPageSize()
- itemsDeletedInCurrentPage;
long limit = pageRequest.getPageSize();
List<Item> itemsInNextPage = em.createQuery(query)
.setFirstResult(offset)
.setMaxResults(limit)
.getResultList();

Related

Page<> vs Slice<> when to use which?

I've read in Spring Jpa Data documentation about two different types of objects when you 'page' your dynamic queries made out of repositories.
Page and Slice
Page<User> findByLastname(String lastname, Pageable pageable);
Slice<User> findByLastname(String lastname, Pageable pageable);
So, I've tried to find some articles or anything talking about main difference and different usages of both, how performance changes and how sorting affercts both type of queries.
Does anyone has this type of knowledge, articles or some good source of information?
Page extends Slice and knows the total number of elements and pages available by triggering a count query. From the Spring Data JPA documentation:
A Page knows about the total number of elements and pages available. It does so by the infrastructure triggering a count query to calculate the overall number. As this might be expensive depending on the store used, Slice can be used as return instead. A Slice only knows about whether there’s a next Slice available which might be just sufficient when walking through a larger result set.
The main difference between Slice and Page is the latter provides non-trivial pagination details such as total number of records(getTotalElements()), total number of pages(getTotalPages()), and next-page availability status(hasNext()) that satisfies the query conditions, on the other hand, the former only provides pagination details such as next-page availability status(hasNext()) compared to its counterpart Page. Slice gives significant performance benefits when you deal with a colossal table with burgeoning records.
Let's dig deeper into its technical implementation of both variants.
Page
static class PagedExecution extends JpaQueryExecution {
#Override
protected Object doExecute(final AbstractJpaQuery repositoryQuery, JpaParametersParameterAccessor accessor) {
Query query = repositoryQuery.createQuery(accessor);
return PageableExecutionUtils.getPage(query.getResultList(), accessor.getPageable(),
() -> count(repositoryQuery, accessor));
}
private long count(AbstractJpaQuery repositoryQuery, JpaParametersParameterAccessor accessor) {
List<?> totals = repositoryQuery.createCountQuery(accessor).getResultList();
return (totals.size() == 1 ? CONVERSION_SERVICE.convert(totals.get(0), Long.class) : totals.size());
}
}
If you observe the above code snippet, PagedExecution#doExecute method underlyingly calls PagedExecution#count method to get the total number of records satisfying the condition.
Slice
static class SlicedExecution extends JpaQueryExecution {
#Override
protected Object doExecute(AbstractJpaQuery query, JpaParametersParameterAccessor accessor) {
Pageable pageable = accessor.getPageable();
Query createQuery = query.createQuery(accessor);
int pageSize = 0;
if (pageable.isPaged()) {
pageSize = pageable.getPageSize();
createQuery.setMaxResults(pageSize + 1);
}
List<Object> resultList = createQuery.getResultList();
boolean hasNext = pageable.isPaged() && resultList.size() > pageSize;
return new SliceImpl<>(hasNext ? resultList.subList(0, pageSize) : resultList, pageable, hasNext);
}
}
If you observe the above code snippet, to findout whether next set of results present or not (for hasNext()) the SlicedExecution#doExecute method always fetch extra one element(createQuery.setMaxResults(pageSize + 1)) and skip it based on the pageSize condition(hasNext ? resultList.subList(0, pageSize) : resultList).
Application:
Page
Use when UI/GUI expects to displays all the results at the initial stage of the search/query itself, with page numbers to traverse(ex., bankStatement with pagenumbers)
Slice
Use when UI/GUI expects to doesnot interested to show all the results at the initial stage of the search/query itself, but intent to show the records to traverse based on scrolling or next button click event (ex., facebook feed search)

Spring data elasticsearch bulk index and delete

I'm new to the community so I apologise if I do something wrong.
I'm using spring data elasticsearch (2.0.4/2.4)
And I would like to make a bulk insert and delete.
But ElasticsearchTemplate only contains a method bulkInsert
#Override
public void bulkIndex(List<IndexQuery> queries) {
BulkRequestBuilder bulkRequest = client.prepareBulk();
for (IndexQuery query : queries) {
bulkRequest.add(prepareIndex(query));
}
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures()) {
Map<String, String> failedDocuments = new HashMap<String, String>();
for (BulkItemResponse item : bulkResponse.getItems()) {
if (item.isFailed())
failedDocuments.put(item.getId(), item.getFailureMessage());
}
throw new ElasticsearchException(
"Bulk indexing has failures. Use ElasticsearchException.getFailedDocuments() for detailed messages ["
+ failedDocuments + "]", failedDocuments
);
}
}
So I have created a bulk method to handle both but I can't access the method prepareIndex which is private.
Are you aware of any solution to, in one bulk, index and delete documents or should I use reflection to change visibility of the prepareIndex method
or is there any easy way to create an indexRequest from a model/POJO?
Not sure which versions you mean with
(2.0.4/2.4)
Currently there is no possibility for bulk deletes. And no combination of different operations like index/update in one request.
Can you file an issue in Jira to add support for bulk delete and a possibility to have different operations in one call? Though this won't make it into the next release, I'm afraid.

Want to iterate through half of mongoDB and iterate through the rest of the half with another query

I'm getting this error:
Exception in thread "main" com.mongodb.MongoCursorNotFoundException:
Query failed with error code -5 and error message 'Cursor 304054517192
not found on server mongodb2:27017' on server mongodb2:27017 at
com.mongodb.operation.QueryHelper.translateCommandException(QueryHelper.java:27)
at
com.mongodb.operation.QueryBatchCursor.getMore(QueryBatchCursor.java:215)
at
com.mongodb.operation.QueryBatchCursor.hasNext(QueryBatchCursor.java:103)
at
com.mongodb.MongoBatchCursorAdapter.hasNext(MongoBatchCursorAdapter.java:46)
at com.mongodb.DBCursor.hasNext(DBCursor.java:155) at
org.jongo.MongoCursor.hasNext(MongoCursor.java:38) at
com.abc.Generator.Generate(Generator.java:162) at
com.abc.main.main(main.java:72)
which I assume is because the query ran for too long.
So I'm planning to query mongo using find() and iterate through half of the collections.
Then I want to use another find() query and iterate through the remaining half of the collections.
Could you help with how to directly place the cursor at the half'th position of the collection? The documentation does not seem to provide any functions for it.
I'm basically just using a find() and iterating thru a collection with 100000 records, while connected to a server via ssh.
MongoCollection history = jongo.getCollection("historyCollection");
MongoCursor<MyClass> allHistories = history.find().as(MyClass.class);
//---Iterate thru all histories
while (allHistories.hasNext()) {
MyClass oneHistory = allHistories.next();
}
Solved it by having the Mongo collections ordered by ObjectId's that were timestamps. This way, I was able to use the greater than operator to find objectIDs and split the iterations.
private MongoCursor<PersonDBO> ReadFewProfilesFromDB(final String objectIdAfterWhichToSearchFrom, final Integer FIND_LIMIT) {
MongoCursor<PersonDBO> aBatchOfProfiles = null;
try {
if (objectIdAfterWhichToSearchFrom.equals(START_OBJECTID_OF_MONGO_BATCHES)) {
aBatchOfProfiles = personProfile.find().limit(FIND_LIMIT).as(PersonDBO.class);
} else {
aBatchOfProfiles = personProfile.find("{_id: {$gt: #}}", new ObjectId(objectIdAfterWhichToSearchFrom)).limit(FIND_LIMIT).as(PersonDBO.class);
}
} catch(Exception e) {logger.error("Problem while trying to find {} personProfiles, starting from objectID {}. {}, {}", FIND_LIMIT, objectIdAfterWhichToSearchFrom, e.getMessage(), e.getCause());}
if (aBatchOfProfiles == null) {
logger.error("profiles collection is null. Nothing more to iterate OR there was an exception when finding profiles. If exception, there would be an error printed above.");
return null;
}
return aBatchOfProfiles;
}

Arrange entites added to GAE by newest first when added?

I created a google app engine client using eclipse and the android demo google hands out. I Created the back end and a few models. When I add an entities from android to my database on GAE it orders it by date not by newest created first. The key it just the current date and tie on android. Im not sure how to work with the back end, as google created it for me in my project. Is there a fast change I can make to it so instead or it ordering it by data when I add an item it will just keep the newest listings on top?
Edited question, this is my endpoint class Google generated for me. How can I modify it to receive the newest added entities first?
#Api(name = "quotesendpoint", namespace = #ApiNamespace(ownerDomain = "projectquotes.com" ownerName = "projectquotes.com", packagePath = ""))
public class quotesEndpoint {
/**
* This method lists all the entities inserted in datastore.
* It uses HTTP GET method and paging support.
*
* #return A CollectionResponse class containing the list of all entities
* persisted and a cursor to the next page.
*/
#SuppressWarnings({ "unchecked", "unused" })
#ApiMethod(name = "listquotes")
public CollectionResponse<quotes> listquotes(
#Nullable #Named("cursor") String cursorString,
#Nullable #Named("limit") Integer limit) {
EntityManager mgr = null;
Cursor cursor = null;
List<quotes> execute = null;
try {
mgr = getEntityManager();
Query query = mgr.createQuery("select from quotes as quotes");
if (cursorString != null && cursorString != "") {
cursor = Cursor.fromWebSafeString(cursorString);
query.setHint(JPACursorHelper.CURSOR_HINT, cursor);
}
if (limit != null) {
query.setFirstResult(0);
query.setMaxResults(limit);
}
execute = (List<quotes>) query.getResultList();
cursor = JPACursorHelper.getCursor(execute);
if (cursor != null)
cursorString = cursor.toWebSafeString();
// Tight loop for fetching all entities from datastore and accomodate
// for lazy fetch.
for (quotes obj : execute)
;
} finally {
mgr.close();
}
return CollectionResponse.<quotes> builder().setItems(execute)
.setNextPageToken(cursorString).build();
The order you see in datastore viewer in GAE is not significant as it is just a display of the current data in your datastore and shown in the increasing order of entity id(if using auto id). This could coincidentally also have an increasing order of date. You cannot modify this display pattern.
What matters is the order seen by your queries and this is determined by indexes. So if you need to get your entities in the descending order of date, then if your date entry is left as indexed, GAE will be automatically having an index for date. You just need to query your entities by specifying a descending sort order on the date property.
EDIT:
Based on the code added, below modifications should be done to query the entities in descending order of date.
1, Add a new date property in your entity:
private Date entrydate;
2, While creating an entity, add the current date to this property
yourentity.setEntryDate(new Date())
3, While querying, set ordering based on descending order of date
query.setOrdering("entrydate desc");

ATG Repository API

Im trying to update multiple records via an ATG class extending GenericService.
However im running against a roadblock.
How do I do a multiple insert query where i can keep adding all the items / rows into the cached object and then do a single command sync with the table using item.add() ?
Sample code
the first part is to clear out the rows in the table before insertion happens (mighty helpful if anyone knows of a way to clear all rows in a table without having to loop through and delete one by one).
MutableRepository repo = (MutableRepository) feedRepository;
RepositoryView view = null;
try{
view = getFeedRepository().getView(getFeedRepositoryFeedDataDescriptorName());
RepositoryItem[] items = null;
if(view != null){
QueryBuilder qb = view.getQueryBuilder();
Query getFeedsQuery = qb.createUnconstrainedQuery();
items = view.executeQuery(getFeedsQuery);
}
if(items != null && items.length>0){
// remove all items in the repository
for(RepositoryItem item :items){
repo.removeItem(item.getRepositoryId(), getFeedRepositoryFeedDataDescriptorName());
}
}
for(RSSFeedObject rfo : feedEntries){
MutableRepositoryItem feedItem = repo.createItem(getFeedRepositoryFeedDataDescriptorName());
feedItem.setPropertyValue(DB_COL_AUTHOR, rfo.getAuthor());
feedItem.setPropertyValue(DB_COL_FEEDURL, rfo.getFeedUrl());
feedItem.setPropertyValue(DB_COL_TITLE, rfo.getTitle());
feedItem.setPropertyValue(DB_COL_FEEDURL, rfo.getPublishedDate());
RepositoryItem item = repo.addItem(feedItem) ;
}
The way I interpret your question is that you want to add multiple repository items to your repository but you want to do it fairly efficiently at a database level. I suggest you make use of the Java Transaction API as recommended in the ATG documentation, like so:
TransactionManager tm = ...
TransactionDemarcation td = new TransactionDemarcation ();
try {
try {
td.begin (tm);
... do repository item work ...
}
finally {
td.end ();
}
}
catch (TransactionDemarcationException exc) {
... handle the exception ...
}
Assuming you are using a SQL repository in your example, the SQL INSERT statements will be issued after each call to addItem but will not be committed until/if the transaction completes successfully.
ATG does not provide support for deleting multiple records in a single SQL statement. You can use transactions, as #chrisjleu suggests, but there is no way to do the equivalent of a DELETE WHERE ID IN {"1", "2", ...}. Your code looks correct.
It is possible to invoke stored procedures or execute custom SQL through an ATG Repository, but that isn't generally recommended for portability/maintenance reasons. If you did that, you would also need to flush the appropriate portions of the item/query caches manually.

Categories

Resources