How to use Spring Data Mongo distinct method with query and limit - java

Looking at the following code:
mongoOps.getCollection("FooBar")
.distinct("_id", query(where("foo").is("bar")).limit(10).getQueryObject());
I would expect this to return only the first 10 distinct _ids of collection FooBar.
But unfortunately, running this against a Collection having more than 10 documents matching the criteria, it returns all of them ignoring the limit(10) specified here.
_id is an ObjectId.
How can I achieve this?
Is it a bug in Spring Data?
I'm already aware how I can achieve this using an aggregate but I'm trying to simplify the code if possible since using an aggregate takes many more lines of code.
FYI: I'm using Spring Data Mongodb 1.10.10 and unfortunately, updating is currently not an option.

Related

Retrieving limited columns using jpa specifications

I am using spring boot JPA specifications for executing complex queries. However, the table I am querying contains more than 20 columns and I need to pull just 3. I tried cq.multiselect(...) but it didn't work and returned me the entity with all the columns.
On investigation, I got to know that it's a bug with specifications that's not yet fixed. Another option was to use projections but specifications can't be combined with projections. An attempt to do so returns the complete entity.
I do not want to switch to Querydsl or #Query approach since it's an existing code and I am stuck with specifications. Any pointers on how to limit the number of columns will be much appreciated :)
Got to know Spring boot JPA Specifications doesn't provide any way to limit the number of columns. There is a bug related to this feature and it's pending for more than 3 years now. Not sure if it will be available soon
Use QueryDslPredicateExecutor.findAll(Predicate predicate, Pageable
pageable) and then get the actual results back using PageRequest.of(0,
limit).
Querydsl limit record example

Geospatial Querying With Spring Data Mongodb

I have a project using Spring Boot 1.5.7 and Spring Data Mongodb 1.10.7. In my entity (POJO annotated with #Document), I want to perform a geospatial query against one of the fields using spring data repos "findByXxxWithin" scheme and passing in a Box containing the bounding box of the search area.
The field I want to query against is a polygon, not a simple point. I have found that a double[] point definition is easy to query and some examples show that you can also use a GeoJsonPoint. For polygons, it doesn't seem that easy. If my entity declares a GeoJsonPoint, the within search using the Box always comes back empty. The GeoJson definition of a polygon is actually a 3 dimensional array of values, in my case, doubles. Defining the data in this manner also results in an empty result. Any attempt to use a of POJO that contains the polygon definition fails. The only success I've had is using a double[][].
I would like to have a more accurate GeoJson representation of the data in my objects that spring data is capable of querying against. Is this possible? Also, there are several other geospatial query operations available to Mongodb, such as $geoIntersects. Are these available through spring data? Or perhaps a lower level api I can use to directly formulate these queries against mongo if spring data does not support them?
Let me try to recite one of my work with MongoDB and Spring data, which resembles your problem statement quite a bit.
I have a document which has georeference (Latitude and Longitude). I have used org.geojson.LngLatAlt object to store longitudes and latitudes. You can also have multiple LngLatAlt objects, as I use a Set (java.util.Set) of them. So, this will solve your problem of representation in your document.
Now, when you have data is present in MongoDB, you can make geospatial queries using Spring data.
At first, it may look like Spring data is quite in-efficient in geospatial queries, you may be tempted to use native MongoDB queries, which is definitely good and efficient. But the thing to learn here is, Spring also provides a way to make such queries; although not very direct but equally efficient.
With Spring Data, you can make spatial queries using org.springframework.data.geo.Box or org.springframework.data.geo.Circle objects. Box is used for BBOX queries and circle is used for sphere queries. Now that you have your org.springframework.data.geo.Shape objects, you can make Criteria objects to Query.
Code snippet -
#Autowired
private MongoTemplate mongoTemplate;
Box bbox = ShapeUtils.getBBox(coordinates);
Query q = new Query(Criteria.where("lngLatAlt").within(bbox));
List<Lead> leads = mongoTemplate.find(q, Lead.class);
Please let me know if my solution is clear and relevant. Or if you need some more clarifications.
Regards

Is it possible to filter MongoDB query results?

I am developing a simple web application that fetches data from MongoDB.
What I need to do, is to show data matching the query on the webpage. Let's say the user has to choose a
programming language [Java, C#, Python]
project creation time [all, max week ago, max month ago]
implemented algorithm [heapsort, quicksort, mergesort]
Now, my MongoDB collection contains all types of object, some of which are not necessarily an algorithm at all (this is unavoidable, unfortunately).
Because of this fact, I have a specific query that finds all the documents which are eligible to further processing.
FindIterable<Document> docs = collection.find(Filters.ex(programmingLanguage));
And here comes my final question:
When I already have a FindIterable object, can I filter it so that only specific documents from previously selected documents will be chosen?
For example, I need a line of code, that will give me only documents created no longer than a month ago which are written in Java, given docs object.
Desirably I would implement it like this:
create function that applies additional filter on a FindIterable object
public static FindIterable<Document> applyLastMonth(FindIterable<Document> docs) {
return docs.<magicfunction>(Filters.gte("date", dateMonthAgo()))
}
and apply it to wherever it is needed. Is it possible?
My problem is much more complex, so please do not solve the example given above, I just want to be able to filter results returned by other query, so that I don't look at dozens of cases in my code. Unfortunately I found out that docs.filter(...) does not work for me, as it replaces the old query with the new one.

Join Postgresql rows with Mongodb documents based on specific columns

I'm using MongoDB and PostgreSQL in my application. The need of using MongoDB is we might have any number of new fields that would get inserted for which we'll store data in MongoDB.
We are storing our fixed field values in PostgreSQL and custom field values in MongoDB.
E.g.
**Employee Table (RDBMS):**
id Name Salary
1 Krish 40000
**Employee Collection (MongoDB):**
{
<some autogenerated id of mongodb>
instanceId: 1 (The id of SQL: MANUALLY ASSIGNED),
employeeCode: A001
}
We get the records from SQL, and from their ids, we fetch related records from MongoDB. Then map the result to get the values of new fields and send on UI.
Now I'm searching for some optimized solution to get the MongoDB results in PostgreSQL POJO / Model so I don't have to fetch the data manually from MongoDB by passing ids of SQL and then mapping them again.
Is there any way through which I can connect MongoDB with PostgreSQL through columns (Here Id of RDBMS and instanceId of MongoDB) so that with one fetch, I can get related Mongo result too. Any kind of return type is acceptable but I need all of them at one call.
I'm using Hibernate and Spring in my application.
Using Spring Data might be the best solution for your use case, since it supports both:
JPA
MongoDB
You can still get all data in one request but that doesn't mean you have to use a single DB call. You can have one service call which spans to twp database calls. Because the PostgreSQL row is probably the primary entity, I advise you to share the PostgreSQL primary key with MongoDB too.
There's no need to have separate IDs. This way you can simply fetch the SQL and the Mongo document by the same ID. Sharing the same ID can give you the advantage of processing those requests concurrently and merging the result prior to returning from the service call. So the service method duration will not take the sum of the two Repositories calls, being the max of these to calls.
Astonishingly, yes, you potentially can. There's a foreign data wrapper named mongo_fdw that allows PostgreSQL to query MongoDB. I haven't used it and have no opinion as to its performance, utility or quality.
I would be very surprised if you could effectively use this via Hibernate, unless you can convince Hibernate that the FDW mapped "tables" are just views. You might have more luck with EclipseLink and their "NoSQL" support if you want to do it at the Java level.
Separately, this sounds like a monstrosity of a design. There are many sane ways to do what you want within a decent RDBMS, without going for a hybrid database platform. There's a time and a place for hybrid, but I really doubt your situation justifies the complexity.
Just use PostgreSQL's json / jsonb support to support dynamic mappings. Or use traditional options like storing json as text fields, storing XML, or even EAV mapping. Don't build a rube goldberg machine.

Reading Multiple Resultset using JPA

I am using JPA(Eclipselink) to execute SQL Server Stored Procedure which returns multiple Resultsets.
As per my knowledge, easiest way to call a SP is:
entityManager.createNativeQuery("exec sp_name").getResultList();
After executing the SP I can only read the single (or very first) ResultSet.
Can some one please suggest how do I retrieve the next ResultSets (or ResultLists())?
I can't answer for EclipseLink specifically, and I'm not sure what the JPA spec says, but most features of JPA took their cue from Hibernate, and Hibernate's limitations on stored procedures are:
The procedure must return a result set. Note that since these servers can return multiple result sets and update counts, Hibernate will iterate the results and take the first result that is a result set as its return value. Everything else will be discarded.
My guess is that JPA defines the same limitation.
EclipseLink has extended support for stored procedures through its StoreProcedureCall class and NamedStoredProcedureCallQuery annotation. You can create a JPA Query using a StoredProcedureCall using the JpaEntityManager interface createQuery(Call) API.
StoreProcedureCall provides additional support over JPA native SQL queries including support for in, out and intout parameters and cursored output parameters and typing. StoreProcedureCall supports calls with both a result set and output parameters, but does not currently support multiple result sets.
What is being returned in your second result set, and how do you want the result returned? You could subclass and customize your SQLServerPlatform in EclipseLink and overwrite the executeStoredProcedure() method to process multiple results sets. It should not be to hard, and you could contribute the code back to EclipseLink if successful. Or you could log and enhancement request for this feature. Looking at the code it should be simple enough to implement, the bigger issue is how to return the multiple result sets.

Categories

Resources