Hibernate-Search #IndexedEmbedded with HashMap, how to include keyset in index - java

I'm trying to integrate Hibernate Search into an application. The application entities can have multiple properties that are stored multilingual. This is accomplished by splitting the non multilingual and multilingual properties into seperate entities. An example snippet of this split looks like this (ommitted hibernate annotations as the database part is working fine):
#Indexed
public class Assignment {
#DocumentId
private UUID id;
#IndexedEmbedded
private Map<String, AssignmentI18n> i18n;
// Other properties
}
public class AssignmentI18n {
#DocumentId
#FieldBridge(impl = AssignmentI18nBridge.class)
private AssignmentI18nId id;
#Field
private String title;
#Field
private String description;
#Field
private String requirements;
public static class AssignmentI18nId {
private UUID assignmentId;
private String iso;
}
}
Now I would like to make this data searchable using Hibernate Search by treating it as a single entity in the index. The way the annotations are set up this happens however all entries of the multilingual fields are stored in the same field in the index. Basicly my index structure looks like this:
id
i18n.title
i18n.description
i18n.requirements
As all values of the multilingual data are indexed in the same field I can no longer distinguish what language they belong to. Is there a way to make the index look more like this?:
id
i18n.nl.title
i18n.en.title
i18n.nl.description
i18n.en.description
i18n.nl.requirements
i18n.en.requirements
Basicly I would like to add the HashMap key value to the index field name. I've looked into the possibiliy of treating the map as a field with a custom FieldBridge but that doesn't seem like the correct approach.

If you want to make the indexed fields look like the one you describe, use a custom field bridge. That's how you could get this structure, but since your map value is quite complex it would take quite a lot of custom code to create all fields.
You could create a feature request for Hibernate Search here. I could imagine that this type of feature would be of general use. Basically a way to either via an #IndexedEmbedded option or via an additional annotation define how the map key becomes part of the Lucene field name. That said, have you thought about how exactly you would then search in this index? Does the user somehow specify a locale and depending on this local you would target the appropriate fields? Also, how do you deal in your approach to configure different stemmers depending on the language type?

Related

Basic Java MVC: Beans and associative entities with attributes

I'm working in this movie db project for college where I need to create a website based on java to do advanced searches in a huge postgresql database. I'm not using hibernate or similar tools. Here's part of the ER diagram for the database:
As you can see, the associative entity actormovie links the entities actor and movie while also listing the character portrayed. I have created two simple beans, Actor and Movie, with attributes, getters and setters.
This is my first java web project with focus on MVC, so I'm more than a little lost. My question is: Should I create a bean mapping the associative table? If not, what do I do with the as_character attribute?
The answer is yes.
This is because the relation Actor->Movie has a property as_character, you can also find a way to not do a class, but in the long time it will cause problems (maybe some stupid bug created because you forgot it, or someone else didn't know it, something you don't want to deal with).
If this is your first approach I think what can make you confused is how to represents the relationship.
The first approach that come in mind, most of the time, is to have an ActorMovie class like:
public class ActorMovie {
Integer actor_id;
Integer movieid;
String as_character;
//getters, setters, equals, hashCode, toString
}
But you can also think at it as a value of Actor (or Movie) and have it like:
public class ActorMovie {
Integer movieid;
String as_character;
//getters, setters, equals, hashCode, toString
}
and a Actor class:
public class Actor {
Integer actor_id;
String name;
String sex;
Set<ActorMovie> movies;
//getters, setters, equals, hashCode, toString
}
Both of them solve the problems, they just change how you will interact with these data through the code, to learn when is better to use one or the other you have to try both and see what change, so choose what you feel more "natural" and see the results.
No, you don't need. The class Actor will have a list of Movie, and the Movie class, a list of Actor. Just map like this.
To the character attribute, you could create a map in the Actor class, where the key, is the movie, and the value, the character (Or a list of character, because an actor can have many characters in the same movie):
Map<Movie, List<String>> characters = new HashMap<>()
It depends how your project will develop. You should create bean mapping for the associative table, if you are not using Object Relational Mapping(as you stated in the question). If you will introduce Object Relational Mapping later on, then Actor can own as_character property, then you should not create bean mapping for the associative table .

Filtering by a ref's properties in App Engine using Objectify

Hypothetical code:
#Entity
public class MyEvent {
#Id Long id;
#Index String name;
Ref<Location> myLocation;
}
#Entity
public class Location {
#Id Long id;
#Index String city;
#Index String country;
}
Is there a way for me to do a filter to find all events within a particular city? This seems like it would need a join, which isn't supported, but I wanted to double check since I can't find a definitive answer.
Also, what is the correct way of structuring the data if this type of filtering isn't possible? Would I need to have a denormalized MyEvent entity with all the fields that I could possibly filter on?
As you can read in the Objectify documentation, Ref properties are more of an Objectify sugar than a Datastore feature. They're stored as Key properties in the Datastore, and so it's not possible to query against the properties of the entity which the key might point to.
If you want to query for events within a city, you could either store the event's city on the event itself and query against that, or query all locations within a city and then query for any events matching those locations (that is, by querying against the location keys, which are stored on the events). A third option would be to make cities into actual entities with a collection-type field of Events. You could also make use of ancestor queries - see the "Datastore Queries" docs for more information.

Modeling nested documents with Spring-Data-MongoDB

I have a MongoDB database that represents snippets of public gene information like so:
{
_id: 1,
symbol: "GENEA",
db_references: {
"DB A": "DBA000123",
"DB B" ["ABC123", "DEF456"]
}
}
I am trying to map this to a #Document-annotated POJO class, like this:
#Document
Public class Gene {
#Id
private int id;
private String symbol;
private Map<String,Object> db_references;
//getters and setters
}
Because of the nature of MongoDB's schema-less design, the db_references field can contain a long list of possible keys, with values sometimes being arrays or other key-value pairs. My primary concern is the speed at which I can fetch multiple Gene documents and slice up their db_references.
My question: what is the best way to represent this field to optimize fetching performance? Should I define a custom POJO and map this field to it? Should I make it a BasicDBObject? Or would it be best not map the documents at all with Spring Data and just use the MongoDB Java driver and parse the DBObjects returned?
Sorry to see your question hasn't been answered yet.
If db_references represent an actual concept within the domain you are much better off capturing this domain knowledge in a class. It is always a good idea and MongoDB helps with it a lot.
Thus, you can store this list of nested objects inside the MongoDB document and fetch the whole aggregate in a single query. Spring Data should also handle deserializing as well.

Lucene/Hibernate Search - Query associated collections?

I'm writing a Seam-based application, making use of JPA/Hibernate and Hibernate Search (Lucene). I have an object called Item that has a many-to-many relation to an object
Keyword. It looks like this (some annotations omitted):
#Indexed
public class Item {
...
#IndexedEmbedded
private List<Keyword> keywords;
...
}
#Indexed
public class Keyword {
...
#Field
private String value;
...
}
I'd like to be able to run a query for all Item object that contain a particular keyword value. I've setup numerous test objects in my database, and it appears the indexes are being created properly. However, when I create and run a query for "keywords.value" = <MY KEYWORD VALUE> I always get 0 results returned.
Does Hibernate Search/Lucene have the ability to run queries of this type? Is there something else I should be doing? Are there additional annotations that I could be missing?
Hibernate Search is perfectly suited for that kind of queries; but it can be done in a simpler way.
On your problem: text indexed by Hibernate Search (Lucene) is going to be Analysed, and the default analyser applies:
Lower casing of the input
Splitting in separate terms on whitespace
So if you're defining the queries as TermQuery (I'm assuming that's what you did, as it's the simplest form), then you have to match against the lower case form, of a token (with no spacing).
Bearing this in mind, you could dump all your keywords in a single Blob String on the Item entity, without needing to map it as separate keywords, chaining them in a single string separated by whitespaces.

solrj: how to store and retrieve List<POJO> via multivalued field in index

My use case is an index which holds titles of online media. The provider of the data associates a list of categories with each title. I am using SolrJ to populate the index via an annotated POJO class
e.g.
#Field("title")
private String title;
#Field("categories")
private List<Category> categoryList;
The associated POJO is
public class Category {
private Long id;
private String name;
...
}
My question has two parts:
a) is this possible via SolrJ - the docs only contain an example of #Field using a List of String, so I assume the serialization/marshalling only supports simple types ?
b) how would I set up the schema to hold this. I have a naive assumption I just need to set
multiValued=true on the required field & it will all work by magic.
I'm just starting to implement this so any response would be highly appreciated.
The answer is as you thought:
a) You have only simple types available. So you will have a List of the same type e.g. String. The point is you cant represent complex types inside the lucene document so you wont deserialize them as well.
b) The problem is what you are trying is to represent relational thinking in a "document store". That will probably work only to a certain point. If you want to represent categories inside a lucene document just use the string it is not necessary to store a id as well.
The only point to store an id as well is: if you want to do aside the search a lookup on a RDBMS. If you want to do this you need to make sure that the id and the category name is softlinked. This is not working for every 1:n relation. (Every 1:n relation where the n related table consists only of required fields is possible. If you have an optional field you need to put something like a filling emptyconstant in the field if possible).
However if these 1:n relations are not sparse its possible actually if you maintain the order in which you add fields to the document. So the case with the category relation can be probably represented if you dont sort the lists.
You may implement a method which returns this Category if you instantiate it with the values at position 0...n. So the solution would be if you want to have the first category it will be at position 0 of every list related to this category.

Categories

Resources