Store Key-Value structure in Elasticsearch via Spring-Data

Store Key-Value structure in Elasticsearch via Spring-Data - java

I need information how to store the best way a Document (Java POJO) with the Spring-Data-Elasticsearch #Document Annotation which includes a Map
#Document(indexName = "downloadclienterrors", type = "downloadclienterror")
public class DownloadClientErrorLogElasticsearch {
#Id
private Long id;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String host;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String shortMessage;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String fullMessage;
#Field(type = FieldType.Date)
private String clientTimestamp;
private Integer level;
private Map<String, String> additionalFieldList;
...
}
Like the POJO is created in this 1st class I can store it via my repository in the elastic search instance.
This is the way how I add then data to it, I wanna be flexible which JSON fields I add, because that's flexible from my client software.
additionalFieldList.put("url", "http://www.google.de");
additionalFieldList.put("user_agent", "Browser/1.0.0 Windows");
My problem is that I need also the fields in the additionalFieldList marked as .not_analyzed. (f.e additionalFieldList.url, additionalFieldList.user_agent).
I would like to have the same behaviour like with the FieldIndex.not_analyzed annotation on a String also on my Map but of course only for the value in the map.
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private Map<String, String> additionalFieldList;
But that doesn't work when I try to store the document. I receive a ugly Exception.
When someone knows a way, or how it would be better to design such a document in elasticsearch, because I am quit fresh and new in this area I would love to hear some comments.
Thanks before and grey greetings from Hamburg,
Tommy Ziegler

You can use #Mapping annotation to configure dynamic_templates.
Just put your mapping file in your classpath and annotate your POJO with #Mapping
Mapping example
JSON
{
"downloadclienterrors": {
"dynamic_templates": [
{
"additionalFieldList": {
"path_match": "additionalFieldList.*",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
...
}
}
POJO
#Mapping(mappingPath = "/downloadclienterrors.json")
#Document(indexName = "downloadclienterrors", type = "downloadclienterror")
public class DownloadClientErrorLogElasticsearch {
...
}

What you have to do is to create a another class additional and add additionalFieldList there.
something like this-
public class additional {
private Map<String, String> additionalFieldList;
}
and then use this class in your pojo
#Document(indexName = "downloadclienterrors", type = "downloadclienterror")
public class DownloadClientErrorLogElasticsearch {
#Id
private Long id;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String host;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String shortMessage;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String fullMessage;
#Field(type = FieldType.Date)
private String clientTimestamp;
private Integer level;
#Field(type = FieldType.Nested)
private additional additional;
...
}

Related

Hibernate search with ResultTransformer (DTO)

I need to collect a specific DTO from the Hibernate search results, I connected all the dependencies in the maven, wrote the following request based on the official documentation (I deleted the unnecessary code, which can only confuse, left only what is needed for the search):
public List<QuestionDto> search(String text) {
FullTextQuery query = fullTextEntityManager.createFullTextQuery(queryBuilder
.simpleQueryString()
.onField("description")
.matching(text)
.createQuery())
.setProjection("id", "description", "title", "countValuable", "persistDateTime", "user.fullName", "tags")
.setResultTransformer(new ResultTransformer() {
#Override
public Object transformTuple(Object[] tuple, String[] aliases) {
return QuestionDto.builder()
.id(((Number) tuple[0]).longValue())
.title((String) tuple[2])
.description((String) tuple[1])
.countValuable(((Number) tuple[3]).intValue())
.persistDateTime((LocalDateTime) tuple[4])
.username((String) tuple[5])
.tags((List<TagDto>) tuple[6])
.build();
}
#Override
public List transformList(List collection) {
return collection;
}
});
return query.getResultList();
}
BUT for some reason instead of tags comes NULL
May be someone have any idea?
Entity Question
#Indexed
public class Question {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
#Field(store = Store.YES)
private String title;
private Integer viewCount = 0;
#Field(store = Store.YES)
private String description;
#Field(store = Store.YES)
private LocalDateTime persistDateTime;
#Field(store = Store.YES)
private Integer countValuable = 0;
#ManyToOne(fetch = FetchType.LAZY, optional = false)
#IndexedEmbedded(includePaths = "fullName")
private User user;
#ManyToMany(fetch = FetchType.LAZY)
#IndexedEmbedded(includeEmbeddedObjectId = true, includePaths = {"name", "description"})
private List<Tag> tags;
Entity Tag
public class Tag {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
#Field(store = Store.YES)
private String name;
#Field(store = Store.YES)
private String description;
private LocalDateTime persistDateTime;
#ManyToMany(mappedBy = "tags", fetch = FetchType.LAZY)
#ContainedIn
private List<Question> questions;

There are two problems here:
Projections on multi-valued fields are not supported.
You are trying to project on an "object" field: "tags" does not hold a value by itself, it just has sub-fields ("tags.name", "tags.description"). Projections on "object" fields is not supported at the moment.
If you're using the Elasticsearch backend, you can take advantage of the _source projection (org.hibernate.search.elasticsearch.ElasticsearchProjectionConstants#SOURCE), which will return the string representation of the Elasticsearch document, formatted as JSON. You can then parse it (with Gson, for example) to extract whatever information you need.
And of course, you always have the option of not using projections at all, then extracting the information from the entities loaded from the database.
Note that in Hibernate Search 6 (currently in Beta), we're going to add built-in support for projections on multi-valued fields, but it's unlikely to be added to Hibernate Search 5, which is in maintenance mode (no new features or improvements, only bugfixes).
In a more distant future, we will probably add more direct support for DTOs.

ModelMapper - convert a Date inside a Collection<Object> to String (java)

I've searched a lot in this forum and other websites, but I'm still stuck with my problem.
I'm actually using modelmapper to convert an entity to a DTO.
Here is the Entity :
#Entity
public class Candidate implements Serializable {
#Id
#GeneratedValue (strategy=GenerationType.IDENTITY)
#Column(name = "id")
private Long id;
#Column (name = "name")
private String lastname;
#Column (name = "firstname")
private String firstname;
#Column (name = "phone")
private String phoneNumber;
#Column (name = "mail")
private String email;
#Column (name = "title")
private int title;
#OneToMany (mappedBy = "candidateId")
private Collection<Candidature> Interviews;
Here is Candidature Entity (that you find in the first Entity's collection):
public class Candidature implements Serializable {
#Id
#NotBlank
#GeneratedValue(strategy=GenerationType.IDENTITY)
#Column(name="id")
private Long id;
#ManyToOne (fetch = FetchType.LAZY)
#JoinColumn (name = "candidat_id")
private Candidate candidateId;
#Column(name = "interview")
#Temporal (TemporalType.DATE)
private Date dateInterview;
#Column(name ="status")
private String status;
And here is the DTO :
public class CandidateDTO {
private Long id;
private String lastname;
private String firstname;
private String phoneNumber;
private String email;
private String title;
private String dateLastInterview;
As you can see, there are some differences.
The problem I face is that the last attribute of DTO (dateLastInterview) comes from the Collection<Candidature> and more precisely it must be the last dateInterview converted into String.
Convert a Date into String is not a problem. Getting the last item of a Collection neither.
But I can't make it work with modelMapper.
Here is a sample code I tried :
modelMapper = new ModelMapper();
Converter<Candidate, CandidateDTO> converter = new Converter<Candidate, CandidateDTO>()
{
#Override
public CandidateDTO convert(MappingContext<Candidate, CandidateDTO> mappingContext) {
Candidate candidate = mappingContext.getSource();
CandidateDTO cdto = new CandidateDTO();
List<Candidature> list = (List) candidate.getInterviews();
Date date = list.get(list.size()-1).getDateInterview();
DateFormat df = new SimpleDateFormat("yyyy-MM-dd");
String dateInterviewConverted = df.format(date);
mappingContext.getDestination().setTitle(mappingContext.getSource().getTitle());
mappingContext.getDestination().setDateLastInterview(dateInterviewConverted);
return cdto;
}
};
modelMapper.createTypeMap(Candidate.class, CandidateDTO.class).setConverter(converter);
(and I tried, instead of the last line above : modelMapper.addConverter(converter); but same result)
But it doesn't work, I get all attributes at null.
I previously succeded using
map().setTitle(source.getTitle());
map().setDateLastInterview(dateInterviewConverted);
And then converting the Date to String in my DTO "set" method, but it seems that it shouldn't be here, but into the ModelMapper class or the class that is using it.
Do you have an idea ? I'm new with modelMapper, and I keep browsing google and I can't find (or maybe understand ?) any response that might help me.
Thanks

Ok I think I succeded.
Using the converter was the right thing, but I wasn't using it correctly. For the converter, the two objets that you put inside <> are the ones of the attributes concerned by the converter.
For example, for the first converter, I wanted to parameter the conversion of the Collection (coming from an object Candidate) to become a String (to match the attribute of the DTO).
So then you only have to create a PropertyMap with the Class and ClassDTO, and in the configure() method you only mention the attributes that will use special parameters (the other ones are correct since they respect the standard mapping).
Converter<Collection<Candidature>, String> convertLastDateToString = new Converter<Collection<Candidature>, String>() {
public String convert(MappingContext<Collection<Candidature>, String> context) {
List<Candidature> candidatureList = (List)context.getSource();
String dateInterviewConverted = "";
if (candidatureList.size() > 0) {
Date lastInterview = candidatureList.get(0).getDateInterview();
for (int i = 0; i < candidatureList.size(); i++) {
if (candidatureList.get(i).getDateInterview().after(lastInterview)) {
lastInterview = candidatureList.get(i).getDateInterview();
}
}
// converts the Date to String
DateFormat df = new SimpleDateFormat(DATE_FORMAT);
dateInterviewConverted = df.format(lastInterview);
}
return dateInterviewConverted;
}
};
// allows custom conversion for Title attribute
// the source (Candidate) has a title attribute in int type
// the destination (CandidateDTO) has a title attributes in String type
Converter<Integer, String> convertTitleToString = new Converter<Integer, String>(){
public String convert(MappingContext<Integer, String> context){
return Title.values()[context.getSource()].toString();
}
};
// define explicit mappings between source and destination properties
// does only concernes the attributes that will need custom mapping
PropertyMap<Candidate, CandidateDTO> candidateMapping = new PropertyMap<Candidate, CandidateDTO>()
{
protected void configure()
{
// to map these two attributes, they will use the corresponding converters
using(convertTitleToString).map(source.getTitle()).setTitle(null);
using(convertLastDateToString).map(source.getCandidatures()).setDateLastInterview(null);
}
};
// add the mapping settings to the ModelMapper
modelMapper.addMappings(candidateMapping);

Hibernate Search/Lucene : String field cannot be used for sorting "indexed with multiple values per document, use SORTED_SET instead"

I have following model .
public class FeatureMeta {
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
#Column(unique=true)
private String uri;
#Column
#Field
private String name;
#Field
#Column
private String businessDesc;
#Field
#Column
private String logicalDesc;
.
.
}
I am trying to sort the documents by "name" as follows :
org.hibernate.search.jpa.FullTextQuery jpaQuery =
fullTextEntityManager.createFullTextQuery(aggrBuilder.build(), FeatureMeta.class);
.
.
SortFieldContext sortCtx = queryBuilder.sort().byField("name",SortField.Type.STRING);
jpaQuery.setSort(sortCtx.createSort());
.
But Lucene throws following exception ?
java.lang.IllegalStateException: Type mismatch: name was indexed with
multiple values per document, use SORTED_SET instead at
org.apache.lucene.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:678)
at
org.apache.lucene.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:189)
at
org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:646)
at
org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:626)
at
org.apache.lucene.uninverting.UninvertingReader.getSortedDocValues(UninvertingReader.java:256)
at org.apache.lucene.index.DocValues.getSorted(DocValues.java:262)
at
org.apache.lucene.search.FieldComparator$TermOrdValComparator.getSortedDocValues(FieldComparator.java:762)
at
org.apache.lucene.search.FieldComparator$TermOrdValComparator.getLeafComparator(FieldComparator.java:767)
at
org.apache.lucene.search.FieldValueHitQueue.getComparators(FieldValueHitQueue.java:183)
at
org.apache.lucene.search.TopFieldCollector$SimpleFieldCollector.getLeafCollector(TopFieldCollector.java:164)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:812)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:535)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:523)
at
org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:103)
Any tips?

EDIT: Actually, before anything else you should check which analyzer you are using on the name field. The analyzer probably has a tokenizer, which will result in multi-valued fields, which cannot be sorted on.
Try adding a different field for sorting, and use an analyzer with a KeywordTokenizer on this field:
#AnalyzerDef(name = "sort_analyzer",
tokenizer = #TokenizerDef(factory = KeywordTokenizerFactory.class),
filters = {
#TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
#TokenFilterDef(factory = LowerCaseFilterFactory.class)
}
)
public class FeatureMeta {
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
#Column(unique=true)
private String uri;
#Column
#Field
#Field(name = "name_sort", analyzer = #Analyzer(definition = "sort_analyzer"))
private String name;
#Field
#Column
private String businessDesc;
#Field
#Column
private String logicalDesc;
.
.
}
Then sort on this new field, instead of the default one:
SortFieldContext sortCtx = queryBuilder.sort().byField("name_sort",SortField.Type.STRING);
Original answer (the points I made are still valid):
Not sure what causes the exception in your case, but try fixing these issues in your code:
Add a #SortableField annotation on the name property
Do not use queryBuilder.sort().byField("name",SortField.Type.STRING), just use queryBuilder.sort().byField("name")
If it doesn't work, maybe you should try to wipe your indexes and reindex.

Find top order by date desc issue

My elasticsearch version is: "2.4.2",
spring boot version is: "1.4.2.RELEASE".
My repository:
public interface StockDetailsEsRepository extends ElasticsearchRepository<StockDetails, String> {
StockDetails findTopByOrderByDateDesc();
}
StockDetails Class:
#Document(indexName = "stock_details", type = "daily", replicas = 0)
public class StockDetails {
#Id
private String id;
#Field(type = FieldType.Date, format = DateFormat.custom, pattern = "yyyy-MM-dd")
#JsonDeserialize(using = CustomLocalDateDeserializer.class)
#JsonSerialize(using = CustomLocalDateSerializer.class)
private LocalDate date;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
#JsonDeserialize(using = BigDecimalDeserializer.class)
#JsonSerialize(using = BigDecimalSerializer.class)
private BigDecimal openPrice;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
#JsonDeserialize(using = BigDecimalDeserializer.class)
#JsonSerialize(using = BigDecimalSerializer.class)
private BigDecimal maxPrice;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
#JsonDeserialize(using = BigDecimalDeserializer.class)
#JsonSerialize(using = BigDecimalSerializer.class)
private BigDecimal minPrice;
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
#JsonDeserialize(using = BigDecimalDeserializer.class)
#JsonSerialize(using = BigDecimalSerializer.class)
private BigDecimal closePrice;
#Field(type = FieldType.Long)
private Long transactionsNumber = 0L;
#Field(type = FieldType.Long)
private Long volume;
private Stock stock;
The problem is when I use the query from my repository I get following exception:
java.lang.NullPointerException: null
at org.springframework.data.elasticsearch.core.ElasticsearchTemplate.queryForPage(ElasticsearchTemplate.java:308)
at org.springframework.data.elasticsearch.core.ElasticsearchTemplate.queryForObject(ElasticsearchTemplate.java:252)
I debug the ElasticsearchTemplate.queryForPage method and the argument CriteriaQuery criteriaQuery is null. The Same problem occurs when I change the method name in my repository to findTopByOrderByStockTickerDesc. It's wired because in official spring data elasticsearch documentation is almost the same example http://docs.spring.io/spring-data/elasticsearch/docs/current/reference/html/#repositories.limit-query-result
findTopByOrderByAgeDesc();
Ofcourse I can achieve my goal in other way for example:
Sort sort = new Sort(new Sort.Order(Sort.Direction.DESC, "date"));
StockDetails topByOrderByDateDesc = stockDetailsEsRepository.findAll(new PageRequest(0, 1, sort))
.getContent()
.stream()
.findFirst()
.get();
But I would like use method which is describe in official documentation. Has someone similar problem?

Disable indexing fields for all fields in the document in Elasticsearch

am using elasticsearch in my Java Spring application, for working with elasticsearch Spring JPA is used.
I have a document and corresponding class in java with all fields that should not be indexed (I search through them for exact match using termFilter statement in java api)
In my case I have to annotate each field
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
and I get something like this
#JsonInclude(JsonInclude.Include.NON_NULL)
#JsonIgnoreProperties(ignoreUnknown = true)
#Document(indexName = "message", type = "message")
public class Message implements Serializable {
#Id
#NotNull
#JsonProperty("id")
private String id;
#JsonProperty("userName")
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String userName;
#NotNull
#JsonProperty("topic")
#Field(index = FieldIndex.not_analyzed, type = FieldType.String)
private String topic;
#NotNull
#JsonProperty("address")
#Field(index = FieldIndex.not_analyzed, type = FieldType.String)
private String address;
#NotNull
#JsonProperty("recipient")
#Field(index = FieldIndex.not_analyzed, type = FieldType.String)
private String recipient;
}
Is there a possibility to put annotation on class in order not to duplicate it above all fields?

You can achive your goal without #Field annotations using raw mappings + dynamic templates
Specify the path to your mappings in json file using #Mapping annotation
#Mapping(mappingPath = "/mappings.json")
Then in mappings.json define your mapping like this:
{
"mappings": {
"message": {
"dynamic_templates": [
{ "notanalyzed": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
}
}
Note: I didn't test it, so please check for typos.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Store Key-Value structure in Elasticsearch via Spring-Data - java

Related

Hibernate search with ResultTransformer (DTO)

ModelMapper - convert a Date inside a Collection<Object> to String (java)

Hibernate Search/Lucene : String field cannot be used for sorting "indexed with multiple values per document, use SORTED_SET instead"

Find top order by date desc issue

Disable indexing fields for all fields in the document in Elasticsearch

Categories

Resources