Reading CSV data in Spring Batch (creating a custom LineMapper) - java

I've been doing a bit of work writing some batch processing code on CSV data. I found a tutorial online and so far have been using it without really understanding how or why it works, which means I'm unable to solve a problem I'm currently facing.
The code I'm working with is below:
#Bean
public LineMapper<Employee> lineMapper() {
DefaultLineMapper<Employee> lineMapper = new DefaultLineMapper<Employee>();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setNames(new String[] { "id", "firstName", "lastName" });
lineTokenizer.setIncludedFields(new int[] { 0, 1, 2 });
BeanWrapperFieldSetMapper<Employee> fieldSetMapper = new BeanWrapperFieldSetMapper<Employee>();
fieldSetMapper.setTargetType(Employee.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
I'm not entirely clear on what setNames or setIncludedFields is really doing. I've looked through the docs, but still don't know what's happening under the hood. Why do we need to give names to the lineTokenizer? Why can't it just be told how many columns of data there will be? Is its only purpose so that the fieldSetMapper knows which fields to map to which data objects (do they all need to be named the same as the fields in the POJO?)?
I have a new problem where I have CSVs with a large amount of columns (about 25-35) that I need to process. Is there a way to generate the columns in setNames programmatically with the variable names of the POJOs, rather than editing them in by hand?
Edit:
An example input file may be something like:
test.csv:
field1, field2, field3,
a,b,c
d,e,f
g,h,j
The DTO:
public class Test {
private String field1;
private String field2;
private String field3;
//setters and getters and constructor

I see the confusion, so I will try to clarify how key interfaces work together. A LineMapper is responsible for mapping a single line from your input file to an instance of your domain type. The default implementation provided by Spring Batch is the DefaultLineMapper, which delegates the work to two collaborators:
LineTokenizer: which takes a String and tokenizes it into a FieldSet (which is similar to the ResultSet in the JDBC world, where you can get fields by index or name)
FieldSetMapper: which maps the FieldSet to an instance of your domain type
So the process is: String -> FieldSet -> Object:
Each interface comes with a default implementation, but you can provide your own if needed.
DelimitedLineTokenizer
The names attribute in DelimitedLineTokenizer is used to create named fields in the FieldSet. This allows you to get a field by name from the FieldSet (again, similar to ResultSet methods where you can get a field by name). The includedFields allows to select a subset of fields from your input file, just like in your use case where you have 25 fields and you only need to extract a subset of fields.
BeanWrapperFieldSetMapper
This FieldSetMapper implementation expects a type and uses the JavaBean naming conventions for getters/setters to set fields on the target object from the FieldSet.
Is there a way to generate the columns in setNames programmatically with the variable names of the POJOs, rather than editing them in by hand?
This is what the BeanWrapperFieldSetMapper will do. If you provide field names in the FieldSet, the mapper will call the setter of each field having the same name. The name matching is fuzzy in the sense that it tolerates close matches, here is an excerpt from the Javadoc:
Property name matching is "fuzzy" in the sense that it tolerates close matches,
as long as the match is unique. For instance:
* Quantity = quantity (field names can be capitalised)
* ISIN = isin (acronyms can be lower case bean property names, as per Java Beans recommendations)
* DuckPate = duckPate (capitalisation including camel casing)
* ITEM_ID = itemId (capitalisation and replacing word boundary with underscore)
* ORDER.CUSTOMER_ID = order.customerId (nested paths are recursively checked)
This mapper is also configurable with a custom ConversionService if needed. If this still does not cover your use case, you need to provide a custom mapper.

Related

ZK MVVM Validation - Dependent Property Array content?

I'm playing around with the ZK 8 MVVM form validation system and generally it seems to do what I want, but I wonder what the definition of the dependent property index is...
Let's take a simple validator...
public class FormValidator extends AbstractValidator {
#Override
public void validate(final ValidationContext ctx) {
Property[] properties = ctx.getProperties("firstName");
Object value0 = properties[0].getValue();
Object value1 = properties[1].getValue();
}
}
So, when this is called before the save command, for every property, I get a Property[] array of length 2. But somehow, I have yet to find out what is stored in [0] and what is stored in [1]. Sometimes it seems that [0] stores the current value (which may or may not be valid according the field validator there) and [1] the last valid entry... But sometimes it seems to be the other way round...
The examples in the documentation always seem to simply take the first element ([0]) for validation, but I would like the understand what both parts of this pair actually mean...
Anyone got an idea for that?
I might be off the mark with my answer, but if you are using ZK8, you should look into using Form binding
That way you do not have to handle Properties in your validator and can retrieve a proxy object matching the bean you use for your form.
If you are using a User POJO with a firstName and lastName attribut.
User myProxy= (User ) ctx.getProperty().getValue();
And then you can validate both fields by simply doing getFirstName and getLastName on myProxy.
Hope it helps.

How can I use capitalized property in a hibernate entity

In our product we use auto generated hibernate entities to be able to link a customizable Database scheme to our server software. The entity names and property names are taken from the data base. Especially, the property names can usually not be changed as they also are used in user code unrelated to the hibernate data layer (e.g. python scripts)
Some of these property names are capitalized, which seems to cause some problems. HQL statements using those property names fail with an Exception, e.g.:
org.hibernate.QueryException: could not resolve property List_id
at org.hibernate.QueryException.generateQueryException(QueryException.java:137)
at org.hibernate.QueryException.wrapWithQueryString(QueryException.java:120)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.doCompile(QueryTranslatorImpl.java:234)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.compile(QueryTranslatorImpl.java:158)
at org.hibernate.engine.query.spi.HQLQueryPlan.<init>(HQLQueryPlan.java:126)
at org.hibernate.engine.query.spi.HQLQueryPlan.<init>(HQLQueryPlan.java:88)
at org.hibernate.engine.query.spi.QueryPlanCache.getHQLQueryPlan(QueryPlanCache.java:190)
Some code snippet for the example Exception:
#Entity(name = "ListItem")
#Table(name = "LIST_ITEM")
public class ListItem
extends HibernatePojoClass
{
private String List_id = "";
#Column(name = "`LIST_ID`", length = 8)
public String getList_id() {
return List_id;
}
public void setList_id(String List_id) {
this.List_id = List_id;
}
...
and the HQL statement:
select li.id, li.List_id from ListItem li
The exception occurs when hibernate tries to transform the hql statement to a sql statement.
Why does this happen?
It seems that when I use li.list_id in the hql statement, the property is resolved (while this leads to another error); can I prevent this implicit "capitalization change" somehow?
if you use
#Column(name = "`LIST_ID`", length = 8)
public String getList_id() {
return List_id;
}
you should refer that property as list_id in HQL, of course.
Hibernate can use a naming strategy to generate column names. ImprovedNamingStrategy from Hibernate 4 will convert column name to the lower case, even if you specify it. I am not sure about the quotes, but for this:
#Column(name = "LIST_ID", length = 8)
public String getList_id() {
return List_id;
}
using ImprovedNamingStrategy you will have list_id column name.
You can try to use your own naming strategy to generate correct column names.
JPA has 2 basic access modes: property access and field access.
Property access requires you to adhere to the Java Beans convention which means you need field name that starts with a lower case character and a corresponding getter/setter which has the same character in upper case, i.e. field listId would require a getter getListId().
Thus you'd need to use field access in order to have Hibernate use the field name as it is. Another advantage of using field access on an entity's id would be that you'd not need to do any lazy loading in order to just get the id - which wouldn't be possible with property access in Hibernate.
For more information have a look at sections 2.2 and 2.3 of the JPA specification.
A final word of advice though: as already stated multiple times in my comments you should try and stick with the Java code conventions. Some advantages of doing so:
It'll be easier to communicate with others such as people here on SO (e.g. a name starting with a capital letter normally is assumed to be a class name).
You'll have less problems with libraries in the Java eco system since most of them use the same conventions or are based on them (e.g. JavaBeans, JavaEL, etc.)
It'll be easier to spot errors, e.g. when using a class rather than a field or variable etc.
You'll be less dependent on IDE features like code coloring, error highlighting etc.

java/jackson - resolve during parsing

I have a group which contains a list of persons:
class Person {
...
}
class Group {
public Person findPerson(String name) {
...
}
}
Say I have an input JSON (representation of SomeDataClass - see below) which refer to a person by its name:
{
...
"person" : "Bill"
}
I am using Jackson to parse this input JSON. By default, Jackson parses this the person field to a String. Is it possible to change this, such that the person is resolved/looked up during parsing?
class SomeDataClass {
...
#JsonProperty("person")
protected Person person;
}
Note that I do not want to create a new person. I want to look it up, by calling the function getPerson on an instance of Group. This means that I must have access to the group during the parsing. There are several groups at runtime, so it is not singleton.
update
I am aware of the #JsonDeserialize(using = XYZ.cass) possibility, but this does not allow me to pass the group to the custom deserializer. As said, there are multiple groups, so it is not singleton.
I do not think this is possible with Jackson. You could try to store your reference to the group in a ThreadLocal, so your deserializer is using the correct group.
Jackson does have support for Object Ids, via #JsonIdentityInfo annotation. But it is assumed that references using ids ("Bill" in this case) may be resolved by matching definitions within JSON content, so this may not work for your case.
You may need to handle resolution yourself; if you define setPerson(String), method itself could try locating actual instance to use. But that does require use of ThreadLocal, as mentioned.
Another alternative could be custom deserializer, which would use "attribute"s via DeserializationContext; but you still need to provide such mappings so it does not help a lot.

Dynamically show fields of the generic type through intellisense

In Servoy, a development and deployment platform, you have the possibility to use what is called a JSFoundSet which is an object containing record objects defined by its SQL. Such a JSFoundSet is created as follows using the appropriate annotation:
/** #type{JSFoundSet<db:/database/table_name}*/
var fs = null;
from this point on you can use in your code the variable fs to get or set values to the properties of table_name. So if I would create a foundset of table Customer and this table contains the columns id, firstName and lastName, then the Servoy platform provides intellisense that allows me to do this:
fs.id = 1;
fs.firstName = 'John';
fs.lastName = 'Doe';
Since I use a lot of Java too, I wanted to see if I can create something similar of this in Java. So I want to create a class FoundSet of a certain generic type, say of type Customer in our example, after which in my code I can create an object of this class and then access the public fields (set/get) of FoundSet. While typing I wish to see these fields show up through intellisense.
Is there a library or some sort that allows me to define some annotations as the Servoy example to accomplish this?

How to set field for objects in list by field name with LambdaJ?

Code:
class MyClass {
private String field1;
private Long field2;
//getters and setters also here
}
List<MyClass> myClassList = new ArrayList<>();
//getting my list filled
Now I need to set e.g. field1 for all objects in list to some value. I can do it with:
forEach(myClassList).setField1("some value");
But how can I set some field dynamically, passing field name as string "field1" or "field2" etc.?
What you're asking is countrary to the main principle on which lambdaj is based. I designed it to allow to invoke the methods of your Beans in a strongly typed way. In this way you have all the help that your favorite IDE can give you like autocompletion. Moreover if you'll decide to rename that method your IDE will be able to automatically change the name for you or at least you'll a compilation error instead of finding the problem only at runtime.

Categories

Resources