CSV file structure for a java object - java

I want a CSV format for Order objects. My Order Object will have order details, order line details and item details. Please find the java object below:
Order {
OrderNo, OrderName, Price,
OrderLine {
OrderLineNo, OrderLinePrice,
Item{
ItemNo, ItemName, Item Description
}
}
}
Can anyone please guide me to create csv format for this.

Have a POJO class for your Object for which you want to create CSV file and use java.io.FileWriter to write/append values in csv file. This Java-code-geek Link will help you with this.

If you are feeling adventurous, I'm building support for nested elements in CSV in uniVocity-parsers.
The 2.0.0-SNAPSHOT version supports parsing nested beans with annotations. We are planning to release the final version in a couple of weeks. Writing support has not been implemented yet, so that part you'll have to do manually (should be fairly easy with the current API).
Parsing this sort of structure is more complex, but the parser seems to be working fine for most cases. Have a look at that test case:
Input CSV:
1,Foo
Account,23234,HSBC,123433-000,HSBCAUS
Account,11234,HSBC,222343-130,HSBCCAD
2,BAR
Account,1234,CITI,213343-130,CITICAD
Note that the first column of each row identifies which bean will be read. As "Client" in the CSV matches the class name, you don't need to annotate
Pojos
enum ClientType {
PERSONAL(2),
BUSINESS(1);
int typeCode;
ClientType(int typeCode) {
this.typeCode = typeCode;
}
}
public static class Client {
#EnumOptions(customElement = "typeCode", selectors = { EnumSelector.CUSTOM_FIELD })
#Parsed(index = 0)
private ClientType type;
#Parsed(index = 1)
private String name;
#Nested(identityValue = "Account", identityIndex = 0, instanceOf = ArrayList.class, componentType = ClientAccount.class)
private List<ClientAccount> accounts;
}
public static class ClientAccount {
#Parsed(index = 1)
private BigDecimal balance;
#Parsed(index = 2)
private String bank;
#Parsed(index = 3)
private String number;
#Parsed(index = 4)
private String swift;
}
Code to parse the input
public void parseCsvToBeanWithList() {
final BeanListProcessor<Client> clientProcessor = new BeanListProcessor<Client>(Client.class);
CsvParserSettings settings = new CsvParserSettings();
settings.getFormat().setLineSeparator("\n");
settings.setRowProcessor(clientProcessor);
CsvParser parser = new CsvParser(settings);
parser.parse(new StringReader(CSV_INPUT));
List<Client> rows = clientProcessor.getBeans();
}
If you find any issue using the parser, please send update this issue

Related

Apache Beam how to filter data based on date value

I am trying to read records from a CSV file and filter the records based on the date. I have implemented this in the following way. But is this a correct way?
The steps are:
Creating pipeline
Read the data from a file
Perform necessary filtering
Create a MapElement Object and convert the OrderRequest to String
Mapping the OrderRequest Entity to String
Write the output to a file
Code:
// Creating pipeline
Pipeline pipeline = Pipeline.create();
// For transformations Reading from a file
PCollection<String> orderRequest = pipeline
.apply(TextIO.read().from("src/main/resources/ST/STCheck/OrderRequest.csv"));
PCollection<OrderRequest> pCollectionTransformation = orderRequest
.apply(ParDo.of(new DoFn<String, OrderRequest>() {
private static final long serialVersionUID = 1L;
#ProcessElement
public void processElement(ProcessContext c) {
String rowString = c.element();
if (!rowString.contains("order_id")) {
String[] strArr = rowString.split(",");
OrderRequest orderRequest = new OrderRequest();
orderRequest.setOrder_id(strArr[0]);
// Condition to check if the
String source1 = strArr[1];
DateTimeFormatter fmt1 = DateTimeFormat.forPattern("mm/dd/yyyy");
DateTime d1 = fmt1.parseDateTime(source1);
System.out.println(d1);
String source2 = "4/24/2017";
DateTimeFormatter fmt2 = DateTimeFormat.forPattern("mm/dd/yyyy");
DateTime d2 = fmt2.parseDateTime(source2);
System.out.println(d2);
orderRequest.setOrder_date(strArr[1]);
System.out.println(strArr[1]);
orderRequest.setAmount(Double.valueOf(strArr[2]));
orderRequest.setCounter_id(strArr[3]);
if (DateTimeComparator.getInstance().compare(d1, d2) > -1) {
c.output(orderRequest);
}
}
}
}));
// Create a MapElement Object and convert the OrderRequest to String
MapElements<OrderRequest, String> mapElements = MapElements.into(TypeDescriptors.strings())
.via((OrderRequest orderRequestType) -> orderRequestType.getOrder_id() + " "
+ orderRequestType.getOrder_date() + " " + orderRequestType.getAmount() + " "
+ orderRequestType.getCounter_id());
// Mapping the OrderRequest Entity to String
PCollection<String> pStringList = pCollectionTransformation.apply(mapElements);
// Now Writing the elements to a file
pStringList.apply(TextIO.write().to("src/main/resources/ST/STCheck/OrderRequestOut.csv").withNumShards(1)
.withSuffix(".csv"));
// To run pipeline
pipeline.run();
System.out.println("We are done!!");
Pojo Class:
public class OrderRequest implements Serializable{
String order_id;
String order_date;
double amount;
String counter_id;
}
Though I am getting the correct result, is this a correct way? My two main problem is
1) How to i access individual columns? So that, I can specify conditions based on that column value.
2) Can we specify headers when reading the data?
Yes, you can process CSV files like this using TextIO.read() provided they do not contain fields embedding newlines and you can recognize/skip the header lines. Your pipeline looks good, though as a minor style issue I would probably have the first ParDo do only the parsing, followed by a Filter that looked at the date to filter things out.
If you want to automatically infer the header lines, you could open read the first line in your main program (using standard java libraries, or Beams FileSystems class) and extract this out manually, passing it into your parsing DoFn.
I agree a more columnar approach would be more natural. We have this in Python as our Dataframes API which is now available for general use. You would write something like
with beam.Pipeline() as p:
df = p | beam.dataframe.io.read_csv("src/main/resources/ST/STCheck/OrderRequest.csv")
filtered = df[df.order_date > limit]
filtered.write_csv("src/main/resources/ST/STCheck/OrderRequestOut.csv")

Camel Exchange body record size

Scenario:
CSV file is sent to my endpoint, Pojo transforms the data for java and message sent to one of my route lets say ("direct:consume") route, then a processor processes the file manipulating the message and creating a new output.
Issue:
file contains only one line the code breaks
file contains multiple lines the code works
Tried:
tried to find a way to determine the amount of record coming in the exchange.getIn().getBody()
read on stackoverflow
read camel documentation about exchange
check java codes for Object/Objects to List conversion without knowing record amount
Code:
public void process(Exchange exchange) throws Exception {
List<Cars> output = new ArrayList<Cars>();
**List<Wehicle> rows = (List<Wehicle>) exchange.getIn().getBody(); <-- Fails**
for (Wehicle row: rows) {
output.add(new Cars(row));
}
exchange.getIn().setBody(output);
exchange.getIn().setHeader("CamelOverruleFileName", "CarEntries.csv");
}
Wehicle
...
#CsvRecord(separator = ",", skipFirstLine = true, crlf = "UNIX")
public class Wehicle {
#DataField(pos = 1)
public String CouponCode;
#DataField(pos = 2)
public String Price;
}
...
Cars
#CsvRecord(separator = ",", crlf = "UNIX", generateHeaderColumns = true)
public class Cars {
#DataField(pos = 1, columnName = "CouponCode")
private String CouponCode;
#DataField(pos = 2, columnName = "Price")
private String Price;
public Cars(Wehicle origin) {
this.CouponCode = Utilities.addQuotesToString(origin.CouponCode);
this.Price = origin.Price;
}
}
Input:
"CouponCode","Price"
"ASD/785", 1900000
"BWM/758", 2000000
Question:
How to create dynamicall a List regardless if i get one object or multiple objects?
-- exchange.getIn().getBody() returns object
How to check the amount of records from camel exchange message ?
-- exchange.getIn().getBody() no size/length method
Any other way of doing this?
Haven't used java for a long time, plus quiet new to camel.
After re checking the official documentation it seems the following changes are solving the issue.
Code:
public void process(Exchange exchange) throws Exception {
List<Cars> output = new ArrayList<Cars>();
List records = exchange.getIn().getBody(List.class);
(List<Wehicle>) rows = (List<Wehicle>) records;
for (Wehicle row: rows) {
output.add(new Cars(row));
}
exchange.getIn().setBody(output);
exchange.getIn().setHeader("CamelOverruleFileName", "CarEntries.csv");
}

How to properly handle comma inside a quoted string using opencsv?

I'm trying to read csv file that contains strings both quoted and not.
If string is quoted, it should save it's quote chars.
Beside that, if string contains comma, it should not be split.
I've tried multiple ways but nothing works as of now.
Current test data:
"field1 (with use of , we lose the other part)",some description
field2,"Dear %s, some text"
Getting 1st field of mapped bean
Expected result:
"field1 (with use of , we lose the other part)"
field2
Current result:
"field1 (with use of
field2
Here is the code:
public class CsvToBeanReaderTest {
#Test
void shouldIncludeDoubleQuotes() {
String testData =
"\"field1 (with use of , we lose the other part)\",some description\n"
+
"field2,\"Dear %s, some text\"";
RFC4180ParserBuilder rfc4180ParserBuilder = new RFC4180ParserBuilder();
rfc4180ParserBuilder.withQuoteChar(ICSVWriter.NO_QUOTE_CHARACTER);
ICSVParser rfc4180Parser = rfc4180ParserBuilder.build();
CSVReaderBuilder builder = new CSVReaderBuilder(new StringReader(testData));
CSVReader reader = builder
.withCSVParser(rfc4180Parser)
.build();
List<TestClass> result = new CsvToBeanBuilder<TestClass>(reader)
.withType(TestClass.class)
.withEscapeChar('\"')
.build()
.parse();
result.forEach(testClass -> System.out.println(testClass.getField1()));
}
private List<TestClass> readTestData(String testData) {
return new CsvToBeanBuilder<TestClass>(new StringReader(testData))
.withType(TestClass.class)
.withSeparator(',')
.withSkipLines(0)
.withIgnoreEmptyLine(true)
.build()
.parse();
}
public static final class TestClass {
#CsvBindByPosition(position = 0)
private String field1;
#CsvBindByPosition(position = 1)
private String description;
public String toCsvFormat() {
return String.join(",",
field1,
description);
}
public String getField1() {
return field1;
}
}
}
I've found out that if I comment or remove rfc4180ParserBuilder.withQuoteChar(ICSVWriter.NO_QUOTE_CHARACTER); the string will be parsed correctly, but I will lose the quote char which should not be lost. Is there any suggestions what can be done? (I would prefer not to switch on other csv libraries)

Spring ElasticSearch: Issue when combining multiple Queries

I've searched high and low for an answer and cant seem to find anything, so apologies if this has been asked before.
I have a simple Spring boot API which allows students to add their college projects. These projects are stored in ES which I'm trying to make searchable.
FWIW, From a use case POV, a user can search for a project with a query string which is based on a project Name or a hashtag for the project i.e. "Java".
Right now with the code below, if I search for projects which have a given hashtag I can retrieve them no problem. I get exactly the results I'm looking for.
However...once I try to add a second query whereby I want to limit the results to only students who have a userId contained in a ArrayList my search fails. By Fails I mean it returns everything from the index.
Can anyone spot what I'm doing wrong?
Many thanks
private Iterable<UserProjects> searchCollegeProjects(Integer page, Integer size, List<String> hashtags, List<String> studentUserIds) {
PageRequest pageRequest = PageRequest.of(page, size);
QueryBuilder projectQueryBuilder = null;
if (hashtags != null) {
projectQueryBuilder =
QueryBuilders
.multiMatchQuery(hashtags.toString(), "projectName", "projectHashtag")
.fuzziness(Fuzziness.AUTO);
}
BoolQueryBuilder userIdQueryBuilder = null;
if(studentUserIds!=null) {
userIdQueryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.termsQuery("userId.keyword", studentUserIds));
}
Query searchQuery = new NativeSearchQueryBuilder()
.withQuery(projectQueryBuilder)
.withQuery(userIdQueryBuilder)
.withFilter(boolQuery()
.mustNot(QueryBuilders.termsQuery("isDraft", true)))
.withPageable(PageRequest.of(page, size))
.build();
SearchHits<UserProjects> projectHits;
try {
projectHits =
elasticsearchOperations
.search(searchQuery, UserProjects.class,
IndexCoordinates.of(elasticProjectsIndex));
} catch (Exception ex) {
logger.error("Error unable to perform search query" + ex);
throw new GeneralSearchException(ex.toString());
}
List<UserProjects> projectMatches = new ArrayList<>();
projectHits.forEach(srchHit -> {
projectMatches.add(srchHit.getContent());
});
long totalCount = projectHits.getTotalHits();
Page<UserProjects> resultPage = PageableExecutionUtils.getPage(
projectMatches,
pageRequest,
() -> totalCount);
return resultPage;
}
Also below is my POJO for UserProjects:
#Getter
#Setter
#Document(indexName = "college.userprojects")
#JsonIgnoreProperties(ignoreUnknown = true)
public class UserProjects {
#Id
#Field(type = FieldType.Auto, name ="_id")
private String projectId;
#Field(type = FieldType.Text, name = "userId")
private String userId;
#Field(type = FieldType.Text, analyzer = "autocomplete_index", searchAnalyzer = "lowercase" ,name = "projectName")
private String projectName;
#Field(type = FieldType.Auto, name = "projectHashTag")
private List<String> projectHashTag;
#Field(type = FieldType.Boolean, name = "isDraft")
private Boolean isDraft;
}
EDIT: Just to be clear, if I remove this line from the native query:
.withQuery(userIdQueryBuilder)
I get back projects (Correctly) which contain the query text in either their projectName or projectHashtag fields. Putting the above line back in will cause the search to return everything.

What is the way to insert a huge JSON data into a SQLite DB in android

I have a large json request, that is, it has around 50k rows with 15 columns that I have to insert into a SQLite DB with the same structure. This is, I have to copy the same data allocated in postgres db into my sqlite db within my app. Is there some efficient way to do it? is there some api or something that could help to the work?
I have to tell that I'm able to do with OMRLite with JSON datas that isn't large but when I try to do with a bigger ones I have my app crashes and has the out of memory error.
Please if you have some idea or some example that I could follow I will appreciate it a lot! thanks in advance!
You can also use Google's Official Gson Streaming Library.
Gson Streaming : Gson Streaming
JsonReader plays very important role to parse json using Streaming library.
Because the JSON is so large you can't load it completely in memory. Even using the standard JSONObject will result on out of memory on many devices.
What I've done in a similar situation was to use Jackson for parsing it. Jackson can do it from the stream so memory usage is not a problem at all. The downside is API which is not that straight forward to use compared to normal options.
Here is an example I found: Jackson Streaming
GSON stream works for me Gson - streaming but I have to say it takes his time, around 4 minutes for 68K rows with 10 columns for example. But anyway solves my issue. So I have this JSON responde:
- preciosArtPK: {
codLista: 1,
codArticulo: 11348,
cansiVenta: 1,
fecVigencia: 1435781252000
},
siglaVenta: "UN",
precioVenta: 0,
margenPct: 100,
codUsuario: 1,
vigente: "S",
nomModulo: "MIGRACION"
Above JSON is a part of an array response but I have those "preciosArtPK" to include in the serialization with gson. How could I do that? I have the class that handles my serialization:
#DatabaseTable(tableName = "preciosart")
public class PreciosArt {
public static final String PRECIOS_COD_LISTA = "_id";
public static final String PRECIOS_COD_ARTICULO = "cod_articulo";
public static final String PRECIOS_CANSI_VENTA = "cansi_venta";
public static final String PRECIOS_FEC_VIGENCIA = "fec_vigencia";
public static final String PRECIOS_SIGLA_VENTA = "sigla_venta";
public static final String PRECIOS_PRECIO_VENTA = "precio_venta";
public static final String PRECIOS_MARGEN_PCT = "margen_pct";
public static final String PRECIOS_COD_USUARIO = "cod_usuario";
public static final String PRECIOS_VIGENTE = "vigente";
public static final String PRECIOS_NOM_MODULO = "nom_modulo";
#DatabaseField(id = true, unique = true, columnName = PRECIOS_COD_LISTA)
private Integer codLista;
#DatabaseField(unique = true, columnName = PRECIOS_COD_ARTICULO)
#SerializedName("codArticulo") // probar
private Integer codArticulo;
#DatabaseField(unique = true, columnName = PRECIOS_CANSI_VENTA)
private Integer cansiVenta;
#DatabaseField(unique = true, columnName = PRECIOS_FEC_VIGENCIA)
private Long fecVigencia;
#DatabaseField(columnName = PRECIOS_SIGLA_VENTA)
#SerializedName("siglaVenta")
private String siglaVenta;
#DatabaseField(columnName = PRECIOS_PRECIO_VENTA)
#SerializedName("precioVenta")
private Double precioVenta;
#DatabaseField(columnName = PRECIOS_MARGEN_PCT)
#SerializedName("margenPct")
private Float margenPct;
#DatabaseField(columnName = PRECIOS_COD_USUARIO)
#SerializedName("codUsuario")
private Integer codUsuario;
#DatabaseField(columnName = PRECIOS_VIGENTE)
#SerializedName("vigente")
private String vigente;
#DatabaseField(columnName = PRECIOS_NOM_MODULO)
#SerializedName("nomModulo")
private String nomModulo;
but this does't fill those fields (codArticulo, cansiVenta and fecVigencia). I read about to deserialize these json formats making another class of it, so I did the same:
#SerializedName("codLista")
private Integer codLista;
#SerializedName("codArticulo")
private Integer codArticulo;
#SerializedName("cansiVenta")
private Integer cansiVenta;
#SerializedName("fecVigencia")
private Long fecVigencia;
Problem is: how could I fill those field with my json deserialized? I use OMRLite to do the work, this class is for that purpose:
public long updatePreciosart(PreciosArt preciosArt){
long resultUpdate = -1;
try {
getHelper().getPreciosartDao().createOrUpdate(preciosArt);
resultUpdate = 0;
} catch (SQLException e) {
e.printStackTrace();
resultUpdate = -1;
}
return resultUpdate;
}
Hope you understand what the problem is and hope you help me with that! thanks again!

Categories

Resources