Need to parse XML with a strange looking Schema - java

As I understood it before taking this job, XML Uses a series of key-value pairs that up till today seemed fairly straight forward with how I was using it. Basically I need to parse an XML document like this in Android:
<MailingAddress Caption="Mailing Address" PropertyType="CRM.Address" FieldType="4" DisplayType="0" ValueType="0" IsRequired="False">
<CRM.Address>
<Street Caption="Street" PropertyType="System.String" FieldType="1" DisplayType="1" ValueType="1" MaxDataLength="400" IsRequired="False" />
<City Caption="City" PropertyType="System.String" FieldType="1" DisplayType="1" ValueType="1" MaxDataLength="400" IsRequired="False" />
<State Caption="State" PropertyType="System.String" FieldType="1" DisplayType="1" ValueType="1" MaxDataLength="200" IsRequired="False" />
<PostalCode Caption="Postal Code" PropertyType="System.String" FieldType="1" DisplayType="1" ValueType="1" MaxDataLength="100" IsRequired="False" />
<Country Caption="Country" PropertyType="System.String" FieldType="1" DisplayType="1" ValueType="1" MaxDataLength="200" IsRequired="False" />
</CRM.Address>
Does anyone know how I might go about parsing this or know of a parser that would be useful to me? Am I going to have to write my own parser?

This looks like well formed xml , so have a schema, generate POJO using xsd to java command , and de-serilize then we should have values in xml as POJO thus we can do whatever we want from java pojo

Related

Parse an XML into Java - MetaData Format

I saw some xml parsing in java but I really don't know how I can apply it to my code.
Here is my xml file:
<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:georss="http://www.georss.org/georss" xmlns:gml="http://www.opengis.net/gml" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xml:base="https://adomain.com">
<id>https://sharepoint.mydomain/aFile)</id>
<category term="SP.File" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<title />
<updated>2015-05-18T07:13:18Z</updated>
<author>
Bla Bla<name />
</author>
<content type="application/xml">
<m:properties>
<d:CheckInComment />
<d:CheckOutType m:type="Edm.Int32">2</d:CheckOutType>
<d:ContentTag>{63FD2CFA-D223-405B-86B3-D59B34ECEBBE},3,1</d:ContentTag>
<d:CustomizedPageStatus m:type="Edm.Int32">0</d:CustomizedPageStatus>
<d:ETag>"{63FD2CFA-D223-405B-86B3-D59B34ECEBBE},3"</d:ETag>
<d:Exists m:type="Edm.Boolean">true</d:Exists>
<d:Length m:type="Edm.Int64">638367</d:Length>
<d:Level m:type="Edm.Byte">2</d:Level>
<d:MajorVersion m:type="Edm.Int32">0</d:MajorVersion>
<d:MinorVersion m:type="Edm.Int32">1</d:MinorVersion>
<d:Name>aName.pdf</d:Name>
<d:ServerRelativeUrl>/mydomain.com/afile</d:ServerRelativeUrl>
<d:TimeCreated m:type="Edm.DateTime">2014-09-03T15:30:22Z</d:TimeCreated>
<d:TimeLastModified m:type="Edm.DateTime">2014-09-03T15:30:25Z</d:TimeLastModified>
<d:Title />
<d:UIVersion m:type="Edm.Int32">1</d:UIVersion>
<d:UIVersionLabel>0.1</d:UIVersionLabel>
</m:properties>
</content>
</entry>
I am trying to get the metadata of a file from SharePoint which is displayed in xml format.
How can I get the data which is inside the content and also the title and the author like this:
Author BlaBla
Title Bla
Type application/xml
TimeLastModified xx/xx/xxxx
The easiest way to parse XML files is the DOM parser. You can find the documentation here and a few tutorials here and here.
Also, a related question in stackoverflow here.
You can use Jaxb, now a days it is used for parsing purpose very effectively, Converting XML to JAVA called UnMarshalling http://www.javatpoint.com/jaxb-tutorial

XML Parsing in java to get it in Key, Value Pair?

There are many XML Parsing technique are there which I am not aware yet. I want to parse the XML (Form Data) and get the form output data in Key, Value pair. Which XML parsing technique makes it easy to get the values in key value pair for the following XML format,
<?xml version="1.0" encoding="UTF-8"?>
<metadata>
<control for="9bd2f8fd2421eb0b0a410feaa1f482c50551486a" name="first-name" type="input" datatype="string">
<resources lang="en">
<label>First Name</label>
<help />
<hint>Your first or given name
</hint>
<alert />
</resources>
<resources lang="fr">
<label>Prénom</label>
<help />
<hint>
Votre prénom
</hint>
<alert />
</resources>
<value>Rahul</value>
</control>
<control for="8532f26e19a5b33200f56bb839c5f3aa2fa3a25f" name="last-name" type="input" datatype="string">
<resources lang="en">
<label>Last Name</label>
<help />
<hint>Your last name</hint>
<alert />
</resources>
<resources lang="fr">
<label>Nom de famille</label>
<help />
<hint>Votre nom de famille</hint>
<alert />
</resources>
<value>Sharma
</value>
</control>
</metadata>
Note I need to capture only values with English Language. For the above XML I need the output as follows,
First Name - Rahul
Last Name - Sharma
This might push to the right direction:
Which is the best library for XML parsing in java
And to capture the values in English, you would have to employ natural language processing to recognize which language the text you've captured using the xml parser. Luckily, you can use libraries for identifying english sentences. Here is a post outlining java libraries to identify the language of text:
How to detect language of user entered text?
Then after removing the text that is not english, you can go through in retrieving the dictionary.

JAXB Parsing an XML file with variables (e.g. $(var1) )

I'm interested to parse an XML that contains variables (which are defined by me inside the XML).
Here's an example of the XML files:
<parameters>
<parameter name="parent-id" value="1" />
<parameter name="child-id" value="1" />
</parameters>
<Parents>
<Parent id="$(parent-id)">
<Children>
<Child id="$(child-id)">
</Child>
</Children>
</Parent>
</Parents>
Is there a utility or some standard way to do so in Java? (using JAXB possibly)
Or should I implement this "mini" parsing mechanism by myself?
(A mechanism that identifies the variables and plants them inside the XML, and only later calls JAXB flows)
Thanks a lot in advance!
Use an XSLT transformation to convert your XML into an XSLT stylesheet and then execute the XSLT stylesheet. It's simple enough to convert
<parameters>
<parameter name="parent-id" value="1" />
<parameter name="child-id" value="1" />
</parameters>
into
<xsl:param name="parent-id" select="1" />
<xsl:param name="child-id" select="1" />
and
<Parent id="$(parent-id)">
into
<Parent id="{$parent-id}">
and to add a wrapper xsl:stylesheet and xsl:template element, and then you're done.

ZK - Adding component to Treeitem depending on model content

So, I want to add a specific component (either Checkbox or Textbox) depending on the AttType field of my current node. My zul file looks like this:
<tree id="permissionTree" width="100%"
model="#bind(vm.treeModel)" style="text-align:left;">
<treecols>
<treecol label="Item" width="400px" />
<treecol label="Wert" />
</treecols>
<template name="model" var="node">
<treeitem>
<treerow>
<treecell label="#load(node.data.name)" />
<treecell> HERE COMPONENT DEPENDING ON node.data.AttType </treecell>
</treerow>
</treeitem>
</template>
</tree>
How can I accomplish this? Oh and I want the Textbox/Checkbox value to be bound to my model as a String, that would be pretty nice.
Thanks for any suggestions.
Edit: I made a little "workaround" for myself. Since I have only 3 possible input types, I just defined them hard-coded:
<tree id="permissionTree" width="100%"
model="#bind(vm.treeModel)" style="text-align:left;">
<treecols>
<treecol label="Item" />
<treecol label="Wert" />
</treecols>
<template name="model" var="node">
<treeitem open="#bind(node.open)" onClick="#command('expandNode', item=node)">
<treerow>
<treecell label="#load(node.data.name)" />
<treecell>
<textbox visible="#load(node.data.isTextbox)" value="#bind(node.data.value)" />
<textbox visible="#load(node.data.isTextarea)" rows="6" width="300px" value="#bind(node.data.value)" />
<checkbox visible="#load(node.data.isCheckbox)" checked="#bind(node.data.checkboxValue)" />
</treecell>
</treerow>
</treeitem>
</template>
</tree>
And in the constructor of a TreeNode I set the isTextbox/isTextarea/isCheckbox values according to the type. This way the model binding still works :)
As you asked in your previous question: use treeitem renderer and add items you need there. I don't think binding will work for new components.

Can I format data that is to be written in CSV file using java

I have some code to write data into a CSV file, but it writes data into a CSV
without formatting it properly. I want to bold some specific text. Is that possible?
CSV is just a plain text format, so you can't do any formatting.
If you want formatting, consider using an Excel library such as Apache POI or Jasper Reports. (Of course, then you end up with an excel file rather than a CSV, so depending on your situation that may or may not be appropriate)
As a side note, there are some strange nuances to writing CSV (such as making sure quotes, commas etc are properly escaped). There's a nice lightweight library I've used called Open CSV that might make your life easier if you choose to just stick with plain old CSV.
CSV is a plain text file format, you can not use any text effect.
Not that I am aware of, CSV is a plain text format.
If you are creating a csv, so that you can open it up in Excel, then I would suggest taking a look at the MS Excel XML format.
http://en.wikipedia.org/wiki/Microsoft_Office_XML_formats#Excel_XML_Spreadsheet_example
An example would be as follows (taken from the wikipedia link, and this makes some text BOLD)
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook
xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>Darl McBride</Author>
<LastAuthor>Bill Gates</LastAuthor>
<Created>2007-03-15T23:04:04Z</Created>
<Company>SCO Group, Inc.</Company>
<Version>11.8036</Version>
</DocumentProperties>
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
<WindowHeight>6795</WindowHeight>
<WindowWidth>8460</WindowWidth>
<WindowTopX>120</WindowTopX>
<WindowTopY>15</WindowTopY>
<ProtectStructure>False</ProtectStructure>
<ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
<Styles>
<Style ss:ID="Default" ss:Name="Normal">
<Alignment ss:Vertical="Bottom" />
<Borders />
<Font />
<Interior />
<NumberFormat />
<Protection />
</Style>
<Style ss:ID="s21">
<Font x:Family="Swiss" ss:Bold="1" />
</Style>
</Styles>
<Worksheet ss:Name="Sheet1">
<Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="5"
x:FullColumns="1" x:FullRows="1">
<Row>
<Cell>
<Data ss:Type="String">Text in cell A1</Data>
</Cell>
</Row>
<Row>
<Cell ss:StyleID="s21">
<Data ss:Type="String">Bold text in A2</Data>
</Cell>
</Row>
<Row ss:Index="4">
<Cell ss:Index="2">
<Data ss:Type="Number">43</Data>
</Cell>
</Row>
<Row>
<Cell ss:Index="2" ss:Formula="=R[-1]C/2">
<Data ss:Type="Number">21.5</Data>
</Cell>
</Row>
</Table>
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
<Print>
<ValidPrinterInfo />
<HorizontalResolution>600</HorizontalResolution>
<VerticalResolution>600</VerticalResolution>
</Print>
<Selected />
<Panes>
<Pane>
<Number>3</Number>
<ActiveRow>5</ActiveRow>
<ActiveCol>1</ActiveCol>
</Pane>
</Panes>
<ProtectObjects>False</ProtectObjects>
<ProtectScenarios>False</ProtectScenarios>
</WorksheetOptions>
</Worksheet>
</Workbook
I plain old regular CSV no. But there is no reason why you could not encode your data before writing it out... for example the HTML tags for bold is <b></b>. This would signal to you exactly which portions of the text are bold and being a from a well know standard is still human readable too. The main drawback is you have to parse your data after you read it :(
Something else to consider, since you are writing the data out why not write it out as comma separated values in RTF or some other format that does support bold etc? Normally CSV is plain text but there is no reason why you couldn't write it out another way. Just remember to read it back in the same format...

Categories

Resources