In my case i am able to read the xml file and parse it to get content as of meta data only provides the type of file which is "application/xml"
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.xml.XMLParser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.SAXException;
public class XmlParserExample {
public static void main(String[] args) throws IOException, SAXException, TikaException {
BodyContentHandler handler = new BodyContentHandler();
XMLParser parser = new XMLParser();
Metadata metadata = new Metadata();
ParseContext pcontext = new ParseContext();
FileInputStream inputstream = new FileInputStream(new File("example.xml"));
parser.parse(inputstream, handler, metadata, pcontext);
System.out.println("Contents of the document:" + handler.toString());
System.out.println("Metadata of the document:");
String[] metadataNames = metadata.names();
for(String name : metadataNames) {
System.out.println(name + ": " + metadata.get(name));
}
}
}
Above snippet of code prints the whole xml content and Content Type (as metadata).But i also want to fetch the xml tags as well so that i can create a HashMap which is requirement in my case.
Below is my Dummy example.xml:-
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE PubmedArticleSet SYSTEM "http://dtd.nlm.nih.gov/ncbi/pubmed/out/pubmed_190101.dtd">
<PubmedArticleSet>
<PubmedArticle>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">27483086</PMID>
<DateCompleted>
<Year>2018</Year>
<Month>05</Month>
<Day>02</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>05</Month>
<Day>02</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1532-849X</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>26</Volume>
<Issue>4</Issue>
<PubDate>
<Year>2017</Year>
<Month>Jun</Month>
</PubDate>
</JournalIssue>
<Title>Journal of prosthodontics : official journal of the American College of Prosthodontists</Title>
<ISOAbbreviation>J Prosthodont</ISOAbbreviation>
</Journal>
<ArticleTitle>The Use of CADCAM Technology for Fabricating Cast Gold Survey Crowns under Existing Partial Removable Dental Prosthesis. A Clinical Report.</ArticleTitle>
<Pagination>
<MedlinePgn>321-326</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1111jopr.12525</ELocationID>
<Abstract>
<AbstractText>The fabrication of a survey crown under an existing partial removable dental prosthesis (PRDP) has always been a challenge to many dental practitioners. This clinical report presents a technique for fabricating accurate cast gold survey crowns to fit existing PRDPs using CAD/CAM technology. The report describes a technique that would digitally scan the coronal anatomy of a cast gold survey crown and an abutment tooth under existing PRDPs planned for restoration, prior to any preparation. The information is stored in the digital software where all the coronal anatomical details are preserved without any modifications. The scanned designs are then applied to the scanned teeth preparations, sent to the milling machine and milled into full-contour clear acrylic resin burn-out patterns. The acrylic resin patterns are tried in the patient's mouth the same day to verify the full seating of the PRDP components. The patterns are then invested and cast into gold crowns and cemented in the conventional manner.</AbstractText>
<CopyrightInformation>© 2016 by the American College of Prosthodontists.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>El Kerdani</LastName>
<ForeName>Tarek</ForeName>
<Initials>T</Initials>
<AffiliationInfo>
<Affiliation>Department of Restorative Dental Sciences, Division of Prosthodontics, University of Florida College of Dentistry, Gainesville, FL.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Roushdy</LastName>
<ForeName>Sally</ForeName>
<Initials>S</Initials>
<AffiliationInfo>
<Affiliation>Department of Restorative Dental Sciences, Division of Prosthodontics, University of Florida College of Dentistry, Gainesville, FL.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D002363">Case Reports</PublicationType>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2016</Year>
<Month>08</Month>
<Day>02</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>J Prosthodont</MedlineTA>
<NlmUniqueID>9301275</NlmUniqueID>
<ISSNLinking>1059-941X</ISSNLinking>
</MedlineJournalInfo>
<ChemicalList>
<Chemical>
<RegistryNumber>7440-57-5</RegistryNumber>
<NameOfSubstance UI="D006046">Gold</NameOfSubstance>
</Chemical>
</ChemicalList>
<CitationSubset>D</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000368" MajorTopicYN="N">Aged</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017076" MajorTopicYN="Y">Computer-Aided Design</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D003442" MajorTopicYN="Y">Crowns</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D000044" MajorTopicYN="N">Dental Abutments</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017267" MajorTopicYN="Y">Dental Prosthesis Design</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D003832" MajorTopicYN="Y">Denture, Partial, Removable</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006046" MajorTopicYN="N">Gold</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D008297" MajorTopicYN="N">Male</DescriptorName>
</MeshHeading>
</MeshHeadingList>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">CADM</Keyword>
<Keyword MajorTopicYN="N">cast gold</Keyword>
<Keyword MajorTopicYN="N">milled acrylic resin patterns</Keyword>
</KeywordList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="accepted">
<Year>2016</Year>
<Month>06</Month>
<Day>13</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2016</Year>
<Month>8</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2018</Year>
<Month>5</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2016</Year>
<Month>8</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">27483086</ArticleId>
<ArticleId IdType="doi">10.111pr.12525</ArticleId>
</ArticleIdList>
</PubmedData>
</PubmedArticle>
<PubmedArticle>
<MedlineCitation Status="PubMed-not-MEDLINE" Owner="NLM">
<PMID Version="1">27483087</PMID>
<DateCompleted>
<Year>2018</Year>
<Month>08</Month>
<Day>07</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>08</Month>
<Day>07</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">2326-5205</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>68</Volume>
<Issue>11</Issue>
<PubDate>
<Year>2016</Year>
<Month>11</Month>
</PubDate>
</JournalIssue>
<Title>Arthritis & rheumatology (Hoboken, N.J.)</Title>
</Journal>
<ArticleTitle>Reply.</ArticleTitle>
<Pagination>
<MedlinePgn>2826-2827</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10t.39831</ELocationID>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Hitchon</LastName>
<ForeName>Carol Ann</ForeName>
<Initials>CA</Initials>
<AffiliationInfo>
<Affiliation>University of Manitoba, Winnipeg, Manitoba, Canada.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Koppejan</LastName>
<ForeName>Hester</ForeName>
<Initials>H</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Trouw</LastName>
<ForeName>Leendert A</ForeName>
<Initials>LA</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Huizinga</LastName>
<ForeName>Tom J W</ForeName>
<Initials>TJ</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Toes</LastName>
<ForeName>René E M</ForeName>
<Initials>RE</Initials>
<AffiliationInfo>
<Affiliation>Leiden University Medical Center, Leiden, The Netherlands.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>El-Gabalawy</LastName>
<ForeName>Hani S</ForeName>
<Initials>HS</Initials>
<AffiliationInfo>
<Affiliation>University of Manitoba, Winnipeg, Manitoba, Canada.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<GrantList CompleteYN="Y">
<Grant>
<GrantID>MOP‐77700</GrantID>
<Agency>CIHR</Agency>
<Country>Canada</Country>
</Grant>
</GrantList>
<PublicationTypeList>
<PublicationType UI="D016422">Letter</PublicationType>
<PublicationType UI="D013485">Research Sup</PublicationType>
<PublicationType UI="D016420">Comment</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2016</Year>
<Month>10</Month>
<Day>09</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>Arthritis Rheumatol</MedlineTA>
<NlmUniqueID>101623795</NlmUniqueID>
<ISSNLinking>2326-5191</ISSNLinking>
</MedlineJournalInfo>
<CommentsCorrectionsList>
<CommentsCorrections RefType="CommentOn">
<RefSource>dff</RefSource>
<PMID Version="1">27483211</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="CommentOn">
<RefSource>Arthritis Rheumato</RefSource>
<PMID Version="1">26946484</PMID>
</CommentsCorrections>
</CommentsCorrectionsList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2016</Year>
<Month>07</Month>
<Day>26</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2016</Year>
<Month>07</Month>
<Day>28</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2016</Year>
<Month>10</Month>
<Day>28</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2016</Year>
<Month>10</Month>
<Day>28</Day>
<Hour>6</Hour>
<Minute>1</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2016</Year>
<Month>8</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">27483087</ArticleId>
<ArticleId IdType="doi">efre</ArticleId>
</ArticleIdList>
</PubmedData>
</PubmedArticle>
</PubmedArticleSet>
Kindly help me out on this.
Thanks
My suggestion: If you want to read an XML file, and then parse its contents, you are probably better off using a purpose-built XML parser, rather than Tika.
There are various options - each with its own pros and cons (for example speed, memory consumption).
Here is one approach - it reads the entire file into memory, but you already do that with your Tika approach, so I assume file size is not a problem.
The code assumes there is a file called pubmed.xml which contains the XML presented in the question.
It reads the XML from file, and handles each element as a DOM node.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;
...
public void parseUsingDom() {
try {
File xmlFile = new File("C:/tmp/pubmed.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
NodeList articles = doc.getElementsByTagName("Article");
for (int i = 0; i < articles.getLength(); i++) {
Node article = articles.item(i);
if (article.getNodeType() == Node.ELEMENT_NODE) {
Element articleElement = (Element) article;
String title = articleElement
.getElementsByTagName("ArticleTitle")
.item(0).getTextContent();
System.out.println("");
System.out.println("Title : " + title);
NodeList authors = articleElement.getElementsByTagName("Author");
for (int j = 0; j < authors.getLength(); j++) {
Node author = authors.item(j);
if (author.getNodeType() == Node.ELEMENT_NODE) {
Element authorElement = (Element) author;
String foreName = authorElement
.getElementsByTagName("ForeName")
.item(0).getTextContent();
String lastName = authorElement
.getElementsByTagName("LastName")
.item(0).getTextContent();
System.out.println("Author : " + lastName + ", " + foreName);
}
}
}
}
} catch (Exception e) {
System.err.print(e);
}
}
The program prints the following output, just as a demo of what is possible:
Title : The Use of CADCAM Technology for Fabricating Cast Gold Survey Crowns under Existing Partial Removable Dental Prosthesis. A Clinical Report.
Author : El Kerdani, Tarek
Author : Roushdy, Sally
Title : Reply.
Author : Hitchon, Carol Ann
Author : Koppejan, Hester
Author : Trouw, Leendert A
Author : Huizinga, Tom J W
Author : Toes, René E M
Author : El-Gabalawy, Hani S
In your case, you would capture the relevant values in a hash map, of course.
I'm trying to get data from an XML-file and use this data in processing. When doing so I get a NPE, and I can't quite figure out where I'm wrong. The XML got several layers and I have to get data from this "child":
http://i62.tinypic.com/2mb90g.png
My code looks like this:
XML xml;
void setup(){
xml = loadXML("parker.xml");
XML[] children = xml.getChildren("kml");
XML[] Folder=children[0].getChildren("Folder");
XML[] Placemark=Folder[1].getChildren("Placemark");
XML[] Polygon=Placemark[2].getChildren("Polygon");
XML[] outerBoundaryIs=Polygon[3].getChildren("outerBoundaryIs");
XML[] LinearRing=outerBoundaryIs[4].getChildren("LinearRing");
for (int i = 0; i < LinearRing.length; i++) {
float coordinates = children[i].getFloat("coordinates");
println(coordinates);
}
}
Best Chris
Stack trace:
[Fatal Error] :1:1: Content is not allowed in prolog.
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content
is not allowed in prolog. at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
at processing.data.XML.(XML.java:187) at
processing.core.PApplet.loadXML(PApplet.java:6310) at
processing.core.PApplet.loadXML(PApplet.java:6300) at
XMLtryout.setup(XMLtryout.java:21) at
processing.core.PApplet.handleDraw(PApplet.java:2359) at
processing.core.PGraphicsJava2D.requestDraw(PGraphicsJava2D.java:240)
at processing.core.PApplet.run(PApplet.java:2254) at
java.lang.Thread.run(Thread.java:744)
XML file:
https://www.dropbox.com/s/xn3thjskhlf2wai/parker.xml
This error maybe caused because missing this at the top of your xml file
<?xml version="1.0" encoding="utf-8"?>
or there's some non-printable garbage at the start of your file.
The 'Content is not allowed in prolog' error indicates that you have some content between the XML declaration and the appearance of the document element, for example
<?xml version="1.0" encoding="utf-8"?>
content here is not allowed
<kml xmlns="http://earth.google.com/kml/2.1">
...
</kml>
The XML file you linked is ok though, so it seems you're
reading the XML binary incorrectly before passing it to the XML parser,
or (more likely) you're not reading the XML at all (can happen when you read from a web URL and getting an error response). I assume you get a HTTP 40x error which you don't recognize, and read the response (usually HTML) as XML, which causes the error. Remember that applets usually can only read resources from the same server (that's what might cause the error).
To verify this, attempt to read the URL content and output it as text, and check if it looks ok.
Make it more easy
try like this
public static void setUp(){
try {
File file = new File("parker.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
doc.getDocumentElement().normalize();
System.out.println("Root element " + doc.getDocumentElement().getNodeName());
NodeList nodeLst = doc.getElementsByTagName("LinearRing");
System.out.println("Information");
for (int s = 0; s < nodeLst.getLength(); s++) {
Node fstNode = nodeLst.item(s);
if (fstNode.getNodeType() == Node.ELEMENT_NODE) {
Element fstElmnt = (Element) fstNode;
NodeList fstNmElmntLst = fstElmnt.getElementsByTagName("coordinates");
Element fstNmElmnt = (Element) fstNmElmntLst.item(0);
NodeList fstNm = fstNmElmnt.getChildNodes();
System.out.println("coordinates : " + ((Node) fstNm.item(0)).getNodeValue());
//
//
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
that should display :
Root element kml
Information
coordinates :
10.088512,56.154109,0
10.088512,56.154109,0
10.088512,56.154109,0
10.088511,56.15411,0
10.088511,56.15411,0
10.08851,56.15411,0
10.08851,56.154111,0
10.088509,56.154111,0
10.088508,56.154111,0
10.088508,56.154111,0
10.088507,56.154111,0
10.088506,56.154111,0
10.088506,56.154111,0
10.088505,56.154111,0
10.088504,56.154111,0
10.088504,56.154111,0
10.088503,56.15411,0
10.088503,56.15411,0
10.088502,56.15411,0
10.088502,56.154109,0
10.088502,56.154109,0
10.088501,56.154109,0
10.088501,56.154108,0
10.088501,56.154108,0
10.088501,56.154108,0
10.088501,56.154107,0
10.088285,56.154094,0
10.088104,56.154372,0
10.088061,56.154401,0
10.087988,56.15445,0
10.087915,56.154611,0
10.08791,56.154613,0
10.087912,56.154613,0
10.087877,56.1548,0
10.08772,56.15482,0
10.087558,56.154911,0
10.087421,56.155111,0
10.087316,56.155308,0
10.087328,56.15538,0
10.087372,56.155413,0
10.087446,56.155453,0
10.087484,56.155487,0
10.08747,56.155601,0
10.08772,56.155616,0
10.087719,56.155618,0
10.088618,56.155671,0
10.089096,56.1557,0
10.089096,56.155699,0
10.089138,56.155701,0
10.089127,56.155706,0
10.089004,56.155787,0
10.08888,56.155853,0
10.088799,56.155806,0
10.088571,56.155914,0
10.088455,56.155946,0
10.088112,56.156081,0
10.088184,56.156138,0
10.087733,56.156353,0
10.087489,56.156421,0
10.087288,56.156341,0
10.087268,56.156333,0
10.086893,56.156182,0
10.08684,56.156271,0
10.087049,56.156373,0
10.086893,56.156455,0
10.086664,56.156575,0
10.086443,56.156698,0
10.086425,56.156708,0
10.085983,56.156955,0
10.085655,56.157139,0
10.085462,56.157276,0
10.085272,56.157233,0
10.085176,56.157328,0
10.084917,56.157393,0
10.084883,56.157458,0
10.08495,56.157513,0
10.084947,56.157524,0
10.084943,56.157539,0
10.084855,56.15787,0
10.084855,56.15787,0
10.084321,56.157317,0
10.085553,56.156195,0
10.085555,56.156194,0
10.085553,56.156194,0
10.085734,56.156035,0
10.085821,56.155977,0
10.085937,56.155932,0
10.085993,56.155942,0
10.086031,56.155959,0
10.086171,56.15592,0
10.086227,56.155901,0
10.086392,56.155841,0
10.086513,56.155786,0
10.08664,56.155699,0
10.086686,56.155657,0
10.086727,56.155605,0
10.086777,56.155486,0
10.086861,56.155289,0
10.086916,56.155134,0
10.087006,56.154899,0
10.087075,56.154706,0
10.087094,56.154649,0
10.0871,56.154574,0
10.087112,56.154464,0
10.087111,56.154362,0
10.087112,56.154279,0
10.087112,56.154279,0
10.087113,56.15427,0
10.087114,56.154198,0
10.087108,56.15413,0
10.087091,56.154054,0
10.086992,56.153698,0
10.087,56.153678,0
10.087031,56.153647,0
10.087036,56.153648,0
10.087046,56.153652,0
10.087035,56.153647,0
10.087039,56.153645,0
10.087072,56.153612,0
10.087367,56.153308,0
10.087371,56.153303,0
10.08742,56.15323,0
10.087568,56.152963,0
10.087568,56.152962,0
10.087569,56.152962,0
10.08757,56.152961,0
10.087571,56.15296,0
10.087573,56.152959,0
10.087574,56.152959,0
10.087575,56.152958,0
10.087577,56.152958,0
10.087579,56.152958,0
10.087581,56.152957,0
10.087582,56.152957,0
10.087584,56.152957,0
10.087586,56.152958,0
10.087588,56.152958,0
10.087589,56.152958,0
10.087591,56.152959,0
10.087592,56.152959,0
10.087593,56.15296,0
10.087594,56.152961,0
10.087595,56.152962,0
10.087596,56.152963,0
10.087596,56.152964,0
10.087596,56.152965,0
10.087596,56.152965,0
10.087614,56.152967,0
10.087921,56.152988,0
10.088134,56.153019,0
10.088311,56.15304,0
10.088454,56.153052,0
10.088469,56.153378,0
10.08847,56.153445,0
10.088473,56.153597,0
10.088473,56.153597,0
10.088473,56.153597,0
10.088703,56.153614,0
10.088703,56.153614,0
10.088703,56.153614,0
10.088704,56.153614,0
10.088705,56.153614,0
10.088705,56.153615,0
10.088706,56.153615,0
10.088706,56.153615,0
10.088707,56.153616,0
10.088707,56.153616,0
10.088707,56.153616,0
10.088707,56.153617,0
10.088707,56.153617,0
10.088707,56.153617,0
10.088707,56.153618,0
10.088512,56.154108,0
10.088512,56.154109,0
coordinates :
10.086779,56.155487,0
10.086778,56.155488,0
10.086779,56.155487,0
10.086779,56.155487,0
coordinates :
10.08847,56.153602,0
10.088469,56.153602,0
10.088469,56.153602,0
10.08847,56.153602,0
PS : kml is the root element
I'm trying to import a mkl file with jak but i get the following error:
javax.xml.bind.UnmarshalException: unexpected element (uri:"http://earth.google.com/kml/2.2", local:"kml"). Expected elements are
...
and then a big list
Does anyone else run into this problem?
This is the code:
final Kml kml = Kml.unmarshal(new File("../data/Eemskanaal.kml"));
final Placemark placemark = (Placemark) kml.getFeature();
Point point = (Point) placemark.getGeometry();
List<Coordinate> coordinates = point.getCoordinates();
for (Coordinate coordinate : coordinates) {
System.out.println(coordinate.getLatitude());
System.out.println(coordinate.getLongitude());
System.out.println(coordinate.getAltitude());
}
And this is the kml file:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2">
<Document>
<name>BU00100107 Verspreide huizen Eemskanaal (ten zuiden)</name>
<description><![CDATA[description]]></description>
<Placemark>
<name>BLA!</name>
<description><![CDATA[]]></description>
<styleUrl>#style1</styleUrl>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<tessellate>1</tessellate>
<coordinates>
6.941796,53.314914,0.000000
6.942705,53.310923,0.000000
6.952713,53.305394,0.000000
6.954853,53.300262,0.000000
6.954239,53.296317,0.000000
6.962271,53.295483,0.000000
6.995900,53.287338,0.000000
6.995013,53.285264,0.000000
6.996842,53.281429,0.000000
6.991748,53.278255,0.000000
6.990729,53.275234,0.000000
6.988361,53.274477,0.000000
6.990120,53.271780,0.000000
6.984540,53.272709,0.000000
6.984543,53.274393,0.000000
6.980317,53.274404,0.000000
6.975829,53.272503,0.000000
6.974816,53.271125,0.000000
6.963342,53.271937,0.000000
6.955082,53.265909,0.000000
6.945183,53.269634,0.000000
6.940684,53.273351,0.000000
6.935942,53.273875,0.000000
6.934392,53.276351,0.000000
6.929104,53.272181,0.000000
6.909544,53.265952,0.000000
6.908803,53.269015,0.000000
6.909151,53.278897,0.000000
6.888166,53.279161,0.000000
6.887788,53.279639,0.000000
6.886750,53.280950,0.000000
6.886729,53.280977,0.000000
6.888260,53.281856,0.000000
6.895912,53.286254,0.000000
6.892976,53.288089,0.000000
6.891571,53.290803,0.000000
6.887323,53.298046,0.000000
6.887729,53.309725,0.000000
6.887583,53.309816,0.000000
6.888683,53.311891,0.000000
6.893966,53.313119,0.000000
6.924732,53.311548,0.000000
6.929655,53.312392,0.000000
6.934810,53.315353,0.000000
6.941796,53.314914,0.000000
</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<tessellate>1</tessellate>
<coordinates>
6.905549,53.283453,0.000000
6.908790,53.282516,0.000000
6.912146,53.283305,0.000000
6.916480,53.287575,0.000000
6.916764,53.288072,0.000000
6.915251,53.288369,0.000000
6.915097,53.290097,0.000000
6.912526,53.292361,0.000000
6.908052,53.290971,0.000000
6.905569,53.288875,0.000000
6.905549,53.283453,0.000000
</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
</Placemark>
</Document>
</kml>
Any other solutions are also welcome
Here is my quick and dirty way :)
public static Kml getKml(InputStream is) throws Exception {
String str = IOUtils.toString( is );
IOUtils.closeQuietly( is );
str = StringUtils.replace( str, "xmlns=\"http://earth.google.com/kml/2.2\"", "xmlns=\"http://www.opengis.net/kml/2.2\" xmlns:gx=\"http://www.google.com/kml/ext/2.2\"" );
ByteArrayInputStream bais = new ByteArrayInputStream( str.getBytes( "UTF-8" ) );
return Kml.unmarshal(bais);
}
I am unfamiliar with Jak, but if you're using the OGC Schema, the namespace is different. You have
http://earth.google.com/kml/2.2
The OGC namespace is
http://www.opengis.net/kml/2.2
The Google extension schema uses
http://www.google.com/kml/ext/2.2
as well. The namespace you're using was used by Google before KML was given to the OGC as an open standard.
The namespace is incorrect, but if you have 1700 files and this is the only problem, you might consider simply using the two-argument form of Kml.unmarshal(File file, boolean validate).
final Kml kml = loadXMLFile("../data/Eemskanaal.kml");
private static Kml loadXMLFile(String path) {
Kml kml = null;
try {
kml = Kml.unmarshal(path);
} catch (RuntimeException ex) {
kml = Kml.unmarshal(new File(path), false);
}
return kml;
}
I've also used the following cheezy perl script to correct my files.
$ cat ~/bin/fixxmlns
#!/usr/bin/perl
for (#ARGV) {
open (FH,"<$_");
open (NFH,">$_.x");
$look = 1;
while ($r = <FH>) {
if ($look && $r =~ /earth.google.com/) {
$r = qq{<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:gx="http://www.google.com/kml/ext/2.2">\n};
$look = 0;
}
print NFH $r;
}
close (NFH);
close (FH);
rename("$_", "$_.orig");
rename("$_.x", "$_");
}
$ fixxmlns *.kml
$ find parentdir -name "*.kml" -print0 | xargs -0 fixxmlns