SPSSReader reader = new SPSSReader(args[0], null);
Iterator it = reader.getVariables().iterator();
while (it.hasNext())
{
System.out.println(it.next());
}
I am using this SPSSReader to read the spss file. Here,every string is printed with some junk characters appended with it.
Obtained Result :
StringVariable: nameogr(nulltpc{)(10)
NumericVariable: weightppuo(nullf{nd)
DateVariable: datexsgzj(nulllanck)
DateVariable: timeppzb(null|wt{l)
DateVariable: datetimegulj{(null|ns)
NumericVariable: commissionyrqh(nullohzx)
NumericVariable: priceeub{av(nullvlpl)
Expected Result :
StringVariable: name (10)
NumericVariable: weight
DateVariable: date
DateVariable: time
DateVariable: datetime
NumericVariable: commission
NumericVariable: price
Thanks in advance :)
I tried recreating the issue and found the same thing.
Considering that there is a licensing for that library (see here), I would assume that this might be a way of the developers to ensure that a license is bought as the regular download only contains a demo version as evaluation (see licensing before the download).
As that library is rather old (copyright of the website is 2003-2008, requirement for the library is Java 1.2, no generics, Vectors are used, etc), I would recommend a different library as long as you are not limited to the one used in your question.
After a quick search, it turned out that there is an open source spss reader here which is also available through Maven here.
Using the example on the github page, I put this together:
import com.bedatadriven.spss.SpssDataFileReader;
import com.bedatadriven.spss.SpssVariable;
public class SPSSDemo {
public static void main(String[] args) {
try {
SpssDataFileReader reader = new SpssDataFileReader(args[0]);
for (SpssVariable var : reader.getVariables()) {
System.out.println(var.getVariableName());
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
I wasn't able to find stuff that would print NumericVariable or similar things but as those were the classnames of the library you were using in the question, I will assume that those are not SPSS standardized. If they are, you will either find something like that in the library or you can open an issue on the github page.
Using the employees.sav file from here I got this output from the code above using the open source library:
resp_id
gender
first_name
last_name
date_of_birth
education_type
education_years
job_type
experience_years
monthly_income
job_satisfaction
No additional characters no more!
Edit regarding the comment:
That is correct. I read through some SPSS stuff though and from my understanding there are only string and numeric variables which are then formatted in different ways. The version published in maven only gives you access to the typecode of a variable (to be honest, no idea what that is) but the github version (that does not appear to be published on maven as 1.3-SNAPSHOT unfortunately) does after write- and printformat have been introduced.
You can clone or download the library and run mvn clean package (assuming you have maven installed) and use the generated library (found under target\spss-reader-1.3-SNAPSHOT.jar) in your project to have the methods SpssVariable#getPrintFormat and SpssVariable#getWriteFormat available.
Those return an SpssVariableFormat which you can get more information from. As I have no clue what all that is about, the best I can do is to link you to the source here where references to the stuff that was implemented there should help you further (I assume that this link referenced to in the documentation of SpssVariableFormat#getType is probably the most helpful to determine what kind of format you have there.
If absolutely NOTHING works with that, I guess you could use the demo version of the library in the question to determine the stuff through it.next().getClass().getSimpleName() as well but I would resort to that only if there is no other way to determining the format.
I am not sure, but looking at your code, it.next() is returning a Variable object.
There has to be some method to be chained to the Variable object, something like it.next().getLabel() or it.next().getVariableName(). toString() on an Object is not always meaningful. Check toString() method of Variable class in SPSSReader library.
In Eclipse I want (for example) that code like this
public Foo bar() {
}
gets formatted to this
public Foo bar()
{
}
via the clean up function.
But to do that I have to check "Format source code" in the clean up profile.
But that also formats code like this
alert.setHeaderText("blablablablablablablablablablablablablablablabla");
to this
alert.setHeaderText(
"blablablablablablablablablablablablablablablabla");
which I absolutely do not want. Is there any possible way to stop Eclipse from cutting lines like that?
Go to Window->Preferences->Java->Code Style->Formatter. Create new formatter. Click on edit and then pick tab Line Wrapping and set Line Wrapping policy to Do not wrap.
For more clarification refer the below Link :-
http://eclipsesource.com/blogs/2013/07/09/invisible-chaos-mastering-white-spaces-in-eclipse/
You can configure the style to which code is formatted. Under
Preferences: Java -> CodeStyle -> Formatter
Then look for "Line wrapping".
I want to write an eclipse plugin for my computer science lecturer. It's a custom editor with an xml based file format.
I've implemented syntax highlighting and other stuff. But I stuck when it comes to the usage of annotations/marker to show invalid content.
This is how it looks like, if the content is valid:
image of valid conent http://image-upload.de/image/aPcsaa/6c799a671c.png
This is how it looks like, if the content is invalid:
image of invalid conent http://image-upload.de/image/4TdooQ/04d662f397.png
As you can see, invalid attributes will get marked, but the problem is, the whole formatting seems to be lost between these annotations.
I use org.eclipse.jface.text.reconciler.Reconciler for delayed parsing of the content to create the document model. The model is used to format the text and display annotations. All this happens in the void void process(DirtyRegion dirtyRegion) method of the Reconciler.
For text formatting I use <ITextViewer>.changeTextPresentation(textAttributes, false) and for annotation handling I use
IAnnotationModel annotationModel = <ISourceViewer>.getAnnotationModel();
IAnnotationModelExtension annotationModelExtension = (IAnnotationModelExtension) annotationModel;
annotationModelExtension.replaceAnnotations(oldAnnotations, newAnnotations);
Since the Reconciler does not use the swt thread I have to use this construct to avoid exceptions:
<ITextViewer>.getTextWidget().getDisplay().asyncExec(new Runnable() {
#Override
public void run() {
<ITextViewer>.changeTextPresentation(textAttributes, false);
updateAnnotations(nodes);
}
});
By the way ITextViewer and ISourceViewer are meant as the same viewer object.
As annotation type I've testet: org.eclipse.ui.workbench.texteditor.spelling and some others, also custom types, but all with the same result.
I'm not quite sure, what I do wrong, could it be because it is all in a single call?
I hope someone can help me with this problem.
Thank you in advance.
I'm new to Scala, so I may be off base on this, I want to know if the problem is my code. Given the Scala file httpparse, simplified to:
object Http {
import java.io.InputStream;
import java.net.URL;
def request(urlString:String): (Boolean, InputStream) =
try {
val url = new URL(urlString)
val body = url.openStream
(true, body)
}
catch {
case ex:Exception => (false, null)
}
}
object HTTPParse extends Application {
import scala.xml._;
import java.net._;
def fetchAndParseURL(URL:String) = {
val (true, body) = Http request(URL)
val xml = XML.load(body) // <-- Error happens here in .load() method
"True"
}
}
Which is run with (URL doesn't matter, this is a joke example):
scala> HTTPParse.fetchAndParseURL("http://stackoverflow.com")
The result invariably:
java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/html4/strict.dtd
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1187)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:973)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEnti...
I've seen the Stack Overflow thread on this with respect to Java, as well as the W3C's System Team Blog entry about not trying to access this DTD via the web. I've also isolated the error to the XML.load() method, which is a Scala library method as far as I can tell.
My Question: How can I fix this? Is this something that is a by product of my code (cribbed from Raphael Ferreira's post), a by product of something Java specific that I need to address as in the previous thread, or something that is Scala specific? Where is this call happening, and is it a bug or a feature? ("Is it me? It's her, right?")
I've bumped into the SAME issue, and I haven't found an elegant solution (I'm thinking into posting the question to the Scala mailing list) Meanwhile, I found a workaround: implement your own SAXParserFactoryImpl so you can set the f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); property. The good thing is it doesn't require any code change to the Scala code base (I agree that it should be fixed, though).
First I'm extending the default parser factory:
package mypackage;
public class MyXMLParserFactory extends SAXParserFactoryImpl {
public MyXMLParserFactory() throws SAXNotRecognizedException, SAXNotSupportedException, ParserConfigurationException {
super();
super.setFeature("http://xml.org/sax/features/validation", false);
super.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false);
super.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
super.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
}
}
Nothing special, I just want the chance to set the property.
(Note: that this is plain Java code, most probably you can write the same in Scala too)
And in your Scala code, you need to configure the JVM to use your new factory:
System.setProperty("javax.xml.parsers.SAXParserFactory", "mypackage.MyXMLParserFactory");
Then you can call XML.load without validation
Without addressing, for now, the problem, what do you expect to happen if the function request return false below?
def fetchAndParseURL(URL:String) = {
val (true, body) = Http request(URL)
What will happen is that an exception will be thrown. You could rewrite it this way, though:
def fetchAndParseURL(URL:String) = (Http request(URL)) match {
case (true, body) =>
val xml = XML.load(body)
"True"
case _ => "False"
}
Now, to fix the XML parsing problem, we'll disable DTD loading in the parser, as suggested by others:
def fetchAndParseURL(URL:String) = (Http request(URL)) match {
case (true, body) =>
val f = javax.xml.parsers.SAXParserFactory.newInstance()
f.setNamespaceAware(false)
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
val MyXML = XML.withSAXParser(f.newSAXParser())
val xml = MyXML.load(body)
"True"
case _ => "False"
}
Now, I put that MyXML stuff inside fetchAndParseURL just to keep the structure of the example as unchanged as possible. For actual use, I'd separate it in a top-level object, and make "parser" into a def instead of val, to avoid problems with mutable parsers:
import scala.xml.Elem
import scala.xml.factory.XMLLoader
import javax.xml.parsers.SAXParser
object MyXML extends XMLLoader[Elem] {
override def parser: SAXParser = {
val f = javax.xml.parsers.SAXParserFactory.newInstance()
f.setNamespaceAware(false)
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
f.newSAXParser()
}
}
Import the package it is defined in, and you are good to go.
This is a scala problem. Native Java has an option to disable loading the DTD:
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
There are no equivalent in scala.
If you somewhat want to fix it yourself, check scala/xml/parsing/FactoryAdapter.scala and put the line in
278 def loadXML(source: InputSource): Node = {
279 // create parser
280 val parser: SAXParser = try {
281 val f = SAXParserFactory.newInstance()
282 f.setNamespaceAware(false)
<-- insert here
283 f.newSAXParser()
284 } catch {
285 case e: Exception =>
286 Console.err.println("error: Unable to instantiate parser")
287 throw e
288 }
GClaramunt's solution worked wonders for me. My Scala conversion is as follows:
package mypackage
import org.xml.sax.{SAXNotRecognizedException, SAXNotSupportedException}
import com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
import javax.xml.parsers.ParserConfigurationException
#throws(classOf[SAXNotRecognizedException])
#throws(classOf[SAXNotSupportedException])
#throws(classOf[ParserConfigurationException])
class MyXMLParserFactory extends SAXParserFactoryImpl() {
super.setFeature("http://xml.org/sax/features/validation", false)
super.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false)
super.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false)
super.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)
}
As mentioned his the original post, it is necessary to place the following line in your code somewhere:
System.setProperty("javax.xml.parsers.SAXParserFactory", "mypackage.MyXMLParserFactory")
It works. After some detective work, the details as best I can figure them:
Trying to parse a developmental RESTful interface, I build the parser and get the above (rather, a similar) error. I try various parameters to change the XML output, but get the same error. I try to connect to an XML document I quickly whip up (cribbed stupidly from the interface itself) and get the same error. Then I try to connect to anything, just for kicks, and get the same (again, likely only similar) error.
I started questioning whether it was an error with the sources or the program, so I started searching around, and it looks like an ongoing issue- with many Google and SO hits on the same topic. This, unfortunately, made me focus on the upstream (language) aspects of the error, rather than troubleshoot more downstream at the sources themselves.
Fast forward and the parser suddenly works on the original XML output. I confirmed that there was some additional work has been done server side (just a crazy coincidence?). I don't have either earlier XML but suspect that it is related to the document identifiers being changed.
Now, the parser works fine on the RESTful interface, as well any well formatted XML I can throw at it. It also fails on all XHTML DTD's I've tried (e.g. www.w3.org). This is contrary to what #SeanReilly expects, but seems to jive with what the W3 states.
I'm still new to Scala, so can't determine if I have a special, or typical case. Nor can I be assured that this problem won't re-occur for me in another form down the line. It does seem that pulling XHTML will continue to cause this error unless one uses a solution similar to those suggested by #GClaramunt $ #J-16 SDiZ have used. I'm not really qualified to know if this is a problem with the language, or my implementation of a solution (likely the later)
For the immediate timeframe, I suspect that the best solution would've been for me to ensure that it was possible to parse that XML source-- rather than see that other's have had the same error and assume there was a functional problem with the language.
Hope this helps others.
There are two problems with what you are trying to do:
Scala's xml parser is trying to physically retrieve the DTD when it shouldn't. J-16 SDiZ seems to have some advice for this problem.
The Stack overflow page you are trying to parse isn't XML. It's Html4 strict.
The second problem isn't really possible to fix in your scala code. Even once you get around the dtd problem, you'll find that the source just isn't valid XML (empty tags aren't closed properly, for example).
You have to either parse the page with something besides an XML parser, or investigate using a utility like tidy to convert the html to xml.
My knowledge of Scala is pretty poor, but couldn't you use ConstructingParser instead?
val xml = new java.io.File("xmlWithDtd.xml")
val parser = scala.xml.parsing.ConstructingParser.fromFile(xml, true)
val doc = parser.document()
println(doc.docElem)
For scala 2.7.7 I managed to do this with scala.xml.parsing.XhtmlParser
Setting Xerces switches only works if you are using Xerces. An entity resolver works for any JAXP parser.
There are more generalized entity resolvers out there, but this implementation does the trick when all I'm trying to do is parse valid XHTML.
http://code.google.com/p/java-xhtml-cache-dtds-entityresolver/
Shows how trivial it is to cache the DTDs and forgo the network traffic.
In any case, this is how I fix it. I always forget. I always get the error. I always go fetch this entity resolver. Then I'm back in business.
I work on the localization of Java software, and my projects have both .properties files and XML resources. We currently use comments to instruct translators to not translate certain strings, but the problem with comments is that they are not machine-readable.
The only solution I can think of is to prefix each do-not-translate key with something like _DNT_ and train our translation tools to ignore these entries. Does anyone out there have a better idea?
Could you break the files up into ones to be translated or ones to be not translated and then only send them the one that are to be translated? (Don't know the structure so har dto know when answering if that is practical...)
The Eclipse JDT also uses comments to prevent the translation of certain Strings:
How to write Eclipse plug-ins for the international market
I think your translation tool should work in a similar way?
The simplest solution is to not put do-not-translate strings (DNTs) in your resource files.
.properties files don't offer much in the way of metadata handling, and since you don't need the data at runtime, its presence in .properties files would be a side-effect rather than something that is desirable. Consider too, partial DNTs where you have something that cannot be translated contained in a translatable string (e.g. a brand name or URI).
"IDENTIFIER english en en en" -> "french fr IDENTIFIER fr fr"
As far as I am aware, even standards like XLIFF do not take DNTs into consideration and you'll have to manage them through custom metadata files, terminology files and/or comments (such as the note element in XLIFF).
Like axelclk posted in his link... eclipse provide a
//$NON-NLS-1$
Statement to notify the project that the first string in this line should not translated. All other string you can find by calling
Source->Externalize Strings
External Strings include all languages you want to support.
File which include the translations looking like:
PluginPage.Error1 = text1
PluginPage.Error2 = text2
Class which read the translation
private static final String BUNDLE_NAME = "com.plugin.name"; //$NON-NLS-1$
private static final ResourceBundle RESOURCE_BUNDLE = ResourceBundle.getBundle(BUNDLE_NAME);
private PluginMessages() {
}
public static String getString(String key) {
// TODO Auto-generated method stub
try {
return RESOURCE_BUNDLE.getString(key);
} catch (MissingResourceException e) {
return '!' + key + '!';
}
}
And you can call it like:
String msg = PluginMessages.getString("PluginPage.Error2"); //$NON-NLS-1$
EDIT:
When a string is externalized and you want to use the original string, you can delete the externalize string from all properties files, without the default one. When the Bundle can not find a message file which is matching to the local language, the default is used.
But this is not working at runtime.
If you do decide to use do-not-translate comments in your properties files, I would recommend you follow the Eclipse convention. It's nothing special, but life will be easier if we all use the same magic string!
(Eclipse doesn't actually support DO-NOT-TRANSLATE comments yet, as far as I know, but Tennera Ant-Gettext has an implementation of the above scheme which is used when converting from resource bundles to Gettext PO files.)