How to parse JSON array with no object name - java

How would I parse this JSON array in Java? I'm confused because there is no object. Thanks!
EDIT: I'm an idiot! I should have read the documentation... that's probably what it's there for...
[
{
"id":"63565",
"name":"Buca di Beppo",
"user":null,
"phone":"(408)377-7722",
"address":"1875 S Bascom Ave Campbell, California, United States",
"gps_lat":"37.28967000",
"gps_long":"-121.93179700",
"monhh":"",
"tuehh":"",
"wedhh":"",
"thuhh":"",
"frihh":"",
"sathh":"",
"sunhh":"",
"monhrs":"",
"tuehrs":"",
"wedhrs":"",
"thuhrs":"",
"frihrs":"",
"sathrs":"",
"sunhrs":"",
"monspecials":"",
"tuespecials":"",
"wedspecials":"",
"thuspecials":"",
"frispecials":"",
"satspecials":"",
"sunspecials":"",
"description":"",
"source":"ripper",
"worldsbarsname":"BucadiBeppo31",
"url":"www.bucadebeppo.com",
"maybeDupe":"no",
"coupontext":"",
"couponimage":"0",
"distance":"1.00317",
"images":[
0
]
}
]

It is perfectly valid JSON. It is an array containing one object.
In JSON, arrays and objects don't have names. Only attributes of objects have names.
This is all described clearly by the JSON syntax diagrams at http://json.org. (FWIW, the site has translations in a number of languages ...)
How do you parse it? There are many libraries for parsing JSON. Many of them are linked from the site above. I suggest you use one of those rather than writing your own parsing code.
In response to this comment:
OTOH, writing your own parser is a reasonable project, and a good exercise for both learning JSON and learning Java (or whatever language). A reasonable parser can be written in about 500 lines of text.
In my opinion (having written MANY parsers in my time), writing a parser for a language is a very inefficient way to gain a working understanding the syntax of a language. And depending on how you implement the parser (and the nature of the language syntax specification) you can easily get an incorrect understanding.
A better approach is to read the language's syntax specification, which the OP has now done, and which you would have to do in order to implement a parser.
Writing a parser can be a good learning exercising, but it is really a learning exercise in writing parsers. Even then, you need to pick an appropriate implementation approach, and an appropriate language to be parsed.

It's an array containing one element. That element is an object. The object (dictionary) contains about 20 name/value pairs.

Related

How to represent a context free grammar in code?

I want to write a compiler as a personal project and am in the process of reading and understanding parsers (LL(k), LR(k), SLR etc.)
All of these parsers are based on some grammar which comes from the user and this grammar is generally written in a text file (for eg. in ANTLR, it comes in a .g4 file, which is a text file IMO). If I want my parser to create its parsing tables from such a grammar file, what is the best way to parse it and represent the productions in code?
EDIT:
For example, let's say I have this grammar:
S -> 'a'|'b'|'('S')'|T
T -> '*'S
I was thinking of parsing this given grammar and storing it as an ArrayList<ArrayList<String>>. This way every item in the ArrayList will be a collection of productions from the same non-terminal:
// with this type of a representation, I can assign an id to each production
//For example, production S -> 'a' has id 01 or T -> '*'S has an id of 10 and so on
{
{"S", "'a'", "'b'", "'('S')'", "T"},
{"T", "'*'S"}
}
I am not sure about representing the grammar as an AST, because then I don't know how to assign Ids to each production. But the above representation of a grammar seems pretty naive design to me and I am suspicious that there should be some standard way to doing this which will be easier to work with.

Load a Perl Hash into Java

I have a big .pm File, which only consist of a very big Perl hash with lots of subhashes. I have to load this hash into a Java program, do some work and changes on the data lying below and save it back into a .pm File, which should look similar to the one i started with.
By now, i tried to convert it linewise by regex and string matching, converting it into a XML Document and later Elementwise parse it back into a perl hash.
This somehow works, but seems quite dodgy. Is there any more reliable way to parse the perl hash without having a perl runtime installed?
You're quite right, it's utterly filthy. Regex and string for XML in the first place is a horrible idea, and honestly XML is probably not a good fit for this anyway.
I would suggest that you consider JSON. I would be stunned to find java can't handle JSON and it's inherently a hash-and-array oriented data structure.
So you can quite literally:
use JSON;
print to_json ( $data_structure, { pretty => 1 } );
Note - it won't work for serialising objects, but for perl hash/array/scalar type structures it'll work just fine.
You can then import it back into perl using:
my $new_data = from_json $string;
print Dumper $new_data;
Either Dumper it to a file, but given you requirement is multi-language going forward, just using native JSON as your 'at rest' data is probably a more sensible choice.
But if you're looking at parsing perl code within java, without a perl interpreter? No, that's just insanity.

Parsing Serialized Java objects in Python

The string at the bottom of this post is the serialization of a java.util.GregorianCalendar object in Java. I am hoping to parse it in Python.
I figured I could approach this problem with a combination of regexps and key=val splitting, i.e. something along the lines of:
text_inside_brackets = re.search(r"\[(.*)\]", text).group(1)
and
import parse
for x in [parse('{key} = {value}', x) for x in text_inside_brackets.split('=')]:
my_dict[x['key']] = x['value']
My question is: What would be a more principled / robust approach to do this? Are there any Python parsers for serialized Java objects that I could use for this problem? (do such things exist?). What other alternatives do I have?
My hope is to ultimately parse this in JSON or nested Python dictionaries, so that I can manipulate it it any way I want.
Note: I would prefer to avoid a solution relies on Py4J mostly because it requires setting up a server and a client, and I am hoping to do this within a single
Python script.
java.util.GregorianCalendar[time=1413172803113,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/New_York",offset=-18000000,dstSavings=3600000,useDaylight=true,transitions=235,lastRule=java.util.SimpleTimeZone[id=America/New_York,offset=-18000000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2014,MONTH=9,WEEK_OF_YEAR=42,WEEK_OF_MONTH=3,DAY_OF_MONTH=13,DAY_OF_YEAR=286,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=2,AM_PM=0,HOUR=0,HOUR_OF_DAY=0,MINUTE=0,SECOND=3,MILLISECOND=113,ZONE_OFFSET=-18000000,DST_OFFSET=3600000]
The serialized form of a GregorianCalendar object contains quite a lot of redundancy. In fact, there are only two fields that matter, if you want to reconstitute it:
the time
the timezone
There is code for extracting this in How to convert Gregorian string to Gregorian Calendar?
If you want a more principled and robust approach, I echo mbatchkarov's suggestion to use JSON.

ANTLR: Multiple ASTs using the same ambiguous grammar?

I'm building an ANTLR parser for a small query language. The query language is by definition ambiguous, and we need all possible interpretations (ASTs) to process the query.
Example:
query : CLASSIFIED_TOKEN UNCLASSIFIED_TOKEN
| ANY_TOKEN UNCLASSIFIED_TOKEN
;
In this case, if input matches both rules, I need to get 2 ASTs with both interpretations. ANTLR will return the first matched AST.
Do you know a simple way to get all possible ASTs for the same grammar? I'm thinking about running parser multiple times, "turning off" already matched rules between iterations; this seems dirty. Is there a better idea? Maybe other lex/parser tool with java support that can do this?
Thanks
If I were you, I'd remove the ambiguities. You can often do that by using contextual information to determine which grammar rules actually trigger. For instance, in
C* X;
in C (not your language, but this is just to make a point), you can't tell if this is just a pointless multiplication (legal to write in C), or a declaration of a variable X of type "pointer to C". So, there are two valid (ambiguous) parses. But if you know that C is a type declaration (from some context, perhaps an earlier code declaration), you can hack the parser to kill off the inappropriate choices and end up with just the one "correct" parse, no ambiguities.
If you really don't have the context, then you likely need a GLR parser, which happily generate both parses in your final tree. I don't know of any available for Java.
Our DMS Software Reengineering Toolkit [not a Java-based product] has GLR parsing support, and we use that all the time to parse difficult languages with ambiguities. The way we handle the C example above is to produce both parses, because the GLR parser is happy to do this, and then if we have additional information (such as symbol table table), post-process the tree to remove the inappropriate parses.
DMS is designed to support the customized analysis and transformation of arbitrary languages, such as your query language, and makes it easy to define the grammar. Once you have a context-free grammar (ambiguities or not), DMS can parse code and you can decide what to do later.
I doubt you're going to get ANTLR to return multiple parse trees without wholesale rewriting of the code.
I believe you're going to have to partition the ambiguities, each into its own unambiguous grammar and run the parse multiple times. If the total number of ambiguous productions is large you could have an unmanageable set of distinct grammars. For example, for three binary ambiguities (two choices) you'll end up with 8 distinct grammars, though there might be slightly fewer if one ambiguous branch eliminates one or more of the other ambiguities.
Good luck

Incremental streaming JSON library for Java

Can anyone recommend a JSON library for Java which allows me to give it chunks of data as they come in, in a non-blocking fashion? I have read through A better Java JSON library and similar questions, and haven't found precisely what I'd like.
Essentially, what I'd like is a library which allows me to do something like the following:
String jsonString1 = "{ \"A broken";
String jsonString2 = " json object\" : true }";
JSONParser p = new JSONParser(...);
p.parse(jsonString1);
p.isComplete(); // returns false
p.parse(jsonString2);
p.isComplete(); // returns true
Object o = p.getResult();
Notice the actual key name ("A broken json object") is split between pieces.
The closest I've found is this async-json-library which does almost exactly what I'd like, except it cannot recover objects where actual strings or other data values are split between pieces.
There are a few blocking streaming/incemental JSON parsers (as per Is there a streaming API for JSON?); but for async nothing yet that I am aware of.
The lib you refer to seems badly named; it does not seem to do real asynchronous processing, but merely allow one to parse sequence of JSON documents (which multiple other libs allow doing as well)
If there were people who really wanted this, writing one is not impossible -- for XML there is Aalto, and handling JSON is quite a bit simpler than XML.
For what it is worth, there is actually this feature request to add non-blocking parsing mode for Jackson; but very few users have expressed interest in getting that done (via voting for the feature request).
EDIT: (2016-01) while not async, Jackson ObjectMapper allows for convenient sub-tree by sub-tree binding of parts of the stream as well -- see ObjectReader.readValues() (ObjectReader created from ObjectMapper), or short-cut versions of ObjectMapper.readValues(...). Note the trailing s in there, which implies a stream of Objects, not just a single one.
Google Gson can incrementally parse Json from an InputStream
https://sites.google.com/site/gson/streaming
I wrote such a parser: JsonParser.java. See examples how to use it:JsonParserTest.java.

Categories

Resources