Mibble MIB Parser - extracting comments from the mib - java

I am using the Mibble MIB Parser to extract all simple data types from an MIB file. I've been successful until my attempt to extract comment text.
Take the following module as an example:
invBookList OBJECT-TYPE
SYNTAX INTEGER {
mobydick(1), -- call me ishmael
paradiselost(2), -- aComment
1984(3), -- aComment
solaris(4) -- aComment
}
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"A few Books for an example."
::= { invMasterList 43 }
According to Mibble's API, the OBJECT-TYPE can be accessed by extracting an SnmpObjectType and then calling the appropriate getter method. Which I have done, and can successfully extract all of the text except the comments in the INTEGER syntax.
I have tried calling getSyntax().getComment() on the SnmpObjectType, but always returns null. getSyntax() will extract the INTEGER syntax, e.g.:
mobydick(1),paradiselist(2),1984(3),solaris(4)
but unfortunately strips out the comments.
Any one out there have experience with Mibble Parser who knows how to extract the comments?
Many Thanks.

First, you need to use version 2.9 of Mibble. Then look into MibWriter.java to understand how to use the API:
https://github.com/cederberg/mibble/blob/master/src/java/net/percederberg/mibble/MibWriter.java

Related

Why my list is empty when I am parsing correct json response?

I am trying to print the ids from a JSON response. But I am not able to understand why I am getting a blank list. I have verified the JSONpath (SECTIONS_IDS_JSONPATH) from the online website and it is giving me correct results.
public static void main(String[] args) {
String SECTIONS_IDS_JSONPATH = "$.[*].instructionEvents[*].sectionId";
String sectionsData = "{\"sections\":[{\"id\":\"8da1cf5d-3150-4e11-b2af-338d1df20475\",\"courseId\":\"e8a65581-ed1c-43f0-90a7-7b9d51b35062\",\"courseCredits\":[{\"minimum\":4,\"maximum\":null,\"measure\":\"hour\",\"increment\":null}],\"academicPeriodId\":\"8b7a8e9e-5417-42a3-9c90-8d47226b5987\",\"reservedSeatsMaximum\":0,\"maxEnrollment\":0,\"hours\":[],\"sites\":[\"All Campuses\"],\"instructors\":[],\"instructionEvents\":[{\"id\":\"9d0c49e2-1579-43c3-b25a-2f85f551e62d\",\"sectionId\":\"8da1cf5d-3150-4e11-b2af-338d1df20475\",\"courseId\":\"e8a65581-ed1c-43f0-90a7-7b9d51b35062\",\"days\":[\"monday\",\"wednesday\",\"friday\"],\"startTm\":\"2019-01-01T09:45:00-05:00\",\"endTm\":\"2024-12-01T10:45:00-05:00\",\"localizations\":[],\"instructionalMethod\":\"Lecture\"}]},{\"id\":\"ad3f63ad-e642-4938-a9fd-318afd2d1ad0\",\"courseId\":\"e8a65581-ed1c-43f0-90a7-7b9d51b35062\",\"courseCredits\":[{\"minimum\":4,\"maximum\":null,\"measure\":\"hour\",\"increment\":null}],\"academicPeriodId\":\"8b7a8e9e-5417-42a3-9c90-8d47226b5987\",\"reservedSeatsMaximum\":0,\"maxEnrollment\":20,\"hours\":[],\"sites\":[\"All Campuses\"],\"instructors\":[{\"id\":\"c26572de-f9c8-4623-ba6a-79997b33f1c6\",\"sectionId\":\"ad3f63ad-e642-4938-a9fd-318afd2d1ad0\",\"role\":\"primary\",\"persons\":[{\"id\":\"c1b50d79-5505-4a33-9316-b4b1f52c0ca3\",\"names\":[{\"firstName\":\"BanColoFac-1\",\"lastName\":\"CTester\",\"preferred\":true}]}]}],\"instructionEvents\":[{\"id\":\"af8fb500-29f5-4451-95d5-a11215298cd4\",\"sectionId\":\"ad3f63ad-e642-4938-a9fd-318afd2d1ad0\",\"courseId\":\"e8a65581-ed1c-43f0-90a7-7b9d51b35062\",\"days\":[\"tuesday\",\"thursday\"],\"startTm\":\"2019-01-01T10:00:00-05:00\",\"endTm\":\"2024-12-01T10:50:00-05:00\",\"localizations\":[],\"instructionalMethod\":\"Lecture\"}]},{\"id\":\"a1422391-e2b9-4bc4-907b-371fcea01d70\",\"courseId\":\"e8a65581-ed1c-43f0-90a7-7b9d51b35062\",\"courseCredits\":[{\"minimum\":4,\"maximum\":null,\"measure\":\"hour\",\"increment\":null}],\"academicPeriodId\":\"8b7a8e9e-5417-42a3-9c90-8d47226b5987\",\"reservedSeatsMaximum\":0,\"maxEnrollment\":20,\"hours\":[],\"sites\":[\"All Campuses\"],\"instructors\":[{\"id\":\"808daae1-3ec6-47ec-9af0-5392199bdf78\",\"sectionId\":\"a1422391-e2b9-4bc4-907b-371fcea01d70\",\"role\":\"primary\",\"persons\":[{\"id\":\"793cc9b3-57c7-4a2d-8984-07a1fb6834a9\",\"names\":[{\"firstName\":\"Andrew\",\"lastName\":\"Adams\",\"preferred\":true}]}]}],\"instructionEvents\":[{\"id\":\"730b4206-684d-4413-bf20-9bec5c1dc900\",\"sectionId\":\"a1422391-e2b9-4bc4-907b-371fcea01d70\",\"courseId\":\"e8a65581-ed1c-43f0-90a7-7b9d51b35062\",\"days\":[\"tuesday\",\"thursday\"],\"startTm\":\"2019-01-01T10:00:00-05:00\",\"endTm\":\"2024-12-01T10:50:00-05:00\",\"localizations\":[],\"instructionalMethod\":\"Lecture\"},{\"id\":\"8bc059ab-a8f8-4469-8e79-bbc71f7fa3fd\",\"sectionId\":\"a1422391-e2b9-4bc4-907b-371fcea01d70\",\"courseId\":\"e8a65581-ed1c-43f0-90a7-7b9d51b35062\",\"days\":[\"monday\",\"wednesday\",\"friday\"],\"startTm\":\"2019-05-26T09:00:00-04:00\",\"endTm\":\"2021-05-26T09:50:00-04:00\",\"localizations\":[],\"instructionalMethod\":\"Lecture\"}]}]}";
List<String> ids = JsonPath.parse(sectionsData).read(SECTIONS_IDS_JSONPATH);
System.out.println(ids);
}
Alright, since this question might get delete if nothing else ever happens I better post this as an answer.
As explained by Andreas you should use JSONPath $.*[*].instructionEvents[*].sectionId instead. Quoting fromt the comment
The syntax $.[*] is undefined, I can't find any documentation/example
doing that. The JSONPath Online Evaluator [*based on
JSONPath-Plus implemented in JavaScript] treats it as $..[*], but the Java library treats
it differently. Since the outer part of the JSON is {"sections":[ ... ]}, you have an object, so you need a property selector (.prop or
.*). Once you've selected your property (.sections, or .*
since there's only one), the property is an array, so you need an
array selector ([2] or [*]). Hence you can use $.sections[*] or
$.*[*] to match all sections.
Indeed, looking at this massive JSONPath Comparision we can see that the syntax in question is not listed for any implementation.

How can I efficiently extract text from bunch for web pages without extra information

I have list of webpages around 1 million, I want to efficiently just extract text from those pages. Currently I am using BeautifulSoup library in python to get text from HTML and using request command to get html of a webpage. This approach extract some extra information in addition to the text like if any javascript is listed in body.
Could you please suggest me any suitable and efficient way to do the task. I looked at scrapy but it looks like it crawls specific website. Can we pass it list of specific webpages to get information from ?
Thank you in advance.
Yes, you can use Scrapy to crawl a set of URLs in a generic fashion.
You simply need to set them on the start_urls list attribute of your spider, or reimplement the start_requests spider method to yield requests from any data source, and then implement your parse callback to perform the generic content extraction you want.
You can use html-text to extract text from them, and regular Scrapy selectors to extract additional data like the one you mention.
In scrapy you can set up your own parser. E.g. Beautiful soup. This parser you can call from your parse method.
To extract text from generic pages I traverse the body only, exclude comments etc and some tags like script, style, etc:
for snippet in soup.find('body').descendants:
if isinstance(snippet, bs4.element.NavigableString) \
and not isinstance(snippet, EXCLUDED_STRING_TYPES)\
and snippet.parent.name not in EXCLUDED_TAGS:
snippet = re.sub(UNICODE_WHITESPACES, ' ', snippet)
snippet = snippet.strip()
if snippet != '':
snippets.append(snippet)
with
EXCLUDED_STRING_TYPES = (bs4.Comment, bs4.CData, bs4.ProcessingInstruction, bs4.Declaration)
EXCLUDED_TAGS = ['script', 'noscript', 'style', 'pre', 'code']
UNICODE_WHITESPACES = re.compile(u'[\t\n\x0b\x0c\r\x1c\x1d\x1e\x1f \x85\xa0\u1680\u2000\u2001\u2002\u2003\u2004'
u'\u2005\u2006\u2007\u2008\u2009\u200a\u2028\u2029\u202f\u205f\u3000]+')

Indexing external text data to lucene index in GraphDB

Is it possible to index external to RDF data?
Like in RDF there is a triple with the object as a link to an external file. Can the content of this file be indexed instead of the link value?
I suspect that the answer above misunderstood the question. The question refers to external content - i.e., if GraphDB's Lucene is able to index the content available at http://example.org, rather than the RDF literal associated with it (and then return in searches the triple pointing to that content).
From what I was able to try no, this is not currently supported.
Absolutely. Lucene is a core part of GraphDB and it offers the standard functionality which comes with a standalone Lucene. The data will have to be parametrized as a String literal. <http://www.example.org/> rdfs:label "An example webpage url."#EN .
Then you can configure a Lucene Index:
PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
INSERT DATA {
luc:index luc:setParam "uris" .
luc:include luc:setParam "literals" .
luc:moleculeSize luc:setParam "1" .
luc:includePredicates luc:setParam "http://www.w3.org/2000/01/rdf-schema#label" .
}
And once you have the configuration, you can create the index.
PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
INSERT DATA {
luc:myTestIndex luc:createIndex "true" .
}
And, given the index and your data, you can query it.
PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
SELECT * {
?subj luc:myTestIndex "web*"
}
Since you are asking about the subject of something which contains the string web*, you'll get <http://www.example.org/>. If you had other triples linking to this one, they might have also appeared.
More information about the way in which GraphDB interacts with Lucene and its Full-Text-Search capabilities can be found within the GraphDB documentation.

How to replace a query string in an Apache Velocity template?

In my web application I'm trying to prevent users from inserting JavaScript in the freeText parameter when they're running a search.
To do this, I've written code in the header Velocity file to check whether the query string contains a parameter called freeText, and if so, use the replace method to replace the characters within the parameter value. However, when you load the page, it still displays the original query string - I'm unsure on how to replace the original query string with my new one which has the replaced characters.
This is my code:
#set($freeTextParameter = "$request.getParameter('freeText')")
freeTextParameter: $freeTextParameter
#if($freeTextParameter)
##Do the replacement:
#set($replacedQueryString = "$freeTextParameter.replace('confirm','replaced')")
replacedQueryString after doing the replace: $replacedQueryString
The query string now: $request.getQueryString()
The freeText parameter now: $request.getParameter('freeText')
#end
In the code above, the replacedQueryString variable has changed as expected (ie the replacement has been carried out as expected), but the $request.getQueryString() and $request.getParameter('freeText') are still the same as before, as if the replacement had never happened.
Seeing as there is a request.getParameter method which works fine for getting the parameters, I assumed there would be a request.setParameter method to do the same thing in reverse, but there isn't.
The Java String is an immutable object, which means that the replace() method will return an altered string, without changing the original one.
Since the parameters map given by the HttpServletRequest object cannot be modified, this approach doesn't work well if your templates rely on $request.getParameter('freeText').
Instead, if you rely on VelocityTools, then you can rather rely on $params.freeText in your templates. Then, you can tune your WEB-INF/tools.xml file to make this parameters map alterable:
<?xml version="1.0">
<tools>
<toolbox scope="request">
<tool key="params" readOnly="false"/>
...
</toolbox>
...
</tools>
(Version 2.0+ of the tools is required).
Then, in your header, you can do:
#set($params.freeText = params.freeText.replace('confirm','replaced'))
I managed to fix the issue myself - it turned out that there was another file (which gets called on every page) in which the $!request.getParameter('freeText')" variable is used. I have updated that file so that it uses the new $!replacedQueryString variable (ie the one with the JavaScript stripped out) instead of the existing "$!request.getParameter('freeText')" variable. This now prevents the JavaScript from being executed on every page.
So, this is the final working code in the header Velocity file:
#set($freeTextParameter = "$!m.request.httpRequest.getParameter('freeText')")
#if($freeTextParameter)
#set($replacedQueryString = "$freeTextParameter.replace('confirm','').replace('<','').replace('>','').replace('(','').replace(')','').replace(';','').replace('/','').replace('\"','').replace('&','').replace('+','').replace('script','').replace('prompt','').replace('*','').replace('.','')")
#end

Converting HTTP Response (Java "Properties" stream format) in to NSDictionary

I am working on iphone application which contains HTTP Request and Response.
The format of the response is a key/value format compatible with the Java "Properties" stream format.
I want to store the response into a NSDictionay. Could you suggest me any way to do this?
Thank you.
sangee
Edit:
Thanks guyz for the quick replies!!!
is their any other ways to store them in NSSdictionay?
I just want to store the album name and description in an array like this:
mutablearray = [wrwr, dsf, my album];
could you please let me know if this possible or not?
Thanks again!!!
This is the response i got it for my HTTP request...
GR2PROTO
debug_album= debug_gallery_version= debug_user=admin debug_user_type=Gallery_User debug_user_already_logged_in= server_version=2.12 status=0 status_text=Login successful.
#GR2PROTO debug_album= debug_gallery_version= debug_user=admin debug_user_type=Gallery_User debug_user_already_logged_in=1
album.name.1=wrwr album.title.1=wrwr album.summary.1= album.parent.1=0 album.resize_size.1=640 album.thumb_size.1=100 album.perms.add.1=true album.perms.write.1=true album.perms.del_item.1=true album.perms.del_alb.1=true album.perms.create_sub.1=true album.info.extrafields.1=Description
album.name.2=dsf album.title.2=dsf album.summary.2= album.parent.2=0 album.resize_size.2=640 album.thumb_size.2=100 album.perms.add.2=true album.perms.write.2=true album.perms.del_item.2=true album.perms.del_alb.2=true album.perms.create_sub.2=true album.info.extrafields.2=Description
album.name.3=my album album.title.3=my album album.summary.3= album.parent.3=0 album.resize_size.3=640 album.thumb_size.3=100 album.perms.add.3=true album.perms.write.3=true album.perms.del_item.3=true album.perms.del_alb.3=true album.perms.create_sub.3=true album.info.extrafields.3=Description
If you can, I would recommend serializing the data as JSON (or XML, if you have to) and parsing it using TouchJSON or a similar parser. If you really can't, then you'll have to implement your own parser--take a look at NSScanner.
Look at NSStream and the Stream Programming Guide for Cocoa.
Back in the day when Java was fully integrated into Cocoa, NSStream mapped onto Java streams. It still might. IIRC, (it's been a while) NSStream will return a properly populated NSDictionary from a Java stream.
Edit:
It looks like the text returned is just a space delimited hash which is the Java version of dictionary. It takes the form of key=value space key=value. The only tricky part is that some of the hashes are nested.
The first line for example is nested:
debug_album{
debug_gallery_version{
debug_user=admin
debug_user_type=Gallery_User
debug_user_already_logged_in{
server_version=2.12
status=0
status_text=Login successful.
}
}
}
You need a recursive scanner to parse that. The "key=space" pattern indicates a nested dictionary.

Categories

Resources