Is it possible to use IMAP + paging? - java

I have a Requirement to make an IMAP client as a Web application
I achieved the functionality of Sorting as:
//userFolder is an Object of IMAPFolder
Message[] messages = userFolder.getMessages();
Arrays.sort(messages, new Comparator<Message>()
{
public int compare(Message message1, Message message2)
{
int returnValue = 0;
try
{
if (sortCriteria == SORT_SENT_DATE)
{
returnValue = message1.getSentDate().compareTo(message2.getSentDate());
}
} catch (Exception e)
{
System.out.println(e.getMessage());
e.printStackTrace();
}
if (sortType == SORT_TYPE_DESCENDING)
{
returnValue = -returnValue;
}
return returnValue;
}
});
The code snippet is not complete , its just brief
SORT_SENT_DATE,SORT_TYPE_DESCENDING are my own constants.
Actually This solution is working fine, but it fails in logic for paging
Being a Web based application, i cant expect server to load all messages for every user and sort them
(We do have situations >1000 Simultaneous users with mail boxes having > 1000 messages each )
It also does not make sense for the web server to load all, sort them, return just a small part (say 1-20),
and on the next request, again load all sort them and return (21-40). Caching possible, but whts the gaurantee user would actually make a request ?
I heard there is a class called FetchProfile, can that help me here ? (I guess it would still load all messages but just the information thats required)
Is there any other way to achieve this ?
I need a solution that could also work in Search operation (searching with paging),
I have built an archietecture to create a SearchTerm but here too i would require paging.
for ref, i have asked this same Question at :
http://www.coderanch.com/t/461408/Other-JSE-JEE-APIs/java/it-possible-use-IMAP-paging

You would need a server with the SORT extension and even that may not be enough. Then you issue SORT on the specific mailbox and FETCH only those message numbers that fall into your view.
Update based on comments:
For servers where the SORT extension is not available the next best thing is to FETCH header field representing the sort key for all items (eg. FETCH 1:* BODY[HEADER.FIELDS(SUBJECT)] for subject or FETCH 1:* BODY[HEADER.FIELDS(DATA)] for sent date), then sort based on the key. You will get a list of sorted message number this way, which should be equivalent to what the SORT command would return.
If server side cache is allowed then the best way is to keep cache of envelopes (in the IMAP ENVELOPE sense) and then update it using the techniques described in RFC 4549. It's easy to sort and page given this cache.
There are two IMAP APIs on Java - the official JavaMail API and Risoretto. Risoretto is more low-level and should allow to implement anything described above, JavaMail may be able to do so as well, but I don't have much experience with it.

Related

How we can get forest data directory in MarkLogic

I am trying to get the forest Data directory in MarkLogic. I used the following method to get data directory...using the Server Evaluation Call Interface running queries as admin. If not, please let me know how I can get forest data directory
ServerEvaluationCall forestDataDirCall = client.newServerEval()
.xquery("admin:forest-get-data-directory(admin:get-configuration(), admin:forest-get-id(admin:get-configuration(), \"" + forestName +"\"))");
for (EvalResult forestDataDirResult : forestDataDirCall.eval()) {
String forestDataDir = null;
forestDataDir = forestDataDirResult.getString();
System.out.println("forestDataDir is " + forestDataDir);
}
I see no reason for needing to hit the server evaluation endpoint to ask this question to the server. MarkLogic comes with a robust REST based management API including getters for almost all items of interest.
Knowing that, you can use what is documented here:
http://yourserver:8002/manage/v2/forests
Results can be in JSON, XML or HTML
It is the getter for forest configurations. Which forests you care about can be found by iterating over all forests or by reaching through the database config and then to the forests. It all depends on what you already know from the outside.
References:
Management API
Scripting Administrative Tasks

Is it possible to get Direct Messages from Twitter by a specific user using the Twitter4j library?

I'm using the Twitter4j library to develop a proyect that works with Twitter, one of the things what I need is to get the Direct messages, I'm using the following code:
try{
List<DirectMessage> loStatusList = loTwitter.getDirectMessages();
for (DirectMessage loStatus : loStatusList) {
System.out.println(loStatus.getId() + ",#" + loStatus.getSenderScreenName() + "," + loStatus.getText() + "|");
}
}
catch(Exception e)
It works fine, but what the code returns is a list of the most recent messages in general. What I want is to get those direct messages using some kind of filter that allows finding them by a user that I indicate.
For example, I need to see the DM only from user #TwitterUser.
Is this posible with this library?
All kinds of suggestions are accepted, even if I should use another library I would be grateful if you let me know.
It looks like the actual Twitter API doesn't support a direct filter on that API, by username anyway. (See Twitter API doc: GET direct_messages.)
Which means, you'd have to make multiple calls to the API with pagination enabled, and cache the responses into a list.
Here is an example of pagination wtih Twitter4J getDirectMessages().
In that example, use the existing:
List<DirectMessage> messages;
But inside the loop, do:
messages.addAll(twitter.getDirectMessages(paging));
Note: you only would have to do this once. And in fact, you should persist these to a durable local cache like Redis or something. Because once you have the last message id, you can ask the Twitter API to only return "messages since id" with the since_id param.
Anyway, then on the client side you'd just do your filtering with the usual means in Java. For example:
// Joe is on twitter as #joe
private static final String AT_JOE = "Joe";
// Java 8 Lambda to filter by screen name
List<DirectMessage> messagesFromJoe = messages.stream()
.filter(message -> message.getSenderScreenName().equals(AT_JOE))
.collect(Collectors.toList());
Above, getSenderScreenName() was discovered by reading the Twitter4J API doc for DirectMessage.

query my sql db from snmp agent

I am new to snmp and using snmp4j to create an snmp agent.My java application needs to listen to snmp request and query the db based on the incoming oid and send a response back. I have a src code for snmp agent. But how does the agent query the db based on the incoming oid? Do I need to register all oids in my db as managed objects in the agent so the agent can do the look up when the request arrives? or in other words, how do i point to my datastore/db from the agent?
this is the code i am using.
http://shivasoft.in/blog/java/snmp/creating-snmp-agent-server-in-java-using-snmp4j/
`List oidList = impl.getOidList(); //get data from db
for (Oid oid : oidList) {
agent.registerManagedObject(MOScalarFactory.createReadOnly(new OID(
oid.getOid()), oid.getValue()));
}'
I am trying to register the managed objects with data in db. Is this correct?
I am getting duplicate registration exception on second row though the oid is unique.
`.1.3.6.1.4.1.1166.1.6.1.2.2.1.3.1.1
.1.3.6.1.4.1.1166.1.6.1.2.2.1.3.1.2`
I dont think this is the right way because what if the db is huge?
Any help/tips are greatly appreciated.
Problem
You get org.snmp4j.agent.DuplicateRegistrationException because there can be only one ManagedObject in the ContextScope. Each registration assigns ManagedObject value to the MOContextScope. Second registration tries to set second object to the contextScope. The scope is already filled and thus exception is thrown.
Althow each scalar value SHOULD end up with .0. You may check any MIB browser like iReasoning and pick any value. If this value is scalar - trailing zero is appended automaticaly despite the fact it is not mentioned in MIB-file. So the most "correct" way is to use 4.1 solution.
Solution 1 - own MOScalar
Write your own MOScalar. With tinier bounds.
You should overwrite getLowerBound, getUpperBound, isLowerIncluded, isUpperIncluded to get separate contextScopes for your objects.
I would suggest to return Scalar OID every time and include both boundaries.
Upper and lower boundaries better return same OID you've settled.
Solution 2 - own MOServer
Write your own MOServer. With blackJack and others...
Mostly you can simply copypaste code despite this one
private SortedMap<MOScope, ManagedObject> registry;
It should look like this
private SortedMap<MOScope, Set<ManagedObject>> registry;
And it would affect registration, unregistration and other logic.
DefaultMOServer - 678 lines incl. comments. In fact you should fix several classes:
query.matchesQuery(object)
private boolean matchesQuery(MOQuery query, ManagedObject object) {
if ((query.matchesQuery(object)) && object.getScope().isOverlapping(query.getScope()))
if (object instanceof MOScalar) {
MOScalar moScalar = (MOScalar) object;
return query.getLowerBound().compareTo(moScalar.getID()) <= 0 &&
query.getUpperBound().compareTo(moScalar.getID()) >= 0;
} else {
return true;
}
return false;
}
protected void fire...Event(ManagedObject objects, MOQuery query) {
protected void fire...Event(Set<ManagedObject> objects, MOQuery query) {
if (lookupListener != null) {
for (ManagedObject mo : objects) {
ManagedObject other = lookup(new DefaultMOQuery(contextScope));
Set<ManagedObject> other = lookup(new DefaultMOQuery(contextScope), false);
And so on...
Solution 3 - Tables
Use table rows.
You can add a table and append rows.
You would be able to access cells as the
<tableEntryOID>.<columnSubID>.<rowIndexOID>
You may use this question as tutorial.
Solution 4 - OIDs fixup
Make you oid's use different contextScopes.
Solution 4.1
Adding trailing zero
agent.registerManagedObject(
MOScalarFactory.createReadOnly(
new OID(oid.getOid()).successor(),
oid.getValue()
)
);
This will append .0 to the same-level properties.
snmpget -v2c -c public localhost:2001 oid.getOid().0
Also any MIBbrowser will append .0 to each scalar oid defined in the MIB-file. You may check it with iReasoning as the most popular browser. Even hrSystemUptime (.1.3.6.1.2.1.25.1.1 - see left bottom) is requested as hrSystemUptime.0 (.1.3.6.1.2.1.25.1.1.0) on the top.
Solution 4.2
Separate OID at the base.
static final OID sysDescr1 = new OID("1.3.6.1.4.1.5.6.1.8.9"),
sysDescr2 = new OID("1.3.6.1.4.1.5.6.2.2.5");
Fix the database OIDs to get separate contextScopes.
In addition
You can try reading SNMP4J-Agent-Instrumentation-Guide.pdf. This didn't helped me btw.
You can attach the sources to your IDE to read about zero trailer and other nuances. This helped me a lot to get more info about DefaultMOServer.
Correct pom.xml import to get latest version
<repositories>
<repository>
<id>SNMP4J</id>
<url>https://oosnmp.net/dist/release/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.snmp4j</groupId>
<artifactId>snmp4j-agent</artifactId>
<version>2.3.2</version>
</dependency>
</dependencies>
First of all, OIDs do start with a number - not a dot. The syntax you are using is from NET-SNMP and is non-standard.
Second, please read the SNMP4J-Agent-Instrumentation-Guide.pdf document that describes in detail how you can instrument an agent for a MIB. You got the duplicate registration exception, because you registered a scalar as a sub-tree. Scalar OIDs have to end with the ".0" instance suffix.
Using the CommandResponder interface is sort of reinventing the wheel. You will most likely never manage to implement a secure and standard conform SNMP agent when you start from scratch. Using SNMP4J-Agent and its instrumentation hooks will save you a lot of work and trouble.

How do I invoke a template from another template in Play Framework?

I have a template accountlist.scala.html looking like this:
#(accounts: models.domain.AccountList)
#titlebar = {<p>some html</p>}
#content = {
#for(account <- accounts) {
<p>#account.name</p>
}
}
#main(titlebar)(content)
... and another template account.scala.html like this:
#(account: models.domain.Account)
#titlebar = {<p>#account.name</p>}
#content = {
#for(transaction <- account.getTransactions()) {
<p>#transaction.detail</p>
}
}
#main(titlebar)(content)
From both of them I am invoking the template main.scala.html.
I have access to the entire Account POJO in the first view accountlist.scala.html, so really there is no need for me to invoke the server to get the details of the account when I go to the view in which I display the details. I would just like to change view on the client side. How could I call the second view account.scala.html from the view accountlist.scala.html a user clicks on an account in the list? I am ready to change the templates as needed.
I have provided a previous answer, which is still available at the end of this post. From your comments, however I understand that you are asking for something else without understanding how dangerous it is.
There are three ways of handling your use case, let's start with the worst one.
A stateful web application
If you have loaded data into a Pojo from some data source and you want to re-use the Pojo between multiple requests, what you are trying to do is to implement some kind of client-state on the server, such as a cache. Web applications have been developed in this way for long time and this was the source of major bugs and errors. What happens if the underlying account data in the database is updated between one http request and the following? Your client won't see it, because it use cached data. Additionally, what happens when the user performs a back in his browser? How do you get notified on the server side so you keep track of where the user is in his navigation flow? For these and others reasons, Play! is designed to be stateless. If you are really into web applications, you probably need to read about what is the REST architectural style.
A stateless web application
In a stateless web applications, you are not allowed to keep data between two http requests, so you have two ways to handle it:
Generate the user interface in a single shot
This is the approach which you can use when your account data is reduced. You embed all the necessary data from each account into your page and you generate the view, which you keep hidden and you show only when the user clicks. Please note that you can generate the HTML on the server side and with Javascript makes only certain part of your DOM visible, or just transfer a JSON representation of your accounts and use some kind of templating library to build the necessary UI directly on the client
Generate the user interface when required
This approach becomes necessary when the account data structure contains too many informations, and you don't want to transfer all this information for all the accounts on the client at first. For example, if you know the user is going to be interested in seeing the details only of very few accounts, you want to require the details only when the user asks for it.
For example, in your list of accounts you will have a button associated with each account, called details and you will use the account id to send a new request to the server.
#(accounts: models.domain.AccountList)
#titlebar = {<p>some html</p>}
#content = {
#for(account <- accounts) {
<p>#account.name <button class="details" href="#routes.Controllers.account(account.id)">details</button></p>
}
}
Please note that you can also generate the user interface on the client side, but you will still need to retrieve it from the server the data structures when the user clicks on the button. This will ensure that the user retrieves the last available state of the account.
Old answer
If your goal is to reuse your views, Play views are nothing else then Scala classes, so you can import them:
#import packagename._
and then you use it in another template:
#for(account <- accounts) {
#account(account)
}
The question reveals a misunderstanding of play framework templates. When compiling the play project the template code is transformed to html, css and javascript.
You can not "invoke"/link another template showing the account transactions from a href attribute of your Account row. However, you can do any of the following:
In case you have loaded all transactions from all accounts to the client in one go: extend the template to generate separate <div> sections for each account showing the transactions. Also generate javascript to 1) hide the overview div and 2) show the specific transaction div when clicking on one of the accounts in the overview. Please see the knockout library proposed by Edmondo1984 or the accordion or tabs in twitter bootstrap.
In case you only load the account overview from the server. Generate a link such as this one href="#routes.Controllers.account(account.id)" (see Edmondo1984 answer) and make another template to view this data.
Since the question concerned a case in which you got all data from the server, go by option 1.

Efficient way to GET multiple HTML pages simultaneously

So I'm working on web scraping for a certain website. The problem is:
Given a set of URLs (in the order of 100s to 1000s), I would like to retrieve the HTML of each URL in an efficient manner, specially time-wise. I need to be able to do 1000s of requests every 5 minutes.
This should usually imply using a pool of threads to do requests from a set of not yet requested urls. But before jumping into implementing this, I believe that it's worth asking here since I believe this is a fairly common problem when doing web scraping or web crawling.
Is there any library that has what I need?
So I'm working on web scraping for a certain website.
Are you scraping a single server or is the website scraping from multiple other hosts? If it is the former, then the server you are scraping may not like too many concurrent connections from a single i/p.
If it is the latter, this is really a general question on how many outbound connections you should open from a machine. There is physical limit, but it is pretty large. Practically, it would depend on where that client is getting deployed. The better the connectivity, the higher number of connections it can accommodate.
You might want to look at the source code of a good download manager to see if they have a limit on the number of outbound connections.
Definitely user asynchronous i/o, but you would still do well to limit the number.
Your bandwidth utilization will be the sum of all of the HTML documents that you retrieve (plus a little overhead) no matter how you slice it (though some web servers may support compressed HTTP streams, so certainly use a client capable of accepting them).
The optimal number of concurrent threads depends a great deal on your network connectivity to the sites in question. Only experimentation can find an optimal number. You can certainly use one set of threads for retrieving HTML documents and a separate set of threads to process them to make it easier to find the right balance.
I'm a big fan of HTML Agility Pack for web scraping in the .NET world but cannot make a specific recommendation for Java. The following question may be of use in finding a good, Java based scraping platform
Web scraping with Java
I would start by researching asynchronous communication. Then take a look at Netty.
Keep in mind there is always a limit to how fast one can load a web page. For an average home connection, it will be around a second. Take this into consideration when programming your application.
http://wwww.Jsoup.org just for scrapping part! The thread pooling i think you should implement urself.
Update
if this approach is fitting your need, you can download the complete class files here:
http://codetoearn.blogspot.com/2013/01/concurrent-web-requests-with-thread.html
AsyncWebReader webReader = new AsyncWebReader(5/*number of threads*/, new String[]{
"http://www.google.com",
"http://www.yahoo.com",
"http://www.live.com",
"http://www.wikipedia.com",
"http://www.facebook.com",
"http://www.khorasannews.com",
"http://www.fcbarcelona.com",
"http://www.khorasannews.com",
});
webReader.addObserver(new Observer() {
#Override
public void update(Observable o, Object arg) {
if (arg instanceof Exception) {
Exception ex = (Exception) arg;
System.out.println(ex.getMessage());
} /*else if (arg instanceof List) {
List vals = (List) arg;
System.out.println(vals.get(0) + ": " + vals.get(1));
} */else if (arg instanceof Object[]) {
Object[] objects = (Object[]) arg;
HashMap result = (HashMap) objects[0];
String[] success = (String[]) objects[1];
String[] fail = (String[]) objects[2];
System.out.println("Failds");
for (int i = 0; i < fail.length; i++) {
String string = fail[i];
System.out.println(string);
}
System.out.println("-----------");
System.out.println("success");
for (int i = 0; i < success.length; i++) {
String string = success[i];
System.out.println(string);
}
System.out.println("\n\nresult of Google: ");
System.out.println(result.remove("http://www.google.com"));
}
}
});
Thread t = new Thread(webReader);
t.start();
t.join();

Categories

Resources