email response that what time email is opened and received - java

How I can track an email?
I'm using java on the server side for sending emails. I want to track whether it is delivered, opened, etc. How I can do that?

Well, e-mail isn't that simple.
You can use some techniques to get a better understanding of when an e-mail is opened, for instance, but it's not guaranteed.
One usual approach is to include an image, for instance (a beacon), that makes a request to your server when it's loaded.
Of course, some mail clients will ask the user if he allows the loading of external content. If he says no, there's nothing you can do.
Like posted in a comment, you can look up more here: http://en.wikipedia.org/wiki/Email_tracking and by googling the subject, a lot has been written about it.

Related

Make my request only available for my Client

I created an application that needs information from my website. But I don't want this information to be accessible by any other way. My client-sided app has to be the only entity that can get this information. How can I achieve that?
After some research, I have found these solutions, but I am not sure which approach is the best?
Custom user agent
Password is the request
httpassword, but how to handle it in Java?
If someone want to get this information outside of your app, there is no way to prevent it, if the person really want get it. They can just decompile your code and analyse the function. This should be clear.
But for preventing it for normal people, you can use a robots.txt, user agent, custom HTTP Headers and other things you mentioned. Just an encryption could be helpful too.
I would suggest a private subdomain, combined with API keys (per HTTP Headers) and an encryption.
You can whitelist the IP address of your application from your website.

parsing commands sent through email

I am really unsure as to where I should be posting this, but if this is the wrong place could someone direct me where to I could get an answer? I want to be able to send commands to my email address, and have the commands parsed and executed when the message is sent. I.e. I send an email and it contains this:
public class sentThruEmail {
public static void main(String[] args) {
System.out.println("Hello");
}
}
I would want to configure my program to recognize when a new email comes from a sent address, to open it, compile it (in this case compile the java) and then execute it on the machine that the program is running on.
How can I go about figuring out how to do this? Any help would be wonderful, thanks!
EDIT: Or maybe the first step would be how to recognize an email was received from an address at all? In Java, how could I go about that: recognizing that an email was received and outputting something to the screen to alert the fact.
This would be a very bad idea. Blindly accepting email from an untrusted source, compiling it, and executing it is an enormous security hole.
There are several parts of this to consider:
Security
As others mention, there are security risks to consider, here. If that worries you (and it should!), you may want to consider some of the following:
Digitally signing these "command" emails, and verifying that signature before looking at the email
PGP is a popular choice for this
Running your program (which reads the emails) in a "sandbox" environment, such as being chroot'ed or in a jail
Only run a very limited set of commands - perhaps just ones you invent.
Getting Mail
If you still want to do this, given the security issues, you will need your program to read mail.
You probably want to use IMAP for this, or POP.
Parsing
Once you have the mail, you need to parse the contents.
You could just compile it directly if you are only sending code.
You could also send the code as an attachment with a certain MIME type to identify it. That way you could still send a 'normal' email (perhaps with commentary about what this code is for), but your program would be able to cleanly separate out the code.
Responding
How will you communicate results back? Or do you care?
You may want to send a reply email (use SMTP), or update a webpage. A webpage is nice since if you are running the web server locally, you can just write a file directly.
Examples
The standard "confirmation email" system has a lot of similarity to what you describe. Someone sends an email to an automated system, it reads it, does some processing, and replies. Search around for those systems and I'm sure you'll get started.
I created the Exquisite Corpse Emailer project, which does much of what you describe (but in Perl). It only accepts a very small set of limited commands, but it listens to an email address on IMAP, parses the text, updates a database as a result, etc.
But if you want to do it, look into sockets programming, as you would need to connect to your email provider (if he allows terminal login)

Can page scraping be detected?

So I just created an application that does page scraping for me, and ran it. It worked fine. I was wondering if someone would be able to figure out that the code was being page scraped, whether or not they had written code for that purpose?
I wrote the code in java, and it's pretty much just checking for one line of the html code.
I thought I'ld get some insight on that before I add anymore code to this program. I mean it's useful, and all, but it's almost like a hack.
Seems like the worst case scenario as a result of this page scraper isn't too bad as I can just use another device later and the IP will be different. Also it might not matter in a month. The website seems to be getting quite a lot of web traffic anyways at the moment. Whoever edits the page is probably asleep now, and it really hasn't accomplished anything at this point so this could go unnoticed.
Thanks for such fast responses. I think it might have gone unnoticed. All I did was copy a header, so just text. I guess that is probably similar to how browser copy-paste works. The page was just edited this morning, including the text I was trying to get. If they did notice anything, they haven't announced it, so all is good.
It is a hack. :)
There's no way to programmatically determine if a page is being scraped. But, if your scraper becomes popular or you use it too heavily, it's quite possible to detect scraping statistically. If you see one IP grab the same page or pages at the same time every day, you can make an educated guess. Same if you see requests on another timer.
You should try to obey the robots.txt file if you can, and rate limit yourself, to be polite.
As a sysadmin myself, yes I'd probably notice but ONLY based on the behavior of the client. If a client had a weird user agent, I'd be suspicious. If a client browsed the site too quickly or in very predictable intervals, I'd be suspicious. If certain support files were never requested (favicon.ico, various linked in CSS and JS files), I'd be suspicious. If the client were accessing odd (not directly accessible) pages, I'd be suspicious.
Then again I'd have to actually be looking at my logs. And this week Slashdot has been particularly interesting, so no I probably wouldn't notice.
It depends on how have you implemented this and how smart are the detection tools.
First take care about User-Agent. If you do not set it explicitly it will be something like "Java-1.6". Browsers send their "unique" user agents, so you can just mimic the browser behavior and send User-Agent of MSIE, or FireFox (for example).
Second, check other HTTP headers. Probably some browsers send their specific headers. Take one example and follow it, i.e. try to add the headers to your requests (even if you do not need them).
Human user acts relatively slowly. Robot may act very quickly, i.e. retrieve the page and then "click" link, i.e. perform yet another HTTP GET. Put random sleep between these operations.
Browser retrieves not only the main HTML. Then it downloads images and other stuff. If you really do not want to be detected you have to parse HTML and download this stuff, i.e. actually be "browser".
And the last point. It is obviously not your case but it is almost impossible to implement robot that passes Capcha. This is yet another way to detect robot.
Happy hacking!
If your scraper acts like a human then there is a hardly any chance for it to be detected as a scraper. But if your scraper acts like a robot then its not difficult to be detected.
To act like a human you will need to:
Look at what a browser sends in the HTTP headers and simulate them.
Look at what a browser requests for when accessing the page and access the same with the scraper
Time your scraper to access at the speed of a normal user
Send requests at random intervals of time instead of at fixed intervals
If possible make requests from a dynamic IP rather than a static one
assuming you wrote the page scraper in a normal manner, i.e., it fetches the whole page and then does pattern recognition to extract what you want from the page, all someone might be able to tell is that the page was fetched by a robot rather than a normal browser. all their logs will show is that the entire page was fetched; they can't tell what you do with it once it's in your RAM.
To the server serving the page, there's no difference whether you download a page into the browser or download a page and screen scrape it. Both actions just require an HTTP request, whatever you do with the resulting HTML on your end is none of the server's business.
Having said that, a sophisticated server could conceivably detect activity that doesn't look like a normal browser. For example, a browser should request any additional resources linked to from the page, something that usually doesn't happen when screen scraping. Or requests with an unusual frequency coming from a particular address. Or simply the HTTP User-Agent header.
Whether a server tries to detect these things or not depends on the server, most don't.
I'd like to put my two cents in for others that may be reading this. In the past couple of years web scraping has been frowned upon more and more by the court system. I've cited a lot of examples in a blog post I recently wrote.
You should definitely abide the robots.txt but also look at the websites T&C's to make sure you are not in violation. There are definitely ways that people can identify you are web scraping and there could be potential consequences for doing so. In the event that web scraping is not disallowed by the website's Terms and Conditions, then have fun but make sure to still be conscionable. Dont destroy a webserver with an out of control bot, throttle yourself to make sure you dont impact the server!
For full disclosure, I am a co-founder of Distil Networks and we help companies identify and stop web scrapers and bots.

What sort of encoding/image/formatting issues are there when building a web mail client that pulls emai via pop3

What sort of encoding/image/formatting issues are there when building a web mail client that pulls emai via pop3?
Some things i can think of that I know I will have to handle:
attachments
inline images
html emails
What other possible headaches are there?
It's quite a lot of work, and there are already a lot of solutions out there - but that shouldn't deter you! Your three points cover almost everything in general terms... the fact it's coming through POP3 isn't all that relevant, IMAP, or even OWS (Outlook Web Services for Exchange) all require attention to the following points:
Attachments can be referred to inline in an email (combo of your 1,2,3) - as in an email can include an IMAGE which is itself an attachment.
There are many MIME types you have to support.
Emails can be a single part, multi-part different, multi-part alternative, and combinations thereof. A good newsletter will send you a text & HTML version of the same data leaving the client the choice to choose which way to consume the data. That email could have one or more attachments... and that attachment can be another text/html email with another attachment... and this goes on ad nauseam.
HTML As you've already pointed out, rendering email HTML inside your page without intersects in style etc is tricky, plus you'll want to filter for bad content - JavaScript includes potentially, embedded images which might have privacy implications.
There are several character encodings that can be used - this ties into MIME types, but is worth independently noting (for headache-worthiness alone).
Basically you have to be jack of a number of trades to generate and decode emails.
Many !
I highly suggest you to read the pop3 rfc as a starter.
http://www.faqs.org/rfcs/rfc1939.html
You can download few open source projects to see how they implemented the rfc's.
I agree with Pierre that you should read the specs to fully understand what's going on behind the scenes.
One thing I would add though is that the key thing I would be worried about is the security of the mailboxes you are reading and SPAM. Emails often contain calls to javascript/images that can be used to track whether the message has been opened. This is the key reason many mail client don't show images unless you turn them on.
Along with the other methods you are using you will probably have to make sure you parse the message and take out any calls that could potentially cause privacy issues unless the senders are trusted.

Access Hotmail Unread Mail Count via Java

I want to write an application using Java6 that can check a users Hotmail inbox for the 'unread message count'!
There is a Javascript API but I will not have a browser instance, and it seems that I need one to use it. (see stakoverflow question: 964392 )
I can use POP3, but since it does not support flags, I can only tell how many 'new' messages there are in the users Inbox since the last time I checked, not how many unread messages there are. ( This is my current implementation, it's not what is required, but is currently all I can achieve )
There is IMAP access, but only for 'premium users'(Hotmail users who pay).
There's also HttpMail access, but this is poorly documented, and from testing, seems it's also only for premium users.
Unfortunately, this similar question on msdn suggests this is impossible
EDIT:
All I can offer is a half-solution. You could create the html page containing the script suggested by the people on MSDN but instead of setting the value of an input box to the number of unread messages - you could use Ajax to post this number back to your application. This is, of course, not a very robust solution since it depends on the browser and may very well not be cross platform. Another thing you can do is read up on running Javascript on the JVM. I don't know how good that solution is, either, but I think it's more robust once (or rather if) you can get it to work.
One potential option could be to use the HTMLUnit Java headless web browser to make the requests. HTMLUnit has very good, but not perfect, JavaScript support to handle creating the dynamic content.

Categories

Resources