JavaScript not being properly executed in HtmlUnit

JavaScript not being properly executed in HtmlUnit - java

I'm currently developing some tests with HtmlUnit. It's loading a page that contains braintree.js (their form encryption library). I have a bunch running, but I'm stuck where it calls crypto. The JS in question is:
(function() {
try {
var ab = new Uint32Array(32);
crypto.getRandomValues(ab);
sjcl.random.addEntropy(ab, 1024, "crypto.getRandomValues");
} catch (e) {}
})();
HtmlUnit is throwing:
EcmaError, ReferenceError, "'crypto' is not defined."
I suppose HtmlUnit doesn't include crypto. Would it be possible to include a crypto library myself?

Based on your comment, I have to tell you that HtmlUnit is a pain in the neck when it comes to JavaScript. It will complain a lot about variables not being defined and unknown functions and so on.
Real browsers are more flexible, eg: they accept syntactically incorrect pieces of JavaScript. HtmlUnit expects everything to be perfect without any kind of error. Furthermore, even if you didn't miss a semicolon, HtmlUnit might complain.
My advice:
Make sure your JavaScript is syntactically correct
Avoid the user of complex libraries (jQuery seems to be properly supported)
If you can use non-minimized versions of libraries it's worth giving it a try
Try to avoid complex jQuery methods (eg: adding events dynamically to elements)
And the most important one: Try switching between different BrowserVersions. Internet Explorer (ironically) has proven to provide the best results when it comes to interpreting JavaScript

Related

How can I make sure in selenium that language switcher converting content language

Scenario :
Go to website : https://www.virtuality.fashion/it/home_it/
Click on Flag of France from top right
Verify that page language must changed to french language
How can I verify that language has been changed and page content changed to french.
I am trying to automate it so how can I verify above via selenium. Steps are simple so I did not get any issue till click on flag but unable to find any method or option in selenium automation which make sure that page content language has been changed.
I thought to copy all content and paste into google translate and then compare content with English language but this seems quite complex if there will be more pages to verify.

I don't believe Selenium is a good tool for content testing. You'll just waste your time and resources.
Anyway, if there're no alternatives on how to verify that localization files are correctly applied on lower testing levels, I'd add some smart validation with a bit of randomization.
After language switching you can pick a random element (defined in you page object) which has a text, and then send it to some language detection service. Assuming you have frequent builds, randomization may help you to detect anomalies in different site areas.
But it definitely makes no sense to parse the entire content via Selenium and validate it against some hardcoded expected result. Any content change may cause failures and increase support time.
As an alternative, if you have a static content, you may take a look at some specialized solutions like applitools.

You can use Selenium and specify the language in desiredCapabilities:
"desiredCapabilities": {
"browserName": "chrome",
"chromeOptions": {
"args": ["--lang=en-GB"],
"prefs": {
"intl": {
"accept_languages": "en-GB"
}
}
}
}

filter out encoded javascript content from request

I have a problem where I am trying to cleanse the request content to strip out HTML and javascript if included in the input parameters.
This is basically to protect against XSS attacks and the ideal mechanism would be to validate input and encode the output but due to some restrictions I cannot work on the output end.
All I can do at this time is to try to cleanse the input through a filter. I am using ESAPI to canonicalize the input parameters and also using jsoup with the most restrictive Whitelist.none() option to strip all HTML.
This works as long as the malicious javascript is within some HTML tags but fails for a URL with javascript code without any HTML surrounding it, eg:
http://example.com/index.html?a=40&b=10&c='-prompt``-'
ends up showing an alert on the page. This is kind of what I am doing right now:
param = encoder.canonicalize(param, false, false);
param = Jsoup.clean(param, Whitelist.none());
So the question is:
Is there some way through which I can make sure that my input is stripped of all HTML and javascript code at the filter?
Should I throw in some regex validations but is there any regex that will take care of the cases that are getting past the check I have right now?

DISCLAIMER:
If output-escaping is not allowed in your internet-facing solution, you are in a NO-WIN SCENARIO. It's like antivirus on Windows: You'll be able to detect specific and known attacks, but you will be unable to detect or defend against unknown attacks. If your employer insists on this path, your due diligence is to make management aware of this fact and get their acceptance of the risks in writing. Every time I've confronted management with this, they've opted for the correct solution--output escaping.
================================================================
First off... watch out when using JSoup in any kind of a cleaning/filtering/input validation situation.
Upon receiving invalid HTML, like
<script>alert(1);
Jsoup will add in the missing </script> tag.
This means that if you're using Jsoup to "cleanse" HTML, it first transforms INVALID HTML into VALID HTML, before it begins processing.
So the question is: Is there some way through which I can make sure
that my input is stripped of all HTML and javascript code at the
filter? Should I throw in some regex validations but is there any
regex that will take care of the cases that are getting past the check
I have right now?
No. ESAPI and ESAPI's input validation is not appropriate for your use case because HTML is not a regular language and ESAPI's input for its validation are Regular Expressions. The fact is you cannot do what you ask:
Is there some way through which I can make sure that my input is
stripped of all HTML and javascript code at the filter?
And still have a functioning web application that requires user-defined HTML/JavaScript.
You can stack the deck in your favor a little bit: I would choose something like OWASP's HTML Sanitizer. and test your implementation against the XSS inputs listed here.
Many of those inputs are taken from OWASP's XSS Filter evasion cheat sheet, and will at least exercise your application against known attempts. But you will never be secure without output escaping.
===================UPDATE FROM COMMENTS==================
SO the use case is to try and block all html and javascript. My recommendation is to implement caja since it encapsulates HTML, CSS, and Javascript.
Javascript though is also difficult to manage from input validation, because like HTML, JavaScript is a non-regular language. Additionally, each browser has its own implementation that deviates in different ways from the ECMAScript spec. If you want to protect your input from being interpreted, this means you'd ideally have to have a parser for each browser family attempting to interpret user input in order to block it.
When all you've really got to do is make sure that the output is escaped. Sorry to beat a dead horse, but I have to stress that output escaping is 100x more important than rejecting user input. You want both, but if forced to choose one or the other, output escaping is less work overall.

How do I use packages I created in java in javascript?

I've got a package login that has a few classes inside of it...I thought my code would look like this:
importPackage(login);
var password = document.form1.pword.value;
var hash = JPP7.toHash(password);
where JPP7 is the class of mine that does the hashing? I am using a colleagues javascript code, but I know next to nothing about javascript. Am I going to have to give an absolute path to the package folder?

(Converted from my comment above)
You cannot (at least not easily nor out-of-the-box) use Java code in Javascript. Despite the similarity in their names, they are very different and are incompatible with each other. See also What's the difference between Javascript and Java?.
There was once an implementation of the JVM in Javascript, but it seems to now be dead. I wouldn't have recommended using it in any sort of production code anyway.

Include a dynamically created GWT widget in a different Web page

I have the following requirement: Based on some user input, I need to generate a HTML form that the user can embed on a separate Web application. I thought on doing this with GWT since I'm familiar with it.
I'm clear on the input parsing and widget generation part. What I don't know how to do is how to export the root widget's (most probably a Panel) compiled code, so the user can take the code and include it in some other page.
Something like:
String rootPanelCode = rootPanel.exportCode();
Dialog codeDialog = new DialogBox();
codeDialog.setText(rootPanelCode);
Then the user copies the displayed code in some HTML file:
<script type="text/javascript" language="javascript">
//copied code goes here
</script>
Requiring a particular <div id="required_id" /> in the HTML file is not a problem. Or maybe javascript code is not enough, and the user is required to download a zip file with js and html files, copy those to a directory and reference them in the page. This again is not a problem.
Is my use case possible with GWT?
Thanks in advance.

I'd say... no :) Mainly because when a GWT application is started it first runs the bootstrap file that in turn chooses the particular permutation for the current browser. So the code that you would get might include some stuff that wouldn't work in all browsers. This might be side stepped by providing some sort of "lightweight" boostrap file/method to download but I doubt that would work.
Besides, the JS code you get is heavily optimized (and with GWT 2.0 the JS file contains JS, CSS and even images), for example, when possible strings are put into variables for performance reasons - but these variables are usually grouped together and put in one place in the compiled JS file, so even if you could somehow get to the code that creates your form, it could contain references to some undefined variables. In other words, the compiled code is meant to be used as a whole.
A more "elegant" solution (and more importantly, feasible with GWT ;)) is to export the form to some sort of abstract form/language, maybe JSON, so that you could parse/recreate it easily in the other web app:
{
"form1": [
{ "label1": "value1" },
{ "label2": "value2" }
]
}
Hmm, I just thought of a possible hack.. With the right use of code splitting it might be possible to separate the code responsible for the form creation - but that would make it maybe easier to "export", it's not a complete solution (and I wouldn't recommend it.. just an interesting/possible hack).

How to webscrape scholar.google.com in Java?

I want to write a Java func grabTopResults(String f) such that grabTopResults("automata theory") returns me a list of the top 100 cited papers on scholar.google.com for "automata theory".
Does anyone have suggestions for what libraries will make my life easy?
Thanks!

As I'm sure Google can afford the bandwidth, I'll ignore the question of whether this is immoral/illegal/prohibited by Google's T&C
First thing you need to do is figure out what HTTP request (or requests) you need to issue in order to obtain the page with the data you need. Once you've figured this out, use HttpClient to issue the same request from Java code. The previous link shows example code that explains how to do this.
Once you've downloaded the content of the relevant page, you'll need to use a HTML parser to extract the data you're interested in. The Jericho parser suggested by peperg is a good choice.
If the Google police come knocking, you've never heard of me, OK?

I use http://jericho.htmlparser.net/docs/index.html . Google Scholar doesn't have API ( http://code.google.com/p/google-ajax-apis/issues/detail?id=109 ). Of course it is not allowed by Google (read terms of use. Automatic requestr are forbidden).

Below is a bit of example code which gets the titles on the first page using the open source product TestPlan. It is a standalone product, but if you really need it I could help you integrated it into your Java code (it is written in Java itself).
GotoURL http://scholar.google.com/
SubmitForm with
%Params:q% automate theory
end
set %Items% as response //div[#class='gs_r']
foreach %Item% in %Items%
set %Title% as selectIn %Item% h3
Notice %Title%
end
This produces output like the below (my IP is Germany, thus a german response). Obviously you could format it however you like, or write it to a file; this is just a rough test.
00000000-00 GOTOURL http://scholar.google.com/
00000001-00 SUBMITFORM default
00000002-00 NOTICE [ZITATION] Stochastic complexity in statistical inquiry theory
00000003-00 NOTICE AUTOMATED THEORY FORMATION IN MATHEMATICS1
00000004-00 NOTICE Constraint generation via automated theory formation
00000005-00 NOTICE [BUCH] Automated theorem proving: after 25 years
00000006-00 NOTICE [BUCH] Introduction to the Theory of Computation
00000007-00 NOTICE [ZITATION] Computer-controlled systems: theory and design
00000008-00 NOTICE [BUCH] … , randomness & incompleteness: papers on algorithmic information theory
00000009-00 NOTICE [BUCH] Automatic control systems
00000010-00 NOTICE [BUCH] VLSI physical design automation: theory and practice
00000011-00 NOTICE Singular Control Systems.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

JavaScript not being properly executed in HtmlUnit - java

Related

How can I make sure in selenium that language switcher converting content language

filter out encoded javascript content from request

How do I use packages I created in java in javascript?

Include a dynamically created GWT widget in a different Web page

How to webscrape scholar.google.com in Java?

Categories

Resources