Different results by different methods on Chrome Performance Metrics? - java

I am writing some code to automate calculating certain page performance metrics. The results I am getting for page size are different by different methods:
What I want to achieve is to read these values shown in this screenshot:
Methods I am using:
Method giving different page load time and different transferred sizes:
Totalbytes and NetData return very different numbers, both very far from what the screenshot would show
public void testing() throws HarReaderException {
JavascriptExecutor js1=((JavascriptExecutor)driver);
try {
Thread.sleep(5000);
}catch(Exception e) {e.printStackTrace();}
String url=driver.getCurrentUrl();
System.out.println("Current URL :"+url);
long pageLoadTime= (Long)js1.executeScript("return (window.performance.timing.loadEventEnd-window.performance.timing.responseStart)");
long TTFB= (Long)js1.executeScript("return (window.performance.timing.responseStart-window.performance.timing.navigationStart)");
long endtoendRespTime= (Long)js1.executeScript("return (window.performance.timing.loadEventEnd-window.performance.timing.navigationStart)");
Date date = new Date();
//Timestamp ts=new Timestamp(date.getTime());
System.out.println("PageLoadTime Time :"+pageLoadTime);
System.out.println("TTFB :"+TTFB);
System.out.println("Customer perceived Time :"+endtoendRespTime);
System.out.println("timeStamp");
String scriptToExecute = "var performance = window.performance || window.mozPerformance || window.msPerformance || window.webkitPerformance || {}; var network = performance.getEntries() || {}; return network;";
String netData = ((JavascriptExecutor)driver).executeScript(scriptToExecute).toString();
System.out.println("Net data: " + netData);
String anotherScript = "return performance\n" +
" .getEntriesByType(\"resource\")\n" +
" .map((x) => x.transferSize)\n" +
" .reduce((a, b) => (a + b), 0);"; //I have tried encodedSize here as well, still gives different results
System.out.println("THIS IS HOPEFULLY THE TOTAL TRANSFER SIZE " + js1.executeScript((anotherScript)).toString());
int totalBytes = 0;
for (LogEntry entry : driver.manage().logs().get(LogType.PERFORMANCE)) {
if (entry.getMessage().contains("Network.dataReceived")) {
Matcher dataLengthMatcher = Pattern.compile("dataLength\":(.*?),").matcher(entry.getMessage()); //I tried encodedLength and other methods but always get different results from the actual page
dataLengthMatcher.find();
totalBytes = totalBytes + Integer.parseInt(dataLengthMatcher.group(1));
//Do whatever you want with the data here.
}
}
System.out.println(totalBytes);
}
Setting up selenium Chrome driver, enabling performance logging and mobbrowser proxy:
#BeforeTest
public void setUp() {
// start the proxy
proxy = new BrowserMobProxyServer();
proxy.start(0);
//get the Selenium proxy object - org.openqa.selenium.Proxy;
Proxy seleniumProxy = ClientUtil.createSeleniumProxy(proxy);
// configure it as a desired capability
DesiredCapabilities capabilities = new DesiredCapabilities().chrome();
LoggingPreferences logPrefs = new LoggingPreferences();
logPrefs.enable(LogType.PERFORMANCE, Level.ALL);
capabilities.setCapability(CapabilityType.LOGGING_PREFS, logPrefs);
capabilities.setCapability(CapabilityType.PROXY, seleniumProxy);
ChromeOptions options = new ChromeOptions();
options.addArguments("--incognito");
capabilities.setCapability(ChromeOptions.CAPABILITY, options);
//set chromedriver system property
System.setProperty("webdriver.chrome.driver", driverPath);
driver = new ChromeDriver(capabilities);
// enable more detailed HAR capture, if desired (see CaptureType for the complete list)
proxy.enableHarCaptureTypes(CaptureType.REQUEST_CONTENT, CaptureType.RESPONSE_CONTENT);
}
Methods I am using to analyze the page:
This method was supposed to show the Load Time in chrome inspector, but it is always showing a lesser number (I think it is showing the time of the last response received instead of DOMContentLoaded or Load Time)
public double calculatePageLoadTime(String filename) throws HarReaderException {
HarReader harReader = new HarReader();
de.sstoehr.harreader.model.Har har = harReader.readFromFile(new File(filename));
HarLog log = har.getLog();
// Access all pages elements as an object
long startTime = log.getPages().get(0).getStartedDateTime().getTime();
// Access all entries elements as an object
List<HarEntry> hentry = log.getEntries();
long loadTime = 0;
int entryIndex = 0;
//Output "response" code of entries.
for (HarEntry entry : hentry)
{
long entryLoadTime = entry.getStartedDateTime().getTime() + entry.getTime();
if(entryLoadTime > loadTime){
loadTime = entryLoadTime;
}
entryIndex++;
}
long loadTimeSpan = loadTime - startTime;
Double webLoadTime = ((double)loadTimeSpan) / 1000;
double webLoadTimeInSeconds = Math.round(webLoadTime * 100.0) / 100.0;
return webLoadTimeInSeconds;
}
I am getting the total number of requests by reading the HAR file from the page, but for some reason it is always 10% less then the actual:
public int getNumberRequests(String filename) throws HarReaderException {
HarReader harReader = new HarReader();
de.sstoehr.harreader.model.Har har = harReader.readFromFile(new File(filename));
HarLog log = har.getLog();
return log.getEntries().size();
}
Testing this on google gives very different results by each method, which are usually 10-200% off from correct numbers.
Why does this happen? Is there a simple way to get those metrics properly from Chrome or any library that makes this easier? My task is automate doing performance analysis on thousands of pages.

I personally analyzed this on my system over and over again and came up with this -
The resource size which its showing currently is the amount of resource fetched till page load event is triggered.
So to overcome this you need to capture the the resource size variable after the page load event also until it stabilizes.Then it will match the actual console values.

Related

Jsoup fetches wrong results

Working with Jsoup. The URL works well on the browser. But it fetches wrong result on the server. I set the maxBodySize "0" as well. But it still only gets first few tags. Moreover the data is even different from the browser one. Can you guys give me a hand?
String queryUrl = "http://www.juso.go.kr/addrlink/addrLinkApi.do?confmKey=U01TX0FVVEgyMDE3MDYyODE0MTYyMzIyMTcw&currentPage=1&countPerPage=20&keyword=연남동";
Document document = Jsoup.connect(queryUrl).maxBodySize(0).get();
Are you aware that this endpoint returns paginated data? Your URL asks for 20 entries from the first page. I assume that the order of these entries is not specified so you can get different data each time you call this endpoint - check if there is a URL parameter that can determine specific sort order.
Anyway to read all 2037 entries you have to do it sequentially. Examine following code:
final String baseUrl = "http://www.juso.go.kr/addrlink/addrLinkApi.do";
final String key = "U01TX0FVVEgyMDE3MDYyODE0MTYyMzIyMTcw";
final String keyword = "연남동";
final int perPage = 100;
int currentPage = 1;
while (true) {
System.out.println("Downloading data from page " + currentPage);
final String url = String.format("%s?confmKey=%s&currentPage=%d&countPerPage=%d&keyword=%s", baseUrl, key, currentPage, perPage, keyword);
final Document document = Jsoup.connect(url).maxBodySize(0).get();
final Elements jusos = document.getElementsByTag("juso");
System.out.println("Found " + jusos.size() + " juso entries");
if (jusos.size() == 0) {
break;
}
currentPage += 1;
}
In this case we are asking for 100 entries per page (that's the maximum number this endpoint supports) and we call it 21 times, as long as calling for a specific page return any <juso> element. Hope it helps solving your problem.

Looping stops when assertion fails

There is an excelsheet where all URLs (16) are listed in one column. Now once page gets loaded need to verify whether page title is matching with the expected title which is already stored in excel. I am able to perform it using for loop. It runs all URls if all are passed but stops when it fails. I need to run it completely and give a report which passed and which failed. I written the below code.
rowCount = suite_pageload_xls.getRowCount("LoadURL");
for(i=2,j=2;i<=rowCount;i++,j++) {
String urlData = suite_pageload_xls.getCellData("LoadURL", "URL", i);
Thread.sleep(3000);
long start = System.currentTimeMillis();
APP_LOGS.debug(start);
driver.navigate().to(urlData);
String actualtitle = driver.getTitle();
long finish = System.currentTimeMillis();
APP_LOGS.debug(finish);
APP_LOGS.debug(urlData+ "-----" +driver.getTitle());
long totalTime = finish - start;
APP_LOGS.debug("Total time taken is "+totalTime+" ms");
String expectedtitle = suite_pageload_xls.getCellData("LoadURL", "Label", j);
Assert.assertEquals(actualtitle, expectedtitle);
if (actualtitle.equalsIgnoreCase(expectedtitle)) {
APP_LOGS.debug("PAGE LABEL MATCHING....");
String resultpass = "PASS";
APP_LOGS.debug(resultpass);
APP_LOGS.debug("***********************************************************");
} else {
APP_LOGS.debug("PAGE LABEL NOT MATCHING....");
String resultfail = "FAIL";
APP_LOGS.debug(resultfail);
APP_LOGS.debug("***********************************************************");
}
}
Kindly help me in this regard.
This is the correct behavior of the assertion, it throws an Exception when the assertion is wrong.
You could store the actualTitles and expectedTitles in arrays and perform the assertions all at once.
For better assertions I suggest you try AssertJ, you could directly compare 2 lists, the actual and the expected, and it will return the complete difference.

Dataframes are slow to parse through small amount of data

I have 2 classes doing a similar task in Apache Spark but the one using data frame is many times slower than the "regular" one using RDD. (30x)
I would like to use data frame since it will eliminate a lot of code and classes we have but obviously I can't have it be that much slower.
The data set is nothing big. We have 30 some files with json data in each about events triggered from activities in another piece of software. There are between 0 to 100 events in each file.
A data set with 82 events will take about 5 minutes to be processed with data frames.
Sample code:
public static void main(String[] args) throws ParseException, IOException {
SparkConf sc = new SparkConf().setAppName("POC");
JavaSparkContext jsc = new JavaSparkContext(sc);
SQLContext sqlContext = new SQLContext(jsc);
conf = new ConfImpl();
HashSet<String> siteSet = new HashSet<>();
// last month
Date yesterday = monthDate(DateUtils.addDays(new Date(), -1)); // method that returns the date on the first of the month
Date startTime = startofYear(new Date(yesterday.getTime())); // method that returns the date on the first of the year
// list all the sites with a metric file
JavaPairRDD<String, String> allMetricFiles = jsc.wholeTextFiles("hdfs:///somePath/*/poc.json");
for ( Tuple2<String, String> each : allMetricFiles.toArray() ) {
logger.info("Reading from " + each._1);
DataFrame metric = sqlContext.read().format("json").load(each._1).cache();
metric.count();
boolean siteNameDisplayed = false;
boolean dateDisplayed = false;
do {
Date endTime = DateUtils.addMonths(startTime, 1);
HashSet<Row> totalUsersForThisMonth = new HashSet<>();
for (String dataPoint : Conf.DataPoints) { // This is a String[] with 4 elements for this specific case
try {
if (siteNameDisplayed == false) {
String siteName = parseSiteFromPath(each._1); // method returning a parsed String
logger.info("Data for site: " + siteName);
siteSet.add(siteName);
siteNameDisplayed = true;
}
if ( dateDisplayed == false ) {
logger.info("Month: " + formatDate(startTime)); // SimpleFormatDate("yyyy-MM-dd")
dateDisplayed = true;
}
DataFrame lastMonth = metric.filter("event.eventId=\"" + dataPoint + "\"").filter("creationDate >= " + startTime.getTime()).filter("creationDate < " + endTime.getTime()).select("event.data.UserId").distinct();
logger.info("Distinct for last month for " + dataPoint + ": " + lastMonth.count());
totalUsersForThisMonth.addAll(lastMonth.collectAsList());
} catch (Exception e) {
// data does not fit the expected model so there is nothing to print
}
}
logger.info("Total Unique for the month: " + totalStudentForThisMonth.size());
startTime = DateUtils.addMonths(startTime, 1);
dateDisplayed = false;
} while ( startTime.getTime() < commonTmsMetric.monthDate(yesterday).getTime());
// reset startTime for the next site
startTime = commonTmsMetric.StartofYear(new Date(yesterday.getTime()));
}
}
There are a few things that are not efficient in this code but when I look at the logs it only adds a few seconds to the whole processing.
I must be missing something big.
I have ran this with 2 executors and 1 executor and the difference is 20 seconds on 5 minutes.
This is running with Java 1.7 and Spark 1.4.1 on Hadoop 2.5.0.
Thank you!
So there a few things, but its hard to say without seeing the breakdown of the different tasks & their time. The short version is you are doing way to much work in the driver and not taking advantage of Spark's distributed capabilities.
For example, you are collecting all of the data back to the driver program (toArray() and your for loop). Instead you should just point Spark SQL at the files in needs to load.
For the operators, it seems like your doing many aggregations in the driver, instead you could use the driver to generate the aggregations and have Spark SQL execute them.
Another big difference between your in-house code and the DataFrame code is going to be Schema inference. Since you've already created classes to represent your data, it seems likely that you know the schema of your JSON data. You can likely speed up your code by adding the schema information at read time so Spark SQL can skip inference.
I'd suggest re-visiting this approach and trying to build something using Spark SQL's distributed operators.

executePhantomJS on Remotewebdriver

if I use a web driver then it works perfectly
driver = new PhantomJSDriver(capabilities);
driver.executePhantomJS( "var page = this;");
How can I make it work?
driver = new RemoteWebDriver(capabilities);
driver.executePhantomJS( "var page = this;");
UPDATE
My code
capabilities = DesiredCapabilities.phantomjs();
driver = new RemoteWebDriver(capabilities);
driver.executePhantomJS( "var page = this; binary =0;mimetype=''; count = 0;id=0; bla = '{';"
+"page.onResourceReceived = function(request) {"
+ "if(id !== request.id){"
+"bla += '\"'+count+ '\":'+JSON.stringify(request, undefined, 4)+',';"
+"if(request.contentType.substring(0, 11) =='application'){"
+"console.log(request.contentType);"
+ "mimetype = request.contentType;"
+ "binary++;"
+ "}"
+"count++;"
+ "id = request.id;"
+ "}"
+"};");
Java gives error: The method executePhantomJS(String) is undefined for the type RemoteWebDriver.
If i use executeScript it will not work.
I need run 100 test parallel, i can't use webdriver.
I guess that you wanna run PhantomJSDriver on your Se Grid. This is how it works for me (C# Factory implementation):
public IWebDriver CreateWebDriver(string identifier)
{
if (identifier.ToLower().Contains("ghostdriver"))
{
return new RemoteWebDriver(new Uri(ConfigurationManager.AppSettings["Selenium.grid.Url"]), DesiredCapabilities.PhantomJS());
}
}
or try this one
Console.WriteLine("Creating GhostDriver (PhantomJS) driver.");
//Temporary commented for testing purposes
IWebDriver ghostDriver = new PhantomJSDriver("..\\..\\..\\MyFramework\\Drivers");
ghostDriver.Manage().Window.Maximize();
//ghostDriver.Manage().Window.Size = new Size(1920, 1080);
ghostDriver.Manage()
.Timeouts()
.SetPageLoadTimeout(new TimeSpan(0, 0, 0,
Convert.ToInt32(ConfigurationManager.AppSettings["Driver.page.load.time.sec"])));
return ghostDriver;
In case that you wonder why there is ConfigurationManager - I avoid the hard-coded values, so they are extracted from the App.config file.
If you want to run PhantomJS scripts with RemoteWebDriver (for using the Selenium Grid), I used the following solution (only C# unfortunately):
I had to extend the RemoteWebDriver so it can run PhantomJS commands:
public class RemotePhantomJsDriver : RemoteWebDriver
{
public RemotePhantomJsDriver(Uri remoteAddress, ICapabilities desiredCapabilities) : base(remoteAddress, desiredCapabilities)
{
this.CommandExecutor.CommandInfoRepository.TryAddCommand("executePhantomScript", new CommandInfo("POST", $"/session/{this.SessionId.ToString()}/phantom/execute"));
}
public Response ExecutePhantomJSScript(string script, params object[] args)
{
return base.Execute("executePhantomScript", new Dictionary<string, object>() { { "script", script }, { "args", args } });
}
}
After this you can use the ExecutePhantomJSScript method to run any JavaScript code that wants to interact with the PhantomJS API. The following example gets the page title trough the PhantomJS API (Web Page Module):
RemotePhantomJsDriver driver = new RemotePhantomJsDriver(new Uri("http://hub_host:hub_port/wd/hub"), DesiredCapabilities.PhantomJS());
driver.Navigate().GoToUrl("http://stackoverflow.com");
var result = driver.ExecutePhantomJSScript("var page = this; return page.title");
Console.WriteLine(result.Value);
driver.Quit();

Selenium webdriver not logging into live.com

I've been attempting to login to live.com with the Selenium Webdriver however every attempt results in an invalid email or password. A correct login is used. I tested it in a physical browser. I'm uncertain if this is the case but I believe sendkeys is not working. Code below.
protected static Map<String,String> settings = new HashMap<String,String>();
public static WebDriver pwb;
In Main:
// Default settings
settings.put("browser","chrome");
settings.put("os","WINDOWS8");
// Sort arguments and put into HashMap
grab(args);
// Set browser settings
DesiredCapabilities cap = DesiredCapabilities.htmlUnitWithJs();
cap.setBrowserName(settings.get("browser"));
cap.setPlatform(Platform.extractFromSysProperty(settings.get("os")));
d.log(cap.asMap().toString());
// Create web browser object
pwb = new HtmlUnitDriver(cap);
login();
In login():
Load("http://login.live.com",0); // custom load function
d.log(pwb.getTitle());
if(pwb.getTitle().contains("Sign in to your")){ // Full login
d.log("Executing full login.\nEmail:" + settings.get("email") + "\nPassword:"+settings.get("password"));
// if (pwb instanceof JavascriptExecutor) {
// ((JavascriptExecutor) pwb).executeScript("document.getElementsByName('login')[0].click();"); //value = " + settings.get("email") + "");
// ((JavascriptExecutor) pwb).executeScript("document.getElementsByName('passwd')[0].click();"); //.value = " + settings.get("password") + "");
// }
// TODO get login working
pwb.findElement(By.id("i0116")).clear();
pwb.findElement(By.id("i0116")).sendKeys(settings.get("email"));
pwb.findElement(By.id("i0118")).clear();
pwb.findElement(By.id("i0118")).sendKeys(settings.get("password"));
pwb.findElement(By.name("SI")).click();
// Attempted = true;
}
If anymore information is needed please let me know.

Categories

Resources