How to webscrape information with java - java

I've been trying to do a lot of research with webscraping. I'm trying to load the stats of different characters of the game I play into my program, and as there are over 150 of them I can't manually type them all in. Here is one of the pages that I'm trying to scrape information from this site
I'm trying to get these variables:
HEALTH, ATTACK DAMAGE, HEALTH REGEN.,ATTACK SPEED, MANA,ARMOR,RANGE,MOV. SPEED
So far what I've gotten is below with the JSoup library which prints the website in a single line of text, and I'm not sure how to extract the variables from that variable.
package WebScraper;
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class getWikiStats {
public void getChampStats(){
try {
Document doc = Jsoup.connect("http://leagueoflegends.wikia.com/wiki/Kassadin").get();
String website = doc.text();
System.out.println(website);
} catch (IOException e) {
e.printStackTrace();
}
}
}
Is there a better way to do this? Or is there a way to get the variables from that line of text?

Related

How can I figure out what elements to use as parameters for cssQuery

I would really want to understand how to actually extract the data I want from a website. I have done it with an IMDb top chart that I got from a tutorial on YouTube but it just confuses me how to know what syntax to insert for the row.select parameters.
I have tried doing it with other websites such as Best Buy, getting the price and name of specific laptops and I failed because I am pretty sure I put the wrong parameters(cssQuery).
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import javax.swing.*;
import java.io.IOException;
public class Scraper {
static String title;
static final String url = "https://www.imdb.com/chart/top";
public static void main(String args[])throws IOException {
final Document document = Jsoup.connect(url).get();
for(Element row: document.select("table.chart.full-width tr")){
final String title = row.select(".titleColumn a").text();
final String rating = row.select(".imdbRating").text();
System.out.println(title);
System.out.println(rating);
}
}
}
for what i have undersnd from our question is that you dont know which css class t put in your code. for that you could inspect website by right-clicking on the website and click inspect element and from there you can check the div class by pressing ctrl+shift+c and hover over any element on the website like i have shown in below image

Android's BitmapFactory returns corrupted image

I am participating in a project where we build an app which solves Rubiks' cubes. Initially we started with a desktop app using JavaFX but we decided to switch over to an Android app.
Since I already implemented a working model at least for color recognition, I wanted to reuse that and just build another UI around it. That is where I am stuck right now because I cannot even get Android's bitmap API to work. Unfortunately it looks like I need to stick with it since Swing/AWT/JavaFX image libraries are not available.
So I implemented a JUnit test which I cleaned up a bit:
package de.uniks.rubiksapp;
import android.content.res.Resources;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.robolectric.RobolectricTestRunner;
import org.robolectric.RuntimeEnvironment;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
#RunWith(RobolectricTestRunner.class)
public class TestBitmap {
#Test
public void testBitmap() {
Resources resources = RuntimeEnvironment.application.getResources();
InputStream testImageStream = resources.openRawResource(R.drawable.test_front);
Bitmap testImageBitmap = BitmapFactory.decodeStream(testImageStream);
//Bitmap testImageBitmap = BitmapFactory.decodeResource(resources, R.drawable.test_front);
// --- Pixel readout would be here --- //
System.out.println(testImageBitmap.getWidth() + "x" + testImageBitmap.getHeight() + "px");
FileOutputStream out = null;
try {
out = new FileOutputStream("test.png");
testImageBitmap.compress(Bitmap.CompressFormat.PNG, 100, out);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (out != null) {
out.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
My problem currently is that the bitmaps I get from BitmapFactory seem to be corrupted. When I am trying to read specific pixels using Bitmap.getPixel(), they are always black. So I tried saving the image back to disk which works but the resulting files are only around 40 bytes large and cannot be opened.
Initially, I tried using Bitmap.decodeResource() instead of Bitmap.decodeStream() but that always returns images with a size of 100x100px, although my source images are 1240x800px large. Even when I use Bitmap.Options and set inScaled = false. At least the size is correct when using Bitmap.decodeStream().
Thanks for any help!

How to get text of a toaster message in mobile app using Appium

I am trying to verify a toaster message in Android Mobile app but not able to get text of toaster message as it doesn't show in uiautomatorviewer.
Got some information that by the help of OCR it can be done taking screenshots and fetching the text from that screenshots
Can anyone help me out how to do this step by step using java in Appium project?
You can follow the information on the below links to install the Tesseract on your machine:
For Mac: http://emop.tamu.edu/Installing-Tesseract-Mac
For Windows: http://emop.tamu.edu/Installing-Tesseract-Windows8
After installing the TessEract on your machine you need to add the dependency of TessEract Java library in your project. If you are using Maven for it, adding below dependency will work:
<dependency>
<groupId>org.bytedeco.javacpp-presets</groupId>
<artifactId>tesseract</artifactId>
<version>3.04-1.1</version>
</dependency>
Also the 'Step 3' which is mentioned by Ivan need not to be followed.
If you are using 'TestNG' the TessEract API needs to be initialised only once so instead of initialising it every time, as per your framework you can initialise it either in the 'BeforeTest' or 'BeforeSuite' or 'BeforeClass' method and accordingly close the API either in 'AfterTest' or 'AfterSuite' or 'AfterClass' method.
Below is the code that I have written to achieve it.
import static org.bytedeco.javacpp.lept.pixDestroy;
import static org.bytedeco.javacpp.lept.pixRead;
import java.io.File;
import java.io.IOException;
import org.apache.commons.io.FileUtils;
import org.bytedeco.javacpp.lept.PIX;
import org.bytedeco.javacpp.tesseract.TessBaseAPI;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.testng.annotations.AfterSuite;
import org.testng.annotations.BeforeSuite;
public class BaseTest {
static TessBaseAPI api = new TessBaseAPI();
#BeforeSuite
public void beforeSuit() throws IOException {
File screenshotsDirec = new File("target/screenshots");
if (screenshotsDirec.exists())
FileUtils.forceDelete(screenshotsDirec);
FileUtils.forceMkdir(screenshotsDirec);
System.out.println("Initializing TessEract library");
if (api.Init("/opt/local/share", "eng") != 0) {
System.err.println("Could not initialize tesseract.");
}
}
public synchronized boolean verifyToastMessage(String msg)
throws IOException {
TakesScreenshot takeScreenshot = ((TakesScreenshot) driver);
File[] screenshots = new File[5];
for (int i = 0; i < screenshots.length; i++) {
screenshots[i] = takeScreenshot.getScreenshotAs(OutputType.FILE);
}
String outText;
Boolean isMsgContains = false;
for (int i = 0; i < screenshots.length; i++) {
PIX image = pixRead(screenshots[i].getAbsolutePath());
api.SetImage(image);
outText = api.GetUTF8Text().getString().replaceAll("\\s", "");
System.out.println(outText);
isMsgContains = outText.contains(msg);
pixDestroy(image);
if (isMsgContains) {
break;
}
}
return isMsgContains;
}
#AfterSuite()
public void afterTest() {
try {
api.close();
} catch (Exception e) {
api.End();
e.printStackTrace();
}
}
}
I would also like to add that writing tests to read and verify the Toast messages in this way is not very much reliable as in one of my tests this code successfully captures the Toast message while in another test it fails to capture the toast message because the capturing of the screenshots starts when the toast message disappears. That was the reason I tried to write this code very much efficiently. However that also does not serve the purpose.
Follow this discussion on Appium forum: https://discuss.appium.io/t/verifying-toast/3676.
Basic steps to verify a Toaster are:
Perform action to trigger the toast message to appear on screen
Take x number of screenshots
Increase resolutions of all screenshots
Use tessearct OCR to detect the toast message.
Refer this repo to use Java OCR library (see at the bottom):
import org.bytedeco.javacpp.*;
import static org.bytedeco.javacpp.lept.*;
import static org.bytedeco.javacpp.tesseract.*;
public class BasicExample {
public static void main(String[] args) {
BytePointer outText;
TessBaseAPI api = new TessBaseAPI();
// Initialize tesseract-ocr with English, without specifying tessdata path
if (api.Init(null, "eng") != 0) {
System.err.println("Could not initialize tesseract.");
System.exit(1);
}
// Open input image with leptonica library
PIX image = pixRead(args.length > 0 ? args[0] : "/usr/src/tesseract/testing/phototest.tif");
api.SetImage(image);
// Get OCR result
outText = api.GetUTF8Text();
System.out.println("OCR output:\n" + outText.getString());
// Destroy used object and release memory
api.End();
outText.deallocate();
pixDestroy(image);
}
}

Java Properties: loading null from a .txt file

so I'm currently working on reading information from a text file from the first time, and from what I have pieced together, the following code should work and return 100 and 16:
package Utility;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Properties;
public class textReader {
public textReader()
{}
public Object fetchElement(String fileName, String keyName)
{
Properties properties = new Properties();
try {
properties.load(new FileInputStream("P:/Real_Time_Survival/Real_Time_Survivial_Game/assets" + fileName));
} catch (IOException e) {
}
return properties.getProperty("keyName");
}
}
but when called from the main class with
textReader ready = new textReader();
ready.fetchElement("Sprites/ExampleSprite/Default/SpriteData.txt", "FrameDuration");
ready.fetchElement("Sprites/ExampleSprite/Default/SpriteData.txt", "AnimationFrames");
It returns null (have the system printing out those lines, cut it out due to formatting errors). Any idea as to why this won't work?
I'm going to stick my neck out and guess you left off a "/" after "assets"
I um, put the keyName variable as a string.
Yep this is solved, 10/10 best dumb mistakes ever

Upload image twitter4j

I introduced myself to Twitter4j yesterday, and are now testing out features for an upcoming program of mine. As the title suggests, I am trying to upload an image to twitter, without any luck. Here's my code:
import static java.awt.Toolkit.getDefaultToolkit;
import static javax.swing.JOptionPane.ERROR_MESSAGE;
import static javax.swing.JOptionPane.showMessageDialog;
import java.awt.Image;
import java.io.File;
import javax.swing.Icon;
import javax.swing.ImageIcon;
import twitter4j.Status;
import twitter4j.Twitter;
import twitter4j.TwitterException;
import twitter4j.TwitterFactory;
import twitter4j.examples.tweets.UploadMultipleImages;
import twitter4j.media.ImageUpload;
import twitter4j.media.ImageUploadFactory;
public final class UpdateStatus {
static File file = new File("/images/Done.jpg");
public static void main(String[] args) {
for(int i=0;i<2;i++){
Twitter twitter = new TwitterFactory().getInstance();
Status status=null;
try {
ImageUpload.upload(file,"22");
} catch (TwitterException e) {
System.err.println("Shit...");
System.exit(3);
}
}
System.out.println("Done");
}
}
The image I'm trying to upload is Done.jpg, and is in a folder in the package. I've used this method for images in other programs, so I am pretty sure it works. Though, this gives me an error message before I run the code, saying "Cannot make a static reference to the non-static method upload(File, String) from the type ImageUpload". Any ideas that could help me? :D
You need to ensure following before testing your code -
Register your app at https://apps.twitter.com/ and get Oauth tokens to be able to connect your app to Twitter and perform desired action.
You will get a consumerKey,consumerAccessToken, accessKey and accessToken.
If you want to post updates, please ensure you configure your app
permissions to have a Read and Write access, deafult access is Read
Only.
After you have the required access tokens, you need to instantiate a Twitter instance using those tokens. This instance can then be used to perform requisite action. See sample code below to upload an image -
ConfigurationBuilder twitterConfigBuilder = new ConfigurationBuilder();
twitterConfigBuilder.setDebugEnabled(true);
twitterConfigBuilder.setOAuthConsumerKey("consumerkey");
twitterConfigBuilder.setOAuthConsumerSecret("consumersecret");
twitterConfigBuilder.setOAuthAccessToken("accesstoken");
twitterConfigBuilder.setOAuthAccessTokenSecret("accesstokensecret");
Twitter twitter = new TwitterFactory(twitterConfigBuilder.build()).getInstance();
String statusMessage = "Watch out this interesting offer I came across today";
File file = new File("/images/Done.jpg");
StatusUpdate status = new StatusUpdate(statusMessage);
status.setMedia(file); // set the image to be uploaded here.
twitter.updateStatus(status);
Hope this helps.
ImageUpload.upload is not a static method, but an instance method.
You need to create an instance of ImageUpload, and call the method from the instance.
Checking the documentation of ImageUpload, it is an interface. So you'll need to instantiate a class that implements ImageUpload.

Categories

Resources