I'm developing an app that takes data from a website with JSoup. I was able to get the normal data.
But now I need to implement a pagination on it. I was told it would have to be with Web Driver, Selenium. But I do not know how to work with him, could someone tell me how I can do it?
public class MainActivity extends AppCompatActivity {
private String url = "http://www.yudiz.com/blog/";
private ArrayList<String> mAuthorNameList = new ArrayList<>();
private ArrayList<String> mBlogUploadDateList = new ArrayList<>();
private ArrayList<String> mBlogTitleList = new ArrayList<>();
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
new Description().execute();
}
private class Description extends AsyncTask<Void, Void, Void> {
#Override
protected Void doInBackground(Void... params) {
try {
// Connect to the web site
Document mBlogDocument = Jsoup.connect(url).get();
// Using Elements to get the Meta data
Elements mElementDataSize = mBlogDocument.select("div[class=author-date]");
// Locate the content attribute
int mElementSize = mElementDataSize.size();
for (int i = 0; i < mElementSize; i++) {
Elements mElementAuthorName = mBlogDocument.select("span[class=vcard author post-author test]").select("a").eq(i);
String mAuthorName = mElementAuthorName.text();
Elements mElementBlogUploadDate = mBlogDocument.select("span[class=post-date updated]").eq(i);
String mBlogUploadDate = mElementBlogUploadDate.text();
Elements mElementBlogTitle = mBlogDocument.select("h2[class=entry-title]").select("a").eq(i);
String mBlogTitle = mElementBlogTitle.text();
mAuthorNameList.add(mAuthorName);
mBlogUploadDateList.add(mBlogUploadDate);
mBlogTitleList.add(mBlogTitle);
}
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void result) {
// Set description into TextView
RecyclerView mRecyclerView = (RecyclerView)findViewById(R.id.act_recyclerview);
DataAdapter mDataAdapter = new DataAdapter(MainActivity.this, mBlogTitleList, mAuthorNameList, mBlogUploadDateList);
RecyclerView.LayoutManager mLayoutManager = new LinearLayoutManager(getApplicationContext());
mRecyclerView.setLayoutManager(mLayoutManager);
mRecyclerView.setAdapter(mDataAdapter);
}
}
}
Problem statement (as per my understanding): Scraper should be able to go to the next page until all pages are done using the pagination options available at the end of the blog page.
Now if we inspect the next button in the pagination, we can see the following html.
a class="next_page" href="http://www.yudiz.com/blog/page/2/"
Now we need to instruct Jsoup to pick up this dynamic url in the next iteration of the loop to scrap data. This can be done using the following approach:
String url = "http://www.yudiz.com/blog/";
while (url!=null){
try {
Document doc = Jsoup.connect(url).get();
url = null;
System.out.println(doc.getElementsByTag("title").text());
for (Element urls : doc.getElementsByClass("next_page")){
//perform your data extractions here.
url = urls != null ? urls.absUrl("href") : null;
}
} catch (IOException e) {
e.printStackTrace();
}
}
Related
I want to get several images of a URL and move to a String [] and then play to a slide.
In the case I have already been able to do this with help is listing normal in the Log. However I am not sure how to transform into a String []. I did this with JSoup.
It would look like this: [http://teste.com/image1.png, http://teste.com/image2.png]
public class ImageScrapAsyncTask extends AsyncTask<String, Void, Document> {
#Override
protected Document doInBackground(String... urls) {
try {
return Jsoup.connect(urls[0]).get();
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
#Override
protected void onPostExecute(Document document) {
if (document != null) {
Elements imgElements = document.select("img");
List<String> images = new ArrayList<>();
for (Element element : imgElements) {
String image = element.attr("data-src");
Log.d("IMAGE_URL", image);
if(image!=null && !image.equals("")){
images.add(image);
}
}
}
}
}
Try this,
String[] imagesArray = images.toArray(new String[0]);
I've got this code with fetches the "rate" data from an API, along with "rate", I need to get the "name". If I get "name" it often binds it below the "rate".
I need it to join on the same row of the List View, so it is like [Rate Name].
I need to get two objects of a JSON Array and bind it to the array adapter so I can display two objects in the same row of a List View so it is more user friendly.
The code below is of the AsyncTask, the code works fine but I need to add one more object and make sure it is displayed as one rate - one name and then iterating through the loop and adding more as needed in the same order.
public class AsyncTaskParseJson extends AsyncTask<String, String, String> {
// the url of the web service to call
String yourServiceUrl = "eg: URL";
#Override
protected void onPreExecute() {
}
String filename = "bitData";
#Override
protected String doInBackground(String... arg0) {
try {
// create new instance of the httpConnect class
httpConnect jParser = new httpConnect();
// get json string from service url
String json = jParser.getJSONFromUrl(yourServiceUrl);
// parse returned json string into json array
JSONArray jsonArray = new JSONArray(json);
// loop through json array and add each currency to item in arrayList
//Custom Loop Initialise
for (int i = 1; i < 8; i++) {
JSONObject json_message = jsonArray.getJSONObject(i);
// The second JSONObject which needs to be added
JSONObject json_name = jsonArray.getJSONObject(i);
if (json_message != null) {
//add each currency to ArrayList as an item
items.add(json_message.getString("rate"));
String bitData = json_message.getString("rate");
String writeData = bitData + ',' +'\n';
FileOutputStream outputStream;
File file = getFileStreamPath(filename);
// first check if file exists, if not create it
if (file == null || !file.exists()) {
try {
outputStream = openFileOutput(filename, MODE_PRIVATE);
outputStream.write(writeData.getBytes());
outputStream.write("\r\n".getBytes());
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
}
}
// if file already exists then append bit data to it
else if (file.exists()) {
try {
outputStream = openFileOutput(filename, Context.MODE_APPEND);
outputStream.write(writeData.getBytes());
outputStream.write("\r\n".getBytes());
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
} catch (JSONException e) {
e.printStackTrace();
}
return null;
}
// below method will run when service HTTP request is complete, will then bind text in arrayList to ListView
#Override
protected void onPostExecute(String strFromDoInBg) {
ListView list = (ListView) findViewById(R.id.rateView);
ArrayAdapter<String> rateArrayAdapter = new ArrayAdapter<String>(BitRates.this, android.R.layout.simple_expandable_list_item_1, items);
list.setAdapter(rateArrayAdapter);
}
}
Just Create Custom Class Messsage:
public class Item{
private String name;
private String rate;
public void Message(String n, String r){
this.name=n;
this.rate=r;
}
// create here getter and setter
}
Now in your background, you have to add name and rate in Message class
Public class MainAcitity extends Activity{
public static List<Item> items= new ArrayList<>();// define in inside the class
// this has to be down on background
Item i=new Item(json_message.getString("name"),json_message.getString("rate"));
items.add(i);
Now pass this listmessge onPostExecute :
ListView list = (ListView) findViewById(R.id.rateView);
ArrayAdapter<String> rateArrayAdapter = new ArrayAdapter<String>(BitRates.this, android.R.layout.simple_expandable_list_item_1, items);
list.setAdapter(rateArrayAdapter);
Is that any helpful for you.
Follow this link.You will get my point.
https://devtut.wordpress.com/2011/06/09/custom-arrayadapter-for-a-listview-android/
This should be really simple and i've been getting close to figuring it out but am starting to loose my sanity.
Simply trying to pull data from an api
How do i pull JSON data from an object (inside another object)?
This is the first day i've ever heard of JSON so ill be the first to admit i have very little knowledge about this. Thanks for your time!
This post: JSON parsing in android: No value for x error was super helpful and got me to where i am now which is trying to open "list" then get into "0" so i can gather weather data.
Been using jsonviewer.stack to try to understand this stuff
Api url: http://api.openweathermap.org/data/2.5/forecast/city?id=4161254&APPID=e661f5bfc93d47b8ed2689f89678a2c9
My code:
public class MainActivity extends AppCompatActivity {
TextView citySpot;
TextView cityTemp;
public class DownloadTask extends AsyncTask<String, Void, String> {
#Override
protected String doInBackground(String... urls) {
String result = "";
URL url;
HttpURLConnection urlConnection = null;
try {
url = new URL(urls[0]);
urlConnection = (HttpURLConnection) url.openConnection();
InputStream in = urlConnection.getInputStream();
InputStreamReader reader = new InputStreamReader(in);
int data = reader.read();
while (data != -1) {
char current = (char) data;
result += current;
data = reader.read();
}
return result;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(String result) {
super.onPostExecute(result);
try {
// this works flawlessly and gathers the city name
JSONObject container = new JSONObject(result);
JSONObject cityTest = container.getJSONObject("city");
citySpot.setText(cityTest.getString("name"));
// this is my failed attempt to get inside of the "list" folder
JSONObject listList = new JSONObject(result);
JSONObject listTest = listList.getJSONObject("list");
JSONObject listTestOne = listTest.getJSONObject("0");
JSONObject listTestTwo = listTestOne.getJSONObject("main");
cityTemp.setText(listTestTwo.getString("temp"));
}
catch (JSONException e) {
e.printStackTrace();
}
}
}
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
citySpot = (TextView) findViewById(R.id.cityName);
cityTemp = (TextView) findViewById(R.id.textView3);
DownloadTask task = new DownloadTask();
task.execute("http://api.openweathermap.org/data/2.5/forecast/city?id=4161254&APPID=e661f5bfc93d47b8ed2689f89678a2c9");
}
}
You could try use gson library to parse the response. Add below in build.gradle
compile group: 'com.google.code.gson', name: 'gson', version: '2.3.1'
Create beans to match your response. Use fields to match what comes back in the json response. Sample below.
public class ResultsResponse {
private List<MyList> list;
}
public class MyList {
private String main;
}
If you want MyList can have another list. Finally try
GsonBuilder gsonBuilder = new GsonBuilder();
ResultsResponse response = gsonBuilder.create().fromJson(jsonString, ResultsResponse.class)
response object should have your list populated.
I have been searching on how to retrieve a public user timeline from Facebook without any authentication or sign in. The easiest library I found is facebook4j but I can't figure out how to achieve my goal(don't understand the access token).
I have achieved my my goal for Twitter using twitter4j and my code is as follow:
class TweetFetch extends AsyncTask<String,String,Void>{
private String convertedSingerName;
private ArrayAdapter<String> nameAdapter;
String user;
ProgressDialog loadDialog;
#Override
protected void onPreExecute() {
//get a loading animation
/*loadDialog = new ProgressDialog(getApplicationContext());
loadDialog.setMessage("Loading...");
loadDialog.setCancelable(false);
loadDialog.show();*/
setProgressBarIndeterminateVisibility(true);
convertedSingerName = Normalizer.normalize(singerName, Normalizer.Form.NFD);
convertedSingerName = convertedSingerName.replaceAll("[^\\p{ASCII}]", "");
convertedSingerName = convertedSingerName.replaceAll("\\s", "");
user = convertedSingerName;
nameAdapter = (ArrayAdapter<String>) tweetList.getAdapter();
getTweetOAuthDetails();
}
#Override
protected Void doInBackground(String... params) {
try{
Paging page = new Paging(1, 50);//page number, number per page
statuses = twitter.getUserTimeline(user, page);
Log.i("Status count ", statuses.size() + " Feeds");
for (int i=0 ; i<statuses.size() ; i++){
twitter4j.Status status = statuses.get(i);
publishProgress(status.getText());
Log.i("Tweet Count" + (i+1), status.getText() + "\n\n");
}
}catch (TwitterException te){
te.printStackTrace();
}
return null;
}
#Override
protected void onProgressUpdate(String... values) {
nameAdapter.add(values[0]);
}
#Override
protected void onPostExecute(Void aVoid) {
//loadDialog.dismiss();
setProgressBarIndeterminateVisibility(false);
}
}
Is there anyway that I can use the same approach using facebook4j for Facebook. If this is not achievable using facebook4j what is the simplest way.
Thanks.
I'm developing an Android app. I'm using Jsoup to retreive elements from a page. Then, I'm iterating over the collection to get each individual part of it. I'm not sure how to save each instance of an element as a different variable. I think that I can use a for loop for this, but I don't quite understand it. How would I determine the length of how long to select from? How would I use it? I'm retreiving elements from here: http://lapi.transitchicago.com/api/1.0/ttarrivals.aspx?key=201412abc85d49b2b83f907f9e329eaa&mapid=40380. My code is below:
public class TestStation extends Activity {
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.test_station);
StrictMode.ThreadPolicy policy = new StrictMode.ThreadPolicy.Builder().permitAll().build();
StrictMode.setThreadPolicy(policy);
Intent intent = getIntent();
String value = intent.getExtras().getString("value");
Uri my = Uri.parse("http://lapi.transitchicago.com/api/1.0/ttarrivals.aspx?key=201412abc85d49b2b83f907f9e329eaa&mapid="+value);
String myUrl = my.toString();
Document doc = null;
TextView tv1 = (TextView) findViewById(R.id.tv1);
try {
doc = Jsoup.connect(myUrl).userAgent("Mozilla/5.0 (Macintosh; U; Intel Mac OS X; de-de) AppleWebKit/523.10.3 (KHTML, like Gecko) Version/3.0.4 Safari/523.10").get();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Elements elem = doc.select("eta");
Iterator<Element> iterator = elem.iterator();
while(iterator.hasNext())
{ Element div = iterator.next();
Elements arrT = div.select("arrT");
Elements prdt = div.select("prdt");
Elements destNm = div.select("destNm");
Elements rt = div.select("rt");
String DestNm = destNm.text();
String Rt = rt.text();
tv1.setText(String.valueOf (Rt));
I would like to store each instance (there is many) of arrT, pdrt, and destNm as a different variable. How would I go across doing this? Thank you for your help.
You could use a generic type list
ArrayList<Elements> xx = new ArrayList<Elements>();
then in your while loop
xx.add(arrT);
`Could you make an class with getters and setters for the properties you're interested in, and create an array of that?
EDIT:
To create your class:
public class YourClass {
private Element arrt;
...
public void setArrT (String input) {
arrt = input;
}
public Element getArrt() {
return arrt;
}
....
}
Give it a go.
Transfer between Unicode and UTF-8
try {
String string = "\u201cGoodbye Athens thanks";
byte[] utf8 = string.getBytes("UTF-8");
string = new String(utf8, "UTF-8");
} catch (UnsupportedEncodingException e) {
}