I am trying to download the Source code of a web page .
But the problem is the whole code is not being showing up only a small part is downloading every time .
public class MainActivity extends AppCompatActivity {
public class DownloadTask extends AsyncTask < String , Void , String >
{
#Override
protected String doInBackground(String... params) {
String content ="";
URL url ;
HttpURLConnection conn = null;
try {
url = new URL (params[0]);
conn = (HttpURLConnection)url.openConnection();
InputStream is = conn.getInputStream();
InputStreamReader isr = new InputStreamReader(is);
int data = isr.read();
while(data!=-1)
{
char c = (char) data;
content += c;
data = isr.read();
}
Log.i("The Code is ",content);
}
catch (Exception e)
{
e.getStackTrace();
}
return content;
}
}
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
String result =" ";
DownloadTask DT = new DownloadTask();
try {
result = DT.execute("https://www.google.co.in").get();
}
catch (Exception e)
{
e.getStackTrace();
}
Log.i("The Code is ",result);
}
}
It's important to close the StreamReader. Might not be a problem, but it's a good practice.
while(data!=-1)
{
char c = (char) data;
content += c;
data = isr.read();
}
isr.close();
is.close();
I think your first page is downloaded fine, but when you try to load it again and again you might face problem. As I said this might not be a fix, but it's important. Hope this helps someone.
Related
I am trying to download the HTML of page. After it downloads I try to Log it. Everything goes smoothly but the HTML stops at a certain point every time, even though it has a lot more HTML to show.
I tried using a different page, my page which just has some instructions for my Company and it worked perfectly. Is there a limit maybe? I tried it with urlconnection.connect(), and without it and there is no difference.
public class MainActivity extends AppCompatActivity {
public class DownloadHTML extends AsyncTask<String, Void, String>{
#Override
protected String doInBackground(String... urls) {
URL url;
String result = "";
HttpURLConnection urlConnection = null;
try {
url = new URL(urls[0]);
urlConnection = (HttpURLConnection)url.openConnection();
InputStream in = urlConnection.getInputStream();
InputStreamReader reader = new InputStreamReader(in);
int data = reader.read();
while (data!=-1){
char current = (char) data;
result += current;
data = reader.read();
}
return result;
} catch (Exception e) {
e.printStackTrace();
return "Fail";
}
}
}
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
String Result = "";
DownloadHTML task = new DownloadHTML();
try {
Result = task.execute("http://www.posh24.se/kandisar").get();
} catch (Exception e) {
e.printStackTrace();
}
Log.i("URL", Result);
}
}
Here is the splitting and it wont work.
try {
Result = task.execute("http://www.posh24.se/kandisar").get();
String[] splitStrings = Result.split("<div class=\"channelListEntry\">");
Pattern p = Pattern.compile("<img src=\"(.*?)\"");
Matcher m = p.matcher(splitStrings[0]);
while (m.find()){
CelebUrls.add(m.group(1));
}
p = Pattern.compile("alt=\"(.*?)\"");
m = p.matcher(splitStrings[0]);
while (m.find()){
CelebNames.add(m.group(1));
}
} catch (Exception e) {
e.printStackTrace();
}
Log.i("URL", Arrays.toString(CelebUrls.toArray()));
}
}
Modifing your method like this will give you the content of the html page in UTF-8 format.
(In this case its UTF-8 because the page is encoded like that, in doubt you can pass Charset.forName("utf-8") as second paramter to the constructor of InputStreamReader)
When testing you example implementation I only got some output with various unreadable characters.
Ignore the class and the method changes, I only made them to have a standalone example.
public class ParsingTest {
static String doInBackground(String address) {
URL url;
StringBuilder result = new StringBuilder(1000);
HttpURLConnection urlConnection = null;
try {
url = new URL(address);
urlConnection = (HttpURLConnection)url.openConnection();
InputStream in = urlConnection.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
String line = reader.readLine();
while (line != null){
result.append(line);
result.append("\n");
line = reader.readLine();
}
return result.toString();
} catch (Exception e) {
e.printStackTrace();
return "Fail";
}
}
public static void main(String[] args) {
String result = doInBackground("http://www.posh24.se/kandisar");
System.out.println(result);
}
}
If the only part that interest you are the images of the top100, you can just adjust the while loop to:
String line = reader.readLine();
while (line != null){
if (line.contains("<div class=\"channelListEntry\">")) {
reader.readLine();
reader.readLine();
line = reader.readLine().trim();
// At this points its probably easier to use a List<String> for the result instead
result.append(line);
result.append("\n");
}
line = reader.readLine();
}
This is only a simplied example based on the current design of the page, where the img comes 3 lines after the declaration of the div.
If you want to you can also just extract the url of the image and the alt description directly at this point. Instead of using complicated regex you could rely on the String#indexOf instead.
private static final String SRC = "src=\"";
private static final String ALT = "\" alt=\"";
private static final String END = "\"/>";
public static void extract(String image) {
int index1 = image.indexOf(SRC);
int index2 = image.indexOf(ALT);
int index3 = image.indexOf(END);
System.out.println(image);
System.out.println(image.substring(index1 + SRC.length(), index2));
System.out.println(image.substring(index2 + ALT.length(), index3));
}
Note that if you directly process the content from the page your app does not require the memory to store the full page.
Its taking too long to compile the code (around 5mins +, only for this app).
Also when it's finally done, complete HTML is not displayed in the logcat! Only partial.
Can you guys please point out what's wrong with the code?
Is it because of "InputStream" reading character by character (as the HTML is huge)?
public class MainActivity extends AppCompatActivity {
public class DownloadTask extends AsyncTask<String, Void, String> {
#Override
protected String doInBackground(String... urls) {
String result = "";
URL url;
HttpURLConnection urlConnection = null;
try {
url = new URL(urls[0]);
urlConnection = (HttpURLConnection) url.openConnection();
InputStream in = urlConnection.getInputStream();
InputStreamReader reader = new InputStreamReader(in);
int data = reader.read();
while (data != -1) {
char current = (char) data;
result += current;
data = reader.read();
}
return result;
} catch (Exception e) {
e.printStackTrace();
return "Failed";
}
}
}
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
DownloadTask task = new DownloadTask();
String result = null;
try {
result = task.execute("http://www.amazon.com").get();
} catch (Exception e) {
e.printStackTrace();
}
Log.i("Result",result);
}
Yes. The more system calls you make like that, the worse your performance. You should be reading in multiple kilobytes at a time, not characters. If you need to loop over it one at a time, do that afterwards.
Also, use a StringBuilder!!!! + on a string is HIGHLY inefficient. For every character you make a new String object. StringBuilder avoids that.
I checked across StackOverflow for answers, but I did not find much. So, I am doing this for practice, like Hello World for working with JSON, I am getting JSON response from openweather API.
I write the name of the city in EditText and press the button to search for it and display JSON string in the logs.
public class MainActivity extends AppCompatActivity {
EditText city;
public void getData(View view){
String result;
String cityName = city.getText().toString();
getWeather weather = new getWeather();
try {
result = weather.execute(cityName).get();
System.out.println(result);
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
public class getWeather extends AsyncTask<String, Void, String>{
#Override
protected String doInBackground(String... urls) {
URL url;
HttpURLConnection connection = null;
String result = "";
try {
String finalString = urls[0];
finalString = finalString.replace(" ", "%20");
String fullString = "http://api.openweathermap.org/data/2.5/forecast?q=" + finalString + "&appid=a18dc34257af3b9ce5b2347bb187f0fd";
url = new URL(fullString);
connection = (HttpURLConnection) url.openConnection();
InputStream in = connection.getInputStream();
InputStreamReader reader = new InputStreamReader(in);
int data = reader.read();
while(data != -1){
char current = (char) data;
result += current;
data = reader.read();
}
return result;
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
}
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
city = (EditText) findViewById(R.id.editText);
}
}
What can I do to not get that message?
weather.execute(cityName).get()
When you do get() you are waiting the AsyncTask to finish. Thus you are running all heavy operation on Ui thread.
From documentation of get():
Waits if necessary for the computation to complete, and then retrieves its result.
Remove get().
When I'm trying to download html using this method:
public class DownloadHtml extends AsyncTask<String, Void, String> {
#Override
protected String doInBackground(String... urls) {
String result = "";
URL url;
HttpURLConnection connection = null;
try {
url = new URL(urls[0]);
connection = (HttpURLConnection) url.openConnection();
InputStream inputStream = connection.getInputStream();
InputStreamReader reader = new InputStreamReader(inputStream);
int data = reader.read();
while (data != -1) {
char currentChar = (char) data;
result += currentChar;
data = reader.read();
}
return result;
} catch (Exception e) {
e.printStackTrace();
return "Failed";
}
}
}
And logging a result
DownloadHtml downloadHtml = new DownloadHtml();
String result = null;
try {
result = downloadHtml.execute("http://stackoverflow.com").get();
} catch (Exception e) {
e.printStackTrace();
}
Log.i("Html", result);
I am gettin only small part of it.
Is there a way to get whole HTML of webpage?
Solution was simple. Looks like Log.i doesn't print everything in one go.
When I have tried to get all the links from HTML they were successfully printed.
I try to fetch stringBuilder object to the my main activity. I check my json file or parsing codes they works well. However when I try to fetch stringbuilder it gave an error:
A resource was acquired at attached stack trace but never released. See java.io.Closeable for information on avoiding resource leaks.
java.lang.Throwable: Explicit termination method 'close' not called
code for Server.java is as below
`
public class Server extends Activity {
static StringBuilder stringBuilder = new StringBuilder();
public Server(){
try {
JSONObject obj = new JSONObject(loadJSONFromAsset());
JSONArray project = obj.getJSONArray("project");
for (int i = 0; i < project.length(); i++) {
JSONObject ss = project.getJSONObject(i);
stringBuilder.append(ss.getString("title") + "\n");
JSONArray post = ss.getJSONArray("posts");
for(int j = 0; j < post.length();j++){
JSONObject posts = post.getJSONObject(j);
stringBuilder.append(posts.getString("id") +"\n");
JSONArray tag = posts.getJSONArray("tags");
for(int k = 0; k < tag.length();k++){
stringBuilder.append(tag.getString(k) +"\n");
}
}
}
}
catch (JSONException e) {
stringBuilder.append("error");
e.printStackTrace();
}
}
public String getString(){
return stringBuilder.toString();
}
public String loadJSONFromAsset() {
String json = null;
try {
InputStream is = getAssets().open("cat.json");
int size = is.available();
byte[] buffer = new byte[size];
is.read(buffer);
is.close();
json = new String(buffer, "UTF-8");
} catch (IOException ex) {
ex.printStackTrace();
return null;
}
return json;
}}
and here is my MainActivity.java
public class MainActivity extends Activity {
TextView jsonDataTextView;
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
jsonDataTextView = (TextView) findViewById(R.id.textView);
Server s = new Server();
jsonDataTextView.setText(s.stringBuilder.toString());} }
Is there any solution?
The InputStream may not be closed in the try block of your loadJSONFromAsset method, and it should.
public String loadJSONFromAsset() {
String json = null;
InputStream is = null;
try {
is = getAssets().open("cat.json");
int size = is.available();
byte[] buffer = new byte[size];
is.read(buffer);
json = new String(buffer, "UTF-8");
}
catch (IOException ex) {
ex.printStackTrace();
}
finally {
if (is != null) {
try {
is.close();
}
catch (IOException ex) {
// Do you want to handle this exception?
}
}
}
return json;
}
Note: Something bothers me in your Server constructor, in the catch block you try to append "error" to your StringBuilder. Do you know that here, the StringBuilder may not be empty. Indeed in the try block, one (or more) attempt(s) to append some string to it may work before something goes wrong at some point.
Note 2: Server as a non Activity
public class Server {
private Context mContext;
public Server(Context context) {
mContext = context;
...
}
...
public String loadJSONFromAsset() {
...
mContext.getAssets().open("cat.json");
}
}
then in your MainActivity
Server s = new Server(this);