I am trying to get recharge plan information of service provider into my java program, the website contains dynamic data, and when i am fetching the URL using URLConnection i am only getting the static content,I want to automate the recharge plans of different website into my program.
package com.fs.store.test;
import java.net.*;
import java.io.*;
public class MyURLConnection
{
private static final String baseTataUrl = "https://www.tatadocomo/pre-paypacks";`enter code here`
public MyURLConnection()
{
}
public void getMeData()
{
URLConnection urlConnection = null;
BufferedReader in = null;
try
{
URL url = new URL(baseTataUrl);
urlConnection = url.openConnection();
HttpURLConnection connection = null;
connection = (HttpURLConnection) urlConnection;
in = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()/*,"UTF-8"*/));
String currentLine = null;
StringBuilder line = new StringBuilder();
while((currentLine = in.readLine()) != null)
{
System.out.println(currentLine);
line = line.append(currentLine.trim());
}
}catch(IOException e)
{
e.printStackTrace();
}
finally{
try{
in.close();
}
catch(Exception e){
e.printStackTrace();
}
}
}
public static void main (String args[])
{
MyURLConnection test = new MyURLConnection();
System.out.println("About to call getMeData()");
test.getMeData();
}
}
You must use one of HtmlEditorKits
with Javascript enabled in your browser
and then get content.
See examples:
oreilly
Inspect the traffjc. Firefox has a TamperData plugin for instance. Then you may communicate more directly.
Use apache's HttpClient to facilitate the communication, instead of plain URL.
Maybe use some JSON library if JSON data are coming back.
More details, but you might now skip some loading.
Related
I am creating a program that has to take prices of some products from web. I managed to do this for few first products, but then I got a URL that either read with 503 responce from the server or not fully read(tags with price were not included in the output). Here is my code:
import java.net.*;
import java.io.*;
import java.util.Properties;
public class Test {
public static void main(String[] args) throws Exception {
new Test().connect();
}
public void connect() {
try {
String url = "https://antoshka.ua/ua/nabir-lakiv-dlya-nigtiv-make-it-real-rusalonka-3-sht6282464.html",
proxy = "proxy.mydomain.com",
port = "8080";
URL server = new URL(url);
Properties systemProperties = System.getProperties();
systemProperties.setProperty("http.proxyHost",proxy);
systemProperties.setProperty("http.proxyPort",port);
HttpURLConnection connection = (HttpURLConnection)server.openConnection();
connection.connect();
InputStream in = connection.getInputStream();
readResponse(in);
} catch(Exception e) {
e.printStackTrace();
}
}
public void readResponse(InputStream is) throws IOException {
BufferedInputStream bis = new BufferedInputStream(is);
ByteArrayOutputStream buf = new ByteArrayOutputStream();
int result = bis.read();
while(result != -1) {
byte b = (byte)result;
buf.write(b);
result = bis.read();
}
System.out.println(buf.toString());
}
}
And here is the url I try to read: https://antoshka.ua/ua/nabir-lakiv-dlya-nigtiv-make-it-real-rusalonka-3-sht6282464.html
If you browse it in incognito mode you will see this
Which mind be the cause of the problem. This also mind mean that this page is protected against bots.
Also waiting for 6 seconds after this command
server.openConnection();
mind solve your problem
My advise is to use REST API (if exist). I'm not Russian so i cant find this web page's REST API for you.
According to the Android Documentation:
Assuming you have a web server with a certificate issued by a well known CA, you can make a secure request with code as simple this:
URL url = new URL("https://wikipedia.org");
URLConnection urlConnection = url.openConnection();
InputStream in = urlConnection.getInputStream();
copyInputStreamToOutputStream(in, System.out);
Yes, it really can be that simple.
I am currently working on an Android project and now want to start introducing a User System (DataBase of users/passwords, registration/log in capabilities, etc.).
Here is how we are currently handling the connection (in the Android section of the project):
package com.example.payne.simpletestapp;
import org.osmdroid.util.BoundingBox;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
public class ServerConnection {
public String serverAddress;
public int port;
public ServerConnection(String addr, int port){
this.serverAddress = addr;
this.port = port;
}
public ServerConnection(String addr){
this.port = 80;
this.serverAddress = addr;
}
/**
* Envoie une requête au serveur de façon périodique.
* <p></p>
* Reçoit toutes les alertes sur le territoire
*
* #return
*/
public String ping(BoundingBox boundingBox) throws Exception {
String param =
"?nord=" + boundingBox.getLatNorth() +
"&sud=" + boundingBox.getLatSouth() +
"&est=" + boundingBox.getLonEast() +
"&ouest=" + boundingBox.getLonWest();
return this.getRequest("/alertes", param);
}
public String getRequest(final String path, final String param) throws Exception {
final Response result = new Response();
Thread connection = new Thread(new Runnable() {
#Override
public void run() {
try{
URL obj = new URL(serverAddress + path + param);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
// optional default is GET
con.setRequestMethod("GET");
// int responseCode = con.getResponseCode();
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
result.response = response.toString();
} catch (Exception e){
e.printStackTrace();
}
}
});
connection.start();
connection.join();
return result.response;
}
public Boolean postAlert(Alerte alerte) {
boolean success = false;
alerte.log();
String param = "?type=" + alerte.type +
"&lat=" + alerte.getLatitude() +
"&lng=" + alerte.getLongitude();
try {
this.getRequest("/putAlert", param);
success = true;
} catch (Exception e) {
e.printStackTrace();
}
return success;
}
}
Note that elsewhere, another member of the team seems to have been establishing connections differently (in the 'backend' section of the project, which is used by the server, and not necessarily by Android):
private String getRssFeed() {
try {
String rss = "";
URL rssSource = new URL("https://anURLwithHTTPS.com/");
URLConnection rssSrc = rssSource.openConnection();
BufferedReader in = new BufferedReader(
new InputStreamReader(
rssSrc.getInputStream(), StandardCharsets.UTF_8));
String inputLine;
while ((inputLine = in.readLine()) != null) {
rss += inputLine;
}
in.close();
String rssCleaned = rss.replaceAll("<", "<").replaceAll(">", ">").substring(564);
return rssCleaned;
} catch (MalformedURLException ex) {
Logger.getLogger(AlertesFluxRss.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(AlertesFluxRss.class.getName()).log(Level.SEVERE, null, ex);
}
return "";
}
The questions are:
Will the first code presented (the Android-related one) use an encrypted connection (HTTPS)?
What about the second one?
How do we test if the information that was sent is encrypted?
Is casting to HttpURLConnection a threat? The documentation seems to indicate that it would automatically recast into "HttpsURLConnection".
Since HttpsURLConnection extends HttpURLConnection, it has all of the capacities HttpURLConnection and more, right?
I did find this, but it wasn't exactly what I was looking for.
Our current idea for registering new users would be to send their information within an URL through en encrypted connection (along the lines of https://myWebsite.com/api/?user=Bob&pass=amazing) to the server (hosted on https://www.heroku.com).
If there are better ways to do such a thing, please let me know. Suggestions will be gladly considered. I'm not too knowledgeable when it comes to Libraries neither, so please share anything you know that might make my approach more secure and fast to implement.
Thank you so much for your help in advance! :)
Will the first code presented (the Android-related one) use an encrypted connection (HTTPS)?
Not without seeing the URL the 'first code presented' uses, which isn't shown. In the Android sample you quoted, the URL string "https://wikipedia.org" guarantees it, by definition. If it starts with https:, it uses HTTPS, which is encrypted: otherwise, it doesn't, and isn't.
What about the second one?
Yes, for the reason given above.
How do we test if the information that was sent is encrypted?
You don't. You use an https: URL. The rest is automatic. It works. Worked for 22 years. No testing required. Don't test the platform. If you must check it, attempt a typecast to HttpsURLConnection. If it's HTTPS, that will succeed: otherwise, fail.
Is casting to HttpURLConnection a threat?
No. I don't understand why you ask.
The documentation seems to indicate that it would automatically recast into "HttpsURLConnection".
No it doesn't. It indicates than an HttpsURLConnection is created. No casting.
Since HttpsURLConnection extends HttpURLConnection, it has all of the capacities HttpURLConnection and more, right?
Right.
I am using this program in JAVA to get a csv file from the specified URL, but only problem is that it requires username and password that I know, how can I modify this program to include username and password?
public class TestGetCSV {
public static void main(String[] args) {
try {
URL url12 = new URL("https://wd5-impl-services1.workday.com/ccx/service/customreport2/yahoo3/ISU-YINT061-ProjectsFE/YINT061.05-getWorkersForOpenText?format=csv" );
URLConnection urlConn = url12.openConnection();
HttpURLConnection conn = (HttpURLConnection)urlConn;
conn.setInstanceFollowRedirects( false );
InputStreamReader inStream = new InputStreamReader(urlConn.getInputStream());
BufferedReader buff= new BufferedReader(inStream);
String content2 = buff.readLine();
while (content2 !=null) {
System.out.println(content2);
content2 = buff.readLine();
}
}
catch (Exception e){
e.printStackTrace();
}
}
}
As this is basic authentication based upon HTTP headers you can just open the URL
https://<username>:<password>#wd5-impl-services1.workday.com/ccx/service/customreport2/yahoo3/ISU-YINT061-ProjectsFE/YINT061.05-getWorkersForOpenText?format=csv
instead I believe.
However, please keep in mind that this way your credentials are supplied in plaintext.
I'm writing a Java program which hits a list of urls and needs to first know if the url exists. I don't know how to go about this and cant find the java code to use.
The URL is like this:
http: //ip:port/URI?Attribute=x&attribute2=y
These are URLs on our internal network that would return an XML if valid.
Can anyone suggest some code?
You could just use httpURLConnection. If it is not valid you won't get anything back.
HttpURLConnection connection = null;
try{
URL myurl = new URL("http://www.myURL.com");
connection = (HttpURLConnection) myurl.openConnection();
//Set request to header to reduce load as Subirkumarsao said.
connection.setRequestMethod("HEAD");
int code = connection.getResponseCode();
System.out.println("" + code);
} catch {
//Handle invalid URL
}
Or you could ping it like you would from CMD and record the response.
String myurl = "google.com"
String ping = "ping " + myurl
try {
Runtime r = Runtime.getRuntime();
Process p = r.exec(ping);
r.exec(ping);
BufferedReader in = new BufferedReader(new InputStreamReader(p.getInputStream()));
String inLine;
BufferedWriter write = new BufferedWriter(new FileWriter("C:\\myfile.txt"));
while ((inLine = in.readLine()) != null) {
write.write(inLine);
write.newLine();
}
write.flush();
write.close();
in.close();
} catch (Exception ex) {
//Code here for what you want to do with invalid URLs
}
}
A malformed url will give you an exception.
To know if you the url is active or not you have to hit the url. There is no other way.
You can reduce the load by requesting for a header from the url.
package com.my;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.UnknownHostException;
public class StrTest {
public static void main(String[] args) throws IOException {
try {
URL url = new URL("http://www.yaoo.coi");
InputStream i = null;
try {
i = url.openStream();
} catch (UnknownHostException ex) {
System.out.println("THIS URL IS NOT VALID");
}
if (i != null) {
System.out.println("Its working");
}
} catch (MalformedURLException e) {
e.printStackTrace();
}
}
}
output : THIS URL IS NOT VALID
Open a connection and check if the response contains valid XML? Was that too obvious or do you look for some other magic?
You may want to use HttpURLConnection and check for error status:
HttpURLConnection javadoc
I would like to be able to fetch a web page's html and save it to a String, so I can do some processing on it. Also, how could I handle various types of compression.
How would I go about doing that using Java?
I'd use a decent HTML parser like Jsoup. It's then as easy as:
String html = Jsoup.connect("http://stackoverflow.com").get().html();
It handles GZIP and chunked responses and character encoding fully transparently. It offers more advantages as well, like HTML traversing and manipulation by CSS selectors like as jQuery can do. You only have to grab it as Document, not as a String.
Document document = Jsoup.connect("http://google.com").get();
You really don't want to run basic String methods or even regex on HTML to process it.
See also:
What are the pros and cons of leading HTML parsers in Java?
Here's some tested code using Java's URL class. I'd recommend do a better job than I do here of handling the exceptions or passing them up the call stack, though.
public static void main(String[] args) {
URL url;
InputStream is = null;
BufferedReader br;
String line;
try {
url = new URL("http://stackoverflow.com/");
is = url.openStream(); // throws an IOException
br = new BufferedReader(new InputStreamReader(is));
while ((line = br.readLine()) != null) {
System.out.println(line);
}
} catch (MalformedURLException mue) {
mue.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
} finally {
try {
if (is != null) is.close();
} catch (IOException ioe) {
// nothing to see here
}
}
}
Bill's answer is very good, but you may want to do some things with the request like compression or user-agents. The following code shows how you can various types of compression to your requests.
URL url = new URL(urlStr);
HttpURLConnection conn = (HttpURLConnection) url.openConnection(); // Cast shouldn't fail
HttpURLConnection.setFollowRedirects(true);
// allow both GZip and Deflate (ZLib) encodings
conn.setRequestProperty("Accept-Encoding", "gzip, deflate");
String encoding = conn.getContentEncoding();
InputStream inStr = null;
// create the appropriate stream wrapper based on
// the encoding type
if (encoding != null && encoding.equalsIgnoreCase("gzip")) {
inStr = new GZIPInputStream(conn.getInputStream());
} else if (encoding != null && encoding.equalsIgnoreCase("deflate")) {
inStr = new InflaterInputStream(conn.getInputStream(),
new Inflater(true));
} else {
inStr = conn.getInputStream();
}
To also set the user-agent add the following code:
conn.setRequestProperty ( "User-agent", "my agent name");
Well, you could go with the built-in libraries such as URL and URLConnection, but they don't give very much control.
Personally I'd go with the Apache HTTPClient library.
Edit: HTTPClient has been set to end of life by Apache. The replacement is: HTTP Components
All the above mentioned approaches do not download the web page text as it looks in the browser. these days a lot of data is loaded into browsers through scripts in html pages. none of above mentioned techniques supports scripts, they just downloads the html text only. HTMLUNIT supports the javascripts. so if you are looking to download the web page text as it looks in the browser then you should use HTMLUNIT.
You'd most likely need to extract code from a secure web page (https protocol). In the following example, the html file is being saved into c:\temp\filename.html Enjoy!
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import javax.net.ssl.HttpsURLConnection;
/**
* <b>Get the Html source from the secure url </b>
*/
public class HttpsClientUtil {
public static void main(String[] args) throws Exception {
String httpsURL = "https://stackoverflow.com";
String FILENAME = "c:\\temp\\filename.html";
BufferedWriter bw = new BufferedWriter(new FileWriter(FILENAME));
URL myurl = new URL(httpsURL);
HttpsURLConnection con = (HttpsURLConnection) myurl.openConnection();
con.setRequestProperty ( "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0" );
InputStream ins = con.getInputStream();
InputStreamReader isr = new InputStreamReader(ins, "Windows-1252");
BufferedReader in = new BufferedReader(isr);
String inputLine;
// Write each line into the file
while ((inputLine = in.readLine()) != null) {
System.out.println(inputLine);
bw.write(inputLine);
}
in.close();
bw.close();
}
}
To do so using NIO.2 powerful Files.copy(InputStream in, Path target):
URL url = new URL( "http://download.me/" );
Files.copy( url.openStream(), Paths.get("downloaded.html" ) );
On a Unix/Linux box you could just run 'wget' but this is not really an option if you're writing a cross-platform client. Of course this assumes that you don't really want to do much with the data you download between the point of downloading it and it hitting the disk.
Get help from this class it get code and filter some information.
public class MainActivity extends AppCompatActivity {
EditText url;
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate( savedInstanceState );
setContentView( R.layout.activity_main );
url = ((EditText)findViewById( R.id.editText));
DownloadCode obj = new DownloadCode();
try {
String des=" ";
String tag1= "<div class=\"description\">";
String l = obj.execute( "http://www.nu.edu.pk/Campus/Chiniot-Faisalabad/Faculty" ).get();
url.setText( l );
url.setText( " " );
String[] t1 = l.split(tag1);
String[] t2 = t1[0].split( "</div>" );
url.setText( t2[0] );
}
catch (Exception e)
{
Toast.makeText( this,e.toString(),Toast.LENGTH_SHORT ).show();
}
}
// input, extrafunctionrunparallel, output
class DownloadCode extends AsyncTask<String,Void,String>
{
#Override
protected String doInBackground(String... WebAddress) // string of webAddress separate by ','
{
String htmlcontent = " ";
try {
URL url = new URL( WebAddress[0] );
HttpURLConnection c = (HttpURLConnection) url.openConnection();
c.connect();
InputStream input = c.getInputStream();
int data;
InputStreamReader reader = new InputStreamReader( input );
data = reader.read();
while (data != -1)
{
char content = (char) data;
htmlcontent+=content;
data = reader.read();
}
}
catch (Exception e)
{
Log.i("Status : ",e.toString());
}
return htmlcontent;
}
}
}
Jetty has an HTTP client which can be use to download a web page.
package com.zetcode;
import org.eclipse.jetty.client.HttpClient;
import org.eclipse.jetty.client.api.ContentResponse;
public class ReadWebPageEx5 {
public static void main(String[] args) throws Exception {
HttpClient client = null;
try {
client = new HttpClient();
client.start();
String url = "http://example.com";
ContentResponse res = client.GET(url);
System.out.println(res.getContentAsString());
} finally {
if (client != null) {
client.stop();
}
}
}
}
The example prints the contents of a simple web page.
In a Reading a web page in Java tutorial I have written six examples of dowloading a web page programmaticaly in Java using URL, JSoup, HtmlCleaner, Apache HttpClient, Jetty HttpClient, and HtmlUnit.
I used the actual answer to this post (url) and writing the output into a
file.
package test;
import java.net.*;
import java.io.*;
public class PDFTest {
public static void main(String[] args) throws Exception {
try {
URL oracle = new URL("http://www.fetagracollege.org");
BufferedReader in = new BufferedReader(new InputStreamReader(oracle.openStream()));
String fileName = "D:\\a_01\\output.txt";
PrintWriter writer = new PrintWriter(fileName, "UTF-8");
OutputStream outputStream = new FileOutputStream(fileName);
String inputLine;
while ((inputLine = in.readLine()) != null) {
System.out.println(inputLine);
writer.println(inputLine);
}
in.close();
} catch(Exception e) {
}
}
}