public class Crawler {
public static void main(String[] args) {
List<String> Web = new ArrayList<String>();
Web.add("www.thehindu.com");
Web.add("www.indianexpress.com");
Web.add("www.ndtv.com");
Web.add("www.tehekla.com");
try {
for (int i = 0; i < Web.size(); i ++) {
// URL my_url = new URL("http://www.thehindu.com/");
String a = Web.get(i).toString();
System.out.println(a);
URL my_url = new URL(a);
BufferedReader br = new BufferedReader(new InputStreamReader(my_url.openStream()));
String strTemp = "";
while(null != (strTemp = br.readLine())) {
System.out.println(strTemp);
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
When I am trying to run this code then error is showing as:
java.net.MalformedURLException: no protocol: www.thehindu.com
Try adding http:// before each URL.
You need to place http before website address
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
public class Crawler {
public static void main(String[] args) {
List<String> Web = new ArrayList<String>();
Web.add("http://www.thehindu.com");
Web.add("http://www.indianexpress.com");
Web.add("http://www.ndtv.com");
Web.add("http://www.tehekla.com");
try {
for (int i = 0; i < Web.size(); i++) {
// URL my_url = new URL("http://www.thehindu.com/");
String a = Web.get(i).toString();
System.out.println(a);
URL my_url = new URL(a);
BufferedReader br = new BufferedReader(new InputStreamReader(
my_url.openStream()));
String strTemp = "";
while (null != (strTemp = br.readLine())) {
System.out.println(strTemp);
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
Related
Thanks in advance for every input!
I'm getting a little familiar with how to read data from websites with Java and have tried to do this by reading data using a URLConnectionReader.
Unfortunately I get an UnknownHostException when I test the whole thing in a Java online compiler (https://www.jdoodle.com/online-java-compiler/).
Have I forgotten any imports? I proceeded according to a tutorial.
Code: (designed for online-java-compiler jdoodle):
import java.net.*;
import java.io.*;
public class URLConnectionReader {
public static void main(String[] args)
{
String output = getUrlContents("https://www.tradegate.de/orderbuch_umsaetze.php?isin=NO0010892359");
System.out.println(output);
}
private static String getUrlContents(String theUrl)
{
StringBuilder content = new StringBuilder();
try
{
URL url = new URL(theUrl);
URLConnection urlConnection = url.openConnection();
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));
String line;
while ((line = bufferedReader.readLine()) != null)
{
content.append(line + "\n");
}
bufferedReader.close();
}
catch(Exception e)
{
e.printStackTrace();
}
return content.toString();
}
}
Error message:
java.net.UnknownHostException: www.tradegate.de
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:220)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.base/java.net.Socket.connect(Socket.java:591)
at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:285)
at java.base/sun.security.ssl.BaseSSLSocketImpl.connect(BaseSSLSocketImpl.java:173)
at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:182)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:474)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:569)
at java.base/sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:265)
at java.base/sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:372)
at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1187)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)
at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1515)
at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:250)
at URLConnectionReader.getUrlContents(URLConnectionReader.java:21)
at URLConnectionReader.main(URLConnectionReader.java:8)
I separated the classes as follows and your code works without any exceptions=>
class Mian:
public class Mian {
public static void main(String[] args) throws ClassNotFoundException {
URLConnectionReader urlcr = new URLConnectionReader();
String output =
urlcr.getUrlContents("https://www.tradegate.de/orderbuch_umsaetze.php?
isin=NO0010892359");
System.out.println(output);
}
}
and URLConnectionReader class:
import java.net.*;
import java.io.*;
public class URLConnectionReader {
public String getUrlContents(String theUrl)
{
StringBuilder content = new StringBuilder();
try
{
URL url = new URL(theUrl);
URLConnection urlConnection = url.openConnection();
BufferedReader bufferedReader = new BufferedReader(new
InputStreamReader(urlConnection.getInputStream()));
String line;
while ((line = bufferedReader.readLine()) != null)
{
content.append(line + "\n");
}
bufferedReader.close();
}
catch(Exception e)
{
e.printStackTrace();
}
return content.toString();
}
}
this is my code, im trying to compare two .csv files and match them and save the common pain in another file. How do i do it?
This is the cotnent of item_no.csv file
1
2
3
4
5
This is the content of item_desc.csv file
1,chocolate,100
2,biscuit,20
3,candy,10
4,lollipop,5
5,colddrink,50
6,sandwitch,70
EDIT This is the expected output:
1,chocolate,100
2,biscuit,20
3,candy,10
4,lollipop,5
5,colddrink,50
This is my code:
package fuu;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import com.sun.org.apache.xerces.internal.impl.xpath.regex.ParseException;
public class Demo {
public static void main(String[] args) throws ParseException, IOException {
// TODO Auto-generated method stub
BufferedReader br = new BufferedReader(new FileReader("/home/yotta/eclipse/workspace/Test/WebContent/doc/item_no.csv"));
BufferedReader br1 = new BufferedReader(new FileReader("/home/yotta/eclipse/workspace/Test/WebContent/doc/item_desc.csv"));
String line = null;
String line1 = null;
String line2 = null;
String[] str=null;
String[] str1=null;
try {
while((line = br.readLine())!=null){
str = line.split(",");
System.out.println(str[0]);
}
while((line1 = br1.readLine())!=null){
str1 = line1.split(",");
System.out.println(str1[0]+" "+str1[1]+" "+str1[2]);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
You could separate the different steps.
public class Demo {
public static void main(String[] args) throws IOException {
Map<String, String> descMap = new HashMap<>();
String line;
// read all item descriptions
try (BufferedReader br1 = new BufferedReader(new FileReader("item_desc.csv"))) {
while ((line = br1.readLine()) != null) {
int itemNbrSeparator = line.indexOf(',');
String itemNbr = line.substring(0, itemNbrSeparator);
descMap.put(itemNbr, line);
}
}
List<String> matched = new ArrayList<>();
// read the item numbers and store each matched
try (BufferedReader br = new BufferedReader(new FileReader("item_no.csv"))) {
while ((line = br.readLine()) != null) {
if (descMap.containsKey(line)) {
System.out.println(descMap.get(line));
matched.add(descMap.get(line));
}
}
}
// output all matched
Path outFile = Paths.get("item_match.csv");
Files.write(outFile, matched, Charset.defaultCharset(), new LinkOption[0]);
}
}
One way is this
List<String> lines1 = new ArrayList<String>();
while ((line = br.readLine()) != null) {
str = line.split(",");
lines1.add(line);
System.out.println(str[0]);
}
List<String> lines2 = new ArrayList<String>();
while ((line = br1.readLine()) != null) {
str = line.split(",");
System.out.println(str[0]);
if(lines1.contains(str[0])){
lines2.add(line);
}
}
for (String l : lines1) {
System.out.println(l);
}
Hello I'm creating a HangMan game and I want the array list of words to come from the internet. Its not initializing for me. Can anyone help? This is the code.
public String getaword()
{
try
{
URL url = new URL ("http://dictionary-thesaurus.com/wordlists/Adjectives%28929%29.txt");
//URLConnection urlConnection = (URLConnection)url.openConnection();
//inStream = new InputStreamReader(urlConnection.getInputStream());
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
String str=null;
ArrayList<String> lines = new ArrayList<String>();
while((str = in.readLine()) != null)
{
lines.add(str);
words = lines.toArray(new String[lines.size()]);
}
}
catch (Exception e)
{
e.getStackTrace();
}
Random r = new Random();
int num;
num = r.nextInt(words.length);
return words[num];
}
Try this.
public static void main(String[] args) {
ArrayList<String> lines = new ArrayList<String>();
try {
URL url = new URL ("http://dictionary-thesaurus.com/wordlists/Adjectives(929).txt");
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
String str = null;
while((str = in.readLine()) != null) {
lines.add(str);
}
}
catch (Exception e) {
e.printStackTrace();
}
System.out.println(lines);
}
I am working on a simple server in Java that should have a capability of transferring a file across computers. I am getting a NullPointerException on line 77 of Protocol.class. Here is the stack:
java.lang.NullPointerException
at Protocol.processInput(Protocol.java:77)
at Server.main(Server.java:41)
Why does this happen? There is no null references on line 77!
Client.java:
import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.Socket;
import java.net.UnknownHostException;
import java.util.ArrayList;
import javax.swing.JFileChooser;
import javax.swing.JOptionPane;
import javax.swing.filechooser.FileNameExtensionFilter;
public class Client {
private static boolean filein = false;
private static ArrayList<String> fln = new ArrayList<>();
public static void main(String[] args) throws IOException {
if (args.length != 2) {
System.err.println(
"Usage: java Client <host name> <port number>");
System.exit(1);
}
String hostName = args[0];
int portNumber = Integer.parseInt(args[1]);
try (
Socket kkSocket = new Socket(hostName, portNumber);
PrintWriter out = new PrintWriter(kkSocket.getOutputStream(), true);
BufferedReader in = new BufferedReader(
new InputStreamReader(kkSocket.getInputStream()));
) {
BufferedReader stdIn =
new BufferedReader(new InputStreamReader(System.in));
String fromServer;
String fromUser = null;
while ((fromServer = in.readLine()) != null) {
System.out.println("Server: " + fromServer);
if (fromServer.equals("#file")) { filein = true;
fromUser = "";}
else if(fromServer.equals("#end#")) {
filein = false;
JFileChooser chooser = new JFileChooser();
int returnVal = chooser.showSaveDialog(null);
if(returnVal == JFileChooser.APPROVE_OPTION) {
String fname = chooser.getSelectedFile().getAbsolutePath();
File f = new File(fname);
f.createNewFile();
PrintWriter p = new PrintWriter(f);
for(int i = 0; i < fln.size(); i++) {
p.println(fln.get(i));
}
p.close();
JOptionPane.showMessageDialog(null, "File saved!");
}
}
else if (filein == true) {
fln.add(fromServer);
System.out.println(fln.get(fln.size() - 1));
}
if (fromServer.equals("Bye."))
break;
if (!filein) fromUser = stdIn.readLine();
else if (filein) fromUser = "#contintueFileRun";
if (fromUser != null) {
System.out.println("Client: " + fromUser);
out.println(fromUser);
}
}
} catch (UnknownHostException e) {
System.err.println("Don't know about host " + hostName);
System.exit(1);
} catch (IOException e) {
System.err.println("Couldn't get I/O for the connection to " +
hostName);
System.exit(1);
}
}
}
Server.java:
import java.io.*;
import java.net.*;
import static java.lang.System.out;
/**
* Title: FTP Server
* #author Galen Nare
* #version 1.0
*/
public class Server {
public static void main(String[] args) throws IOException {
out.println("Starting server!");
if (args.length != 1) {
System.err.println("Usage: java Server <port number>");
System.exit(1);
}
int portNumber = Integer.parseInt(args[0]);
try (
ServerSocket serverSocket = new ServerSocket(portNumber);
Socket clientSocket = serverSocket.accept();
PrintWriter out =
new PrintWriter(clientSocket.getOutputStream(), true);
BufferedReader in = new BufferedReader(
new InputStreamReader(clientSocket.getInputStream()));
) {
String inputLine, outputLine;
// Initiate conversation with client
Protocol kkp = new Protocol();
outputLine = kkp.processInput("");
out.println(outputLine);
while ((inputLine = in.readLine()) != null) {
outputLine = kkp.processInput(inputLine);
out.println(outputLine);
if (outputLine.equals("Bye."))
break;
}
} catch (IOException e) {
System.out.println("Exception caught when trying to listen on port "
+ portNumber + " or listening for a connection");
System.out.println(e.getMessage());
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
And finally, Protocol.java:
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Scanner;
public class Protocol {
enum ServerState {
STARTING,
WAITING
}
ArrayList<String> lns;
boolean fileout = false;
int i = 0;
private ServerState state = ServerState.STARTING;
public String processInput(String theInput) throws Exception {
String theOutput = "";
if (state == ServerState.STARTING) {
theOutput = "Hello, Client!";
state = ServerState.WAITING;
}
if (!theInput.equals("")) {
if(theInput.length() > 10 && theInput.startsWith("e")) {
if (theInput.substring(0,11).equalsIgnoreCase("executecmd ")) {
theOutput = theInput.substring(11);
System.out.println(theOutput);
try {
#SuppressWarnings("unused")
Process child = Runtime.getRuntime().exec(theInput.substring(11));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
theOutput = "Executed " + theInput.substring(11) + ".";
}
} else if (theInput.equalsIgnoreCase("stop")) {
theOutput = "Stopping Server!";
System.exit(0);
} else if (theInput.equalsIgnoreCase("executecmd")) {
theOutput = "Usage: executecmd <command [-options]>";
} else if (theInput.equalsIgnoreCase("getfile")) {
theOutput = "Usage: getfile <file>";
} else if(theInput.length() > 7 && theInput.startsWith("g")) {
System.out.println("in");
if (theInput.substring(0,8).equalsIgnoreCase("getfile ")) {
theOutput = theInput.substring(8);
File f = new File(theInput.substring(8));
Scanner scan = new Scanner(f);
ArrayList<String> lns = new ArrayList<>();
while(scan.hasNext()) {
lns.add(scan.nextLine());
}
for (int i=0; i < lns.size(); i++) {
System.out.println(lns.get(i));
}
scan.close();
lns.add("#end#");
theOutput = "#file";
fileout = true;
}
} else if (fileout && i < lns.size()) {
theOutput = lns.get(i);
i++;
} else if (fileout && i == lns.size()) {
i = 0;
fileout = false;
} else {
theOutput = "That is not a command!";
}
}
System.out.print(theOutput);
return theOutput;
}
}
Thanks in advance!
You're never initializing lns in Protocol, so it's always a null reference. You may be able to get away with just changing the declaration to:
private List<String> lns = new ArrayList<String>();
(I've made it private and changed the type to List just out of habit...)
You should also consider giving it a more readable name - is it meant to represent lines? If so, call it lines!
(Next, consider why you weren't able to diagnose this yourself. Did you step through this in the debugger? Why did you think there were no null references on line 77? What diagnostic steps did you take in terms of adding extra logging etc? It's important to use errors like this as a learning experience to make future issues more tractable.)
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.Iterator;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;
public class Test {
List<String> knownWordsArrayList = new ArrayList<String>();
List<String> wordsArrayList = new ArrayList<String>();
List<String> newWordsArrayList = new ArrayList<String>();
String toFile = "";
public void readKnownWordsFile() {
try {
FileInputStream fstream2 = new FileInputStream("knownWords.txt");
BufferedReader br2 = new BufferedReader(new InputStreamReader(fstream2, "UTF-8"));
String strLine;
while ((strLine = br2.readLine()) != null) {
knownWordsArrayList.add(strLine.toLowerCase());
}
HashSet h = new HashSet(knownWordsArrayList);
// h.removeAll(knownWordsArrayList);
knownWordsArrayList = new ArrayList<String>(h);
// for (int i = 0; i < knownWordsArrayList.size(); i++) {
// System.out.println(knownWordsArrayList.get(i));
// }
} catch (Exception e) {
// TODO: handle exception
}
}
public void readFile() {
try {
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream("Smallville 4x02.de.srt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
String numberedLineRemoved = "";
String strippedInput = "";
String[] words;
String trimmedString = "";
String temp = "";
// Read File Line By Line
while ((strLine = br.readLine()) != null) {
temp = strLine.toLowerCase();
// Print the content on the console
numberedLineRemoved = numberedLine(temp);
strippedInput = numberedLineRemoved.replaceAll("\\p{Punct}", "");
if ((strippedInput.trim().length() != 0) || (!strippedInput.contains("")) || (strippedInput.contains(" "))) {
words = strippedInput.split("\\s+");
for (int i = 0; i < words.length; i++) {
if (words[i].trim().length() != 0) {
wordsArrayList.add(words[i]);
}
}
}
}
HashSet h = new HashSet(wordsArrayList);
h.removeAll(knownWordsArrayList);
newWordsArrayList = new ArrayList<String>(h);
// HashSet h = new HashSet(wordsArrayList);
// wordsArrayList.clear();
// newWordsArrayList.addAll(h);
for (int i = 0; i < newWordsArrayList.size(); i++) {
toFile = newWordsArrayList.get(i) + ".\n";
// System.out.println(newWordsArrayList.get(i) + ".");
System.out.println();
}
System.out.println(newWordsArrayList.size());
// Close the input stream
in.close();
} catch (Exception e) {// Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
public String numberedLine(String string) {
if (string.matches(".*\\d.*")) {
return "";
} else {
return string;
}
}
public void writeToFile() {
try {
// Create file
FileWriter fstream = new FileWriter("out.txt");
BufferedWriter out = new BufferedWriter(fstream);
out.write(toFile);
// Close the output stream
out.close();
} catch (Exception e) {// Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
public static void main(String[] args) {
Test test = new Test();
test.readKnownWordsFile();
test.readFile();
test.writeToFile();
}
}
How can I read äöüß from file?
Would the string.toLowercase() handle these properly as well?
And when I go to print words containing any of äöüß, how can I print the word properly?
When I print to console I get
Außerdem
weiß
for Außerdem
weiß
How can I fix this?
I tried:
BufferedReader br = new BufferedReader(new InputStreamReader(in, "UTF-8"));
But now I'm getting aufkl?ren instead of aufklären and its messing up in other places as well.
Updated the code to see if it would print on the file properly, but I'm just getting one in the file.
You need to read files using the charset which was used to create the file. If you're on a windows machine, that's probably cp1252. So:
BufferedReader br = new BufferedReader(new InputStreamReader(in, "Cp1252"));
If that doesn't work, most text editors are capable of telling you what encoding is used for a given document.