How to load custom XML feed with Jsoup? - java

I have a Java-method that gets a feed-document (via http) and then parses the feed (which is not of-type JSON or XML).
This is the method:
public ArrayList<ArrayList<String>> getFeed(String type)
{
String feed = "";
String address = "";
Document file;
/**
* FEED URLs-------\/
*/
switch (type) {
case "news":
address = "https://[domain]/svc/feeds/news/6001?subtree=false&imagesize=medium-square";
break;
case "events":
address = "http://[domain]/svc/feeds/events/6001?subtree=true&imagesize=medium-square&from=%5bfromDate%5d&to=%5btoDate";
}
try {
HttpURLConnection connection = (HttpURLConnection) (new URL(address)).openConnection();
//TODO: #Test
//----------------------------\/--THIS ONE WILL CAUSE ERRORS!!
file = (Document)connection.getContent();
connection.disconnect();
//OUTPUT
feed = file.getElementsByAttribute("pre").text();
stream = new StringReader(feed);
} catch (Exception e) {}
//BEGIN PARSING\\//--THEN OUTPUT//\\
try {
return parse();
} catch (FeedParseException e) {}
//de-fault
return null;
}
It's not working; saying that object:'file' caused NullPointerException.
So how do I increase my precision in debugging something which seems to me to be non-Open-Source.
P.S.: I'm not testing the "events" case so don't worry about the GET-parameters there.
here's my stack-trace:
I don't see how it helps though...

You can pass to Jsoup the URL object directly.
Instead of:
HttpURLConnection connection = (HttpURLConnection) (new URL(address)).openConnection();
//TODO: #Test
//----------------------------\/--THIS ONE WILL CAUSE ERRORS!!
file = (Document)connection.getContent();
connection.disconnect();
do
file = Jsoup //
.connect(address) //
.timeout( 10 * 1000) //
.ignoreContentType(true) //
.get();
Jsoup 1.8.3

Related

Exception is java.net.ProtocolException: Invalid HTTP method: PATCH

I am trying to update user metadata by using HttpURLConnection PATCH API.
I had googled and found this useful link which I used.
[https://stackoverflow.com/questions/25163131/httpurlconnection-invalid-http-method-patch][1]
Following are my steps to update user metadata.
Calling database to get user information which need to be
updated, suppose database return 1000 users
.
Calling GET xxx/users/{userId} API 1000 times to check whether
database user exists or not,
Suppose GET xxx/users/{userId} API return 800 active users which
needs to be updated afterwards
Calling PATCH xxx/users/{userId} API 800 times to update user
metadata.
My code working fine if record size is less than or equal to 200-250 but If size gets increase say 1000 then application throwing exception saying
Exception is java.net.ProtocolException: Invalid HTTP method: PATCH
Here is my code.
public static void main(String[] args) throws FileNotFoundException, IOException {
// Here calling DB to get user metadata
List<CEUser> ca = getUserMetada(prop);
// Calling GET users/{userId} API to check whether user exists or not
List<CEUser> activeUsers = getActiveUsers(ca, prop);
// Calling PATCH users/{userId} API to update user metadata
updateUsername(activeUsers, prop);
}
public List<CEUser> getActiveUsers(List<CEUser> CEUsers, Properties prop) {
try {
List<CEUser> activeCEUsers = new ArrayList<>();
for (CEUser ca : CEUsers) {
URL url = new URL(prop.getCeBaseURL() + "users/" + ca.getUserId());
HttpURLConnection httpConnection = (HttpURLConnection) url.openConnection();
httpConnection.setRequestMethod("GET");
httpConnection.setRequestProperty("Accept", "application/json");
httpConnection.connect();
if (httpConnection.getResponseCode() == 200)
activeCEUsers.add(ca);
httpConnection.disconnect();
}
return activeCEUsers;
} catch (Exception e) {
System.out.println("Exception occurred in getActiveUsers() method ");
}
}
private static void allowMethods(String... methods) {
try {
Field methodsField = HttpURLConnection.class.getDeclaredField("methods");
Field modifiersField = Field.class.getDeclaredField("modifiers");
modifiersField.setAccessible(true);
modifiersField.setInt(methodsField, methodsField.getModifiers() & ~Modifier.FINAL);
methodsField.setAccessible(true);
String[] oldMethods = (String[]) methodsField.get(null);
Set<String> methodsSet = new LinkedHashSet<>(Arrays.asList(oldMethods));
methodsSet.addAll(Arrays.asList(methods));
String[] newMethods = methodsSet.toArray(new String[0]);
methodsField.set(null/*static field*/, newMethods);
} catch (NoSuchFieldException | IllegalAccessException e) {
throw new IllegalStateException(e);
}
}
public List<CEUser> updateUsername(List<CEUser> ceUsers, Properties prop) {
try {
allowMethods("PATCH");
List<CEUser> updatedUsername = new ArrayList<>();
for (CEUser ca : ceUsers) {
// Construct username
String username = "some static email";
// Construct email into json format to set in body part
String json = constructJson("email", username);
URL url = new URL(prop.getCeBaseURL() + "users/" + ca.getUserId());
HttpURLConnection httpConnection = (HttpURLConnection) url.openConnection();
httpConnection.setRequestMethod("PATCH");
httpConnection.setDoOutput(true);
httpConnection.setRequestProperty("Accept-Charset", "UTF-8");
httpConnection.setRequestProperty("Content-Type", "application/json;charset=UTF-8");
try (OutputStream output = httpConnection.getOutputStream()) {
output.write(json.getBytes("UTF-8"));
}
httpConnection.connect();
if (httpConnection.getResponseCode() == 200) {
ca.setUsername(username); // set updated username
updatedUsername.add(ca);
}
httpConnection.disconnect();
}
return updatedUsername;
} catch (Exception e) {
System.out.println("Exception occurred in updateUsername() method");
}
}
Any idea why same code working for 200-250 records but not for 1000 records.
Thanks

opennlp.tools.postag.POSTaggerME.train() java.lang.NullPointerException

There are same problem! I get InputSteram = null, I used IntelliJ IDEA, OpenNLP 1.9.1. on Ubuntu 18.04
public void makeDataTrainingModel() {
model = null;
System.out.println("POS model started");
//InputStream dataIn = null;
InputStreamFactory dataIn = null;
try {
dataIn = new InputStreamFactory() {
public InputStream createInputStream() throws IOException {
return NLPClassifier.class.getResourceAsStream("/home/int/src
/main/resources/en-pos.txt");
}
};
//I get null pointer here in dataIn
ObjectStream<String> lineStream = new PlainTextByLineStream((InputStreamFactory) , "UTF-8");
ObjectStream<POSSample> sampleStream = new WordTagSampleStream(lineStream);
**//This train part IS NOT WORK ?**
model = POSTaggerME.train("en", sampleStream, TrainingParameters.defaultParams(), null);
} catch (IOException e) {
// Failed to read or parse training data, training failed
e.printStackTrace();
} finally {
if (dataIn != null) {
System.out.println("InputStreamFactory was not created!");
}
}
System.out.println("POS model done...");
System.out.println("Success generate model...");
//write Data model
OutputStream modelOut = null;
try {
String currentDir = new File("").getAbsolutePath();
modelOut = new BufferedOutputStream(new FileOutputStream(currentDir + "//src//main//resources//example-bad-model.dat"));
model.serialize(modelOut);
} catch (IOException e) {
// Failed to save model
e.printStackTrace();
} finally {
if (modelOut != null) {
try {
modelOut.close();
} catch (IOException e) {
// Failed to correctly save model.
// Written model might be invalid.
e.printStackTrace();
}
}
}
System.out.println("Model generated and treated successfully...");
}
I get null pointer in inputStream and Error...
InputStreamFactory was not created!
Exception in thread "main" java.lang.NullPointerException
at java.io.Reader.<init>(Reader.java:78)
at java.io.InputStreamReader.<init>(InputStreamReader.java:113)
at
opennlp.tools.util.PlainTextByLineStream.reset(PlainTextByLineStream.java:57)
at opennlp.tools.util.PlainTextByLineStream.<init>
(PlainTextByLineStream.java:48)
at opennlp.tools.util.PlainTextByLineStream.<init>
(PlainTextByLineStream.java:39)
at NLPClassifier.makeDataTrainingModel(NLPClassifier.java:98)
at NlpProductClassifier.main(NlpProductClassifier.java:39)
Data looks like this:
profit_profit shell_environment 384912_CD bucks_currency
salary_profit finger_body 913964_CD usd_currency
profit_profit faith_law 3726_CD rur_currency
profit_profit game_entertainment 897444_CD dollar_currency
got_buy gift_jewelery 534841_CD rub_currency
**Why the thread does not open and it throw an exception?**
If getResourceAsStream returns null, it means that the resource wasn't found.
You should check for null and do something else, such as throwing an exception (IOException or FileNotFoundException in this case, since IOException and subclasses are allowed by the throws declaration) - you shouldn't let it pass the null to the rest of your code.
NLPClassifier.class.getResourceAsStream("/home/int/src/main/resources/en-pos.txt") won't work, because resources have the same structure as Java packages, except that dots are replaced with slashes. It's not a path in the file system.
Change it to: getResourceAsStream("/en-pos.txt") (because your file is at the root of the package hierarchy)
I change my code, as Erwin Bolwidt say:
/** I commented this part
return NLPClassifier.class.getResourceAsStream("/home/interceptor/src/main/resources/en-pos.txt");
*/
/**
Add this location of my resoures:
/Project/src/main/resources
*/
return getClass().getClassLoader().getResourceAsStream("en-pos.txt");
After that, i found Apache OpenNLP: java.io.FileInputStream cannot be cast to opennlp.tools.util.InputStreamFactory, with a similar problem, but with other methods. #schrieveslaach says
You need an instance of InputStreamFactory which will retrieve your InputStream. Additionally, TokenNameFinderFactory must not be null ! like this posFactory - must not be null!
/**
* Factory must not be a null. Add posModel.getFactory()
* model = POSTaggerME.train("en", sampleStream, TrainingParameters.defaultParams(), null);
*/
model = POSTaggerME.train("en", sampleStream, TrainingParameters.defaultParams(), posModel.getFactory());
Full code of project in repo https://github.com/AlexTitovWork/NLPclassifier

Android TabLayout and using LoaderCallbacks to populate a Fragment

So I'm working on an assignment where I have to create a TabLayout representing different categories of news, where the news is retrieved using the Bing search API and the JSON is parsed and used to populate the ListView in the three Fragments that make up the TabLayout. I'm also using a ViewPager.
My issue is that for some reason, the content of all three Fragments is the same... same article results. Why is this? I'm using Loader IDs, and the loader is initialized in the onActivityCreated() method. Is there a way I can load the articles relevant to the current Fragment when the user swipes over to that tab?
Here are the relevant methods of my Fragments. They're almost identical in each Fragment, with the exception of the LOADER_ID and CATEGORY_NAME values.
#Override
public void onActivityCreated(Bundle savedInstanceState) {
super.onActivityCreated(savedInstanceState);
// Get a reference to the ConnectivityManager to check state of network connectivity
ConnectivityManager connMgr = (ConnectivityManager)
getContext().getSystemService(Context.CONNECTIVITY_SERVICE);
// Get details on the currently active default data network
NetworkInfo networkInfo = connMgr.getActiveNetworkInfo();
// If there is a network connection, fetch data
if (networkInfo != null && networkInfo.isConnected()) {
// Get a reference to the LoaderManager, in order to interact with loaders.
LoaderManager loaderManager = getLoaderManager();
// Initialize the loader. Pass in the int ID constant defined above and pass in null for
// the bundle. Pass in this activity for the LoaderCallbacks parameter (which is valid
// because this activity implements the LoaderCallbacks interface).
loaderManager.initLoader(WorldFragment.LOADER_ID, null, this);
} else {
// Otherwise, display error
// First, hide loading indicator so error message will be visible
View loadingIndicator = getActivity().findViewById(R.id.loading_indicator);
loadingIndicator.setVisibility(View.GONE);
// Update empty state with no connection error message
mEmptyStateTextView.setText(R.string.no_internet_connection);
}
}
#Override
public Loader<List<Article>> onCreateLoader(int id, Bundle args) {
return new ArticleLoader(this.getContext(), WorldFragment.CATEGORY_NAME);
}
#Override
public void onLoadFinished(Loader<List<Article>> loader, List<Article> articles) {
// Hide loading indicator because the data has been loaded
View loadingIndicator = getActivity().findViewById(R.id.loading_indicator);
loadingIndicator.setVisibility(View.GONE);
// Set empty state text to display "No articles found."
mEmptyStateTextView.setText(R.string.no_articles);
// Clear the adapter of previous earthquake data
adapter.clear();
// If there is a valid list of {#link Earthquake}s, then add them to the adapter's
// data set. This will trigger the ListView to update.
if (articles != null && !articles.isEmpty()) {
adapter.addAll(articles);
}
}
#Override
public void onLoaderReset(Loader<List<Article>> loader) {
adapter.clear();
}
And here is the source for my ArticleLoader class. The "fetchArticleData" call is what retrieves the Article objects by parsing the JSON into Article objects.
public class ArticleLoader extends AsyncTaskLoader<List<Article>> {
private static final String LOG_TAG = ArticleLoader.class.getName();
private String category;
public ArticleLoader(Context context, String category) {
super(context);
this.category = category;
}
#Override
protected void onStartLoading() {
forceLoad();
}
#Override
public List<Article> loadInBackground() {
if (category == null) {
return null;
}
// Perform the network request, parse the response, and extract a list of earthquakes.
List<Article> articles = QueryUtils.fetchArticleData(category);
return articles;
}
}
As per request, here is the QueryUtils class.
public class QueryUtils {
private static final String LOG_TAG = QueryUtils.class.getSimpleName();
private static final String REQUEST_BASE_URL = "https://api.cognitive.microsoft.com/bing/v5.0/news/";
private static final String API_KEY = "redacted";
/**
* Create a private constructor because no one should ever create a {#link QueryUtils} object.
* This class is only meant to hold static variables and methods, which can be accessed
* directly from the class name QueryUtils (and an object instance of QueryUtils is not needed).
*/
private QueryUtils() {
}
/**
* Query the USGS dataset and return a list of {#link Article} objects.
*/
public static List<Article> fetchArticleData(String category) {
// Create URL object
HttpURLConnection conn = createUrlConnection(category);
// Perform HTTP request to the URL and receive a JSON response back
String jsonResponse = null;
try {
jsonResponse = makeHttpRequest(conn);
} catch (IOException e) {
Log.e(LOG_TAG, "Problem making the HTTP request.", e);
}
// Extract relevant fields from the JSON response and create a list of {#link Article}s
List<Article> articles = extractFeatureFromJson(jsonResponse);
// Return the list of {#link Article}s
return articles;
}
/**
* Returns new URL object from the given string URL.
*/
private static HttpURLConnection createUrlConnection(String category) {
URL url = null;
HttpURLConnection conn = null;
try {
url = new URL(QueryUtils.REQUEST_BASE_URL);
conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Category", category);
conn.setRequestProperty("Ocp-Apim-Subscription-Key", QueryUtils.API_KEY);
conn.setDoInput(true);
} catch (Exception e) {
Log.e(LOG_TAG, "Problem building the URL connection ", e);
}
return conn;
}
/**
* Make an HTTP request to the given URL and return a String as the response.
*/
private static String makeHttpRequest(HttpURLConnection conn) throws IOException {
String jsonResponse = "";
// If the URL is null, then return early.
if (conn == null) {
return jsonResponse;
}
InputStream inputStream = null;
try {
conn.connect();
// If the request was successful (response code 200),
// then read the input stream and parse the response.
if (conn.getResponseCode() == 200) {
inputStream = conn.getInputStream();
jsonResponse = readFromStream(inputStream);
} else {
Log.e(LOG_TAG, "Error response code: " + conn.getResponseCode());
}
} catch (IOException e) {
Log.e(LOG_TAG, "Problem retrieving the article JSON results.", e);
} finally {
if (conn != null) {
conn.disconnect();
}
if (inputStream != null) {
// Closing the input stream could throw an IOException, which is why
// the makeHttpRequest(URL url) method signature specifies than an IOException
// could be thrown.
inputStream.close();
}
}
return jsonResponse;
}
/**
* Convert the {#link InputStream} into a String which contains the
* whole JSON response from the server.
*/
private static String readFromStream(InputStream inputStream) throws IOException {
StringBuilder output = new StringBuilder();
if (inputStream != null) {
InputStreamReader inputStreamReader = new InputStreamReader(inputStream, Charset.forName("UTF-8"));
BufferedReader reader = new BufferedReader(inputStreamReader);
String line = reader.readLine();
while (line != null) {
output.append(line);
line = reader.readLine();
}
}
return output.toString();
}
/**
* Return a list of {#link Article} objects that has been built up from
* parsing the given JSON response.
*/
private static List<Article> extractFeatureFromJson(String articleJSON) {
// If the JSON string is empty or null, then return early.
if (TextUtils.isEmpty(articleJSON)) {
return null;
}
// Create an empty ArrayList that we can start adding articles to
List<Article> articles = new ArrayList<>();
// Try to parse the JSON response string. If there's a problem with the way the JSON
// is formatted, a JSONException exception object will be thrown.
// Catch the exception so the app doesn't crash, and print the error message to the logs.
try {
// Create a JSONObject from the JSON response string
JSONObject baseJsonResponse = new JSONObject(articleJSON);
// Extract the JSONArray associated with the key called "features",
// which represents a list of features (or articles).
JSONArray articleArray = baseJsonResponse.getJSONArray("value");
// For each article in the articleArray, create an {#link Article} object
for (int i = 0; i < articleArray.length(); i++) {
// Get a single article at position i within the list of articles
JSONObject currentArticle = articleArray.getJSONObject(i);
// Extract the value for the key called "name"
String articleName = currentArticle.getString("name");
// Extract the value for the key called "url"
String articleSource = currentArticle.getString("url");
// Extract the value for the key called "image"
JSONObject imageObject = currentArticle.getJSONObject("image");
String imageSource = imageObject.getJSONObject("thumbnail").getString("contentUrl");
// Create a new {#link Article} object with the name, url, and image
// from the JSON response.
Article article = new Article(articleName, articleSource, imageSource);
// Add the new {#link Article} to the list of articles.
articles.add(article);
}
} catch (JSONException e) {
// If an error is thrown when executing any of the above statements in the "try" block,
// catch the exception here, so the app doesn't crash. Print a log message
// with the message from the exception.
Log.e("QueryUtils", "Problem parsing the article JSON results", e);
}
// Return the list of articles
return articles;
}
}
Does anyone know what I'm doing wrong?

How to calculate a file size from URL in java

I'm attempting to get a bunch of pdf links from a web service and I want to give the user the file size of each link.
Is there a way to accomplish this task?
Thanks
Using a HEAD request, you can do something like this:
private static int getFileSize(URL url) {
URLConnection conn = null;
try {
conn = url.openConnection();
if(conn instanceof HttpURLConnection) {
((HttpURLConnection)conn).setRequestMethod("HEAD");
}
conn.getInputStream();
return conn.getContentLength();
} catch (IOException e) {
throw new RuntimeException(e);
} finally {
if(conn instanceof HttpURLConnection) {
((HttpURLConnection)conn).disconnect();
}
}
}
The accepted answer is prone to NullPointerException, doesn't work for files > 2GiB and contains an unnecessary call to getInputStream(). Here's the fixed code:
public long getFileSize(URL url) {
HttpURLConnection conn = null;
try {
conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("HEAD");
return conn.getContentLengthLong();
} catch (IOException e) {
throw new RuntimeException(e);
} finally {
if (conn != null) {
conn.disconnect();
}
}
}
Update: The accepted was updated but still has issues.
Try to use HTTP HEAD method. It returns the HTTP headers only. The header Content-Length should contain information you need.
Did you try already to use getContentLength on the URL connection?
In case the server responses a valid header you should get the size of the document.
But be aware of the fact that the webserver might also return the file in chunks. In this case IIRC the content length method will return either the size of one chunk (<=1.4) or -1 (>1.4).
The HTTP response has a Content-Length header, so you could query the URLConnection object for this value.
Once the URL connection has been opened, you can try something like this:
List values = urlConnection.getHeaderFields().get("content-Length")
if (values != null && !values.isEmpty()) {
// getHeaderFields() returns a Map with key=(String) header
// name, value = List of String values for that header field.
// just use the first value here.
String sLength = (String) values.get(0);
if (sLength != null) {
//parse the length into an integer...
...
}
}
In case you are on Android, here's a solution in Java:
/**#return the file size of the given file url , or -1L if there was any kind of error while doing so*/
#WorkerThread
public static long getUrlFileLength(String url) {
try {
final HttpURLConnection urlConnection = (HttpURLConnection) new URL(url).openConnection();
urlConnection.setRequestMethod("HEAD");
final String lengthHeaderField = urlConnection.getHeaderField("content-length");
Long result = lengthHeaderField == null ? null : Long.parseLong(lengthHeaderField);
return result == null || result < 0L ? -1L : result;
} catch (Exception ignored) {
}
return -1L;
}
And in Kotlin:
/**#return the file size of the given file url , or -1L if there was any kind of error while doing so*/
#WorkerThread
fun getUrlFileLength(url: String): Long {
return try {
val urlConnection = URL(url).openConnection() as HttpURLConnection
urlConnection.requestMethod = "HEAD"
urlConnection.getHeaderField("content-length")?.toLongOrNull()?.coerceAtLeast(-1L)
?: -1L
} catch (ignored: Exception) {
-1L
}
}
If your app is from Android N, you can use this instead:
/**#return the file size of the given file url , or -1L if there was any kind of error while doing so*/
#WorkerThread
fun getUrlFileLength(url: String): Long {
return try {
val urlConnection = URL(url).openConnection() as HttpURLConnection
urlConnection.requestMethod = "HEAD"
urlConnection.contentLengthLong.coerceAtLeast(-1L)
} catch (ignored: Exception) {
-1L
}
}
You can try this..
private long getContentLength(HttpURLConnection conn) {
String transferEncoding = conn.getHeaderField("Transfer-Encoding");
if (transferEncoding == null || transferEncoding.equalsIgnoreCase("chunked")) {
return conn.getHeaderFieldInt("Content-Length", -1);
} else {
return -1;
}

apache.commons.fileupload throws MalformedStreamException

I have got this piece of code (I didn't write, just maintaining):
public class MyMultipartResolver extends CommonsMultipartResolver{
public List parseEmptyRequest(HttpServletRequest request) throws IOException, FileUploadException {
String contentType = request.getHeader(CONTENT_TYPE);
int boundaryIndex = contentType.indexOf("boundary=");
InputStream input = request.getInputStream();
byte[] boundary = contentType.substring(boundaryIndex + 9).getBytes();
MultipartStream multi = new MultipartStream(input, boundary);
multi.setHeaderEncoding(getHeaderEncoding());
ArrayList items = new ArrayList();
boolean nextPart = multi.skipPreamble();
while (nextPart) {
Map headers = parseHeaders(multi.readHeaders());
// String fieldName = getFieldName(headers);
String subContentType = getHeader(headers, CONTENT_TYPE);
if (subContentType == null) {
FileItem item = createItem(headers, true);
OutputStream os = item.getOutputStream();
try {
multi.readBodyData(os);
} finally {
os.close();
}
items.add(item);
} else {
multi.discardBodyData();
}
nextPart = multi.readBoundary();
}
return items;
}
}
I am using commons-fileupload.jar version 1.2.1 and obviously the code is using some deprecated methods...
Anyway, while trying to use this code to upload a very large file (780 MB) I get this:
org.apache.commons.fileupload.MultipartStream$MalformedStreamException: Stream ended unexpectedly
at org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:983)
at org.apache.commons.fileupload.MultipartStream$ItemInputStream.read(MultipartStream.java:887)
at java.io.InputStream.read(InputStream.java:89)
at org.apache.commons.fileupload.util.Streams.copy(Streams.java:94)
at org.apache.commons.fileupload.util.Streams.copy(Streams.java:64)
at org.apache.commons.fileupload.MultipartStream.readBodyData(MultipartStream.java:593)
at org.apache.commons.fileupload.MultipartStream.discardBodyData(MultipartStream.java:619)
that is thrown from 'multi.discardBodyData();' line.
My question:
How can I avoid this error and be able to be able to succeed collecting the FileItems?
catch
(org.apache.commons.fileupload.MultipartStream.MalformedStreamException e)
{
e.printStackTrace();
return ERROR;
}
Catch the exception and handle it via ..either InputStream or Return Error use it in struts action tag

Categories

Resources