I have implemented html to pdf conversion using openhtmltopdf and I use it in Struts 2 action and it works very well. However, in the case of very large data, e.g. the html data is > 3Mb (pdf file ~1.6Mb) when I test it with JMeter for 50 hits the application crashes with message java.lang.OutOfMemoryError: Java heap space.
If I increase the java limit with the -Xmx option I just get some extra hits
The code i use is like this:
First clean html
public class HtmlToXhtmlConverterHTMLCleaner2 extends AbstractHtmlToXhtmlConverter
implements IHtmlToXhtmlConverter {
public HtmlToXhtmlConverterHTMLCleaner2(String htmlData) {
super(htmlData);
}
#Override
public void convert() {
final HtmlCleaner cleaner = new HtmlCleaner();
CleanerProperties cleanerProperties = cleaner.getProperties();
cleanerProperties.setAdvancedXmlEscape(true);
cleanerProperties.setOmitXmlDeclaration(true);
cleanerProperties.setOmitDoctypeDeclaration(false);
cleanerProperties.setTranslateSpecialEntities(true);
cleanerProperties.setTransResCharsToNCR(true);
cleanerProperties.setRecognizeUnicodeChars(true);
cleanerProperties.setIgnoreQuestAndExclam(true);
cleanerProperties.setUseEmptyElementTags(false);
cleanerProperties.setPruneTags("script");
final XmlSerializer xmlSerializer = new PrettyXmlSerializer(cleanerProperties);
try {
final TagNode rootTagNode = cleaner.clean(htmlData);
this.xhtmlData = xmlSerializer.getAsString(rootTagNode);
} catch (Exception ex) {
ex.printStackTrace();
}
}
then convert cleaned html to pdf
public class PDFConverterHtmlToPdf extends AbstractPDFConverter implements IPDFConverter {
ByteArrayOutputStream psfData;
public PDFConverterHtmlToPdf(String xhtmlData, String cssFile) {
super();
this.xhtmlData = xhtmlData;
this.cssFile = cssFile;
}
#Override
public void convert() {
pdfData = new ByteArrayOutputStream();
try {
// There are more options on the builder than shown below.
PdfRendererBuilder builder = new PdfRendererBuilder();
if(cssFile != null && cssFile.length() > 0){
builder.withHtmlContent(xhtmlData, cssFile);
} else {
builder.withHtmlContent(xhtmlData, "");
}
builder.toStream(pdfData);
builder.run();
} catch (Exception e) {
e.printStackTrace();
}
}
}
then send data from strus2 action to request
private void buildPdfContent(String htmlContent) {
String pdfConverterCssFile = "http://localhost:8080/DocumentConverterApi/css/htmlToPdf.css";
PDFConverterHelp pdfConverterHelp = new PDFConverterHelp("demo.pdf",
htmlContent, pdfConverterCssFile);
pdfConverterHelp.build();
inputStream = new ByteArrayInputStream(pdfConverterHelp.getPDFFile().toByteArray());
pdfConverterHelp.closePdfData();
contentDisposition = "inline;filename=\"" + "demo.pdf\"";
}
I'm doing something wron?
Is there any other way to implement it without the risk of crashing the application?
Related
I'm trying to implement a wrapped "move" function with Xodus, but something is not working out right:
#Override
public boolean move(String appId, String name, String targetName) {
final boolean[] success = new boolean[1];
final Environment env = manager.getEnvironment(xodusRoot, appId);
final VirtualFileSystem vfs = manager.getVirtualFileSystem(env);
env.executeInTransaction(
new TransactionalExecutable() {
#Override
public void execute(#NotNull final Transaction txn) {
File file = vfs.openFile(txn, name, false);
InputStream input = vfs.readFile(txn, file);
if(input != null) {
File targetFile = vfs.openFile(txn, targetName, true);
DataOutputStream output = new DataOutputStream(vfs.writeFile(txn, targetFile));
try {
output.write(ByteStreams.toByteArray(input));
} catch (IOException e) {
e.printStackTrace();
}
vfs.deleteFile(txn, name);
success[0] = true;
}
}
});
// vfs.shutdown();
// env.close();
return success[0];
}
The problem is the file gets moved but the byte array is not getting copied, not sure if the problem is because of multiple VFS operation in the same transaction. Can someone give me a hint of why the bytes from the source file are not getting copied properly?
Looks like you are trying to implement another version of VirtualFileSystem.renameFile(..).
The context is as follows:
I've got objects that represent Tweets (from Twitter). Each object has an id, a date and the id of the original tweet (if there was one).
I receive a file of tweets (where each tweet is in the format of 05/04/2014 12:00:00, tweetID, originalID and is in its' own line) and I want to save them as an XML file where each field has its' own tag.
I want to then be able to read the file and return a list of Tweet objects corresponding to the Tweets from the XML file.
After writing the XML parser that does this I want to test that it works correctly. I've got no idea how to test this.
The XML Parser:
public class TweetToXMLConverter implements TweetImporterExporter {
//there is a single file used for the tweets database
static final String xmlPath = "src/main/resources/tweetsDataBase.xml";
//some "defines", as we like to call them ;)
static final String DB_HEADER = "tweetDataBase";
static final String TWEET_HEADER = "tweet";
static final String TWEET_ID_FIELD = "id";
static final String TWEET_ORIGIN_ID_FIELD = "original tweet";
static final String TWEET_DATE_FIELD = "tweet date";
static File xmlFile;
static boolean initialized = false;
#Override
public void createDB() {
try {
Element tweetDB = new Element(DB_HEADER);
Document doc = new Document(tweetDB);
doc.setRootElement(tweetDB);
XMLOutputter xmlOutput = new XMLOutputter();
// display nice nice? WTF does that chinese whacko want?
xmlOutput.setFormat(Format.getPrettyFormat());
xmlOutput.output(doc, new FileWriter(xmlPath));
xmlFile = new File(xmlPath);
initialized = true;
} catch (IOException io) {
System.out.println(io.getMessage());
}
}
#Override
public void addTweet(Tweet tweet) {
if (!initialized) {
//TODO throw an exception? should not come to pass!
return;
}
SAXBuilder builder = new SAXBuilder();
try {
Document document = (Document) builder.build(xmlFile);
Element newTweet = new Element(TWEET_HEADER);
newTweet.setAttribute(new Attribute(TWEET_ID_FIELD, tweet.getTweetID()));
newTweet.setAttribute(new Attribute(TWEET_DATE_FIELD, tweet.getDate().toString()));
if (tweet.isRetweet())
newTweet.addContent(new Element(TWEET_ORIGIN_ID_FIELD).setText(tweet.getOriginalTweet()));
document.getRootElement().addContent(newTweet);
} catch (IOException io) {
System.out.println(io.getMessage());
} catch (JDOMException jdomex) {
System.out.println(jdomex.getMessage());
}
}
//break glass in case of emergency
#Override
public void addListOfTweets(List<Tweet> list) {
for (Tweet t : list) {
addTweet(t);
}
}
#Override
public List<Tweet> getListOfTweets() {
if (!initialized) {
//TODO throw an exception? should not come to pass!
return null;
}
try {
SAXBuilder builder = new SAXBuilder();
Document document;
document = (Document) builder.build(xmlFile);
List<Tweet> $ = new ArrayList<Tweet>();
for (Object o : document.getRootElement().getChildren(TWEET_HEADER)) {
Element rawTweet = (Element) o;
String id = rawTweet.getAttributeValue(TWEET_ID_FIELD);
String original = rawTweet.getChildText(TWEET_ORIGIN_ID_FIELD);
Date date = new Date(rawTweet.getAttributeValue(TWEET_DATE_FIELD));
$.add(new Tweet(id, original, date));
}
return $;
} catch (JDOMException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
}
Some usage:
private TweetImporterExporter converter;
List<Tweet> tweetList = converter.getListOfTweets();
for (String tweetString : lines)
converter.addTweet(new Tweet(tweetString));
How can I make sure the the XML file I read (that contains tweets) corresponds to the file I receive (in the form stated above)?
How can I make sure the tweets I add to the file correspond to the ones I tried to add?
Assuming that you have the following model:
public class Tweet {
private Long id;
private Date date;
private Long originalTweetid;
//getters and seters
}
The process would be the following:
create an isntance of TweetToXMLConverter
create a list of Tweet instances that you expect to receive after parsing the file
feed the converter the list you generated
compare the list received by parsing the list and the list you initiated at the start of the test
public class MainTest {
private TweetToXMLConverter converter;
private List<Tweet> tweets;
#Before
public void setup() {
Tweet tweet = new Tweet(1, "05/04/2014 12:00:00", 2);
Tweet tweet2 = new Tweet(2, "06/04/2014 12:00:00", 1);
Tweet tweet3 = new Tweet(3, "07/04/2014 12:00:00", 2);
tweets.add(tweet);
tweets.add(tweet2);
tweets.add(tweet3);
converter = new TweetToXMLConverter();
converter.addListOfTweets(tweets);
}
#Test
public void testParse() {
List<Tweet> parsedTweets = converter.getListOfTweets();
Assert.assertEquals(parsedTweets.size(), tweets.size());
for (int i=0; i<parsedTweets.size(); i++) {
//assuming that both lists are sorted
Assert.assertEquals(parsedTweets.get(i), tweets.get(i));
};
}
}
I am using JUnit for the actual testing.
Hello I am in the process of making an Android app that pulls some data from a Wiki, at first I was planning on finding a way to parse the HTML, but from something that someone pointed out to me is that XML would be much easier to work with. Now I am stuck trying to find a way to parse the XML correctly. I am trying to parse from a web address right now from:
http://zelda.wikia.com/api.php?action=query&list=categorymembers&cmtitle=Category:Games&cmlimit=500&format=xml
I am trying to get the titles of each of the games into a string array and I am having some trouble. I don't have an example of the code I was trying out, it was by using xmlpullparser. My app crashes everytime that I try to do anything with it. Would it be better to save the XML locally and parse from there? or would I be okay going from the web address? and how would I go about parsing this correctly into a string array? Please help me, and thank you for taking the time to read this.
If you need to see code or anything I can get it later tonight, I am just not near my PC at this time. Thank you.
Whenever you find yourself writing parser code for simple formats like the one in your example you're almost always doing something wrong and not using a suitable framework.
For instance - there's a set of simple helpers for parsing XML in the android.sax package included in the SDK and it just happens that the example you posted could be easily parsed like this:
public class WikiParser {
public static class Cm {
public String mPageId;
public String mNs;
public String mTitle;
}
private static class CmListener implements StartElementListener {
final List<Cm> mCms;
CmListener(List<Cm> cms) {
mCms = cms;
}
#Override
public void start(Attributes attributes) {
Cm cm = new Cm();
cm.mPageId = attributes.getValue("", "pageid");
cm.mNs = attributes.getValue("", "ns");
cm.mTitle = attributes.getValue("", "title");
mCms.add(cm);
}
}
public void parseInto(URL url, List<Cm> cms) throws IOException, SAXException {
HttpURLConnection con = (HttpURLConnection) url.openConnection();
try {
parseInto(new BufferedInputStream(con.getInputStream()), cms);
} finally {
con.disconnect();
}
}
public void parseInto(InputStream docStream, List<Cm> cms) throws IOException, SAXException {
RootElement api = new RootElement("api");
Element query = api.requireChild("query");
Element categoryMembers = query.requireChild("categorymembers");
Element cm = categoryMembers.requireChild("cm");
cm.setStartElementListener(new CmListener(cms));
Xml.parse(docStream, Encoding.UTF_8, api.getContentHandler());
}
}
Basically, called like this:
WikiParser p = new WikiParser();
ArrayList<WikiParser.Cm> res = new ArrayList<WikiParser.Cm>();
try {
p.parseInto(new URL("http://zelda.wikia.com/api.php?action=query&list=categorymembers&cmtitle=Category:Games&cmlimit=500&format=xml"), res);
} catch (MalformedURLException e) {
} catch (IOException e) {
} catch (SAXException e) {}
Edit: This is how you'd create a List<String> instead:
public class WikiParser {
private static class CmListener implements StartElementListener {
final List<String> mTitles;
CmListener(List<String> titles) {
mTitles = titles;
}
#Override
public void start(Attributes attributes) {
String title = attributes.getValue("", "title");
if (!TextUtils.isEmpty(title)) {
mTitles.add(title);
}
}
}
public void parseInto(URL url, List<String> titles) throws IOException, SAXException {
HttpURLConnection con = (HttpURLConnection) url.openConnection();
try {
parseInto(new BufferedInputStream(con.getInputStream()), titles);
} finally {
con.disconnect();
}
}
public void parseInto(InputStream docStream, List<String> titles) throws IOException, SAXException {
RootElement api = new RootElement("api");
Element query = api.requireChild("query");
Element categoryMembers = query.requireChild("categorymembers");
Element cm = categoryMembers.requireChild("cm");
cm.setStartElementListener(new CmListener(titles));
Xml.parse(docStream, Encoding.UTF_8, api.getContentHandler());
}
}
and then:
WikiParser p = new WikiParser();
ArrayList<String> titles = new ArrayList<String>();
try {
p.parseInto(new URL("http://zelda.wikia.com/api.php?action=query&list=categorymembers&cmtitle=Category:Games&cmlimit=500&format=xml"), titles);
} catch (MalformedURLException e) {
} catch (IOException e) {
} catch (SAXException e) {}
Does anyone know where to find a little how to on using dbpedia spotlight in java or scala? Or could anyone explain how it's done? I can't find any information on this...
The DBpedia Spotlight wiki pages would be a good place to start.
And I believe the installation page has listed the most popular ways (using a jar, or set up a web service) to use the application.
It includes instructions on using the Java/Scala API with your own installation, or calling the Web Service.
There are some additional data needed to be downloaded to run your own server for full service, good time to make a coffee for yourself.
you need download dbpedia spotlight (jar file) after that u can use next two classes ( author pablomendes ) i only make some change .
public class db extends AnnotationClient {
//private final static String API_URL = "http://jodaiber.dyndns.org:2222/";
private static String API_URL = "http://spotlight.dbpedia.org:80/";
private static double CONFIDENCE = 0.0;
private static int SUPPORT = 0;
private static String powered_by ="non";
private static String spotter ="CoOccurrenceBasedSelector";//"LingPipeSpotter"=Annotate all spots
//AtLeastOneNounSelector"=No verbs and adjs.
//"CoOccurrenceBasedSelector" =No 'common words'
//"NESpotter"=Only Per.,Org.,Loc.
private static String disambiguator ="Default";//Default ;Occurrences=Occurrence-centric;Document=Document-centric
private static String showScores ="yes";
#SuppressWarnings("static-access")
public void configiration(double CONFIDENCE,int SUPPORT,
String powered_by,String spotter,String disambiguator,String showScores){
this.CONFIDENCE=CONFIDENCE;
this.SUPPORT=SUPPORT;
this.powered_by=powered_by;
this.spotter=spotter;
this.disambiguator=disambiguator;
this.showScores=showScores;
}
public List<DBpediaResource> extract(Text text) throws AnnotationException {
LOG.info("Querying API.");
String spotlightResponse;
try {
String Query=API_URL + "rest/annotate/?" +
"confidence=" + CONFIDENCE
+ "&support=" + SUPPORT
+ "&spotter=" + spotter
+ "&disambiguator=" + disambiguator
+ "&showScores=" + showScores
+ "&powered_by=" + powered_by
+ "&text=" + URLEncoder.encode(text.text(), "utf-8");
LOG.info(Query);
GetMethod getMethod = new GetMethod(Query);
getMethod.addRequestHeader(new Header("Accept", "application/json"));
spotlightResponse = request(getMethod);
} catch (UnsupportedEncodingException e) {
throw new AnnotationException("Could not encode text.", e);
}
assert spotlightResponse != null;
JSONObject resultJSON = null;
JSONArray entities = null;
try {
resultJSON = new JSONObject(spotlightResponse);
entities = resultJSON.getJSONArray("Resources");
} catch (JSONException e) {
//throw new AnnotationException("Received invalid response from DBpedia Spotlight API.");
}
LinkedList<DBpediaResource> resources = new LinkedList<DBpediaResource>();
if(entities!=null)
for(int i = 0; i < entities.length(); i++) {
try {
JSONObject entity = entities.getJSONObject(i);
resources.add(
new DBpediaResource(entity.getString("#URI"),
Integer.parseInt(entity.getString("#support"))));
} catch (JSONException e) {
LOG.error("JSON exception "+e);
}
}
return resources;
}
}
second class
/**
* #author pablomendes
*/
public abstract class AnnotationClient {
public Logger LOG = Logger.getLogger(this.getClass());
private List<String> RES = new ArrayList<String>();
// Create an instance of HttpClient.
private static HttpClient client = new HttpClient();
public List<String> getResu(){
return RES;
}
public String request(HttpMethod method) throws AnnotationException {
String response = null;
// Provide custom retry handler is necessary
method.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,
new DefaultHttpMethodRetryHandler(3, false));
try {
// Execute the method.
int statusCode = client.executeMethod(method);
if (statusCode != HttpStatus.SC_OK) {
LOG.error("Method failed: " + method.getStatusLine());
}
// Read the response body.
byte[] responseBody = method.getResponseBody(); //TODO Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
// Deal with the response.
// Use caution: ensure correct character encoding and is not binary data
response = new String(responseBody);
} catch (HttpException e) {
LOG.error("Fatal protocol violation: " + e.getMessage());
throw new AnnotationException("Protocol error executing HTTP request.",e);
} catch (IOException e) {
LOG.error("Fatal transport error: " + e.getMessage());
LOG.error(method.getQueryString());
throw new AnnotationException("Transport error executing HTTP request.",e);
} finally {
// Release the connection.
method.releaseConnection();
}
return response;
}
protected static String readFileAsString(String filePath) throws java.io.IOException{
return readFileAsString(new File(filePath));
}
protected static String readFileAsString(File file) throws IOException {
byte[] buffer = new byte[(int) file.length()];
#SuppressWarnings("resource")
BufferedInputStream f = new BufferedInputStream(new FileInputStream(file));
f.read(buffer);
return new String(buffer);
}
static abstract class LineParser {
public abstract String parse(String s) throws ParseException;
static class ManualDatasetLineParser extends LineParser {
public String parse(String s) throws ParseException {
return s.trim();
}
}
static class OccTSVLineParser extends LineParser {
public String parse(String s) throws ParseException {
String result = s;
try {
result = s.trim().split("\t")[3];
} catch (ArrayIndexOutOfBoundsException e) {
throw new ParseException(e.getMessage(), 3);
}
return result;
}
}
}
public void saveExtractedEntitiesSet(String Question, LineParser parser, int restartFrom) throws Exception {
String text = Question;
int i=0;
//int correct =0 ; int error = 0;int sum = 0;
for (String snippet: text.split("\n")) {
String s = parser.parse(snippet);
if (s!= null && !s.equals("")) {
i++;
if (i<restartFrom) continue;
List<DBpediaResource> entities = new ArrayList<DBpediaResource>();
try {
entities = extract(new Text(snippet.replaceAll("\\s+"," ")));
System.out.println(entities.get(0).getFullUri());
} catch (AnnotationException e) {
// error++;
LOG.error(e);
e.printStackTrace();
}
for (DBpediaResource e: entities) {
RES.add(e.uri());
}
}
}
}
public abstract List<DBpediaResource> extract(Text text) throws AnnotationException;
public void evaluate(String Question) throws Exception {
evaluateManual(Question,0);
}
public void evaluateManual(String Question, int restartFrom) throws Exception {
saveExtractedEntitiesSet(Question,new LineParser.ManualDatasetLineParser(), restartFrom);
}
}
main()
public static void main(String[] args) throws Exception {
String Question ="Is the Amazon river longer than the Nile River?";
db c = new db ();
c.configiration(0.0, 0, "non", "CoOccurrenceBasedSelector", "Default", "yes");
System.out.println("resource : "+c.getResu());
}
I just add one little fix for your answer.
Your code is running, if you add the evaluate method call:
public static void main(String[] args) throws Exception {
String question = "Is the Amazon river longer than the Nile River?";
db c = new db ();
c.configiration(0.0, 0, "non", "CoOccurrenceBasedSelector", "Default", "yes");
c.evaluate(question);
System.out.println("resource : "+c.getResu());
}
Lamine
In the request method of the second class (AnnotationClient) in Adel's answer, the author Pablo Mendes hasn't finished
TODO Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
which is an annoying warning that needs to be removed by replacing
byte[] responseBody = method.getResponseBody(); //TODO Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
// Deal with the response.
// Use caution: ensure correct character encoding and is not binary data
response = new String(responseBody);
with
Reader in = new InputStreamReader(method.getResponseBodyAsStream(), "UTF-8");
StringWriter writer = new StringWriter();
org.apache.commons.io.IOUtils.copy(in, writer);
response = writer.toString();
How can I use the library to download a file and print out bytes saved? I tried using
import static org.apache.commons.io.FileUtils.copyURLToFile;
public static void Download() {
URL dl = null;
File fl = null;
try {
fl = new File(System.getProperty("user.home").replace("\\", "/") + "/Desktop/Screenshots.zip");
dl = new URL("http://ds-forums.com/kyle-tests/uploads/Screenshots.zip");
copyURLToFile(dl, fl);
} catch (Exception e) {
System.out.println(e);
}
}
but I cannot display bytes or a progress bar. Which method should I use?
public class download {
public static void Download() {
URL dl = null;
File fl = null;
String x = null;
try {
fl = new File(System.getProperty("user.home").replace("\\", "/") + "/Desktop/Screenshots.zip");
dl = new URL("http://ds-forums.com/kyle-tests/uploads/Screenshots.zip");
OutputStream os = new FileOutputStream(fl);
InputStream is = dl.openStream();
CountingOutputStream count = new CountingOutputStream(os);
dl.openConnection().getHeaderField("Content-Length");
IOUtils.copy(is, os);//begin transfer
os.close();//close streams
is.close();//^
} catch (Exception e) {
System.out.println(e);
}
}
If you are looking for a way to get the total number of bytes before downloading, you can obtain this value from the Content-Length header in http response.
If you just want the final number of bytes after the download, it is easiest to check the file size you just write to.
However if you want to display the current progress of how many bytes have been downloaded, you might want to extend apache CountingOutputStream to wrap the FileOutputStream so that everytime the write methods are called it counts the number of bytes passing through and update the progress bar.
Update
Here is a simple implementation of DownloadCountingOutputStream. I am not sure if you are familiar with using ActionListener or not but it is a useful class for implementing GUI.
public class DownloadCountingOutputStream extends CountingOutputStream {
private ActionListener listener = null;
public DownloadCountingOutputStream(OutputStream out) {
super(out);
}
public void setListener(ActionListener listener) {
this.listener = listener;
}
#Override
protected void afterWrite(int n) throws IOException {
super.afterWrite(n);
if (listener != null) {
listener.actionPerformed(new ActionEvent(this, 0, null));
}
}
}
This is the usage sample :
public class Downloader {
private static class ProgressListener implements ActionListener {
#Override
public void actionPerformed(ActionEvent e) {
// e.getSource() gives you the object of DownloadCountingOutputStream
// because you set it in the overriden method, afterWrite().
System.out.println("Downloaded bytes : " + ((DownloadCountingOutputStream) e.getSource()).getByteCount());
}
}
public static void main(String[] args) {
URL dl = null;
File fl = null;
String x = null;
OutputStream os = null;
InputStream is = null;
ProgressListener progressListener = new ProgressListener();
try {
fl = new File(System.getProperty("user.home").replace("\\", "/") + "/Desktop/Screenshots.zip");
dl = new URL("http://ds-forums.com/kyle-tests/uploads/Screenshots.zip");
os = new FileOutputStream(fl);
is = dl.openStream();
DownloadCountingOutputStream dcount = new DownloadCountingOutputStream(os);
dcount.setListener(progressListener);
// this line give you the total length of source stream as a String.
// you may want to convert to integer and store this value to
// calculate percentage of the progression.
dl.openConnection().getHeaderField("Content-Length");
// begin transfer by writing to dcount, not os.
IOUtils.copy(is, dcount);
} catch (Exception e) {
System.out.println(e);
} finally {
IOUtils.closeQuietly(os);
IOUtils.closeQuietly(is);
}
}
}
commons-io has IOUtils.copy(inputStream, outputStream). So:
OutputStream os = new FileOutputStream(fl);
InputStream is = dl.openStream();
IOUtils.copy(is, os);
And IOUtils.toByteArray(is) can be used to get the bytes.
Getting the total number of bytes is a different story. Streams don't give you any total - they can only give you what is currently available in the stream. But since it's a stream, it can have more coming.
That's why http has its special way of specifying the total number of bytes. It is in the response header Content-Length. So you'd have to call url.openConnection() and then call getHeaderField("Content-Length") on the URLConnection object. It will return the number of bytes as string. Then use Integer.parseInt(bytesString) and you'll get your total.