I am getting null pointer, is possible to get this link?
Element element = document.select("div.tw-absolute.tw-bottom-0.tw-left-0.tw-overflow-hidden.tw-right-0.tw-top-0.video-player__container").first();
System.out.println(element.absUrl("src"));
Tried this too
nullpointer as well
Element video = document.select("video").first();
String absSrc = video.absUrl("src");
System.out.println(absSrc);
html part
<div class = "tw-absolute tw-bottom-0 tw-left-0 tw-overflow-hidden tw-right-0 tw-top-0 video-player__container" data-test-selector="video-player__video-container">
<video playsinline="" webkit-playsinline="" src="https://clips-media-assets2.twitch.tv/40487770748-offset-9048.mp4?token=%7B%22authorization%22:%7B%22forbidden%22:false,%22reason%22:%22%22%7D,%22chansub%22:%7B%22restricted_bitrates%22:%5B%5D%7D,%22device_id%22:%226518a1542e035018%22,%22expires%22:1609419047,%22https_required%22:true,%22privileged%22:false,%22user_id%22:500437676,%22version%22:2,%22vod_id%22:850278065%7D&sig=5e17731db577b99e535c4aad3eacc70c0cc34521"></video>
link: https://www.twitch.tv/scream/clip/BrightOilyAppleMcaT
Looks like this one will require again, a lot of work to unpick.
Here's what I can tell you just from a quick look:
when you make the initial request, it does not contain the result you're looking for in the HTML. Therefore it must be coming from a subsequent HTTP request that is fired off once the page is loaded... i.e. there's javascript communicating with back end servers to get JSON payloads. In one of those payloads you'll find ".mp4".
If you use Chrome developer tools, you can flick over to the "Network" tab, click on each request following the first one, and check the "Preview" tab. You will find some requests contain JSON responses, others are just .css, .png, etc. ignore these. In the JSON responses, check the results for the occurrence of some generic value you're interested in like ".mp4". Once you've found it:
.. you then need to try to recreate the headers, the request body (as its not empty), the type of HTTP request (POST), and pass any relevant cookies (in the headers).
You're going to have to make anywhere between 1 and 5 HTTP requests to get what you need to get this JSON payload. Once you have it you can then parse it back.
This is another one of those jobs that's so big I'm not going to begin to try to do it for you.
If it were me doing the job, I'd check the Twitch API docs https://dev.twitch.tv/docs/api/ to see if there's a better/easier way that's just 1-2 requests.
You can change the CSS query as below.
Element element = document.select("div.tw-absolute.tw-bottom-0.tw-left-0.tw-overflow-hidden.tw-right-0.tw-top-0.video-player__container > video").first();
String src = element.attr("src");
System.out.println(src);
Related
I am using JSOUP to fetch the documents from a website.
Below is my code
webPageUrl = https://mwcc.ms.gov/#/electronicDataInterchange
Document doc = Jsoup.connect(webPageUrl).get();
Elements links = doc.getElementsByAttribute("a[href]");
Below line of code is not working. It is supposed to return an element but doesn't:
doc.getElementsByAttribute("a[href]")
Can someone please point out the mistake in my code?
That page seems to be an Angular application, which means it loads some (probably all or most) of its content via JavaScript scripts.
The fact that the URL contains the fragment separator # is already a strong indicator of that fact, because if you do a HTTP request, then everything after that indicator is cut off (i.e. not sent to the server), so the actual request will just be of https://mwcc.ms.gov/.
As far as I know JSoup does not support running JavaScript, so you might need to look into a more involved scraping tool (possibly running a full browser engine).
The java API for CICS is here. Does anyone know if there any method to put a couple of radio buttons to a web form using this API?
Here's my code to create radio button
HttpRequest req = HttpRequest.getHttpRequestInstance();
String msg = "ZEUSBANK ANTI-FRAUD CHECK BY SHE0008.<br> "
+ "When investigation is complete. Tick the check box and submit.<br>";
String template = "<form><input type=\"radio\"> YES<br><input type=\"radio\"> NO<br></form>";
HttpResponse resp = new HttpResponse();
Document doc = new Document();
doc.createText(msg);
doc.appendFromTemplate(template);
resp.setMediaType("text/plain");
resp.sendDocument(doc, (short)200, "OK", ASCII);
But when I run it on a browser, it print plain text and doesn't convert html tag.
Fixed it, I just change media type from text/plain to text/html and it works.
As you've already discovered, you needed to send the request with the text/html content type.
If you're planning to do more Java web-based work through CICS Java, you might want to investigate the embedded WebSphere Liberty. It adds support for Java EE features, which includes JSF, JSP and Servlets, which can make web development in Java a lot easier.
Tri,
I haven't used CICS for 15 years, so I doubt I'm an expert anymore. But looking quickly at the API, it seems like all the presentation logic would be in your regular Java code. You would then format appropriate messages and invoke the CICS API to update the server & get a response.
There doesn't seem to be any 'BMS-related' methods at all (which is a good thing).
The only 'field' method I see is com.ibm.cics.server.FormField but that only has get() methods, not set().
Are you just starting with Java CICS, or are you just stuck on this particular issue? If you have some sample code of what you are trying, post it so we can see if anyone has any ideas.
HTH, Jim
I am currently taking a course in app development and I am trying to use Facebooks API for GET requests on certain events. My goal is the get a JSON file containing all comments made on a certain event.
However some events return only a an "id" key with an id number such as this:
{
"id": "116445769058883"
}
That happends with this event:
https://www.facebook.com/events/116445769058883/
However other events such as (https://www.facebook.com/events/1964003870536124/) : returns only the latest comment for some reason.
I am experementing with facebook explore API:
https://developers.facebook.com/tools/explorer/
This is the following GET requests that I have been using in the explorer:
GET -> /v.10/facebook-id/?fields=comments
Any ideas? It's really tricky to understand the response since both events have the privacy set to OPEN.
Starting from v2.4 of the API, the API is now declarative which means you'll need to specify what fields you want the API to return.
For example, if you want first name and second name of the user, then you make a GET request to /me?fields=first_name,last_name else you will only get back the default fields which are id and name.
If you want to see what fields are available for a given endpoint, use metadata field. e.g. GET /me?metadata=true
I have a request in the form of json,which looks like this.
{"User":{"email":"test#test.com","FName":"fname"}}
When I try to send it via REST assured ,the U in the User is seen to change its case.i.e. changes to a lower case.
To send the request I have created my own serialized classes. The end-point is seen like this:
{"user":{"email":"test#test.com","FName":"fname"}}
but somehow it is not changing the case of the remaining fields.I don't knwo why this is happening.
I've even tried to create a filter for a request specification,but couldn't go any further with that too. I also then thought of first converting the serialized object to a gson,and then check the case of the User, still no luck.
Error I get is:
The class, User,does not match the payload object for payload.
Please note I am trying to use the service of another team,so I really don't have an access to their code-base(Although not needed).Observe the space between the first , and user in the above message, is it worth noting?
I finally got away with it by converting the object(JSON) into a JSON string/payload.
And while passing it as a form parameter,passed the string/payload.
Somehow,still couldn't figure out why the formparameter/formparam option in RESTAssured did not allow the serialized object to go through. But,anyway got around it this time.
Thanks for the suggestions all.
I have read hundreds of SO Posts and studied several Java HTTP-Proxy Sources available... but I could not find a solution for my Problem.
I wrote a WebApp that proxies Http-Requests. The WebApp is working, but links and referrers become broken because the "Root" of the proxied page points to the root of my server and not to the path of my proxyservlet..
To make it more clear:
My ProxyServlet gets a Request "http://myserver.com/proxy/ProxyServlet?foo=bar"
The ProxyServlet now fetches the pagecontent from ServerX (e.g. "http://original.com/test.html")
The content of the page is delivered to the browser by just reading and writing from one stream to the other and copying the headers.
The browser displays the page, the URL, that the browser shows is the original request ("http://myserver.com/proxy/ProxyServlet?foo=bar"), but all relative links now point to
"http://myserver.com/XXX.html" instead of "http://myserver.com/proxy/ProxyServlet/XXX.html"
Is there a response-header where I can change the "path" so that relative links correctly point to my ProxyServlet?
(Rewriting the page-content and replacing links would be too difficult, because the page contains relatively addressed elements such as javascript code and other active content...)
(Changing the mapping for my Servlet to "/*" is also not possible... it must be accessed via this path...)
You are inventing a "reverse proxy", and miss the "URL rewriting" feature...
Off the top of my search results, here's an open source proxy servlet that does this:
http://j2ep.sourceforge.net/docs/rewrite.html
Also you should know there is probably something wrong with the system architecture if you have to do this. Dropping in a standalone proxy like Apache, nginex, Varnish should always be an option, as you will HAVE to add one (or more!) as you start scaling.
It sounds like the page you're proxying in is using absolute links, e.g. <a href="/XXX.html"> which means "no matter where this link is found, look for it relative to the document root". If you have control of it, the best thing is for the proxy target to be more lenient in it's linking, and instead use <a href="XXX.html">. If you can't do that, then you need to re-write these URLs, some example code, using JSoup:
Document doc = Jsoup.parse(rawBody, getDisplayUrl());
for(Element cssALink : doc.select("link[rel=stylesheet],a[href]"))
{
cssALink.attr("href", cssALink.absUrl("href"));
}
for(Element imgJsLink : doc.select("script[src],img[src]"))
{
imgJsLink.attr("src", imgJsLink.absUrl("src"));
}
return doc.toString();