I am trying to detect color of different elements in a webpage(saved on machine). Currently I am trying to write a code in python. The initial approach which I followed is:
find color word in html file in different tags using regular expressions.
try to read the hex value.
But this approach is very stupid. I am new to website design, can you please help me with this.
There can be multiple stylesheets, and many cascading styles. You don't know which elements visually end up being the "background" elements. I think if you're looking for something robust that will work on most webpages, you need to leverage a browsers rendering engine and focus on identifying what a user would see.
Consider using a web browser to render the page, taking a screen shot, and then doing image processing to find the most frequent color near the sides of the page. You can use a scriptable browser like phantomjs.
If you're new to programming, this approach is going to be wayyyyy over your head.
In java you can use JSOUP. Its quite good
Document doc = Jsoup.connect("http://YourPage.html").get();
Elements colors = doc.select("[bgcolor]");
I don't know anything about Java or Python, but could you have it parse the html code and look for something like 'background-color: < color >'?
Related
I have color image document with text and images and tables.
Document can have two columns.
Document is composite from areas: area header and text (bigger font, can have different font color and something like sub-header additional data).
This is exemplary image but real one can be color:
What i need to do.
I need find on image document this areas of text with headers.
What i need to know.
Method how to divide document to divide document on particular parts.
I try with opencv in java(if someone have python and c++ version i can convert it for java version by myself). I found few similar problem on stack overflow, but none of them can help me. You must know that my opencv knowledge is not very well and it is only from on-line tutorials and stack overflow.
Is there any fine solution on my problem in opencv way or i need use something else, different library or application to achieve this?
One and only requirement is that it must be done from command line.
If i had this areas i can do what i need next, but this is step which stops me.
have you solved the problem?
I'm working on a similar problem.
My solution is to use HoughLines https://docs.opencv.org/3.4.0/d9/db0/tutorial_hough_lines.html
You can use text detection combined with dilation to detect bold text i.e. headers and then group the text boxes between two consecutive headers as the text under first header.
I own a sports apparel company and I'm looking to have an applet built that will allow customers to see how their team names will look in certain colors on jerseys. Below you can see the final result of a competitor site's Flash applet where text is rendered on 2D surfaces/images.
My requirements: I need users to be able to set the font, primary text color, outline text color, and text style (arched or straight).
So my question-- Is this sort of text rendering possible with only Javascript/PHP?
If so, what limitations do you for see? I've been told the arching and outline text color may be issues. I've also been told that I may have to upload library files to a server where the actual rendering may take place.
If not, what scripting would you guys recommend? I'm trying to stay away from Flash because it's slow and costly.
I'll be passing this onto our developers so please feel free to be as detailed as possible. I figure'd I'd save them some leg work!
Thank you!
Depending on how complex you want your graphics to be, html5 drawing abilities could be used. Check Raphaƫl library, for instance, webGL/canvas renderers already have a lot of features in modern browsers.
As of the solution with server rendering, it's also possible with gd2(php), but imho that would be less convenient, at least try something different from php (btw, what's your backend running on?)
Your competitor's solution with java applet honestly seems the easiest, except that it requires jre, which few people are eager to install =)
That's kind-of a high level question, but yes you can definitely use javascript for it.
If there's a problem with getting characters to look right, you can always save each letter as a separate image and have javascript place them next to each other in preview. I'd try to see how close you could get with the existing fonts first.
Layering the text: one color large font, then a different color smaller font will give you the outline effect your looking for.
I am trying to make a small news crawler.
I got every thing working after many tries.
Problem is that approx every HTML news page have more then 50 images.
Many of them are too small. So, i am filtering them simply by checking size.
Only images lager them 200x200 will be taken.
But there are many images on a single page which are large.
and some news articles not have any related image.
Lets take a example -
Link to News - http://timesofindia.indiatimes.com/india/Over-9-3-lakh-TB-patients-in-India-undetected-Report/articleshow/24600851.cms
My code got this image - Image no. 0 http://timesofindia.indiatimes.com/photo/10905539.cms
Image height - 300
Image width - 450
But this image is useless to image topic.
In simple words "How to get correct image dynamically"
I do not want to make code for each website.
A blank image is better then a wrong image.
I would recommend an approach where you identify the proximity of an image based on its position.. so, if an Image comes inside the article its probably an image about the article itself (except for ads which are very wide).
you can findout the source of the image and decide if it should interest you or not. for instance ad images usually come from a different server which doesn't belong to the site. (in your case indiatimes.com).
Consider the alt text. The alt text usually contains either the title completely or some words from the title.
Also, the article does not have any relevant image associated with the title.
I also suggest JSoup:
jsoup: Java HTML Parser
jsoup is a Java library for working with real-world HTML. It provides
a very convenient API for extracting and manipulating data, using the
best of DOM, CSS, and jquery-like methods.
jsoup implements the WHATWG HTML5 specification, and parses HTML to
the same DOM as modern browsers do.
scrape and parse HTML from a URL, file, or string
find and extract data, using DOM traversal or CSS selectors
manipulate the HTML elements, attributes, and text
clean user-submitted content against a safe white-list, to prevent XSS attacks
output tidy HTML
I don't have much experience in Java, but I am attempt to write a simple rogue-like game to familiarize myself with it, and I am just wondering how I would go about creating an interface like this:
Are there any obvious ways that you would go about something like this? Being new to Java, I really have no idea what the best method would be.
Sorry to be vague!
Thanks
There is no such (simple) component in the JDK - if you don't need color, a JTextArea can be used to display ASCII-Art (after setting a fixed-width font). You will need to take care not to run into characterset issues (if you don't stick to US-ASCII 7-bit).
Writing a component that handles color display (and maybe escape sequences, in essence emulates a console window) wouldn't be too hard, but if you just started with Java it may prove to be an unwelcome challenge.
You could also just write your game in Java and leave displaying the ASCII to the system console (your game would simple output to stdout).
Edit: Color ASCII could be acieved by converting your internal format to (simple) HTML and that HTML could be displayed using a JLabel. Its probably not the most elegant method, but it should be reasonably simple to implement (and with nowadays hardware speed should not be an issue with this approach either). This approach builds on the capability that you can just use JLabel.setText() and pass a string that starts with a HTML tag. The JLabel then interprets the whole text as HTML.
Check out Trystan's Ascii Panel, and his blog and tutorial on making a roguelike here.
Better late than never, right? You may want to check Zircon Project.
I want to implement list like "stackoverflow question list" (where each row has multiple items, text, tags, user, time etc) in GWT. What should be most appropriate approach?
I tried using FlowPanel and inside that "HTML elements" so result will be DIVs inside DIV.
But, then CSS is pain.. (unable to set right aligned multiple rows and left aligned user profile image etc)
If I use table then, it GWT does not support row rendering. Need to work on cell, it is again pain.
so, Any suggestions?
(Please exclude GXT or SmartGWT like other heavy weight frameworks, just want to use core GWT. )
Cheers,
The major answer here is 'it depends'.
The general way I try to approach anything with GWT is to come up with an HTML mockup. Once you have a static version of the layout you want, complete with CSS, it's actually quite straightforward to convert this into GWT code. See this article on 'tags first gwt' for well written example.
The point to keep in mind with GWT is that ultimately, the browser is going to have to deal with a DOM structure you build up, so if you can make it correct without GWT, it is far easier to then make it correct with GWT.
Use DockPanel for contents like multiple items, text, tags, user, time etc...Then add the dockpanel to FlexTable.FlexTable will support to add rows.
How About gwt Grid? even that supports text and table.