Introduction
The title is a bit complicated so let's break it down:
I have an image submitted by a user
The image is a top view of a landscape featuring clearly marked regions. For example, if this was a park the image would be a top view of the parks layout.
I need to allow the user to classify different elements in the image and estimate the area occupied by those elements. Continuing with the park analogy; the park may have two pavilions and a sand volleyball court. I must allow the user to mark the points of interest (let's say the volleyball court) and compute their area (given the overall dimensions of the depicted park)
Current Ideas
I think I should create a buffered image and use that as the background of a canvas.
I'm not sure about the user input. My first idea was to have the users drag rectangles associated with a specific feature (ie. red rectangle for volleyball courts) to the region over the image. Rectangles work because the elements are mostly rectangular but I don't know that users can resize rectangles.
To reiterate, the main problem is determining the area occupied by physical structures in a given image. No machine vision, just plain old Mouse Events.
How should I be approaching the user input dilemma? Any APIs I should be digging through?
Please let me know if I can improve the question and explanation.
Related
So i am new to libgdx and i am thinking about making 2D quest. I want to know how to interact with the objects on the backround image. For example, i have this location
Is it possible to get the window's coordinates so that if the player is next to it (I want to check the interaction possibilities by the player's coordinates and the interactive object's coordinates), he can press "E" or do something like that?
I was thinking about putting certain images in the background, but I don't know if that's a good idea. I also saw something about TextureRegion but I don't think it can help me because I still don't know how to get the coordinates of a specific area in the background.
Colour code the objects. From the appearance of your game its retro adventure so you are using a restricted palette. A restricted palette has advantages! So if you have windows with two colours, doors with 2 different and separate colours, objects with 2 different separate colours, then you can check when the player gets near a window, or whichever, by seeing what active colours are present on a radius around the player location.
So e.g. the floor and walls are coloured of course but they never trigger anything. This does limit you to one window per room, or one type of object per room (unless you introduce a new window colour pair to distinguish them). The code would be just sample all the pixels in the area around the player maybe a rectangle would be better.
A portion of my app involves the user drawing images that will be later strung together in a PDF. The user is free to use the entire screen to draw. Once the user is done drawing, I'd like to trim off all of the white space before adding the image to a PDF. This is where I am having problems. I've thought of two different methods to determine the location of trimmable white space and both seem clumsy.
My first thought was having the motion event of the stylus record if the event has gone outside of the box so far. If it has, I would expand the box to accommodate this. Unfortunately I could see polling every time there is a motion event being bad for performance. I can't just look at up and down events because the user could draw something like the letter V.
Then I thought I could look at all the pixels (using getPixel()) and see where the highest, lowest, rightmost and leftmost black pixels are. Again this seems like a really inefficient way to find the box. I'm sure I could skip some pixels to improve performance, but I can't skip too many.
Is there a standard way of doing what I want to do? I haven't been able to find anything.
You can inside your editor, where you record that this pixel has been drawn upon, update the maximum and minimum X and Y, and then use them later to crop the image.
If the user is drawing, aren't you already handling the onTouchEvent callback in order to capture the drawing events? If so, it shouldn't be a big deal to keep a minX, maxX, minY and maxY and check each recorded drawing event against these values.
My app takes a picture of the user and I want the user to mark where their eyes, mouth, and ears are on the image. The ears are just one point but the mouth is a selection of the whole mouth area
I have got the taking a picture part done but how can I go on about allowing the user to highlight he mouth area and then get those pixels?
http://developer.android.com/guide/topics/ui/ui-events.html has details on using touch events, which you'll need to understand and implement.
I'm working on small project which requires: Change clothes (shirt/pants etc.) of a person in any 2D image he chooses to upload. So somehow edges needs to be detected and relevant areas are supposed to be filled with new patterns. I do see a lot of other complications, but let's assume simple patterns have to be filled only.
For a web application, is it possible to do it in HTML5? Any other alternatives?
For a standalone application, what kind of technology would be preferred, C++/Java?
Update
Based on Bart's comment:
Any useful pointer like Bart's would be really useful
Assumption: Clear traceable 'standing' human figure in 2d image
Since it's an image, there is no real-time scenario
Assumption: Clear traceable 'standing' human figure in 2d image
A way to do this is to require the user to take two pictures. One picture is the one with the user in it, the other picture must be taken in the same camera position and orientation, but the user steps out of the frame for that one.
Since both pictures will have the same background you can compare pixel by pixel between the two images and flag those pixels that have a difference over some threshold. Of course the threshold must be selected so that camera noise isn't detected as a difference. Once you have the collection of pixels that are different you can filter them and calculate an approximate silhouette for the user from the pixels on the edge.
A simplification of the above method can be done if you have control over the background. You could use a bluescreen to avoid having to have a second picture with the background.
My question is similar to this one, but is more specific in scope.
In my card game application, I would like for users to be able to click on words located in a scanned jpeg image. Please see this sample Pokemon trading card.
In this case, the user should be able to hover his mouse over the text "Scratch", upon which a pulsing rectangular border will appear around the text, indicating that it is clickable. The problem is how to detect the border of the text. There will be an array of words KNOWN BEFOREHAND that the user may click on (these will be retrieved from a database on a card-by-card basis). To continue our example, the array in this case will be ["Scratch", "Live Coal"]. Once the user clicks on "Scratch", the application must know via a call-back that "Scratch" was chosen instead of "Live Coal".
I was thinking of using optical character recognition libraries to solve this problem, but the open-source options for this are poor in quality (e.g. GOCR) and/or not well-tested on multiple platforms (e.g. Tesseract). I only care about Windows and Mac compatibility. Am I missing an obvious/simpler solution/algorithm that does not require OCR? I cannot simply hand-code in bounding boxes for each card, as there will be thousands of scanned cards in my database. The user may also upload his own custom card scans with an accompanying array of clickable text.
Text color is not always black. See this panorama of different card and text styles that will be permitted. The black cards have white text, and the third-to-last card (Zekrom) has black text with a white outline.
Solutions in any programming language are appreciated. However, please note that I am looking for open-source algorithms and/or libraries. If there is a solution in Ruby or Java, even better, as my code is primarily in these two languages.
EDIT: I forgot to mention that the order of the words/phrases in the array will be the same as on the card. Thus, the array will be ["Scratch", "Live Coal"] instead of ["Live Coal", "Scratch"]. I am mentioning this because it can potentially simplify the task. Thus, for this example, I can simply look for black pixels (though I have to watch out for the black star in the white circle). However, there will be more difficult cases where there is descriptive text under the attack name in a smaller font (again, see the panorama for examples).
I would just write a program that allows you to visually draw a bounding box around your text for simplicity but could could do this buy detecting differences in pixel color. Since the text is black you could see where the upper-left most black pixel is without large indents and within the bottom half of the card.
When the cursor is stationary, check if there is a black pixel either underneath or to 4 pixels around the cursor. If it is, check the first three consecutive (because there still might be a non-black pixel between the letters) non-black pixels to the left of the cursor, to the right, to the top and at the bottom. If yes, use these locations to draw a square. You can use OpenCV.