On Spy++ and capturing and sending messages to/from SunAwtFrame [duplicate]

On Spy++ and capturing and sending messages to/from SunAwtFrame [duplicate] - java

Background:
I was trying to program an auto clicker to click in the background to an application (Roblox, not trying to do anything malicious). I was able to get the window and perform commands like closing it. However, when trying to send clicks to the window it returns 0. (I'm using SendMessage so I don't activate the window)
Minimum reproducible example:
import win32gui
import win32con
import win32api
hwnd = win32gui.FindWindow(None, "Roblox")
while True:
lParam = win32api.MAKELONG(100, 100)
temp = win32gui.SendMessage(hwnd, win32con.WM_LBUTTONDOWN, None, lParam)
win32gui.SendMessage(hwnd, win32con.WM_LBUTTONUP, None, lParam)
print(temp)
Things I tried:
I tried changing the window to see if it was the wrong window, or if it didn't see the window
I tried sending the message normally:
lParam = win32api.MAKELONG(100, 100) # Get the coordinates and change to long
temp = win32gui.SendMessage(hwnd, win32con.WM_LBUTTONDOWN, None, lParam) # Send message to handle
win32gui.SendMessage(hwnd, win32con.WM_LBUTTONUP, None, lParam) # Release key from sent message to handle
I tried it with other windows, and it worked, but not for Roblox
I tried with other commands and it works, but clicks don't. This works: (So I know it's the right window)
temp = win32gui.SendMessage(hwnd, win32con.WM_CLOSE, 0, 0) # Close window with SendMessage

You cannot do that.
Let's start by rephrasing the problem statement to more easily follow along, why that is the case:
"How do I convince a program that has chosen to ignore mouse input messages—by decision or coincidence—to acknowledge mouse input messages?"
As it turns out, that part is actually solved. As the documentation for WM_LBUTTONDOWN notes:
If an application processes this message, it should return zero.
And zero you get, so there's no reason to question the fact that the message has been handled to the extent deemed necessary by the application. This probably falls down the "coincidence" branch, where the application just isn't interested in mouse messages any more than passing them on to DefWindowProc, the kitchen sink for all messages that aren't relevant enough to even ignore.
Key insight here is: A program that needs to process and respond to mouse input can decide to ignore mouse input messages1. (And clients that are based on mouse message handling can easily identify fake input messages, too, and respond by, y'know, not responding altogether.)
So, in essence, sending (or posting) fake mouse messages isn't going to work. Reliably. Ever.
Which leaves you with essentially 3 alternatives:
UI Automation
Accessing a custom automation interface
SendInput (a consolidated version combining keybd_event and mouse_event)
The first two options are listed for completeness only. They are commonly available for applications that actively support being automated. Games generally don't, and protecting against those avenues is easy, and cheap: An application doesn't have to do anything.
SendInput won't work, either. As far as the system is concerned, injected input is processed the same way as any other input (this blog post offers a helpful illustration). Specifically, when a mouse click is injected over a window, that window comes to the foreground. So that fails the requirement to have the application "in the background".
Even if that wasn't the case, injected input is easily and reliably identifiable. A low-level mouse hook is all that's required to get an MSLLHOOKSTRUCT, whose flags field gives this information readily away. With a low-level hook's ability to prevent input from being passed on to the system, a return 1; is all that's needed to filter out those input events.
And that covers all supported ways to automate a foreign application. It's a dead end so dead that it's not worth beating.
Now, if automating an application that runs in the background using fake input summarizes the requirements, then your only option is to run the application in a virtualized environment (which ensures that a click stays confined within the virtual environment and won't bring the application to the front). Keep in mind that all restrictions covered above still apply, and you cannot use any of the methods above. You would have to implement and install a custom mouse driver that generates input that's indiscernible from genuine hardware-sourced input events.
But even then, applications have ways of discovering that they are running in a virtualized environment, and refuse to operate when they do.
The bottom line is: Cheating is hard. Really hard. And your problem doesn't have an easy solution.
1 Mouse input messages are generated by the system as a convenience. They represent a useful (and lossy) abstraction over hardware input events. The full fidelity of those hardware input events is generally not required by "standard" applications.
Games, on the other hand, will usually use lower-level input processing infrastructure, such as Raw Input, and not even look at any of the higher-level processing artifacts.

Related

Command Messages - classification design question

I'm trying to design one service using event sourcing. One use case is - when command arrives, I have to execute some logic and transform object from one MODE to other one. Now, the "problem" I'm facing is to decide between these two options for command messages:
Option 1: (put mode directly into name of the command message)
SetToManualModeCommand{
int val = 22;
}
SetToAutoModeCommand{
int val = 22;
}
SetToSemiAutoModeCommand{
int val = 22;
}
SetTo...ModeCommand{
int val = 22;
}
Option 2: (put mode as an enum inside single command message)
enum Mode{
AUTO,
SEMI_AUTO,
MANUAL;
}
SetToModeCommand{
Mode mode;
int val = 22;
}
modes are not changing that often - but they can change. If they change - in option 1 I have to make new class. In option 2 I have to add one additional enum value.
Do you see any drawbacks/benefits of any of these 2 options, or are they more less the same?

From the context you’ve given I don’t see a convincing argument either way. I know you’re talking about commands, not events, but try thinking about it from the event subscribers point of view. Is the significant event that the mode changed in some way, or that it changed to a specific value? In other words, would subscribers want a single ModeChanged event with details inside the event, or would some subscribers want just ModeChangedToManual and others just want ModeChangedToAuto, etc. You may consider the event storage system you’re using and how easy it is to create a filtered subscription.
It’s convenient (not required) that each command creates a single event. If subscribers would prefer a single event and you have 4 commands issuing that event it makes the system just a tiny bit more complicated, and those tiny bits tend to add up. If it’s better for subscribers that you have 4 separate events, then have 4 separate commands.

As a general principle - make the implicit explicit. Therefore make a command for each 'mode'. But I honestly don't think there is much in it.
Clearly what you actually do is very dependant on the context of the system you are building. How often do new modes get introduced?
I would say, however, that you should definitely have an event for each mode.

See what you code wants. If command sending code is a switch statement that encodes your mode, and event consuming code is also a switch statement that decodes your mode, then just send a mode as a command/event payload.
If consuming code for each mode is in different places - then make it easy for each place to subscribe to right mode - make different events for each mode.

My general approach is to not include classification as structure. Having classification as data seems to be easier.
Another example that is similar to yours would be different types of customers: GoldCustomer, SilverCustomer, or BronzeCustomer. A while later (2-3 years) the business may decide to add a PlatinumCustomer.
An enumeration may be better but for anything that is not fixed I would not use structure (neither class nor enumeration). Field positions in some sport may be an enumeration since these are fixed. If they change then it is significant enough to warrant code changes.
To that end I would have some domain concept represent the mode in a more generic/fluid fashion but YMMV.

Where are WindowEvents fired from?

I have a really weird problem, one of the windows in my app (let's call it "Window A") is consistently placing itself (or being put) behind the window that brings it out ("Window B"). Even if I click on Window A, Window B immediately comes forward again.
There's nothing obvious in the code as to why this might be happening. I can write a windowActivated() or windowDeactivated(), but by the time they're called the information on who actually switched the windows is of course long gone.
How can I get to the point where those events are fired?

I've found what was causing the bug and thought I'd post it here in case it's of use to others. I never found an answer to my original question, and I still think it may well have shown me where the bug was coming from. It certainly was non-obvious.
It turns out that if you have a custom focus traversal policy (Container.setFocusTraversalPolicy(FocusTraversalPolicy)) and its getFirstComponent() passes back a component which is not focusable[*], then whenever the window is brought to the front either programmatically or by the user, it will be sent back one step in the z-order hierarchy.
I found the problem by good old brute force: the offending window is part of an inheritance hierarchy like so:
AbstractSuperclass
/ \
/ \
BuggyWindow NonBuggyWindow
I made a ToyWindow class to also descend from AbstractSuperclass. It didn't have the bug. I laboriously copied in code from BuggyWindow until the bug exhibited. That was in a long method which is called when the window is first displayed; by successive deletion I arrived at the offending block of code, where a number of widgets have their isEditable() and isEnabled() set to false. Since other windows have all their widgets disabled (in View mode), there was obviously more to it than that. Then I remembered the custom focus traversal policy.
I wrote a toy program with the important elements, and was able to reliably trigger the bug. I added checks for focusability to all of the methods in my custom focus traversal policy. Bug go bye-bye.
Thanks to those who responded and I do apologise for the lack of information. I didn't want to waste people's time by shoving huge amounts of code at you. It meant you didn't get what you're used to here, and that was unfortunate.
[*] I'm taking non-focusable as !(isFocusable() && isEnabled()) since I couldn't quickly get sufficient information to understand exactly when a component might be focusable but not enabled (or the other way around), and it was good enough for my purposes. (Oh, how I wish the JRE would have comments better than e.g. "isFocusable() returns whether this component can be focussed"..."isEnabled() returns whether this component is enabled" -- ##&$!!!)

Java System.in.read() vs Event Handler?

Assuming I have something like this:
x= (char) System.in.read();
if(x == 'a') {
// Do something;
}
How much different is it from something like:
public void handle(KeyEvent event) {
switch (event.getCode()) {
case A: // Do something;
case ENTER: // Do something else;
}
}
I mean when should I use the first and when the second? What are the pros and cons?

The two approaches are getting input from the user in two different ways.
The first is reading characters from the JVM's "standard input" stream. If you ran your application without redirecting standard input, this stream is likely to be coming from the "console window" where you launched the JVM. The keystrokes on the console window are processed by the console / OS line editor until the user types ENTER. When that happens a line of characters is delivered to the input stream ready to be read by the JVM / Java application.
The second is processing keyboard events directly. However, this only works in a GUI application. It only sees the keyboard events directed at the application's window(s).
I mean when should I use the first and when the second?
Use the first in a console-based where you don't need to see characters at the instance the user presses a key.
Use the second when you have a GUI based application and you want to get input from the user interactively.
What are the pros and cons?
That is self-evident from the above. However, there are a couple additional "cons".
With the System.in approach:
The System.in stream could be coming from a file. If you need to be sure (or as sure as possible) that you are talking to a real user, use System.console() ... and check that you don't get null.
If you want to see the user's characters as they are typed, you need to use a 3rd-party library.
With the EventHandler approach:
This won't work on a "headless" system.
It is heavy-weight on the Java side as well. Something on the Java side needs to deal with key-up / key-down events, echoing, line-editing, end so on. If you are intercepting the events in your code, it may well be your code that has to do the heavy lifting.
I don't think there is a way to use it without launching a window. (Otherwise there would be no way for the end-use to know where the "input focus" is for the characters that he / she is typing.)

Check state of and information about an external application

Here (http://www.desktop-macros.com/) is a program which records sequences of mouse clicks and key strokes on a PC and then plays it back to perform some user-defined actions.
Now, what I'd like to achieve is a bit more demanding: I'd like for example to launch a browser with mouse clicks, wait until it's started (i.e. its application screen is visible) and then again perform some mouse&keyboard actions.
Of course it would also be useful to obtain also other information, like position and dimensions of the window.
Is it possible to make such fancy OS-related operations (like checking whether an application is fully-loaded) with Java? Maybe there are some non-standard libraries with useful API?
If not, could you recommend some way/language of solving such an issue?

I use Autohotkey that has the command WinWait that waits for windows having the good title. But I rely on Send {Enter}, not on mouse, to do things.

Introduce Delay after keyReleased() event

So, I'm working with swing and I need to find a clean (non-CPU-hogging-way) to introduce a delay on a text field. Basically, users will enter a number into this field and the keyReleased() event checks that the input fits a few parameters and then assigns the value to a data storage element in the program. If the data is invalid, it displays a message. Since the routine is called every time they type a letter (unless they type VERY fast), the input process becomes quite annoying (as in general one or two characters of data are not going to fit the allowed parameters).
I've tried setting up a timer object and a timer task for it, however it doesn't seem to work very well (because it delays the thread the program is running on). The option to just wait until the data reaches a certain length is also not possible since (as state before) the input can vary in length.
Anyone got any ideas? Thanks!

I've done stuff like this fairly frequently, and I have two suggestions.
The standard way of dealing with this is to use the InputVerifier class. This however only operates when the input component loses focus - it's designed to prevent the user navigating out of an input field when it's invalid, but it doesn't check until then.
The other way I've done this is to check validity on every keystroke, but not to bring up a message when it's invalid. Instead use a color to indicate validity - e.g. color it red when its invalid and black when valid. This isn't nearly as intrusive as the message. You can use a tooltip to give more detailed feedback.
You can also combine these methods.

Write a custom DocumentFilter. Read the section from the Swing tutorial on Text Component Features for more information.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.