Encoding of key codes (virtual?) send to virtualbox-guest-system - java

How to send a string to a virtualbox-guest-machine?
This is my code:
public void testKeyboard() throws Exception {
IVirtualBox b = connect();
List<IMachine> machines = b.getMachines();
for (IMachine m : machines) {
MachineState d = b.getMachineStates(Arrays.asList(m)).iterator().next();
if (d == MachineState.Running) {
ISession s = manager.getSessionObject();
m.lockMachine(s, LockType.Shared);
IConsole console = s.getConsole();
IKeyboard k = console.getKeyboard();
k.putScancodes(Arrays.asList(25, 25 | 0x80)); // <- sends the character P
s.unlockMachine();
}
}
}
Microsoft say its 0x50
java.awt.event.KeyEvent also say its 0x50:
/** Constant for the "P" key. */
public static final int VK_P = 0x50;
In virtualbox its different.
P=25
2=0x50
b=0x30
Why in the world is the code of P = 25 in virtualbox?

Virtual key codes and keyboard scan codes are two different things.
Keyboards (still) talk to PCs using a really dusty old protocol. Many years ago I implemented a toy version of the protocol in an OS driver and it was quite annoying. Making it worse, keyboards can use 3 different sets of keyboard scancodes which originated with different flavors of old PCs.
Here is a reference for the 3 sets: https://www.vetra.com/scancodes.html
The constants you discovered line up with Set 1. Note that Set 1 has different codes used to press the key and release the key!
The virtual key constants used by the Java KeyEvent class and by that Windows list both seem to be based closely on ASCII characters, since 'P' = 0x50 in ASCII. But it's the job of the operating system keyboard driver to translate from keyboard scancodes to more logical sets of constants. There is no universal constant for a key.
Since VirtualBox is emulating the physical keyboard interface to the guest OS, its IKeyboard API takes raw scancodes.
For the VirtualBox GUI, it must have a function which translates from host OS key constants back to scancodes for the guest OS, which might be easier if you can use that, but IKeyboard appears to be a very low-level interface which bypasses that.
Depending on your use case, perhaps there is a different API you can also use here; or perhaps you can control your VM with guest software like SSH or VNC.

Scan codes are keyboard layout dependent mappings to key layouts. You can send codes based on the scan codes for a USB keyboard, which all modern systems are likely to understand. The codes are documented in Appendix C of the Microsoft document "Keyboard Scan Code Specification". You will see there that character P is in position 26 in the 1-indexed table of key location. This would explain a 0-indexed code of 25 producing the character P.
You will see that "key down" needs you to send 25 as the scan code (0x19) and then 0x99 for key up. This is what your (25 | 0x80) is doing. So your code at the moment sends a message meaning "Key P has been pressed (make in the table)" followed by "Key P has been released (break in the table)"
The direct link to the doc I'm referencing here is: https://download.microsoft.com/download/1/6/1/161ba512-40e2-4cc9-843a-923143f3456c/scancode.doc
N.B. These codes are quite different to ASCII codes, which are where you are getting the 0x50 from.
Sending strings via the keyboard
If you want to send a whole string via the keyboard this way, what you will need is a mapping table from ASCII codes for each character to key make and key break codes. You can then take the string character by character and work out which scan code pairs to send for each character. Note - you'll need to deal with case by adding a "Shift down" before the pair and "Shift up" afterwards. This is do-able with a fairly long mapping table, I'd use a WeakHashMap

Related

Why are the ModuleEntry32 text values from WinAPI displaying as Chinese characters?

I have been using the JNA (Java Native Access) library to access the memory of processes. I have been writing some code to enumerate through all modules of a process, and the struct MODULEENTRY32 is obtained properly - I am getting their handles and base addresses properly. However, the "String" values szModule and szExePath (which are char arrays) that are returned give me random Chinese characters.
JNA provides helper classes for structs such as MODULEENTRY32 (they call it MODULEENTRY32W) for using functions such as Module32First and Module32Next, which I've been using. They have sort of their own toString method for szModule and szExePath, and those return the random Chinese chars as well. I have tried to encode/decode it myself, and would get close to the "right" values (encoding to UTF-16, then decoding to ISO) but it still is a bit off - as in I can't use equals/equalsIgnoreCase to compare it with another String.
Below is roughly an example of what I am getting when printing out szModule and szExePath in the format szModule:szExePath returned from the Module32First/Module32Next calls:
瑮汤⹬汤l: 瑮汤⹬汤l
䕋乒䱅㈳䐮䱌: 䕋乒䱅㈳䐮䱌
䕋乒䱅䅂䕓搮汬: 䕋乒䱅䅂䕓搮汬
档潲敭敟晬搮汬: 档潲敭敟晬搮汬
䕖卒佉⹎汤l: 䕖卒佉⹎汤l
獭捶瑲搮汬: 獭捶瑲搮汬
And here is roughly how I am enumerating:
// hSnapshot is valid, and I already called "Module32First" - this loops through any other modules
while(this.moduleBaseAddr == null && this.moduleHandle == null) {
Tlhelp32.MODULEENTRY32W currentModuleEntry32 = new Tlhelp32.MODULEENTRY32W();
if(this.kernel32.Module32Next(hSnapshot, currentModuleEntry32)) {
currentModuleEntry32.read();
String currentModuleName = currentModuleEntry32.szModule();
System.out.println(currentModuleName + ": " + currentModuleEntry32.szModule());
if(currentModuleName.equals(MODULE_NAME)) {
this.moduleBaseAddr = currentModuleEntry32.modBaseAddr;
this.moduleHandle = currentModuleEntry32.hModule.getPointer();
break;
}
}else{
break;
}
}
Does anyone have any insight on solving this issue?
You are mixing ANSI function mappings and Unicode structure mappings.
Most Windows functions have two versions of the function, one ending in A and one in W, with comments in the documentation. For example, CreateProcess has two versions, CreateProcessA and CreateProcessW, where the documentation states:
The processthreadsapi.h header defines CreateProcess as an alias which automatically selects the ANSI or Unicode version of this function based on the definition of the UNICODE preprocessor constant. Mixing usage of the encoding-neutral alias with code that not encoding-neutral can lead to mismatches that result in compilation or runtime errors. For more information, see Conventions for Function Prototypes.
That link states:
New Windows applications should use Unicode to avoid the inconsistencies of varied code pages and for ease of localization.
Unfortunately in the case of GetModuleFirst and GetModuleNext, they do not follow the usual SDK convention. There is no -A version of these functions so the mapping you have created is ANSI (really ASCII). The byte string returned for szModule in the first line of the output in your question is 6e74646c6c2e646c6c3a206e74646c6c2e646c6c which in ASCII or UTF-8 decodes to ntdll.dll: ntdll.dll. Because you are using the MODULEENTRY32W (Unicode) structure mapping, these bytes are interpreted as UTF-16, resulting in the characters you are seeing in your output.
The Unicode mappings are GetModuleFirstW and GetModuleNextW, and are the functions you should be using. These are mapped in JNA's Kernel32 class. I highly recommend you use the JNA mappings rather than reinventing the wheel.
Incidentally, JNA's Kernel32Util class already handles all of this and offers a List<Tlhelp32.MODULEENTRY32W> getModules(int processID) method using the correct mappings, that you may find useful.

Raw memory access Java/Python

Memory-mapped hardware
On some computing architectures, pointers can be used to directly
manipulate memory or memory-mapped devices.
Assigning addresses to pointers is an invaluable tool when programming
microcontrollers. Below is a simple example declaring a pointer of
type int and initialising it to a hexadecimal address in this example
the constant 0x7FFF:
int *hardware_address = (int *)0x7FFF;
In the mid 80s, using the BIOS to access the video capabilities of PCs
was slow. Applications that were display-intensive typically used to
access CGA video memory directly by casting the hexadecimal constant
0xB8000 to a pointer to an array of 80 unsigned 16-bit int values.
Each value consisted of an ASCII code in the low byte, and a colour in
the high byte. Thus, to put the letter 'A' at row 5, column 2 in
bright white on blue, one would write code like the following:
#define VID ((unsigned short (*)[80])0xB8000)
void foo() {
VID[4][1] = 0x1F00 | 'A';
}
is such thing possible in Java/Python in the absence of pointers?
EDIT:
is such an acces possible:
char* m_ptr=(char*)0x603920;
printf("\nm_ptr: %c",*m_ptr);
?
I'm totally uncertain of the context and thus useful application of what you're trying to do, but here goes:
The Java Native Interface should allow direct memory access within the process space. Similarly, python can load c module that would provide an access method.
Unless you've got a driver loaded by the system to do the interfacing, however, any hardware device memory will be out-of-bounds. Even then, the driver / kernel module must be the one to address non-application space memory.
If you are on an operating system with /dev/mem, you can create a MappedByteBuffer onto it and do this sort of thing.

Capturing Joystick Input in C or Java

I need to capture joystick input using C or Java (whichever is easier).
There are answers to similar questions but they all use C++ or C#.
The program only needs to get the direction and amount by which the joystick is tilted.
I'm using Windows7, so I'll probably need to use winmm.dll as is explained here.
I would appreciate if someone could explain how to do so in C or Java.
There are premade libraries for both languages. The more important question would be the language you have to use or which one you'd prefer primarily. It doesn't necessarily make sense to add C code just to add such functionality to a Java program. In a similar way, you wouldn't want to call Java from C, just to get joystick input.
First hit on Google for "java joystick" has been this one. Haven't tried it yet.
As for C++ code (and most likely C# too) you should be able to use the same code in C, assuming it's pure Windows API code (cause that one is in C too). So you shouldn't have any issues adapting these.
Edit:
Regarding the answer you linked: You should be able to use this solution 1:1 in C (in Java you'd have to write code essentially doing the same). But instead of declaring everything yourself, just #include <windows.h> and you should be ready to go (in C).
I recommend the Simple Directmedia Layer library for a pure C solution. The library is pleasant to use, and their documentation and code examples are great:
SDL_Joystick *joy;
// Initialize the joystick subsystem
SDL_InitSubSystem(SDL_INIT_JOYSTICK);
// Check for joystick
if(SDL_NumJoysticks()>0){
// Open joystick
joy=SDL_JoystickOpen(0);
...
The C# solution is indeed pure Windows API code!
In C just #include <windows.h> and instead of [DllImport("winmm.dll")]
you link to winmm.lib. The following example should make it clear:
void printJoystickData()
{
// The captured data will be written to the following struct:
//typedef struct joyinfo_tag {
// UINT wXpos;
// UINT wYpos;
// UINT wZpos;
// UINT wButtons;
//} JOYINFO,*PJOYINFO,*LPJOYINFO;
JOYINFO joystickInfo;
// The ID of the joystick you are using. If there is only one joystick use JOYSTICKID1.
UINT joystickId = JOYSTICKID1;
MMRESULT errorCode = joyGetPos(joystickId, &joystickInfo);
switch(errorCode)
{
case JOYERR_NOERROR: // No error. joystickInfo now contains contains the captured data.
{
printf("The X position (left/right tilt) is %u\n", joystickInfo.wXpos);
printf("The Y position (up/down tilt) is %u\n", joystickInfo.wYpos);
printf("The Z position (usually the throttle) is %u\n", joystickInfo.wZpos);
// These values range from 0 to 65536. 32768 is the centre position.
// Test button 1. You can do the same for JOY_BUTTON2, JOY_BUTTON3 etc.
// wButtons is a UNINT that is the OR of all pressed button flags.
if(joystickInfo.wButtons & JOY_BUTTON1)
printf("Button 1 was pressed.");
break;
}
case JOYERR_PARMS: fprintf(stderr, "Invalid parameters to joyGetPos."); break;
case JOYERR_NOCANDO: fprintf(stderr, " Failed to capture joystick input."); break;
case JOYERR_UNPLUGGED: fprintf(stderr, "The joystick identified by joystickId isn't plugged in."); break;
case MMSYSERR_NODRIVER: fprintf(stderr, "No (active) joystick driver available."); break;
}
}
The Simple Directmedia Layer library (suggested by blahdiblah) also looks promising but for what I needed to do I think the code above is simpler.
For Java use the Central Nexus Device API as suggested by Mario. The download includes documentation.
The sample you have linked to doesn't actually use anything object-oriented, which means you can quite easily port it to C.
C supports structs the same as C# (which are allocated on the stack), so that's basically copy-paste.
The one thing that might trip you up is the [DllImport] attribute. The purpose of this attribute is to p/invoke (platform invoke) an unmanaged DLL from within managed C# code. Here you would use extern to access the Windows API.
Refer this url for Gamepad.c and Gamepad.h files.
https://github.com/elanthis/gamepad
Open the joystick using
STATE.fd = open(STATE.device, O_RDWR|O_NONBLOCK);
Structure Definition:
STATE is a structure object.
//It is in Gamepad.h file
open returns -1 on failure.
Set the flag value (defined while declaring variables for joystick) if opened successfully.
Read the joystick input using
(read(STATE[gamepad].fd, &je, sizeof(je)) > 0)
Structure Definition:
je is a structure object
//It is in joystick.h
je is updated now.
je.type is one among the three things mentioned in the joystick.h header file
If a button is pressed , then je.number is an int that denotes the button number as specified by the manufacturer. If a stick is moved , then je.number denotes the axis specification by the manufacturer.
The magnitude is present in the je.value which is assigned to the stick's corresponding variable accordingly.

Java Simulating the keyboard

How would I send text to the computer (like a keyboard) via a Java class?
I have considered using the Robot class to press and release each key, but that would be tedious and there is no way to get the KeyCode from a char.
No, there is also a soft way (well, on Windows it works at least ;-)):
private static void outputString(Robot robot,String str)
{
Toolkit toolkit = Toolkit.getDefaultToolkit();
boolean numlockOn = toolkit.getLockingKeyState(KeyEvent.VK_NUM_LOCK);
int[] keyz=
{
KeyEvent.VK_NUMPAD0,
KeyEvent.VK_NUMPAD1,
KeyEvent.VK_NUMPAD2,
KeyEvent.VK_NUMPAD3,
KeyEvent.VK_NUMPAD4,
KeyEvent.VK_NUMPAD5,
KeyEvent.VK_NUMPAD6,
KeyEvent.VK_NUMPAD7,
KeyEvent.VK_NUMPAD8,
KeyEvent.VK_NUMPAD9
};
if(!numlockOn)
{
robot.keyPress(KeyEvent.VK_NUM_LOCK);
}
for(int i=0;i<str.length();i++)
{
int ch=(int)str.charAt(i);
String chStr=""+ch;
if(ch <= 999)
{
chStr="0"+chStr;
}
robot.keyPress(KeyEvent.VK_ALT);
for(int c=0;c<chStr.length();c++)
{
int iKey=(int)(chStr.charAt(c)-'0');
robot.keyPress(keyz[iKey]);
robot.keyRelease(keyz[iKey]);
}
robot.keyRelease(KeyEvent.VK_ALT);
}
if(!numlockOn)
{
robot.keyPress(KeyEvent.VK_NUM_LOCK);
}
}
Try use this :
http://javaprogrammingforums.com/java-se-api-tutorials/59-how-sendkeys-application-java-using-robot-class.html
Use a GUI testing framework (even if you do not use it for testing). I recommend FEST. In FEST you can search for GUI elements and automate all kinds of user interactions including entering text.
For example once you have a text field fixture (the FEST term for a wrapper that lets you control the component), you can do
JTextComponentFixture fixture = ...;
fixture.enterText("Some text");
#JavaCoder-1337 Not exactly...
Although some switch-case (hard way?) is still needed to handle some (special) characters, most of the characters can be handled fairly easily.
How much you need depends on your target audience, but whatever the case, you can handle it through a combination of:
AWTKeyStroke.getAWTKeyStroke(char yourChar).getKeyCode(); - Which
handles the most basic ones; a-zA-Z are translated to they'r base
(a-z) keyEvents, and a few other chars are also handled similarly (base key only, no modifiers thus no casing is applied).
As you can imagine, this method is particularly effective for simplifying English handling, since the language makes little use of accented letters compared to many others.
Normalizer.normalize(String textToNormalize, Form.NFD); - Which decomposes most composed (accented) characters, like áàãâä,éèêë,íìîï,etc, and they'r uppercase equivalents, to they'r base elements. Example: á (224) becomes a (97) followed by ´ [769].
If your send(String text) method is able to send accents, a simple swap of the accent (in the example it's VK_DEAD_ACUTE) and it's letter, so that they get to proper send order, and you will get the desired á output. Thus eliminating the need for an á filter.
Combined with the first simplification, for this example, that makes 1/3 [´] instead of 3/3 [a,á,´] switch-case needed!
These are only a few of many simplifications that can be done to shorten that dreadfully long switch-case method that is (unwisely) suggested by many fellow programmers.
For example, you can easily handle casing by detecting if the character to be sent is uppercase, and then detecting the current capslock state to invert the casing operation, if needed:
boolean useShift = Character.isUpperCase(c);
useShift = Toolkit.getDefaultToolkit().getLockingKeyState(KeyEvent.VK_CAPS_LOCK) ? !useShift : useShift;
if (useShift) {
keyPress(KeyEvent.VK_SHIFT);
sendChar(aChar);
keyRelease(KeyEvent.VK_SHIFT);
} else {
sendChar(aChar);
}
Another option (the one that I use), which is even simpler, is to simply code a macro in a tool/language that is (far) more suited for this kind of operation (I use and recommend AutoHotKey), and simply call it's execution from Java:
Runtime rt = Runtime.getRuntime();
//"Hello World!" is a command-line param, forwarded to the ahk script as it's text-to-send.
rt.exec(".../MyJavaBot/sendString.ahk \"Hello World!\"");

ASCII non readable characters 28, 29 31

I am processing a file which I need to split based on the separator.
The following code shows the separators defined for the files I am processing
private static final String component = Character.toString((char) 31);
private static final String data = Character.toString((char) 29);
private static final String segment = Character.toString((char) 28);
Can someone please explain the significance of these specific separators?
Looking at the ASCII codes, these separators are file, group and unit separators. I don't really understand what this means.
Found this here. Cool website!
28 – FS – File separator The file
separator FS is an interesting control
code, as it gives us insight in the
way that computer technology was
organized in the sixties. We are now
used to random access media like RAM
and magnetic disks, but when the ASCII
standard was defined, most data was
serial. I am not only talking about
serial communications, but also about
serial storage like punch cards, paper
tape and magnetic tapes. In such a
situation it is clearly efficient to
have a single control code to signal
the separation of two files. The FS
was defined for this purpose.
29 – GS – Group separator
Data storage was one
of the main reasons for some control
codes to get in the ASCII definition.
Databases are most of the time setup
with tables, containing records. All
records in one table have the same
type, but records of different tables
can be different. The group separator
GS is defined to separate tables in a
serial data storage system. Note that
the word table wasn't used at that
moment and the ASCII people called it
a group.
30 – RS – Record separator
Within a group (or table) the records
are separated with RS or record
separator.
31 – US – Unit separator
The smallest data items to be stored
in a database are called units in the
ASCII definition. We would call them
field now. The unit separator
separates these fields in a serial
data storage environment. Most current
database implementations require that
fields of most types have a fixed
length. Enough space in the record is
allocated to store the largest
possible member of each field, even if
this is not necessary in most cases.
This costs a large amount of space in
many situations. The US control code
allows all fields to have a variable
length. If data storage space is
limited—as in the sixties—this is a
good way to preserve valuable space.
On the other hand is serial storage
far less efficient than the table
driven RAM and disk implementations of
modern times. I can't imagine a
situation where modern SQL databases
are run with the data stored on paper
tape or magnetic reels...
The ascii control characters range from 28-31. (0x1C to 0x1F)
31 Unit Separator
30 Record Separator
29 Group Separator
28 File Separator
Sample invocation:
char record_separator = 0x1F;
String s = "hello" + record_separator + "world"
These characters are control characters. They're not meant to be written or read by humans, but by computers. You should treat them in your program like any other character.

Categories

Resources