GestureKit

Disclaimer

It is very important to me that I say this upfront. GestureKit works astronomically better with the hardware integration option of a Teensy 4.0 (20-30$ on Amazon last I checked). The polling rate of inputs gets bottlenecked very quickly when you are not able to funnel the outputs per device instead of the whole operating system executing one queue.

GestureKit is for accessibility and novelty non-exploitative ergonomic uses only. I started making this project after a couple of years of friendship with 3 different people who experience a wide range of forearm, wrist, and general hand/finger disabilities. They use Svalboards (svalboard.com) for everyday office work and one of them uses an Azeron Cyborg (azeron.com) for gaming to assist with their limited mobility. GestureKit became my attempt at furthering their accessibility options and variations of output triggers by diversifying the accepted gestures a key can execute and detect.

It is not a product or something I am selling so I do not say this out of liability concern, but out of advocacy that users consider their intentions with this project. It is a personal project that I use for my own convenience and to the benefit of my friends.

The Problem

PC games and productivity applications often demand complex, precisely-timed multi-key sequences executed repeatedly over hours. Performing these manually is fatiguing, inconsistent, and error-prone during long sessions. Existing macro tools treat the keyboard as a single device. They cannot distinguish simultaneous gestures on different keys, lack hold-duration awareness, and produce output with fixed timing that feels robotic and is trivially detectable by pattern analysis.

What GestureKit Does

GestureKit hooks into the OS-level keyboard event stream globally (even when another window is focused) and routes each key event to that key's dedicated, independent state machine. Each state machine classifies the press pattern into a gesture type, then fires the bound macro sequence through an executor that applies multi-layer human-like timing randomization. Every key operates in complete isolation. You can perform simultaneous gestures on different keys without interference, and multiple macro sequences execute concurrently.

Architecture Overview

The system follows a clean pipeline architecture:

Input Capture → Gesture Detection → Binding Lookup → Cooldown Gating → Traffic Control → Sequence Execution → Key Output

The input listener captures global keyboard and mouse events via node-global-key-listener, normalizes key names, tracks modifier state, and dispatches to the active gesture detector. The detector resolves the gesture type, the binding lookup finds the matching macro, and the executor sends the output key sequence through the selected backend with randomized inter-key timing, echo hits for ability confirmation, and modifier-aware traffic control to prevent key collisions.

Gesture Detection Systems

GestureKit implements two complete detection systems, selectable at startup.

Alpha System — 12 Gestures

The original system using elongating multi-press windows. Supports single, double, triple, and quadruple taps, each with normal, long, and super long hold variants.

4 tap counts × 3 hold durations = 12 gesture types per key

Elongating detection window with configurable timing

Long and super long variants fall back to each other when one is unbound

Await jail mechanism prevents accidental triggers after rapid multi-taps

Omega System — 13 Gestures (Primary)

A streamlined system designed for low-latency responsiveness. Long gestures fire immediately when the hold threshold is crossed with no waiting for key release.

quick / long — base gestures

quick_toggle / long_toggle — while W or Y is held past threshold

quick_f2 / long_f2 — F2 toggle layer

quick_q_toggle / long_q_toggle — Q modifier layer

quick_s_toggle / long_s_toggle — S group targeting layer

combo_7_4 — chord gesture where key 4 fires during key 7 hold

Instant-quick optimization: if no long binding exists for a key, quick fires immediately on release with zero delay

Per-Key Calibrated Thresholds

Each of the 33 monitored input keys has an independently calibrated hold-duration threshold that determines the quick to long boundary. These thresholds are derived from a calibration wizard that collects timing samples from the user's actual keypresses and computes statistical boundaries with confidence scores. The system adapts to individual typing characteristics. A key you naturally hold slightly longer gets a higher threshold.

Execution Pipeline

Macro sequences are more than simple key replay. Each step in a sequence supports:

Buffer tiers — low (129–163ms), medium (229–263ms), high (513–667ms) inter-key delays

Echo hits — rapid re-presses during the buffer phase (1–4 hits) to ensure ability registration despite game lag, with tier-aware timing ranges

Dual key presses — two keys pressed near-simultaneously with configurable offset

Hold-through-next — a modifier key held down through the next step's execution and released mid-buffer

Scroll steps — mouse wheel output with configurable magnitude

Timer steps — TTS countdown announcements via the say package

Traffic control — modifier-aware conflict resolution that only delays output when the conflicting modifier (Shift/Alt) is actually physically held

Human-Like Timing Randomization

All timing values pass through a multi-layer randomization system that produces distributions matching natural human motor patterns:

Hash-based entropy — MurmurHash3 mixing for deterministic but random-looking values

Gaussian sweet-spot bias — configurable probability boosts toward values humans naturally produce (e.g. 29ms and 33ms key-down durations)

History-based correction — tracks recent output and adjusts weights to maintain realistic distributions over time

Per-value noise overlay — final entropy layer that prevents detectable periodicity

This replaces a naive Math.random() approach and produces output that passes statistical tests (Kolmogorov-Smirnov, chi-squared, autocorrelation, runs test), all implemented in the frontend for live verification.

Three Output Backends

RobotJS

Uses the Windows SendInput API. Easiest setup. Shares the input queue with mouse movement, causing potential stutter under heavy output. Mitigated by RepeatPolice (deduplication), Output Pacing (cyclic delays), and Queue Pressure monitoring.

Interception

Kernel-level driver. Input is indistinguishable from physical keyboard events. Lowest latency, maximum application compatibility. Requires driver installation.

Teensy 4.0

USB HID via serial. A physical Teensy microcontroller acts as a real USB keyboard. Output goes through a separate USB device with zero host CPU contention, no mouse stutter, and no queue pressure. Serial protocol: KEY:keyname:duration. Includes auto-reconnect on USB disconnect.

The executor automatically routes key output to the active backend. In Teensy mode, all software-mode workarounds (RepeatPolice, Output Pacing, Queue Pressure monitoring) are disabled since they are unnecessary.

Multi-Character Profile System

GestureKit supports 7 character-specific profiles (built for SWTOR classes), each defining unique ability bindings while inheriting shared targeting and utility bindings. Profiles configure D key behavior (continuous R-stream toggle for Tank, burst cycles with configurable timing for Rage/Mercs, or single-press for Sorcs/Sniper), S key ability (Guard with dual-key output, Cleanse, Shield Probe, etc.), per-key bindings for keys 1–6, A, B, H, I, U, and special keys, and DPS targeting with dynamic group member slot assignment including config mode, DPS designation phase, and Q+5/Q+6 intercept for target-of-target sequences.

Profiles are defined in code as TypeScript arrays of binding objects, compiled into O(1) lookup maps at startup.

Special Key Behaviors

Several keys have behaviors that bypass the standard gesture to binding to executor pipeline:

D key — toggles an R-stream (periodic R keypresses at configurable intervals) with TTS announcements. Supports continuous, burst-slow, burst-fast, and single-press modes per profile

S key — dual-purpose: quick tap fires a profile-specific ability, long hold activates group member targeting mode where number keys 1–4 target specific party members

C key — hybrid quick/long detection with double-tap producing Escape

Equals key — gap-based tap counting (ignores hold duration) with long hold triggering R-stream

F2 key — independent toggle modifier creating a separate gesture layer with long hold also triggering R-stream

ENTER key — pauses and resumes the entire gesture system for chat typing

Electron Desktop Dashboard

An 8-page React and Tailwind CSS dashboard running in Electron with full context isolation:

Dashboard — engine start/stop, active profile display, calibration confidence heatmap, live gesture activity feed, queue pressure indicator

Input Monitor — real-time key event stream visualization

Gesture Gallery — browse and configure gesture type definitions

Profiles — full CRUD with drag-and-drop key pool builder for assigning input/output keys and managing bindings

Calibration — per-key calibration wizard with sample collection, statistical analysis, and confidence scoring

Traffic Controller — conflict key map and queue status monitoring

Timing Engine — buffer tier sample generation with live statistical verification (KS test, chi-squared, autocorrelation, runs test)

Execution Pipeline — active execution monitoring

The bridge layer dynamically imports core engine modules and falls back to mock data when the engine is not available, so the dashboard works standalone for UI development.

Engineering Highlights

Per-Key Isolation with Concurrent Execution

The core design challenge. Each of the 33 input keys has a completely independent state machine. If keys A and B are both mid-gesture simultaneously, they run in parallel without shared state. The executor enforces per-binding overlap prevention (same macro will not stack) while allowing different macros to run concurrently.

Modifier-Aware Traffic Control

When a macro outputs SHIFT+K, there is a risk that the physical keyboard's Shift key (held for movement) contaminates unmodified keypresses in other concurrent sequences. The compiled profile identifies which keys appear with both modified and unmodified variants, and the traffic controller only delays output when the conflicting modifier is actually physically held, checked via a live callback to the input listener's modifier state.

Key Suppression Feedback Loop

When the executor outputs a key via RobotJS, that synthetic keypress gets captured by the global input listener and would trigger a gesture. The suppression system temporarily blocks specific keys in the gesture detector for a configurable duration after synthetic output, preventing infinite feedback loops.

Teensy Serial Protocol with Auto-Reconnect

The Teensy communicates via a request-response protocol over serial (KEY:n:45 to OK:n). If USB disconnects mid-session, pending commands are rejected and a background reconnect loop attempts recovery up to 5 times with 2-second intervals. The engine continues running without crashing.

Use Case

Originally built to play MMOs and type pesky emails — activities with more than 30–80 common outputs — with barely any hand movement to get where you intended to be. GestureKit is applicable to any scenario where complex, precisely-timed key sequences need to be bound to simple gesture inputs: games, DAWs, video editing, accessibility tools, or any keyboard-driven workflow requiring consistent execution over long sessions.

As for the future, I brought up Svalboards and Azeron Cyborgs as devices that already work like any normal mouse and keyboard, but I have interest down the line in seeing if I could make GestureKit work with something like a QuadStick (quadstick.com). I would need to better understand first whether their system would even benefit from my project or if the feature overlap is too redundant to bother. There are far more advanced things in the world already when it comes to assistive technology for differently abled or uniquely inclined individuals. I only aim to expand on that or create access to a tool that I could not easily find from my own searches.

Tech Stack

Runtime — Node.js 18+ with ES Modules

Language — TypeScript 5 (strict mode)

Input Capture — node-global-key-listener (global keyboard hooks)

Output — RobotJS (SendInput), Interception driver (kernel), Teensy 4.0 (USB HID via serialport)

Desktop — Electron 33 with context isolation

Frontend — React 18, Vite 6, Tailwind CSS 3, Recharts, D3, Lucide icons

Testing — Vitest

Hardware — Teensy 4.0 microcontroller (PJRC, VID 16C0)

Project Overview

Dashboard Walkthrough

Gesture Detection Deep Dive