Parsing ANSI/VT Terminal Keycode Sequences ========================================== Parsing input from an ANSI keyboard stream can be optimised by combining all keycode sequences into a superset that represents all sources: {'?' or 'O' or '['} {{;}} {'@'-'~'} This can be parsed with the following 'C' code. It returns a character code < 0x100 for normal character keys, 0x180+n for function keys, and 0x1C0+n for editing keys. With function and editing keys, the modifier keys are returned in bit 4 and bit 5. unsigned char ansikey[]={ /* Translation table */ 0x84,0xCF,0xCE,0xCD,0xCC,0xC5,0xC9,0xCA, /* f4,Up,Dwn,Rgt,Lft,Bgn,End,Nxt */ 0xC8,0x09,0x1B,0x95,0x94,0x0D,0x00,0x00, /* Hme,Tab,Clr,Shf5,Shf4,Kent,(N),(O)*/ 0x81,0x82,0x83,0x84,0x85,0xCA,0xCB,0x00, /* f1,f2,f3,f4,f5,PgDn,PgUp,(W) */ 0x3D,0x00,0xD5,0x00,0x00,0x00,0x00,0x00, /* K=,(Y),ShTAB,([),(\),(]),(^),(_) */ 0x00,0xC8,0xC6,0xC7,0xC9,0xCB,0xCA,0xC8, /* 0,Home,Ins,Del,End,PgUp,PgDn,Home */ 0xC9,0x8D,0x80,0x81,0x82,0x83,0x84,0x85, /* End,Num,F0,F1,F2,F3,F4,F5 */ 0x00,0x86,0x87,0x88,0x89,0x8A,0x00,0x8B, /* (16),F6,F7,F8,F9,F10,(22),F11 */ 0x8C,0x80,0x8E,0x00,0x8F,0xC1,0x00,0x00, /* F12,F13,F14,(27),F15,F16,(30),F17 */ 0x80,0x00,0xC3,0x00 /* F18,F19,F20,(35) */ } int ch=0; /* Character pressed */ int key=0; /* Key lookup */ int mod=0; /* Modifier= */ fflush(stdout); read(STDIN_FILENO, &ch, 1); /* Read without flushing */ ch=ch & 0xFF; if (ch != 27) return ch; /* Not */ if (kbhit() == 0) return ch; /* Nothing pending */ /* Read an ANSI/VT key sequence as: * [ () (;) <0x40-0x7F> * or O <0x40-0x7F> * or ? <0x40-0x7F> * or <0x40-0x7F> */ while ((ch=getchar()) == 27); /* Get following non- character */ if (ch=='O' || ch=='?') { /* Convert O or ? to [0 */ ch=getchar(); /* Get ? or O */ } else { if (ch=='[') { /* Parse [( */ while ((ch=getchar())<'@') { /* Parse through non-alphas */ if (ch>='0' && ch<='9') { /* Digit, add to current num */ mod=mod*10+(ch-'0'); } if (ch==';') { key=mod; mod=0; } /* Semicolon, parse next number */ } } } if (ch<'@' || ch>0x7F) return ch; /* Not ..., return */ /* mod=0, key=0, ch= ? mod=0, key=0, ch= O mod=0, key=0, ch= [ mod=0, key=0, ch= [ mod=, key=0, ch= [; mod=, key=, ch= */ if (ch == '~') { /* [...~ */ if (key == 0) { key=mod; mod=0; } key=key+32; } else { if (ch >= '`') return (ch & 0x3F) | 0x100; /* ? or O */ else key=(ch & 0x1F); /* .. */ } if (mod) mod=mod-1; /* Convert modifiers to bitmap */ ch=ansikey[key]; /* Translate keypress */ if (mod & 1) ch=ch ^ 0x10; /* SHIFT pressed */ if (mod & 4) ch=ch ^ 0x20; /* CTRL pressed */ if (mod & 2) ch=ch ^ 0x30; /* ALT pressed */ return (ch | 0x100); Example code: * JGH Console Library - mdfs.net/System/C/Lib * PDP11 ANSI parsing - mdfs.net/Info/Comp/PDP11/ProgTips * PDP11 BBC BASIC Console I/O - mdfs.net/Software/PDP11/BBCBASIC References ---------- * invisible-island.net/xterm/ctlseqs/ctlseqs.html * www.gnu.org/software/screen/manual/html_node/Input-Translation.html#Input-Translation Updates ------- 22-Jun-2020: Testing suggests malformed sequences are ignored. Slight formatting changes. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ANSI/VT terminal keyboard input sequences are: [ followed by up to 2 8-bit numeric parameters separated by ';' and terminated with an uppercase letter or '~', or ? followed by a single letter, or O followed by a single letter, or followed by a single uppercase letter. where is CHR$27, and "letter" is taken to mean '@' (CHR$64) to (CHR$127), and "uppercase letter" is taken to mean '@' (CHR$64) to '_' (CHR$95). Sometimes a leading is followed by another - parsing code should ignore any additional s until a non- is received. Keyboard input needs to have the ability to poll the input to see if any characters are pending without waiting if there are no characters are present, as some sequences are terminated by the *absense* of a character. -> -> -> loop until non- received '?' -> keycode sequence (not '[') -> keycode sequence '[' () -> keycode sequence, defaults to 1 '[' (';') '~' -> keycode sequence, defaults to 1 '[' -> malformed sequence otherwise -> malformed sequence If the terminating character is '~', the first number must be present and is a keycode number, the second number is an optional modifier value. For example: [1~ [2~ [1;2~ [2;4~ If the terminating character is an uppercase letter, the letter is the keycode value, and the optional or first number is the modifier value. For example: [A [B [1B [2C [5D The modifier value defaults to 1, and after subtracting 1 is a bitmap of modifier keys being pressed: b3=, b2=, b1=, b0=. For example: [20~ is key 20 -> function key 9 [20;2~ is key 20, modifier 2 -> Shift-function key 9 [H is key H -> Home [5C is key C, modifier 5 -> Ctrl-Left Malformed sequences appear to be ignored and are treated as 'no key pressed'. These are DEC VT/vt200 sequences -------------------------------- [(;)~ [1~ - Home/Find [16~ - (f5) [31~ - F17 [2~ - Insert [17~ - F6 [32~ - F18 (Print) [3~ - Delete [18~ - F7 [33~ - F19 (Cancel) [4~ - End/Select [19~ - F8 [34~ - F20 (Pause) [5~ - PgUp [20~ - F9 [35~ - [6~ - PgDn [21~ - F10 [36~ - [7~ - Home [22~ - [8~ - End [23~ - F11 [9~ -(NumLock) [24~ - F12 [10~ - F0 [25~ - F13/Print [11~ - F1 [26~ - F14/Scroll [12~ - F2 [27~ - [13~ - F3 [28~ - F15/Break [14~ - F4 [29~ - F16/Menu [200~ - start of paste [15~ - F5 [30~ - [201~ - end of paste These are xterm sequences ------------------------- [() [@ - (output is Insert) [A - Up [K - [U - (output is PageDown) [B - Down [L - [V - (output is PageUp) [C - Right [M - [W - [D - Left [N - [X - [E - [O - vt52/vt100 [Y - [F - End [1P - F1 [Z - [G - Keypad 5 [1Q - F2 [[ - [H - Home [1R - F3 [\ - [I - [1S - F4 [^ - [J - Clear [T - [_ - [A to [D are the same as the ANSI output sequences. The is normally omitted if no modifier keys are pressed, but most implementations always emit the for F1-F4. These are vt100 sequences ------------------------- O or ? O@ - OK OV - OA - Keypad 8 Up OL OW - OB - Keypad 2 Down OM - Keypad Enter OX - Keypad = OC - Keypad 6 Right ON OY - OD - Keypad 4 Left OO OZ - OE - Shift+Keypad 5 OP - F1 O[ - OF - Keypad 1 End OQ - F2 O\ - OG - OR - F3 O] - OH - Keypad 7 Home OS - F4 O^ - OI - OT - O_ - OJ - OU - Lower case terminators translate directly to Keypad O` - Ok - Keypad + Ov - Keypad 6 Oa - Ol - Keypad , Ow - Keypad 7 Ob - Om - Keypad - Ox - Keypad 8 Oc - On - Keypad . Oy - Keypad 9 Od - Oo - Keypad / Oz - Keypad : Oe - Op - Keypad 0 O{ - Keypad ; Of - Oq - Keypad 1 O| - Keypad < Og - Or - Keypad 2 O} - Keypad = Oh - Keypad ( Os - Keypad 3 O~ - Keypad > Oi - Keypad ) Ot - Keypad 4 O<7F> - Keypad ? Oj - Keypad * Ou - Keypad 5 I have not found any documentation on O` to Og (and ?` to ?g), but it is safe to assume they can be treated as keypad keys <20> to <27>. These are vt52 sequences ------------------------ @ - L - U - A - Up M - V - B - Down N - W - C - Right OP - F1 X - D - Left OQ - F2 Y - E - OR - F3 Z - F - OS - F4 [ - [ sequence G - P - F1 \ - H - Q - F2 ] - I - R - F3 ^ - J - S - F4 _ - K - T -