TypeNet Benchmark for development of authentication keystroke technologies (github.com)

🤖 AI Summary
The TypeNet Benchmark is a controlled-release dataset of keystroke embedding vectors designed to accelerate research in large-scale authentication and keystroke-biometrics. Interested researchers request access via email to atvs@uam.es with a signed license; upon approval they receive credentials and a download window. The repository contains 128-dimensional embeddings for 130K subjects (100K desktop, 30K mobile) produced by TypeNet — a recurrent model of two 128-unit LSTM layers (tanh) with batch normalization, dropout (0.5) and recurrent dropout (0.2). Inputs are truncated/padded to M=50 key events; three model variants trained with softmax, contrastive and triplet losses generate separate .npy files named Embedding_vectors_LOSS_SCENARIO.npy (shape: Subject × Session × 128; sessions indexed 0–15). Significance and technical implications: embeddings were derived from large Aalto keystroke corpora (Dhakal et al. desktop and Palin et al. mobile) using an open-set evaluation: training used ~68K desktop / 30K mobile subjects (softmax limited to 10K classes due to hardware), while test sets comprise 100K and 30K subjects respectively. The benchmark includes an experimental protocol (gallery sizes G=1..10, Euclidean-distance scoring, EER computation) to reproduce reported authentication results, supports metric-learning comparisons at scale, and enables reproducible research on free-text keystroke authentication under realistic, high-cardinality conditions. Publications [1,4] must be cited for public work based on this benchmark.
Loading comments...
loading comments...