TESID: Textualised Encrypted Sequential Identifiers

⚠ The form fields below are read‐only or disabled because they’re implemented client‐side only and you either don’t have JavaScript enabled, or your browser doesn’t support the required functionality; but the sample values are all still correct.

Simple TESID usage
	Numerical ID	↔	TESID
Try your own values:
		↔
Generate a new ID every frame (just for the fun of it!)
Sample values:
	0	↔	w2ej
	1	↔	w6um
	2	↔	x45g
	3	↔	6mqv
		⋮
	1234	↔	ghyc
		⋮
	1000000	↔	v9w5
		⋮
(2²⁰ − 1)	1048575	↔	atcw
(2²⁰)	1048576	↔	8qwm6y
		⋮
	123456789	↔	38cbuk
		⋮
(2³⁰ − 1)	1073741823	↔	3eipc7
(2³⁰)	1073741824	↔	n3md95r4
		⋮

Typed TESID usage
Type^†	Numerical ID	↔	TESID
Try your own values:
		↔
… with sparsity
Sample values (with sparsity 256):
Project (0)	0	↔	w2ej
Project (0)	1	↔	dh2h
Project (0)	2	↔	pet7
Project (0)	3	↔	kmfv
Task (1)	0	↔	w6um
Task (1)	1	↔	a6xy
Task (1)	2	↔	qd5m
Task (1)	3	↔	bycz
User (2)	0	↔	x45g
User (2)	1	↔	7xgj
User (2)	2	↔	n5sj
User (2)	3	↔	cyvq

^†The types used here (Project/Task/User) are just simple examples. You will choose your own types, and each type will have a number (discriminant) associated with it.

(The JavaScript library is loaded on this page, and TESIDCoder and TypedTESIDCoder are exposed as globals so you can play with them directly if you like. Remember to use big integers like 0n—normal numbers like 0 will net you a TypeError.)

What it is

TESID converts numbers into short random strings.

If you have centralised ID allocation and don’t actively want to expose your ID sequence, TESID lets you store sequential numeric identifiers in your database, but expose cryptographically‐secure pseudorandom but fairly short strings to users, avoiding disclosing the original ID sequence, which can be a valuable information leak.

TESID is commonly an alternative to UUIDs, which are massive overkill and very unwieldy in centralised ID allocation scenarios.

The strings TESID produces are designed to be human‐friendly: quite short, not mixed case, and with typical confusables (1/l, 0/o) removed. This allows people to usefully interact with the IDs, by typing, handwriting, speech and more.

The algorithm

The TESID encoding algorithm takes a non‐negative integer ID, and turns it into a string following these three steps:

First, if desired, it does type discrimination and sparsification, by multiplying by a sparsity factor and adding a discriminant. (This allows you to avoid TESID reuse or confusion across types, and make valid TESIDs harder to guess.)
Secondly, it scrambles the number with a real cryptographic block cipher (a slight variant of Speck) within certain ranges:
- a 20‐bit block size for values 0 to 2²⁰ − 1 (that is, a number in the range 0–1,048,575 will become some number in the range 0–1,048,575),
- 30‐bit for values 2²⁰ to 2³⁰ − 1 (so a number in the range 1,048,576–1,073,741,823 will become some number in the range 0–1,073,741,823),
- 40‐bit for values 2³⁰ to 2⁴⁰ − 1,
- &c. for 50‐bit, 60‐bit, 70‐bit, 80‐bit, 90‐bit and 100‐bit.
Thirdly, it converts the scrambled number to a string using base conversion with a base‐32 alphabet, with leading padding to the appropriate length for its range: 4 characters for 20‐bit values, 6 for 30‐, 8 for 40‐, &c. until 20 for 100‐bit values.

Decoding is a fairly straightforward reversal of encoding.

The end result of this technique is that you get nice short IDs for as long as is possible, but avoid exposing the numeric sequence. (In the absence of sparsity and discrimination, you’ll get about a million four‐character TESIDs, a billion six‐, a trillion eight‐, and so on.)

See the algorithms page for a more detailed description.

Implementations

Rust

tesid on crates.io
Zero dependencies beyond core (no–std‐compatible)
Rust 1.56.0 or later
Licensed BlueOak-1.0.0 OR MIT OR Apache-2.0
The reference implementation, with:
- The best error handling (because algebraic data types make it practical);
- The best in‐code documentation (because I wrote it first and didn’t copy everything to the others, though I did copy all applicable tests); and
- Generally the easiest code to understand (because Rust is very good for bitwise stuff); also
- The best performance (because Rust); and
- Finite‐sized integer types (which render it very slightly more restrictive than implementations that use arbitrary‐sized integers).

Python

tesid on PyPI
Zero dependencies
Python 3.6 or later
Licensed BlueOak-1.0.0 OR MIT
Pretty boring, really, compared with the Rust and JavaScript versions, each with their own gimmick 🙂

JavaScript

tesid on npm
Zero dependencies
TypeScript type definitions
Requires complete native big integer support
Engine baselines (bearing in mind it’s for backend use):
- Node.js 12.7 or later (10.4 apart from module syntax)
- Chrome 67
- Firefox 68
- Safari 15 (14 lacks DataView#getBigUint64)
Licensed BlueOak-1.0.0 OR ISC
Written for compactness and performance (not readability or precise error reporting)
With all functionality retained, 1680 bytes minified (915 gzipped)

Code sample

This sample code is written in Python, as it has the prettiest syntax. But you can do just the same in both Rust and TypeScript.

# --- First, simple usage (using the key from above!) ---
from tesid import TESIDCoder

secret_key = '000102030405060708090a0b0c0d0e0f'
coder = TESIDCoder(secret_key)

assert coder.encode(0)          == 'w2ej'
assert coder.encode(1)          == 'w6um'
assert coder.encode(2)          == 'x45g'
assert coder.encode(2**20 - 1)  == 'atcw'
assert coder.encode(2**20)      == '8qwm6y'
assert coder.encode(2**30 - 1)  == '3eipc7'
assert coder.encode(2**30)      == 'n3md95r4'
assert coder.encode(2**100 - 1) == 'ia2bvpjaiju7g5uaxn5t'
# coder.encode(2**100) would raise ValueError.

assert coder.decode('w2ej') == 0


# --- Second, convenient typed usage ---
from tesid import TypedTESIDCoder, SplitDecode
from enum import Enum

class Type(Enum):
    Project = 0
    Task    = 1
    User    = 2

typed_coder = TypedTESIDCoder(coder, 256, Type)

assert typed_coder.encode(Type.Project, 0) == 'w2ej'
assert typed_coder.encode(Type.Task,    0) == 'w6um'
assert typed_coder.encode(Type.User,    0) == 'x45g'
assert typed_coder.encode(Type.Project, 1) == 'dh2h'
assert typed_coder.encode(Type.Task,    1) == 'a6xy'
assert typed_coder.encode(Type.User,    1) == '7xgj'
assert typed_coder.decode(Type.Project, 'w2ej') == 0
# typed_coder.decode(Type.Task, 'w2ej') would raise ValueError.
assert typed_coder.split_decode('w2ej') == \
	SplitDecode(id=0, discriminant=Type.Project)

Getting started

So you think you’d like to use TESID? Here’s a set of steps for getting started.

Read about the considerations when choosing TESID. There are a couple of things you should be aware of that may make you decide TESID is not suitable for your situation.
Select a TESID library. Currently, implementations are provided for Rust, Python and JavaScript, each under the name tesid in the usual package repository; see above for more details.
Decide whether and how to use sparsity and discrimination (explained in these linked sections of the more and design rationale documents). In very brief form: you should use these if you want to use TESID on more than one type (use sparsity and discriminant), if you want valid TESIDs to be less guessable (use large sparsity), or if you want to use the probabilistic solution to undesirable patterns (use moderate to large sparsity).
Generate a key. It should probably be stored as a secret in a config file or environment variable. Ideally use different keys for different environments.
Hook up TESID encoding and decoding everywhere you expose IDs.
Go forth and prosper.

Generating keys

TESID uses a 128‐bit key for its cryptography; libraries take this as a 32‐character big‐endian lowercase hexadecimal string.

This key should be randomly generated. Here are a few command‐line techniques you can use:

openssl rand -hex 16: (Requires OpenSSL.)
python -c 'import secrets; print(secrets.token_hex(16))': (Requires Python 3.6+.)
</dev/random head -c 16 | hexdump -e '4 "%08x" "\n"': (Should run in any typical POSIX environment, I think—Linux, macOS, BSD, &c.)
node -e 'console.log(crypto.getRandomValues(new Uint8Array(16)).reduce((s, b) => s + b.toString(16).padStart(2, "0"), ""))': (Requires Node.js 15+. The JavaScript works in browsers too, and is what the “Generate new random key” button in the demo above does, though as a matter of security principle you should not use my button without vetting all the code running on my site: wiser to run it locally on your own terms.)

More information (background, alternatives, research, &c.)

This page is an overview to TESID, general information and introduction. I have three more pages that are worth reading if you’re interested in the problem space, understanding more when TESID is appropriate, what alternatives there are, &c.:

More general information: a miscellaneous bag of information about TESID, the supporting research, and other related discussions.
Algorithms: prose descriptions of the TESID encoding and decoding algorithms, including the pieces that make them up.
Design rationale: some analysis of the reasoning behind TESID and specific design choices.

There’s some definite overlap between the pages, but also a lot of non‐overlap.

TESID

Textualised Encrypted Sequential Identifiers

Store numerical IDs in your database, but expose them as short random strings.

Demo