A proposal for Identity Bars (dkolf.de)

Abstract

This document proposes a concept I call "identity bars" as a form of Identicon to reliably distinguish large numbers (hash values) in a visual way.

Author: David Heiko Kolf, 2024-06-29.

Motivation

On the internet it is often necessary to have a reliable way to distinguish between different entities. Is the file I want to download really the same or is someone trying to give me a different file that just pretends to be what I am looking for? Am I encoding my message to the person I want to contact rather than to someone just pretending to be them? Is this data set really the one I am looking for or just some other data with a similar name?

Fortunately there are ways to achieve just that. I can ask for a unique "hash value" to be calculated for my file and compare that. The same hash value can be calculated for a cryptographic public key to generate a fingerprint. In a database I can generate a universally unique identifier to distinguish data sets.

All of those solutions suffer from one problem: They are usually displayed as very large hexadecimal numbers. Those are easy to compare for a computer, but a challenge for a human:

However, humans are usually quite good at telling apart images, so I was looking for a way to turn long numbers into distinguishable pictures.

Similar solutions

QR codes

QR codes are a way to turn data into an image. However, that image is again just targeted at computers to decode it, for a human it would be even worse for comparisons or recognition than the raw data.

Identicons

Identicons are approaches to solve the issue of turning some numbers into identifiable pictures.

Most identicon solutions I found had some disadvantages I hoped to avoid: Either they used just a limited (or even unpredictable) amount of data from the original value or they started to remind me more of a QR code. What I was looking for was a way to represent a value of at least 128 bits in such a manner that it could still be reverted back to its original value (to prove that no data got lost) and would not look like random noise to a human. I liked the aesthetics of the GitHub Identicons and used them as a starting point.

Approach

I started with the GitHub Identicons, symmetrical 5x5 pixel sprites of a single color. By being symmetrical they are quickly recognizable. Such a sprite already covers 15 bits (3x5) plus a few more for the color. However, I do not want to depend on the color as it can easily degrade (when printed, viewed on a weak display or by people with a different color perception).

The remaining bits (including those already used for choosing the color) are oriented in a horizontal line with a thickness of 4 bits — except for the second column where the 16th bit is placed in a fifth position above the line. By aligning the data in a horizontal line it should fit easier into tables and other textual orientations.

So far the horizontal line would still look like random noise. In order to create more recognizable shapes the Scale3x algorithm is now applied. The Scale3x algorithm was originally developed for the purpose of interpolating pixel graphics from older low-resolution games and it helps to follow diagonals. By applying Scale3x to random bits arbitrary shapes are created. Scale3x does not introduce new colors and it preserves the original data.

Reference implementation in Python

The reference implementation written in Python 3 is a module to create the output images in PNG format. For easier integration in web pages it also contains functions that can encode the original IDs as Base64 in filenames which can in turn be used to generate the images on demand.

The module provides the following public functions:

makepng(idhash)
Converts an array of bytes with a length of at least 4 bytes into a PNG image. The image is returned as an array of bytes.
makefilename(prefix, idhash)
Creates a filename that contains the idhash in a Base64 encoded format.
makefilecontent(filename)
Takes a filename created by makefilename and returns the PNG image.
makedatauri(idhash)
Creates a PNG image and returns it as a Data URI ready for embedding in HTML pages.
wsgiapp(environ, start_response)
A WSGI implementation returning a PNG image by calling makefilecontent

Example use cases

File list with hashes

SHA-256FileID bar
eb3bf160688fb395a2db6bc52eeff4f7855a6321d2b41bdc754554d13f4e7d44 dkjson-2.8.lua
0a6a01ebf01af09a3fc1f1afbba17676eb5f4b04f2fad938470d3f34098069fb idbar.py

OpenPGP key ring

Long IDNameID bar
7751 fa76 5167 b422 Erika Mustermann ID bar
4c73 b287 a0f8 1442 Jane Doe ID bar
dad9 83fe b853 3911 John Doe ID bar
f9c0 f0c7 4965 ebcf Max Mustermann ID bar