2024-01-02: Site is back on track, self-hosted on a rpi because you get the Internet you fund.

GLOBAL UUID DATABASE - 446141 UUID, 1193573 comments.

World's most complete UUID database.

Check the /blog/ !

Last UUIDs added:

10 random UUIDs from the database :


What it this website ?

This website is the world most complete annotated UUID database, which happens to be an index of all the numbers between 0 and 2^128.
Not all the numbers between 0 and 2^128 are RFC4122-compliant, but all are welcome in my database.
Various uses can be made of the data in here : This database started on 2018-01-01, made with dedication by your neighbourly django adept.
As of 2024-01-25, this is very much still alive, even if the time I can invest in computers when I'm not paid for that is inversely proportional to numerous things out there. Hey, at least it's back online and should stay so for a while, right ?

How could I contribute ?"

Well, you could just add some UUID to the database !
It's pretty simple, since the UUID annotation form is presented on every UUID page. You just have to access the UUID page, the URL should be https://uuid.pirate-server.com/<UUID>. Then:
  1. Fill the UUID name, something useful and short.
  2. Eventually, put some comments on the context, your sources (URL), etc.
  3. Feel free to sign your contribution using the "author" field in the form.
And sumbit !
Also
You can as well use the following cURL command line :
  uuid="put an actual uuid here";
  title="my super uuid";
  details="some comments, like: it's a cool uuid";
  author="my acquisition script";
  email="hello"; # advanced anti-noise technique
  curl --referer "https://uuid.pirate-server.com/$uuid" -XPOST --data "title=${title}&details=${details}&author=${author}&email=${email}" https://uuid.pirate-server.com/comment
  
Another thing to know while contributing, is that the best-ranked comment gets elected as UUID name, so if you find the authoritative name for some UUID, feel free to upvote yours.

What is an UUID ?

There are a lot of links below, but if you're reading this you might want the definition right now. UUID are 128 bits identifiers, concieved to be globally unique (hence the other name for those identifiers, Globally Unique IDentifier, GUID). Their generation does not require any form of central registration authority.
Their uniqueness is mostly based on the entropy embedded in the 128 bits, slightly lower than 128 bits for UUID respecting the specification, which have some bits reserved for the version.
Possible UUID versions : The UUID specification defines several parts for the hexadecimal representation of the 128 bits : Each of those fields can have its own section, let's begin with the most important one, trivially visible, the time-high-and-version field, which is visible in the middle of an UUID.

Version and Variant

The Variant of an UUID is the number of non-zero bits in the second half of the UUID, which is after the bit 64, or starting at the first MSB of the 9th byte. (I need to check those offsets with the endianness issues). Possible values are : The Version of an UUID is stored as an uint4 in the first nibble of the 7th byte, which is the first character of the middle field in the hexadecimal representation of an UUID. The allowed versions have been listed above, and nothing above 5 should be seen. It's worth mentioning that most of the valid UUID I collected in the database were UUIDv1 (based on time and MAC) and UUIDv4 (random).
Excellent snippets from the cryptanalysis linked below
The UUID version is defined     : Ui [7] = ( Ci [7] & 0x0F ) | 0x40.    
Identifies belonging to RFC 4122: Ui [9] = ( Ci [9] & 0x3F ) | 0x80.  
The Java documentation claims that "Variant 2" UUID are "Leach-Salz", but have the good property of listing the bitmasks :
The layout of a variant 2 (Leach-Salz) UUID is as follows:
The most significant long consists of the following unsigned fields:

0xFFFFFFFF00000000 time_low
0x00000000FFFF0000 time_mid
0x000000000000F000 version
0x0000000000000FFF time_hi

The least significant long consists of the following unsigned fields:
0xC000000000000000 variant
0x3FFF000000000000 clock_seq
0x0000FFFFFFFFFFFF node

UUIDv0

There is only one, the NIL UUID : 00000000-0000-0000-0000-000000000000 Nil UUID microsoft .

UUIDv1

Time

The timestamps embedded in the UUIDv1 and UUIDv2 are 60bit timestamps, they should be the number of 100-nanosecond intervals between the generation time and 1582-10-15 00:00:00, date of the Gregorian reform of the Christian calendar.
Because Microsoft chose the 1601-01-01 00:00:00, date of the first 400-year cycle reset of the newly-applied Gregorian calendar, some UUID have been shifted by 17 years, 1 month and 15 days.
Finally, because the UNIX epoch started on 1970-01-01 00:00:00, some UUID have been miscalculated using the latter timestamp as starting point, hence having an offset of 470 years.
Those three timestamps are shown on the UUID detailed pages, as there is no canonical way to determine which timestamp starting point was used.
The RFC incorrectly states that the « rollover » date should occur « around A.D. 3400, depending on the specific algorithm used », but without considering the clock sequence entropy, the rollover will occur in three thousand years :
>>> (datetime.datetime(1582, 10, 15) + datetime.timedelta(microseconds=uuid.UUID('ffffffff-ffff-1fff-8000-000000000000').time//10))
datetime.datetime(5236, 3, 31, 21, 21, 0, 684697)
  

Clock sequence

This field is the third source of entropy for the UUIDv1, defined in the RFC 4122. The RFC did some attempts at defining methods to add entropy in this field, it should be considered random, and might be shared by UUID generated during the same boot of a specific operating system.

UUIDv2

Those UUID are a mysterious kind. I did not find any of those during any of my researches. They have been defined by the Open Group, and were meant to be used within the DCE RPC protocol, which is still in use, mostly in Microsoft environments. (Nowadays, people tend to use JSON over HTTP, but back in the time the trend was to use undocumented binary formats and raw TCP sockets; an interoperability problem partly solved by DCE RPC.).

UNIX ids

Defined for UUIDv2, those UUIDv2 are just UUIDv1 with the unix id replacing the first 4 bytes. (It's an uint32). Plus, some optional "Group / User / Domain" values, indicating which type of UID is in the first four bytes.
The sec_rgy_domain_t value replaces the clock_seq_low byte, therefore is a 8-bit integer (0-255), but only three values have been defined by the specification :
typedef signed32    sec_rgy_domain_t;
const signed32      sec_rgy_domain_person = 0; // uint32 = UID
const signed32      sec_rgy_domain_group  = 1; // uint32 = GID
const signed32      sec_rgy_domain_org    = 2; // does not seem to attribute meaning to the first uint32
  
Once again, I never found any of those on the Internet, but I'm describing those here for archiving purposes.

UUIDv3

Those UUID are made of three parts :
>>> hashlib.md5(uuid.NAMESPACE_DNS.bytes + bytes("random.org","ascii")).hexdigest()
'c4ca5056bca54b3ed3bed6580842e1a4'
>>> uuid.UUID(bytes=hashlib.md5(uuid.NAMESPACE_DNS.bytes + bytes("random.org","ascii")).digest())
UUID('c4ca5056-bca5-4b3e-d3be-d6580842e1a4')
>>> uuid.uuid3(uuid.NAMESPACE_DNS, "random.org")
UUID('c4ca5056-bca5-3b3e-93be-d6580842e1a4')

UUIDv4

Same as UUIDv3, but here the entropy is supposedly random. As usual, there are some bits (nibbles) reserved for version and variant :
>>> set([(uuid.uuid4().bytes[8] >> 4) for x in range(10000)])
{8, 9, 10, 11}
>>> set([(uuid.uuid4().bytes[6] >> 4) for x in range(10000)])
{4}

UUIDv5

Same as UUIDv3, but with SHA1 in lieu of the MD5.
As for MD5, only the first 16 bytes of the resulting hash are picked.

Should I add all the UUIDs in the world ?

Yes, as long as they are global. Don't add disk partitions uuids specific to your hardware (even though historical hardware might have its place in this database), but add anything that anyone might want to look up online.
Here are a few uncommon UUID that were added to the database : Ideally, I would like to collect all the strange Microsoft Windows UUIDs, and all the strange UUIDs from all the strange places that might exist.
I already made some scripts to push new UUIDs to the database, collected from various sources such as websites, source code listings, and workstations. Those scripts are listed below :
Automation script to push long lists : github gist
Automation script to push MSDN copypasted CSIDL : github gist
Automation script that contains several sources : https://uuid.pirate-server.com/c :

What about publishing data and collecting IPs ?

I don't store IP addresses except in HTTP logs, and all the data is CC-BY-NC-SA.
Complete UUID list : full.txt (It's a dump from 2018-04-29, I should automate this.)
Also, for your delighted eyes, here is a full index of all the UUIDs I have: /full-index/.

Is there an API ?

Yes !

What's next ?

I definitely need to implement the following features : I should also document my operations in some sort of technical blog maybe, it's fun and I'm learning a lot of things. All the static UUID sources from a single Windows workstation will be embedded and documented in the /c collection script anyways. (For now, I track my discoveries in a temporary account on an usonian website that has been deleted. Which means I now regret not writing proper HTML heh.).
Some links about UUIDs are presented below. I would like to express my sympathy to other archivists ( Harald Tveit Alvestrand, Simon Mourier, the lad at famkruithof.net , Jason Scott from textfiles.com, as well as all the archive.org volunteers. Also, thanks to Mikołaj Zalewski for his large database (of 59052 entries) and his parsers & extractors.

Special kind of UUIDs

The UUID specification has been used and abused by a lot of people, and various creative uses of it were made.

Sources

Sources that have been or will be integrated (most can't be complete and must be maintained):