In this article:
- Why this database was created
- What are the underlying technologies
- Some early strategies and stories
- Information on the collection scripts
- Some « fun facts » about broken things
Why this database was created
Initially, I did some DFIR jobs at my previous employer. When doing forensics, specifically on Microsoft Windows workstations, you end up scrolling in log listings, file listings, registry listings, permissions listings, etc.
While some text-based artifacts could be read and understood (« InstallerFunction », « PointlessMediaMenu »), the most unnerving thing was to stumble on UUIDs, to paste them on Google, and to find that it's a standard UUID used in every Windows OS since MS DOS to do a pointless thing.
Sometimes as well, the UUID in question is the only handle you have to trace back the attack path, because it's a CLSID and is mapped via the Microsoft Windows registry to a DLL program. Which allows an attacker not to use « remote-code-execution.exe » but « 45640654065 » instead, which isn't human-readable.
I had the need for a global UUID database several times when doing forensics, and started a draft of this database in my hotel room in Dublin, in the week that ended with our team earning a SANS forensics coin. (Great training btw, thanks, $employer).
Fast forwards a few months, and we're in december, the new year will start in a few days, no one is home, and I've finally an occasion to re-open my long list of abandoned projects. Time to build this global UUID database.
How this database was created
I created this database on 2018-01-01, back in the time it had not a lot of UUIDs. It's a standard Django application, whose main objects are UUID, indexed by their UUID. There's « Comments », mapped to UUIDs, and that's pretty much all. (I recently added « Labels », but it's not mature yet).
The whole thing had little to no CSS, a simple API using the « Referer » field as an additional mandatory parameter, and a « full list » special page. (Which cannot exist anymore given the database size).
Having collected a few UUIDs, I could not resist but to publish it on 2018-01-01, such a nice date:
Here is a GLOBAL UUID DATABASE : https://t.co/73lVFkqCmm
— GLOBAL UUID DATABASE (@582a1cb9) December 30, 2017
Feel free to use my GLOBAL UUID DATABASE.
I'm proud of this GLOBAL UUID DATABASE.
Have a nice day using the GLOBAL UUID DATABASE.
(did you know v1 had timestamps ?)
Happy new GLOBAL UUID DATABASE. pic.twitter.com/PvoNriYn1f
I started with approximately 8000 UUIDs, which were mostly extracted from the Microsoft Windows CLSIDs of my host.
Gathering more UUIDs
I looked a bit online for funny keywords (« error », « strange », « program », « erroneous », etc.) appending «UUID» or «GUID» to find out interesting and new UUIDs, as well as new UUID families. (« UUID namespaces »).
Also, whenever I stumbled upon a new UUID, I would look it up on Google, and find an associated list of related UUIDs. At the beginning, I mostly found MSDN listings. Fun fact, they are OCR'd, and contain '?' and 'L' instead of proper letters, sometimes. You cannot just copy-paste the data.
Microsoft regularly embeds "?" or "L" in UUID listings on MSDN. pic.twitter.com/coF4O4cKtR
— GLOBAL UUID DATABASE (@582a1cb9) January 2, 2018
Microsoft search wasn't that helpful :
— GLOBAL UUID DATABASE (@582a1cb9) January 31, 2018
Sometimes I would also stumble on Github listings, way more useful. They usually included a source (here: mingw, itself linking to the windows SDK), which could then be fetched directly.
- https://github.com/EddieRingle/portaudio/blob/master/src/hostapi/wasapi/mingw-include/propkey.h
I would eventually finish with such a folder, full of flat list and parsers. (I removed the parsers from the listing below, but you'll get an idea of the thing).
$ find . -maxdepth 1 -type f -perm 644 |sort | xargs wc -l
1778 ./alltclsid.txt
13 ./android_effects
100 ./apple-uid
111 ./baselinemgt
82 ./bh-win-04-seki-up2
111 ./biosbits
78 ./bluetooth-logs
19 ./boot_bcd
366 ./bt
62 ./btresponses
419 ./canonical_names
153 ./clspush
30 ./control_panel
7 ./dce-sec-acl-manager
24 ./dcom
44 ./dcomcnfg
47499 ./dmde
37 ./dsdt.alaska.ami.intl
13 ./dsdt.dsl.1
33973 ./dsdt.dsl.1.old
19 ./dsdt.dsl.uuid
19 ./dsdt.unknown
2125 ./EDK2_2015_GUIDs-2017-04-27.csv
35 ./efivar-guids
126 ./efivar-protocol-guids
77 ./extended-rights-reference
52 ./fdisk
20 ./flowerpower
65 ./folder_type_identifiers
21 ./gppref
104 ./gpt-guid
155 ./impacket-dcerpc-v5-epm
148 ./impacket-dcerpc-v5-epm-knownuuid
1141 ./ISMMC.html
203 ./linux-uuid
7 ./.linux-uuid.py.swp
8 ./.linux-uuid.swp
61 ./ms_audit
24 ./ms-dcom-assign
36 ./msgpp
3813 ./msi-guids-windows.html
15 ./mstrust
70 ./mstscax
42 ./ms-vds-assign
54 ./nmap_nse_msrpc
23 ./pset2
17 ./psetid
48 ./rdp_h
2645 ./reactos
21 ./reactos-acuuid
795 ./schema_nt4.txt
794 ./shellbags_tln.pl
928 ./shellbags_xp.pl
13 ./sony_smartband
40 ./updateGroupGUIDs
152 ./vc-redist
971 ./vc-redist-packages-and-related-registry-entries
11 ./ves-sony-w64
17 ./windows-azure-permissions
4572 ./wine
31 ./yaho
104437 total
This was (and is still) useful for static lists found in weird places, but definitely not suited for a massive collection initiative. I had to automate.
Automating the gathering
Because gathering manual lists is slow, painful and error-prone, I automated some things, starting with the Microsoft Windows registry CLSID enumeration :
A first collection script was developed, which simply enumerated a lot of weird places in Microsoft Windows, then produced a CSV.
Current status :
— GLOBAL UUID DATABASE (@582a1cb9) January 26, 2018
- 8063/10052 UUIDs in my GLOBAL UUID DATABASE can be found in one single w10 host
- the collection script is ready (gathering part ok)
- next step : `get-help iwr` to push those in the database
- next next step : oneline this :) pic.twitter.com/2yiPXuH4HZ
Then, I implemented the upload feature, allowing for a fully automated operation mode.
So far, no one on the internet executed it in an attempt to participate in the database collection except :
- xer, when directly asked to in PM
- some polish CERT probably, since the software installed is what you would expect on a polish detonation VM
Hello,
— GLOBAL UUID DATABASE (@582a1cb9) January 27, 2018
Please copy-paste the following command line in an elevated powershell instance on your Microsoft Windows operating systems.
Thank you and best regards,
Your future self.
iex (iwr https://t.co/yXgB9MTAyj).content pic.twitter.com/SibrfAGSBz
There are way more than one class of UUIDs in the Microsoft Windows namespaces. The DCOM (related to RPC) namespace is full of them as well, so I had to look at different places, and wrote the appropriate code:
I'll add the Win32_DCOMApplication WMI class UUID source to my database :) https://t.co/lMcc55BbLs
— GLOBAL UUID DATABASE (@582a1cb9) January 27, 2018
Here is a version of the script, gathering the following UUIDs on a workstation:
- MSI
- CLSID (disabled)
- Activex
- comapplications
- WMI (disabled)
Here is the code :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
|
Unfortunately, it requires at least powershell 3, which is not the default configuration of Microsoft Windows hosts (powershell 2.0). But still, it was useful to gather a lot of UUIDs.
Nevertheless, I gathered 10000 UUIDs !
I reached 10000 UUIDs in my GLOBAL UUID DATABASE.https://t.co/73lVFkqCmm
— GLOBAL UUID DATABASE (@582a1cb9) January 24, 2018
Soon : powershell2 one-liner to collect & push unknown ActiveX, MSI and CLSID uuids. pic.twitter.com/Fa5303u30O
Ingesting specification documents
A typical UUID would find its definition in a specification document:
- The Bluetooth specification (which, for the record, does not mention at all Harald Bluetooth)
- The ACPI & EFI specification
- A lot of Microsoft Windows Server Protocols specification
- The RFCs
While extracting the UUIDs from the text-based ones proved to be simple, the
PDF-only documents were more tricky. Hopefully, poppler
and pdftotext
were
here to convert from PDF to text. And extracting from the
Windows_Server_Protocols.zip
file is more convenient :
MORE UUIDs in my GLOBAL UUID DATABASE, the Active Directory Schema Classes : https://t.co/OvyLJBNkJB
— GLOBAL UUID DATABASE (@582a1cb9) January 15, 2018
Current count : 9762 pic.twitter.com/XftxXu8eEN
Finally, all those specification UUIDs have been ingested via two methods :
- For small or simple lists, a large copy-paste to some TSV, then uploaded using my TSV uploader
- For complex (XML, with a lot of attributes, broken PDF tables) I had to rely on pdftotext and other custom scripts. Regular expressions for the win :)
Diving in the different types of UUID
There are different versions & variants of UUIDs, as well as some vendor-defined custom flavors. Some colors were added so that one could instantly rejoice whenever an UUIDv1 is to be seen.
To highlight this fact, I decided to put colors (and to use Django templates as well). UUIDv1 are green, and the rest has funny colors as well.
Just added some per-version colors on my database. I do wonder what is the source for all those UUIDv3 (hash-based, MD5) and UUIDv5 (hash-based, SHA1).https://t.co/73lVFkqCmm pic.twitter.com/VYmsdweCpo
— GLOBAL UUID DATABASE (@582a1cb9) February 27, 2018
As well, the timestamps are randomly respecting - or not - the specification. To produce information out of this, I had to :
- Include the extracted date in UUID listings (searches)
- Calculate alternative dates decoding (using different epochs, namely 1582-10-15, 1601-01-01, 1970-01-01)
- https://uuid.pirate-server.com/search?q=example
Also, Adobe has a funny use of the available UUIDs, but I didn't (yet) implement anything to make use of this.
- https://www.adobe.com/devnet-docs/acrobatetk/tools/AdminGuide/identify.html
- https://uuid.pirate-server.com/search?q=AC76BA86
Broken UUIDs
Since UUIDs are commonly used and stored using their text-based representation rather than their 16-bytes pure form, it is possible to use non-UUID strings in some places. For example, Microsoft uses non-UUID strings in some WMI locations.
Since WMI is an interoperability layer with non-trivial interfaces, I guess the missing bits have been stripped because they were set to zero, and the standard C stripping methods cut them out. Since 1/256 random byte is equal to zero this could have been expected.
Informations on 455ce053-2552-4051-a3e4-c4200dc31b70:
0PAD8_wmi:CIMV2:Win32_VolumeChangeEvent
CIMV2:Win32_VolumeChangeEvent
Qualifier : abstract : True
Qualifier : Locale : 1033
Qualifier : UUID : 455CE053-2552-4051-A3E4-C4200DC31B7 <- missing bits
...................455CE053-2552-4051-A3E4-C4200DC31B70 <- padded to have 6 'node' bytes
................... 123456789012 <- padded tohighlight the nibble count
Properties:DriveName, EventType, SECURITY_DESCRIPTOR, TIME_CREATED
See more here :
I arbitrarily padded them with zeroes, so that they could be recorded as UUIDs, but I honestly don't know if it's the correct choice.
Enough of the UUID families I found, here are the numbers !
Statistics on UUIDs
Since UUIDv1 are embedding timestamps, it is possible to extract the day of week and the hour of day. Since gnuplot is one of my tools of choice, I plotted all the UUIDs I had in a table :
- X-axis : 7 days
- Y-axis : 24 hours
And here are the day-of-week vs hour-of-day graph.There is a very visible night + week-end intersection :) pic.twitter.com/iYMSNPrQ9E
— GLOBAL UUID DATABASE (@582a1cb9) March 9, 2018
I guess the week-end isn't at the end of the 7-days listing since 1582-10-15 might not have been a monday. There's clearly a pattern, which means it's not full-random, which means my database yields more than 0 bits of entropy :)
A lot of curious facts can be extracted from timestamps, expect more in the future. For now, deduce what you want from the listing below :
-
Results for "w64:"
66666972-912e-11cf-a5d6-28db04c10000 1996-04-08 11:04:28 w64:RIFF 7473696c-912f-11cf-a5d6-28db04c10000 1996-04-08 11:12:02 w64:LIST abf76256-392d-11d2-86c7-00c04f8edb8a 1998-08-21 19:32:26 w64:MARKER 925f94bc-525a-11d2-86dc-00c04f8edb8a 1998-09-22 20:26:50 w64:SUMMARYLIST 20746d66-acf3-11d3-8cd1-00c04f8edb8a 1999-12-07 22:10:34 w64:FMT 61746164-acf3-11d3-8cd1-00c04f8edb8a 1999-12-07 22:12:23 w64:DATA 65766177-acf3-11d3-8cd1-00c04f8edb8a 1999-12-07 22:12:30 w64:WAVE 6b6e756a-acf3-11d3-8cd1-00c04f8edb8a 1999-12-07 22:12:40 w64:JUNK 6c76656c-acf3-11d3-8cd1-00c04f8edb8a 1999-12-07 22:12:42 w64:LEVL 74636166-acf3-11d3-8cd1-00c04f8edb8a 1999-12-07 22:12:55 w64:FACT 74786562-acf3-11d3-8cd1-00c04f8edb8a 1999-12-07 22:12:55 w64:BEXT
As well, here is another graph I made:
Today at @ToulouseHacking we chatted about reviewing those UUIDv1 timestamps. Here are two graphs focusing on the UUIDv1 data, mostly from Microsoft.
— GLOBAL UUID DATABASE (@582a1cb9) March 9, 2018
I wonder what happened in 2000 that made them stop generating UUIDv1 at this crazy rate. API stabilization ? #uuid pic.twitter.com/s0iCjXs0Sm
New UUID namespaces
My quest to new APIs, namespaces and specifications brought me to new places, and made me discover :
- That some publications have a « LSID », a unique publication identifier, which is not what I initially searched (a UUID for each living species) but it's enough since I can now have a lizard in my database :
Said reptile, which discovery paper UUID is now in some Golang and Apache GUID test suites ;https://t.co/4Zv7PxhCza pic.twitter.com/sK0h5mzfV5
— GLOBAL UUID DATABASE (@582a1cb9) February 14, 2018
I also discovered funny attack surfaces (yes, my main job is security-related), but this is for another article ;)
Digging in old sources
Some old UUIDs are only to be found in legacy places, such as the Microsoft Windows SDK, or some old Microsoft Windows builds, only in binary form. The legendary « UUID.LIB » file, which isn't that simple to parse.
Those obscure VS files are COFF object files in an AR archive. That's a nice way to pack things. pic.twitter.com/JRzeG27E2s
— GLOBAL UUID DATABASE (@582a1cb9) February 2, 2018
Also, here is a great place to visit in the Microsoft Windows SDK, the banned APIs header file :
Banned APIs pic.twitter.com/bYJOd57btf
— GLOBAL UUID DATABASE (@582a1cb9) March 18, 2018
Current status
Now that I ingested most of the UUIDs I could find, I surprise myself by annoying random people on the Internet and on twitter, searching every while and so for « UUID » and « GUID ».
This led me to discoveries, and made me discover the sphere of UUID persons on twitter. More on that later ;)
ICallInterceptor was designed in 1998-12-01 on a Dell.https://t.co/b8FE7vc55k
— GLOBAL UUID DATABASE (@582a1cb9) March 10, 2018
The current status of the database is :
-
I found some malware-shipping websites (download-dll.exe) that included MSI uninstallation UUIDs, and each of these websites is being dumped with custom scripts. « requests_html » is a great python module.
-
I plan to add more sections to my powershell acquisition script, and to make it compatible to powershell version 2 as well. I should run it on the Microsoft-provided virtual machines (modern.ie)
-
I'm preparing a set of blog posts, the first of which you've just finished reading :)
Thanks for reading, and have a nice day !