"I have a mind like a steel... uh... thingy." Patrick Logan's weblog.

Search This Blog

Sunday, August 14, 2005

uuid

Termite currently has a dependency on libuuid to generate universally unique identifiers. Since there is no need for the result to conform to the uuid_t data type, just that the result is unique across space and time, I wrote a uuid generator in Scheme which will make running Termite easier on more platforms. I loosely followed the pattern for generating primary keys in EJB Design Patterns which does not require a singleton or a database, etc. i.e. multiple instances of this uuid maker can exist in the same address space or in different address spaces, and on different hosts. I had fun replacing the absolutely horrible C code from libuuid with short, readable, Scheme.

> (load "my-uuid.scm")
"/home/pdlogan/dev/termite/my-uuid.scm"
> (define make-uuid (make-uuid-maker))
> (make-uuid)
"1A10E03B-4FF3486B-00000057-F7CA93A2"
The result is a 128-bit unsigned integer but the only requirement is that it is a unique Scheme object across space and time (within a reasonable amount of certainty). The code is actually following the convention for uuid's, i.e. display each 32-bit part in hex separated by dashes.

Each call to make-uuid-maker returns a new closure around an "address part" and an "object part". Every uuid this closure makes will be unique to its IP address and unique to the internal id of the object in the specific Scheme runtime. (Note: the original uuid generators use the network card MAC address but the IP address is easier to get and just as good for this purpose.)

Each call to one of these closures combines its address part and object part to a "time part" and a "random part". The time part is roughly the number of milliseconds since January 1, 1970 mod (2^32)-1. The random part is a random 32-bit unsigned integer. So within the closure's IP address and randomly allocated object, each resulting uuid is also made unique to the millisecond and a random number *within* the millisecond.

(define (make-uuid-maker)
  (let ((address-part (or (hex-encode-ip-address)
     (hex-encode-random-integer)))
 (object-id-part (hex-encode-random-object-id)))
    (lambda ()
      (let ((time-part  (hex-encode-current-milliseconds))
     (random-part (hex-encode-random-integer)))
 (string-append time-part "-" address-part "-" object-id-part "-" random-part)))))

The IP address is the host's internet address if it has one, or an intranet address, otherwise the address part becomes just another 32-bit random integer.

(define (hex-encode-ip-address)
  (let* ((address (or (get-internet-address)
        (get-intranet-address))))
    (if address
 (lengthen-with-leading-zeros
  (number->string (string->number
     (let ((result ""))
       (for-each
                                (lambda (n)
       (set! result (string-append result (number->string n))))
    (u8vector->list address))
       result))
    16))
 #f)))

The object id closed within a uuid maker is also random. If a closure is always created at a specific point, e.g. in initialization, then this object runs the risk of having the same id each time. Instead a random number of objects (not more than 100) are created before the one captured in the closure.

(define (hex-encode-random-object-id)
  (let loop ((n (random-integer 100)))
    (let ((id (object->serial-number (cons 1 2))))
      (if (= n 0)
   (lengthen-with-leading-zeros (number->string id 16))
   (loop (- n 1))))))
Well that was fun. If you see any serious flaws let me know. There are some Gambit-specific dependencies here but nothing too difficult to replace. The formatting's really bad, but I am off to the beach for the day so later.

5 comments:

Calvin Spealman said...

The fact that we have so many non-conforming UUID generators, and all for little or no reason, is the cause of the real problem. Most people do what you just did, without even looking for libuuid first. If there really is any considerable lack of libuuid deployment (which I really doubt), then that is the only reason.

PJE said...

You might want to look at draft-leach-uuids-guids-01.txt, which is sort of a defacto standard for GUID formats, as it documents all the various legacy GUID bit fields, as well as several variations of GUID generation methods, and the standard string format for them (which is *not* 4 32-bit fields, by the way). As it happens, GUIDs are not just arbitrary 128-bit numbers, they have a number of internal fields identifying their format and version. If you're just making up your own new unique ID system, it'd probably be better to call it something else and use an obviously different format, so that people don't think they can use your implementation with something that expects Leach-style UUIDs or GUIDs.

PJE said...

By the way, here's a compliant implementation in Python.

Patrick Logan said...

Calvin -- At least I did look at libuuid first. 8^(

The reason I wrote this is simple... Termite does not need a standard uuid and libuuid is not widely deployed. I am not interested in solving that problem per se, I just want Termite to be more accessible when it is released.

I do think a Termite application would want to use a standard uuid when communicating with the outside world. That could be done with libuuid and in particular a "uuidgen" server that is not necessarily installed in every node of Termite.

PJE -- I agree this should have a different name because it is not derived from, nor meant to be compatible with, any standard uuid.

Also PJE -- I'll look at that Python code. Thanks.

Anonymous said...

(Note: the original uuid generators use the network card MAC address but the IP address is easier to get and just as good for this purpose.)

Not true.

If you have home users, for instance, behind NAT routers, odds are they all have 192.168.1.x as their IP address, for
some low value of x.

In corporate environments, internal networks often use the same 192.168.x.y subnet and then use NAT to "go legal" to talk to the outside world.

So this really might not be unique enough for your needs.

MAC IDs aren't all unique either, but they're a heck of a lot closer to it...

Of course, your uuid's will probably still be unique enough, because of the other fields. But don't delude yourself about the IP address being "just as good" as the MAC address.

Blog Archive

About Me

Portland, Oregon, United States
I'm usually writing from my favorite location on the planet, the pacific northwest of the u.s. I write for myself only and unless otherwise specified my posts here should not be taken as representing an official position of my employer. Contact me at my gee mail account, username patrickdlogan.