Skip to content

COM Registration

bclothier edited this page Jun 2, 2018 · 17 revisions

Given that Rubberduck is a COM Add-in, we must deal with registering components that should be visible to either the user's code (in the case of Rubberduck's API) or to the Visual Basic Editor (in the case of Rubberduck's UI, such as tool windows or custom menu). In order for Rubberduck to work, those components must have appropriate entries in the registry.

Because COM registration often is shrouded in mystery, it is common to form myths such as believing that regasm.exe is mandatory for registering or that it must be in HKLM hive key, and one consequence is that there is a large amount of cargo cult programming around the COM registration. Neither are true. What does trip people up is that COM, being a very big thing and has several related things (e.g. OLE, Automation, COM+, to name a few), there can be varying requirements to get a working COM component depending on how one intend to use the COM component in question. Because we intend to interop with the VBA editor and to be available to the VBA code, we must meet the requirements imposed by VBE, which is implicitly the Automation. Therefore a minimal COM registration will not be acceptable. Furthermore, it is not possible to use a COM component that isn't a part of a type library. Therefore, we must not only describe the COM component but also the type library in order to be acceptable to Automation clients such as Visual Basic. Because Rubberduck is also written in .NET, we must accommodate additional keys specific to the COM interop.

A recommended reading on what composes a minimum COM registration and additional things we can add is Larry's series of blog posts.

Branches of registry used for COM registration

So the takeaway is that we must have the following entries in those sub-branches of the Software\Classes tree.

CLSID = Describes the class ID of a component. This contains all the key data required to activate the component, including the DLL it's located in among other things.

Interface = Describes the interface that a COM component implements. This seems to be opaque but because we are using .NET COM interop, it actually refers to an universal marshaler, which is why we see same GUID {00020424-0000-0000-C000-000000000046} as its ProxyStubClsId32. The COM interop then knows what to do with it and associate with the correct COM class behinds the scene.

TypeLib = Describes the type library that a COM component may be with. Note that simply having a registry entry pointing to the TLB file is not sufficient. All the COM components' CLSID and Interface must in turn contain a TypeLib that refers back to this key.

Record = This subtree is not documented (or at least I could not find an official documentation) but it seems to enumerate all enumerations that are COM-visible. All enumerations have their own GUID, and therefore is used to help find the constant members of the enumeration and this key enables the clients such as Visual Basic to locate the enumeration in the given DLL.

<ProgId> = Provides a registry lookup of CLSID. For example, a registry entry with Rubberduck.AssertClass will have a subkey CLSID with the same GUID as found within the CLSID node. That enables the client such as VBA to locate the actual CLSID with human-friendly name from calls such as CreateObject("Rubberduck.AssertClass"). This is not required for the COM component to function, but we like being friendly to our users, so we include ProgIds.

HKCU and HKLM

Both hive has their Software\Classes. Therefore the same set of registry entries can be written to either simply by changing the root, without any changes to the key, name and value. Obviously, writing to HKCU is nice as this means no admin privilege is required.

However, most of COM lookups may be actually done via HKCR, which is basically a merge of HKLM\Software\Classes and HKCU\Software\Classes, with the entries in HKCU taking precedence. That implies that if there is a COM registration in both place with differing contents, the entries in HKCU will effectively trump those in HKLM.

It is important to note that UAC factors into this. As explained here, running process as an administrator might cause the HKCU registration to not work anymore. Therefore, if someone needs to run the VBE host or Visual Studio as an administrator, attempting to load Rubberduck will fail in that case. In that situation, if one is going to run as an administrator for the VBE host, they might just as well install using HKLM registration. For Visual Studio, the debug build is always going to be registered in HKCU and this can be worked around by attaching to the process rather than executing the VBE host directly from the elevated Visual Studio.

32/64 bits and registry redirection

Because we need to support both 32-bit and 64-bit VBA hosts, we need to ensure the appropriate registry entries are written to appropriate places. Because all registry entries end up in Software\Classes, on a 64-bit host, we need to write to Software\Classes\Wow6432Node to make it accessible to the 32-bit host.

Note that redirection only applies to some branches of the registry. In this case, we only need to handle redirection for CLSID and Interface subbranches. The TypeLib subbranch is effectively a symbolic link and Microsoft recommends against writing to both 64-bit and 32-bit redirected keys for the TypeLib. There is no redirection for the Record subbranch. Therefore, both TypeLib and Record can be written to only HKLM or HKCU without any regards to the bitness of the installing machine.

Reference on registry redirection:

Sample Registry Entries

CLSID Branch

A full-featured CLSID subkey for a COM component implemented in .NET that is usable for Automation client would generally have those subkeys:

CLSID Guid

Generic form:

...\CLSID\{<GUID of the implementing class>}
@="<fully qualified class name>"

Example:

...\CLSID\{69E194DA-43F0-3B33-B105-9B8188A6F040}]
@="Rubberduck.UnitTesting.AssertClass"

The subkey defines the actual CLSID of the COM component. Typically each .NET class will have its own CLSID, and it should be globally unique. This is the key used by COM to uniquely identify a class out of all other classes and thus be able to activate the class in question.

The GUID should correspond to the implementing class's Guid attribute. In case of the AssertClass class, we have the attribute [Guid(RubberduckGuid.AssertClassGuid)], referencing the RubberduckGuid static class to provide the GUID for that class.

The default value typically corresponds to the class name as found within the .NET namespace. This technically isn't required but it is also used within viewing tools such as the object browser to provide more information.

Implemented Categories

Generic form:

...\CLSID\{<CLSID>}\Implemented Categories
...\CLSID\{<CLSID>}\Implemented Categories\{<CATID GUID>}

Example:

...\CLSID\{69E194DA-43F0-3B33-B105-9B8188A6F040}\Implemented Categories
...\CLSID\{69E194DA-43F0-3B33-B105-9B8188A6F040}\Implemented Categories\{62C8FE65-4EBB-45e7-B440-6E39B2CDBF29}

Implemented categories, aka CATID, is not particularly required and has generally host-defined implementations. It is up to the client to make use of the CATID and thus change the behavior. For example, some controls may have certain CATID set to indicate it's a particular type of control to be displayed in toolbox. When performing regasm.exe, it will generate this key, with the same GUID {62C8FE65-4EBB-45e7-B440-6E39B2CDBF29} which maps to HKEY_CLASSES_ROOT\Component Categories\{62C8FE65-4EBB-45e7-B440-6E39B2CDBF29}. There, it indicates it's .NET Category, so we can know that the class is implemented in the .NET. Visual Basic or any COM consumers probably won't care about that but it is useful to the .NET to avoid re-importing COM metatype, so it's recommended to keep the key.

InProcServer32

Generic form:

...\CLSID\{<CLSID>}\InprocServer32
@="<DLL name>"
"ThreadingModel"="<ThreadingModel>"
"Class"="<Fully qualified class name>"
"Assembly"="<Assembly name>, Version=<Assembly version>, Culture=<Assembly culture>, PublicKeyToken=<Assembly public key>"
"RuntimeVersion"="<CLR runtime>"
"CodeBase"="<Path to the DLL containing the class>"

Example:

...\CLSID\{69E194DA-43F0-3B33-B105-9B8188A6F040}\InprocServer32
@="mscoree.dll"
"ThreadingModel"="Both"
"Class"="Rubberduck.UnitTesting.AssertClass"
"Assembly"="Rubberduck, Version=2.1.6642.37961, Culture=neutral, PublicKeyToken=null"
"RuntimeVersion"="v4.0.30319"
"CodeBase"="C:\GitHub\Rubberduck\Rubberduck.Deployment\bin\Debug\Rubberduck.dll"

The InprocServer32 describes how the class component may be activated and where it is located on the computer. There are also some .NET specific data included. Normally, a C++ COM registration might have the default value pointing to the its own DLL. However, because we are doing COM interop, we need to point to the mscoree.dll instead, and additionally pass some of the data to be used by .NET for activating the COM-visible object. This is why we have 4 additional attributes whereas COM itself technically requires only the default value (the DLL name) and the ThreadingModel. Those 4 additional attributes are then used by the COM interop to locate the class within the .NET assembly, and use the correct CLR runtime (keep in mind CLR version is different from .NET framework).

Note that there is no such thing as InProcServer64; on 64-bit systems, we use the registry virtualization to distinguish between 32-bit and 64-bit version of class which in the case of COM interop is somehow irrelevant because we use Any CPU so the Assembly value will remain the same in both 64-bit and 32-bit path. There is a InProcServer which is used for 16-bit process and is not used by us.

Version-specific InprocServer32

Generic form:

...\CLSID\{<CLSID>}\InprocServer32\<Version>
<same keys as non-version-specific InProcServer32>

Example:

...\CLSID\{69E194DA-43F0-3B33-B105-9B8188A6F040}\InprocServer32\2.1.6642.37961
"Class"="Rubberduck.UnitTesting.AssertClass"
"Assembly"="Rubberduck, Version=2.1.6642.37961, Culture=neutral, PublicKeyToken=null"
"RuntimeVersion"="v4.0.30319"
"CodeBase"="C:\GitHub\Rubberduck\Rubberduck.Deployment\bin\Debug\Rubberduck.dll"

We can additionally describe version specific behavior and thus customize accordingly. In practice, however, only one is actually used and there shouldn't be multiple versions active at same time anyway. Thus, there is generally only one version subkey. Note that the default value is not set, since it is already declared in the non-version-specific InProcServer32 subkey. We can only customize the class, assembly, runtime version and the path to the file, but at this time, they will be all the same.

ProgId

Generic form:

...\CLSID\{<CLSID>}\ProgId
@="<ProgId>"

Example:

...\CLSID\{8D052AD8-BBD2-4C59-8DEC-F697CA1F8A66}\ProgId
@="Rubberduck.AssertClass"

This is required whenever providing a ProgId. In this case, this correspond to the ProgId attribute which the class has. We use RubberduckProgId static class to map all those.

Interface branch

Though the CLSID is the lynchpin of COM component, we always access it via interfaces and a coclass may in fact implement several interfaces, some documented, some not documented, There is no mechanism for enumerating them except what is already reported via the type library and the registry. Therefore, when we implicitly make a IUnknown::QueryInterface, we will be looking up the interface ID (aka IID), and this is where the branch becomes relevant; it describes the interfaces.

Interface ID (IID)

Generic form:

...\Interface\{<GUID>}
@="<Interface name>"

Example:

...\Interface\{69E194DB-43F0-3B33-B105-9B8188A6F040}
@="IAssert"

This key identifies the actual interface. This should map to an actual interface, which in this case is the IAssert interface. This interface is used by the AssertClass we saw earlier because we have the attribute ComDefaultInterface set. Note that the attribute is not strictly required; simply implementing the interface is sufficient though the attribute is useful when a class implements more than one COM-visible interface so you can explicitly specify which is to be its default interface.

Furthermore, the GUID corresponds to the interface's ID (IID) and is declared similarly to how we declare CLSID as illustrated in IAssert interface, which also maps back to the RubberduckGuid class.

ProxyStubClsid32

Generic form:

...\Interface\{<IID>}\ProxyStubClsid32
@="{00020424-0000-0000-C000-000000000046}"

Example:

...\Interface\{69E194DB-43F0-3B33-B105-9B8188A6F040}\ProxyStubClsid32
@="{00020424-0000-0000-C000-000000000046}"

Similar to the CLSID, we need to describe how to "activate" an interface and for that we can use the universal marshaler that is also used by classes. However, by itself, that is not enough; the universal marshaler depends on having a type library, since we aren't providing our own proxy DLL. For that reason, the next key is vital.

TypeLib

Generic form:

...\Interface\{<IID>}\TypeLib
@="{<TypeLib GUID>}"
"Version"="<TypeLib Version>"

Example:

...\Interface\{69E194DB-43F0-3B33-B105-9B8188A6F040}\TypeLib
@="{E07C841C-14B4-4890-83E9-8C80B06DD59D}"
"Version"="2.1"

The universal marshaler relies on typelib and the key entry provides it with information it needs to look up the type library to be able to marshal the IID. Refer to the following TypeLib branch section for more information.

TypeLib branch

As indicated previously because we use universal marshaler, rather than building our own proxy DLL, we must provide a type library. It's also likely that the type library is mandated for use by Visual Basic since it internally uses type library to provide metadata both to its users and for internal uses. For that reason, minimal COM registration won't suffice and we must provide the information.

By default, if we do not do any merging, a type library is exported for each .NET assembly, so therefore a generated DLL file will have its own TLB file which can contain description of all the DLL's COM-visible classes, interfaces and enumerations. Therefore, all DLL's interfaces' registry entries will refer back to the same type library's registry entry.

It should be note that in practice, regasm.exe will only write to the 64-bit path, never to the 32-bit path for this entry, which is odd since it does write both 32-bit and 64-bit paths for CLSID and Interface but never adjusts for the TypeLib. Whether this is a bug is uncertain, as this is apparently not an impediment for Visual Basic consumers to look up the information. It may have to do with the fact that it also describes the platform as its key, rather than relying on the path, as explained later in the Platform section.

TypeLib GUID

Generic form:

...\TypeLib\{<GUID>}

Example:

...\TypeLib\{E07C841C-14B4-4890-83E9-8C80B06DD59D}

Like other COM objects, the type library must have its own GUID to uniquely identify the library. In .NET project, this corresponds to the assembly version defined in the AssemblyInfo.cs.

Version

Generic form:

...\TypeLib\{<TypeLib GUID>}\<Version>
@="<TypeLib name>"

Example:

...\TypeLib\{E07C841C-14B4-4890-83E9-8C80B06DD59D}\2.1
@="Rubberduck"

There can be a multiple versions of same type library, though in practice we only use one at a given time. Because at the time of writing, Rubberduck is at 2.1.., .NET will take the major and minor, and provide 2.1 when using regasm.exe. We've followed along with the convention, since smaller changes should generally be non-breaking-change. Whenever we introduce a breaking change, the version should be bumped.

Furthermore, the version must be the one used by various Interface's TypeLib subkey as noted above, to allow location of the correct typelib. There is no version-neutral lookup.

The name of type library can be anything and is usually used in object browser. In this case, we simply use Rubberduck as the name.

Locale

Generic form:

...\TypeLib\{<TypeLib GUID>}\<Version>\<Locale>

Example:

...\TypeLib\{E07C841C-14B4-4890-83E9-8C80B06DD59D}\2.1\0

If a type library has locale-specific data, one can specify which locale type library supports. However, in practice, it's easiest to specify 0, indicating it's not locale-specific and leave the internationalization to the implementation.

platform-specific path

Generic form:

...\TypeLib\{<TypeLib GUID>}\<Version>\<Locale>\<Platform>
@="<Path to TLB>"

Example:

...\TypeLib\{E07C841C-14B4-4890-83E9-8C80B06DD59D}\2.1\0\win32
@="C:\GitHub\Rubberduck.Deployment\bin\Debug\Rubberduck.x32.tlb"

...\TypeLib\{E07C841C-14B4-4890-83E9-8C80B06DD59D}\2.1\0\win64
@="C:\GitHub\Rubberduck.Deployment\bin\Debug\Rubberduck.x64.tlb"

For each type library file, they are bound by the bitness; a separate type library must be generated for 32-bit, 64-bit (and also for 16-bit but who uses those...). Accordingly, we can describe which platform with the <platform> which can be win16, win32, or win64, then embedding the path to the appropriate TLB file.

Flags

Generic form:

...\TypeLib\{<TypeLib GUID}\<Version>\FLAGS
@="<Flags>"

Example:

...\TypeLib\{E07C841C-14B4-4890-83E9-8C80B06DD59D}\2.1\FLAGS
@="0"

The <Flags> corresponds to the LIBFLAGS enumeration. Normally in practice it is always 0 as no flags are set. Research is needed whether any of flags is needed.

HelpDir

Generic Form:

...\TypeLib\{<TypeLib GUID>}\<Version>\HELPDIR
@="<Path to directory with help files>"

Example:

...\TypeLib\{E07C841C-14B4-4890-83E9-8C80B06DD59D}\2.1\HELPDIR
@="C:\GitHub\Rubberduck.Deployment\bin\Debug\"

Most of COM component can have a HelpFile and HelpContextId. Presumably in the case where the HelpFile isn't an absolute path, the registry entry can be used to locate the directory that should contain the help file. We currently don't have a help file, so this is largely moot until we actually create some...

Record Branch

Records lists all enumerations that may be defined in a DLL. Those enumerations can be given a Guid attribute, just like a class and an interface and is presumably used by Visual Basic or other consumers to map the enumeration value to the literal value as part of compilation.

Enum GUID

Generic form:

...\Record\{<GUID>}

Example:

...\Record\{3E077C17-5678-3605-8449-FEABE42C9725}

Like CLSID and IID, it is the unique identifier for enumeration. In this case, this corresponds to DeclarationType enum, which can have GUID attribute assigned.

Version

Generic form:

...\Record\{<Enum GUID>}\<Version>
"Class"="<Fully qualified class name>"
"Assembly"="<Assembly name>, Version=<Assembly version>, Culture=<Assembly culture>, PublicKeyToken=<Assembly public key>"
"RuntimeVersion"="<CLR Runtime>"
"CodeBase"="<Path to the DLL containing the enumerations>"

Example:

...\Record\{3E077C17-5678-3605-8449-FEABE42C9725}\2.1.6642.37961
"Class"="Rubberduck.API.VBA.DeclarationType"
"Assembly"="Rubberduck, Version=2.1.6642.37961, Culture=neutral, PublicKeyToken=null"
"RuntimeVersion"="v4.0.30319"
"CodeBase"="C:\GitHub\Rubberduck\Rubberduck.Deployment\bin\Debug\Rubberduck.dll"

Record allows for multiple versions of the same enumeration GUID, though it is expected that there'll be only one in use and it would be mapped to the currently loaded version anyway. Curiously, it uses the full version as the assembly version and not major/minor as type library does.

The data is very similar to what we saw with CLSID's InprocServer32 key, providing enough information on where to locate the assembly and be able to load it. Note that it even uses the same key Class which is misleading, since it's certainly not referring to a class but an enumeration.

ProgId Branch

Directly in the Software\Classes, we can create registry entry enumerating the ProgId entries. Those are used by various calls such as CreateObject to locate the CLSID using a human friendly string.

ProgID

Generic form:

...\<ProgId>
@="<Fully qualified class name>"

Example:

...\Rubberduck.AssertClass
@="Rubberduck.UnitTesting.AssertClass"

The <ProgId> should correspond to the class's ProgId attribute. The <Fully qualified class name> isn't required but is normally generated by regasm.exe presumably as an aid; it is also used in viewing tools as well.

CLSID

Generic form:

\<ProgId>\CLSID
@="{<CLSID>}"

Example:

...\Rubberduck.AssertClass\CLSID
@="{69E194DA-43F0-3B33-B105-9B8188A6F040}"

This links a ProgId to the CLSID which enables the marshaler to complete the mapping and thus locate the implementing class. As noted earlier, the CLSID branch has ProgId that backlinks to this as well.

Clone this wiki locally