Skip to content

Commit 32e9200

Browse files
RcColesAnton-TF
authored andcommitted
Docs: Add BL1 design doc
Change-Id: I24858e887b418d83022670170fa8d865f119b1ce Signed-off-by: Raef Coles <[email protected]>
1 parent 376e3d6 commit 32e9200

File tree

1 file changed

+302
-0
lines changed
  • docs/technical_references/design_docs

1 file changed

+302
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,302 @@
1+
########################
2+
BL1 Immutable bootloader
3+
########################
4+
5+
:Author: Raef Coles
6+
:Organization: Arm Limited
7+
8+
9+
************
10+
Introduction
11+
************
12+
13+
Some devices that use TF-M will require initial boot code that is stored in ROM.
14+
There are a variety of reasons that this might happen:
15+
16+
- The device cannot access flash memory without a driver, so needs some setup
17+
to be done before main images on flash can be booted.
18+
- The device has no on-chip secure flash, and therefore cannot otherwise
19+
maintain a tamper-resistant root of trust.
20+
- The device has a security model that requires an immutable root of trust
21+
22+
Henceforth any bootloader stored in ROM will be referred to as BL1, as it would
23+
necessarily be the first stage in the boot chain.
24+
25+
TF-M provides a reference second-stage flash bootloader BL2, in order to allow
26+
easier integration. This bootloader implements all secure boot functionality
27+
needed to provide a secure chain of trust.
28+
29+
A reference ROM bootloader BL1 has now being added with the same motivation -
30+
allowing easier integration of TF-M for platforms that do not have their own
31+
BL1 and require one.
32+
33+
****************************
34+
BL1 Features and Motivations
35+
****************************
36+
37+
The reference ROM bootloader provides the following features:
38+
39+
- A split between code being stored in ROM and in other non-volatile memory.
40+
41+
- This can allow significant cost reduction in fixing bugs compared to
42+
ROM-only bootloaders.
43+
44+
- A secure boot mechanism that allows upgrading the next boot stage (which
45+
would usually be BL2).
46+
47+
- This allows for the fixing of any bugs in the BL2 image.
48+
- Alternately, this could allow the removal of BL2 in some devices that are
49+
constrained in flash space but have ROM.
50+
51+
- A post-quantum resistant asymmetric signature scheme for verifying the next
52+
boot stage image.
53+
54+
- This can allow devices to be securely updated even if attacks
55+
involving quantum computers become viable. This could extend the lifespans
56+
of devices that might be deployed in the field for many years.
57+
58+
- A mechanism for passing boot measurements to the TF-M runtime so that they
59+
can be attested.
60+
- Tooling to create and sign images.
61+
- Fault Injection (FI) and Differential Power Analysis (DPA) mitigations.
62+
63+
*********************************
64+
BL1_1 and BL1_2 split bootloaders
65+
*********************************
66+
67+
BL1 is split into two distinct boot stages, BL1_1 which is stored in ROM and
68+
BL1_2 which is stored in other non-volatile storage. This would usually be
69+
either trusted or untrusted flash, but on platforms without flash memory can be
70+
OTP. As BL1_2 is verified against a hash stored in OTP, it is immutable after
71+
provisioning even if stored in mutable storage.
72+
73+
Bugs in ROM bootloaders usually cannot be fixed once a device is provisioned /
74+
in the field, as ROM code is immutable the only option is fixing the bug in
75+
newly manufactured devices.
76+
77+
However, it can be very expensive to change the ROM code of devices once
78+
manufacturing has begun, as it requires changes to the photolithography masks
79+
that are used to create the device. This cost varies depending on the complexity
80+
of the device and of the process node that it is being fabricated on, but can be
81+
large, both in engineering time and material/process costs.
82+
83+
By placing the majority of the immutable bootloader in other storage, we can
84+
mitigate the costs associated with changing ROM code, as a new BL1_2 image can
85+
be used at provisioning time with minimal changeover cost. BL1_1 contains a
86+
minimal codebase responsible mainly for the verification of the BL1_2 image.
87+
88+
The bootflow is as follows. For simplicity this assumes that the boot stage
89+
after BL1 is BL2, though this is not necessarily the case:
90+
91+
1) BL1_1 begins executing in place from ROM
92+
2) BL1_1 copies BL1_2 into RAM
93+
3) BL1_1 verifies BL1_2 against the hash stored in OTP
94+
4) BL1_1 jumps to BL1_2, if the hash verification has succeeded
95+
5) BL1_2 copies the primary BL2 image from flash into RAM
96+
6) BL1_2 verifies the BL2 image using asymmetric cryptography
97+
7) If verification fails, BL1_2 repeats 5 and 6 with the secondary BL2 image
98+
8) BL1_2 jumps to BL2, if either image has successfully verified
99+
100+
.. Note::
101+
The BL1_2 image is not encrypted, so if it is placed in untrusted flash it
102+
will be possible to read the data in the image.
103+
104+
Some optimizations have been made specifically for the case where BL1_2 has been
105+
stored in OTP:
106+
107+
OTP can be very expensive in terms of chip area, though new technologies like
108+
antifuse OTP decrease this cost. Because of this, the code size of BL1_2 has
109+
been minimized. Code-sharing has been configured so that BL1_2 can call
110+
functions stored in ROM. Care should be taken that OTP is sized such that it is
111+
possible to include versions of the functions used via code-sharing, in case the
112+
ROM functions contain bugs, though less space is needed than if all code is
113+
duplicated as it is assumed that most functions will not contain bugs.
114+
115+
As OTP memory frequently has low performance, BL1_2 is copied into RAM before it
116+
it is executed. It also copies the next image stage into RAM before
117+
authenticating it, which allows the next stage to be stored in untrusted flash.
118+
This requires that the device have sufficient RAM to contain both the BL1_2
119+
image and the next stage image at the same time. Note that this is done even if
120+
BL1_2 is located in XIP-capable flash, as it both allows the use of untrusted
121+
flash and simplifies the image upgrade logic.
122+
123+
.. Note::
124+
BL1_2 enables TF-M to be used on devices that contain no secure flash, though
125+
the ITS service will not be available. Other services that depend on ITS will
126+
not be available without modification.
127+
128+
*************************************
129+
Secure boot / Image upgrade mechanism
130+
*************************************
131+
132+
BL1_2 verifies the authenticity of the next stage image via asymmetric
133+
cryptography, using a public key that is provisioned into OTP.
134+
135+
BL1_2 implements a rollback protection counter in OTP, which is used to prevent
136+
the next stage image being downgraded to a less secure version.
137+
138+
BL1_2 has two image slots, which allows image upgrades to be performed. The
139+
primary slot is always booted first, and then if verification of this fails
140+
(either due to an invalid signature or due to a version lower than the rollback
141+
protection counter) the secondary slot is then booted (subject to the same
142+
checks).
143+
144+
BL1_2 contains no image upgrade logic, in order for OTA of the next stage image
145+
to be implemented, a later stage in the system must handle downloading new
146+
images and placing them in the required slot.
147+
148+
********************************************
149+
Post-Quantum signature verification in BL1_2
150+
********************************************
151+
152+
BL1_2 uses a post-quantum asymmetric signature scheme to verify the next stage.
153+
The scheme used is Leighton-Michaeli Signatures (henceforth LMS). LMS is
154+
standardised in `NIST SP800-208
155+
<https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-208.pdf>`_
156+
and `IETF RFC8554. <https://datatracker.ietf.org/doc/html/rfc8554>`_
157+
158+
LMS is a stateful-hash signature scheme, meaning that:
159+
160+
1) It is constructed from a cryptographic hash function, in this case SHA256.
161+
162+
- This function can be accelerated by existing hardware accelerators, which
163+
can make LMS verification relatively fast compared to other post-quantum
164+
signature schemes that cannot be accelerated in hardware yet.
165+
166+
2) Each private key can only be used to sign a certain number of images.
167+
168+
- BL1_2 uses the SHA256_H10 parameter set, meaning each key can sign 1024
169+
images.
170+
171+
The main downside, the limited amount of possible signatures, can be mitigated
172+
by limiting the amount of image upgrades that are done. As BL2 is often
173+
currently not upgradable, it is not anticipated that this limit will be
174+
problematic. If BL1 is being used to directly boot a TF-M/NS combined image, the
175+
limit is more likely to be problematic, and care should be taken to examine the
176+
likely update amount.
177+
178+
LMS public keys are 32 bytes in size, and LMS signatures are 1912 bytes in size.
179+
The signature size is larger than some asymmetric schemes, though most devices
180+
should have enough space in flash to accommodate this.
181+
182+
The main upside of LMS, aside from the security against attacks involving
183+
quantum computers, is that it is relatively simple to implement. The software
184+
implementation that is used by BL1 is ~3KiB in size, which is considerably
185+
smaller than the corresponding RSA implementation which is at least 6.5K. This
186+
simplicity of implementation is useful to avoid bugs.
187+
188+
BL1 will use MbedTLS as the source for its implementation of LMS.
189+
190+
.. Note::
191+
As of the time of writing, the LMS code is still in the process of being
192+
merged into MbedTLS, so BL1 currently does not support asymmetric
193+
verification of the next boot stage. Currently, the next boot stage is
194+
hash-locked, so cannot be upgraded.
195+
196+
The Github pull request for LMS can be found `here
197+
<https://github.com/ARMmbed/mbedtls/pull/4826>`_
198+
199+
*********************
200+
BL1 boot measurements
201+
*********************
202+
203+
BL1 outputs boot measurements in the same format as BL2, utilising the same
204+
shared memory area. These measurements can then be included in the attestation
205+
token, allowing the attestation of the version of the boot stage after BL1.
206+
207+
***********
208+
BL1 tooling
209+
***********
210+
211+
Image signing scripts are provided for BL1_1 and BL1_2. While the script is
212+
named ``create_bl2_img.py``, it can be used for any next stage image.
213+
214+
- ``bl1/bl1_1/scripts/create_bl1_2_img.py``
215+
- ``bl1/bl1_2/scripts/create_bl2_img.py``
216+
217+
These sign (and encrypt in the case of ``create_bl2_img.py``) a given image file
218+
and append the required headers.
219+
220+
**************************
221+
BL1 FI and DPA mitigations
222+
**************************
223+
224+
BL1 reuses the FI countermeasures used in the TF-M runtime, which are found in
225+
``lib/fih/``.
226+
227+
BL1 implements countermeasures against DPA, which are primarily targeted
228+
towards being able to handle cryptographic material without leaking its
229+
contents. The functions with these countermeasures are found in
230+
``bl1/bl1_1/shared_lib/util.c``
231+
232+
``bl_secure_memeql`` tests if memory regions have the same value
233+
234+
- It does not perform early exits to prevent timing attacks.
235+
- It compares chunks in random orders to prevent DPA trace correlation analysis
236+
- It inserts random delays to prevent DPA trace correlation analysis
237+
- It performs loop integrity checks
238+
- It uses FIH constructs
239+
240+
``bl_secure_memcpy`` copies memory regions
241+
242+
- It copies chunks in random orders to prevent DPA trace correlation analysis
243+
- It inserts random delays to prevent DPA trace correlation analysis
244+
- It performs loop integrity checks
245+
- It uses FIH constructs
246+
247+
**************************
248+
Using BL1 on new platforms
249+
**************************
250+
251+
New platforms must define the following macros in their ``region_defs.h``:
252+
253+
- ``BL1_1_HEAP_SIZE``
254+
- ``BL1_1_STACK_SIZE``
255+
- ``BL1_2_HEAP_SIZE``
256+
- ``BL1_2_STACK_SIZE``
257+
- ``BL1_1_CODE_START``
258+
- ``BL1_1_CODE_LIMIT``
259+
- ``BL1_1_CODE_SIZE``
260+
- ``BL1_2_CODE_START``
261+
- ``BL1_2_CODE_LIMIT``
262+
- ``BL1_2_CODE_SIZE``
263+
- ``PROVISIONING_DATA_START``
264+
- ``PROVISIONING_DATA_LIMIT``
265+
- ``PROVISIONING_DATA_SIZE``
266+
267+
The ``PROVISIONING_DATA_*`` defines are used to locate where the data to be
268+
provisioned into OTP can be found. These are required as the provisioning bundle
269+
needs to contain the entire BL1_2 image, usually >= 8KiB in size, which is too
270+
large to be placed in the static data area as is done for all other dummy
271+
provisioning data. On development platforms with reprogrammable ROM, this is
272+
often placed in unused ROM. On production platforms, this should be located in
273+
RAM and then filled with provisioning data. The format of the provisioning data
274+
that should be located in the ``PROVISIONING_DATA_*`` region can be found in
275+
``bl1/bl1_1/lib/provisioning.c`` in the struct
276+
``bl1_assembly_and_test_provisioning_data_t``
277+
278+
If the platform is storing BL1_2 in flash, it must set
279+
``BL1_2_IMAGE_FLASH_OFFSET`` to the flash offset of the start of BL1_2.
280+
281+
The platform must also implement the HAL functions defined in the following
282+
headers:
283+
284+
- ``bl1/bl1_1/shared_lib/interface/trng.h``
285+
- ``bl1/bl1_1/shared_lib/interface/crypto.h``
286+
- ``bl1/bl1_1/shared_lib/interface/otp.h``
287+
288+
If the platform integrates a CryptoCell-312, then it can reuse the existing
289+
implementation.
290+
291+
***********
292+
BL1 Testing
293+
***********
294+
295+
New tests have been written to test both the HAL implementation, and the
296+
integration of those functions for verifying images. These tests are stored in
297+
the ``tf-m-tests`` repository, under the ``test/bl1/`` directory, and further
298+
subdivided into BL1_1 and BL1_2 tests.
299+
300+
--------------
301+
302+
*Copyright (c) 2022, Arm Limited. All rights reserved.*

0 commit comments

Comments
 (0)