Grow block size check in zfs_rangelock_cb() needs improvement

### System information

Type | Version/Name
 --- | ---
Distribution Name	| Ubuntu
Distribution Version	| 24.04
Kernel Version	| 16.14.0
Architecture	| x86_64
OpenZFS Version	| zfs-2.4.99


### Describe the problem you're observing
In zfs_rangelock_cb() there is a test to determine if the file block size could increase. This check is necessary because in order to safely increase the file block size, the entire file range [0..UINT64_MAX] needs to be locked for writing. No other readers or writers may be active in the file. The current code checks to see if the write is beyond the current block size, and if the block size is less than the current max block size for the file system. If these conditions are true, the range lock is adjusted to cover the total range.

The current check:
```
        /*
         * If we need to grow the block size then lock the whole file range.
         */
        uint64_t end_size = MAX(zp->z_size, new->lr_offset + new->lr_length);
        if (end_size > zp->z_blksz && (!ISP2(zp->z_blksz) ||
            zp->z_blksz < ZTOZSB(zp)->z_max_blksz)) {
                new->lr_offset = 0;
                new->lr_length = UINT64_MAX;
        }
```
This check is not strong enough. If the file already has more than one block, then the current block size cannot change. So if the file block size is less than the maximum block size supported by the file system, and there are multiple blocks in the file, the current code will almost always extend the rangelock to its maximum size. This means that all writes become serialized and even reads are slowed as they will more often contend with writes.

Suggested fix:
```
        /*
         * If we might grow the block size then lock the whole file range.
         * NB: this test should match the check in zfs_grow_blocksize
         */
        uint64_t end_size = MAX(zp->z_size, new->lr_offset + new->lr_length);
        if (zp->z_size <= zp->z_blksz && end_size > zp->z_blksz &&
            (!ISP2(zp->z_blksz) || zp->z_blksz < ZTOZSB(zp)->z_max_blksz)) {
                new->lr_offset = 0;
                new->lr_length = UINT64_MAX;
        }
```

### Describe how to reproduce the problem
While it is probably rare for this situation to occur, it is easy to reproduce:
1. Set the file system block size to something small (e.g. 32k): `zfs set recordsize=32k test`
2. Create a file larger than the block size: `dd if=/dev/urandom of=/test/testfile bs=32k count=1000`
3. Set the file system block size to something larger: `zfs set recordsize=128k test`

At this point almost all writes to `testfile` will trigger the over-locking and IO performance will suffer. Unfortunately, there is no easy way to observe what is happening (beyond the performance degradation) as there is no observability in the code for range locking. This issue was discovered via kernel debugging.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Grow block size check in zfs_rangelock_cb() needs improvement #18046

System information

Describe the problem you're observing

Describe how to reproduce the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Type	Version/Name
Distribution Name	Ubuntu
Distribution Version	24.04
Kernel Version	16.14.0
Architecture	x86_64
OpenZFS Version	zfs-2.4.99

Grow block size check in zfs_rangelock_cb() needs improvement #18046

Description

System information

Describe the problem you're observing

Describe how to reproduce the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions