-
CVE-2025-40007
- EPSS 0.03%
- Veröffentlicht 20.10.2025 15:26:53
- Zuletzt bearbeitet 21.10.2025 19:31:25
- Quelle 416baaa9-dc9f-4396-8d5f-8c081f
- CVE-Watchlists
- Unerledigt
In the Linux kernel, the following vulnerability has been resolved:
netfs: fix reference leak
Commit 20d72b00ca81 ("netfs: Fix the request's work item to not
require a ref") modified netfs_alloc_request() to initialize the
reference counter to 2 instead of 1. The rationale was that the
requet's "work" would release the second reference after completion
(via netfs_{read,write}_collection_worker()). That works most of the
time if all goes well.
However, it leaks this additional reference if the request is released
before the I/O operation has been submitted: the error code path only
decrements the reference counter once and the work item will never be
queued because there will never be a completion.
This has caused outages of our whole server cluster today because
tasks were blocked in netfs_wait_for_outstanding_io(), leading to
deadlocks in Ceph (another bug that I will address soon in another
patch). This was caused by a netfs_pgpriv2_begin_copy_to_cache() call
which failed in fscache_begin_write_operation(). The leaked
netfs_io_request was never completed, leaving `netfs_inode.io_count`
with a positive value forever.
All of this is super-fragile code. Finding out which code paths will
lead to an eventual completion and which do not is hard to see:
- Some functions like netfs_create_write_req() allocate a request, but
will never submit any I/O.
- netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read()
and then netfs_put_request(); however, netfs_unbuffered_read() can
also fail early before submitting the I/O request, therefore another
netfs_put_request() call must be added there.
A rule of thumb is that functions that return a `netfs_io_request` do
not submit I/O, and all of their callers must be checked.
For my taste, the whole netfs code needs an overhaul to make reference
counting easier to understand and less fragile & obscure. But to fix
this bug here and now and produce a patch that is adequate for a
stable backport, I tried a minimal approach that quickly frees the
request object upon early failure.
I decided against adding a second netfs_put_request() each time
because that would cause code duplication which obscures the code
further. Instead, I added the function netfs_put_failed_request()
which frees such a failed request synchronously under the assumption
that the reference count is exactly 2 (as initially set by
netfs_alloc_request() and never touched), verified by a
WARN_ON_ONCE(). It then deinitializes the request object (without
going through the "cleanup_work" indirection) and frees the allocation
(with RCU protection to protect against concurrent access by
netfs_requests_seq_start()).
All code paths that fail early have been changed to call
netfs_put_failed_request() instead of netfs_put_request().
Additionally, I have added a netfs_put_request() call to
netfs_unbuffered_read() as explained above because the
netfs_put_failed_request() approach does not work there.Verknüpft mit AI von unstrukturierten Daten zu bestehenden CPE der NVD
Daten sind bereitgestellt durch das CVE Programm von einer CVE Numbering Authority (CNA) (Unstrukturiert).
HerstellerLinux
≫
Produkt
Linux
Default Statusunaffected
Version <
8df142e93098b4531fadb5dfcf93087649f570b3
Version
20d72b00ca814d748f5663484e5c53bb2bf37a3a
Status
affected
Version <
4d428dca252c858bfac691c31fa95d26cd008706
Version
20d72b00ca814d748f5663484e5c53bb2bf37a3a
Status
affected
Version
1a8360c2eed3b292ed654c2ac61b09de4a80e298
Status
affected
HerstellerLinux
≫
Produkt
Linux
Default Statusaffected
Version
6.16
Status
affected
Version <
6.16
Version
0
Status
unaffected
Version <=
6.16.*
Version
6.16.10
Status
unaffected
Version <=
*
Version
6.17
Status
unaffected
| Typ | Quelle | Score | Percentile |
|---|---|---|---|
| EPSS | FIRST.org | 0.03% | 0.064 |
| Quelle | Base Score | Exploit Score | Impact Score | Vector String |
|---|