-

CVE-2025-38166

In the Linux kernel, the following vulnerability has been resolved:

bpf: fix ktls panic with sockmap

[ 2172.936997] ------------[ cut here ]------------
[ 2172.936999] kernel BUG at lib/iov_iter.c:629!
......
[ 2172.944996] PKRU: 55555554
[ 2172.945155] Call Trace:
[ 2172.945299]  <TASK>
[ 2172.945428]  ? die+0x36/0x90
[ 2172.945601]  ? do_trap+0xdd/0x100
[ 2172.945795]  ? iov_iter_revert+0x178/0x180
[ 2172.946031]  ? iov_iter_revert+0x178/0x180
[ 2172.946267]  ? do_error_trap+0x7d/0x110
[ 2172.946499]  ? iov_iter_revert+0x178/0x180
[ 2172.946736]  ? exc_invalid_op+0x50/0x70
[ 2172.946961]  ? iov_iter_revert+0x178/0x180
[ 2172.947197]  ? asm_exc_invalid_op+0x1a/0x20
[ 2172.947446]  ? iov_iter_revert+0x178/0x180
[ 2172.947683]  ? iov_iter_revert+0x5c/0x180
[ 2172.947913]  tls_sw_sendmsg_locked.isra.0+0x794/0x840
[ 2172.948206]  tls_sw_sendmsg+0x52/0x80
[ 2172.948420]  ? inet_sendmsg+0x1f/0x70
[ 2172.948634]  __sys_sendto+0x1cd/0x200
[ 2172.948848]  ? find_held_lock+0x2b/0x80
[ 2172.949072]  ? syscall_trace_enter+0x140/0x270
[ 2172.949330]  ? __lock_release.isra.0+0x5e/0x170
[ 2172.949595]  ? find_held_lock+0x2b/0x80
[ 2172.949817]  ? syscall_trace_enter+0x140/0x270
[ 2172.950211]  ? lockdep_hardirqs_on_prepare+0xda/0x190
[ 2172.950632]  ? ktime_get_coarse_real_ts64+0xc2/0xd0
[ 2172.951036]  __x64_sys_sendto+0x24/0x30
[ 2172.951382]  do_syscall_64+0x90/0x170
......

After calling bpf_exec_tx_verdict(), the size of msg_pl->sg may increase,
e.g., when the BPF program executes bpf_msg_push_data().

If the BPF program sets cork_bytes and sg.size is smaller than cork_bytes,
it will return -ENOSPC and attempt to roll back to the non-zero copy
logic. However, during rollback, msg->msg_iter is reset, but since
msg_pl->sg.size has been increased, subsequent executions will exceed the
actual size of msg_iter.
'''
iov_iter_revert(&msg->msg_iter, msg_pl->sg.size - orig_size);
'''

The changes in this commit are based on the following considerations:

1. When cork_bytes is set, rolling back to non-zero copy logic is
pointless and can directly go to zero-copy logic.

2. We can not calculate the correct number of bytes to revert msg_iter.

Assume the original data is "abcdefgh" (8 bytes), and after 3 pushes
by the BPF program, it becomes 11-byte data: "abc?de?fgh?".
Then, we set cork_bytes to 6, which means the first 6 bytes have been
processed, and the remaining 5 bytes "?fgh?" will be cached until the
length meets the cork_bytes requirement.

However, some data in "?fgh?" is not within 'sg->msg_iter'
(but in msg_pl instead), especially the data "?" we pushed.

So it doesn't seem as simple as just reverting through an offset of
msg_iter.

3. For non-TLS sockets in tcp_bpf_sendmsg, when a "cork" situation occurs,
the user-space send() doesn't return an error, and the returned length is
the same as the input length parameter, even if some data is cached.

Additionally, I saw that the current non-zero-copy logic for handling
corking is written as:
'''
line 1177
else if (ret != -EAGAIN) {
	if (ret == -ENOSPC)
		ret = 0;
	goto send_end;
'''

So it's ok to just return 'copied' without error when a "cork" situation
occurs.

Verknüpft mit AI von unstrukturierten Daten zu bestehenden CPE der NVD
Diese Information steht angemeldeten Benutzern zur Verfügung.
Daten sind bereitgestellt durch das CVE Programm von einer CVE Numbering Authority (CNA) (Unstrukturiert).
HerstellerLinux
Produkt Linux
Default Statusunaffected
Version < 328cac3f9f8ae394748485e769a527518a9137c8
Version d3b18ad31f93d0b6bae105c679018a1ba7daa9ca
Status affected
Version < 2e36a81d388ec9c3f78b6223f7eda2088cd40adb
Version d3b18ad31f93d0b6bae105c679018a1ba7daa9ca
Status affected
Version < 57fbbe29e86042bbaa31c1a30d2afa16c427e3f7
Version d3b18ad31f93d0b6bae105c679018a1ba7daa9ca
Status affected
Version < 603943f022a7fe5cc83ca7005faf34798fb7853f
Version d3b18ad31f93d0b6bae105c679018a1ba7daa9ca
Status affected
Version < 54a3ecaeeeae8176da8badbd7d72af1017032c39
Version d3b18ad31f93d0b6bae105c679018a1ba7daa9ca
Status affected
HerstellerLinux
Produkt Linux
Default Statusaffected
Version 4.20
Status affected
Version < 4.20
Version 0
Status unaffected
Version <= 6.1.*
Version 6.1.142
Status unaffected
Version <= 6.6.*
Version 6.6.94
Status unaffected
Version <= 6.12.*
Version 6.12.34
Status unaffected
Version <= 6.15.*
Version 6.15.3
Status unaffected
Version <= *
Version 6.16
Status unaffected
Zu dieser CVE wurde keine CISA KEV oder CERT.AT-Warnung gefunden.
EPSS Metriken
Typ Quelle Score Percentile
EPSS FIRST.org 0.03% 0.06
CVSS Metriken
Quelle Base Score Exploit Score Impact Score Vector String