7.5 HIGH
- CVSS version (CVSS): 3.1
- Attack Vector (AV): Network (N)
- Attack Complexity (AC): Low (L)
- Privileges Required (PR): None (N)
- User Interaction (UI): None (N)
- Scope (S): Unchanged (U)
- Confidentiality (C): High (H)
- Integrity (I): None (N)
- Availability (A): None (N)
- Modified Attack Vector (MAV): Network (N)
- Modified Attack Complexity (MAC): Low (L)
- Modified Privileges Required (MPR): None (N)
- Modified User Interaction (MUI): None (N)
- Modified Confidentiality (MC): High (H)
- Modified Scope (MS): Unchanged (U)
- Modified Integrity (MI): None (N)
- Modified Availability (MA): None (N)
by @LeSuisse Activity log
- Created suggestion
- @LeSuisse accepted
- @LeSuisse published on GitHub
NLTK: URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read
NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1.
References
-
-
https://github.com/nltk/nltk/pull/3575 x_refsource_MISC
Affected products
- ==< 3.10.0-rc1
Matching in nixpkgs
pkgs.python313Packages.nltk
Natural Language Processing ToolKit
Package maintainers
-
@bengsparks Ben Sparks <benjamin.sparks@protonmail.com>