Home Artificial Intelligence Critical flaw in AI testing framework MLflow can lead to server and data compromise

by Lucian Constantin

CSO Senior Writer

Critical flaw in AI testing framework MLflow can lead to server and data compromise

News Analysis

Mar 24, 20235 mins

Machine LearningVulnerabilities

The now-patched vulnerability in the popular MLflow platform could expose AI and machine-learning models stored in the cloud and allow for lateral movement.

Credit: VAKS-Stock Agency / Shutterstock

MLflow, an open-source framework that’s used by many organizations to manage their machine-learning tests and record results, received a patch for a critical vulnerability that could allow attackers to extract sensitive information from servers such as SSH keys and AWS credentials. The attacks can be executed remotely without authentication because MLflow doesn’t implement authentication by default and an increasing number of MLflow deployments are directly exposed to the internet.

“Basically, every organization that uses this tool is at risk of losing their AI models, having an internal server compromised, and having their AWS account compromised,” Dan McInerney, a senior security engineer with cybersecurity startup Protect AI, told CSO. “It’s pretty brutal.”

McInerney found the vulnerability and reported it to the MLflow project privately. It was fixed in version 2.2.1 of the framework that was released three weeks ago, but the release notes don’t mention any security fix.

Local and remote file inclusion via path traversal

MLflow is written in Python and is designed to automate machine-learning workflows. It has multiple components that allow users to deploy models from various ML libraries; manage their lifecycle including model versioning, stage transitions and annotations; track experiments to record and compare parameters and results; and even package ML code in a reproducible form to share with other data scientists. MLflow can be controlled through a REST API and command-line interface.

All these capabilities make the framework a valuable tool for any organization experimenting with machine learning. Scans using the Shodan search engine reinforce this, showing a steady increase of publicly exposed MLflow instances over the past two years, with the current count sitting at over 800. However, it’s safe to assume that many more MLflow deployments exist inside internal networks and could be reachable by attackers who gain access to those networks.

“We reached out to our contacts at various Fortune 500’s [and] they’ve all confirmed they’re using MLflow internally for their AI engineering workflow,’ McInerney tells CSO.

The vulnerability found by McInerney is tracked as CVE-2023-1177 and is rated 10 (critical) on the CVSS scale. He describes it as local and remote file inclusion (LFI/RFI) via the API, where a remote and unauthenticated attackers can send specifically crafted requests to the API endpoint that would force MLflow to expose the contents of any readable files on the server.

For example, the attacker can include JSON as part of the request where they modify the source parameter to be whatever file they want on the server and the application will return it. One such file can be the ssh keys, which are usually stored in the .ssh directory inside the local user’s home directory. However, knowing the user’s home directory in advance is not a prerequisite for the exploit because the attacker can first read /etc/passwd file, which is available on every Linux system and which lists all the available users and their home directories. None of the other parameters sent as part of the malicious request need to exist and can be arbitrary.

What makes the vulnerability worse is that most organizations configure their MLflow instances to use Amazon AWS S3 for storing their models and other sensitive data. According to Protect AI’s review of the configuration of the publicly available MLflow instances, seven out of ten used AWS S3. This means that attackers can set the source parameter in their JSON request to be the s3:// URL of the bucket used by the instance to steal models remotely.

It also means that AWS credentials are likely stored locally on the MLflow server so the framework can access S3 buckets, and these credentials are typically stored in a folder called ~/.aws/credentials under the user’s home directory. Exposure of AWS credentials can be a serious breach because depending on the IAM policy, it can give attackers lateral movement capabilities into an organization’s AWS infrastructure.

Lack of default authentication leads to insecure deployments

Requiring authentication for accessing the API endpoint would prevent exploitation of this flaw, but MLflow does not implement any authentication mechanism. Basic authentication with a static username and password can be added by deploying a proxy server like nginx in front of the MLflow server and forcing authentication through that. Unfortunately, almost none of the publicly exposed instances use such a setup.

“I can hardly call this a safe deployment of the tool, but at the very least, the safest deployment of MLflow as it stands currently is to keep it on an internal network, in a network segment that is partitioned away from all users except those who need to use it, and put behind an nginx proxy with basic authentication,” McInerney says. “This still doesn’t prevent any user with access to the server from downloading other users’ models and artifacts, but at the very least it limits the exposure. Exposing it on a public internet facing server assumes that absolutely nothing stored on the server or remote artifact store server contains sensitive data.”

by Lucian Constantin

CSO Senior Writer

Lucian Constantin writes about information security, privacy, and data protection for CSO.

Americas

Asia

Europe

Oceania

Topics

About

Policies

Our Network

More

Critical flaw in AI testing framework MLflow can lead to server and data compromise

The now-patched vulnerability in the popular MLflow platform could expose AI and machine-learning models stored in the cloud and allow for lateral movement.

Local and remote file inclusion via path traversal

Lack of default authentication leads to insecure deployments

More from this author

Chinese threat actor engaged in multi-year DNS resolver probing effort

How the ToddyCat threat group sets up backup traffic tunnels into victim networks

More attacks target recently patched critical flaw in Palo Alto Networks firewalls

Windows path conversion weirdness enables unprivileged rootkit behavior

Most popular authors

Show me more

Iranian hackers harvest credentials through advanced social engineering campaigns

Dropbox Sign hack exposed user data, raises security concerns for e-sign industry

UnitedHealth hack may impact a third of US citizens: CEO testimony

CSO Executive Sessions: The personality of cybersecurity leaders

CSO Executive Sessions: Geopolitical tensions in the South China Sea - why the private sector should care

CSO Executive Sessions: 2024 International Women's Day special

CSO Executive Sessions: The personality of cybersecurity leaders

CSO Executive Sessions: Geopolitical tensions in the South China Sea - why the private sector should care

CSO Executive Sessions: 2024 International Women's Day special

Critical flaw in AI testing framework MLflow can lead to server and data compromise

The now-patched vulnerability in the popular MLflow platform could expose AI and machine-learning models stored in the cloud and allow for lateral movement.

Local and remote file inclusion via path traversal

Lack of default authentication leads to insecure deployments

Related content

CISA, FBI urge developers to patch path traversal bugs before shipping

Microsoft continues to add, shuffle security execs in the wake of security incidents

Malware explained: How to prevent, detect and recover from it

LayerX Security Raises $26M for its Browser Security Platform, Enabling Employees to Work Securely from Any Browser, Anywhere

From our editors straight to your inbox

More from this author

Chinese threat actor engaged in multi-year DNS resolver probing effort

How the ToddyCat threat group sets up backup traffic tunnels into victim networks

More attacks target recently patched critical flaw in Palo Alto Networks firewalls

Windows path conversion weirdness enables unprivileged rootkit behavior

Most popular authors

Show me more

Iranian hackers harvest credentials through advanced social engineering campaigns

Dropbox Sign hack exposed user data, raises security concerns for e-sign industry

UnitedHealth hack may impact a third of US citizens: CEO testimony

CSO Executive Sessions: The personality of cybersecurity leaders

CSO Executive Sessions: Geopolitical tensions in the South China Sea - why the private sector should care

CSO Executive Sessions: 2024 International Women's Day special

CSO Executive Sessions: The personality of cybersecurity leaders

CSO Executive Sessions: Geopolitical tensions in the South China Sea - why the private sector should care

CSO Executive Sessions: 2024 International Women's Day special