ORCID Number

0000-0003-0032-8201

Date of Award

Summer 7-10-2025

Access Type

Thesis - Open Access

Degree Name

Master of Science in Computer Science

Department

Electrical Engineering and Computer Science

Committee Chair

Shafika Showkat Moni

Committee Chair Email

monis@erau.edu

First Committee Member

Omar Ochoa

First Committee Member Email

Omar.Ochoa@erau.edu

Second Committee Member

Laxima Niure Kandel

Second Committee Member Email

Laxima.NiureKandel@erau.edu

College Dean

James W. Gregory

Abstract

Large-language models (LLMs) already power mission critical tasks such as command-and-control chat, satellite ground-station automation, military analytics, and cyber-defense. Since most of these services are offered through application programming interfaces (APIs) that still expose full or top-k logits and lack mature safeguards, they present a serious, often overlooked attack surface. Earlier work has shown how to rebuild the output projection layer or distill surface behavior, but no attack has produced a deployable clone within a tight query budget. In this thesis, we address this problem by presenting a practical pipeline for cloning LLMs under constrained settings. The approach first estimates the output projection matrix by collecting fewer than 10,000 top-$k$ logits from the target model and analyzing them with singular value decomposition (SVD). Next, it trains smaller student models with varying transformer depths on publicly available data to reproduce the original model’s internal patterns and outputs. Experiments show that a 6-layer student can match 97.6\% of the teacher model’s hidden-state geometry, with only a 7.31\% increase in perplexity and an NLL of 7.58. A lighter 4-layer version runs 17.1\% faster and uses 18.1\% fewer parameters, while maintaining strong performance. The entire attack completes in under 24 GPU hours without triggering API rate limits. These findings show that even adversaries with limited budgets can recreate the knowledge inside modern LLMs, highlighting the urgent need for stronger API protections and safer deployment practices.

Scholarly Commons Citation

Gharami, Kanchon, "In the Shadow of Prompts: Adversarial Attacks and Model Cloning in Large Language Models" (2025). Doctoral Dissertations and Master's Theses. 909.
https://commons.erau.edu/edt/909

Download

Included in

Artificial Intelligence and Robotics Commons, Electrical and Computer Engineering Commons, Information Security Commons

COinS

Doctoral Dissertations and Master's Theses

In the Shadow of Prompts: Adversarial Attacks and Model Cloning in Large Language Models

ORCID Number

Date of Award

Access Type

Degree Name

Department

Committee Chair

Committee Chair Email

First Committee Member

First Committee Member Email

Second Committee Member

Second Committee Member Email

College Dean

Abstract

Scholarly Commons Citation

Included in

Search

Browse

Author Corner

Links

Doctoral Dissertations and Master's Theses

In the Shadow of Prompts: Adversarial Attacks and Model Cloning in Large Language Models

Author

ORCID Number

Date of Award

Access Type

Degree Name

Department

Committee Chair

Committee Chair Email

First Committee Member

First Committee Member Email

Second Committee Member

Second Committee Member Email

College Dean

Abstract

Scholarly Commons Citation

Included in

Share

Search

Browse

Author Corner

Links