Date of Award

Spring 2026

Access Type

Thesis - Open Access

Degree Name

Master of Science in Computer Science

Department

Electrical Engineering and Computer Science

Committee Chair

Omar Ochoa

Committee Chair Email

ochoao@erau.edu

Committee Advisor

Omar Ochoa

Committee Advisor Email

ochoao@erau.edu

First Committee Member

Alejandro Vargas

First Committee Member Email

vargasar@erau.edu

Second Committee Member

Laxima Niure Kandel

Second Committee Member Email

niurekal@erau.edu

College Dean

James W. Gregory

Abstract

While prompt engineering is pivotal for shaping Large Language Model (LLM) outputs, the impact of confidence framing on behavioral calibration remains underexplored. This study investigates the ways in which psychological framing, utilizing techniques such as capability praise, role amplification, and doubt induction, affects linguistic tone, objective accuracy, and internal calibration. A 1,080-trial experimental matrix evaluated six diverse models across factual, logical, coding, and cyber security domains. Analysis using the Kruskal-Wallis H-test revealed highly significant behavioral shifts across all measured dimensions, providing conclusive evidence that the applied frames exert a substantial influence on model performance.

The findings identify a distinct cognitive trade-off. While confidence-boosting language produced more assertive and fluent outputs, it significantly degraded factual reliability and internal calibration in larger proprietary models. However, a paradox was observed in small language models, where authoritative or more confident frames acted as a corrective focusing mechanism that improved calibration. In the cyber security domain, doubt-inducing frames successfully weaponized alignment guardrails, increasing aggregate refusal rates from 43.3% to 71.1%. These results suggest that linguistic confidence is a trailing indicator of internal alignment rather than a marker of latent truth. This work establishes that overconfident framing introduces critical vulnerabilities in factual and logical domains, while simultaneously offering a potential reliability boost for lightweight models in regulated enterprise environments.

Share

COinS