OBSCURA- A Dark Web Command Line Search Tool

Faculty Mentor Name

Jesse Chiu, Jon Haass

Format Preference

Poster

Abstract

Dark web activity represents an increasingly relevant domain within cybersecurity and intelligence operations. It remains underrepresented in Embry-Riddle’s extracurricular activities due to operational risk, ethical concerns, and technical barriers. To address this gap, a locally executed analysis system was developed to enable controlled, offline examination of archived dark web content for educational and research purposes.

The project resulted in the design and implementation of a modular command-line interface (CLI) tool that performs keyword-based searching across a locally stored corpus of dark web pages retrieved using cURL. Archived pages are stored in HTML format and processed through a lightweight indexing pipeline that parses page content, normalizes text, and enables efficient keyword matching without requiring live network access. This architecture eliminates repeated interaction with dark web services while preserving analytical fidelity.

The system returns ranked search results with contextual excerpts, providing a streamlined workflow for exploratory analysis. A complementary graphical user interface was developed to improve usability for non-subject matter experts while maintaining CLI functionality for advanced users. By extracting data acquisition and analysis behind a controlled local interface, the tool significantly reduces exposure risk while supporting repeatable and auditable analysis workflows relevant to cybersecurity education including open-source intelligence research.

Share

COinS
 

OBSCURA- A Dark Web Command Line Search Tool

Dark web activity represents an increasingly relevant domain within cybersecurity and intelligence operations. It remains underrepresented in Embry-Riddle’s extracurricular activities due to operational risk, ethical concerns, and technical barriers. To address this gap, a locally executed analysis system was developed to enable controlled, offline examination of archived dark web content for educational and research purposes.

The project resulted in the design and implementation of a modular command-line interface (CLI) tool that performs keyword-based searching across a locally stored corpus of dark web pages retrieved using cURL. Archived pages are stored in HTML format and processed through a lightweight indexing pipeline that parses page content, normalizes text, and enables efficient keyword matching without requiring live network access. This architecture eliminates repeated interaction with dark web services while preserving analytical fidelity.

The system returns ranked search results with contextual excerpts, providing a streamlined workflow for exploratory analysis. A complementary graphical user interface was developed to improve usability for non-subject matter experts while maintaining CLI functionality for advanced users. By extracting data acquisition and analysis behind a controlled local interface, the tool significantly reduces exposure risk while supporting repeatable and auditable analysis workflows relevant to cybersecurity education including open-source intelligence research.