TrustME

Fall 2024 CSCI 5541 NLP: Class Project - University of Minnesota

Phish and Chips

Jeremiah Johnson

Ronit Motwani

Jundong Zhang



Abstract

In this project, we aim to present a novel approach to widespread problem of phishing. We proposed a novel method for analyzing the commonality and readability of words to determine the probability that the message is a phishing attempt on a given text without compromising privacy. Our approach creates a foundation for further research and improvement.


Teaser Figure

The following figure conveys our workflow given an input sentence. The workflow displays the 2 key mentrics that formed the basis of our analysis -- Word Commonality and Sentence Readability.

imgname

Introduction / Background / Motivation

What did you try to do? What problem did you try to solve? Articulate your objectives using absolutely no jargon.

The rise of online communication through emails and text messages has given scammers new ways to spread deceptive messages. Many existing methods for detecting these scams focus on spotting suspicious words, checking links, or using advanced computer models. However, these tools often miss how the overall writing style of a message can signal danger. We believe that phishing messages are often written using simpler language and more common words to trick a wider audience. Based on this idea, our research looks at how easy a message is to read and how frequently its words appear in everyday language. By analyzing these features, we aim to develop a more effective system for identifying phishing messages that current methods might overlook.

How is it done today, and what are the limits of current practice?

Current phishing detection research has focused extensively on machine learning models and structural analysis. Researchers such as Uddin and Sarker (2024) (refrence) explored transformer-based models, while Çolhak et al. (2024) (refrence) demonstrated that combining AI-driven text and HTML structure analysis improves detection. However, these models often rely solely on surface-level features, such as explicit phishing keywords and malicious link detection. This leaves a critical gap in detecting messages designed to appear genuine through subtle text manipulation. Our research proposes a more nuanced approach by incorporating lexical patterns, emphasizing readability and word commonality as previously unexplored dimensions in phishing detection.

Who cares? If you are successful, what difference will it make?

Our work has significant implications for advancing phishing detection systems by offering an innovative lexicographical analysis layer. Cybersecurity professionals, developers of secure communication systems, and email service providers could integrate our proposed approach into their existing frameworks. The potential to detect phishing messages based on readability and linguistic patterns adds a new dimension to message screening processes. By focusing on how messages are constructed rather than solely on what they contain, our research highlights the untapped potential of text-based analysis in strengthening online security and mitigating the risks posed by phishing attacks while protecting message confidentiality.


Approach

What did you do exactly? How did you solve the problem? Why did you think it would be successful? Is anything new in your approach?

Our initial approach aimed to develop a contextual understanding of phishing messages by leveraging word relationship data from external APIs. The idea was to extract contextual associations between words, enabling a deeper semantic analysis of message content. However, this approach faced significant limitations due to insufficient relational data provided by available APIs such as ConceptNet and DBpedia. This restricted our ability to build a comprehensive knowledge-base model. Recognizing these constraints, we revisited the core intent behind phishing messages -- to deceive as many users as possible through easily digestible and widely understood language (refrence). This reflection led us to formulate a new theory: phishing messages are likely crafted using simpler sentence structures and more commonly used words to maximize their reach. Shifting our focus we designed the detection system that combines readability scoring, which evaluates how easily a message can be understood, with word commonality analysis based on frequency metrics found in WordsAPI. This re-approach enabled us to explore phishing detection through a unique lexical lens, grounded in the fundamental principles of language accessibility and deceptive communication tactics.

What problems did you anticipate? What problems did you encounter? Did the very first thing you tried work?

We ran into limitations with our API, as we were only allotted twenty-five thousand API requests per day on our current paid plan. This forced up to both limit the sample size of the messages we could use, and also limited the amount of keywords we could request per message in order to fit the limit of our current WordsAPI plan. Running our model multiple times while tweaking the code also contributed heavily to our limit as well. Additionally, the parsing of the message and extraction of the keywords made it difficult for us to run a robust sample size without our application timing out and losing progress.Considering that calculating readability of a given text is relatively straightforward and simple with the python package nltk, we did not anticipate to encounter any kind of challenges and difficulties during the implementation and calculation procedure itself. And we indeed got the readability scores smoothly.


Results

How did you measure success? What experiments were used? What were the results, both quantitative and qualitative? Did you succeed? Did you fail? Why?

The way we measured success was fairly simple: whether our lexical analysis (combining commonality and readability scores together) correctly determined whether the message was phishing or not against the ground truth label of the message. We wanted to answer whether the phishing messages used more common language than non-phishing messages and whether we would be able to detect that. However, contrary to our prediction, our model was not able to reliably create a distinction in the word choices and readability of the phishing messages versus the non-phishing messages. When tested, we ran with a threshold of 0.5 since the score was normalized on a scale of zero to one. However, we leaned far more towards classifying messages as non-phishing. When we lowered the threshold down to around 0.33, we began to see the classification be evenly split up between phishing and non-phishing, but our accuracy decreased due to this as well, since non-phishing messages began to be classified as phishing. Ultimately we determined that our results were inconclusive as the metrics we used for classifications were insufficient in creating a distinction based on our tested dataset.




Conclustion and Future Work

How easily are your results able to be reproduced by others? Did your dataset or annotation affect other people's choice of research or development projects to undertake? Does your work have potential harm or risk to our society? What kinds? If so, how can you address them? What limitations does your model have? How can you extend your work for future research?


Our research framework is designed with replicability in mind, leveraging widely available tools, publicly accessible datasets and APIs. The model's implementation relies on standard machine learning libraries, including Hugging Face's Transformers for RoBERTa fine-tuning and common readability scoring algorithms and word frequency retrieval. Our methodology is clearly defined, with step-by-step processes for data preparation, feature extraction, and score aggregation. Future researchers can reproduce our results by following these procedures, adjusting parameters, or applying the model to new datasets and gain better scores.

Our research utilized a dataset consisting of general messages, covering a wide range of topics typically encountered in everyday online communication. While this dataset provided valuable insights into common linguistic patterns in phishing and benign messages, it limited our ability to explore more specialized phishing tactics aimed at specific industries or professional contexts. A more targeted dataset focusing on business-related messages, such as corporate emails or financial correspondence, could enhance the applicability of our model by exposing it to niche vocabulary and context-specific phishing strategies. Future research could benefit from developing or sourcing such domain-specific datasets, enabling a deeper understanding of how lexical features manifest in professionally oriented phishing attempts. This adaptation could inspire new lines of research in business email compromise (BEC) detection and tailored phishing prevention systems.

While our model processes textual data by extracting keywords and calculating scores based on word commonality and readability formulas, it inherently carries a potential privacy risk. Analyzing message content could, in theory, expose sensitive information if messages were stored or transmitted insecurely. However, our approach minimizes this risk by focusing solely on lexical features rather than the message's actual content. The model processes the text locally, extracts relevant scores, and discards the original message, ensuring that sensitive information is neither retained nor shared. Since we operate on abstracted numerical representations rather than raw text, the possibility of data leakage is significantly reduced. To further mitigate privacy concerns, future implementations could enhance security by applying encryption protocols, processing messages entirely on-device, or anonymizing data before analysis. This ensures that our research remains both technically sound and ethically responsible.

While our model offers a novel approach to phishing detection through lexical analysis, it faces certain limitations due to the evolving nature of phishing messages. Historically, phishing emails were often poorly constructed, featuring grammatical errors, awkward phrasing, and low lexical sophistication—traits our model was designed to detect through readability and word commonality scoring. However, the rise of advanced LLMs capable of generating highly coherent and convincing phishing messages challenges the effectiveness of our approach. As these models become more accessible, phishing attempts may increasingly exhibit professional writing quality, making lexical analysis alone insufficient. To address this, future research could integrate AI-generated text checkers into our TrustME scoring system, adding an adaptive layer that assesses the likelihood of a message being machine-generated. This would strengthen the detection process by balancing lexical evaluation with modern AI-driven checks, ensuring that the scoring system remains robust against more sophisticated phishing tactics.