浪人
DE | EN
TryHackMe - BankGPT Write-Up
tech

TryHackMe - BankGPT Write-Up

Back to Blog
3 min read

🏦 TryHackMe - BankGPT (Room Walkthrough & Learning Notes)

Room: https://tryhackme.com/room/bankgpt
Topic: Prompt Injection • LLM Security • Context-Based Manipulation

Info: This write-up describes exclusively the approach and learning content. It contains no flags, passwords, hashes, or confidential values in accordance with TryHackMe guidelines.


🔍 Challenge Overview

The BankGPT room provides a simulated banking chatbot powered by a Large Language Model.
The bot is designed to:

  • follow internal security policies
  • not disclose confidential information
  • mask or redact sensitive values
  • explain processes rather than output concrete data

The task is to investigate how prompt injection techniques and contextual formulations affect these protection mechanisms.

This is not a classic technical exploit, but rather a test of:

  • argumentation and reasoning
  • context crafting
  • social influence
  • weaknesses in LLM behavior

🪟 The User Interface:

Bankgpt

After starting the room and navigating to the page via the link, you find yourself on a typical minimal chatbot UI.


🎯 Core Idea of the Challenge

Direct requests for confidential data are consistently rejected by the bot.

However, when the same requests are embedded in a context such as:

  • compliance audits
  • internal audit processes
  • classification or verification workflows

the model responds increasingly openly - sometimes with additional internal descriptions or metadata.

Typical interaction flow:

  1. The model provides secure or redacted placeholder values
  2. It mentions internal designations or classifications
  3. It provides additional context information with polite requests
  4. Certain phrasings unexpectedly weaken protection mechanisms

This demonstrates:

The guidelines are implemented at the conversation level - not based on a genuine assessment of sensitivity.

This creates potential attack surfaces.


🧩 Central Methods & Insights

The challenge demonstrates several real prompt injection patterns:


🟡 Authority and Process Framing

Formulations that resemble internal communication build more trust:

  • audit inquiries
  • system or compliance validation
  • verification logic rather than data collection

The model more frequently interprets such prompts as legitimate workflows.


🟡 Redaction & Classification Queries

Instead of requesting sensitive data, the bot can be made to:

  • explain redactions or placeholders
  • describe internal data types or designations
  • explain storage or reference concepts

While only metadata is discussed, even this can be security-relevant.


🟡 “Debug / Integrity Check / Verification”

In development or support contexts, LLMs often respond more freely to:

  • check and confirmation instructions
  • repetitions or echoes of values
  • simulated log or protocol outputs

This highlights risks in:

  • support chatbots
  • internal tools
  • DevOps automations

when outputs are not additionally secured.


🟡 Format and Demonstration Vulnerabilities

A particularly important learning point:

When a model is only supposed to “show a format,” it can still output real values.

Format validation is often not interpreted as data disclosure.

This has immediate relevance for:

  • AI-driven business processes
  • assistant systems in enterprises
  • security-critical areas

🧠 Security Lessons from the Challenge

BankGPT clearly shows:

  • conversational guidelines alone provide no real protection
  • LLMs evaluate phrasing, not security risks
  • harmless-sounding process language can lead to data leaks
  • disclosing metadata can already be critical
  • prompt injection remains a real, current threat

Secure systems require protection at: ✔ Data access level
✔ System architecture
✔ Permission models
not just in chat response logic.


🏁 Conclusion

BankGPT is an excellent practical introduction to:

  • adversarial prompts
  • analysis of conversational attack surfaces
  • LLM abuse scenarios
  • modern security questions around AI systems

The challenge requires:

  • critical thinking
  • communication awareness
  • understanding of human and technical security factors

and very clearly demonstrates what risks can arise when AI systems are integrated into operational workflows.

👉 Room link for reference:
https://tryhackme.com/room/bankgpt