cross-posted from: https://programming.dev/post/37726760

  • Guardrails can be bypassed: With prompt injection, ChatGPT agents can be manipulated into breaking built-in policies and solving CAPTCHAs.
  • CAPTCHA defenses are weakening: The agent solved not only simple CAPTCHAs but also image-based ones - even adjusting its cursor to mimic human behavior.
  • Enterprise risk is real: Attackers could reframe real controls as “fake” to bypass them, underscoring the need for context integrity, memory hygiene, and continuous red teaming.
  • AmbiguousProps@lemmy.today
    link
    fedilink
    English
    arrow-up
    23
    ·
    18 hours ago

    I posted this elsewhere, but CAPTCHAs have always been used to train models, and have always had to improve themselves even before LLMs blew up. This article was posted from a site with an .ai tld, and seems to be doing the whole Sam Altman “I’m scared of AI, AGI is right around the corner! I certainly don’t have a vested interest in making you think it does more than it actually does”