SecureAGI

Security is the core towards AGI

Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs

May 13, 2025

A comprehensive study analyzing over 1,400 adversarial prompts to assess the susceptibility of leading LLMs to prompt injection and jailbreak attacks, proposing layered defense strategies.

Yingjing Lu

More Stories

A Systematic Evaluation of Jailbreak Risks in Large Language Models

May 1, 2025

This study introduces a comprehensive benchmark to assess the vulnerability of LLMs to jailbreak attacks, revealing their persistent weaknesses despite increasing safety efforts.

Yingjing Lu

Guided by the Machine: A Framework for Mechanistic Interpretability in Language Models

April 12, 2025

A novel framework for interpretability that leverages guiding signals to reverse-engineer how transformer models represent and compute high-level behaviors.

Yingjing Lu

Privacy Auditing of Large Language Models: Advancements in Canary Design

March 25, 2025

This article explores innovative methodologies for enhancing privacy audits in large language models through improved canary generation techniques.

Yingjing Lu

LLM Security: A Comprehensive Survey of Vulnerabilities, Attacks, and Defenses

February 21, 2025

An in-depth look at the evolving security landscape of Large Language Models, highlighting key vulnerabilities, attack vectors, and defense mechanisms.

Yingjing Lu

DeepSeek V3 Deep Dive: Training Methodologies and Their Impact

January 5, 2025

In this article we are deep diving into Deep Seek V3's training methodologies that makes it efficient to train

Yingjing Lu

DeepSeek V3 Intro: Ground breaking does not cost a lot of money as people would think

January 3, 2025

Intro to Deep Seek V3, a new state of the art LLM that does not cost a lot of money to train. Let's dive into what it is for this article

Yingjing Lu

LLM Security Issues - Misinformation and Social Engineering

December 4, 2024

Large Language Models (LLMs) have transformed the technological landscape, finding applications in everything from customer support and creative writing to research assistance and programming. However, their ubiquity also exposes them to significant security risks. Attackers can manipulate these models in subtle but impactful ways, undermining their reliability and potentially causing real-world harm. This article examines the primary security vulnerabilities in LLMs, provides concrete examples of attacks, and discusses mitigation strategies.

Yingjing Lu

LLM Security Issues - Model Manipulation

November 12, 2024

Yingjing Lu

LLM Security Issues - An Overview

October 15, 2024

Large Language Models (LLMs) like OpenAI’s GPT series, Google’s Bard, and Meta’s LLaMA have revolutionized the way humans interact with artificial intelligence (AI). However, as their capabilities grow, so do the potential security vulnerabilities they introduce. This article explores the primary security concerns associated with LLMs, organized into key categories.

Yingjing Lu

Unlocking the Potential of Multi-Modal Large Language Models: A Comprehensive Guide to Training with Text, Images, and Voice

July 12, 2024

Multi-modal large language models (LLMs) promise groundbreaking advancements by integrating text, image, and voice data into unified AI systems. This article explores the essential steps, techniques, and challenges involved in training such sophisticated models.

Yingjing Lu

Demystifying Large Language Models: Exploring Different Types and Their Applications

June 15, 2024

Large Language Models (LLMs) are revolutionizing the way we interact with technology, but their diversity can be overwhelming—this guide breaks down the different types of LLMs, their unique strengths, and practical applications.

Yingjing Lu

The Dark Side of AI Language Models: Understanding the Security Risks

March 2, 2024

As AI language models become more advanced and widely used, it's crucial to understand the potential security risks they pose. From personal information leaks to copyright infringement within their training data, these models can have unintended consequences that you should be aware of.

Yingjing Lu

Enhancing Your Daily Life and Work Leveraging Potential of Generative AI

January 23, 2023

By embracing the power of generative AI, you can unlock new possibilities, enhance your skills, and achieve greater success in your personal and professional endeavors. This article list out some ways you can leverage to improve your life and work.

Yingjing Lu

What is: Large Language Models and Generative AI

March 16, 2022

Large language models and generative AI are being used in everything from chatbots and virtual assistants to content creation and language translation. Soon, you might find yourself having a heart-to-heart with your smartphone, getting writing tips from your computer, or even watching a movie script written entirely by AI!

Yingjing Lu