Publications - QuantaAlpha

2026

EpochX: Building the Infrastructure for an Emergent Agent Civilization

Agent Infrastructure Human-Agent Collaboration Skill Marketplace

Paper HF Code Product

2026

Idea2Paper: What Should an End-to-End Research Agent Really Do?

Research Agent End-to-End Scientific Writing

Paper Code Product

2026

Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

Research Automation Knowledge Graph Scientific Narrative

Paper Code Product

2026

Story2Proposal: A Scaffold for Structured Scientific Paper Writing

Multi-Agent Framework Scientific Writing Visual Contract

Paper Code Product

AAAI 2026 Oral

Easy for Children, Hard for AI: The Limits of Multimodal LLMs in Early Childhood Learning

Multimodal LLM Benchmark Oral

Paper

AAAI 2026

PsyPARSE: Retrieval-Augmented Slow Thinking for Personalized Empathetic Counseling

Empathetic Counseling RAG Slow Thinking

Paper

2026

Chain of Mindset: Reasoning with Adaptive Cognitive Modes

LLM Reasoning Adaptive Cognitive Modes Chain of Mindset

Paper HF Code

2026

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

AI4Investment self-evolving Alpha Mining

Paper HF Code

2026

Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

Agent Safety Intrinsic Risk Sensing Hierarchical Defense

Paper HF Code

2026

Controlled Self-Evolution for Algorithmic Code Optimization

Self-Evolution CodeAgent Genetic Algorithm EffiBench

Paper HF Code

ACL 2026

MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences

CodeAgent Memory Bug Fixing SWE-bench

Paper HF Code

2026

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Video Reasoning DeepResearch Web Retrieval VideoDR

Paper HF Code

ACL 2026

KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions

Agent Memory Person Understanding Digital Companion

Paper HF Code

2026

EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines

Self-Evolution DeepResearch Finite State Machine Multi-hop QA

arXiv HF

2026

FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

Agent Safety Financial Security Regulatory Compliance

arXiv HF Code

ACL 2026

Does Memory Need Graphs? A Unified Framework and Empirical Analysis for Long-Term Dialog Memory

Agent Memory Graph Structure Dialog Memory

Paper HF Code

ACL 2026

CloneMem: Benchmarking Long-Term Memory for AI Clones

Agent Memory AI Clone Temporal Reasoning

Paper HF Code

ACL 2026

RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction

Agent Memory Cross-session Dialog Real-world Interaction

Paper HF Code

2026

DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation

LoRA MoE Dynamic Rank PEFT

arXiv HF

ACL 2026

MirrorQA: Benchmarking Multimodal LLMs on Mirror-Orientation Reasoning

Multimodal LLM Mirror Reasoning Benchmark

ACL 2026

Tiny Scales, Great Challenges: The Limits of Multimodal LLMs in Scale Recognition

Multimodal LLM Scale Recognition Benchmark

ACL 2026

SafetyMem: Adaptive Jailbreak Defense via Dual-Component Safety Memory

Jailbreak Defense Safety Memory LLM Safety

ICLR 2026

Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

Foundation Model EEG Signal Poster

Paper

ACL 2026

LiveCANNBench: Benchmark SWE AI Coding for Ascend CANN

SWE AI Coding Benchmark Ascend CANN

2026

Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure

CodeAgent Agent Infrastructure Embeddable

Paper HF Code

2026

SemaClaw: A Step Towards General-Purpose Personal AI Agents Through Harness Engineering

Personal AI Agent Harness Engineering General Purpose

Paper HF Code

2025

🐙 Octopus: Agentic Multimodal Reasoning with Six-Capability Orchestration

Multimodal Reasoning Agentic Framework arXiv Preprint

Paper

NeurIPS 2025

🧠 SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning

Self-Evolution Trajectory Optimization Poster

Paper HF Code

AAAI 2026 Oral

🧩 GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks

CodeAgent Benchmark Oral

Paper HF Code

EMNLP 2025 Findings

ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast&Slow Reasoning for Robust Agent Defense

Agent Safety Self-learning Hierarchical Reasoning

Paper Code

NeurIPS 2025 Spotlight

🔍 RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories

CodeAgent Repository Understanding Spotlight

Paper HF Code

ACL 2025 Findings

Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs

LLM Safety Essence Driven Defense Jailbreak

Paper Code

2025

ShieldLearner: A New Paradigm for Jailbreak Attack Defense in LLMs

LLM Safety Self-learning Jailbreak

Paper

Publications (30+)

MirrorQA: Benchmarking Multimodal LLMs on Mirror-Orientation Reasoning

Tiny Scales, Great Challenges: The Limits of Multimodal LLMs in Scale Recognition

SafetyMem: Adaptive Jailbreak Defense via Dual-Component Safety Memory

LiveCANNBench: Benchmark SWE AI Coding for Ascend CANN