Monthly archive

June 2026 AI News Archive

95 archived articles for June 2026, grouped by publication day.

RSS feed

News feed: 24m ago

Archived articles

Active days

Sources

Source overview

arXiv cs.AI

47 articles

Latest: Jun 19, 2026

arXiv cs.LG

18 articles

Latest: Jun 19, 2026

arXiv cs.CL

15 articles

Latest: Jun 19, 2026

The Decoder

5 articles

Latest: Jun 19, 2026

Hugging Face

3 articles

Latest: Jun 17, 2026

MIT Tech Review AI

2 articles

Latest: Jun 19, 2026

OpenAI

2 articles

Latest: Jun 3, 2026

TechCrunch

1 article

Latest: Jun 19, 2026

Jun 19, 2026

TechCrunch

Encryption, spyware, and now Mythos: History shows why cyber export control doesn’t work

For the last 30 years, stopping the flow of cybersecurity-related software has proven to be ineffective. It's unclear why it would work now with Anthropic’s cybersecurity model Mythos.

The Decoder

Amazon drops its OpenAI drama film after signing a $50 billion deal with Sam Altman's company

Amazon MGM Studios has dropped "Artificial," the nearly finished OpenAI film directed by Luca Guadagnino with Andrew Garfield as Sam Altman. Amazon struck a $50 billion partnership with OpenAI in February. According to…

MIT Tech Review AI

A startup claims it broke through a bottleneck that’s holding back LLMs

Miami-based AI startup Subquadratic came out of stealth mode last month with a huge claim. It announced that it had solved a mathematical bottleneck that had been holding back large language models for almost a decade.…

The Decoder

OpenAI researchers show small doses of "beneficial trait" training make AI models broadly safer and harder to manipulate

OpenAI researchers show that reinforcement learning on desired behavioral traits like truthfulness and corrigibility works across domains. Training on health data also improved deception detection, and the model scored…

The Decoder

Website "In the Weights" shows whether AI models know who you are

Two former OpenAI employees have built a website called "In the Weights" that reveals which people AI models can recall purely from their training data. A strength score of up to 996 shows how deeply a person is embedde…

arXiv cs.AI

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

arXiv:2606.19348v1 Announce Type: cross Abstract: We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models -- DeepSeek-V4-Pro with 1.6T parameters (49B activated)…

arXiv cs.AI

LLM agent safety, multi-turn red-teaming, jailbreak benchmarks, adversarial robustness, safety-critical systems

arXiv:2606.20408v1 Announce Type: cross Abstract: Large language model (LLM) agents are increasingly proposed as supervisory components for safety-critical systems, yet their robustness under sustained, adaptive adversa…

arXiv cs.AI

Connect the Dots: Training LLMs for Long-Lifecycle Agents with Cross-Domain Generalization Via Reinforcement Learning

arXiv:2606.20002v1 Announce Type: cross Abstract: This work presents a general framework for training large language models (LLMs) to "Connect the Dots" (CoD), a meta-capability required by long-lifecycle agents: as an…

arXiv cs.AI

Hidden Anchors in Multi-Agent LLM Deliberation

arXiv:2606.19494v1 Announce Type: new Abstract: Multi-agent LLM deliberation, where agents exchange and revise answers over several rounds, is increasingly used to improve reasoning and accuracy, yet how and why it work…

arXiv cs.AI

LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data

arXiv:2606.19509v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly applied to structured clinical data, yet whether they can recognize the limits of their own knowledge on such tasks remains u…

arXiv cs.AI

Uncertainty Decomposition for Clarification Seeking in LLM Agents

arXiv:2606.19559v1 Announce Type: new Abstract: Recent position papers argue that the classical aleatoric/epistemic uncertainty framework is insufficient for interactive large language model (LLM) agents and call for un…

arXiv cs.AI

Analyzing the Narration Gap in LLM-Solver Loops

arXiv:2606.19588v1 Announce Type: new Abstract: Formal tools such as SAT and SMT solvers are increasingly embedded in language model reasoning pipelines when a safety or security critical question can be formulated in l…

arXiv cs.AI

Which Pairs to Compare for LLM Post-Training?

arXiv:2606.19607v1 Announce Type: new Abstract: Preference-based post-training has become a central paradigm for aligning language models. A common data-collection strategy is to generate a small set of completions for…

arXiv cs.AI

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

arXiv:2606.19704v1 Announce Type: new Abstract: Agent benchmarks are growing fast, but no single benchmark touches more than four or five of the dimensions that deployment exposes. This paper aggregates the largest coor…

arXiv cs.AI

Beyond Entropy: Learning from Token-Level Distributional Deviations for LLM Reasoning

arXiv:2606.19771v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has significantly advanced Large Language Model (LLM) reasoning; however, it faces a fundamental optimization instabi…

arXiv cs.AI

ORAgentBench: Can LLM Agents Solve Challenging Operations Research Tasks End to End?

arXiv:2606.19787v1 Announce Type: new Abstract: Large language models are increasingly deployed as autonomous agents for multi-step tasks in executable environments, yet their ability to perform realistic operations res…

arXiv cs.AI

Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring

arXiv:2606.20138v1 Announce Type: new Abstract: LLMs can personalize education, although current static-prompt tutoring systems struggle to adapt to diverse academic disciplines. We develop and test a system with subjec…

arXiv cs.AI

Navigating Unreliable Parametric and Contextual Knowledge: Explicit Knowledge Conflict Resolution for LLM Inference

arXiv:2606.20245v1 Announce Type: new Abstract: Large language models (LLMs) have achieved strong performance across a wide range of language-based tasks by leveraging both extensive parametric knowledge and in-context…

arXiv cs.AI

What Do Safety-Aligned LLMs Learn From Mixed Compliance Demonstrations?

arXiv:2606.20508v1 Announce Type: new Abstract: Prior work has shown that in-context demonstrations can jailbreak language models, but it remains unclear how models interpret different types of compliance demonstrations…

arXiv cs.AI

Human-AI Agent Interaction in a Business Context

arXiv:2606.18716v1 Announce Type: cross Abstract: As AI agents are increasingly integrated into core business processes, understanding and designing effective interaction patterns between humans and AI agents becomes cr…

arXiv cs.AI

Exposing the Unsaid: Visualizing Hidden LLM Bias through Stochastic Path Aggregation

arXiv:2606.19344v1 Announce Type: cross Abstract: Large Language Models (LLMs) exhibit representational and syntactic biases that are difficult to evaluate due to the stochastic nature of text generation. Standard audit…

arXiv cs.AI

Where to Place the Query? Unveiling and Mitigating Positional Bias in In-Context Learning for Diffusion LLMs via Decoding Dynamics

arXiv:2606.19349v1 Announce Type: cross Abstract: While In-Context Learning (ICL) is extensively studied in Autoregressive (AR) LLMs, its mechanism within Diffusion Large Language Models (dLLMs) remains largely unexplor…

arXiv cs.AI

Information Lattice Learning as Probabilistic Graphical Model Structure Learning

arXiv:2606.19366v1 Announce Type: cross Abstract: Information lattice learning (ILL) learns interpretable rules of a signal by alternately projecting the signal onto a partition lattice that encodes a hierarchy of abstr…

arXiv cs.AI

Cost-Optimal LLM Routing with Limited User Feedback under User Satisfaction Guarantees

arXiv:2606.19376v1 Announce Type: cross Abstract: Inference costs for large language model (LLM) applications are rapidly growing, driven by surging demand and rising infrastructure cost. Users expect high-quality respo…

arXiv cs.AI

Interpretable and Verifiable Hardware Generation with LLM-Driven Stepwise Refinement

arXiv:2606.19387v1 Announce Type: cross Abstract: Large language models (LLMs) have achieved remarkable success in software development. However, they are susceptible to hallucinations, meaning that they can introduce s…

arXiv cs.AI

Secure Coding Drift in LLM-Assisted Post-Quantum Cryptography Development: A Gamified Fix

arXiv:2606.19474v1 Announce Type: cross Abstract: The transition to Post Quantum Cryptography (PQC) introduces considerable implementation complexity, requiring strict adherence to constant-time execution, side channel…

arXiv cs.AI

Techniques for Peak Memory Reduction for LoRA Fine-tuning of LLMs on Edge Devices

arXiv:2606.19528v1 Announce Type: cross Abstract: Fine-tuning of Large Language Models (LLMs) using Low-Rank Adaptation (LoRA) on an end-user's data offers personalized experiences while keeping data private, but faces…

arXiv cs.AI

FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines

arXiv:2606.19605v1 Announce Type: cross Abstract: Multi-step LLM pipelines fail through interactions among retrieval, reasoning, and formatting steps, so prompt-only optimization can miss bottlenecks in the chain. We pr…

arXiv cs.AI

NRITYAM: Language Models Meet Art and Heritage of Dance

arXiv:2606.19727v1 Announce Type: cross Abstract: Language models have become essential tools in shaping modern workflows. However, their global effectiveness hinges on a nuanced understanding of local socio-cultural co…

arXiv cs.AI

SafeSpec: Fast and Safe LLM via Dynamic Reflective Sampling

arXiv:2606.19755v1 Announce Type: cross Abstract: Speculative inference accelerates large language model (LLM) decoding but provides no inherent safety guarantees. Existing safety defenses are largely incompatible with…

arXiv cs.AI

Measuring Biological Capabilities and Risks of AI Agents

arXiv:2606.19899v1 Announce Type: cross Abstract: This paper addresses a rapidly emerging policy challenge: how to generate and interpret credible evidence about the biological capabilities and risks of AI scientists, o…

arXiv cs.AI

Confidence Calibration for Multimodal LLMs: An Empirical Study through Medical VQA

arXiv:2606.19950v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) show great potential in medical tasks, but their elicited confidence often misaligns with actual accuracy, potentially leading t…

arXiv cs.AI

Hierarchical Control in Multi-Agent Games: LLM-based Planning and RL Execution

arXiv:2606.20014v1 Announce Type: cross Abstract: Reinforcement learning (RL) has achieved strong performance in sequential decision-making, yet scaling to complex multi-agent environments remains challenging due to spa…

arXiv cs.AI

A Neuromorphic Reinforcement Learning Framework for Efficient Pathfinding in Robotic Mobile Fulfillment Systems

arXiv:2606.20031v1 Announce Type: cross Abstract: Dynamic environmental changes, confined workspaces, and stringent real-time constraints make pathfinding in Robotic Mobile Fulfillment Systems (RMFS) a challenging probl…

arXiv cs.AI

Evaluating and Enhancing Negation Comprehension in Remote Sensing MLLMs

arXiv:2606.20177v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in various Remote Sensing (RS) tasks. However, their ability to comprehend negation remains…

arXiv cs.AI

Editorial Alignment: A Participatory Approach to Engaging Editorial Expertise in LLM-mediated Knowledge Dissemination

arXiv:2606.20258v1 Announce Type: cross Abstract: The emergence of LLM-driven information services is reshaping the conditions under which public knowledge institutions operate, threatening to absorb the editorial funct…

arXiv cs.AI

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

arXiv:2606.20373v1 Announce Type: cross Abstract: Large Language Models (LLMs) show promise for code compilation tasks, but applying them to runtime performance tuning is difficult due to complex microarchitectural effe…

arXiv cs.AI

Multi-View Decompilation for LLM-Based Malware Classification

arXiv:2606.20436v1 Announce Type: cross Abstract: Malware analysts often inspect compiled binaries through decompiled pseudo-C, when source code is unavailable. Recent work suggests that large language models (LLMs) can…

arXiv cs.AI

Calibration Without Comprehension: Diagnosing the Limits of Fine-Tuning LLMs for Vulnerability Detection in Systems Software

arXiv:2606.20502v1 Announce Type: cross Abstract: Whether LLMs scoring well on vulnerability benchmarks genuinely reason about security or merely pattern-match on contaminated data remains unresolved. We present CWE-Tra…

arXiv cs.AI

Efficient and Sound Probabilistic Verification for AI Agents

arXiv:2606.20510v1 Announce Type: cross Abstract: Securing AI agents that operate in complex digital environments has become a critical need, and runtime monitoring approaches that formulate and enforce policies express…

arXiv cs.AI

How Transparent is DiffusionGemma?

arXiv:2606.20560v1 Announce Type: cross Abstract: LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. Howe…

arXiv cs.AI

PCBSchemaGen: Reward-Guided LLM Code Synthesis for Printed Circuit Boards (PCB) Schematic Design with Structured Verification

arXiv:2602.00510v2 Announce Type: replace Abstract: Most LLM code-synthesis benchmarks rely on unit tests as the reward oracle, but PCB schematic design has none: correctness is defined by structured physical constraint…

arXiv cs.AI

RetailBench: Benchmarking long horizon reasoning and coherent decision making of LLM agents in realistic retail environments

arXiv:2606.15862v2 Announce Type: replace Abstract: Large language model (LLM) agents have made rapid progress on short-horizon, well-scoped tasks, yet their ability to sustain coherent decisions in dynamic long-horizon…

arXiv cs.AI

TxBench-PP: Analyzing AI Agent Performance on Small-Molecule Preclinical Pharmacology

arXiv:2606.19245v2 Announce Type: replace Abstract: Artificial intelligence (AI) agents promise to accelerate drug discovery by compressing interpretation and decision-making loops, but practical deployment requires tru…

arXiv cs.AI

Reinforcement-aware Knowledge Distillation for LLM Reasoning

arXiv:2602.22495v3 Announce Type: replace-cross Abstract: Reinforcement learning (RL) post-training has recently driven major gains in long chain-of-thought reasoning large language models (LLMs), but the high inference…

arXiv cs.AI

The Autonomy Tax: Defense Training Breaks LLM Agents

arXiv:2603.19423v2 Announce Type: replace-cross Abstract: Large language model (LLM) agents increasingly rely on external tools (file operations, API calls, database transactions) to autonomously complete complex multi-…

arXiv cs.AI

Automated Standardization of Legacy Biomedical Metadata Using an Ontology-Constrained LLM Agent

arXiv:2604.08552v2 Announce Type: replace-cross Abstract: Scientific metadata are often incomplete and noncompliant with community standards, limiting dataset findability, interoperability, and reuse. Even when standard…

arXiv cs.AI

FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning

arXiv:2604.11556v2 Announce Type: replace-cross Abstract: LLM-assisted software development has become increasingly prevalent, and can generate large-scale systems, such as compilers. It becomes crucial to strengthen th…

arXiv cs.AI

"Important You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

arXiv:2606.03090v2 Announce Type: replace-cross Abstract: The emergence of large language models (LLMs) has significantly accelerated recent research on LLM-based automatic grading (AG) systems. Benefiting from the stro…

arXiv cs.AI

Gaming-Resistant Insurance Contracts for Autonomous AI Agents: Strategy-Proof Toll Mechanism Design

arXiv:2606.16326v2 Announce Type: replace-cross Abstract: Paper A defines a time-consistent actuarial runtime that prices each side-effect-bearing action against a contractually fixed safe default and gates execution ag…

arXiv cs.AI

Statistical Foundations of LLM-based A/B Testing: A Surrogacy Framework for Human Causal Inference

arXiv:2606.17165v2 Announce Type: replace-cross Abstract: Organizations and researchers show increasing interest in using large language models (LLMs) in place of human participants in A/B tests, in the hope of experime…

arXiv cs.AI

Mitigating Anchoring Bias in LLM-Based Agents for Energy-Efficient 6G Autonomous Networks

arXiv:2606.18272v2 Announce Type: replace-cross Abstract: This paper presents an autonomous agentic resource negotiation framework designed to enable zero-touch network slicing in 6G architectures using Large Language M…

arXiv cs.CL

Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence

arXiv:2606.19353v1 Announce Type: new Abstract: In-Context Learning (ICL) allows LLMs to adapt to new tasks from a few demonstrations, but its reliability remains a concern: predictions are highly sensitive to both prom…

arXiv cs.CL

Characterizing Narrative Content in Web-scale LLM Pretraining Data

arXiv:2606.19468v1 Announce Type: new Abstract: The narrative composition of web-scale LLM pretraining corpora remains largely unexplored even though narrative is a fundamental mode of human communication. We present th…

arXiv cs.CL

Code-Switching Reveals Language Anchoring in Multilingual LLMs

arXiv:2606.19668v1 Announce Type: new Abstract: Multilingual Large Language Models (MLLMs) are increasingly expected to handle Code-Switched (CS) inputs, yet mixing languages frequently degrades performance relative to…

arXiv cs.CL

AtomMem: Building Simple and Effective Memory System for LLM Agents via Atomic Facts

arXiv:2606.19847v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate strong reasoning and generation abilities, but their fixed context windows limit long-term information accumulation and reuse acro…

arXiv cs.CL

Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

arXiv:2606.19852v1 Announce Type: new Abstract: Information extraction from pathology reports is essential for cancer staging, tumor registry population. Yet key data remains embedded in narrative reports, making manual…

arXiv cs.CL

GEMS: Geometric Constraints Enable Multi-Semantic Superposition in LLMs

arXiv:2606.19946v1 Announce Type: new Abstract: Activation steering controls model behavior by modifying intermediate hidden states at inference time without retraining. Existing methods handle only single-direction inj…

arXiv cs.CL

Your Mouse and Eyes Secretly Leak Your Preference: LLM Alignment using Implicit Feedback from Users

arXiv:2606.20482v1 Announce Type: new Abstract: To align a Large Language Model (LLM), most existing methods collect explicit human feedback and train a reward model to predict the human preference based on the response…

arXiv cs.CL

StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs

arXiv:2606.20527v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these models jud…

arXiv cs.CL

Displacement Is Not Direction: Evaluating Fidelity Metrics for Quantized LLM Deployment

arXiv:2606.19558v1 Announce Type: cross Abstract: Fidelity metrics, such as per-token KL divergence (KLD) against a high-precision reference, are often used in practice as low-cost proxies for benchmark quality. We test…

arXiv cs.CL

S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation

arXiv:2603.25702v2 Announce Type: replace Abstract: Block-diffusion language models offer a promising path toward faster-than-autoregressive generation by combining block-wise autoregressive decoding with within-block p…

arXiv cs.CL

Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades

arXiv:2605.17443v2 Announce Type: replace Abstract: We analyze how automatic speech recognition (ASR) errors propagate through ASR-LLM cascades in Korean spoken question answering (SQA), focusing on downstream semantic…

arXiv cs.CL

Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio

arXiv:2606.17041v3 Announce Type: replace Abstract: Meta-analysis is a demanding form of evidence synthesis that combines literature retrieval, PI/ECO-guided study selection, and statistical aggregation. Its structured,…

arXiv cs.CL

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs

arXiv:2603.16941v2 Announce Type: replace-cross Abstract: Speech Large Language Models (SpeechLLMs) process spoken input directly, retaining cues such as accent and perceived gender that were previously removed in casca…

arXiv cs.CL

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

arXiv:2604.18105v2 Announce Type: replace-cross Abstract: Integrating large language models (LLMs) into automatic speech recognition (ASR) has become a mainstream paradigm in recent years. Although existing LLM-based AS…

arXiv cs.CL

ESBMC-GraphPLC: Formal Verification of Graphical PLCopen XML Ladder Diagram Programs Using SMT-Based Model Checking

arXiv:2606.18941v2 Announce Type: replace-cross Abstract: PLCopen XML defines two encoding formats for IEC 61131-3 Ladder Diagram programs: a textual encoding using elements, and a graphical encoding that represents run…

arXiv cs.LG

Algebraic Dead Directions in LayerNorm Transformers: A Forward-Pass-Only Diagnostic at LLM Scale

arXiv:2606.19491v1 Announce Type: new Abstract: Pretrained transformers sit near singular minima of the loss, where the Fisher information metric degenerates along dead directions: directions in parameter space along wh…

arXiv cs.LG

Matching Markets meet Cumulative Prospect Theory: Towards Optimal and Adversarially Robust Learning

arXiv:2606.19883v1 Announce Type: new Abstract: We study a multi-agent multi-armed bandit problem in the competitive setup with two-sided matching markets under a human centric decision making model. To capture human pr…

arXiv cs.LG

Kolmogorov-Arnold Reservoir Computing

arXiv:2606.19984v1 Announce Type: new Abstract: Reservoir computing offers a lightweight framework for forecasting dynamical systems but may struggle to capture long-range dependencies due to limited representational ca…

arXiv cs.LG

Activation- and Influence-Aware Ranks (AIR): Function-Preserving SVD Compression for LLMs

arXiv:2606.19993v1 Announce Type: new Abstract: We present Activation- and Influence-Aware Ranks (AIR), an SVD-based LLM compression framework that guides each weight matrix's low-rank approximation with a backward-sign…

arXiv cs.LG

VIMPO: Value-Implicit Policy Optimization for LLMs

arXiv:2606.20008v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards has become a central tool for improving the reasoning ability of large language models, but current methods face a trade-off…

arXiv cs.LG

Quantile of Means: A Bonus-Free Ensemble Method for Minimax Optimal Reinforcement Learning

arXiv:2606.20107v1 Announce Type: new Abstract: Optimal Reinforcement Learning (RL) algorithms typically rely on carefully constructed count-based uncertainty estimates to drive exploration. Although theoretically sound…

arXiv cs.LG

Quantum-classical physics-informed Kolmogorov-Arnold networks for PDEs

arXiv:2606.20326v1 Announce Type: new Abstract: We develop QCPIKAN, the first quantum-classical physics-informed Kolmogorov-Arnold network designed to solve partial differential equations (PDEs). Built upon Chebyshev-po…

arXiv cs.LG

FloatDoor: Platform-Triggered Backdoors in LLMs

arXiv:2606.19535v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in sensitive settings such as software engineering, where their outputs directly shape downstream artifacts. Recen…

arXiv cs.LG

Online Dynamic Batching with Formal Guarantees for LLM Training

arXiv:2606.19989v1 Announce Type: cross Abstract: Modern LLM training breaks a core assumption behind offline batch samplers: the true training cost of a sample is only observable after preprocessing, augmentation, temp…

arXiv cs.LG

The Correctness Illusion in LLM-Generated GPU Kernels

arXiv:2606.20128v1 Announce Type: cross Abstract: Benchmarks for LLM-generated GPU kernels (KernelBench, TritonBench, GEAK) score correctness through fixed-shape, small-sample allclose-style checks. The number of inputs…

arXiv cs.LG

Indexed Bellman Information Complexity

arXiv:2606.11171v2 Announce Type: replace Abstract: We develop indexed Bellman information complexity, a representation-level theory of interactive decision making centered on information indices and reference histories…

arXiv cs.LG

From Drift to Coherence: Stabilizing Beliefs in LLMs

arXiv:2606.17832v2 Announce Type: replace Abstract: Large language models (LLMs) are often hypothesized to perform implicit Bayesian inference, yet a key coherence condition, the martingale property of predictive belief…

arXiv cs.LG

Monotonic Kolmogorov-Arnold Networks: A Theoretical and Empirical Study of Monotonicity as an Inductive Bias

arXiv:2606.17886v2 Announce Type: replace Abstract: Monotonicity has been a long-running architectural inductive bias for neural networks, motivated by tabular, scientific, and economic settings where outputs are known…

arXiv cs.LG

Zero-Shot Active Feature Acquisition via LLM-Elicitation

arXiv:2606.18933v2 Announce Type: replace Abstract: Active feature acquisition (AFA) sequentially selects which features to observe to reach a classification or ranking decision. Its central limitation is reliance on la…

arXiv cs.LG

Bioacoustic Geolocation: Species Sounds as Geographic Signals

arXiv:2505.18726v3 Announce Type: replace-cross Abstract: Can we determine someone's geographic location solely from the sounds they hear? Are acoustic signals enough to localize within a country, state, or even city? I…

arXiv cs.LG

Toward all-optical unsupervised Hebbian learning in deep photonic neuromorphic networks

arXiv:2601.22300v3 Announce Type: replace-cross Abstract: We propose a deep photonic neuromorphic network (PNN) architecture based on phase-change material (PCM) synapses and local optical feedback for online, unsupervi…

arXiv cs.LG

LLM-Based Synthetic Ground Truth Generation for Audio-Based Emotion Classification via In-Context Learning

arXiv:2606.14784v2 Announce Type: replace-cross Abstract: Understanding human states and interaction dynamics is a core goal of human-computer interaction (HCI). As interaction paradigms become more immersive, virtual r…

arXiv cs.LG

OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic Testing

arXiv:2606.19149v2 Announce Type: replace-cross Abstract: Automated vulnerability discovery in large codebases remains challenging: traditional static analysis produces high false-positive rates, while dynamic approache…

Jun 18, 2026

The Decoder

ChatGPT's new health upgrade beats doctor-written answers, OpenAI says

OpenAI has upgraded ChatGPT's healthcare capabilities with GPT-5.5 Instant. In the company's own comparative tests, the model now outscores answers written by doctors in accuracy, clarity, and completeness. The error ra…

The Decoder

Anthropic brings Artifacts to Claude Code, letting teams share live pages from coding sessions

Claude Code can now turn work results into interactive web pages called "artifacts" and share them with your team. The pages pull from the full session context, update automatically when something changes, and keep a ve…

Jun 17, 2026

Simon Willison

GLM-5.2 is probably the most powerful text-only open weights LLM

Chinese AI lab Z.ai released GLM-5.2 to their coding plan subscribers on June 13th, and then yesterday (June 16th) released the full open weights under an MIT license. Similar in size to their previous GLM-5 and GLM-5.1…

Hugging Face

MolmoMotion: Language-guided 3D motion forecasting

Hugging Face

GLM-5.2: Built for Long-Horizon Tasks

Jun 16, 2026

Ars Technica

Critical Copilot vulnerability allowed hackers to steal 2FA code from users

SearchLeak exploit shows why the industry's approach to LLM security fails over and over.

Jun 5, 2026

MIT Tech Review AI

The Meta hack shows there’s more to AI security than Mythos

On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they…

Jun 4, 2026

Hugging Face

Nemotron 3.5 Content Safety: Customizable Multimodal Safety for Global Enterprise AI

Jun 3, 2026

OpenAI

Introducing new capabilities to GPT-Rosalind

GPT-Rosalind advances life sciences research with enhanced biological reasoning, medicinal chemistry expertise, genomics analysis, and experimental workflow capabilities.

Jun 1, 2026

OpenAI

OpenAI frontier models and Codex are now available on AWS

OpenAI frontier models and Codex are now generally available on AWS, giving enterprises a new path to build with OpenAI through the AWS environments, controls, and procurement workflows they already use. Customers can g…

June 2026 AI News Archive

Source overview

Jun 19, 2026

Encryption, spyware, and now Mythos: History shows why cyber export control doesn’t work

Amazon drops its OpenAI drama film after signing a $50 billion deal with Sam Altman's company

A startup claims it broke through a bottleneck that’s holding back LLMs

OpenAI researchers show small doses of "beneficial trait" training make AI models broadly safer and harder to manipulate

Website "In the Weights" shows whether AI models know who you are

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

LLM agent safety, multi-turn red-teaming, jailbreak benchmarks, adversarial robustness, safety-critical systems

Connect the Dots: Training LLMs for Long-Lifecycle Agents with Cross-Domain Generalization Via Reinforcement Learning

Hidden Anchors in Multi-Agent LLM Deliberation

LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data

Uncertainty Decomposition for Clarification Seeking in LLM Agents

Analyzing the Narration Gap in LLM-Solver Loops

Which Pairs to Compare for LLM Post-Training?

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

Beyond Entropy: Learning from Token-Level Distributional Deviations for LLM Reasoning

ORAgentBench: Can LLM Agents Solve Challenging Operations Research Tasks End to End?

Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring

Navigating Unreliable Parametric and Contextual Knowledge: Explicit Knowledge Conflict Resolution for LLM Inference

What Do Safety-Aligned LLMs Learn From Mixed Compliance Demonstrations?

Human-AI Agent Interaction in a Business Context

Exposing the Unsaid: Visualizing Hidden LLM Bias through Stochastic Path Aggregation

Where to Place the Query? Unveiling and Mitigating Positional Bias in In-Context Learning for Diffusion LLMs via Decoding Dynamics

Information Lattice Learning as Probabilistic Graphical Model Structure Learning

Cost-Optimal LLM Routing with Limited User Feedback under User Satisfaction Guarantees

Interpretable and Verifiable Hardware Generation with LLM-Driven Stepwise Refinement

Secure Coding Drift in LLM-Assisted Post-Quantum Cryptography Development: A Gamified Fix

Techniques for Peak Memory Reduction for LoRA Fine-tuning of LLMs on Edge Devices

FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines

NRITYAM: Language Models Meet Art and Heritage of Dance

SafeSpec: Fast and Safe LLM via Dynamic Reflective Sampling

Measuring Biological Capabilities and Risks of AI Agents

Confidence Calibration for Multimodal LLMs: An Empirical Study through Medical VQA

Hierarchical Control in Multi-Agent Games: LLM-based Planning and RL Execution

A Neuromorphic Reinforcement Learning Framework for Efficient Pathfinding in Robotic Mobile Fulfillment Systems

Evaluating and Enhancing Negation Comprehension in Remote Sensing MLLMs

Editorial Alignment: A Participatory Approach to Engaging Editorial Expertise in LLM-mediated Knowledge Dissemination

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Multi-View Decompilation for LLM-Based Malware Classification

Calibration Without Comprehension: Diagnosing the Limits of Fine-Tuning LLMs for Vulnerability Detection in Systems Software

Efficient and Sound Probabilistic Verification for AI Agents

How Transparent is DiffusionGemma?

PCBSchemaGen: Reward-Guided LLM Code Synthesis for Printed Circuit Boards (PCB) Schematic Design with Structured Verification

RetailBench: Benchmarking long horizon reasoning and coherent decision making of LLM agents in realistic retail environments

TxBench-PP: Analyzing AI Agent Performance on Small-Molecule Preclinical Pharmacology

Reinforcement-aware Knowledge Distillation for LLM Reasoning

The Autonomy Tax: Defense Training Breaks LLM Agents

Automated Standardization of Legacy Biomedical Metadata Using an Ontology-Constrained LLM Agent

FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning

"**Important** You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

Gaming-Resistant Insurance Contracts for Autonomous AI Agents: Strategy-Proof Toll Mechanism Design

Statistical Foundations of LLM-based A/B Testing: A Surrogacy Framework for Human Causal Inference

Mitigating Anchoring Bias in LLM-Based Agents for Energy-Efficient 6G Autonomous Networks

Quantifying Aleatoric Uncertainty of In-Context Learning for Robust Measure of LLM Prediction Confidence

Characterizing Narrative Content in Web-scale LLM Pretraining Data

Code-Switching Reveals Language Anchoring in Multilingual LLMs

AtomMem: Building Simple and Effective Memory System for LLM Agents via Atomic Facts

Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

GEMS: Geometric Constraints Enable Multi-Semantic Superposition in LLMs

Your Mouse and Eyes Secretly Leak Your Preference: LLM Alignment using Implicit Feedback from Users

StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs

Displacement Is Not Direction: Evaluating Fidelity Metrics for Quantized LLM Deployment

S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation

Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades

Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

ESBMC-GraphPLC: Formal Verification of Graphical PLCopen XML Ladder Diagram Programs Using SMT-Based Model Checking

Algebraic Dead Directions in LayerNorm Transformers: A Forward-Pass-Only Diagnostic at LLM Scale

Matching Markets meet Cumulative Prospect Theory: Towards Optimal and Adversarially Robust Learning

Kolmogorov-Arnold Reservoir Computing

Activation- and Influence-Aware Ranks (AIR): Function-Preserving SVD Compression for LLMs

VIMPO: Value-Implicit Policy Optimization for LLMs

Quantile of Means: A Bonus-Free Ensemble Method for Minimax Optimal Reinforcement Learning

Quantum-classical physics-informed Kolmogorov-Arnold networks for PDEs

FloatDoor: Platform-Triggered Backdoors in LLMs

Online Dynamic Batching with Formal Guarantees for LLM Training

The Correctness Illusion in LLM-Generated GPU Kernels

"Important You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems