Eye on AI Weekly Research Watch

Impossibile aggiungere al carrello

Puoi avere soltanto 50 titoli nel carrello per il checkout.

Riprova più tardi

Rimozione dalla Lista desideri non riuscita.

Riprova più tardi

Non è stato possibile aggiungere il titolo alla Libreria

Per favore riprova

Non è stato possibile seguire il Podcast

Per favore riprova

Esecuzione del comando Non seguire più non riuscita

Eye on AI Weekly Research Watch

Di: Craig Spencer Smith

Ascolta gratuitamente

Episodi Visualizza tutto

VISTA: View-Consistent Self-Verified Training for GUI Grounding

Jun 15 2026

Teaching AI to click the right button on a screen — GUI grounding — sounds simple but is surprisingly brittle. A core training problem is that reinforcement learning often collapses: on hard instances, every rollout fails, so there's no useful learning signal; on easy ones, every rollout succeeds, equally uninformative. VISTA solves this by generating multiple crops of the same GUI screenshot, comparing model predictions across geometrically different but semantically equivalent views. A self-verification mechanism further stabilizes training by anchoring on cases where the model has already produced a correct answer. Results across five benchmarks show consistent accuracy improvements, with the strongest gains on the most challenging GUI grounding tasks. Applications include desktop automation agents, accessibility tools, and software testing frameworks. Authors: Xinyu Qiu, Yunzhu Zhang, Heng Jia, Shuheng Shen, Changhua Meng, Linchao Zhu Paper: https://arxiv.org/abs/2606.14579v1
Mostra di più Mostra meno

3 min

Impossibile aggiungere al carrello

Puoi avere soltanto 50 titoli nel carrello per il checkout.

Riprova più tardi

Riprova più tardi

Rimozione dalla Lista desideri non riuscita.

Riprova più tardi

Non è stato possibile aggiungere il titolo alla Libreria

Per favore riprova

Non è stato possibile seguire il Podcast

Per favore riprova

Esecuzione del comando Non seguire più non riuscita

Ascolta gratuitamente
CARE: Controlling LLM-Generated Policies through Auditable Review of Evidence in Scientific Experimentation

Jun 15 2026

High-throughput scientific experimentation — screening thousands of chemical compounds, for instance — is expensive and irreversible, making it a dangerous domain for unconstrained AI autonomy. CARE solves this by keeping a proven non-LLM optimizer as the default while allowing an LLM to propose challenger strategies, only authorizing the challenger when pre-outcome evidence actually supports the switch. Every decision is logged in an auditable trail. On chemistry benchmarks, this outperforms all other evaluated methods, improving best-found outcomes significantly over a strong baseline. Applications extend to drug discovery, materials science, process optimization in manufacturing, and any high-stakes experimental domain where AI creativity needs to be harnessed without sacrificing accountability or safety. Authors: Guanyu Liu, Weiyi Kong, Zeyu Wang, Boer Zhang, Baiqing Li, Peiyu Zhang, Tianyu Shi Paper: https://arxiv.org/abs/2606.14581v1
Mostra di più Mostra meno

2 min

Impossibile aggiungere al carrello

Puoi avere soltanto 50 titoli nel carrello per il checkout.

Riprova più tardi

Riprova più tardi

Rimozione dalla Lista desideri non riuscita.

Riprova più tardi

Non è stato possibile aggiungere il titolo alla Libreria

Per favore riprova

Non è stato possibile seguire il Podcast

Per favore riprova

Esecuzione del comando Non seguire più non riuscita

Ascolta gratuitamente
A Temporal Planning Framework for Disruption Aware Dynamic Route Optimization in Heterogeneous Railway Systems

Jun 15 2026

Railway networks are extraordinarily complex — trains of different gauges share limited track, single-track sections require precise coordination, and unexpected disruptions cascade through entire timetables. Most optimization research stops at high-level scheduling, leaving the messy operational details — track switching, gauge compatibility, disruption response — to human operators under pressure. This framework models the entire problem using PDDL 2.1 temporal planning, generating timestamped, conflict-free operational plans that account for gauge constraints and stochastic disruptions like blocked tracks or engine failures. Tested on 200 benchmark instances with up to 1,000 track points and 120 trains, it demonstrates practical viability for real-world railway systems seeking to reduce reliance on manual intervention during disruptions. Authors: Pollob Chandra Ray, Sabah Binte Noor, Fazlul Hasan Siddiqui Paper: https://arxiv.org/abs/2606.14582v1
Mostra di più Mostra meno

3 min

Impossibile aggiungere al carrello

Puoi avere soltanto 50 titoli nel carrello per il checkout.

Riprova più tardi

Riprova più tardi

Rimozione dalla Lista desideri non riuscita.

Riprova più tardi

Non è stato possibile aggiungere il titolo alla Libreria

Per favore riprova

Non è stato possibile seguire il Podcast

Per favore riprova

Esecuzione del comando Non seguire più non riuscita

Ascolta gratuitamente

Ancora nessuna recensione

Eye on AI Weekly Research Watch

Impossibile aggiungere al carrello

Rimozione dalla Lista desideri non riuscita.

Non è stato possibile aggiungere il titolo alla Libreria

Non è stato possibile seguire il Podcast

Esecuzione del comando Non seguire più non riuscita

Eye on AI Weekly Research Watch

VISTA: View-Consistent Self-Verified Training for GUI Grounding

Impossibile aggiungere al carrello

Rimozione dalla Lista desideri non riuscita.

Non è stato possibile aggiungere il titolo alla Libreria

Non è stato possibile seguire il Podcast

Esecuzione del comando Non seguire più non riuscita

CARE: Controlling LLM-Generated Policies through Auditable Review of Evidence in Scientific Experimentation

Impossibile aggiungere al carrello

Rimozione dalla Lista desideri non riuscita.

Non è stato possibile aggiungere il titolo alla Libreria

Non è stato possibile seguire il Podcast

Esecuzione del comando Non seguire più non riuscita

A Temporal Planning Framework for Disruption Aware Dynamic Route Optimization in Heterogeneous Railway Systems

Impossibile aggiungere al carrello

Rimozione dalla Lista desideri non riuscita.

Non è stato possibile aggiungere il titolo alla Libreria

Non è stato possibile seguire il Podcast

Esecuzione del comando Non seguire più non riuscita