LLM & Human-Machine Collaboration Reading Party

Online, 2023

Introduction

Before the summer holiday, we successfully organized a month-long reading party focused on LLM and robotics (see Cycle 0: LLM and robotics). During this month, we passionately explored various applications of LLM in the field of robotics. From the Sept. 2023, we hope to revive the reading party. Our theme will continue to revolve around LLM and human-machine collaboration. To make our discussions more in-depth and specific, we will set a particular research direction each month. Each direction will be hosted by a PhD student studying the area, whose primary task will be to recommend and propose relevant outstanding papers. Subsequently, participating students will delve deep into these papers, sharing insights and engaging in discussions. Note: It’s acceptable if the paper doesn’t utilize LLM, as long as it falls under the topic and possesses potential for LLM-enhanced improvements.

The reading party is scheduled to be held every two weeks, with two students presenting each time. Each student will have 30 minutes to introduce and interpret the paper(s), followed by a 15-minute discussion and Q&A session. Given that some papers in the LLM field might be shorter or more straightforward, some students might need to present multiple papers within their allocated time slot. Therefore, we have set aside a total of 90 minutes for each weekly reading party, ensuring that every presentation receives ample time and attention. Furthermore, for each month, we will switch to a different specific research topic or direction. This ensures that students from various research backgrounds can benefit and gain insights from the sessions.

There are candidated topic list:

  • LLM for Reasoning
  • LLM for Game Theoretical Problem
  • LLM for Planning
  • LLM for Embodied AI
  • LLM for HCI
  • LLM for Science
  • LLM for Synthetic Data Generation
  • TBD

How to join?

Contacts: Yang Li

Email: yang.li-4@manchester.ac.uk

Homepage: https://liyang.page/reading-party.html

Update

The current cycle is Cycle 1: LLM for Reasoning.

The upcoming presenters are

  • Mingrui Li
  • Zhe Cao, Master Student at Shanghai Jiaotong University

Papers to be Presented:

  • TBD
  • TBD

Date & Time:

Wed, Jan 17th, 2023 at 11:30 - 14:00 (London time) 19:30 - 21:00 (Beijing time)

Welcome to Join Us! Email yang.li-4@manchester.ac.uk

Cycle 1: LLM for Reasoning

Week 7: 3rd Jan 2024

Speaker 1: Bin Zhang, PhD student at Institute of Automation, Chinese Academy of Sciences & SenseTime Research

  • Slides Link
  • Stackelberg Decision Transformer for Asynchronous Action Coordination in Multi-Agent Reinforcement Learning
  • Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach

Speaker 2: Tengye Xu, PhD student at Zhejiang University

  • Slides Link
  • Text2Reward: Dense Reward Generation with Language Models for Reinforcement Learning

Week 6: 20th Dec 2023

Speaker 1: Lei Yuan, Assistant Professor at Nanjing University

Week 5: 5th Dec 2023

Speaker 1: Xue Yan, PhD student at The Institute of Automation, Chinese Academy of Sciences

  • Slides Link
  • Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models

Speaker 2: Yang Li, PhD student at University of Manchester

Week 4: 15th Nov 2023

Speaker 1: Ziyu Wan, PhD student at SJTU

Speaker 2: Tenglong Liu, PhD student at NUST

Week 3: 1st Nov 2023

Speaker 1: Yudi Zhang, PhD Student

  • Slides Link
  • Efficient Human-AI Coordination via Preparatory Language-Based Convention
  • Evaluating Multi-Agent Coordination Abilities in Large Language Models

Speaker 2: Yiyu Jiang, PhD Student at UoM

Week 2: 18th Oct 2023

Speaker 1: Muning Wen, PhD Student at Shanghai Jiaotong University

  • Better Zero-Shot Reasoning with Self-Adaptive Prompting
    • Paper Link
    • Slides Link
    • Summary (by BARD): the authors propose a novel prompt design method for large language models (LLMs) that leverages the LLM’s own zero-shot outputs to improve its reasoning capabilities. The method, called Consistency-based Self-adaptive Prompting (COSP), does not require any handcrafted responses or ground-truth labels, and it can be used to improve the performance of LLMs on a variety of reasoning tasks, including question answering, summarization, and natural language inference. In the zero-shot setting, COSP works by selecting and building a set of examples from the LLM’s zero-shot outputs via carefully designed criteria that combine consistency, diversity, and repetition. These criteria encourage the LLM to learn from its own predictions, and they help to reduce the sensitivity of the LLM’s performance to the choice of examples. Experiments show that COSP can improve the performance of LLMs on zero-shot reasoning tasks by up to 15% compared to zero-shot baselines. COSP can also match or exceed the performance of few-shot baselines, which require handcrafted examples. Overall, COSP is a promising new method for improving the zero-shot reasoning capabilities of LLMs. It is easy to implement and does not require any additional human effort.

Speaker 2: Weiqin Zu, Master Student at ShanghaiTech University

Week 1: 7th Oct 2023

Speaker 1: Wenhao Zhang, Undergraduate Student at Shanghai Jiaotong University

  • MindAgent: Emergent Gaming Interaction
    • Paper Link
    • Slides Link
    • Summary (by ChatGPT): The paper “MINDAGENT: EMERGENT GAMING INTERACTION” introduces the “MindAgent” system for multi-agent interactions, emphasizing the capabilities of Large Language Models (LLMs) in complex task planning. The authors present “CUISINE WORLD,” a novel gaming benchmark to assess multi-agent collaboration efficiency. Extensive evaluations highlight LLMs’ potential, such as GPT-4, in scheduling multiple agents and collaborating with human players.

Speaker 2: Yang Li, PhD Student at UoM

  • Self-Refined Large Language Model as Automated Reward Function Designer for Deep Reinforcement Learning in Robotics
    • Paper Link
    • Slides Link
    • Summary (by ChatGPT): The paper titled “Self-Refined Large Language Model as Automated Reward Function Designer for Deep Reinforcement Learning in Robotics” addresses the challenge of designing high-performing reward functions for Deep Reinforcement Learning (DRL) in robotics. Recognizing the potential of Large Language Models (LLMs) in tasks requiring common-sense knowledge, the authors introduce a novel LLM framework with a self-refinement mechanism for automated reward function design. Through evaluations on various robotic control tasks, the LLM-designed reward functions demonstrate competitive performance, often rivaling or surpassing manually designed functions.

Speaker 3: Zhe Cao, Master Student at Shanghai Jiaotong University

  • Open X-Embodiment: Robotic Learning Datasets and RT-X Models
    • Paper Link
    • Slides Link
    • Summary (by ChatGPT): The paper “Open X-Embodiment: Robotic Learning Datasets and RT-X Models” presents an open, large-scale dataset for robot learning, curated from 21 institutions worldwide. This dataset encompasses diverse behaviors, robot embodiments, and environments, aiming to facilitate the training of generalized robotic policies. By consolidating data from 22 different robots, the authors introduce a high-capacity model named RT-X, which demonstrates positive transfer across various robotic platforms.

Cycle 0: LLM and Robotics

Read More