跳转至

COMP329 Autonomous Mobile Robotics - Agents and Autonomous Systems

一、Course Basic Information(课程基础信息)

  • Course Name(课程名称):Autonomous Mobile Robotics(自主移动机器人学)
  • Course Modules(课程模块)
  • Part 1: Agents and Robots(智能体与机器人)
  • Part 2: Properties of Autonomy(自主性的属性)
  • Part 4: The Robot Control Architecture(机器人控制架构)
  • Lecturer(授课教师):Dr Terry R. Payne
  • Department(所属院系):Department of Computer Science, UNIVERSITY OF LIVERPOOL(利物浦大学计算机科学系)
  • Semester(授课学期):Semester 1 - 2025-26(2025-26学年第一学期)

二、Core Concepts of Robots(机器人核心概念)

1. Origin and Definitions of Terminology(术语起源与定义)

  • Origin of "Robot":The word "robot" was first used in Karel Capek’s play "Rossum’s Universal Robots" in 1920, and robots were conceptualised as "mechanical men" at that time. The Czech word "robota" roughly translates to the feudal term "corvée", referring to unpaid labor provided to one’s liege lord.(“Robot”一词源于1920年Karel Capek的戏剧《Rossum’s Universal Robots》(《罗萨姆的万能机器人》),当时机器人被定义为“机械人(mechanical men)”;其词源可追溯至捷克语“robota”,大致对应封建时期的“corvée(徭役)”,即向领主提供的无偿劳动。)
  • Origin of "Robotics":Isaac Asimov coined the term "robotics" in 1942.(1942年,Isaac Asimov(艾萨克·阿西莫夫)创造了“robotics(机器人学)”这一术语。)
  • Authoritative Definitions(权威定义)
  • Definition by Robot Institute of America (1980):A programmable, multifunction manipulator designed to move material, parts, tools, or specialized devices through variable programmed motions for the performance of a variety of tasks.(美国机器人协会(1980年)定义:一种可编程的多功能操纵器,设计用于通过可变的编程运动移动材料、零件、工具或专用设备,以执行各种任务。)
  • Definition by Russell and Norvig (2003):A physical agent that performs tasks by manipulating the physical world.(Russell和Norvig(2003年)定义:通过操纵物理世界来执行任务的物理智能体(physical agent)。)

2. Relationship Between Robots, Teleoperation and Autonomy(机器人与远程操作、自主性的关系)

  • Tele-operated Robots:Many vehicles considered "autonomous" are actually tele-operated, relying on continuous human control and lacking the ability to make independent decisions.(许多被认为是“自主”的设备实际为远程操作机器人,需依赖人工持续控制,无法独立决策。)
  • Autonomous Robots:The core feature is the ability to make independent decisions without real-time human intervention, and adjust behaviors independently according to environmental changes.(自主机器人的核心特征是能够自主制定决策,无需人工实时干预,可根据环境变化独立调整行为。)

三、Core Concepts of Agents(智能体核心概念)

1. Definition and Core Attributes of Agents(智能体的定义与核心属性)

  • Basic Definition:An agent is a computer system capable of independent (autonomous) action on behalf of its user or owner, which figures out what needs to be done to satisfy design objectives rather than constantly being told. It can be further described as a computer system situated in a specific environment and capable of autonomous action in that environment to meet its delegated objectives. The core lies in "autonomy", i.e., the ability to act independently.(智能体是“能够代表用户或所有者自主行动的计算机系统,可自主判断需执行的操作以满足设计目标,而非持续依赖指令”;进一步可表述为“处于特定环境中,能通过自主行动实现委托目标的计算机系统”,核心在于“自主性(autonomy)”,即独立行动能力。)
  • Core Decision Requirements:An agent needs to solve two key problems: "what action to perform" and "when to perform an action".(智能体需解决两个关键问题——“执行何种行动(what action to perform)”与“何时执行行动(when to perform an action)”。)
  • Typical Cases:NASA’s Deep Space 1 (DS1) and the Autonomous Asteroid Exploration Project. DS1, launched in 1998, is a representative spacecraft with autonomous decision-making capabilities.(美国国家航空航天局(NASA)的“深空1号(Deep Space 1,DS1)”和“自主小行星探测项目(Autonomous Asteroid Exploration Project)”,其中DS1于1998年发射,是具备自主决策能力的航天器代表。)

2. Interaction Mode Between Agents and Environment(智能体与环境的交互模式)

  • Core Elements of Interaction:An agent obtains "percepts" through "sensors" and performs "actions" through "effectors". The core question is "what actions to take for a given state of the environment".(智能体通过“传感器(Sensors)”获取“感知信息(Percepts)”,再通过“执行器(Effectors)”执行“行动(Action)”,核心问题是“针对特定环境状态应采取何种行动”。)
  • Core Questions of Mobile Robotics:In the context of mobile robotics, the interaction between an agent and the environment requires solving three core questions: "Where am I?", "Where am I going?", and "How do I get there?" These are the core contents of the robotics module in the course.(在移动机器人场景中,智能体与环境的交互需解决三个核心问题——“我在哪里(Where am I?)”“我要去哪里(Where am I going?)”“如何到达那里(How do I get there?)”,这也是课程中机器人学模块的核心内容。)

3. Hierarchy and Characteristics of Autonomy(自主性的层级与特征)

  • Autonomy Spectrum:Autonomy exists on a continuous spectrum from "no autonomy (e.g., simple machines)" to "full autonomy (e.g., humans)". Autonomy is adjustable—when it is more beneficial to delegate decision-making to a higher authority, the agent can transfer decision-making authority.(自主性存在从“无自主性(如简单机器)”到“完全自主性(如人类)”的连续光谱,且自主性可调整——当交由更高权威决策更有利时,智能体可转移决策权限。)
  • Cases of Simple Agents(简单智能体案例)
  • Control Systems:e.g., Thermostat. Its delegated goal is to maintain room temperature, and it only achieves this through two simple actions: "heating on/off", with a single decision logic.(控制系统:如恒温器(Thermostat),其委托目标是维持室温,仅通过“加热/关闭加热”两种简单行动实现,决策逻辑单一。)
  • Software Demons:e.g., the "biff program" in UNIX systems. Its function is to monitor incoming emails and prompt via GUI, with a fixed action mode and no complex decision-making.(软件守护进程:如UNIX系统的“biff程序”,功能是监控新邮件并通过图形界面(GUI)提示,行动模式固定,无复杂决策。)

4. Differences Between Agents and Objects(智能体与对象的差异)

Dimension of Difference(差异维度) Agents(智能体) Objects(对象)
Autonomy(自主性) Possess stronger autonomy and can independently decide whether to perform actions requested by other agents.(具备更强自主性,可自主决定是否执行其他智能体请求的行动。) Passively execute instructions, have no independent decision-making ability, and must complete actions as required.(被动执行指令,无自主决策能力,需按要求完成行动。)
Behavioral Flexibility(行为灵活性) Can exhibit flexible behaviors such as reactive, pro-active, and social.(可表现出反应式(reactive)、主动式(pro-active)、社交式(social)灵活行为。) No such behavior model, only execute according to preset logic.(无此类行为模型,仅按预设逻辑执行。)
Proactivity(主动性) Not passive service providers; multi-agent systems are inherently multi-threaded, with each agent having at least one active control thread.(非被动服务提供者,多智能体系统本质为多线程,每个智能体至少有一个主动控制线程。) Passively respond to calls, no active control threads.(被动响应调用,无主动控制线程。)
Motivation for Action(行动动机) Perform actions based on "personal gain" to improve their own utility, in line with rationality.(基于“自身利益(personal gain)”执行行动,以提升自身效用(utility)为目标,符合理性。) No concept of "interest" or "utility", only execute actions to complete tasks.(无“利益”或“效用”概念,仅为完成任务而执行行动。)

5. Relationship Between Agents and Artificial Intelligence (AI)(智能体与人工智能的关系)

  • Building useful agents does not require solving all AI problems; it only needs to enable them to select correct actions in a limited domain, i.e., "a little intelligence goes a long way".(构建有用的智能体无需解决所有AI问题,仅需使其在特定领域(limited domain)选择正确行动,即“少量智能即可发挥巨大作用(a little intelligence goes a long way)”。)
  • Commercial Case:NETBOT achieved commercial profitability by simplifying agent functions (reducing "intelligence level"), proving that agents focused on specific domains are more practical.(商业案例:NETBOT公司通过简化智能体功能(降低“智能程度”),最终实现商业盈利,证明聚焦特定领域的智能体更具实用价值。)

四、Behavioral Attributes and Other Characteristics of Autonomous Agents(自主智能体的行为属性与其他特征)

1. Three Core Behavioral Attributes(三大核心行为属性)

  • Reactive:Need to maintain continuous interaction with the environment and respond to environmental changes in a timely manner (ensuring the timeliness of responses). The real environment is dynamic and information is incomplete, so agents need to consider the possibility of action failure and judge whether the action is worth executing. However, programs in a fixed environment (e.g., the operating environment of a compiler) do not require such responsiveness and can be executed blindly.(反应式:需与环境保持持续交互,及时响应环境变化(确保响应具有时效性)。现实环境动态且信息不完整,智能体需考虑行动失败可能性,判断行动是否值得执行;而固定环境(如编译器的运行环境)中的程序无需此类响应能力,可盲目执行。)
  • Pro-active:Goal-driven, not only able to respond to environmental stimuli (e.g., "stimulus-response" rules), but also able to proactively generate and try to achieve goals, identify potential opportunities, rather than being driven only by events.(主动式:以目标为导向(goal-driven),不仅能对环境刺激做出反应(如“刺激-响应”规则),还能主动生成并尝试实现目标,识别潜在机会,而非仅受事件驱动。)
  • Social Ability:In multi-agent environments (e.g., the real world, the Internet), agents need to interact with other agents (or humans) through cooperation, coordination, and negotiation. Some goals can only be achieved through social interaction.(社交能力:在多智能体环境(如现实世界、互联网)中,智能体需通过协作(cooperation)、协调(coordination)、协商(negotiation)与其他智能体(或人类)交互,部分目标仅能通过社交实现。)
  • Cooperation:Working together as a team to achieve a common goal, applicable when a single agent cannot achieve the goal independently or cooperation can improve efficiency (e.g., obtaining results faster).(协作:团队协作实现共同目标,适用于单个智能体无法独立完成目标或协作可提升效率(如更快获得结果)的场景。)
  • Coordination:Managing dependencies between different activities. For example, players in a football match need to coordinate to complete team tasks, and RoboSoccer is a typical case.(协调:管理不同活动间的依赖关系,如足球比赛中队员需协调配合以完成团队任务,机器人足球(RoboSoccer)是典型案例。)
  • Negotiation:Reaching agreements on matters of common interest through a "proposal-counter-proposal" process to achieve compromise. For example, roommates negotiating TV usage time ("watch football tonight, watch a movie tomorrow").(协商:针对共同利益达成协议,通过“提议-反提议”过程实现妥协,如室友间协商电视使用时间(“今晚看足球,明天看电影”)。)

2. Other Key Characteristics(其他关键属性)

  • Mobility:Software agents can move in electronic networks, while robots can move in a nondeterministic environment.(移动性:软件智能体可在电子网络中移动,机器人则可在非确定性环境(nondeterministic environment)中移动。)
  • Rationality:An agent’s actions need to be goal-oriented and will not intentionally take actions that hinder the achievement of goals.(理性:智能体的行动需以实现目标为导向,不会故意采取阻碍目标实现的行动。)
  • Veracity:Refers to whether an agent will intentionally transmit false information.(诚实性:指智能体是否会故意传递虚假信息。)
  • Benevolence:Depends on whether there are conflicts between the goals of agents; conflicts determine whether agents have inherent willingness to help.(善意性:取决于智能体间目标是否冲突,冲突与否决定智能体是否具有内在帮助意愿。)
  • Learning/Adaption:Refers to whether an agent can improve its performance over time.(学习/适应性:指智能体是否能随时间推移提升性能。)

五、Agent Control Architecture and Task Planning(智能体控制架构与任务规划)

1. Agent State and Control Loop(智能体状态与控制循环)

  • Definition of State:An agent has an "internal state" used to record the environment state and interaction history. Let the set of all internal states be \(I\).(智能体存在“内部状态(internal state)”,用于记录环境状态与交互历史,设所有内部状态的集合为\(I\)。)
  • Control Loop Process(控制循环流程)
  • The agent starts from the initial internal state \(i_0\);(智能体从初始内部状态\(i_0\)启动;)
  • Observes the environment state \(e\) and generates percepts \(see(e)\) through the "see function";(观察环境状态\(e\),通过“感知函数(see function)”生成感知信息\(see(e)\);)
  • Updates the internal state to \(next(i_0, see(e))\) through the "next function";(通过“状态更新函数(next function)”将内部状态更新为\(next(i_0, see(e))\);)
  • Determines and executes the action \(action(next(i_0, see(e)))\) through the "action function";(通过“行动选择函数(action function)”确定行动\(action(next(i_0, see(e)))\)并执行;)
  • Returns to step 2 to enter the next cycle.(返回步骤2,进入下一轮循环。)

2. Definition of Core Functions(核心函数定义)

  • See Function (Perception Function):The agent’s ability to observe the environment, with a mapping relationship of \(see: E \to Per\) (where \(E\) is the set of environment states and \(Per\) is the set of percepts), and the output is percepts.(感知函数:智能体观察环境的能力,映射关系为\(see: E \to Per\)\(E\)为环境状态集合,\(Per\)为感知信息集合),输出为感知信息。)
  • Action Function (Action Selection Function):The agent’s decision-making ability, with a mapping relationship of \(action: I \to Ac\) (where \(Ac\) is the set of actions), mapping from internal states to specific actions.(行动选择函数:智能体的决策能力,映射关系为\(action: I \to Ac\)\(Ac\)为行动集合),从内部状态映射到具体行动。)
  • Next Function (State Update Function):The agent’s ability to update its understanding of the environment, with a mapping relationship of \(next: I \times Per \to I\), generating a new internal state by combining the current internal state and new percepts.(状态更新函数:智能体更新环境认知的能力,映射关系为\(next: I \times Per \to I\),结合当前内部状态与新感知信息,生成新的内部状态。)

3. Agent Task Planning and Utility Functions(智能体任务规划与效用函数)

  • Core Requirement of Task Planning:Humans only need to tell the agent "what to do" without telling it "how to do it", which requires specific mechanisms to achieve this goal.(人类只需告知智能体“做什么(what to do)”,无需告知“怎么做(how to do it)”,需通过特定机制实现此目标。)
  • Definition of Utility Function:By assigning "utility values" to "environment states that we want the agent to achieve", the agent’s task is transformed into "achieving states that maximize utility", with a mapping relationship of \(u: E \to \mathbb{R}\) (where \(\mathbb{R}\) is the set of real numbers), i.e., each environment state corresponds to a real-number utility value.(效用函数:通过为“希望智能体实现的环境状态”分配“效用值(utility value)”,使智能体的任务转化为“实现效用最大化的状态”,映射关系为\(u: E \to \mathbb{R}\)\(\mathbb{R}\)为实数集合),即每个环境状态对应一个实数效用值。)
  • Local Utility Functions
  • Calculation Methods of Run Value:For the "run" of an agent, the value can be calculated by "minimum utility of state", "maximum utility of state", "sum of state utilities", "average of state utilities", etc.(运行价值计算方式:针对智能体的“运行过程(run)”,价值可通过“状态最小效用”“状态最大效用”“状态效用总和”“状态效用平均值”等方式计算。)
  • Limitation:Assigning utilities to individual states makes it difficult to reflect the "long-term perspective", so a "discount mechanism (e.g., discounting future rewards in reinforcement learning)" needs to be introduced to reduce the utility weight of long-term states.(局限性:为单个状态分配效用难以体现“长期视角”,需引入“折扣机制(如强化学习中的未来奖励折扣)”,降低远期状态的效用权重。)
  • Case Calculation (4x3 Deterministic Environment)
    • Environment Setting:Default state reward \(r=-0.04\), target state reward \(r=+1\), environment determinism \(p=1.0\) (the agent will definitely reach the expected position);(环境设定:默认状态奖励\(r=-0.04\),目标状态奖励\(r=+1\),环境确定性\(p=1.0\)(智能体必达预期位置);)
    • Optimal Action Sequence:[Up, Up, Right, Right, Right];(最优行动序列:[Up, Up, Right, Right, Right];)
    • Cumulative Reward Calculation:\(r=(-0.04 \times 4) + 1.0 = 1.0 - 0.16 = 0.84\). The negative reward (\(-0.04\)) is used to encourage the agent to reach the target as soon as possible.(累加奖励计算:\(r=(-0.04 \times 4) + 1.0 = 1.0 - 0.16 = 0.84\),负奖励(\(-0.04\))的作用是激励智能体尽快到达目标。)
  • Extension to Non-Deterministic Environment:When the environment is non-deterministic (e.g., the probability of successfully reaching the target is \(p=0.8\), and the probability of moving sideways is \(p=0.1\)), it is necessary to calculate the "total probability of successfully reaching the target" (e.g., the total probability after 5 actions is \(p=0.32776\)). The utility value needs to be dynamically calculated based on the path probability, and reinforcement learning is built based on this model.(非确定性环境扩展:当环境非确定(如成功到达目标概率\(p=0.8\),侧向移动概率\(p=0.1\)),需计算“成功到达目标的总概率”(如5步行动后总概率\(p=0.32776\)),效用值需结合路径概率动态计算,强化学习(Reinforcement Learning)基于此模型构建。)

六、Core Challenges and Control Architectures of Mobile Robotics(移动机器人的核心挑战与控制架构)

1. Core Challenges and Solution Elements(核心挑战与解决要素)

  • Three Core Questions:Continuing the core questions of agent-environment interaction—"Localisation (Where am I?)", "Goal (Where am I going?)", and "Path (How do I get there?)".(延续智能体与环境交互的核心问题——“定位(Where am I?)”“目标(Where am I going?)”“路径(How do I get there?)”。)
  • Key Solution Elements(关键解决要素)
  • Locomotion and Kinematics:Enabling robot movement, balancing "manoeuvrability" and "control difficulty";(移动与运动学:实现机器人移动,平衡“机动性(manoeuvrability)”与“控制难度”;)
  • Perception:Enabling robots to "perceive the environment", handling the uncertainty of sensor input and environmental changes, and balancing "cost", "data volume", and "computation volume";(感知:使机器人“感知环境”,处理传感器输入的不确定性与环境变化,平衡“成本”“数据量”与“计算量”;)
  • Localisation and Mapping:Determining the robot’s position, building an environment map, and clarifying the relative positions of environmental features;(定位与地图构建:确定机器人位置,构建环境地图,明确环境特征的相对位置;)
  • Planning and Navigation:Planning a path to the target, realizing path tracking, and avoiding obstacles.(规划与导航:规划到达目标的路径,实现路径跟踪,规避障碍物。)
  • Additional Difficulties(额外难点)
  • Dynamic Environment:The environment changes continuously (e.g., object occlusion), and no concise model is available. It is necessary to solve problems such as "environment representation method (e.g., cells/items)", "resolution selection", and "state explosion";(动态环境:环境持续变化(如物体遮挡),无简洁模型可用,需解决“环境表示方式(如单元格/物品)”“分辨率选择”“状态爆炸”等问题;)
  • Multiple Uncertainties:Sensors have errors, and the process of extracting information from sensors also has errors, leading to uncertainty in environmental cognition.(多源不确定性:传感器存在误差,传感器信息提取过程也存在误差,导致环境认知的不确定性。)

2. Implementation Logic of Localisation, Perception and Navigation(定位、感知与导航的实现逻辑)

  • Perception and Localisation:Robots extract "features" of the environment through sensors, and maps record the relative relationships between features. The localisation process is "identifying features → matching positions in the map where these features can be observed" to determine their own position.(感知与定位:机器人通过传感器提取环境“特征(features)”,地图记录特征间的相对关系;定位过程即“识别特征→匹配地图中可观察该特征的位置”,实现自身位置确定。)
  • Navigation:Planning a path based on the map (e.g., marking obstacles and targets through "cells with distance values"), and avoiding sudden obstacles in real time during path execution to ensure path effectiveness.(导航:结合地图规划路径(如通过“距离值单元格”标记障碍物与目标),在路径执行过程中实时规避突发障碍物,确保路径有效性。)

3. Mainstream Control Architectures(主流控制架构)

  • Classical/Deliberative Architecture
  • Core Features:Complete environment modeling, function-based, horizontal decomposition;(核心特征:完整环境建模(complete modelling)、基于函数(function based)、水平分解(horizontal decomposition);)
  • Process:Sensors → Perception → Localisation/Map Building → Cognition/Planning → Motion Control → Actuators.(流程:传感器(Sensors)→感知(Perception)→定位/地图构建(Localisation/Map Building)→认知/规划(Cognition/Planning)→运动控制(Motion Control)→执行器(Actuators)。)
  • Behaviour Based Architecture
  • Core Features:Sparse or no modeling, behaviour-based, vertical decomposition, bottom-up;(核心特征:稀疏建模或无建模(sparse or no modelling)、基于行为(behaviour based)、垂直分解(vertical decomposition)、自下而上(bottom up);)
  • Process:Sensors → Parallel processing of multiple behaviors (e.g., "Discover new area", "Detect goal position", "Avoid Obstacles", "Follow right/left wall") → Coordination and Fusion (e.g., fusion via vector summation) → Actuators.(流程:传感器(Sensors)→多行为并行处理(如“发现新区域”“检测目标位置”“规避障碍物”“沿墙行驶”)→协调与融合(如向量求和融合)→执行器(Actuators)。)
  • Hybrid Architecture
  • Core Features:Combining the advantages of the above two architectures, there is no unified optimal combination method. A typical model is "low-level modules (localisation, obstacle avoidance, data collection) adopt a behaviour-based architecture, and high-level cognitive modules (planning, map building) adopt a deliberative architecture".(核心特征:结合前两种架构的优势,无统一最优结合方式,典型模式为“低层模块(定位、避障、数据收集)采用行为式架构,高层认知模块(规划、地图构建)采用deliberative架构”。)

1. Course Content Summary(课程内容总结)

  • Part 1:Defines the concept of robots, analyzes the connotation of autonomy, and compares the differences between autonomous robots and tele-operated robots; introduces the concept of agents, and explains the definition, attributes of agents, differences from objects, and the relationship between agents and AI.(第一部分:定义机器人概念,解析自主性的内涵,对比自主机器人与远程操作机器人的差异;引入智能体概念,阐述智能体的定义、属性、与对象的差异,以及智能体与AI的关系。)
  • Part 2:Focuses on the core behavioral attributes (reactive, pro-active, social ability) and other key attributes (mobility, rationality, etc.) of autonomous agents; subsequent modules will deeply explore the "attributes of physical agents (i.e., robots)".(第二部分:聚焦智能体的核心行为属性(反应式、主动式、社交能力)与其他关键属性(移动性、理性等);后续模块将深入探讨“物理智能体(即机器人)的属性”。)
  • Part 4:Identifies the core challenges of mobile robotics, proposes key elements to solve the challenges, and introduces three control architectures (classical, behaviour-based, hybrid), providing an implementation framework for autonomous robot control.(第四部分:明确移动机器人的核心挑战,提出解决挑战的关键要素,介绍经典、行为式、混合三种控制架构,为机器人自主控制提供实现框架。)
  • Core Reading:Chapter 1 of "The robotics primer" by Matarić, Maja J. It is concise and covers the definition of robots, the concept of autonomy, and compares autonomous robots with tele-operated robots.(核心阅读:Matarić, Maja J. 所著《The robotics primer》第一章,内容简短,涵盖机器人定义、自主性概念,对比自主机器人与远程操作机器人。)
  • Supplementary Reading on Agents and Autonomy:"An Introduction to MultiAgent Systems" by Mike Wooldridge, focusing on "2.5 Abstract Agent Architectures and Runs".(智能体与自主性补充阅读:Mike Wooldridge 所著《An Introduction to MultiAgent Systems》,重点阅读“2.5 抽象智能体架构与运行(abstract agent architectures and runs)”。)
  • Reference Materials on Probabilistic Robotics:Course materials of "Introduction to Mobile Robotics" by Professor Wolfram Burgard from the University of Freiburg, Germany. The website is http://ais.informatik.uni-freiburg.de/teaching/ss19/robotics/.(概率机器人参考资料:德国弗莱堡大学(University of Freiburg)Wolfram Burgard教授的“Introduction to Mobile Robotics”课程资料,网址为http://ais.informatik.uni-freiburg.de/teaching/ss19/robotics/。)