Planning and learning for weakly-coupled distributed agents

Stochastic planning has gained popularity over classical planning in recent years by offering principled ways to model the uncertainty inherent in many real world applications. In particular, Markov decision process (MDP) based approaches are particularly well suited to problems that involve sequent...

Mô tả chi tiết

Lưu vào:

Hiển thị chi tiết
Tác giả chính:	Guo, Anyuan
Định dạng:	Luận án
Ngôn ngữ:	en_US
Thông tin xuất bản:	University of Massachusetts Amherst 2008
Chủ đề:	Thesis
Truy cập trực tuyến:	http://ir.vnulib.edu.vn/handle/123456789/1714
Từ khóa:	Thêm từ khóa bạn đọc Không có từ khóa, Hãy là người đầu tiên gắn từ khóa cho biểu ghi này!

id	oai:192.168.1.90:123456789-1714
record_format	dspace
spelling	oai:192.168.1.90:123456789-17142022-03-28T10:19:24Z Planning and learning for weakly-coupled distributed agents Guo, Anyuan Thesis Stochastic planning has gained popularity over classical planning in recent years by offering principled ways to model the uncertainty inherent in many real world applications. In particular, Markov decision process (MDP) based approaches are particularly well suited to problems that involve sequential decision making. Partially observable stochastic games (POSGs) are a multi-agent extension of the MDP framework to decentralized decision making. While elegant and expressive, this model has been shown to be intractable. Many real world problems exhibit additional structure that can be leveraged for computational gain. The focus of this thesis is on a class of problems in which the agents are largely independent but may interact by constraining each other’s actions. The best-known solution concepts in game theory are that of computing Nash equilibrium and performing strategy elimination. This dissertation addresses both of these solution concepts. First, we show that finding a Nash equilibrium where each agent achieves reward at least k is NP-hard. Then, we present the Multi-Agent Action Elimination (MAAE) algorithm which performs iterated elimination of dominated strategies. We show experimentally that is is able to solve much larger problems than the general purpose algorithm by pruning a large portion of the policy space. In situations where the agent does not have a model of the environment, it has to learn to act with just the reward as feedback. In this dissertation, we explore reinforcement learning based approaches. In particular, we develop the Bound-Based Exploration with Soft Pruning algorithm (BBESP). It is a Q-learning based algorithm that carries over many of the key ideas from our planning algorithm. It uses learned bounds to perform soft pruning of actions as well as focusing an agent’s exploration. We show that this approach learns faster than traditional Q-learning on this class of problems. 2008-01-11T06:47:59Z 2008-01-11T06:47:59Z 2006 Thesis http://ir.vnulib.edu.vn/handle/123456789/1714 en_US Doctor of Philosophy application/pdf application/pdf University of Massachusetts Amherst
institution	Đại học Quốc Gia Hồ Chí Minh
collection	DSpace
language	en_US
topic	Thesis
spellingShingle	Thesis Guo, Anyuan Planning and learning for weakly-coupled distributed agents
description	Stochastic planning has gained popularity over classical planning in recent years by offering principled ways to model the uncertainty inherent in many real world applications. In particular, Markov decision process (MDP) based approaches are particularly well suited to problems that involve sequential decision making. Partially observable stochastic games (POSGs) are a multi-agent extension of the MDP framework to decentralized decision making. While elegant and expressive, this model has been shown to be intractable. Many real world problems exhibit additional structure that can be leveraged for computational gain. The focus of this thesis is on a class of problems in which the agents are largely independent but may interact by constraining each other’s actions. The best-known solution concepts in game theory are that of computing Nash equilibrium and performing strategy elimination. This dissertation addresses both of these solution concepts. First, we show that finding a Nash equilibrium where each agent achieves reward at least k is NP-hard. Then, we present the Multi-Agent Action Elimination (MAAE) algorithm which performs iterated elimination of dominated strategies. We show experimentally that is is able to solve much larger problems than the general purpose algorithm by pruning a large portion of the policy space. In situations where the agent does not have a model of the environment, it has to learn to act with just the reward as feedback. In this dissertation, we explore reinforcement learning based approaches. In particular, we develop the Bound-Based Exploration with Soft Pruning algorithm (BBESP). It is a Q-learning based algorithm that carries over many of the key ideas from our planning algorithm. It uses learned bounds to perform soft pruning of actions as well as focusing an agent’s exploration. We show that this approach learns faster than traditional Q-learning on this class of problems.
format	Thesis
author	Guo, Anyuan
author_facet	Guo, Anyuan
author_sort	Guo, Anyuan
title	Planning and learning for weakly-coupled distributed agents
title_short	Planning and learning for weakly-coupled distributed agents
title_full	Planning and learning for weakly-coupled distributed agents
title_fullStr	Planning and learning for weakly-coupled distributed agents
title_full_unstemmed	Planning and learning for weakly-coupled distributed agents
title_sort	planning and learning for weakly-coupled distributed agents
publisher	University of Massachusetts Amherst
publishDate	2008
url	http://ir.vnulib.edu.vn/handle/123456789/1714
work_keys_str_mv	AT guoanyuan planningandlearningforweaklycoupleddistributedagents
_version_	1749008530478727168

Planning and learning for weakly-coupled distributed agents

Tài liệu tương tự