Planning and learning for weakly-coupled distributed agents
Stochastic planning has gained popularity over classical planning in recent years by offering principled ways to model the uncertainty inherent in many real world applications. In particular, Markov decision process (MDP) based approaches are particularly well suited to problems that involve sequent...
Lưu vào:
Tác giả chính: | |
---|---|
Định dạng: | Luận án |
Ngôn ngữ: | en_US |
Thông tin xuất bản: |
University of Massachusetts Amherst
2008
|
Chủ đề: | |
Truy cập trực tuyến: | http://ir.vnulib.edu.vn/handle/123456789/1714 |
Từ khóa: |
Thêm từ khóa bạn đọc
Không có từ khóa, Hãy là người đầu tiên gắn từ khóa cho biểu ghi này!
|
id |
oai:192.168.1.90:123456789-1714 |
---|---|
record_format |
dspace |
spelling |
oai:192.168.1.90:123456789-17142022-03-28T10:19:24Z Planning and learning for weakly-coupled distributed agents Guo, Anyuan Thesis Stochastic planning has gained popularity over classical planning in recent years by offering principled ways to model the uncertainty inherent in many real world applications. In particular, Markov decision process (MDP) based approaches are particularly well suited to problems that involve sequential decision making. Partially observable stochastic games (POSGs) are a multi-agent extension of the MDP framework to decentralized decision making. While elegant and expressive, this model has been shown to be intractable. Many real world problems exhibit additional structure that can be leveraged for computational gain. The focus of this thesis is on a class of problems in which the agents are largely independent but may interact by constraining each other’s actions. The best-known solution concepts in game theory are that of computing Nash equilibrium and performing strategy elimination. This dissertation addresses both of these solution concepts. First, we show that finding a Nash equilibrium where each agent achieves reward at least k is NP-hard. Then, we present the Multi-Agent Action Elimination (MAAE) algorithm which performs iterated elimination of dominated strategies. We show experimentally that is is able to solve much larger problems than the general purpose algorithm by pruning a large portion of the policy space. In situations where the agent does not have a model of the environment, it has to learn to act with just the reward as feedback. In this dissertation, we explore reinforcement learning based approaches. In particular, we develop the Bound-Based Exploration with Soft Pruning algorithm (BBESP). It is a Q-learning based algorithm that carries over many of the key ideas from our planning algorithm. It uses learned bounds to perform soft pruning of actions as well as focusing an agent’s exploration. We show that this approach learns faster than traditional Q-learning on this class of problems. 2008-01-11T06:47:59Z 2008-01-11T06:47:59Z 2006 Thesis http://ir.vnulib.edu.vn/handle/123456789/1714 en_US Doctor of Philosophy application/pdf application/pdf University of Massachusetts Amherst |
institution |
Đại học Quốc Gia Hồ Chí Minh |
collection |
DSpace |
language |
en_US |
topic |
Thesis |
spellingShingle |
Thesis Guo, Anyuan Planning and learning for weakly-coupled distributed agents |
description |
Stochastic planning has gained popularity over classical planning in recent years by offering principled ways to model the uncertainty inherent in many real world applications. In particular, Markov decision process (MDP) based approaches are particularly well suited to problems that involve sequential decision making.
Partially observable stochastic games (POSGs) are a multi-agent extension of the MDP framework to decentralized decision making. While elegant and expressive, this model has been shown to be intractable. Many real world problems exhibit additional structure that can be leveraged for computational gain. The focus of this thesis is on a class of problems in which the agents are largely independent but may interact by constraining each other’s actions.
The best-known solution concepts in game theory are that of computing Nash equilibrium and performing strategy elimination. This dissertation addresses both of
these solution concepts. First, we show that finding a Nash equilibrium where each agent achieves reward at least k is NP-hard. Then, we present the Multi-Agent Action Elimination (MAAE) algorithm which performs iterated elimination of dominated strategies. We show experimentally that is is able to solve much larger problems than the general purpose algorithm by pruning a large portion of the policy space.
In situations where the agent does not have a model of the environment, it has to learn to act with just the reward as feedback. In this dissertation, we explore reinforcement learning based approaches. In particular, we develop the Bound-Based Exploration with Soft Pruning algorithm (BBESP). It is a Q-learning based algorithm that carries over many of the key ideas from our planning algorithm. It uses learned bounds to perform soft pruning of actions as well as focusing an agent’s exploration. We show that this approach learns faster than traditional Q-learning on this class of problems. |
format |
Thesis |
author |
Guo, Anyuan |
author_facet |
Guo, Anyuan |
author_sort |
Guo, Anyuan |
title |
Planning and learning for weakly-coupled distributed agents |
title_short |
Planning and learning for weakly-coupled distributed agents |
title_full |
Planning and learning for weakly-coupled distributed agents |
title_fullStr |
Planning and learning for weakly-coupled distributed agents |
title_full_unstemmed |
Planning and learning for weakly-coupled distributed agents |
title_sort |
planning and learning for weakly-coupled distributed agents |
publisher |
University of Massachusetts Amherst |
publishDate |
2008 |
url |
http://ir.vnulib.edu.vn/handle/123456789/1714 |
work_keys_str_mv |
AT guoanyuan planningandlearningforweaklycoupleddistributedagents |
_version_ |
1749008530478727168 |