Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Concepts:
Mesos sharing improves cluster utilization and avoids
per-framework data replication.
Mesos shares resources in a fine-grained manner,
allowing frameworks to achieve data locality by taking
turns reading data stored on each machine.
Mesos introduces a distributed two-level scheduling mechanism called
resource offers.
Mesos decides how many resources to offer each framework,
while frameworks decide which resources to accept and which
computations to run on them.
Purpose:
Organizations will want to run multiple frameworks in the
same cluster, picking the best one for each application.
Sharing a cluster between frameworks improves utilization and
allows applications to share access to large datasets that
may be too costly to replicate.
(不同的framework 使用同一個dataset)
Solution concern:
1. To support a wide array of both current and future frameworks,
each of which will have different scheduling needs based
on its programming model, communication pattern, task dependencies,
and data placement.
2. Highly scalable.
3. Be fault-tolerant and highly available.
Design decision:
Delegating control over scheduling to the frameworks.
This is accomplished through a new abstraction,
called a resource offer, which encapsulates a bundle of
resources that a framework can allocate on (a cluster node) to run tasks.
Mesos decides how many resources to offer each frame-
work, based on an organizational policy such as fair shar-
ing, while frameworks decide which resources to accept
and which tasks to run on them.
Mesos allows framework developers to build specialized
frameworks targeted at particular problem domains rather
than one-size-fits-all abstractions.
Mesos aims to enable fine-grained sharing between multiple
cluster computing frameworks, while giving these frameworks
enough control to achieve placement goals such as data locality.
(notes:)
1. Mesos 提供 Resources. 一個Resource代表
在一個Cluster中的(一個Node)上,可以提供的資源。
2. Mesos決定提供多少Resources給每一個'framework'。
3. 'Framework' 決定要拿哪些 Resources (即哪些cluster內的node)。
How Mesos achieves isolation, scalability, and fault tolerance:
Design Philosophy:
1. Define a minimal interface that enables efficient
resource sharing across frameworks.
2. 'Push' control of task scheduling and execution to the
frameworks.
3. Provided as 'library' instead of Mesos API.
Design detail:
1. a master process that manages
(Mesos有一個Master Process,可以放在任何地方)
2. slave daemons
(Mesos在每一個cluster node上跑一個daemon process,接受master Mesos process指揮)
running on each cluster node,
and frameworks that run tasks on these slaves.
3. The master implements fine-grained sharing across
frameworks using 'resource offers'.
4. Each resource offer contains a list of free resources
on multiple slaves.
5. The master decides 'how many' resources
(一個resource代表一個cluster內的node)
to offer to each framework according to a given organizational policy.
6. To support a diverse set of policies,
the master employs a modular architecture that makes
it easy to add new allocation modules via a pluggin mechanism.
(使用plug-in module來實做 resource offer policy)
7. A framework running on top of Mesos consists of two
components:
*a scheduler that registers with the master
to be offered resources,
*and an executor process that is launched on
slave nodes to run the framework’s tasks.
8. While the master determines how many resources are of-
fered to each framework, the frameworks’ schedulers se-
lect which of the offered resources to use. When a frame-
works accepts offered resources, it passes to Mesos a de-
scription of the tasks it wants to run on them. In turn,
Mesos launches the tasks on the corresponding slaves.
9. 'framework' has the ability to 'reject' offers.
This is due to that the resources offered by Mesos Master
has either no data locality or other constraints.
Mesos Master只會簡單透過policy pluggin來篩選提過哪個'framework'
那些resources. 而'framework' 透過 reject機制來淘汰不符合的resource,
像是該resource(即node)沒有data locality。
"delay scheduling" 用來等data copy 到resource後, "framework"
才接受該resource。
10. resource offer allocation
(performed by allocation modules in the master),
and resource isolation (performed by slaves).
一個Cluster Node上跑一個Master Mesos Process,管理其他Slave Mesos
Daemons。
Master實做 resource offers給'frameworks'。
Master/Slave functionalities:
Concepts:
Mesos sharing improves cluster utilization and avoids
per-framework data replication.
Mesos shares resources in a fine-grained manner,
allowing frameworks to achieve data locality by taking
turns reading data stored on each machine.
Mesos introduces a distributed two-level scheduling mechanism called
resource offers.
Mesos decides how many resources to offer each framework,
while frameworks decide which resources to accept and which
computations to run on them.
Purpose:
Organizations will want to run multiple frameworks in the
same cluster, picking the best one for each application.
Sharing a cluster between frameworks improves utilization and
allows applications to share access to large datasets that
may be too costly to replicate.
(不同的framework 使用同一個dataset)
Solution concern:
1. To support a wide array of both current and future frameworks,
each of which will have different scheduling needs based
on its programming model, communication pattern, task dependencies,
and data placement.
2. Highly scalable.
3. Be fault-tolerant and highly available.
Design decision:
Delegating control over scheduling to the frameworks.
This is accomplished through a new abstraction,
called a resource offer, which encapsulates a bundle of
resources that a framework can allocate on (a cluster node) to run tasks.
Mesos decides how many resources to offer each frame-
work, based on an organizational policy such as fair shar-
ing, while frameworks decide which resources to accept
and which tasks to run on them.
Mesos allows framework developers to build specialized
frameworks targeted at particular problem domains rather
than one-size-fits-all abstractions.
Mesos aims to enable fine-grained sharing between multiple
cluster computing frameworks, while giving these frameworks
enough control to achieve placement goals such as data locality.
(notes:)
1. Mesos 提供 Resources. 一個Resource代表
在一個Cluster中的(一個Node)上,可以提供的資源。
2. Mesos決定提供多少Resources給每一個'framework'。
3. 'Framework' 決定要拿哪些 Resources (即哪些cluster內的node)。
How Mesos achieves isolation, scalability, and fault tolerance:
Design Philosophy:
1. Define a minimal interface that enables efficient
resource sharing across frameworks.
2. 'Push' control of task scheduling and execution to the
frameworks.
3. Provided as 'library' instead of Mesos API.
Design detail:
1. a master process that manages
(Mesos有一個Master Process,可以放在任何地方)
2. slave daemons
(Mesos在每一個cluster node上跑一個daemon process,接受master Mesos process指揮)
running on each cluster node,
and frameworks that run tasks on these slaves.
3. The master implements fine-grained sharing across
frameworks using 'resource offers'.
4. Each resource offer contains a list of free resources
on multiple slaves.
5. The master decides 'how many' resources
(一個resource代表一個cluster內的node)
to offer to each framework according to a given organizational policy.
6. To support a diverse set of policies,
the master employs a modular architecture that makes
it easy to add new allocation modules via a pluggin mechanism.
(使用plug-in module來實做 resource offer policy)
7. A framework running on top of Mesos consists of two
components:
*a scheduler that registers with the master
to be offered resources,
*and an executor process that is launched on
slave nodes to run the framework’s tasks.
8. While the master determines how many resources are of-
fered to each framework, the frameworks’ schedulers se-
lect which of the offered resources to use. When a frame-
works accepts offered resources, it passes to Mesos a de-
scription of the tasks it wants to run on them. In turn,
Mesos launches the tasks on the corresponding slaves.
9. 'framework' has the ability to 'reject' offers.
This is due to that the resources offered by Mesos Master
has either no data locality or other constraints.
Mesos Master只會簡單透過policy pluggin來篩選提過哪個'framework'
那些resources. 而'framework' 透過 reject機制來淘汰不符合的resource,
像是該resource(即node)沒有data locality。
"delay scheduling" 用來等data copy 到resource後, "framework"
才接受該resource。
10. resource offer allocation
(performed by allocation modules in the master),
and resource isolation (performed by slaves).
一個Cluster Node上跑一個Master Mesos Process,管理其他Slave Mesos
Daemons。
Master實做 resource offers給'frameworks'。
Master/Slave functionalities:
- Master:
- Deploy Policy. Figure out how many resource offer to particular 'framework
resource offer allocation(pluggable allocation module)
快速allocate resource base on tasks are short tasks. - Only reallocate resources when tasks finish.
- If resources are not freed quickly enough, the allocation module also has the ability to revoke (kill) tasks.
- Before killing a task, Mesos gives its framework a grace period to clean it up.
- Isolation:
- Isolation modules. 一個node上面可以跑多個executors.
- 每個executors屬於一個'framework'.
- 'Framework'可安裝filters on Master. 用來過濾resources offered to the framework.
- 當resources被Master offer給'framework', Master會先將resource給標記並使其可以被使用。當'framework' 一段時間沒有接受該offer,會將該offer給revoke,給其他'framework'使用。
- Soft-state design:
- The master can reconstruct completely its internal state from the periodic messages it gets from the slaves, and from the framework schedulers.
- A hot-standby design, where the master is shadowed by several backups that are ready to take over when the master fails.
- Allows a framework to register multiple schedulers such that when one fails, another one is notified by the Mesos master to take over.
- Slave:
- delay scheduling
- Reject offered resources
- resource isolation
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.