This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Actors, MapReduce and Java EE app servers

4 replies
Ant Kutschera
Joined: 2012-01-28,
User offline. Last seen 42 years 45 weeks ago.

Hi,

Actors can be used to break tasks down into smaller chunks and process
them on multiple cores, or even multiple machines via remote actors.
Additionally, they can wrap mutable data, so you don't need to write
your own locking mechanisms on such data.

MapReduce in systems like Apache Hadoop break tasks down into smaller
chunks and process them on multiple cores, on multiple servers, but
have the benefit over actors that they work with distributed file
systems and databases, so out of the box you have persistence. So far
as I know, Hadoop does not help with locking on mutable data.

Java EE application servers don't break work down into little tasks,
rather each single request (through say a call to a web service being
hosted on the app server) is a "little" task. Once there is enough
load on the server (i.e. enough calls to the web service), the work is
evenly distributed across multiple cores. You can deploy to a cluster
to scale out and have the processing done on multiple machines.

The question is: If I can choose between Java EE, MapReduce and
Actors, would I find that the only real "use case" for using Actors
would be pretty much only when I want to avoid locking on mutable
data?

If that is the case, then why do I read about people using Akka to
support the back end of web servers, and why did someone invest time
to create remote actors, without building a distributed file system?

Cheers,
Ant

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: Actors, MapReduce and Java EE app servers


On Jan 28, 2012 11:16 AM, "Ant Kutschera" <ant [dot] kutschera [at] gmail [dot] com> wrote:
>
> Hi,
>
> Actors can be used to break tasks down into smaller chunks and process
> them on multiple cores, or even multiple machines via remote actors.
> Additionally, they can wrap mutable data, so you don't need to write
> your own locking mechanisms on such data.
>
> MapReduce in systems like Apache Hadoop break tasks down into smaller
> chunks and process them on multiple cores, on multiple servers, but
> have the benefit over actors that they work with distributed file
> systems and databases, so out of the box you have persistence.  So far
> as I know, Hadoop does not help with locking on mutable data.
>
> Java EE application servers don't break work down into little tasks,
> rather each single request (through say a call to a web service being
> hosted on the app server) is a "little" task.  Once there is enough
> load on the server (i.e. enough calls to the web service), the work is
> evenly distributed across multiple cores.  You can deploy to a cluster
> to scale out and have the processing done on multiple machines.
>
> The question is: If I can choose between Java EE, MapReduce and
> Actors, would I find that the only real "use case" for using Actors
> would be pretty much only when I want to avoid locking on mutable
> data?
>
> If that is the case, then why do I read about people using Akka to
> support the back end of web servers, and why did someone invest time
> to create remote actors, without building a distributed file system?

Why did people invest time in creating distributed filesystems without building a distributed actor framework?

Use whatever distributed file system you want with Akka.

>
> Cheers,
> Ant

Ant Kutschera
Joined: 2012-01-28,
User offline. Last seen 42 years 45 weeks ago.
Re: Actors, MapReduce and Java EE app servers

You could, but Akka doesn't (1) optimise running a task (i.e.
responding to a message) physically close to where the data is
persisted. You have to program it to do that, which is no simple
task. So compared to Hadoop out-of-the-box, Akka won't run a
persitable MapReduce job as fast.

(1) - I don't know this for sure, but I've not read anything to
suggest Akka has this functionality out of the box.

On Jan 28, 12:38 pm, √iktor Ҡlang wrote:
> On Jan 28, 2012 11:16 AM, "Ant Kutschera" wrote:
>
>
>
>
>
>
>
>
>
>
>
> > Hi,
>
> > Actors can be used to break tasks down into smaller chunks and process
> > them on multiple cores, or even multiple machines via remote actors.
> > Additionally, they can wrap mutable data, so you don't need to write
> > your own locking mechanisms on such data.
>
> > MapReduce in systems like Apache Hadoop break tasks down into smaller
> > chunks and process them on multiple cores, on multiple servers, but
> > have the benefit over actors that they work with distributed file
> > systems and databases, so out of the box you have persistence.  So far
> > as I know, Hadoop does not help with locking on mutable data.
>
> > Java EE application servers don't break work down into little tasks,
> > rather each single request (through say a call to a web service being
> > hosted on the app server) is a "little" task.  Once there is enough
> > load on the server (i.e. enough calls to the web service), the work is
> > evenly distributed across multiple cores.  You can deploy to a cluster
> > to scale out and have the processing done on multiple machines.
>
> > The question is: If I can choose between Java EE, MapReduce and
> > Actors, would I find that the only real "use case" for using Actors
> > would be pretty much only when I want to avoid locking on mutable
> > data?
>
> > If that is the case, then why do I read about people using Akka to
> > support the back end of web servers, and why did someone invest time
> > to create remote actors, without building a distributed file system?
>
> Why did people invest time in creating distributed filesystems without
> building a distributed actor framework?
>
> Use whatever distributed file system you want with Akka.
>
>
>
>
>
>
>
>
>
> > Cheers,
> > Ant

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: Re: Actors, MapReduce and Java EE app servers

Hi Ant,

On Jan 28, 2012 1:39 PM, "Ant Kutschera" <ant [dot] kutschera [at] gmail [dot] com> wrote:
>
> You could, but Akka doesn't (1) optimise running a task (i.e.
> responding to a message) physically close to where the data is
> persisted.  You have to program it to do that, which is no simple
> task.

As long as you know where to route (consistent hashing etc) you can very easily create your own router that inspects the messages and forwards it to the right node.

You're going to see much more interesting opportunities in Akka 2.1, 2.2 and 2.3 with builtin clustering, automatic rebalancing, builtin consistent routing, event sourcing etc.

 So compared to Hadoop out-of-the-box, Akka won't run a
> persitable MapReduce job as fast.

First of all, Akka is not a MR framework, and I don't like speculations on performance. So show me some benches then we can discuss the results.

Cheers,
V

>
>
> (1) - I don't know this for sure, but I've not read anything to
> suggest Akka has this functionality out of the box.
>
>
> On Jan 28, 12:38 pm, √iktor Ҡlang <viktor [dot] kl [dot] [dot] [dot] [at] gmail [dot] com> wrote:
> > On Jan 28, 2012 11:16 AM, "Ant Kutschera" <ant [dot] kutsch [dot] [dot] [dot] [at] gmail [dot] com> wrote:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > > Hi,
> >
> > > Actors can be used to break tasks down into smaller chunks and process
> > > them on multiple cores, or even multiple machines via remote actors.
> > > Additionally, they can wrap mutable data, so you don't need to write
> > > your own locking mechanisms on such data.
> >
> > > MapReduce in systems like Apache Hadoop break tasks down into smaller
> > > chunks and process them on multiple cores, on multiple servers, but
> > > have the benefit over actors that they work with distributed file
> > > systems and databases, so out of the box you have persistence.  So far
> > > as I know, Hadoop does not help with locking on mutable data.
> >
> > > Java EE application servers don't break work down into little tasks,
> > > rather each single request (through say a call to a web service being
> > > hosted on the app server) is a "little" task.  Once there is enough
> > > load on the server (i.e. enough calls to the web service), the work is
> > > evenly distributed across multiple cores.  You can deploy to a cluster
> > > to scale out and have the processing done on multiple machines.
> >
> > > The question is: If I can choose between Java EE, MapReduce and
> > > Actors, would I find that the only real "use case" for using Actors
> > > would be pretty much only when I want to avoid locking on mutable
> > > data?
> >
> > > If that is the case, then why do I read about people using Akka to
> > > support the back end of web servers, and why did someone invest time
> > > to create remote actors, without building a distributed file system?
> >
> > Why did people invest time in creating distributed filesystems without
> > building a distributed actor framework?
> >
> > Use whatever distributed file system you want with Akka.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > > Cheers,
> > > Ant

Ant Kutschera
Joined: 2012-01-28,
User offline. Last seen 42 years 45 weeks ago.
Re: Actors, MapReduce and Java EE app servers

Sounds great Victor - thanks. I'll keep an eye out.

On Jan 28, 2:04 pm, √iktor Ҡlang wrote:
> Hi Ant,
>
> On Jan 28, 2012 1:39 PM, "Ant Kutschera" wrote:
>
>
>
> > You could, but Akka doesn't (1) optimise running a task (i.e.
> > responding to a message) physically close to where the data is
> > persisted.  You have to program it to do that, which is no simple
> > task.
>
> As long as you know where to route (consistent hashing etc) you can very
> easily create your own router that inspects the messages and forwards it to
> the right node.
>
> You're going to see much more interesting opportunities in Akka 2.1, 2.2
> and 2.3 with builtin clustering, automatic rebalancing, builtin consistent
> routing, event sourcing etc.
>
>  So compared to Hadoop out-of-the-box, Akka won't run a
>
> > persitable MapReduce job as fast.
>
> First of all, Akka is not a MR framework, and I don't like speculations on
> performance. So show me some benches then we can discuss the results.
>
> Cheers,
> V
>
>
>
>
>
>
>
>
>
>
>
> > (1) - I don't know this for sure, but I've not read anything to
> > suggest Akka has this functionality out of the box.
>
> > On Jan 28, 12:38 pm, √iktor Ҡlang wrote:
> > > On Jan 28, 2012 11:16 AM, "Ant Kutschera"
> wrote:
>
> > > > Hi,
>
> > > > Actors can be used to break tasks down into smaller chunks and process
> > > > them on multiple cores, or even multiple machines via remote actors.
> > > > Additionally, they can wrap mutable data, so you don't need to write
> > > > your own locking mechanisms on such data.
>
> > > > MapReduce in systems like Apache Hadoop break tasks down into smaller
> > > > chunks and process them on multiple cores, on multiple servers, but
> > > > have the benefit over actors that they work with distributed file
> > > > systems and databases, so out of the box you have persistence.  So far
> > > > as I know, Hadoop does not help with locking on mutable data.
>
> > > > Java EE application servers don't break work down into little tasks,
> > > > rather each single request (through say a call to a web service being
> > > > hosted on the app server) is a "little" task.  Once there is enough
> > > > load on the server (i.e. enough calls to the web service), the work is
> > > > evenly distributed across multiple cores.  You can deploy to a cluster
> > > > to scale out and have the processing done on multiple machines.
>
> > > > The question is: If I can choose between Java EE, MapReduce and
> > > > Actors, would I find that the only real "use case" for using Actors
> > > > would be pretty much only when I want to avoid locking on mutable
> > > > data?
>
> > > > If that is the case, then why do I read about people using Akka to
> > > > support the back end of web servers, and why did someone invest time
> > > > to create remote actors, without building a distributed file system?
>
> > > Why did people invest time in creating distributed filesystems without
> > > building a distributed actor framework?
>
> > > Use whatever distributed file system you want with Akka.
>
> > > > Cheers,
> > > > Ant

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland