This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Re: 2.8 copy method needs improvement

3 replies
Naftoli Gugenheim
Joined: 2008-12-17,
User offline. Last seen 42 years 45 weeks ago.

Here's another idea: you write a compiler plugin that for any class with a @copyable annotation generates a copy method. Then if you want to subclass a case class, annotate it with your annotation.

-------------------------------------
adriaN kinG wrote:

The design of the automatically generated copy method, introduced in Scala 2.8
for case classes, is suboptimal.

The main problem with the design of the copy method is that it is incompatible
with inheritance. The copy method is located in the class to be copied, and
provides defaults for its arguments, which are the values for the copied
object's class parameters. Given the way default arguments are implemented, this
design implies that if you generate the copy method for a concrete class A,
there is, in the general case, no consistent way to generate the copy method (as
currently phrased) for a concrete subclass B of A.

There is no shortage of philosophical controversy regarding inheritance in
general, and regarding inheritance from concrete classes in particular; some
people consider the latter not to be good programming practice. Nonetheless,
Scala has so far placed no restrictions on inheriting from concrete classes, and
there are use cases in which the conciseness of concrete inheritance outweighs
any possible difficulty in later refactoring. Scala should continue to support
concrete inheritance without restriction.

There are alternative designs that provide all the benefits of the copy method,
but that are consistent with inheritance, and so would be more useful to Scala
programmers.

Current Design
--------------

Consider the following class hierarchy:

class A (val i: Int)
class B (val j: Int) extends A(1)
class C (val s: String, override val i: Int) extends A(i)

In the current design, in order to generate a copy method automatically for A,
you declare it as a case class:

case class A (i: Int)

The compiler transforms this (ignoring case class features not relevant to this
discussion) into something like:

class A (val i: Int) {
def copy (i: Int = this.i): A = new A(i)
}

Given an A:

val a: A = A(1)

you can invoke this method with:

a.copy()

to get an unaltered copy, or:

a.copy(i = 2)

to get a copy with an altered class parameter value.

What if you want to get a copy of a B instead of an A? It is currently legal
(although deprecated) to declare:

case class B (j: Int) extends A(1)

but if you invoke it on a B:

val b: B = B(3)
b.copy()

you get an A instance rather than a B instance, because the copy method is not
generated for a class like B that already inherits a copy method (from A, in
this case).

You might imagine the Scala compiler could automatically generate a copy method
for B that would have the effect of:

class B (val j: Int) extends A(1) {
override def copy (j: Int = this.j): B = new B(j)
}

This looks reasonable enough, but an analogous method for C:

class C (val s: String, override val i: Int) extends A(i) {
def copy (s: String = this.s, i: Int = this.i): C = new C(s,i)
}

runs into trouble. The default-argument mechanism wants to generate this method
in A:

def copy$default$1: Int = this.i

and this method in C:

def copy$default$1: String = this.s

Methods with the same parameters cannot have different return types, so it is
not possible to generate a copy of method of this form for C. To avoid such a
situation, the current design refuses to generate any copy method in classes
like B or C, where the superclass already has a copy method.

Alternative 1: Method in Companion Object
-----------------------------------------

An alternative to the current design of the copy method would be to generate a
method not in the class to be copied but in its companion object. For the
classes in the above hierarchy, such methods might look like:

object A {
def copy (a: A) (i: Int = a.i): A = new A(i)
}
object B {
def copy (b: B) (j: Int = b.j): B = new B(j)
}
object C {
def copy (c: C) (s: String = c.s, i: Int = c.i): C = new C(s,i)
}

Given:

val a: A = A(1)
val b: B = B(3)
val c: C = C("hi!",6)

such copy methods might be invoked with:

A.copy(a)(i = 2)
B.copy(b)(j = 4)
C.copy(c)(s = "yo!")

You could even explicitly specify something like the current behavior for
subclasses of case classes, in which only the superclass portion of an object is
copied, if for some reason you considered that useful:

A.copy(b)(i = 5) // results in an A instance

Because the copy method is not inherited, there are no issues with inheritance
in this design. You can copy concrete class instances at any level of the class
hierarchy.

You might object that this approach is less elegant than the current one,
because the name of the class being copied appears in the invocation of the
copy method; it is not possible to copy something without specifying the type of
the resulting object. This is a reasonable objection, but it also applies to
the current design of the copy method: even though the name of the destination
class does not appear in the invocation, there is no way to invoke the
automatically generated copy method without knowing the class of the resulting
object, because the copy method is generated for only one class in a given
hierarchy (the resulting object is necessarily of that class).

Another possible objection is that the proposed syntax is more verbose than that
of the current design. One way to remedy this would be to rename "copy" to
"apply":

object A {
def apply (a: A) (i: Int = a.i): A = new A(i)
}
object B {
def apply (b: B) (j: Int = b.j): B = new B(j)
}
object C {
def apply (c: C) (s: String = c.s, i: Int = c.i): C = new C(s,i)
}

The corresponding invocations of these methods would be more succinct:

A(a)(i = 2)
B(b)(j = 4)
C(c)(s = "yo!")

If you are familiar with C++ or similar languages, you might recognize this
syntax (without the second argument list) as being similar to that of a copy
constructor. In fact, a number of people have noted that, in various languages,
copy constructors offer an alternative to an inheritable "clone" or "copy"
method; if you google "clone method vs copy constructor" you will find some
interesting comments on the issue. Ideas vary according to the language being
discussed, of course, but the preponderance of opinion seems to be that a copy
constructor, or something like it, offers less scope for trouble than an
inheritable method.

Alternative 2: Copy/Setter Methods
----------------------------------

In my own pre-2.8 code, I have often needed to make a copy of an immutable
object in which a single value is changed. Lacking an automatically generated
copy method, I have written individual methods to make these copies. These
methods amount to annoying boilerplate, and I would be grateful if a compiler
generated them automatically.

For example, for the above classes, imagine that the compiler generated:

class A (val i: Int) {
def i (new_i: Int): A = new A(new_i)
}
class B (val j: Int) extends A(1) {
def j (new_j: Int): B = new B(new_j)
}
class C (val s: String, override val i: Int) extends A(i) {
def s (new_s: String): C = new C(new_s,i)
override def i (new_i: Int): C = new C(s,new_i)
}

These setter-like copy methods are less prone to trouble with inheritance, as
illustrated for the class parameter i in C: where a class parameter is
overridden, it is usually of the same type in the subclass as in the superclass,
and the copy/setter method overrides the corresponding superclass method, with
the obvious meaning. (Strictly speaking, the subclass class parameter's type
might be a subtype of the corresponding superclass class parameter's type, in
which case the generated copy/setter method would overload the superclass's
method of the same name rather than overriding it. However, this does not create
an illegal declaration, as in the situation with copy$default$1 described
above.)

Invocations of copy/setter methods read nicely when chained:

val e =
Employee(
firstName = "Bob",lastName = "Johnson",title = Engineer,
hireDate = "10-Jan-2003",nationality = UnitedStates)
val e2 =
e.firstName("Sue").lastName("Ewings").title(Janitor).
hireDate("7-Jul-2005")
val e3 =
e.firstName("Jules").lastName("Fournier").hireDate("8-Sep-2006").
nationality(France)

although such usage presumably creates intermediate objects that immediately
become garbage (a sufficiently sophisticated VM could optimize the garbage
away).

This type of copy method also avoids requiring the name of the copied class to
appear in the invocation of the method; subclasses that override a superclass
parameter (that is, have a corresponding class parameter of the same name)
appear to have "virtual" copy methods.

Conclusion
----------

If Scala intends to continue offering full support for inheritance from concrete
classes, then either of the above approaches to an automatic copy method would
be less problematic than the current design. The two approaches are not mutually
exclusive, and in combination would allow a variety of succinct and powerful
copying idioms in Scala.

It would be too bad if the current design of the copy method led to the removal
case class inheritance, particularly since (at least with the limited testing
I've done) all pre-2.8 bugs related to case class inheritance appear to be fixed
in 2.8, and there are a number of compelling use cases for case classes that
inherit from one another (see, for example, all the use cases sited in replies
to
http://old.nabble.com/-scala--Do-you-use-case-class-inheritance--to22869...).

A

_________________________________________________________________
Hotmail: Powerful Free email with security by Microsoft.
http://clk.atdmt.com/GBL/go/201469230/direct/01/

Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Re: 2.8 copy method needs improvement
method synthesis in compiler plugins is still a somewhat arcane practice.The tricky part is that you need to do  it after typer, so you have the necessary information in generating the method.
The catch is that this leaves the symbol table in an inconsistent state, thanks to errors caused by attempts at referencing the copy method before it's known to the compiler.  Because of this, you can't even go back and re-type units after adding the method, as this would then attempt to insert duplicate symbols.
I have yet to figure out the best way to deal with the problem.  Ideally, if namer/typer fail on a unit then they should leave the AST and symtab unchanged; sadly this is not the case and the symtab *is* changed by such failures.

On 22 February 2010 00:49, Naftoli Gugenheim <naftoligug [at] gmail [dot] com> wrote:
Here's another idea: you write a compiler plugin that for any class with a @copyable annotation generates a copy method. Then if you want to subclass a case class, annotate it with your annotation.

-------------------------------------
adriaN kinG<ceroxylon [at] hotmail [dot] com> wrote:


The design of the automatically generated copy method, introduced in Scala 2.8
for case classes, is suboptimal.

The main problem with the design of the copy method is that it is incompatible
with inheritance. The copy method is located in the class to be copied, and
provides defaults for its arguments, which are the values for the copied
object's class parameters. Given the way default arguments are implemented, this
design implies that if you generate the copy method for a concrete class A,
there is, in the general case, no consistent way to generate the copy method (as
currently phrased) for a concrete subclass B of A.

There is no shortage of philosophical controversy regarding inheritance in
general, and regarding inheritance from concrete classes in particular; some
people consider the latter not to be good programming practice. Nonetheless,
Scala has so far placed no restrictions on inheriting from concrete classes, and
there are use cases in which the conciseness of concrete inheritance outweighs
any possible difficulty in later refactoring. Scala should continue to support
concrete inheritance without restriction.

There are alternative designs that provide all the benefits of the copy method,
but that are consistent with inheritance, and so would be more useful to Scala
programmers.

Current Design
--------------

Consider the following class hierarchy:

   class A (val i: Int)
   class B (val j: Int) extends A(1)
   class C (val s: String, override val i: Int) extends A(i)

In the current design, in order to generate a copy method automatically for A,
you declare it as a case class:

   case class A (i: Int)

The compiler transforms this (ignoring case class features not relevant to this
discussion) into something like:

   class A (val i: Int) {
     def copy (i: Int = this.i): A = new A(i)
   }

Given an A:

   val a: A = A(1)

you can invoke this method with:

   a.copy()

to get an unaltered copy, or:

   a.copy(i = 2)

to get a copy with an altered class parameter value.

What if you want to get a copy of a B instead of an A? It is currently legal
(although deprecated) to declare:

   case class B (j: Int) extends A(1)

but if you invoke it on a B:

   val b: B = B(3)
   b.copy()

you get an A instance rather than a B instance, because the copy method is not
generated for a class like B that already inherits a copy method (from A, in
this case).

You might imagine the Scala compiler could automatically generate a copy method
for B that would have the effect of:

   class B (val j: Int) extends A(1) {
     override def copy (j: Int = this.j): B = new B(j)
   }

This looks reasonable enough, but an analogous method for C:

   class C (val s: String, override val i: Int) extends A(i) {
     def copy (s: String = this.s, i: Int = this.i): C = new C(s,i)
   }

runs into trouble. The default-argument mechanism wants to generate this method
in A:

   def copy$default$1: Int = this.i

and this method in C:

   def copy$default$1: String = this.s

Methods with the same parameters cannot have different return types, so it is
not possible to generate a copy of method of this form for C. To avoid such a
situation, the current design refuses to generate any copy method in classes
like B or C, where the superclass already has a copy method.

Alternative 1: Method in Companion Object
-----------------------------------------

An alternative to the current design of the copy method would be to generate a
method not in the class to be copied but in its companion object. For the
classes in the above hierarchy, such methods might look like:

   object A {
     def copy (a: A) (i: Int = a.i): A = new A(i)
   }
   object B {
     def copy (b: B) (j: Int = b.j): B = new B(j)
   }
   object C {
     def copy (c: C) (s: String = c.s, i: Int = c.i): C = new C(s,i)
   }

Given:

   val a: A = A(1)
   val b: B = B(3)
   val c: C = C("hi!",6)

such copy methods might be invoked with:

   A.copy(a)(i = 2)
   B.copy(b)(j = 4)
   C.copy(c)(s = "yo!")

You could even explicitly specify something like the current behavior for
subclasses of case classes, in which only the superclass portion of an object is
copied, if for some reason you considered that useful:

   A.copy(b)(i = 5)        // results in an A instance

Because the copy method is not inherited, there are no issues with inheritance
in this design. You can copy concrete class instances at any level of the class
hierarchy.

You might object that this approach is less elegant than the current one,
because the name of the class being copied appears in the invocation of the
copy method; it is not possible to copy something without specifying the type of
the resulting object. This is a reasonable objection, but it also applies to
the current design of the copy method: even though the name of the destination
class does not appear in the invocation, there is no way to invoke the
automatically generated copy method without knowing the class of the resulting
object, because the copy method is generated for only one class in a given
hierarchy (the resulting object is necessarily of that class).

Another possible objection is that the proposed syntax is more verbose than that
of the current design. One way to remedy this would be to rename "copy" to
"apply":

   object A {
     def apply (a: A) (i: Int = a.i): A = new A(i)
   }
   object B {
     def apply (b: B) (j: Int = b.j): B = new B(j)
   }
   object C {
     def apply (c: C) (s: String = c.s, i: Int = c.i): C = new C(s,i)
   }

The corresponding invocations of these methods would be more succinct:

   A(a)(i = 2)
   B(b)(j = 4)
   C(c)(s = "yo!")

If you are familiar with C++ or similar languages, you might recognize this
syntax (without the second argument list) as being similar to that of a copy
constructor. In fact, a number of people have noted that, in various languages,
copy constructors offer an alternative to an inheritable "clone" or "copy"
method; if you google "clone method vs copy constructor" you will find some
interesting comments on the issue. Ideas vary according to the language being
discussed, of course, but the preponderance of opinion seems to be that a copy
constructor, or something like it, offers less scope for trouble than an
inheritable method.

Alternative 2: Copy/Setter Methods
----------------------------------

In my own pre-2.8 code, I have often needed to make a copy of an immutable
object in which a single value is changed. Lacking an automatically generated
copy method, I have written individual methods to make these copies. These
methods amount to annoying boilerplate, and I would be grateful if a compiler
generated them automatically.

For example, for the above classes, imagine that the compiler generated:

   class A (val i: Int) {
     def i (new_i: Int): A = new A(new_i)
   }
   class B (val j: Int) extends A(1) {
     def j (new_j: Int): B = new B(new_j)
   }
   class C (val s: String, override val i: Int) extends A(i) {
     def s (new_s: String): C = new C(new_s,i)
     override def i (new_i: Int): C = new C(s,new_i)
   }

These setter-like copy methods are less prone to trouble with inheritance, as
illustrated for the class parameter i in C: where a class parameter is
overridden, it is usually of the same type in the subclass as in the superclass,
and the copy/setter method overrides the corresponding superclass method, with
the obvious meaning. (Strictly speaking, the subclass class parameter's type
might be a subtype of the corresponding superclass class parameter's type, in
which case the generated copy/setter method would overload the superclass's
method of the same name rather than overriding it. However, this does not create
an illegal declaration, as in the situation with copy$default$1 described
above.)

Invocations of copy/setter methods read nicely when chained:

   val e =
     Employee(
       firstName = "Bob",lastName = "Johnson",title = Engineer,
       hireDate = "10-Jan-2003",nationality = UnitedStates)
   val e2 =
     e.firstName("Sue").lastName("Ewings").title(Janitor).
       hireDate("7-Jul-2005")
   val e3 =
     e.firstName("Jules").lastName("Fournier").hireDate("8-Sep-2006").
       nationality(France)

although such usage presumably creates intermediate objects that immediately
become garbage (a sufficiently sophisticated VM could optimize the garbage
away).

This type of copy method also avoids requiring the name of the copied class to
appear in the invocation of the method; subclasses that override a superclass
parameter (that is, have a corresponding class parameter of the same name)
appear to have "virtual" copy methods.

Conclusion
----------

If Scala intends to continue offering full support for inheritance from concrete
classes, then either of the above approaches to an automatic copy method would
be less problematic than the current design. The two approaches are not mutually
exclusive, and in combination would allow a variety of succinct and powerful
copying idioms in Scala.

It would be too bad if the current design of the copy method led to the removal
case class inheritance, particularly since (at least with the limited testing
I've done) all pre-2.8 bugs related to case class inheritance appear to be fixed
in 2.8, and there are a number of compelling use cases for case classes that
inherit from one another (see, for example, all the use cases sited in replies
to
http://old.nabble.com/-scala--Do-you-use-case-class-inheritance--to22869562.html#a22874094).

A


_________________________________________________________________
Hotmail: Powerful Free email with security by Microsoft.
http://clk.atdmt.com/GBL/go/201469230/direct/01/



--
Kevin Wright

mail/google talk: kev [dot] lee [dot] wright [at] googlemail [dot] com
wave: kev [dot] lee [dot] wright [at] googlewave [dot] com
skype: kev.lee.wright
twitter: @thecoda

Chris Twiner
Joined: 2008-12-17,
User offline. Last seen 42 years 45 weeks ago.
Re: 2.8 copy method needs improvement

+1 Kevin and I have tried numerous ways to get annotation based generation working.

If other code uses what your generated code should be then typer will fail.  You need typer to get the code tree. 

I have discussed exactly this problem at length, the solution was to ignore it given the deprecation of case classes:

http://code.google.com/p/scala-scales/wiki/VirtualConstructorPreSIP

http://news.gmane.org/find-root.php?message_id=%3cc10a34630904230833m2a6cbe86ke930b05fa4e41369%40mail.gmail.com%3e

The last link never got answered directly just we got the warning instead.

This just keeps coming up, perhaps it should be an error not a warning. Also it would be lovely if the compiler would allow synthetics to be more easily created.

On Feb 22, 2010 2:10 AM, "Kevin Wright" <kev [dot] lee [dot] wright [at] googlemail [dot] com> wrote:

method synthesis in compiler plugins is still a somewhat arcane practice. The tricky part is that you need to do  it after typer, so you have the necessary information in generating the method.
The catch is that this leaves the symbol table in an inconsistent state, thanks to errors caused by attempts at referencing the copy method before it's known to the compiler.  Because of this, you can't even go back and re-type units after adding the method, as this would then attempt to insert duplicate symbols.
I have yet to figure out the best way to deal with the problem.  Ideally, if namer/typer fail on a unit then they should leave the AST and symtab unchanged; sadly this is not the case and the symtab *is* changed by such failures.

On 22 February 2010 00:49, Naftoli Gugenheim <naftoligug [at] gmail [dot] com> wrote: > > Here's another idea...

--
Kevin Wright

mail/google talk: kev [dot] lee [dot] wright [at] googlemail [dot] com
wave: kev [dot] lee [dot] wright [at] googlewave [dot] com
skype: kev.lee.wright
twitter: @thecoda

Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Re: 2.8 copy method needs improvement
I've got synthesis working in *most* cases, but run into problems with the last few (singletons are my biggest pain-in-the-proverbial) right now.  Running two passes of namer/typer seems to be a good approach at the moment, so I use an early pass with errors suppressed, then generate the code, then a second pass on units that failed first time round, but getting the symtab right is an absolute nightmare.
There are solutions, but they're daunting:1. Spawn my own global for the first pass - this would avoid a polluted symtab, but would also cause an undesirable duplication of effort as ALL units in the system would have to be fully named and typed twice.
2. Become sufficiently acquainted with the compiler that I can successfully fix inconsistencies in the symtab after the first pass.  Either by rolling back after failed units, rolling forward to patch up broken relationships between singletons and their classes, or checking for the presence of a symbol before attempting to insert it, and first deleting it if already present.  The main risk with this approach is that it's a bit fragile and prone to being broken by future changes in compiler behaviour.
Once I get a solution in place, I'm going to have to extract some sort of general framework for code-synthesising plugins, given that this seems to be a fairly common problem.
On 22 February 2010 08:33, Chris Twiner <chris [dot] twiner [at] gmail [dot] com> wrote:

+1 Kevin and I have tried numerous ways to get annotation based generation working.

If other code uses what your generated code should be then typer will fail.  You need typer to get the code tree. 

I have discussed exactly this problem at length, the solution was to ignore it given the deprecation of case classes:

http://code.google.com/p/scala-scales/wiki/VirtualConstructorPreSIP

http://news.gmane.org/find-root.php?message_id=%3cc10a34630904230833m2a6cbe86ke930b05fa4e41369%40mail.gmail.com%3e

The last link never got answered directly just we got the warning instead.

This just keeps coming up, perhaps it should be an error not a warning. Also it would be lovely if the compiler would allow synthetics to be more easily created.

On Feb 22, 2010 2:10 AM, "Kevin Wright" <kev [dot] lee [dot] wright [at] googlemail [dot] com> wrote:

method synthesis in compiler plugins is still a somewhat arcane practice. The tricky part is that you need to do  it after typer, so you have the necessary information in generating the method.
The catch is that this leaves the symbol table in an inconsistent state, thanks to errors caused by attempts at referencing the copy method before it's known to the compiler.  Because of this, you can't even go back and re-type units after adding the method, as this would then attempt to insert duplicate symbols.
I have yet to figure out the best way to deal with the problem.  Ideally, if namer/typer fail on a unit then they should leave the AST and symtab unchanged; sadly this is not the case and the symtab *is* changed by such failures.

On 22 February 2010 00:49, Naftoli Gugenheim <naftoligug [at] gmail [dot] com> wrote: > > Here's another idea...

--
Kevin Wright

mail/google talk: kev [dot] lee [dot] wright [at] googlemail [dot] com
wave: kev [dot] lee [dot] wright [at] googlewave [dot] com
skype: kev.lee.wright
twitter: @thecoda




--
Kevin Wright

mail/google talk: kev [dot] lee [dot] wright [at] googlemail [dot] com
wave: kev [dot] lee [dot] wright [at] googlewave [dot] com
skype: kev.lee.wright
twitter: @thecoda

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland