2D/3D vector math package -- design choices, and where should it go?

I do rather a lot of scientific programming (I run a neurobiology lab) and have found the lack of basic 2D/3D vector types in almost every language endlessly annoying.  So I've created a Scala package.

Right now, the package lets you do things like this with immutable vectors:
  val a = Vec2I(2,3)   // a.x==2 and a.y==3
  val b = 2.vec2       // b.x==2 and b.y==2
  val c = 2*(a+b)      // c.x==8 and c.y==12, as they should
  val d = 2.0*(a+b)    // Same thing, except the type is promoted to Vec2D
  val origin = mouse_event.getPoint   // A Java MouseEvent
  val movement = (new_event.getPoint - origin)/zoom_factor

  // Angle between two vectors; "^" is a unit-vector hat
  val e = Math.acos( ((3,5)^) ** Vec2I(2,3)^ )  

Basically, you can do anything you might find in an intro calculus text, with transparent conversions to and from existing Java types and appropriately typed tuples (and they're case classes, so matching on Vec2I(x,5) works).  Since efficient operations require using primitive types, I've got 2D and 3D types for Int, Long, Float, and Double (Vec2I, Vec2L, ... , Vec3D).

In addition, the package has mutable vectors (since all the object creation overhead takes ~20x longer than vector addition of ints).  Here the syntax is less pretty, but thanks to the JVM and scala compiler, it runs at least as fast as hand-rolled array access.  For example, one could:
  val movement = ((new_event.getPoint.m -= origin) /= zoom_factor)
where the ".m" means create a mutable copy, and the assignment operators put the computation into the LHS copy as expected from C++ convention.  Or one could
  val f = 1 +: 2 *: (0.5f,1.5f).m
where the operator: notation means to stuff the result in the RHS of the argument.  Most useful is stuff like:
  val vector_sum = (Vec2D_m(0,0) /: vector_list)(_ += _)

I find these capabilities very handy--it required writing tons of boilerplate code, but now as far as I can tell, everything just works.  So I have four questions:

(1) Is this of sufficiently generic interest that I should figure out a way to make this available?

(2) If so, what would be the appropriate venue?  At one extreme, I could throw it up on Sourceforge or somesuch and hope whoever wants it will find it.  At the other extreme, since it is a pretty basic language feature, a refined version could be incorporated into an existing package or possibly even the core Scala library.  (I do not think that incorporation into something like scalala is the best idea; the use cases are almost entirely nonoverlapping, since scalala is for full-strength large-vector/matrix linear algebra, and my package is basically to make life less annoying when dealing with screen coordinates and other 2D data.)

(3) Also, if so, I am uncertain how verbose the library should be, and what some of the syntax should be.  Right now I'm on the very short end of verbosity:
  2.vec2 extends 2 to a 2D vector, i.e. Vec2I(2,2)
  val g = (2,3,4).m converts an (Int,Int,Int) tuple to a mutable int vector Vec2I_m(2,3,4)
  val h = g.im.L converts to an immutable vector of Longs
  a ** b is a dot product
  (a X b)^ is the unit vector of a cross product
and so on.  Input from anyone who has opinions on this is welcome.  One thing I am especially uncomfortable with is using
  += for addition stored in the left-hand argument
  +: for addition stored in the right-hand argument
for mutable vectors.  On the one hand, the natural right-associativity of +: makes the code easier to write.  On the other hand, the different precedence of operators ending in = and : seems like it's asking for trouble.

(4) Has someone already done this, and better, so that I should just use their code instead?

Anyway, thanks for any comments people may have.

  --Rex

Re: should we move the discussion somewhere else?

I think we should discuss possible designs for a common Vector & Matrix
library API somewhere else than on a mailing list, where it would be
easier to show code as well. It seems kind of hard to keep track of
things here. But what's a good place? a Wiki, GitHub? web forum ...?

Or should we just keep the discussion here, what do you think?

Erkki

Rex Kerr wrote:
> I do rather a lot of scientific programming (I run a neurobiology lab)
> and have found the lack of basic 2D/3D vector types in almost every
> language endlessly annoying. So I've created a Scala package.
>
> Right now, the package lets you do things like this with immutable
> vectors:
> val a = Vec2I(2,3) // a.x==2 and a.y==3
> val b = 2.vec2 // b.x==2 and b.y==2
> val c = 2*(a+b) // c.x==8 and c.y==12, as they should
> val d = 2.0*(a+b) // Same thing, except the type is promoted to Vec2D
> val origin = mouse_event.getPoint // A Java MouseEvent
> val movement = (new_event.getPoint - origin)/zoom_factor
>
> // Angle between two vectors; "^" is a unit-vector hat
> val e = Math.acos( ((3,5)^) ** Vec2I(2,3)^ )
>
> Basically, you can do anything you might find in an intro calculus
> text, with transparent conversions to and from existing Java types and
> appropriately typed tuples (and they're case classes, so matching on
> Vec2I(x,5) works). Since efficient operations require using primitive
> types, I've got 2D and 3D types for Int, Long, Float, and Double
> (Vec2I, Vec2L, ... , Vec3D).
>
> In addition, the package has mutable vectors (since all the object
> creation overhead takes ~20x longer than vector addition of ints).
> Here the syntax is less pretty, but thanks to the JVM and scala
> compiler, it runs at least as fast as hand-rolled array access. For
> example, one could:
> val movement = ((new_event.getPoint.m -= origin) /= zoom_factor)
> where the ".m" means create a mutable copy, and the assignment
> operators put the computation into the LHS copy as expected from C++
> convention. Or one could
> val f = 1 +: 2 *: (0.5f,1.5f).m
> where the operator: notation means to stuff the result in the RHS of
> the argument. Most useful is stuff like:
> val vector_sum = (Vec2D_m(0,0) /: vector_list)(_ += _)
>
> I find these capabilities very handy--it required writing tons of
> boilerplate code, but now as far as I can tell, everything just
> works. So I have four questions:
>
> (1) Is this of sufficiently generic interest that I should figure out
> a way to make this available?
>
> (2) If so, what would be the appropriate venue? At one extreme, I
> could throw it up on Sourceforge or somesuch and hope whoever wants it
> will find it. At the other extreme, since it is a pretty basic
> language feature, a refined version could be incorporated into an
> existing package or possibly even the core Scala library. (I do not
> think that incorporation into something like scalala is the best idea;
> the use cases are almost entirely nonoverlapping, since scalala is for
> full-strength large-vector/matrix linear algebra, and my package is
> basically to make life less annoying when dealing with screen
> coordinates and other 2D data.)
>
> (3) Also, if so, I am uncertain how verbose the library should be, and
> what some of the syntax should be. Right now I'm on the very short
> end of verbosity:
> 2.vec2 extends 2 to a 2D vector, i.e. Vec2I(2,2)
> val g = (2,3,4).m converts an (Int,Int,Int) tuple to a mutable int
> vector Vec2I_m(2,3,4)
> val h = g.im.L converts to an immutable vector of Longs
> a ** b is a dot product
> (a X b)^ is the unit vector of a cross product
> and so on. Input from anyone who has opinions on this is welcome.
> One thing I am especially uncomfortable with is using
> += for addition stored in the left-hand argument
> +: for addition stored in the right-hand argument
> for mutable vectors. On the one hand, the natural right-associativity
> of +: makes the code easier to write. On the other hand, the
> different precedence of operators ending in = and : seems like it's
> asking for trouble.
>
> (4) Has someone already done this, and better, so that I should just
> use their code instead?
>
> Anyway, thanks for any comments people may have.
>
> --Rex
>

Re: should we move the discussion somewhere else?

On Wed, Sep 2, 2009 at 3:45 AM, Erkki Lindpere wrote:
> I think we should discuss possible designs for a common Vector & Matrix
> library API somewhere else than on a mailing list, where it would be easier
> to show code as well. It seems kind of hard to keep track of things here.
> But what's a good place? a Wiki, GitHub? web forum ...?
>
> Or should we just keep the discussion here, what do you think?

I agree with moving. The thread is approaching 60 messages on a topic
thats only of interest to a subset of Scala users.

I'd suggest a broader scope like "Mathematical and Numerical computing
in Scala", might give the forum more appeal and more longevity beyond
the current discussion.

I'd prefer an established neutral provider such as GitHub or Google
Groups+Code, and/or request a new sublist from EPFL (like XML did?).
Conveniently searchable archives are a must.

Erkki, If you create one and post the details, I'll certainty sign up.

-Ben

>
> Erkki
>
> Rex Kerr wrote:
>>
>> I do rather a lot of scientific programming (I run a neurobiology lab) and
>> have found the lack of basic 2D/3D vector types in almost every language
>> endlessly annoying.  So I've created a Scala package.
>>
>> Right now, the package lets you do things like this with immutable
>> vectors:
>>  val a = Vec2I(2,3)   // a.x==2 and a.y==3
>>  val b = 2.vec2       // b.x==2 and b.y==2
>>  val c = 2*(a+b)      // c.x==8 and c.y==12, as they should
>>  val d = 2.0*(a+b)    // Same thing, except the type is promoted to Vec2D
>>  val origin = mouse_event.getPoint   // A Java MouseEvent
>>  val movement = (new_event.getPoint - origin)/zoom_factor
>>
>>  // Angle between two vectors; "^" is a unit-vector hat
>>  val e = Math.acos( ((3,5)^) ** Vec2I(2,3)^ )
>> Basically, you can do anything you might find in an intro calculus text,
>> with transparent conversions to and from existing Java types and
>> appropriately typed tuples (and they're case classes, so matching on
>> Vec2I(x,5) works).  Since efficient operations require using primitive
>> types, I've got 2D and 3D types for Int, Long, Float, and Double (Vec2I,
>> Vec2L, ... , Vec3D).
>>
>> In addition, the package has mutable vectors (since all the object
>> creation overhead takes ~20x longer than vector addition of ints).  Here the
>> syntax is less pretty, but thanks to the JVM and scala compiler, it runs at
>> least as fast as hand-rolled array access.  For example, one could:
>>  val movement = ((new_event.getPoint.m -= origin) /= zoom_factor)
>> where the ".m" means create a mutable copy, and the assignment operators
>> put the computation into the LHS copy as expected from C++ convention.  Or
>> one could
>>  val f = 1 +: 2 *: (0.5f,1.5f).m
>> where the operator: notation means to stuff the result in the RHS of the
>> argument.  Most useful is stuff like:
>>  val vector_sum = (Vec2D_m(0,0) /: vector_list)(_ += _)
>>
>> I find these capabilities very handy--it required writing tons of
>> boilerplate code, but now as far as I can tell, everything just works.  So I
>> have four questions:
>>
>> (1) Is this of sufficiently generic interest that I should figure out a
>> way to make this available?
>>
>> (2) If so, what would be the appropriate venue?  At one extreme, I could
>> throw it up on Sourceforge or somesuch and hope whoever wants it will find
>> it.  At the other extreme, since it is a pretty basic language feature, a
>> refined version could be incorporated into an existing package or possibly
>> even the core Scala library.  (I do not think that incorporation into
>> something like scalala is the best idea; the use cases are almost entirely
>> nonoverlapping, since scalala is for full-strength large-vector/matrix
>> linear algebra, and my package is basically to make life less annoying when
>> dealing with screen coordinates and other 2D data.)
>>
>> (3) Also, if so, I am uncertain how verbose the library should be, and
>> what some of the syntax should be.  Right now I'm on the very short end of
>> verbosity:
>>  2.vec2 extends 2 to a 2D vector, i.e. Vec2I(2,2)
>>  val g = (2,3,4).m converts an (Int,Int,Int) tuple to a mutable int vector
>> Vec2I_m(2,3,4)
>>  val h = g.im.L converts to an immutable vector of Longs
>>  a ** b is a dot product
>>  (a X b)^ is the unit vector of a cross product
>> and so on.  Input from anyone who has opinions on this is welcome.  One
>> thing I am especially uncomfortable with is using
>>  += for addition stored in the left-hand argument
>>  +: for addition stored in the right-hand argument
>> for mutable vectors.  On the one hand, the natural right-associativity of
>> +: makes the code easier to write.  On the other hand, the different
>> precedence of operators ending in = and : seems like it's asking for
>> trouble.
>>
>> (4) Has someone already done this, and better, so that I should just use
>> their code instead?
>>
>> Anyway, thanks for any comments people may have.
>>
>>  --Rex
>>
>

Re: should we move the discussion somewhere else?

I can host a forum (aka bulletin board) on my home server for the purpose if
there are enough people that are interested; it wouldn't be a high-speed forum
for 1000's of visitors but it would be free to use and enough for a small
thing like this.

Otherwise, I don't really see what this mailing list lacks and that another
medium would provide (Google Wave isn't widely used yet, sadly), especially
since KDE KMail structures all e-mails from this mailing list into threads for
me :-)

On Tuesday 01 September 2009 19:45:33 Erkki Lindpere wrote:
> I think we should discuss possible designs for a common Vector & Matrix
> library API somewhere else than on a mailing list, where it would be
> easier to show code as well. It seems kind of hard to keep track of
> things here. But what's a good place? a Wiki, GitHub? web forum ...?
>
> Or should we just keep the discussion here, what do you think?
>
> Erkki
>
> Rex Kerr wrote:
> > I do rather a lot of scientific programming (I run a neurobiology lab)
> > and have found the lack of basic 2D/3D vector types in almost every
> > language endlessly annoying. So I've created a Scala package.
> >
> > Right now, the package lets you do things like this with immutable
> > vectors:
> > val a = Vec2I(2,3) // a.x==2 and a.y==3
> > val b = 2.vec2 // b.x==2 and b.y==2
> > val c = 2*(a+b) // c.x==8 and c.y==12, as they should
> > val d = 2.0*(a+b) // Same thing, except the type is promoted to
> > Vec2D val origin = mouse_event.getPoint // A Java MouseEvent
> > val movement = (new_event.getPoint - origin)/zoom_factor
> >
> > // Angle between two vectors; "^" is a unit-vector hat
> > val e = Math.acos( ((3,5)^) ** Vec2I(2,3)^ )
> >
> > Basically, you can do anything you might find in an intro calculus
> > text, with transparent conversions to and from existing Java types and
> > appropriately typed tuples (and they're case classes, so matching on
> > Vec2I(x,5) works). Since efficient operations require using primitive
> > types, I've got 2D and 3D types for Int, Long, Float, and Double
> > (Vec2I, Vec2L, ... , Vec3D).
> >
> > In addition, the package has mutable vectors (since all the object
> > creation overhead takes ~20x longer than vector addition of ints).
> > Here the syntax is less pretty, but thanks to the JVM and scala
> > compiler, it runs at least as fast as hand-rolled array access. For
> > example, one could:
> > val movement = ((new_event.getPoint.m -= origin) /= zoom_factor)
> > where the ".m" means create a mutable copy, and the assignment
> > operators put the computation into the LHS copy as expected from C++
> > convention. Or one could
> > val f = 1 +: 2 *: (0.5f,1.5f).m
> > where the operator: notation means to stuff the result in the RHS of
> > the argument. Most useful is stuff like:
> > val vector_sum = (Vec2D_m(0,0) /: vector_list)(_ += _)
> >
> > I find these capabilities very handy--it required writing tons of
> > boilerplate code, but now as far as I can tell, everything just
> > works. So I have four questions:
> >
> > (1) Is this of sufficiently generic interest that I should figure out
> > a way to make this available?
> >
> > (2) If so, what would be the appropriate venue? At one extreme, I
> > could throw it up on Sourceforge or somesuch and hope whoever wants it
> > will find it. At the other extreme, since it is a pretty basic
> > language feature, a refined version could be incorporated into an
> > existing package or possibly even the core Scala library. (I do not
> > think that incorporation into something like scalala is the best idea;
> > the use cases are almost entirely nonoverlapping, since scalala is for
> > full-strength large-vector/matrix linear algebra, and my package is
> > basically to make life less annoying when dealing with screen
> > coordinates and other 2D data.)
> >
> > (3) Also, if so, I am uncertain how verbose the library should be, and
> > what some of the syntax should be. Right now I'm on the very short
> > end of verbosity:
> > 2.vec2 extends 2 to a 2D vector, i.e. Vec2I(2,2)
> > val g = (2,3,4).m converts an (Int,Int,Int) tuple to a mutable int
> > vector Vec2I_m(2,3,4)
> > val h = g.im.L converts to an immutable vector of Longs
> > a ** b is a dot product
> > (a X b)^ is the unit vector of a cross product
> > and so on. Input from anyone who has opinions on this is welcome.
> > One thing I am especially uncomfortable with is using
> > += for addition stored in the left-hand argument
> > +: for addition stored in the right-hand argument
> > for mutable vectors. On the one hand, the natural right-associativity
> > of +: makes the code easier to write. On the other hand, the
> > different precedence of operators ending in = and : seems like it's
> > asking for trouble.
> >
> > (4) Has someone already done this, and better, so that I should just
> > use their code instead?
> >
> > Anyway, thanks for any comments people may have.
> >
> > --Rex

David Flemström
david [dot] flemstrom [at] gmail [dot] com

Scala vector and matrix math library

p, li { white-space: pre-wrap; }Hello all,
I just read the recent thread ("2D/3D vector math package [...]") and also think it would be a great idea to add a library like this to the standard Scala distribution (or in scalax, or somewhere else where everybody can find it), and I would happy to help doing this myself.


However, I've got very different ideas and goals compared to those mentioned in that thread, so I don't feel like continuing with it. I don't want *just* a math-oriented, easy-to-use but maybe slow vector math solution, but a solution that can adapt to be fast and/or easy to use depending on the situation and the way it is used, so that it is usable for heavy computations that require good performance, but also for ordinary users that just want a vector library that works and don't care as much about performance. If something is to be used by a fair amount of developers, it has to be a good solution for all those developers.


To further adhere to this ideal, I will, if I ever create a library for this, publish it on GitHub and generously accept patches to it if the patches contain sensible features that also adhere to the ideal.


Some features I've thought about including:


- Completely immutable data structures. The kind that trigger HotSpot caching and inlining. I needn't explain more. This can give incredible performance improvements in the long run because of JIT optimizations, while being really useful for concurrent programming etc because of the immutability. It would of course also be possible to create mutable/immutable implementations à la the scala.collection library, but there should at least be one part of the vector library that's *completely* immutable IMO.


- All classes are generic. If you only make a vector library that only uses Floats or Doubles etc, someone is going to want to have an Int-based vector library, and thus a new one has to be written. With the new Numeric class in Scala 2.8, it's possible to make generic classes that take number types and that don't use boxing/unboxing with help from the @specialized annotation. I have attached an example implementation of a small system that demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note that the code is optimized for speed and not necessarily for legibility.


- Support for dependant vectors. This is only possible if an approach similar to the scala.collection mutable/immutable approach is taken. Dependant vectors are vectors that don't store values by themselves (e.g. there are no variables stored inside of the vector), but instead a dependant vector has a function stored inside of it that tells it how to get to it's values. This is really useful if one vector's value depends on another vector's state. I'll show what I mean if I ever get started on a library like this.


- Usage of existing traits/classes. A Vector will of course extend a Product, and maybe an Ordered, a Seq or a Tuple (if those classes also get the @specialized annotation when 2.8 is released).


- Matrices and other general math tools, of course. The primary focus will be on vectors, though.


Comments? Suggestions in general?

David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: vector and matrix as function of indice

Hello
vector can be function of index. matrix can be funcion of indice.
so , I think math vector lib can use this aspect.

easy to see , unit matrix of any dimention can be defined as mathematical
way.
(i==j --> 0 , i!=j --> 1)

using pattern matching can be usefull.

and this way is independent from strage type(on memory , on disk , on
network)

shuji yamamoto.

On Sat, 22 Aug 2009 00:05:58 +0900, David Flemström
wrote:

> Hello all,
> I just read the recent thread ("2D/3D vector math package [...]") and
> also
> think it would be a great idea to add a library like this to the standard
> Scala distribution (or in scalax, or somewhere else where everybody can
> find
> it), and I would happy to help doing this myself.
>
> However, I've got very different ideas and goals compared to those
> mentioned
> in that thread, so I don't feel like continuing with it. I don't want
> *just* a
> math-oriented, easy-to-use but maybe slow vector math solution, but a
> solution
> that can adapt to be fast and/or easy to use depending on the situation
> and
> the way it is used, so that it is usable for heavy computations that
> require
> good performance, but also for ordinary users that just want a vector
> library
> that works and don't care as much about performance. If something is to
> be
> used by a fair amount of developers, it has to be a good solution for all
> those developers.
>
> To further adhere to this ideal, I will, if I ever create a library for
> this,
> publish it on GitHub and generously accept patches to it if the patches
> contain sensible features that also adhere to the ideal.
>
> Some features I've thought about including:
>
> - Completely immutable data structures. The kind that trigger HotSpot
> caching
> and inlining. I needn't explain more. This can give incredible
> performance
> improvements in the long run because of JIT optimizations, while being
> really
> useful for concurrent programming etc because of the immutability. It
> would of
> course also be possible to create mutable/immutable implementations à la
> the
> scala.collection library, but there should at least be one part of the
> vector
> library that's *completely* immutable IMO.
>
> - All classes are generic. If you only make a vector library that only
> uses
> Floats or Doubles etc, someone is going to want to have an Int-based
> vector
> library, and thus a new one has to be written. With the new Numeric
> class in
> Scala 2.8, it's possible to make generic classes that take number types
> and
> that don't use boxing/unboxing with help from the @specialized
> annotation. I
> have attached an example implementation of a small system that
> demonstrates
> what I mean (ref1.scala); it only compiles with Scala 2.8. Note that the
> code
> is optimized for speed and not necessarily for legibility.
>
> - Support for dependant vectors. This is only possible if an approach
> similar
> to the scala.collection mutable/immutable approach is taken. Dependant
> vectors
> are vectors that don't store values by themselves (e.g. there are no
> variables
> stored inside of the vector), but instead a dependant vector has a
> function
> stored inside of it that tells it how to get to it's values. This is
> really
> useful if one vector's value depends on another vector's state. I'll
> show what
> I mean if I ever get started on a library like this.
>
> - Usage of existing traits/classes. A Vector will of course extend a
> Product,
> and maybe an Ordered, a Seq or a Tuple (if those classes also get the
> @specialized annotation when 2.8 is released).
>
> - Matrices and other general math tools, of course. The primary focus
> will be
> on vectors, though.
>
> Comments? Suggestions in general?
>
> David Flemström
> david [dot] flemstrom [at] gmail [dot] com

Scala vector and matrix math library

Note that in this discussion, Im wearing an "Interface Hat". In
general my thinking and comments are geared towards discovering &
negotiating an minimal, abstract Vector/Matrix API that might be

- Accepted by existing and forthcoming libraries (eg Scalala)
- Accepted by clients needing math library functionality

Im not really concerned with what's the best concrete implementation,
except to ensure the API supports & enables implementers and clients
alike.

On Sat, Aug 29, 2009 at 1:26 AM, Rex Kerr wrote:
> On Fri, Aug 28, 2009 at 1:21 AM, Ben Hutchison
> wrote:
>>
>> If someone really wants a Vector[Int], a unit vector can be defined
>> for it. The quantization is really severe, but its possible: (2, 1)
>> becomes (1, 0)
>
> Ouch.  That is possible, but is it useful?  I can't think of a case where it
> would be.  Why make library implementers implement junk "unit" methods?

I didnt mean thats its particularly useful, but rather that it allows
a library simplification: Vector[T] can be parametrized by any numeric
T without needing to make special cases for discrete integers.

Im not sure that implementers need define a unit vector method. I
suspect an abstract method could be crafted that would work for most
types of number.

IMHO keeping the set of abstract interfaces in a Vector library to the
absolute minimal set that spans the problem domain is key to
successful adoption, so anything that gets included needs a very solid
justification. Part of the reason Java collections was successfully
reimplemented is because it only defines about 7 interfaces in total.

Implementers of the abstract API can of course offer extra types
beyond the core API, as they see fit. "IntVector extends
Vector[Int]"or RationalVector, for example.

>>
>> > Not entirely.  You can assign to mutable vectors.  You can't to
>> > non-mutable
>> > vectors.  If we call := the element-assignment operation, then
>>
>> By assign, you mean: replace the entire content of Vector x with
>> Vector y (x := y) ?
>>
>> Can't this be composed by combining all the element-wise assignments
>> (which are defined for the immutable case as copy-with-modification),
>> without introducing an extra trait at the interface level?
>
> No, copy-with-modification requires a var, while internal updates work fine
> with vals.

Point taken, however, I personally dont feel a "reassign a vector to
new value without needing a var" use-case justifies the inclusion of a
separate Mutable trait in a core library.

I think it should require a var. After all, you have just assigned a
completely new value to your vector.

> Well, I just benchmarked it (based on the same code posted in my reply to
> David but I added a + method and played with val/var switching).  And I see,
> for 2-ary vector-vector addition:
>   Mutable vector (+=)  2.0 ns / op
>   Mutable vector (+)  9.8 ns / op
>   Immutable vector (+)  10.2 ns / op
> (it's an even bigger difference for vector + two integer arguments).

Thanks for benchmarking this. It seems its a matter of priorities and
tastes. Certainly, mutable vectors have their use-cases and any API
must not preclude using them.

As I mentioned previously, following the Scala collections idiom (same
method, differing semantics) is my preferred way to support this, as
it does not seem to require a separate Mutable trait.

Efficient mutable reassignment can be supported via something like
"val vectorWithNewValue = vectorWithOldValue.reassign(newValue)",
where reassign() composes element-wise assignments.

-Ben

Re: Scala vector and matrix math library

On Fri, Aug 21, 2009 at 11:05 AM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
- All classes are generic. If you only make a vector library that only uses Floats or Doubles etc, someone is going to want to have an Int-based vector library, and thus a new one has to be written. With the new Numeric class in Scala 2.8, it's possible to make generic classes that take number types and that don't use boxing/unboxing with help from the @specialized annotation. I have attached an example implementation of a small system that demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note that the code is optimized for speed and not necessarily for legibility.

I've now benchmarked this, and the results are not encouraging.  I have benchmarks for two things: accumulating vectors from arrays of previously-computed vector values, and accumulating vectors with values computed on the fly.  Previously-computed vectors are added with methods that take a vector; if computed on the fly, they're passed as two arguments to a + function to avoid the temptation to create an object.

These give the following timings for 2.8.0.r18509-b20090819020207, which to me appears from these benchmarks to not have @specialized working yet (even with -Xexperimental and -Xfuture):

(baseline "do-it-yourself" method)
Primitive math, on-the-fly: 100  <-- defined to be 100
Primitive math, stored: 290

(true regardless of @specialized, final, -optimise)
Immutable Vec2[Int], on-the-fly: 3800
Immutable Vec2[Int], stored: 9000

(true regardless of final, -optimise)
Immutable Vec2I, on-the-fly: 1700
Immutable Vec2I, stored: 2600

(true regardless of @specialized, final, -optimise)
Mutable Vec2[Int], on-the-fly: 2700
Mutable Vec2[Int], stored: 3000

(true if not both final and -optimise)
Mutable Vec2I, on-the-fly: 220
Mutable Vec2I, stored: 500

(both final and -optimise)
Mutable Vec2I, on-the-fly: 70
Mutable Vec2I, stored: 150

Bottom line: for immutables, generics are 2x slower when adding two primitives, 3x when combining two generics, and the best immutables are 10-20x slower than mutables.  For mutables, generics are 6-12x slower than unoptimized&final dedicated mutables, and optimized final dedicated mutables are 3x faster again (and, surprisingly, actually 1.5-2x faster than roll-your-own primitives).  Oh, and by the way, according to Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I takes 16 (primitives take 8).

I think this is pretty clear evidence that if you actually want a high-performance library for 2 and 3 component vectors using primitive types, that the library I've already written to do this (or another using a similar strategy) is at this point the way to go, and that mutable vectors is the way to go.  I hope that eventually the @specialized version will be closer to final -optimise, but it looks to me like the compiler team have their work cut out for them to make this happen.

A couple of details about the code:

Vec2[Int] is from a generic that looks like so (you can imagine the var version):
  final class Vec2[@specialized T](val x:T , val y:T)(implicit arith : Numeric[T]) {
    def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
    // ...
  }

Vec2I is hand-rolled using primitive types and looks like so (again, imagine vars):
  final class Vec2I(val x:Int , val y:Int) {
    def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
    // ...
  }

  --Rex

Re: Scala vector and matrix math library

Specialization has to be enabled by -Yspecialize. I couldn't find the
code you used for benchmarking, so I couldn't post back timings. Your
results are what I'd expect, but the specialized transformation should
get you close to the hand-written specialized versions.

iulian

On Mon, Aug 24, 2009 at 9:54 AM, Rex Kerr wrote:
> On Fri, Aug 21, 2009 at 11:05 AM, David Flemström
> wrote:
>>
>> - All classes are generic. If you only make a vector library that only
>> uses Floats or Doubles etc, someone is going to want to have an Int-based
>> vector library, and thus a new one has to be written. With the new Numeric
>> class in Scala 2.8, it's possible to make generic classes that take number
>> types and that don't use boxing/unboxing with help from the @specialized
>> annotation. I have attached an example implementation of a small system that
>> demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note
>> that the code is optimized for speed and not necessarily for legibility.
>
> I've now benchmarked this, and the results are not encouraging.  I have
> benchmarks for two things: accumulating vectors from arrays of
> previously-computed vector values, and accumulating vectors with values
> computed on the fly.  Previously-computed vectors are added with methods
> that take a vector; if computed on the fly, they're passed as two arguments
> to a + function to avoid the temptation to create an object.
>
> These give the following timings for 2.8.0.r18509-b20090819020207, which to
> me appears from these benchmarks to not have @specialized working yet (even
> with -Xexperimental and -Xfuture):
>
> (baseline "do-it-yourself" method)
> Primitive math, on-the-fly: 100  <-- defined to be 100
> Primitive math, stored: 290
>
> (true regardless of @specialized, final, -optimise)
> Immutable Vec2[Int], on-the-fly: 3800
> Immutable Vec2[Int], stored: 9000
>
> (true regardless of final, -optimise)
> Immutable Vec2I, on-the-fly: 1700
> Immutable Vec2I, stored: 2600
>
> (true regardless of @specialized, final, -optimise)
> Mutable Vec2[Int], on-the-fly: 2700
> Mutable Vec2[Int], stored: 3000
>
> (true if not both final and -optimise)
> Mutable Vec2I, on-the-fly: 220
> Mutable Vec2I, stored: 500
>
> (both final and -optimise)
> Mutable Vec2I, on-the-fly: 70
> Mutable Vec2I, stored: 150
>
> Bottom line: for immutables, generics are 2x slower when adding two
> primitives, 3x when combining two generics, and the best immutables are
> 10-20x slower than mutables.  For mutables, generics are 6-12x slower than
> unoptimized&final dedicated mutables, and optimized final dedicated mutables
> are 3x faster again (and, surprisingly, actually 1.5-2x faster than
> roll-your-own primitives).  Oh, and by the way, according to
> Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I takes 16
> (primitives take 8).
>
> I think this is pretty clear evidence that if you actually want a
> high-performance library for 2 and 3 component vectors using primitive
> types, that the library I've already written to do this (or another using a
> similar strategy) is at this point the way to go, and that mutable vectors
> is the way to go.  I hope that eventually the @specialized version will be
> closer to final -optimise, but it looks to me like the compiler team have
> their work cut out for them to make this happen.
>
> A couple of details about the code:
>
> Vec2[Int] is from a generic that looks like so (you can imagine the var
> version):
>   final class Vec2[@specialized T](val x:T , val y:T)(implicit arith :
> Numeric[T]) {
>     def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
>     // ...
>   }
>
> Vec2I is hand-rolled using primitive types and looks like so (again, imagine
> vars):
>   final class Vec2I(val x:Int , val y:Int) {
>     def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
>     // ...
>   }
>
>   --Rex
>
>

Re: Scala vector and matrix math library

Ah.  Unfortunately, in this particular case specialization makes things even worse.  With mutable generic int vectors, it goes from 2700&3000 to 4000&4000.  With immutable generics it goes from 3800&9000 to 8000&14000.

Also, memory usage goes up even more, from 54 to 62 bytes.

  --Rex

On Mon, Aug 24, 2009 at 8:19 AM, Iulian Dragos <jaguarul [at] gmail [dot] com> wrote:
Specialization has to be enabled by -Yspecialize. I couldn't find the
code you used for benchmarking, so I couldn't post back timings. Your
results are what I'd expect, but the specialized transformation should
get you close to the hand-written specialized versions.

iulian

On Mon, Aug 24, 2009 at 9:54 AM, Rex Kerr<ichoran [at] gmail [dot] com> wrote:
> On Fri, Aug 21, 2009 at 11:05 AM, David Flemström
> <david [dot] flemstrom [at] gmail [dot] com> wrote:
>>
>> - All classes are generic. If you only make a vector library that only
>> uses Floats or Doubles etc, someone is going to want to have an Int-based
>> vector library, and thus a new one has to be written. With the new Numeric
>> class in Scala 2.8, it's possible to make generic classes that take number
>> types and that don't use boxing/unboxing with help from the @specialized
>> annotation. I have attached an example implementation of a small system that
>> demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note
>> that the code is optimized for speed and not necessarily for legibility.
>
> I've now benchmarked this, and the results are not encouraging.  I have
> benchmarks for two things: accumulating vectors from arrays of
> previously-computed vector values, and accumulating vectors with values
> computed on the fly.  Previously-computed vectors are added with methods
> that take a vector; if computed on the fly, they're passed as two arguments
> to a + function to avoid the temptation to create an object.
>
> These give the following timings for 2.8.0.r18509-b20090819020207, which to
> me appears from these benchmarks to not have @specialized working yet (even
> with -Xexperimental and -Xfuture):
>
> (baseline "do-it-yourself" method)
> Primitive math, on-the-fly: 100  <-- defined to be 100
> Primitive math, stored: 290
>
> (true regardless of @specialized, final, -optimise)
> Immutable Vec2[Int], on-the-fly: 3800
> Immutable Vec2[Int], stored: 9000
>
> (true regardless of final, -optimise)
> Immutable Vec2I, on-the-fly: 1700
> Immutable Vec2I, stored: 2600
>
> (true regardless of @specialized, final, -optimise)
> Mutable Vec2[Int], on-the-fly: 2700
> Mutable Vec2[Int], stored: 3000
>
> (true if not both final and -optimise)
> Mutable Vec2I, on-the-fly: 220
> Mutable Vec2I, stored: 500
>
> (both final and -optimise)
> Mutable Vec2I, on-the-fly: 70
> Mutable Vec2I, stored: 150
>
> Bottom line: for immutables, generics are 2x slower when adding two
> primitives, 3x when combining two generics, and the best immutables are
> 10-20x slower than mutables.  For mutables, generics are 6-12x slower than
> unoptimized&final dedicated mutables, and optimized final dedicated mutables
> are 3x faster again (and, surprisingly, actually 1.5-2x faster than
> roll-your-own primitives).  Oh, and by the way, according to
> Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I takes 16
> (primitives take 8).
>
> I think this is pretty clear evidence that if you actually want a
> high-performance library for 2 and 3 component vectors using primitive
> types, that the library I've already written to do this (or another using a
> similar strategy) is at this point the way to go, and that mutable vectors
> is the way to go.  I hope that eventually the @specialized version will be
> closer to final -optimise, but it looks to me like the compiler team have
> their work cut out for them to make this happen.
>
> A couple of details about the code:
>
> Vec2[Int] is from a generic that looks like so (you can imagine the var
> version):
>   final class Vec2[@specialized T](val x:T , val y:T)(implicit arith :
> Numeric[T]) {
>     def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
>     // ...
>   }
>
> Vec2I is hand-rolled using primitive types and looks like so (again, imagine
> vars):
>   final class Vec2I(val x:Int , val y:Int) {
>     def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
>     // ...
>   }
>
>   --Rex
>
>



--
« Je déteste la montagne, ça cache le paysage »
Alphonse Allais

Re: Scala vector and matrix math library

p, li { white-space: pre-wrap; }The final @specialized pattern, when finished, *should* make a and b in the following code *exactly* identical when it comes to memory and performance.


Note that I deliberately added Numeric to the non-generic version to make the two implementations have equal performance; the generic version IS normally slower because it uses the Numeric class, but not by much. What I want to demonstrate, however, is that it's Numeric that is the bottleneck, and not the fact that B is generic; part of the issue is that Numeric itself isn't compiled with @specialized since it would enhance performance greatly if it was.


If you want to do a real benchmark, copy the Numeric code from http://lampsvn.epfl.ch/trac/scala/browser/scala/trunk/src/library/scala/... and compile it, along with the code for Ordering, with @specialized. I'll try to do this myself when I get time.


The code:
class A(var x: Int, var y: Int) {
private val arith = Numeric.IntIsIntegral
def +(that: A) = new A(arith.plus(this.x, that.x),
arith.plus(this.y, that.y))
}
class B[@specialized T](var x: T, var y: T)(implicit arith: Numeric[T]) {
def +(that: B) = new B(arith.plus(this.x, that.x),
arith.plus(this.y, that.y))
}
var a = new A(2, 3)
a = a + a
var b = new B[Int](2, 3)
b = b + b


Quick question: Why doesn't the code for Numeric already include @specialized? It doesn't hurt until the compiler becomes aware of it, and it would in my opinion be better to hunt down all of the classes that need specialized now instead of waiting until the feature goes out of beta.


On Monday 24 August 2009 15:29:27 Rex Kerr wrote:
> Ah. Unfortunately, in this particular case specialization makes things
> even worse. With mutable generic int vectors, it goes from 2700&3000 to
> 4000&4000. With immutable generics it goes from 3800&9000 to 8000&14000.
>
> Also, memory usage goes up even more, from 54 to 62 bytes.
>
> --Rex
>
> On Mon, Aug 24, 2009 at 8:19 AM, Iulian Dragos <jaguarul [at] gmail [dot] com> wrote:
> > Specialization has to be enabled by -Yspecialize. I couldn't find the
> > code you used for benchmarking, so I couldn't post back timings. Your
> > results are what I'd expect, but the specialized transformation should
> > get you close to the hand-written specialized versions.
> >
> > iulian
> >
> > On Mon, Aug 24, 2009 at 9:54 AM, Rex Kerr<ichoran [at] gmail [dot] com> wrote:
> > > On Fri, Aug 21, 2009 at 11:05 AM, David Flemström
> > >
> > > <david [dot] flemstrom [at] gmail [dot] com> wrote:
> > >> - All classes are generic. If you only make a vector library that only
> > >> uses Floats or Doubles etc, someone is going to want to have an
> >
> > Int-based
> >
> > >> vector library, and thus a new one has to be written. With the new
> >
> > Numeric
> >
> > >> class in Scala 2.8, it's possible to make generic classes that take
> >
> > number
> >
> > >> types and that don't use boxing/unboxing with help from the
> > >> @specialized annotation. I have attached an example implementation of
> > >> a small system
> >
> > that
> >
> > >> demonstrates what I mean (ref1.scala); it only compiles with Scala
> > >> 2.8.
> >
> > Note
> >
> > >> that the code is optimized for speed and not necessarily for
> > >> legibility.
> > >
> > > I've now benchmarked this, and the results are not encouraging. I have
> > > benchmarks for two things: accumulating vectors from arrays of
> > > previously-computed vector values, and accumulating vectors with values
> > > computed on the fly. Previously-computed vectors are added with
> > > methods that take a vector; if computed on the fly, they're passed as
> > > two
> >
> > arguments
> >
> > > to a + function to avoid the temptation to create an object.
> > >
> > > These give the following timings for 2.8.0.r18509-b20090819020207,
> > > which
> >
> > to
> >
> > > me appears from these benchmarks to not have @specialized working yet
> >
> > (even
> >
> > > with -Xexperimental and -Xfuture):
> > >
> > > (baseline "do-it-yourself" method)
> > > Primitive math, on-the-fly: 100 <-- defined to be 100
> > > Primitive math, stored: 290
> > >
> > > (true regardless of @specialized, final, -optimise)
> > > Immutable Vec2[Int], on-the-fly: 3800
> > > Immutable Vec2[Int], stored: 9000
> > >
> > > (true regardless of final, -optimise)
> > > Immutable Vec2I, on-the-fly: 1700
> > > Immutable Vec2I, stored: 2600
> > >
> > > (true regardless of @specialized, final, -optimise)
> > > Mutable Vec2[Int], on-the-fly: 2700
> > > Mutable Vec2[Int], stored: 3000
> > >
> > > (true if not both final and -optimise)
> > > Mutable Vec2I, on-the-fly: 220
> > > Mutable Vec2I, stored: 500
> > >
> > > (both final and -optimise)
> > > Mutable Vec2I, on-the-fly: 70
> > > Mutable Vec2I, stored: 150
> > >
> > > Bottom line: for immutables, generics are 2x slower when adding two
> > > primitives, 3x when combining two generics, and the best immutables are
> > > 10-20x slower than mutables. For mutables, generics are 6-12x slower
> >
> > than
> >
> > > unoptimized&final dedicated mutables, and optimized final dedicated
> >
> > mutables
> >
> > > are 3x faster again (and, surprisingly, actually 1.5-2x faster than
> > > roll-your-own primitives). Oh, and by the way, according to
> > > Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I
> > > takes
> >
> > 16
> >
> > > (primitives take 8).
> > >
> > > I think this is pretty clear evidence that if you actually want a
> > > high-performance library for 2 and 3 component vectors using primitive
> > > types, that the library I've already written to do this (or another
> > > using
> >
> > a
> >
> > > similar strategy) is at this point the way to go, and that mutable
> >
> > vectors
> >
> > > is the way to go. I hope that eventually the @specialized version will
> >
> > be
> >
> > > closer to final -optimise, but it looks to me like the compiler team
> > > have their work cut out for them to make this happen.
> > >
> > > A couple of details about the code:
> > >
> > > Vec2[Int] is from a generic that looks like so (you can imagine the var
> > > version):
> > > final class Vec2[@specialized T](val x:T , val y:T)(implicit arith :
> > > Numeric[T]) {
> > > def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
> > > // ...
> > > }
> > >
> > > Vec2I is hand-rolled using primitive types and looks like so (again,
> >
> > imagine
> >
> > > vars):
> > > final class Vec2I(val x:Int , val y:Int) {
> > > def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
> > > // ...
> > > }
> > >
> > > --Rex
> >
> > --
> > « Je déteste la montagne, ça cache le paysage »
> > Alphonse Allais


David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: Scala vector and matrix math library

On Mon, Aug 24, 2009 at 10:26 AM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
If you want to do a real benchmark, copy the Numeric code from http://lampsvn.epfl.ch/trac/scala/browser/scala/trunk/src/library/scala/Numeric.scala and compile it, along with the code for Ordering, with @specialized. I'll try to do this myself when I get time.

It's more complicated than that.  Ordering[T] extends java.util.Comparator[T] with PartialOrdering[T], object Numeric defines conversions that use Integral[T], and so on.  There are all sorts of things that potentially need tweaking for @specialized to work.  I'm afraid I don't have the time to work it all out.

Also, I'm not sure @specialized is going to save most of the typing for mutable vectors specifically since is nice to be able to
  val v = Vec2F_m(3.0f,2.5f)
  val u = Vec2I_m(1,2)
  v += u
and have it run quickly (i.e. with zero object instantiations).  Defining an implicit conversion from Vec2I_m to Vec2F_m makes the addition operation take ~10x longer than if there's a method in Vec2F_m that understands integers, e.g.
  def +=(v:Vec2I_m) = { x+=v.x ; y+=v.y ; this }
You'd really need a Numeric[T,S] for this to not need retyping.

  --Rex

Re: Scala vector and matrix math library

Hey Rex,

Did you try @specialized with non-final classes? The way @specialized
works is by creating subclasses for each primitive type requested, so it
may be that final is preventing it from doing that.

Ismael

Re: Scala vector and matrix math library

On Sat, Aug 22, 2009 at 1:05 AM, David
Flemström wrote:
> Hello all,
> I just read the recent thread ("2D/3D vector math package [...]") and also
> think it would be a great idea to add a library like this to the standard
> Scala distribution (or in scalax, or somewhere else where everybody can find
> it), and I would happy to help doing this myself.
>
> However, I've got very different ideas and goals compared to those mentioned
> in that thread, so I don't feel like continuing with it.

We agree that standardisation / defragmentation of math libraries is
desirable, however...

I'd be interested to hear specifically what in Scalala you think is
unsuitable as a starting point for a standard Scala math library.
Specifically.

And any detailed survey of Commons Math will reveal a massive
investment of time/effort in building that library that would be
inefficient to repeat. (Although it has some rough patches too - eg
Vector3D [http://commons.apache.org/math/apidocs/org/apache/commons/math/geometry/Vector3D.html].)

Rex has fairly easily posed some challenges for your proposed Vector[Int].

Personally, I think effort invested in understanding, reusing,
consolidating and adapting the existing Java and Scala assets would be
more fruitful than proposing new ones.

-Ben

Re: Scala vector and matrix math library

p, li { white-space: pre-wrap; }On Sunday 23 August 2009 05:56:51 you wrote:
> Personally, I think effort invested in understanding, reusing,
> consolidating and adapting the existing Java and Scala assets would be
> more fruitful than proposing new ones.
>
> -Ben


Of course, I have to agree with this.
However, it seems that some libraries seem to be too specialized already to allow for them being changed.


The scalala library, for instance, does not enforce dimensions with the type system meaning that it's valid to pass any kind of vector to a method taking a vector type, and you have to add checks like:
require(vector.size == 3, "Wrong vector dimension")


...or use partial functions everywhere to manually enforce this. I would argue that it would be better to follow the Scala convention of having Vector2, Vector3, Vector4 etc for this purpose (but it could of course be useful to have an abstract class Vector, too, for generic methods that take any kind of vector). This type system is of course not ideal for scalala, because making classes up to Matrix512_512 etc will require a lot of classes, so you can't really adapt Scalala to have this feature, while the same would be possible for a less math-oriented and more computation-oriented library, the latter being something that you from time to time have a need for.


What also might become an issue for a generally accepted vector library is the size of data types. Scalala is already very memory-optimized; a 3D DenseVector does "only" take up 56 bytes of memory (8 byte class header, 4 byte array reference, 4 byte padding, 8 byte array header (on most JVMs), 8*3 bytes for the values, 4 bytes for the array length and 4 bytes padding) which is pretty good already.
However, comparing that to a generic vector that is dimension specialized (e.g. Vector2, Vector3 etc), that uses Numeric for its operations and doesn't utilize an Array for storing the data: Such a Vector3[Double] would take up only 32 bytes of memory (8 class header, 8*3 for values, 4 for Numeric reference and 4 padding), and a Vector3[Float] would only take up 24 bytes (8 header, 4*3 values, 4 for Numeric reference). To give Scalala the same memory usage, you would have to restructure the whole type system (but it would be possible, indeed, and I could actually do some research in order to maybe implement something if you're interested).


Another issue is that the current Scalala doesn't easily allow for immutable data structures that are completely immutable (so that HotSpot can apply optimizations). In order to accomplish this (while still leaving the option open for mutable versions of the same structure à la the scala.collection library), you have to put the actual data variables at the bottom of the type tree, so that you have def's all the way down, have a trait with the correct "update" methods named Mutability or similar, and then store the actual data variables at the leafs of the type structure.


E.g. (and I write this off-hand so don't blame me if it doesn't compile):
trait Vector extends Product { //skipped Iterable etc to make things easier
def dimension = productArity
}


trait Vector2 extends Vector {
def x: Double, def y: Double
def productArity = 2
def productElement(n: Int): Double = if(n == 0) x else if (n == 1) y
else error("Out of range")
def productPrefix = "Vector2"
}
class ImmutVec2(protected var xt: Double, protected var yt: Double)
extends Vector2 {
def x = xt; def y = yt
}


trait Mutable2 extends Vector2 {
def x_=(v: Double) = (xt = v); def y_=(v: Double) = (yt = v)
}
class MutVec2(xt: Double, yt: Double) extends ImmutVec2(xt, yt)
with Mutable2


This is not how things are done in Scalala, however, and would also require some major restructuring and breakage in order to work.


If you can show me alternative ways of solving the issues in Scalala I've mentioned here, then I'll reconsider writing a new library and instead look if Scalala can't be adapted to suit more purposes, or if I maybe should accept the performance penalties Scalala gives me and perform optimizations elsewhere in my applications (the vector math will be the bottleneck however).


Regards,
David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: Scala vector and matrix math library

David,

I've been playing with making type hierarchies (and verifying my suspicions with performance testing); this has made me wonder about the following example you gave:

On Sun, Aug 23, 2009 at 2:24 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
trait Vector extends Product { //skipped Iterable etc to make things easier
def dimension = productArity
}


trait Vector2 extends Vector {
def x: Double, def y: Double
def productArity = 2
def productElement(n: Int): Double = if(n == 0) x else if (n == 1) y
else error("Out of range")
def productPrefix = "Vector2"
}
class ImmutVec2(protected var xt: Double, protected var yt: Double)
extends Vector2 {
def x = xt; def y = yt
}


trait Mutable2 extends Vector2 {
def x_=(v: Double) = (xt = v); def y_=(v: Double) = (yt = v)
}
class MutVec2(xt: Double, yt: Double) extends ImmutVec2(xt, yt)
with Mutable2

This type of hierarchy isn't hard to implement.  The problem is that instead of saving work for the library creator, it actually makes them do extra work if they want performance.

For example, if you try to define any math inside Vector2 above, you have to perform a method call for each value you want to get (which slows things down by a factor of 2-10x).  In fact, you have to keep re-implementing the same code all the way down the chain until you hit something marked "final":

trait ImmutVec2(/* ... */) extends Vector 2 {
  /* ... */
  def dotProduct(v:ImmutVec2) = x*v.x + y*v.y  // No way to avoid v.x() method call
}

final class MutVec2/* . . . */ {
  /* ... */
  def dotProduct(v:MutVec2) = x*v.x + y*v.y  // Now we see v.x is just a field lookup!
}

It seems to me that what the hierarchy buys us is the ability to relatively easily create a new vector type with lousy performance, at the expense of having to populate the entire inheritance tree with duplicate code (except for the type).

Am I missing something?

  --Rex
 

Re: Scala vector and matrix math library

I thought I would inject myself one more time into this thread, just
for a word about how Scalala is structured right now. I'm not the
primary author of the library, but I've done a lot of the refactoring
for 2.8, so I thought I could share what our goals are.

Scalala is intended as something of a matlab replacement, and not for
graphics and other low dimensional stuff. Daniel Ramage and I are both
natural language processing/machine learning students, and the library
reflects that. We care about high dimensional problems of varying size
(say 300-30000 dimensions). Low dimensions are almost always useless
to us, and so we can't really represent what we want in a fixed size
class, or one that can be effectively inlined. The Tensor hierarchy
reflects that. (In my other project, scalanlp, we now have an
implementation of Tensor over arbitrary T types, which for me are
usually strings.)

The Tensor hierarchy *is* intended to support several speed
enhancements, though they're removed right now, mostly in the form of
taking advantage of expression trees to tighten loops (so, make v + 4
* y be a scaleAdd rather than a multiply then add). I think Dan plans
to support MTJ-style calling out to LAPACK to make things really
blaze.

As has been pointed out, it would be good to have immutability in the
hierarchy, and it would be good to have Vector2 and Vector3.
Dan's said he's very open to contributions.

Anyway, nothing concrete, but I thought I would represent out what I
see as Scalala's focus.

Re: Scala vector and matrix math library

I have also looked a bit into this topic.
I was wondering how practical it would be to have additional compiler phase
plugins for optimization of some cases while maintaining a generic
interface.
In ScalaBox2D, the authoring was experimenting with that.

How would this work from a design and architectural point of view?

* reference
http://www.box2d.org/forum/viewtopic.php?f=11&t=3349&start=20

http://github.com/Villane/vecmath/tree/8763344240fb82069cfe7a7dac6f30e03...

Re: Scala vector and matrix math library

The problem with creating simultaneously flexible and high-performance vector math code is not that it's hard to do, it's just that there is *so much* repetitive code to write that no sane person would try to do it by hand.  It's tricky to shift all that burden to the compiler via @specialized, because this is @very_very_specialized in terms of sensible operations that should happen quickly.

But given that all the code follows from a much smaller set of basic operations and rules for what makes things run quickly, it seems like automatic code generation should in theory be possible.

Hence, I think the right answer for now is to write a program that automatically generates a high performance small vector library.  I've played with this a little and there don't seem to be any show-stoppers for creating the leaves of the library (which are all final for performance, and have overloaded methods to do math with each other).  I haven't worked out how to cleanly auto-generate the right class hierarchy yet, but that shouldn't be too daunting.

The hierarchy I'm thinking of now is:

Scalar[T]   <-- need this for things like 2+v to work
  ScalarTypename
Vec[T] extends Product with Iterable[T]  // Anything else needed?
  VecIsReference[T] extends Vec[T]
  VecIsRational[T] extends Vec[T]
  VecIsMutable[T] extends Vec[T]
  VecTypename extends Vec[Typename]  (with VecIsRational[T])
  Vec#[T] extends Vec[T]
    Vec#Typename extends Vec#[Typename] with VecTypename
      final Vec#T extends Vec#Typename
      final MVec#T extends Vec#Typename with VecIsMutable[Typename]
      final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
        with VecIsReference[Typename]

Here, there'd be separate classes for Typename in {Int,Long,Float,Double} (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be {2,3,4,5,6} for optimal performance, given my benchmarking.

The MVec2I series would be mutable; the RefVec2I series would be mutable and index into an array instead of carrying the values in itself (so you could store vector data in arrays for compactness and lose as little performance as possible).

Overall, this would require a library with 97 classes (!!) if it went up to N=6 (stopping at N=3 leaves it at a more reasonable 46 classes).  The final classes at the leaves of the tree would have ~20 (tiny) methods per operator for static versions (~60 for mutable versions).

By the way, Vec#[T] would be a non-abstract class, so you could actually use that if you wanted something that worked for your favorite type T and didn't require stellar performance; I also think you could mix in VecIsReference[T] and it would work, though I haven't actually written the code.

Also by the way, I am using Vec instead of Vector because of the convention by people who write collections libraries that "a Vector is an Array, not something to do math on"--the namespace is by now hopelessly conflicted.

  --Rex

Re: Scala vector and matrix math library

Hi Rex,

Thanks for putting your ideas down in class-hierarchy form.

On Fri, Aug 28, 2009 at 2:50 AM, Rex Kerr wrote:
> The hierarchy I'm thinking of now is:
>
> Scalar[T]   <-- need this for things like 2+v to work
> ScalarTypename

Please explain - in general, you cant add a scalar and a vector..?

> Vec[T] extends Product with Iterable[T]  // Anything else needed?

- Name choice: see end of message.
- Is T a subtype of Numeric?

>   VecIsReference[T] extends Vec[T]
>   VecIsRational[T] extends Vec[T]

- What do these classes bring beyond what's expressed in type param T
- is it performance?
- Im personally not much concerned by the performance issues, or at
least I dont want them to obscure accurate modeling of the underlying
domain. My preference would be to model the shape (ie interface) of an
abstract Vector trait hierarchy first, ensure it can cover the known
use cases, and only then worry about concrete implementations.
- What does the "ls" in the name signify?

>   VecIsMutable[T] extends Vec[T]

- What operations does this trait define? In Scala collections,
mutable collections re-interpret the meaning of some existing methods
to do in-place updates - I think a Vector library should do likewise.
Thus, can mutable & immutable share the same API, but with differing
semantics?

>   VecTypename extends Vec[Typename]  (with VecIsRational[T])

I dont understand what this one does.

>   Vec#[T] extends Vec[T]
>     Vec#Typename extends Vec#[Typename] with VecTypename
>       final Vec#T extends Vec#Typename
>       final MVec#T extends Vec#Typename with VecIsMutable[Typename]
>       final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
>         with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.

- I agree these defined-size {2..6} classes would be useful.
- Im not sure Mutability needs to be redefined per class, cant comment
without knowing what it might contain.

>
> Overall, this would require a library with 97 classes (!!) if it went up to
> N=6 (stopping at N=3 leaves it at a more reasonable 46 classes).

To me, any library claiming "standard" status must have provide a
purely abstract package, that other existing or future libraries can
implement. That follows directly from the "Separate Interface from
Implementation" principle.

As well as abstract types, the abstract package will need some Factory
API so that the creation operations can be virtual calls resolving to
a particular implementation.

However good you believe your implementation to be, it is inevitable
that someone will have alternate needs or ideas. Currently, the lack
of a separate, standardized interface forces each implementation (and
we know of at least 6) to publish a new interface.

It should be possible to map Scalala onto the abstractions, or
Commons-Math, or Java3D, by means of some implicit wrappers.

By all means provide a decent default impl in a separately
downloadable package, as you outline, but any library that fuses
interface and impl will simply become Yet Another Vector & Matrix
library.

Sometimes less is more. Defining a library of interfaces that doesnt
require a particular implementation will contribute much more lasting
value, IMO.

> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.

There's plenty of counter-examples: Scalala, Java3d, Commons-Math. I
prefer to call a vector "Vector".

-Ben

Re: Scala vector and matrix math library

On Thu, Aug 27, 2009 at 8:43 PM, Ben Hutchison <ben [at] playscapegames [dot] com> wrote:
On Fri, Aug 28, 2009 at 2:50 AM, Rex Kerr<ichoran [at] gmail [dot] com> wrote:
> Scalar[T]   <-- need this for things like 2+v to work
>   ScalarTypename

Please explain - in general, you cant add a scalar and a vector..?

2+v is shorthand for new Vector2(2,2)+v (if v has length 2).  This kind of thing is handy for vectors whose components are supposed to be indexes, for example.

Besides which, you need Scalar for 2*v.
 
> Vec[T] extends Product with Iterable[T]  // Anything else needed?

- Name choice: see end of message.
- Is T a subtype of Numeric?

Numeric is weird in that it knows how to do math on types, but isn't a type itself upon which math can be done.  So I would have Vec[T] (or AbstractVector[T] or whatever) actually define as abstract methods the operations on elements much like Numeric does.  It would then be trivial to create a NumericVector that used Numeric to implement those operations.

It might be that AbstractVector[T] should be a subtype of Numeric[T]; like Numeric, AbstractVector needs to know how to do math on its elements.
 
>   VecIsReference[T] extends Vec[T]
>   VecIsRational[T] extends Vec[T]

- What do these classes bring beyond what's expressed in type param T

Reference vector types would point into Array[T], and have methods to help with the pointing.  Or maybe they should point into Sequence[T].  They'd better not point into Vector[T] (i.e. scala.collections.Vector[T]) as that would be supremely confusing.

Rational vector types can be unit vectors, and have a notion of "approximately equal" i.e. equal to within the precision of the representation of the rational number (e.g. Float or Double).
 
- is it performance?

Not in this case.  It is in the Mutable case.
 
- Im personally not much concerned by the performance issues, or at
least I dont want them to obscure accurate modeling of the underlying
domain. My preference would be to model the shape (ie interface) of an
abstract Vector trait hierarchy first, ensure it can cover the known
use cases, and only then worry about concrete implementations.

A low-performance nicely-structured hierarchy is of limited use for small vectors.  If you are content with low performance, use a library that is designed for gigantic vectors and use that for small vectors.  Both things should be kept in mind.  I think both are attainable, but there may be some extra bits to the interface that are only there for performance reasons.
 
- What does the "ls" in the name signify?

It's my way of trying to keep the Vec part of the name first.  Perhaps a bad idea.
 
>   VecIsMutable[T] extends Vec[T]

- What operations does this trait define? In Scala collections,
mutable collections re-interpret the meaning of some existing methods
to do in-place updates - I think a Vector library should do likewise.
Thus, can mutable & immutable share the same API, but with differing
semantics?

Not entirely.  You can assign to mutable vectors.  You can't to non-mutable vectors.  If we call := the element-assignment operation, then
  val v = ImmutableVector(0,0)
  val u = MutableVector(1,1)
  var w = SomeSortOfVector(2,2)  // Can't we just use vars?
  v := u  // Nonsense, shouldn't compile
  u := v  // You'll want to do this all the time
  w = u  // Bad, want a copy, not a reference
  w = u.copy // Bad, needless object creation
 
>   VecTypename extends Vec[Typename]  (with VecIsRational[T])

I dont understand what this one does.

This provides concrete implementations for the primitive types.  So
  VectorInteger extends AbstractVector[Integer]
  VectorDouble extends AbstractVector[Double] with RationalVector[Double]
although as written VectorDouble could just extend RationalVector.  Hm.  There was some reason why I had thought to do it this way originally, but I think I changed the design.
 
>   Vec#[T] extends Vec[T]
>     Vec#Typename extends Vec#[Typename] with VecTypename
>       final Vec#T extends Vec#Typename
>       final MVec#T extends Vec#Typename with VecIsMutable[Typename]
>       final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
>         with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.

- I agree these defined-size {2..6} classes would be useful.
- Im not sure Mutability needs to be redefined per class, cant comment
without knowing what it might contain.

The mutable guys of a specific size override their getter and setter with a var:
  MVector2F(x0:Float,y0:Float) /* . . . */ {
    override var x = x0
    override var y = y0
    /* . . . */
  }
 
To me, any library claiming "standard" status must have provide a
purely abstract package, that other existing or future libraries can
implement. That follows directly from the "Separate Interface from
Implementation" principle.

Sure.  That's what Vector2[T], Vector3[T], ... are for.  They're abstract, but when Numeric is finalized they could be extended to non-abstract versions.  Or you could supply the needed element-wise methods.
 
As well as abstract types, the abstract package will need some Factory
API so that the creation operations can be virtual calls resolving to
a particular implementation.

Agreed.
 
Currently, the lack of a separate, standardized interface forces each implementation (and
we know of at least 6) to publish a new interface.

It should be possible to map Scalala onto the abstractions, or
Commons-Math, or Java3D, by means of some implicit wrappers.

It should be possible to map *an appropriately selected subset* of Scalala or Commons-Math or Java3D or java.awt.geom or whatever onto it.  But if you want it to be entirely generic, then you're really stuck with implementing a giant abstraction of group theory and Clifford algebras.

Extreme abstraction is just another form of extreme specialization, if it goes beyond what is commonly needed.
 
Sometimes less is more. Defining a library of interfaces that doesnt
require a particular implementation will contribute much more lasting
value, IMO.

Can you give some examples of these sorts of libraries?  It'd be easier for me to understand what you're looking for, and why you think it's of lasting value if you could give a concrete example.
 
> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.

There's plenty of counter-examples: Scalala, Java3d, Commons-Math. I
prefer to call a vector "Vector".

So do I.  Except:

> import scalala.tensor._
> import scala.collection._
> val a = Vector(1,2,3)    // Um...confusing, much?

Now, if we could get rid of Vector from the Scala class hierarchy...but I fear it's already too late for that, as it is for Java, C++, etc. etc..

Maybe we should call the class ^->.  It's almost like putting an arrow over the type....

  --Rex

Re: Scala vector and matrix math library

On Fri, Aug 28, 2009 at 1:08 PM, Rex Kerr wrote:
> Rational vector types can be unit vectors, and have a notion of
> "approximately equal" i.e. equal to within the precision of the
> representation of the rational number (e.g. Float or Double).

I was thinking about this issue after your earlier queries to David.

If someone really wants a Vector[Int], a unit vector can be defined
for it. The quantization is really severe, but its possible: (2, 1)
becomes (1, 0), (2, 2) as well, given some rule for boundary cases
such as x == y. All computer maths involves quantization errors, its
just far more marked for the Int case than for floating-point.

Similarly, equality can be defined within an tolerance specified as an Int.

That might avoid needing an extra RationalVector abstraction to handle
these operations.

> A low-performance nicely-structured hierarchy is of limited use for small
> vectors.

Im not convinced that a nicely structured library implies "low"
performance, just not highest performance. And as you've noted,
seemingly costly abstraction can often be partly optimized away by the
compiler or runtime.

If you really want highest performance, you really shouldn't be
running on the JVM, due to its lack of user-defined value types: the
fact than Vectors are reference, not value, objects means that you
cannot eg compose 3 Vector3s into a 3DTriangle class where the 9
vertex coordinates of the Triangle are all stored together.

So, we're on the JVM because we want abstraction, reuse, portability
and good tools, along with /decent/, not best, performance. Lets do
the job properly and ensure a Vector library is nicely structured.

> Not entirely.  You can assign to mutable vectors.  You can't to non-mutable
> vectors.  If we call := the element-assignment operation, then

By assign, you mean: replace the entire content of Vector x with
Vector y (x := y) ?

Can't this be composed by combining all the element-wise assignments
(which are defined for the immutable case as copy-with-modification),
without introducing an extra trait at the interface level?

Also, object creation is already cheapish, and relatively speaking,
only getting cheaper with time. Copy-GC JVMs like Hotspot allocate
into a defragmented heap region, so allocation is a (synchronized)
pointer increment. Copy GC does work approx proportional to surviving
objects, so cleanup of short lived objects is cheap as well. Im
dubious about the value of object reuse myself.

>> It should be possible to map Scalala onto the abstractions, or
>> Commons-Math, or Java3D, by means of some implicit wrappers.
>
> It should be possible to map *an appropriately selected subset* of Scalala
> or Commons-Math or Java3D or java.awt.geom or whatever onto it.  But if you
> want it to be entirely generic, then you're really stuck with implementing a
> giant abstraction of group theory and Clifford algebras.

Ive experienced cases of speculative generalization and the poor APIs
that result. Good generalizations really are harvested.

In the case of Vector/Matrix libraries, we've got 6 examples in
JVM-land alone. Surely we have something to harvest that reflects the
real-world challenges those 6 libraries have faced..

> Can you give some examples of these sorts of libraries?  It'd be easier for
> me to understand what you're looking for, and why you think it's of lasting
> value if you could give a concrete example.

Firstly, what Im looking for: to build maths-reliant applications on
top of something fairly solid and lasting, and without being coupled
to a particular implementation forever.

When I write code that references, eg Scalala's Vector, the fate of my
app is partly coupled to Scalala, and thus to unpredictable factors
like Daniel Ramage's future career choices. If I write most of my
client against a minimal, abstracted API expressly designed for
re-implementation, I feel less vulnerable to the changing fortunes in
the underlying library.

As for an example, java.util.Collections. Its *far* from perfect, but
its been an resounding success as an example of an API that 3rd
parties can and do reimplement. In Javaland, when you want to write a
fancy new type of concurrent hashtable, you don't need to start by
defining what Map means.

>> > Also by the way, I am using Vec instead of Vector because of the
>> > convention
>
>> import scalala.tensor._
>> import scala.collection._
>> val a = Vector(1,2,3)    // Um...confusing, much?
>
> Now, if we could get rid of Vector from the Scala class hierarchy.

Scala already has a means to address this namespace-collision problem:

import scala.collection.{Vector => CVector, _}

-Ben

Re: Scala vector and matrix math library

Hi folks,

I'm the primary author of Scalala. This has been a really fascinating
discussion to read through! Sorry to not have chimed in earlier. I'm glad
to see all these issues being vetted in a wider community forum. The
scalala user group currently has 18 members (and growing!)
http://groups.google.com/group/scalala and several community members have
already submitted code that has been integrated into the mainline tree. So
we're a small community at the moment, but I'd really like to encourage
everyone reading this thread to click around and play with Scalala, even
though its documentation is still spotty.

I was going to reply to a bunch of these messages individually, but thought
I'd collate some major points of response here.

= Scalala's design philosphy =

My original motivation for creating scalala was to be able to abstract away
from directly manipulating basic vector/matrix data types and to provide a
more matlab-like syntax for interacting with them. I believe that the
language of linear algebra can and should be specified in terms of
operators, constructors, and selectors, and as such should be really a
minimal syntax-driven library design. Originally, scalala was just an
operator-supporting wrapper around MTJ types until I found some of MTJ's
Java-y design decisions too limiting. It evolved to provide a more elegant
type system while heavily borrowing from MTJ's internals to provide
optimized performance on common routines (although only some of those code
paths are optimized at present).

= Regarding a (new?) standard API for vector / matrix math =

I'd argue that the standard API for vector math should be based on those
operators, constructors, and selectors, that make the most sense from a code
usability perspective. And, of course, performance shouldn't be sacrificed
in the process. Scalala's design choices are a work in progress - the goal
is to form a scala-y synthesis of the good parts of existing Java libraries
while taking a page from more mature numerical packages like Matlab and R.

Scalala's design factors out operators from data types, and puts almost all
the hard work into the operators. There's an overview of usage here
http://code.google.com/p/scalala/wiki/QuickStart. Because the operators are
type preserving, the efforts of library writers / users can go toward other
things (like making better specialized data representations, such as a
ImmutableVector2 or what have you.)

As a result, it is my hope that code written using scalala is more or less
future-proof in that similar libraries might make similar choices about how
mathematical statements should be written. Hence if bad things happen to
Scalala, one could plug in the next cool linear algebra library and only the
data types might change. I'd love to have more discussion about that.

val x = DenseVector(3)(0,1,2); // construct a vector [dense] of size 3
val z = x * 2 + 1 value; // take the value of the given expression
val A = DenseMatrix(3,4)(1); // construct a dense matrix of ones
val b = A.t * z value; // transpose and multiple matrix by vector
z :+= x; // incrementing
z := x; // assignment
// etc ..., more at the QuickStart linked above

An operator-driven style of coding puts some demands on the type system that
supports those operators, and also implies some data type choices. In
particular, efficiently supporting three-argument-or-more specialized
methods like "A * x + b" requires building object-based expression trees
that can in turn be farmed out to specialized BLAS / LAPACK calls (which is
a waste of time on 2D vectors, admittedly, but critical for the larger
vectors I work with). The current design of the library reflects the desire
to support new data representations easily (no need to re-implement all the
math on each data type) while preserving type-safety across operations.
It'll get even better in scala 2.8 with David Hall's improvements.

= Regarding 2D / 3D specializations and immutability =

We hadn't had the use case of specialized 2D and 3D vectors in mind from the
start, but I'm completely open to adding support for 2D/3D specializations
to Scalala, although I'll be the first to admit that with vectors so short,
you'll likely need specialized compile time optimizations to avoid spending
all the time in expression tree construction.

Ben Hutchison wrote:
>
> Firstly, what Im looking for: to build maths-reliant applications on
> top of something fairly solid and lasting, and without being coupled
> to a particular implementation forever.
>
> When I write code that references, eg Scalala's Vector, the fate of my
> app is partly coupled to Scalala, and thus to unpredictable factors
> like Daniel Ramage's future career choices. If I write most of my
> client against a minimal, abstracted API expressly designed for
> re-implementation, I feel less vulnerable to the changing fortunes in
> the underlying library.
>

Just to clarify this a bit more, I think this ought to be thought of on two
levels. There are data types and there are operators/usages. I think
defining a truly "minimal API" in the sense that Ben wants might plausibly
be limited only to the operator/usage half. Regarding a hierarchy of types,
I'd argue that Scalala's definition is pretty reasonable. Where it misses
desiderata from the current discussion is in immutability (definitely
addable) and specialization to non-Double values (trickier). I'm open to
adding both of these to Scalala if anyone's up to it.

Daniel Ramage

Re: Scala vector and matrix math library

On Sat, Aug 29, 2009 at 6:41 AM, dramage wrote:
> I'm the primary author of Scalala.  This has been a really fascinating
> discussion to read through!  Sorry to not have chimed in earlier.

Hi Daniel.

Good to have your input to the discussion.

I am pretty impressed by the design of Scalala. If a standardized
Vector/Matrix interface is developed, I think it makes sense to lift
some concepts from Scalala into that interface, and to ensure that
Scala can easily be an implementation of it (or likely initially, the
primary impl. of it).

As has been mentioned, there's features people wish for that Scalala
doesnt presently provide (eg: fixed-size Vectors; support for other
number types than double via type param, immutability), but I dont see
any of them as really big obstacles. Especially if you're happy for
Scalala to evolve in those directions.

> Scalala's design factors out operators from data types, and puts almost all
> the hard work into the operators.

Personally, this is one aspect of Scalala that I feel a little uneasy
& uncertain about: the TensorOp design. I'd like to understand better
why you chose that style (a DSL with deferred evaluation?) over using
instance methods on the tensor classes.

Separated operations are not directly visible in an objects API. Does
finding out what operations are available in some context becomes
harder for a client programmer to navigate?

Another consequence is that operators defined independently of the
data they operate on can only use the public API of the data object,
and thus they cannot access private encapsulated state? (I see this as
somewhat analogous to the Anemic- vs Rich- Domain debate that has run
in enterprise development circles for the last 5-10 years. Anemic
domains separate data from operations over the data, while rich
domains combine them. The names "Rich" and "Anemic" are a bit
prejudicial.)

Because the operators are
> type preserving, the efforts of library writers / users can go toward other
> things (like making better specialized data representations, such as a
> ImmutableVector2 or what have you.)

If operations are defined in traits, can library writers can still
focus on making better specialized data representations, and simply
the mixin the operation traits that their implementation will
provide...?

Ie along the lines of Odersky/Zenger's "Scalable Component
Abstractions" paper, which I feel outlines a really nice &
straightforward way to create components and get reuse in Scala-based
software.

-Ben

Re: Scala vector and matrix math library

Ben Hutchison wrote:
> On Sat, Aug 29, 2009 at 6:41 AM, dramage wrote:
>> Scalala's design factors out operators from data types, and puts almost all
>> the hard work into the operators.
>
> Personally, this is one aspect of Scalala that I feel a little uneasy
> & uncertain about: the TensorOp design. I'd like to understand better
> why you chose that style (a DSL with deferred evaluation?) over using
> instance methods on the tensor classes.
I am also curious. Is this to allow some kind runtime of optimisations?
(Is it possible with this approach to restructure the "AST" of the
expressions runtime to minimise the multiplication work? (like:
(A_mx2)*(B_2xn)*(C_nxk), all dense matrices, and the amount of work is
not -necessarily- the same for (A*B)*C and A*(B*C)), but at runtime the
proper might be selected (is it an NP problem?).)
Bests, gabor

Re: Scala vector and matrix math library

On Fri, Aug 28, 2009 at 1:21 AM, Ben Hutchison <ben [at] playscapegames [dot] com> wrote:
If someone really wants a Vector[Int], a unit vector can be defined
for it. The quantization is really severe, but its possible: (2, 1)
becomes (1, 0)

Ouch.  That is possible, but is it useful?  I can't think of a case where it would be.  Why make library implementers implement junk "unit" methods?  It seems like this should be isolated as a separate trait so you only have to bother with it when it's useful.  The default implementation of make-a-unit-vector will not produce anything sane here, and any tolerances will have to be implemented very carefully to have consistent and known behavior on corner cases.  (If the tolerance is 10, it matters whether it's inclusive or not (or half-inclusive on one side); if the tolerance is 1e-12, it doesn't really matter because you shouldn't expect an exact value of 1e-12 anyway.)
 
So, we're on the JVM  because we want abstraction, reuse, portability
and good tools, along with /decent/, not best, performance. Lets do
the job properly and ensure a Vector library is nicely structured.

I think we basically agree--we want a nice structure and decent performance.  I just emphasize the performance part more because I think that's relatively hard, whereas coming up with a sane class hierarchy is relatively easy.  Coming up with the *uniquely objectively most sane* class hierarchy might be hard or impossible, though.
 
> Not entirely.  You can assign to mutable vectors.  You can't to non-mutable
> vectors.  If we call := the element-assignment operation, then

By assign, you mean: replace the entire content of Vector x with
Vector y (x := y) ?

Can't this be composed by combining all the element-wise assignments
(which are defined for the immutable case as copy-with-modification),
without introducing an extra trait at the interface level?

No, copy-with-modification requires a var, while internal updates work fine with vals.
 
Also, object creation is already cheapish, and relatively speaking,
only getting cheaper with time. .... Im
dubious about the value of object reuse myself.

Well, I just benchmarked it (based on the same code posted in my reply to David but I added a + method and played with val/var switching).  And I see, for 2-ary vector-vector addition:
  Mutable vector (+=)  2.0 ns / op
  Mutable vector (+)  9.8 ns / op
  Immutable vector (+)  10.2 ns / op
(it's an even bigger difference for vector + two integer arguments).

My original 2D/3D vector library post stressed the "easy to use immutable vector" part of it.  I agree that for many applications you don't care about the extra 8 ns--if you're using it for mouse updates, they'll be coming in hundreds of us apart or more.  On the other hand, a factor of 5 is huge if your application's bottleneck is vector math (as David Flemström said his was, and as a lot of my code is--though David seems very fond of immutable vectors for some reason).
 
As for an example, java.util.Collections. Its *far* from perfect, but
its been an resounding success as an example of an API that 3rd
parties can and do reimplement.

Okay.  java.util.Collections is full of things that are for performance only, especially as you get deeper into the hierarchy.  But I agree that it's laid out reasonably nicely by separate functionality, especially at the top.  I am comfortable with that sort of compromise.  (E.g. you have an ArrayList and a HashSet for performance, but the top level distinction is between lists and sets, not between arrays and hashtables.)
   --Rex

Re: Scala vector and matrix math library

p, li { white-space: pre-wrap; }On Friday 28 August 2009 17:26:14 Rex Kerr wrote:
> (as David Flemström
> said his was, and as a lot of my code is--though David seems very fond of
> immutable vectors for some reason.


That's because I'll be using vectors in a heavily threaded environment, and would rather want my *vector* fields to be @synchronized instead of having to use @synchronized vector *values* and locks for vector mutation. As you know, immutability is invaluable for concurrent programming, but you can't build an immutable library on top of a mutable one, while it's very easy to build a mutable system on top of an immutable one, that also is very performant.


Also, I won't be updating my vectors that often - 120 times per second perhaps in non-critical parts of the program, and as many updates as the CPU can muster in only some locations where I'm better off using "var x: Float; var y: Float" instead anyways since method calls are expensive too, no matter how much you reduce object creations - but they will instead be used in loads and loads of different locations simultaneously, and thus immutablilty is an important aspect there too.


And really, when you think about it: how often do you really find yourself in situations where you need to do 100.000.000+ vector mutations per second? What is it that you are doing; drawing 3D fractals, doing real-time fluid simulation or calculating weather forecasts? If so, then the JVM is the wrong platform for you anyways. Go with Haskell or Scheme or Erlang or C; if you still need semi-platform-independence, use C and LLVM.


BTW, a Scala compiler for LLVM would be awesome. One can dream, right?


David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: Scala vector and matrix math library

On Sat, Aug 29, 2009 at 3:18 AM, David
Flemström wrote:
> On Friday 28 August 2009 17:26:14 Rex Kerr wrote:
>> (as David Flemström
>> said his was, and as a lot of my code is--though David seems very fond of
>> immutable vectors for some reason.
>
> That's because I'll be using vectors in a heavily threaded environment, and
> would rather want my *vector* fields to be @synchronized instead of having
> to use @synchronized vector *values* and locks for vector mutation.

I agree that this design style - fine-grained objects are immutable,
coarse grained entities ("objects with identity") may be mutable - is
a great way to build software, especially for concurrency, and an
important use case for any general library to meet.

-Ben

Re: Scala vector and matrix math library

On Fri, Aug 28, 2009 at 1:18 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
That's because I'll be using vectors in a heavily threaded environment, and would rather want my *vector* fields to be @synchronized

Ah, that makes sense now.  The safety of immutability would be a big win there--it usually takes a lot longer to negotiate locks than it does to create an object.
 
And really, when you think about it: how often do you really find yourself in situations where you need to do 100.000.000+ vector mutations per second?

Personally?  Every day.  Any time you have numeric data of a reasonable size, there's the opportunity to do calculations on it, and some data is intrinsically 2D or 3D or more (e.g. GPS coordinates).  A lot of algorithms are N^2 or worse, and even N log(N) can take a while if you have collected an interestingly large quantity of data.
 
Go with Haskell or Scheme or Erlang or C

Huh?  Java (and Scala, when written for performance) is faster than any Erlang I know of, and is generally faster than any Scheme implementation I know of.  It's roughly even with Haskell GHC, and within a factor of 2 of C.

Basically, as soon as you're not using C/C++ or other lower level procedural languages, you're as well off or better using the JVM as anything else.  (I did write real-time image processing code in C++ written in C-like style when I really needed performance.  Painful.  I'd rather not have to do that again.)

  --Rex

Re: Scala vector and matrix math library

p, li { white-space: pre-wrap; }On Friday 28 August 2009 20:00:36 Rex Kerr wrote:
> Huh? Java (and Scala, when written for performance) is faster than any
> Erlang I know of, and is generally faster than any Scheme implementation I
> know of. It's roughly even with Haskell GHC, and within a factor of 2 of
> C.
>
> Basically, as soon as you're not using C/C++ or other lower level
> procedural languages, you're as well off or better using the JVM as
> anything else. (I did write real-time image processing code in C++ written
> in C-like style when I really needed performance. Painful. I'd rather not
> have to do that again.)
My point was rather that those languages don't have the same overheads for operating with small data structures that the JVM has while still being acceptable programming languages for high-performance tasks. Oh, and if you don't run Haskell/Erlang/Scheme interpreted or in a VM but instead compile them (or alternatively compile them to LLVM intermediate BC which is possible to some extent since the compiler support is there in many cases), you can easily match hand-written assembly because of the extreme static optimizations that can be done in those languages (because of laziness, immutability and extreme modularity; you can isolate one function, see when it returns certain output values depending on its input values, and then just replace it with a lookup table for instance).

David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: Scala vector and matrix math library

I think that a compiler plugin that would allow for basic macros would do the
trick in a situation like this. We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

For example, this will still yield a getter method x for class B and C:

trait A {
def x: Int
}

class B extends A {
val x: Int = 3
}
class C extends A {
var x: Int = 3
}

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO. The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

On Thursday 27 August 2009 18:50:41 Rex Kerr wrote:
> The problem with creating simultaneously flexible and high-performance
> vector math code is not that it's hard to do, it's just that there is *so
> much* repetitive code to write that no sane person would try to do it by
> hand. It's tricky to shift all that burden to the compiler via
> @specialized, because this is @very_very_specialized in terms of sensible
> operations that should happen quickly.
>
> But given that all the code follows from a much smaller set of basic
> operations and rules for what makes things run quickly, it seems like
> automatic code generation should in theory be possible.
>
> Hence, I think the right answer for now is to write a program that
> automatically generates a high performance small vector library. I've
> played with this a little and there don't seem to be any show-stoppers for
> creating the leaves of the library (which are all final for performance,
> and have overloaded methods to do math with each other). I haven't worked
> out how to cleanly auto-generate the right class hierarchy yet, but that
> shouldn't be too daunting.
>
> The hierarchy I'm thinking of now is:
>
> Scalar[T] <-- need this for things like 2+v to work
> ScalarTypename
> Vec[T] extends Product with Iterable[T] // Anything else needed?
> VecIsReference[T] extends Vec[T]
> VecIsRational[T] extends Vec[T]
> VecIsMutable[T] extends Vec[T]
> VecTypename extends Vec[Typename] (with VecIsRational[T])
> Vec#[T] extends Vec[T]
> Vec#Typename extends Vec#[Typename] with VecTypename
> final Vec#T extends Vec#Typename
> final MVec#T extends Vec#Typename with VecIsMutable[Typename]
> final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
> with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.
>
> The MVec2I series would be mutable; the RefVec2I series would be mutable
> and index into an array instead of carrying the values in itself (so you
> could store vector data in arrays for compactness and lose as little
> performance as possible).
>
> Overall, this would require a library with 97 classes (!!) if it went up to
> N=6 (stopping at N=3 leaves it at a more reasonable 46 classes). The final
> classes at the leaves of the tree would have ~20 (tiny) methods per
> operator for static versions (~60 for mutable versions).
>
> By the way, Vec#[T] would be a non-abstract class, so you could actually
> use that if you wanted something that worked for your favorite type T and
> didn't require stellar performance; I also think you could mix in
> VecIsReference[T] and it would work, though I haven't actually written the
> code.
>
> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.
>
> --Rex

David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: Scala vector and matrix math library

On Thu, Aug 27, 2009 at 1:36 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

That's why you write the Scala vector package producer in Scala (like I did for my trial project).  Then you don't have anything extra to carry around.
 
Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

I should have explained in more detail why there would be so many methods per operation.

trait A {
  def x : Int
  def +(a:A) : A
}

final class B(val x:Int) extends A {
  def +(a:A) = new B(x + a.x)   // a.x must use a getter
  def +(b:B) = new B(x + b.x)   // b.x can access field directly
  def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
}

final class C(private val a:Array[Int],private val i:Int) extends A {
  def x = a(i)
  def +(a:A) = new B(x + a.x)  // a.x is a getter, x need not be (haven't checked)
  def +(b:B) = new B(x + b.x)  // b.x is a field access
  def +(c:C) = new B(x + c.x)  // Should -optimise to a(i)+c.a(c.i)
}

So it is entirely true that if you want to use A in your code, you'll have to go through an extra method call and things will be slower.

But it is also entirely true that you are not required to use A unless you honestly don't know when you've got a B and when you've got a C.  You can mix B and C with no superfluous method calls because B and C know about each other and are final, and can use class-specific field access instead of falling back on the generic getter method in A.

This gives the best of both worlds: if you know your types and you know what you're doing, you can write code that is as fast as using primitives by hand.  If you want to be blind to immutability or references or whatever, only then do you need to use A and the only cost is that you'll pay an extra method call.

The only place you lose is in code size.

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO.

Did you see all the benchmarks I posted?  For mutable vectors, extra method calls triple (or more) how long a simple arithmetic operation takes.  IMO a factor of three is quite significant.  I stopped using Python in favor of Java over factors of three, for example.

(Specifically, the relevant elapsed times for 2-vectors are:
  primitive+primitive (calculated on the fly) = 100
  primitive+primitive (from array) = 200
  final vector+primitive (calc on the fly) = 80
  final vector+final vector (from array) = 140
  trait+primitive (calc on the fly) = 400
  trait+trait (from array) = 400
)
 
The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

I agree.  Once it's stabilized this is definitely a thing to do.  (Not the only thing, but it'd make everyone's life a lot easier.)  I'm not sure whether one actually wants to include Numeric[T] as a field, though--this adds 4-8 bytes to every single vector.  Why not

trait GenericVector[T] extends Vector with Numeric[T] { ... }

and then create vectors with something like

val a = new GenericVector[BigInt] with Numeric.BigIntIsIntegral

if that can work (extra typing, but a bit of typing might be worth the space if you're using something smaller than a BigInt).  Numeric does not define operators directly, so there shouldn't be any risk of name collision.
   --Rex

Re: Scala vector and matrix math library

On Thu, Aug 27, 2009 at 8:46 PM, Rex Kerr wrote:
> final class B(val x:Int) extends A {
>   def +(a:A) = new B(x + a.x)   // a.x must use a getter
>   def +(b:B) = new B(x + b.x)   // b.x can access field directly
>   def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
> }

Maybe this is already possible to implement by exploiting the type
system in some way. See
http://michid.wordpress.com/2008/10/29/meta-programming-with-scala-condi...

BR,
John

Re: Scala vector and matrix math library

p, li { white-space: pre-wrap; }On Thursday 27 August 2009 20:46:02 Rex Kerr wrote:
> trait A {
>   def x : Int
>   def +(a:A) : A
> }
>
> final class B(val x:Int) extends A {
>   def +(a:A) = new B(x + a.x)   // a.x must use a getter
>   def +(b:B) = new B(x + b.x)   // b.x can access field directly
>   def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
> }
>
> final class C(private val a:Array[Int],private val i:Int) extends A {
>   def x = a(i)
>   def +(a:A) = new B(x + a.x)  // a.x is a getter, x need not be (haven't
> checked) def +(b:B) = new B(x + b.x)  // b.x is a field access
>   def +(c:C) = new B(x + c.x)  // Should -optimise to a(i)+c.a(c.i)
> }
Well, see, here's where I'm not so sure about how things work. Have you really tested this and seen that what you say in the comments in fact is true?


I know that Scala does "stabilization" (in lack of a better word) by seeing where getters can be removed entirely and where they have to stay being methods.


However, as long as you are extending something that already uses a getter, you can't "override" that getter with a stable val. Think about the following example:


val x: A = new B(3)


Obviously, when you call x.x, you will use a getter because that's what the trait has defined (and the JVM makes a difference between field accesses and method calls, so it cant transform the one into another except by HotSpoting. Same true for scalac), and ergo, the B class (even by itself) will in fact contain a getter ("def x = $x" where $x is some private generated variable) and not a pure final val, because the val is "unstable".


...so, all of the + methods above will in fact use a getter all of the time, unavoidably, with my line of reasoning. Please prove me wrong on this one, preferrably by publishing your test suite like Mr. Juma has proposed.

David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: Scala vector and matrix math library

On Thu, Aug 27, 2009 at 3:01 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
On Thursday 27 August 2009 20:46:02 Rex Kerr wrote:
> trait A {
>   def x : Int
>   def +(a:A) : A
> }
>
> final class B(val x:Int) extends A {
>   def +(a:A) = new B(x + a.x)   // a.x must use a getter
>   def +(b:B) = new B(x + b.x)   // b.x can access field directly
>   def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
> }
>
> final class C(private val a:Array[Int],private val i:Int) extends A {
>   def x = a(i)
>   def +(a:A) = new B(x + a.x)  // a.x is a getter, x need not be
>   def +(b:B) = new B(x + b.x)  // b.x is a field access
>   def +(c:C) = new B(x + c.x)  // Should -optimise to a(i)+c.a(c.i)
> }
Well, see, here's where I'm not so sure about how things work. Have you really tested this and seen that what you say in the comments in fact is true?

Yes, I've tested, both with javap -c and with benchmarking, with four gigantic caveats:
  (1) The HotSpot compiler is extraordinarily finicky about what it will and will not optimize.
  (2) Some of my tests used enough memory so the GC played a role
  (3) 2.8.0 behaves differently from 2.7.5 in terms of what it leaves as invokevirtual and what it unwraps to getfield/putfield
  (4) There was a compiler bug that was just fixed as of the 18583 nightly build that was making it hard for me to get consistent benchmarks.  (18583 looks like it has a couple of debugging statements in it that weren't commented out, incidentally.)

Think about the following example:


val x: A = new B(3)


Obviously, when you call x.x, you will use a getter because that's what the trait has defined (and the JVM makes a difference between field accesses and method calls, so it cant transform the one into another except by HotSpoting. Same true for scalac), and ergo, the B class (even by itself) will in fact contain a getter ("def x = $x" where $x is some private generated variable) and not a pure final val, because the val is "unstable".


...so, all of the + methods above will in fact use a getter all of the time, unavoidably, with my line of reasoning. Please prove me wrong on this one

Let's first tackle the code generation issue.  With 2.7.x without optimise, we see that class C has a getter method:
public int x();
  Code:
   0:    aload_0
   1:    invokespecial    #52; //Method a:()[I
   4:    aload_0
   5:    invokespecial    #54; //Method i:()I
   8:    iaload
   9:    ireturn
And when we use the +(b:B) form, for example, we see:
public B $plus(B);
  Code:
   0:    new    #34; //class B
   3:    dup
   4:    aload_0
   5:    invokevirtual    #38; //Method x:()I
   8:    aload_1
   9:    invokevirtual    #44; //Method B.x:()I
   12:    iadd
   13:    invokespecial    #41; //Method B."<init>":(I)V
   16:    areturn
which is exactly what you'd expect.  But with -optimise we now get
public B $plus(B);
  Code:
   0:    new    #34; //class B
   3:    dup
   4:    aload_0
   5:    invokespecial    #37; //Method a:()[I
   8:    aload_0
   9:    invokespecial    #40; //Method i:()I
   12:    iaload
   13:    aload_1
   14:    getfield    #48; //Field B.x:I
   17:    iadd
   18:    invokespecial    #43; //Method B."<init>":(I)V
   21:    areturn
that is, scalac has created the getter but doesn't use it for B.  So it is entirely avoidable.  However, 2.8.0 doesn't bother to avoid it, possibly trusting that HotSpot will elide the unnecessary invokevirtuals.  (2.8.0 produces the same code as 2.7.x without optimise in this case.)

Anyway, it is entirely possible for the compiler to omit correct bytecode that avoids getters and setters in the hierarchy I gave.  Whether it makes a difference for performance, however, requires testing.
 
, preferrably by publishing your test suite like Mr. Juma has proposed.

I've been trying to fix at least some of problems (1)-(4) before sharing any code.  Iulian seems to have fixed the compiler bug.  I've fixed (2).  (3) you just have to live with.  The thing I'm attaching seems to play somewhat better with JIT compilation; I assume this will continue to get better, so we should be aiming more for the (reasonable) best case.

In this case it doesn't appear to matter whether the class is final--the JIT figures it out and fixes it.  Since I saw other cases where it did appear to matter, I'm still a little uneasy, but I'm relieved that the most stable test code I've used so far (in terms of irrelevant changes not causing sizable differences in performance) indicates that most getter/setter code is optimized away entirely by the JIT compiler *if* it's in nice small bite-sized pieces in separate methods.  Otherwise it's very hit or miss, and some scalac optimizations can help out.  (For example, 2.8.0 was inlining code from another class defined in the same file--that really helped HotSpot out in the middle of a huge test method.)

Test code is attached, for anyone interested--play around with switching which classes are used, if you're interested!  The output reports ns / op, that is, nanoseconds per vector operation.  (In this case, it's a 2-ary +=.)  You have to manually change the code and recompile and run to change which classes get used--it wasn't worth the time to try to add every single interesting case, and some of them interfere with each other anyway.

The biggest difference I see is a switch between 2.0 ns/op when the code knows I'm using leaf classes (i.e. may as well be final) or all leaf classes are the same (even though they get passed to methods that take superclasses); 2.6 ns/op when I'm using different classes (e.g. type V as data and W as the accumulator--as long as the JVM knows that two leaves exist and are used somewhere it slows down a little); and 4.0+ ns/op when I have an array of almost all one leaf class that is slightly poisoned by another leaf class (meaning that the superclass is really required).  Using the fully generic one is hopeless--integers get boxed, which takes raises it to 11.4 ns/op.

This all goes to remind me how much I hate trying to benchmark anything to do with the JVM, how necessary it is, and what an amazingly good job JIT can do in some but not all cases.

Anyway, even though the compiler can emit supposedly optimized code in some cases, the HotSpot compiler can do the same job behind the scenes, so custom methods appear to not be highly necessary for mutable/immutable/reference/whatever distinctions: having everyone use the superclass only seems to impose a 30% penalty at worst.  I'm not sure overcoming that is enough to warrant tripling (or more) the size of the code.

  --Rex

Re: Scala vector and matrix math library

Just to point out, it's not possible in scala to directly access a field; even when you use val or var, you are still going through generated getters/setters to support for uniform access.

final class B(val x:Int) extends A {
  def +(a:A) = new B(x + a.x)   // a.x must use a getter
  def +(b:B) = new B(x + b.x)   // b.x can access field directly
  def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
}

If you run javap on the above, you'll note the only difference in bytecode between the overloaded A and B variants of + is one instruction: invokevirtual in the case of B and invokeinterface for A. Indeed a small savings, but nothing to write home about; certainly not worth the pain of virtually identical overloaded methods.

 - Colin

Re: Scala vector and matrix math library

Hey Rex,

On Thu, 2009-08-27 at 14:46 -0400, Rex Kerr wrote:

> Did you see all the benchmarks I posted? For mutable vectors, extra
> method calls triple (or more) how long a simple arithmetic operation
> takes. IMO a factor of three is quite significant. I stopped using
> Python in favor of Java over factors of three, for example.

Is the code for the various benchmarks you have been talking about
available anywhere so that others can verify? HotSpot performance can
vary quite a bit depending on the machine, JVM settings, version, etc.
so it would be good to do that.

Best,
Ismael

Re: Scala vector and matrix math library


 BTW, are there plans for meta programming in Scala ?

   http://groovy.codehaus.org/Compile-time+Metaprogramming+-+AST+Transformations

 That would be an alternative to a compiler plugin wouldn't it ?


On Thu, Aug 27, 2009 at 1:36 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
I think that a compiler plugin that would allow for basic macros would do the
trick in a situation like this. We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

For example, this will still yield a getter method x for class B and C:

trait A {
 def x: Int
}

class B extends A {
 val x: Int = 3
}
class C extends A {
 var x: Int = 3
}

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO. The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

On Thursday 27 August 2009 18:50:41 Rex Kerr wrote:
> The problem with creating simultaneously flexible and high-performance
> vector math code is not that it's hard to do, it's just that there is *so
> much* repetitive code to write that no sane person would try to do it by
> hand.  It's tricky to shift all that burden to the compiler via
> @specialized, because this is @very_very_specialized in terms of sensible
> operations that should happen quickly.
>
> But given that all the code follows from a much smaller set of basic
> operations and rules for what makes things run quickly, it seems like
> automatic code generation should in theory be possible.
>
> Hence, I think the right answer for now is to write a program that
> automatically generates a high performance small vector library.  I've
> played with this a little and there don't seem to be any show-stoppers for
> creating the leaves of the library (which are all final for performance,
> and have overloaded methods to do math with each other).  I haven't worked
> out how to cleanly auto-generate the right class hierarchy yet, but that
> shouldn't be too daunting.
>
> The hierarchy I'm thinking of now is:
>
> Scalar[T]   <-- need this for things like 2+v to work
>   ScalarTypename
> Vec[T] extends Product with Iterable[T]  // Anything else needed?
>   VecIsReference[T] extends Vec[T]
>   VecIsRational[T] extends Vec[T]
>   VecIsMutable[T] extends Vec[T]
>   VecTypename extends Vec[Typename]  (with VecIsRational[T])
>   Vec#[T] extends Vec[T]
>     Vec#Typename extends Vec#[Typename] with VecTypename
>       final Vec#T extends Vec#Typename
>       final MVec#T extends Vec#Typename with VecIsMutable[Typename]
>       final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
>         with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.
>
> The MVec2I series would be mutable; the RefVec2I series would be mutable
> and index into an array instead of carrying the values in itself (so you
> could store vector data in arrays for compactness and lose as little
> performance as possible).
>
> Overall, this would require a library with 97 classes (!!) if it went up to
> N=6 (stopping at N=3 leaves it at a more reasonable 46 classes).  The final
> classes at the leaves of the tree would have ~20 (tiny) methods per
> operator for static versions (~60 for mutable versions).
>
> By the way, Vec#[T] would be a non-abstract class, so you could actually
> use that if you wanted something that worked for your favorite type T and
> didn't require stellar performance; I also think you could mix in
> VecIsReference[T] and it would work, though I haven't actually written the
> code.
>
> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.
>
>   --Rex


David Flemström
david [dot] flemstrom [at] gmail [dot] com

Re: Scala vector and matrix math library

Hey Folks,

i've only been dipping in and out of this thread, but if you are seriously considering an API for matrix maths, may i suggest that you use Geometric Algebra (see also Clifford Algebra) as the basis for the API. You will get huge leap in abstraction and usability. The graphics and physics community are very, very excited about this way of structuring the basic operations. In fact, folks are so excited people have started to look at hardware accelerations for Geometric Algebra.

If you'd like an example, here is a C++ template-based lib for Clifford Algebra. i can't remember if they make use of BLAS/LAPACK to effect and/or optimize stuff underneath or if this is a grounds up implementation.

Best wishes,

--greg

On Thu, Aug 27, 2009 at 10:44 AM, Maxime Lévesque <maxime [dot] levesque [at] gmail [dot] com> wrote:

 BTW, are there plans for meta programming in Scala ?

   http://groovy.codehaus.org/Compile-time+Metaprogramming+-+AST+Transformations

 That would be an alternative to a compiler plugin wouldn't it ?


On Thu, Aug 27, 2009 at 1:36 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
I think that a compiler plugin that would allow for basic macros would do the
trick in a situation like this. We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

For example, this will still yield a getter method x for class B and C:

trait A {
 def x: Int
}

class B extends A {
 val x: Int = 3
}
class C extends A {
 var x: Int = 3
}

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO. The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

On Thursday 27 August 2009 18:50:41 Rex Kerr wrote:
> The problem with creating simultaneously flexible and high-performance
> vector math code is not that it's hard to do, it's just that there is *so
> much* repetitive code to write that no sane person would try to do it by
> hand.  It's tricky to shift all that burden to the compiler via
> @specialized, because this is @very_very_specialized in terms of sensible
> operations that should happen quickly.
>
> But given that all the code follows from a much smaller set of basic
> operations and rules for what makes things run quickly, it seems like
> automatic code generation should in theory be possible.
>
> Hence, I think the right answer for now is to write a program that
> automatically generates a high performance small vector library.  I've
> played with this a little and there don't seem to be any show-stoppers for
> creating the leaves of the library (which are all final for performance,
> and have overloaded methods to do math with each other).  I haven't worked
> out how to cleanly auto-generate the right class hierarchy yet, but that
> shouldn't be too daunting.
>
> The hierarchy I'm thinking of now is:
>
> Scalar[T]   <-- need this for things like 2+v to work
>   ScalarTypename
> Vec[T] extends Product with Iterable[T]  // Anything else needed?
>   VecIsReference[T] extends Vec[T]
>   VecIsRational[T] extends Vec[T]
>   VecIsMutable[T] extends Vec[T]
>   VecTypename extends Vec[Typename]  (with VecIsRational[T])
>   Vec#[T] extends Vec[T]
>     Vec#Typename extends Vec#[Typename] with VecTypename
>       final Vec#T extends Vec#Typename
>       final MVec#T extends Vec#Typename with VecIsMutable[Typename]
>       final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
>         with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.
>
> The MVec2I series would be mutable; the RefVec2I series would be mutable
> and index into an array instead of carrying the values in itself (so you
> could store vector data in arrays for compactness and lose as little
> performance as possible).
>
> Overall, this would require a library with 97 classes (!!) if it went up to
> N=6 (stopping at N=3 leaves it at a more reasonable 46 classes).  The final
> classes at the leaves of the tree would have ~20 (tiny) methods per
> operator for static versions (~60 for mutable versions).
>
> By the way, Vec#[T] would be a non-abstract class, so you could actually
> use that if you wanted something that worked for your favorite type T and
> didn't require stellar performance; I also think you could mix in
> VecIsReference[T] and it would work, though I haven't actually written the
> code.
>
> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.
>
>   --Rex


David Flemström
david [dot] flemstrom [at] gmail [dot] com




--
L.G. Meredith
Managing Partner
Biosimilarity LLC
1219 NW 83rd St
Seattle, WA 98117

+1 206.650.3740

http://biosimilarity.blogspot.com

Re: Scala vector and matrix math library

p, li { white-space: pre-wrap; }On Thursday 27 August 2009 17:54:04 Askia wrote:
> I have also looked a bit into this topic.
> I was wondering how practical it would be to have additional compiler phase
> plugins for optimization of some cases while maintaining a generic
> interface.
> In ScalaBox2D, the authoring was experimenting with that.
>
> How would this work from a design and architectural point of view?
Something that would be interesting to have, if we can gurantee that everyone who uses the library also activates the compiler plugin (I don't know how the plugin architecture works, but if this is possible), would be to have a plugin that:
- transforms all immutable case classes
→ that don't have any fields apart from the ones in the constructor
→ that also have been marked with a specific annotation,
into parameterlist elements and individual variables.


So, e.g.:


@inlined
case class A(val b: Int, val c: Int)
def someDef(something: A) = println(something.toString)
val d = new A(2, 3)
someDef(d)


...becomes:
@inlined
case class A(val b: Int, val c: Int) {
//... transformed methods that have param lists
//with the above variables instead of "this"
}
def someDef(inline_something_$b: Int, inline_something_$c: Int) =
println(A.toString(inline_something_$b, inline_something_$c))
val d_$b = 2
val d_$c = 3
someDef(d_$b, d_$c)


Imagine the speed improvements we would get! It would be *very* difficult to implement, though, and you wouldn't be able to change the inlined class's parameter list after it has been referenced in external unrecompilable code.

David Flemström
david [dot] flemstrom [at] gmail [dot] com

Name Collisions With Private Fields and Public Getters/Setters

The following won't compile:

class MyClass(var x: int) {
def x: int = {
println("x getter")
this.x
}

def x_=(x: int) {
println("x setter")
this.x = x
}
}

But the getter and setter will yield to an error stating that a method
was defined twice and there are ambiguous references even. Okay, I
understand the problem.

On the contrary, it is quite common for a class to keeps its fields
private and guard the (public) setters and getters, given the fields are
vars and not vals. For instance, a getter may return a lazily-created
value, and a setter might dismiss unacceptable values breaking a class
invariance, etc.

The examples in the book I'm reading, ``Programming in Scala,'' don't
give a hint on best practices in such cases. The book seems to suggest
to choose non-human-readable var names for fields (e.g. `re' and `im'
for complex numbers, `n' and `d' for rationals and so so on) and choose
well-readable method names on the public interface.

The best I have come up with yet is to prepend a dollar sign to the
private fields, like so:

class User(private var $userName: String, private var $password: String) {
def this() {
this(null, null)
}

def password = $password

def password_=(password: String) {
$password = password
}

def userName = $userName

def userName_=(userName: String) {
$userName = userName
}
}

This looks like a lot of boilerplate to me. What do you think? Would you
think this is a best-practice naming scheme?

I for one would love to be able to overload the intrinsic auto-generated
getters and setters without the compiler picking on me, but that's all
possibly water under the bridge ...

Re: Name Collisions With Private Fields and Public Getters/Set

Philip, the problem here is that your getter/setter is precisely the private field. Scala has a shortcut for that: val and var. When you declare "var x", Scala creates a field x, a getter x and a setter x_=. You can't create fields directly: any field must have at least a getter, which may be private as needed. This is done to enforce the uniform access principle: ALL fields are accessed through getters and setters.   If the getter and setter don't do anything, use val/var.   One the other hand, suppose the password was stored encrypted, and with a salt. Then your class might go like this:   class User(val name: String, initialPassword: String) {   private lazy var encryptedPassword = encrypt(initialPassword, salt)   private lazy var salt = scala.util.Random.nextInt     private def encrypt(plainText: String, salt: Int): String = { ... }   private def decrypt(encryptedText: String, salt: Int): String = { ... }     def password = decrypt(encryptedPassword, salt)   def password_=(newPassword: String) = encrypt(newPassword, salt) }   There, see? If the private field is not exactly the same thing as the public field, its name will naturally be something else. Even if the value is the same, but you did some validation, you might call the private field validatedX and the getter X.
On Wed, Aug 26, 2009 at 6:56 PM, Philip Köster <philip [dot] koester [at] web [dot] de> wrote:
The following won't compile:

class MyClass(var x: int) {
       def x: int = {
               println("x getter")
               this.x
       }

       def x_=(x: int) {
               println("x setter")
               this.x = x
       }
}

But the getter and setter will yield to an error stating that a method was defined twice and there are ambiguous references even. Okay, I understand the problem.

On the contrary, it is quite common for a class to keeps its fields private and guard the (public) setters and getters, given the fields are vars and not vals. For instance, a getter may return a lazily-created value, and a setter might dismiss unacceptable values breaking a class invariance, etc.

The examples in the book I'm reading, ``Programming in Scala,'' don't give a hint on best practices in such cases.  The book seems to suggest to choose non-human-readable var names for fields (e.g. `re' and `im' for complex numbers, `n' and `d' for rationals and so so on) and choose well-readable method names on the public interface.

The  best I have come up with yet is to prepend a dollar sign to the private fields, like so:

class User(private var $userName: String, private var $password: String) {
       def this() {
               this(null, null)
       }
       
       def password = $password
       
       def password_=(password: String) {
               $password = password
       }

       def userName = $userName

       def userName_=(userName: String) {
               $userName = userName
       }
}

This looks like a lot of boilerplate to me. What do you think? Would you think this is a best-practice naming scheme?

I for one would love to be able to overload the intrinsic auto-generated getters and setters without the compiler picking on me, but that's all possibly water under the bridge ...



--
Daniel C. Sobral

Something I learned in academia: there are three kinds of academic reviews: review by name, review by reference and review by value.

Re: Name Collisions With Private Fields and Public Getters/Sett

On Wed, Aug 26, 2009 at 11:56:37PM +0200, Philip Köster wrote:
> The best I have come up with yet is to prepend a dollar sign to the
> private fields, like so:
>
> class User(private var $userName: String, private var $password: String) {

This is a grave error. The dollar sign is not for user code. You will
break in spectacular (or worse, subtle) fashion by using it.

The convention I'm aware of is leading underscore, _userName.

Re: Name Collisions With Private Fields and Public Getters/Sett

> This is a grave error. The dollar sign is not for user code. You will
> break in spectacular (or worse, subtle) fashion by using it.

Thanks for your answer. Honestly, I've never in my career as a
programmer been a fan of prefixes such as `$', `_', `n', `l', `lp',
`the' or, to make matters worse as they do in Redmond, `lpsz', or
whatever. The underscore looks like something reserved to me, especially
the double underscore, but a dollar prefix on fields is perfectly doable
in Java, which is something that writers of Scala libraries should take
into account.

In the end, it's a matter of taste. I like neither of the two, but given
the choice, I'd go for the dollar sign.

But as I said, I'm looking for a solution that needs no prefixes at all,
making me able to choose natural identifiers.

Regards
---Phil

Re: Name Collisions With Private Fields and Public Getters/Sett

On Thu, Aug 27, 2009 at 12:37:01AM +0200, Philip Köster wrote:
> The underscore looks like something reserved to me, especially the
> double underscore, but a dollar prefix on fields is perfectly doable
> in Java, which is something that writers of Scala libraries should
> take into account.
>
> In the end, it's a matter of taste. I like neither of the two, but
> given the choice, I'd go for the dollar sign.

Maybe I'm misreading you but you don't seem to understand, I'm not
talking about a matter of personal taste. If you use the '$' your code
is going to break. It's in the scala language spec: "The '$' character
is reserved for compiler synthesized identifiers. User programs should
not define identifiers which contain '$' characters." I can assure you
knowing how the $ is used in the compiler that your $-using future is an
unpleasant one indeed.

It's in the java language spec as well: "The $ character should be used
only in mechanically generated source code or, rarely, to access
preexisting names on legacy systems."

Re: Name Collisions With Private Fields and Public Getters/Sett

Paul Phillips wrote:

> Maybe I'm misreading you but you don't seem to understand, I'm not
> talking about a matter of personal taste. If you use the '$' your code
> is going to break. It's in the scala language spec: "The '$' character
> is reserved for compiler synthesized identifiers. User programs should
> not define identifiers which contain '$' characters." I can assure you
> knowing how the $ is used in the compiler that your $-using future is an
> unpleasant one indeed.

Digging out an older discussion: When I posted my question I had been
learning Scala for 48 hours. Section 6.10 in ``PIS" (not a complimentary
abbreviation for an excellent book, really, I mean "Programming in
Scala") clearly says: ``The `$' character also counts as a letter,
however it is reserved for identifiers generated by the Scala compiler.
Identifiers in user programs should not contain `$' characters, even
though it will compile; if they do this might lead to name clashes with
identifiers generated by the Scala compiler."

As for the name clashes: Section 10.5 points out that in Scala, unlike
in Java, share the same namespace so that Scala implements the
uniform-access principle. I wasn't aware of `override val' when I posted
my question. The leading underscore for private fields is a common
pattern, so I think I will follow it when I need distinct IDs.

So, Paul, you were of course right with everything you said.

---Phil

PS: ``Watership Down" is one of my all-time favorites, too. I just
reread it recently---marvellous.

Re: Name Collisions With Private Fields and Public Getters/Set

So, how did learning Scala work for you? And, in particular, what are your thoughts on the private fields issue now?

On Mon, Oct 12, 2009 at 3:02 AM, Philip Köster <philip [dot] koester [at] web [dot] de> wrote:

Paul Phillips wrote:

Maybe I'm misreading you but you don't seem to understand, I'm not talking about a matter of personal taste.  If you use the '$' your code is going to break.  It's in the scala language spec: "The '$' character is reserved for compiler synthesized identifiers. User programs should not define identifiers which contain '$' characters." I can assure you knowing how the $ is used in the compiler that your $-using future is an unpleasant one indeed.

Digging out an older discussion: When I posted my question I had been learning Scala for 48 hours. Section 6.10 in ``PIS" (not a complimentary abbreviation for an excellent book, really, I mean "Programming in Scala") clearly says: ``The `$' character also counts as a letter, however it is reserved for identifiers generated by the Scala compiler. Identifiers in user programs should not contain `$' characters, even though it will compile; if they do this might lead to name clashes with identifiers generated by the Scala compiler."

As for the name clashes: Section 10.5 points out that in Scala, unlike in Java, share the same namespace so that Scala implements the uniform-access principle. I wasn't aware of `override val' when I posted my question. The leading underscore for private fields is a common pattern, so I think I will follow it when I need distinct IDs.

So, Paul, you were of course right with everything you said.

---Phil

PS: ``Watership Down" is one of my all-time favorites, too. I just reread it recently---marvellous.



--
Daniel C. Sobral

Something I learned in academia: there are three kinds of academic reviews: review by name, review by reference and review by value.

Re: Name Collisions With Private Fields and Public Getters/Set

> So, how did learning Scala work for you?

I'm moving forth slowly, typing in most of the book's examples and
making a lot of experiments and unit tests. I think that the
Odersky/Spoon/Venners book is written in an excellent didactic
fashion---I'd never have expected CS gurus to write such good and lucid
English and explain the matter so clearly and easy to grasp. Five stars
for the best-composed tech book I have read in years. Only very few
questions remain open, and it's easy to get answers here if there's
something that the book doesn't covered or is outdated already.

> And, in particular, what are
> your thoughts on the private fields issue now?

I'm fine with the way things are. In C++ and Java, a member `f' is
always distinct from `f()', but not in Scala. I like being able to omit
the `f()' for ``pure" methods that have no (conceptual) side effects,
and I find it very clever to allow for fields overriding methods like this:

class A {
def name = "A"
}

class B extends A {
override val name = "B"
}

No, no, I'm very happy with Scala. It's pleasant both from an
aesthetical as well as from a functional point of view, and I'm sure it
is going to be my next favorite language.

Regards
---Ph.

Re: Name Collisions With Private Fields and Public Getters/Set

> I like being able to omit the `f()' for ``pure" methods that have no (conceptual) side effects

Correction: ``omit the `()'," i.e. the empty parameter list.

Re: Name Collisions With Private Fields and Public Getters/Sett

Correcting a typo:

> As for the name clashes: Section 10.5 points out that in Scala, unlike
> in Java, share the same namespace [...]

That should read ``fields and methods share the same namespace."

Re: Name Collisions With Private Fields and Public Getters/Sett

> Maybe I'm misreading you but you don't seem to understand, I'm not
> talking about a matter of personal taste. If you use the '$' your code
> is going to break. It's in the scala language spec: "The '$' character
> is reserved for compiler synthesized identifiers. User programs should
> not define identifiers which contain '$' characters." I can assure you
> knowing how the $ is used in the compiler that your $-using future is an
> unpleasant one indeed.

Okay, thanks for the update. This is clearly a Scala novelty that I
haven't been aware of. (I already noticed `_=' was translated to `_$eq'
or something resembling although I thought on a byte-code level
everything was allowed, even identifiers such as `42'.)

AFWIF, I think having private vars take on a different ID than public
accessors is not a perfect solution. A solution I can live with, maybe,
but a solution I don't find satisfactory.

But I'll abandon the `$', promised. :)

---Ph.

Copyright © 2013 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland