This page is no longer maintained — Please continue to the home page at www.scala-lang.org

2D/3D vector math package -- design choices, and where should it go?

77 replies
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
I do rather a lot of scientific programming (I run a neurobiology lab) and have found the lack of basic 2D/3D vector types in almost every language endlessly annoying.  So I've created a Scala package.

Right now, the package lets you do things like this with immutable vectors:
  val a = Vec2I(2,3)   // a.x==2 and a.y==3
  val b = 2.vec2       // b.x==2 and b.y==2
  val c = 2*(a+b)      // c.x==8 and c.y==12, as they should
  val d = 2.0*(a+b)    // Same thing, except the type is promoted to Vec2D
  val origin = mouse_event.getPoint   // A Java MouseEvent
  val movement = (new_event.getPoint - origin)/zoom_factor

  // Angle between two vectors; "^" is a unit-vector hat
  val e = Math.acos( ((3,5)^) ** Vec2I(2,3)^ )  

Basically, you can do anything you might find in an intro calculus text, with transparent conversions to and from existing Java types and appropriately typed tuples (and they're case classes, so matching on Vec2I(x,5) works).  Since efficient operations require using primitive types, I've got 2D and 3D types for Int, Long, Float, and Double (Vec2I, Vec2L, ... , Vec3D).

In addition, the package has mutable vectors (since all the object creation overhead takes ~20x longer than vector addition of ints).  Here the syntax is less pretty, but thanks to the JVM and scala compiler, it runs at least as fast as hand-rolled array access.  For example, one could:
  val movement = ((new_event.getPoint.m -= origin) /= zoom_factor)
where the ".m" means create a mutable copy, and the assignment operators put the computation into the LHS copy as expected from C++ convention.  Or one could
  val f = 1 +: 2 *: (0.5f,1.5f).m
where the operator: notation means to stuff the result in the RHS of the argument.  Most useful is stuff like:
  val vector_sum = (Vec2D_m(0,0) /: vector_list)(_ += _)

I find these capabilities very handy--it required writing tons of boilerplate code, but now as far as I can tell, everything just works.  So I have four questions:

(1) Is this of sufficiently generic interest that I should figure out a way to make this available?

(2) If so, what would be the appropriate venue?  At one extreme, I could throw it up on Sourceforge or somesuch and hope whoever wants it will find it.  At the other extreme, since it is a pretty basic language feature, a refined version could be incorporated into an existing package or possibly even the core Scala library.  (I do not think that incorporation into something like scalala is the best idea; the use cases are almost entirely nonoverlapping, since scalala is for full-strength large-vector/matrix linear algebra, and my package is basically to make life less annoying when dealing with screen coordinates and other 2D data.)

(3) Also, if so, I am uncertain how verbose the library should be, and what some of the syntax should be.  Right now I'm on the very short end of verbosity:
  2.vec2 extends 2 to a 2D vector, i.e. Vec2I(2,2)
  val g = (2,3,4).m converts an (Int,Int,Int) tuple to a mutable int vector Vec2I_m(2,3,4)
  val h = g.im.L converts to an immutable vector of Longs
  a ** b is a dot product
  (a X b)^ is the unit vector of a cross product
and so on.  Input from anyone who has opinions on this is welcome.  One thing I am especially uncomfortable with is using
  += for addition stored in the left-hand argument
  +: for addition stored in the right-hand argument
for mutable vectors.  On the one hand, the natural right-associativity of +: makes the code easier to write.  On the other hand, the different precedence of operators ending in = and : seems like it's asking for trouble.

(4) Has someone already done this, and better, so that I should just use their code instead?

Anyway, thanks for any comments people may have.

  --Rex

Warren Henning
Joined: 2008-12-31,
User offline. Last seen 42 years 45 weeks ago.
Re: 2D/3D vector math package -- design choices, and where sho

On Thu, Aug 20, 2009 at 5:15 PM, Rex Kerr wrote:
> (1) Is this of sufficiently generic interest that I should figure out a way
> to make this available?

If you get feedback from people showing interest (especially in
helping you work on it), it's probably worth the effort to release it
and provide basic mailing list support. However, your basic target
audience, people doing scientific programming, often use MATLAB,
Fortran, C, or something like that and generally have never heard of
functional programming languages.

What all Scala users would be interested in is experience reports on
using Scala for a domain many of us have participated in at one time
but have no current vested interest in.

As for your questions about design choices, I would think the basic
guiding principle would be "what helps me do science the best?" Do
what makes sense to a scientist, or what would be the easiest to
explain to a scientific colleague who perhaps has never used Scala.

I hope Scala serves you well in your scientific endeavors and please
let us know how it goes!

Warren

Jonathan Merritt
Joined: 2009-01-18,
User offline. Last seen 1 year 23 weeks ago.
Re: 2D/3D vector math package -- design choices, and where shou

Hi Rex,

On 21/08/2009, at 10:15 AM, Rex Kerr wrote:
> I do rather a lot of scientific programming (I run a neurobiology
> lab) and have found the lack of basic 2D/3D vector types in almost
> every language endlessly annoying. So I've created a Scala package.
>
> Right now, the package lets you do things like this with immutable
> vectors:

I'm in a similar position; I'm doing biomechanics. I have a major
interest in new programming languages, but also in the coding
practices (documentation, unit testing, agile development, etc.), that
have evolved to serve the business world so well.

I generally find that "simple" vector operations are not a problem for
me, because the algorithms behind them are (of course! :-) simple.
The bigger problems arise when I suddenly realize that I need a matrix
decomposition of some kind, or an optimization algorithm. IMHO, those
are the areas that are the "killer domain" of the more traditional
approaches (Matlab, Fortran, SciPy, Scilab, Octave, etc. - or should I
say: BLAS and Lapack!). They are sufficiently complicated operations
that a "roll your own" approach would often be a buggy waste of time.

What would be *really* nice is a standard numerical library for the
JVM. But all we have at the moment is fragmentation. As a new
developer I wondered: "should I use Colt, Commons-Math, JAMA, or
what?" :-) Often they have overlapping functionality, and the lack of
a well-publicized "leader" makes me (as a sole scientific developer)
hesitant to invest heavily in any of them.

In Scala, the leading numerical contender at the moment seems to be
scalala:
http://code.google.com/p/scalala/
Maybe that project would be a good place to investigate integrating
your code?

Jonathan Merritt.

Ben Hutchison
Joined: 2009-01-21,
User offline. Last seen 42 years 45 weeks ago.
Re: 2D/3D vector math package -- design choices, and where sho

I have been thinking about this issue for the past few days, and up it
pops on the mailing list! I agree with sentiments expressed that its
unfortunate not to have a standard defn for Vector, Matrix etc on JVM.

Some standardization would be great. Thats's collectively a matter for
users (including us) to agree to converge on one or two, and/or create
Implicit Conversions that allow different libraries to interop.

Here's a list of the mentioned implementations, including both Scala
and Java ones, plus info on developers and release history, ordered in
approximate descending order of my preference.

I include Java libs because IMO they could be Scalafied with some
implicit conversions, albeit at some performance cost (how much? Is
this a real problem?)

* Commons-Math [http://commons.apache.org/math/] Last release August
2009. Circa 6 committers. Java. Broadest maths coverage, includes
matrices & vectors but also much else.

* Scalala [http://code.google.com/p/scalala/] Less than 1 year old. 1
main committer Daniel Ramage. Scala. Focus on Tensors (including
Vector and Matrices).

* JAMA [http://math.nist.gov/javanumerics/jama/] Last release 2005.
Multiple contributors. Java. Matrix focus.

* Colt.[http://acs.lbl.gov/~hoschek/colt/] 1 primary dev: Wolfgang
Hoschek. Last release Sep 2004. Java Pre-1.5.

Also, AFAIK Erkki Lindpere's ScalaBox2 includes it own maths classes,
but its not general purpose, rather a symptom of the lack of a
standard library.

-Ben

On Fri, Aug 21, 2009 at 2:27 PM, Jonathan Merritt wrote:

> What would be *really* nice is a standard numerical library for the JVM.
>  But all we have at the moment is fragmentation. As a new developer I
> wondered: "should I use Colt, Commons-Math, JAMA, or what?" :-)  Often they
> have overlapping functionality, and the lack of a well-publicized "leader"
> makes me (as a sole scientific developer) hesitant to invest heavily in any
> of them.
>
> In Scala, the leading numerical contender at the moment seems to be scalala:
>  http://code.google.com/p/scalala/
> Maybe that project would be a good place to investigate integrating your
> code?
>
> Jonathan Merritt.
>
>

Swade Leather
Joined: 2009-01-31,
User offline. Last seen 42 years 45 weeks ago.
Re: 2D/3D vector math package -- design choices, and where sho

Thanks for the interesting discussion. I'm a scientist who writes
numerical code that runs on very large data sets regularly,
performance is a key consideration for me. Java/JVM-based languages
are seriously lacking in well-optimized linear algebra libraries and
"rolling your own" isn't really feasible for the exact reasons that
Jonathan described.

I've been using MTJ with good success. Once compiled with ATLAS
support via NNI I get performance that is comparable to my existing
C/C++ and Fortran code. I understand that Common-Math may soon be
using MTJ for its linalg stuff (because of speed concerns). Neither
JAMA or Colt is really comparable in terms of speed from my
experience.

For performance details see
http://www.ri.reslab.no/uploads/White%20papers/boh_hpj.pdf

On Fri, Aug 21, 2009 at 2:34 AM, Ben Hutchison wrote:
> I have been thinking about this issue for the past few days, and up it
> pops on the mailing list! I agree with sentiments expressed that its
> unfortunate not to have a standard defn for Vector, Matrix etc on JVM.
>
> Some standardization would be great. Thats's collectively a matter for
> users (including us) to agree to converge on one or two, and/or create
> Implicit Conversions that allow different libraries to interop.
>
> Here's a list of the mentioned implementations, including both Scala
> and Java ones, plus info on developers and release history, ordered in
> approximate descending order of my preference.
>
> I include Java libs because IMO they could be Scalafied with some
> implicit conversions, albeit at some performance cost (how much? Is
> this a real problem?)
>
> * Commons-Math [http://commons.apache.org/math/] Last release August
> 2009. Circa 6 committers. Java. Broadest maths coverage, includes
> matrices & vectors but also much else.
>
> * Scalala [http://code.google.com/p/scalala/] Less than 1 year old. 1
> main committer Daniel Ramage. Scala. Focus on Tensors (including
> Vector and Matrices).
>
> * JAMA [http://math.nist.gov/javanumerics/jama/] Last release 2005.
> Multiple contributors. Java. Matrix focus.
>
> * Colt.[http://acs.lbl.gov/~hoschek/colt/]  1 primary dev: Wolfgang
> Hoschek. Last release Sep 2004. Java Pre-1.5.
>
>
> Also, AFAIK Erkki Lindpere's ScalaBox2 includes it own maths classes,
> but its not general purpose, rather a symptom of the lack of a
> standard library.
>
> -Ben
>
> On Fri, Aug 21, 2009 at 2:27 PM, Jonathan Merritt wrote:
>
>> What would be *really* nice is a standard numerical library for the JVM.
>>  But all we have at the moment is fragmentation. As a new developer I
>> wondered: "should I use Colt, Commons-Math, JAMA, or what?" :-)  Often they
>> have overlapping functionality, and the lack of a well-publicized "leader"
>> makes me (as a sole scientific developer) hesitant to invest heavily in any
>> of them.
>>
>> In Scala, the leading numerical contender at the moment seems to be scalala:
>>  http://code.google.com/p/scalala/
>> Maybe that project would be a good place to investigate integrating your
>> code?
>>
>> Jonathan Merritt.
>>
>>
>

David Flemström
Joined: 2009-08-10,
User offline. Last seen 42 years 45 weeks ago.
Scala vector and matrix math library
p, li { white-space: pre-wrap; }Hello all,
I just read the recent thread ("2D/3D vector math package [...]") and also think it would be a great idea to add a library like this to the standard Scala distribution (or in scalax, or somewhere else where everybody can find it), and I would happy to help doing this myself.


However, I've got very different ideas and goals compared to those mentioned in that thread, so I don't feel like continuing with it. I don't want *just* a math-oriented, easy-to-use but maybe slow vector math solution, but a solution that can adapt to be fast and/or easy to use depending on the situation and the way it is used, so that it is usable for heavy computations that require good performance, but also for ordinary users that just want a vector library that works and don't care as much about performance. If something is to be used by a fair amount of developers, it has to be a good solution for all those developers.


To further adhere to this ideal, I will, if I ever create a library for this, publish it on GitHub and generously accept patches to it if the patches contain sensible features that also adhere to the ideal.


Some features I've thought about including:


- Completely immutable data structures. The kind that trigger HotSpot caching and inlining. I needn't explain more. This can give incredible performance improvements in the long run because of JIT optimizations, while being really useful for concurrent programming etc because of the immutability. It would of course also be possible to create mutable/immutable implementations à la the scala.collection library, but there should at least be one part of the vector library that's *completely* immutable IMO.


- All classes are generic. If you only make a vector library that only uses Floats or Doubles etc, someone is going to want to have an Int-based vector library, and thus a new one has to be written. With the new Numeric class in Scala 2.8, it's possible to make generic classes that take number types and that don't use boxing/unboxing with help from the @specialized annotation. I have attached an example implementation of a small system that demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note that the code is optimized for speed and not necessarily for legibility.


- Support for dependant vectors. This is only possible if an approach similar to the scala.collection mutable/immutable approach is taken. Dependant vectors are vectors that don't store values by themselves (e.g. there are no variables stored inside of the vector), but instead a dependant vector has a function stored inside of it that tells it how to get to it's values. This is really useful if one vector's value depends on another vector's state. I'll show what I mean if I ever get started on a library like this.


- Usage of existing traits/classes. A Vector will of course extend a Product, and maybe an Ordered, a Seq or a Tuple (if those classes also get the @specialized annotation when 2.8 is released).


- Matrices and other general math tools, of course. The primary focus will be on vectors, though.


Comments? Suggestions in general?

David Flemström
david [dot] flemstrom [at] gmail [dot] com
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: 2D/3D vector math package -- design choices, and where sho

Also, AFAIK Erkki Lindpere's ScalaBox2 includes it own maths classes,
but its not general purpose, rather a symptom of the lack of a
standard library.

-Ben


Thanks!  I wasn't aware of this one--and it's exactly the type of itch that I was motivated to scratch with my library: how do you do graphics or 2D physics effectively without a standard 2D vector library?

  --Rex

ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
Hi David,

I share your goals, but unfortunately they can't all be simultaneously met.  What I came up with was my compromise for 2.7.  2.8 does raise a few extra interesting possibilities.

On Fri, Aug 21, 2009 at 11:05 AM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
However, I've got very different ideas and goals compared to those mentioned in that thread, so I don't feel like continuing with it. I don't want *just* a math-oriented, easy-to-use but maybe slow vector math solution, but a solution that can adapt to be fast and/or easy to use depending on the situation and the way it is used, so that it is usable for heavy computations that require good performance, but also for ordinary users that just want a vector library that works and don't care as much about performance. If something is to be used by a fair amount of developers, it has to be a good solution for all those developers.

That's why I have both mutable and immutable versions.  The immutable ones are as fast as I can make them given that they have unavoidable object creation even if the compiler can optimize away a lot of them.  The mutable ones are as fast as using arrays of primitive data types; you can't really get faster than that.
 
Some features I've thought about including:


- Completely immutable data structures. The kind that trigger HotSpot caching and inlining. I needn't explain more.

I've already benchmarked this.  With short vectors--2D and 3D--the overhead of just *one* object creation is so much higher than the cost of a basic arithmetic operation like + or * that it's almost impossible to get speed savings just because the compiler can optimize away almost all the object creations.  For example, a test where one is doing one each of +, *, >>, and &, to create two values and then summing those in a vector (so the vector operation is two more +es), I see the following timings:
  time with primitive vars for each coordinate = 100
  time with immutable vectors, vector-vector addition: 1700
  time with immutable vectors, vector-(x,y) addition: 900
  time with mutable vectors, vector-vector addition: 2100
  time with mutable vectors, vector-(x,y) addition: 1050
  time with mutable vectors, vector-vector +=: 400
  time with mutable vectors, vector-(var,var) +=: 70
Here, on the right, (x,y) means I'm using a method that takes two values; vector means I'm using a constructor to create a new vector; addition means that a new vector is (at least formally) constructed to hold the result, while += means that the result replaces an existing vector.

You can see that even though the immutability helps over mutable vectors when you need some level of new object creation, the compiler and JVM are not yet smart enough to optimize away to the level you can reach with mutable vectors
 
- All classes are generic. If you only make a vector library that only uses Floats or Doubles etc, someone is going to want to have an Int-based vector library, and thus a new one has to be written. With the new Numeric class in Scala 2.8, it's possible to make generic classes that take number types and that don't use boxing/unboxing with help from the @specialized annotation.

I really hope this is possible.  For now, I'm deeply skeptical that it will work as well as hoped because there are lots of vector operations that are nonidentical between different base types.  For example, the length of a vector requires a square root operation that's handled differently in all types, and the default float/double-to-int translation (truncation) is not sensible (compared to rounding).  You also don't even have unit vectors with integer vectors.  To fix this, you could make a generic base class that was overridden sanely for integer and floating-point types (leaving open the option to do it with complex types or whatever).  Unfortunately, this then would prevent optimizations that are easy to do when classes are marked final, and I currently see up to ~50% improvements when I use a mutable final class as compared with a mutable non-final class.

For an intermediate-performance case, though, I agree that this would be great (when 2.8 is out).
 
- Support for dependant vectors. This is only possible if an approach similar to the scala.collection mutable/immutable approach is taken. Dependant vectors are vectors that don't store values by themselves (e.g. there are no variables stored inside of the vector), but instead a dependant vector has a function stored inside of it that tells it how to get to it's values. This is really useful if one vector's value depends on another vector's state. I'll show what I mean if I ever get started on a library like this.

This is an interesting idea, but you'd need to isolate the indirected vectors with different types or the compiler/JVM would have a lot of trouble figuring out when it did and when it did not need to make method calls as opposed to inlining.
 
- Usage of existing traits/classes. A Vector will of course extend a Product, and maybe an Ordered, a Seq or a Tuple (if those classes also get the @specialized annotation when 2.8 is released).

Implicit conversions can take care of anything that doesn't work with extension.
   --Rex

David Hall 4
Joined: 2009-08-21,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

On Fri, Aug 21, 2009 at 9:55 AM, Rex Kerr wrote:
> On Fri, Aug 21, 2009 at 11:05 AM, David Flemström
> wrote:
>>
>> However, I've got very different ideas and goals compared to those
>> mentioned in that thread, so I don't feel like continuing with it. I don't
>> want *just* a math-oriented, easy-to-use but maybe slow vector math
>> solution, but a solution that can adapt to be fast and/or easy to use
>> depending on the situation and the way it is used, so that it is usable for
>> heavy computations that require good performance, but also for ordinary
>> users that just want a vector library that works and don't care as much
>> about performance. If something is to be used by a fair amount of
>> developers, it has to be a good solution for all those developers.
>
> That's why I have both mutable and immutable versions.  The immutable ones
> are as fast as I can make them given that they have unavoidable object
> creation even if the compiler can optimize away a lot of them.  The mutable
> ones are as fast as using arrays of primitive data types; you can't really
> get faster than that.
>
>>
>> Some features I've thought about including:
>>
>> - Completely immutable data structures. The kind that trigger HotSpot
>> caching and inlining. I needn't explain more.
>
> I've already benchmarked this.  With short vectors--2D and 3D--the overhead
> of just *one* object creation is so much higher than the cost of a basic
> arithmetic operation like + or * that it's almost impossible to get speed
> savings just because the compiler can optimize away almost all the object
> creations.  For example, a test where one is doing one each of +, *, >>, and
> &, to create two values and then summing those in a vector (so the vector
> operation is two more +es), I see the following timings:
>   time with primitive vars for each coordinate = 100
>   time with immutable vectors, vector-vector addition: 1700
>   time with immutable vectors, vector-(x,y) addition: 900
>   time with mutable vectors, vector-vector addition: 2100
>   time with mutable vectors, vector-(x,y) addition: 1050
>   time with mutable vectors, vector-vector +=: 400
>   time with mutable vectors, vector-(var,var) +=: 70
> Here, on the right, (x,y) means I'm using a method that takes two values;
> vector means I'm using a constructor to create a new vector; addition means
> that a new vector is (at least formally) constructed to hold the result,
> while += means that the result replaces an existing vector.
>
> You can see that even though the immutability helps over mutable vectors
> when you need some level of new object creation, the compiler and JVM are
> not yet smart enough to optimize away to the level you can reach with
> mutable vectors

Have you run the tests with and without XX:+DoEscapeAnalysis ?

ijuma
Joined: 2008-08-20,
User offline. Last seen 22 weeks 2 days ago.
Re: Scala vector and matrix math library

Hi,

On Fri, 2009-08-21 at 12:55 -0400, Rex Kerr wrote:
> Unfortunately, this then would prevent optimizations that are easy to
> do when classes are marked final, and I currently see up to ~50%
> improvements when I use a mutable final class as compared with a
> mutable non-final class.

How did you benchmark this? Did you use the -server JIT with an
appropriate warm-up period? If there is a single implementation of a
non-final class in a hot call site, the performance should be identical
to a final class (basically HotSpot assumes that there are no subclasses
and deoptimises the code if a subclass shows up later).

Ismael

ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
I always used -server, and at David Hall's suggestion I've tried both -XX:+DoEscapeAnalysis and not, which makes very little difference.

What does make the difference is whether I mark the class final or not--and this only happens when I use -optimise.  Incidentally, there is also an intermittent (!) compiler bug in 2.7.4 that sometimes optimizes away the entire loop.  I should try 2.7.5 and the latest build of 2.8 in the trunk to see if it's still there.

Anyway, it's not JVM optimization that matters with final, it's scalac optimization.

And, incidentally, I found a case where the improvement is 3x, not 50%.

  --Rex

On Fri, Aug 21, 2009 at 1:33 PM, Ismael Juma <mlists [at] juma [dot] me [dot] uk> wrote:
Hi,

On Fri, 2009-08-21 at 12:55 -0400, Rex Kerr wrote:
> Unfortunately, this then would prevent optimizations that are easy to
> do when classes are marked final, and I currently see up to ~50%
> improvements when I use a mutable final class as compared with a
> mutable non-final class.

How did you benchmark this? Did you use the -server JIT with an
appropriate warm-up period? If there is a single implementation of a
non-final class in a hot call site, the performance should be identical
to a final class (basically HotSpot assumes that there are no subclasses
and deoptimises the code if a subclass shows up later).

Ismael



ijuma
Joined: 2008-08-20,
User offline. Last seen 22 weeks 2 days ago.
Re: Scala vector and matrix math library

Hi Rex,

On Fri, 2009-08-21 at 14:10 -0400, Rex Kerr wrote:
> I always used -server, and at David Hall's suggestion I've tried both
> -XX:+DoEscapeAnalysis and not, which makes very little difference.
>
> What does make the difference is whether I mark the class final or
> not--and this only happens when I use -optimise.

Ok, that explains it. So, this is not a JIT optimisation, but one
performed by the Scala compiler. It makes sense that it would only
happen if the class is final.

Ismael

Ben Hutchison
Joined: 2009-01-21,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

On Sat, Aug 22, 2009 at 1:05 AM, David
Flemström wrote:
> Hello all,
> I just read the recent thread ("2D/3D vector math package [...]") and also
> think it would be a great idea to add a library like this to the standard
> Scala distribution (or in scalax, or somewhere else where everybody can find
> it), and I would happy to help doing this myself.
>
> However, I've got very different ideas and goals compared to those mentioned
> in that thread, so I don't feel like continuing with it.

We agree that standardisation / defragmentation of math libraries is
desirable, however...

I'd be interested to hear specifically what in Scalala you think is
unsuitable as a starting point for a standard Scala math library.
Specifically.

And any detailed survey of Commons Math will reveal a massive
investment of time/effort in building that library that would be
inefficient to repeat. (Although it has some rough patches too - eg
Vector3D [http://commons.apache.org/math/apidocs/org/apache/commons/math/geometry/Vector3D.html].)

Rex has fairly easily posed some challenges for your proposed Vector[Int].

Personally, I think effort invested in understanding, reusing,
consolidating and adapting the existing Java and Scala assets would be
more fruitful than proposing new ones.

-Ben

David Flemström
Joined: 2009-08-10,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
p, li { white-space: pre-wrap; }On Sunday 23 August 2009 05:56:51 you wrote:
> Personally, I think effort invested in understanding, reusing,
> consolidating and adapting the existing Java and Scala assets would be
> more fruitful than proposing new ones.
>
> -Ben


Of course, I have to agree with this.
However, it seems that some libraries seem to be too specialized already to allow for them being changed.


The scalala library, for instance, does not enforce dimensions with the type system meaning that it's valid to pass any kind of vector to a method taking a vector type, and you have to add checks like:
require(vector.size == 3, "Wrong vector dimension")


...or use partial functions everywhere to manually enforce this. I would argue that it would be better to follow the Scala convention of having Vector2, Vector3, Vector4 etc for this purpose (but it could of course be useful to have an abstract class Vector, too, for generic methods that take any kind of vector). This type system is of course not ideal for scalala, because making classes up to Matrix512_512 etc will require a lot of classes, so you can't really adapt Scalala to have this feature, while the same would be possible for a less math-oriented and more computation-oriented library, the latter being something that you from time to time have a need for.


What also might become an issue for a generally accepted vector library is the size of data types. Scalala is already very memory-optimized; a 3D DenseVector does "only" take up 56 bytes of memory (8 byte class header, 4 byte array reference, 4 byte padding, 8 byte array header (on most JVMs), 8*3 bytes for the values, 4 bytes for the array length and 4 bytes padding) which is pretty good already.
However, comparing that to a generic vector that is dimension specialized (e.g. Vector2, Vector3 etc), that uses Numeric for its operations and doesn't utilize an Array for storing the data: Such a Vector3[Double] would take up only 32 bytes of memory (8 class header, 8*3 for values, 4 for Numeric reference and 4 padding), and a Vector3[Float] would only take up 24 bytes (8 header, 4*3 values, 4 for Numeric reference). To give Scalala the same memory usage, you would have to restructure the whole type system (but it would be possible, indeed, and I could actually do some research in order to maybe implement something if you're interested).


Another issue is that the current Scalala doesn't easily allow for immutable data structures that are completely immutable (so that HotSpot can apply optimizations). In order to accomplish this (while still leaving the option open for mutable versions of the same structure à la the scala.collection library), you have to put the actual data variables at the bottom of the type tree, so that you have def's all the way down, have a trait with the correct "update" methods named Mutability or similar, and then store the actual data variables at the leafs of the type structure.


E.g. (and I write this off-hand so don't blame me if it doesn't compile):
trait Vector extends Product { //skipped Iterable etc to make things easier
def dimension = productArity
}


trait Vector2 extends Vector {
def x: Double, def y: Double
def productArity = 2
def productElement(n: Int): Double = if(n == 0) x else if (n == 1) y
else error("Out of range")
def productPrefix = "Vector2"
}
class ImmutVec2(protected var xt: Double, protected var yt: Double)
extends Vector2 {
def x = xt; def y = yt
}


trait Mutable2 extends Vector2 {
def x_=(v: Double) = (xt = v); def y_=(v: Double) = (yt = v)
}
class MutVec2(xt: Double, yt: Double) extends ImmutVec2(xt, yt)
with Mutable2


This is not how things are done in Scalala, however, and would also require some major restructuring and breakage in order to work.


If you can show me alternative ways of solving the issues in Scalala I've mentioned here, then I'll reconsider writing a new library and instead look if Scalala can't be adapted to suit more purposes, or if I maybe should accept the performance penalties Scalala gives me and perform optimizations elsewhere in my applications (the vector math will be the bottleneck however).


Regards,
David Flemström
david [dot] flemstrom [at] gmail [dot] com
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
On Fri, Aug 21, 2009 at 11:05 AM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
- All classes are generic. If you only make a vector library that only uses Floats or Doubles etc, someone is going to want to have an Int-based vector library, and thus a new one has to be written. With the new Numeric class in Scala 2.8, it's possible to make generic classes that take number types and that don't use boxing/unboxing with help from the @specialized annotation. I have attached an example implementation of a small system that demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note that the code is optimized for speed and not necessarily for legibility.

I've now benchmarked this, and the results are not encouraging.  I have benchmarks for two things: accumulating vectors from arrays of previously-computed vector values, and accumulating vectors with values computed on the fly.  Previously-computed vectors are added with methods that take a vector; if computed on the fly, they're passed as two arguments to a + function to avoid the temptation to create an object.

These give the following timings for 2.8.0.r18509-b20090819020207, which to me appears from these benchmarks to not have @specialized working yet (even with -Xexperimental and -Xfuture):

(baseline "do-it-yourself" method)
Primitive math, on-the-fly: 100  <-- defined to be 100
Primitive math, stored: 290

(true regardless of @specialized, final, -optimise)
Immutable Vec2[Int], on-the-fly: 3800
Immutable Vec2[Int], stored: 9000

(true regardless of final, -optimise)
Immutable Vec2I, on-the-fly: 1700
Immutable Vec2I, stored: 2600

(true regardless of @specialized, final, -optimise)
Mutable Vec2[Int], on-the-fly: 2700
Mutable Vec2[Int], stored: 3000

(true if not both final and -optimise)
Mutable Vec2I, on-the-fly: 220
Mutable Vec2I, stored: 500

(both final and -optimise)
Mutable Vec2I, on-the-fly: 70
Mutable Vec2I, stored: 150

Bottom line: for immutables, generics are 2x slower when adding two primitives, 3x when combining two generics, and the best immutables are 10-20x slower than mutables.  For mutables, generics are 6-12x slower than unoptimized&final dedicated mutables, and optimized final dedicated mutables are 3x faster again (and, surprisingly, actually 1.5-2x faster than roll-your-own primitives).  Oh, and by the way, according to Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I takes 16 (primitives take 8).

I think this is pretty clear evidence that if you actually want a high-performance library for 2 and 3 component vectors using primitive types, that the library I've already written to do this (or another using a similar strategy) is at this point the way to go, and that mutable vectors is the way to go.  I hope that eventually the @specialized version will be closer to final -optimise, but it looks to me like the compiler team have their work cut out for them to make this happen.

A couple of details about the code:

Vec2[Int] is from a generic that looks like so (you can imagine the var version):
  final class Vec2[@specialized T](val x:T , val y:T)(implicit arith : Numeric[T]) {
    def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
    // ...
  }

Vec2I is hand-rolled using primitive types and looks like so (again, imagine vars):
  final class Vec2I(val x:Int , val y:Int) {
    def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
    // ...
  }

  --Rex

ijuma
Joined: 2008-08-20,
User offline. Last seen 22 weeks 2 days ago.
Re: Scala vector and matrix math library

Hey Rex,

Did you try @specialized with non-final classes? The way @specialized
works is by creating subclasses for each primitive type requested, so it
may be that final is preventing it from doing that.

Ismael

Iulian Dragos
Joined: 2008-12-18,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

Specialization has to be enabled by -Yspecialize. I couldn't find the
code you used for benchmarking, so I couldn't post back timings. Your
results are what I'd expect, but the specialized transformation should
get you close to the hand-written specialized versions.

iulian

On Mon, Aug 24, 2009 at 9:54 AM, Rex Kerr wrote:
> On Fri, Aug 21, 2009 at 11:05 AM, David Flemström
> wrote:
>>
>> - All classes are generic. If you only make a vector library that only
>> uses Floats or Doubles etc, someone is going to want to have an Int-based
>> vector library, and thus a new one has to be written. With the new Numeric
>> class in Scala 2.8, it's possible to make generic classes that take number
>> types and that don't use boxing/unboxing with help from the @specialized
>> annotation. I have attached an example implementation of a small system that
>> demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note
>> that the code is optimized for speed and not necessarily for legibility.
>
> I've now benchmarked this, and the results are not encouraging.  I have
> benchmarks for two things: accumulating vectors from arrays of
> previously-computed vector values, and accumulating vectors with values
> computed on the fly.  Previously-computed vectors are added with methods
> that take a vector; if computed on the fly, they're passed as two arguments
> to a + function to avoid the temptation to create an object.
>
> These give the following timings for 2.8.0.r18509-b20090819020207, which to
> me appears from these benchmarks to not have @specialized working yet (even
> with -Xexperimental and -Xfuture):
>
> (baseline "do-it-yourself" method)
> Primitive math, on-the-fly: 100  <-- defined to be 100
> Primitive math, stored: 290
>
> (true regardless of @specialized, final, -optimise)
> Immutable Vec2[Int], on-the-fly: 3800
> Immutable Vec2[Int], stored: 9000
>
> (true regardless of final, -optimise)
> Immutable Vec2I, on-the-fly: 1700
> Immutable Vec2I, stored: 2600
>
> (true regardless of @specialized, final, -optimise)
> Mutable Vec2[Int], on-the-fly: 2700
> Mutable Vec2[Int], stored: 3000
>
> (true if not both final and -optimise)
> Mutable Vec2I, on-the-fly: 220
> Mutable Vec2I, stored: 500
>
> (both final and -optimise)
> Mutable Vec2I, on-the-fly: 70
> Mutable Vec2I, stored: 150
>
> Bottom line: for immutables, generics are 2x slower when adding two
> primitives, 3x when combining two generics, and the best immutables are
> 10-20x slower than mutables.  For mutables, generics are 6-12x slower than
> unoptimized&final dedicated mutables, and optimized final dedicated mutables
> are 3x faster again (and, surprisingly, actually 1.5-2x faster than
> roll-your-own primitives).  Oh, and by the way, according to
> Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I takes 16
> (primitives take 8).
>
> I think this is pretty clear evidence that if you actually want a
> high-performance library for 2 and 3 component vectors using primitive
> types, that the library I've already written to do this (or another using a
> similar strategy) is at this point the way to go, and that mutable vectors
> is the way to go.  I hope that eventually the @specialized version will be
> closer to final -optimise, but it looks to me like the compiler team have
> their work cut out for them to make this happen.
>
> A couple of details about the code:
>
> Vec2[Int] is from a generic that looks like so (you can imagine the var
> version):
>   final class Vec2[@specialized T](val x:T , val y:T)(implicit arith :
> Numeric[T]) {
>     def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
>     // ...
>   }
>
> Vec2I is hand-rolled using primitive types and looks like so (again, imagine
> vars):
>   final class Vec2I(val x:Int , val y:Int) {
>     def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
>     // ...
>   }
>
>   --Rex
>
>

ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
Ah.  Unfortunately, in this particular case specialization makes things even worse.  With mutable generic int vectors, it goes from 2700&3000 to 4000&4000.  With immutable generics it goes from 3800&9000 to 8000&14000.

Also, memory usage goes up even more, from 54 to 62 bytes.

  --Rex

On Mon, Aug 24, 2009 at 8:19 AM, Iulian Dragos <jaguarul [at] gmail [dot] com> wrote:
Specialization has to be enabled by -Yspecialize. I couldn't find the
code you used for benchmarking, so I couldn't post back timings. Your
results are what I'd expect, but the specialized transformation should
get you close to the hand-written specialized versions.

iulian

On Mon, Aug 24, 2009 at 9:54 AM, Rex Kerr<ichoran [at] gmail [dot] com> wrote:
> On Fri, Aug 21, 2009 at 11:05 AM, David Flemström
> <david [dot] flemstrom [at] gmail [dot] com> wrote:
>>
>> - All classes are generic. If you only make a vector library that only
>> uses Floats or Doubles etc, someone is going to want to have an Int-based
>> vector library, and thus a new one has to be written. With the new Numeric
>> class in Scala 2.8, it's possible to make generic classes that take number
>> types and that don't use boxing/unboxing with help from the @specialized
>> annotation. I have attached an example implementation of a small system that
>> demonstrates what I mean (ref1.scala); it only compiles with Scala 2.8. Note
>> that the code is optimized for speed and not necessarily for legibility.
>
> I've now benchmarked this, and the results are not encouraging.  I have
> benchmarks for two things: accumulating vectors from arrays of
> previously-computed vector values, and accumulating vectors with values
> computed on the fly.  Previously-computed vectors are added with methods
> that take a vector; if computed on the fly, they're passed as two arguments
> to a + function to avoid the temptation to create an object.
>
> These give the following timings for 2.8.0.r18509-b20090819020207, which to
> me appears from these benchmarks to not have @specialized working yet (even
> with -Xexperimental and -Xfuture):
>
> (baseline "do-it-yourself" method)
> Primitive math, on-the-fly: 100  <-- defined to be 100
> Primitive math, stored: 290
>
> (true regardless of @specialized, final, -optimise)
> Immutable Vec2[Int], on-the-fly: 3800
> Immutable Vec2[Int], stored: 9000
>
> (true regardless of final, -optimise)
> Immutable Vec2I, on-the-fly: 1700
> Immutable Vec2I, stored: 2600
>
> (true regardless of @specialized, final, -optimise)
> Mutable Vec2[Int], on-the-fly: 2700
> Mutable Vec2[Int], stored: 3000
>
> (true if not both final and -optimise)
> Mutable Vec2I, on-the-fly: 220
> Mutable Vec2I, stored: 500
>
> (both final and -optimise)
> Mutable Vec2I, on-the-fly: 70
> Mutable Vec2I, stored: 150
>
> Bottom line: for immutables, generics are 2x slower when adding two
> primitives, 3x when combining two generics, and the best immutables are
> 10-20x slower than mutables.  For mutables, generics are 6-12x slower than
> unoptimized&final dedicated mutables, and optimized final dedicated mutables
> are 3x faster again (and, surprisingly, actually 1.5-2x faster than
> roll-your-own primitives).  Oh, and by the way, according to
> Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I takes 16
> (primitives take 8).
>
> I think this is pretty clear evidence that if you actually want a
> high-performance library for 2 and 3 component vectors using primitive
> types, that the library I've already written to do this (or another using a
> similar strategy) is at this point the way to go, and that mutable vectors
> is the way to go.  I hope that eventually the @specialized version will be
> closer to final -optimise, but it looks to me like the compiler team have
> their work cut out for them to make this happen.
>
> A couple of details about the code:
>
> Vec2[Int] is from a generic that looks like so (you can imagine the var
> version):
>   final class Vec2[@specialized T](val x:T , val y:T)(implicit arith :
> Numeric[T]) {
>     def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
>     // ...
>   }
>
> Vec2I is hand-rolled using primitive types and looks like so (again, imagine
> vars):
>   final class Vec2I(val x:Int , val y:Int) {
>     def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
>     // ...
>   }
>
>   --Rex
>
>



--
« Je déteste la montagne, ça cache le paysage »
Alphonse Allais

David Flemström
Joined: 2009-08-10,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
p, li { white-space: pre-wrap; }The final @specialized pattern, when finished, *should* make a and b in the following code *exactly* identical when it comes to memory and performance.


Note that I deliberately added Numeric to the non-generic version to make the two implementations have equal performance; the generic version IS normally slower because it uses the Numeric class, but not by much. What I want to demonstrate, however, is that it's Numeric that is the bottleneck, and not the fact that B is generic; part of the issue is that Numeric itself isn't compiled with @specialized since it would enhance performance greatly if it was.


If you want to do a real benchmark, copy the Numeric code from http://lampsvn.epfl.ch/trac/scala/browser/scala/trunk/src/library/scala/... and compile it, along with the code for Ordering, with @specialized. I'll try to do this myself when I get time.


The code:
class A(var x: Int, var y: Int) {
private val arith = Numeric.IntIsIntegral
def +(that: A) = new A(arith.plus(this.x, that.x),
arith.plus(this.y, that.y))
}
class B[@specialized T](var x: T, var y: T)(implicit arith: Numeric[T]) {
def +(that: B) = new B(arith.plus(this.x, that.x),
arith.plus(this.y, that.y))
}
var a = new A(2, 3)
a = a + a
var b = new B[Int](2, 3)
b = b + b


Quick question: Why doesn't the code for Numeric already include @specialized? It doesn't hurt until the compiler becomes aware of it, and it would in my opinion be better to hunt down all of the classes that need specialized now instead of waiting until the feature goes out of beta.


On Monday 24 August 2009 15:29:27 Rex Kerr wrote:
> Ah. Unfortunately, in this particular case specialization makes things
> even worse. With mutable generic int vectors, it goes from 2700&3000 to
> 4000&4000. With immutable generics it goes from 3800&9000 to 8000&14000.
>
> Also, memory usage goes up even more, from 54 to 62 bytes.
>
> --Rex
>
> On Mon, Aug 24, 2009 at 8:19 AM, Iulian Dragos <jaguarul [at] gmail [dot] com> wrote:
> > Specialization has to be enabled by -Yspecialize. I couldn't find the
> > code you used for benchmarking, so I couldn't post back timings. Your
> > results are what I'd expect, but the specialized transformation should
> > get you close to the hand-written specialized versions.
> >
> > iulian
> >
> > On Mon, Aug 24, 2009 at 9:54 AM, Rex Kerr<ichoran [at] gmail [dot] com> wrote:
> > > On Fri, Aug 21, 2009 at 11:05 AM, David Flemström
> > >
> > > <david [dot] flemstrom [at] gmail [dot] com> wrote:
> > >> - All classes are generic. If you only make a vector library that only
> > >> uses Floats or Doubles etc, someone is going to want to have an
> >
> > Int-based
> >
> > >> vector library, and thus a new one has to be written. With the new
> >
> > Numeric
> >
> > >> class in Scala 2.8, it's possible to make generic classes that take
> >
> > number
> >
> > >> types and that don't use boxing/unboxing with help from the
> > >> @specialized annotation. I have attached an example implementation of
> > >> a small system
> >
> > that
> >
> > >> demonstrates what I mean (ref1.scala); it only compiles with Scala
> > >> 2.8.
> >
> > Note
> >
> > >> that the code is optimized for speed and not necessarily for
> > >> legibility.
> > >
> > > I've now benchmarked this, and the results are not encouraging. I have
> > > benchmarks for two things: accumulating vectors from arrays of
> > > previously-computed vector values, and accumulating vectors with values
> > > computed on the fly. Previously-computed vectors are added with
> > > methods that take a vector; if computed on the fly, they're passed as
> > > two
> >
> > arguments
> >
> > > to a + function to avoid the temptation to create an object.
> > >
> > > These give the following timings for 2.8.0.r18509-b20090819020207,
> > > which
> >
> > to
> >
> > > me appears from these benchmarks to not have @specialized working yet
> >
> > (even
> >
> > > with -Xexperimental and -Xfuture):
> > >
> > > (baseline "do-it-yourself" method)
> > > Primitive math, on-the-fly: 100 <-- defined to be 100
> > > Primitive math, stored: 290
> > >
> > > (true regardless of @specialized, final, -optimise)
> > > Immutable Vec2[Int], on-the-fly: 3800
> > > Immutable Vec2[Int], stored: 9000
> > >
> > > (true regardless of final, -optimise)
> > > Immutable Vec2I, on-the-fly: 1700
> > > Immutable Vec2I, stored: 2600
> > >
> > > (true regardless of @specialized, final, -optimise)
> > > Mutable Vec2[Int], on-the-fly: 2700
> > > Mutable Vec2[Int], stored: 3000
> > >
> > > (true if not both final and -optimise)
> > > Mutable Vec2I, on-the-fly: 220
> > > Mutable Vec2I, stored: 500
> > >
> > > (both final and -optimise)
> > > Mutable Vec2I, on-the-fly: 70
> > > Mutable Vec2I, stored: 150
> > >
> > > Bottom line: for immutables, generics are 2x slower when adding two
> > > primitives, 3x when combining two generics, and the best immutables are
> > > 10-20x slower than mutables. For mutables, generics are 6-12x slower
> >
> > than
> >
> > > unoptimized&final dedicated mutables, and optimized final dedicated
> >
> > mutables
> >
> > > are 3x faster again (and, surprisingly, actually 1.5-2x faster than
> > > roll-your-own primitives). Oh, and by the way, according to
> > > Runtime.getRuntime.freeMemory, Vec2[Int] takes 54 bytes while Vec2I
> > > takes
> >
> > 16
> >
> > > (primitives take 8).
> > >
> > > I think this is pretty clear evidence that if you actually want a
> > > high-performance library for 2 and 3 component vectors using primitive
> > > types, that the library I've already written to do this (or another
> > > using
> >
> > a
> >
> > > similar strategy) is at this point the way to go, and that mutable
> >
> > vectors
> >
> > > is the way to go. I hope that eventually the @specialized version will
> >
> > be
> >
> > > closer to final -optimise, but it looks to me like the compiler team
> > > have their work cut out for them to make this happen.
> > >
> > > A couple of details about the code:
> > >
> > > Vec2[Int] is from a generic that looks like so (you can imagine the var
> > > version):
> > > final class Vec2[@specialized T](val x:T , val y:T)(implicit arith :
> > > Numeric[T]) {
> > > def +(ax:T,ay:T) = new Vt[T](arith.plus(x,ax),arith.plus(y,ay))
> > > // ...
> > > }
> > >
> > > Vec2I is hand-rolled using primitive types and looks like so (again,
> >
> > imagine
> >
> > > vars):
> > > final class Vec2I(val x:Int , val y:Int) {
> > > def +(ax:Int,ay:Int) = new V(x+ay,y+ay)
> > > // ...
> > > }
> > >
> > > --Rex
> >
> > --
> > « Je déteste la montagne, ça cache le paysage »
> > Alphonse Allais


David Flemström
david [dot] flemstrom [at] gmail [dot] com
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
On Mon, Aug 24, 2009 at 10:26 AM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
If you want to do a real benchmark, copy the Numeric code from http://lampsvn.epfl.ch/trac/scala/browser/scala/trunk/src/library/scala/Numeric.scala and compile it, along with the code for Ordering, with @specialized. I'll try to do this myself when I get time.

It's more complicated than that.  Ordering[T] extends java.util.Comparator[T] with PartialOrdering[T], object Numeric defines conversions that use Integral[T], and so on.  There are all sorts of things that potentially need tweaking for @specialized to work.  I'm afraid I don't have the time to work it all out.

Also, I'm not sure @specialized is going to save most of the typing for mutable vectors specifically since is nice to be able to
  val v = Vec2F_m(3.0f,2.5f)
  val u = Vec2I_m(1,2)
  v += u
and have it run quickly (i.e. with zero object instantiations).  Defining an implicit conversion from Vec2I_m to Vec2F_m makes the addition operation take ~10x longer than if there's a method in Vec2F_m that understands integers, e.g.
  def +=(v:Vec2I_m) = { x+=v.x ; y+=v.y ; this }
You'd really need a Numeric[T,S] for this to not need retyping.

  --Rex

Ben Hutchison
Joined: 2009-01-21,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

Hi David,

You make some good points. I need some more time to think over them and respond.

If we can agree on the Separate Interface from Implementation
principle, then 2 hopefully separable issues emerge:

1. Negotiating a suitable common interfaces for Vector, Matrix, Tensor
etc. Currently, every library redefines these, and that obstructs
interop and reuse.

2. Developing the most full-featured, performant, correct or memory
efficient impl. This can hopefully be done by individual libraries in
either cooperation or competition.

To me, its reaching (even partial) agreement on some common interfaces
that offers the most value to the community as a whole. I think both
Scalala's modeling of the problem domain and your critique are useful
steps down that road.

-Ben

On Mon, Aug 24, 2009 at 4:24 AM, David
Flemström wrote:
> On Sunday 23 August 2009 05:56:51 you wrote:
>> Personally, I think effort invested in understanding, reusing,
>> consolidating and adapting the existing Java and Scala assets would be
>> more fruitful than proposing new ones.
>>
>> -Ben
>
> Of course, I have to agree with this.
> However, it seems that some libraries seem to be too specialized already to
> allow for them being changed.
>
> The scalala library, for instance, does not enforce dimensions with the type
> system meaning that it's valid to pass any kind of vector to a method taking
> a vector type, and you have to add checks like:
> require(vector.size == 3, "Wrong vector dimension")
>
> ...or use partial functions everywhere to manually enforce this. I would
> argue that it would be better to follow the Scala convention of having
> Vector2, Vector3, Vector4 etc for this purpose (but it could of course be
> useful to have an abstract class Vector, too, for generic methods that take
> any kind of vector). This type system is of course not ideal for scalala,
> because making classes up to Matrix512_512 etc will require a lot of
> classes, so you can't really adapt Scalala to have this feature, while the
> same would be possible for a less math-oriented and more
> computation-oriented library, the latter being something that you from time
> to time have a need for.
>
> What also might become an issue for a generally accepted vector library is
> the size of data types. Scalala is already very memory-optimized; a 3D
> DenseVector does "only" take up 56 bytes of memory (8 byte class header, 4
> byte array reference, 4 byte padding, 8 byte array header (on most JVMs),
> 8*3 bytes for the values, 4 bytes for the array length and 4 bytes padding)
> which is pretty good already.
> However, comparing that to a generic vector that is dimension specialized
> (e.g. Vector2, Vector3 etc), that uses Numeric for its operations and
> doesn't utilize an Array for storing the data: Such a Vector3[Double] would
> take up only 32 bytes of memory (8 class header, 8*3 for values, 4 for
> Numeric reference and 4 padding), and a Vector3[Float] would only take up 24
> bytes (8 header, 4*3 values, 4 for Numeric reference). To give Scalala the
> same memory usage, you would have to restructure the whole type system (but
> it would be possible, indeed, and I could actually do some research in order
> to maybe implement something if you're interested).
>
> Another issue is that the current Scalala doesn't easily allow for immutable
> data structures that are completely immutable (so that HotSpot can apply
> optimizations). In order to accomplish this (while still leaving the option
> open for mutable versions of the same structure à la the scala.collection
> library), you have to put the actual data variables at the bottom of the
> type tree, so that you have def's all the way down, have a trait with the
> correct "update" methods named Mutability or similar, and then store the
> actual data variables at the leafs of the type structure.
>
> E.g. (and I write this off-hand so don't blame me if it doesn't compile):
> trait Vector extends Product { //skipped Iterable etc to make things easier
> def dimension = productArity
> }
>
> trait Vector2 extends Vector {
> def x: Double, def y: Double
> def productArity = 2
> def productElement(n: Int): Double = if(n == 0) x else if (n == 1) y
> else error("Out of range")
> def productPrefix = "Vector2"
> }
> class ImmutVec2(protected var xt: Double, protected var yt: Double)
> extends Vector2 {
> def x = xt; def y = yt
> }
>
> trait Mutable2 extends Vector2 {
> def x_=(v: Double) = (xt = v); def y_=(v: Double) = (yt = v)
> }
> class MutVec2(xt: Double, yt: Double) extends ImmutVec2(xt, yt)
> with Mutable2
>
> This is not how things are done in Scalala, however, and would also require
> some major restructuring and breakage in order to work.
>
> If you can show me alternative ways of solving the issues in Scalala I've
> mentioned here, then I'll reconsider writing a new library and instead look
> if Scalala can't be adapted to suit more purposes, or if I maybe should
> accept the performance penalties Scalala gives me and perform optimizations
> elsewhere in my applications (the vector math will be the bottleneck
> however).
>
> Regards,
> David Flemström
> david [dot] flemstrom [at] gmail [dot] com

ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
On Mon, Aug 24, 2009 at 8:18 PM, Ben Hutchison <ben [at] playscapegames [dot] com> wrote:
Hi David,

You make some good points. I need some more time to think over them and respond.

If we can agree on the Separate Interface from Implementation
principle, then 2 hopefully separable issues emerge:

I'm not sure this is fully possible in this case.  There are important performance differences, especially for small vectors, between generic and specialized code.  Without a specialized interface, you can't access the specialized code.

For example, I just ran some tests comparing Array[Int]-based vectors to lots of separately declared var Ints.  On my machine, Array[Int] is 6x slower for 2-element vectors, but by 8 elements it's only ~2x slower (which continues to be true, more or less, through 16 elements, at which point I got tired of typing out var a:Int,var b:Int,... (it was actually 1.8x at 16 elements; I think it would asymptote at 1.5x)).  Also, there's an extra 24 bytes of overhead for Array[Int], which is pretty noticable until you have 4-6 elements.  Thus, for performance, one would probably want Vec2,...,Vec6 before switching over to VecN.  Also, for performance, one would want separate classes for primitive types.  This could be maddening for someone who plans on dealing only with giant arrays of doubles and wants a nice class hierarchy with AbstractTensor at the top. 

It would be good to have general guidelines for interfaces so that they aren't more different than they need to be, but I do wonder how much those differences reflect underlying assumptions and preferences for coding style and the domain of the problem.

  --Rex
 
Ben Hutchison
Joined: 2009-01-21,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

Ok, Ive now got time to make a more considered reply.

I would summarize my main interest/agenda as "Harvesting some common
interfaces representing those characteristics of Vectors and Matrices
that most current libraries agree upon". The aim being to allow
applications to be built against the interfaces, rather than
libraries, at least in part.

On Mon, Aug 24, 2009 at 4:24 AM, David
Flemström wrote:
>
> The scalala library, for instance, does not enforce dimensions with the type
> system meaning that it's valid to pass any kind of vector to a method taking
> a vector type, and you have to add checks like:
> require(vector.size == 3, "Wrong vector dimension")
>
> ...or use partial functions everywhere to manually enforce this.

Yes, however: it seems a standardized Vector trait (similar to
scalala.tensor.Vector) with variable size, could be subclassed by
Vector2 ... VectorN, who then fix the size at 2 .. N?

I would
> argue that it would be better to follow the Scala convention of having
> Vector2, Vector3, Vector4 etc for this purpose (but it could of course be
> useful to have an abstract class Vector, too, for generic methods that take
> any kind of vector). This type system is of course not ideal for scalala,
> because making classes up to Matrix512_512 etc will require a lot of
> classes, so you can't really adapt Scalala to have this feature, while the
> same would be possible for a less math-oriented and more
> computation-oriented library, the latter being something that you from time
> to time have a need for.

Ok, if the common interface just exposed the variable size vector
form, the choice of what fixed size subtypes to create could be a
particular library writers decision I think?

>
> What also might become an issue for a generally accepted vector library is
> the size of data types. [snip]

Does memory use affect the interface, or just the implementation?

>
> Another issue is that the current Scalala doesn't easily allow for immutable
> data structures that are completely immutable (so that HotSpot can apply
> optimizations). In order to accomplish this (while still leaving the option
> open for mutable versions of the same structure à la the scala.collection
> library), you have to put the actual data variables at the bottom of the
> type tree, so that you have def's all the way down, have a trait with the
> correct "update" methods named Mutability or similar, and then store the
> actual data variables at the leafs of the type structure.

I agree that both immutability and mutability would need to be
supportable by any widely accepted interface, and that an immutable
library would useful.

Immutable libraries need any update methods to return the updated
copy, which imposes a certain style on client-code.

//immutable
val scaledVector = aVector.scale(2)
//mutable
aVector.scale(2)

However, the first API style works for either case, as in the Scala
collections API. So my proposal would be that update methods return an
updated-result object provided by the object invoked upon. Then a
separate Mutable trait is not required in core API.

I agree that's not how Scalala works. In-place modifier methods might
be offered as a Mutable- or Scalala- specific extension API.

>
> If you can show me alternative ways of solving the issues in Scalala I've
> mentioned here, then I'll reconsider writing a new library and instead look
> if Scalala can't be adapted to suit more purposes,

You probably cant make Scalala work just how you want. But I encourage
you to look at the common overlap of your needs with Scalala's design,
and see if there's some basic abstractions your proposed library and
Scalala (and ideally some of the Java libs) could jointly implement.

To me, it seems there are. I know that sounds idealistic and
wishy-washy, but currently there are at least 6 incompatible
Vector/Matrix libs accessible from Scala (Colt, Jama, Java3D, MTJ,
Commons-Math and Scalala). And you are proposing a 7th...

Isnt it about time people agreed to agree?

-Ben

Ben Hutchison
Joined: 2009-01-21,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

On Wed, Aug 26, 2009 at 1:46 AM, Rex Kerr wrote:
> On Mon, Aug 24, 2009 at 8:18 PM, Ben Hutchison

>> If we can agree on the Separate Interface from Implementation
>> principle, then 2 hopefully separable issues emerge:
>
> I'm not sure this is fully possible in this case.  There are important
> performance differences, especially for small vectors, between generic and
> specialized code.  Without a specialized interface, you can't access the
> specialized code.

Sure. But cant the Specialized impl. still implement the generic
interface? Maybe going thru the specialized API will be faster, but
not everyone needs that.

>
[snip]  Also, for performance, one would want
> separate classes for primitive types.  This could be maddening for someone
> who plans on dealing only with giant arrays of doubles and wants a nice
> class hierarchy with AbstractTensor at the top.

If you just want giant mutable arrays of doubles, then most of the
existing matrix libs try to address your needs.

What I actually see in short supply is decent abstraction, elegance
and reuse. Things Ive learned (the hard way) to value greatly.

Personally, Im not seeking bare metal speed from doing maths in Scala.
If one is, why not use the super-fast C/Fortran libs? Im after
convenience, abstraction and power. Im happy to wait for @specialized
and Hotspot improvements to get more performance.

Also, the lack of user-definable value types on the JVM imposes
annoying limits on what performance is achievable anyway. Hence all
the double arrays.

-Ben

Bradley Buchsbaum
Joined: 2009-04-15,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
Might multi-dimensional arrays fit in to the mix as well?

for java, there is:

http://www.unidata.ucar.edu/software/netcdf/java/docs/ucar/multiarray/package-summary.html


Brad Buchsbaum

On Wed, Aug 26, 2009 at 8:26 AM, Ben Hutchison <ben [at] playscapegames [dot] com> wrote:
Ok, Ive now got time to make a more considered reply.

I would summarize my main interest/agenda as "Harvesting some common
interfaces representing those characteristics of Vectors and Matrices
that most current libraries agree upon". The aim being to allow
applications to be built against the interfaces, rather than
libraries, at least in part.

On Mon, Aug 24, 2009 at 4:24 AM, David
Flemström<david [dot] flemstrom [at] gmail [dot] com> wrote:
>
> The scalala library, for instance, does not enforce dimensions with the type
> system meaning that it's valid to pass any kind of vector to a method taking
> a vector type, and you have to add checks like:
> require(vector.size == 3, "Wrong vector dimension")
>
> ...or use partial functions everywhere to manually enforce this.

Yes, however: it seems a standardized Vector trait (similar to
scalala.tensor.Vector) with variable size, could be subclassed by
Vector2 ... VectorN, who then fix the size at 2 .. N?

I would
> argue that it would be better to follow the Scala convention of having
> Vector2, Vector3, Vector4 etc for this purpose (but it could of course be
> useful to have an abstract class Vector, too, for generic methods that take
> any kind of vector). This type system is of course not ideal for scalala,
> because making classes up to Matrix512_512 etc will require a lot of
> classes, so you can't really adapt Scalala to have this feature, while the
> same would be possible for a less math-oriented and more
> computation-oriented library, the latter being something that you from time
> to time have a need for.

Ok, if the common interface just exposed the variable size vector
form, the choice of what fixed size subtypes to create could be a
particular library writers decision I think?

>
> What also might become an issue for a generally accepted vector library is
> the size of data types. [snip]

Does memory use affect the interface, or just the implementation?

>
> Another issue is that the current Scalala doesn't easily allow for immutable
> data structures that are completely immutable (so that HotSpot can apply
> optimizations). In order to accomplish this (while still leaving the option
> open for mutable versions of the same structure à la the scala.collection
> library), you have to put the actual data variables at the bottom of the
> type tree, so that you have def's all the way down, have a trait with the
> correct "update" methods named Mutability or similar, and then store the
> actual data variables at the leafs of the type structure.

I agree that both immutability and mutability would need to be
supportable by any widely accepted interface, and that an immutable
library would useful.

Immutable libraries need any update methods to return the updated
copy, which imposes a certain style on client-code.

//immutable
val scaledVector = aVector.scale(2)
//mutable
 aVector.scale(2)

However, the first API style works for either case, as in the Scala
collections API. So my proposal would be that update methods return an
updated-result object provided by the object invoked upon. Then a
separate Mutable trait is not required in core API.

I agree that's not how Scalala works. In-place modifier methods might
be offered as a Mutable- or Scalala- specific extension API.

>
> If you can show me alternative ways of solving the issues in Scalala I've
> mentioned here, then I'll reconsider writing a new library and instead look
> if Scalala can't be adapted to suit more purposes,

You probably cant make Scalala work just how you want. But I encourage
you to look at the common overlap of your needs with Scalala's design,
and see if there's some basic abstractions your proposed library and
Scalala (and ideally some of the Java libs) could jointly implement.

To me, it seems there are. I know that sounds idealistic and
wishy-washy, but currently there are at least 6 incompatible
Vector/Matrix libs accessible from Scala (Colt, Jama, Java3D, MTJ,
Commons-Math and Scalala). And you are proposing a 7th...

Isnt it about time people agreed to agree?

-Ben



--
Bradley R. Buchsbaum
Rotman Research Institute
3560 Bathurst St.
Toronto, ON Canada M6A 2E1
email: bbuchsbaum [at] rotman-baycrest [dot] on [dot] ca
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
On Wed, Aug 26, 2009 at 8:46 AM, Ben Hutchison <ben [at] playscapegames [dot] com> wrote:
On Wed, Aug 26, 2009 at 1:46 AM, Rex Kerr<ichoran [at] gmail [dot] com> wrote:
> On Mon, Aug 24, 2009 at 8:18 PM, Ben Hutchison <ben [at] playscapegames [dot] com>

> There are important
> performance differences, especially for small vectors, between generic and
> specialized code.  Without a specialized interface, you can't access the
> specialized code.

Sure. But cant the Specialized impl. still implement the generic
interface? Maybe going thru the specialized API will be faster, but
not everyone needs that.

It should be able to.  There might be some cases where one could get into problem with name collisions if one was careless, but the obvious things I've thought of work out with a little thought.

>
[snip]  Also, for performance, one would want
> separate classes for primitive types.  This could be maddening for someone
> who plans on dealing only with giant arrays of doubles and wants a nice
> class hierarchy with AbstractTensor at the top.

If you just want giant mutable arrays of doubles, then most of the
existing matrix libs try to address your needs.

What I actually see in short supply is decent abstraction, elegance
and reuse. Things Ive learned (the hard way) to value greatly.

In equally short supply is tiny mutable or immutable arrays of non-Doubles.  This allows a completely different but I would argue equally important elegance.

This is useful both with small data sets (e.g. screen coordinates) and vast ones (where you can't fit everything in memory so you want to generate a stream of, say, cell phone GPS coordinates that you use to build summary statistics).  In a lot of cases where the underlying data is 2-3 dimensional, you can use the abstraction that your data is actually a (2-3)xN matrix, but it's best not to be forced to use that abstraction unless it's the appropriate one--you have to keep more indices straight, keep everything in memory or develop a really clever caching scheme so you don't have to, etc..

Personally, Im not seeking bare metal speed from doing maths in Scala.
If one is, why not use the super-fast C/Fortran libs? Im after
convenience, abstraction and power.

What I like about Scala (at least in principle) is that one doesn't really need to give up anything, even speed (to within a factor of 2 or so, anyway).  For example, why not wrap the super-fast C/Fortran libs with the pretty generic framework?
   --Rex

ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
David,

I've been playing with making type hierarchies (and verifying my suspicions with performance testing); this has made me wonder about the following example you gave:

On Sun, Aug 23, 2009 at 2:24 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
trait Vector extends Product { //skipped Iterable etc to make things easier
def dimension = productArity
}


trait Vector2 extends Vector {
def x: Double, def y: Double
def productArity = 2
def productElement(n: Int): Double = if(n == 0) x else if (n == 1) y
else error("Out of range")
def productPrefix = "Vector2"
}
class ImmutVec2(protected var xt: Double, protected var yt: Double)
extends Vector2 {
def x = xt; def y = yt
}


trait Mutable2 extends Vector2 {
def x_=(v: Double) = (xt = v); def y_=(v: Double) = (yt = v)
}
class MutVec2(xt: Double, yt: Double) extends ImmutVec2(xt, yt)
with Mutable2

This type of hierarchy isn't hard to implement.  The problem is that instead of saving work for the library creator, it actually makes them do extra work if they want performance.

For example, if you try to define any math inside Vector2 above, you have to perform a method call for each value you want to get (which slows things down by a factor of 2-10x).  In fact, you have to keep re-implementing the same code all the way down the chain until you hit something marked "final":

trait ImmutVec2(/* ... */) extends Vector 2 {
  /* ... */
  def dotProduct(v:ImmutVec2) = x*v.x + y*v.y  // No way to avoid v.x() method call
}

final class MutVec2/* . . . */ {
  /* ... */
  def dotProduct(v:MutVec2) = x*v.x + y*v.y  // Now we see v.x is just a field lookup!
}

It seems to me that what the hierarchy buys us is the ability to relatively easily create a new vector type with lousy performance, at the expense of having to populate the entire inheritance tree with duplicate code (except for the type).

Am I missing something?

  --Rex
 
David Flemström
Joined: 2009-08-10,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
p, li { white-space: pre-wrap; }I just wanted to demonstrate the general structure in my example. Various optimizations could of course be done, but I think that if we want to use some of the *memory* optimizations some of the others have mentioned (aka "dynamically" choosing either an array or individual fields for the storage model), you'll have to have at least 1 unaviodable method call per vector element, thus sacrificing some *performance*. We could of course say that all vectors should use arrays (or whatever) and stick to that but then we have the issue of mutability again. I think the simplest way to solve this would be to simply pay for that extra method call.


Oh, and in the following code:


case class X(var y: Int)
val x = X(2)
val z = x.y


...the getter method y is called when assigning to z in all cases anyways (unless, maybe, if X is final and has no superclasses that define y). You might know this already; I just want to highlight that you get direct field accesses when writing simply var; it's not that simple.


On Wednesday 26 August 2009 23:32:38 Rex Kerr wrote:
> David,
>
> I've been playing with making type hierarchies (and verifying my suspicions
> with performance testing); this has made me wonder about the following
> example you gave:
>
> On Sun, Aug 23, 2009 at 2:24 PM, David Flemström
>
> <david [dot] flemstrom [at] gmail [dot] com>wrote:
> > trait Vector extends Product { //skipped Iterable etc to make things
> > easier def dimension = productArity
> > }
> >
> >
> > trait Vector2 extends Vector {
> > def x: Double, def y: Double
> > def productArity = 2
> > def productElement(n: Int): Double = if(n == 0) x else if (n == 1) y
> > else error("Out of range")
> > def productPrefix = "Vector2"
> > }
> > class ImmutVec2(protected var xt: Double, protected var yt: Double)
> > extends Vector2 {
> > def x = xt; def y = yt
> > }
> >
> >
> > trait Mutable2 extends Vector2 {
> > def x_=(v: Double) = (xt = v); def y_=(v: Double) = (yt = v)
> > }
> > class MutVec2(xt: Double, yt: Double) extends ImmutVec2(xt, yt)
> > with Mutable2
>
> This type of hierarchy isn't hard to implement. The problem is that
> instead of saving work for the library creator, it actually makes them do
> extra work if they want performance.
>
> For example, if you try to define any math inside Vector2 above, you have
> to perform a method call for each value you want to get (which slows things
> down by a factor of 2-10x). In fact, you have to keep re-implementing the
> same code all the way down the chain until you hit something marked
> "final":
>
> trait ImmutVec2(/* ... */) extends Vector 2 {
> /* ... */
> def dotProduct(v:ImmutVec2) = x*v.x + y*v.y // No way to avoid v.x()
> method call
> }
>
> final class MutVec2/* . . . */ {
> /* ... */
> def dotProduct(v:MutVec2) = x*v.x + y*v.y // Now we see v.x is just a
> field lookup!
> }
>
> It seems to me that what the hierarchy buys us is the ability to relatively
> easily create a new vector type with lousy performance, at the expense of
> having to populate the entire inheritance tree with duplicate code (except
> for the type).
>
> Am I missing something?
>
> --Rex


David Flemström
david [dot] flemstrom [at] gmail [dot] com
phkoester
Joined: 2009-08-23,
User offline. Last seen 42 years 45 weeks ago.
Name Collisions With Private Fields and Public Getters/Setters

The following won't compile:

class MyClass(var x: int) {
def x: int = {
println("x getter")
this.x
}

def x_=(x: int) {
println("x setter")
this.x = x
}
}

But the getter and setter will yield to an error stating that a method
was defined twice and there are ambiguous references even. Okay, I
understand the problem.

On the contrary, it is quite common for a class to keeps its fields
private and guard the (public) setters and getters, given the fields are
vars and not vals. For instance, a getter may return a lazily-created
value, and a setter might dismiss unacceptable values breaking a class
invariance, etc.

The examples in the book I'm reading, ``Programming in Scala,'' don't
give a hint on best practices in such cases. The book seems to suggest
to choose non-human-readable var names for fields (e.g. `re' and `im'
for complex numbers, `n' and `d' for rationals and so so on) and choose
well-readable method names on the public interface.

The best I have come up with yet is to prepend a dollar sign to the
private fields, like so:

class User(private var $userName: String, private var $password: String) {
def this() {
this(null, null)
}

def password = $password

def password_=(password: String) {
$password = password
}

def userName = $userName

def userName_=(userName: String) {
$userName = userName
}
}

This looks like a lot of boilerplate to me. What do you think? Would you
think this is a best-practice naming scheme?

I for one would love to be able to overload the intrinsic auto-generated
getters and setters without the compiler picking on me, but that's all
possibly water under the bridge ...

phkoester
Joined: 2009-08-23,
User offline. Last seen 42 years 45 weeks ago.
Re: Name Collisions With Private Fields and Public Getters/Sett

I should have written `Int' rather than `int', sorry. (Wrote free-hand,
hope I got my point across.)

---Ph.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Name Collisions With Private Fields and Public Getters/Sett

On Wed, Aug 26, 2009 at 11:56:37PM +0200, Philip Köster wrote:
> The best I have come up with yet is to prepend a dollar sign to the
> private fields, like so:
>
> class User(private var $userName: String, private var $password: String) {

This is a grave error. The dollar sign is not for user code. You will
break in spectacular (or worse, subtle) fashion by using it.

The convention I'm aware of is leading underscore, _userName.

Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Re: Name Collisions With Private Fields and Public Getters/Set
Risky choice, $ is used heavily by the compiler for generated artifacts.
Your code may work now, but there's a possibility that you'll run into a naming conflict with some future version of Scala.
Have you considered using an inner singleton object to contain these private fields?

On Wed, Aug 26, 2009 at 11:01 PM, Philip Köster <philip [dot] koester [at] web [dot] de> wrote:
I should have written `Int' rather than `int', sorry. (Wrote free-hand, hope I got my point across.)

---Ph.

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Name Collisions With Private Fields and Public Getters/Set
Philip, the problem here is that your getter/setter is precisely the private field. Scala has a shortcut for that: val and var. When you declare "var x", Scala creates a field x, a getter x and a setter x_=. You can't create fields directly: any field must have at least a getter, which may be private as needed. This is done to enforce the uniform access principle: ALL fields are accessed through getters and setters.   If the getter and setter don't do anything, use val/var.   One the other hand, suppose the password was stored encrypted, and with a salt. Then your class might go like this:   class User(val name: String, initialPassword: String) {   private lazy var encryptedPassword = encrypt(initialPassword, salt)   private lazy var salt = scala.util.Random.nextInt     private def encrypt(plainText: String, salt: Int): String = { ... }   private def decrypt(encryptedText: String, salt: Int): String = { ... }     def password = decrypt(encryptedPassword, salt)   def password_=(newPassword: String) = encrypt(newPassword, salt) }   There, see? If the private field is not exactly the same thing as the public field, its name will naturally be something else. Even if the value is the same, but you did some validation, you might call the private field validatedX and the getter X.
On Wed, Aug 26, 2009 at 6:56 PM, Philip Köster <philip [dot] koester [at] web [dot] de> wrote:
The following won't compile:

class MyClass(var x: int) {
       def x: int = {
               println("x getter")
               this.x
       }

       def x_=(x: int) {
               println("x setter")
               this.x = x
       }
}

But the getter and setter will yield to an error stating that a method was defined twice and there are ambiguous references even. Okay, I understand the problem.

On the contrary, it is quite common for a class to keeps its fields private and guard the (public) setters and getters, given the fields are vars and not vals. For instance, a getter may return a lazily-created value, and a setter might dismiss unacceptable values breaking a class invariance, etc.

The examples in the book I'm reading, ``Programming in Scala,'' don't give a hint on best practices in such cases.  The book seems to suggest to choose non-human-readable var names for fields (e.g. `re' and `im' for complex numbers, `n' and `d' for rationals and so so on) and choose well-readable method names on the public interface.

The  best I have come up with yet is to prepend a dollar sign to the private fields, like so:

class User(private var $userName: String, private var $password: String) {
       def this() {
               this(null, null)
       }
       
       def password = $password
       
       def password_=(password: String) {
               $password = password
       }

       def userName = $userName

       def userName_=(userName: String) {
               $userName = userName
       }
}

This looks like a lot of boilerplate to me. What do you think? Would you think this is a best-practice naming scheme?

I for one would love to be able to overload the intrinsic auto-generated getters and setters without the compiler picking on me, but that's all possibly water under the bridge ...



--
Daniel C. Sobral

Something I learned in academia: there are three kinds of academic reviews: review by name, review by reference and review by value.
phkoester
Joined: 2009-08-23,
User offline. Last seen 42 years 45 weeks ago.
Re: Name Collisions With Private Fields and Public Getters/Set

> Have you considered using an inner singleton object to contain these
> private fields?

No, I'm afraid I haven't. (Been learning Scala for two days now. ;))
Mind to share an example?

phkoester
Joined: 2009-08-23,
User offline. Last seen 42 years 45 weeks ago.
Re: Name Collisions With Private Fields and Public Getters/Sett

> This is a grave error. The dollar sign is not for user code. You will
> break in spectacular (or worse, subtle) fashion by using it.

Thanks for your answer. Honestly, I've never in my career as a
programmer been a fan of prefixes such as `$', `_', `n', `l', `lp',
`the' or, to make matters worse as they do in Redmond, `lpsz', or
whatever. The underscore looks like something reserved to me, especially
the double underscore, but a dollar prefix on fields is perfectly doable
in Java, which is something that writers of Scala libraries should take
into account.

In the end, it's a matter of taste. I like neither of the two, but given
the choice, I'd go for the dollar sign.

But as I said, I'm looking for a solution that needs no prefixes at all,
making me able to choose natural identifiers.

Regards
---Phil

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Name Collisions With Private Fields and Public Getters/Set
On Wed, Aug 26, 2009 at 7:38 PM, Philip Köster <philip [dot] koester [at] web [dot] de> wrote:
Have you considered using an inner singleton object to contain these private fields?

No, I'm afraid I haven't. (Been learning Scala for two days now. ;)) Mind to share an example?

 class User(initialName: String, initialPassword: String) {
   private object fields {
     var name: String = initialName;
     var password: String = initialPassword;
   }
   def name = fields.name
   def name_=(newName: String) = fields.name = newName
   def password = fields.password
   def password_=(newPassword: String) = fields.password = newPassword
 }
--
Daniel C. Sobral

Something I learned in academia: there are three kinds of academic reviews: review by name, review by reference and review by value.
David Hall 4
Joined: 2009-08-21,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

I thought I would inject myself one more time into this thread, just
for a word about how Scalala is structured right now. I'm not the
primary author of the library, but I've done a lot of the refactoring
for 2.8, so I thought I could share what our goals are.

Scalala is intended as something of a matlab replacement, and not for
graphics and other low dimensional stuff. Daniel Ramage and I are both
natural language processing/machine learning students, and the library
reflects that. We care about high dimensional problems of varying size
(say 300-30000 dimensions). Low dimensions are almost always useless
to us, and so we can't really represent what we want in a fixed size
class, or one that can be effectively inlined. The Tensor hierarchy
reflects that. (In my other project, scalanlp, we now have an
implementation of Tensor over arbitrary T types, which for me are
usually strings.)

The Tensor hierarchy *is* intended to support several speed
enhancements, though they're removed right now, mostly in the form of
taking advantage of expression trees to tighten loops (so, make v + 4
* y be a scaleAdd rather than a multiply then add). I think Dan plans
to support MTJ-style calling out to LAPACK to make things really
blaze.

As has been pointed out, it would be good to have immutability in the
hierarchy, and it would be good to have Vector2 and Vector3.
Dan's said he's very open to contributions.

Anyway, nothing concrete, but I thought I would represent out what I
see as Scalala's focus.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Name Collisions With Private Fields and Public Getters/Sett

On Thu, Aug 27, 2009 at 12:37:01AM +0200, Philip Köster wrote:
> The underscore looks like something reserved to me, especially the
> double underscore, but a dollar prefix on fields is perfectly doable
> in Java, which is something that writers of Scala libraries should
> take into account.
>
> In the end, it's a matter of taste. I like neither of the two, but
> given the choice, I'd go for the dollar sign.

Maybe I'm misreading you but you don't seem to understand, I'm not
talking about a matter of personal taste. If you use the '$' your code
is going to break. It's in the scala language spec: "The '$' character
is reserved for compiler synthesized identifiers. User programs should
not define identifiers which contain '$' characters." I can assure you
knowing how the $ is used in the compiler that your $-using future is an
unpleasant one indeed.

It's in the java language spec as well: "The $ character should be used
only in mechanically generated source code or, rarely, to access
preexisting names on legacy systems."

phkoester
Joined: 2009-08-23,
User offline. Last seen 42 years 45 weeks ago.
Re: Name Collisions With Private Fields and Public Getters/Sett

> Maybe I'm misreading you but you don't seem to understand, I'm not
> talking about a matter of personal taste. If you use the '$' your code
> is going to break. It's in the scala language spec: "The '$' character
> is reserved for compiler synthesized identifiers. User programs should
> not define identifiers which contain '$' characters." I can assure you
> knowing how the $ is used in the compiler that your $-using future is an
> unpleasant one indeed.

Okay, thanks for the update. This is clearly a Scala novelty that I
haven't been aware of. (I already noticed `_=' was translated to `_$eq'
or something resembling although I thought on a byte-code level
everything was allowed, even identifiers such as `42'.)

AFWIF, I think having private vars take on a different ID than public
accessors is not a perfect solution. A solution I can live with, maybe,
but a solution I don't find satisfactory.

But I'll abandon the `$', promised. :)

---Ph.

Ricky Clarkson
Joined: 2008-12-19,
User offline. Last seen 3 years 2 weeks ago.
Re: Name Collisions With Private Fields and Public Getters/Set

For what it's worth, I used to be able to crash javac by cleverly
placing $ signs in my code. :)

2009/8/27 Philip Köster :
>> Maybe I'm misreading you but you don't seem to understand, I'm not talking
>> about a matter of personal taste.  If you use the '$' your code is going to
>> break.  It's in the scala language spec: "The '$' character is reserved for
>> compiler synthesized identifiers. User programs should not define
>> identifiers which contain '$' characters." I can assure you knowing how the
>> $ is used in the compiler that your $-using future is an unpleasant one
>> indeed.
>
> Okay, thanks for the update. This is clearly a Scala novelty that I haven't
> been aware of. (I already noticed `_=' was translated to `_$eq' or something
> resembling although I thought on a byte-code level everything was allowed,
> even identifiers such as `42'.)
>
> AFWIF, I think having private vars take on a different ID than public
> accessors is not a perfect solution. A solution I can live with, maybe, but
> a solution I don't find satisfactory.
>
> But I'll abandon the `$', promised. :)
>
> ---Ph.
>

askia
Joined: 2008-08-24,
User offline. Last seen 4 years 8 weeks ago.
Re: Scala vector and matrix math library

I have also looked a bit into this topic.
I was wondering how practical it would be to have additional compiler phase
plugins for optimization of some cases while maintaining a generic
interface.
In ScalaBox2D, the authoring was experimenting with that.

How would this work from a design and architectural point of view?

* reference
http://www.box2d.org/forum/viewtopic.php?f=11&t=3349&start=20

http://github.com/Villane/vecmath/tree/8763344240fb82069cfe7a7dac6f30e03...

David Flemström
Joined: 2009-08-10,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
p, li { white-space: pre-wrap; }On Thursday 27 August 2009 17:54:04 Askia wrote:
> I have also looked a bit into this topic.
> I was wondering how practical it would be to have additional compiler phase
> plugins for optimization of some cases while maintaining a generic
> interface.
> In ScalaBox2D, the authoring was experimenting with that.
>
> How would this work from a design and architectural point of view?
Something that would be interesting to have, if we can gurantee that everyone who uses the library also activates the compiler plugin (I don't know how the plugin architecture works, but if this is possible), would be to have a plugin that:
- transforms all immutable case classes
→ that don't have any fields apart from the ones in the constructor
→ that also have been marked with a specific annotation,
into parameterlist elements and individual variables.


So, e.g.:


@inlined
case class A(val b: Int, val c: Int)
def someDef(something: A) = println(something.toString)
val d = new A(2, 3)
someDef(d)


...becomes:
@inlined
case class A(val b: Int, val c: Int) {
//... transformed methods that have param lists
//with the above variables instead of "this"
}
def someDef(inline_something_$b: Int, inline_something_$c: Int) =
println(A.toString(inline_something_$b, inline_something_$c))
val d_$b = 2
val d_$c = 3
someDef(d_$b, d_$c)


Imagine the speed improvements we would get! It would be *very* difficult to implement, though, and you wouldn't be able to change the inlined class's parameter list after it has been referenced in external unrecompilable code.

David Flemström
david [dot] flemstrom [at] gmail [dot] com
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
The problem with creating simultaneously flexible and high-performance vector math code is not that it's hard to do, it's just that there is *so much* repetitive code to write that no sane person would try to do it by hand.  It's tricky to shift all that burden to the compiler via @specialized, because this is @very_very_specialized in terms of sensible operations that should happen quickly.

But given that all the code follows from a much smaller set of basic operations and rules for what makes things run quickly, it seems like automatic code generation should in theory be possible.

Hence, I think the right answer for now is to write a program that automatically generates a high performance small vector library.  I've played with this a little and there don't seem to be any show-stoppers for creating the leaves of the library (which are all final for performance, and have overloaded methods to do math with each other).  I haven't worked out how to cleanly auto-generate the right class hierarchy yet, but that shouldn't be too daunting.

The hierarchy I'm thinking of now is:

Scalar[T]   <-- need this for things like 2+v to work
  ScalarTypename
Vec[T] extends Product with Iterable[T]  // Anything else needed?
  VecIsReference[T] extends Vec[T]
  VecIsRational[T] extends Vec[T]
  VecIsMutable[T] extends Vec[T]
  VecTypename extends Vec[Typename]  (with VecIsRational[T])
  Vec#[T] extends Vec[T]
    Vec#Typename extends Vec#[Typename] with VecTypename
      final Vec#T extends Vec#Typename
      final MVec#T extends Vec#Typename with VecIsMutable[Typename]
      final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
        with VecIsReference[Typename]

Here, there'd be separate classes for Typename in {Int,Long,Float,Double} (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be {2,3,4,5,6} for optimal performance, given my benchmarking.

The MVec2I series would be mutable; the RefVec2I series would be mutable and index into an array instead of carrying the values in itself (so you could store vector data in arrays for compactness and lose as little performance as possible).

Overall, this would require a library with 97 classes (!!) if it went up to N=6 (stopping at N=3 leaves it at a more reasonable 46 classes).  The final classes at the leaves of the tree would have ~20 (tiny) methods per operator for static versions (~60 for mutable versions).

By the way, Vec#[T] would be a non-abstract class, so you could actually use that if you wanted something that worked for your favorite type T and didn't require stellar performance; I also think you could mix in VecIsReference[T] and it would work, though I haven't actually written the code.

Also by the way, I am using Vec instead of Vector because of the convention by people who write collections libraries that "a Vector is an Array, not something to do math on"--the namespace is by now hopelessly conflicted.

  --Rex

David Flemström
Joined: 2009-08-10,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

I think that a compiler plugin that would allow for basic macros would do the
trick in a situation like this. We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

For example, this will still yield a getter method x for class B and C:

trait A {
def x: Int
}

class B extends A {
val x: Int = 3
}
class C extends A {
var x: Int = 3
}

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO. The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

On Thursday 27 August 2009 18:50:41 Rex Kerr wrote:
> The problem with creating simultaneously flexible and high-performance
> vector math code is not that it's hard to do, it's just that there is *so
> much* repetitive code to write that no sane person would try to do it by
> hand. It's tricky to shift all that burden to the compiler via
> @specialized, because this is @very_very_specialized in terms of sensible
> operations that should happen quickly.
>
> But given that all the code follows from a much smaller set of basic
> operations and rules for what makes things run quickly, it seems like
> automatic code generation should in theory be possible.
>
> Hence, I think the right answer for now is to write a program that
> automatically generates a high performance small vector library. I've
> played with this a little and there don't seem to be any show-stoppers for
> creating the leaves of the library (which are all final for performance,
> and have overloaded methods to do math with each other). I haven't worked
> out how to cleanly auto-generate the right class hierarchy yet, but that
> shouldn't be too daunting.
>
> The hierarchy I'm thinking of now is:
>
> Scalar[T] <-- need this for things like 2+v to work
> ScalarTypename
> Vec[T] extends Product with Iterable[T] // Anything else needed?
> VecIsReference[T] extends Vec[T]
> VecIsRational[T] extends Vec[T]
> VecIsMutable[T] extends Vec[T]
> VecTypename extends Vec[Typename] (with VecIsRational[T])
> Vec#[T] extends Vec[T]
> Vec#Typename extends Vec#[Typename] with VecTypename
> final Vec#T extends Vec#Typename
> final MVec#T extends Vec#Typename with VecIsMutable[Typename]
> final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
> with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.
>
> The MVec2I series would be mutable; the RefVec2I series would be mutable
> and index into an array instead of carrying the values in itself (so you
> could store vector data in arrays for compactness and lose as little
> performance as possible).
>
> Overall, this would require a library with 97 classes (!!) if it went up to
> N=6 (stopping at N=3 leaves it at a more reasonable 46 classes). The final
> classes at the leaves of the tree would have ~20 (tiny) methods per
> operator for static versions (~60 for mutable versions).
>
> By the way, Vec#[T] would be a non-abstract class, so you could actually
> use that if you wanted something that worked for your favorite type T and
> didn't require stellar performance; I also think you could mix in
> VecIsReference[T] and it would work, though I haven't actually written the
> code.
>
> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.
>
> --Rex

David Flemström
david [dot] flemstrom [at] gmail [dot] com

Maxime Lévesque
Joined: 2009-08-18,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

 BTW, are there plans for meta programming in Scala ?

   http://groovy.codehaus.org/Compile-time+Metaprogramming+-+AST+Transformations

 That would be an alternative to a compiler plugin wouldn't it ?


On Thu, Aug 27, 2009 at 1:36 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
I think that a compiler plugin that would allow for basic macros would do the
trick in a situation like this. We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

For example, this will still yield a getter method x for class B and C:

trait A {
 def x: Int
}

class B extends A {
 val x: Int = 3
}
class C extends A {
 var x: Int = 3
}

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO. The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

On Thursday 27 August 2009 18:50:41 Rex Kerr wrote:
> The problem with creating simultaneously flexible and high-performance
> vector math code is not that it's hard to do, it's just that there is *so
> much* repetitive code to write that no sane person would try to do it by
> hand.  It's tricky to shift all that burden to the compiler via
> @specialized, because this is @very_very_specialized in terms of sensible
> operations that should happen quickly.
>
> But given that all the code follows from a much smaller set of basic
> operations and rules for what makes things run quickly, it seems like
> automatic code generation should in theory be possible.
>
> Hence, I think the right answer for now is to write a program that
> automatically generates a high performance small vector library.  I've
> played with this a little and there don't seem to be any show-stoppers for
> creating the leaves of the library (which are all final for performance,
> and have overloaded methods to do math with each other).  I haven't worked
> out how to cleanly auto-generate the right class hierarchy yet, but that
> shouldn't be too daunting.
>
> The hierarchy I'm thinking of now is:
>
> Scalar[T]   <-- need this for things like 2+v to work
>   ScalarTypename
> Vec[T] extends Product with Iterable[T]  // Anything else needed?
>   VecIsReference[T] extends Vec[T]
>   VecIsRational[T] extends Vec[T]
>   VecIsMutable[T] extends Vec[T]
>   VecTypename extends Vec[Typename]  (with VecIsRational[T])
>   Vec#[T] extends Vec[T]
>     Vec#Typename extends Vec#[Typename] with VecTypename
>       final Vec#T extends Vec#Typename
>       final MVec#T extends Vec#Typename with VecIsMutable[Typename]
>       final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
>         with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.
>
> The MVec2I series would be mutable; the RefVec2I series would be mutable
> and index into an array instead of carrying the values in itself (so you
> could store vector data in arrays for compactness and lose as little
> performance as possible).
>
> Overall, this would require a library with 97 classes (!!) if it went up to
> N=6 (stopping at N=3 leaves it at a more reasonable 46 classes).  The final
> classes at the leaves of the tree would have ~20 (tiny) methods per
> operator for static versions (~60 for mutable versions).
>
> By the way, Vec#[T] would be a non-abstract class, so you could actually
> use that if you wanted something that worked for your favorite type T and
> didn't require stellar performance; I also think you could mix in
> VecIsReference[T] and it would work, though I haven't actually written the
> code.
>
> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.
>
>   --Rex


David Flemström
david [dot] flemstrom [at] gmail [dot] com

Meredith Gregory
Joined: 2008-12-17,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
Hey Folks,

i've only been dipping in and out of this thread, but if you are seriously considering an API for matrix maths, may i suggest that you use Geometric Algebra (see also Clifford Algebra) as the basis for the API. You will get huge leap in abstraction and usability. The graphics and physics community are very, very excited about this way of structuring the basic operations. In fact, folks are so excited people have started to look at hardware accelerations for Geometric Algebra.

If you'd like an example, here is a C++ template-based lib for Clifford Algebra. i can't remember if they make use of BLAS/LAPACK to effect and/or optimize stuff underneath or if this is a grounds up implementation.

Best wishes,

--greg

On Thu, Aug 27, 2009 at 10:44 AM, Maxime Lévesque <maxime [dot] levesque [at] gmail [dot] com> wrote:

 BTW, are there plans for meta programming in Scala ?

   http://groovy.codehaus.org/Compile-time+Metaprogramming+-+AST+Transformations

 That would be an alternative to a compiler plugin wouldn't it ?


On Thu, Aug 27, 2009 at 1:36 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
I think that a compiler plugin that would allow for basic macros would do the
trick in a situation like this. We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

For example, this will still yield a getter method x for class B and C:

trait A {
 def x: Int
}

class B extends A {
 val x: Int = 3
}
class C extends A {
 var x: Int = 3
}

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO. The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

On Thursday 27 August 2009 18:50:41 Rex Kerr wrote:
> The problem with creating simultaneously flexible and high-performance
> vector math code is not that it's hard to do, it's just that there is *so
> much* repetitive code to write that no sane person would try to do it by
> hand.  It's tricky to shift all that burden to the compiler via
> @specialized, because this is @very_very_specialized in terms of sensible
> operations that should happen quickly.
>
> But given that all the code follows from a much smaller set of basic
> operations and rules for what makes things run quickly, it seems like
> automatic code generation should in theory be possible.
>
> Hence, I think the right answer for now is to write a program that
> automatically generates a high performance small vector library.  I've
> played with this a little and there don't seem to be any show-stoppers for
> creating the leaves of the library (which are all final for performance,
> and have overloaded methods to do math with each other).  I haven't worked
> out how to cleanly auto-generate the right class hierarchy yet, but that
> shouldn't be too daunting.
>
> The hierarchy I'm thinking of now is:
>
> Scalar[T]   <-- need this for things like 2+v to work
>   ScalarTypename
> Vec[T] extends Product with Iterable[T]  // Anything else needed?
>   VecIsReference[T] extends Vec[T]
>   VecIsRational[T] extends Vec[T]
>   VecIsMutable[T] extends Vec[T]
>   VecTypename extends Vec[Typename]  (with VecIsRational[T])
>   Vec#[T] extends Vec[T]
>     Vec#Typename extends Vec#[Typename] with VecTypename
>       final Vec#T extends Vec#Typename
>       final MVec#T extends Vec#Typename with VecIsMutable[Typename]
>       final RefVec#T extends Vec#Typename with VecIsMutable[Typename]
>         with VecIsReference[Typename]
>
> Here, there'd be separate classes for Typename in {Int,Long,Float,Double}
> (perhaps abbreviated to I,L,F,D in the final leaf versions), and # would be
> {2,3,4,5,6} for optimal performance, given my benchmarking.
>
> The MVec2I series would be mutable; the RefVec2I series would be mutable
> and index into an array instead of carrying the values in itself (so you
> could store vector data in arrays for compactness and lose as little
> performance as possible).
>
> Overall, this would require a library with 97 classes (!!) if it went up to
> N=6 (stopping at N=3 leaves it at a more reasonable 46 classes).  The final
> classes at the leaves of the tree would have ~20 (tiny) methods per
> operator for static versions (~60 for mutable versions).
>
> By the way, Vec#[T] would be a non-abstract class, so you could actually
> use that if you wanted something that worked for your favorite type T and
> didn't require stellar performance; I also think you could mix in
> VecIsReference[T] and it would work, though I haven't actually written the
> code.
>
> Also by the way, I am using Vec instead of Vector because of the convention
> by people who write collections libraries that "a Vector is an Array, not
> something to do math on"--the namespace is by now hopelessly conflicted.
>
>   --Rex


David Flemström
david [dot] flemstrom [at] gmail [dot] com




--
L.G. Meredith
Managing Partner
Biosimilarity LLC
1219 NW 83rd St
Seattle, WA 98117

+1 206.650.3740

http://biosimilarity.blogspot.com
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: Scala vector and matrix math library
On Thu, Aug 27, 2009 at 1:36 PM, David Flemström <david [dot] flemstrom [at] gmail [dot] com> wrote:
We can't really rely on an external tool in
this case, because it would be really tedious to carry around all the time.

That's why you write the Scala vector package producer in Scala (like I did for my trial project).  Then you don't have anything extra to carry around.
 
Oh and I don't really see how creating macroized classes would improve
performance. You will never be able to avoid that method call required for
element access (that we talked about before), for example, because then you
can't have polymorphism without throwing away either the immutable or the
mutable library, and "reference vectors" become plain impossible to implement.

I should have explained in more detail why there would be so many methods per operation.

trait A {
  def x : Int
  def +(a:A) : A
}

final class B(val x:Int) extends A {
  def +(a:A) = new B(x + a.x)   // a.x must use a getter
  def +(b:B) = new B(x + b.x)   // b.x can access field directly
  def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
}

final class C(private val a:Array[Int],private val i:Int) extends A {
  def x = a(i)
  def +(a:A) = new B(x + a.x)  // a.x is a getter, x need not be (haven't checked)
  def +(b:B) = new B(x + b.x)  // b.x is a field access
  def +(c:C) = new B(x + c.x)  // Should -optimise to a(i)+c.a(c.i)
}

So it is entirely true that if you want to use A in your code, you'll have to go through an extra method call and things will be slower.

But it is also entirely true that you are not required to use A unless you honestly don't know when you've got a B and when you've got a C.  You can mix B and C with no superfluous method calls because B and C know about each other and are final, and can use class-specific field access instead of falling back on the generic getter method in A.

This gives the best of both worlds: if you know your types and you know what you're doing, you can write code that is as fast as using primitives by hand.  If you want to be blind to immutability or references or whatever, only then do you need to use A and the only cost is that you'll pay an extra method call.

The only place you lose is in code size.

I guess that the only thing a macro could help doing would be to give us
purely native vector implementations for Int..Double, but that won't increase
performance significantly enough IMO.

Did you see all the benchmarks I posted?  For mutable vectors, extra method calls triple (or more) how long a simple arithmetic operation takes.  IMO a factor of three is quite significant.  I stopped using Python in favor of Java over factors of three, for example.

(Specifically, the relevant elapsed times for 2-vectors are:
  primitive+primitive (calculated on the fly) = 100
  primitive+primitive (from array) = 200
  final vector+primitive (calc on the fly) = 80
  final vector+final vector (from array) = 140
  trait+primitive (calc on the fly) = 400
  trait+trait (from array) = 400
)
 
The flexibility that Numeric gives is
just extraordinary and I think it would be good to try to use it, in some way,
even if this means introducing a "trait GenericVector[T] extends Vector { val
arith: Numeric[T]}" and having some vector implementations use Numeric and
some use primitives.

I agree.  Once it's stabilized this is definitely a thing to do.  (Not the only thing, but it'd make everyone's life a lot easier.)  I'm not sure whether one actually wants to include Numeric[T] as a field, though--this adds 4-8 bytes to every single vector.  Why not

trait GenericVector[T] extends Vector with Numeric[T] { ... }

and then create vectors with something like

val a = new GenericVector[BigInt] with Numeric.BigIntIsIntegral

if that can work (extra typing, but a bit of typing might be worth the space if you're using something smaller than a BigInt).  Numeric does not define operators directly, so there shouldn't be any risk of name collision.
   --Rex

ijuma
Joined: 2008-08-20,
User offline. Last seen 22 weeks 2 days ago.
Re: Scala vector and matrix math library

Hey Rex,

On Thu, 2009-08-27 at 14:46 -0400, Rex Kerr wrote:

> Did you see all the benchmarks I posted? For mutable vectors, extra
> method calls triple (or more) how long a simple arithmetic operation
> takes. IMO a factor of three is quite significant. I stopped using
> Python in favor of Java over factors of three, for example.

Is the code for the various benchmarks you have been talking about
available anywhere so that others can verify? HotSpot performance can
vary quite a bit depending on the machine, JVM settings, version, etc.
so it would be good to do that.

Best,
Ismael

Colin Bullock
Joined: 2009-01-23,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
Just to point out, it's not possible in scala to directly access a field; even when you use val or var, you are still going through generated getters/setters to support for uniform access.

final class B(val x:Int) extends A {
  def +(a:A) = new B(x + a.x)   // a.x must use a getter
  def +(b:B) = new B(x + b.x)   // b.x can access field directly
  def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
}

If you run javap on the above, you'll note the only difference in bytecode between the overloaded A and B variants of + is one instruction: invokevirtual in the case of B and invokeinterface for A. Indeed a small savings, but nothing to write home about; certainly not worth the pain of virtually identical overloaded methods.

 - Colin
David Flemström
Joined: 2009-08-10,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library
p, li { white-space: pre-wrap; }On Thursday 27 August 2009 20:46:02 Rex Kerr wrote:
> trait A {
>   def x : Int
>   def +(a:A) : A
> }
>
> final class B(val x:Int) extends A {
>   def +(a:A) = new B(x + a.x)   // a.x must use a getter
>   def +(b:B) = new B(x + b.x)   // b.x can access field directly
>   def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
> }
>
> final class C(private val a:Array[Int],private val i:Int) extends A {
>   def x = a(i)
>   def +(a:A) = new B(x + a.x)  // a.x is a getter, x need not be (haven't
> checked) def +(b:B) = new B(x + b.x)  // b.x is a field access
>   def +(c:C) = new B(x + c.x)  // Should -optimise to a(i)+c.a(c.i)
> }
Well, see, here's where I'm not so sure about how things work. Have you really tested this and seen that what you say in the comments in fact is true?


I know that Scala does "stabilization" (in lack of a better word) by seeing where getters can be removed entirely and where they have to stay being methods.


However, as long as you are extending something that already uses a getter, you can't "override" that getter with a stable val. Think about the following example:


val x: A = new B(3)


Obviously, when you call x.x, you will use a getter because that's what the trait has defined (and the JVM makes a difference between field accesses and method calls, so it cant transform the one into another except by HotSpoting. Same true for scalac), and ergo, the B class (even by itself) will in fact contain a getter ("def x = $x" where $x is some private generated variable) and not a pure final val, because the val is "unstable".


...so, all of the + methods above will in fact use a getter all of the time, unavoidably, with my line of reasoning. Please prove me wrong on this one, preferrably by publishing your test suite like Mr. Juma has proposed.

David Flemström
david [dot] flemstrom [at] gmail [dot] com
John Nilsson
Joined: 2008-12-20,
User offline. Last seen 42 years 45 weeks ago.
Re: Scala vector and matrix math library

On Thu, Aug 27, 2009 at 8:46 PM, Rex Kerr wrote:
> final class B(val x:Int) extends A {
>   def +(a:A) = new B(x + a.x)   // a.x must use a getter
>   def +(b:B) = new B(x + b.x)   // b.x can access field directly
>   def +(c:C) = new B(x + c.x)   // Not sure what actually happens here...
> }

Maybe this is already possible to implement by exploiting the type
system in some way. See
http://michid.wordpress.com/2008/10/29/meta-programming-with-scala-condi...

BR,
John

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland