This page is no longer maintained — Please continue to the home page at www.scala-lang.org

To Subset or Not To Subset?

16 replies
Randall R Schulz
Joined: 2008-12-16,
User offline. Last seen 1 year 29 weeks ago.

Hi,

Another suggestion keeps coming up in my company's discussions about
adopting Scala: Should the coding requirements forbid use of parts of
the language?

Personally, I find this notion so bizarre I can't even comprehend what
motivates it. I think it's some sort of fear of the unknown, or
something.

Has anyone dealt with this? If so, how did you address it? Did you
accept a subsetting of the language in your organization? What are the
pros and cons of doing so?

Randall Schulz

John Nilsson
Joined: 2008-12-20,
User offline. Last seen 42 years 45 weeks ago.
Re: To Subset or Not To Subset?

I guess there is an arguneng to be made for discuracing use of some features. Some features are often a code smell. For example integer division is a feature in Java I usually find as a subtle bug.

BR
John

Den 9 jun 2011 16:57 skrev "Randall R Schulz" <rschulz [at] sonic [dot] net>:
> Hi,
>
> Another suggestion keeps coming up in my company's discussions about
> adopting Scala: Should the coding requirements forbid use of parts of
> the language?
>
> Personally, I find this notion so bizarre I can't even comprehend what
> motivates it. I think it's some sort of fear of the unknown, or
> something.
>
> Has anyone dealt with this? If so, how did you address it? Did you
> accept a subsetting of the language in your organization? What are the
> pros and cons of doing so?
>
>
> Randall Schulz
Jim Powers
Joined: 2011-01-24,
User offline. Last seen 36 weeks 2 days ago.
Re: To Subset or Not To Subset?
On Thu, Jun 9, 2011 at 10:32 AM, Randall R Schulz <rschulz [at] sonic [dot] net> wrote:
Hi,

Another suggestion keeps coming up in my company's discussions about
adopting Scala: Should the coding requirements forbid use of parts of
the language?

Personally, I find this notion so bizarre I can't even comprehend what
motivates it. I think it's some sort of fear of the unknown, or
something.

Agreed it's bizarre, but not completely uncommon.  For instance I understand that Google has style guidelines to limit what features of C++ you can use in production code.  I don't work at Google so I don't know the details.
In certain security contexts restrictions are placed on what language/library features can be used because the compilers and runtimes have to get a "certification" and you can use only those features certified.
Not my taste though.
--
Jim Powers
ichoran
Joined: 2009-08-14,
User offline. Last seen 2 years 3 weeks ago.
Re: To Subset or Not To Subset?
I have never had to deal with this, but I would argue the following:

* Scala does not have any gratuitous and dangerous features.  The only features that are cleanly separable from its core functionality are in-line XML and the use of the Symbol type; otherwise, things like higher-kinded types and implicit conversions are a central part of why using Scala is an advantage.

* You may wish to have coding guidelines--stress on _guidelines_--for those parts of the language that are less familiar and where people cannot necessarily be counted on to do something sensible, either because the feature is novel (good practice needs to be learned) or because it's different from Java (bad practice needs to be unlearned).  Some guidelines may include things like (depending on personal preference):
  - Use symbolic method names only for
    . Mathematical operations (using standard symbols, or as close as you can get)
    . Collections operations (using the Scala collections standards)
    . Parsing (using the Scala collections standards)
    . Actors (using the standard for any familiar actor library)
  - Reuse the same variable name in chained collections operations if the data is the same, and different variable name if the data has changed:
    "Hiya".map(c => c * c.asDigit).map(i => i*(i+1)).filter(i => (i&0x2)==0x2) // Yes
    "Hiya".map(x => x * x.asDigit).map(x => x*(x+1)).filter(x => (x&0x2)==0x2) // No
  - Handle typical cannot-proceed conditions with Option (for simple failure) or Either (where you need to know what went wrong) instead of with exceptions.
  - Do not use implicit conversions to convert between simple existing types.
  - Do not use implicit arguments with simple existing types.
  - Declare a case class instead of using tuples of size 4 or larger.  Do not nest tuples more than two deep.  Nest case classes instead of having a single class with more than 6 arguments.

If the pro-subset folk(s) are amenable to rational argument, for any feature that is considered, the thing to do is to show a use case where that feature is _clearly_ the right thing to do; then suggest that if necessary, a guideline is established that directs people to use the feature where it is the right thing to do.

  --Rex

On Thu, Jun 9, 2011 at 10:32 AM, Randall R Schulz <rschulz [at] sonic [dot] net> wrote:
Hi,

Another suggestion keeps coming up in my company's discussions about
adopting Scala: Should the coding requirements forbid use of parts of
the language?

Personally, I find this notion so bizarre I can't even comprehend what
motivates it. I think it's some sort of fear of the unknown, or
something.

Has anyone dealt with this? If so, how did you address it? Did you
accept a subsetting of the language in your organization? What are the
pros and cons of doing so?


Randall Schulz

paulbutcher
Joined: 2010-03-08,
User offline. Last seen 10 weeks 5 days ago.
Re: To Subset or Not To Subset?

On 9 Jun 2011, at 16:52, Jim Powers wrote:
> Agreed it's bizarre, but not completely uncommon. For instance I understand that Google has style guidelines to limit what features of C++ you can use in production code. I don't work at Google so I don't know the details.

In a C++ context, this can certainly make sense. For a bunch of reasons. These don't make sense for Scala, but for context:

a) Some language features may not be supported on one of the platforms you need to support. For example, we forbid the use of RTTI in our C++ code because Android doesn't support it.

b) Some language features are just plain broken. For example exception specifications.

c) Some language features existed for good reason "back in the day" but have now outlived their usefulness. For example setjmp/longjmp.

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: paul [at] paulbutcher [dot] com
AIM: paulrabutcher
Skype: paulrabutcher

Joshua.Suereth
Joined: 2008-09-02,
User offline. Last seen 32 weeks 5 days ago.
Re: To Subset or Not To Subset?
This might come from twitter's style guide, which does reduce the allowed features, IIRC.   Google tends to reduce languages (like C++) to choose one style of development.  I find these to be anti-patterns to productivity.  The beauty of Scala is that you can use a style appropriate to the problem you're working on and switch between styles as necessary.
Also, the collections library has already gone beyond what most companies would limit themselves too, so it's not like you can escape forever without learning how to use these advanced features...

On Thu, Jun 9, 2011 at 10:32 AM, Randall R Schulz <rschulz [at] sonic [dot] net> wrote:
Hi,

Another suggestion keeps coming up in my company's discussions about
adopting Scala: Should the coding requirements forbid use of parts of
the language?

Personally, I find this notion so bizarre I can't even comprehend what
motivates it. I think it's some sort of fear of the unknown, or
something.

Has anyone dealt with this? If so, how did you address it? Did you
accept a subsetting of the language in your organization? What are the
pros and cons of doing so?


Randall Schulz

Meredith Gregory
Joined: 2008-12-17,
User offline. Last seen 42 years 45 weeks ago.
Re: To Subset or Not To Subset?
Dear Randall,
i would second and amplify Josh's argument regarding the collections lib. i would hold that up as the standard for feature set considerations. 
Best wishes,
--greg

On Thu, Jun 9, 2011 at 9:37 AM, Josh Suereth <joshua [dot] suereth [at] gmail [dot] com> wrote:
This might come from twitter's style guide, which does reduce the allowed features, IIRC.   Google tends to reduce languages (like C++) to choose one style of development.  I find these to be anti-patterns to productivity.  The beauty of Scala is that you can use a style appropriate to the problem you're working on and switch between styles as necessary.
Also, the collections library has already gone beyond what most companies would limit themselves too, so it's not like you can escape forever without learning how to use these advanced features...

On Thu, Jun 9, 2011 at 10:32 AM, Randall R Schulz <rschulz [at] sonic [dot] net> wrote:
Hi,

Another suggestion keeps coming up in my company's discussions about
adopting Scala: Should the coding requirements forbid use of parts of
the language?

Personally, I find this notion so bizarre I can't even comprehend what
motivates it. I think it's some sort of fear of the unknown, or
something.

Has anyone dealt with this? If so, how did you address it? Did you
accept a subsetting of the language in your organization? What are the
pros and cons of doing so?


Randall Schulz




--
L.G. Meredith
Managing Partner
Biosimilarity LLC
7329 39th Ave SWSeattle, WA 98136

+1 206.650.3740

http://biosimilarity.blogspot.com
H-star Development
Joined: 2010-04-14,
User offline. Last seen 2 years 26 weeks ago.
Re: To Subset or Not To Subset?

you can keep noobs from messing things up more than necessary. the rule
doesn't make sense for advanced programmers.

Am 09.06.2011 16:32, schrieb Randall R Schulz:
> Hi,
>
> Another suggestion keeps coming up in my company's discussions about
> adopting Scala: Should the coding requirements forbid use of parts of
> the language?
>
> Personally, I find this notion so bizarre I can't even comprehend what
> motivates it. I think it's some sort of fear of the unknown, or
> something.
>
> Has anyone dealt with this? If so, how did you address it? Did you
> accept a subsetting of the language in your organization? What are the
> pros and cons of doing so?
>
>
> Randall Schulz
>

Bernd Johannes
Joined: 2011-01-28,
User offline. Last seen 42 years 45 weeks ago.
Re: To Subset or Not To Subset?

Hi Randall,

> Another suggestion keeps coming up in my company's discussions about
> adopting Scala: Should the coding requirements forbid use of parts of
> the language?
>
> Personally, I find this notion so bizarre I can't even comprehend what
> motivates it. I think it's some sort of fear of the unknown, or
> something.

I understand that motivation and some months ago I would have agreed. Some
things are easy in scala, and some obiously aren't - and might even look
outright dangerous.

But now I think subsetting is not the solution but training. Because
subsetting means that you have to work around artificial constraints probably
producing code that doesn't solve a given task the way it should in scala.
At the same time your team does not evolve and stays... well constrained.

Maybe you can assess or estimate the knowledge level of your colleagues and
suggest appropriate work packages for each knowledge level.

For "everyday scripting" and small projects scala works like a charm and I
guess anybody can come to term with such minimal design requirements
(including me :-).

High performance / reentrant coding or clever modular design in a complex
project is another topic. But here the chosen language is only a feature.
If somebody doesn't know how to do it right he will screw things up with any
language.
And if Joe knows how it should look like he will dance with the compiler until
he's there.

But from your own experience (if you still remember your scala beginning days)
you might want to discourage / encourage some specific practices.

I chose for example to almost always specify the return type of a method.
In my first scala steps I didn't and was facing bizzare compiler type errors
in places where I couldn't figure out what was going wrong. So I learned that
type inference changes can pop up as errors in rather unexpected places...

So coding conventions / best practices / black pages for me is the way to go.

What is still missing?
I think scala is still not settled regarding "template solutions" / "design
patterns". There are still many things to try and to compare how to get it
done the best way. And as scala is more expressive that java it will take a
little longer to get there.

Just my 5 cents.
Greetings
Bernd

Philippe Lhoste
Joined: 2010-09-02,
User offline. Last seen 42 years 45 weeks ago.
Re: To Subset or Not To Subset?

On 09/06/2011 18:41, Paul Butcher wrote:
> c) Some language features existed for good reason "back in the day" but have now
> outlived their usefulness. For example setjmp/longjmp.

I never used this pair in some years of C programming. But I understand they are far from
being outdated. I know that Lua uses it for error recovery / handling. I have read that
some languages, like Scheme, use them to implement continuations. Their usage is very rare
and quite specialized (language implementations only?) but not outlived.

I criticize only the taken example, not the point itself... :-)

Philippe Lhoste
Joined: 2010-09-02,
User offline. Last seen 42 years 45 weeks ago.
Re: To Subset or Not To Subset?

On 09/06/2011 16:32, Randall R Schulz wrote:
> Another suggestion keeps coming up in my company's discussions about
> adopting Scala: Should the coding requirements forbid use of parts of
> the language?

Martin Odersky defined levels of mastery of Scala, both for the plain coder and for the
library designer.
It can make sense to enforce these guidelines, depending on the kind of job and on the
level of the coder. I wouldn't set that as hard rules, as long as there is a good peer
review of each check in. If a newbie uses a Manifest, why not, if he shows a good
understanding of the concept, but it is also likely to be misused. I can imagine an
enthusiast newbie abusing of parallel collections to massage a list of a hundred files at
most, etc.

Back when our company adopted Java 1.5 syntax, there was some rules: avoid static imports,
avoid auto-boxing, etc. Today we make some cautious breaches of these rules, but somehow
they served as warning and learning wheels... :-)

Kevin Wright 2
Joined: 2010-05-30,
User offline. Last seen 26 weeks 4 days ago.
Re: Re: To Subset or Not To Subset?
If your goal is to reduce complexity and produce more readable code, then why not target that explicitly, instead of simply stating that "this particular subset will produce simpler code"
This kind of thing is rampant in medicine and the whole "nutritional supplements" industry.  It's known as a surrogate outcome and is (to put it lightly) abused [1].  I'd like to think we're better than that!
Just measure the complexity directly.  Perform code reviews and reject any code that the reviewers find hard to understand.  In cases of doubt, invite more reviewers and take a vote; sometimes the correct resolution is simply to add a few comments.
If we measure the surrogate outcome (language subset used), then there's a very real risk of either producing no benefit whatsoever, or of actively harming the codebase by denying use of a language feature that is obviously the cleanest solution to some given problem.

[1] For example; clinical studies have clearly shown that people with naturally higher levels of anti-oxidants in their blood are at lower risk of heart attack, it therefore seems obvious that supplementing with anti-oxidants will be beneficial - and so many pills are explicitly marketed as being anti-oxidants.
In practice, these supplements seem to upset the balance in some bio-regulation mechanism; as a result, they actually increase your risk of heart attack.  If you read the backs of bottles and adverts, you'll notice that they only ever claim the surrogate outcome (higher anti-oxidant levels).  The manufacturers then cleverly arrange for such adverts to accompany magazine articles mentioning the benefits of intrinsically higher levels, and rely on readers to make the false assumption that these pills will benefit you.
I'd recommend reading Ben Goldacre's "Bad Science" for more of this kind of stuff. It really is a fascinating book, and contains a lot that is relevant to our industry.

On 10 June 2011 10:33, Philippe Lhoste <PhiLho [at] gmx [dot] net> wrote:
On 09/06/2011 18:41, Paul Butcher wrote:
c) Some language features existed for good reason "back in the day" but have now
outlived their usefulness. For example setjmp/longjmp.

I never used this pair in some years of C programming. But I understand they are far from being outdated. I know that Lua uses it for error recovery / handling. I have read that some languages, like Scheme, use them to implement continuations. Their usage is very rare and quite specialized (language implementations only?) but not outlived.

I criticize only the taken example, not the point itself... :-)

Peter C. Chapin 2
Joined: 2011-01-07,
User offline. Last seen 42 years 45 weeks ago.
RE: To Subset or Not To Subset?

Subsetting makes sense in some contexts, but I'm not sure about Scala. For
example Ada is one of the few (the only?) language that provides a standard
way of creating subsets. Using pragma Restrictions you can tell the compiler
to disable certain collections of features depending on your needs. Why do
this?

1. Using a subset might allow for a simplified runtime environment... and
important feature if you are targeting a small embedded system. For example
if your Ada compiler supports the "high integrity" system annex you can turn
off exceptions and then you don't need to deal with exception handling in
the run time system.

2. Using a subset can allow for deeper static analysis of program
properties. Turning off exceptions, for example, makes flow control easier
to analyze. Ada's high integrity system annex also allows you to disable
dynamic memory allocation so that static bounds on memory consumption can be
more easily computed (by the compiler or associated tools).

Of course doing this basically disables your ability to use the standard
library, at least in the examples above, but the people who are interested
in this feature are usually quite happy to live with that. In part it's a
cultural thing. It also works because Ada's intended application domain is
quite different than Scala's.

These comments probably don't apply at all to the case of work-a-day coding
standards (in either language).

Peter

> -----Original Message-----
> From: scala-debate [at] googlegroups [dot] com [mailto:scala-
> debate [at] googlegroups [dot] com] On Behalf Of Randall R Schulz
> Sent: Thursday, June 09, 2011 10:32
> To: scala-debate [at] googlegroups [dot] com
> Subject: [scala-debate] To Subset or Not To Subset?
>
> Hi,
>
> Another suggestion keeps coming up in my company's discussions about
> adopting Scala: Should the coding requirements forbid use of parts of the
> language?
>
> Personally, I find this notion so bizarre I can't even comprehend what
> motivates it. I think it's some sort of fear of the unknown, or something.

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: To Subset or Not To Subset?

On Fri, Jun 10, 2011 at 06:33, Philippe Lhoste wrote:
> On 09/06/2011 18:41, Paul Butcher wrote:
>>
>> c) Some language features existed for good reason "back in the day" but
>> have now
>> outlived their usefulness. For example setjmp/longjmp.
>
> I never used this pair in some years of C programming. But I understand they
> are far from being outdated. I know that Lua uses it for error recovery /
> handling. I have read that some languages, like Scheme, use them to
> implement continuations. Their usage is very rare and quite specialized
> (language implementations only?) but not outlived.
>
> I criticize only the taken example, not the point itself... :-)

Anyone who hasn't used -- or come into contact with code that uses --
setjmp/longjmp is way too far from the OS level. :-)

Randall R Schulz
Joined: 2008-12-16,
User offline. Last seen 1 year 29 weeks ago.
Re: To Subset or Not To Subset?

On Thursday June 9 2011, Randall R Schulz wrote:
> Hi,
>
> Another suggestion keeps coming up in my company's discussions about
> adopting Scala: Should the coding requirements forbid use of parts of
> the language?
>
> ...

Again, thank you all for the great discussion.

(And please don't let this bring it to a halt!)

Randall Schulz

Justin du coeur
Joined: 2009-03-04,
User offline. Last seen 42 years 45 weeks ago.
Re: To Subset or Not To Subset?
On Fri, Jun 10, 2011 at 8:36 AM, Peter C. Chapin <PChapin [at] vtc [dot] vsc [dot] edu> wrote:
Subsetting makes sense in some contexts, but I'm not sure about Scala. For
example Ada is one of the few (the only?) language that provides a standard
way of creating subsets. Using pragma Restrictions you can tell the compiler
to disable certain collections of features depending on your needs. Why do
this?

Hmm.  To me, the two reasons you give are at right angles to why a typical company would want to subset, which has little to do with technical motives.  Indeed, I find much of this discussion a bit odd, because it feels like it has missed the point.  If I was running a mammoth engineering division, the reason I would think about subsetting would be *to make the code easier to understand*.
This ties very, very closely to the discussion about entry-level programmers.  The thing is, a *big* company typically has a zillion programmers, most of them doing relative drudgework.  These aren't the high-flyers who want to learn all the ins and outs of Scala -- these are people who have done a certificate in programming, *maybe* a BS, and are just trying to do their job as directed.  They're doing very routine application programming.
If I'm running a hundred such programmers, I want an extremely well-defined training program, that I am confident they can pick up quickly *and* I am confident will suffice to let them do their jobs.  And yes, it is entirely likely that I am willing to sacrifice a bit of code quality for that -- it's a valid tradeoff.
So in such a circumstance, I'm likely to want to take Martin's definition of the features relevant to the basic (or more likely mid-level) application programmer, and reify those as rules or at least guidelines for my application code.  *Personally* I would just make those guidelines, but I suspect the average engineering manager wants something more hard-and-fast.  
And the sole motivation for that is to avoid surprises in the code.  I want everyone writing the application code at a roughly similar level, in a roughly similar way, because I *want* to be able to treat my employees as interchangeable, so I want code that I am confident they will all understand.  And that probably means declaring some of Scala's more entertaining features off-limits for that application code.  (Library coders probably get freer rein, but they're more senior people.)
Mind, all of the above is why I don't work for large companies: I'm a startup guy, in part because I always want to be able to use the most-correct solution to hand, without worrying about this sort of stuff.  I'm playing Devil's Advocate here -- but I suspect I'm correct...
Erik Engbrecht
Joined: 2008-12-19,
User offline. Last seen 3 years 18 weeks ago.
Re: To Subset or Not To Subset?

I don't think subsetting the language itself makes much sense, but I think it would be worthwhile to demarcate sections of the standard library that shouldn't be used because they aren't actively maintained, suffer some significant design flaws, or seem to be a common source of bugs or WTF mailing list posts.  There also may be places where using the features is ok, but attempts shouldn't be made to extend them because doing so requires high levels of wizardry that are better applied to other problems.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland