This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Parsing command lines argument in a "scalaesque" way

6 replies
Lukas Pustina
Joined: 2009-12-01,
User offline. Last seen 42 years 45 weeks ago.
Hello everybody,
I’m quite new to Scala and I’m using it in a productive environment for the first time. I still lack the intuition for Scala which every experienced programmer develops  for his favorite language over time and was hoping you could help me to advance my code to become more „ scalaesque“ (as in pythonic). I hope you don’t mind.
In my class hierarchy, I pass command line parameters from abstract classes down down to lower, more specific classes. Each class in the hierarchy evaluates the parameters it is interested in and passes the remaining arguments to succeeding class. The method looks as following:
  override protected def parseArgs( args: String* ): Array[String] = {    val leftArgs = super.parseArgs(args: _*)    var remainingArgs: List[String] = Nil
    val it = leftArgs.elements    for ( s <- it ) {    s match {        case "-f" => filename = it.next        case "-v" => verbose = true        case "-s" => someother_parameter = it.next         case _ => remainingArgs ::= s      }    }

 

remainingArgs.reverse.toArray  }
Even though the code works perfectly, to my padawan eyes it doesn’t look optimal. It smells. Do you have any suggestion how to make it less verbose and more scalaesque?
I’m happy about any suggestions,Lukas
PS I’m not looking for a „better“ commandline parsing framework, but for a better understanding of Scala.
--
Lukas Pustina
University of Bonn                  Tel:    +49-228-73-4838Institute of Computer Science IV    Fax:    +49-228-73-4571Roemerstr. 164                      E-Mail: pustina [at] cs [dot] uni-bonn [dot] de53117 Bonn                          http:   http://iv.cs.uni-bonn.de/pustina/





dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Parsing command lines argument in a "scalaesque" way
I don't think this is bad, but the pattern of your code suggests a partial function-like solution.   Say, for instance, you have this:   val pf: PartialFunction[Any, Unit] = {   case ("-f", arg: String) => filename = arg   case "-v" => verbose = true   case ("-s", arg: String) => someotherParameter = arg }   You can then do this:   val (argsToBeProcessed, remainingArgs) = args partition (pf isDefinedAt _) argsToBeProcessed foreach pf   It would require some other process to get through the args first and convert it to a List[Any], which can either be (String, String) or String, depending on whether an arg is required or not.   An interesting thing about it is that if you have a list of such partial functions, you can then do this:   val onePf = pfs reduceLeft (_ orElse _)   So various modules can provide their own args requirements, and then you can join all of them into a single partial function, which you can use as described above.   Alternatively, you could structure your code like this:   val pf: PartialFunction[List[String], List[String]] = {   case "-f" :: (arg: String) :: tail => filename = arg; tail   case "-v" :: tail => verbose = true; tail   case "-s" :: (arg: String) :: tail => someotherParameter = arg; tail }
You still have partial functions, and you still can concatenate them with "orElse". But instead of using "partition" and requiring some other code to do preprocessing on the args list, you can do this:   def processArgs(args: List[String], pf: PartialFunction[List[String], List[String]]): List[String] = args match {   case Nil => Nil   case _ => if (pf isDefinedAt args) processArgs(pf(args)) else args.head :: processArgs(args.tail) }   Each call to "pf" extracts used arguments from the list and returns the rest. If there is no usable arguments (tested by isDefinedAt), then we separate the first argument in the list to be returned, and process the rest. If there are usable arguments, we let pf consume them and process whatever was remaining.   Of course, there IS a pretty good solution for argument processing, but you stated that was not your interest. :-)
On Tue, Dec 1, 2009 at 11:44 AM, Lukas Pustina <pustina [at] cs [dot] uni-bonn [dot] de> wrote:
Hello everybody,
I’m quite new to Scala and I’m using it in a productive environment for the first time. I still lack the intuition for Scala which every experienced programmer develops  for his favorite language over time and was hoping you could help me to advance my code to become more „ scalaesque“ (as in pythonic). I hope you don’t mind.
In my class hierarchy, I pass command line parameters from abstract classes down down to lower, more specific classes. Each class in the hierarchy evaluates the parameters it is interested in and passes the remaining arguments to succeeding class. The method looks as following:
  override protected def parseArgs( args: String* ): Array[String] = {     val leftArgs = super.parseArgs(args: _*)     var remainingArgs: List[String] = Nil
    val it = leftArgs.elements     for ( s <- it ) {     s match {         case "-f" => filename = it.next         case "-v" => verbose = true         case "-s" => someother_parameter = it.next          case _ => remainingArgs ::= s       }     }

 

remainingArgs.reverse.toArray   }
Even though the code works perfectly, to my padawan eyes it doesn’t look optimal. It smells. Do you have any suggestion how to make it less verbose and more scalaesque?
I’m happy about any suggestions, Lukas
PS I’m not looking for a „better“ commandline parsing framework, but for a better understanding of Scala.
--
Lukas Pustina
University of Bonn                  Tel:    +49-228-73-4838 Institute of Computer Science IV    Fax:    +49-228-73-4571 Roemerstr. 164                      E-Mail: pustina [at] cs [dot] uni-bonn [dot] de 53117 Bonn                          http:   http://iv.cs.uni-bonn.de/pustina/








--
Daniel C. Sobral

I travel to the future all the time.
Tim Perrett 2
Joined: 2009-04-06,
User offline. Last seen 42 years 45 weeks ago.
Re: Parsing command lines argument in a "scalaesque" way

You might find this option parser helpful:

http://gist.github.com/246481

To be honest, I've been meaning to make a bunch of modifications to
that for some time (argument sorting and less mutability)... it was
originally from Aaron Harnly so the credit goes to him for the
original effort.

Cheers, Tim

2009/12/1 Daniel Sobral :
> I don't think this is bad, but the pattern of your code suggests a partial
> function-like solution.
>
> Say, for instance, you have this:
>
> val pf: PartialFunction[Any, Unit] = {
>   case ("-f", arg: String) => filename = arg
>   case "-v" => verbose = true
>   case ("-s", arg: String) => someotherParameter = arg
> }
>
> You can then do this:
>
> val (argsToBeProcessed, remainingArgs) = args partition (pf isDefinedAt _)
> argsToBeProcessed foreach pf
>
> It would require some other process to get through the args first and
> convert it to a List[Any], which can either be (String, String) or String,
> depending on whether an arg is required or not.
>
> An interesting thing about it is that if you have a list of such partial
> functions, you can then do this:
>
> val onePf = pfs reduceLeft (_ orElse _)
>
> So various modules can provide their own args requirements, and then you can
> join all of them into a single partial function, which you can use as
> described above.
>
> Alternatively, you could structure your code like this:
>
> val pf: PartialFunction[List[String], List[String]] = {
>   case "-f" :: (arg: String) :: tail => filename = arg; tail
>   case "-v" :: tail => verbose = true; tail
>   case "-s" :: (arg: String) :: tail => someotherParameter = arg; tail
> }
> You still have partial functions, and you still can concatenate them with
> "orElse". But instead of using "partition" and requiring some other code to
> do preprocessing on the args list, you can do this:
>
> def processArgs(args: List[String], pf: PartialFunction[List[String],
> List[String]]): List[String] = args match {
>   case Nil => Nil
>   case _ => if (pf isDefinedAt args) processArgs(pf(args)) else args.head ::
> processArgs(args.tail)
> }
>
> Each call to "pf" extracts used arguments from the list and returns the
> rest. If there is no usable arguments (tested by isDefinedAt), then we
> separate the first argument in the list to be returned, and process the
> rest. If there are usable arguments, we let pf consume them and process
> whatever was remaining.
>
> Of course, there IS a pretty good solution for argument processing, but you
> stated that was not your interest. :-)
> On Tue, Dec 1, 2009 at 11:44 AM, Lukas Pustina
> wrote:
>>
>> Hello everybody,
>> I’m quite new to Scala and I’m using it in a productive environment for
>> the first time. I still lack the intuition for Scala which every experienced
>> programmer develops  for his favorite language over time and was hoping you
>> could help me to advance my code to become more „ scalaesque“ (as in
>> pythonic). I hope you don’t mind.
>> In my class hierarchy, I pass command line parameters from abstract
>> classes down down to lower, more specific classes. Each class in the
>> hierarchy evaluates the parameters it is interested in and passes the
>> remaining arguments to succeeding class. The method looks as following:
>>   override protected def parseArgs( args: String* ): Array[String] = {
>>     val leftArgs = super.parseArgs(args: _*)
>>     var remainingArgs: List[String] = Nil
>>     val it = leftArgs.elements
>>     for ( s <- it ) {
>>     s match {
>>         case "-f" => filename = it.next
>>         case "-v" => verbose = true
>>         case "-s" => someother_parameter = it.next
>>         case _ => remainingArgs ::= s
>>       }
>>     }
>>
>>
>>
>> remainingArgs.reverse.toArray
>>   }
>> Even though the code works perfectly, to my padawan eyes it doesn’t look
>> optimal. It smells. Do you have any suggestion how to make it less verbose
>> and more scalaesque?
>> I’m happy about any suggestions,
>> Lukas
>> PS I’m not looking for a „better“ commandline parsing framework, but for a
>> better understanding of Scala.
>> --
>> Lukas Pustina
>> University of Bonn                  Tel:    +49-228-73-4838
>> Institute of Computer Science IV    Fax:    +49-228-73-4571
>> Roemerstr. 164                      E-Mail: pustina [at] cs [dot] uni-bonn [dot] de
>> 53117 Bonn                          http:
>>   http://iv.cs.uni-bonn.de/pustina/
>>
>>
>>
>>
>>
>
>
>
> --
> Daniel C. Sobral
>
> I travel to the future all the time.
>

Ricky Clarkson
Joined: 2008-12-19,
User offline. Last seen 3 years 2 weeks ago.
Re: Parsing command lines argument in a "scalaesque" way

http://github.com/paulp/optional worked really well for me, though it
looks like http://github.com/alexy/optional is relevant. paulp's
optional deliberately works on 2.7.x, and alexy's apparently works on
2.8. I have only tried paulp's.

I know you said you wanted to learn Scala, not an options framework,
but I do think this is worth a look.

2009/12/1 Tim Perrett :
> You might find this option parser helpful:
>
> http://gist.github.com/246481
>
> To be honest, I've been meaning to make a bunch of modifications to
> that for some time (argument sorting and less mutability)... it was
> originally from Aaron Harnly so the credit goes to him for the
> original effort.
>
> Cheers, Tim
>
> 2009/12/1 Daniel Sobral :
>> I don't think this is bad, but the pattern of your code suggests a partial
>> function-like solution.
>>
>> Say, for instance, you have this:
>>
>> val pf: PartialFunction[Any, Unit] = {
>>   case ("-f", arg: String) => filename = arg
>>   case "-v" => verbose = true
>>   case ("-s", arg: String) => someotherParameter = arg
>> }
>>
>> You can then do this:
>>
>> val (argsToBeProcessed, remainingArgs) = args partition (pf isDefinedAt _)
>> argsToBeProcessed foreach pf
>>
>> It would require some other process to get through the args first and
>> convert it to a List[Any], which can either be (String, String) or String,
>> depending on whether an arg is required or not.
>>
>> An interesting thing about it is that if you have a list of such partial
>> functions, you can then do this:
>>
>> val onePf = pfs reduceLeft (_ orElse _)
>>
>> So various modules can provide their own args requirements, and then you can
>> join all of them into a single partial function, which you can use as
>> described above.
>>
>> Alternatively, you could structure your code like this:
>>
>> val pf: PartialFunction[List[String], List[String]] = {
>>   case "-f" :: (arg: String) :: tail => filename = arg; tail
>>   case "-v" :: tail => verbose = true; tail
>>   case "-s" :: (arg: String) :: tail => someotherParameter = arg; tail
>> }
>> You still have partial functions, and you still can concatenate them with
>> "orElse". But instead of using "partition" and requiring some other code to
>> do preprocessing on the args list, you can do this:
>>
>> def processArgs(args: List[String], pf: PartialFunction[List[String],
>> List[String]]): List[String] = args match {
>>   case Nil => Nil
>>   case _ => if (pf isDefinedAt args) processArgs(pf(args)) else args.head ::
>> processArgs(args.tail)
>> }
>>
>> Each call to "pf" extracts used arguments from the list and returns the
>> rest. If there is no usable arguments (tested by isDefinedAt), then we
>> separate the first argument in the list to be returned, and process the
>> rest. If there are usable arguments, we let pf consume them and process
>> whatever was remaining.
>>
>> Of course, there IS a pretty good solution for argument processing, but you
>> stated that was not your interest. :-)
>> On Tue, Dec 1, 2009 at 11:44 AM, Lukas Pustina
>> wrote:
>>>
>>> Hello everybody,
>>> I’m quite new to Scala and I’m using it in a productive environment for
>>> the first time. I still lack the intuition for Scala which every experienced
>>> programmer develops  for his favorite language over time and was hoping you
>>> could help me to advance my code to become more „ scalaesque“ (as in
>>> pythonic). I hope you don’t mind.
>>> In my class hierarchy, I pass command line parameters from abstract
>>> classes down down to lower, more specific classes. Each class in the
>>> hierarchy evaluates the parameters it is interested in and passes the
>>> remaining arguments to succeeding class. The method looks as following:
>>>   override protected def parseArgs( args: String* ): Array[String] = {
>>>     val leftArgs = super.parseArgs(args: _*)
>>>     var remainingArgs: List[String] = Nil
>>>     val it = leftArgs.elements
>>>     for ( s <- it ) {
>>>     s match {
>>>         case "-f" => filename = it.next
>>>         case "-v" => verbose = true
>>>         case "-s" => someother_parameter = it.next
>>>         case _ => remainingArgs ::= s
>>>       }
>>>     }
>>>
>>>
>>>
>>> remainingArgs.reverse.toArray
>>>   }
>>> Even though the code works perfectly, to my padawan eyes it doesn’t look
>>> optimal. It smells. Do you have any suggestion how to make it less verbose
>>> and more scalaesque?
>>> I’m happy about any suggestions,
>>> Lukas
>>> PS I’m not looking for a „better“ commandline parsing framework, but for a
>>> better understanding of Scala.
>>> --
>>> Lukas Pustina
>>> University of Bonn                  Tel:    +49-228-73-4838
>>> Institute of Computer Science IV    Fax:    +49-228-73-4571
>>> Roemerstr. 164                      E-Mail: pustina [at] cs [dot] uni-bonn [dot] de
>>> 53117 Bonn                          http:
>>>   http://iv.cs.uni-bonn.de/pustina/
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Daniel C. Sobral
>>
>> I travel to the future all the time.
>>
>

Lukas Pustina
Joined: 2009-12-01,
User offline. Last seen 42 years 45 weeks ago.
Re: Parsing command lines argument in a "scalaesque" way
Hello Daniel,
thanks for your answer, I really like the idea of using partial functions, however I didn’t get your idea totally, yet. I have a follow up question and a question regarding my previously sent code.
Alternatively, you could structure your code like this:   val pf: PartialFunction[List[String], List[String]] = {   case "-f" :: (arg: String) :: tail => filename = arg; tail   case "-v" :: tail => verbose = true; tail   case "-s" :: (arg: String) :: tail => someotherParameter = arg; tail }
You still have partial functions, and you still can concatenate them with "orElse". But instead of using "partition" and requiring some other code to do preprocessing on the args list, you can do this:   def processArgs(args: List[String], pf: PartialFunction[List[String], List[String]]): List[String] = args match {   case Nil => Nil   case _ => if (pf isDefinedAt args) processArgs(pf(args)) else args.head :: processArgs(args.tail) }

It this a mistake or do I just don’t get it: Why is only the remaining String array passed to processArgs and the partial function is omitted in the second case clause?


In my class hierarchy, I pass command line parameters from abstract classes down down to lower, more specific classes. Each class in the hierarchy evaluates the parameters it is interested in and passes the remaining arguments to succeeding class. The method looks as following:
  override protected def parseArgs( args: String* ): Array[String] = {     val leftArgs = super.parseArgs(args: _*)     var remainingArgs: List[String] = Nil
    val it = leftArgs.elements     for ( s <- it ) {     s match {         case "-f" => filename = it.next         case "-v" => verbose = true         case "-s" => someother_parameter = it.next          case _ => remainingArgs ::= s       }     } 
remainingArgs.reverse.toArray   }

In this code, I construct a List by constantly prepending to it. Since this reverses the order of the String elements, I have to reverse the list before returning it. Is this „bad style“ or something that is usually done? How much does it cost?Further is it sensible to go this way, use a list, reverse it, transform it to an Array? Is there something more efficient?
Thanks,Lukas

--
Lukas Pustina
University of Bonn                  Tel:    +49-228-73-4838Institute of Computer Science IV    Fax:    +49-228-73-4571Roemerstr. 164                      E-Mail: pustina [at] cs [dot] uni-bonn [dot] de53117 Bonn                          http:   http://iv.cs.uni-bonn.de/pustina/





dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Parsing command lines argument in a "scalaesque" way


On Wed, Dec 2, 2009 at 8:57 AM, Lukas Pustina <pustina [at] cs [dot] uni-bonn [dot] de> wrote:
Hello Daniel,
thanks for your answer, I really like the idea of using partial functions, however I didn’t get your idea totally, yet. I have a follow up question and a question regarding my previously sent code.
Alternatively, you could structure your code like this:   val pf: PartialFunction[List[String], List[String]] = {   case "-f" :: (arg: String) :: tail => filename = arg; tail   case "-v" :: tail => verbose = true; tail   case "-s" :: (arg: String) :: tail => someotherParameter = arg; tail }
You still have partial functions, and you still can concatenate them with "orElse". But instead of using "partition" and requiring some other code to do preprocessing on the args list, you can do this:   def processArgs(args: List[String], pf: PartialFunction[List[String], List[String]]): List[String] = args match {   case Nil => Nil   case _ => if (pf isDefinedAt args) processArgs(pf(args)) else args.head :: processArgs(args.tail) }

It this a mistake or do I just don’t get it: Why is only the remaining String array passed to processArgs and the partial function is omitted in the second case clause?

I might have made a mistake, granted, but I think this is correct. I tested most of that code, but not all of it.
Here is how we use this. We call processArgs(args, pf), where args is the list of args yet to be processed, and pf is the partial function that's processing them.  This function will apply pf to _all_ args, but it will use recursion to do so. Any args that cannot be processed by pf will be returned.
Now, notice that pf will process at most one argument, which may be composed of more than one string, and return the rest. For example:
case "-f" :: (arg: String) :: tail => filename = arg; tail

That takes the first two strings of the list ("-f" and whatever follows it), use that to assign "filename", and then return the "tail" of the list.
So here is how processArgs works:
1. If there are no arguments to be processed, bail out returning Nil (if there were no arguments to be processed, no arguments were left unprocessed).
2. So, there are arguments to be processed, but can pf process them? Verify this with "isDefinedAt". Note that this will check only the NEXT argument in the list. If the list is List("-x", "-f", "filename"), it won't process, because pf is trying to match against the beginning of it.
3. If it can be processed, then process it, and return the result of processArgs on the list returned by pf -- the remaining arguments.
4. If it cannot be processed, return that element that could not be processed (args.head -- "-x" in the example above) concatenated with the result of processArgs on the remaining of the list (args.tail).
Let's consider how it would process that example list, List("-x", "-f", "filename").
1. List("-x", "-f", "filename") is not Nil, so go to the default case. 2. pf.isDefinedAt(List("-x", "-f", "filename)) is false, because none of pf's cases can match a list beginning with "-x".3. Return "-x" :: processArgs(List("-f", "filename")) (args.head and args.tail respectively)
Now we are processing List("-f", "filename"), which was passed to processArgs by the recursion above.
4. List("-f", "filename") is not Nil, so go to the default case. 5. pf.isDefinedAt(List("-f", "filename")) is true, because one of pf's cases can match it.6. return processArgs(pf(List("-f", "filename"))
Now we are inside pf:
7. List("-f", "filename") matches "-f" :: filename :: tail. filename gets assigned "filename", and tail gets assigned Nil. 8. Set filename, return tail (Nil).
We are back to processArgs again, processing the result of pf, which was Nil.
9. Nil is Nil, return Nil
Back to step 6,
10. return Nil
Back to step 3,
11. return "-x" :: Nil
And that's it.
Note that we cannot call "pf" on "tail" because we do not know if "pf" can process "tail". We have to recurse first, which will then test if it is possible.  



In my class hierarchy, I pass command line parameters from abstract classes down down to lower, more specific classes. Each class in the hierarchy evaluates the parameters it is interested in and passes the remaining arguments to succeeding class. The method looks as following:
  override protected def parseArgs( args: String* ): Array[String] = {     val leftArgs = super.parseArgs(args: _*)     var remainingArgs: List[String] = Nil
    val it = leftArgs.elements     for ( s <- it ) {     s match {         case "-f" => filename = it.next         case "-v" => verbose = true         case "-s" => someother_parameter = it.next          case _ => remainingArgs ::= s       }     } 
remainingArgs.reverse.toArray   }

In this code, I construct a List by constantly prepending to it. Since this reverses the order of the String elements, I have to reverse the list before returning it. Is this „bad style“ or something that is usually done? How much does it cost? Further is it sensible to go this way, use a list, reverse it, transform it to an Array? Is there something more efficient?

Reverse is o(n). Calling it once at the end of an algorithm is fine. Calling it repeatedly is not fine. In your case, I assume there are many classes, most of which will not remove any parameter, but the parameter list is very small or empty, if they are mostly optional. If they are mostly non-optional, most classes will remove one parameter, but the parameter list is small.
Neither case is particularly good, but nor are they particularly bad, since parameters are a small list for a computer, and since this only happens once before the program start.
In my example, the recursion is not tail-recursion, which is not good but, again, because parameter lists are small, it shouldn't matter. As a bonus, however, I do not have to invert the list. But that is not the main benefit.
The idea is that you call each class and get a partial function back from each of them. You then concatenate these partial functions with "orElse". Finally, you process the argument list with the result. Since all your handlers are concatenated, the only parameters that will not be processed are the ones not supported.
This is clearly more efficient, as the list is iterated over only one time. In your case, you iterate the list over once for every handler that exists, though the list will be slowly decreasing in size.  

Thanks,Lukas

--
Lukas Pustina
University of Bonn                  Tel:    +49-228-73-4838Institute of Computer Science IV    Fax:    +49-228-73-4571Roemerstr. 164                      E-Mail: pustina [at] cs [dot] uni-bonn [dot] de 53117 Bonn                          http:   http://iv.cs.uni-bonn.de/pustina/








--
Daniel C. Sobral

I travel to the future all the time.
James.Strachan
Joined: 2009-07-08,
User offline. Last seen 2 years 25 weeks ago.
Re: Parsing command lines argument in a "scalaesque" way

Tim Perrett-3 wrote:
>
> You might find this option parser helpful:
>
> http://gist.github.com/246481
>
> To be honest, I've been meaning to make a bunch of modifications to
> that for some time (argument sorting and less mutability)... it was
> originally from Aaron Harnly so the credit goes to him for the
> original effort.
>
> Cheers, Tim
>

Thanks Tim - I took this code and hacked it a little so arguments and
options print themselves out nicely & added a mavenn/sbt build & tests

If anyone wants to keep hacking the code is here...
http://github.com/jstrachan/scopt

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland