- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
questioning FP
body
p { margin-bottom: 0cm; margin-top: 0pt; }
Hi All,
(I welcome any sort of replies, even containing foul language, as
long as there's also information inside.)
I'm about to give a lecture on FP. Now, some concepts in FP I love, but for some I have a hard time finding a good angle. Specifically, the IO monad.
Consider the following imperative code:
service.fetchData
service.closeObviously, both method calls create side effects.
Now, (discarding iteratees), lets consider making this with the IO monad:
for (data <- service.fetchData;
_ <- service.close)
yield data
where all methods are changed to return IO[Data], IO[Unit] and also the calling code must now return an IO[X] for some X.
The question is, what have we gained?
Instead of sequencing by line order, we're explicitly sequencing in the for comprehension. Not much of a gain here, since any mistake that causes code to be written in the first order in the imperative case can also be done in the functional case.
I also don't see anything that makes the code less prone to other mistakes. Such as forgetting to close or calling close twise. Obviously, there are ways to fix this such as service.withData(f: Data => X), but again, the more imperative style of silently calling close inside withData doesn't loose to the functional style that must return IO[X].
What am I missing? I'd appreciate any other examples which show how the use of IO is superior by making code that is more robust.
Regards,
Ittay










Re: Re: questioning FP
On Wednesday, October 12, 2011 9:58:40 AM UTC-4, Adriaan Moors wrote:
Traverse is already polymorphic in the effect. It needs to be Applicative specifically because it describes how effects are combined. That is, Applicative provides a distributive law, allowing us to distribute M over T. When we traverse a data structure, we are lifting its constructors into the M idiom so we can combine the effects on the elements into a larger effect on the whole structure.
But it's fun to think about whether there's an even more abstract way to describe such a distributive law than the pair Applicative[M] and Traverse[T].
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 6:40 PM, Runar Bjarnason <runarorama [at] gmail [dot] com> wrote:
you're right -- I blame my over-eager abstraction trigger finger
Re: Re: questioning FP
You can always go back and ignore all this. It'll just keep coming back up on the list.
OK, having gone through the slides and begun reading the papers you pointed out w.r.t. DDC, it's not as if the type signatures on the effectual functions in DDC are a piece of cake either.
From the slides:updateInt :: forall (r1, r2 : region) . Int r1 -> Int r2 -(e1)> () :- e1 = Read r2 \/ Write r1 All to do the logical equivalent of assignment. Not only that but DDC gives you dependent types and a proof system.
I've been recently doing my best to learn ATS (http://www.ats-lang.org/) which has a lot in common with DDC including dependent types and an integrated proof system. It's not as if the type signatures in ATS are any better.
If dependent types and integrated theorem proving are part of Scala's future super cool, but I don't think that that's what you were going for. In the mean-time traverse really doesn't look all that bad[1].
--
Jim Powers
[1] In fact, it looks downright awesome.
Re: questioning FP
Jim Powers skrev 2011-10-12 17:16:
> If dependent types and integrated theorem proving are part of Scala's
> future super cool, but I don't think that that's what you were going
> for. In the mean-time traverse really doesn't look all that bad[1].
I have to point out that in Scala the decision has already been made to
allow uncontrolled side effects by default. Monads (nor applicatives)
will never be a solution for controlling side effects in Scala, unless,
in the unlikely event, the language is completely re-designed.
So, that only leaves the option of adding explicit notation for
specifying that functions/methods lack side effects, which is quite the
opposite from Haskell. The first step is obviously a way to specify that
a function/method is pure...
/Jesper Nordenberg
Re: questioning FP
On Wednesday, October 12, 2011 12:27:32 PM UTC-4, Jesper Nordenberg wrote:
Let's reword that: "The decision has already been made to make Scala Turing complete. Types will never be a solution for determining correctness in Scala, unless, in the unlikely event, the language is completely re-designed."
It doesn't require a redesign of the language. We could get a very long way by being explicit about effects in our libraries. Imagine if there were a complete and useful subset of the standard library that declared effects in its types. I think that would be sufficient, and all you need is higher kinds.
Now, it would be even better if we also had kind polymorphism and tail call elimination, but that's another story.
Re: questioning FP
Runar Bjarnason skrev 2011-10-12 18:58:
> It doesn't require a redesign of the language. We could get a very long
> way by being explicit about effects in our libraries. Imagine if there
> were a complete and useful subset of the standard library that declared
> effects in its types. I think that would be sufficient, and all you need
> is higher kinds.
That's merely for documentation of side effects not control of them. You
will get no compiler error (or warning) if a function that's supposed to
be pure performs a side effect.
/Jesper Nordenberg
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 15:05, Jesper Nordenberg wrote:
> Runar Bjarnason skrev 2011-10-12 18:58:
>>
>> It doesn't require a redesign of the language. We could get a very long
>> way by being explicit about effects in our libraries. Imagine if there
>> were a complete and useful subset of the standard library that declared
>> effects in its types. I think that would be sufficient, and all you need
>> is higher kinds.
>
> That's merely for documentation of side effects not control of them. You
> will get no compiler error (or warning) if a function that's supposed to be
> pure performs a side effect.
That means it will not detect bugs.
It's like using Option in Scala. Two, three years ago, I heard many
complains that using Option would not prevent null errors in the
programs. Well, since then I learned that if I kept to null-free
libraries, and "sanitized" nulls whenever I interfaced with a Java
library that used them, it truly reduced null problems a great deal.
The same thing applies here, except, perhaps, that doing I/O will not
cause instant exceptions like trying to use a null value. Still, the
problem will be reduced -- isolated at certain APIs.
Re: Re: questioning FP
Daniel Sobral wrote:
Can you please share a snippet where there are bugs and how IO reveals them? This is the sort of thing I was looking for.
It is isolated only if those APIs are used at the top of the program. Otherwise, the IO monad needs to propagate up the call hierarchy. And if I do it at the top of the program, is there real benefit in it? I'm about to call performUnsafeIO several lines below anyway, so why invest in wrapping just to unwrap?
Ittay
Re: Re: questioning FP
On Wednesday, October 12, 2011 6:51:34 PM UTC-4, Ittay Dror wrote:
I think this needs to be repeated: The idea of "call hierarchy" is applicable only to programming that is first-order and single-threaded. That said, if you call a function that depends on IO, then you too depend on IO whether you want to pretend otherwise or not.
Don't do that. Use map and flatMap instead. If you can say "f(x.unsafePerformIO)", then you can just as well say "x map f".
Re: Re: questioning FP
BTW, thanks for replying. I have a feeling that we're talking about different things. Maybe the next message will contain the "joining" argument
Runar Bjarnason wrote:
Re: Re: questioning FP
Runar Bjarnason wrote: Can you explain what is first-order? And why multi threaded doesn't have a call hierarchy? (a function can have several clients)
of course. so? as an analogy, if my function does division, I depend on math, so do I wrap its result in a Math[_] monad?
at the "end of the world", i call performUnsafeIO, right? You also said that normally I'd call my effectual functions up front and then yield a function that is pure, where most of my logic is. Am I correct so far?
if so, then what is the point of wrapping the effectual functions in IO, if I'm going to call unsafePerformIO anyway?:
* if most of the program logic is in those functions, then it means the IO values propagated through several layers (I assume)
* if those functions are trivial / fundamental, then I just wrapped a simple thing in IO to unwrap it immediately after.
Re: Re: questioning FP
On Thursday, October 13, 2011 11:27:49 AM UTC-4, Ittay Dror wrote:
A higher-order function is a function that takes a function as its argument. First-order just means programming without those. In scala, we have higher-order functions like map and flatMap. When working with monads, this is how we lift non-monadic code into the monad.
This is closely related to continuation-passing. Consider these two methods:
def timesTwo(x: Int): Int = x * 2
def timesTwoK(x: Int, k: Int => Nothing): Nothing = k(x * 2)
The former is called like this: f(timesTwo(n))
The latter is called like this: timesTwoK(n, f)
The former is a first-order function that simply returns the result of the computation. The latter is a higher-order function that doesn't return, but instead accepts a function as its argument. That function is called the continuation. Instead of accepting a result from the function, we tell it how to continue without us.
The "map" and "flatMap" methods are exactly the same way. Instead of getting the result out of a monad (like IO), we pass the continuation to map or flatMap. That continuation will receive the result if and when it's available.
Ordinarily you would close over some division function and carry on, but it's not completely insane to use a data type instead. If you want to be polymorphic in the division function, for example. Or if you want to allow defer optimization of mathematical expressions to your math engine.
Take a simpler use case: multiplication. The expression 3 * 4 * 5 could be written like this:
List(3, 4, 5)
Abstracting over the multiplication operator. We can inject it later like this:
list.foldLeft(1, _ * _)
What's the point of constructing a list, if I'm just going to fold it later anyway?
Re: Re: questioning FP
Runar Bjarnason wrote:
If you have a high order function and it doesn't call its argument, but pass it to somewhere else, then a monad makes sense (In fact I wrote about such a monad in www.tikalk.com/incubator/blog/functional-programming-scala-rest-us). But it is a specific use case and should be treated as such. The monad is future/promise, not IO[_]
Re: Re: questioning FP
Runar Bjarnason wrote: When folding the list, information is added. When calling unsafePerformIO, no information is created. It just takes out a delayed action.
Let me state again my issues:
1. I don't code with random functions named 'foo' that I know nothing about.
2. If side effects are dangerous, IO[_] does not protect me from taking them. Calling a method 'doDangerousAction: Unit' is no different than 'doDangerousAction: IO[Unit]'. If this is the logic I need to call, then this is the logic I need to call. And wrapping it in IO[_] doesn't make the action safer. It will eventually be called anyway.
3. IO[_] makes code trickier to work with (need to work with for comprehensions, use traverse/sequence etc.)
4/ IO[_] does not prevent bugs:
Imagine 'close(resource: Resource): Unit'
Now this imperative code:
close(resource)
close(resource) // throws an exception that the resource is already closed.
Or with a higher order function:
def doWithResource(f: Resource => Unit) {
f(resource)
f(resource)
}
doWithResource(close)
Now this is of course a simple case, we need to imagine some sort of complex code in between and around the calls that made the developer get confused about the flow.
Now with IO:
val io = close(resource)
val io2 = io.flatMap(close(resource))
Or:
(close(resource), close(resource))
will result in the same exception.
I grant that it is hard to imagine doing the above by mistake since IO makes the code more intentional. We've made things difficult for the developer. To that, I have two things:
1. The same has been said about checked exceptions. But we know these are a failed experiment. Mainly due to the abstraction consideration.
2. We've traded one source of error with another. Because when using IO[_] the developer can forget to return the result of close and then the resource will not be closed, leading to a leak (which is harder to debug)
(about doWithResource: if it returns IO[Unit] or List[IO[Unit]] makes no difference since I'll need to sequence it and return it, maybe for the second case I can check the list size is 1, but this is a runtime check)
So I'm still baffled where the usefulness. Basically I think it is like checked exceptions: it looks nice to tag methods that can fail, but pretty soon you find out that you must use these methods, you can't do anything with the exception, and it makes your code more difficult to use.
To contrast with other monads: Take Option for example: when a function returns Option it not only documents the fact it may be returning none, I'm also able to do something about it. If I see a 'get(key: K): Option[V]' I can use it with 'get(key) getOrElse defaultValue'. So: 1) I've avoided the dangers of NPE, 2) I've reduced it to a simple value, so my code, and clients' code remains simple. In other words, Option is not an opaque wrapper like IO[_]
Some really smart people are saying I'm missing something. But the arguments are generally philosophical and with a lot of jargon. Maybe you guys are so used to the benefits of IO[_] that you give higher-level arguments. So please, go back to the basics. Show me an example of an imperative code and how introducing IO[_] removes the bugs without introducing the possibility for new ones (such as forgetting to return the IO instance).
Thanks,
ittay
Re: Re: questioning FP
On Thursday, October 13, 2011 3:51:02 PM UTC-4, Ittay Dror wrote:
Right, so your code is mostly first-order. It probably has a lot of repetition. You can improve on that by abstracting more.
There is a huge difference. The latter call is referentially transparent. The former call will mingle the performing of the dangerous action with program logic. I want a clean separation between describing what is done and actually doing it. If you don't want that, that's fine. Your code, your rules.
No it doesn't. It makes code easier to work with. In addition to having no side-effects, referential transparency, equational reasoning, compositionality, clean separation of concerns, I can also make use of functions like traverse and sequence which remove a lot of tedious repetition from my code.
Sure it does. Take for example any kind of race condition where you accidentally interleave two side-effects (like writing to a file while getting some content to write to that same file). That kind of interleaving is prevented if the type system forces you to sequence effects monadically.
The IO monad is not at all like checked exceptions, precisely because it is a monad.
This is what types are for. Unit is not the same as IO[Unit], and the compiler will complain if you don't return anything.
Re: Re: questioning FP
Runar Bjarnason wrote:
Yes, mostly first order, not a lot of repetition.
Sure, in HOF, I might call a function several times just for the sake of not keeping a val of the result. In this case, the function I pass to the HOF should be idempotent.
I also agree that in such cases, IO[_] is a good approach, but why force me to use it everywhere. Give me a println: Unit and make flatMap accept f: Any => IO[Unit]. Then I need to wrap println for this case only.
OK, I'll bite. I have two threads. One writes to a file the other gets some content. Two questions:
1. how does using IO[_] prevent the race? At the 'run' method of each thread I call unsafePerformIO which will now run the actions in parallel. Unless of course they are synchronized, which can be done with normal methods.
2. how do i avoid this sequencing when the IO actions can actually be done in parallel (two different files, inserting a cache value into a concurrent hash map)?
Why will the compiler complain? I call close() which returns IO[_] and I forgot to return it. My calling method used to be pure, so it returned Int. It still returns Int. Compiler is happy. Another scenario is that my method was required to return IO and was already returning it from another IO action.
Re: Re: questioning FP
On Thu, Oct 13, 2011 at 6:35 PM, Ittay Dror <ittay [dot] dror [at] gmail [dot] com> wrote:
No need to go to concurrent threads. Just think of nesting side-effects accidentally. You're writing to a log, and getting the message (claiming to be a String) has the side-effect of switching to the next log file if the current one is full. You end up writing your message to the wrong file. Anything like this:
f(x)
Where f has some sequence of side-effects, and a method on x, that f calls somewhere in this sequence, has a side-effect that interferes with it. If you've written enough "enterprise" software, there's no way you wouldn't have come across this kind of bug.
But, again, this is not the purpose of the IO monad. Its main purpose is separating effects from pure code. Doing this just also happens to prevent interleaving of effects.
Re: Re: questioning FP
Runar Bjarnason wrote:
Like martin has demonstrated you can take any imperative program and imagine every expression is returning an IO[_] and ';' or '\n' as bind. So I don't think IO[_] solves such bugs. But, it creates a potential of other bugs where an effectful computation misses to include one of its effects in the result
I was questioning whether this separation benefits anything or is only good from a purist point of view where everything is black & white.
In a language with a lot of use of HOFs, that are open to everyone to use, I see how the use of IO is necessary to the point of making all low level functions return IO[_].
Maybe the difference in Scala is that we have classes with open recursion and protected methods. So I can extend functionality while keeping the contract closed. And then the drawbacks of using IO[_] outweigh the benefits.
Re: Re: questioning FP
On Oct 14, 2011, at 1:36, Ittay Dror <ittay [dot] dror [at] gmail [dot] com> wrote:
It does, but I only know that because I tried it both ways. There's an expression in the southern USA that "you can't tell a young soldier anything", meaning I have to try things for myself and not be told by you. Because I don't yet have the benefit of experience.
Re: Re: questioning FP
(1) My trials have only illustrated to me that an IO monad is either a waste of time or I'm doing it wrong, and
(2) Nobody has ever provided an example of where it does anything useful, so if I'm doing it wrong I don't know how I could find this out.
When this thread began, I was willing to concede that there may be uses for it, but although I appreciate the efforts of people arguing for it, the reasoning has been unconvincing at best. So I'm now coming to the conclusion that the uses are only to appeal to personal style, not to solve practical programming problems.
So that this is not all just disagreeable rhetoric, let me give three examples where I have tried in the past to use something like an IO monad, why I rejected it, and what the superior solution was.
(1) TIFF image writer.
Tiff images have a number of required components in their header, and many more optional components. I had an application where the header information needed to be assembled in a not-entirely-straightforward way, and I was running into problems with forgetting to initialize parts of the header. I attempted using an IO marker, but that only told me what I already knew: there was a bunch of code dealing with output. The logic was still wrong. So I switched to using a finite state machine in state space (of the Header[False,False,False] style); the IO monad itself didn't assist with this at all. The equivalent of performUnsafeIO now would throw a compile time error if the header was not properly constructed, and everything was sorted out. But IO[Whatever] only got in the way, so I removed it. I needed a specific finite state machine, not an abstraction for IO, to help keep things straight.
(2) Data computation engine.
I have a daemon that sits around looking for data to appear in a directory; when it does, it grabs it, performs some conversions and computations, then spits it out again in another directory while removing the originals. Since files can appear (and disappear!) asynchronously, it's a little tricky to keep things straight. I speculated when I started this project that using an IO monad might help keep things straight. But no: in order to parallelize the computations, I had actors responsible for reading, moving, computing, etc., and there was no sensible way to push IO above the actor level. Within each actor, the concerns were well-separated; the input actor was, as far as the next actor could tell, just a source for data--I had already abstracted over whether or not the data involved IO. Likewise for the output actor. The input actor itself had scarcely a line of code that didn't involve IO, so any marker trait was redundant, and since the input or output statements were (nearly) consecutive, a state machine was also redundant. IO[Whatever] only got in the way--the solution was to have dedicated actors.
(3) Image processing tool.
I analyze scientific images that are sometimes too large for memory; thus, there's lots of IO required in order to keep the relevant portion of the data set (and partially completed computations) available. I had initially thought that using an IO monad would help me keep track of what was going on: where did I need to be careful, and where not? Where were the expensive operations and where were they inexpensive? This worked reasonably well initially; although I wasn't in any danger of forgetting where IO was when I first wrote the program, it seemed as though when I returned to the code later, I'd have a better understanding of the flow of data and caching and so forth. However, when I returned to the code to modify it, I found that the opposite was true: I wanted to change the sites of caching based on profiling. I already knew exactly where the expensive computations were--no thanks to IO markers--and now I had to change a whole bunch of type signatures because I wanted to change the location of IO. This was actually an anti-pattern: the point of caching is to abstract over whether or not there is IO and/or side-effects, and handle them transparently locally. IO[Whatever] got in the way with a vengeance because it was carried upstream too far.
Now, I fully admit that I was not using any established IO monad library; the concepts seemed straightforward enough to me to use them on my own, so maybe I missed some key insights and properties. But I don't think so. I think IO monads are mostly a red herring. You might want to collect your data for IO, but you don't want an IO monad for that, you want a collection (possibly a lazy one). You might want to transform your data, but you want e.g. an appropriate applicative functor for that, not an IO monad. You might want to separate IO concerns from others, but if so, just do it: write self-sufficient IO methods or actors or classes, and put the IO there.
My conclusion is that dividing things into IO and not-IO is an unhelpful abstraction in Scala since different IO tasks have very little in common. Abstraction is useful when there are shared properties; with IO, the only thing that is really shared is that we happened to label these things with "IO". (Why is writing something to a locked file and reading it back "IO" when writing a value in shared memory and reading it back is not? Do you really want to label keyboard input, writing to a file, and socket communication with the same marker? Isn't a lot more useful to mark each one separately, if you need markers?) IO often does not commute, but the IO monad won't prevent errors of commution (a state machine can). Side-effects are fundamentally different from not-side-effects, but IO is not fundamentally different from any other side effect unless you have no other side effects around, and even then, that-there-are-side-effects are unlikely to be as interesting as what-the-side-effects-are.
Thus, I think Ittay had the right conclusion: it is useful only for purity. Purity aside, you almost always want something else.
I'm quite open to counterexamples; I am still rather mystified how so many intelligent people could find something so valuable without being able to illustrate a clear use-case that depends exactly on the IOness (not e.g. the use of monad transformers). So I hold out hope that I'm still missing something valuable. But it's a little frustrating that it's boiling down to "try it and you'll see" when I've tried to try it and found the opposite.
--Rex
On Fri, Oct 14, 2011 at 7:55 AM, Runar Oli <runarorama [at] gmail [dot] com> wrote:
Re: Re: questioning FP
On Friday, October 14, 2011 4:28:50 PM UTC-4, Rex Kerr wrote:
It is. See scalaz.IORef and scalaz.STRef.
No, I don't want any marker. I want to describe these things declaratively with a DSL that I can pass to an interpreter at a convenient time.
That's the thing though. Much of the usefulness of an IO datatype is that it is a monad. So wherever you see M[_]:Monad or M[_]:Applicative, you can pass IO for M. The usefulness is precisely that it has no IOness. It's just another DSL, and manipulating expressions written in that DSL has no side-effects. This is why it's useful. I'm sorry you were not able to make use of it. Maybe the situation would improve if we built a more complete library of IO widgets.
I can relate a story where having an IO monad solved a real problem at work. This was not very long ago, for a web-based application talking to a database via JDBC. This was a million-LOC enterprise wossname with all the trimmings. The part that talked to the database was designed with a kind of strategy pattern, where you would inherit from a class named Command and overload a method named "body". Commands were executed using execute, the implementation of which went something like this:
1. Read the inputs and validate them.
2. Open a database connection.
3. If all is well, execute the body, passing the inputs.
4. Close the database connections.
Plus some complicated error-handling etc, but you get the idea.
At some point we started noticing that the database was accumulating row locks while the application was running with about 5000 concurrent users. If left alone, the app would become unresponsive, so we resorted to manually killing database sessions.
The reason this was occurring is that some commands actually depended on other commands, so occasionally a programmer would implement a command so that it would call another Command's execute method directly. The nested session would of course depend on rows that were already being manipulated (and not committed yet) by the outer session.
Any red-blooded functional programmer would at this point be screaming "use a monad!" Wherever you have an inner thing that depends on an outer thing, but needs shared context, you have a monad. The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
type IO[A] = java.sql.Connection => (A, java.sql.Connection)
Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
Of course, this was in Java, which as everybody knows, is just a functional academic language, and we should all pat ourselves on the back for not being that academic.
Re: Re: questioning FP
On a lark I decided to put my rather poor FP skills to the test to see if I could model Runar's example above. Here's my current version:
https://gist.github.com/1291831
Like I said, I pretty much suck at FP (in particular Scala FP because things are generally less clear than in Haskell, especially when it comes to type type inference), but that may be a good thing in this case: more people may be able to understand my example. It was a hoot to do and let me play around with Scalaz7, which is simply amazing.
While doing this example I cam across RegionT:https://github.com/scalaz/scalaz/blob/scalaz7/core/src/main/scala/scalaz/effect/RegionT.scala
Which is clearly the "right way" to handle connections and transactions. Also, the current implementation is rather brittle in the face of error conditions. Validations clearly could be used to good effect here.
Frankly, I find this amazing given the capability differences between Scala's and Java's type systems. Must have been fun like a tooth abscess.
Re: Re: questioning FP
https://gist.github.com/1292001
Note that this guarantees (in the type system) that if executeQuery can access a Connection, then it is open. It also guarantees that the connection is finally closed (because there is no combinator to close a connection inside an IOM, so you must do so via withConnection).
On Sunday, October 16, 2011 11:07:50 PM UTC-4, Jim Powers wrote:
Re: Re: questioning FP
On Sunday, October 16, 2011 11:07:50 PM UTC-4, Jim Powers wrote:
Re: Re: questioning FP
Sorry, I meant to say that the current implementation of my "toy"/"example" is brittle. I can hardly say that about RegionT.
--
Jim Powers
Re: Re: questioning FP
From looking quickly at the source, it seems RegionT does not implement the subtyping of and inferring regions part of the paper.Are there any plans to support this?
(This is not a critic. RegionT is already awesome per se.)
Best,
Nicolas.
Re: Re: questioning FP
Re: questioning FP
> I can relate a story where having an IO monad solved a real problem at work. This was not very long ago, for a web-based application talking to a database via JDBC. This was a million-LOC enterprise wossname with all the trimmings. The part that talked to the database was designed with a kind of strategy pattern, where you would inherit from a class named Command and overload a method named "body". Commands were executed using execute, the implementation of which went something like this:
>
> 1. Read the inputs and validate them.
> 2. Open a database connection.
> 3. If all is well, execute the body, passing the inputs.
> 4. Close the database connections.
>
> Plus some complicated error-handling etc, but you get the idea.
>
> At some point we started noticing that the database was accumulating row locks while the application was running with about 5000 concurrent users. If left alone, the app would become unresponsive, so we resorted to manually killing database sessions.
>
> The reason this was occurring is that some commands actually depended on other commands, so occasionally a programmer would implement a command so that it would call another Command's execute method directly. The nested session would of course depend on rows that were already being manipulated (and not committed yet) by the outer session.
>
> Any red-blooded functional programmer would at this point be screaming "use a monad!" Wherever you have an inner thing that depends on an outer thing, but needs shared context, you have a monad. The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
>
> type IO[A] = java.sql.Connection => (A, java.sql.Connection)
>
> Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
Thanks for this illuminating example!
> Of course, this was in Java, which as everybody knows, is just a functional academic language, and we should all pat ourselves on the back for not being that academic.
;-)
Heiko
Re: Re: questioning FP
Runar Bjarnason wrote:
Isn't STRef a solution to allow using mutable data structures for performance gains? Personally, I'd just use the mutable data structure directly (making it hidden/private in some scope if I'm afraid it would accidentally be shared)
Indeed, in this use case, as in HOF, the use of IO is valid. However, not all development uses these use cases. So not all functions that do side-effects should return IO, just those involved in these use cases.
In other words, IO[_] is useful to solve a use-case, not as a mean to get referential transparency.
Sounds like you had a problem with separation of concerns
My approach would have been:
* create an Executor class that accepts a Command, opens a connection, calls command.body and close the database connection.
* This can even be a method withConnection{f: Connection => Unit}
* commands can call other commands' body method
* There's no problem composing f: Connection => Unit and g: Connection => Unit with an operator. `f andAgain g ` where toAgain(f).andAgain(g) = {a => f(a); g(a)}. Of the class based approach allows to create a sort of HOF that invokes g in the middle of f.
Re: Re: questioning FP
On Oct 15, 2011, at 5:06, Ittay Dror <ittay [dot] dror [at] gmail [dot] com> wrote:
It's a solution for allowing working with mutable data structures in a way that the type system guarantees is referentially transparent.
You've just described our unsafePerformIO. And we had an "andAgain" function except it was called "bind" and its type signature wasn't naff.
Re: Re: questioning FP
Runar Oli wrote: I'm not arguing there's no place for lazy IO or the IO monad in programming. Of course there are use cases where people "redesign the wheel" without considering they're actually implementing the IO[_] "design pattern" (or State[_] for that matter).
My point was that IO[_] is not a silver bullet. In many (I think majority) of cases, side effecting functions are easier to use (without incurring bugs of course) than trying to make them referentially transparent with IO[_]. In other words, referential transparency is not a holy grail.
Re: questioning FP
Hi all,
Thanks for this very interesting thread. After enjoying listening to the various arguments I would like to add my view: Except for Roland's "idea" of using Actors for something (?), only the FP/monadic party (Runar) has shown a well thought out and field-tested approach for properly dealing with side effects. Therefore, as long as nobody comes up with a real alternative (except for ignoring), I think I will be a FP fanboy.
By the way: While I am, despite my age, still a young soldier and only a real-world, enterprise-system OO developer, I find the the FP approach neither hard to understand nor hard to apply.
Just my two cents,
Heiko
Re: Re: questioning FP
Maybe; it'd depend on what the widgets were. (And on how well they were documented.) As I said, I was rolling my own.
Ouch. It seems like there are two serious problems here: first, redundant and possibly conflicting open/closing; and second, poor definition of what state of the database should be used when nesting commands--do you use the partial manipulations or not?
That sort of works (although the type you've specified works only for input? Or you're using A to store state in the type system as well as wrap input?). What's not clear, however, is how this is superior to:
(1) adding another method to the parent class that separates out the DB-open-close stuff from the read/write of an existing database (and using the appropriate one when calling externally vs. to each other), or
(2) Wrapping the database interface to cache open/close requests so you don't have to worry about it, or
(3) refactoring as a bunch of primitive commands that you chain by sequential appearance in a method (or in a list over which you fold or map the connection) rather than Kleisli composition. (I.e. do the same thing, but write methods that implement only java.sql.Connection => A; if you're not enforcing via types only certain combinations, there isn't much point in returning the connection.)
Still, it looks _somewhat_ promising, so I appreciate the example.
--Rex
Re: Re: questioning FP
On Oct 15, 2011, at 3:27, Rex Kerr wrote:
>
> That sort of works (although the type you've specified works only for input?
A combinator for writing to the database just has type IO[Unit].
> Or you're using A to store state in the type system as well as wrap input?). What's not clear, however, is how this is superior to:
I'm sure I could have solved it with an ad hoc solution like an AbstractBeanFactoryProxy or something simple like that, but the way I see it is that I needed nested dependencies with shared context, and that's what a monad is.
> (or in a list over which you fold or map the connection) rather than Kleisli composition.
That is Kleisli composition.
> (I.e. do the same thing, but write methods that implement only java.sql.Connection => A;
That's perfectly reasonable, and that's just the Reader monad. I wanted to use something more State-like to discourage parallelism.
Re: Re: questioning FP
>
>
> On Oct 15, 2011, at 3:27, Rex Kerr <ichoran [at] gmail [dot] com> wrote:
>
>>
>> That sort of works (although the type you've specified works only for input?
>
> A combinator for writing to the database just has type IO[Unit].
Okay, but that's just a marker.
>> Or you're using A to store state in the type system as well as wrap input?). What's not clear, however, is how this is superior to:
>
> I'm sure I could have solved it with an ad hoc solution like an AbstractBeanFactoryProxy or something simple like that, but the way I see it is that I needed nested dependencies with shared context, and that's what a monad is.
I did agree that it solved the problem.
>>> (or in a list over which you fold or map the connection) rather than Kleisli composition.
>
> That is Kleisli composition.
It's a special case, no? I thought that type transformations along the chain were part of the point of Kleisli composition. (Then again, given your type signature, it doesn't look like you were doing this.)
>> (I.e. do the same thing, but write methods that implement only java.sql.Connection => A;
>
> That's perfectly reasonable, and that's just the Reader monad. I wanted to use something more State-like to discourage parallelism.
Seems like it's just as easy to use or not use parallelism in each. If you're not updating the types, you can always grab something out of order and try to use it twice in parallel. java.sql.Connection isn't going to know any better.
--Rex
Re: Re: questioning FP
On Oct 15, 2011, at 14:51, Rex Kerr wrote:
>
> > A combinator for writing to the database just has type IO[Unit].
>
> Okay, but that's just a marker.
>
No.
Re: Re: questioning FP
Depends what you call a marker, I guess.
It does help you compose input and output operations into one.
It doesn't help you use input to create output.
It doesn't help you keep your output straight.
In normal use cases it doesn't do anything _different_ than methods that call other methods, but it might be slightly more convenient as you don't have to keep passing the same argument (but it's slightly less convenient because you have to keep dragging Unit along, unless you have an (A => A) => (A => (Unit,A)) handy, which is pretty easy to arrange).
So, okay, it's not _just_ a marker. It just doesn't seem to have novel properties. You need one more level of sophistication for that.
--Rex
Re: Re: questioning FP
Runar's example is using a state monad where the A is the type of the state. All of the functions composed using Kleisli composition in this case can participate in composing a new state, or not! (leaving the state the same).
From the Scalaz source, if you look at the definition of map and flatMap for State:
def map[B](f: A => B): State[S, B] = state(apply(_) match { case (s, a) => (s, f(a)) })
def flatMap[B](f: A => State[S, B]): State[S, B] = state(apply(_) match { case (s, a) => f(a)(s) })
Meaning each composed function can emit an output (in the above example: java.sql.Connection) and a potential update to the state. Depending on how the state monad is used you can either get the state or the final result (or both) at the end of the computation.
Now, if you think purely functionally then functions using the above signature have no way to get another connection to, well, anything. Pure functions produce outputs exclusively based on their inputs. In the case of functions running in a State monad those inputs are the function inputs and the accumulated state so far. Your fold example is less flexible than what Runar is talking about because it is required to produce an A, but Runar can do things like lift the identity function into the State and simply do a no-op. If you hack around your fold example and try to achieve the same level of flexibility why retaining a functional-programming idiom you will eventually stumble upon the State monad.
--
Jim Powers
Re: Re: questioning FP
Jim Powers wrote:
State monad is S => (A, S), so the state here is Connection, which doesn't change (or at least, not in a way where you can distinguish I think), so it looks like A, the result, is produced without referential transparency (Runar, correct me please).
Re: Re: questioning FP
Well, actually, it is a state monad and it depends on the version of Scalaz which order the result or the state are in:
Scalaz6sealed trait State[S, +A] { def apply(s: S): (S, A)
Scalaz7:sealed trait StateT[S, F[_], A] { def apply(s: S): F[(A, S)] = ...trait StateTs { type State[S, A] = StateT[S, Identity, A] The order of the values returned in the Tuple has no significance, you only need to be consistent. In Runar's example he puts the state in the first position instead of the second. No biggie.
--
Jim Powers
Re: Re: questioning FP
Jim Powers wrote:
I wasn't referring to order. The state is the argument to the function (Connection in this case). And you're supposed to return a new state instance (e.g., pop an element from a stack state). Otherwise, the function pulls a value from thin air ==> not referentially transparent. In Runar's example, I suspect that the function will return the same value when invoked again on the returned connection since its state is opaque.
Ittay
Re: Re: questioning FP
Ah, okay. Nothing wrong with using a state monad; it's a perfectly sensible way to build a finite state machine. I'm not that familiar with it, so I didn't recognize it from the signature.
Except again, it has nothing to do with IO. This is equally useful if you want to build an entirely immutable data structure where you have constraints on particular combinations of values in that structure; if said structure is from an external library, it's easier to build a state monad than to try other means of statefully wrapping the library.
I do not argue that state monads are unhelpful (overkill sometimes, but still helpful, and more flexible than a fold, I agree). I just argue that creating a state monad and labeling it "IO" is a strange thing to do; it has to do with linking values (state) and types. Whether or not the state is internal or external (i.e. IO) is immaterial, and in a particular application you want to know _which_ IO process it's helping you with, not that it has something to do with some IO process.
--Rex
Re: Re: questioning FP
State in the State Monad does not refer to a finite state machine, it is arbitrary state. For example:
import scalaz._ import Scalaz._
case class Foo(x:Int)
def plusOne:Foo => Foo = (f => f.copy(x = f.x + 1))
val plusTwo:State[Foo,Unit] = for { _ <- modify[Foo](plusOne) _ <- modify[Foo](plusOne)} yield ()
plusTwo ~> Foo(0)
==> Foo(2)
You can now compose plusTwos together:
val plusFour:State[Foo,Unit] = for { _ <- plusTwo _ <- plusTwo } yield ()
plusFour ~> Foo(0)
==> Foo(4)
--
Jim Powers
Re: Re: questioning FP
On Saturday, October 15, 2011 2:22:32 PM UTC-4, Jim Powers wrote:
It does. It's essentially a Mealy machine, which is characterized by the fact that outputs depend on both inputs and the current state. I.e. a state transition edge in a Mealy machine can be modeled with a function of type:
type Mealy[S, A, B] = (A, S) => (B, S)
Of course, if we curry the argument, this is just:
A => S => (B, S)
If we take the result type of this and wrap it:
case class State[S, B](apply: S => (B, S))
then a Mealy machine is just an arrow in the Kleisli category under the State monad:
A => State[S, B]
Re: Re: questioning FP
--Rex
On Sat, Oct 15, 2011 at 7:28 PM, Runar Bjarnason <runarorama [at] gmail [dot] com> wrote:
Re: Re: questioning FP
OK sure. I guess what I was going for is State can meet all your statefull needs. It's not a thing for "special" State needs :-).
Then again, I don't know of any program that doesn't generate outputs based on inputs and current state.
--
On Oct 15, 2011 7:28 PM, "Runar Bjarnason" <runarorama [at] gmail [dot] com> wrote:Jim Powers
Re: Re: questioning FP
Rex Kerr wrote:
Re: questioning FP
well spoken, thanks for writing this up! Especially this part I wholeheartedly agree with:
Functional programming is very useful and provides tools of great utility, the IO monad is simply not part of this set. And since we are on scala-debate, I may add that from the fact that the IO monad is actually necessary in purely functional languages I conclude that those languages—while interesting in an academic sense—are fundamentally broken in practice. I’m not saying that it is impossible to write practical software with them, it’s just not true that purely functional programming should be portrayed as the goal everyone should aspire to reach. And there is mounting evidence for the fact that Scala is sitting much closer to the sweet spot than most other established programming languages out there; this is carefully formulated to indicate that improvement is still possible ;-)
This mail has sparked a question in my mind: I do think that actors are a perfect way of encapsulating IO, it gives you all the goodies without spoiling the rest of the code base (which would naturally also be written using actors ;-) ). Can anybody explain what could be wrong with this intuition? Actually it is more than intuition, since I have implemented quite a few systems which use this approach. CS theoretical issues are also welcome.
Regards,
Roland
On Oct 14, 2011, at 22:28 , Rex Kerr wrote:
Roland Kuhn
Typesafe – Enterprise-Grade Scala from the Experts
twitter: @rolandkuhn
Re: Re: questioning FP
Video worth a thousand words?http://vimeo.com/25786102
--
Jim Powers
Re: Re: questioning FP
In many ways, though, it's _not_ like using Option. The main problem with nulls in Java--at least in my experience--is that some APIs would return null only when something disastrous happened, while others would return null in the normal course of events; if you didn't remember which was which, your program would die.
The reason to use null in the normal course of events is when a result may not be available, which is exactly what Option[X] captures explicitly. And Option not only is a marker that something might be null (that is slightly helpful but not _that_ helpful) but it gives you tools for dealing with data that might not be there (e.g. map). So you get the same concept you were using before except clearly instead of as a hack, and more powerful tools to deal with it. What's not to like?*
The situation with IO, however, is only analogous if people write APIs that opaquely perform IO into existing streams _and_ if IO gives you a way to avoid doing this. Personally, I have had hundreds, possibly thousands of problems caused by null-used-as-option, and I'm struggling to remember a single time where I had an unexpected-IO problem that was not introduced as part of a debugging process. Furthermore, I have yet to see an example where IO actually does something for me that is relevant to IO (like data type conversion or input validation or auto-closing streams or whatever) and which does often cause problems. There are certainly some wonderful capabilities out there with traverse (weird name, though!), but the wonderfulness isn't IO-dependent; it just makes working with an IO marker a bit less annoying.
"Here is a very awkward solution to a problem you don't really have--and look, with these nifty tricks we can make it only slightly awkward!" isnt' a very good selling point for a capability. There are all sorts of things that help out with IO: building a FSM to make sure things happen in order, writing robust possibly even monadic converters, packaging output up first and then sending it out all in one go in something that manages resources, and so on.
Some of us are still having difficulty seeing the problem, so it's feeling like Hungarian notation in dynamic languages: yes, you can use it to annotate in information that the language itself does not, but the burden is sufficiently high for the benefit one receives that it is not widely regarded as a good practice (indeed, it's mostly regarded as a painful waste of time). Maybe we should annotate mutation also, and what is in units of meters, and so on; but I really doubt that IO[Mutated[Meter[Array[Double]]]] is going to illuminate more than it confuses. (In particular, if I want to change the units to Centimeters, getting at it is going to be a pain.)
An Array[Double] with IO with Mutable with Meter, where the traits are all virtual and only kept track of by the compiler, might be doable as a marker system, but there are a lot of corner cases to work though.
--Rex
* What's not to like is the performance overhead. Pity, but in critical cases one can just pay more attention and use null after all.