Tuesday, December 31, 2013

"Parallel and Concurrent Programming in Haskell" by Simon Marlow; O'Reilly Media;

I want to start writing this, but it's been forever since I actually read this book. I'm concerned I won't be able to do it justice.

So here's the deal: I've read this book once many months ago (June?), and I've been re-reading it since. Some of the details are blurry, but the gist of the book is still with me.

Without further ado, here's the review:



======
Review
======

Disclaimer: I'm reviewing this book under the O'Reilly Blogger Review program. (though I ended up purchasing a hard copy afterwards any way.)

This is The Book that sold me on Haskell for concurrent and parallel programming. Sure, I've read several articles on the benefits of functional languages for programming in the multi-core world, but that didn't really sink in until I saw how elegant it could be in a functional language.

In brief, the main benefits I got from reading this this book were:

* Surveyed parallel programming (in Haskell)
* Surveyed concurrent programming (in Haskell)
* Saw the elegance of the approaches for myself
* Learned about laziness gotchas in parallel contexts
* Learned a bit about what's next and left to improve
* Learned what modules to turn to and watch when in need

I hope I never have to look at OpenCL or CUDA C++ again for parallel programming. The way Repa/Accelerate handles this is beautiful.

The chapters on concurrent programming showed me how much having concurrency primitives built into a language change async programming. Having forkIO to run subsequent computations and a scheduler in the run-time make it very convenient.

In sum, I highly recommend this book. 10/10, one of my top 10 books of 2013.

Saturday, December 14, 2013

10 Ways to Incorporate Haskell into a Modern, Functional, CS Curriculum

I had some time to spare while I was waiting for dinner to be ready. Dinner was being prepared by the local China Gate, and it would probably take on the order of 10 to 15 minutes. So I sat down in the lobby with a notepad in hand.

The topic on my thoughts: how could Haskell be incorporated into a modern CS curriculum? I decided to run as radically as I could with this, writing a rough draft of what such a curriculum might look like.

I won't make any arguments for or against functional programming here. I refer readers instead to papers like this one and this one, talks like this one or this one, and books like this one. This is an exercise in ideation.

Without further ado, let's begin:

0. Haskell for Introductory Computer Science

Imagine this is the first time you're learning programming. You've never been exposed to mutation, to functions, to compiling, algorithms, or any of the details of architecture. Where do we start?

Four weeks of Haskell. Enough to get through the first few chapters of LYAH and be able to start thinking about recursion. End each week with a simple exercise, for example, upper-casing a list of strings, upper-casing only the first letter of every string that is longer than 2 characters - little things to build confidence. The Udacity Introduction to Computer Science course has many appropriate ideas for beginning level exercises.

With that introductory period out of the way, now's the time to show why computer science is relevant! Take the time to show case the areas: operating systems, networking, algorithms, programming languages, cryptography, architecture, hardware, and more. Make it relevant:

* Operating systems: Why are there so many? What does it do? How does my application (email, browser, Steam) run from beginning to end?
* Algorithms: How do you figure out if two characters have collided in a video game? How do you sort a list of address contacts alphabetically by last name?
* Networking: How do you send an instant message or an email to a friend on the other side of the world?
* Programming Languages: As with operating systems.

There are many applicable introductory exercises here that can set the pace for future courses.

1. Haskell for Basic Algorithms

This one, and the latter algorithms course on this "Top 10 List", deserve special attention.

Algorithms are fundamentally pure constructs. You give them a well-defined input, and receive a well-defined output

Take plenty of time to provide weekly exercises. Teaching sorting algorithms, trees, string matching algorithms, and more will be a delight here, I predict.

It's also a good time to introduce basic run-time analysis, e.g., Big O notation.

This is also a beautiful time to introduce QuickCheck in conjunction with invariants.

2. Haskell for Data Structures

Very similar to the basic algorithms course, except now we teach students about some basic ways to organize data. Lists, vectors, trees, hash maps, and graphs - these should be enough to keep most students (and practitioners) well-equipped for years!

QuickCheck and frequent programming exercises will do well here.

If an advanced version of this course is desired, I highly recommend starting from here to brainstorm a variant: Purely Functional Data Structures

3. Haskell for Networking

This can be very low-level (OSI network stack, TCP window buffering, etc.), it can be very high-level (HTTP, distributed systems), or some mix of the two.

I think the most important concepts students can come of this course with would be:

* Validating data at the boundaries and the dangers/usefulness of IO
* How to communicate with other systems
* How to write their own protocols, and why in general, they shouldn't reinvent the wheel

4. Haskell for Operating Systems

Teach them to ask - what is an operating system? How do I manage my resources?

It's worth surveying the concepts of: memory management, file systems, data persistence, concurrency, parallelism, process management, task scheduling, and possibly a bit more.

Great projects in this course include: 1) write your own shell, 2) write a simple, local task manager.

5. Haskell for Comparative Programming Languagues

Let's talk types, functions, composition, and problem solving using different approaches. Ideally, such a course would come after learning how to design solutions to mid-sized programming challenges.

After that, have students write an interpreter for a Lisp.

6. Haskell for Compiler Construction

More on: write your own language. This course should cover parsing, lexical analysis, type analysis, and the conversion from source to assembly.

7. Haskell for Advanced Algorithms

This one's going to be fun. Unleash the power of equational reasoning to put together a course that runs through: graph theory, network flow analysis, greedy algorithms, memoization, and more.

This would also be a great time to discuss how the price one pays for purity in regards to asymptotic performance, and how to overcome that, if necessary.

Also, an extended treatment of algorithmic analysis in the presence of laziness would be valuable here.

8. Haskell for Introductory Design of Programs

Really, this should be higher up in this list, and a very early course.

The goal of this course is to come out of it knowing how to:

* Get a big picture of what they're trying to build
* Break it down into the smaller pieces
* Iterate 'til they get the big picture running

It's a great time to teach some basic ideas for testing, how to experiment with the REPL, and how to take advantage of the type system for simple things.

On a more social level, it's a wonderful time to also guide students towards collaborative design, e.g., how to work together and make it fun and efficient.

9. Haskell for High-Performance Computing

This could be a very fun course. It affords the opportunity to allow students to run wild with simulations and experiments of their choosing, while learning about what it means to do high-performance computing in a functional language.

Given that, it should teach some basic tools that will be applicable to most or all projects. How does one benchmark well? What are the evils of optimization? What is over-optimization? When is optimization needed? What tools exist right now to harness parallelism in Haskell (Repa, Accelerate, etc.)? When is data distribution needed? Why is parallelism important? How is parallelism different than concurrency? How can the type system be wielded to help keep units (km/s, etc.) consistent across calculations?

I'd advocate for letting students choose their own projects built in parallel with the course taking place. A simple default is to optimize the much-lauded matrix multiply to larger and larger scales (distributed even, if they want to go so far!). Writing a collision detection engine for a simple game would be pretty interesting, as well.

Notably (in my opinion) absent topics:

* Hardware: CPUs, cache hierarchies, main memory, storage, interconnects
* Advanced data structures
* Cryptography
* Web development
* Type systems
* Cross-disciplinary development, e.g., Haskell for CS + [Physics, Chemistry, Biology, etc.]

These topics are absent from my list for no reason other than I didn't think of them 'til the list was done and articulated. There's so much one can learn and apply at the level of abstraction that computer science (and mathematics) affords that we could specialize further and further. For my own sake, I'm setting a limit. :)

Final Notes

I've just brain-stormed a curriculum in Haskell. There's a lot of details missing, but it's a working draft.

There's also other things to consider, beyond the technical aspects. Consider the social aspects. How we teach students to work together? How do we keep learning engaging and fun?  How do we help students connect to the greater community of developers that exist outside of academia? How do we keep the lessons relevant to the lives that they lead? How do we encourage them to pursue their passions?

To Be Honest...

I had forgotten that I ramped up this blog. It's been so long since I've looked at it, and I only came across it again since I've been actively considering writing. I was surprised to see that people were still periodically coming here over the past few months, even though it's been nearly six months since I last posted (thanks!).

I've been spending more time publishing mini-blogs over at https://thoughtstreams.io/. I have three categories so far: "Marconi Updates", "Haskell", and "Philosophy Behind Software, Love, and Life".

Thoughtstreams is a very interesting platform. More than any other blogging engine I've ever used, it tries not to get in the way. I just write, as little or as much as I like, and post. It even supports Markdown syntax, with basic support for code blocks! I've felt more compelled to write a little every day using that. It helps!

Marconi Updates is my micro-blog (shared with flaper87 and kgriffs). Herein, I write about our efforts on the Openstack Marconi project, a distributed queuing system written entirely in Python. Most of my updates are storage-centric, since that's one of the more interesting aspects of the project to me.

Haskell is all about Haskell! I've developed a sort of infatuation with the language over the past three years, even though I've yet to use it develop personal or professional projects. The appeal of a functional (double-meaning there), statically-typed language with elegant, terse syntax really gets the language geek in me going. I'll found this channel to be a great place to gush about the language, tidbits I find interesting, and what I hope to do with it.

Philosophy Behind Software, Love, and Life is a new experiment, shared with flaper87. In this microblog, I want to share more personal thoughts and beliefs. Thus far, I've written about mindfulness and about some of the personal struggles I've encountered in working with technology. I hope to share more of my personal story therein, and how they guide my beliefs on my journey through this world.

So that's the story. I think I isolated this blog to be too much about reviewing books in the past, and that really damaged my morale about posting here. In the future, I may discontinue this blog entirely and move over to a service like ThoughtStreams, or I may even spin up my own blog! For the time being, though, books need reviewing, I love writing, and I haven't made the time to designate any other place as my primary review blog spot.