Saturday, December 14, 2013

10 Ways to Incorporate Haskell into a Modern, Functional, CS Curriculum

I had some time to spare while I was waiting for dinner to be ready. Dinner was being prepared by the local China Gate, and it would probably take on the order of 10 to 15 minutes. So I sat down in the lobby with a notepad in hand.

The topic on my thoughts: how could Haskell be incorporated into a modern CS curriculum? I decided to run as radically as I could with this, writing a rough draft of what such a curriculum might look like.

I won't make any arguments for or against functional programming here. I refer readers instead to papers like this one and this one, talks like this one or this one, and books like this one. This is an exercise in ideation.

Without further ado, let's begin:

0. Haskell for Introductory Computer Science

Imagine this is the first time you're learning programming. You've never been exposed to mutation, to functions, to compiling, algorithms, or any of the details of architecture. Where do we start?

Four weeks of Haskell. Enough to get through the first few chapters of LYAH and be able to start thinking about recursion. End each week with a simple exercise, for example, upper-casing a list of strings, upper-casing only the first letter of every string that is longer than 2 characters - little things to build confidence. The Udacity Introduction to Computer Science course has many appropriate ideas for beginning level exercises.

With that introductory period out of the way, now's the time to show why computer science is relevant! Take the time to show case the areas: operating systems, networking, algorithms, programming languages, cryptography, architecture, hardware, and more. Make it relevant:

* Operating systems: Why are there so many? What does it do? How does my application (email, browser, Steam) run from beginning to end?
* Algorithms: How do you figure out if two characters have collided in a video game? How do you sort a list of address contacts alphabetically by last name?
* Networking: How do you send an instant message or an email to a friend on the other side of the world?
* Programming Languages: As with operating systems.

There are many applicable introductory exercises here that can set the pace for future courses.

1. Haskell for Basic Algorithms

This one, and the latter algorithms course on this "Top 10 List", deserve special attention.

Algorithms are fundamentally pure constructs. You give them a well-defined input, and receive a well-defined output

Take plenty of time to provide weekly exercises. Teaching sorting algorithms, trees, string matching algorithms, and more will be a delight here, I predict.

It's also a good time to introduce basic run-time analysis, e.g., Big O notation.

This is also a beautiful time to introduce QuickCheck in conjunction with invariants.

2. Haskell for Data Structures

Very similar to the basic algorithms course, except now we teach students about some basic ways to organize data. Lists, vectors, trees, hash maps, and graphs - these should be enough to keep most students (and practitioners) well-equipped for years!

QuickCheck and frequent programming exercises will do well here.

If an advanced version of this course is desired, I highly recommend starting from here to brainstorm a variant: Purely Functional Data Structures

3. Haskell for Networking

This can be very low-level (OSI network stack, TCP window buffering, etc.), it can be very high-level (HTTP, distributed systems), or some mix of the two.

I think the most important concepts students can come of this course with would be:

* Validating data at the boundaries and the dangers/usefulness of IO
* How to communicate with other systems
* How to write their own protocols, and why in general, they shouldn't reinvent the wheel

4. Haskell for Operating Systems

Teach them to ask - what is an operating system? How do I manage my resources?

It's worth surveying the concepts of: memory management, file systems, data persistence, concurrency, parallelism, process management, task scheduling, and possibly a bit more.

Great projects in this course include: 1) write your own shell, 2) write a simple, local task manager.

5. Haskell for Comparative Programming Languagues

Let's talk types, functions, composition, and problem solving using different approaches. Ideally, such a course would come after learning how to design solutions to mid-sized programming challenges.

After that, have students write an interpreter for a Lisp.

6. Haskell for Compiler Construction

More on: write your own language. This course should cover parsing, lexical analysis, type analysis, and the conversion from source to assembly.

7. Haskell for Advanced Algorithms

This one's going to be fun. Unleash the power of equational reasoning to put together a course that runs through: graph theory, network flow analysis, greedy algorithms, memoization, and more.

This would also be a great time to discuss how the price one pays for purity in regards to asymptotic performance, and how to overcome that, if necessary.

Also, an extended treatment of algorithmic analysis in the presence of laziness would be valuable here.

8. Haskell for Introductory Design of Programs

Really, this should be higher up in this list, and a very early course.

The goal of this course is to come out of it knowing how to:

* Get a big picture of what they're trying to build
* Break it down into the smaller pieces
* Iterate 'til they get the big picture running

It's a great time to teach some basic ideas for testing, how to experiment with the REPL, and how to take advantage of the type system for simple things.

On a more social level, it's a wonderful time to also guide students towards collaborative design, e.g., how to work together and make it fun and efficient.

9. Haskell for High-Performance Computing

This could be a very fun course. It affords the opportunity to allow students to run wild with simulations and experiments of their choosing, while learning about what it means to do high-performance computing in a functional language.

Given that, it should teach some basic tools that will be applicable to most or all projects. How does one benchmark well? What are the evils of optimization? What is over-optimization? When is optimization needed? What tools exist right now to harness parallelism in Haskell (Repa, Accelerate, etc.)? When is data distribution needed? Why is parallelism important? How is parallelism different than concurrency? How can the type system be wielded to help keep units (km/s, etc.) consistent across calculations?

I'd advocate for letting students choose their own projects built in parallel with the course taking place. A simple default is to optimize the much-lauded matrix multiply to larger and larger scales (distributed even, if they want to go so far!). Writing a collision detection engine for a simple game would be pretty interesting, as well.

Notably (in my opinion) absent topics:

* Hardware: CPUs, cache hierarchies, main memory, storage, interconnects
* Advanced data structures
* Cryptography
* Web development
* Type systems
* Cross-disciplinary development, e.g., Haskell for CS + [Physics, Chemistry, Biology, etc.]

These topics are absent from my list for no reason other than I didn't think of them 'til the list was done and articulated. There's so much one can learn and apply at the level of abstraction that computer science (and mathematics) affords that we could specialize further and further. For my own sake, I'm setting a limit. :)

Final Notes

I've just brain-stormed a curriculum in Haskell. There's a lot of details missing, but it's a working draft.

There's also other things to consider, beyond the technical aspects. Consider the social aspects. How we teach students to work together? How do we keep learning engaging and fun?  How do we help students connect to the greater community of developers that exist outside of academia? How do we keep the lessons relevant to the lives that they lead? How do we encourage them to pursue their passions?

2 comments:

  1. As an alternative, consider researching the ways in which programming has been integrated into other curricula, especially math and science. As an example, check out the previous work of Dr. Rex Page at the University of Oklahoma.

    ReplyDelete
  2. Jim,

    Thanks for sharing this information. I'm rather curious about how to combine disciplines rather than segregate them further, so this is interesting to me. I think taking away these barriers between disciplines will make any material taught immediately more relevant, and therefore, more engaging.

    I'll take a look into Dr. Rex Page's work, all the more so since I've been amused by the front page quote: "Procedural programming is a
    monumental waste of human energy. Save energy! Submit a paper to TFP 2014."

    ReplyDelete