Read, Review, Refactor: 2013

Tuesday, December 31, 2013

"Parallel and Concurrent Programming in Haskell" by Simon Marlow; O'Reilly Media;

I want to start writing this, but it's been forever since I actually read this book. I'm concerned I won't be able to do it justice.

So here's the deal: I've read this book once many months ago (June?), and I've been re-reading it since. Some of the details are blurry, but the gist of the book is still with me.

Without further ado, here's the review:

======
Review
======

Disclaimer: I'm reviewing this book under the O'Reilly Blogger Review program. (though I ended up purchasing a hard copy afterwards any way.)

This is The Book that sold me on Haskell for concurrent and parallel programming. Sure, I've read several articles on the benefits of functional languages for programming in the multi-core world, but that didn't really sink in until I saw how elegant it could be in a functional language.

In brief, the main benefits I got from reading this this book were:

* Surveyed parallel programming (in Haskell)
* Surveyed concurrent programming (in Haskell)
* Saw the elegance of the approaches for myself
* Learned about laziness gotchas in parallel contexts
* Learned a bit about what's next and left to improve
* Learned what modules to turn to and watch when in need

I hope I never have to look at OpenCL or CUDA C++ again for parallel programming. The way Repa/Accelerate handles this is beautiful.

The chapters on concurrent programming showed me how much having concurrency primitives built into a language change async programming. Having forkIO to run subsequent computations and a scheduler in the run-time make it very convenient.

In sum, I highly recommend this book. 10/10, one of my top 10 books of 2013.

Saturday, December 14, 2013

10 Ways to Incorporate Haskell into a Modern, Functional, CS Curriculum

I had some time to spare while I was waiting for dinner to be ready. Dinner was being prepared by the local China Gate, and it would probably take on the order of 10 to 15 minutes. So I sat down in the lobby with a notepad in hand.

The topic on my thoughts: how could Haskell be incorporated into a modern CS curriculum? I decided to run as radically as I could with this, writing a rough draft of what such a curriculum might look like.

I won't make any arguments for or against functional programming here. I refer readers instead to papers like this one and this one, talks like this one or this one, and books like this one. This is an exercise in ideation.

Without further ado, let's begin:

0. Haskell for Introductory Computer Science

Imagine this is the first time you're learning programming. You've never been exposed to mutation, to functions, to compiling, algorithms, or any of the details of architecture. Where do we start?

Four weeks of Haskell. Enough to get through the first few chapters of LYAH and be able to start thinking about recursion. End each week with a simple exercise, for example, upper-casing a list of strings, upper-casing only the first letter of every string that is longer than 2 characters - little things to build confidence. The Udacity Introduction to Computer Science course has many appropriate ideas for beginning level exercises.

With that introductory period out of the way, now's the time to show why computer science is relevant! Take the time to show case the areas: operating systems, networking, algorithms, programming languages, cryptography, architecture, hardware, and more. Make it relevant:

* Operating systems: Why are there so many? What does it do? How does my application (email, browser, Steam) run from beginning to end?
* Algorithms: How do you figure out if two characters have collided in a video game? How do you sort a list of address contacts alphabetically by last name?
* Networking: How do you send an instant message or an email to a friend on the other side of the world?
* Programming Languages: As with operating systems.

There are many applicable introductory exercises here that can set the pace for future courses.

1. Haskell for Basic Algorithms

This one, and the latter algorithms course on this "Top 10 List", deserve special attention.

Algorithms are fundamentally pure constructs. You give them a well-defined input, and receive a well-defined output

Take plenty of time to provide weekly exercises. Teaching sorting algorithms, trees, string matching algorithms, and more will be a delight here, I predict.

It's also a good time to introduce basic run-time analysis, e.g., Big O notation.

This is also a beautiful time to introduce QuickCheck in conjunction with invariants.

2. Haskell for Data Structures

Very similar to the basic algorithms course, except now we teach students about some basic ways to organize data. Lists, vectors, trees, hash maps, and graphs - these should be enough to keep most students (and practitioners) well-equipped for years!

QuickCheck and frequent programming exercises will do well here.

If an advanced version of this course is desired, I highly recommend starting from here to brainstorm a variant: Purely Functional Data Structures

3. Haskell for Networking

This can be very low-level (OSI network stack, TCP window buffering, etc.), it can be very high-level (HTTP, distributed systems), or some mix of the two.

I think the most important concepts students can come of this course with would be:

* Validating data at the boundaries and the dangers/usefulness of IO
* How to communicate with other systems
* How to write their own protocols, and why in general, they shouldn't reinvent the wheel

4. Haskell for Operating Systems

Teach them to ask - what is an operating system? How do I manage my resources?

It's worth surveying the concepts of: memory management, file systems, data persistence, concurrency, parallelism, process management, task scheduling, and possibly a bit more.

Great projects in this course include: 1) write your own shell, 2) write a simple, local task manager.

5. Haskell for Comparative Programming Languagues

Let's talk types, functions, composition, and problem solving using different approaches. Ideally, such a course would come after learning how to design solutions to mid-sized programming challenges.

After that, have students write an interpreter for a Lisp.

6. Haskell for Compiler Construction

More on: write your own language. This course should cover parsing, lexical analysis, type analysis, and the conversion from source to assembly.

7. Haskell for Advanced Algorithms

This one's going to be fun. Unleash the power of equational reasoning to put together a course that runs through: graph theory, network flow analysis, greedy algorithms, memoization, and more.

This would also be a great time to discuss how the price one pays for purity in regards to asymptotic performance, and how to overcome that, if necessary.

Also, an extended treatment of algorithmic analysis in the presence of laziness would be valuable here.

8. Haskell for Introductory Design of Programs

Really, this should be higher up in this list, and a very early course.

The goal of this course is to come out of it knowing how to:

* Get a big picture of what they're trying to build
* Break it down into the smaller pieces
* Iterate 'til they get the big picture running

It's a great time to teach some basic ideas for testing, how to experiment with the REPL, and how to take advantage of the type system for simple things.

On a more social level, it's a wonderful time to also guide students towards collaborative design, e.g., how to work together and make it fun and efficient.

9. Haskell for High-Performance Computing

This could be a very fun course. It affords the opportunity to allow students to run wild with simulations and experiments of their choosing, while learning about what it means to do high-performance computing in a functional language.

Given that, it should teach some basic tools that will be applicable to most or all projects. How does one benchmark well? What are the evils of optimization? What is over-optimization? When is optimization needed? What tools exist right now to harness parallelism in Haskell (Repa, Accelerate, etc.)? When is data distribution needed? Why is parallelism important? How is parallelism different than concurrency? How can the type system be wielded to help keep units (km/s, etc.) consistent across calculations?

I'd advocate for letting students choose their own projects built in parallel with the course taking place. A simple default is to optimize the much-lauded matrix multiply to larger and larger scales (distributed even, if they want to go so far!). Writing a collision detection engine for a simple game would be pretty interesting, as well.

Notably (in my opinion) absent topics:

* Hardware: CPUs, cache hierarchies, main memory, storage, interconnects
* Advanced data structures
* Cryptography
* Web development
* Type systems
* Cross-disciplinary development, e.g., Haskell for CS + [Physics, Chemistry, Biology, etc.]

These topics are absent from my list for no reason other than I didn't think of them 'til the list was done and articulated. There's so much one can learn and apply at the level of abstraction that computer science (and mathematics) affords that we could specialize further and further. For my own sake, I'm setting a limit. :)

Final Notes

I've just brain-stormed a curriculum in Haskell. There's a lot of details missing, but it's a working draft.

There's also other things to consider, beyond the technical aspects. Consider the social aspects. How we teach students to work together? How do we keep learning engaging and fun? How do we help students connect to the greater community of developers that exist outside of academia? How do we keep the lessons relevant to the lives that they lead? How do we encourage them to pursue their passions?

To Be Honest...

I had forgotten that I ramped up this blog. It's been so long since I've looked at it, and I only came across it again since I've been actively considering writing. I was surprised to see that people were still periodically coming here over the past few months, even though it's been nearly six months since I last posted (thanks!).

I've been spending more time publishing mini-blogs over at https://thoughtstreams.io/. I have three categories so far: "Marconi Updates", "Haskell", and "Philosophy Behind Software, Love, and Life".

Thoughtstreams is a very interesting platform. More than any other blogging engine I've ever used, it tries not to get in the way. I just write, as little or as much as I like, and post. It even supports Markdown syntax, with basic support for code blocks! I've felt more compelled to write a little every day using that. It helps!

Marconi Updates is my micro-blog (shared with flaper87 and kgriffs). Herein, I write about our efforts on the Openstack Marconi project, a distributed queuing system written entirely in Python. Most of my updates are storage-centric, since that's one of the more interesting aspects of the project to me.

Haskell is all about Haskell! I've developed a sort of infatuation with the language over the past three years, even though I've yet to use it develop personal or professional projects. The appeal of a functional (double-meaning there), statically-typed language with elegant, terse syntax really gets the language geek in me going. I'll found this channel to be a great place to gush about the language, tidbits I find interesting, and what I hope to do with it.

Philosophy Behind Software, Love, and Life is a new experiment, shared with flaper87. In this microblog, I want to share more personal thoughts and beliefs. Thus far, I've written about mindfulness and about some of the personal struggles I've encountered in working with technology. I hope to share more of my personal story therein, and how they guide my beliefs on my journey through this world.

So that's the story. I think I isolated this blog to be too much about reviewing books in the past, and that really damaged my morale about posting here. In the future, I may discontinue this blog entirely and move over to a service like ThoughtStreams, or I may even spin up my own blog! For the time being, though, books need reviewing, I love writing, and I haven't made the time to designate any other place as my primary review blog spot.

Wednesday, August 21, 2013

"ZeroMQ: Messaging for Many Application" by Peter Hintjens; O'Reilly Media;

Building Distributed Systems and Communities

I was pleasantly surprised by this book. I came in expecting to learn about the magic of ZeroMQ, and I came out knowing not only more about ZeroMQ itself, but also distributed systems and community building.

The first chapter primes the rest of the text, preparing the reader to not only learn more about the world of distributed computing, but also begins instilling a certain excitement about how things could be so much easier, and how ZeroMQ enables this. This is surprising, as very few technical texts manage to become page-turners so quickly.

The next four chapters are a deep dive into ZeroMQ (and distributed systems) best practices. Heartbeats, publish/subscribe, request/reply, round-robin task distribution, and then composing all of these patterns and more. It's a collection of best practices without seeming like a dictionary. I could spend a few months practicing and studying these best practices and I'd feel like I knew more about how to build reliable systems.

The last half of the book is all about how to build communities and processes that last. Building distributed systems requires effective communication. There's a wealth of knowledge in the last portion of the book about how to enable that. Some great advice includes:

- separate maintainers from contributors
- use Github (which can be interpreted as: make issue tracking, persistent comments, and code review *really* easy)

A good chunk of the community building advice is available as the ZMQ C4 guide. It's elaborated on much more deeply in this book, which I appreciate. C4 is the contract; this book is the rationale.

If you're interested in or need to build a system that's highly concurrent, has high reliability and/or performance requirements, and even if you can't use ZMQ to build said system, get this book anyway. You'll learn a lot in the process that you can apply to making network (and team) communication much more effective.

Monday, July 22, 2013

Sarah Mei. The Insufficiency of Good Design. Ruby Conf 2012.

Link:https://www.youtube.com/watch?v=UgrVdHYEZGg

Watch this at some point. If you work with people, even if you're not in the software development industry, you'll benefit from the ideas shared in this video.

I learned about this video today from a colleague. It turns out there's a lot of wisdom in it. It actually surprised me quite a few times. According to the video, which in turn cites a Carnegie Mellon University study conducted in 2007, the best predictor of quality in code is...

Good Communication

This trumps technical qualifications and domain-specific knowledge. I tried hard to find the exact paper Sarah spoke of, but the closest I could come to that study was this article published in 1999. As a side note, many of the papers listed on the parent page are very interesting from an organizational point of view. *bookmarked*

Before I go further, allow me to quote Conway's Law:

Any organization that designs a system will inevitably produce a design whose structure is a copy of the organization's communication structure.

Studies seem to indicate that this is real, real as in take a look at projects you've worked on and see for yourself real.

So given that Conway's Law is something that is affecting you right now, it means that you can actually debug your communication patterns. This was one of the surprising points Sarah brought up. It turns out that by reflecting on code that's been written, and "code smells" that keep coming up, you can determine missing links in communication. You could continue to solve those code smells. You could refactor them all away - once per sprint even! However, that's only addressing the symptoms. The underlying problem is more likely to be one of communication, and if you address that, it should follow that the anti-pattern in the code should go away. A lot of conjecture here - take home experiments?

To sum it up:

Every piece of bad code is trying to tell you something

It's up to you to listen.

"21st Century C" by Ben Klemens; O'Reilly Media;

Disclaimer: I've received this book as a part of the O'Reilly Blogger Review Program.

In 21st Century C, Ben Klemens sets out to show you how much C has changed over the past two decades. Promising to show you how to make the most of modern C, he takes on a tour of the language, including features added by recent standards, and also indicates what features we should stop using.

Verdict: [|||||| ] (6 out of 10)

tl;dr - I wouldn't recommend this book in general. If you're looking for idiomatic C, and you're experienced enough with C to know what works and what doesn't, and are able to make judgment on a case-by-case basis with regards to the author's advice, you might be able to find a few interesting gems in here.

For the reader interested in a more detailed review:

I found Klemens' writing to be entertaining. The references to punk rock became a little groan-inducing after some time, but groans are better than snores - I wasn't put to sleep by the writing. His character definitely shows through his writing. I appreciate this, as all too many text books suffer from the formalism-first problem. Readers are humans, and Klemens addresses this well.

The tour of C history was also very interesting. The coverage of C from inception up to the latest standard (C11) and how things have changed since then was especially valuable to me. I even appreciated the mild coverage that was given regarding C on Windows, where C really seems to be an afterthought.

One more good point before I jump into the criticisms - the chapter covering pointer semantics was solid. Klemens explains how to use them, several of the associated gotchas, and what to look out for. Succinct, clear, and full of examples.

Things are not so bright every where in the book. One particular section that concerned me, was in Chapter 1: The Unified Header. Klemens contends that every project should use an "allheaders.h" because "Your compiler doesn't think 2,400 lines is a big deal any more, and this compiles in under a second". While I agree that compilation speed won't be a problem, the unified header approach has a few problems that are more troublesome for long-term maintenance. First and foremost, since C has very limited support for name-spacing, it'll be very difficult to new comers to your C library to determine where each imported function and structure comes from. When time to refactor comes along, it'll be more likely that the refactoring process will be difficult if the reader follows this unified header approach than if they made more clear separation of their modules by purpose, relationship, and abstraction level.

I leave you with this final note - one of the strengths of this book is that it is very opinionated. Unfortunately, this is also this book's greatest weakness. Take every claim made within with a grain of salt and you should emerge wiser.

Friday, June 14, 2013

"Python Cookbook (2013)" by David Beazley, Brian K. Jones; O'Reilly Media;

I recently purchased the Python Cookbook. The opinions below are all mine. This wasn't a gift, or an item gifted to me for review purposes - I really wanted this book!

Read more below to see why.

Python 3: The Power is Yours!

This book is an amazing resource for making the most of Python 3. This book truly makes it apparent what you're missing out on in the land of Python 2 and shows what can be accomplished more concisely in Python 3.

For example, as early as chapter 1, you're introduced to advanced unpacking techniques introduced in Python 3:

head, *tail = elems

The equivalent in Python 2 is:

head, tail = elems[0], elems[1:]

The chapter on generators and iterations techniques alone is worth the price of this book. It shows you how to "Loop Like a Native", all the way through using partial function application combined with lambdas to lazily loop over the contents of a file, ending when there's nothing left to read. Elegant!

There's more gems here, but I'll leave them for you to discover. The authors did a wonderful job of putting this book together, and I highly recommend this to anyone interested in becoming an effective Pythonista!

Thursday, March 14, 2013

A Dry Spell, Some New Books

It's been awhile, readers. Sometimes things get so busy that finding the time to write a few sentences seems like more than I can afford! Here's a few quick points:

Joy of Clojure 2e: I reviewed this while it was still in early release format. It looks like a pretty solid book. I feel like I would've gotten more out of it if I had more experience with the JVM platform and Lisps overall.

New:

ZeroMQ, O'Reilly Media: It looks like a very promising read. The technology has been around for awhile, and I've seen it come up more than once in casual technical discussion. I grabbed a copy of the book and I'll be reading up on it soon.
Clojure Programming, O'Reilly Media: Lisps continue to elude me, and I thought I'd dive in with this O'Reilly publication. Having been published very recently, and having received rather high reviews, I'm excited to see what this will teach me, especially with how meta-programmable I've heard Lisps can be.

'til next time - happy reading!

Saturday, February 23, 2013

Busy Times, Agile Reviewing

Though I promised a review of Hackers and Painters a week ago, it's been busier than I anticipated.

I'll say this much - the last few chapters of Hackers and Painters are very interesting. They'll make you question how much of your systems and applications you develop in low-level, compiled languages and how much more you could be getting done if you used a more powerful language. For aspiring system designers, these chapters are a very worthwhile read.

Interesting side note: I wonder what Agile book reviewing would be like? Perhaps I'll adapt that methodology instead of reviewing all at once, with a retrospective at the end of each book that summarizes the findings.

Friday, February 15, 2013

Updates and Upcoming Reviews

It's been a busy week, but there's still time to read available. I've just finished reading Hackers and Painters by Paul Graham. I'll have a review of that one up by the end of the weekend. Up next, I'm looking at (in no certain order):

Working with Legacy Code, Michael C. Feathers
Refactoring: Improving the Design of Existing Code, Martin Fowler
Beautiful Architecture, Diomidis Spinellis, Georgios Gousios
Introducing Erlang, Simon St. Laurent

Happy reading!

Saturday, February 9, 2013

"Beautiful Testing" by Adam Goucher, Tim Riley; O'Reilly Media;

Disclaimer: I've read portions of the book, not all of it. I'll specify these sections as I review. I'm reviewing the eBook version. (The PDF is great!). I received this book as a gift for participating in an O'Reilly Velocity survey.

To begin with, I found that this book is very friendly to jump into any chapter at any time - there are no ordering dependencies. This is very nice, since at times I'll be more interested in the business aspect and at others I want to do a technical deep dive. The quality of the writing has also been mostly solid so far. The author's passions shine through in most cases, and there has only been one section where it felt like the writing was too dry.

There are three parts to this book: Beautiful Testers, Beautiful Process, and Beautiful Tools. I found this division into tracks to be very effective, and further facilitates jumping in depending on one's mood and needs.

In preparation to review this book, I read the following chapters:
1. Was It Good For You?
3. Building Open Source QA Communities
4. Collaboration is the Cornerstone of Beautiful Performance Testing
14. Test-Driven Development: Driving New Standards of Beauty
15. Beautiful Testing as the Cornerstone of Business Success
17. Beautiful Testing is Efficient Testing

"Was It Good For You?" is an opinionated and energizing introduction to this book. In this chapter, Linda Wilkinson attempts to define what a tester is and what value a tester brings to the table. It felt like a keynote at a major conference, and left me feeling a little inspired after I read through it. Even though there are no ordering dependencies, I highly recommend starting at this chapter. It adds a lot to the book.

"Building Open Source QA Communities" is an interesting tale of what it takes to get volunteers involved and engaged, and what it takes to lose them. It was a memorable read.

"Collaboration is the Cornerstone of Beautiful Performance Testing" is all about working with others. It tells a story about how difficult things can be when miscommunication arises, and how a little effort in collaboration goes a long way towards really understanding the requirements. The back-and-forth dynamics detailed in the chapter make for an entertaining read.

The three chapters in the Beautiful Processes part of the book dove in. Chapter 17 was particularly pleasant to read. It was concise and humorous, and presented a neat mnemonic for determining testing priority (SLIME).

I've not read any part of the Beautiful Tools section of the book, but the chapter on "Testing Network Services in Multimachine Scenarios" looks particularly interesting.

Another note about the book overall: there are many great diagrams throughout, in full color. These make for convenient print-outs that summarize the knowledge in a chapter!

What could be better? I feel like the balance between parts could have been a bit better. I would have appreciated a little more on the side of Beautiful Tools. The landscape of testing tools only continues to grow. In fact, if another edition of this book is ever released, that would be at the top of my wishlist for new content!

Check it out: Beautiful Testing