Shorten Map and Each Commands

Or: Pithy Procs for Working with Enumerables

Typing out code blocks every time you want to use map or each is a bit of a pain, and often unnecessary.

You may know that (1..5).map { |n| n.to_s }  can be rewritten as  (1..5).map &:to_s.

But what does the latter really mean?

The ampersand lets Ruby know you’re passing something to be used as a block and not an argument. It also calls #to_proc on the something. Symbol#to_proc is implemented in C, but for our purposes, it maps to something roughly like this:

Here’s another example:

(1..5).each { |n| puts(n) }  can be rewritten as (1..5).each &method(:puts).

Method#to_proc is also written in C, but the documentation includes a ruby equivalent:

What about something like (1..5).map { |n| n * 2 } ? (1..5).map &2.method(:*).

What else can we pass in? Anything with to_proc. Lambdas? Check.

 

Implementing #to_proc

Why not add a to_proc method to a command class?

In the above, only one PurchaseCommand is initialized; the same command is used for all items in the cart. It might be a bit clearer to write it this way:

This also allows us to hold a reference to the command for later use, if, for example we wanted do something with its shipment.

Not just for map and each!

This isn’t limited to each and map.

(1..10).partition &:odd?

Be careful, though. While arr.reject &:nil?  is shorter than arr.reject { |i| i.nil? } , arr.compact  is shorter still, and arguably more intention-revealing. At least if you know your enumerable methods.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Reorder Git Commits Through Rebasing

The other day I was working on a pull request when I discovered I had forgotten something trivially simple that should have gone in the commit before the one I just made. If I hadn’t have made that other commit, I could just --amend it. I could git reset HEAD^, amend my commit, then restage and redo my second commit, but that’s a lot of steps and could get ugly in a hurry if the target commit was more than one commit back.

What would be perfect would be just making an extra ‘oops’ commit and squashing it later. But squash and fixup meld into the previous commit. If only there were a way to change the order of commits so I could squash as desired.

As it turns out, there is!

Reorder Git Commits with an Interactive Rebase

Reordering git commits is about as intuitive as it gets. Simply:

  1. start an interactive rebase – in my case with git rebase --interactive HEAD~3
  2. change the order of the commits in your editor
  3. save and quit.

You can even squash or fixup at the same time.

It’s even right there in the comments, though a bit opaquely: “These lines can be re-ordered; they are executed from top to bottom.”

‘Executed’ refers to the commands in your rebase (e.g. pick). So the reason this works is the pick command actually adds the commit. Order matters. I guess I always thought pick meant “yes keep this one” rather than “add this one to the end of the queue.”

reorder git commits - before

Before

reorder git commits - after

Reorder git commits and fixup

As always, the standard history-changing caveat applies. If you’ve already pushed changes to a shared branch, you’re going to have to live with your oops commit. If you’re working on a private topic branch or you haven’t pushed yet, rebase away ;)

Posted in Uncategorized | Tagged , | Comments Off on Reorder Git Commits Through Rebasing

Boilerplate karma.conf.js with karma init

I’ve worked on a number of Angular apps, but most of them were either set up by someone else or created using Yeoman or something similar.

Earlier today I was trying to get Karma and Jasmine working so I could write some tests for a small new application. I had spent some time googling boilerplate karma.conf.js files and trying to fill in the gaps. Finally, I stumbled across the karma init command, which walks you through generating a boilerplate karma.conf.js.

The bigger lesson

Aside from the obvious lesson of “use karma init,” I think the other takeaway here is the importance of familiarizing oneself with the tools one uses every day.

Posted in Uncategorized | Tagged , , , | Leave a comment

GROUP BY and aggregate functions

Imagine you have the following table ‘users’:

id username level points
1 jsmith 10 1000
2 fbloggs 10 2000
2 jdoe 3 100
4 mbloggs 3 1000
5 bdavis 8 500

You can do some pretty interesting stuff with this data. You could determine the results of a contest for who can acquire the most points. Let’s try that.

Whoa, hang on a minute. That’s not right. What just happened here?

MAX() is an aggregate function. What does that mean? It means its job is to take in many rows, perform an operation to combine all the data, and return a single result. We got our single result. But what happened to username? It got aggregated. We asked for two pieces of information that didn’t necessarily have anything to do with each other.

It’s worth pointing out that Postgres would not let you make this mistake.

Let’s use Postgres for now to be safe.

jsmith and mbloggs are tied for second place. Listing users that are tied on the same line, separated by commas would be a better way to see this.

in Postgres or in MySQL,

Can we simplify the SQL at all? What if we want just the user with the most points? What about this:

The username column is now in an aggregate function too, so Postgres shouldn’t complain.

Wow. We’ve managed to shoot ourselves in the foot even with Postgres. Since this is the second time we’ve made a mistake like this, let’s take a moment to dig a bit deeper about why this doesn’t work the way we might think it should in this particular situation.

What is MAX()? It returns the maximum value for a particular column given a group of rows. It’s a piece of statistical information about a GROUP of rows. You don’t need statistics about a single row.

For instance, it’s more apparent that it wouldn’t make sense to do this:

The number of rows doesn’t really have much to do with any of the ids.

Picking a Winner Correctly

So how would you accomplish this? Like this:

-or-

Notice that we use one query (or subquery) to find the highest point value, and a second query to find all users (there could be a tie) with that value. Aggregates aren’t for working with single rows.

Using Aggregate Functions Correctly

So what would be an appropriate way to use aggregate functions and GROUP BY? What if we wanted to look at the amount of participation of the various user levels?

Here we’re dealing with groups of rows that have something in common: level. We group all users of the same level into a single pseudo-row. Note that, just as it doesn’t make sense to use aggregate functions on a single row, it doesn’t make sense to talk about a single column of a group of rows. Users of level 10 don’t have a single points value, they have many. How do you want them aggregated? Do you want to add them up? Concatenate them into a list? Find the highest?

The one sort of gray area/exception in this case is level 8. Because there is only one row in the group, there is technically only one points value, only one id, etc. It might be helpful to think of these cases not as a single row, but as a set or array containing one row. [5] instead of 5.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Using git rebase onto

Have you ever branched off of master, done some work, committed, pushed, and opened a pull request only to find out your code belongs in some other long-running branch?

What do you do? Delete your branch and rewrite your code? Stash it and apply onto a new branch? Cherry pick your commits to a new branch?

How about git rebase onto?

I don’t know what your git workflow looks like, but just to start with an easy example, let’s say your server runs the production branch and normal development happens against master. You’ve opened a PR against master, but now you’ve been asked to make it a hotfix into production. There are several commits in master (and thus, on your branch) that are not ready for prime time. How do you fix it in one command?

git rebase --onto production master

The syntax is just git rebase –onto [target-parent] [current-parent]. target-parent and current-parent can be any git ref.

This, of course, results in a non-fast-forward update, so in order to push your changes, you’ll need to do one of two things:

  1. If your organization doesn’t frown on it, use the force! git push --force origin yourbranchname
  2. If you or your organization are uncomfortable with force pushing, you can rename the branch before pushing. git branch -m yournewbranchname

A couple last words. Never, ever force push a shared branch. It will cause problems for someone else and everyone will hate you. Because git rebase onto requires force pushing, don’t do that on a shared branch either. But really, why would you ever do either one in the first place!? You shouldn’t be committing directly to a shared branch anyway. If you write all of your code in personal topic branches that get merged into the shared branches, you can use cool things like git push force and git rebase onto. This is why we can have nice things.

Also, git rebase can be somewhat dangerous if you don’t know what you’re doing. More to your sanity than your code, but still. Make sure you know what to do if you fuck things up.

Posted in Uncategorized | Tagged , | Leave a comment

Easier Bisecting with Tests

You’ve probably used – or at least watched someone use – git bisect before. It’s a great way to track down which commit introduced a bug. It usually looks something like this:

git bisect start
git bisect good [ref before bug existed]
git bisect bad [ref after bug was introduced...often HEAD]

[Check if bug is there…yup!]
git bisect bad
[Check for the bug…no.]
git bisect good
[Check for bug…no.]
git bisect good
[Bug? Yes.]
git bisect bad

[SHA where bug was introduced is isolated. Yay!]
git bisect reset

As you might know, there’s a way to have a script automatically do all the tedious good/bad checking for you: git bisect run. What you might not know is how easy it is to use.

When I go to fix a bug, usually the first thing I do is write a failing test that should pass when the bug is fixed. This forces me to think about the problem in very concrete terms, and gives me an indicator to know when I’ve fixed it. TDD.

As it turns out, this test is all you need to use git bisect run!

Normally, the test would probably go in an existing file in the test/spec directory. To start with, though, let’s put it in its own file. Because git bisect works by checking out lots of commits, a new file is your best bet for avoiding potential merge conflicts.

C’est tout. That’s it. We’re ready to use git bisect run.

git bisect start
git bisect good [ref before bug existed]
git bisect bad [ref after bug was introduced]
git bisect run [command to run your test]

[Lots of automated checking and test output]
[SHA where bug was introduced is isolated. Yay!]
git bisect reset

Now that you’ve found the offending ref and run git bisect reset it’s safe to move your test to the appropriate file.

Quick addendum:

Why would you care when the bug was introduced? Why not just fix it and move on?

  • In many cases, you really don’t
  • If your project actively maintains multiple versions, knowing where the bug was introduced will tell you which are affected without having to test every last one
  • If you’re having a hard time figuring out what’s causing the bug, sometimes looking at the commit that introduced it can narrow it down
  • If you want to do some kind of post-mortem to figure out who wrote the bug, why a test wasn’t written, etc
Posted in Uncategorized | Tagged , , , | Leave a comment

How I Test Controllers

I’ve been trying to improve the specs I write lately. My method before was mostly copying and pasting from my past projects. Somewhere way back I adapted them from an early version of Michael Hartl’s Rails Tutorial, making various modifications over the years.

Recently I decided to scrap my specs and start over from scratch. Since my controller specs were particularly ugly, I decided I’d start with them.

I wanted to share what I learned, so here goes. In true test-driven fashion, we’ll write a little one-controller app, specs first. Our app will be a list of favorite books. We’ll start with the most basic skeleton:

 

Because this app is so simple, there will be nothing at all interesting in our model, and thus nothing to test. We’ll trust ActiveRecord to do its job since that code is already tested ;) Here’s our model:

 

Now let’s start with index. What do we expect an index action to do?

  • We need to get an array of all books from ActiveRecord via the model.
  • We need to assign that array to a variable for the view
  • We need to render the index template
  • We need to indicate to the browser that the request was successful (HTTP status 200)
  • We don’t need to set any flash messages
  • There’s really only one path to test. The only abnormal instance would be an empty array of books, but the controller doesn’t really know or care whether this happens – just the model and the view.

In the past I would have done a Factory Girl create in a before block to test getting the array. But the controller isn’t really responsible for getting the ActiveRecord object, only asking the the model for it and passing the result on to the view. Besides, we should probably write integration specs to make sure everything works together, so we’re going to test that anyway. That’s why these days I like to use mocks in testing my controllers.

Let’s go ahead and create a Factory Girl factory anyway. We can use attributes_for  as a convenient way to get our params hash, and the factory will be useful later for our integration specs and possibly model if we add anything interesting like validations.

 

Time to write our first failing spec

 

And we get our failure: Book didn’t receive all. Let’s modify the index action in the controller:

 

And it passes. Now let’s write our first real spec.

 

Of course it fails, because while we called all on Book to get our first expectation to pass, we didn’t assign it to a variable for the view. Let’s fix that.

 

And it passes. On to our next spec. It should respond with success. Because this is the normal response for rails, let’s set ourselves up for failure.

 

And write our spec:

 

Failure. Let’s fix it and set up our next failure.

 

Success.

 

Failure.

 

Success.

 

Failure.

 

One could argue that it seems excessive to write code to make such simple specs fail, and I’ll readily admit I don’t always do this. But this is what the rhythm of TDD should feel like, and it’s not completely safe to trust specs you haven’t seen fail.

#show, #new, and #edit will be very similar – there’s only one path and they simply render a view. I’ll include the full code at the end.

The next interesting challenge is #create. Let’s take a crack at that.

What do we expect a create action to do?

  • We need to instantiate a new Book with the params we’re given
  • We need to assign the object to a variable for the view in case we need to display it
  • We need to ask the book to persist itself
  • If the save reports success, we want to:
    • Set a flash message indicating success
    • Redirect to index
    • Because it’s a redirect, index is responsible for anything that happens after that. We’ve already tested that.
  • If the save fails, we want to:
    • Render the new template
    • Indicate to the browser that the request was successful (HTTP status 200) – despite the fact that the save was not
    • We don’t need to display any flash messages

 

Failure – Book didn’t receive new

 

Success. Speaking of which, it’s time to branch our specs for success vs failure.

 

It fails.

 

It passes.

 

It fails.

 

It passes.

 

It fails.

 

It passes. The failure path behaves virtually identically to the new action. We’ll look at a way to exploit this and other similarities to DRY up our specs in a follow-up post on shared examples. For now, notice that book is a local variable. This allows us to call save while still setting up our spec to check whether the book is assigned to an instance variable for failure.

Here is the final create action and corresponding spec:

 

 

So what are the takeaways?

  • Write specs that test the thing you’re testing. Use mocks. This makes tests easier to write, easier to read, and super fast. On my machine, 29 specs run in less than two tenths of a second without Spork.
  • Describe the behavior of the thing you’re testing. Do this before you write your specs. When you’re done, your specs are documentation. Try rspec spec --format documentation
  • Follow the rhythm of TDD: red, green, refactor. For easy stuff like this, refactoring might not be necessary.

It should go without saying, but these are just guidelines. Fast tests are a means, not an end. There are cases where mocking becomes awkward. TDD is a tool, not a dogma.

I’ve posted the finished code on GitHub.

I hope this is helpful to others, and I welcome feedback.

Posted in Uncategorized | Tagged , , , , | Leave a comment

TDD, Dogma, and Professional Courtesy

I had the privilege of attending RailsConf last month and seeing DHH’s keynote. A lot of feathers were ruffled. Mine weren’t. Maybe that’s why I’ve been only peripherally aware of the ensuing debate. But I did watch today’s discussion. It reinforced what I thought was going on: people are stuck in their own point of view and talking past each other.

In his keynote, DHH challenged the idea of programming as hard science. I personally do think of myself as a software engineer and programming as hard science.

Hard. Fucking. Science.

But – if I understand correctly – I see what he’s getting at. Programmers learn design patterns, principles, and techniques and assume they are universal truths because science. The reality is, however, that even in hard science there’s often room for subjectivity and multiple approaches. Even in mathematics – the hard science – there is often more than one way to arrive at a solution. There may even be many correct solutions, any one of which can be used, depending on what you’re doing.

I don’t think becoming French poets or software writers will make us immune to this sort of black-and-white thinking – text editor holy wars have been around as long as text editors have.

What will? Professional courtesy. If we see ourselves as craftspeople – whether that craft be  writing, engineering, or otherwise – shouldn’t we treat our fellow programmers with the respect we’d like to receive ourselves? Oughtn’t we withhold judgment of their process until we’ve seen the deliverable – the code?

Can we let go of our absolutes? Absolutes like “If you don’t write your tests first, your code is shit.” And like “Writing tests first leads to bad design.”

I’ll start.

I love TDD. The rhythm works for me. For many things. But sometimes I write tests last. And sometimes I don’t write tests at all. Good design drives good tests and good development – regardless of which happens first or whether testing happens at all. Testing is a tool, not the deliverable. Every problem is not a nail and testing is not a silver bullet.

I love using mocks and isolating the thing I’m testing. It makes my tests run really fast and I can easily isolate what I broke and where. But they sometimes come at a cost and sometimes that cost isn’t worth it.

I like RSpec and TextMate 2, but there are plenty of other great tools for writing code. I like design patterns and principles and learning them has made me a better coder. But there isn’t a design pattern for every programming problem. Sometimes principles conflict and there are tradeoffs to be made. Good code is a little bit subjective. We sometimes have to choose between DRYer code and code that’s easier to read. Between code that’s faster and code that’s more easily reusable.

If everyone did things the same way, how on earth would we learn from each other?

Posted in Uncategorized | Tagged , , | Leave a comment

Lessons Learned about Timestamps

I ran into an extremely frustrating issue lately where some Rails code periodically behaved in a very unexpected way.

After trying everything I could think of, I ran some code in the console to list the occurrences of the behavior, the user they happened to, and the time they happened at. A pattern quickly emerged. It was happening between 18:00 and 0:00 every day. Suspicious. Sounded like a time offset issue. It occurred to me to check something else, and sure enough: before daylight savings time ended, it happened between 18:00 and midnight. After it ended, it happened between 17:00 and 01:00. It perfectly matched my DST offset.

What was happening? Rails was writing timestamps to the database in local time, but when it went to pull them back out, it assumed they were in UTC and converted accordingly.

Why? One line of code in environment.rb:

The only explanation I can think of goes something like this: PostgreSQL doesn’t store UTC offsets with its timestamps by default. Rails 2.3 uses the default settings rather than forcing the type. When it encounters a timestamp without an offset, instead of checking default timezone, it assumes UTC.

As Rails 2 is no longer supported, there’s no sense in creating a pull request, but perhaps this will help someone else. My solution? Remove that line and let it save timestamps in UTC as recommended.

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Generators in Rails 2.3

The other day I started looking in to how to create generators in Rails 2.3. I came across a couple of articles on the topic that provided a good starting point, still was a bit confused about a couple things.

I couldn’t figure out exactly how to get information to templates at first. The solution: simply add an attr_reader corresponding to the instance variable in initialize(). In this case, hello.

and in templates/foo.erb, just put an erb tag with a local variable name.

 

At this point, you might be wondering, like I did, “What about erb files? How do you differentiate between erb to interpret now in the generation process and that for later when a template is rendered, say?”

 

Note that the double % occurs only in the opening erb tag.

When I discovered that the first link above contained a link to the code for rails 2.3 generators and dug around a bit in there, most of this became clear pretty quickly.

Something that took a bit more effort to suss out was how to accept runtime options that had a string value instead of boolean.

 

Note that the “Default: mysql” bit is just part of the help output. You’d have to write logic to actually make this function as promised.

Posted in Uncategorized | Tagged , , | Leave a comment