Tuesday, December 20, 2011

Constant Pain with Non-Constant Constants

This thought has crossed my mind before... since it is on my mind now I thought I might share. Perhaps it will help someone on their software development career path... or perhaps it is just me venting...

The Art of Coding


One of the greatest challenges of coding is being able to understand it. Skill in coding is being able to understand the problem well enough to create a solution (otherwise known as coding). Great Skill is being able to write a solution so that someone else (including the future you) can read and understand what it is they are looking at. This concept isn't new, but yet we seem to be either challenged with the concept... or distracted. Even back in 1952 (yes 1952!!), in the computer history archive a set of lecture notes which in them states: "It is quite difficult to decipher coded programs even with notes and even if you yourself made the program several months ago." It also states: "I think the answer is simple: to make it easy one must make coding comprehensible". Well said!

The Point of Constants


One area that seems to be a constant :) struggle is constants. There are at least a couple of justifications for using constants I can think of: 1) a point of reference for tools (autocomplete with intellisense in IDEs and refactoring), 2) A common holder of value for which if it changes will be realized across the system. The main point of a constant isn't DRY or reuse... a common point of confusion for some reason.

IMO the stronger argument is point 2. if you were coding in VI or emacs, point 1 may not be of value, but point 2 certainly would be. If you name the reference of a constant the same name as the value of the constant... then you lose any the value of point 2.
It must also be said that constants, although they add value, they make code less readable (see examples below). If your constants have no value, they don't achieve points 1 and 2, then they need to be whacked!


Examples of Losing with Constants


This is real production code... some variables changed to obscure identify.


// example 1 Really?? where does this end?
String EMPTY_STRING = "";

// this isn't readable... it takes minutes (instead of seconds) to understand its value
// example 2
String ADVERTISE_PROPERTY_PREFIX = OUR_PROPERTY_PREFIX + "advertise.";
String ADVERTISE_TRADER_ADDRESS_PROPERTY = ADVERTISE_PROPERTY_PREFIX + "host";
String ADVERTISE_CONTEXT_PROPERTY = ADVERTISE_PROPERTY_PREFIX + "context";
String ADVERTISE_TYPE_PROPERTY = ADVERTISE_PROPERTY_PREFIX + "type";

// example 3
// here's a common one
static final String TABLE_NAME = "email";

// Column names for all the fields in the email table
static final String ID = TABLE_NAME + ".email_id";
static final String CUSTOMER_ID = TABLE_NAME + ".customer_id";
static final String ADDRESS = TABLE_NAME + ".address";
// ...
static final String FIELDS = ID + ", " + CUSTOMER_ID + ", " + ADDRESS + ", " + CREATED_BY + ", " + LAST_MOD_BY
+ ", " + CREATE_DATE + ", " + LAST_MOD_DATE + ", " + TERM_DATE + ", " + WEB_USER_ID;

// List of fields names with formatting used for select statements
static final String SELECT_FIELDS = ID + ", " + CUSTOMER_ID + ", " + ADDRESS + ", " + CREATED_BY + ", "
+ LAST_MOD_BY + ", " + "TO_CHAR(" + CREATE_DATE + ",'" + DATE_TIME_FORMAT + "')," + "TO_CHAR("
+ LAST_MOD_DATE + ",'" + DATE_TIME_FORMAT + "')," + "TO_CHAR(" + TERM_DATE + ",'" + DATE_TIME_FORMAT
+ "')," + WEB_USER_ID;

// SQL used to create a new EMAIL record
static final String CREATE_SQL = "INSERT INTO " + TABLE_NAME + " (" + FIELDS
+ ") VALUES (?,?,?,?,?,SYSDATE,SYSDATE,NULL,?)";

// SQL used to retrieve an email record by id
static final String RETRIEVE_BY_ID_SQL = "SELECT " + SELECT_FIELDS + " FROM " + TABLE_NAME + " WHERE " + TERM_DATE
+ " IS NULL AND " + ID + "=?";


For example 1... where does it end? would you create a constant A = "A"? When can EMPTY_STRING not be an empty string or ""? In this case, we get no value and less readability.

Example 2 is the chaining of constants, commonly found when there are a large number of property values being retrieved from a file or sys env. While not as bad as example 1, it is less readable. Having a prefix constant may be of value, but rarely is the chaining of constants of value. It isn't worth trading the potential for a company name change or something similar for readability. Or put another way, it isn't worth trading something that is unlikely to happen for something else which is more likely to be needed (that being the readability of the code)

Example 3 I see all the time... argghhhhh... please... just stop!

Examples of good constants would include, urls and database connections strings or a project defined standard date format.

May your constants always be final!

Wednesday, December 14, 2011

Advanced Spock Techniques



In recent years there have been a couple of tools that stand out when it comes to helping me be productive. One of those is the groovy test framework Spock. It is worthy of an introductory blog post... but that isn't this post. One of the challenges to Spock is the documentation isn't a complete as one would hope. The Spock Basics is fantastic when getting started but it is what it claims to be basics. I've been speaking with NFJS on the subject of Spock for the last half of 2011. On 2 occasions I've had the look our red shirted friend in the picture has when asked about controlling aspects of mocks in spock... meaning I didn't have the answer. In this case, google didn't discover it either... a quick email to the Peter Niederwieser reveals the previously undocumented solution. (at least that's my understanding)... and thanks Peter!



Here we will discuss 2 advanced aspects of mocking with spock. If you need more details on mocking in general hit the Spock site... or wait patiently for a future post:)

Problem Setup:

When mocking with Groovy we get 2 benefits from the verification of the mock. 1) order of execution, 2) failure if another method on the mock was executed but was not setup in the demand configuration.



void testDeposit() {

Account account = new Account(TEST_ACCOUNT_NO, 100)

def mock = new MockFor(AccountDao)
mock.demand.findAccount(TEST_ACCOUNT_NO) { account }
mock.demand.updateAccount(account) { assertEquals 150, account.balance }

def dao = mock.proxyDelegateInstance()
def service = new AccountServiceImpl(dao)

service.deposit TEST_ACCOUNT_NO, 50
mock.verify dao
}


Spock Cleanup


A similar solution in Spock might look like this:

def "deposit into account test refactor"() {

def mock = Mock(AccountDao)
def service = new AccountServiceImpl(mock)

when:
service.deposit nmr, amt

then:
mock.findAccount(_) >> account
1 * mock.updateAccount(_)
0 * mock.createAccount(_)

and:
account.balance == total

where:
nmr | account | amt | total
"101" | new Account("101", 100) | 50 | 150
"101" | new Account("101", 0) | 50 | 50
}


However this solution doesn't guarantee order nor do we want to use the 0 * mock. on all possible methods.

Spock Mock Solution


It turns out that we can repeatedly use the then: dsl to determine the order of events. So we can modify the Spock code above with the following changes.


then:
1 * mock.findAccount("1234") >> account

then:
_ * mock.updateAccount(_)

In this case we are saying that the findAccount() method will be called exactly once and return the account object and "then" the updateAccount will be call any number of times with any type of argument. The test will fail if the updateAccount is ever called prior to the findAccount being invoked first.

We can use this same technique to solve the strictness concern we have as well by adding one more then: block of code as outlined below:

then:
1 * mock.findAccount("1234") >> account

then:
_ * mock.updateAccount(_)

then:
0 * mock._

In this case the last 0 * mock._ says to expect nothing else. Ahhh.. the beauties of a dynamic language:)

Happy Coding... may your builds never fail and your tests always green bar!

Thursday, October 13, 2011

MongoDB Grails and Copying Collections

I currently find myself working on a project where Grails and MongoDB are the technology stack... in fact that in part is some of the reason for being interested in the project. I thought I would share 2 aspects of my learning so far:

Grails + Mongo


We spent at least 2 weeks trying to understand which approach we were going to take for our persistent tier. The general concensus was to use the Mongo GORM. We started down that path and ran into a bug and then a limitation on embedding. We got amazing support on the bug followed by the next release of the GORM option which fixed our initial document embedding concerns. We ran into more problems.

We switched to the morphia driver... once again... challenges caused lots of frustration. The challenges included getting other plugins to work with this driver, the fact that the morphia "domain" objects couldn't be in the standard grails domain folder, which further complicated mocking. Frustrations with getting full capabilities of getting spring security to work. It turns out that with spring security there are a number of relational assumptions (along with assumptions of objects being domain classes.. meaning they live in the grails domain package space) which cause problems when you are using NoSQL.

The morphia had another limitation we couldn't live with... we wanted the ability to have different types of objects mapped to the same collection. It appears to me that the Mongo efforts in the Grails space are bolt ons to a rational model. You can get it to work when thinking with a RDBMS mindset. They seem to break down with a NoSQL mindset... this certainly could be attributed to user error (meaning me). I believe we gave it a good try... however the project is under the gun and we didn't have time to "play" with it or wait for responses.

Alas we created our own solution... one of our significant constraints was to be prepared for extremely large scale. This had us duplicating a lot of documents not common in RDBMS solutions... along with the fact we needed to control the number of queries. Ideally we want 1 query for a page... certainly for the main page which is high laden with dynamic content.

Mongo Copying Documents - Collection to Collection


I'm really getting to like Mongo... a lot. One of the downsides is there is limited documentation... including limited forum comments on it. Because of this, I've decided to post interesting solutions to problems as I discover them... time permitting. so for todays tip. I recently needed to duplicate a "record" in mongo speak a document in a collection. There is so much information out there on how not to duplicate records that a google search at this point usually points you in the wrong direction. My situation was such that we have a CMS component to our application, which sets a publish day on content. What I needed was the ability to duplicate a record (so the BSON id would need to change and set a new publish day.

The first thing to recognize for those new to Mongo is that the Mongo console (the command line client to Mongo) works with javascript... so get your javascript fu ready... no.. not your jQuery fu... your plain jane generic javascript fu:)

Here is the solution I came up with:

db.NewsItem.find({newsType: "Blah"}).forEach( function(x){ x._id = new ObjectId(); x.publishDay=283; db.NewsItem.insert(x)} );


So it turns out that you can pass a lambda expression to the forEach method of the collection. The trick is to assign a new ObjectId() to the _id, make any other assignments and then insert into your collection of choice... in this case back into the same collection. This comes in handy as well when you want to backup a collection real quick. (yea, I know you could use the mongodump and mongorestore, for a quick temp solution this just seems more convenient)

db.NewsItem.find({publishDay : 285}).forEach( function(x){ db.NewsItemBackup.insert(x)} );


Happy Coding!

Wednesday, October 5, 2011

JavaOne: Rocking the Gradle

Presenting Rocking the Gradle at JavaOne tomorrow 12:30pm at the Parc 55... see you there!

Tuesday, January 11, 2011

Throughput and High Velocity

Wow… has it really be 11 months since I’ve blogged. I had many ideas for blogs half written or half thought out… but last year was extremely busy. It’s a new year… and I’m back :) Let’s not focus on the past, let’s get right into it.

I had a client recently asking how to improve throughput. It took me a few minutes of questioning to understand what the subject really was. Thoughtput? Are you having scaling issues? It turns out, the interest is closer to developer throughput, although I don’t think they used that term and it may have been broader than that. The interest is in the acceleration of moving an idea into a product. Immediately coming to mind is the book “The Goal”, whose focus it is to increase the throughput of a manufacturing plant. Although the creation of software is nothing like an assembly line (although the analogies seem to be common place), it is however interesting to consider some of the similarities involved with the theory of constraints. The challenge is, in my opinion, that the constraints for software development are more virtual and harder to measure than assembly line constraints.

Consider what slows down the development of software. Of all my years of software development, the issue has never been an issue of programmer to keyboard. In other words, it isn’t how you type or how fast you type, although it certainly needs to be reasonable. It often times isn’t how long a developer has been developing… although again there is a required threshold. Where have there been issues? Let’s see if you list is similar to my list:
  • Poorly written requirements. The user seems to know what they want, but was unable to communicate it effectively or the BA was unable to capture it appropriately.
  • Lack of direction, mission and purpose of the entire team… either we have too much time, too much money or a total lack of leadership. Let’s chalk this up to priorities.
  • Lack of knowing what to do next… BA or programmer… or PM
  • Lack of knowing how what a programmer is working on fits into the whole
  • Lack of team coordination
  • Large team sizes
I’m sure there are plenty more… but what I would say is the #1 issue for teams in trying to achieve high velocity is clarity of thought. I mean the ability to see a problem as the user sees the problem. Being able to see a solution and understand the problem in abstract terms and abstract ways; For a team to have the same thought or to hold within their minds the same mental model / picture. This is a tough task as differing priorities, differing locations, differing experiences and differing thoughts among the team members results in a number of differing ideas to the solution. It brings to mind to me of the scene in the last samurai, where Koyamada tells Tom Cruise “Too Many Mind”… “No Mind”

As Dr. Covey says in 7 Habits of Highly Effective People (which is what we are talking about… right?), you must start with the end in mind. Clarity positions people to be as effective as they can be. In the real world, a user really may not have the clarity themselves or may not understand the problem fully to articulate the desired clarity. This is why an agile development process is so effective. It provides an opportunity for the user to discover the clarity through regular feedback with the developers. Or in other words to detect deltas in the forming solution from the ideal or user desired solution. It also places the developer close to the user, so that through the creation of software when the output drifts from the user mental model it is caught quickly and corrected. Also developers are pairing with, and communicating with other developers regularly so that differences in the mental model are detected early.
What we are seeking for efficiencies sake is the fastest mechanism to detect that something is just not right.

So at the time of development what are the common practices that occur in a corporate development which cause inefficiencies or put another way what are the practices that cause the error detection to be delayed, either the detection itself is delayed or the communication of the detection is delayed.
  • Developers are several degrees removed from the user. So there are either 2 human “hops” for error recognition and for communication.
  • Developers are too junior. Junior developers are just less capable of thinking and communicating in abstract software terms and are less able to detect deficiencies in the solution model.
  • Separation of the team members. Numerous studies show the increase cost of communication of people in just 2 different rooms, its more expensive for 2 different floors and way expensive when there is an ocean between members of the team. There is an additional cost that isn’t well documented in the fact the difference in the mental model or solution set are very difficult off-shore.
  • Communication is all facilitated through paper. This is the worse of all the communication modalities.

So what can you do to increase throughput? What can you do to increase team velocity?
  1. Hire highly skilled developers, developers skilled in their language who also have a proven record of quickly translating problem sets into solution sets.
  2. Provide them with quick, and easy access to the user
  3. Put the developers in the same work space
  4. Reduce the team size, which results in less possibilities of mental model differences.
You are looking for a team with a high attention to detail and ability to detect differences in the mental model during the development process. Then you are looking for well-lubricated spontaneous communication.

This isn’t fool proof but it does solve one of the largest most common problems I see in corporate software development.

A number of ideas came to mind in the process of writing this blog... and there are some assumptions baked into the content. Based on that, I will likely blog deeper on several of these subjects in the future.