Dependency Injection and Circular References

I’ve been using dependency injection (DI) in my code for over a year now. I wasn’t a huge fan of it when I started. The only benefit I have seen from using it is with unit testing. When code is unit tested, its easier to “hide” parts of code you don’t want to test.

I’ve also read about the performance hit for using DI. All code has a cost and I wanted to be preemptive about getting the most out of our code. The company I work for uses Ninject for their DI tool. Many performance benchmarks place Ninject near the bottom of the list for performance. DI’s performance hit is mostly during startup, but that is one of the places I feel we need to be faster.

We didn’t do anything complex with Ninject/DI. Just basic constructor injection. This is the act of using the DI container to populate the IWeapon object when it creates the object based on a Ninject config file.

--Code from Ninject's site.
class Samurai
{
    readonly IWeapon weapon;
    public Samurai(IWeapon weapon) 
    {
        this.weapon = weapon;
    }
    public void Attack(string target) 
    {
        this.weapon.Hit(target);
    }
}

class Sword : IWeapon
{
    public void Hit(string target) 
    {
        Console.WriteLine("Chopped {0} clean in half", target);
    }
}

When we use the code, we can just call the following.

Samurai sam = new Samuari();

If you’re new to DI, you’ll assume that this code will break, but Ninject/DI will use its config file to add whatever class you have bound to IWeapon.

This means that removing Ninject and its config file will break the code as the IWeapon object isn’t defined. Using our this is where I was updating our code to manually bind on the default constructor. The updated code would look like the following

--Code from Ninject's site.
class Samurai
{
    readonly IWeapon weapon;
    
    --Default Constructor replaces Ninject.
    public Samurai()
    {
        this.weapon = new Sword();
    }

    public Samurai(IWeapon weapon) 
    {
        this.weapon = weapon;
    }
    public void Attack(string target) 
    {
        this.weapon.Hit(target);
    }
}

class Sword : IWeapon
{
    public void Hit(string target) 
    {
        Console.WriteLine("Chopped {0} clean in half", target);
    }
}

If you’re still reading, thanks 🙂 Ninject purists will state that I’m loosing the advantages Ninject provides. If an implementation changes, I can’t make a fix in 1 Ninject config file, but must update code in several places. I’m okay with this performance tradeoff. My implementations don’t change that much and I have find/replace tools that I am comfortable with. Using this method, I still get the advantages of basic DI for unit testing that I wanted in the first place.

Here is where the problem comes in and the title of the article. By using a DI tool such as Ninject, I can easily create circular dependencies in my projects. It turns out that in the project I was attempting to remove Ninject from, two assemblies were each referencing each other. Because we are using Ninject, there is not direct reference in the project references for the other project. Its all brought in via runtime with the DI tool.

At this time, my plans to remove Ninject are on hold as it would take a significant refactoring of the code to remove the circular reference. I am by no way saying not to use DI, but to be careful as it makes it easier to do some things that you don’t want to do.

Hogan Haake

The Trouble With Being a Software Developer

When my wife and I first got married, she took care of our monthly bills. Each month bills would pile into the desk and once a month or so, she would go up to the office and pay the bills. Then after paying the bills, she would sort and file them in a cabinet for reference. Most of the time, we never look at the bills in the cabinet, but occasionally, we review a previous bill to compare it against the current one. Sometimes for trending an electric bill, most often to see where the rate increased.

Eventually it was decided that it was my turn to handle the bills (something about pulling my weight around the house). We both soon discovered that I am horrible at organizing and filing away the bills into folders. After much deliberation, we decided that I would scan the bills and place them in folders on the computer. In order to best organize them, they were placed in a “Bills/2013/<person we owe money to>” structure. The problem I have run into is that at the beginning of the year, I end up creating a folder for the new year, then manually creating 20 plus new folders with the correct names. I would just copy the whole directory, but then it would include all of the PDF files of the bills in that folder.

Being a lazy guy, I endure this task saying that I should automate the task but never bother. This year, my father was mentioning to me that he has the same problem with is personal and professional folders. I figured that with a need greater than just mine, I would create a program that copies just the folders. I look for every opportunity to “pay back” my parents for all of the opportunities they have provided me with my life.

I spent about an hour writing a simple program that would copy the folders without the files (including sub directories). It wasn’t hard and I didn’t have to look anything up as it was a simple program. I actually had more problem trying to email the file dad as his email provider didn’t like .exe or .zip files…

I sent the program to dad and he was quite happy to have it. He sent it to other people in his office that had the same problem. Simple code that was a hit in a local accounting office.

The story would have ended there, but I was doing some work on my company’s software Continuous Integration build server. I had to copy some files over the network. I was looking into the properties for “xcopy” and came across the following documentation.

xcopy /t /e “C:\Your Folder” “C:\New Folder”
/t = Copies the subdirectory structure, but not the files
/e = Copies subdirectories, including empty ones

This got me thinking, and I did a Google search (https://www.google.com/search?q=Copy+folders+without+files). According to Google results, about 17 million results were returned with people looking for or solving this relatively simple problem. Some result pages offered multiple solutions from the built in Windows commands to elaborate custom projects that make my simple copy program seem insignificant.

The point I realized is that being a software engineer, I should have been smarter. It is very easy for me to write a quick program to solve a problem, but maybe I should be looking for alternate existing solutions before I dig in and just start writing code. Maybe I wouldn’t have missed that big play during the game if I wasn’t on the sofa writing the solution to a problem that has already been solved many times! How many other places in my professional career should I be looking for existing solutions instead of making my own?

Hogan Haake

SQL Exists Syntax

About this post: This post is designed to help you understand how to use the EXISTS keyword in your SQL query environment. All of the examples will be based on a fictional three table database for posting messages to an Internet message board. The examples will be generic so that they can apply to any database (Oracle, SQL Server, etc) that supports the EXISTS syntax.

As part of my job, I regularly deal with SQL coding. My current focus is on Oracle databases, but I occasionally get to play in DB2 and other platforms. I recently encountered some SQL code that I was not familiar with. In the WHERE clause of the code was an EXISTS statement with a sub query listed inside. It looked something like the following

SELECT 	*
FROM	Users U
WHERE 	ID EXISTS (SELECT NULL
		                  FROM PostRating PR
		                  WHERE PR.UserID = U.ID
		                  AND MarkedAsSpam >= 5)

Hard as I tried, I could not make sense of the query listed above. Why would you select NULL from the the table and what does it mean. In order to explore this further, we need to have a simple problem to solve.

First, there will be two database tables for this example. I’m going to write generic syntax to get the point across. You’ll have to modify this to your platform. The tables only have the columns that are relevant to the example. Were this a production table, they would have many more columns.

 

CREATE TABLE Users AS
ID          Number
Name     String

INSERT INTO Users (ID, Name) VALUES (1, 'Nice Guy')
INSERT INTO Users (ID, Name) VALUES (2, 'Bad Guy')

CREATE TABLE PostRating AS
UserID           Number
PostID           Number
SpamVotes    Number

INSERT INTO PostRating (UserID, PostID, SpamVotes) VALUES (1, 1, 1)
INSERT INTO PostRating (UserID, PostID, SpamVotes) VALUES (2, 2, 20)

 

For the tables above, the Users table lists the valid users in the system. PostRatings table is used for people on the message board to mark a message as SPAM. Using the community to mark messages as spam can help clean up an open message board from unwanted content.

Before digging into the original query above, lets explore with some basic testing how EXISTS works.

 

--Returns 2 rows.
SELECT *
FROM Users

--Still Returns 2 rows
SELECT *
FROM   Users
WHERE  EXISTS(SELECT null
                          FROM PostRating)
             
--Returns 0 rows. 
--Note that the sub query returns zero rows.      
SELECT *
FROM   Users
WHERE  EXISTS(SELECT null
                          FROM   PostRating
	                  UserID = 999)  

--Returns 2 rows. 
--Note that the sub query returns 1 rows.      
SELECT *
FROM   Users
WHERE  EXISTS(SELECT null
                          FROM   PostRating
	                  WHERE  UserID = 1)


--Still return  2 rows  
--The sub query if for an Oracle database, but it works
--the same for any table that has rows in it.             
SELECT *
FROM   Users
WHERE  EXISTS (SELECT null
                           FROM dual)
                    
--Blows up, must be a query inside of the parentheses.                    
SELECT *
FROM   Users
WHERE  EXISTS (NULL)

 

Hopefully at this point, you have come to the same conclusion that I did after all the testing. If the query listed inside the EXISTS returns anything [including NULL] then the statement is true. If no records come back, then the record is false and don’t return anything in the whole query.

Now you might be asking why anybody would use this syntax. The power of this statement comes when you join the sub query of the EXISTS to the main query. You can quickly exclude records that don’t meet the criteria. In the original example above, I wanted to list all users that have had one or more posts listed as spam 5 or more times.

SELECT 	*
FROM	Users U
WHERE 	ID EXISTS (SELECT NULL
		                  FROM PostRating PR
		                  WHERE PR.UserID = U.ID
		                  AND MarkedAsSpam >= 5)

By joining the Users Table to the PostRating table in the WHERE statement of the sub query inside the Exists, I’m essentially asking, “Does this user have their MarkedAsSpam for any posted listed 5 or more times?” The result is that only 1 record will return for the ‘Bad Guy’ user, but it will ignore ‘Good Guy’ as their count is only at 1.

This would be great for an admin report to help know which users might be removed from the system for abuse, and it protects the good users are not abusing the system.

One final thought. Upon my initial research for EXISTS I could only thing that there are other ways to write the same query and get the result you’re looking for. However, I inherited a large code base where the developers preferred the EXISTS syntax. I don’t have the time to re-write and test working code, so I have learned to work with it. Now that I understand it, I have added it to my skills and even find it quite useful!

Let me know if you have questions or comments.

Hogan Haake

Oracle Sequence Promotes Poorly Maintainable Code

I started my professional career working with Microsoft’s SQL Server. I spent twelve years off and on learning how to design a database and write stored procedures in T-SQL. Then this last October, I switched jobs and was exposed to a new database platform, Oracle. Since this switch, I have used every curse word I know and invented new ones to express my frustration at interacting with an Oracle 10x something database. I’ll leave the rest of my rantings for another post and just focus on one aspect of Oracle that has frustrated me recently.

I started creating my first new table in Oracle and started defining the columns. I always start with an ID column that is typically used as primary key of the table. As I went to select the column type, I didn’t see anything labeled “autonumber”. Trying again, I looked for integer, but that isn’t there either. Oracle only supports the “Number” column. There, you can provide the precision before and after the decimal point. After selecting the number column, I looked all over for something that would mark the column as unique and set for an autonumber sequence. Striking out  quickly, it was time to ask Google and start learning about Sequence objects.

Oracle tables have no built in mechanism for auto numbering. Instead, you must create a separate unique Sequence object and use it each time a record is inserted into the database.

CREATE SEQUENCE customers_seq START WITH 1000 INCREMENT BY 1 NOCACHE NOCYCLE;

Then each time the sequence is used, it looks something like

INSERT INTO customers (ID, Name…) VALUES (customers_seq.nextval, ‘Hogan Haake’…);

Comparing this to the SQL Server I’m used to, if a column is autonumbered, you just exclude it in the insert and it automatically gets the next ID on insert.

INSERT INTO customers(Name, …) VALUES (‘Hogan Haake’…)

At this point, Oracle people could argue that I’m just lazy, or I just need to learn a new way. They are right on both accounts, but there is more to the story! I recently came across some bad code in part of my application where the developer didn’t use the sequence.nextvalue for an insert, instead converting the current date into a number [YYMMDD Format] and inserted that into the table as a unique value. While that method worked, the unique number they were generating was quite far away from the current sequence. The system has been in production for two years now and the sequence number is about 6 months away from a “collision” with incorrectly inserted manual numbers in the ID column.

Current Sequence Value           Manual Sequence Value
107,000                                      120,210     (first inserted 2012-Feb-10th)

The current sequence value is fast approaching the first manual sequence value. It was fortunate that the bug was found before it caused corrupt data and long nights for me. Due to the complexity of the system and time constraints, the simple fix of  incrementing the next value of the sequence to 500,000 to avoid any future collisions with “unique” numbers was chosen. It would be nice to fix the offending code with the correct sequence number, but management decided the code worked enough that we could move on to other problems.

In a SQL server environment, if you try to insert a value into an autonumber field, an error is produced preventing this type of error from happening.

I’m not sure what other issues I’m going to encounter with this new environment, but I sure miss SQL Server. If you still don’t think SQL server is better, consider community support. Who would you rather trust for help?

Pinal Dave (Sql Server) or Don Burleson (Oracle)?

Hogan

Blog Engine .NET Tag Cloud Optimization

For all of the amazing blogs and websites that are out there, hundreds exist that are just average. Most of us are not experts, like Eric Lippert, who can write with authority on a single topic for years and have interesting and new things to say. Instead, there are blogs like mine. I write about whatever is interesting to me at the time. As the years progress, what I write about changes, and I want my website [Blog Engine .NET] to reflect that.

As I create new posts, they are shown on the front page till they get old. However, I have found that the default Tag Cloud widget is only interested in what I have written about most. For example, I took a motorcycle trip with my brother, so I wrote extensively about it. Since that trip almost two years ago, Arkansas was the top item on my Tag Cloud. I have nothing against that wonderful state, but I felt that it made my website out of date.

I have made a simple one line code change to the Tag Cloud that makes it more relevant. Instead of using tags from all 190 posts on my site from the last 2 years, I changed it to only consider tags from posts that are less than 1 year old. After this change was made, the Tag Cloud immediately updated and more accurately reflected what I have been writing about recently.

Now for the details, I’ll assume you’re familiar with Blog Engine .NET:

1. This example is using version 2.0, I have not tested it with other versions.
2. Go to the file (from the root of the source code download) BlogEngine.NET\widgets\Tag cloud\widget.ascx.cs
3. Using your favorite text editor (because we don’t need to compile anything) go to line 197.
  a. Method private static SortedDictionary<string, int> CreateRawList() method
      The old method

foreach (var tag in Post.Posts.Where(post => post.IsVisibleToPublic

With a minor modification highlighted, we can adjust the time to just the last year’s worth of posts.

4. With the minor change in the method, save the file, and upload it to the same relative path on your Blog Engine .NET installation and next time you call a page with the Tag Cloud on it, it will automatically re-compile and the cloud will be up to date.

Sorry that I used images instead of text, but I wanted the color to come out nice. If you’re nervous to make the code change yourself, you can download the one file to install yourself.

widget.ascx.zip (1.84 kb)

Happy Coding!

Hogan Haake