Monday, July 31, 2006

Bots Soon to High-Card Humans

The online poker rooms are flourishing in what seems to be the Eldorado of our time. But not for long. Around the corner lurk the bots, or computer programs designed for playing poker, currently in hard training to win your hard-earned cash.

Ten years ago I was an avid poker player, constantly on the lookout for new games. Most of my friends that I used to play for nickels and dimes are still playing today. They have taken their chip stacks online, and the limits have increased, but from the look of it they are doing well financially. Online poker ten years back consisted of a yearly e-mail tournament and games played over Internet Relay Chat (IRC), including first-generation bots. Online discussions took place on a spam-filled USENET newsgroup; in short, the situation wasn't very exciting. Today, you can hardly enter any web site without seeing advertising for online poker, with poker sites and fora mushrooming. According to Party Gaming, online poker generated $2.6 billion in gross gaming yield and represented 20% of global online gaming revenue during 2005. To call the development during these ten years an explosion would be an understatement.

But artificial intelligence is about to take over the tables. We have already seen it happen in chess, Deep Blue beat world champion Gary Kasparov in 1997; in checkers; and in backgammon. In these games it is believed that the best software program is superior to, or at least on par with, the world's best human players. Poker is somewhat different than chess and backgammon, but they do share a lot of common ground. In poker the cards decide the outcome of a single hand, and in backgammon the dice decide the outcome of a single game. But in the long run the winner is the one who makes the best choices. The skill level of the computer players are steadily improving, but it is not only smart algorithms that make them a tough opponent for a human. Poker players are notorious for playing long sessions and in the wee hours, the quality of play is decreasing at the same rate as the players' eyelids are closing. Poker bots don't have this problem, nor are they affected by the other major shortcoming of a human poker player: being on tilt, or playing suboptimal out of anger from losing a recent pot.

The game of online poker, with its huge revenue, is an attractive market for bot writers. Sure, online poker rooms prohibit bots; but in reality, there are a lot of poker bots that go undetected. With software dominating the world of chess, checkers and backgammon; poker is the next game. Online poker is not the same lucrative business it was a few years ago, the competition has stiffened and I am convinced that bots already have started to take a piece of the action. I predict that in just a few years a majority of the big money winners in online poker are bots. When that situation occurs it is questionable if online poker can survive at the same level as we see today. With more poker bots getting closer and closer to an optimal game, the playing field will be more even, and the earnings will not be enough to beat the poker rooms' fees, unless some unwary humans will stick around to feed them.

Building a poker bot to play in an online poker room where its participation is banned requires more than writing logic needed at the poker table. The poker servers are fighting tooth and nail in a war for their very existence. They are adopting multiple measures of defense: spyware-like functionality that monitors running processes on your computer, pop-up screens (a.k.a. bot challenges), playing patterns, etc. This war between sites and bots is not fought in the open, both sides prefer to keep a low profile here. The sites do not want to scare away their human clientele, while the bots are fighting detection. Because of their clandestine existence, it is difficult to evaluate the exact state of poker bots today. Some universities are doing research in the field and the University of Alberta seems to lead the way.

It is exciting to know that my fellow finalist Daniel Crenna is writing a framework for hooking up poker bots to play against each other. I hope that his endeavor will help budding poker bot authors to improve their software. It will be very exciting to see the final results of this project! A similar commercial product, Poker Academy, is also available on the market.

Sunday, July 30, 2006

Writing Documentation

Writing software documentation is probably the most boring part of a project for a developer. However, having blogged during the development process makes it easier. I am able to take some blog entries and paste them into the help files with very little editing. Also, by continuously writing blog articles during the past months, it is easier to fight writer's block.

I am using the free Shalom Help Maker to generate a standard CHM-file, easily accessible from within FeedJournal. Writing the help files is instructive because it places you in the user's shoes, and any design flaw becomes much more apparent. However, I have been working hard to keep the application design simple, and I hope that it is intuitive enough for users, so that they will not need to resort to the help system.

Saturday, July 29, 2006

Deep Linking from RSS

One of the more unique and perhaps controversial features of FeedJournal is that it can filter out the meat of an article published on the web.

How does it accomplish this? FeedJournal has four ways of retrieving the actual content for the next issue.

Actual Content
In the trivial case, a site (like this blog for example) decides to include the full article text within its RSS feed. FeedJournal simply published the content; no surprises here. By the way, this is how all standard RSS aggregators work. The problem is when a site decides to only publish summaries or teasers of the full article text. FeedJournal needs to deal with this because it is an offline RSS reader, users cannot click on their printed newspaper to read the full article.

Linked Content
The <link> tag inside the RSS feed specifies the URL for the full article. In case the RSS only includes summaries of the full articles, FeedJournal retrieves the text from this URL.

Rewritten Link
In most cases, just following this link is not a good solution. The web page typically includes lots of irrelevant content, like a navigation menu, a blogroll, or other articles. FeedJournal lets the user write a regular expression for each feed, automatically rewriting the article’s URL to the URL of the printer-friendly version. As an example the URL to a full article in International Herald Tribune is http://www.iht.com/articles/2006/07/28/news/mideast.php while the link to the printer-friendly version is http://www.iht.com/bin/print_ipub.php?file=/articles/2006/07/28/news/mideast.php By inserting bin/print_ipub.php?file=/ in the middle of the URL we will reach the printer-friendly article. This article is much more suitable for publishing in FeedJournal, because it more or less only contains the meat of the article.

Filtered Content
“More or less”, I said in the last sentence. There are usually some unwanted elements left in the printer-friendly version, like a header and a footer. These can be filtered out by letting FeedJournal begin the article after a specified substring in the HTML document source. Likewise, another substring can be selected as ending the relevant content.

By applying these functions it is possible to scoop, or extract, the meat of almost any web published article. Of course it is only necessary to do this once for every feed. To my knowledge, FeedJournal is the only aggregator who has the functionality described in the last three sections.

Is this legal, you ask? Wouldn’t a site owner require each user to actually visit the web site to read the content and click on all those fancy ads sprinkled all over? Well, my stance is that if the content is freely available on the web, I am free to do whatever I want with it for my own purposes. Keep in mind that we are not actually republishing the site’s content, we are only filtering it for our own use. Essentially, I think of this as a pop-up or ad blocker running in your browser.

What is interesting to note is that some web sites have tried to include in their copyright notice a paragraph limiting the usage of their content. Digg.com, for example, initially had a clause in the their copyright effectively prohibiting RSS aggregators from using their RSS feeds! Today, it is removed.

As long as FeedJournal is used for personal use, and the issues are not sold or made available publicly, I do not see any legal problems with the deep linking.

Friday, July 28, 2006

Time for Code Freeze

Time, quality, resources and scope. Those are the four variables in software project management. As the deadline closes in I only have the luxury to change scope. Sure, there are more features I planned to get into this version, but the scope will be cut in order to make the release stable and have a timely delivery. Time is a rare resource for me these days with being a new father , having a full-time job, following the latest news about the regional conflict, and blogging/developing FeedJournal . Despite that, I am proud of what I have accomplished so far with my project in Visual C# 2005 Express Edition.

One week remains until release, and the time has come for Code Freeze: no more new features. Until August 6th I will work on finalizing documentation, web site, and of course testing.

FeedJournal will become a commercial project in version 2.0. Until then the fully functional version 1.0 will be the one submitted for the Made In Express Contest under a shared source license. That basically means that the source is available but there are no rights to use this source code in your own projects. I plan to add plenty of cool features before a commercial release plus some optimizations under the hood. I would like to thank everyone who contacted me with feature requests or comments. What you should expect to see in future versions are:

  • Sections
  • Layout templates
  • Images
  • Browser integration for publishing a selected web page in the next issue
  • PDF import (nice for those online crossword puzzles and Sudokus)
  • Advanced article scoring using user-defined keywords and extended RSS tags
  • Scheduled publishing and printing
  • HTTP authentication and cookie support
  • Improved throttling support and adhering to servers’ TTL setting

There are also some ideas I had envisioned early on in the project, prior to starting implementation, that have been moved to the recycle bin. Nothing strange with this, it is a normal reality check once you get down to the fine points of how things are supposed to work. For example, I had planned to rank articles according to web popularity (Technorati, Digg, delicious, Google, Yahoo, etc.). After much research of the various service APIs it is clear to me that this is simply not possible. Having x articles would mean sending x web requests to the various services. Until they support a technique for bundling together multiple requests into one, this feature will not be a part of FeedJournal.

In the meantime, I am looking for serious beta testers for future FeedJournal versions who will be rewarded for constructive feedback and testing.

Sunday, July 23, 2006

Test-Driven Development

Test-Driven Development is a paradigm shift with its novel approach to software development. It puts the fun back in development, while improving quality and end-user satisfaction.

Unit testing is one of the cornerstones in agile development. A comprehensive suite of automatic tests for your source code generates many benefits. Your confidence in modifying the source code will skyrocket because you know that if you unintentionally break something the tests will catch it.

But this is just the first step. By writing a test for a new feature before the actual implementation, you are set up to reap additional benefits. For most developers uninitiated in this technique, it sounds counterintuitive and like a big waste of time. Still, this is the way that I have developed software for the last couple of years, and I really can't see myself writing a substantial amount of code not using this technique.

The technique I described above is called Test-Driven Development (TDD) and is steadily gaining popularity. It stems from one of twelve extreme programming practices, and many friends and colleagues to whom I introduced it swear by TDD today. The main benefit gained from a test-driven approach is that you are approaching the solution from the top, from the user’s perspective.

Too many potential killer apps are destroyed by programmers who develop bottom up. When it comes to designing the interface (like a GUI or API), they find themselves tied down to their early ill-informed low-level design choices. By doing TDD you will let the requirements drive the development. As a positive side-effect it is a thrill to develop software this way, you are always focused on reaching one specific goal, usually not that far away. It is difficult to lose focus on what you are out to accomplish in the source code while trying to get a test to pass.

Automatic tests can also be used to specify programming assignments. You hand your colleague a set of tests that lack implementation and ask him to implement the required classes so that the tests will pass.

So how do you do this in practice? Well, if I were you I would head straight over to xUnit’s Wikipedia entry where free tools are available for almost any language/platform. For .NET development, nUnit is the testing framework of choice. Good luck with your tests!

Friday, July 21, 2006

FeedJournal Sample Issue


Yes folks, we have a world premiere, the first sample of a FeedJournal issue is available for your viewing pleasure! Let me remind you that the purpose of the FeedJournal project is to generate a PDF newspaper based on RSS feeds, intended for printing. The PDF file is available for download here. In order to open it you will need Adobe Reader or Foxit Reader.

The content spans a selection of last week’s blog entries from the Made In Express Contest finalists. I chose these feeds, not because I want to plug the contest, but because I want to avoid breeching copyright law for republishing other blogs’ articles.

So what can you see in this sample issue? The following settings are in use: A4 paper size (a European standard), 4 columns, 0 points line spacing, 8 points column spacing, 30 points page margin and 10 points margin between headline and article text. Furthermore, the headline is Times New Roman (22 pt bold), article text is Times New Roman (8pt), publishing date is Lucida Console (5pt) and news source is Arial (9pt italic). All of these settings, and others, can be customized from the application’s Options dialog.

But there are also things that you cannot see in the sample. Like for example images. Beside the masthead (newspaper lingo for the first page logo), there are no images. Future versions of FeedJournal will include support for images contained in blog entries. Another thing not visible in the sample issue is the already implemented support for long articles to jump between pages if they do not fit. The reason is simply that there weren’t any long articles available in the selected time span.

And let me finally take the opportunity to congratulate my fellow finalists for getting published in the newspaper! ;-)

I would appreciate any feedback regarding the sample issue, don’t hesitate to contact me by using blog comments or e-mail me at contact@feedjournal.com.

Wednesday, July 19, 2006

The Modern Emigrant

When I decided to move to a different place of the world two and a half years ago, I knew it wouldn’t be easy. My confidence was strengthened by the knowledge that I am adapting fast to new situations. Internet has been a big help for me to stay in touch with my roots, but there are still hurdles which emigrants will always face.

Deep in Swedish culture we find Wilhelm Moberg’s classic book “The Emigrants” (in Swedish “Utvandrarna”). It tells the story of a family’s decision to move away from hard times in Sweden to try their luck in America, and their subsequent lives there. The story is written in mid-twentieth century and takes place a hundred years earlier. The framework of the story can still today be related to in many parts of the world where people flee poverty or worse, seeking new golden opportunities elsewhere.

The situation for me, and others who decide to emigrate from modern western countries, was different. I didn’t escape anything; I had a good life in Sweden with friends, family and employment. I came here for the very modern reason of having met my life companion over the Internet. Coming here I left my entire family behind. In the days of Moberg’s story, moving to another continent basically meant being cut off entirely from relations with your old country. News and other communications traveled at a laughable speed compared with today’s global broadband networks.

Internet as a communications media has really transformed the life and experiences of the modern emigrant. Living in a different continent, I still have the possibility to listen to live Swedish radio, watch the latest newscasts and read the latest local news from my old hometown. I don’t miss one episode of my favorite Swedish reality TV show. I strike up a text, voice or video conversation with old buddies no matter if they are back home in Sweden or playing poker in Las Vegas. Tools like Miranda (IM), Skype (VOIP), Juice (podcast), ĀµTorrent (BitTorrent), Firefox (web browser) are closing the distance between being home and away. All of this is of major importance because it lets me stay in touch with my roots and also use my mother tongue, to which I will always return for inner peace.

Internet as an e-commerce tool is growing slower but at a steady rate. I still buy my books from Amazon, just as I did before I emigrated. Amazon’s price, service and delivery time are the same here and there. I spent considerable time in my new country trying to locate book shops which carried the right books for me, but in the end it was a waste of my time. Amazon is simply what I am used to and where I feel comfortable browsing and buying books.

Still, some of the problems emigrants face are eternal. The language barrier is a major obstacle for every emigrant. Adapting to a new language is a hurdle that you got to pass in order to assimilate into your new society. Going to the supermarket and asking them in English (assuming it is not the country’s native tongue) where to find the mayonnaise is like putting a huge stamp on your forehead saying: “I don’t belong here”. Belonging and assimilation are equivalent for the emigrant, and the answer lies in mastering the language. And by language, I also include body language, which is just as hard to master as the spoken language.

There is still another hurdle the emigrant must pass in order to assimilate, and it’s a tough one once you have passed a certain age. You are not familiar with local celebrities in your new country, no matter if it is sports stars, singers, TV hosts, authors, business men, politicians or actors; you are simply clueless. It is similar to trying to solve a New York Times crossword puzzle for a non-American or a Guardian puzzle for a non-Brit. All those clues about “Jeopardy host” or “Folk singer Guthrie” make you put your pen down in despair.

What all of this boils down to is that emigration is and continues to be a great adventure and learning experience for me. Looking back upon my decision to live in Israel, I don’t regret it for a second.

Monday, July 17, 2006

Rockets and Progress

In an instant the situation here has deteriorated. Rockets explode closer and closer to our home (so far a safe distance away), and it will surely take time before we will see and end to it all. Native Israelis are more relaxed about the situation, more adapted, or perhaps it's just an image they are putting up. Having recently become a father makes me worry about my family's security. Between closely monitoring the latest headlines, spending time with our 1.5 month old daughter Noa and working my butt off at my day-job, the contest deadline is slowly closing in.

I started to write the FeedJournal help file but I haven't decided on a format yet. HTML is attractive because I can easily host an online version of the help files, while keeping them up to date with minimal maintenance. CHM files are more standard and look more professional though. The jury is still out...

The application itself is starting to be finalized and I haven't forgotten my promise to submit a sample PDF newspaper issue here. Patience! For now here is a screenshot of the main window.

Tuesday, July 11, 2006

SQL Server 2005 - Everywhere Edition

Microsoft recently announced a new edition in their SQL Server 2005 family of products: Everywhere Edition. This is a free and lightweight version of SQL Server 2005.

So what is different from the Express Editions that is required for us Made In Express Finalists? I can only talk for myself but the Everywhere Edition would be more suitable than the Express Edition for my Windows Forms application for a number of reasons (source: Steel Price's blog):

  • runtime size is only 1.4MB (in-process),
  • single data file without transition log,
  • smaller redistributable package,
  • embeddable in applications.

In short, Everywhere Edition is more system resource friendly! Of course, there are some limitations in Everywhere, compared with Express, for example it cannot run as a service and lacks multiuser support. These issues are not relevant for my project FeedJournal though.

So what do you say Microsoft: can we use the Everywhere Edition in Made In Express Contest?

Sunday, July 9, 2006

Creativity as Driving Force

Take a minute to remember the last project you completed alone. Can you remember the satisfaction of seeing the pieces fall into place to build a greater whole, to put the finishing touches and perhaps launch it publicly? This satisfaction, in its best moments, defines one of greatest feelings in life. It is the driving force for artists, hackers, bloggers, journalists, and anyone who lets their creativity be a central part of their day-to-day activities.

Creativity can be manifested in different ways for different people. The force of it is just as powerful though, no matter if it is being used to cook, do gardening, writing, drawing, composing music, or anything else. I am a software developer, and my choice of profession has a lot to do with getting an outlet for my desire to be creative. It is my firm belief that people gain happiness and satisfaction from nurturing and giving in to their creative impulses.

Society brings with it social pressures and expectations as well as stereotypes we are expected to fit into. These are stifling our creativity, and it is up to each and every one of us to find our own way. To find our own way is not an easy thing and we must constantly fight this war, in order not to fall back into stereotypical behavior and dissatisfaction. It is by leveraging our creativity that we can force this issue to our advantage. The war against stereotypical behavior must be fought on many fronts: professionally, as a family, in your relations with close ones, and of course in your dealings with yourself.

A personal example: Last year I decided that I wanted to start my own software development business. This decision stemmed from the fact that I had always had creative positions in my professional life until I emigrated from Sweden to Israel. Here in my new country I started to work in a position where my creative juices weren’t flowing like I was used to. I was more and more missing the development work I always had been doing in the corporate world. I decided that the best way for me to express my creativity by building software applications on my own time.

It is the first time I am creating a large software project on my own, including marketing and selling it. It is nothing short of amazing. Everyday I am learning something new and the fact that I am my own boss means that I can focus my creativity in the direction where I am feeling most productive that day. One day it means writing articles or blog entries, another day it means coding or web site building.

Going it alone in the software business is of course not a new phenomenon (small shareware shops were heard of decades ago), but there is a rather recent term for it: micro-ISV (Independent Software Vendor). It connotes to one or a few individuals who are building and selling software products. The term was coined by Eric Sink in an MSDN Magazine article, and it stuck in the industry. Today, there are books, active forums, podcasts and web sites dedicated to help out the budding or already blossoming micro-ISV entrepreneur. These micro-ISV and shareware people who are going it alone have all made a proactive choice about using their creativity daily.

I think there is a reason we are seeing a entrepreneurial boom in the software industry today. The growth and acceptance of e-commerce along with more powerful and affordable Rapid Application Development tools (Visual Studio Express comes to mind), makes it much easier for the single software developer to make a living today. It’s a beautiful world where creativity can pay the bills!

Wednesday, July 5, 2006

Why FeedJournal? (or why the information age matters)

The idea of an RSS syndicated newspaper came to me when I was subscribing to a morning newspaper last year. I hadn't had a morning paper for years then, so it was all a bit new to me. I really enjoyed to have access to news hot off the press, which I could read without having to stare into the computer monitor; for example in the comfort of my bed, sofa, or while traveling. But there were two things I strongly disliked about it: the monthly subscription was fairly expensive and I didn't really care for a majority of the content in the newspaper. The competing newspaper had a few sections that I would much rather read, but I couldn't afford to spend my time reading more than one morning newspaper. I knew that there were better ways out there for accessing relevant news in a comfortable way. I just needed to find them.

Content is king. There are no two ways about it. When people were talking about the information age ten-fifteen years ago I didn't get it. I didn't see how the management and distribution of content could become so central in a society that it would name a whole time period. But I am starting to see it now, how a low signal-to-noise ratio can kill the greatest endeavor; how the delivery of timely and to the point information can be of extreme value; and how the production of high quality content in itself can form an outstanding business plan.

I’ll say it again, the "production of high quality content in itself can form an outstanding business plan". Traditionally and historically the great content producers also had to be great content deliverers in order to survive. They had to make sure that the newspapers or books were printed and delivered to make any kind of business. Today, all of this has changed. Today, we have electronic delivery of the same content that used to make up newspapers and books, through for example the World Wide Web.

But along with the change of delivery method we as customers are losing out on some of the great and time-proven ways of accessing the content. We need to make a compromise between reading a newspaper online with all the latest events, or in paper format using news that in our fast-paced life are already old (just by a few hours but still old). This is where FeedJournal comes into play. FeedJournal serves as a content deliverer and presents information from whichever sources you want in a traditional format that was the default way of reading news for a very, very long time.

Of course, this is just one out of many of FeedJournal’s benefits over a traditional newspaper. It also empowers the user with the option of collecting multiple feeds to create a newspaper that is tailored for her own needs: with the local team’s results, the stock portfolio’s development or even personal e-mail. It gives the user the possibility to choose the deadline to be the exact moment she wants, not six hours before it will actually be read. And of course the paper size can be decided: A4, A3, letter size or why not an index card version that you can put in your Hipster-PDA? Gone are the monthly subscription fees for delivery, you only need to pay for the actual content in case your favorite news source doesn’t provide it for free on the web already.

Saturday, July 1, 2006

Microsoft CodePlex

A few days ago Microsoft officially launched their open-source project hosting web site CodePlex. It is great to see Microsoft finally embracing and supporting the open-source community with an initiative like this. Like all web launches these days CodePlex is a work in progress, and even though the functionality is still a little thin, I see great projects coming out of it very soon. What makes CodePlex stand out compared to the established player, Sourceforge, is the user interface and user friendliness. Sure, it is limited to .NET projects but isn't that what we all are passionate about?

FeedJournal will not be hosted on CodePlex, but I will definitely consider submitting other projects of mine there, or joining something interesting. The reason I will keep FeedJournal off CodePlex is that I plan to take FeedJournal commercial after the publication of the free 1.0 version, which I will submit to Made In Express Contest.