Marketplaces Enforce Master-of-None Mentality

Marketplaces are great. On my Android phone I have, at my fingertips, a huge amount of applications that just work. Marketplaces provide us with a sense of security. To uninstall the app, there is guaranteed to be exactly one thing you must do. To install an app, there is exactly one way to install it. It is self contained, there are no dependencies I have to install. Configuration is non-existent, if at all. Discovering how to launch your app is straight forward. It just works.

Let's contrast that with a typical Linux system. I use Arch Linux. So, when I go to install an application, I use pacman -S someapp. And I cross my fingers and pray that it works. Usually it does. Sometimes I have to manually download and install things that aren't in this blessed "marketplace" of sorts. It's never as seemless as "closed" markets though. A linux application can do anything. It could corrupt my system(if I give it sudo), it could trash my home directory, it could install spam that I could never figure out how to uninstall.

These are two sides of a coin. They are naturally at ends. There isn't really a good way of curing these problems with Linux. Most people would say they aren't problems, but rather design choices(myself included).

marketplace

Dependencies... how I miss thee

So, what's this all about? If you look on the Android Marketplace, iOS AppStore, or god forbid the Windows Store, you'll see a stark difference compared to Arch Linux's packages. And no, it's not the open source aspect.

If you want to search through a file in Linux, you'll probably use something like

cat somefile | grep 'something'

you'll use the cat utility to read the file in and pipe the contents to grep, where grep will search across the file for "something".

How do you do that on Android? Or Windows 8/RT?

Basically, you can't. At least, not in a good way. With Android, file managers is possible, and most of them include some basic searching capabilities, but you won't get the power of grep. You won't be able to do awesome shit like you can by combining the strengths of different applications.

If I wanted to write a file search utility for Android, I'd have to first build a sub-par file browser to navigate to the file, and then implement my actual search functionality.

Markets enforce master-of-none mentality

I once had a magnificent plan to port my scripting language to Android. How much work would that require?

  1. File browsing/saving/loading
  2. Text editor (syntax highlighting, searching, etc. More than just a text box)
  3. My programming language

And that's just the start. If I want to provide APIs in my language to search in files, I have to implement that. If I want network access, I have to provide that. There is no netcat, or grep that people could utilize instead of my sub-par APIs.

Why netcat doesn't exist in markets

If you wanted to implement a netcat utility in any marketplace, it'd be fairly pointless. The power of netcat comes from being able to pipe it to other places that the original authors never even dreamed of. What's that, you want to make a TCP/IP proxy?

nc -l -p 8080 | nc example.com 80

You want something that can encrypt a file and send it off somewhere?

openssl aes-256-cbc -salt -e < file-to-transfer | nc example.com 9999

How would you do this in a marketplace application? Sure, maybe you could cobble together some solution like finding a dedicated TCP proxy. And then finding a file encrypter and a TCP/IP program that can send files... but this requires that someone developed such an application beforehand.

You can't just create some general purpose utility. You must create some "multi" purpose utility where you came up with all of the interesting use cases you could and implement them. If you missed one, then there just isn't a solution to that problem. There is no way to combine your program and some other program to solve the problem. It's all or nothing.

It's not just markets

If you notice, desktop Windows does this to a certain extent as well. It's I/O redirection is downright terrible. (although I hear Powershell is nice) This is probably why you see all-in-one applications everywhere. Linux has a general "air" about it that encourages you to make things modular and enable the utilization of other tools where possible.

However, marketplaces is the only place where this is actually enforced. Windows 8 has extremely limited IPC functions. Oh, you gave me a (very limited) search API that works across every application, big whoop. Windows 8 especially enforces it. Did you know that you can't make a general purpose text editor in Windows 8? Impossible. There is no way to open every file with a single application. They enforce you to declare which file extensions you'll be allowed to edit (and no, * doesn't work).

Finally, the bugs

Have you ever encountered a bug in a walled-garden application? Of course you have. Would you say you encounter them more than on desktop application? Probably. Developers can't worry about only one thing because if they don't implement it, then their application can't do it. You get a feature request in your netcat-want-to-be for sending text on-demand instead of files. Now you have to implement some kind of text editor. Now some people want to be able to return an automated response that returns the current date and time. Yea, good luck with keeping up with the wishes of your users.

Developers can't just worry about the one thing they do good. They also have to worry about all the things people might want to combine to make your application more useful. This is why I believe that most market applications have more bugs than their counterparts in desktop operating systems.

For the picky

Yes, I know I probably have some false assumptions, but I'm not far off. I'm no pro in Android and such. It's probably possible to do some rudimentary IPC and maybe even some kind of dependency stuff... but it's not the norm, and I know it's probably not easy for you OR the end user.

Posted: 4/30/2013 4:22:51 AM

A Proposal For Spam-Free Writeable APIs

I've been having an interest in Bitcoin recently, but it would appear I'm too late to the party to make any money on mining. So, what's the next best thing? Taking their idea and using it elsewhere.

The idea behind Bitcoin is to make a particular thing a rare commodity. Now let's pretend we have a website like say http://stackoverflow.com We want to make a public API for it that is writable. Current options appear to be

  1. API keys which require a human to register
  2. ????

I'll throw a second option into the mix. "API Coins" which require a fair bit of computing power to create and are only good in a certain context.

Let's say you wanted to make an account at stackoverflow with a machine that didn't require any human interaction, or rather, didn't require a captcha, valid email, personal info, etc. In theory, a program could register it completely in an automated fashion.

My proposal to prevent masses of spam bots: make it expensive. Use a bitcoin like scheme. Instead of SHA256, I'd go for scrypt because it's so mostly better on CPUs rather than GPUs, and thus capable of executing from Javascript.

So, when you visit the register page I provide something like

  1. Conditions a hash must match (difficultly)
  2. The value hashed must contain a certain provided phrase (to prevent pre-mining of API coins)
  3. That's it!

You calculate a hash which matches and poof! You've got an API key. Ideally, this would be a process that would take no more than 5 minutes on the slowest of hardware. Now, when you need to perform an operation, there will be another hash request, but it won't be as intense as the creation of your API key... but if you're a bad boy, your API key will get banned and you'll have to generate a new one.

Now, how does our site know that API keys are "valid" without pre-mining risk? The key is to make the nonce phrase be random and unique, but slightly persistent. So, when the request is made to get the nonce, it is stored for say an hour. If the API key isn't "found" within an hour or two, it's considered invalid. This would prevent batching of API key creation.

To help to enforce these "hard" checkpoints, if a user, say wanted to post a comment, they'd be given a request like the API key request. A certain difficulty and a phrase to be contained within the pre-hash value. Ideally, this would be significantly easier than generation of an API key.. You could also enforce throttling at this phase by increasing the difficulty for their account as they post more and more things.

The other awesome part about this scheme? It's anonymous other than the IP address in the logs. You can be reasonably sure that it's a human posting while getting absolutely no personal information and storing absolutely no personal information. No passwords needed. You effectively have a sort of private key instead, stored in a cookie or some such.

This also enables awesomely easy registration for users of your API users. "What's an API key?" crops up plenty. Eliminate the need for it!

Some unsolved problems with this approach however:

  1. How to link accounts with it? Assuming you'd want multiple API keys to each API user?
  2. Password to facilitate linking accounts?
  3. What if you lose your key?
  4. What about those mystical FPGA scrypt machines I've heard rumors about?

I might throw together an extremely simple "micro-blog" thing(twitter clone) that uses this concept just to see how it turns out. The hardest thing would probably be implementing scrypt in Javascript

Note One last thing. This isn't to "stop" spam. It's rather to make your site so expensive to spam that it's not profitable. Sure, you can always rent out a few hundred EC2 VMs or some such and compute a few hundred API tokens, but how much is that going to cost? How much do you expect to make from spamming that site?

Posted: 3/31/2013 4:11:10 AM

Breaking Changes For Everyone!

So, remember how I said there would be no more breaking changes to the router of BarelyMVC? Well, part of the whole "making it testable" meant that the current API as it was sucked major balls. We need some way to simple get an IServerContext into the created HttpHandler. It's not really possible without magic with the current way the API is... So, it's changing.

The Proof Of Concept for a tiny taste of the new API is here. Highlights:

  • Fluent API blog.Handles("/blog/new").With((c)=> c.New()).RequiresAuthentication()
  • Worry less about getting data from routes/forms into your HttpHandler methods
  • Treat handlers more like controllers
  • Make it so no more reliance on static class elements like HttpContext.Current
  • Will reduce code duplication for adding similar routes on the same "controller"
  • STILL no reflection or manual casting required! Not even an explicit generic parameter!

With the way I foresee this working, I can honestly say it looks significantly better than ASP.Net MVC's way of routing. I mean, we're talking FLUENT API cool. I'd dare to say it's also better than OpenRasta's form of routing.

In case you were too lazy to look at that gist. Here is an example:

var blog=router.Controller(() => new BlogController());
blog.Handles("/foo/bar").With((c) => c.View());

Can it read anymore like plain English? I don't believe so. And still, no magic, no reflection, no casting. Just good ol' fashion generic delegates and some neat compiler support for implicit generic parameters.

So, yes, it's a huge breaking change, but your code will suck less after migrating. Trust me, I have about 50 lines of code just for routing for this blog. I don't take breaking changes to routing lightly.

Posted: 2/14/2013 7:17:37 AM

BarelyMVC Roadmap

So, I've been working on BarelyMVC recently and established that there isn't a formal roadmap. I think that's a bit of a disgrace and wish to change that. So, here is the road map target for version 1.0(in order sorta)

  1. Rework to use IServerContext so the entire framework is easily mocked and unit testable(and as a result, the application built on top of it) (note, API should be fairly stable throughout this conversion)
  2. Strive for better unit test coverage(Don't plan on measuring it, but a lot better than right now)
  3. Get session support built into FSCAuth
  4. Integrate CacheGen into BarelyMVC
  5. Documentation and a tutorial or two
  6. Visual Studio and/or MonoDevelop project templates
  7. Compare and contrast document between ASP.Net MVC and BarelyMVC
  8. Setup a CI and/or nightly build server
Posted: 1/20/2013 7:50:03 PM

ASP.Net MVC and BarelyMVC performance comparison

So, I was very curious as to how ASP.Net MVC and BarelyMVC stacked up against each other performance wise. So, I did some benchmarking! I believe these numbers are fairly accurate, but I didn't build a dedicated machine for it, so they should be taken with a small grain of salt.

First off, the two test projects can be downloaded here. It's just two bare-bone projects. It's the ASP.Net MVC "welcome to MVC" type template site, and my recreation of that in BarelyMVC using the standard BarelyMVC style.

Platform

  • Arch Linux 64-bit (kernel 3.x)
  • 8G of RAM
  • 2 500G harddrives stuck together with RAID-1
  • Mono 2.10.8
  • AMD Pheneom II X-6 (6 cores)
  • Release mode builds with debugging disabled in the web.config
  • Served using Mono's xsp
  • Barebones sample. No database or other I/O
  • Each test was "warmed up" (I loaded a page before beginning the test, to let things compile where needed)
  • ab -n 10000 -c <concurrency> http://127.0.0.1:8080/ was the command used

And, here are the fancy charts I made:

Requests/second performance measurement

Time/request performance measurement

As you can tell, BarelyMVC blows ASP.Net MVC out of the water! Big things to note are that BarelyMVC can serve a request in just over 1ms in the best case. ASP.Net MVC needs at least 7ms. Also, an interesting thing to note is BarelyMVC's performance with very high concurrency actually stays kinda sane. A 100ms request is bearable. A 300ms is approaching noticeably slow. I also had the concurrency level of 1000 results done, but they made the graph harder to read. Hint: ASP.Net MVC didn't get better (although BarelyMVC started getting a bit insane as well).

Requests per second is known as a fairly useless metric, but I still think it has some use to show how much load a server can handle in massive concurrent usage.

Anyway, if you're considering making something that has to stand a lot of load and you're open to alternative(ie, non-Microsoft) frameworks, you should definitely take a look at BarelyMVC. It's API is fairly stable now and it's quickly approaching beta status. It's raw and to the metal with as little magic as possible.. but thanks to T4 and lamdas, it's still easy to read, write, and debug. (Also BSD licensed! :) )

Posted: 1/18/2013 4:30:51 AM

CacheGen Proof Of Concept

So, I finally have CacheGen to where I can probably integrate it into this website. I did some rough concurrency testing (spawning 60 threads accessing the cache with random clearing). It's a rough test, but it does show that there isn't anything obviously wrong with it at least.

So, the code it generates is brilliantly simple as well. Some good use cases for this:

  • Keep all your cache settings in one place
  • Statically typed and named! No more remembering manual casts or magic strings
  • Make your caching logic testable! It generates code against an easily mockable interface
  • Switch out your caching layer with ease.

Now, I'm only going to elaborate on the last point. "Why would I ever want to change out my caching layer!?"

Here's why. You built Bookface 1.0 and a few dozen users are on it. People start talking though and suddenly you have a few thousand(or more). You page response times have crept up into the seconds range. Something must be done. After upgrading servers, and expanding some of the hardware, you find the bottleneck. Your web server's caches are being cleared too often. There isn't anything you can do though, the memory is maxed out as it is. So, obvious choice: Use something like memcached for distributed caching on a dedicated server or two.

What's makes using memcached or something so hard? It requires code changes! Luckily for you, you used CacheGen though. Why? All of your caching is in one place, and your interface to the caching method(CacheMechanism) is in one single simple class. It's trivial to implement a two-level cache between ASP.Net and memcached at this point and all of your code relying on your cache will just magically work without being changed.

This is what I think makes CacheGen especially awesome. It manages your caching settings, makes everything statically typed, AND lets you have an almost unreal amount of flexibility.

It's not quite ready for primetime yet. I've proved that it should work, the thing now to do is clean up the API some and add some more unit testing to see if I can catch more bugs.

Anyway, I don't expect this process to take too long. I plan to tag an alpha release for this relatively soon (within the month)

Posted: 12/8/2012 7:21:17 AM

Making Caching Awesome Again

So, I've recently began a new project. It's no where near stable, or even usable. But I think it's a good enough time to introduce it.

It's called CacheGen and it's hosted at bitbucket and BSD licensed.

Basically, it's a T4 template that generates really awesome helpers for caching. It's geared at ASP.Net, but my goal is to make it so it can be used in other ways as well. So, how does it work? You give it a list of specifications for something you want to cache. Let's say you have a string that's really expensive to generate, but doesn't belong in a database. To go on my theme of markdown being slow, let's call it MarkdownTranslation

You tell the T4 template to create a cache item for it like this:

var tmp=new CacheObject
{
  Name="MarkdownTransform",
  ValueType="string";
}
var cache=new CacheGen("Earlz.Example.MyCache");
cache.Items.Add(tmp);
Write(cache.ToString();

And then, you can use the MyCache class like this:

var text=MyCache.MarkdownTransform ?? (MyCache.MarkdownTransform=Markdown.Translate(foobar));

Much much easier. But, there is a flaw in this. It's possible that we could consume double the server resources required for this. This wouldn't be a problem in this case, but imagine a very heavy SQL query or something. This isn't quite what we want when things are really expensive. So, let's use some other fancy syntax

var text=MyCache.MarkdownTransformCache.LockGetOrLoad(()=>Markdown.Translate(foobar));

What this will do instead is raise a flag so that requests for MarkdownTransform will block and during this time, it will execute the expensive code and other threads will sleep. This way, it only gets executed once. And, when it finally gets the results back, the cache will be loaded with it and the other threads will be able to access it. So, instead of computing the expensive thing multiple times, instead we just hold out other requests for a little while so the expensive thing is done only once.

Sure, you can do this same thing with ASP.Net's cache, but how much code would it require for each time you did this? Hence why it's T4. Also, it's completely statically typed. No casting!

Now, what if you wanted to cache the markdown for multiple posts? Or just want to cache entire post objects? Well, I thought of this too:

You'd make the CacheObject like this in T4:

Name="Transforms";
KeyType="ObjectId";
ValueType="BlogPostData";

Then, you could apply the same lock-first type behavior like so:

var post=MyCache.TransformsCache.LockGetOrLoad(objectid, ()=>LoadPost());

Or, if you prefer the (probably safer) possibly execute twice behavior:

var post=MyCache.Transforms[objectid] ?? (MyCache.Transforms[objectid] = LoadPost());

These are the ideas anyway. None of it is actually implemented yet, and a lot more research must be done to ensure that this is a sane way to go about it. But, you can expect to get something similar to this.

Posted: 12/6/2012 5:17:43 AM

MarkdownSharp is freaking slow!

So, I use MarkdownSharp on this website for translating my posts from markdown to HTML. During a profiling session, I found that MarkdownSharp is horribly HORRIBLY slow. It was the bottle neck of my entire website. On my personal machine, requests per second went from 325 (or ~22ms per request) to 720 (or ~9ms per request) when I cached the markdown translated text.

Personally, I find it pitiful that Markdown was more of a bottleneck than my database. Getting of my post data from my database: ~5 ms. Translating all of the text in the posts to HTML: ~15ms. What the hell is wrong with this!?

I'm sure Markdown isn't trivial to translate of course, and mono generally sucks at string operations anyway. But still, I find this absolutely ridiculous that the only thing so far I've had to cache is markdown translations.

Anyway, this website is updated now and should be capable of handling about 100 requests per second. I suspect the primary reason I can't achieve anything higher than that is a combination of Apache having a lot of overhead and my VPS being rather weak anyway. But, this should be plenty for this low traffic blog.

Posted: 12/3/2012 4:36:54 AM

How to unit test T4 code generators

So, you're like me and have a greater than 500 line T4 template that is a steaming pile of... code. And of course, no syntax highlighting without addons, no intellisense, generally horrible Visual Studio support, and near impossible to unit test.

Well, my friends, I have just the thing for you! After beating my head against a wall for several days, I've found salvation!

To use my method requires some "neat" code, and it requires a few assumptions:

  • You're using T4 to generate C# code
  • You're then taking this C# code and including it in your project
  • You're ready to do some major refactoring for an awesome experience in the end (and your T4 template will be so much cleaner afterwards!)
  • You can have at least 2 tiles tied to T4 (one T4 template and one include file)

The corner stone of making T4 testable is to separate out the "logic" and the "content"; where the content is the generated C# code, and the logic is what you do to generate it.

To do this and enforce this separation cleanly, you must have two files. One of these files is the T4 "view", and another file is the logic, which is capable of being normally compiled outside of T4.

Example Untestable T4 Code

Lets start with a simple example. You have a simple T4 template which takes a file like so:

foo=bar
biz=baz

and turns it into

public class GeneratedClass
{
  public foo="bar";
  public biz="baz";
}

Here is a simple(and untestable) T4 template for it.:

<#@ template language="C#v3.5" hostspecific="true"#>
<#@ assembly name="System.Core" #>
<#@ import namespace="System.IO" #>
<#@ import namespace="System" #>
<#@ import namespace="System.Linq" #>

using System;
namespace Earlz.SampleT4
{
    public class GeneratedClass
    {
<#
    string filename="fields.txt";

    string path=Path.GetDirectoryName(Host.TemplateFile);
    var f=File.OpenText(Path.Combine(path, filename));
    string text=f.ReadToEnd();
    text=text.Replace("\r", ""); //strip extra line endings (if needed)
    var lines=text.Split('\n');
    foreach(var line in lines)
    {
        var parts=line.Split('='); //split for each element
#>
        public string <#= parts[0] #> = @"<#= parts[1].Replace("\"", "\"\"") #>";
<#
    }
#>
    }
}

There are a few rumored methods of testing this T4 file:

  • Run an integration test manually comparing the code (very very brittle)
  • I can't think of anything else worth mentioning

The Great Refactor

Now, What I suggest:

Let's refactor this and make it so we can eliminate some logic out of our "view". In this case, the view should worry about getting the code into the generated file and that's it. The logic on the other hand should work at a level of abstraction.

Here's what we're going to do:

  • Eliminate most of this code from the view
  • Create a new file for the logic
  • Use a very clever trick so that our logic file will compile outside of T4 and work when included in T4
  • Make it so that instead of just outputting text, we're building an object model that happens to be easily translatable to text

Because I came prepared, I already have the object model abstraction built. you can catch it at the bottom of the last code snippet. (It's easy to rip out and put elsewhere)

Compiling Code In and Out of T4

So, we create a new T4 logic file and name it something like "GenerateClassLogic.tt.cs". Now, you may be thinking "but you can't use the same code in T4 and a regular project!" *WRONG!

I came across a very nifty trick. Behold:

//<#+
/*That line above is very carefully constructed to be awesome and make it so this works!*/
#if NOT_IN_T4
//Apparently T4 places classes into another class, making namespaces impossible
namespace MyNamespace.Foo.Bar
{
    using System;
    using System.Linq;
    using System.Text;
    using System.Collections.Generic;
    using System.IO;
#endif
//regular ol' C# classes and code...

#if NOT_IN_T4
} //end the namespace
#endif
//#>

The key parts are the first and last lines. These begin with a comment so that the C# compiler will ignore them outside of T4, but inside of T4 these instruct the transformer to include this as a "class feature" (which ends up being a nested class)

Note here also that you must define a compile symbol. This is super easy to do. If you add it to your project, then it won't carry over to the T4 template though, making this an easy way to add in a few key things that can't be done without knowing if we're executing within T4 or not.

So, now we have a T4 logic file that will compile inside and outside of T4. Perfect for unit testing! All we need now is some logic!

The Result

Here is what I came up with:

//<#+
/*That line above is very carefully constructed to be awesome and make it so this works!*/
#if NOT_IN_T4
//Apparently T4 places classes into another class, making namespaces impossible
namespace Earlz.SampleT4.Internal
{
    using System;
    using System.Linq;
    using System.Text;
    using System.Collections.Generic;
    using System.IO;
#endif
    //regular ol' C# classes and code...

    public class GenerateClassFromText : ClassGenerator
    {
        public GenerateClassFromText(string text)
        {
            Init(text);
        }
        public GenerateClassFromText(string templatefile, string filename)
        {
            string path=Path.GetDirectoryName(templatefile);
            var f=File.OpenText(Path.Combine(path, filename));
            Init(f.ReadToEnd());
        }
        public void Init(string text)
        {
            text=text.Replace("\r", ""); //strip extra line endings (if needed)
            var lines=text.Split('\n');
            foreach(var line in lines)
            {
                var parts=line.Split('='); //split for each element
                var field=new Field
                {
                    Accessibility="public",
                    Name=parts[0],
                    Type="string",
                    InitialValue=string.Format("@\"{0}\"",parts[1].Replace("\"", "\"\""))
                };
                Fields.Add(field);
            }
        }

    }

    //shove this all into one file so we don't force implementers to hand combine this or copy over more than 2 files
    public class ClassGenerator : CodeElement
    {
        virtual public List<Property> Properties
        {
            get;
            private set;
        }
        virtual public List<Method> Methods
        {
            get;
            private set;
        }
        virtual public List<Field> Fields
        {
            get;
            private set;
        }
        virtual public string Namespace
        {
            get;set;
        }
        virtual public string OtherCode
        {
            get;set;
        }
        public virtual string BaseClass
        {
            get;set;
        }
        public ClassGenerator()
        {
            Properties=new List<Property>();
            Methods=new List<Method>();
            Fields=new List<Field>();
            Accessibility="";
        }
        public override string ToString ()
        {
            StringBuilder sb=new StringBuilder();
            sb.Append("namespace "+Namespace);
            sb.AppendLine("{");
            sb.AppendLine(PrefixDocs);
            sb.Append(GetTab(1)+Accessibility+" class "+Name);
            if(string.IsNullOrEmpty(BaseClass))
            {
                sb.AppendLine();
            }
            else
            {
                sb.AppendLine(": "+BaseClass);
            }
            sb.AppendLine(GetTab(1)+"{");
            foreach(var p in Properties)
            {
                sb.AppendLine(p.ToString());
            }
            foreach(var m in Methods)
            {
                sb.AppendLine(m.ToString());
            }
            foreach(var f in Fields)
            {
                sb.AppendLine(f.ToString());
            }
            sb.AppendLine(OtherCode);
            sb.AppendLine(GetTab(1)+"}");
            sb.AppendLine("}");
            return sb.ToString();
        }

    }
    abstract public class CodeElement
    {
        public const string Tab="    ";
        public string Name
        {
            get;
            set;
        }
        public string Accessibility
        {
            get;
            set;
        }
        string prefixdocs;
        virtual public string PrefixDocs
        {
            get
            {
                return prefixdocs;
            }
            set
            {
                prefixdocs=GetTab(2)+"///<summary>\n"+GetTab(2)+"///"+value+"\n"+GetTab(2)+"///</summary>";
            }
        }
        public override string ToString ()
        {
            throw new NotImplementedException();
        }
        public static string GetTab(int nest)
        {
            string tmp="";
            for(int i=0;i<nest;i++)
            {
                tmp+=Tab;
            }
            return tmp;
        }
        protected CodeElement()
        {
            Accessibility="";
            PrefixDocs="";
        }
    }
    public class Property : CodeElement
    {
        public string Type
        {
            get;set;
        }
        public string GetMethod
        {
            get;
            set;
        }
        public string SetMethod
        {
            get;
            set;
        }
        public override string ToString ()
        {
            string tmp=GetTab(2)+PrefixDocs+"\n";
            tmp+=GetTab(2)+CodeElement.Tab+Accessibility+" "+Type+" "+Name+"{\n";
            if(GetMethod!=null)
            {
                tmp+=GetTab(2)+GetMethod+"\n";
            }
            if(SetMethod!=null)
            {
                tmp+=GetTab(2)+SetMethod+"\n";
            }
            tmp+=GetTab(2)+"}\n";
            return tmp;
        }
        public Property()
        {
            GetMethod="get;";
            SetMethod="set;";
        }
    }
    public class Field : CodeElement
    {
        public string Type
        {
            get;
            set;
        }
        public string InitialValue
        {
            get;
            set;
        }
        public override string ToString ()
        {
            string tmp=GetTab(2)+PrefixDocs+"\n";
            tmp+=GetTab(2)+Accessibility+" " +Type+" " +Name;
            if(InitialValue!=null)
            {
                tmp+="="+InitialValue+";";
            }else{
                tmp+=";";
            }
            return tmp;
        }
    }
    public class Method : CodeElement
    {
        public string ReturnType
        {
            get;
            set;
        }
        public List<MethodParam> Params
        {
            get;set;
        }
        public string Body
        {
            get;set;
        }
        public Method()
        {
            Params=new List<MethodParam>();
            Body="";
            ReturnType="void";
        }
        public override string ToString ()
        {
            string tmp=GetTab(2)+PrefixDocs+"\n";
            tmp=GetTab(2)+Accessibility+" "+ReturnType+" "+Name+"(";
            for(int i=0;i<Params.Count;i++)
            {
                tmp+=Params[i].ToString();
                if(i==Params.Count-1)
                {
                    tmp+=")";
                }
                else
                {
                    tmp+=", ";
                }
            }
            if(Params.Count==0)
            {
                tmp+=")";
            }
            tmp+="\n"+GetTab(2)+"{\n";
            tmp+=Body;
            tmp+="\n"+GetTab(2)+"}";
            return tmp;
        }
    }
    public class MethodParam
    {
        public string Name{get;set;}
        public string Type{get;set;}
        public override string ToString ()
        {
            return Type+" "+Name;
        }
    }
#if NOT_IN_T4
} //end the namespace
#endif
//#>

Wow, so much cleaner! Plus, go to modify it in Visual Studio. What's that? Theirs actually intellisense!? Yes! There is! It will also throw compiler errors when you screw stuff up.

Also, I trimmed the T4 view down significantly as well and of course put in an include statement for our logic file:

<#@ template language="C#v3.5" hostspecific="true"#>
<#@ assembly name="System.Core" #>
<#@ import namespace="System.IO" #>
<#@ import namespace="System.Text" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="System" #>
<#@ import namespace="System.Linq" #>

<#
    string filename="fields.txt";
    var gen=new GenerateClassFromText(Host.TemplateFile, filename);
    gen.Namespace="Earlz.SampleT4";
    gen.Name="GeneratedClass";
    gen.Accessibility="public";
#>

<#= gen.ToString() #>

<#@ include file="GenerateClassLogic.tt.cs" #>

Wow that's simple!

Now note: None of the stuff in your "view" is testable. You should always keep that in mind and keep it as simple as humanely possible.

If you'll notice, other than a bit of boilerplate XML documentation, the generated code is exactly the same. This is intentional :)

Unit Testing T4

What were we trying to do again? Ah yes. Tests. Let's add some NUnit tests for this!

And so, by simply adding a reference to our project in the NUnit test project, we instantly get access to the logic of the code generator. Here is the quick testing I did:

[TestFixture]
public class CodeGeneratorTests
{
    string TestText=
@"Foo=Bar
Biz=Baz
TestQuotes=""foo bar""";

    [Test]
    public void EnsureFieldWritten()
    {

        var gen=new GenerateClassFromText(TestText);
        Assert.IsTrue(gen.Fields.Any(x=>x.Name=="Foo"));
        Assert.IsTrue(gen.Fields.Any(x=>x.Name=="Biz"));
    }
    [Test]
    public void EnsureFieldsPublic()
    {
        var gen=new GenerateClassFromText(TestText);
        var tmp=gen.Fields.Single(x=>x.Name=="Foo");
        Assert.AreEqual(tmp.Accessibility, "public");
    }
    [Test]
    public void EnsureFieldsQuoted()
    {
        var gen=new GenerateClassFromText(TestText);
        var tmp=gen.Fields.Single(x=>x.Name=="TestQuotes");
        Assert.AreEqual(tmp.InitialValue, @"@""""""foo bar""""""");
    }
}

Conclusion and Remarks

So, in conclusion, we've learned that T4 is actually capable of taming. (when I first learned T4, I wouldn't have thought it possible either) The main thing to do is maintain a separation of "content" and logic. Now of course, there are some gotchas to watch for:

  • This only works when your T4 generator's target is to generate C# code
  • When you add a reference to something, you must add it to both the logic file and the view
  • It's still very difficult to reference external assemblies(particularly project assemblies). This doesn't solve that problem at all
  • You must define a compiler symbol for your project if you wish to run unit tests against it.
  • When other people use your T4 template outside of your project(and thus don't need to test it), they must ensure that the logic file is not compiled outside of the T4
  • Unit testing that a piece of code is generated is very difficult and brittle(hence the need for abstractions where possible to make this easier)
  • My abstraction mechanisms for building objects aren't really that good. They're good enough for me, but please someone improve them! (if you do it I'll link to you from here)

Happy code generation!

Posted: 11/21/2012 3:46:10 AM

An analysis of the history of programming paradigms

Hi, so, when did functional programing become such a huge thing that every language implements. What led to it's popularity? And I'm sure some of you may be wondering: now that we have functional programming in mainstream languages, what's next? Well, I'm going to attempt an educated guess at that. But, first, we need a history lesson of the different programming paradigms, and why they came to be implemented in the popular languages that businesses use each day.

As usual, hardly anything in computer science is "new". There was a LOT of experimentation in the 60s and 70s with different programming languages and thus different paradigms. I'd argue to say that everything language related has been tried at least once during that time period even. I'm not going to cover that here though, instead I'm going to cover their introduction to mainstream languages.

Also, one last thing: This is mostly educated guesses. I have no proof to back me up. It's about like answering "why did Pokemon become popular"... No proof exists, but we can make guesses at why.

The First Mainstream Language

First, we have procedural programming. This was especially well marked with the creation and rise of C. The reason I believe this became popular was because you could write code which could translate very closely to assembly language. Resources were scarce, but writing directly in assembly language had begun to be impractical. Now, you may ask, why didn't other paradigms become popular at this point?

I'll list the problems with each paradigm:

Functional programming was there with Lisp and friends. However, garbage collection is practically required and never comes free. With resources being scarce, this was not the best way to go. Also, at this point most people knew assembly language and still were familiar with low level details like punch cards. Putting a huge amount of abstraction on top of that concept meant that it would be quite hard to learn

Stack-oriented programming was there with Forth and friends. This didn't require garbage collection, but was still a huge layer of abstraction on top of the actual instruction set. Despite this, it didn't mean it was slower. My best guess is that this was harder for assembly-skilled programmers to adapt to.

Ok, so it's seen now that procedural programming is the best step forward from assembly coding because assembly basically is procedural programming. C introduced many things though. The biggest thing is it made cross-compiling feasible, and the language was fairly simple which made making new compilers easy. Looking into all this in detail though really makes me wonder why Forth didn't win out against C for the most popular mainstream language.

Object Oriented Programming

Next on the list of big paradigm shifts: object oriented programming. This of course existed long before it went mainstream. The turning point that it really became popular was with the rise of the GUI. Objects are a natural fit with GUI elements. My guess for why it didn't go mainstream sooner is because it made compilers more complicated and have to worry about more than just doing a single pass at code and calling it good. Eventually, compilers caught up though and C++ replaced C for the main programming language spot. This idea of object oriented programming really got kicked into mainstream with the popularity of Java. You can see some more history about object oriented programming at this very helpful Programmers.Stackexchange answer

Garbage Collection

So, what's next? Probably the biggest one is garbage collection. Memory(and good collection algorithms) were finally cheap enough to let programmers forget about managing memory. Of course, this existed long before it went mainstream, but most programmers considered it slow and wasteful(which it arguably was at the time). I think the big reason garbage collection went so mainstream is because we finally reached a tipping point where computing time(and resources) were cheaper than programming time.

Generics

Generics would probably be next: statically typing an object which can take more than a single type. C++ had it first of course. I'm not sure if it became "mainstream" before it hit Java and .Net or not. It arguably has been popular for sometime now. Ada has had generics since it was first designed in the 70s. I believe the primary reason for generics becoming so mainstream is because object oriented programming became mainstream. Doing OOP in a statically-typed way is quite cumbersome without generics. People were beginning to realize that duplicating code and using tons of explicit casts was really a bad practice.

Functional Programming

Next up is everyone's recent favorite: functional programming. I actually saw this trend develop(and was a programmer at the time). Functional programming really seemed to hit mainstream with .Net support, though Javascript has been functional since 1995. Javascript didn't become really used though until at least the early 2000s with the advent of modern browsers and more adherence to standards. (and the beginning of intense hatred for IE 6). So, I wouldn't really consider functional programming to have became mainstream until everyones favorite languages started adding functional aspects. The primary driving reason for functional programming is that suddenly everyone's new PC started coming with dual-core processors. Suddenly, concurrent programming was something everyone was concerned with. Functional programming is a perfect fit for concurrent tasks. Functional programs have no state and naturally are as content working in parallel. This has also seen the popularity of many languages rise as well. Haskell is beginning to be considered "not just a research language". F# is actually used in some production products, Scala appears to be where all of the modern JVM programmers are at. Javascript is now seeing a huge amount of utilization, which naturally requires functional programming to be "proper".

So, what's next? This is only a guess, but I this is what I think the big paradigm of the next decade will be

Next?: Metaprogramming

I'm of course no stranger to this. I use T4 in a lot of my projects. It's a way to take tedious code and turn it into something that just-works, and wouldn't be possible by other means. Another example of this is all of the dependency injection things out there now. That's really just a step away from metaprogramming. Writing programs which write themselves. And of course, reflection with .Net (and Java?) is common place already. However, there aren't many mainstream languages at this point which make metaprogramming particularly easy. However, we're already seeing a rise in this with mainstream languages like Ruby and Python. Where I think metaprogramming really shines though is in statically-typed languages... where there isn't a lot of easy to use support other than some fairly basic APIs. T4 of course is an exception(and my favorite one), but even T4 has definitely not made it to mainstream usage.

So, why isn't metaprogramming already all the rage? I think the big reason is compiler complexity. It can be an enormously difficult thing to implement an interpreter within a compiler. Other than this though, I truly think it's just a matter of time. This is why I do not have a good reason for why it isn't already the rage. All of the problems it use to have such as code bloat and memory issues really don't matter a whole lot now.

Posted: 11/4/2012 6:49:53 AM