Showing posts with label c#. Show all posts
Showing posts with label c#. Show all posts

Tuesday, 3 November 2009

Regexes from the perspective of a noob. Also autumn images.



Pictures this time are from a visit last week to Anglesey Abbey. I've no idea what the round-leaved shrub is called but it is lovely, isn't it?
The purple berries likewise. I did buy one of these a couple of years ago but it failed so comprehensively that I can't even remember where I planted it. The last photo is a Japanese Maple. But you knew that.

There comes a time, for any programmer, when they have to get their hands dirty with regular expressions. Yuk. They look like maths but are not really, as the rules are fairly random. This cannot be helped, as the system grew organically and is used by too many languages and too much legacy code to change.

One thing I dislike very much and will change in my own code is seeing a regex like this:

string matchString = @"^(?\\d{4}-\\d{2}-\\d{2})\\s*(?



What on earth does that mean? (Actually I had to break this line up so it is readable. In my code editor it is all on one line but of course that wraps and "pre" tags don't.) You look in tutorial books and on MSDN and other places and all the examples seem to be like this. Do you know, you are allowed to approach this like a normal programmer, not a machine, and break it all up into smaller chunks, each of which has a sensible name and a comment?


string dateMatch = @"(?\d{4}-\d{2}-\d{2})"; // 0000-00-00 pattern
string timeMatch = @"(?

Then add them all together:

string matchString = @"^" + dateMatch + @"\s*" + timeMatch + findExit + exitCodeMatch; 

Isn't that so much nicer? I for one certainly feel that regexes are slightly less likely to come back and haunt me later if I compose them in this way.But as I said in the title, I am still a noob in this area.


By the way, the @ signs are peculiar to .Net and mean yes, I do want  those backslashes. The code won't compile without them.

The more eagle-eyed of you may notice that this example comes from an exercise in the MCTS Microsoft .Net Framework - Application Development Foundation book. Don't worry, I composed the regexes myself and there may well have been a better way of doing it.

Wednesday, 14 October 2009

Models, Views and Controllers

Pork Scratchings
Another nerdy post, I'm sorry. To make up for that, and because I was glued to every one of Jamie Oliver's adventures in the US, I have a gluttonous photo for you. I bought a bag of pigskin from the Farmers' Market on Saturday and made my very own Porky Scratchings from it. I used the recipe from Jamie's America, which involves stirring cayenne, cinnamon, seasalt, honey and garlic into the cooked scratchings and putting them back in the oven for 8 minutes. This was gorgeous but a little too much like Nigella Lawson's Union Square Café bar nuts from Nigella Bites (same topping but with brown sugar and rosemary). Next time I'll try just fennel, garlic and seasalt.

Well, for some time now, I've had the first chapter of a book on ASP.NET MVC on my desktop. It's about a site called NerdDinner. For some reason, I decided to actually take a look at it and started to work through the example. And I instantly fell in love! I've known I want to program in C# for several years now, and that I feel comfortable in Visual Studio. I am also aware that there are things called "Content Management Systems", lots of them, and only one of those written in .Net. What I wanted from a CMS would be a way of managing content (of course), so the customer can change what is on the page, keeping it up to date. I would therefore need a membership system. I am aware that it is important to keep data storage well away from the page itself, so that database tables and the classes that represent them can be looked after. The problem is that what we can now call the "traditional" ASP.NET structure, with code behind pages and databound gridviews is that it is easy to get into an almighty mess, which at some point needs to be sorted out. Then it looks like way too big a task.

I looked at various "Content Management Systems". Why are they called that anyway? Why not "Website Management Systems" or something else. Wordpress is supposed to be easy but I didn't get on with it, as it is pretty rigid. Fair enough, it was designed for blogging anyway. I investigated Drupal. Huge learning curve and writtten (like many CMSs) in PHP. No thanks. I looked at DotNetNuke. Two problems: it dumps a big load of mysterious files and language in your website, and seems to require a whole database all to itself. Third problem, I could in no way get this to work on my Vista PC with SQL Server, with or without IIS. Forget it. This was disappointing, as I really would prefer to stick with Visual Studio.

I tried Django late last spring, using various books and the help on the Django site. Python is a language that certainly appeals, it is clean, nerdy in a good way and does not carry all the bloat (written in various styles) that PHP does. To my frustration, I could not get beyond the most trivial examples. Neither could I work out where to keep my CSS, images and Javascript files. The writers of all these books seemed to regard the matter as unimportant. Examples were written with inline CSS (yuk!).

So, ASP.NET MVC. It doesn't call itself a "Content Management System", any more than .Net itself does. It's a Framework. That sounds much more open. As long as you have C# 3.0 (check) and some kind of MS SQL database (apparently Web Developer Express 2008 will do) you're in.

What appeals to me is that so many other modern buzzwords and ideas seem to be a part of this. They have copied ideas from Ruby on Rails (I had tried that as well and found it difficult, now I would be back in my comfort zone). It was based on the old-fashioned but newly discovered pattern "Model, View, Controller". This separates the pretty web pages from the content in the database and is all run with a set of "controllers". There is a huge emphasis on testability, even designing with behaviour and testability in  mind from the start. You are encouraged to import and use open-source test tools such as NUnit and Moq (I'm learning). The great JQuery library is part of the framework.

As for the parts I already know well, CSS can go anywhere the views can find it (typically in a "content" folder. Web pages are built up using a master page and come out with nice clean html. Any javascript or id attributes are what I put there myself and not generated out of monolithic .Net widgets.

Unless I want them of course. MVC is not the dark side of the .Net. You can (fairly) easily add in bits of web form-type ASP.Net when you want to.

I'm so happy, I'm getting  incoherent. Once I've worked through the superb Steven Sanderson book Pro ASP.NET Framework, I'm going right back to my portfolio and rewriting everything in MVC. It's light, it's programmer-like and I love it.

Tuesday, 29 September 2009

Many threads make light work



Today is the final episode in my threading story. Next time, I'll talk about bread, and possibly Wicken Fen as well. The image on the left is of my coffee cup in front of its cafetière. The strange felt object is a coffee cosy, kindly made by my sister. Bet you don't have one of those!

Previous episode

As I hinted before, the separation of incoming stream from the list of words created, nice though it is as far as Object Orientation is concerned, makes even more sense if you are building up words from several streams and adding them to the same wordlist. Like trying to find out how many times the word "thriller" came up just after Michael Jackson's death (I'm trying to think of a one-word example!).

So, this time we'll start with an array of readers. This one has three. I amended the assignReaderContent function to give each one two different paragraphs from The Star by Arthur C Clarke (the first 6, in fact). I then created a single WordList and a list of WordListBuilder and a list of Threads. All Threads (there will be three) must share the same WordList. The function setupBuilderList is passed all three lists plus the WordList object, and associates them all together:

    static void setupBuilderList(List<WordListBuilder> bList,ICharReader[] rArray, List<Thread>threadList, WordList wList)
    {
        for (int i = 0; i < rArray.Length; i++)
        {
            bList.Add(new WordListBuilder(rArray[i], wList));
            WordListBuilder.buildWordListStarterOp listStarter = bList[i].buildWordListStarter;
            threadList.Add(new Thread(new ParameterizedThreadStart(listStarter)));
        }
    }
There is a lot going on here, and quite a few new parts of ASP.NET and the classes I have created to explain. Of course, at this point there are no Threads or WordListBuilders, so we have to make one builder to enclose each reader and associate each with a thread. We create a new builder for each reader, as before, passing the reader and the wordlist to the constructor. Then we declare a delegate, called buildWordListStarterOp for the builder in question to communicate with the rest of the operation.The starter op. in question is a member function called buildWordListStarter, which has to be passed an object (as does the delegate):

        public void buildWordListStarter(object state)
        {
            if (state is BuildWordListParams)
            {
                BuildWordListParams oP = (BuildWordListParams)state;
                oP.myWordListBuilder.buildWordList();
            }
        }

        public delegate void buildWordListStarterOp(object state);

In the setup function, a ParameterizedThreadStart object is required, so we can give the thread something to work with. The parameter has to be of type system.object. the BuildWordListStarter member function expects to be handed an object of the class BuildWordListParams:

class BuildWordListParams
    {
        public WordListBuilder myWordListBuilder;
        public BuildWordListParams(WordListBuilder a)
        {
            myWordListBuilder = a;
        }
    }

Of course, since we are only passing a list, and a list is already a system.object, it is not necessary to go to the trouble of declaring BuildWordListParams. On the other hand, it is now there, and if I wanted to pass something else, for example and instruction to ignore common words such as "the" and "it", I can just add them to this declaration and it remains just one object.

This class simply passes on an object of type WordListBuilder. Assuming all goes well (and I confess, we don't have any exception handling here) this means you can take the BuildWordListParams object and just call its buildWordList member function, the same one we have been using all along.

In the static void function, this buildWordListStarterOp is called listStarter and is then passed to the new Thread in its constructor. It is perhaps not necessary to delve into the workings of threads in general. I take the view that the ASP.NET team write Thread libraries so I don't have to.

We have a list of builders and a list of thread. We now set up the timer with just the same function as last time.

The last static function (startThreads) is to start each thread off, giving it the correct parameter as it goes:

    static void startThreads(List<thread> tList, List<wordlistbuilder> bList)
    {
        int i = 0;
        foreach (Thread t in tList)
        {
            BuildWordListParams p = new BuildWordListParams(bList[i]);
            t.Start(p);
            i++;
        }
    }

Each Thread is passed the corresponding WordListBuilder.

One more, rather important, point. If you have three threads able to access a resource, such as a list of words at the same time, you may well end up with rubbish results. The content of both member functions of WordList, AddNewWord and makeReportCalledBack are enclosed in a lock statement. This ensures only one thread (we hope, containing a builder) will access it at any one time. It will then be released for another thread to use.

Here are some screenshots:

Here, you can see that the main thread is number 9, and that numbers 10-12 have been assigned to the different builders.

Here, the main thread has finished, all builders have been assigned their char readers and all threads have been started.


We are getting the first report from thread 6, which is the Timer/WordList.













At this point most or all of the content has been read. I counted up instances of the word "that" in my 6 paragraphs and there are, indeed 10.

Here, two of the threads have hit the EndOfStreamException  and have been closed.
What I do find odd is that all reports cease at this point. The previous version kept making the same report over and over again. I do not understand this, and suspect I have introduced a bug.

AT some point I will try to get all of my files up here in a zip, so you can have them, if you want.

Friday, 25 September 2009

Timing and threads

Jackies carrots
Yesterday was market day and I forgot to take my camera with me. Sorry. But I will include an image from our market that, like Biddy Baxter, I took earlier. This one is from Jackie's famous fruit and veg stall. She's a real quick thinker, so no dithering!

Previous Episode

I said I'd get on to processing inputs from different sources. This is two problems really: one is, there might well be delays in transmission. In this case, it would be better to make regular reports (every 8 secs?) so you don't think your computer has died. You need a timer for this. The other is that your various sources have to be able to add words and counts to the same list without trading on each other and making a mess of the count. This suggests threading to me. Luckily, both timers and threads come from the same library in C#, as they've already forseen you might need both for some tasks. Whatever language you use, you will most likely need a special library for thread, because they go deeper than I want to into the workings of the machine in question. I am NOT a back-end developer (oooh missus!).

Thing is, I had to learn this as I went along, as I haven't even thought about simultaneous processes since Uni (and I'm too old to use that abbreviation, besides it was a Poly then). I turned to the one of my C# books I have only recently started reading again: Pro C# with .NET 3.0 by Andrew Troelsen. I usually refer to the more webby Pro ASP.NET 3.5 in C# 2008 by Matthew McDonald and Mario Szpuszta. Both are Apress books.

So, I now have a slightly different char reader called SlowedDownCharReader. This one uses a private random number to produce slight delays in transmission, which is perhaps a bit more like real life:

      Thread.Sleep(this._random.Next(200));

happens at the beginning of the fetchNextChar method. The random number returned is multiplied by 200 to give some hundredths of a second. This soon mounts up when you only get one char at a time! Note that the word Thread is already coming up. Random delays and timers are all things that affect the current thread of execution. Using these classes requires a using of System.Threading at the head of all relevant files.

Back in the main function, I've written a static setupTimer function. This is where it gets tricky. Since the operation we want to happen every 8 seconds is the WordList method makeReport, we need to create a C# delegate for the Timer to call. This has to be a void function passed a System.Object parameter:

    public delegate void makeReportOp(object state);

Here is the whole setupTimer function:

    static void setupTimer(WordList list, int secsToStart,int intervalSecs)
    {
        int millisecsToStart = secsToStart * 1000;
        int millisecsInterval = intervalSecs * 1000;
        WordList.makeReportOp reporter = new WordList.makeReportOp(list.makeReportCalledBack);
        TimerCallback timeCB = new TimerCallback(reporter);
        Timer everySoOften = new Timer(timeCB, null, millisecsToStart, millisecsInterval);
    }

First of all, the time before the Timer is called and the interval time are passed as seconds and then multiplied by 1000. No one thinks in milliseconds! Now the delegate is created and passed the WordList method to call. I have created a new report method, which sorts the list and writes it to console just as before but writes a short message before and after, so you can see what is going on. The method is locked, so that only one thread can access it (we are envisaging only one wordlist at a time). This message contains the thread number as well, because it is different from the main thread of the program. A TimerCallback delegate is created and passed the reporter delegate. Once that is done, all that is needed is to create a Timer object and pass it the callback, so it knows what to do. It is the Timer which works in its own thread of execution.

This all leads to a strange effect. Every 8 seconds (the interval I chose) a report is made to the console. These get progressively longer as more words get added. At some point, the fetchNextChar will generate its exception, print out a message to the console and stop the main thread. Nothing stops the timer except switching off the console window, so it will carry on printing the same list. I'm told killing threads is a difficult business. But this test case was imagining a continuous process of reporting on different streams coming in, so we won't worry about that.

Next time we'll set up 3-4 threads all putting their words into the same list.






Next Episode

Wednesday, 23 September 2009

Chars, words and sorted lists


Pictures today are autumn things in the garden. First is some pink Sedum, looking much like pink cauliflower (you can get pink varieties). The second is bright orange Pyracanthus berries before the blackbirds scoff them all.

I've recently been working on a little task which is to run on the console. Not a website at all, much simpler. I found myself starting up my Visual Studio and setting up a project for the first time. Even, rather later, gathering my nice new classes into one place and making them into a library. Hot stuff! I should certainly create libraries more often.

The task was to create an instance of a character reader and assign it a chunk of text as a string. This can then be accessed, one char at a time. In a friendly way it lets you know you have got to the end by throwing an EndOfStreamException. The output wanted was a  list of words in the string sorted by number of occurrences and then alphabetically. After that it got more complicated but that will do for the next time.

I got my text from the first 3 paras of a classic short story "The Star" by Arthur C Clarke. Thank you Wikipedia.

I decided to create a class wrapper for a list of words called WordList. Also a WordListBuilder to process each character as it came and wrap things up at the end. This character reader class is called RealTimeCharReader, there will be another type later.
So:

    ICharReader myCharReader = new RealTimeCharReader();
    assignReaderContent(myCharReader);
    WordListBuilder myWordListBuilder = new WordListBuilder(myCharReader, 
         new WordList());

This means I want an instance of ICharReader, which implements IDispose, so has to have a Dispose() method, to make sure it cleans up. It also has to have a method fetchNextChar() which delivers a character, moves the pointer along and throws an exception when it is finished.

assignReaderContent is a static function allowing me to pass in whatever string I feel like in the Main() function.

WordListBuilder is going to have to take each char and either add it to a new word, or finish the word, eat any nonletter characters and start a new word. Each word is then added to the WordList. A Word is another new class containing a string and a counter. If a word turns up more than once, you can just increment the counter. Both the char reader and the wordlist are passed to the WordListBuilder to do the real work. Here is the call:
myWordListBuilder.buildWordList();
myWordListBuilder.BuilderWordList.makeReport();
buildWordList will process the list, makeReport will sort it and print the results to the console.

buildWordList consists of a try/catch structure containing a forever while loop which takes each returned char and decides what to do with it. The catch intercepts the EndOfStreamException and exits the while loop. Any other runtime exception in the reader gets caught by a more general catch. Either way the finally clause means Dispose is always invoked.

    public void buildWordList()
    {
        char nextChar;
        string nextWord = string.Empty;
                
        try{
            while (true){            
                nextChar = BuilderReader.fetchNextChar();
                if(char.IsLetter(nextChar)){
                    nextWord += nextChar;
                }
                else{
                   // treat all puntuation, digits and whitespace the same
                    if(nextWord.Length > 0){
                        completeWord(nextWord);
                        nextWord = string.Empty;                            
                    }
                }
            }
         }
             
         catch(EndOfStreamException ex){
             // add last word - even if it is partial - discuss[EOS]ion point?
            completeWord(nextWord);
            Console.WriteLine(ex.Message);
         }
         catch(Exception ex){
             Console.WriteLine(ex.Message);
         }
         finally{
             BuilderReader.Dispose();
         }
    }

    private void completeWord(string s){
        if(s.Length > 0){
            Word w = new Word(s);
            BuilderWordList.addNewWord(w);
         }
     }

The local variable nextWord is just a string, set to empty string. Since you have to add all words, including the last one - even if it has only partially arrived before transmission ceased, I added a short method completeWord, to be called whenever a non-letter is read or an EndOfStreamException is hit. This calls the WordList method addNewWord if there is anything in the string.


    public void addNewWord(Word w)
    {  
        // check word exists
        bool foundMatch = false;
        foreach (Word listWord in MyList){
             // if yes, then just increment count
             if (w.Content == listWord.Content){
                  foundMatch = true;
                  listWord.incrementCount();
             }
        }
        if (!foundMatch){
             MyList.Add(w);
        }
    }

Again, a simple method (as I prefer it!). It simply iterated through the list of Word objects in the member variable myList. If it finds a match it sets the boolean foundMatch to true and adds one to the count (or I could have used Count++). If it doesn't, then a new word is added to the list.

So now we have a list which contains a max of one entry for each word but it is not in order. I needed to create a class derived from IComparer and pass it to the sort function for the list of words. An IComparer clas needs to implement the function Compare. In this case we have to compare counts and then alphabetical order. It can be done in one pass, which I think makes life a lot easier!

    public class WordComparer : IComparer<word>
    {
        public int Compare(Word x, Word y)
        {
            if (x.Count > y.Count){
                return -1;
            }
            else if (x.Count < y.Count){
                return 1;
            }
            else{
            // they're the same word count, so sort alphabetically
                return string.Compare(x.Content, y.Content);
            }
        }
    }
The only unexpected part of this was that I assigned 1 as the result if x.Count was the greater. This gave me a reverse sorted list! So I swapped the values round. Notice for the alphabetical order I just use string.Compare, no further specification needed.

Nearly there! The sort function in WordList is very short:
private void sortList()
        {
            WordComparer wc = new WordComparer();
            MyList.Sort(wc);
        }
Passing an instance of IComparer enables the list to sort itself.

The last function simply iterates through the list and writes each word to the console, together with its count.

What's this all good for? Well, you might be monitoring someone's Tweets for rude words or buzzwords to display a Tag_cloud. The next time we'll consider monitoring several streams (Tweeters?) at once with possible delays in transmission.

By the way, I even managed to put all my classes in a library and make the small project refer to it. I'm quite proud.

I don't have any idea how to place a zip file here. I doubt if it would be that interesting at this stage but I ought to find out.

Next Episode

Thursday, 27 August 2009

MS SQL is too expensive


Hi,

Today's picture was taken by my daughter at the RHS Rosemoor in Devon last week. I love those big scary ferny things. They have large brush like flowers under all the leaves as well.

Spent much of yesterday researching how to become a "reseller" of hosting space. In other words, if I am going to write websites for people, I want somewhere to host their websites so I can keep an eye on them. I like ASP.Net and C#, which implies a Microsoft server and Ms SQL. There are lots of these around, some are very cheap. But ALL of the cheap ones have lots of truly scary reviews. Problem is, it looks like you have to pay a certain amount to Microsoft per MS database and there is no getting around that.

So why do I want it? MySQL is free and as far as I can see does the same job (is relational and all that). I can't use DotNetNuke but then I haven't yet managed to get that to run on my PC anyway. Looks like I can get a deal with a reputable host (for example 1&1 but it could be someone else) and use ASP.Net and MySQL (it runs on a Windows PC, it can run on a Windows server, surely?).

Ok, next step is how to connect to MySQL, the easiest way possible via Visual Studio 2008. I recommend this tutorial by Dr. Jay Krishnaswamy. This looks really lovely until you get to the part where you change the database type. MySQL just does not appear. The magic (and it took me ages to find this out) is on the MySQL website itself. Look through the downloads page (there aren't very many) and choose the latest stable connector for MySQL and .Net. This is an msi file. Scary though it is, I just downloaded it and let it do its stuff.

Next time I opened up VisStudio and had another go at configuring the data source, MySQL appeared in the list! After that it is all a matter of making sure your WAMP server is on and choosing the database and table you want. You also have to follow the above tutorial, to avoid errors (i.e, VisStudio does not like square brackets in SQL queries).

I now have a page which shows a gridview of a MySQL table on localhost. Next stop the external server!