Friday, May 12, 2006

class or struct...

In C# the struct keyword doesn't mean the same thing as in C or C++. First of all functions and properties may be associated with it. Next it's treated as a base type. When I say a base type I refer to int, long etc. These types are copied when sent as arguments instead of a pointer to their location.
This means that struct's are not constructed in the same way as classes and are treated as a base type. This also means that the rules that apply to boxing and unboxing basic types also apply to these types.
From this information we can descern that structs should only be used in extremely rare cases. An example would be a map point or perhaps a complex number structure. This would mean that we have a new type that we wish to be indistinguishable to others, that can be used the same as int or a long.

Any code that contains structs without strong justification of it's usage, along with penalties of using such a construct, shouldn't be allowed to use them. If in any doubt a class should be used instead of a struct.

Monday, February 13, 2006

Default Arguments (to use or not to use)...

The case of whether default arguments should be used or instead function overloading should be is a topic that isn't so hotly debated anymore. It appears to be a rarer and rarer occurance that default arguments should/can be used.
At least for .NET (not an issue in C#) default arguments should not be used. It's simply a case of when code is compiled against a library the default value is placed in the calling binary. Meaning if a library changes it's default value, the value won't be changed in an assembly using that library until it's recompiled.

Sunday, February 12, 2006

Threading and CORBA

A while back CORBA was all the rage, especially as a means of replacing DCOM. CORBA was to allow seemless integration between machines to share load and allow programs to span multiple machines. Now the only place that really still sees such protocols are those where communication speed between machines is necessary. Protocols such as XML-RPC and SOAP now tend to replace them. These are better simply since they can be more easily debugged by using packet dumps.

I ran across an interesting issue, the other week. One of the guys at work palmed off a problem to me that wasn't mine. The logs showed that a call back was being called out of order and that the program was hanging permantly.
Basically the problem was extremely simple, though to see the behaviour it wasn't. Imagine you have a routine call that is expected to give asynchronous call backs. This means that the callbacks may be returned in any order (i.e. not the other they occurred in and sometimes before the method call has returned). The solution is to simply lock the method so that it can return any required information and then allow the callbacks to run.

The problem is of course since the callbacks occurred out of order, the program was done so that the if the second callback came before the first then it would wait. This was implemented using a changing a thread into a suspend state. Of course this meant that this thread was now unable to complete, and preventing the first callback, or any future callbacks from occurring.

The code has now simply been changed to accumilate all the information from both callbacks and then process it approprately once it's all accumilated. Simple in it's simplicity without any need to worry about threads and their states.

Tuesday, January 24, 2006

The importance of contextual metadata (part 2)

Now as we advance in the techniques that may be used for discovering metadata, through better discovery, improvements have been made so that metadata may appear in a document in a contextual location.

For examply by using the structure of paragraph and metadata recognition routines it would be possible to do something like the following.

"John and Sara searched for kransky at their local supermarket in Sydney."
becomes
"<paragraph><sentence><name>John</name> and <name>Sara</name> searched for <food>kransky</food> at their local supermarket in <location>Sydney</location>.</sentence></paragraph>"

I've dumbed it down so that nothing conflicts. Now the information now has a context, so while it's still possible to drill down on name (it'd have to be extracted to seperate field like in the previous post). Now that the information is constructed in such a fashion it's now possible to search for a sentence or paragraph that contains a food and a person. Or for a sentence that contains a name John and a location. Of course the ability to search such marked up data relies on you having either a very good search engine at your disposal or the willingness to write one. ;-)

Monday, January 02, 2006

Harriet's Damage




Poor harriet has taken a beating since I first got her. Her trouble started when she lost her fairing, in her first accident. A broken clutch lever and broken indicators were a result of this. I got to spend a few weeks not being able to breath without pain from this little accident.

The later photos show harriet with a modified front. Less cosmetic and more serious damage was done this time. A good reason not to trust cars, especially P platers to give way at a give way sign. The old girl is going in on Tuesday, and hopefully we'll be able to get her up and beautiful again.

I have a new helmet and gear ready to go, so all I'm missing is the bike, for me to be ready... Hopefully my caution, more experience and a little bit of luck will allow me to avoid further incidents to myself and Harriet.

Wednesday, December 28, 2005

C# inheritance

The can be done in a number of programming methodologies. Now for something like the waterfall approach, inheritance is all planned out. With other more dynamic programming methodologies it's not so obvious, which methods should be overwritten and which shouldn't be.

Java is a language where if something is overwritten in the child the parent's version will automatically execute the child code's view. In C# the designers have decided to go for a literal approach when it comes to the language and inheritance. Much like C++ which is understandable considering it translated to IL as well as C# and VB.NET.
Take the following example. If the class car is derived from automobile and automobile has a method called TopSpeed. Now if the virtual and override keywords aren't used then if somebody assigns a car to type automobile the code for automobile will be executed when the method TopSpeed is executed.

I believe this is a case of bad language design. If a child class is to have a method with the same name as parent's method it should implicitly override the parent method. This begs the question if one is writing libraries in the infrastructure of a project that are to be used by developers, should all methods be made virtual? I believe if you wish to allow flexible extensibility by your programmers then a strong argument might be made for it.

Thursday, December 22, 2005

The importance of contextual metadata... (Part 1)

As the amount of information in the world increases so does the need to be able to better search for information. The two main tools that we have at our aid to improve this are statistical metrics and metadata (ignoring for things such as anchor weighting since I'm assuming a non-web enviornment and constant boosting from certain authors due to the fact I'm assuming that all information is important).

When I say statistical metrics I'm referring to things such as a search for "soy or linseed", now statitically one word will occur more often than the other, so it should probably rated more highly and a document containing both soy and linseed is more important than one containing just one of the search terms. etc...

The next aid to finding interesting results is metadata. Now pre-existing metadata for a document is nice to have but is often incorrect or inaccurate, so we have to judgements on how much weight we give to pre-existing metadata must be made usually on a case by case basis (referring to an inspection of the data to be searched over).

Next we have created metadata. This data helps to define things about the document it's self. For example people or places can be extracted. This allows us to drilldown on pre-existing values in searches. Other contextual information can be gathered from the text of the document, such as identifing a title of a document or a heading and making it more important.
A search for bush a gives us? A plant, a president, and a pro footballer. By recognising people we can limit those documents to a president and a pro footballer, by searching for bush inside the people metadata.

Attributes in .NET

Attributes were definitely a bit of a blank for me initially when it came to .NET. Why would you want to have metadata within your code? What would be the point of that? It turns out that it's not that silly, especially when we come to thinking about code in terms of reflection and runtime discovery.

Note: Java has the same abilities as .NET in this area though not as highly documented. http://www-128.ibm.com/developerworks/java/library/j-dyn0429/ explains how java byte code has attributes and that each function is simply an attribute and that custom attributes may exist. Though it appears at this time there doesn't appear to be any real way to utilise this information.

On metadata attributes in .NET it's possible to use these attributes to dynamically find items in classes at runtime. This is of course especially useful if you'd like to dynamically discover load classes at runtime. Think plugins...

First of all create a custom attribute. This is simply done deriving a class from the System.Attribute class. You can use the Attribute AttributeUsage to say that it's only valid for classes and interfaces.
Write a custom interface that uses your custom attribute for the interface declaration.
Implement your interface. It is now possible with reflection to find all instances of classes that implement that interface using reflection. Simply load the current assembly (Assembly.ExecutingAssembly) and get all the types. For each type attempt to get all instances of our custom attribute. If the attribute exists then we know that the class implements our custom interface. (Also check to make sure that the class isn't abstract or an interface that we you can actually create an instance of the class).

And now you have seen how reflection can be used to find and dynamically instantiate classes. This method is used (though slightly differently), to find webservice methods. You may also see http://www.xml-rpc.net/ this uses the same sort of reflection to know what methods to marshall on a website. So there you have it. Attributes and reflection.

Wednesday, November 09, 2005

XML Python and Characters...

A little while ago python moved from having only 8bit strings that were treated as byte arrays to unicode supported strings. Personally I think if you're going to move, you have decide on a direction and move.

On a more interesting note, the python xml.dom.minidom provides support for parsing XML. When parsing a utf-8 encoded string it converts the Text and CDATA nodes into a python unicode string. Nice if it's routines can guess information about the source correctly.
Now image you have a CDATA node with the contents "\r\r\n" (using the C programming language representations of the line feed and carrigage return characters). A person would usually expect the XML parser to give a unicode string representation of "\r\r\n". This is not the case. In fact the character string becomes "\n\n". So if you're thinking of reliably extracting textual data from an XML document in python, I can only recommend staying away from python's minidom (python 2.3.5 under win32).

Saturday, November 05, 2005

Extracting Raw XHTML from an XML document...

One of the cited reasons that a person might want to use XHTML or XML safe HTML, is simply the extraction of a document or text fragment from within another that's ready for display. While this a good idea it may fall down in a number of places.

Yesterday I was misassigned a bug. Not my code, and we haven't started non-ownership fixing yet. Anyway the bug was a simply highlighting bug. The letters were all squished together. e.g. "<em>tag1</em><em>tag2</em>"

Now the way this text looked in the XML was "<containingtag>...<em>tag1</em> <em>tag2</em>..</containingtag>". So why has the extraction using an XML DOM parser failed?

System.Xml.XmlDocument (Microsoft .NET).
An XML document doesn't need to treat whitespace between tags as important and by default it shouldn't. Hence the implementation of XmlDocument will eat the whitespace gap between tags.
So if we wish to use .InnerXml to get our snippet in a preserved state we'll need to do the following.

Set the preserve whitespace attribute on the document to true. This will allow you to get the inner XML from tags and preserve and any whitespace you've placed between your HTML tags.

Friday, November 04, 2005

Nostalgia


I was having a little bit of nostalgia about my old house mate ginnly (Virginia) and more importantly flight. I said that I'd go for my pilots licence after I got my motorcycle licence.
Well I managed to get my bike licence a little while back, and I'm now still on my L's and have had two accidents so far. So this here is a little bit of a kick in my pants to get me up and going, so that I'll start getting lessons again. I can only assume that she's still getting lessons down in Melbourne.
So after I'm not so horrendously broke it should be flying lessons once again.

Creation

With the creation of anything in this case a blog, I feel that there should be a purpose or a statement. While this is a blog that allows me to write about whatever I feel, I do intend to use it as a place write technical information, about programming and the techniques around it. I'd also like to include information about myself, but I will I endevour to keep these two areas quite separate.