Monthly Archives: July 2013

TStringList.Assign… Bug, Feature, or Documentation Error?

I was recently revisiting my improved implementation of TStringList (see http://www.riversoftavg.com/efficiencytlistntstringlist.htm).  Now this code is very old, but its point about the efficiencies to be gained when copying sorted TStringLists applies.  Unfortunately, my stringlist, and Delphi’s TStringList, has a bug.

Suppose you have 2 TStringLists and the following code:

Strings1.Assign(Strings2);
if not Strings1.Equals(Strings2) then
  raise EListError.Create('Copy unsuccessful');

Do you think the exception will ever occur?  If you are like me, you would say no, and you would be wrong.

Here is the TStringList.Assign method:

procedure TStringList.Assign(Source: TPersistent);
begin
 inherited Assign(Source);
 if Source is TStringList then
 begin
   FCaseSensitive := TStringList(Source).FCaseSensitive;
   FDuplicates := TStringList(Source).FDuplicates;
   FSorted := TStringList(Source).FSorted;
 end;
end;

The inherited Assign clears the TStrings, copies some properties over, and then copies the strings using AddStrings.  The problem is subtle, but the error is that TStringList sets the Self properties after the strings have been copied (To be precise, the trouble is that the Self TStringList does not set the properties properly before copying the strings.  But more on that in a moment).

If Self has its Sorted property set to True, things start to take a wrong turn.  Basically, 3 errors can occur:

  • If the Source TStringList is not sorted, the 2 TStringLists are guaranteed to not be equal, e.g.,
Source := TStringList.Create;
Source.Add('Hello');
Source.Add('World');
Source.Add('Hello');
Dest := TStringList.Create;
Dest.Sorted := True;
Dest.Assign(Source);
if not Dest.Equals(Source) then raise exception.create('oops');
  • If the Source TString is sorted but the CaseSensitive property is not the same, the 2 TStringLists could possibly not be equal as they sort the strings differently. 
  • If the Source TStringList is sorted but contains duplicates, the 2 TStringLists could still not be equal.  The default value for the Duplicates property is dupIgnore, which means that the Self TStringList could be just throwing those duplicates away.  So not only are they not in the same order, but the 2 TStringLists don’t even contain the same number of items!

So how should the Assign method look?  The first thought is to put the TStringList property assigns before the inherited call, e.g.,

procedure TStringList.Assign(Source: TPersistent);
begin
 if Source is TStringList then
 begin
   FCaseSensitive := TStringList(Source).FCaseSensitive;
   FDuplicates := TStringList(Source).FDuplicates;
   FSorted := TStringList(Source).FSorted;
 end;
 inherited Assign(Source);
end;

This is better, but there is still a bug.  The problem is that even if the Source TStringList is Sorted and Duplicates = dupIgnore, there still may be duplicates in the Source TStringList!  Setting the Duplicates property does not actually do anything – it only works when strings are added.  So the Source TStringList could have added duplicate strings and then set its Duplicates property to dupIgnore or dupError afterwards.

The solution is to temporarily set the Self TStringList to unsorted and Duplicates to dupAccept.  Copy the strings and then copy the actual property values.  So, our final code looks like this:

procedure TStringList.Assign(Source: TPersistent);
begin
 if Source is TStringList then
 begin
   FDuplicates := dupAccept;
   FSorted := False;
 end;
 inherited Assign(Source);
 if Source is TStringList then
 begin
   FCaseSensitive := TStringList(Source).FCaseSensitive;
   FDuplicates := TStringList(Source).FDuplicates;
   FSorted := TStringList(Source).FSorted;
 end;
end;

One thing I am unsure about.  What should be done if the Source is a TStrings but is not implemented by a TStringList?  In that case, should the strings be sorted or not as they are added to Self?  My decision was that the Self TStringList should keep its own property values, even if that means the 2 objects, Self and Source, will not be equal.

Postscript

So will this “bug” ever be fixed?  My instinct says that it won’t.  This behavior has been around way too long and has become the expected behavior.  After all, the current behavior is a quick way to strip duplicate strings from a TStrings.  Changing the behavior now may break user’s code.  So that is why the title of this post asks is it a bug, is it a feature, or is it just a documentation error?

Happy CodeSmithing!

Delphi Component Writers… Descend from TPersistent!

I have been busy converting the RiverSoftAVG code to support Delphi XE4.  I promise to discuss this process in more detail later, but right now, I have to talk about a pet peeve of mine.  I guarantee my recommendation is controversial with some people, and perhaps, I should avoid it for my very first real blog post, but I just have to go with it…

Descend your objects from TPersistent, not TObject

It is amazing how useful the TPersistent class is, and how little it is actually used.  Embarcadero is especially guilty of this.  The TPersistent class is not just about persisting a class, it is arguably more useful to use the class to copy from one class to another.  Unfortunately, Embarcadero leaves large swaths of the Vcl/FMX library descending from TObject instead of TPersistent.

One of my favorite tricks when developing a class is to use the AssignTo protected method as a method to assign my class to a TStrings object.  I override the AssignTo method, detect if the destination object is TStrings and then fill the TStrings with appropriate text.

For example, my handy-dandy string some type of list/collection class could have:

procedure TMyStringCollection.AssignTo(Dest: TPersistent);
begin
  if Dest is TStrings then begin
    TStrings(Dest).BeginUpdate;
    try
      TStrings(Dest).Clear;
      for i := 0 to Count - 1 do
        TStrings(Dest).AddObject(MyString[i], MyItem[i]);
    finally
      TStrings(Dest).EndUpdate;
    end;
  end
  else begin
    inherited AssignTo(Dest);
  end;
end;

This little trick makes it easy to quickly see the contents of my list in a TListBox, TMemo,… you name it.  You just do code like:

ListBox1.Items := MyStringCollection1;

If Embarcadero had made TObjectList descend from a persistent TList, that AssignTo could easily have been:

procedure TObjectList.AssignTo(Dest: TPersistent);
begin
  if Dest is TStrings then begin
    TStrings(Dest).BeginUpdate;
    try
      TStrings(Dest).Clear;
      for i := 0 to Count - 1 do
        TStrings(Dest).AddObject(Items[i].ToString, Items[i]);
    finally
      TStrings(Dest).EndUpdate;
    end;
  end
  else begin
    inherited AssignTo(Dest);
  end;
end;

Wouldn’t that be incredibly useful for quickly dumping a TObjectList?  Unfortunately, they didn’t.  TList doesn’t descend from TPersistent; it descends from TObject.  Even worse, TList<T> (the generics version of TList) and TObjectList<T> don’t descend from TPersistent (and this is where my latest aggravation comes from).  With generics lists, Embarcadero has no idea what I will put in the generic.  If they had descended from TPersistent, I could have overridden AssignTo and done whatever I want.  Not just assign to TStrings, but any other class.

In order to support mobile, I need to get away from using TLists and casting Pointers around.  Embarcadero recommends replacing TLists with generics.  Now I have a hard choice of using their generic classes and losing TStrings compatibility or making my own and losing compatibility with their generics.  Most likely, I will descend from their generics classes and graft on an Assign and AssignTo method, but it is far from ideal.Because I also need my lists/tables to persist (the IECS overrides the hash table/dictionary so that they can be read and write properties), I will have to make my own basic generics classes.

The Counter-arguments

Granted, there is the argument to be made that Embarcadero should have put the Assign/AssignTo code at the TObject level and left TPersistent for classes that truly persist themselves.  And I can completely agree with that, unfortunately they didn’t.  Perhaps in the future they will do so, it is not the first time that they added important new methods to TObject after many years of leaving them out.  I had Equals and ToString in all my objects almost a decade before they got around to adding it.  If they add them, I will rejoice (and for goodness sake, add in Clone as well).  (At the same time, I must admit I will be cursing them because now I have to handle 2 inheritance hierarchies in supporting legacy code 🙂 ).

There is also an argument that object purists will make that classes should not know about other classes.  And this one, I totally understand and I must admit it does make me a little uncomfortable.  However, I consider TStrings a basic class, only one level below TObject and TPersistent (actually, anything in Classes.pas is basic (such as TCollection) to my mind and can be supported).  And the Assign/AssignTo paradigm was expressly designed to avoid this.  TStrings does not need to know about my class to be assigned back and forth to it.  My class just needs to know about TStrings.

So the next time you are tempted to descend your incredibly useful class from TObject instead of from TPersistent, just don’t.

That’s all for now from the Delphi CodeSmith.