In this issue, our survey of C++ compilers continues and several contributors make critical comments about the cost and complexity of using C++. I have held the second instalment of my compiler-writing series over to Overload 8 for various reasons that will, I hope, become clear in that issue!
Before I tackle the primary subject of OS/2 C/C++ compilers I'd like to take a little space to expand on my column in the last issue.
For some reason I completely forgot to mention the most outstanding feature of Salford Software's C and C++ compilers - their debug support and in particular, runtime debugging. Anyone who has written more than the most trivial of programs will have tripped over memory problems (though they may not realise it yet).
1. Dangling pointers (and references in C++), i.e., using a pointer variable that is no longer attached to underlying memory. For example, returning a pointer or reference to a local (auto) object. Often such abuse actually appears to work because the associated memory has not been reused yet. This actually makes matters worse because the defect will only manifest rarely, and will get past many programmer contrived test suites.
2. Writing beyond the end of an object - or sometimes before the beginning but this is a far less frequent problem. C's mechanisms for handling array parameters (and dynamic arrays) make this problem particularly vicious and frequently impossible (or effectively so) to detect statically (at compile time).
3. Memory leaks - the commonest form of resource leakage. This is another problem that is difficult or impossible to detect statically and which needs special tools to detect dynamically. In a way, it is the exact reverse of `dangling pointers' because it occurs when all pointers and references to dynamically assigned memory are lost before that memory is freed. The real sting in this problem is that it often only manifests as a serious problem when a program has been running for hours, days or possibly weeks. Virtual memory resources make it worse by delaying ultimate collapse.
4. Reading uninitialised memory. Any attempt to read from memory that your program has not previously written to will exhibit undefined behaviour. Unfortunately, undefined behaviour often manifests by doing exactly what you expected. That makes it rather difficult to detect.
I recently had an instance in my training room where a programmer was puzzled because his program always ran correctly the first time and failed the second. The first time the program ran, his assumption that a variable was zero agreed with the memory provided. On running the program again he got the same storage with the results he had written to it during the first execution.
Clever operating systems such as Windows NT make this kind of problem harder to detect because they clean up storage before re-allocating it to a new task.
I believe NT stamps 0xDEADBEEF all over freed memory? Clearly not a vegetarian operating system... - Ed.
There are tools available to tackle these problems and I hope readers will write in to describe anything that helps them reduce the incidence of these problems.
I was being a bit optimistic when I wrote my last column but my latest information is that version 7 will ship on March 25th (I guess those who had the sense to get to our AGM may already know that).
...if Symantec's team had turned up! - Ed.
The upgrade price is GBP89 and the special offer price (until the end of May) is GBP149 - at these prices it must be worth considering upgrading your machine to 16 Mbytes. The way things are going you are likely to need that much for sensible performance with the next generation of OS's and development tools anyway.
I mentioned last time that the parser had been disconnected from the rest of the compiler. I can now be more precise and tell you that it has been wired into the editor so that your code is being parsed while you write it. This is not intrusive in that it lets you write whatever you want to and ignores what it does not understand. May be we could persuade Symantec to provide an option which was a bit more intrusive (i.e., howls when you write code that will not parse).
The environment (IDE) is one of the best that I have used, though it will take those of you used to the cruder early 1990's PC IDE's some time to get to grips with it. Those coming from powerful workstation environments may wonder what is so special.
The MSWindows support is based on MFC so at least you will have a few familiar bugs and defects to work round. By the way, if your code is in any way critical you should only use MFC (and code generated using it) if you are familiar with the details of MFC. It is far too late to discover an MFC problem when running mission critical code. I, like many others, can live with the bugs in such Microsoft products as Word for Windows 6 because I do not multi-task critical software along side it.
Those of you who think you can just install a product and jump in to using it at once will find this a difficult product but that is because such attitudes are unrealistic. If you want something better you must expect to invest some effort in learning to use the new product.
There is an exception to this and this is GNU C and G++. Sean added a comment about UNIX users expecting these to be free. This is fundamentally true but - and in a PC world it is a big but - you will still have to get a copy as well as copies of all the other tools you will need such as debuggers, profilers etc. The cost of this in a UNIX context (remembering that Unix was designed with programmers in mind) is very low. In addition the tools will work just about straight out of the box (well for Unix gurus it will).
The cost in a PC environment is quite different. Here we expect the basic commercial tools to cost a few (very few) hundred pounds. For the novice, the tools must work directly without any fiddling. It was in this context that I was suggesting that the GNU development tools were not suitable and not that low cost, the actual delivery of the free software will cost close to the price of a low end PC C/C++ IDE such as Turbo C++.
I would still argue that the entire GNU development environment, debuggers and all, is free. However, Francis' point is well taken - GNU software does not always run "out of the box" and can therefore prove expensive to get running - Ed.
How about one of the Linux specialists writing a series on using G++ with Linux. Such a series could be at one of two levels. That for experienced UNIX users and professional programmers could focus on quality programming and tool support. On the other hand there is a place for a series for inexperienced UNIX users and part time programmers aiming at leading the reader from the start. The former would seem appropriate for publication here while the latter would, I think, better fit C Vu.
It would also be nice to see the Macintosh specialists report on the compilers available for their system. I find people often assume that everyone else will know as much as they do about what is available. It isn't true others know more, less and the same but different.
Anyone out there use Symantec, MPW or Code Warrior on the Mac? Write it up and send it to Francis to collate! - Ed.
One point that is well worth keeping in mind is that their is a strong relationship between Metaware and IBM. Metaware wrote the SOM compiler for IBM and also provide a direct C++ to SOM compilation system.
What is SOM? Well that is a little complicated to answer in the current context but I'll give a brief (and I hope not too inaccurate) answer. One of the growth areas in current computing is DLLs and forms of object linking. The problem from the C++ point of view is that any change in a class declaration changes the object module so that relinking is often not enough. This is particularly problematical when your program utilises a DLL. If the DLL version does not have the same layout for classes that your code expects there will be a horrible crunch.
SOM tackles this problem by providing an extra layer of indirection in a language independent way. This means that for a relatively small overhead (less than 15%) in performance your program can use both current and future versions of other SOM conforming software.
The real fun starts as we move into distributed systems and support via DSOM.
Er, yes, but what does SOM actually stand for? - Ed.
As you would expect from a high quality compiler specialist, this is an excellent compiler. The IDE is pretty rudimentary, which is less significant for those who already have OS/2 development tools from which they can, to a large extent, build their own IDE.
I wish Watcom would go out and negotiate with companies such as Blue Sky and Kaseworks. Add products from these companies to Watcom compiler technology and you have something really special. The problem is that full products from these companies are expensive to buy for any but the specialist developer. Once you have tried special versions attached to a compiler you are likely to want the full product if your work merits it.
If you need to support more than one platform on an Intel x86 based machine this is a compiler you should consider very seriously.
Watcom, above, not withstanding? - Ed.
With this release Borland includes OWL for OS/2. This is not a perfect match for OWL for MSWindows but it a pretty good one. The product is well up to Borland's normal standard.
The down side is that it is a separate set of tools at a separate purchase price. What we really need is an x86 platform developers CD with both these tools and the MSWindows ones together.
In the meantime, if you need to develop for both Microsoft and IBM GUIs on an Intel x86 platform this has got to be worth serious consideration. The pity is that other priorities at both Borland and Novell (I think they are still responsible) have delayed the development of OWL for Appware.
As always with products from IBM this is a solid well constructed product. I don't mean that it is entirely bug free - I don't think that there are any products of this complexity for which you can say that. However if your code does not behave the way you expect the chances are pretty high that your expectations were wrong.
Of course, with a language still under development and refinement it may be that you know about the current state of the language while this compiler is still implementing the 1992 version but even the best of firms has this kind of problem.
The development environment is among the best that I have used and the bundled KASE:Set from Kaseworks puts all the other code generators for AFXs to shame.
If you program solely for IBM platforms (OS/2 etc.) then by all means look at the other products but this is the one that you will buy. I can hardly wait to get my hands on the next version.
I hope you can understand why I get so irritated by those who ask me what is the best C++ compiler. There is no such thing and anyone who tries to give you an answer without first checking what you want to do is too ignorant to be worth listening to.
People who answer questions without asking any of their own are unlikely to provide useful answers.
I have used a wide range of programming languages over the last twenty years; C++ is unique both in the facilities it offers and in the continuing effort required to use it competently. I don't mind the effort needed to use the expressive power of the language but the effort required to circumvent soluble problems is a continual irritation. In short C++ programming is not only hard, but also harder than it needs to be.
I am not saying that programming in C++ is wrong; far from it - I frequently need its power of expression, but this power often comes at an excessive cost. It takes considerable practice on the violin to play a tune (I can't), but anyone can play one on a Stylophone (at least I can). The other difference is that there are many more ways to play the tune - the results may be much better but the cost is higher. It is always necessary to consider the costs and C++ is pricing itself out of the market. If I have a program to be written and a choice of a trainee programmer and Visual Basic for a couple of weeks or an experienced C++ programmer for a couple of weeks (or an inexperienced one for a few months) which route am I going to take? The Visual Basic program may not be as elegant or efficient, but it is far cheaper.
Having just made some claims about the unnecessary cost of using C++ I should come up with some justifications! A continual problem for me is the unhelpful defaults of many features of the language, for instance:
* member functions don't default to virtual;
* default constructors, copy constructors, and assignment operators are generated automatically.
Other problems for the developer are caused by:
* the lack of a syntax for referring to classes by their relationships ("my base class"),
* with the addition of "exception handling" C++ is no longer a "better C", and
* constraints on the program that cannot be checked automatically (e.g., the "one definition rule").
Allow me to elucidate...
The default is "justified" on the basis that the overhead of a virtual function call is avoided except where explicitly requested. However, I cannot believe that the cost of dynamic binding is significant in the majority of cases. In speed terms suppose that dynamic binding adds 20% to the function call overhead and 10% of the programs execution time is spent in the function call overhead - this is almost certainly an overestimate and still only gives a 2% performance hit. Of more relevance are small classes that have large numbers of instances. These may not be able to stand the overhead of a vtable reference in the memory mapping of the class.
Before anyone writes in and tells me that I should just put virtual before almost all member function declarations let me point out that this is my argument. It is the need to know this is desirable and the time spent overriding the language default that are unnecessary costs.
In addition, (and this is common to a number of the other points) it is impossible to override the defaults in library code that is outside my control. To cite a particular example of a problem library: there are a number of classes in the MFC library that should (allegedly :-) have virtual destructors but don't. If the default were "correct" then this would be very unlikely to have happened. It is not just Microsoft that make this error - it is also a problem with the current draft of the proposed "Standard Library".
The committee recently clarified that the generated copy constructor and copy assignment operator perform memberwise copy and memberwise assignment respectively. Such copying or assigning of an uninitialised value causes undefined behaviour so you may not even get to your destructor - Ed.
Any class that manages a resource needs to declare the "big three" to avoid problems. Of course to change the language to prevent automatic generation for classes which contain pointers (or member/base classes without the corresponding functions) leads to a problem about how to code copy constructors and assignment operators.
Naturally, tools like "lint" can be used to check for these functions (and some of the other problems mentioned). However, the need for such aids complicates the development process and (as mentioned above) does not help if it is library code in error.
It would be nice to say, for instance, "the direct base class with this function", but instead one must identify the specific base class whose member function is to be called and hope that anyone adding a class between them in the inheritance graph updates the reference. C++ would be simpler to use if this process were automated. (Of course, if one gets the design right first time...)
The advent of "exception handling" changed all that. This flow control mechanism affects every piece of code and needs to be understood by the programmer. As indicated above it is possible to produce correct code without a clear understanding of the "class" mechanism. However, a lack of understanding of "exception handling" is far too likely to lead to problem code like the following:
void f()This is now badly broken - if an exception is thrown anywhere between initialising buf1 and deleting it, then the memory that it references will "leak". Of course, on many platforms losing a few bytes like this may not be an issue, but the same problem exists with more complex objects and other types of resource.
{
char* buf1 = new char[100];
char* buf2 = new char[100];
if (buf1 && buf2)
{
// Something
}
delete [] buf2;
delete [] buf1;
}
Some other languages that use exception handling also include "garbage collection" which trades these problems for another, more intractable set (when you find you have insufficient control over the "garbage collection" process you have no options). In C++ the code can be fixed (below) but the style seems less natural to those moving from C or early C++ implementations:
void g()Naturally, this is not the only solution, but unless you wish to obscure meaning by avoiding the direct use of pointers in this type of code then the alternatives are equally long winded.
{
char* buf1 = NULL;
char* buf2 = NULL;
try
{
buf1 = new char[100];
buf2 = new char[100];
if (buf1 && buf2)
{
// Something
}
delete [] buf2;
delete [] buf1;
}
catch (...)
{
delete [] buf2;
delete [] buf1;
throw;
}
}
This means that if both you and the developer of a library you are using decide to define the same "entity" then there need be no diagnostic and the program could do anything! Just imagine what trying to police such a requirement without diagnostic aids does to your development costs.
At the time of writing the language standardisation process has reached a stage where the chance of fixing any of these problems is remote. The cost will now fall on the developer.
First of all, let me say that I think Alan makes an excellent point about the demands that C++ places on developers. There is no doubt that the learning curve for a language as complex as C++ is much steeper than for, say, C. It may not be so clear-cut that the benefits are correspondingly higher too and so I shall not attempt to argue that point. I shall, however, put on my compiler-writer / X3J16 hat and respond to several of Alan's more specific points.
My thanks to Derek Jones for providing typical execution times on two very different architectures - Ed.
As for the draft Standard Library making the mistake of using non-virtual destructors - I can't think of any library classes that are intended to be used as base classes, with the exception (sic) of the exception class hierarchy which does have virtual destructors.
class Derived : public Base // #1
{
public:
typedef Base inherited; // #2
void f() { inherited::f(); }
};
Admittedly, this suffers from the multiple base class problem too, and if you change #1 without changing #2...
void g()
{
char* buf1 = NULL;
char* buf2 = NULL;
try
{
buf1 = new char[100];
buf2 = new char[100];
if (buf1 && buf2)
{
// Something
}
delete [] buf2;
delete [] buf1;
}
catch (...)
{
delete [] buf2;
delete [] buf1;
throw;
}
}
Is this fixed? Not quite! What happens if new fails? It throws an exception and does not return. In the example above, testing that buf1 and buf2 are not null pointers is redundant. In fact, it makes no difference in the above case but the fact that new throws bad_alloc instead of returning zero will "break" almost every program written before exception handling. One common trick in use today is to add the statement:
set_new_handler(0);
near the beginning of main() which often sets the behaviour of global operator new back to the "old" behaviour. This was not portable and in Austin (March `95) the committee voted to remove this "hack" and provide a standard way to use new without having to deal with exceptions - see The Casting Vote in this issue for more details.
One solution to this problem is to embrace the "initialisation is resource acquisition" idiom where the "resource", in this case memory, is "acquired" by a constructor and released by the corresponding destructor. The draft Standard Library provides several ways to do this - for the example above, it would be more "natural" to use the vector template class:
void g()
{
vector<char> buf1(100);
vector<char> buf2(100);
// Something
}
This does mean, of course, that you need to "know" even more about C++ and its library but the benefits are more maintainable programs since you no longer clutter up functions with error-prone housekeeping code.
/* file1.c */
int a;
/* file2.c */
void a();
int main()
{
a();
}
On some systems, a C compiler will successfully link this program because it uses only names for linkage, not types. Some systems might give a link-time message - I once saw the very mysterious "too far to jump" message from a linker presented with the above code. Now consider a C++ system: it typically encodes a function's calling sequence into the name. This means that the link-names of a the variable and a the function will be different. So C++ has actually helped us here!
I have received some private comments about George's last column so I feel compelled to explain my position: like Francis for CVu, I do not edit George's column (other than to correct typos) which means he may well be more controversial than you care for - he also might be completely wrong! That is for you, dear reader, to decide. I hope that George's columns will encourage several of you to respond - in the past, a particularly barbed attack on the C++ standards committee (CVu5.6) caused me to write a somewhat outraged response (CVu6.1) - Ed.
I like C++, it has the potential for being a great language but it is also exceptionally complicated almost, I think, to a degree where the designers themselves do not understand the implications of their decisions.
What I would like to see is a concerted effort to simplify the language itself and make it easier to use with predictable results - predictable, that is, to the ordinary working programmer not just to balding whiz kids.
The language designers seem prone to introducing things that make their lives easier, often by allowing compilers to implicitly support something which would otherwise have to be made explicit.
One area that is a minefield of unwanted complexity is that of overloading. What is so wrong about forcing programmers to disambiguate close decisions? Doing so might persuade them to look more carefully at their designs and reconsider the degree to which they overload things. By the way, it would be no bad thing if the designers reversed their habit of overloading new, subtly different, meanings onto keywords like static. Actually that keyword is a complete disaster akin to the term chosen for new style function declarations: "prototypes". Both words are already in active use in computer science for other purposes.
Enough of this pre-amble. Let me come to the point of this article - overloading, and specifically operator overloading. Before dealing with the latter let me take a quick look at function overloading.
Once you have function overloading you need a method to resolve uses of an overloaded function name. The first part is to collect all the candidates for the decision.
The rule is currently simple (I say currently, because I do not understand namespaces well enough to be sure that it will remain simple in future.)
Start by examining the current scope, remembering that where the call is to a member function - always identifiable because an object or pointer to object will decorate the call - the initial scope is that class's scope.
Search that scope for all declarations of the required identifier, if any are found that is your complete candidate set.
Otherwise repeat the process for each scope containing that scope.
Keep going until you either obtain a candidate set or have failed while searching the global scope.
In the next stage trim the candidate set to those that have the right number of parameters (being careful to leave in appropriate versions of declarations that fit by using default parameters.)
Now look to see if one of the candidates has the types of its parameters exactly matching those of the arguments in the call. If so, use it (if two match at this stage, take the programmer out and shoot him/her - its probably an acne-ridden male teenage, bedroom whiz kid hacker, but to say so would make me guilty of so many -isms that the PC world would put out a contract on me.)
Don't worry George, the PC police do not roam the pages of Overload - Ed.
If not, you will have to go into best fit mode and start playing games with type conversions. This stage needs drastic simplification because the rules are just too fine grained for good sense. It may mean that ambiguity rarely arises, but it also means that sometimes the resolution is not the one that you expected, leaving some subtle defect in your work. I much prefer to have a compiler require me to be more precise than to have it double guess me. Now we have a range of new-style casts, disambiguation through casting an argument is much less dangerous.
The end result is that function overloading is fine. You only have to use it for constructors. If we shout loud enough the granularity of resolution might be coarsened or one of the providers of support tools might provide a tool that would warn of close calls.
Noted :-) - Ed.
Good programmers (usually those whose employers have supported with training and time to develop skills) will use function overloading with care. Bad programmers, well I doubt that anything will make them better (but see my column in CVu7.3).
Before we start providing any overloading on operators, the language has a fully defined set of operators, each appropriately overloaded (or not provided if inappropriate) in the context of the built-in types. Whatever mechanism implementors use to support these operators, it is inaccessible.
On the other hand, programmers who wish to overload an operator must do so by providing a function to do the work. Despite the slightly eccentric form of such an operator function, it is a function and is subject to exactly the overloading rules that pertain to other functions.
This can lead to some weird behaviour. Consider the following:
void fn()
{
int i;
i= 1 + 2; // the RHS will, I think,
// be statically evaluated
// by the compiler.
i = operator+ (1,2); // does what?
}
Well that explicit call to the operator+ function won't be able to call the normal `+' for ints because no such function exists (well it may be an implicit function provided by the compiler implementor - but we cannot use that). Instead it will have to search global scope for any available user provided versions. These certainly will not be for two int arguments because the language rules explicitly forbid users providing their own versions for parameter lists that do not include any user defined types.
That rule is, in itself, an error because it prevents users from providing their own mixed mode arithmetic via operators. One of the eccentricities of C++ is the automatic type conversion rules it inherited from C and this rule prevents me from fixing that.
Actually, the language doesn't forbid this - but only when at least one operand is a user-defined type is the full search performed, otherwise only built-in operators are considered - Ed.
Next case. Consider:
void fn(){
MyType m(...); //initialised with
// appropriate values
int i;
i=m+1; //A
i=operator + (m, 1); //B
i=m.operator + (1); //C
}
At line A the compiler first looks in the scope of MyType to see if I have provided an operator + function
If I have, it starts the normal process of overload resolution, but what is the candidate set? Only those in the current scope? Those in the current scope and the built-in ones? Those in the current scope, built-ins and globals? All those from the current scope outwards through all enclosing scopes to global scope?
If you truly know the answer to this question, I take my hat off to you. I don't. Of one thing I am certain, the normal name hiding rules for nested scopes do not apply to operators. They cannot or else declaring an operator will hide and inhibit the use of all versions in enclosing scopes.
Now suppose that as well as an in-class definition of MyType::operator+(MyType) there is a file scope (or wider) definition of operator+(MyType, int). Under what circumstances will this exact match be found? Only if no resolution (however bad) can be found in class? Never (i.e., the in-class version hides the other)? Always?
Suppose that MyType provides a conversion to YourType. When will versions of operator+ with YourType as the left operand be considered?
Now let me turn to line B above (explicit call to operator+). I assume that this can only consider versions provided in the scope where it is used or in some outer containing scope. However I have to confess that I am not entirely sure of this.
Whether I am right or wrong, it is certainly the case that the explicit use of an operator function will result in quite different overload rules from those that are used when I use the operator itself.
Obviously line C only searches within the scope of MyType and its enclosing scopes. Obviously? What about the case where MyType contains an operator YourType() function? Of course you already know the rules for this situation. You do, don't you? Oh, well, perhaps I over simplified the rules for overloading functions, or did I?
Even those that can answer all the above consistently may find that they are not so sure when we throw namespace and templates into the mixture. When we get operators defined in template classes, or worse still get offered template operator functions, we really do need a very clear understanding of the overloading rules for both functions and for operators.
On the other hand, I think that any global provision of operators is highly dangerous. Frankly, I would like to see producers of class libraries completely avoid the provision of out-of-class operators. If they must provide them, please do so by providing the functionality in-class and wrapping it up in an inline function (see Francis Glassborow's article in Overload 6). Such inline operator functions should be in a separate header file so that the user determines their availability not the library provider.
The rules for operator overloading need to be cleaned up and made comprehensible to mere mortals such as I, until they are the best advice is `do not use them, they will introduce unexpected behaviour into your work and that of your clients'.
Finally, could our new editor (congratulations on your first issue) either write a detailed explanation of overloading or commission some other expert to do so. I guess it might even take several issues.
George Wendle
Thankyou George. A detailed explanation of overloading would be very likely to fill several issues of Overload! Perhaps I'll take up the challenge after I finish my cOOmpiler series or maybe I can persuade someone else to write a series on overloading? Just to add more spice to the issue, the Standards committee have been making changes to operator name lookup too - see my Casting Vote column in this issue - Ed.
I read George Wendle's article "Overloading on const is wrong" in Overload 6 with great interest. I have always been a keen advocate of const and the idea of const-correctness in code: it permits the visible expression of certain design level decisions in code for the benefit of both the compiler and the human. So where should we draw the line: why should some member functions be const and not others, what are the exceptions to the rule, and would you like biscuits with your const?
I thought that was "would you like fries with your const?" - Ed.
Sean also raised an issue in reply to an old letter of mine. Why should the assignment operator return a non-const reference to its left hand operand?
void fn(D&);
void fn(const D&);
Looking over my code, I only ever use const overloading in the context of a class and I have been unable to find any functions overloaded on const that do not differ in either return type or argument count. Clearly something interesting, and hopefully useful, is going on if I feel the accessibility of the current object should dictate the result type. George cites the classic example of operator[]. Providing a subscript operator for a vector, string or map class is practically a fundamental requirement:
string motd = "hello";
motd[0] = 'j';
What such an operator must also ensure is the preservation of const-ness. Consider a string class with only one subscript operator:
char& operator[](size_t) const;
If it did not return a reference, the change to motd above would not be possible. However, not declaring it const would actually prevent routines passed references to const strings from reading through the string character at a time. There is a problem with this one size fits all approach:
const string greeting = "hey";
greeting[2] = 'p';
This is legal, but is clearly a violation of the expected semantics. The solution is to overload on const-ness to determine the level of access the user should have:
char operator[](size_t) const;
char& operator[](size_t);
One problem that has previously caused problems with iterator classes is that they often fail to preserve the const-ness of what they are iterating over, i.e., through an iterator I can gain writable access to const objects. Alternatively, the iterator provides only lowest common denominator access -- but it is frustrating being given read-only access to a writable object! The STL addresses this problem in a disarmingly simple manner by requiring both const and non-const iterators. For example, for access from the first element a container class would include the declarations
iterator begin();
const_iterator begin() const;
Overloading should only be used to give similar concepts similar names, and this is clearly the case here. Suggesting that the const version should be renamed begin_const breaks with this, causing the programmer to do the name mangling instead of the compiler.
const Type& operator[](const Key&) const
throw(out_of_range);
Type& operator[](const Key&);
On the whole, behavioural differences between const and non-const versions of an overloaded pair should be either non-existent or minimal.
I agree and the STL gets around this by simply not defining a non-const version of the subscript operator for map (STL's associative array template class) - Ed.
However, there is an example I feel would be useful that breaks with this requirement. One of the few areas that the C standard I/O library wins out over its C++ counterpart is pattern matching on input. As its name suggests, the scanf function implements a simple generic scanner, albeit a somewhat insecure and idiosyncratic one. Taking advantage of the difference between non-const references and const references or values it is not hard to imagine an equivalent facility for C++:
cin >> day >> '/' >> month >> '/' >> year;
For such a scheme to work well, the type of literal strings would have to be const char* rather than char*. Sean made a proposal to rid C++ of this irksome piece of C heritage; sadly it was not accepted by the powers that be.
And I haven't yet discovered why the Core WG did not adopt this proposal - Ed.
The functionality described could be implemented using manipulators (see "Writing your own stream manipulators", Overload 5):
cin >> day >> match('/') >> month >>
match('/') >> year;
These could take advantage of templates and template specialisation. However, I do not believe there are any proposals to standardise such a cluster of classes and it would be good to have a simple version already in place that echoed the versatility of scanf in softer, safer tones. Perhaps const-ness in C++ has not been taken far enough?
cout << "The temperature at " << time
<< " on " << date
<< " is " << temperature
<< '.' << endl;
The result of each call to operator<< is a reference to the ostream that was used for output. Chaining is also present in the C language itself; it is not just restricted to the library:
a = b = c;
The result of each assignment is a modifiable lvalue of the left hand side and not a copy of that value.
Only in C++ I'm afraid! In C, the result of an assignment is not an lvalue - Ed.
The proposed standard library, and much of my own code, follows this idiom. Non-const member functions that might otherwise return void often return *this.
coord.radius(new_r).radians(new_theta);
motd.assign(subject).append(" is ") .append(opinion);
dir_list.sort().reverse();
The last example is, for some reason, currently not possible with the STL. It appears to be an oversight that hopefully will be rectified by the library committee: first, it is clearly useful; second, it is important that all library components are written to a common style which, in this case, is that of chainability.
* The definition of assignment for the built-in types;
* Compiler generated assignment operators return a non-const reference;
* Assignment operators in the fledgling C++ standard library return non-const references;
* Many of the good authors in the C++ community support this as a standard idiom (e.g., Stroustrup, Coplien, Meyers, etc.).
These are, to say the least, quite persuasive reasons. This is clearly standard form, yet the Ellemtel guide suggests that returning a const reference is better. To probe this decision we must better understand what coding rules and recommendations might help us to achieve:
1. readability, e.g., indentation, identifier names;
2. defined-ness, e.g., the result of a[i++] = i++ is not well defined;
3. security, e.g., use of gets can seriously affect the health of your program;
4. insurance against accident, e.g., declaring without definition a private copy constructor and assignment operator prevents accidental copying of certain classes of objects;
5. conformance to expectation, i.e., preservation of the Principle of least astonishment;
6. interoperability, i.e., the ability to mix with other components written to a standard form.
In other words, rules and recommendations are a response to, and a preventative cure for, possible problems. What are the problems that the Ellemtel guide is trying to lay to rest? Unfortunately only one example is given:
(a = b) = c;
This is a pointless and pathological piece of code, but how does it measure up against the criteria for a problem seeking a solution:
1. This is quite readable -- pointless, yes, but with parentheses forcing the precedence it is easy to see what is going on. Indeed, it might be argued that the chained assignment without parentheses offers more scope for confusion.
2. This is well defined: a is assigned the value of b, and then a is overwritten by an assignment from c. Again, pointless, but certainly well defined.
3. It is also secure -- no problems with dangling pointers, corrupting memory, etc.
4. You have to force the precedence to get this code fragment, so such code is unlikely to be produced by accident. I don't know about you, but my typos are normally quite simple: I have yet to accidentally enclose a well formed expression with balanced parentheses -- and not notice!
5. In the light of what I mentioned earlier I would expect this example to compile cleanly.
6. If a, b and c are iterators or containers, this code conforms to the signature requirements for assignment laid out by the STL for containers and assignable iterators.
The only problem I was able make out was that the authors of the guide were uncomfortable with C and C++! If they wish to break a de facto (bordering on de jure) standard, they will have to do better than one contrived and weak example. By this, I do not mean that many weak and contrived examples will strengthen their case ;-)
The Ellemtel guide even states, inadvertently, why you should ignore their recommendation:
Designing a class library is like designing a language! If you use operator overloading, use it in a uniform manner; do not use it if it can easily give rise to misunderstanding.
I have already described the uniform manner above. In other words, a non-const reference returned from an assignment is not a problem but an expectation: the absence of a problem does not require a solution, but expectations should be met.
We are pleased to see Kevlin Henney so thoroughly scrutinising one of the recommendations in our public domain document. We are prepared to change this in our forthcoming book "Industrial Strength C++".
The document was last updated in 1992, and at that time there were quite a few writers that advocated a const reference to this as return value. Actually, we got the idea from Scott Meyers after a speech at USENIX C++ 1991 in Washington. Also, Rob Murray's widely acknowledged book, "C++ Strategies and Tactics" recommends this (page 32, 2.2.1 Return value of operator=):
Assignment operators should return a constant reference to the assigned-to object.
One reason why a const reference might actually be of least astonishment is that this is the way it works in C. Try this in your favourite C compiler:
int main()
{
int x = 1;
int y = 2;
int z = 3;
(z = y) = x; /* From Sun C compiler:
illegal lhs of assignment operator
*/
return 0;
}
In C++, on the other hand, this code is legal since by default the result of an assignment expression is a non-const reference of the object assigned to. This is the motivation as to why a non-const reference is appropriate as return value for overloaded assignment operators.
Why have this incompatibility between C and C++? We really don't know! Maybe Bjarne had a bad day in the early eighties when he decided to change this? ;-)
I asked Bjarne Stroustrup about this gratuitous difference between C and C++ and got the following response - Ed.
Why make the change? Why not? The value of:
(a = b)
is a which is an lvalue. Also, we have found real examples of the general form:
T& f(T& a, const T&) { return a=b; }
Whilst putting this issue together, I was reading Scott Meyers' column in
The C++ Report, January 1995, where he talks about writing max and min
functions. He notes that maintaining const-correctness is very difficult
with templates and I can now see a parallel between that and the assignment
operator. Like Mats and Erik above, I may well change my view on this - Ed.
Mirrored from http://www.accu.org/