IBM Z and LinuxONE - Languages - Group home

Back to Blog List

The View from the C++ Standard meeting September 2013

I apologize as this report is late due to several back to back conferences through September. At this meeting, the most important thing was to address as many of the National Body (NB) Comments from the draft C++14 CD possible. This will enable us to be in good shape for the release of C++14 in 2014. Please look at my blog series to get an idea of the major content. However, this meeting did have some interesting minor changes which modified that content. This is fairly normal to decouple features which is still controversial. The biggest change is the moving of VLA (or what we called Array of Runtime Bound) and dynarray into a library array TS, and the adoption of the single quote as a digit separator for C++14.

First, the number of comments for this draft is considerable less than that for C++11, reflecting the smaller content and higher quality. There were 115 NB comments. There are 85 gathered in:

N3733 ISO/IEC CD 14882, C++ 2014, National Body Comments

which has the official NB responses. Due to a minor misunderstanding at ISO HQ where the Canadian comments were sent to the C committee incorrectly, the Canadian comments were embedded in a separate document

N3771

Canadian C++14 Comments

Almost all comments were addressed in both documents which is what is important.

Sixty-six of these comments were for core: 36 from N3770 and all 30 from N3771. Core dealt with both sets but referred some of the extension requests and design issues to EWG for initial evaluation. Of the 66 aimed for core, most issues had resolutions in ready state which mean they will be available to be moved in the next C++ Standard meeting in Issaquah in Feb 2014. A few of the remaining ones are still being drafted. Nine were rejected and six are left pending as opened feature requests. Core status on the NB comments are in:

N3770

C++ CD Comment Status, Rev. 1

There were several additional papers which still needed to be injected into C++14. These were:

N3760

[[deprecated]] attribute

N3781

Single-Quotation-Mark as a Digit Separator

N3778

C++ Sized Deallocation

Some explanation is required. N3760 was an approved feature from EWG to add the [[deprecated]] attribute but it was not reviewed in core. This effectively approves it for C++14.

The next one is the bike shed. I even talked about it in my Going Native C++ talk on C++14: Through the looking glass:

http://channel9.msdn.com/Events/GoingNative/2013/Cpp14-Through-the-Looking-Glass#comments

The original solution of using underscore, but using the double radix (..) for disambiguation for when separator underscore and the leading underscore of the user-defined literal became ambiguous was rejected in the last C++ Standard meeting in Bristol. There were 3 NB comments to find a solution and after some consideration, they went back and approved one Daveed’s proposal (one of my original favorite) of using a single quote as digit separator. A single quote is already used by some countries as a separator, although you are familiar with the use of comma in NA and decimal in Europe. The original concern was that a single quote mean some Language tools would confuse it as opening a quote and start looking for a closing single quote. I always felt this was an overly cautious concern. In this meeting, it was made clear that what we should care about most should be in the order of:

C++ Language Users
C++ Language Implementors
C++ Language Tools

When viewed in this light, it became clear what we should care about first and if such a change is good for the language and is asked for by the users (3 NBs) and easy enough to implement, then we should do it. So this was approved for C++14. So C++14 digit separators will be universal, and does not interfere with user-defined literals. For example, the number twelve can be written 12, 014, or 0XC. The literals 1048576, 1'048'576, 0X100000, 0x10'0000, and 0'004'000'000 all have the same value.

One area where this change can have backwards incompatibility is with C++11 macro invocation. This is because the single quotes delimit a character literal in C++2011, whereas they are digit separators in C++14. This means the following example from the paper is valid in both Standards but produces different results:

#define M(x, ...) __VA_ARGS__

int x[2] = { M(1’2,3’4) };

// C++2011: int x[2] = {};

// C++2014: int x[2] = { 3’4 };

This was approved with a great sigh of relief as many, myself included do not want to spend any more time on this topic. However, enough felt that it was important enough to have a solution, although slightly unsatisfactory as the universal preference is still the single underscore.

The next paper N3778 was also not new, but clarifies N3663 which was approved for C++14 with one important change. Global deallocation function now takes 2nd param to describe size of memory to be deallocated. Static member function already has this. This omission has performance consequences. However, it seems that improvement has been implemented by EDG, and GCC, and in the latter case obtained significant performance improvements. The new information indicates it does not break valid C++11 code. But there is a potential ABI breakage here as the interface now has that additional size parameter. From the paper, it stated that “The primary problem occurs when the system allocation library is new, but an interposed user allocation library is old. In new programs, calls to the unsized version would go to the user library, but calls to the sized version would go to the system library. However, as currently defined, by default the sized version calls the unsized version. Programmers that desire the improved performance must take positive action. The intent is that in some future standard, this default will change. In that case, there would be a mismatch in allocators."

Library Evolution Working Group (LEWG) is a somewhat new group that started in the last few meetings to parallel the Evolution Working Group that reviews and approves features before they go to Core language. Library Evolution will review and approve features before they go to the Library Working Group. LEWG proposed several new TSs for future Standard inclusion. They were the creation of

Library Fundamental TS
Array Extension TS
Parallelism Extension TS
Concurrency Extension TS

Library Fundamental TS will start with the N3672 optional feature which was narrowly passed but had a large number of NB comments indicating dissatisfaction with its proposed interface. To answer them, this has now been moved out of C++14 into a TS.

Array Extension TS contains both VLA (or ARB) and dynarray feature which was originally approved for C++14. VLA is fine, but dynarray has an interesting property which allows it to switch from the heap to the stack. This capability was not well understood before and is probably even more problematic to implement properly. This became effectively the new bike shed after digit separator was approved. There is a desire for VLA as a facility to transition existing uses of T[] and ptr+size more safely then what C99 VLA offers. T[] and ptr+size is a major source of bugs in C++. You cannot cast VLA. They decay to a pointer at the drop of a hat and you cannot get begin and end iterator. You can’t use Standard Algorithms on arrays. So it was deemed that VLA alone would do more harm then good. But dynarray provides the migration out of that into a library array facility that does neatly solve all these problems. But we find there were too many remaining question on its implementation. There was also a desire to keep these two facilities together, even though one is better understood then the other. So after a great deal of discussion, it was decided to pull both features from the C++14 draft and provide them both as a Library Array Extension TS. This will provide time for the implementation of dynarray to stabilize.

We had a between Std meeting SG1 concurrency meeting in July hosted by Nvidia where we discussed the idea of hosting several TSs for SG1. Two of these materialized as TSs this week. The Parallel Extension TS contains N3724 which was revised to:

N3554

A Parallel Algorithms Library

The Concurrency Extension TS contains N3731 and N3721. N3731 was revised to

N3785

Executors and schedulers, revision 3

and N3721 was revised to:

N3784

Improvements to std::future<T> and Related APIs

post-meeting. This is the beginning of a whole set of advanced parallelism features which include the following:

Parallelism TS to be started when any of the following three are available:

Parallel Algorithms: this initiated it in N3554
Data-Based Parallelism. (Vector, SIMD, ...)
Task-based parallelism (cilk, OpenMP, fork-join)

In addition, other candidates for the Parallelism TS in future are:

MapReduce
Pipelines

The concurrency TS will also follow the idea of whatever is ready to be publishable first:

Future Extensions (then, wait_any, wait_all): This is N3784
Executors: This is N3785
Resumable Functions, await (with futures)

Additional content could come in the form of:

Counters
Queues
Concurrent Vector
Unordered Associative Containers

A further Synchronization TS could appear in future with the following features:

Latches and Barriers
upgrade_lock

Technically, SG5 Transactional Memory is really also about advanced synchronization, but is too big to fit into the Synchronization TS, and will appear as its own TS.

Library had 42 comments which are also mostly addressed through this meeting with a very few rejected although N3770 still shows them as unresolved, due to the larger workload. Some of the significant ones were GB9 which removes gets from the library. Complex imaginary constants now finally has user-defined literal suffixes. Specifically, there was controversy before on using i_f for imaginary float because there was concern that if as a UDL suffix may be a problem with as “if” is also a C++ keyword. The only caveat is that there must be no space between the “ and if so if must appear as follows

operator “if ( …)

This is now in C++14.

US21 was a comment to deprecated rand and its friends according to N3742. The proposal is to deprecate of std::rand(), std::srand(), RAND_MAX, and random_shuffle(). The rationale for deprecating rand and its friends are that it is perceived as giving inefficient or poor results depending on what implementation you may be using. The rationale for deprecating random_shuffle() is that one overload is specified so as to depend on rand, while the other overload is specified so as to require a hard-to-produce distribution object from the user; such a distribution is already an implicit part of shuffle, which we retain. The advice was to move to C++11 <random>. Deprecated here means we put the world on notice that we might someday remove them from the Std and put into Annex D. Vendors will likely still support it but there is no guarantee. The problem is that this generated a great deal of controversy concerning whether this perceived inefficiency is an implementation detail which resulted in insufficient consensus for deprecating rand. There might be more consensus to deprecate random_shuffle.

For the remaining TSs, the Networking TS has its first document:

N3783

Network Byte Order Conversion

The Filesystem TS has already delivered a document and it will be moving to a PDTS.

N3803

Programming Languages -- C++ Standard Library -- File System Technical Specification

For all the TSs, there was a discussion on what is the appropriate namespace for a TS. For TR1 we had an additional namespace called TR1, but that is sometimes not uniformly implemented by all compilers, with some putting it in the Standard namespace and others following the TR1 prescription. The idea of an experimental namespace tag for TS means that there is no uncertainty that this is an experimental interface.

So what is emerging is the following:

std::experimental:: ...

Expect that now for TSs, as TSs are not normative. Many of these TSs are still meant to be shipped near the C++14 timeframe.

These plenary sessions are getting longer and longer because there are thirteen subgroups reporting that I will group those into a separate blog. Yes, you heard it, thirteen subgroups.

While this drive for C++14 is happening, there are still some parts of the Committee working on large and small features beyond C++14. This part will describe the many future feature proposals. Many of these proposals may only get full air time during the plenary session and these plenary sessions are getting longer and longer because there are so many subgroups reporting.

Last time I checked, there were ten subgroups. Now, there are thirteen.

http://isocpp.org/std/the-committee

http://isocpp.org/files/img/wg21-structure.png

The new additions are:

SG11, Databases: Bill Seymour

SG12, Undefined and Unspecified Behavior: Gabriel Dos Reis

SG13, Graphics: Herb Sutter

The SGs act as mini-evolution Working groups who are tasked with adding significant new features to C++.

Most of these will require a combination of approvals through the Evolution Working Group and/or Library Evolution Working Group. After the features are approved through the EWG or LEWG, they are moved to each of Core WG or Library WG for wording refinement. Doing it this way enables massive parallelism and still allows each of the Evolution groups to continue to work to treat small features that do not fit into any particular category.

For instance, I also had a proposal to add restrict-like aliasing semantics to C++, and this was reviewed in EWG.

N3635

Towards restrict-like semantics for C++

Up till now, when you want to make sure 2 function pointer arguments do not alias, you have to borrow from C99’s restrict facility. This is a non-standard C feature that has been added by various C++ compilers to support the demand for such facility. But C++ has found a number of issues restrict does not address. Some of these are that there is no way to use it for overlapping array elements, or member aliasing. It seems to work well really only for arguments. There are others and the result is we do not want something exactly like C99 restrict. So our paper proposes a facility called alias grouping, where the user can code, using C++11 attributes, pointers that can be aliased together, say as green pointers as being separate from the blue pointers even though they are the same pointer types. This has the advantage that it is easy for the user to define, non-intrusively backwards compatible, and can be ignored if the compiler does not understand the attribute.

Most members of EWG loved the idea and urged us to develop it fully for the entire C++ language. This is an example of some future C++ feature that is addressed at the EWG level, and not by a study group. There are many others. For the most part, EWG tasks itself on addressing small-ish features and annoyances, that does not necessarily fit in any subgroup. This list is maintained by Ville in N3811.

Starting with the Concurrency, SG1 had ten or so National Body comments to address.

One of the issue was a discussion on N3630. This paper has three proposals covering:

Require that return-from-main and exit join with outstanding async operations.
Remove the requirement that releasing an async operation’s shared state shall block.
Require that ~thread and thread::operator= implicitly join.

Working backwards, on the issue of thread destructor behavior, there was simply not enough consensus for a change, because some argued this was deliberate design.

On the issue that async destructors should not block we devoted a great deal of discussion on it. There are currently three services in C++11 that can return a future. These are packaged tasks, promises, and async. Of these, only async blocks on destruction. There were in fact at least 6 possible positions that we considered and the subsequent straw poll vote (Strongly for, for, neutral, against, strongly against) were:

~future will not block unless returned from async 20-1-1-0-0
Add detach() to future to prevent blocking 0-1-1-8-12
Deprecate async without replacement 15-6-1-1-1 threads? Exceptions? Deprecate it now and not establish wrong usage 12-4-2-0-4
Split off task responsibility from future. NOT VOTED.
Add launch mode nonblocking async. NOT VOTED.
Is_evil flag could block destructor, or a special return_from_async_launch_async 3-8-2-4-4

As you can see the only position that received considerable support was A, giving advisory that future destructors will not block, unless returned from async, making it the notable exception.

One of the design issue discussed was that std: async serves two concerns which are conflated, as both a value return mechanism, and a task control. When there is a value, they block naturally and you won’t get to the destructor anyway. When there is no value, then blocking becomes a potential problem if you look at it as a task control mechanism because other asyncs can block behind it.

But there are programming models where blocking is useful and would not run on, especially when you wish to use it as a task control mechanism for returning a value, then maybe you might want it to block. This is because futures returned from std::async are not intended to be passed across library/API boundaries without first calling .get() or .wait() so that thereafter ~future will not block.

As a comparison to the other popular programming model, OpenMP parallel regions also have an implicit barrier at the end. You have to specify the nowait clause to make it not block and wait. But OpenMP tasks do not have an implicit barrier. You have to specify the taskwait directive for it to block and wait.

I think that in future, the proposed concurrency TS may allow Executors to help separate these concerns

After significant discussion, the only part that we tried to carry was N3776, an attempt to clarify the position that ~future and ~shared_future don’t block except possibly in the presence of async.

There was an attempt to issue a deprecation along the lines of C. Deprecate async without replacement. This motion was actually almost put forward. But before it even went to the mock plenary, Nikolai Josuttis circulated a petition arguing that lack of replacement would serious jeopardizes existing usage pattern, in effect invalidating all the C++11 courses and material that has been taught so far. There was so much concern raised from this point alone, along with the certainty that the motion will almost certainly be defeated by NBs (as there were many who supported the petition), that it was deemed that the motion should not even be brought forward at all. It died even before it reached the operating table.

Other papers that were discussed include how atomics work with signal handler. While we felt there was sufficient resolution, this was delayed at core and was not moved at this meeting. Another paper that was moved was the prohibition on Out-of-Thin-Air (OOTA) results. The issue here is that the current wordings for OOTA prohibited too much, including specifically PowerPC in relaxed memory model. In a code example using Dekker's Algorithm, there is no way to tell that a reordered results was not manufactured out-of-thin-air or actually deliberately generating a specific value. From N3710, which describes this problem well:

Consider the following example, where x and y are atomic variables initialized to zero, and ri are local variables:

Thread 1:  r1 = x.load(memory_order_relaxed);  y.store(r1, memory_order_relaxed);Thread 2:  r2 = y.load(memory_order_relaxed);  x.store(r2, memory_order_relaxed);

Effectively Thread 1 copies x to y, and thread 2 copies y to x. The section 1.10 specification allows each load to see either the initializing store of zero, or the store in the other thread.

This famously allows both r1 and r2 to have final values of 42, or any other "out of thin air" value. This occurs if each load sees the store in the other thread. It effectively models an execution in which the compiler speculates that both atomic variables will have a value of 42, speculatively stores the resulting values, and then performs the loads to confirm that the speculation was correct and nothing needs to be undone.

No known implementations actually produce such results. However, it is extraordinarily hard to write specifications that present them without also preventing legitimate compiler and hardware optimizations. As a first indication of the complexity, note that the following variation of the preceding example should ideally allow x = y = 42, and some existing implementations can produce such a result:

Thread 1:  r1 = x.load(memory_order_relaxed);  y.store(r1, memory_order_relaxed);Thread 2:  r2 = y.load(memory_order_relaxed);  x.store(42, memory_order_relaxed);

In this case, the load in each thread actually can see the store in the other thread, without problems. The two operations in thread 2 are independent and unordered, so either the compiler or hardware can reorder them.

Essentially this issue has been an open issue in the Java specification for about 10 years. The major advantage that we have in C++ is that the problem is confined to non-synchronizing atomics, i.e. memory_order_relaxed, and some memory_order_consume uses (or read-modify-write operations that effectively weaken the ordering on either the read or write to one of those). Many of us expect those to be rarely used.

In general, there is difficulty formally describing OOTA results, and the current description in the C++11 Standard was simply wrong. So it was deemed best to remove that description in the Standard, and replace it with normative encouragement to discourage implementers from generating OOTA results.

Further discussions was carried on regarding vectorization, resumable functions, and coroutines. All seems encouraging. There were continued discussions on vectorization, taskgroups, concurrent containers, and counters.

SG3 on FileSystems still has some work left to do for the TS, but is largely complete. They are starting to think about a second TS.

SG4 on Networking expects a PDTS in the February meeting, but it would depend on when the Library Fundamental TS would be shipping because there is a dependency on stringview. This is a new kind of string, that is different from the original Class string, but is a reference to an actual string.

Within SG5 Transactional Memory(TM), we have put forward a specification proposal in N3718, and it was presented to full Evolution for the first time. We obtained fantastic feedback which offered guidance as to how TM can work well within C++. There was still general approval of the design, but the guidance meant that we will need to further simplify the proposal and more integrate it within C++. The most interesting guidance is to not conflate invariance and synchronization. Herb Sutter, in particular gave specific feedback indicating that what is desired is a simple way of offering composable synchronization over current locks. We also gave an evening session to acquaint and educate members on the design.

SG8 on Concept-lite has a proposed paper which will be turned into a TS in future, but still has some work to do.

SG10 on Feature Test has N3745 which was passed in EWG. But it is really a living document that is non-normative as the Standard changes.

SG11 is formally started to support Databases. SG12 is another new group that will discuss and educate user community on what is Undefined and Unspecified Behaviour. There will be a paper that lists where something is undefined or unspecified.

SG13 is a group that will cover Graphics. It is also new and just starting to meet, but MS has great interest in leading it.

The next meeting will be in February, 2014 where we will continue triage of the remaining defects and NB comments. If it becomes possible to complete the work and issue out a C++14 Draft International Standard(DIS), then we will ballot through the summer even through the June C++ meeting, as there is no problem in having a meeting while a ballot is on-going. If we still need more time to complete the work, then we have till November for the next meeting after June, giving us the necessary months to complete the C++14 ballot. Either way, I would say we are in very good shape to ship C++14 in 2014.