One of the advantages of HLE and RTM locks not highlighted in your post, or in the reference specification, is this reduces (if not eliminates) the need for Wait-Free programming. Use of HLE or RTM removes the nasty side effect of preempting a thread holding a lock, which is principally the reason for writing code in Wait-Free format. (Wait-Free avoids lock held for duration of preemption.)
In the reference manual:
8.3.3 Requirements for HLE Locks
An XRELEASE prefixed instruction must restore the value of the elided lock to the
value it had before the lock acquisition.
While following this rule different sections of a protected structure can be concurrently updated, as shown in your example (provided non-conflicting updates).
Why not permit an XRELEASE performed to a different value to cause the first such release to win and all subsequent XRELEASE (XEND) to abort?
An example of this might be an XACQUIRE that sets a lock flag in the lsb of a pointer, then using the pointer to the object to reference the object. After processing, the XRELEASE may need to restore the pointer to a different value (say different pointer or NULL or different state of lock). This may occur in linked list management. While you could say in these cases do not use HLE/RTM then this exposes the problem of thread preemption.