Internal compiler error with lock/unlock

Internal compiler error with lock/unlock

Hello,

I have run into an ICE with the program below. The compiler version I use is: Intel Fortran 12.0.3 under Linux. The program may not be correct, but the compiler does not give any message, except for the ICE.

Regards,

Arjen

-----

! primes_coarrays.f90 --

! Determine the first 1000 primes - using coarrays

!

program primes_coarrays

use, intrinsic :: iso_fortran_env, only: lock_type

implicit none

type(lock_type), codimension[*] :: primes_lock

integer, dimension(2), codimension[*] :: range

integer, dimension(1000), codimension[*] :: prime

integer, codimension[*] :: number_primes

logical, codimension[*] :: new_results

logical, codimension[*] :: new_task

logical, codimension[*] :: ready

ready = .false.

new_results = .false.

new_task = .false.

number_primes = 0

range(2) = 0

!

! Create tasks in image 1 and handle them in all images

!

do while ( .not. ready )

if ( this_image() == 1 ) then

call add_task

endif

call get_task

enddo

write(*,'(10i5)') prime

contains

!

! Subroutine to post a new task (consisting of a

! range of integers in which to look for primes)

!

! Loop over the images to see which one wants a

! new task

!

subroutine add_task

integer :: i

do i = 1,num_images()

if ( .not. new_task[i] ) then

range(1) = range(2) + 1

range(2) = range(2) + 100

range[i] = range[1]

new_task[i] = .true.

endif

enddo

end subroutine add_task

!

! Subroutine to get a task and search for

! primes inside the new range

!

subroutine get_task

integer, dimension(100) :: new_prime

integer :: lower

integer :: upper

logical :: isprime

logical :: got_task

integer :: np

integer :: i

integer :: j

integer :: residue

integer :: maxindex

got_task = .false.

np = 0

if ( new_task ) then

lower = range(1)

upper = range(2)

new_task = .false.

got_task = .true.

endif

if ( got_task ) then

write(*,*) 'Range: ',lower, upper, this_image()

do i = lower,upper

isprime = .true.

do j = 2,int(sqrt(real(i)))

residue = mod(i,j)

if ( residue == 0 ) then

isprime = .false.

exit

endif

enddo

if ( isprime ) then

np = np + 1

new_prime(np) = i

endif

enddo

endif

!

! Now put the results in image 1

!

if ( got_task ) then

lock( primes_lock )

maxindex = min( size(prime) - number_primes[1], np )

prime(number_primes+1:number_primes+maxindex)[1] = &

new_prime(1:maxindex)

number_primes[1] = number_primes[1] + maxindex

ready[1] = number_primes[1] >= size(prime)

unlock( primes_lock )

endif

end subroutine get_task

end program primes_coarrays

28 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Arjen,

This is a stab in the dark as I do not have a current version of IFV supporting coarrays.

I notice you have:

use, intrinsic :: iso_fortran_env, only:lock_type

And in the contained subroutine get_task you use lock(primes_lock) and unlock(primes_lock).
get_task does not have implicit none, as to if this makes a difference I cannot say.
However, I can say that although you are using intrinsic :: iso_fortran_env, only:lock_type, you are not using the declarations for lock and unlock. Meaning lock and unlock may not be fully declared to the compiler, in particular the calling interface and/or if they are intrinsic or not.

as to if adding lock and unlock to

use, intrinsic :: iso_fortran_env, only:lock_type, lock, unlock

I have no way of testing this here.

Jim Dempsey

www.quickthreadprogramming.com

This has nothing to do with the USE - I can reproduce the problem and will report it to development.

Steve - Intel Developer Support

Issue ID is DPD200169012. Compiling with -O0 avoids the ICE, but it is not clear to me if the program works correctly then. I find that the DO WHILE (.NOT. READY) loop never terminates and the program goes CPU-bound. I have not studied it enough to see if the code has a bug.

Steve - Intel Developer Support

Steve,

I am pretty sure it does contain bugs ;). Sofar I was able to get it to work somewhatby commenting
the lock/unlock statements. With this option I should now be able to make it work properly. I am at
this point not interested in getting the ultimate performance - the program is a mere toy that I use
to get a feeling for coarrays (and to illustrate their practicaluse).

Regards,

Arjen

Jim,

I took the use statement from John Reid's article in the Fortran Forum, but as lock and unlock
are _statements_, not subroutines, the compiler should know about them without the use statement,
even if the proper argument type comes from elsewhere.

Regards,

Arjen

Arjen,

Yes, I understand lock and unlock are not subroutines or functions. Instead they are statement keywords which I assumed (incorrectly) the declarations come in as an intrinsic with a USE, specifically the USE that brought in the lock type. Not having the latest IVF, are lock/unlock brought in by way of option switch? If so, then why isn't the lock type declared in that manner as well? (like INTEGER, ... are)

Jim

www.quickthreadprogramming.com

I do not know - I simply followed John Reid's article, as that is my authorative text at the moment.
To use coarrays I do have to use the -coarrays option to the compiler. It does seem a bit odd that
you have bring the lock_type via a separate statement.

Regards,

Arjen

The need for the -coarray switch in order to use any of the coarray-related features is an "implementation expediency". In an ideal world it would not be needed and omitting the switch would allow you to use the features (including LOCK, CRITICAL, etc.) in a single-image environment. It would have required a lot more work to offer that alternative and we chose not to. Perhaps someday.

Yes, LOCK_TYPE must be imported from ISO_FORTRAN_ENV - LOCK is a statement. The standards committee wanted to avoid breaking programs by adding a new intrinsic type. ISO_FORTRAN_ENV is, technically, an intrinsic module that requires the programmer to explicitly request its use.

Steve - Intel Developer Support

Steve,

If you take notice of the original source code file the file contained

use iso_fortran_env, only:lock_type

IOW it shouldn't have included lock nor unlock intrinsic keywords (assuming they are declared in iso_fortran_env). Meaning user had a programming error .AND. the compiler behaved as if lock and unlocked were sucked in in violation of the only clause.

What I am guessing here is the intrinsic keywords were somehow defined as external*** but the calling convention of the INTERFACE somehow did not also come in to play. IOW lock/unlock being called with incorrect calling interface.

Please note that this is speculation as I cannot reproduce the problem.

This is why I had asked that the OP add lock and unlock to the only clause.
If this corrects for the problem then this is indicative of my assumption being correct.

Jim Dempsey

www.quickthreadprogramming.com

LOCK and UNLOCK are statements, not module procedures. I could reproduce the problem easily and in fact cut it down to about five lines.

Steve - Intel Developer Support

I have improved the program and it is producing the results I expected, except that it does not
stop - despite the use of a STOP statement and the output showing that all threads (8 in total) have reached
the end of the program.

Below you find my updated version.

Any idea what I am doing wrong?

Regards,

Arjen

----

! primes_coarrays.f90 --

! Determine the first 1000 primes - using coarrays

!

program primes_coarrays

use, intrinsic :: iso_fortran_env

implicit none

type(lock_type), codimension[*] :: primes_lock

integer, dimension(2) :: range_priv

integer, dimension(2), codimension[*] :: range

integer, dimension(1000), codimension[*] :: prime

integer, codimension[*] :: number_primes

logical, codimension[*] :: new_results

logical, codimension[*] :: new_task

logical, codimension[*] :: ready

ready = .false.

new_results = .false.

new_task = .false.

number_primes = 0

range_priv(2) = 0

!

! Create tasks in image 1 and handle them in all images

!

do while ( .not. ready )

if ( this_image() == 1 ) then

call add_task

endif

sync all

call get_task

enddo

if ( this_image() == 1 ) then

write(*,'(10i5)') prime

endif

write(*,*) 'Image done:', this_image()

sync all

stop

contains

!

! Subroutine to post a new task (consisting of a

! range of integers in which to look for primes)

!

! Loop over the images to see which one wants a

! new task

!

subroutine add_task

integer :: i

do i = 1,num_images()

if ( .not. new_task[i] ) then

range_priv(1) = range_priv(2) + 1

range_priv(2) = range_priv(2) + 100

range(:)[i] = range_priv(:)

new_task[i] = .true.

write(*,*) 'Assign', i, range_priv

endif

enddo

end subroutine add_task

!

! Subroutine to get a task and search for

! primes inside the new range

!

subroutine get_task

integer, dimension(100) :: new_prime

integer :: lower

integer :: upper

logical :: isprime

integer :: np

integer :: i

integer :: j

integer :: residue

integer :: maxindex

np = 0

if ( new_task ) then

lower = range(1)

upper = range(2)

!

! Determine what primes we have in this range

!

write(*,*) 'Range: ',lower, upper, this_image()

do i = lower,upper

isprime = .true.

do j = 2,int(sqrt(real(i)))

residue = mod(i,j)

if ( residue == 0 ) then

isprime = .false.

exit

endif

enddo

if ( isprime ) then

np = np + 1

new_prime(np) = i

endif

enddo

write(*,*) 'Found:', np, ' primes'

write(*,'(5x,10i5)' ) new_prime(1:np)

!

! Now put the results in image 1

!

lock( primes_lock[1] )

number_primes = number_primes[1]

write(*,*) 'Storing primes:', number_primes, np, this_image()

maxindex = min( size(prime) - number_primes, np )

prime(number_primes+1:number_primes+maxindex)[1] = &

new_prime(1:maxindex)

number_primes[1] = number_primes + maxindex

ready = number_primes[1] >= size(prime)

if ( ready ) then

do i = 1,num_images()

ready[i] = .true.

enddo

endif

!

! We are done - get a new task, if any

!

new_task = .false.

write(*,*) 'Done - new task:', this_image(), ready

unlock( primes_lock[1] )

endif

write(*,*) 'Done?', this_image(), ready

end subroutine get_task

end program primes_coarrays

Try inserting "sync all" in front of "do while(.not. ready)".

This is because without sync all, you have a data race condition with "new_task[i] = .true." in add_task (run by 1st thread) against "new_task = .false." in program prior to call new_task run by ancillary threads.

IOW "new_task[2] = .true." could occur prior to thread 2 issuing "new_task = .false."

Jim Dempsey

www.quickthreadprogramming.com

Try using ERROR STOP instead of STOP.

Steve - Intel Developer Support

Ah, true! Thanks - I will experiment with getting rid of the "sync all" inside the do-while loop,
as I do not think it should be necessary.

Regards,

Arjen

No, the program still hangs.

Regards,

Arjen

When you force a break all - what is happening in each of the threads?

Jim

www.quickthreadprogramming.com

In the output I see thefinal message from all threads. So as far as I can tell, all threads have left
the do-loop and are executing the STOP statement.

Regards,

Arjen

Let's do some diagnostics

In the main program, at end you have

sync all ! or you had sync all

stop ! or error stop

Insert a WRITE(*,*) "passed sync all"

between sync all and stop

This should let you know if the program is hanging on thw sync or on the stop
(leave the former write in front of the synch all too)

if all threads pass the final sync all then consider:
a) remove STOP and let threads fall out of program END (past contains)
b) insert

do while(this_image() .ne. 1)
call sleepqq(1)
end do
call exit(0)

Try a) first

Jim Dempsey

www.quickthreadprogramming.com

Hi Jim,

I verified that all threads pass the sync all statements, so then I moved on to possible solution a.
That did not help.

Then I applied possible solution b.That does not help either! The program simply keeps running,
that is: it "hangs" in the call to "exit", as a print statement after that does not get executed.

Regards,

Arjen

Arjen,

Then what happens if you call abort()?

Not that you want to use abort(), but it may get you by while Intel investigates the problem.

If abort terminates the app you may have to add statements to flushbuffered I/O too.

If you are curious and have the C runtime library source code available, you could re-edit the program wo the way it should work then debug step out of the main program (for each thread) and see where the hangup is occuring.

Jim Dempsey

www.quickthreadprogramming.com

Jim,

no luck: the same result.

Regards,

Arjen

Arjen,

What happens with

program HelloCoarrays
write(*,*) "Hello world from ", this_image()
end program HelloCoarrays

(compile for co-arrays)

Does the above hang on exit too?

Jim

www.quickthreadprogramming.com

Hi Jim,

no, that works okay. Hm, maybe I should reduce my program to the point
where it does correctly finish (at the expense of doing even less useful things)

I will let yyou know how I fare with that.

Regards,

Arjen

I have experimented with a reduced version of the program, but nothing definite transpires.
The preliminary conclusion I think I can formulate is that it is the number of updates to the
prime array that is causing this. When I limited that (losing lots of information), it did finish
when all threads finished.

For the time being I can investigate an alternative solution - have thread 1 do all the updating.

Regards,

Arjen

Try this: at the very end of the program insert

if(this_image() == 1) then
read(*,'(A)') line ! wait for user input
endif
end program yourProgramNameHere

add/reuse some character variable for line

then run the program but do not enter the text. i.e. let the program hang waiting for input.
In a seperate window run your favorite utility to see what is running on the system, in particular do you see one image running or multiple images running. The seperate images should have exited. If they have exited, then go back to the waiting image and press Return. See if this exits the program.

What I am asking you to check for is if there is a exit sequencing problem. If this seems to work then consider replacing the read with a call sleepqq(100) or some time interval long enough for the other images to exit.

Hopefully this will provide you with a work around until Intel can fix the problem.

If it doesn't work and the time difference is significant, the I suggest adding code at the end to get your process ID then issue a KILL process of that ID. A rather nasty way of working around the problem.

Jim Dempsey

www.quickthreadprogramming.com

Hi Jim,

what I see is that even though the threads reach the STOP statement, except for thread 1 that is
waiting for the input, all remain active.

As this is an example, not a critical thing, I will leave this version for the moment and see if my alternative
has more of a chance to work.

Regards,

Arjen

The internal compiler error has been fixed in our sources. I expect the fix to appear in a compiler later this year.

Steve - Intel Developer Support

Leave a Comment

Please sign in to add a comment. Not a member? Join today