Crash when spawning function

Crash when spawning function

I am using cilk to parallelize a tree-walk (part of the MCF program in SPEC). The code runs fine sequentially, but the cilk version crashes.Depending on the problem I feed into the program, it crasches at different points.This is what I see for a smaller problem (a few hundereds of nodes in the tree):

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe8d7a700 (LWP 12286)]
__cilkrts_resume (w=0x622240, f=0x6637c0, sf=0x7fffffffc8c0) at ../../../cilkplus/libcilkrts/runtime/sysdep-unix.c:371
371	    ((volatile char *)sp)[-1];

This is what I see in gdb for the big problem (thousands of nodes in the tree):

[Switching to Thread 0x7fffe8f29700 (LWP 12154)]
do_lookup_x (new_hash=2300512604, old_hash=0x7fffe8f28d18, ref=, result=0x7fffe8f28d00, 
    scope=, i=0, flags=5, skip=0x0, undef_map=0x7ffff7ffa528) at dl-lookup.c:124
124	dl-lookup.c: No such file or directory.
in dl-lookup.c
[Switching to Thread 0x7fffe8f29700 (LWP 12154)]do_lookup_x (new_hash=2300512604, old_hash=0x7fffe8f28d18, ref=, result=0x7fffe8f28d00,     scope=, i=0, flags=5, skip=0x0, undef_map=0x7ffff7ffa528) at dl-lookup.c:124124	dl-lookup.c: No such file or directory.	in dl-lookup.c

I guess there is some stack corruption going on or something to that extent. Have anyone seen this before, and how do you work around it?Code for the specific function:

long
refresh_pot_cilk(node_t *node)
{
  long c1 = 0, c2 = 0;
  long lc = 0;

  if (!node) return 0;

  c2 = cilk_spawn refresh_pot_cilk(node->sibling);

  if( node->orientation == UP ) {
    node->potential = node->basic_arc->cost + node->pred->potential;
  } else { /* == DOWN */
    node->potential = node->pred->potential - node->basic_arc->cost;
    lc ++;
  }

  c1 = cilk_spawn refresh_pot_cilk(node->child);

  cilk_sync;
  return lc + c1 + c2;
}
10 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

Clearly you're running on a Linux variant. Which one? What version?

Also, you're clearly not running the Cilk runtime released with the Intel compiler sincesysdep-unix.c was created as part of the reorganization prior to the GCC release. Which version of the Cilk runtime are you running?

  • If you've got the source, you can look in libcilkrts/include/internal/cilk_version.h
  • Or you can define the CILK_VERSION environment variable to point to a file. After a run that file should have information about the Cilk runtime. If CILK_VERSION is defined as stdout or stderr, the version information will be written there instead of to a file.
  • - Barry

Hello lorrden,
Can you please tell me what command you used to compile your source-code?

Thanks,

Balaji V. Iyer.

Linux version: Linux 2.6.32-34-server #77-Ubuntu SMP Tue Sep 13 20:54:38 UTC 2011 x86_64 GNU/Linux

I use cilk in the gcc branch linked in the website (checked out the svn repo), it is built using normal ./configure, make and make install, no additional flags was used.

The host compiler for the initial bootstrapping is reported as "gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3". The runtime was installed along the cilk version of gcc.

My program is built using the following in a subdirectory of my sources (-O3 results in an ICE, but that seems to have been reported in another thread):

gcc -std=gnu99 -g -O0 -I../ -L/opt/cilk/lib -pthread -lcilkrts -o test ../*.c

(Using a proper path for pointing out the correct gcc, naturally).

The CILK_VERSION file contains the following:

Cilk runtime initialized: Thu Dec 08 21:30:56 2011
Cilk runtime information
========================
Cilk version: 2.0.0 Build 2068
Built by holm on host dbtest
Compilation date: Dec  7 2011 15:05:15
Compiled with GCC V4.7.0
System information
==================
Cilk runtime path: /opt/cilk/lib/libcilkrts.so.5
System OS: Linux, release 2.6.32-34-server
System architecture: x86_64
Thread information
==================
System cores: 4
Cilk workers requested: 3
Thread creator: Private

Hello Lorrden,
Can you try to add "-ldl" when you compile the source-code (as I have indicated in bold below?

gcc -std=gnu99 -g -O0 -I../ -L/opt/cilk/lib -pthread -lcilkrts -ldl -o test ../*.c

Thanks,

Balaji V. Iyer.

Yeah, I've already tried that and it is the same behaviour. I will try to reduce my code size a bit to see if i can get a relatively small test case.

I have traced the issue to a stack overflow (using valgrind). Essentially, the program runs out of stack space in the cilkified version (normal recursive version works find). The problem with MCF is that the trees are rather unbalanced (at least initially).Is there anyway to set the worker threads' stacksizes? I.e. through something likepthread_attr_setstacksize.

You should be able to use __cilkrts_set_param to specify the size of stacks for worker threads.
At the beginning of your program, trycalling

__cilkrts_set_param("stack size", "4000000");

to specify that worker stacks should be about 4 million bytes.
Does that fix the problem? If not, we would be interested in getting an example code that illustrates the error.

Jim

I have this test case now, it runs fine sequentially, it also runs fine with CILK until you place the value15081 as a CLI parameter. The value is used in order to construct a tree with the given number of nodes (the nodes are all chained together without any branches in this case). I tried to change the stack size as recommended, but nothing helps here, the application still goes down at this point (I even tried something ridiculous large like 128 MiB, but nothing helped).
What does help is trying to make the tree balanced though (uncomment the commented out stuff in the make_tree function to build a balanced tree).

If you use the value 15081 (this is on my system, not sure if it is consistent on multiple machines), you get the message "The Cilk Plus runtime system detected a corruption in its data structures. This is most likely caused by an application bug. Aborting execution." If you use a larger value, the application simply craches (in this case 17857 as parameter).

#include 
#include 

#ifndef NOCILK
#include 
#include 
#else
#define cilk_spawn
#define cilk_sync
#define __cilkrts_set_param(a, b)
#endif

typedef struct tree_node_t {
  struct tree_node_t *left;
  struct tree_node_t *right;
  int val;
} tree_node_t;

long gNodes = 0;

tree_node_t *
make_tree(long nodes)
{
  if (nodes <= 0) return NULL;

  tree_node_t *node = malloc(sizeof(tree_node_t));
  if (!node) {
    fprintf(stderr, "out of memoryn");
    exit(1);
  }
  nodes --;

  node->left = make_tree(nodes/*/2 + (nodes&1)*/); // Add one if odd
  node->right = NULL;
//  node->right = make_tree(nodes/2);
  return node;
}

long
tag_tree(tree_node_t *node)
{
  long l, r;
  if (!node) return 1;
  
  l = cilk_spawn tag_tree(node->left);
  r = cilk_spawn tag_tree(node->right);

  cilk_sync;

  node->val = l+r+1;

  return node->val;
}

int
main(int argc, const char *argv[argc])
{
  if (argc != 2) return 1;

  // __cilkrts_set_param("stack size", "128000000");

  long treesize = strtol(argv[1], NULL, 10);
  tree_node_t *root = make_tree(treesize);
  
  long res = tag_tree(root);

  printf("tree result: %ldn", res);
  return 0;
}

Thanks for the test case! It is quite helpful in figuring out what may be going on.

I believe you are running into a limit on spawn depth, rather than a limit on stack size. Every time a function f() is spawned in Cilk, f() uses not only space on the stack, but also pushes an entry onto the worker's deque --- a runtime data structure on each worker that Cilk uses to keep track of work that can be stolen. Every level of nested spawn pushes an additional element onto the deque, and your example overflows this deque for large enough inputs.

Unfortunately, in the current runtime code, the deque is a fixed size.

In global_state.cpp, you should see the line

g->ltqsize = 1024; /* FIXME */

This indicates that you can nest spawns to a depth of about (roughly) 1024. When spawns are nested more deeply than about 1024, then runtime data structures may get corrupted in the current implementation.
Normally, most Cilk programs don't generate a large spawn depth. Spawning a perfectly balanced binary tree to a depth of 1024 would mean a tree with 2^1024 leaves. Because your benchmark generates unbalanced trees, however, you are much more likely to run up against this limit.

This limit on spawn depthis a more general issue with the current runtime. Aworkaround may be to increase g->ltqsize and recompile the runtime. Does that help?

Alternatively, depending on how much you know about the structure of the tree in advance, you might be able to change some of the "cilk_spawn" statements into ordinary function calls and decrease your spawn depth. You might alsobe able to track the depth of spawning in your codeexplicitly, and convert spawns to calls ifthis depth is too large? Both of these approachesreduce the available parallelism of the application though.

Jim

PS. Some other bits of information that may be useful:

  1. To create a balanced binary tree, the second spawn is unnecessary. You can write:

    l = cilk_spawn tag_tree(node->left);
    r = tag_tree(node->right);
    cilk_sync;

  2. The __cilkrts_set_param("stack size") does not change the size of the normal C stack for user-created threads, but only the stacks created for Cilk worker threads. The initial thread for a program counts as a "user-created" thread, so you may also need to set the size of that C stack as well, using normal OS-specific mechanisms. (But you should only need to change this limit if your program also crashes serially, which doesn't seem to apply in your case.)

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi