Loading...
You are not logged-in Login/Register





  • Posts   Search Threads
  • akkiMay 19, 2009 1:31 AM PDT   
    Length of a DNA sequence

    Is there a limit on the length of a DNA sequence in the database file?

    Clay Breshears (Intel)May 21, 2009 2:23 PM PDT
    Rate
     
    Re: Length of a DNA sequence

    Quoting - akki
    Is there a limit on the length of a DNA sequence in the database file?

    Well, the Amoeba dubia has 670 billion base pairs, but that seems a bit excessive.  Humans have around 3.2 billion base pairs.  Again, probably too big for our purposes.

    Let's put a limit of 1 million (10^6) base pairs per sequence.  That would mean 12501 lines for one sequence (1 descriptor line and 12500 lines of sequence).

    --clay

    akkiMay 22, 2009 1:02 AM PDT
    Rate
     
    Re: Length of a DNA sequence


    Well, the Amoeba dubia has 670 billion base pairs, but that seems a bit excessive.  Humans have around 3.2 billion base pairs.  Again, probably too big for our purposes.

    Let's put a limit of 1 million (10^6) base pairs per sequence.  That would mean 12501 lines for one sequence (1 descriptor line and 12500 lines of sequence).

    --clay

    Thanks clay.

    My math is somewhat rusty so, please correct me if I miscalculated. But, with the 231 - 1 (> 2 * 109) limit specified for the number of sequences in another post (http://software.intel.com/en-us/forums/showthread.php?t=65703), and 1 million byte-long sequences (106), do I really need to make provisions for 2 * 1015 (~ 2 Petabytes) of storage for each file?! :O

    What I'm really wondering is, does this mean that I cannot read everything (or at least one of the files) into memory at the beginning?


    邓辉May 24, 2009 9:40 AM PDT
    Rate
     
    Re: Length of a DNA sequence

    Quoting - akki

    Thanks clay.

    My math is somewhat rusty so, please correct me if I miscalculated. But, with the 231 - 1 (> 2 * 109) limit specified for the number of sequences in another post (http://software.intel.com/en-us/forums/showthread.php?t=65703), and 1 million byte-long sequences (106), do I really need to make provisions for 2 * 1015 (~ 2 Petabytes) of storage for each file?! :O

    What I'm really wondering is, does this mean that I cannot read everything (or at least one of the files) into memory at the beginning?
    Hope that the proportion of documents to read and write will be accounted for less


    写字楼里写字间,写字间里程序员
    程序人员写程序,又拿程序换酒钱
    酒醒只在网上坐,酒醉还来网下眠
    酒醉酒醒日复日,网上网下年复年

    lixianyuMay 25, 2009 8:46 AM PDT
    Rate
     
    Re: Length of a DNA sequence

    Quoting - akki

    Thanks clay.

    My math is somewhat rusty so, please correct me if I miscalculated. But, with the 231 - 1 (> 2 * 109) limit specified for the number of sequences in another post (http://software.intel.com/en-us/forums/showthread.php?t=65703), and 1 million byte-long sequences (106), do I really need to make provisions for 2 * 1015 (~ 2 Petabytes) of storage for each file?! :O

    What I'm really wondering is, does this mean that I cannot read everything (or at least one of the files) into memory at the beginning?

    Yes, you're right! We can not read everything into memory at the beginning.

    Clay Breshears (Intel)May 27, 2009 10:29 AM PDT
    Rate
     
    Re: Length of a DNA sequence

    Quoting - akki

    Thanks clay.

    My math is somewhat rusty so, please correct me if I miscalculated. But, with the 231 - 1 (> 2 * 109) limit specified for the number of sequences in another post (http://software.intel.com/en-us/forums/showthread.php?t=65703), and 1 million byte-long sequences (106), do I really need to make provisions for 2 * 1015 (~ 2 Petabytes) of storage for each file?! :O

    What I'm really wondering is, does this mean that I cannot read everything (or at least one of the files) into memory at the beginning?

    There doesn't seem to be anything wrong with your math.  In http://software.intel.com/en-us/forums/showthread.php?t=65734 I've stated the assumption that the raw database will fit into memory.

    --clay

    akkiMay 27, 2009 11:16 PM PDT
    Rate
     
    Re: Length of a DNA sequence


    There doesn't seem to be anything wrong with your math.  In http://software.intel.com/en-us/forums/showthread.php?t=65734 I've stated the assumption that the raw database will fit into memory.

    --clay

    Alright. Thanks clay.


Forum jump:  

Intel Software Network Forums Statistics

16,368 users have contributed to 46,338 threads and 163,946 posts to date.

In the past 24 hours, we have 19 new thread(s) 135 new posts(s), and 74 new user(s).

In the past 3 days, the most popular thread for everyone has been Formula for the intersection of straight lines The most posts were made to Take a look at John Burkhard&# The post with the most views is \"-check none\" generates error

Please welcome our newest member bikerepair8


For more complete information about compiler optimizations, see our Optimization Notice.