TRUNCATE command

TRUNCATE command

Long before ALLOCATE/DEALLOCATE, we had to do memory allocation in assembly language. For large scale general purpose programs, like a statistical analysis package, we have to allocate large arrays to handle the design capacity for numbers of variables, size of analysis, etc. For instance, if you allow up to 32000 variables, matrices of up to 200 x 200 elements, you want to truncate it down to the actual size needed before you start allocating additional memory based on the size of the problemdiscovered.

At that time, at the Institue for Social Research at Michigan, when we created OSIRIS to run on an IBM 360 in about 100K it was especially important and we created a TRUNC subroutine to trucate the space to the actual problem size after setup interpretation and before additional critical memory was allocated.

It would be nice to have a truncate command for allocated memory. MOVE_ALLOC can accomplish it, but you need a short series of commands and data movement. Somethng like TRUNCATE (array(newsize))st leastfor linear arrays wouldbe very helpful anda lot more efficient and simpler to use.

Has this ever been considered?

25 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Not that I am aware of. The typical usage would be to not allocate until you knew the problem size. This TRUNCATE feels to me like a solution superseded by modern practice.

Steve

That's right, but when you allow 32000 possible values and end up having 10, it's a waste.

For instance, reading a data dictioanry where you don't know the ultimate number of variables (which may be integer,character, real), I use and input and output location vector, a # decimals vector, a recoded result vector, a 24-character names vector, a REAL(8) data vector, a variable width vector, two REAL*8 missing-data vectors, with a maximum allowed of 32000 variables.

Calculation and algorithm vectors and matrices are allocated exactly as needed after the problem size is completely determined--I assume that is what you mean by modern practice--but we actually used it in the 1960s.

All of this could be truncated down after the task is completely determined, before other allocations increase the total demand on memory.

The modern practice seems to be over-allocate and not worry about it because memory is so cheap and abundant now, or do a lot of move_alloc steps, which, by the way, have a heavy code memory load themselves.

So in modern usage I find it the most efficient to use permanently define some of the initial large vectors and leave others just large, especiallysince deallocate seems to require a big code hit.It's not a big problem because I can allowany size statistical analysis (of which there are 47 possible)with up to 32000 variables/unlimited recordsandthe code loads in less than 2MB.

I just think a truncate command would be fairly trivial--just update the chain.

A separate subroutine is not required - the language already has something that gives the externally visible outcome that you want. A processor may not deliver this outcome efficiently - but that's an issue for implementations around how much effort they put into optimising particular syntactic patterns. For example:

INTEGER, ALLOCATABLE :: big_array(:)
ALLOCATE(big_array(1000000))
!....
big_array = big_array(:required_size)

Chances are that an implementation today will create a temporary for the right hand side and then allocate-and-assign that copy (which is inefficient), but it doesn't have to do that - this statement could just be a fiddle of the extent or whatever stored in the descriptor for big_array. If the underlying memory manager supported resizing of memory blocks then you could also make available the memory that's no longer required for other purposes, or the implementation could leave it marked in use in the memory manager in case there was a later...

big_array = [ big_array, some_other_data ]

Either way, I don't think there's a standard way that you as a programmer can tell what's happening behind the scenes.

Separate to the above - if any allocate to new size/copy data/move_alloc cycle was expensive relative to the time to do the IO to read the data in, then do two passes of the data - first pass determines the size, second pass actually copies things over. Good input formats allow the size of data sets to be determined without requiring two passes.

Typically through, IO is going to be an order of magnitude slower than any associated memory management operations. To avoid "a lot of move_alloc steps" memory arrays are typically best grown in an exponential fashion... pick some sort of initial allocation, start reading data, when the allocation is exhausted double (or some similar factor; n) the memory allocation from that previous, repeat until data exhausted then chop back to the required size at the end. An alternative to chopping at the end is to remember the index of the last validly defined location in the array and use that as a substitute for the equivalent of UBOUND(...), at worst you will be wasting n - 1 times the amount of memory that you really needed to.

Thanks for the big_array tip; I wasn't aware of that. I may use that, and it's visually elegant.

It's not a big issue for me. I just like to minimize memory usage, coming from my days where memory was at a premium, and I wondered whether anyone else thoguht about it.

I have very few areas left where truncation would help, and they are not really significant; it would just be elegant to be able to truncate.

What I meant by modern practice is to determine the problem size and then allocate arrays to that size. No waste. The problem with the concept of TRUNCATE is that for constant-sized arrays, the compiler would no longer be able to tell, at compile time, how large the array is, and any truncated space would be wasted since it is allocated by the linker. For allocatable arrays, you would typically lose some of the truncated space due to memory allocator alignment issues.

I am having a difficult time understanding how this TRUNCATE actually saves anything.

Steve

I think I see van Eck's point of view. It appears to be a particular case of knowing one way of how something is done in one language (or programming system, to be more general), being in possession of several large codes that use that particular method, and being faced with having to port those codes to a language that does not have that particular method as a built-in feature.

IanH has already stated one set of reasons why TRUNCATE is not needed since the requested functionality may be had in other ways. The compiler option /assume:realloc_lhs is relevant in this context.

A just-in-time memory allocator would make a TRUNCATE intrinsic superfluous. Therefore we can see that the TRUNCATE intrinsic would give the programmer fine-grain control over implementation details (in this case, memory allocator), which is not typically done in a high level compiled language such as Fortran.

Converely, were TRUNCATE to be provided, there would be a potential penalty, either in terms of memory efficiency (is all the discarded memory allocation reclaimable) or allocator overhead (how much shuffling needs to be done to do the reclamation), and users may even ask for the ability to control this overhead.

Steve, that's exactly right and it's what I do. But with a stat package with user input you can't always do that before setup interpretation. For instance, if you allow an unlimited number of variables you need a large vector to hold them until you know how many there are. If you allow an option such as variables=all, you don't know how many before hand and you don't know their characteristics until you read the data dictionary. For a data transformation command, you need several copies of these vectors.

Once setup interpretation is done, you can allocate exactly what's needed for the calculations, but you still have the too-large initial vectors.

The big_array=big_array(:size) works, but has a severe penalty in this scenario. Tests of a single instance show an object code hit of about 7k. For 47 commands this would be about a 390k permanent hit. While it does save allocated memory for a given command, there is only one command in play at a time so the overall savings is not there. BTW, similarly, deallocation of multiple arrays cost a lot of code too, so it's rarely beneficial to deallocate explicitly.

So a simple truncate command that updates the allocation/unallocated tables or chains could save a lot. I wouldn't be concerned about memory fragmentation since you are still better off than before.

Fortran has C interoperability capabilities.

Look to using these functions to map an array descriptor to a C malloc'ed, then realloc'ed memory. Note, there is no guarantee that the application's system's C runtime library realloc will truncate on reduction in size or if it performs a malloc/copy.

Jim Dempsey

www.quickthreadprogramming.com

Dear nvaneck,

I agry with you about the need of a truncate command, which I believe would make less erratic the aquisition of data from unknown data files.

Have you found an efficient and elegant way to handle the problem.

No, I haven't; I still would like one and it would be useful.  What I've done is reserve a global maximum number of variables, currently 1,000, and then after the user enters the problem parameters, allocate the ncecessary amount for each subsequent array, leaving the initial, larger arrays allocated.  I still have to reserve large buffers and vectors for data input. but otherwise am depending on the cheapness of memory to make it less necessary as everyone ramps up the memory they have in their machines and move to 64-bit systrems.

There is significant code size penalty to truncate by re-allocate/move-alloc, and a penalty to explictily deallocating arrays no longer nececessary, and a code size penalty to explictly de-allocate arrays no longer necessary so I now leave them allocated and let the routine exit deal with it instead.  What I would typically save by move/realloc is lost or overwhelmed by the increase in the code size. Also, making  careful and judicous use of automatic allocation on the stack has helped, but it's a bit risky.

I forgot to mention: The maximum variables setting is user controlled, so they get an insufficient memory message and then can set it higher when needed. Ultimate maximum number of variables is 10,000, and at least one State agency user has needed to handle over 8,000.

Also, MOVE_ALLOC has been useful for certain times, like initially reserving 100 cells per variable when contructing frequency tables and re-allocating for those that need more as the data is processed.

I still would like a TRUNCATE command, but it looks like it's not going to happen.

Any compiler provided truncate facility isn't going to come for free from an object code point of view.

That said, with the following program I see minimal increase in the final exe size if I progressively comment in or out the line marked #A; when I use a command line compile with 14.0.0 with the options:

/O1 /warn:all /standard-semantics /threads [ /libs:static | /libs:dll ]

PROGRAM TruncateMe
  IMPLICIT NONE
  CALL main()
CONTAINS
  SUBROUTINE main()
    INTEGER, DIMENSION(:), ALLOCATABLE :: a1(:), a2(:), a3(:), a4(:)
    INTEGER :: new_size(4)
    INTEGER :: i
    REAL :: r(4)
    !****
    a1 = [(i,i=1,1000)]
    a2 = a1
    a3 = a1
    a4 = a1
    
    CALL RANDOM_SEED()
    CALL RANDOM_NUMBER(r)
    new_size = MAX(1, INT(SIZE(a1) * r))
    
    CALL truncate(a1, new_size(1))      ! #A
    CALL truncate(a2, new_size(2))      ! #A
    CALL truncate(a3, new_size(3))      ! #A
    CALL truncate(a4, new_size(4))      ! #A
    
    PRINT *, a1(SIZE(a1))
    PRINT *, a2(SIZE(a2))
    PRINT *, a3(SIZE(a3))
    PRINT *, a4(SIZE(a4))
  END SUBROUTINE main
  SUBROUTINE truncate(array, sz)  
    INTEGER, INTENT(IN), ALLOCATABLE :: array(:)   ! <-- intent(in) !!!
    INTEGER, INTENT(IN) :: sz
    INTEGER, ALLOCATABLE :: tmp(:)
    tmp = array(:sz)
    CALL MOVE_ALLOC(tmp, array)  ! <-- Variable definition context!!!!!
  END SUBROUTINE truncate
END PROGRAM TruncateMe

Perhaps there would be a size increase if I inlined the truncate subroutine (I didn't test), but an increase object size is what you'd expect if you inline something.  If you enable higher optimization you will see a size change, but again, I suspect that is due to inlining (I didn't look at the generated assembly).

I'm interested in any use cases/source examples you might have.

If Dr Fortran is reading this... what is of more concern is that no executable should have been generated at all, because the example code violates C539 in F2008!

I also looked at using the a1 = a1(:new_size(1)) type form, which I "expect" (but not forever...) compilers to not be as good as handling as the process in the truncate procedure.  Then I saw about a 500 byte increase in source with each statement, but there was no difference between three and four statements (which surprises me).

Very interesting and clever example. I may be able to make use of something like that. I think I'll play around with it and see what happens in my situation.

Ian, thanks for pointing out the issue with MOVE_ALLOC and INTENT(IN). Seems to me I saw an interpretation go by on this, but on the face of it I agree with you. I will research this more, but I think an error is indeed warranted.

Your truncate subroutine is really just the opposite of the more normal "reallocate larger" routine that MOVE_ALLOC was designed for. Your use of it here is absolutely appropriate.

Steve

If i'm understanding this, it appears a=a(:new_size)  is the simplest "truncate" command I was looking for, and Ian's TRUNCATE routine may be better for executable size.  The simpler version has the happy ability to work for any type array, whereas TRUNCATE requires a separate version for each.

Thanks guys, I wasn't aware of MOVE_ALLOC but it looks like a very useful tool!

A previous discuusion from 1995 can be found here: http://computer-programming-forum.com/49-fortran/899591db6f22e11e.htm

We already have the MOVE_ALLOC/INTENT(IN) issue on our list from http://software.intel.com/en-us/forums/topic/392663 Issue ID is DPD200244422.

Steve

So is Ian's code legal fortran to reallocate an allocatable array in a if you change INTENT(INOUT) ?

I tried out the rouitne and it has acceptable, minimal impact on code size. It just needs an explicit allocation of tmp to trap any insufficient memory issues.  The alternative of a=a(:new_size) gives a hit of 1K+ and won't trap failures, so it's not reallya good idea.

Yes, Ian's code is fine as long as "array" is INTENT(INOUT). You don't want INTENT(OUT) as that will deallocate it on entry to the routine! But the simple assignment works too, though it MIGHT make an extra copy along the way.

Steve

Thanks, Steve, Ian,

I've think I'm going to find it useful as a REALLOC routine in general...

 

Neal

The MOVE_ALLOC/INTENT(IN) issue has been fixed for a release later this year.

Steve

nvaneck,

I do not know if your program has this behavior, it seems like you have unknown input data size (quantity) at the point of read-in of data and that it is impractical (number of code changes) to handle it other than with something like TRUNCATE. In the programs that I have experience with, the unknown sized data us usually read in one read loop. Might it be better to modify the read-in subroutine to read into the temporary, then copy to the properly allocated destination array? The temporary read-in array can be re-used multiple times (thus need only be allocated once).

Jim Dempsey

www.quickthreadprogramming.com

Yes, Jim, for data reading that's exactly what I do.

It's actually a case allowing maximum space for various arrays while interpreting user-supplied parameters. Once the setup is fully interpreted, actual required dimensions of remaining arrays are used, but some initial vectors and arrays can be large. Some of the time vectors can be left large as the code to move_alloc is not worth the recovery gain. The REALLOC solution handles the remaining ones.

A global setting, i.e., setting maximum allowed variables, is under user control, which provides help if there is limited memory on the application machine (not likely anymore except in developing countries), and local maximums can also be set.

Login to leave a comment.