Updatable logical arrays in Prolog: fixing the semantics of setarg/3

Let us recall that setarg(I,Term,Value) installs Value as the I-th argument of Term and takes care to put back the old value found there on backtracking.

Setarg/3 is probably the most widelly used (at least in SICStus, Aquarius, Eclipse, ProLog-by-BIM, BinProlog), incarnation of backtrackable mutable arrays (overloaded on Prolog's universal term type).

Unfortunately setarg/3 lacks a logical semantics and is implemented differently in in various systems. This is may be the reason why the standardization (see its absence from the Prolog ISO Draft) of setarg/3 can hardly reach a consensus in the predictable future.

Ideally, setarg/3 should work as if a new term (array) had been created identical to the old one, except for the given element. There's no reason to `destroy' a priori the content of the updated cell or to make it subject to ugly side effects. Should this have to happen for implementation reasons, a run-time error should be produced, at least, although this is not the case in any implementation we know of.

Let us start with an inefficient but fairly clean array-update operation, setarg/4.

setarg(I,OldTerm,NewValue,NewTerm):-
  functor(OldTerm,F,N),
  functor(NewTerm,F,N),
  arg(I,NewTerm,NewValue),
  copy_args_except_I(N,I,OldTerm,NewTerm).

copy_args_except_I(0,_,_,_):-!.
copy_args_except_I(I,I,Old,New):-!,I1 is I-1,
  copy_args_except_I(I1,I,Old,New).
copy_args_except_I(N,I,Old,New):-N1 is N-1,arg(N,Old,X),arg(N,New,X),
  copy_args_except_I(N1,I,Old,New).

We can suppose that functor and arg are specified by a (finite) set of facts describing their behavior on the signature of the program. For a given program, we can obviously see setarg/4 as being specified similarly by a finite set of facts.

Furthermore,suppose that all uses of setarg/3 are replaced by calls to setarg/4 with the new states passed around with DCG-style chained variables.

This looks like a good definition of the intended meaning of a program using setarg/3.

We will show that actual implementations (Sicstus and BinProlog) can be made to behave accordingly to this semantics through a small, source level wrapping into a suitable ADT.

Let '$box'/1 be a new functor not in the signature of any user program. By defining

safe_arg(I,Term,Val):-arg(I,Term,'$box'(Val)).

safe_setarg(I,Term,Val):-setarg(I,Term,'$box'(Val)).

Using '$box'/1 in safe_arg/3 (safe_setarg/3) ensures that cell I of the functor Term will be indirectly accessed (updated) even if it contains a variable which in a WAM would be created on place and therefore it would be subject of unpredictable side-effects.

The reason of the draconian warning in some Prolog manuals manual

...setarg is only safe if there is no further use of the old value of the replaced argument...
. will therefore disappear and a purely logical setarg (with a translation semantics expressible in term of setarg/4) can be implemented. Not only this ensures referential transparency and allows normal references to the old content of the updated cells but it also makes incompatible implementations of setarg (Sicstus, Eclipse, BinProlog) work exactly the same way5.

To finish the job properly, something like the following ADT can be created.

new_array(Size,Array):-
  functor(Array,'$array',Size).

update_array(Array,I,Val):-
  safe_setarg(I,Array,Val).

access_array(Array,I,Value):-
  safe_arg(I,Array,Value).

We suggest to use this ADT in your program instead of basic setarg when performance is not an absoulte requirement.

A new change_arg/3 builtin has been added to BinProlog to allow, efficient failure-driven iteration with persistent information. It works like setarg/3 except that side-effects are permanent. Sould unsafe heap objects be generated through the precess change_arg/3 signals a run-time error. This is not the case as far the result is either a constant (which is does not need new heap allocation) or the result of moving a preexistent heap object to a new location.

For instance the (Haskell-style) fold/4 predicate (see library/high.pl) uses change_arg/3 to avoid painful iteration and slow side-effects on the dynamic database. The implementation is competitive is speed with hand-written explicitely recursive code and uses only memory proportional to the size of the answer.

% fold,foldl based on safe failure driven destructive change_arg
foldl(Closure,Null,List,Final):-fold(Closure,Null,X^member(X,List),Final).

fold(Closure,Null,I^Generator,Final):-
  fold0(s(Null),I,Generator,Closure,Final).

fold0(Acc,I,Generator,Closure,_):-
  term_append(Closure,args(SoFar,I,O),Selector),
  Generator,
    arg(1,Acc,SoFar),
    Selector,
    change_arg(1,Acc,O),
  fail.
fold0(Acc,_,_,_,Final):-
  arg(1,Acc,Final).

?- foldl(+,0,[1,2,3],R).
?- fold(*,1,X^member(X,[1,2,3]),R).